Changeset f5bf3c2
- Timestamp:
- Mar 20, 2025, 1:21:53 PM (31 hours ago)
- Branches:
- master
- Parents:
- 62b5940
- Location:
- doc/theses/fangren_yu_MMath
- Files:
-
- 2 edited
Legend:
- Unmodified
- Added
- Removed
-
TabularUnified doc/theses/fangren_yu_MMath/intro.tex ¶
r62b5940 rf5bf3c2 14 14 In certain cases, if the resolver fails to find an exact assertion match, it attempts to find a \emph{best} match using reasonable type conversions. 15 15 Hence, \CFA follows the current trend of replacing nominal inheritance with traits composed of assertions for type matching. 16 The over-arching goal i s to push the boundary on localized assertion matching, with advanced overloading resolution and type conversions that match programmer expectations in the C programming language.17 Together, the resulting \CFAtype-system has a number of unique features making it different from other programming languages with expressive, static, type-systems.16 The over-arching goal in \CFA is to push the boundary on localized assertion matching, with advanced overloading resolution and type conversions that match programmer expectations in the C programming language. 17 Together, the resulting type-system has a number of unique features making it different from other programming languages with expressive, static, type-systems. 18 18 19 19 … … 25 25 These rules are then used to transform a language expression to a hardware expression. 26 26 Modern programming-languages allow user-defined types and generalize across multiple types using polymorphism. 27 Type systems can be static, where each variable has a fixed type during execution and an expression's type is determined onceat compile time, or dynamic, where each variable can change type during execution and so an expression's type is reconstructed on each evaluation.27 Type systems can be static, where each variable has a fixed type during execution and an expression's type is determined at compile time, or dynamic, where each variable can change type during execution and so an expression's type is reconstructed on each evaluation. 28 28 Expressibility, generalization, and safety are all bound up in a language's type system, and hence, directly affect the capability, build time, and correctness of program development. 29 29 … … 35 35 \end{quote} 36 36 Overloading allows programmers to use the most meaningful names without fear of name clashes within a program or from external sources, like include files. 37 Experience from \CC and \CFA developers is that the type system implicitly and correctly disambiguates the majority of overloaded names, \ie it is rare to get an incorrect selection or ambiguity, even among hundreds of overloaded (variables and) functions. 38 In many cases, a programmer has no idea there are name clashes, as they are silently resolved, simplifying the development process. 39 Depending on the language, any ambiguous cases are resolved explicitly using some form of qualification and/or cast. 37 Experience from \CC and \CFA developers shows the type system can implicitly and correctly disambiguates the majority of overloaded names, \ie it is rare to get an incorrect selection or ambiguity, even among hundreds of overloaded (variables and) functions. 38 In many cases, a programmer is unaware of name clashes, as they are silently resolved, simplifying the development process. 39 40 \newterm{Namespace pollution} refers to loading the global or other namespaces with many names, resulting in paranoia that the compiler could make wrong choices for overloaded names causing failure. 41 This fear leads to coding styles where names are partitioned by language mechanisms and qualification is used to make names unique. 42 This approach defeats the purpose of overloading and places an additional coding burden on both the code developer and user. 43 As well, many namespace systems provide a mechanism to open their scope returning to normal overloading, \ie no qualification. 44 While namespace mechanisms are very important and provide a number of crucial program-development features, protection from overloading is overstated. 45 Similarly, lexical nesting is another place where overloading occurs. 46 For example, in object-oriented programming, class memeber names \newterm{shadow} names within members. 47 Some programmers, qualify all member names with @class::@ or @this->@ to make them unique from names defined in members. 48 Even nested lexical blocks result in shadowing, \eg multiple nested loop-indices called @i@. 49 Again, coding styles exist requiring all variables in nested block to be unique to prevent name shadowing. 50 Depending on the language, these possible ambiguities can be reported (as warnings or errors) and resolved explicitly using some form of qualification and/or cast. 51 52 Formally, overloading is defined by Strachey as \newterm{ad hoc polymorphism}: 53 \begin{quote} 54 In ad hoc polymorphism there is no single systematic way of determining the type of the result from the type of the arguments. 55 There may be several rules of limited extent which reduce the number of cases, but these are themselves ad hoc both in scope and content. 56 All the ordinary arithmetic operators and functions come into this category. 57 It seems, moreover, that the automatic insertion of transfer functions by the compiling system is limited to this.~\cite[p.~37]{Strachey00} 58 \end{quote} 59 where a \newterm{transfer function} is an implicit conversion to help find a matching overload: 60 \begin{quote} 61 The problem of dealing with polymorphic operators is complicated by the fact that the range of types sometimes overlap. 62 Thus for example 3 may be an integer or a real and it may be necessary to change it from one type to the other. 63 The functions which perform this operation are known as transfer functions and may either be used explicitly by the programmer, or, in some systems, inserted automatically by the compiling system.~\cite[p.~35]{Strachey00} 64 \end{quote} 65 The differentiating characteristic between parametric polymorphism and overloading is often stated as: polymorphic functions use one algorithm to operate on arguments of many different types, whereas overloaded functions use a different algorithm for each type of argument. 66 A similar differentiation is applicable for overloading and default parameters. 67 \begin{cquote} 68 \setlength{\tabcolsep}{10pt} 69 \begin{tabular}{@{}lll@{}} 70 \multicolumn{1}{c}{\textbf{different implementations}} & \multicolumn{1}{c}{\textbf{same implementation}} \\ 71 \begin{cfa} 72 void foo( int ); 73 void foo( int, int ); 74 \end{cfa} 75 & 76 \begin{cfa} 77 void foo( int, int = 5 ); // default value 78 79 \end{cfa} 80 \end{tabular} 81 \end{cquote} 82 However, this distinguishing characteristic is vague. 83 For example, should the operation @abs@ be overloaded or polymorphic or both? 84 \begin{cquote} 85 \setlength{\tabcolsep}{10pt} 86 \begin{tabular}{@{}lll@{}} 87 \multicolumn{1}{c}{\textbf{overloading}} & \multicolumn{1}{c}{\textbf{polymorphic}} \\ 88 \begin{cfa} 89 int abs( int ); 90 double abs( double ); 91 \end{cfa} 92 & 93 \begin{cfa} 94 forall( T | { void ?{}( T &, zero_t ); int ?<?( T, T ); T -?( T ); } ) 95 T abs( T ); 96 \end{cfa} 97 \end{tabular} 98 \end{cquote} 99 Here, there are performance advantages for having specializations and code-reuse advantages for the generalization. 100 101 The Strachey definitions raise several questions. 102 \begin{enumerate}[leftmargin=*] 103 \item 104 Is overloading polymorphism? 105 106 \noindent 107 In type theory, polymorphism allows an overloaded type name to represent multiple different types. 108 For example, generic types overload the type name for a container type. 109 \begin{cfa} 110 @List@<int> li; @List@<double> ld; @List@<struct S> ls; 111 \end{cfa} 112 For subtyping, a derived type masquerades as a base type, where the base and derived names cannot be overloaded. 113 Instead, the mechanism relies on structural typing among the types. 114 In both cases, the polymorphic mechanisms apply in the type domain and the names are in the type namespace. 115 In the following C example: 116 \begin{cfa} 117 struct S {}; 118 struct S S; 119 enum E { E }; 120 \end{cfa} 121 the names @S@ and @E@ exist is the type and object domain, and C uses the type kinds @struct@ and @enum@ to disambiguate the names. 122 123 On the other hand, in ad-hoc overloading of variables and/or functions, the names are in the object domain and each overloaded object name has an anonymous associated type. 124 \begin{cfa} 125 @double@ foo( @int@ ); @int@ foo( @void@ ); @int@ foo( @double, double@ ); 126 @double@ foo; @char@ foo; @int@ foo; 127 \end{cfa} 128 Notice, the associated type cannot be extracted using @typeof@/\lstinline[language={[11]C++}]{decltype} for typing purposes, as @typeof( foo )@ is always ambiguous. 129 Hence, overloading may not be polymorphism, as no single overloaded entity represents multiple types. 130 131 \item 132 Does ad-hoc polymorphism have a single systematic way of determining the type of the result from the type of the arguments? 133 134 \noindent 135 For exact type matches in overloading, there is a systematic way of matching arguments to parameters, and a function denotes its return type rather than using type inferencing. 136 This matching is just as robust as other polymorphic analysis. 137 The ad-hoc aspect is the implicit transfer functions (conversions) applied to arguments to create an exact parameter type-match, as there may be multiple conversions leading to different exact matches. 138 Note, conversion issues apply to non-overloaded and overloaded functions. 139 Here, the selection of the conversion functions is based on the \emph{opinion} of the language (type system), even if the technique used is based on sound rules, like maximizing conversion accuracy (non-lossy). 140 The difference in opinion results when the language conversion rules differ from a programmer's expectations. 141 However, without implicit conversions, programmers may have to write an exponential number of functions covering all possible exact-match cases among all reasonable types. 142 \CFA's \emph{opinion} on conversions must match C's and then rely on programmers to understand the effects. 143 That is, let the compiler do the heavy-lifting of selecting a \emph{best} set of conversions that minimizes safety concerns. 144 Hence, removing implicit conversions from \CFA is not an option, so it must do the best possible job to get it right. 145 146 \item 147 Why are there two forms of \emph{overloading} (regular and type class) in different programming languages? 148 149 \noindent 150 \newterm{Regular overloading} occurs when the type-system \emph{knows} a function's argument and return types (or a variable's type for variable overloading). 151 If a return type is specified, the compiler does not have to inference the routine body. 152 For example, the compiler has complete knowledge about builtin types and their overloaded arithmetic operators. 153 In this context, there is a fixed set of overloads for a given name that are completely specified. 154 Overload resolution then involves finding an exact match between a call and the overload prototypes based on argument type(s) and possibly return context. 155 If an \emph{exact} match is not found, the call is either ill formed (ambiguous) or further attempts are made to find a \emph{best} match using transfer functions (conversions). 156 As a consequence, no additional runtime information is needed per call, \ie the call is a direct transfer (branch) with pushed arguments. 157 158 \newterm{Type-class overloading} occurs when the compiler is using currying for type inferencing. 159 \begin{lstlisting}[language=Haskell] 160 f( int, int ); f( int, float ); -- return types to be determined 161 g( int, int ); g( float, int ); 162 let x = curry f( 3, _ ); -- which f 163 let y = curry g( _ , 3 ); -- which g 164 \end{lstlisting} 165 For the currying to succeed, there cannot be overloaded function names resulting in ambiguities. 166 To allow currying to succeed requires an implicit disambiguating mechanism, \ie a kind of transfer function. 167 A type class~\cite{typeclass} is a mechanism to convert overloading into parametric polymorphism. 168 Parametric polymorphism has enough information to disambiguate the overloaded names because it removes the type inferencing. 169 \begin{cfa} 170 forall( T | T has + $and$ - ) T f$\(_1\)$( T ); 171 forall( T | T has * $and$ - ) T f$\(_2\)$( T ); 172 x = f$\(_1\)$( x ); // if x has + and - but not * 173 y = f$\(_2\)$( y ); // if y has * and - but not + 174 \end{cfa} 175 Here, the types of @x@ and @y@ are combined in the type-class contraints to provide secondary infomration for disambiguation. 176 This approach handles many overloading cases because the contraints overlap completely or are disjoint 177 178 A type class (trait) generically abstracts the set of the operations used in a function's implementation. 179 A type-class instance binds a specific type to the generic operations to form concrete instances, giving a name type-class. 180 Then Qualified types concisely express the operations required to convert an overloaded 181 The name type-class is used as a transfer function to convert an overloaded routine into a polymorphic routine that is uniquely qualified with the name type-class. 182 \begin{cfa} 183 void foo_int_trait( special int trait for operations in this foo ); 184 void foo_int_int_trait( special (int, int) trait for operations in this foo ); 185 \end{cfa} 186 187 188 In this case, the compiler implicitly changes the overloaded function to a parametrically polymorphic one. 189 Hence, the programmer does specify any additional information for the overloading to work. 190 Explicit overloading occurs when the compiler has to be told what operations are associated with a type programmer separately defines the associate type and subsequently associates the type with overloaded name. 191 \end{enumerate} 192 193 \begin{comment} 194 Date: Mon, 24 Feb 2025 11:26:12 -0500 195 Subject: Re: overloading 196 To: "Peter A. Buhr" <pabuhr@uwaterloo.ca> 197 CC: <f37yu@uwaterloo.ca>, <ajbeach@uwaterloo.ca>, <mlbrooks@uwaterloo.ca>, 198 <alvin.zhang@uwaterloo.ca>, <lseo@plg.uwaterloo.ca>, 199 <j82liang@uwaterloo.ca> 200 From: Gregor Richards <gregor.richards@uwaterloo.ca> 201 202 Yes. 203 204 With valediction, 205 - Gregor Richards 206 207 On 2/24/25 11:22, Peter A. Buhr wrote: 208 > Gregor Richards <gregor.richards@uwaterloo.ca> writes: 209 > In Haskell, `+` works for both because of typeclasses (inclusion 210 > polymorphism), and so is also not an unresolved type. 211 > 212 > I'm making this up. The Haskell type-class is a trait, like an interface or 213 > abstract class, and its usage/declaration/binding creates a specific trait 214 > instance for bound types, which is a vtable filled with the typed routines 215 > instantiated/located for the trait. The vtables are present at runtime and 216 > passed implicitly to ad-hoc polymorphic routines allowing differentiate of 217 > overloaded functions based on the number of traits and their specialization. 218 > (Major geek talk, YA! 8-) 219 > 220 > On 2/21/25 23:04, Fangren Yu wrote: 221 > > In a statically typed language I would rather have definitions like 222 > > double x = x+x be ambiguous than "an unresolved type" as the latter 223 > > sounds like a weaker version of a generic type, and being able to make 224 > > something generic without explicitly saying so is probably not a good 225 > > idea. Giving the unspecified parameter type an arbitrary preference is 226 > > the second best option IMO (does ML give you a warning on such not 227 > > fully inferred types?) 228 > > ------------------------------------------------------------------------ 229 > > *From:* Gregor Richards <gregor.richards@uwaterloo.ca> 230 > > *Sent:* Wednesday, February 19, 2025 9:55:23 PM 231 > > *To:* Peter Buhr <pabuhr@uwaterloo.ca> 232 > > *Cc:* Andrew James Beach <ajbeach@uwaterloo.ca>; Michael Leslie Brooks 233 > > <mlbrooks@uwaterloo.ca>; Fangren Yu <f37yu@uwaterloo.ca>; 234 > > j82liang@uwaterloo.ca <j82liang@uwaterloo.ca>; Alvin Zhang 235 > > <alvin.zhang@uwaterloo.ca>; lseo@plg.uwaterloo.ca <lseo@plg.uwaterloo.ca> 236 > > *Subject:* Re: overloading 237 > > Jives with what I was saying, albeit not exactly the same; it's a result 238 > > of the same problem. 239 > > 240 > > 'doubles' refers to an unresolved 'double', and the latter can't be 241 > > resolved without the former, so you can't compile 'double' unless you 242 > > know what its arguments are. The solutions are: 243 > > 244 > > * Typeclasses make it possible by compiling with a handle. When you 245 > > call a function that takes a typeclass value as an argument, it takes an 246 > > extra, hidden argument internally which is the typeclass handle. That 247 > > handle tells the callee how to use the typeclass functions with this 248 > > particular value. And, of course, you hope that some aggressive inlining 249 > > gets rid of the dynamic dispatch :). But, no per se overloading 250 > > supported, only inclusion polymorphism. 251 > > 252 > > * If you do whole-world compilation, then you just compile what you 253 > > particularly need in context. If you call 'doubles' with a 254 > > float,int,int, then you compile that version. But, currying is unsolved. 255 > > 256 > > * If you do C++-style templating, this is a less severe problem, as 257 > > you compile it with the use of 'doubles', not with the definition. But, 258 > > either no currying, or you have to specify the extra types explicitly so 259 > > it knows what to curry, so no inference. 260 > > 261 > > * If you do Java-style generics, ... just kidding. 262 > > 263 > > In a language like Haskell or OCaml, if you want to compile this 264 > > modularly and have the code with the implementation, then naively it 265 > > would have to make eight implementations. But, even that is only true if 266 > > (x, y, z) is a single tuple argument. Currying is still the killer. If 267 > > you call `doubles 3`, the return is supposed to be some x -> y -> (int, x, 268 > > y), where 'x' and 'y' are "some type on which I can call a 'double' 269 > > function, but I don't know which double function yet because I don't 270 > > know what type". Even *writing* that type is hard enough, but having 271 > > values of that type float around at runtime? Yikes. 272 > > 273 > > To put it a different way: In C++ (and presumably CFA?), you can 274 > > overload all you want to, but you can't get a function pointer to an 275 > > unresolved overload. The function pointer is to a *particular* overload, 276 > > not the set of possible overloads. Well, in a functional language, 277 > > function pointers are the lifeblood of the language. 278 > > 279 > > With valediction, 280 > > - Gregor Richards 281 > > 282 > > On 2/19/25 21:25, Peter A. Buhr wrote: 283 > > > In the "Type Classes" chapter I sent out, the author says the 284 > > following. Does it 285 > > > jive with what you are saying about currying? BTW, I do not know who 286 > > wrote the 287 > > > book chapter. 288 > > > 289 > > > 290 > > ========================================================================== 291 > > > 292 > > > Suppose we have a language that overloads addition + and 293 > > multiplication *, 294 > > > providing versions that work over values of type Int and type Float. 295 > > Now, 296 > > > consider the double function, written in terms of the overloaded 297 > > addition 298 > > > operation: 299 > > > 300 > > > double x = x + x 301 > > > 302 > > > What does this definition mean? A naive interpretation would be to 303 > > say that 304 > > > double is also overloaded, defining one function of type Int -> Int 305 > > -> Int and a 306 > > > second of type Float -> Float -> Float. All seems fine, until we 307 > > consider the 308 > > > function 309 > > > 310 > > > doubles (x,y,z) = (double x, double y, double z) 311 > > > 312 > > > Under the proposed scheme, this definition would give rise to eight 313 > > different 314 > > > versions! This approach has not been widely used because of the 315 > > exponential 316 > > > growth in the number of versions. 317 > > > 318 > > > To avoid this blow-up, language designers have sometimes restricted the 319 > > > definition of overloaded functions. In this approach, which was 320 > > adopted in 321 > > > Standard ML, basic operations can be overloaded, but not functions 322 > > defined in 323 > > > terms of them. Instead, the language design specifies one of the 324 > > possible 325 > > > versions as the meaning of the function. For example, Standard ML give 326 > > > preference to the type int over real, so the type (and 327 > > implementation) of the 328 > > > function double would be int -> int. If the programmer wanted to 329 > > define a double 330 > > > function over floating point numbers, she would have to explicitly 331 > > write the 332 > > > type of the function in its definition and give the function a name 333 > > distinct 334 > > > from the double function on integers. This approach is not particularly 335 > > > satisfying, because it violates a general principle of language 336 > > design: giving 337 > > > the compiler the ability to define features that programmers cannot. 338 > 339 > [2:text/html Show Save:noname (10kB)] 340 \end{comment} 40 341 41 342 … … 58 359 Conversions are necessary because the hardware rarely supports mix-mode operations, so both operands must be converted to a common type. 59 360 Like overloading, the majority of mixed-mode conversions are silently resolved, simplifying the development process. 60 Without implicit conversions, programmers must write an exponential number of functions covering all possible exact-match cases among all possible types.61 361 This approach does not match with programmer intuition and expectation, regardless of any \emph{safety} issues resulting from converted values. 62 362 Depending on the language, mix-mode conversions can be explicitly controlled using some form of cast. … … 188 488 % https://dl.acm.org/doi/10.1145/75277.75283 189 489 190 \begin{minipage}{0.75\textwidth} 191 \begin{tabular}{@{}r|ccccccc@{}} 192 Feat.\,{\textbackslash}\,Lang. & Ada & \CC & \CFA & Java & Scala & Swift & Haskell \\ 490 \begin{minipage}{\linewidth} 491 \setlength{\tabcolsep}{5pt} 492 \begin{tabular}{@{}r|cccccccc@{}} 493 Feature\,{\textbackslash}\,Language & Ada & \CC & \CFA & Java & Scala & Swift & Rust & Haskell \\ 193 494 \hline 194 Oper./Func./Meth. name & O\footnote{except assignment}/F & O/F/M & O/F & M & O/M & O/F/M \\ 195 parameter number & yes & yes & yes & yes & yes & yes \\ 196 parameter types & yes & yes & yes & yes & yes & yes \\ 197 parameter name & no & no & no & no & yes & yes \\ 198 return type & yes & no & yes & no & no & yes \\ 199 Safe/Unsafe arg. conv. & none & yes\footnote{no conversions allowed during template parameter deduction} & S/U & S\footnote{unsafe (narrowing) conversion only allowed in assignment or initialization to a primitive variable} & S & no\footnote{literals only, Int -> Double (Safe)} \\ 200 generic & no & yes\footnote{compile-time only, using template expansion} & yes & yes & yes & yes 495 Operator/Function/Method name & O\footnote{except assignment}/F & O/F/M & O/F & M & O/M & O/F/M & X & X \\ 496 generic name & no & yes\footnote{compile-time only, using template expansion} & yes & yes & yes & yes & X & X \\ 497 parameter number & yes & yes & yes & yes & yes & yes & X & X \\ 498 parameter types & yes & yes & yes & yes & yes & yes & X & X \\ 499 parameter name & no & no & no & no & yes & yes & X & X \\ 500 return type & yes & no & yes & no & no & yes & X & X \\ 501 Safe/Unsafe argument conversion & none & yes\footnote{no conversions allowed during template parameter deduction} & S/U 502 & S\footnote{unsafe (narrowing) conversion only allowed in assignment or initialization to a primitive variable} & S 503 & no\footnote{literals only, Int -> Double (Safe)} & X & X 201 504 \end{tabular} 202 505 \end{minipage} -
TabularUnified doc/theses/fangren_yu_MMath/test.adb ¶
r62b5940 rf5bf3c2 16 16 Function Func( V1 : Integer; V2 : Float ) return Float is begin return Float(V1) + V2; end; 17 17 18 Function "-"( L : Integer; R : Integer ) return Integer is begin return 3; end;18 Function "-"( L, R : Integer ) return Integer is begin return L + (-R); end; -- prevent infinite recusion 19 19 20 20 i : Integer; … … 26 26 Re, Im : Float; 27 27 end record; 28 c : Complex := (Re => 1.0, Im => 1.0); 28 29 Procedure Grind (X : Complex) is begin Put_Line( "Grind1" ); end; 29 30 Procedure Grind (X : Unbounded_String) is begin Put_Line( "Grind2" ); end; 30 c : Complex;31 31 32 32 generic … … 39 39 Put_Line( "XXX" ); return X + X; -- The formal operator "*". 40 40 end twice; 41 42 generic43 type T is private;44 with function "+"( X, Y: T ) return T;45 function twice( X : T; Y : T ) return T;46 47 -- generic units cannot be overloaded48 function twice( X: T; Y : T ) return T is49 begin50 Put_Line( "XXX" ); return X + X; -- The formal operator "*".51 end twice;52 41 53 42 function Int_Twice is new Twice( Integer, "+" => "+" ); 54 43 function float_Twice is new Twice( float, "+" => "+" ); 44 45 -- generic units cannot be overloaded 46 -- generic 47 -- type T is private; 48 -- with function "+"( X, Y: T ) return T; 49 -- function twice( X : T; Y : T ) return T; 50 -- 51 -- function twice( X: T; Y : T ) return T is 52 -- begin 53 -- Put_Line( "XXX" ); return X + X; -- The formal operator "*". 54 -- end twice; 55 55 begin 56 56 i := Random;
Note: See TracChangeset
for help on using the changeset viewer.