Changeset cdbb909
- Timestamp:
- Sep 2, 2024, 2:54:53 PM (3 months ago)
- Branches:
- master
- Children:
- db19e1d
- Parents:
- d6b7d1d
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/fangren_yu_MMath/intro.tex
rd6b7d1d rcdbb909 6 6 Overloading allows programmers to use the most meaningful names without fear of name clashes within a program or from external sources, like include files. 7 7 \begin{quote} 8 There are only two hard things in Computer Science: cache invalidation and naming things. --- Phil Karlton8 There are only two hard things in Computer Science: cache invalidation and \emph{naming things}. --- Phil Karlton 9 9 \end{quote} 10 10 Experience from \CC and \CFA developers is that the type system implicitly and correctly disambiguates the majority of overloaded names, \ie it is rare to get an incorrect selection or ambiguity, even among hundreds of overloaded (variables and) functions. … … 15 15 16 16 17 \subsection{Operator Overloading} 18 19 Virtually all programming languages overload the arithmetic operators across the basic types using the number and type of parameters and returns. 17 \section{Types} 18 19 \begin{quote} 20 Some are born great, some achieve greatness, and some have greatness thrust upon them. Twelfth Night, Act II Scene 5, William Shakespeare 21 \end{quote} 22 23 All computers have multiple types because computer architects optimize the hardware around a few basic types with well defined (mathematical) operations: boolean, integral, floating-point, and occasionally strings. 24 A programming language and its compiler present ways to declare types that ultimately map into those provided by the underlying hardware. 25 These language types are thrust upon programmers, with their syntactic and semantic rules, and resulting restrictions. 26 A language type-system defines these rules and uses them to understand how an expression is to be evaluated by the hardware. 27 Modern programming-languages allow user-defined types and generalize across multiple types using polymorphism. 28 Type systems can be static, where each variable has a fixed type during execution and an expression's type is determined once at compile time, or dynamic, where each variable can change type during execution and so an expression's type is reconstructed on each evaluation. 29 Expressibility, generalization, and safety are all bound up in a language's type system, and hence, directly affect the capability, build time, and correctness of program development. 30 31 32 \section{Operator Overloading} 33 34 Virtually all programming languages overload the arithmetic operators across the basic computational types using the number and type of parameters and returns. 20 35 Like \CC, \CFA also allows these operators to be overloaded with user-defined types. 21 The syntax for operator names uses the @'?'@ character to denote a parameter, \eg prefix and infix increment operators: @?++@, @++?@, and @?+?@. 36 The syntax for operator names uses the @'?'@ character to denote a function parameter, \eg prefix, postfix, and infix increment operators: @++?@, @?++@, and @?+?@. 37 Here, a user-defined type is extended with an addition operation with the same syntax as builtin types. 22 38 \begin{cfa} 23 39 struct S { int i, j }; … … 25 41 S s1, s2; 26 42 s1 = s1 @+@ s2; $\C[1.75in]{// infix call}$ 27 s1 = @?+?@( s1, s2 ); $\C{// direct call }\CRT$43 s1 = @?+?@( s1, s2 ); $\C{// direct call using operator name}\CRT$ 28 44 \end{cfa} 29 45 The type system examines each call size and selects the best matching overloaded function based on the number and types of the arguments. 30 If there are mixed-mode operands, @2 + 3.5@, the type system, like in C/\CC, attempts (safe) conversions, converting the argument type(s) to the parameter type(s). 31 32 33 \subsection{Function Overloading} 46 If there are mixed-mode operands, @2 + 3.5@, the \CFA type system, like C/\CC, attempts (safe) conversions, converting one or more of the argument type(s) to the parameter type(s). 47 Conversions are necessary because the hardware rarely supports mix-mode operations, so both operands must be the same type. 48 Note, without implicit conversions, programmers must write an exponential number of functions covering all possible exact-match cases among all possible types. 49 This approach does not match with programmer intuition and expectation, regardless of any \emph{safety} issues resulting from converted values. 50 51 52 \section{Function Overloading} 34 53 35 54 Both \CFA and \CC allow function names to be overloaded, as long as their prototypes differ in the number and type of parameters and returns. 36 55 \begin{cfa} 37 void f( void ); $\C[ 1.75in]{// (1): no parameter}$56 void f( void ); $\C[2in]{// (1): no parameter}$ 38 57 void f( char ); $\C{// (2): overloaded on the number and parameter type}$ 39 58 void f( int, int ); $\C{// (3): overloaded on the number and parameter type}$ … … 45 64 46 65 Ada, Scala, and \CFA type-systems also use the return type in resolving a call, to pinpoint the best overloaded name. 47 \begin{cfa} 48 int f( void ); $\C[1.75in]{// (4); overloaded on return type}$ 49 double f( void ); $\C{// (5); overloaded on return type}$ 50 int i = f(); $\C{// select (4)}$ 51 double d = f(); $\C{// select (5)}\CRT$ 52 \end{cfa} 53 54 55 \subsection{Variable Overloading} 56 Unlike almost all programming languages, \CFA has variable overloading within a scope, along with shadow overloading in nested scopes. 66 For example, in many programming languages with overloading, the following functions are ambiguous without using the return type. 67 \begin{cfa} 68 int f( int ); $\C[2in]{// (1); overloaded on return type and parameter}$ 69 double f( int ); $\C{// (2); overloaded on return type and parameter}$ 70 int i = f( 3 ); $\C{// select (1)}$ 71 double d = f( 3 ); $\C{// select (2)}\CRT$ 72 \end{cfa} 73 However, if the type system looks at the return type, there is an exact match for each call, which matches with programmer intuition and expectation. 74 This capability can be taken to the extreme, where there are no function parameters. 75 \begin{cfa} 76 int random( void ); $\C[2in]{// (1); overloaded on return type}$ 77 double random( void ); $\C{// (2); overloaded on return type}$ 78 int i = random(); $\C{// select (1)}$ 79 double d = random(); $\C{// select (2)}\CRT$ 80 \end{cfa} 81 Again, there is an exact match for each call. 82 If there is no exact match, a set of minimal conversions can be added to find a best match, as for operator overloading. 83 84 85 \section{Variable Overloading} 86 87 Unlike most programming languages, \CFA has variable overloading within a scope, along with shadow overloading in nested scopes. 88 (Shadow overloading is also possible for functions, if a language supports nested function declarations, \eg \CC named, nested, lambda functions.) 57 89 \begin{cfa} 58 90 void foo( double d ); 59 int v; $\C[ 1.75in]{// (1)}$91 int v; $\C[2in]{// (1)}$ 60 92 double v; $\C{// (2) variable overloading}$ 61 93 foo( v ); $\C{// select (2)}$ … … 66 98 } 67 99 \end{cfa} 68 The \CFA type system simply treats overloaded variables as an overloaded function returning a value with no parameters. 100 It is interesting that shadow overloading is considered a normal programming-language feature with only slight software-engineering problems, but variable overloading within a scope is often considered extremely dangerous. 101 102 In \CFA, the type system simply treats overloaded variables as an overloaded function returning a value with no parameters. 69 103 Hence, no significant effort is required to support this feature. 104 Leveraging the return type to disambiguate is essential because variables have no parameters. 105 \begin{cfa} 106 int MAX = 2147483647; $\C[2in]{// (1); overloaded on return type}$ 107 double MAX = ...; $\C{// (2); overloaded on return type}$ 108 int i = MAX; $\C{// select (1)}$ 109 double d = MAX; $\C{// select (2)}\CRT$ 110 \end{cfa} 111 112 113 \section{Type Inferencing} 114 115 One of the first and powerful type-inferencing system is Hindley--Milner~\cite{Damas82}. 116 Here, the type resolver starts with the types of the program constants used for initialization and these constant types flow throughout the program, setting all variable and expression types. 117 \begin{cfa} 118 auto f() { 119 x = 1; y = 3.5; $\C{// set types from constants}$ 120 x = // expression involving x, y and other local initialized variables 121 y = // expression involving x, y and other local initialized variables 122 return x, y; 123 } 124 auto w = f(); $\C{// typing flows outwards}$ 125 126 void f( auto x, auto y ) { 127 x = // expression involving x, y and other local initialized variables 128 y = // expression involving x, y and other local initialized variables 129 } 130 s = 1; t = 3.5; $\C{// set types from constants}$ 131 f( s, t ); $\C{// typing flows inwards}$ 132 \end{cfa} 133 In both overloads of @f@, the type system works from the constant initializations inwards and/or outwards to determine the types of all variables and functions. 134 Note, like template meta-programming, there can be a new function generated for the second @f@ depending on the types of the arguments, assuming these types are meaningful in the body of the @f@. 135 Inferring type constraints, by analysing the body of @f@ is possible, and these constraints must be satisfied at each call site by the argument types; 136 in this case, parametric polymorphism can allow separate compilation. 137 In languages with type inferencing, there is often limited overloading to reduce the search space, which introduces the naming problem. 138 Return-type inferencing goes in the opposite direction to Hindley--Milner: knowing the type of the result and flowing back through an expression to help select the best possible overloads, and possibly converting the constants for a best match. 139 140 In simpler type inferencing systems, such as C/\CC/\CFA, there are more specific usages. 141 \begin{cquote} 142 \setlength{\tabcolsep}{10pt} 143 \begin{tabular}{@{}lll@{}} 144 \multicolumn{1}{c}{\textbf{gcc / \CFA}} & \multicolumn{1}{c}{\textbf{\CC}} \\ 145 \begin{cfa} 146 #define expr 3.0 * i 147 typeof(expr) x = expr; 148 int y; 149 typeof(y) z = y; 150 \end{cfa} 151 & 152 \begin{cfa} 153 154 auto x = 3.0 * 4; 155 int y; 156 auto z = y; 157 \end{cfa} 158 & 159 \begin{cfa} 160 161 // use type of initialization expression 162 163 // use type of initialization expression 164 \end{cfa} 165 \end{tabular} 166 \end{cquote} 167 The two important capabilities are: 168 \begin{itemize}[topsep=0pt] 169 \item 170 Not determining or writing long generic types, \eg, given deeply nested generic types. 171 \begin{cfa} 172 typedef T1(int).T2(float).T3(char).T ST; $\C{// \CFA nested type declaration}$ 173 ST x, y, x; 174 \end{cfa} 175 This issue is exaggerated with \CC templates, where type names are 100s of characters long, resulting in unreadable error messages. 176 \item 177 Ensuring the type of secondary variables, always matches a primary variable. 178 \begin{cfa} 179 int x; $\C{// primary variable}$ 180 typeof(x) y, z, w; $\C{// secondary variables match x's type}$ 181 \end{cfa} 182 If the type of @x@ changes, the types of the secondary variables correspondingly update. 183 \end{itemize} 184 Note, the use of @typeof@ is more restrictive, and possibly safer, than general type-inferencing. 185 \begin{cfa} 186 int x; 187 type(x) y = ... // complex expression 188 type(x) z = ... // complex expression 189 \end{cfa} 190 Here, the types of @y@ and @z@ are fixed (branded), whereas with type inferencing, the types of @y@ and @z@ are potentially unknown. 191 192 193 \section{Type-Inferencing Issues} 194 195 Each kind of type-inferencing systems has its own set of issues that flow onto the programmer in the form of restrictions and/or confusions. 196 \begin{enumerate}[leftmargin=*] 197 \item 198 There can be large blocks of code where all declarations are @auto@. 199 As a result, understanding and changing the code becomes almost impossible. 200 Types provide important clues as to the behaviour of the code, and correspondingly to correctly change or add new code. 201 In these cases, a programmer is forced to re-engineer types, which is fragile, or rely on a fancy IDE that can re-engineer types. 202 \item 203 The problem of unknown types becomes acute when the concrete type must be used, \eg, given: 204 \begin{cfa} 205 auto x = @...@ 206 \end{cfa} 207 and the need to write a routine to compute using @x@ 208 \begin{cfa} 209 void rtn( @...@ parm ); 210 rtn( x ); 211 \end{cfa} 212 A programmer must re-engineer the type of @x@'s initialization expression, reconstructing the possibly long generic type-name. 213 In this situation, having the type name or its short alias is essential. 214 \item 215 There is the conundrum in type inferencing of when to \emph{brand} a type. 216 That is, when is the type of the variable/function more important than the type of its initialization expression. 217 For example, if a change is made in an initialization expression, it can cause cascading type changes and/or errors. 218 At some point, a variable's type needs to remain constant and the expression needs to be modified or in error when it changes. 219 Often type-inferencing systems allow \newterm{branding} a variable or function type; 220 now the complier can report a mismatch on the constant. 221 \begin{cfa} 222 void f( @int@ x, @int@ y ) { // brand function prototype 223 x = // expression involving x, y and other local initialized variables 224 y = // expression involving x, y and other local initialized variables 225 } 226 s = 1; t = 3.5; 227 f( s, @t@ ); // type mismatch 228 \end{cfa} 229 In Haskell, it is common for programmers to brand (type) function parameters. 230 \end{enumerate} 231 232 \CFA's type system is trying to prevent type-resolution mistakes by relying heavily on the type of the left-hand side of assignment to pinpoint the right types for an expression computation. 233 Type inferencing defeats this goal because there is no left-hand type. 234 Fundamentally, type inferencing tries to magic away types from the programmer. 235 However, this results in lazy programming with the potential for poor performance and safety concerns. 236 Types are as important as control-flow, and should not be masked, even if it requires the programmer to think! 237 A similar example is garbage collection, where storage management is masked, resulting in poor program design and performance. 238 The entire area of Computer-Science data-structures is obsessed with time and space, and that obsession should continue into regular programming. 239 Understanding space and time issues are an essential part of the programming craft. 240 Given @typedef@ and @typeof@ in \CFA, and the strong need to use the left-hand type in resolution, implicit type-inferencing is unsupported. 241 Should a significant need arise, this feature can be revisited. 242 243 244 \section{Polymorphism} 245 246 247 248 \section{Contributions}
Note: See TracChangeset
for help on using the changeset viewer.