Changeset 1e7e6f38 for doc/theses
- Timestamp:
- May 11, 2025, 10:19:45 PM (5 months ago)
- Branches:
- master
- Children:
- 98c77b2
- Parents:
- 9a1aec6
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/fangren_yu_MMath/intro.tex
r9a1aec6 r1e7e6f38 30 30 31 31 \section{Overloading} 32 32 \label{s:Overloading} 33 34 \vspace*{-5pt} 33 35 \begin{quote} 34 36 There are only two hard things in Computer Science: cache invalidation and \emph{naming things}. --- Phil Karlton 35 37 \end{quote} 38 \vspace*{-5pt} 36 39 Overloading allows programmers to use the most meaningful names without fear of name clashes within a program or from external sources, like include files. 37 40 Experience from \CC and \CFA developers shows the type system can implicitly and correctly disambiguate the majority of overloaded names, \ie it is rare to get an incorrect selection or ambiguity, even among hundreds of overloaded (variables and) functions. … … 39 42 40 43 Disambiguating among overloads is implemented by examining each call site and selecting the best matching overloaded function based on criteria like the types and number of arguments and the return context. 41 Since the hardware does not support mixed-mode operands, @2 + 3.5@, the type system must disallow it or (safely) convert the operands to a common type.44 Since the hardware does not support mixed-mode operands, such as @2 + 3.5@, the type system must disallow it or (safely) convert the operands to a common type. 42 45 Like overloading, the majority of mixed-mode conversions are silently resolved, simplifying the development process. 43 46 This approach matches with programmer intuition and expectation, regardless of any \emph{safety} issues resulting from converted values. … … 49 52 As well, many namespace systems provide a mechanism to open their scope returning to normal overloading, \ie no qualification. 50 53 While namespace mechanisms are very important and provide a number of crucial program-development features, protection from overloading is overstated. 51 Similarly, lexical nesting is another place where overloading occurs.54 Similarly, lexical nesting is another place where duplicate naming issues arise. 52 55 For example, in object-oriented programming, class member names \newterm{shadow} names within members. 53 Some programmers ,qualify all member names with @class::@ or @this->@ to make them unique from names defined in members.54 Even nested lexical blocks result in shadowing, \eg multiple nested loop-indices called @i@ .55 Again, coding styles exist requiring all variables in nested block to be unique to prevent name shadowing .56 Some programmers qualify all member names with @class::@ or @this->@ to make them unique from names defined in members. 57 Even nested lexical blocks result in shadowing, \eg multiple nested loop-indices called @i@, silently changing the meaning of @i@ at lower scope levels. 58 Again, coding styles exist requiring all variables in nested block to be unique to prevent name shadowing problems. 56 59 Depending on the language, these possible ambiguities can be reported (as warnings or errors) and resolved explicitly using some form of qualification and/or cast. 57 58 Formally, overloading is defined by Strachey as \newterm{ad hoc polymorphism}: 60 For example, if variables can be overloaded, shadowed variables of different type can produce ambiguities, indicating potential problems in lower scopes. 61 62 Formally, overloading is defined by Strachey as one kind of \newterm{ad hoc polymorphism}: 63 \vspace*{-5pt} 59 64 \begin{quote} 60 65 In ad hoc polymorphism there is no single systematic way of determining the type of the result from the type of the arguments. … … 63 68 It seems, moreover, that the automatic insertion of transfer functions by the compiling system is limited to this.~\cite[p.~37]{Strachey00} 64 69 \end{quote} 70 \vspace*{-5pt} 65 71 where a \newterm{transfer function} is an implicit conversion to help find a matching overload: 72 \vspace*{-5pt} 66 73 \begin{quote} 67 74 The problem of dealing with polymorphic operators is complicated by the fact that the range of types sometimes overlap. … … 69 76 The functions which perform this operation are known as transfer functions and may either be used explicitly by the programmer, or, in some systems, inserted automatically by the compiling system.~\cite[p.~35]{Strachey00} 70 77 \end{quote} 78 \vspace*{-5pt} 71 79 The differentiating characteristic between parametric polymorphism and overloading is often stated as: polymorphic functions use one algorithm to operate on arguments of many different types, whereas overloaded functions use a different algorithm for each type of argument. 72 80 A similar differentiation is applicable for overloading and default parameters. … … 128 136 \end{cfa} 129 137 the overloaded names @S@ and @E@ are separated into the type and object domain, and C uses the type kinds @struct@ and @enum@ to disambiguate the names. 130 In general, types are not overloaded because inferencing them is difficult to imagine in a statically programming language.138 In general, types are not overloaded because inferencing them is difficult to imagine in a statically typed programming language. 131 139 \begin{cquote} 132 140 \setlength{\tabcolsep}{26pt} … … 181 189 \noindent 182 190 \newterm{General overloading} occurs when the type-system \emph{knows} a function's parameters and return types (or a variable's type for variable overloading). 183 In functional programming-languages, there is always a return type (except for a monad).191 In functional programming-languages, there is always a return type. 184 192 If a return type is specified, the compiler does not have to inference the function body. 185 193 For example, the compiler has complete knowledge about builtin types and their overloaded arithmetic operators. … … 214 222 Hence, parametric overloading requires additional information about the universal types to make them useful. 215 223 216 This additional information often comes as a set of operations a type must supply (@trait@/-@concept@) and these operations can then be used in the body of the function. 217 \begin{cfa} 218 forall( T | T ?@++@( T, T ) ) T inc( T t ) { return t@++@; } 224 This additional information often comes as a set of operations that must be supply for a type, \eg \CFA/Rust/Go have traits, \CC template has concepts, Haskell has type-classes. 225 These operations can then be used in the body of the function to manipulate the type's value. 226 Here, a type binding to @T@ must have available a @++@ operation with the specified signature. 227 \begin{cfa} 228 forall( T | @T ?++( T, T )@ ) // trait 229 T inc( T t ) { return t@++@; } // change type value 219 230 int i = 3 220 231 i = inc( i ) … … 222 233 \end{cfa} 223 234 Given a qualifying trait, are its elements inferred or declared? 224 In the above example, the type system infers @int@ for @T@, infers it needs a @++@ operator that takes an @int@ and returns an @int@, and finds this function in the enclosing environment (\eg standard prelude).235 In the example, the type system infers @int@ for @T@, infers it needs an appropriately typed @++@ operator, and finds it in the enclosing environment, possibly in the language's prelude defining basic types and their operations. 225 236 This implicit inferencing is expensive if matched with implicit conversions when there is no exact match. 226 237 Alternatively, types opt-in to traits via declarations. … … 420 431 \subsection{Operator Overloading} 421 432 422 Virtually all programming languages provide general overloading of the arithmetic operatorsacross the basic computational types using the number and type of parameters and returns.433 Many programming languages provide general overloading of the arithmetic operators~\cite{OperOverloading} across the basic computational types using the number and type of parameters and returns. 423 434 However, in some languages, arithmetic operators may not be first class, and hence, cannot be overloaded. 424 435 Like \CC, \CFA allows general operator overloading for user-defined types. … … 438 449 \subsection{Function Overloading} 439 450 440 Both \CFA and \CC allow general overloading for functions, as long as their prototypes differ in the number and type of parameters and returns. 451 Many programming languages provide general overloading for functions~\cite{FuncOverloading}, as long as their prototypes differ in the number and type of parameters. 452 A few programming languages also use the return type for selecting overloaded functions \see{below}. 441 453 \begin{cfa} 442 454 void f( void ); $\C[2in]{// (1): no parameter}$ … … 478 490 \begin{cfa} 479 491 void foo( double d ); 480 int v; 492 int v; $\C[2in]{// (1)}$ 481 493 double v; $\C{// (2) variable overloading}$ 482 494 foo( v ); $\C{// select (2)}$ … … 487 499 } 488 500 \end{cfa} 489 It is interesting that shadow overloadingis considered a normal programming-language feature with only slight software-engineering problems.501 It is interesting that shadowing \see{namespace pollution in \VRef{s:Overloading}} is considered a normal programming-language feature with only slight software-engineering problems. 490 502 However, variable overloading within a scope is often considered dangerous, without any evidence to corroborate this claim. 491 503 In contrast, function overloading in \CC occurs silently within the global scope from @#include@ files all the time without problems. … … 554 566 The following covers these issues, and why this scheme is not amenable with the \CFA type system. 555 567 556 One of the first and powerful type-inferencing systemis Hindley--Milner~\cite{Damas82}.568 One of the first and most powerful type-inferencing systems is Hindley--Milner~\cite{Damas82}. 557 569 Here, the type resolver starts with the types of the program constants used for initialization and these constant types flow throughout the program, setting all variable and expression types. 558 570 \begin{cfa} … … 579 591 Note, return-type inferencing goes in the opposite direction to Hindley--Milner: knowing the type of the result and flowing back through an expression to help select the best possible overloads, and possibly converting the constants for a best match. 580 592 581 In simpler type-inferencing systems, such as C/\CC/\CFA, there are more specific usages.593 There are multiple ways to indirectly specify a variable's type, \eg from a prior variable or expression. 582 594 \begin{cquote} 583 595 \setlength{\tabcolsep}{10pt} … … 606 618 \end{tabular} 607 619 \end{cquote} 608 The two important capabilities are: 620 Here, @type(expr)@ computes the same type as @auto@ righ-hand expression. 621 The advantages are: 609 622 \begin{itemize}[topsep=0pt] 610 623 \item … … 616 629 This issue is exaggerated with \CC templates, where type names are 100s of characters long, resulting in unreadable error messages. 617 630 \item 618 Ensuring the type of secondary variables ,match a primary variable.631 Ensuring the type of secondary variables match a primary variable. 619 632 \begin{cfa} 620 633 int x; $\C{// primary variable}$ … … 625 638 \end{itemize} 626 639 Note, the use of @typeof@ is more restrictive, and possibly safer, than general type-inferencing. 640 \begin{cquote} 641 \setlength{\tabcolsep}{20pt} 642 \begin{tabular}{@{}ll@{}} 627 643 \begin{cfa} 628 644 int x; … … 630 646 type(x) z = ... // complex expression 631 647 \end{cfa} 632 Here, the types of @y@ and @z@ are fixed (branded), whereas with type inferencing, the types of @y@ and @z@ are potentially unknown. 648 & 649 \begin{cfa} 650 int x; 651 auto y = ... // complex expression 652 auto z = ... // complex expression 653 \end{cfa} 654 \end{tabular} 655 \end{cquote} 656 On the left, the types of @y@ and @z@ are fixed (branded), whereas on the right, the types of @y@ and @z@ can fluctuate. 633 657 634 658 635 659 \subsection{Type-Inferencing Issues} 636 660 637 Each kind of type-inferencing system has its own set of issues that flow ontothe programmer in the form of convenience, restrictions, or confusions.661 Each kind of type-inferencing system has its own set of issues that affect the programmer in the form of convenience, restrictions, or confusions. 638 662 639 663 A convenience is having the compiler use its overarching program knowledge to select the best type for each variable based on some notion of \emph{best}, which simplifies the programming experience. … … 657 681 As a result, understanding and changing the code becomes almost impossible. 658 682 Types provide important clues as to the behaviour of the code, and correspondingly to correctly change or add new code. 659 In these cases, a programmer is forced to re-engineer types, which is fragile, or rely on a fancyIDE that can re-engineer types for them.683 In these cases, a programmer is forced to re-engineer types, which is fragile, or rely on an IDE that can re-engineer types for them. 660 684 For example, given: 661 685 \begin{cfa} … … 670 694 In this situation, having the type name or its short alias is essential. 671 695 672 \CFA's type system tries to prevent type-resolution mistakes by relying heavily on the type of the left-hand side of assignment to pinpoint the right types within an expression.696 \CFA's type system tries to prevent type-resolution mistakes by relying heavily on the type of the left-hand side of assignment to pinpoint correct types within an expression. 673 697 Type inferencing defeats this goal because there is no left-hand type. 674 Fundamentally, type inferencing tries to magic away variable types from the programmer. 675 However, this results in lazy programming with the potential for poor performance and safety concerns. 676 Types are as important as control-flow in writing a good program, and should not be masked, even if it requires the programmer to think! 677 A similar issue is garbage collection, where storage management is magicked away, often resulting in poor program design and performance.\footnote{ 678 There are full-time Java consultants, who are hired to find memory-management problems in large Java programs.} 679 The entire area of Computer-Science data-structures is obsessed with time and space, and that obsession should continue into regular programming. 680 Understanding space and time issues is an essential part of the programming craft. 681 Given @typedef@ and @typeof@ in \CFA, and the strong desire to use the left-hand type in resolution, the decision was made not to support implicit type-inferencing in the type system. 698 Fundamentally, type inferencing tries to remove explicit typing from programming. 699 However, writing down types is an important aspect of good programming, as it provides a check of the programmer's expected type and the actual type. 700 Thinking carefully about types is similar to thinking carefully about date structures, often resulting in better performance and safety. 701 Similarly, thinking carefully about storage management in unmanaged languages is an important aspect of good programming, versus implicit storage management (garbage collection) in managed language.\footnote{ 702 There are full-time Java consultants, who are hired to find memory-management problems in large Java programs, \eg Monika Beckworth.} 703 Given @typedef@ and @typeof@, and the strong desire to use the left-hand type in resolution, no attempt has been made in \CFA to support implicit type-inferencing. 682 704 Should a significant need arise, this decision can be revisited. 683 705 … … 702 724 int i, * ip = identity( &i ); 703 725 \end{cfa} 704 Unlike \CC template functions, \CFA polymorphic functions are compatible with C\emph{separate compilation}, preventing compilation and code bloat.726 Unlike \CC template functions, \CFA polymorphic functions are compatible with \emph{separate compilation}, preventing compilation and code bloat. 705 727 706 728 To constrain polymorphic types, \CFA uses \newterm{type assertions}~\cite[pp.~37-44]{Alphard} to provide further type information, where type assertions may be variable or function declarations that depend on a polymorphic type variable. … … 710 732 int val = twice( twice( 3 ) ); $\C{// val == 12}$ 711 733 \end{cfa} 712 Parametric polymorphism and assertions occur in existing type-unsafe (@void *@) Cfunctions, like @qsort@ for sorting an array of unknown values.734 The closest approximation to parametric polymorphism and assertions in C is type-unsafe (@void *@) functions, like @qsort@ for sorting an array of unknown values. 713 735 \begin{cfa} 714 736 void qsort( void * base, size_t nmemb, size_t size, int (*cmp)( const void *, const void * ) ); … … 761 783 The @sized@ assertion passes size and alignment as a data object has no implicit assertions. 762 784 Both assertions are used in @malloc@ via @sizeof@ and @_Alignof@. 763 In practi se, this polymorphic @malloc@ is unwrapped by the C compiler and the @if@ statement is elided producing a type-safe call to @malloc@ or @memalign@.785 In practice, this polymorphic @malloc@ is unwrapped by the C compiler and the @if@ statement is elided producing a type-safe call to @malloc@ or @memalign@. 764 786 765 787 This mechanism is used to construct type-safe wrapper-libraries condensing hundreds of existing C functions into tens of \CFA overloaded functions. … … 787 809 forall( T @| sumable( T )@ ) // use trait 788 810 T sum( T a[$\,$], size_t size ) { 789 @T@ total = 0; 811 @T@ total = 0; // initialize by 0 constructor 790 812 for ( i; size ) 791 total @+=@ a[i]; 813 total @+=@ a[i]; // select appropriate + 792 814 return total; 793 815 } … … 821 843 Write bespoke data structures for each context. 822 844 While this approach is flexible and supports integration with the C type checker and tooling, it is tedious and error prone, especially for more complex data structures. 845 823 846 \item 824 847 Use @void *@-based polymorphism, \eg the C standard library functions @bsearch@ and @qsort@, which allow for the reuse of code with common functionality. 825 848 However, this approach eliminates the type checker's ability to ensure argument types are properly matched, often requiring a number of extra function parameters, pointer indirection, and dynamic allocation that is otherwise unnecessary. 849 826 850 \item 827 Use preprocessor macros, similar to \CC @templates@, to generate code that is both generic and type checked, but errors may be difficult to interpret. 828 Furthermore, writing and using complex preprocessor macros is difficult and inflexible. 851 Use an internal macro capability, like \CC @templates@, to generate code that is both generic and type checked, but errors may be difficult to interpret. 852 Furthermore, writing complex template macros is difficult and complex. 853 854 \item 855 Use an external macro capability, like M4~\cite{M4}, to generate code that is generic code, but errors may be difficult to interpret. 856 Like internal macros, writing and using external macros is equally difficult and complex. 829 857 \end{enumerate} 830 858 … … 903 931 general\footnote{overloadable entities: V $\Rightarrow$ variable, O $\Rightarrow$ operator, F $\Rightarrow$ function, M $\Rightarrow$ member} 904 932 & O\footnote{except assignment}/F & O/F/M & V/O/F & M\footnote{not universal} & O/M & O/F/M & no & no \\ 905 general constraints\footnote{T $\Rightarrow$ parameter type, \# $\Rightarrow$ parameter number, N $\Rightarrow$ parameter name; R $\Rightarrow$ return type}933 general constraints\footnote{T $\Rightarrow$ parameter type, \# $\Rightarrow$ parameter count, N $\Rightarrow$ parameter name; R $\Rightarrow$ return type} 906 934 & T/\#//R\footnote{parameter names can be used to disambiguate among overloads but not create overloads} 907 935 & T/\# & T/\#/R & T/\# & T/\#/N/R & T/\#/N/R & T/\#/N & T/R \\ … … 1000 1028 1001 1029 However, the parameter operations are severely restricted because universal types have few operations. 1002 For example, swift provides a @print@ operation for its universal type, and the java @Object@ class provides general methods: @toString@, @hashCode@, @equals@, @finalize@, \etc.1030 For example, Swift provides a @print@ operation for its universal type, and the Java @Object@ class provides general methods: @toString@, @hashCode@, @equals@, @finalize@, \etc. 1003 1031 This restricted mechanism still supports a few useful functions, where the parameters are abstract entities, \eg: 1004 1032 \begin{swift} … … 1031 1059 1032 1060 \begin{figure} 1033 \setlength{\tabcolsep}{1 5pt}1061 \setlength{\tabcolsep}{12pt} 1034 1062 \begin{tabular}{@{}ll@{}} 1035 1063 \multicolumn{1}{c}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{Haskell}} \\ … … 1037 1065 forall( T ) trait sumable { 1038 1066 void ?{}( T &, zero_t ); 1039 T ?+=?( T &, T ); 1040 }; 1067 T ?+=?( T &, T ); }; 1041 1068 forall( T | sumable( T ) ) 1042 1069 T sum( T a[], size_t size ) { 1043 1070 T total = 0; 1044 1071 for ( i; size ) total += a[i]; 1045 return total; 1046 } 1072 return total; } 1047 1073 struct S { int i, j; }; 1048 1074 void ?{}( S & s, zero_t ) { s.[i, j] = 0; } … … 1050 1076 void ?{}( S & s, int i, int j ) { s.[i, j] = [i, j]; } 1051 1077 S ?+=?( S & l, S r ) { l.[i, j] += r.[i, j]; } 1052 1053 int main() { 1054 int ia[] = { 1, 2, 3 }; 1055 sout | sum( ia, 3 ); // trait inference 1056 double da[] = { 1.5, 2.5, 3.5 }; 1057 sout | sum( da, 3 ); // trait inference 1058 S sa[] = { {1, 1}, {2, 2}, {3, 3 } }; 1059 sout | sum( sa, 3 ).[i, j]; // trait inference 1060 } 1078 int main() { // trait inferencing 1079 sout | sum( (int []){ 1, 2, 3 }, 3 ); 1080 sout | sum( (double []){ 1.5, 2.5, 3.5 }, 3 ); 1081 sout | sum( (S []){ {1,1}, {2,2}, {3,3} }, 3 ).[i, j]; } 1082 1083 1084 1061 1085 \end{cfa} 1062 1086 & … … 1065 1089 szero :: a 1066 1090 sadd :: a -> a -> a 1067 1068 1091 ssum :: Sumable a $=>$ [a] -> a 1069 1092 ssum (x:xs) = sadd x (ssum xs) 1070 1093 ssum [] = szero 1071 1072 1073 1074 1094 data S = S Int Int deriving Show 1075 1095 @instance Sumable Int@ where … … 1078 1098 @instance Sumable Float@ where 1079 1099 szero = 0.0 1080 1100 sadd = (+) 1081 1101 @instance Sumable S@ where 1082 1102 szero = S 0 0 … … 1088 1108 \end{haskell} 1089 1109 \end{tabular} 1090 \caption{Implicit ly/ExplicitlyTrait Inferencing}1091 \label{f:Implicit lyExplicitlyTraitInferencing}1110 \caption{Implicit/Explicit Trait Inferencing} 1111 \label{f:ImplicitExplicitTraitInferencing} 1092 1112 \end{figure} 1093 1113 1094 1114 One differentiating feature among these specialization techniques is the ability to implicitly or explicitly infer the trait information at a class site. 1095 \VRef[Figure]{f:Implicit lyExplicitlyTraitInferencing} compares the @sumable@ trait and polymorphic @sum@ function \see{\VRef{s:Traits}} for \CFA and Haskell.1115 \VRef[Figure]{f:ImplicitExplicitTraitInferencing} compares the @sumable@ trait and polymorphic @sum@ function \see{\VRef{s:Traits}} for \CFA and Haskell. 1096 1116 Here, the \CFA type system inferences the trait functions at each call site, so no additional specification is necessary by the programmer. 1097 1117 The Haskell program requires the programmer to explicitly bind the trait and to each type that can be summed. … … 1103 1123 \end{ada} 1104 1124 Finally, there is a belief that certain type systems cannot support general overloading, \eg Haskell. 1105 As \VRef[Table]{t:OverloadingFeatures} shows, there are multiple languages with both general and parametric overloading, so the decision to not support general overloading is based on the opinion of the language designers and the type system they choose,not any reason in type theory.1125 As \VRef[Table]{t:OverloadingFeatures} shows, there are multiple languages with both general and parametric overloading, so the decision to not support general overloading is based on design choices made by the language designers not any reason in type theory. 1106 1126 1107 1127 The fourth row classifies if conversions are attempted beyond exact match. … … 1122 1142 The details of compiler optimization work are covered in a previous technical report~\cite{Yu20}, which essentially forms part of this thesis. 1123 1143 \item 1124 Th ethesis presents a systematic review of the new features added to the \CFA language and its type system.1144 This thesis presents a systematic review of the new features added to the \CFA language and its type system. 1125 1145 Some of the more recent inclusions to \CFA, such as tuples and generic structure types, were not well tested during development due to the limitation of compiler performance. 1126 1146 Several issues coming from the interactions of various language features are identified and discussed in this thesis;
Note:
See TracChangeset
for help on using the changeset viewer.