Changeset 7d405eb


Ignore:
Timestamp:
Apr 7, 2025, 9:22:23 PM (6 months ago)
Author:
Peter A. Buhr <pabuhr@…>
Branches:
master
Children:
9fbc40e
Parents:
0393fda8
Message:

more proofreading, change file names

Location:
doc/theses/fangren_yu_MMath
Files:
5 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/fangren_yu_MMath/conclusion.tex

    r0393fda8 r7d405eb  
    33The goal of this thesis is to ...
    44
    5 \section{Future Work}
  • doc/theses/fangren_yu_MMath/future.tex

    r0393fda8 r7d405eb  
    55\section{Closed trait types}
    66
    7 \CFA as it currently is does not have any closed types, as new functions can be added at any time. It is also possible to locally declare a function,\footnote{Local functions are not a standard feature in C but supported by mainstream C compilers such as gcc, and allowed in \CFA too.} or a function pointer variable to make a type satisfy a certain trait temporarily and be used as such in this limited scope. However, the lack of closed types in such a "duck typing" scheme proposes two problems. For library implementors, it is common to not want the defined set of operations to be overwritten and cause the behavior of polymorphic invocations to change. For the compiler, it means caching and reusing the result of resolution is not reliable as newly introduced declarations can participate in assertion resolution, making a previously invalid expression valid, or the other way around by introducing ambiguity in assertions. Sometimes those interfaces are fairly complicated, for example the I/O library traits \textbf{istream} and \textbf{ostream} each has over 20 operations. Without the ability to store and reuse assertion resolution results, each time the compiler encounters an I/O operation in the source code (mainly the pipe operator \textbf{?|?} used to represent stream operations in \CFA) it has to resolve the same set of assertions again, causing a lot of repetitive work. Previous experiments have shown that the I/O assertions often account for over half of the number of assertions resolved in a \CFA translation unit. Introducing a way to eliminate the need of doing such repetitive assertion resolutions that are very unlikely to change by new overloads can therefore provide significant improvement to the performance of the compiler.
     7\CFA as it currently is does not have any closed types, as new functions can be added at any time. It is also possible to locally declare a function,\footnote{Local functions are not a standard feature in C but supported by mainstream C compilers such as gcc, and allowed in \CFA too.} or a function pointer variable to make a type satisfy a certain trait temporarily and be used as such in this limited scope. However, the lack of closed types in such a "duck typing" scheme proposes two problems. For library implementors, it is common to not want the defined set of operations to be overwritten and cause the behaviour of polymorphic invocations to change. For the compiler, it means caching and reusing the result of resolution is not reliable as newly introduced declarations can participate in assertion resolution, making a previously invalid expression valid, or the other way around by introducing ambiguity in assertions. Sometimes those interfaces are fairly complicated, for example the I/O library traits \textbf{istream} and \textbf{ostream} each has over 20 operations. Without the ability to store and reuse assertion resolution results, each time the compiler encounters an I/O operation in the source code (mainly the pipe operator \textbf{?|?} used to represent stream operations in \CFA) it has to resolve the same set of assertions again, causing a lot of repetitive work. Previous experiments have shown that the I/O assertions often account for over half of the number of assertions resolved in a \CFA translation unit. Introducing a way to eliminate the need of doing such repetitive assertion resolutions that are very unlikely to change by new overloads can therefore provide significant improvement to the performance of the compiler.
    88
    99The output stream trait in \CFA looks like this:
  • doc/theses/fangren_yu_MMath/intro.tex

    r0393fda8 r7d405eb  
    11\chapter{Introduction}
    22
    3 This thesis is exploratory work I did to understand, fix, and extend the \CFA type-system, specifically, the \newterm{type-resolver} used to satisfy call-site assertions among overloaded variable and function names to allow polymorphic routine calls.
     3This thesis is exploratory work I did to understand, fix, and extend the \CFA type-system, specifically, the \newterm{type-resolver} used to satisfy call-site assertions among overloaded variable and function names to allow polymorphic function calls.
    44Assertions are the operations a function uses within its body to perform its computation.
    55For example, a polymorphic function summing an array needs a size, zero, assignment, and plus for the array element-type, and a subscript operation for the array type.
     
    77T sum( T a[$\,$], size_t size ) {
    88        @T@ total = { @0@ };  $\C[1.75in]{// size, 0 for type T}$
    9         for ( size_t i = 0; i < size; i += 1 )
     9        for ( i; size )
    1010                total @+=@ a@[@i@]@; $\C{// + and subscript for T}\CRT$
    1111        return total;
     
    3737Experience from \CC and \CFA developers shows the type system can implicitly and correctly disambiguates the majority of overloaded names, \ie it is rare to get an incorrect selection or ambiguity, even among hundreds of overloaded (variables and) functions.
    3838In many cases, a programmer is unaware of name clashes, as they are silently resolved, simplifying the development process.
     39
     40Disambiguating among overloads is implemented by examining each call site and selecting the best matching overloaded function based on criteria like the types and number of arguments and the return context.
     41Since the hardware does not support mixed-mode operands, @2 + 3.5@, the type system must disallow it or (safely) convert the operands to a common type.
     42Like overloading, the majority of mixed-mode conversions are silently resolved, simplifying the development process.
     43This approach matches with programmer intuition and expectation, regardless of any \emph{safety} issues resulting from converted values.
     44Depending on the language, mix-mode conversions can be explicitly controlled using a cast.
    3945
    4046\newterm{Namespace pollution} refers to loading the global or other namespaces with many names, resulting in paranoia that the compiler could make wrong choices for overloaded names causing failure.
     
    176182\newterm{General overloading} occurs when the type-system \emph{knows} a function's parameters and return types (or a variable's type for variable overloading).
    177183In functional programming-languages, there is always a return type (except for a monad).
    178 If a return type is specified, the compiler does not have to inference the routine body.
     184If a return type is specified, the compiler does not have to inference the function body.
    179185For example, the compiler has complete knowledge about builtin types and their overloaded arithmetic operators.
    180186In this context, there is a fixed set of overloads for a given name that are completely specified.
     
    229235
    230236\begin{comment}
    231 \begin{lstlisting}[language=Haskell]
     237\begin{haskell}
    232238f( int, int );  f( int, float ); -- return types to be determined
    233239g( int, int );  g( float, int );
    234240let x = curry f( 3, _ ); -- which f
    235241let y = curry g( _ , 3 ); -- which g
    236 \end{lstlisting}
     242\end{haskell}
    237243For the currying to succeed, there cannot be overloaded function names resulting in ambiguities.
    238244To allow currying to succeed requires an implicit disambiguating mechanism, \ie a kind of transfer function.
     
    251257A type-class instance binds a specific type to the generic operations to form concrete instances, giving a name type-class.
    252258Then Qualified types concisely express the operations required to convert an overloaded
    253 The name type-class is used as a transfer function to convert an overloaded routine into a polymorphic routine that is uniquely qualified with the  name type-class.
     259The name type-class is used as a transfer function to convert an overloaded function into a polymorphic function that is uniquely qualified with the  name type-class.
    254260\begin{cfa}
    255261void foo_int_trait( special int trait for operations in this foo );
     
    283289> I'm making this up. The Haskell type-class is a trait, like an interface or
    284290> abstract class, and its usage/declaration/binding creates a specific trait
    285 > instance for bound types, which is a vtable filled with the typed routines
     291> instance for bound types, which is a vtable filled with the typed functions
    286292> instantiated/located for the trait. The vtables are present at runtime and
    287 > passed implicitly to ad-hoc polymorphic routines allowing differentiate of
     293> passed implicitly to ad-hoc polymorphic functions allowing differentiate of
    288294> overloaded functions based on the number of traits and their specialization.
    289295> (Major geek talk, YA! 8-)
     
    416422Virtually all programming languages provide general overloading of the arithmetic operators across the basic computational types using the number and type of parameters and returns.
    417423However, in some languages, arithmetic operators may not be first class, and hence, cannot be overloaded.
    418 Like \CC, \CFA does allow general operator overloading for user-defined types.
     424Like \CC, \CFA allows general operator overloading for user-defined types.
    419425The \CFA syntax for operator names uses the @'?'@ character to denote a parameter, \eg left and right unary operators: @?++@ and @++?@, and binary operators @?+?@ and @?<=?@.
    420426Here, a user-defined type is extended with an addition operation with the same syntax as a builtin type.
     
    426432s1 = @?+?@( s1, s2 );   $\C{// direct call}\CRT$
    427433\end{cfa}
    428 The type system examines each call site and selects the best matching overloaded function based on the number and types of arguments.
    429 If there are mixed-mode operands, @2 + 3.5@, the type system attempts (safe) conversions, like in C/\CC, converting the argument type(s) to the parameter type(s).
    430 Conversions are necessary because the hardware rarely supports mix-mode operations, so both operands must be converted to a common type.
    431 Like overloading, the majority of mixed-mode conversions are silently resolved, simplifying the development process.
    432 This approach matches with programmer intuition and expectation, regardless of any \emph{safety} issues resulting from converted values.
    433 Depending on the language, mix-mode conversions can be explicitly controlled using some form of cast.
     434\CFA excludes overloading the comma operator, short-circuit logical operators \lstinline{&&} and \lstinline{||}, and ternary conditional operator \lstinline{?} (\eg @max = x > y ? x : y@), all of which are short-hand control structures rather than operators.
     435\CC \emph{does} overload the comma and short-circuit logical operators, where overloading the comma operator is esoteric and rarely used, and the short-circuit operators do not exhibit short-circuit semantics, both of which are confusing and/or inconsistent.
    434436
    435437
     
    660662auto x = @...@
    661663\end{cfa}
    662 and the need to write a routine to compute using @x@
     664and the need to write a function to compute using @x@
    663665\begin{cfa}
    664666void rtn( @type of x@ parm );
     
    712714void qsort( void * base, size_t nmemb, size_t size, int (*cmp)( const void *, const void * ) );
    713715\end{cfa}
    714 Here, the polymorphism is type-erasure, and the parametric assertion is the comparison routine, which is explicitly passed.
     716Here, the polymorphism is type-erasure, and the parametric assertion is the comparison function, which is explicitly passed.
    715717\begin{cfa}
    716718enum { N = 5 };
     
    767769
    768770\subsection{Traits}
     771\label{s:Traits}
    769772
    770773\CFA provides \newterm{traits} to name a group of type assertions, where the trait name allows specifying the same set of assertions in multiple locations, preventing repetition mistakes at each function declaration.
     
    772775\begin{tabular}{@{}l|@{\hspace{10pt}}l@{}}
    773776\begin{cfa}
    774 trait @sumable@( T ) {
     777forall( T ) trait @sumable@ {
    775778        void @?{}@( T &, zero_t ); // 0 literal constructor
    776         T ?+?( T, T );           // assortment of additions
    777         T @?+=?@( T &, T );
     779        T @?+=?@( T &, T );              // assortment of additions
     780        T ?+?( T, T );
    778781        T ++?( T & );
    779782        T ?++( T & );
     
    782785&
    783786\begin{cfa}
    784 forall( T @| sumable( T )@ ) // use trait
     787forall( T @| sumable( T )@ )   // use trait
    785788T sum( T a[$\,$], size_t size ) {
    786         @T@ total = { @0@ };  // initialize by 0 constructor
    787         for ( size_t i = 0; i < size; i += 1 )
    788                 total @+=@ a[i]; // select appropriate +
     789        @T@ total = 0;          // initialize by 0 constructor
     790        for ( i; size )
     791                total @+=@ a[i];    // select appropriate +
    789792        return total;
    790793}
     
    804807};
    805808\end{cfa}
    806 These implicit routines are used by the @sumable@ operator @?+=?@ for the right side of @?+=?@ and return.
     809These implicit functions are used by the @sumable@ operator @?+=?@ for the right side of @?+=?@ and return.
    807810
    808811If the array type is not a builtin type, an extra type parameter and assertions are required, like subscripting.
     
    860863The difference between fixed and dynamic is the complexity and cost of field access.
    861864For fixed, field offsets are computed (known) at compile time and embedded as displacements in instructions.
    862 For dynamic, field offsets are compile-time computed at the call site, stored in an array of offset values, passed as a polymorphic parameter, and added to the structure address for each field dereference within a polymorphic routine.
     865For dynamic, field offsets are compile-time computed at the call site, stored in an array of offset values, passed as a polymorphic parameter, and added to the structure address for each field dereference within a polymorphic function.
    863866See~\cite[\S~3.2]{Moss19} for complete implementation details.
    864867
     
    899902general\footnote{overloadable entities: V $\Rightarrow$ variable, O $\Rightarrow$ operator, F $\Rightarrow$ function, M $\Rightarrow$ member}
    900903                                                & O\footnote{except assignment}/F       & O/F/M & V/O/F & M\footnote{not universal}     & O/M   & O/F/M & no    & no    \\
    901 general constraints\footnote{parameter \# $\Rightarrow$ number, T $\Rightarrow$ type, N $\Rightarrow$ name; R $\Rightarrow$ return type}
    902                                                 & \#/T/R        & \#/T  & \#/T/R        & \#/T  & \#/T/N/R      & \#/T/N/R      & \#/T/N        & T/R \\
    903 type parameters                 & no            & yes   & yes           & yes   & yes           & yes           & yes   & yes \\
    904 type constraint\footnote{T $\Rightarrow$ trait/concept, B $\Rightarrow$ type bounds, TC $\Rightarrow$ type class}
    905                                                 & no            & T             & T                     & T             & B                     & T                     & T     & TC \\
     904general constraints\footnote{T $\Rightarrow$ parameter type, \# $\Rightarrow$ parameter number, N $\Rightarrow$ parameter name; R $\Rightarrow$ return type}
     905                                                & T/\#/N/R      & T/\#  & T/\#/R        & T/\#  & T/\#/N/R      & T/\#/N/R      & T/\#/N        & T/R \\
     906univ. type constraints\footnote{A $\Rightarrow$ concept, T $\Rightarrow$ interface/trait/type-class, B $\Rightarrow$ type bounds}
     907                                                & no\footnote{generic cannot be overloaded}             & C             & T                     & B             & B                     & T                     & T                     & T \\
    906908Safe/Unsafe arg. conv.  & no            & S/U\footnote{no conversions allowed during template parameter deduction}      & S/U
    907909                                                & S\footnote{unsafe (narrowing) conversion only allowed in assignment or initialization to a primitive (builtin) type}  & S
     
    948950\end{tabular}
    949951\end{cquote}
     952
    950953The second row classifies the specialization mechanisms used to distinguish among the general overload capabilities.
    951954\begin{cquote}
    952955\begin{tabular}{@{}llll@{}}
    953 \multicolumn{1}{c}{\textbf{number}} & \multicolumn{1}{c}{\textbf{type}} & \multicolumn{1}{c}{\textbf{name}} & \multicolumn{1}{c}{\textbf{return}} \\
    954 \begin{lstlisting}[language=swift]
     956\multicolumn{1}{c}{\textbf{type}} & \multicolumn{1}{c}{\textbf{number}} & \multicolumn{1}{c}{\textbf{name}} & \multicolumn{1}{c}{\textbf{return}} \\
     957\begin{swift}
     958func foo( _ x : Int )
     959func foo( _ x : Double )
     960foo( 3 )
     961foo( 3.5 )
     962\end{swift}
     963&
     964\begin{swift}
    955965func foo( _ x : Int )
    956966func foo( _ x : Int, _ y : Int )
    957967foo( 3 )
    958968foo( 3, 3 )
    959 \end{lstlisting}
    960 &
    961 \begin{lstlisting}[language=swift]
    962 func foo( _ x : Int )
    963 func foo( _ x : Double )
    964 foo( 3 )
    965 foo( 3.5 )
    966 \end{lstlisting}
    967 &
    968 \begin{lstlisting}[language=swift]
     969\end{swift}
     970&
     971\begin{swift}
    969972func foo( x : Int )
    970973func foo( y : Int )
    971974foo( x : 3 )
    972975foo( y : 3 );
    973 \end{lstlisting}
    974 &
    975 \begin{lstlisting}[language=swift]
     976\end{swift}
     977&
     978\begin{swift}
    976979func foo() -> Int
    977980func foo() -> String
    978981var i : Int = foo()
    979982var s : String = foo();
    980 \end{lstlisting}
     983\end{swift}
    981984\end{tabular}
    982985\end{cquote}
    983 The third row classifies if generic functions can be overloaded based on the number and differing type variables.
    984 \begin{cfa}
    985 forall( T ) T foo( T t );
    986 forall( T ) T foo( T t, T s );
    987 forall( T, U ) T foo( T t, U s );
    988 \end{cfa}
    989 The fourth row classifies the mechanism used to specialize the type variables to make them practical.
    990 \begin{cfa}
    991 forall( T | T ?+?( T, T ) ) T foo( T t );
    992 \end{cfa}
    993 The fifth row classifies if conversions are attempted beyond exact match.
    994 \begin{cfa}
    995 int foo( double ); // 1
    996 double foo( int ); // 2
    997 int i = foo( 3 ); // 1 : 3 => 3.0
    998 double d = foo( 3.5 ); // 1 : int result => double
     986
     987The third row classifies if generic functions can be overloaded based on differing type variables, number of type variables, and type-variable constraints.
     988The following examples illustrates the first two overloading cases.
     989\begin{swift}
     990func foo<T>( a : T );  $\C[2.25in]{// foo1}$
     991func foo<T>( a : T,  b : T );  $\C{// foo2}$
     992func foo<T,U>( a : T,  b :  U );  $\C{// foo3}$
     993foo( a : 3.5 );  $\C{// foo1}$
     994foo( a : 2, b : 2 );  $\C{// foo2}$
     995foo( a : 2, b : 2.5 );  $\C{// foo3}\CRT$
     996\end{swift}
     997Here, the overloaded calls are disambiguated using argument types and their number.
     998
     999However, the parameter operations are severely restricted because universal types have few operations.
     1000For example, swift provides a @print@ operation for its universal type, and the java @Object@ class provides general methods: @toString@, @hashCode@, @equals@, @finalize@, \etc.
     1001This restricted mechanism still supports a few useful functions, where the parameters are abstract entities, \eg:
     1002\begin{swift}
     1003func swap<T>( _ a : inout T, _ b : inout T ) { let temp : T = a;  a = b;  b = temp; }
     1004var ix = 4, iy = 3;
     1005swap( &ix, &iy );
     1006var fx = 4.5, fy = 3.5;
     1007swap( &fx, &fy );
     1008\end{swift}
     1009To make a universal function useable, an abstract description is needed for the operations used on the parameters within the function body.
     1010Type matching these operations can occur by discover using techniques like \CC template expansion, or explicit stating, \eg interfaces, subtyping (inheritance), assertions (traits), type classes, type bounds.
     1011The mechanism chosen can affect separate compilation or require runtime type information (RTTI).
     1012\begin{description}
     1013\item[concept] ~
     1014\begin{c++}
     1015concept range = requires( T& t ) {
     1016        ranges::begin(t); // equality-preserving for forward iterators
     1017        ranges::end (t);
     1018};
     1019\end{c++}
     1020\item[traits/type-classes] ~
     1021\begin{cfa}
     1022forall( T | { T ?+?( T, T ) } )
     1023\end{cfa}
     1024\item[inheritance/type bounds] ~
     1025\begin{scala}
     1026class CarPark[Vehicle  >: bike  <: bus](val lot : Vehicle)
     1027\end{scala}
     1028\end{description}
     1029
     1030\begin{figure}
     1031\setlength{\tabcolsep}{15pt}
     1032\begin{tabular}{@{}ll@{}}
     1033\multicolumn{1}{c}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{Haskell}} \\
     1034\begin{cfa}
     1035forall( T ) trait sumable {
     1036        void ?{}( T &, zero_t );
     1037        T ?+=?( T &, T );
     1038};
     1039forall( T | sumable( T ) )
     1040T sum( T a[], size_t size ) {
     1041        T total = 0;
     1042        for ( i; size ) total += a[i];
     1043        return total;
     1044}
     1045struct S { int i, j; };
     1046void ?{}( S & s, zero_t ) { s.[i, j] = 0; }
     1047void ?{}( S & s ) { s{0}; }
     1048void ?{}( S & s, int i, int j ) { s.[i, j] = [i, j]; }
     1049S ?+=?( S & l, S r ) { l.[i, j] += r.[i, j]; }
     1050
     1051int main() {
     1052        int ia[] = { 1, 2, 3 };
     1053        sout | sum( ia, 3 );        // trait inference
     1054        double da[] = { 1.5, 2.5, 3.5 };
     1055        sout | sum( da, 3 );        // trait inference
     1056        S sa[] = { {1, 1}, {2, 2}, {3, 3 } };
     1057        sout | sum( sa, 3 ).[i, j]; // trait inference
     1058}
     1059\end{cfa}
     1060&
     1061\begin{haskell}
     1062class Sumable a where
     1063        szero :: a
     1064        sadd :: a -> a -> a
     1065
     1066ssum ::  Sumable a $=>$ [a] -> a
     1067ssum (x:xs) = sadd x (ssum xs)
     1068ssum [] = szero
     1069
     1070
     1071
     1072data S = S Int Int deriving Show
     1073@instance Sumable Int@ where
     1074        szero = 0
     1075        sadd = (+)
     1076@instance Sumable Float@ where
     1077        szero = 0.0
     1078   sadd = (+)
     1079@instance Sumable S@ where
     1080        szero = S 0 0
     1081        sadd (S l1 l2) (S r1 r2) = S (l1 + r1) (l2 + r2)
     1082main = do
     1083        print ( ssum [1::Int, 2, 3] )
     1084        print ( ssum [1.5::Float, 2.5, 3.5] )
     1085        print ( ssum [(S 1 1), (S 2 2), (S 3 3)] )
     1086\end{haskell}
     1087\end{tabular}
     1088\caption{Implicitly/Explicitly Trait Inferencing}
     1089\label{f:ImplicitlyExplicitlyTraitInferencing}
     1090\end{figure}
     1091
     1092One differentiating feature among these specialization techniques is the ability to implicitly or explicitly infer the trait information at a class site.
     1093\VRef[Figure]{f:ImplicitlyExplicitlyTraitInferencing} compares the @sumable@ trait and polymorphic @sum@ function \see{\VRef{s:Traits}} for \CFA and Haskell.
     1094Here, the \CFA type system inferences the trait functions at each call site, so no additional specification is necessary by the programmer.
     1095The Haskell program requires the programmer to explicitly bind the trait and to each type that can be summed.
     1096(The extra casts in the Haskell program are necessary to differentiate between the two kinds of integers and floating-point numbers.)
     1097Ada also requires explicit binding, creating a new function name for each generic instantiation.
     1098\begin{ada}
     1099function int_Twice is new Twice( Integer, "+" => "+" );
     1100function float_Twice is new Twice( Float, "+" => "+" );
     1101\end{ada}
     1102Finally, there is a belief that certain type systems cannot support general overloading, \eg Haskell.
     1103As \VRef[Table]{t:OverloadingFeatures} shows, there are multiple languages with both general and parametric overloading, so the decision to not support general overloading is based on the opinion of the language designers and the type system they choose, not any reason in type theory.
     1104
     1105The fourth row classifies if conversions are attempted beyond exact match.
     1106\begin{cfa}
     1107int foo( double ); $\C[1.75in]{// 1}$
     1108double foo( int ); $\C{// 2}$
     1109int i = foo( 3 ); $\C{// 1 : 3 to 3.0 argument conversion => all conversions are safe}$
     1110double d = foo( 3.5 ); $\C{// 1 : int to double result conversion => all conversions are safe}\CRT$
    9991111\end{cfa}
    10001112
  • doc/theses/fangren_yu_MMath/resolution.tex

    r0393fda8 r7d405eb  
    22\label{c:content2}
    33
    4 The \CFA's type-system provides expressive polymorphism: variables can be overloaded, functions can be overloaded by argument and return types, tuple types, generic (polymorphic) functions and types (aggregates) can have multiple type parameters with assertion restrictions;
     4Recapping, the \CFA's type-system provides expressive polymorphism: variables can be overloaded, functions can be overloaded by argument and return types, tuple types, generic (polymorphic) functions and types (aggregates) can have multiple type parameters with assertion restrictions;
    55in addition, C's multiple implicit type-conversions must be respected.
    66This generality leads to internal complexity and correspondingly higher compilation cost directly related to type resolution.
    77The reason is that the type resolver must analyze \emph{each} component of an expression with many possible forms of overloading, type restrictions, and conversions.
    8 Designing a ruleset that behaves as expected, \ie matches C programmer expectations, and implementing an efficient algorithm is a very challenging task.
    9 
    10 My first work on the \CFA type-system was as a co-op student.
     8Designing a ruleset that is expressive, behaves as expected, \ie matches C programmer expectations, and can be efficiently implemented is a very challenging task.
     9
     10I first worked on the \CFA type-system as a co-op student.
    1111At that time, compiling a medium-sized \CFA program using advanced polymorphism took multiple minutes (5+ minutes)~\cite[\S~5]{Yu20}.
    1212I found a number of type-resolution problems, which resulted in the following changes to the type-resolution algorithm.
     
    2424\end{enumerate}
    2525\VRef[Table]{t:SelectedFileByCompilerBuild} shows improvements for selected tests with accumulated reductions in compile time across each of the 5 fixes.
    26 The large reduction in compilation time significantly improved the development of the \CFA's runtime because of its frequent compilation cycles.
     26To this day, the large reduction in compilation time significantly improves the development of the \CFA's runtime because of its frequent compilation cycles.
    2727
    2828\begin{table}[htb]
     
    5959
    6060\section{Expression Cost-Model}
     61\label{s:ExpressionCostModel}
    6162
    6263\CFA has been using a total-order expression cost-model to resolve ambiguity of overloaded expressions from the very beginning.
    63 Most \CFA operators can be overloaded in \CFA\footnote{
    64 \CFA excludes overloading the comma operator, short-circuit logical operators \lstinline{&&} and \lstinline{||}, and ternary conditional operator \lstinline{?}, all of which are short-hand control structures rather than operators.
    65 \CC overloads the comma and short-circuit logical operators, where the overloaded comma is esoteric short-circuit operators do not exhibit short-circuit semantics, which is confusing.};
     64Most \CFA operators can be overloaded in \CFA;
    6665hence, they are treated the same way as other function calls with the same rules for overload resolution.
    6766
    68 In \CFA, candidates of an overloaded expression are ranked by numerical cost elements, which account any type conversions needed from a call-site argument type to the matching function parameter type, as well as polymorphic types and restrictions introduced by the @forall@ clause.
    69 Currently, the expression cost has the following components, ranked from highest to lowest.
    70 \begin{enumerate}
     67In \CFA, candidates of an overloaded expression are ranked by numerical cost elements, which accounts for any type conversions needed from a call-site argument type to the matching function parameter type, as well as polymorphic types and restrictions introduced by the @forall@ clause.
     68Currently, the expression cost has the following components, ranked from highest to lowest. where lower cost is better.
     69\begin{enumerate}[leftmargin=*]
    7170\item \textbf{Unsafe} cost representing a narrowing conversion of arithmetic types, \eg @int@ to @short@, and qualifier-dropping conversions for pointer and reference types.
     71Narrowing conversions have the potential to lose (truncation) data.
     72A programmer must decide if the computed data-range can safely be shorted in the smaller storage.
     73Warnings for unsafe conversions are helpful.
     74\begin{cfa}
     75void f( short p );
     76f( 3L ); $\C[2.5in]{// unsafe conversion, possible warning}\CRT$
     77\end{cfa}
     78
    7279\item \textbf{Polymorphic} cost for a function parameter type that is or contains a polymorphic type variable.
     80The fewer polymorphic parameters means a more specific function, which may be able to compute faster or more accurately, like specialization templates in \CC.
     81\begin{cfa}
     82forall( T ) f( T p, int i ); $\C[2.5in]{// 1}$
     83forall( T ) f( T p, T i );   $\C{// 2}$
     84f( 3, 4 ); $\C{// 1}\CRT$
     85\end{cfa}
     86
    7387\item \textbf{Safe} cost representing a widening conversion \eg @short@ to @int@, qualifier-adding conversions for pointer and reference types, and value conversion for enumeration constants.
     88Even when conversions are safe, the fewest conversions it ranked better, \eg @short@ to @int@ versus @short@ to @long int@.
     89\begin{cfa}
     90void f( long int p ); $\C[2.5in]{// 1}$
     91void f( double p );   $\C{// 2}$
     92f( 3h ); $\C{// 1, short constant}\CRT$
     93\end{cfa}
     94
    7495\item \textbf{Variable} cost that counts the number of polymorphic variables, if any, introduced by the @forall@ clause in the function declaration.
     96Fewer polymorphic variables implies more specificity.
     97\begin{cfa}
     98forall( T, T ) f( T p1, T p2 ); $\C[2.5in]{// 1}$
     99forall( T, U ) f( T p1, U p2 ); $\C{// 2}$
     100f( 3, 4 ); $\C{// 1}$
     101f( 3, 4.5 ); $\C{// 2}\CRT$
     102\end{cfa}
     103
    75104\item \textbf{Specialization} cost counting the number of restrictions introduced by type assertions.
     105Fewer restriction means fews parametric variables passed at the function call giving better performance.
     106\begin{cfa}
     107forall( T | { T ?+?( T, T ) } ) void f( T ); $\C[3.25in]{// 1}$
     108forall( T | { T ?+?( T, T ), void f( T, T ) } ); $\C{// 2}$
     109f( 42 ); $\C{// 1}\CRT$
     110\end{cfa}
    76111\end{enumerate}
    77 The comparison of cost tuple is by lexicographical order, from unsafe (highest) to specialization (lowest), with ties moving to the next lowest item.
     112Cost tuples are compared by lexicographical order, from unsafe (highest) to specialization (lowest), with ties moving to the next lowest item.
    78113At a subexpression level, the lowest cost candidate for each result type is included as a possible interpretation of the expression;
    79114at the top level, all possible interpretations of different types are considered (generating a total ordering) and the overall lowest cost is selected as the final interpretation of the expression.
     
    112147\end{comment}
    113148
    114 In many languages that support function/method overloading, such as \CC and Java, a partial-order system decides which interpretation of the expression gets selected.
    115 Hence, sometimes the candidates are incomparable (none are considered a better match), and only when one candidate is considered better than all others (maximal with respect to the partial order) is the expression unambiguous.
    116 Specifically, the resolution algorithms used in \CC and Java are greedy, selecting the best match for each sub-expression without considering the higher-level ones (bottom-up).
     149In many languages that support function/method overloading, such as \CC and Java, a partial-order system decides which interpretation of the expression is selected.
     150Hence, sometimes the candidates are incomparable (none are considered a best match), and only when one candidate is considered better than all others (maximal with respect to the partial order) is the expression unambiguous.
     151Specifically, the resolution algorithms used in \CC and Java are greedy, selecting the best match for each subexpression without considering the higher-level ones (bottom-up).
    117152Therefore, at each resolution step, the arguments are already given unique interpretations, so the ordering only needs to compare different sets of conversion targets (function parameter types) on the same set of input.
    118153\begin{cfa}
    119154@generate a C++ example here@
     155
     156read more
    120157\end{cfa}
    121158
     
    124161\begin{cfa}
    125162@generate a CFA example here@
     163
     164read more
    126165\end{cfa}
    127166so it is possible that both the selected function and the set of arguments are different, and cannot be compared with a partial-ordering system.
     
    129168
    130169Ada is another programming language that has overloading based on return type.
    131 Although Ada also allows implicit type conversions of function arguments, it is fairly conservative on resolving ambiguities.
     170Although Ada also allows implicit type conversions of function arguments, it is fairly conservative on resolving ambiguities~\cite[\S~8.6]{Ada22}.
    132171\begin{cfa}
    133172@generate an Ada example here@
    134 \end{cfa}
    135 There are only two preference rules in Ada overload resolution, one for primitive arithmetic operators and one for universal access types (analogous to @void *@ in C);
     173
     174section 8.6 the context of overload resolution
     175page 468, item number 28 - 30
     176\end{cfa}
     177There are only two preference rules in Ada overload resolution, one for primitive arithmetic operators and one for universal access types (pointer type);
     178\begin{ada}
     179Function "-"( L, R : Float ) return Integer is begin
     180        return Integer(L) + (-Integer(R)); --  prevent infinite recursion
     181end;
     182Integer i;
     183i := 7 - 3; -- prefer
     184i := 7.2 - 3.5
     185\end{ada}
    136186any other cases where an expression has multiple legal interpretations are considered ambiguous.
    137 The current overload resolution system for \CFA is on the other end of the spectrum, as it tries to order every legal interpretations of an expression and chooses the best one according to cost, occasionally giving unexpected results.
     187The current overload resolution system for \CFA is on the other end of the spectrum, as it tries to order every legal interpretations of an expression and chooses the best one according to cost, occasionally giving unexpected results rather than an ambiguity.
    138188
    139189Interestingly, the \CFA cost-based model can sometimes make expression resolution too permissive because it always attempts to select the lowest cost option, and only when there are multiple options tied at the lowest cost does it report the expression is ambiguous.
    140190The reason is that there are so many elements in the cost tuple, the type system ``tries too hard to discover a match'', and therefore, ties are uncommon.
    141 Other than the case of multiple exact matches, where all have cost zero, incomparable candidates under a partial ordering of being more specific can often have different expression costs since different kinds of implicit conversions are involved, resulting in seemingly arbitrary overload selections.
     191Other than the case of multiple exact matches, where all have cost zero, incomparable candidates under a partial ordering can often have different expression costs since different kinds of implicit conversions are involved, resulting in seemingly arbitrary overload selections.
    142192
    143193There are currently at least three different situations where the polymorphic cost element of the cost model does not yield a candidate selection that is clearly justifiable, and one of them is straight up wrong.
     
    152202Since polymorphic cost is ranked above safe conversion cost, option 2 is considered to have lower cost and gets selected.
    153203
    154 In contrast, the template deduction and overload resolution rules in \CC selects option 1 (converting @forall@ to the equivalent function template declaration).
    155 \CC performs template argument deduction and overload candidate ranking in two separate steps.
     204In contrast, the template inferencing and overload resolution rules in \CC selects option 1 (converting @forall@ to the equivalent function template declaration).
     205\CC performs template argument inferencing and overload candidate ranking in two separate steps.
    156206\begin{itemize}
    157207\item
     
    166216This difference could be explained as compensating for \CFA polymorphic functions being separately compiled versus template inlining;
    167217hence, calling them requires passing type information and assertions increasing the runtime cost.
     218We are learning that having the type system consider performance may be inappropriate.
     219
    168220\item
    169221Having a lower total polymorphic cost does not always mean a function is more specialized.
     
    185237\item
    186238A generic type can require more type variables to describe a more specific type.
    187 For example, a generic function taking a @pair@-type, requires two type variables.
     239For example, consider a generic function taking a @pair@-type requires two type variables.
    188240\begin{cfa}
    189241forall( T, U ) void f( pair( T, U ) ); $\C[2.75in]{// 1}$
     
    193245forall( T ) void f( T );                $\C{// 2}\CRT$
    194246\end{cfa}
    195 Passing a @pair@ variable to @f@ gives a cost of 1 poly, 2 variable for the @pair@ overload, versus a cost of 1 poly, 1 variable for the unconstrained overload.
    196 Programmer expectation is to select the former, but the cost model selects the latter;
    197 either could work.
     247Passing a @pair@ variable to @f@
     248\begin{cfa}
     249pair p;
     250f( p );
     251\end{cfa}
     252gives a cost of 1 poly, 2 variable for the @pair@ overload, versus a cost of 1 poly, 1 variable for the unconstrained overload.
     253Programmer expectation is to select option 1 because of the exact match, but the cost model selects 2;
     254while either could work, the type system should select a call that meets expectation of say the call is ambiguous, forcing the programmer to mediate.
    198255As a result, simply counting the number of polymorphic type variables is no longer correct to order the function candidates as being more constrained.
    199256\end{enumerate}
     
    205262Hence, the \CC template-specialization ordering can be applied to \CFA with a slight modification.
    206263
    207 At the meantime, some other improvements have been made to the expression cost system.
     264In the meantime, some other improvements have been made to the expression cost system.
    208265Most notably, the expression resolution algorithm now consistently uses the globally minimal cost interpretation, as discussed in a previous technical report.
    209266While implementing the change, there are also two detailed issues that need to be addressed for the new rules to fully work.
     
    238295For example, @long x = abs( 42 )@ resolves to @long abs( long )@ with @int@ argument 42 converted to @long@ or @int abs( int )@ converting the result to @long@.
    239296In this example, the choice could be arbitrary because both yield identical results.
    240 However, for @int i = abs( -9223372036854775807LL )@, the result is @-1@, suggesting some kind of type error rather than silently generating a logically incorrect result.
    241 There are multiple overload groupings of C functions into a single \CFA name, so usage should not report an ambiguity or warning unless there is a significant chance of error.
     297However, for @int i = abs( -9223372036854775807LL )@, the result is @-1@ due to the narrowing conversion from @long@ to @int@ on the assignment, suggesting at warning to the programmer to reconsider the type of @i@ or the compiler generates an error as the argument is too large for an @int@ parameter.
     298The \CFA system library has multiple overload groupings of C functions into a single name, so usage should not report an ambiguity or warning unless there is a significant chance of error.
    242299
    243300While testing the effects of the updated cost rule, the following example was found in the \CFA standard library.
     
    255312When asked, the developer expected the floating-point overload because of return-type overloading.
    256313This mistake went unnoticed because the truncated component was insignificant in the performance logging.
    257 Investigation of this example leads to the decision that the resolution algorithm favors a lower conversion cost up the expression tree when the total global cost is equal.
     314To correct this mistake, I changed the resolution algorithm to favour a lower conversion cost up the expression tree when the total global cost is equal.
    258315
    259316Another change addresses the issue that C arithmetic expressions have unique meanings governed by its arithmetic conversion rules.
     317From the C language reference manual:
    260318\begin{enumerate}[leftmargin=*,topsep=5pt,itemsep=4pt]
    261319\item
     
    290348\VRef[Rule]{p:SignedToUnsignedSafe} says an unsigned type is safely convertible to an signed type with greater rank, while \VRef[rule]{p:UnsignedToSignedUnsafe} says a signed type is unsafely convertible to an unsigned type.
    291349Therefore, these two rules cover every possible case.
    292 \VRef[Rule]{p:Miscellaneous} is the catch-all to accommodate any kinds of exotic integral representations.
    293 These conversions are applied greedily at local points within an expression.
     350\VRef[Rule]{p:Miscellaneous} is the odd-ball rule because it is really a demotion because signed to unsigned (marked in red) is unsafe.
     351Finally, assignment allows demotion of any larger type into a smaller type, with a possibly lossy conversion (and often no warning).
     352These promotion conversions are applied greedily at local points within an expression.
    294353Because there is no overloading in C, except for builtin operators, no cost model is needed to differentiate among alternatives that could result in ambiguous matches.
    295354
     
    319378This is demonstrated in the following example, adapted from the C standard library:
    320379\begin{cfa}
    321 unsigned long long x;
    322 (unsigned)( x >> 32 );
     380        unsigned long long x;
     381        (unsigned)( x >> 32 );
    323382\end{cfa}
    324383\vspace{5pt}
     
    328387However, this break from C semantics is not backwards compatible, so to maintain C compatibility, the \CFA resolver selects the lowest-cost interpretation of the cast argument for which a conversion or coercion to the target type exists (upcasting to @unsigned long long@ in the example above, due to the lack of unsafe downcasts), using the cost of the conversion itself only as a tie-breaker.\cite[pp.~46-47]{Moss19}
    329388\end{cquote}
    330 However, a cast expression is not necessary to have such inconsistency to C semantics.
     389However, a cast expression is unnecessary to have such inconsistency to C semantics.
    331390An implicit argument-parameter type conversion in a function calls can replicate this issue without an explicit cast.
    332391\begin{cfa}
     
    343402The idea here is to first look for the best integral alternative because integral calculations are exact and cheap.
    344403If no integral solution is found, than there are different rules to select among floating-point alternatives.
    345 This approach reduces the search space by partitioning: only look at integral operations, and then only look at float-point operations.
     404This approach reduces the search space by partitioning into two categories.
    346405
    347406
     
    352411\CFA does not attempt to do any type \textit{inference} \see{\VRef{s:IntoTypeInferencing}}: it has no anonymous functions (\ie lambdas, commonly found in functional programming and also used in \CC and Java), and the variable types must all be explicitly defined (no auto typing).
    353412This restriction makes the unification problem more tractable in \CFA, as the argument types at each call site are usually all specified.
    354 There is a single exception case when the function return type contains a free type variable that does not occur in any of the argument types, and subsequently passed into the parent expression. One such example is the \CFA wrapper for @malloc@ which also infers size argument from the deducted return type.
    355 \begin{cfa}
    356 forall (T*) T* malloc() {
    357         return (T*) malloc (sizeof(T)); // calls C malloc with the size inferred from context
     413There is a single exception case when the function return type contains a free type variable that does not occur in any of the argument types, and subsequently passed into the parent expression.
     414One such example is the \CFA wrapper for @malloc@ which also infers size argument from the inferred return type.
     415\begin{cfa}
     416forall( T * ) T * malloc() {
     417        return ( T *)malloc( sizeof(T) ); // calls C malloc with the size inferred from context
    358418}
     419int * i = malloc();   // infer int for T from LHS
    359420\end{cfa}
    360421A top level expression whose type still contains an unbounded type variable is considered ill-formed as such an expression is inherently ambiguous.
     
    363424Some additions have been made in order to accommodate for the newly added type features to the language.
    364425To summarize, the \CFA type unification has two minor variants: an \textit{exact} mode and an \textit{inexact} mode.
     426The exact mode is required since the type matching algorithm operates recursively and the inner types often have to match exactly, \eg there is no common type for the pointer types @int *@ and @long *@ while there is for @int@ and @long@.
    365427The inexact mode is applied at top level argument-parameter matching, and attempts to find an assignment to the type variables such that the argument types can be converted to parameter types with minimal cost as defined in the previous section.
    366 The exact mode is required since the type matching algorithm operates recursively and the inner types often have to match exactly, for example there is no common type for the pointer types \textbf{int*} and \textbf{long*} while there is for @int@ and @long@.
    367428With the introduction of generic record types, the parameters must match exactly as well; currently there are no covariance or contravariance supported for the generics.
    368429
    369 One simplification was made to the \CFA language that makes modelling the type system easier: polymorphic function pointer types are no longer allowed in declarations.
     430One simplification was made to the \CFA language that makes modelling the type system easier: polymorphic function pointer types are no longer allowed.
    370431The polymorphic function declarations themselves are still treated as function pointer types internally, however the change means that formal parameter types can no longer be polymorphic.
    371 Previously it is possible to write function prototypes such as
    372 \begin{cfa}
    373 void f( forall( T | { T -?( T ); } ) T (*p)( T, T ) );
    374 \end{cfa}
    375 Notably, the unification algorithm implemented in the \CFA compiler has never managed to trace the assertion parameters on the formal types at all, and the problem of determining if two assertion sets are compatible may very likely be undecidable in general, given the ability of synthesizing more complicated types by the nesting of generics.
    376 Eventually, the reason of not allowing such constructs is that they mostly do not provide useful type features for actual programming tasks.
    377 A subroutine of a program operates on the arguments provided at the call site together with (if any) local and global variables, and even though the subroutine can be polymorphic, the types will be supported at each call site.
    378 On each invocation the types to be operate on can be determined from the arguments provided, and therefore there should not be a need to pass a polymorphic function pointer, which can take any type in principle.
    379 
    380 For example, consider a polymorphic function that takes one argument of type @T@ and another pointer to a subroutine
    381 \begin{cfa}
    382 forall (T) void f (T x, forall (U) void (*g) (U));
    383 \end{cfa}
    384 Making @g@ polymorphic in this context would almost certainly be unnecessary, since it can only be called inside the body of @f@ and the types of the argument would have been known anyways, although it can potentially depend on @T@.
    385 Moreover, requesting a function parameter to be able to potentially work on any input type at all would always impose too much constraint on the arguments, as it only needs to make each calls inside the body of @f@ valid.
    386 
    387 Rewriting the prototype to
    388 \begin{cfa}
    389 forall (T) void f (T x, void (*g) (T));
    390 \end{cfa}
    391 will be sufficient (or potentially, some compound type synthesized from @T@), in which case @g@ is no longer a polymorphic type on itself.
    392 The ``monomorphization'' conversion is readily supported in \CFA, either by explicitly assigning a polymorphic function name to a compatible function pointer type, or implicitly done in deducing assertion parameters (which will be discussed in the next section).
    393 Such technique can be directly applied to argument passing, which is essentially just assignment to function parameter variables.
    394 There might be some edge cases where the supplied subroutine @g@ is called on arguments of different types inside the body of @f@ and so declared as polymorphic, but such use case is rare and the benefit of allowing such constructs seems to be minimal in practice.
    395 
    396 The result of this change is that the unification algorithm no longer needs to distinguish open and closed type-variables, as the latter is not allowed to exist.
     432Previously it was possible to write function prototypes such as
     433\begin{cfa}
     434void f( forall( T | { T -?( T ); } ) T (@*p@)( T, T ) );
     435\end{cfa}
     436Notably, the unification algorithm implemented in the \CFA compiler has never managed to trace the assertion parameters on the formal types at all, and the problem of determining if two assertion sets are compatible is likely undecidable in general, given the ability of synthesizing more complicated types by the nesting of generics.
     437Essentially, the reason for disallowing this construct is that it does not provide a useful type feature.
     438A function operates on the call-site arguments together with any local and global variables.
     439When the function is polymorphic, the types are inferred at each call site.
     440On each invocation, the types to be operate on are determined from the arguments provided, and therefore, there is no need to pass a polymorphic function pointer, which can take any type in principle.
     441For example, consider a polymorphic function that takes one argument of type @T@ and polymorphic function pointer.
     442\begin{cfa}
     443forall( T ) void f( T x, forall( U ) void (* g)( U ) );
     444\end{cfa}
     445Making @g@ polymorphic in this context is almost certainly unnecessary, since it can only be called inside the body of @f@ and the types of the argument must be known, although it can potentially depend on @T@.
     446Moreover, requesting a function parameter to be able to work on any input type would impose too much constraint on the arguments, as it only needs to make each call inside the body of @f@ valid.
     447
     448Hence, rewriting the prototype (or potentially, some compound type synthesized from @T@):
     449\begin{cfa}
     450forall( T ) void f( T x, void (* g)( T ) );
     451\end{cfa}
     452is sufficient, so @g@ is no longer a polymorphic type itself.
     453This \emph{monomorphization} conversion is readily supported in \CFA, either by explicitly assigning a polymorphic function name to a compatible function pointer type, or implicitly done in inferring assertion parameters (which is discussed next).
     454This technique is applicable to argument passing, which is just an assignment to a function parameter variable.
     455In theory, there might be edge cases where the supplied function @g@ is called on arguments of different types inside the body of @f@ and so needs to be polymorphic, but this case is rare and its benefit seems to be minimal in practice.
     456
     457The result of this change is that the unification algorithm no longer needs to distinguish open (unbounded) and closed (bounded) type-variables, as the latter is not allowed to exist.
    397458The only type variables that need to be handled are those introduced by the @forall@ clause from the function prototype.
    398 The subtype relationship between function types is now also rendered redundant since none of the function parameter or return types can be polymorphic, and no basic types or non-polymorphic function types are subtypes of any other type.
    399 Therefore the goal of (exact) type unification now simply becomes finding a substitution that produces identical types.
    400 The assertion set need to be resolved is also always just the declarations on the function prototype, which also simplifies the assertion satisfaction algorithm by a bit, as will be discussed further in the next section.
    401 
    402 The type unification results are stored in a type environment data structure, which represents all the type variables currently in scope as equivalent classes, together with their bound types and some other extra information, such as whether the bound type is allowed to be opaque (i.e.
    403 a forward declaration without definition in scope), and whether the bounds are allowed to be widened.
    404 In the more general approach commonly used in functional languages, the unification variables are given a lower bound and an upper bound to account for covariance and contravariance of types.
    405 \CFA currently does not implement any variance with its generic types, and does not allow polymorphic function types, therefore no explicit upper bound is needed and one simple binding value for each equivalence class suffices.
    406 However, since type conversions are allowed in \CFA, the type environment needs to keep track on which type variables are allowed conversions.
    407 This behavior is notably different from \CC template argument deduction which enforces an exact match everywhere unless the template argument types are explicitly given.
    408 For example, a polymorphic maximum function in \CFA can be called with arguments of different arithmetic types and the result follows the usual arithmetic conversion rules, while such expression is not allowed by \CC:
    409 \begin{cfa}
    410 forall (T | {int ?<? (T, T); }) T max (T, T);
    411 
    412 max (42, 3.14); // OK, T=double; requires explicit type annotation in C++ such as max<double>(42, 3.14);
    413 \end{cfa}
    414 
    415 From a theoretical point of view, the simplified implementation of type environment has its shortcomings. There are some cases that do not work nicely with this implementation and some compromise has to be made. A more detailed discussion follows at the end of this chapter.
    416 
    417 
    418 \section{Satisfaction of Assertions}
    419 
    420 The assertion satisfaction problem greatly increases the complexity of \CFA expression resolution.
    421 Past experiments have shown that the majority of time is spent in resolving the assertions for those expressions that takes the longest time to resolve.
    422 Even though a few heuristics-based optimizations are introduced to the compiler now, this remains to be the most costly part of compiling a \CFA program.
     459The subtype relationship among function types is now also rendered redundant since none of the function parameter or return types can be polymorphic, and no basic types or non-polymorphic function types are subtypes of any other type.
     460Therefore, the goal of (exact) type unification becomes finding a substitution that produces identical types.
     461The assertion set that needs to be resolved is just the declarations on the function prototype, which also simplifies the assertion satisfaction algorithm, which is discussed further in the next section.
     462
     463An implementation sketch stores type unification results in a type-environment data-structure, which represents all the type variables currently in scope as equivalent classes, together with their bound types and information such as whether the bound type is allowed to be opaque (\ie a forward declaration without definition in scope) and whether the bounds are allowed to be widened.
     464In the general approach commonly used in functional languages, the unification variables are given a lower bound and an upper bound to account for covariance and contravariance of types.
     465\CFA does not implement any variance with its generic types and does not allow polymorphic function types, therefore no explicit upper bound is needed and one binding value for each equivalence class suffices.
     466However, to handle type conversions, the type environment needs to keep track of which type variables can be converted.
     467This behaviour is notably different from \CC template argument inferencing that enforces an exact match everywhere unless the template argument types are explicitly given.
     468For example, a polymorphic maximum function in \CFA can be called with arguments of different arithmetic types and the result follows the usual arithmetic conversion rules.
     469\begin{cfa}
     470forall( T | {int ?<? ( T, T ); } ) T max ( T, T );
     471max( 42, 3.14 );   $\C[2.5in]{\LstCommentStyle // \CFA implicitly infers T == double}$
     472max<double>(42, 3.14); $\C{\LstCommentStyle // \CC requires explicit type specification}\CRT$
     473\end{cfa}
     474From a theoretical standpoint, the simplified implementation of the type environment has its shortcomings.
     475Some cases do not work nicely with this implementation, and hence, some compromise has to be made.
     476A more detailed discussion follows in \VRef{s:CompilerImplementationConsiderations}.
     477
     478
     479\section{Assertion Satisfaction}
     480
     481The assertion-satisfaction problem greatly increases the complexity of \CFA expression resolution.
     482Past experiments have shown that the majority of compilation time is spent in resolving the assertions for those expressions that takes the longest time to resolve.
     483Even though a few heuristics-based optimizations have been introduced to the compiler, this remains the most costly part of a \CFA compilation.
    423484The major difficulty of resolving assertions is that the problem can become recursive, since the expression used to satisfy an outstanding assertion can have its own assertions, and in theory this can go on indefinitely.
    424485Detecting infinite recursion cases in general is not algorithmically possible and it is not attempted in the compiler.
    425486Instead, a fixed maximum depth of recursive assertions is imposed.
    426 This approach is also taken by \CC compilers as template argument deduction is also similarly undecidable in general.
    427 
    428 
    429 In previous versions of \CFA this number was set at 4; as the compiler becomes more optimized and capable of handling more complex expressions in a reasonable amount of time, I have increased the limit to 8 and in most occasions it does not lead to trouble.
    430 Very rarely there will be a case where the infinite recursion produces an exponentially growing assertion set, causing minutes of time wasted before the limit is reached.
    431 Fortunately it is very hard to run into this situation with realistic \CFA code, and the ones that were found all have some clear characteristics, which can be prevented by some clever tricks.
    432 In fact, some of the performance optimizations come from analyzing these problematic cases.
    433 One example of such will be presented later in this section.
    434 
    435 While the assertion satisfaction problem in isolation looks like just another expression to resolve, the recursive nature makes some techniques applied to expression resolution without assertions no longer possible.
    436 The most significant impact is that the type unification has a side effect, namely editing the type environment (equivalence classes and bindings), which means that if one expression has multiple associated assertions, they are not independent as the changes to the type environment must be compatible for all the assertions to be resolved.
     487This approach is also taken by \CC compilers as template argument inferencing is also similarly undecidable in general.
     488
     489
     490In previous versions of \CFA, this number was set at 4; as the compiler becomes more optimized and capable of handling more complex expressions in a reasonable amount of time, I have increased the limit to 8 and it does not lead to problems.
     491Only rarely is there a case where the infinite recursion produces an exponentially growing assertion set, causing minutes of time wasted before the limit is reached.
     492Fortunately, it is very hard to generate this situation with realistic \CFA code, and the ones that have occurred have clear characteristics, which can be prevented by alternative approaches.
     493In fact, some of my performance optimizations come from analyzing these problematic cases.
     494One example is analysed in this section.
     495
     496While the assertion satisfaction problem in isolation looks like just another expression to resolve, its recursive nature makes some techniques for expression resolution no longer possible.
     497The most significant impact is that type unification has a side effect, namely editing the type environment (equivalence classes and bindings), which means if one expression has multiple associated assertions it is dependent, as the changes to the type environment must be compatible for all the assertions to be resolved.
    437498Particularly, if one assertion parameter can be resolved in multiple different ways, all of the results need to be checked to make sure the change to type variable bindings are compatible with other assertions to be resolved.
    438 A naive algorithm that simply picks any pending assertion to resolve and continue in a depth-first search could be very inefficient and especially prone of falling into an infinite loop, while in many cases it can be avoided by examining other assertions that can provide insight on the desired type binding: if one assertion parameter can only be matched by a unique option, we can then update the type bindings confidently without the need of backtracking.
    439 
    440 
    441 The algorithm currently used in \CFA compiler is designed by Aaron Moss through a simplified prototype experiment that captures most of \CFA type system features and ported back to the actual language.
    442 It can be described as a mix of breadth- and depth-first search in a staged approach.
    443 
    444 
     499A naive algorithm that simply picks any pending assertion to resolve and continue in a depth-first search could be very inefficient and especially prone to an infinite loop.
     500In many cases, these problems can be avoided by examining other assertions that provide insight on the desired type binding: if one assertion parameter can only be matched by a unique option, the type bindings can be updated confidently without the need for backtracking.
     501
     502The Moss algorithm currently used in \CFA was developed using a simplified type-simulator that capture most of \CFA type-system features.
     503The simulation results were then ported back to the actual language.
     504The simulator used a mix of breadth- and depth-first search in a staged approach.
    445505To resolve a set of assertions, the algorithm first attempts to resolve each assertion item individually.
    446506There are three possible outcomes on resolving each assertion:
    447507\begin{enumerate}
    448 \item If no matches are found, the algorithm terminates with a failure.
    449 \item If exactly one match is found, the type environment is updated immediately, and used in resolving any remaining assertions.
    450 \item If multiple matches are found, the assertion candidates with their updated type environments are stored in a list that will be checked for compatibility at the end.
     508\item If no matches are found, the algorithm terminates with a failure (ambiguity).
     509\item If exactly one match is found, the type environment is updated immediately with this result, affecting the resolution of remaining assertions.
     510\item If multiple matches are found, the assertion candidates with their updated type environments are stored in a list that is checked for compatibility at the end.
    451511\end{enumerate}
    452512When all assertion items are resolved successfully, the algorithm attempts to combine the ambiguously resolved assertions to produce mutually compatible assignments.
    453 If any new assertions are introduced by the selected candidates, the algorithm is applied recursively, until there are none pending resolution or the recursion limit is reached which results in a failure.
    454 
    455 It has been discovered in practice that the efficiency of such algorithm can sometimes be very sensitive to the order of resolving assertions.
    456 Suppose an unbound type variable @T@ appears in two assertions, one can be uniquely resolved and allow the type @T@ to be inferred immediately, and another has many ways to be resolved, each results in @T@ being bound to a different concrete type.
    457 If the first assertion is examined first by the algorithm, the deducted type can then be utilized in resolving the second assertion and eliminate many incorrect options without producing the list of candidates pending further checks.
    458 In practice, this have a significant impact when an unbound type @T@ is declared to satisfy the basic \emph{object assertions}\footnote{The term is borrowed from object-oriented languages although \CFA is not object-oriented in principle.} of having a default constructor, destructor, and copy assignment operations.
    459 Since they are defined for every type currently in scope, there are often hundreds or even thousands of matches to these functions with an unspecified operand type, and in most of the cases the value of @T@ can be deduced by resolving another assertion first, which then allows specific object lifetime functions to be looked up since they are sorted internally by the operand type, and greatly reduces the number of wasted resolution attempts.
    460 
    461 Currently this issue also causes the capability of the assertion resolution algorithm to be limited.
    462 Assertion matching is implemented to be more restricted than expression resolution in general, in that the parameter types must match exactly, rather than just merely callable.
    463 If one function declaration includes an assertion of @void f(T)@ and only a @f(long)@ is currently in scope, trying to resolve the assertion with @T=int@ would not work.
    464 Loosening the assertion matching requirement causes many assertion variables to have multiple matches and makes the delayed combination step too costly.
    465 
    466 Given all the issues caused by assertion resolution potentially creating new type variable bindings, a natural progression is to put some restrictions on free type variables such that all the type variables will be bound when the expression reaches assertion resolution stage.
     513If any new assertions are introduced by the selected candidates, the algorithm is applied recursively, until there are none pending resolution or the recursion limit is reached, which results in a failure.
     514
     515However, in practice the efficiency of this algorithm can be sensitive to the order of resolving assertions.
     516Suppose an unbound type variable @T@ appears in two assertions:
     517\begin{cfa}
     518forall( @T@ | { void foo( @T@ ), void bar( @T@ ) } ) T f( T );
     519void foo( int );
     520void bar( int );   void bar( double );   ...
     521f( 3 );
     522\end{cfa}
     523where one can be uniquely resolved and allow the type @T@ to be inferred immediately, and another has many ways to be resolved, each resulting in @T@ being bound to a different concrete type.
     524If the first assertion is examined by the algorithm, the inferred type can then be utilized in resolving the second assertion eliminating many incorrect options without producing a list of candidates requiring further checks.
     525In practice, this have a significant impact when an unbound type @T@ is declared to satisfy the basic \emph{object assertions}\footnote{The term is borrowed from object-oriented languages although \CFA is not object-oriented.} of having a default constructor, destructor, and copy assignment operations.
     526Since these functions are implicitly defined for almost every type in scope, there can be hundreds or even thousands of matches to these functions with an unspecified operand type.
     527In most cases, the value of @T@ can be inferred by resolving other assertions first, and then the object lifetime-functions can easily be fulfilled since functions are sorted internally by the operand type;
     528because of its size, this optimization greatly reduces the number of wasted resolution attempts.
     529
     530This issue also limits the capability of the assertion resolution algorithm.
     531Assertion matching is implemented to be more restrictive than expression resolution in general, in that the parameter types must match exactly, rather than just merely callable.
     532If a function declaration includes the assertion @void f(T)@ and only a @f(long)@ is in scope, trying to resolve the assertion with @T == int@ does not work.
     533However, loosening the assertion matching requirement causes many assertion variables to have multiple matches and makes the delayed combination step too costly.
     534
     535Given all the issues caused by assertion resolution creating new type variable bindings, a natural progression is to restrict free type-variables such that all the type variables are bound when the expression reaches the assertion resolution stage.
    467536A type variable introduced by the @forall@ clause of function declaration can appear in parameter types, return types and assertion variables.
    468 If it appears in the parameter types, it will be bound when matching the arguments to parameters at the call site.
    469 If it only appears in the return type, it can be eventually figured out by the context in principle.
    470 The current implementation in \CFA compiler does not do enough return type deduction as it performs eager assertion resolution, and the return type information cannot be known in general before the parent expression is resolved, unless the expression is in an initialization context, in which the type of variable to be initialized is certainly known.
     537If it appears in parameter types, it can be bound when matching the arguments to parameters at the call site.
     538If it only appears in the return type, it can be eventually be determined from the call-site context.
     539Currently, type resolution cannot do enough return-type inferencing while performing eager assertion resolution: the return type information is unknown before the parent expression is resolved, unless the expression is an initialization context where the variable type is known.
    471540By delaying the assertion resolution until the return type becomes known, this problem can be circumvented.
    472 The truly problematic case occurs if a type variable does not appear in either of the parameter or return types and only appears in some assertion variables.
    473 Such case is very rare and it is not evident that forcing every type variable to appear at least once in parameter or return types limits the expressiveness of \CFA type system to a significant extent.
    474 In the next chapter I will discuss about a proposal of including type declarations in traits rather than having all type variables appear in the trait parameter list, which could be helpful for providing equivalent functionality of having an unbound type parameter in assertion variables, and also addressing some of the variable cost issue discussed in section 4.1.
    475 
    476 
    477 \subsection*{Caching Assertion Resolution Results}
     541The truly problematic case occurs if a type variable does not appear in either of the parameter or return types and only appears in assertions or variables (associate types).
     542\begin{cfa}
     543forall( T | { void foo( @T@ ) } ) int f( float ) {
     544        @T@ t;
     545}
     546\end{cfa}
     547This case is rare so forcing every type variable to appear at least once in parameter or return types limits does not limit the expressiveness of \CFA type system to a significant extent.
     548The next section presents a proposal for including type declarations in traits rather than having all type variables appear in the trait parameter list, which is provides equivalent functionality to an unbound type parameter in assertion variables, and also addresses some of the variable cost issue discussed in \VRef{s:ExpressionCostModel}.
     549
     550
     551\subsection{Caching Assertion Resolution Results}
    478552
    479553In Aaron Moss's prototype design and experiments, a potential optimization of caching the result of already resolved assertions is discussed.
    480 Based on the experiment results, this approach can improve the performance of expression resolution in general, and sometimes allow hard instances of assertion resolution problems to be solved that are otherwise infeasible, for example when the resolution would encounter infinite loops.
    481 
    482 The problem that makes this approach tricky to be implemented correctly is that the resolution algorithm has side effects, namely modifying the type bindings in the environment.
    483 If we were to cache those results that cause the type bindings to be modified, it would be necessary to store the changes to type bindings too, and in case where multiple candidates can be used to satisfy one assertion parameter, all of them needs to be cached including those that are not eventually selected, since the side effect can produce different results depending on the context.
    484 
    485 In the original design of \CFA that includes unrestricted polymorphic formal parameters that can have assertions on themselves, the problem is even more complicated as new declarations can be introduced in scope during expression resolution.
    486 Here is one such example taken from Bilson:
     554Based on the experiment results, this approach can improve the performance of expression resolution in general, and sometimes allow difficult instances of assertion resolution problems to be solved that are otherwise infeasible, \eg when the resolution encounters an infinite loop.
     555
     556The tricky problem in implementing this approach is that the resolution algorithm has side effects, namely modifying the type bindings in the environment.
     557If the modifications are cached, \ie the results that cause the type bindings to be modified, it is also necessary to store the changes to type bindings, too.
     558Furthermore, in cases where multiple candidates can be used to satisfy one assertion parameter, all of them must be cached including those that are not eventually selected, since the side effect can produce different results depending on the context.
     559
     560The original design of \CFA includes unrestricted polymorphic formal parameters with assertions on themselves, making the problem more complicated as new declarations can be introduced in scope during expression resolution.
     561Here is an example taken from Bilson:
    487562\begin{cfa}
    488563void f( forall( T | { T -?( T ); } ) T (*p)( T, T ) );
    489 forall( U, V | { U -?( U ); V -?( V ); } ) U g( U, V ) );
    490 f( g );
    491 \end{cfa}
    492 The inner assertion parameter on the \textit{closed} type variable @T@ is used to satisfy the assertions on @U@ and @V@ in this example.
    493 
    494 However, as per the previous discussions on this topic, polymorphic function pointer types have been removed from \CFA, since correctly implementing assertion matching is not possible in general.
    495 Without closed parameters (and therefore no have-set for assertions) the set of declarations in scope remains unchanged while resolving any expression.
    496 The current \CFA implementation also does not attempt to widen any already bound type parameters to satisfy an assertion.
    497 Note that such restriction does mean that certain kinds of expressions cannot be resolved, for example:
    498 \begin{cfa}
    499 forall (T | {void f(T);}) void g(T);
    500 void f (long);
    501 g(42);
     564forall( U, V | { U -?( U ); V -?( V ); } )  U @g@( U, V ) );
     565f( @g@ );
     566\end{cfa}
     567The inner assertion parameter on the \emph{closed} type-variable @T@ is used to satisfy the assertions on @U@ and @V@ in this example.
     568
     569However, as per the previous discussions on this topic, polymorphic function-pointer types have been removed from \CFA, since correctly implementing assertion matching is impossible in general.
     570Without closed parameters (and therefore no have-set for assertions) the set of declarations in scope remains unchanged, while resolving an expression.
     571The current \CFA implementation also does not attempt to widen any bound type-parameters to satisfy an assertion.
     572Note, this restriction does mean certain kinds of expressions cannot be resolved, \eg:
     573\begin{cfa}
     574forall( T | { void f( T ); } )  void g( T );
     575void f( long );
     576g( 42 );
    502577\end{cfa}
    503578The call @g(42)@ is rejected since no attempt is made to widen the parameter type @T@ from @int@ to @long@.
    504 Such problem could be mitigated if we allow inexact matches of assertions, but cannot be eliminated completely, if @T@ is matched in a parameterized type, including pointers and references:
    505 \begin{cfa}
    506 forall (T | {void f(T*);}) void g(T);
    507 void f (long *);
    508 g(42);
    509 \end{cfa}
    510 Here the only way to resolve the call @g(42)@ is to allow assertion resolution to widen the parameter @T@, since even with inexact matching, @int*@ cannot be converted to @long*@.
     579Such problem could be mitigated by allowing inexact matches of assertions, but cannot be eliminated completely, if @T@ is matched in a parameterized type, including pointers and references:
     580\begin{cfa}
     581forall( T | {void f( T*);})  void g( T );
     582void f( long * );
     583g( 42 );
     584\end{cfa}
     585Here the only way to resolve the call @g(42)@ is to allow assertion resolution to widen the parameter @T@, but even with inexact matching, @int *@ cannot be converted to @long *@.
    511586
    512587
    513588\section{Compiler Implementation Considerations}
     589\label{s:CompilerImplementationConsiderations}
     590
    514591\CFA is still an experimental language and there is no formal specification of expression resolution rules yet.
    515 Presently there is also only one reference implementation, namely the cfa-cc translator, which is under active development and the specific behavior of the implementation can change frequently as new features are added.
    516 
    517 Ideally, the goal of expression resolution involving polymorphic functions would be to find the set of type variable assignments such that the global conversion cost is minimal and all assertion variables can be satisfied.
    518 Unfortunately, with a lot of complications involving implicit conversions and assertion variables, fully achieving this goal is not realistic. The \CFA compiler is specifically not designed to accommodate for all edge cases, either.
    519 Instead it makes a few restrictions to simplify the algorithm so that most expressions that will be encountered in actual code can still pass type checking within a reasonable amount of time.
    520 
    521 As previously mentioned, \CFA polymorphic type resolution is based on a modified version of unification algorithm, where both equivalence (exact) and subtyping (inexact) relations are considered. However, the implementation of type environment is simplified; it only stores a tentative type binding with a flag indicating whether \textit{widening} is possible for an equivalence class of type variables.
    522 Formally speaking, this means the type environment used in \CFA compiler is only capable of representing \textit{lower bound} constraints.
    523 This simplification can still work well most of the time, given the following properties of the existing \CFA type system and the resolution algorithms in use:
    524 
     592Currently, there is only one reference implementation, namely the @cfa-cc@ translator, which is under active development and the specific behaviour of the implementation can change frequently as new features are added.
     593(This situation is standard for any new programming language.)
     594
     595Ideally, the goal of expression resolution involving polymorphic functions is to find the set of type variable assignments such that the global conversion cost is minimal and all assertion variables can be satisfied.
     596Unfortunately, there are a lot of complications involving implicit conversions and assertion variables;
     597hence, fully achieving this goal is unrealistic.
     598And as noted, the \CFA compiler is not covering all the edge cases for its current type-system design.
     599Instead it makes a few restrictions to simplify the resolution algorithm so most expressions in actual code still pass type checking within a reasonable amount of time.
     600
     601As previously mentioned, \CFA polymorphic type resolution is based on a modified version of a unification algorithm, where both equivalence (exact) and subtyping (inexact) relations are considered.
     602However, the implementation of the type environment is simplified;
     603it only stores a tentative type binding with a flag indicating whether \emph{widening} is possible for an equivalence class of type variables.
     604Formally speaking, this means the type environment used in \CFA is only capable of representing \emph{lower-bound} constraints.
     605This simplification works most of the time, given the following properties of the existing \CFA type system and the resolution algorithms:
    525606\begin{enumerate}
    526         \item Type resolution almost exclusively proceeds in bottom-up order, which naturally produces lower bound constraints. Since all identifiers can be overloaded in \CFA, not much definite information can be gained from top-down. In principle it would be possible to detect non-overloaded function names and perform top-down resolution for those; however, the prototype experiments have shown that such optimization does not give a meaningful performance benefit, and therefore it is not implemented.
    527         \item Few nontrivial subtyping relationships are present in \CFA, namely the arithmetic types presented in \VRef[Figure]{f:CFACurrArithmeticConversions}, and qualified pointer/reference types. In particular, \CFA lacks the nominal inheritance subtyping present in object-oriented languages, and the generic types do not support covariance on type parameters. As a result, named types such as structs are always matched by strict equivalence, including any type parameters should they exist.
    528         \item Unlike in functional programming where subtyping between arrow types exists, \ie if $T_2 <: T_1$ and $U_1 <: U_2$ then $T_1 \rightarrow T_2 <: U_1 \rightarrow U_2$, \CFA uses C function pointer types and the parameter/return types must match exactly to be compatible.
     607        \item
     608        Type resolution almost exclusively proceeds in bottom-up order, which naturally produces lower bound constraints.
     609        Since all identifiers can be overloaded in \CFA, little information can be gained from top-down analysis.
     610        In principle, it is possible to detect non-overloaded function names and perform top-down resolution for those;
     611        however, Moss' prototype experiments showed this optimization does not give a meaningful performance benefit, and therefore was not implemented.
     612        \item
     613        Few nontrivial subtyping relationships are present in \CFA, \eg the arithmetic types presented in \VRef[Figure]{f:CFACurrArithmeticConversions} and qualified pointer/reference types.
     614        The reason is that \CFA lacks nominal inheritance subtyping present in object-oriented languages, and the generic types do not support covariance on type parameters.
     615        As a result, named types such as structures are always matched by strict equivalence, even when type parameters exist.
     616        \item
     617        Unlike functional programming where subtyping between arrow types exists, \ie if $T_2 <: T_1$ and $U_1 <: U_2$ then $T_1 \rightarrow T_2 <: U_1 \rightarrow U_2$, \CFA uses C function pointer types and the parameter/return types must match exactly to be compatible.
    529618\end{enumerate}
    530619
    531 \CFA does attempt to incorporate type information propagated from upstream in the case of variable declaration with initializer, since the type of the variable being initialized is definitely known. It is known that the current type environment representation is flawed in handling such type deduction when the return type in the initializer is polymorphic, and an inefficient workaround has to be performed in certain cases. An annotated example is included in the \CFA compiler source code:
    532 
    533 \begin{cfa}
    534 // If resolution is unsuccessful with a target type, try again without, since it
    535 // will sometimes succeed when it wouldn't with a target type binding.
    536 // For example:
    537 forall( otype T ) T & ?[]( T *, ptrdiff_t );
     620\CFA does attempt to incorporate upstream type information propagated from variable declaration with initializer, since the type of the variable being initialized is definitely known.
     621However, the current type-environment representation is flawed in handling such type inferencing, when the return type in the initializer is polymorphic.
     622Currently, an inefficient workaround is performed to efficiency the necessary effect.
     623The following is an annotated example of the workaround.
     624\begin{cfa}
     625// If resolution is unsuccessful with a target type, try again without, since it sometimes succeeds
     626// when it does not with a target type binding. For example:
     627forall( T ) T & ?[]( T *, ptrdiff_t );
    538628const char * x = "hello world";
    539629int ch = x[0];
     
    542632// * (x: const char *) is unified with int *, which fails
    543633\end{cfa}
    544 
    545 The problem here is that we can only represent the constraints $T = int$ and $int <: T$, but since the type information flows in the opposite direction, the proper constraint for this case is $T <: int$, which cannot be represented in the simplified type environment. Currently, an attempt to resolve with equality constraint generated from the initialized variable is still made, since it is often the correct type binding (especially in the case where the initialized variable is a struct), and when such attempt fails, the resolution algorithm is rerun without the initialization context.
    546 
    547 One additional remark to make here is that \CFA does not provide a mechanism to explicitly specify values for polymorphic type parameters. In \CC for example, users may specify template arguments in angle brackets, which could be useful when automatic deduction fails, \eg @max<double>(42, 3.14)@.
    548 There are some partial workarounds such as adding casts to the arguments, but they are not guaranteed to work in all cases. If a type parameter appears in the function return type, however, using the ascription (return) cast will force the desired type binding, since the cast only compiles when the expression type matches exactly with the target.
    549 
    550 \section{Related Work}
    551 
    552 
    553 
     634The problem is that the type system can only represent the constraints $T = int$ and $int <: T$, but since the type information flows in the opposite direction, the proper constraint for this case is $T <: int$, which cannot be represented in the simplified type environment. Currently, an attempt to resolve with equality constraint generated from the initialized variable is still made, since it is often the correct type binding (especially in the case where the initialized variable is a structure), and when such attempt fails, the resolution algorithm is rerun without the initialization context.
     635
     636One additional remark to make here is that \CFA does not provide a mechanism to explicitly specify values for polymorphic type parameters. In \CC for example, users may specify template arguments in angle brackets, which is necessary when automatic inferencing fails, \eg @max<double>(42, 3.14)@.
     637There are some partial workarounds such as adding casts to the arguments, but they are not guaranteed to work in all cases.
     638If a type parameter appears in the function return type, however, using an ascription (return) cast
     639\begin{cfa}
     640(@return@ double (*)( double, double ))max( 42, 3.14 );   $\C[2.5in]{// tell resolver what type to use}\CRT$
     641\end{cfa}
     642forces the desired type binding, since the cast only compiles when the expression type matches exactly with the target.
  • doc/theses/fangren_yu_MMath/uw-ethesis.tex

    r0393fda8 r7d405eb  
    9595\lstset{language=cfa,belowskip=-1pt} % set default language to CFA
    9696\lstnewenvironment{c++}[1][]{\lstset{language=[GNU]C++,escapechar=\$,moredelim=**[is][\color{red}]{@}{@},}\lstset{#1}}{}
     97\lstnewenvironment{haskell}[1][]{\lstset{language=Haskell,escapechar=\$,moredelim=**[is][\color{red}]{@}{@},}\lstset{#1}}{}
     98\lstnewenvironment{swift}[1][]{\lstset{language=swift,escapechar=\$,moredelim=**[is][\color{red}]{@}{@},}\lstset{#1}}{}
     99\lstnewenvironment{scala}[1][]{\lstset{language=scala,escapechar=\$,moredelim=**[is][\color{red}]{@}{@},}\lstset{#1}}{}
     100\lstnewenvironment{ada}[1][]{\lstset{language=Ada,escapechar=\$,moredelim=**[is][\color{red}]{@}{@},}\lstset{#1}}{}
    97101
    98102\newcommand{\PAB}[1]{{\color{red}PAB: #1}}
     
    211215\input{intro}
    212216\input{background}
    213 \input{content1}
    214 \input{content2}
    215 \input{performance}
     217\input{features}
     218\input{resolution}
     219\input{future}
    216220\input{conclusion}
    217221
Note: See TracChangeset for help on using the changeset viewer.