Changeset 07dbcba


Ignore:
Timestamp:
Sep 20, 2024, 8:56:04 AM (2 months ago)
Author:
Peter A. Buhr <pabuhr@…>
Branches:
master
Children:
b723b63
Parents:
b9b8643
Message:

some proofreading of Fangren's thesis

Location:
doc/theses/fangren_yu_MMath
Files:
3 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/fangren_yu_MMath/content1.tex

    rb9b8643 r07dbcba  
    482482\end{cfa}
    483483and the hoisted type names can clash with global types names.
    484 For good reasons, \CC chose to change this semantics~\cite[C.1.2.3.3]{C++}:
    485 \begin{description}[leftmargin=*,topsep=3pt,itemsep=0pt,parsep=0pt]
     484For good reasons, \CC chose to change this semantics:
     485\begin{cquote}
     486\begin{description}[leftmargin=*,topsep=0pt,itemsep=0pt,parsep=0pt]
    486487\item[Change:] A struct is a scope in C++, not in C.
    487488\item[Rationale:] Class scope is crucial to C++, and a struct is a class.
     
    491492The latter is probably rare.
    492493\end{description}
     494\hfill ISO/IEC 14882:1998 (\CC Programming Language Standard)~\cite[C.1.2.3.3]{ANSI98:C++}
     495\end{cquote}
    493496However, there is no syntax to access from a variable through a type to a field.
    494497\begin{cfa}
  • doc/theses/fangren_yu_MMath/content2.tex

    rb9b8643 r07dbcba  
    11\chapter{Resolution Algorithms}
    2 \label{c:content1}
    3 
    4 \CFA's type system is fairly complicated. The compiler needs to analyze each expression with many possible forms of overloading. Variables can be overloaded in \CFA, and functions can be overloaded by the argument types as well as the return types. Combined with the polymorphism introduced by forall clauses and generic types, the complexity of expression analysis can go up very quickly. Designing a rule set that behaves mostly as expected, and implementing it as an efficient algorithm for actual use, are both very challenging tasks. As the \CFA translator's performance improves to a level that can compile a mid-sized program in a reasonable amount of time, the development of \CFA's standard library also speeds up and many new features utilizing the expressiveness of \CFA's type system has been implemented, such as generic container types similar to those in \CC's standard template library. During the process of development, many weaknesses and design flaws of \CFA type system have been discovered. Some of those problems arise from the newly introduced language features described in the previous chapter, and fixing those unexpected interactions with the type system is especially difficult. This chapter describes the type resolution rules currently in use and some major problems that have been identified. Not all of those problems have got solutions yet, because fixing them may require redesigning parts of the \CFA language at a larger scale.
    5 
    6 \section{The Expression Cost Model}
    7 
    8 \CFA has been using an expression cost model to resolve ambiguity of overloaded expressions from the very beginning. Since most operators can be overloaded in \CFA (excluding a few operators that have special semantics, such as the comma operator, and the short-circuit logical operators \&\& and ||, which require the operands to be evaluated in order), they are treated the same way as other function calls and the same rules for overload resolution must apply to them as well.
    9 
    10 In \CFA, candidates of an overloaded expression are ranked by numerical cost elements, that accounts for the type conversions needed from the argument type to the corresponding declared function parameter type, as well as the polymorphic types and restrictions introduces by the forall clause. Currently the expression cost used in \CFA has the following components, ranked from higher to lower by importance:
    11 
     2\label{c:content2}
     3
     4\CFA's type system is powerful but fairly complicated, which leads to higher compilation cost.
     5The compiler needs to analyze each expression with many possible forms of overloading.
     6Variables can be overloaded in \CFA, and functions can be overloaded by the argument types as well as the return types.
     7Combined with the polymorphism introduced by forall clauses and generic types, the complexity of expression analysis can go up very quickly.
     8Designing a rule set that behaves mostly as expected, and implementing it as an efficient algorithm for actual use, are both very challenging tasks.
     9As the \CFA translator's performance improves to a level that can compile a mid-sized program in a reasonable amount of time, the development of \CFA's standard library also speeds up and many new features utilizing the expressiveness of \CFA's type system has been implemented, such as generic container types similar to those in \CC's standard template library.
     10During the process of development, many weaknesses and design flaws of \CFA type system have been discovered.
     11Some of those problems arise from the newly introduced language features described in the previous chapter, and fixing those unexpected interactions with the type system is especially difficult.
     12This chapter describes the type resolution rules currently in use and some major problems that have been identified.
     13Not all of those problems have got solutions yet, because fixing them may require redesigning parts of the \CFA language at a larger scale.
     14
     15
     16\section{Expression Cost Model}
     17
     18\CFA has been using an expression cost model to resolve ambiguity of overloaded expressions from the very beginning.
     19Since most operators can be overloaded in \CFA (excluding a few operators that have special semantics, such as the comma operator, and the short-circuit logical operators \&\& and ||, which require the operands to be evaluated in order), they are treated the same way as other function calls and the same rules for overload resolution must apply to them as well.
     20
     21In \CFA, candidates of an overloaded expression are ranked by numerical cost elements, that accounts for the type conversions needed from the argument type to the corresponding declared function parameter type, as well as the polymorphic types and restrictions introduces by the forall clause.
     22Currently the expression cost used in \CFA has the following components, ranked from higher to lower by importance:
    1223\begin{enumerate}
    13     \item \textbf{Unsafe} cost representing a narrowing conversion of arithmetic types, e.g. \texttt{int} to \texttt{short}, and qualifier-dropping conversions for pointer and reference types;
    14     \item \textbf{Polymorphic} cost where the function parameter type is or contains a polymorphic type variable;
    15     \item \textbf{Safe} cost representing a widening conversion e.g. \texttt{short} to \texttt{int}, qualifier-adding conversions for pointer and reference types, and value conversion for enumeration constants.
    16     \item \textbf{Variable} cost that counts the number of polymorphic variables, if any, introduced by the forall clause in the function declaration;
    17     \item \textbf{Specialization} cost that counts the number of restrictions introduced by the type assertions.
     24\item \textbf{Unsafe} cost representing a narrowing conversion of arithmetic types, e.g.
     25@int@ to @short@, and qualifier-dropping conversions for pointer and reference types;
     26\item \textbf{Polymorphic} cost where the function parameter type is or contains a polymorphic type variable;
     27\item \textbf{Safe} cost representing a widening conversion e.g.
     28@short@ to @int@, qualifier-adding conversions for pointer and reference types, and value conversion for enumeration constants.
     29\item \textbf{Variable} cost that counts the number of polymorphic variables, if any, introduced by the forall clause in the function declaration;
     30\item \textbf{Specialization} cost that counts the number of restrictions introduced by the type assertions.
    1831\end{enumerate}
    19 
    20 The comparison of cost tuple is by lexicographical order, starting from the highest importance term (unsafe cost) and the lower one has lower cost, with ties going to the second term (polymorphic cost) and so on. At a sub-expression level, the lowest cost candidate for each result type is included as a possible interpretation of the expression; at the top level all possible interpretations of different types are considered and the overall lowest cost is selected as the final interpretation of the expression.
     32The comparison of cost tuple is by lexicographical order, starting from the highest importance term (unsafe cost) and the lower one has lower cost, with ties going to the second term (polymorphic cost) and so on.
     33At a sub-expression level, the lowest cost candidate for each result type is included as a possible interpretation of the expression; at the top level all possible interpretations of different types are considered and the overall lowest cost is selected as the final interpretation of the expression.
    2134
    2235In many languages that support function and operator overloading, such as \CC and Java, a partial ordering system decides which interpretation of the expression gets selected, which means that sometimes the candidates are incomparable (none of those are considered a better match) and only when one candidate is considered better than all others (maximal with respect to the partial order) is the expression unambiguous.
    2336
    24 In \CFA trying to use such a system is problematic because of the presence of return type overloading of functions, and overloading of variables. The resolution algorithms used in \CC and Java are greedy, as they select the best match for each sub-expression without considering the higher level ones, and therefore at each step of resolution, the arguments are already given unique interpretations, so the ordering only needs to consider comparing different sets of conversion targets (function parameter types) on the same set of input. However, in \CFA expression resolution considers multiple interpretations of argument sub-expressions with different types, so it is possible that both the selected function and the set of arguments are different, and cannot be compared if we choose to use some kind of partial ordering system. Since this situation can arise quite often in \CFA, even in the simplest form such as an expression \textbf{f(a)} where both the function name \textbf{f} and variable name \textbf{a} are overloaded. We do not want the resolution algorithm to report too many expressions as ambiguous (which would therefore be compilation errors) and restrict the flexibility of \CFA by too much. The previous documentations and papers on \CFA expression resolution never explained why such a cost system is used; this could be a plausible guess of the original motivation of introducing the cost system to \CFA.
    25 
    26 On the contrary, using such a cost-based model can sometimes make \CFA expression resolution too permissive; the system will always attempt to select the lowest cost option, and only when there are multiple options tied at the lowest cost it reports the expression as ambiguous. With so many elements in the cost tuple, ties are expected to be uncommon. Other than the case of multiple exact matches which would all have cost of zero, incomparable candidates under a partial ordering of being more specific can often have different expression costs since different kinds of implicit conversions are involved, resulting in seemingly arbitrary overload selections.
    27 
    28 Ada is another programming language that has overloading based on return type. Although Ada also allows implicit type conversions of function arguments, it is fairly conservative on resolving ambiguities. There are only two "preference" rules in Ada overload resolution, one for primitive arithmetic operators and one for universal access types (analogous to void* in C); any other cases where an expression have multiple legal interpretations are considered ambiguous. The current overload resolution system for \CFA is on the other end of the spectrum, as it tries to order every legal interpretations of an expression and chooses the best one according to cost, occasionally giving unexpected results.
    29 
    30 There are currently at least three different situations where the polymorphic cost element of the cost model does not yield a candidate selection that is clearly justifiable, and one of them is straight up wrong. Here are those three cases:
    31 
     37In \CFA trying to use such a system is problematic because of the presence of return type overloading of functions, and overloading of variables.
     38The resolution algorithms used in \CC and Java are greedy, as they select the best match for each sub-expression without considering the higher level ones, and therefore at each step of resolution, the arguments are already given unique interpretations, so the ordering only needs to consider comparing different sets of conversion targets (function parameter types) on the same set of input.
     39However, in \CFA expression resolution considers multiple interpretations of argument sub-expressions with different types, so it is possible that both the selected function and the set of arguments are different, and cannot be compared if we choose to use some kind of partial ordering system.
     40Since this situation can arise quite often in \CFA, even in the simplest form such as an expression \textbf{f(a)} where both the function name \textbf{f} and variable name \textbf{a} are overloaded.
     41We do not want the resolution algorithm to report too many expressions as ambiguous (which would therefore be compilation errors) and restrict the flexibility of \CFA by too much.
     42The previous documentations and papers on \CFA expression resolution never explained why such a cost system is used; this could be a plausible guess of the original motivation of introducing the cost system to \CFA.
     43
     44On the contrary, using such a cost-based model can sometimes make \CFA expression resolution too permissive; the system will always attempt to select the lowest cost option, and only when there are multiple options tied at the lowest cost it reports the expression as ambiguous.
     45With so many elements in the cost tuple, ties are expected to be uncommon.
     46Other than the case of multiple exact matches which would all have cost of zero, incomparable candidates under a partial ordering of being more specific can often have different expression costs since different kinds of implicit conversions are involved, resulting in seemingly arbitrary overload selections.
     47
     48Ada is another programming language that has overloading based on return type.
     49Although Ada also allows implicit type conversions of function arguments, it is fairly conservative on resolving ambiguities.
     50There are only two "preference" rules in Ada overload resolution, one for primitive arithmetic operators and one for universal access types (analogous to void* in C); any other cases where an expression have multiple legal interpretations are considered ambiguous.
     51The current overload resolution system for \CFA is on the other end of the spectrum, as it tries to order every legal interpretations of an expression and chooses the best one according to cost, occasionally giving unexpected results.
     52
     53There are currently at least three different situations where the polymorphic cost element of the cost model does not yield a candidate selection that is clearly justifiable, and one of them is straight up wrong.
     54Here are those three cases:
    3255\begin{enumerate}
    33     \item Polymorphic exact match versus non-polymorphic inexact match: consider the following declarations
    34 
    35     \begin{cfa}
    36         forall (T) void f (T); // 1
    37         void f (long); // 2
    38 
    39         f (42); // currently selects 2
    40     \end{cfa}
    41 
    42     Under the current cost model, option 1 incurs a polymorphic cost from matching the argument type \textbf{int} to type variable \textbf{T}, and option 2 incurs a safe cost from integer promotion of type \textbf{int} to \textbf{long}. Since polymorphic cost is ranked above safe conversion cost, option 2 is considered to have lower cost and gets selected.
    43    
    44     In contrast, the template deduction and overload resolution rules in \CC selects option 1 instead (converting forall to the equivalent function template declaration). \CC performs template argument deduction and overload candidate ranking in two separate steps: in the first step the type parameters are deduced for each primary function template, and if the corresponding template instantiation succeeds, the resulting function prototype is added to the resolution candidate set. In the second step the implicit conversions (if any) applied to argument types are compared after taking away top-level qualifiers and references, and it prefers an exact match, followed by basic type promotions (roughly corresponds to safe conversion in \CFA), and then other kinds of conversions (roughly corresponds to unsafe conversion in \CFA). Only when the type conversions are the same does it prioritize a non-template candidate. In this example, option 1 produces the prototype void f(int) which gives an exact match and therefore takes priority. The \CC resolution rules effectively makes option 2 a specialization that only applies to type \textbf{long} exactly,\footnote{\CC does have explicit template specializations, however they do not participate directly in overload resolution and can sometimes lead to unintuitive results.} while the current \CFA rules make option 2 apply for all integral types below \textbf{long}. Such a discrepancy could be explained as a design decision that since \CFA polymorphic functions are real entities and are separately compiled, calling them would require passing type information and thus have an actual runtime cost.
    45 
    46     \item Having a lower total polymorphic cost does not always mean a function is more specialized. The following example is taken from Aaron Moss's thesis, which discusses some improvements to the \CFA expression cost model, where he claims the following function prototypes are increasingly more constrained:
    47 
    48     \begin{cfa}
    49         forall(T, U) void f(T, U); //1, polymorphic
    50         forall(T) void f(T, T); //2, less polymorphic
    51         forall(T) void f(T, int); //3, even less polymorphic
    52         forall(T) void f(T*, int); //4, least polymorphic
    53     \end{cfa}
    54 
    55     This argument is not entirely correct. Although it is true that both the sequence 1,2 and 1,3,4 are increasingly more constrained on the argument  types, the option 2 is not comparable to either of option 3 or 4; they actually describe independent constraints on the two arguments. In natural language, option 3 says that the second argument must have type \textbf{int}, while option 2 says that the two arguments must have the same type. These two constraints can independently be satisfied, therefore neither should be considered a better match when trying to resolve a call to f with argument types (int, int), and reporting such an expression as ambiguous is the most appropriate action. This is a limitation of using a numerical cost value as it cannot these logically complicated cases.
    56 
    57     \item Finally, the introduction of generic types means that it may require more type variables to describe a more specific type and that means simply counting the number of polymorphic type variables is no longer correct in general to order the function candidates as being more constrained. Suppose we have a generic pair type defined and writing a function that takes an arbitrary pair would require using two type variables
    58     \begin{cfa}
    59         forall (T,U) void f (pair(T,U)); // 1
    60     \end{cfa}
    61     and compare that with a function that takes any type at all:
    62     \begin{cfa}
    63         forall (T) void f (T); // 2
    64     \end{cfa}
    65 
    66     Passing a pair variable to f gives a cost of 1 poly, 2 variable for the pair overload, and a cost of 1 poly, 1 variable for the unconstrained overload. Clearly we would like the former to be chosen but the cost model cannot handle that correctly.
     56\item Polymorphic exact match versus non-polymorphic inexact match: consider the following declarations
     57
     58\begin{cfa}
     59forall (T) void f (T); // 1
     60void f (long); // 2
     61
     62f (42); // currently selects 2
     63\end{cfa}
     64Under the current cost model, option 1 incurs a polymorphic cost from matching the argument type \textbf{int} to type variable \textbf{T}, and option 2 incurs a safe cost from integer promotion of type \textbf{int} to \textbf{long}.
     65Since polymorphic cost is ranked above safe conversion cost, option 2 is considered to have lower cost and gets selected.
     66
     67
     68In contrast, the template deduction and overload resolution rules in \CC selects option 1 instead (converting forall to the equivalent function template declaration).
     69\CC performs template argument deduction and overload candidate ranking in two separate steps: in the first step the type parameters are deduced for each primary function template, and if the corresponding template instantiation succeeds, the resulting function prototype is added to the resolution candidate set.
     70In the second step the implicit conversions (if any) applied to argument types are compared after taking away top-level qualifiers and references, and it prefers an exact match, followed by basic type promotions (roughly corresponds to safe conversion in \CFA), and then other kinds of conversions (roughly corresponds to unsafe conversion in \CFA).
     71Only when the type conversions are the same does it prioritize a non-template candidate.
     72In this example, option 1 produces the prototype void f(int) which gives an exact match and therefore takes priority.
     73The \CC resolution rules effectively makes option 2 a specialization that only applies to type \textbf{long} exactly,\footnote{\CC does have explicit template specializations, however they do not participate directly in overload resolution and can sometimes lead to unintuitive results.} while the current \CFA rules make option 2 apply for all integral types below \textbf{long}.
     74Such a discrepancy could be explained as a design decision that since \CFA polymorphic functions are real entities and are separately compiled, calling them would require passing type information and thus have an actual runtime cost.
     75
     76\item Having a lower total polymorphic cost does not always mean a function is more specialized.
     77The following example is taken from Aaron Moss's thesis, which discusses some improvements to the \CFA expression cost model, where he claims the following function prototypes are increasingly more constrained:
     78
     79\begin{cfa}
     80forall(T, U) void f(T, U); //1, polymorphic
     81forall(T) void f(T, T); //2, less polymorphic
     82forall(T) void f(T, int); //3, even less polymorphic
     83forall(T) void f(T*, int); //4, least polymorphic
     84\end{cfa}
     85
     86This argument is not entirely correct.
     87Although it is true that both the sequence 1,2 and 1,3,4 are increasingly more constrained on the argument  types, the option 2 is not comparable to either of option 3 or 4; they actually describe independent constraints on the two arguments.
     88In natural language, option 3 says that the second argument must have type \textbf{int}, while option 2 says that the two arguments must have the same type.
     89These two constraints can independently be satisfied, therefore neither should be considered a better match when trying to resolve a call to f with argument types (int, int), and reporting such an expression as ambiguous is the most appropriate action.
     90This is a limitation of using a numerical cost value as it cannot these logically complicated cases.
     91
     92\item Finally, the introduction of generic types means that it may require more type variables to describe a more specific type and that means simply counting the number of polymorphic type variables is no longer correct in general to order the function candidates as being more constrained.
     93Suppose we have a generic pair type defined and writing a function that takes an arbitrary pair would require using two type variables
     94\begin{cfa}
     95forall (T,U) void f (pair(T,U)); // 1
     96\end{cfa}
     97and compare that with a function that takes any type at all:
     98\begin{cfa}
     99forall (T) void f (T); // 2
     100\end{cfa}
     101
     102Passing a pair variable to f gives a cost of 1 poly, 2 variable for the pair overload, and a cost of 1 poly, 1 variable for the unconstrained overload.
     103Clearly we would like the former to be chosen but the cost model cannot handle that correctly.
     104
    67105\end{enumerate}
    68106
    69 These inconsistencies do not seem to be easily solvable and currently the \CFA codebase has to work around with these known defects. One potential path that could possibly be taken is a mix of the conversion cost and \CC-like partial ordering of specializations. Observe that in the \CFA cost tuple, the first three elements (unsafe, polymorphic and safe conversions) are related to the argument types, while the other elements (polymorphic variable and assertion counts) are properties of the function declarations independent of the arguments. This means it may be reasonable to have an ordering that compares the argument conversion costs first and uses the partial ordering of specializations as a tiebreaker. The algorithm used by \CC template specialization ordering can be applied for \CFA with some slight modifications.
    70 
    71 At the meantime, some other improvements have been made to the expression cost system. Most notably, the expression resolution algorithm now consistently uses the globally minimal cost interpretation, as discussed in a previous technical report. While implementing the change, there are also two detailed issues that need to be addressed for the new rules to fully work.
    72 
    73 The first one deals with an idiom commonly used in \CFA that would cause a lot of overloads to have equal costs. These kinds of expressions are so ubiquitous in \CFA code that we do not want them to be deemed ambiguous in the language. Many C library functions have multiple versions for different argument types, for example there are absolute value functions defined for basic arithmetic types with different names, since C does not support any kind of overloading:
    74 
    75 \begin{cfa}
    76     int abs (int);
    77     long labs (long);
    78     double fabs (double);
    79     float fabsf (float);
    80 \end{cfa}
    81 
    82 It is cumbersome for the programmers to remember all these different function names and select the correct ones, and even worse, if the incorrect version is picked, the program still compiles but with undesired conversions, which can sometimes even change the result, such as using the int version for floating point argument. In \CFA all of these functions are renamed to simply @abs@. This causes multiple overloads to have the same total cost when some conversion is needed. For example @long x = abs(42);@ could be either calling @long abs(long)@ with the argument 42 converted to @long@ or calling @int abs(int)@ and converting the result to @long@. In this example the choice could be arbitrary because both yields identical results. In some other cases, the choice can have an actual impact on the final result. While testing the effects of using the updated cost rule we found this piece of code in \CFA standard library:
    83 
    84 \begin{cfa}
    85     static inline __readyQ_avg_t __to_readyQ_avg(unsigned long long intsc) {
    86         if(unlikely(0 == intsc)) return 0.0;
    87         else return log2(intsc); // implicit conversion happens here
    88     } // __readyQ_avg_t is defined to be double
    89 \end{cfa}
    90 
    91 This is a helper function for performance logging that calculate the geometric mean of a counter value, and it does so by summing up the logarithm value of the counter. The function @log2@ is similarly overloaded in \CFA for both integer and floating point types, however in this case, the integer overload returns an integer, so the fractional part of logarithm is truncated. With the previous cost rules the integer version of @log2@ is selected, and when experimenting the updated cost rules this got picked up as an ambiguous expression at first. I reported this issue to the author of library code and got the reply that the expectation was that \CFA would choose the floating point overload, by the return type overloading selection. This mistake went unnoticed since it is only inside a performance logging function and does not serve any critical purposes, and the only effect it has caused is that the performance data becomes inaccurate as the fractional parts got truncated before the sum. Investigating this example leads to the decision that matching the return type higher up in the expression tree is prioritized, in case the total expression cost is equal.
    92 
    93 Another change addresses the issue that C arithmetic expressions have unique meanings governed by the arithmetic promotion rules, however in \CFA they are all modelled as function calls for overload resolution purposes. The previous, partially greedy resolution rules will pick the locally optimal match and it matches the C rules naturally. Care needs to be taken to maintain the C semantics when switching to the total expression cost approach.
    94 
    95 This problem is already partially recognized, when Aaron Moss suggested overload resolution by total cost, in the form of handling cast expressions. To quote directly the example:
    96 
    97 If a cast argument has an unambiguous interpretation as a conversion argument then it must be interpreted as such, even if the ascription interpretation would have a lower overall cost. This is demonstrated in the following example, adapted from the C standard library:
    98 
    99 \begin{cfa}
    100     unsigned long long x;
    101     (unsigned)(x >> 32);
    102 \end{cfa}
    103 
    104 In C semantics, this example is unambiguously upcasting 32 to @unsigned long long@, performing the shift, then downcasting the result to @unsigned@, at cost (1, 0, 3, 1, 0, 0, 0). If ascription were allowed to be a first-class interpretation of a cast expression, it would be cheaper to select the @unsigned@ interpretation of @?>>?@ by downcasting x to @unsigned@ and upcasting 32 to @unsigned@, at a total cost of (1, 0, 1, 1, 0, 0, 0). However, this break from C semantics is not backwards compatible, so to maintain C compatibility, the \CFA resolver selects the lowest-cost interpretation of the cast argument for which a conversion or coercion to the target type exists (upcasting to @unsigned long long@ in the example above, due to the lack of unsafe downcasts), using the cost of the conversion itself only as a tie-breaker.
    105 
    106 However, a cast expression is not necessary to have such inconsistency to C semantics. With any implicit argument-parameter type conversion in function calls we can replicate this issue without an explicit cast. For example
    107 
    108 \begin{cfa}
    109     unsigned long long x;
    110     void f (unsigned);
    111     f (x >> 32);
    112 \end{cfa}
    113 
    114 This has the same effect as using an explicit cast to coerce the type of expression @x >> 32@ to @unsigned@. This shows that fundamentally the problem is not coming from the cast expressions, but modelling the C built-in operators as overloaded functions. A different rule is enforced in selecting the built-in function candidates to fix this problem. If an expression has any legal interpretations as a C built-in operation, only the lowest cost one is kept, regardless of the result types.
     107These inconsistencies do not seem to be easily solvable and currently the \CFA codebase has to work around with these known defects.
     108One potential path that could possibly be taken is a mix of the conversion cost and \CC-like partial ordering of specializations.
     109Observe that in the \CFA cost tuple, the first three elements (unsafe, polymorphic and safe conversions) are related to the argument types, while the other elements (polymorphic variable and assertion counts) are properties of the function declarations independent of the arguments.
     110This means it may be reasonable to have an ordering that compares the argument conversion costs first and uses the partial ordering of specializations as a tiebreaker.
     111The algorithm used by \CC template specialization ordering can be applied for \CFA with some slight modifications.
     112
     113
     114At the meantime, some other improvements have been made to the expression cost system.
     115Most notably, the expression resolution algorithm now consistently uses the globally minimal cost interpretation, as discussed in a previous technical report.
     116While implementing the change, there are also two detailed issues that need to be addressed for the new rules to fully work.
     117
     118The first one deals with an idiom commonly used in \CFA that would cause a lot of overloads to have equal costs.
     119These kinds of expressions are so ubiquitous in \CFA code that we do not want them to be deemed ambiguous in the language.
     120Many C library functions have multiple versions for different argument types, for example there are absolute value functions defined for basic arithmetic types with different names, since C does not support any kind of overloading:
     121\begin{cfa}
     122int abs (int);
     123long labs (long);
     124double fabs (double);
     125float fabsf (float);
     126\end{cfa}
     127It is cumbersome for the programmers to remember all these different function names and select the correct ones, and even worse, if the incorrect version is picked, the program still compiles but with undesired conversions, which can sometimes even change the result, such as using the int version for floating point argument.
     128In \CFA all of these functions are renamed to simply @abs@.
     129This causes multiple overloads to have the same total cost when some conversion is needed.
     130For example @long x = abs(42);@ could be either calling @long abs(long)@ with the argument 42 converted to @long@ or calling @int abs(int)@ and converting the result to @long@.
     131In this example the choice could be arbitrary because both yields identical results.
     132In some other cases, the choice can have an actual impact on the final result.
     133While testing the effects of using the updated cost rule we found this piece of code in \CFA standard library:
     134\begin{cfa}
     135static inline __readyQ_avg_t __to_readyQ_avg(unsigned long long intsc) {
     136        if(unlikely(0 == intsc)) return 0.0;
     137        else return log2(intsc); // implicit conversion happens here
     138} // __readyQ_avg_t is defined to be double
     139\end{cfa}
     140This is a helper function for performance logging that calculate the geometric mean of a counter value, and it does so by summing up the logarithm value of the counter.
     141The function @log2@ is similarly overloaded in \CFA for both integer and floating point types, however in this case, the integer overload returns an integer, so the fractional part of logarithm is truncated.
     142With the previous cost rules the integer version of @log2@ is selected, and when experimenting the updated cost rules this got picked up as an ambiguous expression at first.
     143I reported this issue to the author of library code and got the reply that the expectation was that \CFA would choose the floating point overload, by the return type overloading selection.
     144This mistake went unnoticed since it is only inside a performance logging function and does not serve any critical purposes, and the only effect it has caused is that the performance data becomes inaccurate as the fractional parts got truncated before the sum.
     145Investigating this example leads to the decision that matching the return type higher up in the expression tree is prioritized, in case the total expression cost is equal.
     146
     147Another change addresses the issue that C arithmetic expressions have unique meanings governed by the arithmetic promotion rules, however in \CFA they are all modelled as function calls for overload resolution purposes.
     148The previous, partially greedy resolution rules will pick the locally optimal match and it matches the C rules naturally.
     149Care needs to be taken to maintain the C semantics when switching to the total expression cost approach.
     150
     151
     152This problem is already partially recognized, when Aaron Moss suggested overload resolution by total cost, in the form of handling cast expressions.
     153To quote directly the example:
     154
     155If a cast argument has an unambiguous interpretation as a conversion argument then it must be interpreted as such, even if the ascription interpretation would have a lower overall cost.
     156This is demonstrated in the following example, adapted from the C standard library:
     157\begin{cfa}
     158unsigned long long x;
     159(unsigned)(x >> 32);
     160\end{cfa}
     161In C semantics, this example is unambiguously upcasting 32 to @unsigned long long@, performing the shift, then downcasting the result to @unsigned@, at cost (1, 0, 3, 1, 0, 0, 0).
     162If ascription were allowed to be a first-class interpretation of a cast expression, it would be cheaper to select the @unsigned@ interpretation of @?>>?@ by downcasting x to @unsigned@ and upcasting 32 to @unsigned@, at a total cost of (1, 0, 1, 1, 0, 0, 0).
     163However, this break from C semantics is not backwards compatible, so to maintain C compatibility, the \CFA resolver selects the lowest-cost interpretation of the cast argument for which a conversion or coercion to the target type exists (upcasting to @unsigned long long@ in the example above, due to the lack of unsafe downcasts), using the cost of the conversion itself only as a tie-breaker.
     164
     165However, a cast expression is not necessary to have such inconsistency to C semantics.
     166With any implicit argument-parameter type conversion in function calls we can replicate this issue without an explicit cast.
     167For example
     168\begin{cfa}
     169unsigned long long x;
     170void f (unsigned);
     171f (x >> 32);
     172\end{cfa}
     173This has the same effect as using an explicit cast to coerce the type of expression @x >> 32@ to @unsigned@.
     174This shows that fundamentally the problem is not coming from the cast expressions, but modelling the C built-in operators as overloaded functions.
     175A different rule is enforced in selecting the built-in function candidates to fix this problem.
     176If an expression has any legal interpretations as a C built-in operation, only the lowest cost one is kept, regardless of the result types.
    115177
    116178
    117179\section{Type Unification}
    118180
    119 Type unification is the algorithm that assigns values to each (free) type parameters such that the types of the provided arguments and function parameters match.
    120 
    121 \CFA does not attempt to do any type \textit{inference}: it has no anonymous functions (i.e. lambdas, commonly found in functional programming and also used in \CC and Java), and the variable types must all be explicitly defined (no auto typing). This makes the unification problem more tractable in \CFA as the argument types at each call site are usually all specified. There is a single exception case, which happens when the function return type contains a free type variable that does not occur in any of the argument types, and subsequently passed into the parent expression. A top level expression whose type still contains an unbounded type variable is considered ill-formed as such expression is inherently ambiguous.
    122 
    123 The unification algorithm in \CFA is originally presented in Richard Bilson's thesis and it has remained as the basis of the algorithm currently in use. Some additions have been made in order to accommodate for the newly added type features to the language. To summarize, the \CFA type unification has two minor variants: an \textit{exact} mode and an \textit{inexact} mode. The inexact mode is applied at top level argument-parameter matching, and attempts to find an assignment to the type variables such that the argument types can be converted to parameter types with minimal cost as defined in the previous section. The exact mode is required since the type matching algorithm operates recursively and the inner types often have to match exactly, for example there is no common type for the pointer types \textbf{int*} and \textbf{long*} while there is for \textbf{int} and \textbf{long}. With the introduction of generic record types, the parameters must match exactly as well; currently there are no covariance or contravariance supported for the generics.
    124 
    125 One simplification was made to the \CFA language that makes modelling the type system easier: polymorphic function pointer types are no longer allowed in declarations. The polymorphic function declarations themselves are still treated as function pointer types internally, however the change means that formal parameter types can no longer be polymorphic. Previously it is possible to write function prototypes such as
    126 
    127 \begin{cfa}
    128     void f( forall( T | { T -?( T ); } ) T (*p)( T, T ) );
    129 \end{cfa}
    130 
    131 Notably, the unification algorithm implemented in the \CFA compiler has never managed to trace the assertion parameters on the formal types at all, and the problem of determining if two assertion sets are compatible may very likely be undecidable in general, given the ability of synthesizing more complicated types by the nesting of generics. Eventually, the reason of not allowing such constructs is that they mostly do not provide useful type features for actual programming tasks. A subroutine of a program operates on the arguments provided at the call site together with (if any) local and global variables, and even though the subroutine can be polymorphic, the types will be supported at each call site. On each invocation the types to be operate on can be determined from the arguments provided, and therefore there should not be a need to pass a polymorphic function pointer, which can take any type in principle.
     181Type unification is the algorithm that assigns values to each (free) type parameters such that the types of the provided arguments and function parameters match.
     182
     183
     184\CFA does not attempt to do any type \textit{inference}: it has no anonymous functions (i.e.
     185lambdas, commonly found in functional programming and also used in \CC and Java), and the variable types must all be explicitly defined (no auto typing).
     186This makes the unification problem more tractable in \CFA as the argument types at each call site are usually all specified.
     187There is a single exception case, which happens when the function return type contains a free type variable that does not occur in any of the argument types, and subsequently passed into the parent expression.
     188A top level expression whose type still contains an unbounded type variable is considered ill-formed as such expression is inherently ambiguous.
     189
     190The unification algorithm in \CFA is originally presented in Richard Bilson's thesis and it has remained as the basis of the algorithm currently in use.
     191Some additions have been made in order to accommodate for the newly added type features to the language.
     192To summarize, the \CFA type unification has two minor variants: an \textit{exact} mode and an \textit{inexact} mode.
     193The inexact mode is applied at top level argument-parameter matching, and attempts to find an assignment to the type variables such that the argument types can be converted to parameter types with minimal cost as defined in the previous section.
     194The exact mode is required since the type matching algorithm operates recursively and the inner types often have to match exactly, for example there is no common type for the pointer types \textbf{int*} and \textbf{long*} while there is for \textbf{int} and \textbf{long}.
     195With the introduction of generic record types, the parameters must match exactly as well; currently there are no covariance or contravariance supported for the generics.
     196
     197One simplification was made to the \CFA language that makes modelling the type system easier: polymorphic function pointer types are no longer allowed in declarations.
     198The polymorphic function declarations themselves are still treated as function pointer types internally, however the change means that formal parameter types can no longer be polymorphic.
     199Previously it is possible to write function prototypes such as
     200\begin{cfa}
     201void f( forall( T | { T -?( T ); } ) T (*p)( T, T ) );
     202\end{cfa}
     203Notably, the unification algorithm implemented in the \CFA compiler has never managed to trace the assertion parameters on the formal types at all, and the problem of determining if two assertion sets are compatible may very likely be undecidable in general, given the ability of synthesizing more complicated types by the nesting of generics.
     204Eventually, the reason of not allowing such constructs is that they mostly do not provide useful type features for actual programming tasks.
     205A subroutine of a program operates on the arguments provided at the call site together with (if any) local and global variables, and even though the subroutine can be polymorphic, the types will be supported at each call site.
     206On each invocation the types to be operate on can be determined from the arguments provided, and therefore there should not be a need to pass a polymorphic function pointer, which can take any type in principle.
    132207
    133208For example, consider a polymorphic function that takes one argument of type \textbf{T} and another pointer to a subroutine
    134 
    135 \begin{cfa}
    136     forall (T) void f (T x, forall (U) void (*g) (U));
    137 \end{cfa}
    138 
    139 Making \textbf{g} polymorphic in this context would almost certainly be unnecessary, since it can only be called inside the body of \textbf{f} and the types of the argument would have been known anyways, although it can potentially depend on \textbf{T}. Moreover, requesting a function parameter to be able to potentially work on any input type at all would always impose too much constraint on the arguments, as it only needs to make each calls inside the body of \textbf{f} valid.
     209\begin{cfa}
     210forall (T) void f (T x, forall (U) void (*g) (U));
     211\end{cfa}
     212Making \textbf{g} polymorphic in this context would almost certainly be unnecessary, since it can only be called inside the body of \textbf{f} and the types of the argument would have been known anyways, although it can potentially depend on \textbf{T}.
     213Moreover, requesting a function parameter to be able to potentially work on any input type at all would always impose too much constraint on the arguments, as it only needs to make each calls inside the body of \textbf{f} valid.
    140214
    141215Rewriting the prototype to
    142 
    143 \begin{cfa}
    144     forall (T) void f (T x, void (*g) (T));
    145 \end{cfa}
    146 
    147 will be sufficient (or potentially, some compound type synthesized from \textbf{T}), in which case \textbf{g} is no longer a polymorphic type on itself. The "monomorphization" conversion is readily supported in \CFA, either by explicitly assigning a polymorphic function name to a compatible function pointer type, or implicitly done in deducing assertion parameters (which will be discussed in the next section). Such technique can be directly applied to argument passing, which is essentially just assignment to function parameter variables. There might be some edge cases where the supplied subroutine \textbf{g} is called on arguments of different types inside the body of \textbf{f} and so declared as polymorphic, but such use case is rare and the benefit of allowing such constructs seems to be minimal in practice.
    148 
    149 The result of this change is that the unification algorithm no longer needs to distinguish "open" and "closed" type variables, as the latter is not allowed to exist. The only type variables that need to be handled are those introduced by the \textbf{forall} clause from the function prototype. The subtype relationship between function types is now also rendered redundant since none of the function parameter or return types can be polymorphic, and no basic types or non-polymorphic function types are subtypes of any other type. Therefore the goal of (exact) type unification now simply becomes finding a substitution that produces identical types. The assertion set need to be resolved is also always just the declarations on the function prototype, which also simplifies the assertion satisfaction algorithm by a bit, as will be discussed further in the next section.
    150 
    151 The type unification results are stored in a type environment data structure, which represents all the type variables currently in scope as equivalent classes, together with their bound types and some other extra information, such as whether the bound type is allowed to be opaque (i.e. a forward declaration without definition in scope), and whether the bounds are allowed to be widened. In the more general approach commonly used in functional languages, the unification variables are given a lower bound and an upper bound to account for covariance and contravariance of types. \CFA currently does not implement any variance with its generic types, and does not allow polymorphic function types, therefore no explicit upper bound is needed and one simple binding value for each equivalence class suffices. However, since type conversions are allowed in \CFA, the type environment needs to keep track on which type variables are allowed conversions. This behavior is notably different from \CC template argument deduction which enforces an exact match everywhere unless the template argument types are explicitly given. For example, a polymorphic maximum function in \CFA can be called with arguments of different arithmetic types and the result follows the usual arithmetic conversion rules, while such expression is not allowed by \CC:
    152 
    153 \begin{cfa}
    154     forall (T | {int ?<? (T, T); }) T max (T, T);
    155 
    156     max (42, 3.14); // OK, T=double; requires explicit type annotation in C++ such as max<double>(42, 3.14);
    157 \end{cfa}
    158 
    159 The current \CFA documentation does not include a formal set of rules for type unification. In practice, the algorithm implemented in the \CFA translator can be summarized as follows, given a function signature forall$(T_1,..., T_n) f(p_1, ..., p_m)$ and argument types $(a_1, ..., a_m)$, the unification algorithm performs the following steps: \footnote{This assumes argument tuples are already expanded to the individual components.}
    160 
     216\begin{cfa}
     217forall (T) void f (T x, void (*g) (T));
     218\end{cfa}
     219will be sufficient (or potentially, some compound type synthesized from \textbf{T}), in which case \textbf{g} is no longer a polymorphic type on itself.
     220The "monomorphization" conversion is readily supported in \CFA, either by explicitly assigning a polymorphic function name to a compatible function pointer type, or implicitly done in deducing assertion parameters (which will be discussed in the next section).
     221Such technique can be directly applied to argument passing, which is essentially just assignment to function parameter variables.
     222There might be some edge cases where the supplied subroutine \textbf{g} is called on arguments of different types inside the body of \textbf{f} and so declared as polymorphic, but such use case is rare and the benefit of allowing such constructs seems to be minimal in practice.
     223
     224
     225The result of this change is that the unification algorithm no longer needs to distinguish "open" and "closed" type variables, as the latter is not allowed to exist.
     226The only type variables that need to be handled are those introduced by the \textbf{forall} clause from the function prototype.
     227The subtype relationship between function types is now also rendered redundant since none of the function parameter or return types can be polymorphic, and no basic types or non-polymorphic function types are subtypes of any other type.
     228Therefore the goal of (exact) type unification now simply becomes finding a substitution that produces identical types.
     229The assertion set need to be resolved is also always just the declarations on the function prototype, which also simplifies the assertion satisfaction algorithm by a bit, as will be discussed further in the next section.
     230
     231The type unification results are stored in a type environment data structure, which represents all the type variables currently in scope as equivalent classes, together with their bound types and some other extra information, such as whether the bound type is allowed to be opaque (i.e.
     232a forward declaration without definition in scope), and whether the bounds are allowed to be widened.
     233In the more general approach commonly used in functional languages, the unification variables are given a lower bound and an upper bound to account for covariance and contravariance of types.
     234\CFA currently does not implement any variance with its generic types, and does not allow polymorphic function types, therefore no explicit upper bound is needed and one simple binding value for each equivalence class suffices.
     235However, since type conversions are allowed in \CFA, the type environment needs to keep track on which type variables are allowed conversions.
     236This behavior is notably different from \CC template argument deduction which enforces an exact match everywhere unless the template argument types are explicitly given.
     237For example, a polymorphic maximum function in \CFA can be called with arguments of different arithmetic types and the result follows the usual arithmetic conversion rules, while such expression is not allowed by \CC:
     238\begin{cfa}
     239forall (T | {int ?<? (T, T); }) T max (T, T);
     240
     241max (42, 3.14); // OK, T=double; requires explicit type annotation in C++ such as max<double>(42, 3.14);
     242\end{cfa}
     243The current \CFA documentation does not include a formal set of rules for type unification.
     244In practice, the algorithm implemented in the \CFA translator can be summarized as follows, given a function signature forall$(T_1,..., T_n) f(p_1, ..., p_m)$ and argument types $(a_1, ..., a_m)$, the unification algorithm performs the following steps: \footnote{This assumes argument tuples are already expanded to the individual components.}
    161245\begin{enumerate}
    162     \item The type environment is initialized as the union of all type environments of the arguments, plus $(T_1,...,T_n)$ as free variables. The inclusion of argument environments serves the purpose of resolving polymorphic return types that needs to be deduced.
    163     \item Initially, all type variables
    164 
     246\item The type environment is initialized as the union of all type environments of the arguments, plus $(T_1,...,T_n)$ as free variables.
     247The inclusion of argument environments serves the purpose of resolving polymorphic return types that needs to be deduced.
     248\item Initially, all type variables
    165249\end{enumerate}
    166250
    167251
    168 
    169 
    170 
    171 
    172 
    173252\section{Satisfaction of Assertions}
    174253
    175 The assertion satisfaction problem greatly increases the complexity of \CFA expression resolution. Past experiments have shown that the majority of time is spent in resolving the assertions for those expressions that takes the longest time to resolve. Even though a few heuristics-based optimizations are introduced to the compiler now, this remains to be the most costly part of compiling a \CFA program. The major difficulty of resolving assertions is that the problem can become recursive, since the expression used to satisfy an outstanding assertion can have its own assertions, and in theory this can go on indefinitely. Detecting infinite recursion cases in general is not algorithmically possible and it is not attempted in the compiler. Instead, a fixed maximum depth of recursive assertions is imposed. This approach is also taken by \CC compilers as template argument deduction is also similarly undecidable in general.
    176 
    177 In previous versions of \CFA this number was set at 4; as the compiler becomes more optimized and capable of handling more complex expressions in a reasonable amount of time, I have increased the limit to 8 and in most occasions it does not lead to trouble. Very rarely there will be a case where the infinite recursion produces an exponentially growing assertion set, causing minutes of time wasted before the limit is reached. Fortunately it is very hard to run into this situation with realistic \CFA code, and the ones that were found all have some clear characteristics, which can be prevented by some clever tricks. In fact, some of the performance optimizations come from analyzing these problematic cases. One example of such will be presented later in this section.
    178 
    179 While the assertion satisfaction problem in isolation looks like just another expression to resolve, the recursive nature makes some techniques applied to expression resolution without assertions no longer possible. The most significant impact is that the type unification has a side effect, namely editing the type environment (equivalence classes and bindings), which means that if one expression has multiple associated assertions, they are not independent as the changes to the type environment must be compatible for all the assertions to be resolved. Particularly, if one assertion parameter can be resolved in multiple different ways, all of the results need to be checked to make sure the change to type variable bindings are compatible with other assertions to be resolved. A naive algorithm that simply picks any pending assertion to resolve and continue in a depth-first search could be very inefficient and especially prone of falling into an infinite loop, while in many cases it can be avoided by examining other assertions that can provide insight on the desired type binding: if one assertion parameter can only be matched by a unique option, we can then update the type bindings confidently without the need of backtracking.
    180 
    181 The algorithm currently used in \CFA compiler is designed by Aaron Moss through a simplified prototype experiment that captures most of \CFA type system features and ported back to the actual language. It can be described as a mix of breadth- and depth-first search in a staged approach.
    182 
    183 To resolve a set of assertions, the algorithm first attempts to resolve each assertion item individually. There are three possible outcomes on resolving each assertion:
    184 
     254The assertion satisfaction problem greatly increases the complexity of \CFA expression resolution.
     255Past experiments have shown that the majority of time is spent in resolving the assertions for those expressions that takes the longest time to resolve.
     256Even though a few heuristics-based optimizations are introduced to the compiler now, this remains to be the most costly part of compiling a \CFA program.
     257The major difficulty of resolving assertions is that the problem can become recursive, since the expression used to satisfy an outstanding assertion can have its own assertions, and in theory this can go on indefinitely.
     258Detecting infinite recursion cases in general is not algorithmically possible and it is not attempted in the compiler.
     259Instead, a fixed maximum depth of recursive assertions is imposed.
     260This approach is also taken by \CC compilers as template argument deduction is also similarly undecidable in general.
     261
     262
     263In previous versions of \CFA this number was set at 4; as the compiler becomes more optimized and capable of handling more complex expressions in a reasonable amount of time, I have increased the limit to 8 and in most occasions it does not lead to trouble.
     264Very rarely there will be a case where the infinite recursion produces an exponentially growing assertion set, causing minutes of time wasted before the limit is reached.
     265Fortunately it is very hard to run into this situation with realistic \CFA code, and the ones that were found all have some clear characteristics, which can be prevented by some clever tricks.
     266In fact, some of the performance optimizations come from analyzing these problematic cases.
     267One example of such will be presented later in this section.
     268
     269While the assertion satisfaction problem in isolation looks like just another expression to resolve, the recursive nature makes some techniques applied to expression resolution without assertions no longer possible.
     270The most significant impact is that the type unification has a side effect, namely editing the type environment (equivalence classes and bindings), which means that if one expression has multiple associated assertions, they are not independent as the changes to the type environment must be compatible for all the assertions to be resolved.
     271Particularly, if one assertion parameter can be resolved in multiple different ways, all of the results need to be checked to make sure the change to type variable bindings are compatible with other assertions to be resolved.
     272A naive algorithm that simply picks any pending assertion to resolve and continue in a depth-first search could be very inefficient and especially prone of falling into an infinite loop, while in many cases it can be avoided by examining other assertions that can provide insight on the desired type binding: if one assertion parameter can only be matched by a unique option, we can then update the type bindings confidently without the need of backtracking.
     273
     274
     275The algorithm currently used in \CFA compiler is designed by Aaron Moss through a simplified prototype experiment that captures most of \CFA type system features and ported back to the actual language.
     276It can be described as a mix of breadth- and depth-first search in a staged approach.
     277
     278
     279To resolve a set of assertions, the algorithm first attempts to resolve each assertion item individually.
     280There are three possible outcomes on resolving each assertion:
    185281\begin{enumerate}
    186     \item If no matches are found, the algorithm terminates with a failure.
    187     \item If exactly one match is found, the type environment is updated immediately, and used in resolving any remaining assertions.
    188     \item If multiple matches are found, the assertion candidates with their updated type environments are stored in a list that will be checked for compatibility at the end.
     282\item If no matches are found, the algorithm terminates with a failure.
     283\item If exactly one match is found, the type environment is updated immediately, and used in resolving any remaining assertions.
     284\item If multiple matches are found, the assertion candidates with their updated type environments are stored in a list that will be checked for compatibility at the end.
    189285\end{enumerate}
    190 
    191 When all assertion items are resolved successfully, the algorithm attempts to combine the ambiguously resolved assertions to produce mutually compatible assignments. If any new assertions are introduced by the selected candidates, the algorithm is applied recursively, until there are none pending resolution or the recursion limit is reached which results in a failure.
    192 
    193 It has been discovered in practice that the efficiency of such algorithm can sometimes be very sensitive to the order of resolving assertions. Suppose an unbound type variable @T@ appears in two assertions, one can be uniquely resolved and allow the type @T@ to be inferred immediately, and another has many ways to be resolved, each results in @T@ being bound to a different concrete type. If the first assertion is examined first by the algorithm, the deducted type can then be utilized in resolving the second assertion and eliminate many incorrect options without producing the list of candidates pending further checks. In practice, this have a significant impact when an unbound type @T@ is declared to satisfy the basic "object assertions"\footnote{The term is borrowed from object-oriented languages although \CFA is not object-oriented in principle.} of having a default constructor, destructor, and copy assignment operations. Since they are defined for every type currently in scope, there are often hundreds or even thousands of matches to these functions with an unspecified operand type, and in most of the cases the value of @T@ can be deduced by resolving another assertion first, which then allows specific object lifetime functions to be looked up since they are sorted internally by the operand type, and greatly reduces the number of wasted resolution attempts.
    194 
    195 Currently this issue also causes the capability of the assertion resolution algorithm to be limited. Assertion matching is implemented to be more restricted than expression resolution in general, in that the parameter types must match exactly, rather than just merely callable. If one function declaration includes an assertion of @void f(T)@ and only a @f(long)@ is currently in scope, trying to resolve the assertion with @T=int@ would not work. Loosening the assertion matching requirement causes many assertion variables to have multiple matches and makes the delayed combination step too costly.
    196 
    197 Given all the issues caused by assertion resolution potentially creating new type variable bindings, a natural progression is to put some restrictions on free type variables such that all the type variables will be bound when the expression reaches assertion resolution stage. A type variable introduced by the @forall@ clause of function declaration can appear in parameter types, return types and assertion variables. If it appears in the parameter types, it will be bound when matching the arguments to parameters at the call site. If it only appears in the return type, it can be eventually figured out by the context in principle. The current implementation in \CFA compiler does not do enough return type deduction as it performs eager assertion resolution, and the return type information cannot be known in general before the parent expression is resolved, unless the expression is in an initialization context, in which the type of variable to be initialized is certainly known. By delaying the assertion resolution until the return type becomes known, this problem can be circumvented. The truly problematic case occurs if a type variable does not appear in either of the parameter or return types and only appears in some assertion variables. Such case is very rare and it is not evident that forcing every type variable to appear at least once in parameter or return types limits the expressiveness of \CFA type system to a significant extent. In the next chapter I will discuss about a proposal of including type declarations in traits rather than having all type variables appear in the trait parameter list, which could be helpful for providing equivalent functionality of having an unbound type parameter in assertion variables, and also addressing some of the variable cost issue discussed in section 4.1.
     286When all assertion items are resolved successfully, the algorithm attempts to combine the ambiguously resolved assertions to produce mutually compatible assignments.
     287If any new assertions are introduced by the selected candidates, the algorithm is applied recursively, until there are none pending resolution or the recursion limit is reached which results in a failure.
     288
     289It has been discovered in practice that the efficiency of such algorithm can sometimes be very sensitive to the order of resolving assertions.
     290Suppose an unbound type variable @T@ appears in two assertions, one can be uniquely resolved and allow the type @T@ to be inferred immediately, and another has many ways to be resolved, each results in @T@ being bound to a different concrete type.
     291If the first assertion is examined first by the algorithm, the deducted type can then be utilized in resolving the second assertion and eliminate many incorrect options without producing the list of candidates pending further checks.
     292In practice, this have a significant impact when an unbound type @T@ is declared to satisfy the basic "object assertions"\footnote{The term is borrowed from object-oriented languages although \CFA is not object-oriented in principle.} of having a default constructor, destructor, and copy assignment operations.
     293Since they are defined for every type currently in scope, there are often hundreds or even thousands of matches to these functions with an unspecified operand type, and in most of the cases the value of @T@ can be deduced by resolving another assertion first, which then allows specific object lifetime functions to be looked up since they are sorted internally by the operand type, and greatly reduces the number of wasted resolution attempts.
     294
     295Currently this issue also causes the capability of the assertion resolution algorithm to be limited.
     296Assertion matching is implemented to be more restricted than expression resolution in general, in that the parameter types must match exactly, rather than just merely callable.
     297If one function declaration includes an assertion of @void f(T)@ and only a @f(long)@ is currently in scope, trying to resolve the assertion with @T=int@ would not work.
     298Loosening the assertion matching requirement causes many assertion variables to have multiple matches and makes the delayed combination step too costly.
     299
     300Given all the issues caused by assertion resolution potentially creating new type variable bindings, a natural progression is to put some restrictions on free type variables such that all the type variables will be bound when the expression reaches assertion resolution stage.
     301A type variable introduced by the @forall@ clause of function declaration can appear in parameter types, return types and assertion variables.
     302If it appears in the parameter types, it will be bound when matching the arguments to parameters at the call site.
     303If it only appears in the return type, it can be eventually figured out by the context in principle.
     304The current implementation in \CFA compiler does not do enough return type deduction as it performs eager assertion resolution, and the return type information cannot be known in general before the parent expression is resolved, unless the expression is in an initialization context, in which the type of variable to be initialized is certainly known.
     305By delaying the assertion resolution until the return type becomes known, this problem can be circumvented.
     306The truly problematic case occurs if a type variable does not appear in either of the parameter or return types and only appears in some assertion variables.
     307Such case is very rare and it is not evident that forcing every type variable to appear at least once in parameter or return types limits the expressiveness of \CFA type system to a significant extent.
     308In the next chapter I will discuss about a proposal of including type declarations in traits rather than having all type variables appear in the trait parameter list, which could be helpful for providing equivalent functionality of having an unbound type parameter in assertion variables, and also addressing some of the variable cost issue discussed in section 4.1.
     309
    198310
    199311\subsection*{Caching Assertion Resolution Results}
    200312
    201 In Aaron Moss's prototype design and experiments, a potential optimization of caching the result of already resolved assertions is discussed. Based on the experiment results, this approach can improve the performance of expression resolution in general, and sometimes allow hard instances of assertion resolution problems to be solved that are otherwise infeasible, for example when the resolution would encounter infinite loops.
    202 
    203 The problem that makes this approach tricky to be implemented correctly is that the resolution algorithm has side effects, namely modifying the type bindings in the environment. If we were to cache those results that cause the type bindings to be modified, it would be necessary to store the changes to type bindings too, and in case where multiple candidates can be used to satisfy one assertion parameter, all of them needs to be cached including those that are not eventually selected, since the side effect can produce different results depending on the context.
    204 
    205 In the original design of \CFA that includes unrestricted polymorphic formal parameters that can have assertions on themselves, the problem is even more complicated as new declarations can be introduced in scope during expression resolution. Here is one such example taken from Bilson:
    206 
    207 \begin{cfa}
    208     void f( forall( T | { T -?( T ); } ) T (*p)( T, T ) );
    209     forall( U, V | { U -?( U ); V -?( V ); } ) U g( U, V ) );
    210     f( g );
    211 \end{cfa}
    212 
    213 The inner assertion parameter on the \textit{closed} type variable @T@ is used to satisfy the assertions on @U@ and @V@ in this example.
    214 
    215 However, as per the previous discussions on this topic, polymorphic function pointer types have been removed from \CFA, since correctly implementing assertion matching is not possible in general. Without closed parameters (and therefore no have-set for assertions) the set of declarations in scope remains unchanged while resolving any expression. The current \CFA implementation also does not attempt to widen any already bound type parameters to satisfy an assertion. Note that such restriction does mean that certain kinds of expressions cannot be resolved, for example:
    216 
    217 \begin{cfa}
    218     forall (T | {void f(T);}) void g(T);
    219     void f (long);
    220     g(42);
    221 \end{cfa}
    222 
    223 The call @g(42)@ is rejected since no attempt is made to widen the parameter type @T@ from @int@ to @long@. Such problem could be mitigated if we allow inexact matches of assertions, but cannot be eliminated completely, if @T@ is matched in a parameterized type, including pointers and references:
    224 
    225 \begin{cfa}
    226     forall (T | {void f(T*);}) void g(T);
    227     void f (long *);
    228     g(42);
    229 \end{cfa}
    230 
    231 Here the only way to resolve the call @g(42)@ is to allow assertion resolution to widen the parameter @T@, since even with inexact matching, @int*@ cannot be converted to @long*@.
    232 
     313In Aaron Moss's prototype design and experiments, a potential optimization of caching the result of already resolved assertions is discussed.
     314Based on the experiment results, this approach can improve the performance of expression resolution in general, and sometimes allow hard instances of assertion resolution problems to be solved that are otherwise infeasible, for example when the resolution would encounter infinite loops.
     315
     316The problem that makes this approach tricky to be implemented correctly is that the resolution algorithm has side effects, namely modifying the type bindings in the environment.
     317If we were to cache those results that cause the type bindings to be modified, it would be necessary to store the changes to type bindings too, and in case where multiple candidates can be used to satisfy one assertion parameter, all of them needs to be cached including those that are not eventually selected, since the side effect can produce different results depending on the context.
     318
     319In the original design of \CFA that includes unrestricted polymorphic formal parameters that can have assertions on themselves, the problem is even more complicated as new declarations can be introduced in scope during expression resolution.
     320Here is one such example taken from Bilson:
     321\begin{cfa}
     322void f( forall( T | { T -?( T ); } ) T (*p)( T, T ) );
     323forall( U, V | { U -?( U ); V -?( V ); } ) U g( U, V ) );
     324f( g );
     325\end{cfa}
     326The inner assertion parameter on the \textit{closed} type variable @T@ is used to satisfy the assertions on @U@ and @V@ in this example.
     327
     328However, as per the previous discussions on this topic, polymorphic function pointer types have been removed from \CFA, since correctly implementing assertion matching is not possible in general.
     329Without closed parameters (and therefore no have-set for assertions) the set of declarations in scope remains unchanged while resolving any expression.
     330The current \CFA implementation also does not attempt to widen any already bound type parameters to satisfy an assertion.
     331Note that such restriction does mean that certain kinds of expressions cannot be resolved, for example:
     332\begin{cfa}
     333forall (T | {void f(T);}) void g(T);
     334void f (long);
     335g(42);
     336\end{cfa}
     337The call @g(42)@ is rejected since no attempt is made to widen the parameter type @T@ from @int@ to @long@.
     338Such problem could be mitigated if we allow inexact matches of assertions, but cannot be eliminated completely, if @T@ is matched in a parameterized type, including pointers and references:
     339\begin{cfa}
     340forall (T | {void f(T*);}) void g(T);
     341void f (long *);
     342g(42);
     343\end{cfa}
     344Here the only way to resolve the call @g(42)@ is to allow assertion resolution to widen the parameter @T@, since even with inexact matching, @int*@ cannot be converted to @long*@.
    233345
    234346
    235347\section{Compiler Implementation Considerations}
    236 
    237 
  • doc/theses/fangren_yu_MMath/intro.tex

    rb9b8643 r07dbcba  
    193193\section{Type-Inferencing Issues}
    194194
    195 Each kind of type-inferencing systems has its own set of issues that flow onto the programmer in the form of restrictions and/or confusions.
     195Each kind of type-inferencing system has its own set of issues that flow onto the programmer in the form of restrictions and/or confusions.
    196196\begin{enumerate}[leftmargin=*]
    197197\item
     
    247247
    248248\section{Contributions}
     249
     250
     251
     252\begin{comment}
     253From: Andrew James Beach <ajbeach@uwaterloo.ca>
     254To: Peter Buhr <pabuhr@uwaterloo.ca>, Michael Leslie Brooks <mlbrooks@uwaterloo.ca>,
     255    Fangren Yu <f37yu@uwaterloo.ca>, Jiada Liang <j82liang@uwaterloo.ca>
     256Subject: Re: Haskell
     257Date: Fri, 30 Aug 2024 16:09:06 +0000
     258
     259Do you mean:
     260
     261one = 1
     262
     263And then write a bunch of code that assumes it is an Int or Integer (which are roughly int and Int in Cforall) and then replace it with:
     264
     265one = 1.0
     266
     267And have that crash? That is actually enough, for some reason Haskell is happy to narrow the type of the first literal (Num a => a) down to Integer but will not do the same for (Fractional a => a) and Rational (which is roughly Integer for real numbers). Possibly a compatibility thing since before Haskell had polymorphic literals.
     268
     269Now, writing even the first version will fire a -Wmissing-signatures warning, because it does appear to be understood that just from a documentation perspective, people want to know what types are being used. Now, if you have the original case and start updating the signatures (adding one :: Fractional a => a), you can eventually get into issues, for example:
     270
     271import Data.Array (Array, Ix, (!))
     272atOne :: (Ix a, Frational a) => Array a b -> b - - In CFA: forall(a | Ix(a) | Frational(a), b) b atOne(Array(a, b) const & array)
     273atOne = (! one)
     274
     275Which compiles and is fine except for the slightly awkward fact that I don't know of any types that are both Ix and Fractional types. So you might never be able to find a way to actually use that function. If that is good enough you can reduce that to three lines and use it.
     276
     277Something that just occurred to me, after I did the above examples, is: Are there any classic examples in literature I could adapt to Haskell?
     278
     279Andrew
     280
     281PS, I think it is too obvious of a significant change to work as a good example but I did mock up the structure of what I am thinking you are thinking about with a function. If this helps here it is.
     282
     283doubleInt :: Int -> Int
     284doubleInt x = x * 2
     285
     286doubleStr :: String -> String
     287doubleStr x = x ++ x
     288
     289-- Missing Signature
     290action = doubleInt - replace with doubleStr
     291
     292main :: IO ()
     293main = print $ action 4
     294\end{comment}
Note: See TracChangeset for help on using the changeset viewer.