Changeset 8b1dc66 for doc/theses/aaron_moss_PhD
 Timestamp:
 Mar 5, 2019, 3:10:06 PM (4 years ago)
 Branches:
 aaronthesis, armeh, cleanupdtors, enum, forallpointerdecay, jacob/cs343translation, jenkinssandbox, master, newast, newastuniqueexpr, pthreademulation, qualifiedEnum
 Children:
 3fad907
 Parents:
 6a787f8
 Location:
 doc/theses/aaron_moss_PhD/phd
 Files:

 3 edited
Legend:
 Unmodified
 Added
 Removed

doc/theses/aaron_moss_PhD/phd/resolutionheuristics.tex
r6a787f8 r8b1dc66 1 \chapter{Resolution Heuristics}1 \chapter{Resolution Algorithms} 2 2 \label{resolutionchap} 3 3 … … 7 7 The main task of the \CFACC{} typechecker is \emph{expression resolution}, determining which declarations the identifiers in each expression correspond to. 8 8 Resolution is a straightforward task in C, as no declarations share identifiers, but in \CFA{} the name overloading features discussed in Section~\ref{overloadingsec} generate multiple candidate declarations for each identifier. 9 I refer to a given matching between identifiers and declarations in an expression as an \emph{interpretation}; an interpretation also includes information about polymorphic type bindings and implicit casts to support the \CFA{} features discussed in Sections~\ref{polyfuncsec} and~\ref{implicitconvsec}, each of which increase the proportion of feasible candidate interpretations. 10 To choose between feasible interpretations, \CFA{} defines a \emph{conversion cost} to rank interpretations; the expression resolution problem is thus to find the unique minimalcost interpretation for an expression, reporting an error if no such interpretation exists. 9 A given matching between identifiers and declarations in an expression is an \emph{interpretation}; an interpretation also includes information about polymorphic type bindings and implicit casts to support the \CFA{} features discussed in Sections~\ref{polyfuncsec} and~\ref{implicitconvsec}, each of which increase the number of valid candidate interpretations. 10 To choose among valid interpretations, a \emph{conversion cost} is used to rank interpretations. 11 Hence, the expression resolution problem is to find the unique minimalcost interpretation for an expression, reporting an error if no such interpretation exists. 11 12 12 13 \section{Expression Resolution} … … 17 18 Briefly, \CFACC{} keeps a mapping from type variables to the concrete types they are bound to as an auxiliary data structure during expression resolution; Chapter~\ref{envchap} describes this \emph{environment} data structure in more detail. 18 19 A \emph{unification} algorithm is used to simultaneously check two types for equivalence with respect to the substitutions in an environment and update that environment. 19 Essentially, unification recursively traverses the structure of both types, checking them for equivalence, and when it encounters a type variable it replaces it with the concrete type it is bound to; if the type variable has not yet been bound, the unification algorithm assigns the equivalent type as the bound type of the variable, after performing various consistency checks.20 Ditchfield \cite{Ditchfield92} and Bilson\cite{Bilson03} describe the semantics of \CFA{} unification in more detail.20 Essentially, unification recursively traverses the structure of both types, checking them for equivalence, and when it encounters a type variable, it replaces it with the concrete type it is bound to; if the type variable has not yet been bound, the unification algorithm assigns the equivalent type as the bound type of the variable, after performing various consistency checks. 21 Ditchfield~\cite{Ditchfield92} and Bilson~\cite{Bilson03} describe the semantics of \CFA{} unification in more detail. 21 22 22 23 \subsection{Conversion Cost} \label{convcostsec} … … 28 29 \begin{itemize} 29 30 \item If either operand is a floatingpoint type, the common type is the size of the largest floatingpoint type. If either operand is !_Complex!, the common type is also !_Complex!. 30 \item If both operands are of integral type, the common type has the same size\footnote{Technically, the C standard defines a notion of \emph{rank} in \cite[\S{}6.3.1.1]{C11}, a distinct value for each \lstinline{signed} and \lstinline{unsigned} pair; integral types of the same size thus may have distinct ranks. For instance, if \lstinline{int} and \lstinline{long} are the same size, \lstinline{long} will havegreater rank. The standarddefined types are declared to have greater rank than any types of the same size added as compiler extensions.} as the larger type.31 \item If both operands are of integral type, the common type has the same size\footnote{Technically, the C standard defines a notion of \emph{rank} in \cite[\S{}6.3.1.1]{C11}, a distinct value for each \lstinline{signed} and \lstinline{unsigned} pair; integral types of the same size thus may have distinct ranks. For instance, though \lstinline{int} and \lstinline{long} may have the same size, \lstinline{long} always has greater rank. The standarddefined types are declared to have greater rank than any types of the same size added as compiler extensions.} as the larger type. 31 32 \item If the operands have opposite signedness, the common type is !signed! if the !signed! operand is strictly larger, or !unsigned! otherwise. If the operands have the same signedness, the common type shares it. 32 33 \end{itemize} 33 34 34 Beginning with the work of Bilson \cite{Bilson03}, \CFA{} defines a \emph{conversion cost} for each function call in a way that generalizes C's conversion rules.35 Beginning with the work of Bilson~\cite{Bilson03}, \CFA{} defines a \emph{conversion cost} for each function call in a way that generalizes C's conversion rules. 35 36 Loosely defined, the conversion cost counts the implicit conversions utilized by an interpretation. 36 37 With more specificity, the cost is a lexicographicallyordered tuple, where each element corresponds to a particular kind of conversion. … … 41 42 42 43 \begin{cfa} 43 void f (char, long); $\C{// (1,0,1)}$44 void f (short, long); $\C{// (1,0,1)}$45 forall(otype T) void f (T, long); $\C{// (0,1,1)}$46 void f (long, long); $\C{// (0,0,2)}$47 void f (int, unsigned long); $\C{// (0,0,2)}$48 void f (int, long); $\C{// (0,0,1)}$49 \end{cfa} 50 51 Note that safe and unsafe conversions are handled differently; \CFA{} counts distance of safe conversions (\eg{} !int! to !long! is cheaper than !int! to !unsigned long!), while only counting the number of unsafe conversions (\eg{} !int! to !char! and !int! to !short! both have unsafe cost 1, as in the first two declarationsabove).52 These costs are summed over the parameters in a call; in the example above, the cost of the two !int! to !long! conversions for the fourth declaration sum equal to the one !int! to !unsigned long! conversion in the fifth.44 void f$\(_1\)$(char, long); $\C{// (1,0,1)}$ 45 void f$\(_2\)$(short, long); $\C{// (1,0,1)}$ 46 forall(otype T) void f$\(_3\)$(T, long); $\C{// (0,1,1)}$ 47 void f$\(_4\)$(long, long); $\C{// (0,0,2)}$ 48 void f$\(_5\)$(int, unsigned long); $\C{// (0,0,2)}$ 49 void f$\(_6\)$(int, long); $\C{// (0,0,1)}$ 50 \end{cfa} 51 52 Note that safe and unsafe conversions are handled differently; \CFA{} counts distance of safe conversions (\eg{} !int! to !long! is cheaper than !int! to !unsigned long!), while only counting the number of unsafe conversions (\eg{} !int! to !char! and !int! to !short! both have unsafe cost 1, as in !f!$_1$ and !f!$_2$ above). 53 These costs are summed over the parameters in a call; in the example above, the cost of the two !int! to !long! conversions for !f!$_4$ sum equal to the one !int! to !unsigned long! conversion for !f!$_5$. 53 54 54 55 \begin{figure} … … 68 69 As part of adding reference types to \CFA{} (see Section~\ref{typefeaturessec}), Schluntz added a new $reference$ element to the cost tuple, which counts the number of implicit referencetorvalue conversions performed so that candidate interpretations can be distinguished by how closely they match the nesting of reference types; since references are meant to act almost indistinguishably from lvalues, this $reference$ element is the least significant in the lexicographic comparison of cost tuples. 69 70 70 I havealso refined the \CFA{} cost model as part of this thesis work.71 Bilson's \CFA{} cost model includes the cost of polymorphic type bindings from a function's type assertions in the $poly$ element of the cost tuple; this has the effect of making moreconstrained functions more expensive than lessconstrained functions, as in the following example :72 73 \begin{cfa} 74 forall(dtype T  { T& ++?(T&); }) T& advance (T&, int);75 forall(dtype T  { T& ++?(T&); T& ?+=?(T&, int)}) T& advance (T&, int);71 I also refined the \CFA{} cost model as part of this thesis work. 72 Bilson's \CFA{} cost model includes the cost of polymorphic type bindings from a function's type assertions in the $poly$ element of the cost tuple; this has the effect of making moreconstrained functions more expensive than lessconstrained functions, as in the following example, based on differing iterator types: 73 74 \begin{cfa} 75 forall(dtype T  { T& ++?(T&); }) T& advance$\(1\)$(T& i, int n); 76 forall(dtype T  { T& ++?(T&); T& ?+=?(T&, int)}) T& advance$\(2\)$(T& i, int n); 76 77 \end{cfa} 77 78 78 79 In resolving a call to !advance!, the binding to the !T&! parameter in the assertions is added to the $poly$ cost in Bilson's model. 79 However, type assertions actually make a function \emph{less} polymorphic, and as such functions with more type assertions should be preferred in type resolution. 80 In the example above, the moreconstrained second function can be implemented more efficiently, and as such should be chosenwhenever its added constraint can be satisfied.81 A s such, a $specialization$ element is now included in the \CFA{} cost tuple, the values of which are always negative.80 However, type assertions actually make a function \emph{less} polymorphic, and as such functions with more type assertions should be preferred in type resolution. 81 In the example above, if the meaning of !advance! is ``increment !i! !n! times'', !advance!$_1$ requires an !n!iteration loop, while !advance!$_2$ can be implemented more efficiently with the !?+=?! operator; as such, !advance!$_2$ should be chosen over !advance!$_1$ whenever its added constraint can be satisfied. 82 Accordingly, a $specialization$ element is now included in the \CFA{} cost tuple, the values of which are always negative. 82 83 Each type assertion subtracts 1 from $specialization$, so that moreconstrained functions cost less, and thus are chosen over lessconstrained functions, all else being equal. 83 84 A more sophisticated design would define a partial order over sets of type assertions by set inclusion (\ie{} one function would only cost less than another if it had a strict superset of assertions, rather than just more total assertions), but I did not judge the added complexity of computing and testing this order to be worth the gain in specificity. … … 87 88 88 89 \begin{cfa} 89 forall(otype T, otype U) void f (T, U); $\C[3.25in]{// polymorphic}$90 forall(otype T) void f (T, T); $\C[3.25in]{// less polymorphic}$91 forall(otype T) void f (T, int); $\C[3.25in]{// even less polymorphic}$92 forall(otype T) void f (T*, int); $\C[3.25in]{// least polymorphic}$90 forall(otype T, otype U) void f$\(_1\)$(T, U); $\C[3.25in]{// polymorphic}$ 91 forall(otype T) void f$\(_2\)$(T, T); $\C[3.25in]{// less polymorphic}$ 92 forall(otype T) void f$\(_3\)$(T, int); $\C[3.25in]{// even less polymorphic}$ 93 forall(otype T) void f$\(_4\)$(T*, int); $\C[3.25in]{// least polymorphic}$ 93 94 \end{cfa} 94 95 95 96 The new cost model accounts for the fact that functions with more polymorphic variables are less constrained by introducing a $var$ cost element that counts the number of type variables on a candidate function. 96 In the example above, the first !f! has $var = 2$, while the remainder have $var = 1$. 97 The new cost model also accounts for a nuance unhandled by Ditchfield or Bilson, in that it makes the more specific fourth function above cheaper than the more generic third function. 98 The fourth function is presumably somewhat optimized for handling pointers, but the prior \CFA{} cost model could not account for the more specific binding, as it simply counted the number of polymorphic unifications. 97 In the example above, !f!$_1$ has $var = 2$, while the others have $var = 1$. 98 The new cost model also accounts for a nuance unhandled by Ditchfield or Bilson, in that it makes the more specific !f!$_4$ cheaper than the more generic !f!$_3$; !f!$_4$ is presumably somewhat optimized for handling pointers, but the prior \CFA{} cost model could not account for the more specific binding, as it simply counted the number of polymorphic unifications. 99 99 100 100 In the modified model, each level of constraint on a polymorphic type in the parameter list results in a decrement of the $specialization$ cost element, which is shared with the count of assertions due to their common nature as constraints on polymorphic type bindings. … … 104 104 For multiargument generic types, the leastspecialized polymorphic parameter sets the specialization cost, \eg{} the specialization cost of !pair(T, S*)! is $1$ (from !T!) rather than $2$ (from !S!). 105 105 Specialization cost is not counted on the return type list; since $specialization$ is a property of the function declaration, a lower specialization cost prioritizes one declaration over another. 106 User programmers can choose between functions with varying parameter lists by adjusting the arguments, but the same is not true of varying return types, so the return types are omitted from the $specialization$ element.106 User programmers can choose between functions with varying parameter lists by adjusting the arguments, but the same is not true in general of varying return types\footnote{In particular, as described in Section~\ref{exprcostsec}, cast expressions take the cheapest valid and convertable interpretation of the argument expression, and expressions are resolved as a cast to \lstinline{void}. As a result of this, including return types in the $specialization$ cost means that a function with return type \lstinline{T*} for some polymorphic type \lstinline{T} would \emph{always} be chosen over a function with the same parameter types returning \lstinline{void}, even for \lstinline{void} contexts, an unacceptably counterintuitive result.}, so the return types are omitted from the $specialization$ element. 107 107 Since both $vars$ and $specialization$ are properties of the declaration rather than any particular interpretation, they are prioritized less than the interpretationspecific conversion costs from Bilson's original 3tuple. 108 108 … … 112 112 However, Bilson's design results in inconsistent and somewhat surprising costs, with conversion to the nextlarger samesign type generally (but not always) double the cost of conversion to the !unsigned! type of the same size. 113 113 In the redesign, for consistency with the approach of the usual arithmetic conversions, which select a common type primarily based on size, but secondarily on sign, arcs in the new graph are annotated with whether they represent a sign change, and such sign changes are summed in a new $sign$ cost element that lexicographically succeeds $safe$. 114 This means that sign conversions are approximately the same cost as widening conversions, but slightly more expensive (as opposed to less expensive in Bilson's graph) .114 This means that sign conversions are approximately the same cost as widening conversions, but slightly more expensive (as opposed to less expensive in Bilson's graph), so maintaining the same signedness is consistently favoured. 115 115 116 116 With these modifications, the current \CFA{} cost tuple is as follows: … … 125 125 Nonetheless, some salient details are repeated here for the sake of completeness. 126 126 127 On a theoretical level, the resolver algorithmtreats most expressions as if they were function calls.127 On a theoretical level, the resolver treats most expressions as if they were function calls. 128 128 Operators in \CFA{} (both those existing in C and added features like constructors) are all modelled as function calls. 129 In terms of the core argumentparameter matching algorithm, the overloaded variables of \CFA{} are not handled differently fromzeroargument function calls, aside from a different pool of candidate declarations and setup for different code generation.129 In terms of the core argumentparameter matching algorithm, overloaded variables are handled the same as zeroargument function calls, aside from a different pool of candidate declarations and setup for different code generation. 130 130 Similarly, an aggregate member expression !a.m! can be modelled as a unary function !m! that takes one argument of the aggregate type. 131 Literals do not require sophisticated resolution, as the syntactic form of each implies their result types (!42! is !int!, !"hello"! is !char*!, \etc{}), though struct literalsrequire resolution of the implied constructor call.132 133 Since most expressions can be treated as function calls, nested function calls are the primary component of expression resolution problem instances.134 Each function call has an \emph{identifier} whichmust match the name of the corresponding declaration, and a possiblyempty list of \emph{arguments}.131 Literals do not require sophisticated resolution, as in C the syntactic form of each implies their result types (!42! is !int!, !"hello"! is !char*!, \etc{}), though struct literals (\eg{} !(S){ 1, 2, 3 }! for some struct !S!) require resolution of the implied constructor call. 132 133 Since most expressions can be treated as function calls, nested function calls are the primary component of complexity in expression resolution. 134 Each function call has an \emph{identifier} that must match the name of the corresponding declaration, and a possiblyempty list of \emph{arguments}. 135 135 These arguments may be function call expressions themselves, producing a tree of functioncall expressions to resolve, where the leaf expressions are generally nullary functions, variable expressions, or literals. 136 136 A single instance of expression resolution consists of matching declarations to all the identifiers in the expression tree of a toplevel expression, along with inserting any conversions and satisfying all assertions necessary for that matching. 137 137 The cost of a functioncall expression is the sum of the conversion costs of each argument type to the corresponding parameter and the total cost of each subexpression, recursively calculated. 138 \CFA{} expression resolution must produce either the unique lowestcost interpretation of the toplevel expression, or an appropriate error message if none suchexists.138 \CFA{} expression resolution must produce either the unique lowestcost interpretation of the toplevel expression, or an appropriate error message if none exists. 139 139 The cost model of \CFA{} precludes a greedy bottomup resolution pass, as constraints and costs introduced by calls higher in the expression tree can change the interpretation of those lower in the tree, as in the following example: 140 140 141 141 \begin{cfa} 142 142 void f(int); 143 double g (int); $\C[4.5in]{// g1}$144 int g (double); $\C[4.5in]{// g2}$143 double g$\(_1\)$(int); 144 int g$\(_2\)$(long); 145 145 146 146 f( g(42) ); 147 147 \end{cfa} 148 148 149 !g1! is the cheapest interpretation of !g(42)!, with cost $(0,0,0,0,0,0)$ since the argument type is an exact match, but to downcast the return type of !g1! to an !int! suitable for !f! requires an unsafe conversion for a total cost of $(1,0,0,0,0,0)$. 150 If !g2! is chosen, on the other hand, there is a safe upcast from the !int! type of !42! to !double!, but no cast on the return of !g!, for a total cost of $(0,0,1,0,0,0)$; as this is cheaper, !g2! is chosen. 151 Due to this design, all feasible interpretations of subexpressions must in general be propagated to the top of the expression tree before any can be eliminated, a lazy form of expression resolution, as opposed to the eager expression resolution allowed by C, where each expression can be resolved given only the resolution of its immediate subexpressions. 152 153 If there are no feasible interpretations of the toplevel expression, expression resolution fails and must produce an appropriate error message. 154 If any subexpression has no feasible interpretations, the process can be shortcircuited and the error produced at that time. 155 If there are multiple feasible interpretations of a toplevel expression, ties are broken based on the conversion cost, calculated as above. 156 If there are multiple minimalcost feasible interpretations of a toplevel expression, that expression is said to be \emph{ambiguous}, and an error must be produced. 149 Considered independently, !g!$_1$!(42)! is the cheapest interpretation of !g(42)!, with cost $(0,0,0,0,0,0)$ since the argument type is an exact match. 150 However, in context, an unsafe conversion is required to downcast the return type of !g!$_1$ to an !int! suitable for !f!, for a total cost of $(1,0,0,0,0,0)$ for !f( g!$_1$!(42) )!. 151 If !g!$_2$ is chosen, on the other hand, there is a safe upcast from the !int! type of !42! to !long!, but no cast on the return of !g!$_2$, for a total cost of $(0,0,1,0,0,0)$ for !f( g!$_2$!(42) )!; as this is cheaper, !g!$_2$ is chosen. 152 Due to this design, all valid interpretations of subexpressions must in general be propagated to the top of the expression tree before any can be eliminated, a lazy form of expression resolution, as opposed to the eager expression resolution allowed by C, where each expression can be resolved given only the resolution of its immediate subexpressions. 153 154 If there are no valid interpretations of the toplevel expression, expression resolution fails and must produce an appropriate error message. 155 If any subexpression has no valid interpretations, the process can be shortcircuited and the error produced at that time. 156 If there are multiple valid interpretations of a toplevel expression, ties are broken based on the conversion cost, calculated as above. 157 If there are multiple minimalcost valid interpretations of a toplevel expression, that expression is said to be \emph{ambiguous}, and an error must be produced. 157 158 Multiple minimalcost interpretations of a subexpression do not necessarily imply an ambiguous toplevel expression, however, as the subexpression interpretations may be disambiguated based on their return type or by selecting a moreexpensive interpretation of that subexpression to reduce the overall expression cost, as in the example above. 158 159 … … 160 161 An interpretation can only be selected if all the type assertions in the !forall! clause on the corresponding declaration can be satisfied with a unique minimalcost set of satisfying declarations. 161 162 Type assertion satisfaction is tested by performing type unification on the type of the assertion and the type of the declaration satisfying the assertion. 162 That is, a declaration whichsatisfies a type assertion must have the same name and type as the assertion after applying the substitutions in the type environment.163 That is, a declaration that satisfies a type assertion must have the same name and type as the assertion after applying the substitutions in the type environment. 163 164 Assertionsatisfying declarations may be polymorphic functions with assertions of their own that must be satisfied recursively. 164 165 This recursive assertion satisfaction has the potential to introduce infinite loops into the type resolution algorithm, a situation which \CFACC{} avoids by imposing a hard limit on the depth of recursive assertion satisfaction (currently 4); this approach is also taken by \CC{} to prevent infinite recursion in template expansion, and has proven to be effective and not unduly restrictive of the expressive power of \CFA{}. … … 167 168 In C, cast expressions can serve two purposes, \emph{conversion} (\eg{} !(int)3.14!), which semantically converts a value to another value in a different type with a different bit representation, or \emph{coercion} (\eg{} !void* p; (int*)p;!), which assigns a different type to the same bit value. 168 169 C provides a set of builtin conversions and coercions, and user programmers are able to force a coercion over a conversion if desired by casting pointers. 169 The overloading features in \CFA{} introduce a third cast semantic, \emph{ascription} (\eg{} !int x; double x; (int)x;!), which selects the overload whichmostclosely matches the cast type.170 The overloading features in \CFA{} introduce a third cast semantic, \emph{ascription} (\eg{} !int x; double x; (int)x;!), which selects the overload that mostclosely matches the cast type. 170 171 However, since ascription does not exist in C due to the lack of overloadable identifiers, if a cast argument has an unambiguous interpretation as a conversion argument then it must be interpreted as such, even if the ascription interpretation would have a lower overall cost. 171 172 This is demonstrated in the following example, adapted from the C standard library: … … 178 179 In C semantics, this example is unambiguously upcasting !32! to !unsigned long long!, performing the shift, then downcasting the result to !unsigned!, at total cost $(1,0,3,1,0,0,0)$. 179 180 If ascription were allowed to be a firstclass interpretation of a cast expression, it would be cheaper to select the !unsigned! interpretation of !?>>?! by downcasting !x! to !unsigned! and upcasting !32! to !unsigned!, at a total cost of $(1,0,1,1,0,0,0)$. 180 However, this break from C semantics i ntroduces a backwards compatibility break, so to maintain C compatibilitythe \CFA{} resolver selects the lowestcost interpretation of the cast argument for which a conversion or coercion to the target type exists (upcasting to !unsigned long long! in the example above, due to the lack of unsafe downcasts), using the cost of the conversion itself only as a tiebreaker.181 However, this break from C semantics is not backwards compatibile, so to maintain C compatibility, the \CFA{} resolver selects the lowestcost interpretation of the cast argument for which a conversion or coercion to the target type exists (upcasting to !unsigned long long! in the example above, due to the lack of unsafe downcasts), using the cost of the conversion itself only as a tiebreaker. 181 182 For example, in !int x; double x; (int)x;!, both declarations have zerocost interpretations as !x!, but the !int x! interpretation is cheaper to cast to !int!, and is thus selected. 182 183 Thus, in contrast to the lazy resolution of nested functioncall expressions discussed above, where final interpretations for each subexpression are not chosen until the toplevel expression is reached, cast expressions introduce eager resolution of their argument subexpressions, as if that argument was itself a toplevel expression. … … 187 188 While this theoretical result is daunting, its implications can be mitigated in practice. 188 189 \CFACC{} does not solve one instance of expression resolution in the course of compiling a program, but rather thousands; therefore, if the worst case of expression resolution is sufficiently rare, worstcase instances can be amortized by morecommon easy instances for an acceptable overall runtime, as shown in Section~\ref{instanceexprsec}. 189 Secondly, while a programmer \emph{can} deliberately generate a program designed for inefficient compilation\footnote{ see for instance \cite{Haberman16}, which generates arbitrarily large \CC{} template expansions from a fixedsize source file.}, source code tends to follow common patterns.190 Secondly, while a programmer \emph{can} deliberately generate a program designed for inefficient compilation\footnote{See for instance \cite{Haberman16}, which generates arbitrarily large \CC{} template expansions from a fixedsize source file.}, source code tends to follow common patterns. 190 191 Programmers generally do not want to run the full compiler algorithm in their heads, and as such keep mental shortcuts in the form of language idioms. 191 192 If the compiler can be tuned to handle idiomatic code more efficiently, then the reduction in runtime for idiomatic (but otherwise difficult) resolution instances can make a significant difference in total compiler runtime. … … 193 194 \subsection{Worstcase Analysis} \label{resnanalysissec} 194 195 195 Expression resolution has a number of components whichcontribute to its runtime, including argumentparameter type unification, recursive traversal of the expression tree, and satisfaction of type assertions.196 Expression resolution has a number of components that contribute to its runtime, including argumentparameter type unification, recursive traversal of the expression tree, and satisfaction of type assertions. 196 197 197 198 If the bound type for a type variable can be looked up or mutated in constant time (as asserted in Table~\ref{envboundstable}), then the runtime of the unification algorithm to match an argument to a parameter is usually proportional to the complexity of the types being unified. … … 206 207 To resolve the outermost !wrap!, the resolver must check that !pair(pair(int))! unifies with itself, but at three levels of nesting, !pair(pair(int))! is more complex than either !pair(T)! or !T!, the types in the declaration of !wrap!. 207 208 Accordingly, the cost of a single argumentparameter unification is $O(d)$, where $d$ is the depth of the expression tree, and the cost of argumentparameter unification for a single candidate for a given function call expression is $O(pd)$, where $p$ is the number of parameters. 208 This does not, however, account for the higher costs of unifying two polymorphic type variables, which may in the worst case result in a recursive unification of all type variables in the expression (as discussed in Chapter~\ref{envchap}).209 This bound does not, however, account for the higher costs of unifying two polymorphic type variables, which may in the worst case result in a recursive unification of all type variables in the expression (as discussed in Chapter~\ref{envchap}). 209 210 Since this recursive unification reduces the number of type variables, it may happen at most once, for an added $O(p^d)$ cost for a toplevel expression with $O(p^d)$ type variables. 210 211 211 212 Implicit conversions are also checked in argumentparameter matching, but the cost of checking for the existence of an implicit conversion is again proportional to the complexity of the type, $O(d)$. 212 Polymorphism also introduces a potential expense here; for a monomorphic function there is only one potential implicit conversion from argument type to parameter type, while if the parameter type is an unbound polymorphic type 213 Polymorphism also introduces a potential expense here; for a monomorphic function there is only one potential implicit conversion from argument type to parameter type, while if the parameter type is an unbound polymorphic typevariable then any implicit conversion from the argument type could potentially be considered a valid binding for that type variable. 213 214 \CFA{}, however, requires exact matches for the bound type of polymorphic parameters, removing this problem. 214 215 An interesting question for future work is whether loosening this requirement incurs a significant compiler runtime cost in practice; preliminary results from the prototype system described in Chapter~\ref{exprchap} suggest it does not. 215 216 216 217 Considering the recursive traversal of the expression tree, polymorphism again greatly expands the worstcase runtime. 217 Let ting $i$ be the number of candidate declarations for each function call, if all of these candidates are monomorphicthen there are no more than $i$ unambiguous interpretations of the subexpression rooted at that function call.218 Let $i$ be the number of candidate declarations for each function call; if all of these candidates are monomorphic, then there are no more than $i$ unambiguous interpretations of the subexpression rooted at that function call. 218 219 Ambiguous minimalcost subexpression interpretations may also be collapsed into a single \emph{ambiguous interpretation}, as the presence of such a subexpression interpretation in the final solution is an error condition. 219 One safe pruning operation during expression resolution is to discard all subexpression interpretations with greaterthanminimal cost for their return type, as such interpretations will neverbeat the minimalcost interpretation with their return type for the overall optimal solution.220 As such, with no polymorphism each declaration willgenerate no more than one minimalcost interpretation with its return type, so the number of possible subexpression interpretations is $O(i)$ (note that in C, which lacks overloading, $i \leq 1$).220 One safe pruning operation during expression resolution is to discard all subexpression interpretations with greaterthanminimal cost for their return type, as such interpretations cannot beat the minimalcost interpretation with their return type for the overall optimal solution. 221 As such, with no polymorphism, each declaration can generate no more than one minimalcost interpretation with its return type, so the number of possible subexpression interpretations is $O(i)$ (note that in C, which lacks overloading, $i \leq 1$). 221 222 With polymorphism, however, a single declaration (like !wrap! above) can have many concrete return types after type variable substitution, and could in principle have a different concrete return type for each combination of argument interpretations. 222 223 Calculated recursively, the bound on the total number of candidate interpretations is $O(i^{p^d})$, each with a distinct type. … … 226 227 Since the size of the expression is $O(p^d)$, letting $n = p^d$ this simplifies to $O(i^n \cdot n^2)$ 227 228 228 This already highbound does not yet account for the cost of assertion satisfaction, however.229 This bound does not yet account for the cost of assertion satisfaction, however. 229 230 \CFA{} uses type unification on the assertion type and the candidate declaration type to test assertion satisfaction; this unification calculation has cost proportional to the complexity of the declaration type after substitution of bound type variables; as discussed above, this cost is $O(d)$. 230 231 If there are $O(a)$ type assertions on each declaration, there are $O(i)$ candidates to satisfy each assertion, for a total of $O(ai)$ candidates to check for each declaration. … … 234 235 235 236 It is clear that assertion satisfaction costs can be very large, and in fact a method for heuristically reducing these costs is one of the key contributions of this thesis, but it should be noted that the worstcase analysis is a particularly poor match for actual code in the case of assertions. 236 It is reasonable to assume that most code compiles without errors, as in an activelydeveloped project the code will becompiled many times, generally with relatively few new errors introduced between compiles.237 It is reasonable to assume that most code compiles without errors, as an activelydeveloped project is compiled many times, generally with relatively few new errors introduced between compiles. 237 238 However, the worstcase bound for assertion satisfaction is based on recursive assertion satisfaction calls exceeding the limit, which is an error case. 238 In practice, then, the depth of recursive assertion satisfaction should be bounded by a small constant for errorfree code, which will accountfor the vast majority of problem instances.239 240 Similarly, uses of polymorphism like those that generate the $O(d)$ bound on unification or the $O(i^{p^d})$ bound on number of candidates are particular enough to berare, but not completely absent.239 In practice, then, the depth of recursive assertion satisfaction should be bounded by a small constant for errorfree code, which accounts for the vast majority of problem instances. 240 241 Similarly, uses of polymorphism like those that generate the $O(d)$ bound on unification or the $O(i^{p^d})$ bound on number of candidates are rare, but not completely absent. 241 242 This analysis points to type unification, argumentparameter matching, and assertion satisfaction as potentially costly elements of expression resolution, and thus profitable targets for algorithmic investigation. 242 243 Type unification is discussed in Chapter~\ref{envchap}, while the other aspects are covered below. … … 251 252 \begin{figure}[h] 252 253 \centering 253 \begin{subfigure}[h]{3 .5in}254 \begin{subfigure}[h]{3in} 254 255 \begin{cfa} 255 int *p; $\C[1in]{// pi}$256 char *p; $\C[1in]{// pc}$256 char *p$\(_1\)$; 257 int *p$\(_2\)$; 257 258 258 double *f(int*, int*); $\C[1in]{// fd}$259 char *f(char*, int*); $\C[1in]{// fc}$259 char *f$\(_1\)$(char*, int*); 260 double *f$\(_2\)$(int*, int*); 260 261 261 f ( f( p, p ), p);262 f$\(_A\)$( f$\(_B\)$( p$\(_A\)$, p$\(_B\)$ ), p$\(_C\)$ ); 262 263 \end{cfa} 263 \end{subfigure}~\begin{subfigure}[h]{2 in}264 \end{subfigure}~\begin{subfigure}[h]{2.5in} 264 265 \includegraphics{figures/resolutiondag} 265 266 \end{subfigure} 266 \caption[Resolution DAG for a simple expression.]{Resolution DAG for a simple expression . Functions that do not have a valid argument matching are covered with an \textsf{X}.} \label{resdagfig}267 \caption[Resolution DAG for a simple expression.]{Resolution DAG for a simple expression, annotated with explanatory subscripts. Functions that do not have a valid argument matching are covered with an \textsf{X}.} \label{resdagfig} 267 268 \end{figure} 268 269 269 Note that some interpretations may be part of more than one superinterpretation, as with the second $pi$ in the bottom row, while some valid subexpression interpretations, like $fd$ in the middle row, are not used in any interpretation of their superexpression.270 Note that some interpretations may be part of more than one superinterpretation, as with the !p!$_2$ interpretation of !p!$_B$, while some valid subexpression interpretations, like the !f!$_2$ interpretation of !f!$_B$, are not used in any interpretation of their superexpression. 270 271 271 272 Overload resolution was first seriously considered in the development of compilers for the Ada programming language, with different algorithms making various numbers of passes over the expression DAG, these passes being either topdown or bottomup. … … 281 282 Cormack's algorithm can be modified to request argument interpretations of \emph{any} type when provided an unbound parameter type, but this eliminates any pruning gains that could be provided by the algorithm. 282 283 283 Ganzinger and Ripken~\cite{Ganzinger80} propose an approach (later refined by Pennello~\etal{}~\cite{Pennello80}) that uses a topdown filtering pass followed by a bottomup filtering pass to reduce the number of candidate interpretations; they prove that for the Ada programming language a small number of such iterations is sufficient to converge to a solution for the overload resolution problem.284 Ganzinger and Ripken~\cite{Ganzinger80} propose an approach (later refined by Pennello~\etal{}~\cite{Pennello80}) that uses a topdown filtering pass followed by a bottomup filtering pass to reduce the number of candidate interpretations; they prove that a small number of such iterations is sufficient to converge to a solution for the overload resolution problem in the Ada programming language. 284 285 Persch~\etal{}~\cite{PW:overload} developed a similar twopass approach where the bottomup pass is followed by the topdown pass. 285 286 These approaches differ from Baker, Bilson, or Cormack in that they take multiple passes over the expression tree to yield a solution by applying filtering heuristics to all expression nodes. … … 309 310 List(Declaration) candidates = decls_matching( a.name ); 310 311 List(List(Declaration)) alternatives = {} 311 for ( Declaration c ;candidates ) {312 for ( Declaration c : candidates ) { 312 313 Environment newEnv = env; 313 314 if ( unify( adjType, c.type, newEnv ) ) { … … 323 324 One shortcoming of this approach is that if an earlier assertion has multiple valid candidates, later assertions may be checked many times due to the structure of the recursion. 324 325 Satisfying declarations for assertions are not completely independent of each other, since the unification process may produce new type bindings in the environment, and these bindings may not be compatible between independentlychecked assertions. 325 Nonetheless, with the environment data 326 Nonetheless, with the environment datastructures discussed in Chapter~\ref{envchap}, I have found it more efficient to produce a list of possiblysatisfying declarations for each assertion once, then check their respective environments for mutual compatibility when combining lists of assertions together. 326 327 327 328 Another improvement I have made to the assertion resolution scheme in \CFACC{} is to consider all assertionsatisfying combinations at one level of recursion before attempting to recursively satisfy any !newNeed! assertions. 328 329 Monomorphic functions are cheaper than polymorphic functions for assertion satisfaction because they are an exact match for the environmentadjusted assertion type, whereas polymorphic functions require an extra type binding. 329 Thus, if there is any mutuallycompatible set of assertionsatisfying declarations which does not include any polymorphic functions (and associated recursive assertions), then the optimal set of assertions willnot require any recursive !newNeed! satisfaction.330 More generally, due to the \CFA{} cost 330 Thus, if there is any mutuallycompatible set of assertionsatisfying declarations that does not include any polymorphic functions (and associated recursive assertions), then the optimal set of assertions does not require any recursive !newNeed! satisfaction. 331 More generally, due to the \CFA{} costmodel changes I introduced in Section~\ref{convcostsec}, the conversion cost of an assertionsatisfying declaration is no longer dependent on the conversion cost of its own assertions. 331 332 As such, all sets of mutuallycompatible assertionsatisfying declarations can be sorted by their summed conversion costs, and the recursive !newNeed! satisfaction pass is required only to check the feasibility of the minimalcost sets. 332 This significantly reduces wasted work relative to Bilson's approach, as well as avoiding generation of deeplyrecursive assertion setsfor a significant performance improvement relative to Bilson's \CFACC{}.333 This optimization significantly reduces wasted work relative to Bilson's approach, as well as avoiding generation of deeplyrecursive assertion sets, for a significant performance improvement relative to Bilson's \CFACC{}. 333 334 334 335 Making the conversion cost of an interpretation independent of the cost of satisfying its assertions has further benefits. 335 Bilson's algorithm checks assertions for all subexpression interpretations immediately, including those which are not ultimately used; I have developed a \emph{deferred} variant of assertion checking whichwaits until a toplevel interpretation has been generated to check any assertions.336 Bilson's algorithm checks assertions for all subexpression interpretations immediately, including those that are not ultimately used; I have developed a \emph{deferred} variant of assertion checking that waits until a toplevel interpretation has been generated to check any assertions. 336 337 If the assertions of the minimalcost toplevel interpretation cannot be satisfied then the nextmostminimalcost interpretation's assertions are checked, and so forth until a minimalcost satisfiable interpretation (or ambiguous set thereof) is found, or no toplevel interpretations are found to have satisfiable assertions. 337 In the common case where the code actually does compile this saves the work of checking assertions for ultimatelyrejected interpretations, though it does rule out some pruning opportunities for subinterpretations with unsatisfiable assertions or which are more expensive than a minimalcost polymorphic function with the same return type.338 In the common case where the code actually does compile, this saves the work of checking assertions for ultimatelyrejected interpretations, though it does rule out some pruning opportunities for subinterpretations with unsatisfiable assertions or which are more expensive than a minimalcost polymorphic function with the same return type. 338 339 The experimental results in Chapter~\ref{exprchap} indicate that this is a worthwhile tradeoff. 339 340 … … 346 347 347 348 The first thing that makes this expression so difficult is that it is 23 levels deep; Section~\ref{resnanalysissec} indicates that the worstcase bounds on expression resolution are exponential in expression depth. 348 Secondly, the !??! operator is significantly overloaded in \CFA{}  there are 74 such operators in the \CFA{} standard library, and while 9 are arithmetic operators inherited from C, the rest are polymorphic I/O operators that look s omething like this:349 Secondly, the !??! operator is significantly overloaded in \CFA{}  there are 74 such operators in the \CFA{} standard library, and while 9 are arithmetic operators inherited from C, the rest are polymorphic I/O operators that look similar to: 349 350 350 351 \begin{cfa} … … 353 354 \end{cfa} 354 355 355 Note that !ostream! is a trait with 25 type assertions, and that the output operators for the other arithmetic types are also feasiblefor the !int!type parameters due to implicit conversions.356 Note that !ostream! is a trait with 25 type assertions, and that the output operators for the other arithmetic types are also valid for the !int!type parameters due to implicit conversions. 356 357 On this instance, deferred assertion satisfaction saves wasted work checking assertions on the wrong output operators, but does nothing about the 23 repeated checks of the 25 assertions to determine that !ofstream! (the type of !sout!) satisfies !ostream!. 357 358 358 359 To solve this problem, I have developed a \emph{cached} variant of assertion checking. 359 During the course of checking the assertions of a single toplevel expression, I cache the results ofeach assertion checked.360 During the course of checking the assertions of a single toplevel expression, the results are cached for each assertion checked. 360 361 The search key for this cache is the assertion declaration with its type variables substituted according to the type environment to distinguish satisfaction of the same assertion for different types. 361 This adjusted assertion declaration is then run through the \CFA{} name 362 This adjusted assertion declaration is then run through the \CFA{} namemangling algorithm to produce an equivalent stringtype key. 362 363 363 364 The assertion satisfaction aspect of \CFA{} expression resolution bears some similarity to satisfiability problems from logic, and as such other languages with similar trait and assertion mechanisms make use of logicprogram solvers in their compilers. 364 365 For instance, Matsakis~\cite{Matsakis17} and the Rust team have developed a PROLOGbased engine to check satisfaction of Rust traits. 365 The combination of the assertion satisfaction elements of the problem with the conversion 366 Expressing assertion resolution as a satisfiability problem ignores the cost optimization aspect, which is necessary to decide betweenwhat are often many possible satisfying assignments of declarations to assertions.366 The combination of the assertion satisfaction elements of the problem with the conversioncost model of \CFA{} makes this logicsolver approach difficult to apply in \CFACC{}, however. 367 Expressing assertion resolution as a satisfiability problem ignores the cost optimization aspect, which is necessary to decide among what are often many possible satisfying assignments of declarations to assertions. 367 368 On the other hand, the deeplyrecursive nature of the satisfiability problem makes it difficult to adapt to optimizing solver approaches such as linear programming. 368 369 To maintain a welldefined programming language, any optimization algorithm used must provide an exact (rather than approximate) solution; this constraint also rules out a whole class of approximatelyoptimal generalized solvers. … … 372 373 373 374 As the results in Chapter~\ref{exprchap} show, the algorithmic approaches I have developed for \CFA{} expression resolution are sufficient to build a practicallyperformant \CFA{} compiler. 374 This work may also be of use to other compiler construction projects, notably to members of the \CC{} community as they implement the new Concepts \cite{C++Concepts} standard, which includes type assertions similar to those used in \CFA{}, as well as a Cderived implicit conversion system.375 This work may also be of use to other compiler construction projects, notably to members of the \CC{} community as they implement the new Concepts \cite{C++Concepts} standard, which includes type assertions similar to those used in \CFA{}, as well as the Cderived implicit conversion system already present in \CC{}. 375 376 376 377 I have experimented with using expression resolution rather than type unification to check assertion satisfaction; this variant of the expression resolution problem should be investigated further in future work. … … 378 379 Anecdotally, this flexibility matches userprogrammer expectations better, as small type differences (\eg{} the presence or absence of a reference type, or the usual conversion from !int! to !long!) no longer break assertion satisfaction. 379 380 Practically, the resolver prototype discussed in Chapter~\ref{exprchap} uses this model of assertion satisfaction, with no apparent deficit in performance; the generated expressions that are resolved to satisfy the assertions are easier than the general case because they never have nested subexpressions, which eliminates much of the theoretical differences between unification and resolution. 380 The main challenge to implement this approach in \CFACC{} would beapplying the implicit conversions generated by the resolution process in the codegeneration for the thunk functions that \CFACC{} uses to pass type assertions to their requesting functions with the proper signatures.381 The main challenge to implement this approach in \CFACC{} is applying the implicit conversions generated by the resolution process in the codegeneration for the thunk functions that \CFACC{} uses to pass type assertions to their requesting functions with the proper signatures. 381 382 382 383 Though performance of the existing algorithms is promising, some further optimizations do present themselves. 383 The refined cost model discussed in Section~\ref{convcostsec} is more expressive, but alsorequires more than twice as many fields; it may be fruitful to investigate more tightlypacked inmemory representations of the costtuple, as well as comparison operations that require fewer instructions than a full lexicographic comparison.384 The refined cost model discussed in Section~\ref{convcostsec} is more expressive, but requires more than twice as many fields; it may be fruitful to investigate more tightlypacked inmemory representations of the costtuple, as well as comparison operations that require fewer instructions than a full lexicographic comparison. 384 385 Integer or vector operations on a morepacked representation may prove effective, though dealing with the negativevalued $specialization$ field may require some effort. 385 386 … … 388 389 While the checks for mutual compatibility are naturally more serial, there may be some benefit to parallel resolution of the subproblem instances. 389 390 390 The resolver prototype built for this project and described in Chapter~\ref{exprchap} would be a suitable vehicle for many of these further experiments, and thus a technical contribution of continuing utility.391 The resolver prototype built for this project and described in Chapter~\ref{exprchap} would be a suitable vehicle for many of these further experiments, and thus a technical contribution of continuing utility.
Note: See TracChangeset
for help on using the changeset viewer.