1 | \chapter{Resolution Heuristics} |
---|
2 | \label{resolution-chap} |
---|
3 | |
---|
4 | % consider using "satisfaction" throughout when talking about assertions |
---|
5 | % "valid" instead of "feasible" interpretations |
---|
6 | |
---|
7 | The main task of the \CFACC{} type-checker is \emph{expression resolution}, determining which declarations the identifiers in each expression correspond to. |
---|
8 | Resolution is a straightforward task in C, as no declarations share identifiers, but in \CFA{} the name overloading features discussed in Section~\ref{overloading-sec} generate multiple candidate declarations for each identifier. |
---|
9 | I refer to a given matching between identifiers and declarations in an expression as an \emph{interpretation}; an interpretation also includes information about polymorphic type bindings and implicit casts to support the \CFA{} features discussed in Sections~\ref{poly-func-sec} and~\ref{implicit-conv-sec}, each of which increase the proportion of feasible candidate interpretations. |
---|
10 | To choose between feasible interpretations, \CFA{} defines a \emph{conversion cost} to rank interpretations; the expression resolution problem is thus to find the unique minimal-cost interpretation for an expression, reporting an error if no such interpretation exists. |
---|
11 | |
---|
12 | \section{Expression Resolution} |
---|
13 | |
---|
14 | \subsection{Type Unification} |
---|
15 | |
---|
16 | The polymorphism features of \CFA{} require binding of concrete types to polymorphic type variables. |
---|
17 | Briefly, \CFACC{} keeps a mapping from type variables to the concrete types they are bound to as an auxiliary data structure during expression resolution; Chapter~\ref{env-chap} describes this \emph{environment} data structure in more detail. |
---|
18 | A \emph{unification} algorithm is used to simultaneously check two types for equivalence with respect to the substitutions in an environment and update that environment. |
---|
19 | Essentially, unification recursively traverses the structure of both types, checking them for equivalence, and when it encounters a type variable it replaces it with the concrete type it is bound to; if the type variable has not yet been bound, the unification algorithm assigns the equivalent type as the bound type of the variable, after performing various consistency checks. |
---|
20 | Ditchfield\cite{Ditchfield92} and Bilson\cite{Bilson03} describe the semantics of \CFA{} unification in more detail. |
---|
21 | |
---|
22 | \subsection{Conversion Cost} |
---|
23 | |
---|
24 | C does not have an explicit cost model for implicit conversions, but the ``usual arithmetic conversions''\cit{} used to decide which arithmetic operators to use define one implicitly. |
---|
25 | Beginning with the work of Bilson\cite{Bilson03}, \CFA{} has defined a \emph{conversion cost} for each function call in a way that generalizes C's conversion rules. |
---|
26 | Loosely defined, the conversion cost counts the implicit conversions utilized by an interpretation. |
---|
27 | With more specificity, the cost is a lexicographically-ordered tuple, where each element corresponds to a particular kind of conversion. |
---|
28 | In Bilson's \CFA{} design, conversion cost is a 3-tuple, $(unsafe, poly, safe)$, where $unsafe$ is the count of unsafe (narrowing) conversions, $poly$ is the count of polymorphic type bindings, and $safe$ is the sum of the degree of safe (widening) conversions. |
---|
29 | The following example lists the cost in the Bilson model of calling each of the following functions with two !int! parameters: |
---|
30 | |
---|
31 | \begin{cfa} |
---|
32 | void f(char, long); $\C{// (1,0,1)}$ |
---|
33 | forall(otype T) void f(T, long); $\C{// (0,1,1)}$ |
---|
34 | void f(long, long); $\C{// (0,0,2)}$ |
---|
35 | void f(int, unsigned long); $\C{// (0,0,2)}$ |
---|
36 | void f(int, long); $\C{// (0,0,1)}$ |
---|
37 | \end{cfa} |
---|
38 | |
---|
39 | Note that safe and unsafe conversions are handled differently; \CFA{} counts ``distance'' of safe conversions (\eg{} !int! to !long! is cheaper than !int! to !unsigned long!), while only counting the number of unsafe conversions (\eg{} !int! to !char! and !int! to !short! both have unsafe cost 1). |
---|
40 | |
---|
41 | As part of adding reference types to \CFA{} (see Section~\ref{type-features-sec}), Schluntz added a new $reference$ element to the cost tuple, which counts the number of implicit reference-to-rvalue conversions performed so that candidate interpretations can be distinguished by how closely they match the nesting of reference types; since references are meant to act almost indistinguishably from lvalues, this $reference$ element is the least significant in the lexicographic comparison of cost tuples. |
---|
42 | |
---|
43 | I have also refined the \CFA{} cost model as part of this thesis work. |
---|
44 | Bilson's \CFA{} cost model includes the cost of polymorphic type bindings from a function's type assertions in the $poly$ element of the cost tuple; this has the effect of making more-constrained functions more expensive than less-constrained functions. |
---|
45 | However, type assertions actually make a function \emph{less} polymorphic, and as such functions with more type assertions should be preferred in type resolution. |
---|
46 | As an example, some iterator-based algorithms can work on a forward iterator that only provides an increment operator, but are more efficient on a random-access iterator that can be incremented by an arbitrary number of steps in a single operation. |
---|
47 | The random-access iterator has more type constraints, but should be chosen whenever those constraints can be satisfied. |
---|
48 | As such, I have added a $specialization$ element to the \CFA{} cost tuple, the values of which are always negative. |
---|
49 | Each type assertion subtracts 1 from $specialization$, so that more-constrained functions will cost less and thus be chosen over less-constrained functions, all else being equal. |
---|
50 | A more sophisticated design would define a partial order over sets of type assertions by set inclusion (\ie{} one function would only cost less than another if it had a strict superset of assertions, rather than just more total assertions), but I did not judge the added complexity of computing and testing this order to be worth the gain in specificity. |
---|
51 | |
---|
52 | I have also incorporated an unimplemented aspect of Ditchfield's earlier cost model. |
---|
53 | In the example below, adapted from \cite[89]{Ditchfield92}, Bilson's cost model only distinguished between the first two cases by accounting extra cost for the extra set of !otype! parameters, which, as discussed above, is not a desirable solution: |
---|
54 | |
---|
55 | \begin{cfa} |
---|
56 | forall(otype T, otype U) void f(T, U); $\C[3.25in]{// polymorphic}$ |
---|
57 | forall(otype T) void f(T, T); $\C[3.25in]{// less polymorphic}$ |
---|
58 | forall(otype T) void f(T, int); $\C[3.25in]{// even less polymorphic}$ |
---|
59 | forall(otype T) void f(T*, int); $\C[3.25in]{// least polymorphic}$ |
---|
60 | \end{cfa} |
---|
61 | |
---|
62 | I account for the fact that functions with more polymorphic variables are less constrained by introducing a $var$ cost element that counts the number of type variables on a candidate function. |
---|
63 | In the example above, the first !f! has $var = 2$, while the remainder have $var = 1$. |
---|
64 | My new \CFA{} cost model also accounts for a nuance un-handled by Ditchfield or Bilson, in that it makes the more specific fourth function above cheaper than the more generic third function. |
---|
65 | The fourth function is presumably somewhat optimized for handling pointers, but the prior \CFA{} cost model could not account for the more specific binding, as it simply counted the number of polymorphic unifications. |
---|
66 | |
---|
67 | In my modified model, each level of constraint on a polymorphic type in the parameter list results in a decrement of the $specialization$ cost element. |
---|
68 | Thus, all else equal, if both a binding to !T! and a binding to !T*! are available, \CFA{} will pick the more specific !T*! binding. |
---|
69 | This process is recursive, such that !T**! produces a -2 specialization cost, as opposed to the -1 cost for !T*!. |
---|
70 | This works similarly for generic types, \eg{} !box(T)! also has specialization cost -1. |
---|
71 | For multi-argument generic types, the least-specialized polymorphic parameter sets the specialization cost, \eg{} the specialization cost of !pair(T, S*)! is -1 (from !T!) rather than -2 (from !S!). |
---|
72 | Since the user programmer provides parameters, but cannot provide guidance on return type, specialization cost is not counted for the return type list. |
---|
73 | Since both $vars$ and $specialization$ are properties of the declaration rather than any particular interpretation, they are prioritized less than the interpretation-specific conversion costs from Bilson's original 3-tuple. |
---|
74 | The current \CFA{} cost tuple is thus as follows: |
---|
75 | |
---|
76 | \begin{equation*} |
---|
77 | (unsafe, poly, safe, vars, specialization, reference) |
---|
78 | \end{equation*} |
---|
79 | |
---|
80 | \subsection{Expression Cost} |
---|
81 | |
---|
82 | The mapping from \CFA{} expressions to cost tuples is described by Bilson in \cite{Bilson03}, and remains effectively unchanged modulo the refinements to the cost tuple described above. |
---|
83 | Nonetheless, some salient details are repeated here for the sake of completeness. |
---|
84 | |
---|
85 | On a theoretical level, the resolver algorithm treats most expressions as if they were function calls. |
---|
86 | Operators in \CFA{} (both those existing in C and added features like constructors) are all modelled as function calls. |
---|
87 | In terms of the core argument-parameter matching algorithm, the overloaded variables of \CFA{} are not handled differently from zero-argument function calls, aside from a different pool of candidate declarations and setup for different code generation. |
---|
88 | Similarly, an aggregate member expression !a.m! can be modelled as a unary function !m! that takes one argument of the aggregate type. |
---|
89 | Literals do not require sophisticated resolution, as the syntactic form of each implies their result types (\eg{} !42! is int, !"hello"! is !char*!, \etc{}), though struct literals require resolution of the implied constructor call. |
---|
90 | |
---|
91 | Since most expressions can be treated as function calls, nested function calls are the primary component of expression resolution problem instances. |
---|
92 | Each function call has an \emph{identifier} which must match the name of the corresponding declaration, and a possibly-empty list of \emph{arguments}. |
---|
93 | These arguments may be function call expressions themselves, producing a tree of function-call expressions to resolve, where the leaf expressions are generally nullary functions, variable expressions, or literals. |
---|
94 | A single instance of expression resolution consists of matching declarations to all the identifiers in the expression tree of a top-level expression, along with inserting any conversions and assertions necessary for that matching. |
---|
95 | The cost of a function-call expression is the sum of the conversion costs of each argument type to the corresponding parameter and the total cost of each subexpression, recursively calculated. |
---|
96 | \CFA{} expression resolution must produce either the unique lowest-cost interpretation of the top-level expression, or an appropriate error message if none such exists. |
---|
97 | The cost model of \CFA{} precludes a simple bottom-up resolution pass, as constraints and costs introduced by calls higher in the expression tree can change the interpretation of those lower in the tree, as in the following example: |
---|
98 | |
---|
99 | \begin{cfa} |
---|
100 | void f(int); |
---|
101 | double g(int); $\C{// g1}$ |
---|
102 | int g(double); $\C{// g2}$ |
---|
103 | |
---|
104 | f( g(42) ); |
---|
105 | \end{cfa} |
---|
106 | |
---|
107 | !g1! is the cheapest interpretation of !g(42)!, with cost $(0,0,0,0,0,0)$ since the argument type is an exact match, but to downcast the return type of !g1! to an !int! suitable for !f! requires an unsafe conversion for a total cost of $(1,0,0,0,0,0)$. |
---|
108 | If !g2! is chosen, on the other hand, there is a safe upcast from the !int! type of !42! to !double!, but no cast on the return of !g!, for a total cost of $(0,0,1,0,0,0)$; as this is cheaper, $g2$ is chosen. |
---|
109 | Due to this design, in general all feasible interpretations of subexpressions must be propagated to the top of the expression tree before any can be eliminated, a lazy form of expression resolution, as opposed to the eager expression resolution allowed by C, where each expression can be resolved given only the resolution of its immediate subexpressions. |
---|
110 | |
---|
111 | If there are no feasible interpretations of the top-level expression, expression resolution fails and must produce an appropriate error message. |
---|
112 | If any subexpression has no feasible interpretations, the process can be short-circuited and the error produced at that time. |
---|
113 | If there are multiple feasible interpretations of a top-level expression, ties are broken based on the conversion cost, calculated as above. |
---|
114 | If there are multiple minimal-cost feasible interpretations of a top-level expression, that expression is said to be \emph{ambiguous}, and an error must be produced. |
---|
115 | Multiple minimal-cost interpretations of a subexpression do not necessarily imply an ambiguous top-level expression, however, as the subexpression interpretations may be disambiguated based on their return type or by selecting a more-expensive interpretation of that subexpression to reduce the overall expression cost, as above. |
---|
116 | |
---|
117 | The \CFA{} resolver uses type assertions to filter out otherwise-feasible subexpression interpretations. |
---|
118 | An interpretation can only be selected if all the type assertions in the !forall! clause on the corresponding declaration can be satisfied with a unique minimal-cost set of satisfying declarations. |
---|
119 | Type assertion satisfaction is tested by performing type unification on the type of the assertion and the type of the declaration satisfying the assertion. |
---|
120 | That is, a declaration which satisfies a type assertion must have the same name and type as the assertion after applying the substitutions in the type environment. |
---|
121 | Assertion-satisfying declarations may be polymorphic functions with assertions of their own that must be satisfied recursively. |
---|
122 | This recursive assertion satisfaction has the potential to introduce infinite loops into the type resolution algorithm, a situation which \CFACC{} avoids by imposing a hard limit on the depth of recursive assertion satisfaction (currently 4); this approach is also taken by \CC{} to prevent infinite recursion in template expansion, and has proven to be both effective an not unduly restrictive of the language's expressive power. |
---|
123 | |
---|
124 | Cast expressions must be treated somewhat differently than functions for backwards compatibility purposes with C. |
---|
125 | In C, cast expressions can serve two purposes, \emph{conversion} (\eg{} !(int)3.14!), which semantically converts a value to another value in a different type with a different bit representation, or \emph{coercion} (\eg{} !void* p; (int*)p;!), which assigns a different type to the same bit value. |
---|
126 | C provides a set of built-in conversions and coercions, and user programmers are able to force a coercion over a conversion if desired by casting pointers. |
---|
127 | The overloading features in \CFA{} introduce a third cast semantics, \emph{ascription} (\eg{} !int x; double x; (int)x;!), which selects the overload which most-closely matches the cast type. |
---|
128 | However, since ascription does not exist in C due to the lack of overloadable identifiers, if a cast argument has an unambiguous interpretation as a conversion argument then it must be interpreted as such, even if the ascription interpretation would have a lower overall cost, as in the following example, adapted from the C standard library: |
---|
129 | |
---|
130 | \begin{cfa} |
---|
131 | unsigned long long x; |
---|
132 | (unsigned)(x >> 32); |
---|
133 | \end{cfa} |
---|
134 | |
---|
135 | In C semantics, this example is unambiguously upcasting !32! to !unsigned long long!, performing the shift, then downcasting the result to !unsigned!, at total cost $(1,0,4,0,0,0)$. |
---|
136 | If ascription were allowed to be a first-class interpretation of a cast expression, it would be cheaper to select the !unsigned! interpretation of $?>>?$ by downcasting !x! to !unsigned! and upcasting !32! to !unsigned!, at a total cost of $(1,0,1,0,0,0)$. |
---|
137 | However, this break from C semantics introduces a backwards compatibility break, so to maintain C compatibility the \CFA{} resolver selects the lowest-cost interpretation of the cast argument for which a conversion or coercion to the target type exists (upcasting to !unsigned long long! in the example above, due to the lack of unsafe downcasts), only using the cost of the conversion itself as a tie-breaker. |
---|
138 | For example, in !int x; double x; (int)x;!, both declarations have zero-cost interpretations as !x!, but the !int x! interpretation is cheaper to cast to !int!, and is thus selected. |
---|
139 | Thus, in contrast to the lazy resolution of nested function-call expressions discussed above, where final interpretations for each subexpression are not chosen until the top-level expression is reached, cast expressions introduce eager resolution of their argument subexpressions, as if that argument was itself a top-level expression. |
---|
140 | |
---|
141 | \section{Resolution Algorithms} |
---|
142 | |
---|
143 | \CFA{} expression resolution is not, in general, polynomial in the size of the input expression, as shown in Section~\ref{resn-analysis-sec}. |
---|
144 | While this theoretical result is daunting, its implications can be mitigated in practice. |
---|
145 | \CFACC{} does not solve one instance of expression resolution in the course of compiling a program, but rather thousands; therefore, if the worst case of expression resolution is sufficiently rare, worst-case instances can be amortized by more-common easy instances for an acceptable overall runtime. |
---|
146 | Secondly, while a programmer \emph{can} deliberately generate a program designed for inefficient compilation\footnote{see for instance \cite{Haberman16}, which generates arbitrarily large \CC{} template expansions from a fixed-size source file.}, source code tends to follow common patterns. |
---|
147 | Programmers generally do not want to run the full compiler algorithm in their heads, and as such keep mental shortcuts in the form of language idioms. |
---|
148 | If the compiler can be tuned to handle idiomatic code more efficiently, then the reduction in runtime for idiomatic (but otherwise difficult) resolution instances can make a significant difference in total compiler runtime. |
---|
149 | |
---|
150 | \subsection{Worst-case Analysis} \label{resn-analysis-sec} |
---|
151 | |
---|
152 | Expression resolution has a number of components which contribute to its runtime, including argument-parameter type unification, recursive traversal of the expression tree, and satisfaction of type assertions. |
---|
153 | |
---|
154 | If the bound type for a type variable can be looked up or mutated in constant time (as asserted in Table~\ref{env-bounds-table}), then the runtime of the unification algorithm to match an argument to a parameter is proportional to the complexity of the types being unified. |
---|
155 | In C, complexity of type representation is bounded by the most-complex type explicitly written in a declaration, effectively a small constant; in \CFA{}, however, polymorphism can generate more-complex types: |
---|
156 | |
---|
157 | \begin{cfa} |
---|
158 | forall(otype T) pair(T) wrap(T x, T y); |
---|
159 | |
---|
160 | wrap(wrap(wrap(1, 2), wrap(3, 4)), wrap(wrap(5, 6), wrap(7, 8))); |
---|
161 | \end{cfa} |
---|
162 | |
---|
163 | To resolve the outermost !wrap!, the resolver must check that !pair(pair(int))! unifies with itself, but at three levels of nesting, !pair(pair(int))! is more complex than either !pair(T)! or !T!, the types in the declaration of !wrap!. |
---|
164 | Accordingly, the cost of a single argument-parameter unification is $O(d)$, where !d! is the depth of the expression tree, and the cost of argument-parameter unification for a single candidate for a given function call expression is $O(pd)$, where $p$ is the number of parameters. |
---|
165 | |
---|
166 | Considering the recursive traversal of the expression tree, polymorphism again greatly expands the worst-case runtime. |
---|
167 | Letting $i$ be the number of candidate declarations for each function call, if all of these candidates are monomorphic then there are no more than $i$ unambiguous interpretations of the subexpression rooted at that function call. |
---|
168 | Ambiguous minimal-cost subexpression interpretations may also be collapsed into a single \emph{ambiguous interpretation}, as the presence of such a subexpression interpretation in the final solution is an error condition. |
---|
169 | One safe pruning operation during expression resolution is to discard all subexpression interpretations with greater-than-minimal cost for their return type, as such interpretations will never beat the minimal-cost interpretation with their return type for the overall optimal solution. |
---|
170 | As such, with no polymorphism each declaration will generate no more than one minimal-cost interpretation with its return type, so the number of possible subexpression interpretations is $O(i)$ (note that in C, which lacks overloading, $i \leq 1$). |
---|
171 | With polymorphism, however, a single declaration (like !wrap! above) can have many concrete return types after type variable substitution, and could in principle have a different concrete return type for each combination of argument interpretations. |
---|
172 | Calculated recursively, the bound on the total number of candidate interpretations is $O(i^{p^d})$, each with a distinct type. |
---|
173 | |
---|
174 | Given these calculations of number of subexpression interpretations and matching costs, the upper bound on runtime for generating candidates for a single subexpression $d$ levels up from the leaves is $O( i^{p^d} \cdot pd )$. |
---|
175 | Since there are $O(p^d)$ subexpressions in a single top-level expression, the total worst-case cost of argument-parameter matching with the overloading and polymorphism features of \CFA{} is $O( i^{p^d} \cdot pd \cdot p^d )$. |
---|
176 | |
---|
177 | This already high bound does not yet account for the cost of assertion resolution, though. |
---|
178 | % continue from here |
---|
179 | |
---|
180 | % need to discuss that n == p^d, that that i^p^d bound is **super** rare, and will usually be closer to $i$ |
---|
181 | |
---|
182 | % missing a +1 somewhere in these bounds; also, pd is O(n) () |
---|
183 | |
---|
184 | \subsection{Related Work} |
---|
185 | |
---|
186 | \subsection{Contributions} |
---|
187 | |
---|
188 | One improvement I have implemented to the assertion resolution scheme in \CFACC{} is to change the search for satisfying declarations from depth-first to breadth-first. |
---|
189 | If there are $n$ candidates to satisfy a single assertion, the candidate with the lowest conversion cost among those that have all their assertions satisfied should be chosen. |
---|
190 | Bilson's \CFACC{} would immediately attempt to recursively satisfy the assertions of each candidate that passed type unification; this work is wasted if that candidate is not selected to satisfy the assertion, and the wasted work may be quite substantial if the candidate function produces deeply-recursive assertions. |
---|
191 | I have modified \CFACC{} to first sort assertion satisfaction candidates by conversion cost, and only resolve recursive assertions until a unique minimal-cost candidate or an ambiguity is detected. |
---|
192 | |
---|
193 | \section{Experimental Results} |
---|
194 | |
---|
195 | % use Jenkins daily build logs to rebuild speedup graph with more data |
---|
196 | |
---|
197 | % look back at Resolution Algorithms section for threads to tie up "does the algorithm look like this?" |
---|
198 | |
---|
199 | \section{Conclusion \& Future Work} |
---|
200 | |
---|
201 | I have experimented with using expression resolution rather than type unification to choose assertion resolutions; this path should be investigated further in future work. |
---|
202 | This approach is more flexible than type unification, allowing for conversions to be applied to functions to satisfy assertions. |
---|
203 | Anecdotally, this flexibility matches user-programmer expectations better, as small type differences (\eg{} the presence or absence of a reference type, or the usual conversion from !int! to !long!) no longer break assertion satisfaction. |
---|
204 | Practically, the resolver prototype uses this model of assertion satisfaction, with no apparent deficit in performance; the generated expressions that are resolved to satisfy the assertions are easier than the general case because they never have nested subexpressions, which eliminates much of the theoretical differences between unification and resolution. |
---|
205 | The main challenge to implement this approach in \CFACC{} would be applying the implicit conversions generated by the resolution process in the code-generation for the thunk functions that \CFACC{} uses to pass type assertions with the proper signatures. |
---|
206 | |
---|
207 | % Discuss possibility of parallel subexpression resolution |
---|
208 | |
---|
209 | % Mention relevance of work to C++20 concepts |
---|