Context Navigation

source: doc/theses/fangren_yu_MMath/content2.tex @ f5bf3c2

Last change on this file since f5bf3c2 was 985ff5f, checked in by Peter A. Buhr <pabuhr@…>, 7 weeks ago
small proofreading changes
Property mode set to `100644`
File size: 44.0 KB

Line
1	\chapter{Resolution Algorithms}
2	\label{c:content2}
3
4	The \CFA's type-system provides expressive polymorphism: variables can be overloaded, functions can be overloaded by argument and return types, tuple types, generic (polymorphic) functions and types (aggregates) can have multiple type parameters with assertion restrictions;
5	in addition, C's multiple implicit type-conversions must be respected.
6	This generality leads to internal complexity and correspondingly higher compilation cost directly related to type resolution.
7	The reason is that the type resolver must analyze \emph{each} component of an expression with many possible forms of overloading, type restrictions, and conversions.
8	Designing a ruleset that behaves as expected, \ie matches C programmer expectations, and implementing an efficient algorithm is a very challenging task.
9
10	My first work on the \CFA type-system was as a co-op student.
11	At that time, compiling a medium-sized \CFA program using advanced polymorphism took multiple minutes (5+ minutes)~\cite[\S~5]{Yu20}.
12	I found a number of type-resolution problems, which resulted in the following changes to the type-resolution algorithm.
13	\begin{enumerate}[itemsep=0pt]
14	\item
15	new AST data structure
16	\item
17	special symbol table and argument-dependent lookup
18	\item
19	late assertion-satisfaction
20	\item
21	revised function-type representation
22	\item
23	skip pruning on expressions for function types
24	\end{enumerate}
25	\VRef[Table]{t:SelectedFileByCompilerBuild} shows improvements for selected tests with accumulated reductions in compile time across each of the 5 fixes.
26	The large reduction in compilation time significantly improved the development of the \CFA's runtime because of its frequent compilation cycles.
27
28	\begin{table}[htb]
29	\centering
30	\caption{Compile time of selected files by compiler build in seconds}
31	\label{t:SelectedFileByCompilerBuild}
32	\lstset{deletekeywords={mutex,thread}}
33	\setlength{\extrarowheight}{1pt}
34	\vspace*{-4pt}
35	\begin{tabular}{l\|r\|rrrrr}
36	& \multicolumn{1}{c\|}{Original} & \multicolumn{5}{c}{Accumulative fixes} \\
37	\cline{3-7}
38	Test case & \multicolumn{1}{c\|}{resolver} & \multicolumn{1}{c}{1} & \multicolumn{1}{c}{2} & \multicolumn{1}{c}{3} & \multicolumn{1}{c}{4} & \multicolumn{1}{c}{5} \\
39	\hline
40	@lib/fstream@ & 86.4 & 25.2 & 10.8 & 9.5 & 7.8 & 7.1 \\
41	@lib/mutex@ & 210.4 & 77.4 & 16.7 & 15.1 & 12.6 & 11.7 \\
42	@lib/vector@ & 17.2 & 8.9 & 3.1 & 2.8 & 2.4 & 2.2 \\
43	@lib/stdlib@ & 16.6 & 8.3 & 3.2 & 2.9 & 2.6 & 2.4 \\
44	@test/io2@ & 300.8 & 53.6 & 43.2 & 27.9 & 19.1 & 16.3 \\
45	@test/thread@ & 210.9 & 73.5 & 17.0 & 15.1 & 12.6 & 11.8 \\
46	\end{tabular}
47	\medskip
48	\newline
49	Results are average of 5 runs (3 runs if time exceeds 100 seconds)
50	\end{table}
51
52	Since then, many new features utilizing the expressiveness of \CFA's type system have been implemented, such as generic container types similar to those in \CC's standard template library.
53	During the development of multiple \CFA programs and libraries, more weaknesses and design flaws have been discovered within the type system.
54	Some of those problems arise from the newly introduced language features described in the previous chapter.
55	In addition, fixing unexpected interactions within the type system has presented challenges.
56	This chapter describes in detail the type-resolution rules currently in use and some major problems that have been identified.
57	Not all of those problems have immediate solutions, because fixing them may require redesigning parts of the \CFA type system at a larger scale, which correspondingly affects the language design.
58
59
60	\section{Expression Cost-Model}
61
62	\CFA has been using a total-order expression cost-model to resolve ambiguity of overloaded expressions from the very beginning.
63	Most \CFA operators can be overloaded in \CFA\footnote{
64	\CFA excludes overloading the comma operator, short-circuit logical operators \lstinline{&&} and \lstinline{\|\|}, and ternary conditional operator \lstinline{?}, all of which are short-hand control structures rather than operators.
65	\CC overloads the comma and short-circuit logical operators, where the overloaded comma is esoteric short-circuit operators do not exhibit short-circuit semantics, which is confusing.};
66	hence, they are treated the same way as other function calls with the same rules for overload resolution.
67
68	In \CFA, candidates of an overloaded expression are ranked by numerical cost elements, which account any type conversions needed from a call-site argument type to the matching function parameter type, as well as polymorphic types and restrictions introduced by the @forall@ clause.
69	Currently, the expression cost has the following components, ranked from highest to lowest.
70	\begin{enumerate}
71	\item \textbf{Unsafe} cost representing a narrowing conversion of arithmetic types, \eg @int@ to @short@, and qualifier-dropping conversions for pointer and reference types.
72	\item \textbf{Polymorphic} cost for a function parameter type that is or contains a polymorphic type variable.
73	\item \textbf{Safe} cost representing a widening conversion \eg @short@ to @int@, qualifier-adding conversions for pointer and reference types, and value conversion for enumeration constants.
74	\item \textbf{Variable} cost that counts the number of polymorphic variables, if any, introduced by the @forall@ clause in the function declaration.
75	\item \textbf{Specialization} cost counting the number of restrictions introduced by type assertions.
76	\end{enumerate}
77	The comparison of cost tuple is by lexicographical order, from unsafe (highest) to specialization (lowest), with ties moving to the next lowest item.
78	At a subexpression level, the lowest cost candidate for each result type is included as a possible interpretation of the expression;
79	at the top level, all possible interpretations of different types are considered (generating a total ordering) and the overall lowest cost is selected as the final interpretation of the expression.
80	Glen Ditchfield first proposed this costing model~\cite[\S~4.4.5]{Ditchfield92} to generate a resolution behaviour that is reasonable to C programmers based on existing conversions in the C programming language.
81	This model carried over into the first implementation of the \CFA type-system by Richard Bilson~\cite[\S~2.2]{Bilson03}, and was extended but not redesigned by Aaron Moss~\cite[chap.~4]{Moss19}.
82	Moss's work began to show problems with the underlying costing model;
83	these design issues are part of this work.
84
85	\begin{comment}
86	From: Richard Bilson <rcbilson@gmail.com>
87	Date: Tue, 10 Dec 2024 09:54:59 -0500
88	Subject: Re: cost model
89	To: "Peter A. Buhr" <pabuhr@fastmail.fm>
90
91	I didn't invent it although I may have refined it somewhat. The idea of the
92	conversion cost is from Glen's thesis, see for instance page 90
93
94	On Tue, Dec 10, 2024 at 9:35AM Peter A. Buhr <pabuhr@fastmail.fm> wrote:
95	> Cforall has a costing model based on an N-tuple using lexicographical ordering.
96	> (unsafe, polymorphic, safe, variable, specialization)
97	>
98	> Did you invent this costing model as a cheap and cheerful way to encode Glen's
99	> type system?
100
101	From: Richard Bilson <rcbilson@gmail.com>
102	Date: Tue, 10 Dec 2024 17:04:26 -0500
103	Subject: Re: cost model
104	To: "Peter A. Buhr" <pabuhr@fastmail.fm>
105
106	Yes, I think that's fair to say.
107
108	On Tue, Dec 10, 2024 at 2:22PM Peter A. Buhr <pabuhr@fastmail.fm> wrote:
109	> Found it thanks. But Glen never implemented anything (small examples). So it
110	> feels like you took his conversion-cost idea and created an implementation table
111	> with the lexicographical comparison.
112	\end{comment}
113
114	In many languages that support function/method overloading, such as \CC and Java, a partial-order system decides which interpretation of the expression gets selected.
115	Hence, sometimes the candidates are incomparable (none are considered a better match), and only when one candidate is considered better than all others (maximal with respect to the partial order) is the expression unambiguous.
116	Specifically, the resolution algorithms used in \CC and Java are greedy, selecting the best match for each sub-expression without considering the higher-level ones (bottom-up).
117	Therefore, at each resolution step, the arguments are already given unique interpretations, so the ordering only needs to compare different sets of conversion targets (function parameter types) on the same set of input.
118	\begin{cfa}
119	@generate a C++ example here@
120	\end{cfa}
121
122	In \CFA, trying to use such a system is problematic because of the presence of return-type overloading of functions and variable.
123	Specifically, \CFA expression resolution considers multiple interpretations of argument subexpressions with different types, \eg:
124	\begin{cfa}
125	@generate a CFA example here@
126	\end{cfa}
127	so it is possible that both the selected function and the set of arguments are different, and cannot be compared with a partial-ordering system.
128	This situation arises often in \CFA, even in the simple expression @f(x)@, where both the function name @f@ and variable name @x@ are overloaded.
129
130	Ada is another programming language that has overloading based on return type.
131	Although Ada also allows implicit type conversions of function arguments, it is fairly conservative on resolving ambiguities.
132	\begin{cfa}
133	@generate an Ada example here@
134	\end{cfa}
135	There are only two preference rules in Ada overload resolution, one for primitive arithmetic operators and one for universal access types (analogous to @void *@ in C);
136	any other cases where an expression has multiple legal interpretations are considered ambiguous.
137	The current overload resolution system for \CFA is on the other end of the spectrum, as it tries to order every legal interpretations of an expression and chooses the best one according to cost, occasionally giving unexpected results.
138
139	Interestingly, the \CFA cost-based model can sometimes make expression resolution too permissive because it always attempts to select the lowest cost option, and only when there are multiple options tied at the lowest cost does it report the expression is ambiguous.
140	The reason is that there are so many elements in the cost tuple, the type system ``tries too hard to discover a match'', and therefore, ties are uncommon.
141	Other than the case of multiple exact matches, where all have cost zero, incomparable candidates under a partial ordering of being more specific can often have different expression costs since different kinds of implicit conversions are involved, resulting in seemingly arbitrary overload selections.
142
143	There are currently at least three different situations where the polymorphic cost element of the cost model does not yield a candidate selection that is clearly justifiable, and one of them is straight up wrong.
144	\begin{enumerate}[leftmargin=*]
145	\item Polymorphic exact match versus non-polymorphic inexact match.
146	\begin{cfa}
147	forall( T ) void f( T ); $\C[2.5in]{// 1}$
148	void f( long ); $\C{// 2}$
149	f( 42 ); $\C{// currently selects 2}\CRT$
150	\end{cfa}
151	Under the current cost model, option 1 incurs a polymorphic cost from matching the argument type @int@ to type variable @T@, and option 2 incurs a safe cost from integer promotion of type @int@ to @long@.
152	Since polymorphic cost is ranked above safe conversion cost, option 2 is considered to have lower cost and gets selected.
153
154	In contrast, the template deduction and overload resolution rules in \CC selects option 1 (converting @forall@ to the equivalent function template declaration).
155	\CC performs template argument deduction and overload candidate ranking in two separate steps.
156	\begin{itemize}
157	\item
158	In the first step, the type parameters are deduced for each primary function template, and if the corresponding template instantiation succeeds, the resulting function prototype is added to the resolution candidate set.
159	\item
160	In the second step, the implicit conversions (if any) applied to argument types are compared after taking away top-level qualifiers and references.
161	It then prefers an exact match, followed by basic type promotions (roughly corresponds to safe conversion in \CFA), and then other kinds of conversions (roughly corresponds to unsafe conversion in \CFA).
162	Only when the type conversions are the same does it prioritize a non-template candidate.
163	\end{itemize}
164	In this example, option 1 produces the prototype @void f( int )@, which gives an exact match and therefore takes priority.
165	The \CC resolution rules effectively makes option 2 a specialization that only applies to type @long@ exactly,\footnote{\CC does have explicit template specializations, however they do not participate directly in overload resolution and can sometimes lead to unintuitive results.} while the current \CFA rules make option 2 apply for all integral types below @long@.
166	This difference could be explained as compensating for \CFA polymorphic functions being separately compiled versus template inlining;
167	hence, calling them requires passing type information and assertions increasing the runtime cost.
168	\item
169	Having a lower total polymorphic cost does not always mean a function is more specialized.
170	The following example is from Moss's thesis, which discusses some improvements to the \CFA cost-model.
171	He claims the following function prototypes are increasingly more constrained:
172	\begin{cfa}
173	forall( T, U ) void f( T, U ); $\C[2.75in]{// 1 polymorphic}$
174	forall( T ) void f( T, T ); $\C{// 2 less polymorphic}$
175	forall( T ) void f( T, int ); $\C{// 3 even less polymorphic}$
176	forall( T ) void f( T *, int ); $\C{// 4 least polymorphic}\CRT$
177	\end{cfa}
178	This argument is not entirely correct.
179	Although it is true that both the sequence 1, 2 and 1, 3, 4 are increasingly more constrained on the argument types, option 2 is not comparable to either of option 3 or 4;
180	they actually describe independent constraints on the two arguments.
181	Specifically, option 2 says the two arguments must have the same type, while option 3 states the second argument must have type @int@,
182	Because two constraints can independently be satisfied, neither should be considered a better match when trying to resolve a call to @f@ with argument types @(int, int)@;
183	reporting such an expression as ambiguous is more appropriate.
184	The example illustrates the limitation of using a numerical cost value as it cannot represent these complicated cases.
185	\item
186	A generic type can require more type variables to describe a more specific type.
187	For example, a generic function taking a @pair@-type, requires two type variables.
188	\begin{cfa}
189	forall( T, U ) void f( pair( T, U ) ); $\C[2.75in]{// 1}$
190	\end{cfa}
191	Add a function taking any type.
192	\begin{cfa}
193	forall( T ) void f( T ); $\C{// 2}\CRT$
194	\end{cfa}
195	Passing a @pair@ variable to @f@ gives a cost of 1 poly, 2 variable for the @pair@ overload, versus a cost of 1 poly, 1 variable for the unconstrained overload.
196	Programmer expectation is to select the former, but the cost model selects the latter;
197	either could work.
198	As a result, simply counting the number of polymorphic type variables is no longer correct to order the function candidates as being more constrained.
199	\end{enumerate}
200
201	These inconsistencies are not easily solvable in the current cost-model, meaning the currently \CFA codebase has to workaround these defects.
202	One potential solution is to mix the conversion cost and \CC-like partial ordering of specializations.
203	For example, observe that the first three elements (unsafe, polymorphic and safe conversions) in the \CFA cost-tuple are related to the argument/parameter types, while the other two elements (polymorphic variable and assertion counts) are properties of the function declaration.
204	Using this observation, it is reasonable to have an ordering that compares the argument/parameter conversion costs first and uses the partial ordering of specializations as a tiebreaker.
205	Hence, the \CC template-specialization ordering can be applied to \CFA with a slight modification.
206
207	At the meantime, some other improvements have been made to the expression cost system.
208	Most notably, the expression resolution algorithm now consistently uses the globally minimal cost interpretation, as discussed in a previous technical report.
209	While implementing the change, there are also two detailed issues that need to be addressed for the new rules to fully work.
210
211	The first deals with a common idiom in \CFA that creates many overloads with equal cost.
212	Many C library functions have multiple versions for different argument types.
213	For example, the absolute-value function is defined for basic arithmetic types with different names, since C does not support overloading.
214	\begin{cquote}
215	\begin{tabular}{@{}lll@{}}
216	\begin{cfa}
217	int abs( int );
218	long labs( long );
219	long long int llabs( long long int );
220	\end{cfa}
221	&
222	\begin{cfa}
223	double fabs( double );
224	float fabsf( float );
225	long double fabsl( long double );
226	\end{cfa}
227	&
228	\begin{cfa}
229	cabs, cabsf, $and$ cabsl
230	$for$ _Complex
231
232	\end{cfa}
233	\end{tabular}
234	\end{cquote}
235	It is cumbersome for programmers to remember these function names and select the correct one.
236	If an incorrect version is picked, the program compiles but with potential negative consequences such as using an integral version with a floating-point argument.
237	In \CFA, these functions are wrapped by functions with the overloaded name @abs@, which results in multiple overloads with the same total cost when some conversion is needed.
238	For example, @long x = abs( 42 )@ resolves to @long abs( long )@ with @int@ argument 42 converted to @long@ or @int abs( int )@ converting the result to @long@.
239	In this example, the choice could be arbitrary because both yield identical results.
240	However, for @int i = abs( -9223372036854775807LL )@, the result is @-1@, suggesting some kind of type error rather than silently generating a logically incorrect result.
241	There are multiple overload groupings of C functions into a single \CFA name, so usage should not report an ambiguity or warning unless there is a significant chance of error.
242
243	While testing the effects of the updated cost rule, the following example was found in the \CFA standard library.
244	\begin{cfa}
245	static inline double __to_readyQ_avg( unsigned long long intsc ) {
246	if ( unlikely( 0 == intsc ) ) return 0.0;
247	else return log2( @intsc@ ); // implicit conversion happens here
248	}
249	\end{cfa}
250	This helper function is used for performance logging as part of computing a geometric;
251	it is called during summing of logarithmic values.
252	However, the function name @log2@ is overloaded in \CFA for both integer and floating point types.
253	In this case, the integer overload returns an integral result, truncating any small fractional part of the logarithm, so the sum is slightly incorrect.
254	When experimenting with the updated cost rules, it flagged the @log2@ call as an ambiguous expression.
255	When asked, the developer expected the floating-point overload because of return-type overloading.
256	This mistake went unnoticed because the truncated component was insignificant in the performance logging.
257	\PAB{Not sure I understand this: The conclusion is that matching the return type higher up in the expression tree is better, in cases where the total expression cost is equal.}
258
259	Another change addresses the issue that C arithmetic expressions have unique meanings governed by its arithmetic conversion rules.
260	\begin{enumerate}[leftmargin=*,topsep=5pt,itemsep=4pt]
261	\item
262	First, if the corresponding real type of either operand is @long double@, the other operand is converted, without change of type domain, to a type whose corresponding real type is @long double@.
263	\item
264	Otherwise, if the corresponding real type of either operand is @double@, the other operand is converted, without change of type domain, to a type whose corresponding real type is @double@.
265	\item
266	Otherwise, if the corresponding real type of either operand is @float@, the other operand is converted, without change of type domain, to a type whose corresponding real type is @float@.\footnote{
267	For example, addition of a \lstinline{double _Complex} and a \lstinline{float} entails just the conversion of the \lstinline{float} operand to \lstinline{double} (and yields a \lstinline{double _Complex} result).}
268	\item
269	Otherwise, the integer promotions are performed on both operands.
270	Then the following rules are applied to the promoted operands:
271	\begin{enumerate}[topsep=5pt,itemsep=4pt]
272	\item
273	If both operands have the same type, then no further conversion is needed.
274	\item
275	Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank.
276	\item
277	\label{p:SignedToUnsignedSafe}
278	Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
279	\item
280	\label{p:UnsignedToSignedUnsafe}
281	Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.
282	\item
283	\label{p:Miscellaneous}
284	Otherwise, both operands are converted to the unsigned integer type corresponding to the type of the operand with signed integer type.
285
286	\hfill\cite[\S~6.3.1.8]{C11}
287	\end{enumerate}
288	\end{enumerate}
289	\VRef[Figure]{f:CExpressionConversions} shows the C arithmetic conversions graphically.
290	\VRef[Rule]{p:SignedToUnsignedSafe} says an unsigned type is safely convertible to an signed type with greater rank, while \VRef[rule]{p:UnsignedToSignedUnsafe} says a signed type is unsafely convertible to an unsigned type.
291	Therefore, these two rules cover every possible case.
292	\VRef[Rule]{p:Miscellaneous} is the catch-all to accommodate any kinds of exotic integral representations.
293	These conversions are applied greedily at local points within an expression.
294	Because there is no overloading in C, except for builtin operators, no cost model is needed to differentiate among alternatives that could result in ambiguous matches.
295
296	\begin{figure}
297	\input{C_expression_conversion.pstex_t}
298	\caption{C Expression Conversions: T1 operator T2}
299	\label{f:CExpressionConversions}
300
301	\smallskip
302	\input{CFA_curr_arithmetic_conversion.pstex_t}
303	\caption{\CFA Total-Ordering Expression Conversions}
304	\label{f:CFACurrArithmeticConversions}
305
306	\smallskip
307	\input{CFA_arithmetic_conversion.pstex_t}
308	\caption{\CFA Partial-Ordering Expression Conversions}
309	\label{f:CFAArithmeticConversions}
310	\end{figure}
311
312	\VRef[Figure]{f:CFACurrArithmeticConversions} shows the current \CFA total-order arithmetic-conversions graphically.
313	Here, the unsafe cost of signed to unsigned is factored into the ranking, so the safe conversion is selected over an unsafe one.
314	Furthermore, an integral option is taken before considering a floating option.
315	This model locally matches the C approach, but provides an ordering when there are many overloaded alternative.
316	However, as Moss pointed out overload resolution by total cost has problems, \eg handling cast expressions.
317	\begin{cquote}
318	\ldots if a cast argument has an unambiguous interpretation as a conversion argument then it must be interpreted as such, even if the ascription interpretation would have a lower overall cost.
319	This is demonstrated in the following example, adapted from the C standard library:
320	\begin{cfa}
321	unsigned long long x;
322	(unsigned)( x >> 32 );
323	\end{cfa}
324	\vspace{5pt}
325	In C semantics, this example is unambiguously upcasting 32 to @unsigned long long@, performing the shift, then downcasting the result to @unsigned@, at cost (1, 0, 3, 1, 0, 0, 0).
326	If ascription were allowed to be a first-class interpretation of a cast expression, it would be cheaper to select the @unsigned@ interpretation of @?>>?@ by downcasting @x@ to @unsigned@ and upcasting 32 to @unsigned@, at a total cost of (1, 0, 1, 1, 0, 0, 0).
327	\PAB{This example feels incorrect. Assume a word size of 4 bits (long long). Given the value 1001, shift it >> 2, gives 10, which fits in a half word (int). Now truncate 1001 to a halfword 01 and shift it >> 2, gives 00. So the lower-cost alternative does generate the correct result. Basically truncation is an unsafe operation and hence has a huge negative cost.}
328	However, this break from C semantics is not backwards compatible, so to maintain C compatibility, the \CFA resolver selects the lowest-cost interpretation of the cast argument for which a conversion or coercion to the target type exists (upcasting to @unsigned long long@ in the example above, due to the lack of unsafe downcasts), using the cost of the conversion itself only as a tie-breaker.\cite[pp.~46-47]{Moss19}
329	\end{cquote}
330	However, a cast expression is not necessary to have such inconsistency to C semantics.
331	An implicit argument-parameter type conversion in a function calls can replicate this issue without an explicit cast.
332	\begin{cfa}
333	unsigned long long x;
334	void f( unsigned );
335	f( x >> 32 );
336	\end{cfa}
337	The argument generation has the same effect as using an explicit cast to coerce the type of expression @x >> 32@ to @unsigned@.
338	This example shows the problem is not coming from the cast expressions, but from modelling the C builtin operators as overloaded functions.
339	As a result, a different rule is used to select the builtin function candidates to fix this problem:
340	if an expression has any legal interpretations as a C builtin operation, only the lowest cost one is kept, regardless of the result type.
341
342	\VRef[Figure]{f:CFAArithmeticConversions} shows an alternative \CFA partial-order arithmetic-conversions graphically.
343	The idea here is to first look for the best integral alternative because integral calculations are exact and cheap.
344	If no integral solution is found, than there are different rules to select among floating-point alternatives.
345	This approach reduces the search space by partitioning: only look at integral operations, and then only look at float-point operations.
346
347
348	\section{Type Unification}
349
350	Type unification is the algorithm that assigns values to each (free) type parameters such that the types of the provided arguments and function parameters match.
351
352	\CFA does not attempt to do any type \textit{inference} \see{\VRef{s:IntoTypeInferencing}}: it has no anonymous functions (\ie lambdas, commonly found in functional programming and also used in \CC and Java), and the variable types must all be explicitly defined (no auto typing).
353	This restriction makes the unification problem more tractable in \CFA, as the argument types at each call site are usually all specified.
354	There is a single exception case when the function return type contains a free type variable that does not occur in any of the argument types, and subsequently passed into the parent expression.
355	\begin{cfa}
356	example ... does malloc work here
357	\end{cfa}
358	A top level expression whose type still contains an unbounded type variable is considered ill-formed as such an expression is inherently ambiguous.
359
360	The unification algorithm in \CFA is originally presented in Richard Bilson's thesis and it has remained as the basis of the algorithm currently in use.
361	Some additions have been made in order to accommodate for the newly added type features to the language.
362	To summarize, the \CFA type unification has two minor variants: an \textit{exact} mode and an \textit{inexact} mode.
363	The inexact mode is applied at top level argument-parameter matching, and attempts to find an assignment to the type variables such that the argument types can be converted to parameter types with minimal cost as defined in the previous section.
364	The exact mode is required since the type matching algorithm operates recursively and the inner types often have to match exactly, for example there is no common type for the pointer types \textbf{int} and \textbf{long} while there is for @int@ and @long@.
365	With the introduction of generic record types, the parameters must match exactly as well; currently there are no covariance or contravariance supported for the generics.
366
367	One simplification was made to the \CFA language that makes modelling the type system easier: polymorphic function pointer types are no longer allowed in declarations.
368	The polymorphic function declarations themselves are still treated as function pointer types internally, however the change means that formal parameter types can no longer be polymorphic.
369	Previously it is possible to write function prototypes such as
370	\begin{cfa}
371	void f( forall( T \| { T -?( T ); } ) T (*p)( T, T ) );
372	\end{cfa}
373	Notably, the unification algorithm implemented in the \CFA compiler has never managed to trace the assertion parameters on the formal types at all, and the problem of determining if two assertion sets are compatible may very likely be undecidable in general, given the ability of synthesizing more complicated types by the nesting of generics.
374	Eventually, the reason of not allowing such constructs is that they mostly do not provide useful type features for actual programming tasks.
375	A subroutine of a program operates on the arguments provided at the call site together with (if any) local and global variables, and even though the subroutine can be polymorphic, the types will be supported at each call site.
376	On each invocation the types to be operate on can be determined from the arguments provided, and therefore there should not be a need to pass a polymorphic function pointer, which can take any type in principle.
377
378	For example, consider a polymorphic function that takes one argument of type @T@ and another pointer to a subroutine
379	\begin{cfa}
380	forall (T) void f (T x, forall (U) void (*g) (U));
381	\end{cfa}
382	Making @g@ polymorphic in this context would almost certainly be unnecessary, since it can only be called inside the body of @f@ and the types of the argument would have been known anyways, although it can potentially depend on @T@.
383	Moreover, requesting a function parameter to be able to potentially work on any input type at all would always impose too much constraint on the arguments, as it only needs to make each calls inside the body of @f@ valid.
384
385	Rewriting the prototype to
386	\begin{cfa}
387	forall (T) void f (T x, void (*g) (T));
388	\end{cfa}
389	will be sufficient (or potentially, some compound type synthesized from @T@), in which case @g@ is no longer a polymorphic type on itself.
390	The ``monomorphization'' conversion is readily supported in \CFA, either by explicitly assigning a polymorphic function name to a compatible function pointer type, or implicitly done in deducing assertion parameters (which will be discussed in the next section).
391	Such technique can be directly applied to argument passing, which is essentially just assignment to function parameter variables.
392	There might be some edge cases where the supplied subroutine @g@ is called on arguments of different types inside the body of @f@ and so declared as polymorphic, but such use case is rare and the benefit of allowing such constructs seems to be minimal in practice.
393
394	The result of this change is that the unification algorithm no longer needs to distinguish open and closed type-variables, as the latter is not allowed to exist.
395	The only type variables that need to be handled are those introduced by the @forall@ clause from the function prototype.
396	The subtype relationship between function types is now also rendered redundant since none of the function parameter or return types can be polymorphic, and no basic types or non-polymorphic function types are subtypes of any other type.
397	Therefore the goal of (exact) type unification now simply becomes finding a substitution that produces identical types.
398	The assertion set need to be resolved is also always just the declarations on the function prototype, which also simplifies the assertion satisfaction algorithm by a bit, as will be discussed further in the next section.
399
400	The type unification results are stored in a type environment data structure, which represents all the type variables currently in scope as equivalent classes, together with their bound types and some other extra information, such as whether the bound type is allowed to be opaque (i.e.
401	a forward declaration without definition in scope), and whether the bounds are allowed to be widened.
402	In the more general approach commonly used in functional languages, the unification variables are given a lower bound and an upper bound to account for covariance and contravariance of types.
403	\CFA currently does not implement any variance with its generic types, and does not allow polymorphic function types, therefore no explicit upper bound is needed and one simple binding value for each equivalence class suffices.
404	However, since type conversions are allowed in \CFA, the type environment needs to keep track on which type variables are allowed conversions.
405	This behavior is notably different from \CC template argument deduction which enforces an exact match everywhere unless the template argument types are explicitly given.
406	For example, a polymorphic maximum function in \CFA can be called with arguments of different arithmetic types and the result follows the usual arithmetic conversion rules, while such expression is not allowed by \CC:
407	\begin{cfa}
408	forall (T \| {int ?<? (T, T); }) T max (T, T);
409
410	max (42, 3.14); // OK, T=double; requires explicit type annotation in C++ such as max<double>(42, 3.14);
411	\end{cfa}
412	The current \CFA documentation does not include a formal set of rules for type unification.
413	In practice, the algorithm implemented in the \CFA translator can be summarized as follows, given a function signature forall$(T_1,..., T_n) f(p_1, ..., p_m)$ and argument types $(a_1, ..., a_m)$, the unification algorithm performs the following steps: \footnote{This assumes argument tuples are already expanded to the individual components.}
414	\begin{enumerate}
415	\item The type environment is initialized as the union of all type environments of the arguments, plus $(T_1,...,T_n)$ as free variables.
416	The inclusion of argument environments serves the purpose of resolving polymorphic return types that needs to be deduced.
417	\item Initially, all type variables
418	\end{enumerate}
419
420
421	\section{Satisfaction of Assertions}
422
423	The assertion satisfaction problem greatly increases the complexity of \CFA expression resolution.
424	Past experiments have shown that the majority of time is spent in resolving the assertions for those expressions that takes the longest time to resolve.
425	Even though a few heuristics-based optimizations are introduced to the compiler now, this remains to be the most costly part of compiling a \CFA program.
426	The major difficulty of resolving assertions is that the problem can become recursive, since the expression used to satisfy an outstanding assertion can have its own assertions, and in theory this can go on indefinitely.
427	Detecting infinite recursion cases in general is not algorithmically possible and it is not attempted in the compiler.
428	Instead, a fixed maximum depth of recursive assertions is imposed.
429	This approach is also taken by \CC compilers as template argument deduction is also similarly undecidable in general.
430
431
432	In previous versions of \CFA this number was set at 4; as the compiler becomes more optimized and capable of handling more complex expressions in a reasonable amount of time, I have increased the limit to 8 and in most occasions it does not lead to trouble.
433	Very rarely there will be a case where the infinite recursion produces an exponentially growing assertion set, causing minutes of time wasted before the limit is reached.
434	Fortunately it is very hard to run into this situation with realistic \CFA code, and the ones that were found all have some clear characteristics, which can be prevented by some clever tricks.
435	In fact, some of the performance optimizations come from analyzing these problematic cases.
436	One example of such will be presented later in this section.
437
438	While the assertion satisfaction problem in isolation looks like just another expression to resolve, the recursive nature makes some techniques applied to expression resolution without assertions no longer possible.
439	The most significant impact is that the type unification has a side effect, namely editing the type environment (equivalence classes and bindings), which means that if one expression has multiple associated assertions, they are not independent as the changes to the type environment must be compatible for all the assertions to be resolved.
440	Particularly, if one assertion parameter can be resolved in multiple different ways, all of the results need to be checked to make sure the change to type variable bindings are compatible with other assertions to be resolved.
441	A naive algorithm that simply picks any pending assertion to resolve and continue in a depth-first search could be very inefficient and especially prone of falling into an infinite loop, while in many cases it can be avoided by examining other assertions that can provide insight on the desired type binding: if one assertion parameter can only be matched by a unique option, we can then update the type bindings confidently without the need of backtracking.
442
443
444	The algorithm currently used in \CFA compiler is designed by Aaron Moss through a simplified prototype experiment that captures most of \CFA type system features and ported back to the actual language.
445	It can be described as a mix of breadth- and depth-first search in a staged approach.
446
447
448	To resolve a set of assertions, the algorithm first attempts to resolve each assertion item individually.
449	There are three possible outcomes on resolving each assertion:
450	\begin{enumerate}
451	\item If no matches are found, the algorithm terminates with a failure.
452	\item If exactly one match is found, the type environment is updated immediately, and used in resolving any remaining assertions.
453	\item If multiple matches are found, the assertion candidates with their updated type environments are stored in a list that will be checked for compatibility at the end.
454	\end{enumerate}
455	When all assertion items are resolved successfully, the algorithm attempts to combine the ambiguously resolved assertions to produce mutually compatible assignments.
456	If any new assertions are introduced by the selected candidates, the algorithm is applied recursively, until there are none pending resolution or the recursion limit is reached which results in a failure.
457
458	It has been discovered in practice that the efficiency of such algorithm can sometimes be very sensitive to the order of resolving assertions.
459	Suppose an unbound type variable @T@ appears in two assertions, one can be uniquely resolved and allow the type @T@ to be inferred immediately, and another has many ways to be resolved, each results in @T@ being bound to a different concrete type.
460	If the first assertion is examined first by the algorithm, the deducted type can then be utilized in resolving the second assertion and eliminate many incorrect options without producing the list of candidates pending further checks.
461	In practice, this have a significant impact when an unbound type @T@ is declared to satisfy the basic \emph{object assertions}\footnote{The term is borrowed from object-oriented languages although \CFA is not object-oriented in principle.} of having a default constructor, destructor, and copy assignment operations.
462	Since they are defined for every type currently in scope, there are often hundreds or even thousands of matches to these functions with an unspecified operand type, and in most of the cases the value of @T@ can be deduced by resolving another assertion first, which then allows specific object lifetime functions to be looked up since they are sorted internally by the operand type, and greatly reduces the number of wasted resolution attempts.
463
464	Currently this issue also causes the capability of the assertion resolution algorithm to be limited.
465	Assertion matching is implemented to be more restricted than expression resolution in general, in that the parameter types must match exactly, rather than just merely callable.
466	If one function declaration includes an assertion of @void f(T)@ and only a @f(long)@ is currently in scope, trying to resolve the assertion with @T=int@ would not work.
467	Loosening the assertion matching requirement causes many assertion variables to have multiple matches and makes the delayed combination step too costly.
468
469	Given all the issues caused by assertion resolution potentially creating new type variable bindings, a natural progression is to put some restrictions on free type variables such that all the type variables will be bound when the expression reaches assertion resolution stage.
470	A type variable introduced by the @forall@ clause of function declaration can appear in parameter types, return types and assertion variables.
471	If it appears in the parameter types, it will be bound when matching the arguments to parameters at the call site.
472	If it only appears in the return type, it can be eventually figured out by the context in principle.
473	The current implementation in \CFA compiler does not do enough return type deduction as it performs eager assertion resolution, and the return type information cannot be known in general before the parent expression is resolved, unless the expression is in an initialization context, in which the type of variable to be initialized is certainly known.
474	By delaying the assertion resolution until the return type becomes known, this problem can be circumvented.
475	The truly problematic case occurs if a type variable does not appear in either of the parameter or return types and only appears in some assertion variables.
476	Such case is very rare and it is not evident that forcing every type variable to appear at least once in parameter or return types limits the expressiveness of \CFA type system to a significant extent.
477	In the next chapter I will discuss about a proposal of including type declarations in traits rather than having all type variables appear in the trait parameter list, which could be helpful for providing equivalent functionality of having an unbound type parameter in assertion variables, and also addressing some of the variable cost issue discussed in section 4.1.
478
479
480	\subsection*{Caching Assertion Resolution Results}
481
482	In Aaron Moss's prototype design and experiments, a potential optimization of caching the result of already resolved assertions is discussed.
483	Based on the experiment results, this approach can improve the performance of expression resolution in general, and sometimes allow hard instances of assertion resolution problems to be solved that are otherwise infeasible, for example when the resolution would encounter infinite loops.
484
485	The problem that makes this approach tricky to be implemented correctly is that the resolution algorithm has side effects, namely modifying the type bindings in the environment.
486	If we were to cache those results that cause the type bindings to be modified, it would be necessary to store the changes to type bindings too, and in case where multiple candidates can be used to satisfy one assertion parameter, all of them needs to be cached including those that are not eventually selected, since the side effect can produce different results depending on the context.
487
488	In the original design of \CFA that includes unrestricted polymorphic formal parameters that can have assertions on themselves, the problem is even more complicated as new declarations can be introduced in scope during expression resolution.
489	Here is one such example taken from Bilson:
490	\begin{cfa}
491	void f( forall( T \| { T -?( T ); } ) T (*p)( T, T ) );
492	forall( U, V \| { U -?( U ); V -?( V ); } ) U g( U, V ) );
493	f( g );
494	\end{cfa}
495	The inner assertion parameter on the \textit{closed} type variable @T@ is used to satisfy the assertions on @U@ and @V@ in this example.
496
497	However, as per the previous discussions on this topic, polymorphic function pointer types have been removed from \CFA, since correctly implementing assertion matching is not possible in general.
498	Without closed parameters (and therefore no have-set for assertions) the set of declarations in scope remains unchanged while resolving any expression.
499	The current \CFA implementation also does not attempt to widen any already bound type parameters to satisfy an assertion.
500	Note that such restriction does mean that certain kinds of expressions cannot be resolved, for example:
501	\begin{cfa}
502	forall (T \| {void f(T);}) void g(T);
503	void f (long);
504	g(42);
505	\end{cfa}
506	The call @g(42)@ is rejected since no attempt is made to widen the parameter type @T@ from @int@ to @long@.
507	Such problem could be mitigated if we allow inexact matches of assertions, but cannot be eliminated completely, if @T@ is matched in a parameterized type, including pointers and references:
508	\begin{cfa}
509	forall (T \| {void f(T*);}) void g(T);
510	void f (long *);
511	g(42);
512	\end{cfa}
513	Here the only way to resolve the call @g(42)@ is to allow assertion resolution to widen the parameter @T@, since even with inexact matching, @int@ cannot be converted to @long@.
514
515
516	\section{Compiler Implementation Considerations}

Note: See TracBrowser for help on using the repository browser.

Download in other formats: