\chapter{Resolution Heuristics}
\label{resolution-chap}

The main task of the \CFACC{} type-checker is \emph{expression resolution}, determining which declarations the identifiers in each expression correspond to. 
Resolution is a straightforward task in C, as each declaration has a unique identifier, but in \CFA{} the name overloading features discussed in Section~\ref{overloading-sec} generate multiple candidate declarations for each identifier.
I refer to a given matching between identifiers and declarations in an expression as an \emph{interpretation}; an interpretation also includes information about polymorphic type bindings and implicit casts to support the \CFA{} features discussed in Sections~\ref{poly-func-sec} and~\ref{implicit-conv-sec}, each of which increase the proportion of feasible candidate interpretations. 
To choose between feasible interpretations, \CFA{} defines a \emph{conversion cost} to rank interpretations; the expression resolution problem is thus to find the unique minimal-cost interpretation for an expression, reporting an error if no such interpretation exists.

\section{Expression Resolution}

\subsection{Conversion Cost}

C does not have an explicit cost model for implicit conversions, but the ``usual arithmetic conversions''\cit{} used to decide which arithmetic operators to use define one implicitly. 
Beginning with the work of Bilson\cite{Bilson03}, \CFA{} has defined a \emph{conversion cost} for each function call in a way that generalizes C's conversion rules. 
Loosely defined, the conversion cost counts the implicit conversions utilized by an interpretation. 
With more specificity, the cost is a lexicographically-ordered tuple, where each element corresponds to a particular kind of conversion. 
In Bilson's \CFA{} design, conversion cost is a three-tuple, $(unsafe, poly, safe)$, where $unsafe$ is the count of unsafe (narrowing) conversions, $poly$ is the count of polymorphic type bindings, and $safe$ is the sum of the degree of safe (widening) conversions. 
The following example lists the cost in the Bilson model of calling each of the following functions with two !int! parameters:

\begin{cfa}
void f(char, long); $\C{// (1,0,1)}$
forall(otype T) void f(T, long); $\C{// (0,1,1)}$
void f(long, long); $\C{// (0,0,2)}$
void f(int, unsigned long); $\C{// (0,0,2)}$
void f(int, long); $\C{// (0,0,1)}$
\end{cfa}

Note that safe and unsafe conversions are handled differently; \CFA{} counts ``distance'' of safe conversions (\eg{} !int! to !long! is cheaper than !int! to !unsigned long!), while only counting the number of unsafe conversions (\eg{} !int! to !char! and !int! to !short! both have unsafe cost 1).
As part of adding reference types to \CFA{} (see Section~\ref{type-features-sec}), Schluntz added a new $reference$ element to the cost tuple, which counts the number of implicit reference-to-rvalue conversions performed so that candidate interpretations can be distinguished by how closely they match the nesting of reference types; since references are meant to act almost indistinguishably from lvalues, this $reference$ element is the least significant in the lexicographic comparison of cost tuples.

I have also refined the \CFA{} cost model as part of this thesis work. 
Bilson's \CFA{} cost model includes the cost of polymorphic type bindings from a function's type assertions in the $poly$ element of the cost tuple; this has the effect of making more-constrained functions more expensive than less-constrained functions. 
However, type assertions actually make a function \emph{less} polymorphic, and as such functions with more type assertions should be preferred in type resolution. 
As an example, some iterator-based algorithms can work on a forward iterator that only provides an increment operator, but are more efficient on a random-access iterator that can be incremented by an arbitrary number of steps in a single operation. 
The random-access iterator has more type constraints, but should be chosen whenever those constraints can be satisfied. 
As such, I have added a $specialization$ element to the \CFA{} cost tuple, the values of which are always negative. 
Each type assertion subtracts 1 from $specialization$, so that more-constrained functions will be chosen over less-constrained functions, all else being equal.
A more sophisticated design would define a partial order over sets of type assertions by set inclusion (\ie{} one function would only cost less than another if it had all of the same assertions, plus more, rather than just more total assertions), but I did not judge the added complexity of computing and testing this order to be worth the gain in specificity.

I have also incorporated an unimplemented aspect of Ditchfield's earlier cost model.
In the example below, adapted from \cite[89]{Ditchfield}, Bilson's cost model only distinguished between the first two cases by accounting extra cost for the extra set of !otype! parameters, which, as discussed above, is not a desirable solution:

\begin{cfa}
forall(otype T, otype U) void f(T, U);  $\C{// polymorphic}$
forall(otype T) void f(T, T);  $\C{// less polymorphic}$
forall(otype T) void f(T, int);  $\C{// even less polymorphic}$
forall(otype T) void f(T*, int); $\C{// least polymorphic}$
\end{cfa}

I account for the fact that functions with more polymorphic variables are less constrained by introducing a $var$ cost element that counts the number of type variables on a candidate function. 
In the example above, the first !f! has $var = 2$, while the remainder have $var = 1$. 
My new \CFA{} cost model also accounts for a nuance un-handled by Ditchfield or Bilson, in that it makes the more specific fourth function above cheaper than the more generic third function. 
The fourth function is presumably somewhat optimized for handling pointers, but the prior \CFA{} cost model could not account for the more specific binding, as it simply counted the number of polymorphic unifications. 

In my modified model, each level of constraint on a polymorphic type in the parameter list results in a decrement of the $specialization$ cost element. 
Thus, all else equal, if both a binding to !T! and a binding to !T*! are available, \CFA{} will pick the more specific !T*! binding.
This process is recursive, such that !T**! produces a -2 specialization cost, as opposed to the -1 cost for $T*$. 
This works similarly for generic types, \eg{} !box(T)! also has specialization cost -1; for multi-argument generic types, the least-specialized polymorphic parameter sets the specialization cost, \eg{} the specialization cost of !pair(T, S*)! is -1 (from !T!) rather than -2 (from !S!).
Since the user programmer provides parameters, but cannot provide guidance on return type, specialization cost is not counted for the return type list.
The final \CFA{} cost tuple is thus as follows:

\begin{equation*}
	(unsafe, poly, safe, vars, specialization, reference)
\end{equation*}

\subsection{Expression Cost}

Having defined the cost model ...

% mention cast expression as "eager"

% Mention relevance of work to C++20 concepts