# Changeset b44a7c5

Ignore:
Timestamp:
Jul 26, 2016, 10:06:53 AM (7 years ago)
Branches:
ADT, aaron-thesis, arm-eh, ast-experimental, cleanup-dtors, ctor, deferred_resn, demangler, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, memory, new-ast, new-ast-unique-expr, new-env, no_list, persistent-indexer, pthread-emulation, qualifiedEnum, resolv-new, with_gc
Children:
c967ef9
Parents:
ef3b335
Message:

Start expression resolution section of comp II

File:
1 edited

### Legend:

Unmodified
 ref3b335 \section{Expression Resolution} % TODO cite Baker, Cormack, etc. \subsection{Argument-Directed} \subsection{Parameter-Directed} \textbf{TODO: Richard's algorithm isn't Baker (Cormack?), disentangle from this section \ldots}. The expression resolution algorithm used by the existing iteration of {CFA-CC} is based on Baker's\cite{Baker82} algorithm for overload resolution in Ada. The essential idea of this algorithm is to first find the possible interpretations of the most deeply nested subexpressions, then to use these interpretations to recursively generate valid interpretations of their superexpressions. To simplify matters, the only expressions considered in this discussion of the algorithm are function application and literal expressions; other expression types can generally be considered to be variants of one of these for the purposes of the resolver, \eg variables are essentially zero-argument functions. If we consider expressions as graph nodes with arcs connecting them to their subexpressions, these expressions form a DAG, generated by the algorithm from the bottom up. Literal expressions are represented by leaf nodes, annotated with the type of the expression, while a function application will have a reference to the function declaration chosen, as well as arcs to the interpretation nodes for its argument expressions; functions are annotated with their return type (or types, in the case of multiple return values). \textbf{TODO: Figure} Baker's algorithm was designed to account for name overloading; Richard Bilson\cite{Bilson03} extended this algorithm to also handle polymorphic functions, implicit conversions \& multiple return types when designing the original \CFA compiler. The core of the algorithm is a function which Baker refers to as $gen\_calls$. $gen\_calls$ takes as arguments the name of a function $f$ and a list containing the set of possible subexpression interpretations $S_j$ for each argument of the function and returns a set of possible interpretations of calling that function on those arguments. The subexpression interpretations are generally either singleton sets generated by the single valid interpretation of a literal expression, or the results of a previous call to $gen\_calls$. If there are no valid interpretations of an expression, the set returned by $gen\_calls$ will be empty, at which point resolution can cease, since each subexpression must have at least one valid interpretation to produce an interpretation of the whole expression. On the other hand, if for some type $T$ there is more than one valid interpretation of an expression with type $T$, all interpretations of that expression with type $T$ can be collapsed into a single \emph{ambiguous expression} of type $T$, since the only way to disambiguate expressions is by their return types. If a subexpression interpretation is ambiguous, than any expression interpretation containing it will also be ambiguous. In the variant of this algorithm including implicit conversions, the interpretation of an expression as type $T$ is ambiguous only if there is more than one \emph{minimal-cost} interpretation of the expression as type $T$, as cheaper expressions are always chosen in preference to more expensive ones. Given this description of the behaviour of $gen\_calls$, its implementation is quite straightforward: for each function declaration $f_i$ matching the name of the function, consider each of the parameter types $p_j$ of $f_i$, attempting to match the type of an element of $S_j$ to $p_j$ (this may include checking of implicit conversions). If no such element can be found, there is no valid interpretation of the expression using $f_i$, while if more than one such (minimal-cost) element is found than an ambiguous interpretation with the result type of $f_i$ is produced. In the \CFA variant, which includes polymorphic functions, it is possible that a single polymorphic function definition $f_i$ can produce multiple valid interpretations by different choices of type variable bindings; these interpretations are unambiguous so long as the return type of $f_i$ is different for each type binding. If all the parameters $p_j$ of $f_i$ can be uniquely matched to a candidate interpretation, then a valid interpretation based on $f_i$ and those $p_j$ is produced. $gen\_calls$ collects the produced interpretations for each $f_i$ and returns them; a top level expression is invalid if this list is empty, ambiguous if there is more than one (minimal-cost) result, or if this single result is ambiguous, and valid otherwise. In this implementation, resolution of a single top-level expression takes time $O(\ldots)$, where \ldots. \textbf{TODO:} \textit{Look at 2.3.1 in Richard's thesis when working out complexity; I think he does get the Baker algorithm wrong on combinations though, maybe\ldots} \textbf{TODO: Basic Lit Review} \textit{Look at 2.4 in Richard's thesis for any possible more-recent citations of Baker\ldots} \textit{Look back at Baker's related work for other papers that look similar to what you're doing, then check their citations as well\ldots} \textit{Look at Richard's citations in 2.3.2 w.r.t. type data structures\ldots} \textit{CormackWright90 seems to describe a solution for the same problem, mostly focused on how to find the implicit parameters} \section{Proposal} \section{Completion Timeline}