Context Navigation

Reverse Diff

Changes in / [72e2ea0:5ae36ed]

Location:

doc/aaron_comp_II

Files:

: 4 deleted
: 1 edited

comp_II.tex (modified) (4 diffs)
conversion_dag.eps (deleted)
conversion_dag.odg (deleted)
resolution_dag.eps (deleted)
resolution_dag.odg (deleted)

Legend:

: Unmodified
: Added
: Removed

doc/aaron_comp_II/comp_II.tex

-              r72e2ea0
+              r5ae36ed
 \setlength{\headsep}{0.25in}
-\usepackage{caption}
-\usepackage{subcaption}
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 …
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \newcommand{\bigO}[1]{O\!\left( #1 \right)}
+\newcommand{\bigO}[1]{O\left( #1 \right)}
 \begin{document}
 …
 If cross-argument resolution dependencies cannot be completely eliminated, effective caching strategies to reduce duplicated work between equivalent argument-parameter matches in different combinations may mitigate the asymptotic defecits of the whole-combination matching approach.
 The final area of investigation is heuristics and algorithmic approaches to reduce the number of argument interpretations considered in the common case; if argument-parameter matches cannot be made independent, even small reductions in $i$ should yield significant reductions in the $i^{p+1}$ resolver runtime factor.
 The discussion below presents a number of largely orthagonal axes for expression resolution algorithm design to be investigated, noting prior work where applicable.
-Though some of the proposed improvements to the expression resolution algorithm are based on heuristics rather than asymptoticly superior algorithms, it should be noted that user programmers often employ idioms and other programming patterns to reduce the mental burden of producing correct code, and if these patterns can be identified and exploited by the compiler then the significant reduction in expression resolution time for common, idiomatic expressions should result in lower total compilation time even for code including difficult-to-resolve expressions that push the expression resolver to its theoretical worst case.
 \subsection{Argument-Parameter Matching}
+The first axis for consideration is argument-parameter matching direction --- whether the type matching for a candidate function to a set of candidate arguments is directed by the argument types or the parameter types.
+All expression resolution algorithms form a DAG of interpretations, some explicitly, some implicitly; in this DAG, arcs point from function-call interpretations to argument interpretations, as below:
+\begin{figure}[h]
+\centering
+\begin{subfigure}[h]{2in}
+\begin{lstlisting}
+int *p;  // $p_i$
+char *p; // $p_c$
+double *f(int*, int*); // $f_d$
+char *f(char*, char*); // $f_c$
+f( f( p, p ), p );
+\end{lstlisting}
+\end{subfigure}~\begin{subfigure}[h]{2in}
+\includegraphics{resolution_dag}
+\end{subfigure}
+\caption{Resolution DAG for a simple expression. Functions that do not have a valid argument matching are covered with an \textsf{X}.}\label{fig:res_dag}
+\end{figure}
+Note that some interpretations may be part of more than one super-interpretation, as with $p_i$ in the bottom row, while some valid subexpression interpretations, like $f_d$ in the middle row, are not used in any interpretation of their containing expression.
+\subsubsection{Argument-directed (Bottom-up)}
+Baker's algorithm for expression resolution~\cite{Baker82} pre-computes argument candidates, from the leaves of the expression tree up.
+The first axis we consider is argument-parameter matching --- whether the type matching for a candidate function to a set of candidate arguments is directed by the argument types or the parameter types.
+\subsubsection{Argument-directed (``Bottom-up'')}
+Baker's algorithm for expression resolution\cite{Baker82} pre-computes argument candidates, from the leaves of the expression tree up.
 For each candidate function, Baker attempts to match argument types to parameter types in sequence, failing if any parameter cannot be matched.
 Bilson~\cite{Bilson03} similarly pre-computes argument candidates in the original \CFA compiler, but then explicitly enumerates all possible argument combinations for a multi-parameter function; these argument combinations are matched to the parameter types of the candidate function as a unit rather than individual arguments.
 This approach is less efficient than Baker's approach, as the same argument may be compared to the same parameter many times, but allows a more straightforward handling of polymorphic type-binding and multiple return-types.
 It is possible the efficiency losses here relative to Baker could be significantly reduced by keeping a memoized cache of argument-parameter type comparisons and reading previously-seen argument-parameter matches from this cache rather than recomputing them.
 \subsubsection{Parameter-directed (Top-down)}
 Unlike Baker and Bilson, Cormack's algorithm~\cite{Cormack81} requests argument candidates that match the type of each parameter of each candidate function, from the top-level expression down; memoization of these requests is presented as an optimization.
+Bilson\cite{Bilson03} similarly pre-computes argument candidates in the original \CFA compiler, but then explicitly enumerates all possible argument combinations for a multi-parameter function; these argument combinations are matched to the parameter types of the candidate function as a unit rather than individual arguments.
+This is less efficient than Baker's approach, as the same argument may be compared to the same parameter many times, but allows a more straightforward handling of polymorphic type binding and multiple return types.
+It is possible the efficiency losses here relative to Baker could be significantly reduced by application of memoization to the argument-parameter type comparisons.
+\subsubsection{Parameter-directed (``Top-down'')}
+Unlike Baker and Bilson, Cormack's algorithm\cite{Cormack81} requests argument candidates which match the type of each parameter of each candidate function, from the top-level expression down; memoization of these requests is presented as an optimization.
 As presented, this algorithm requires the result of the expression to have a known type, though an algorithm based on Cormack's could reasonably request a candidate set of any return type, though such a set may be quite large.
 \subsubsection{Hybrid}
 This proposal includes the investigation of hybrid top-down/bottom-up argument-parameter matching.
+A reasonable hybrid approach might take a top-down approach when the expression to be matched has a fixed type, and a bottom-up approach in untyped contexts.
+This approach may involve switching from one type to another at different levels of the expression tree.
+For instance:
+A reasonable hybrid approach might be to take a top-down approach when the expression to be matched is known to have a fixed type, and a bottom-up approach in untyped contexts.
+This may include switches from one type to another at different levels of the expression tree, for instance:
 \begin{lstlisting}
 forall(otype T)
 …
 int x = f( f( '!' ) );
 \end{lstlisting}
+The outer call to ©f© must have a return type that is (implicitly convertable to) ©int©, so a top-down approach is used to select \textit{(1)} as the proper interpretation of ©f©. \textit{(1)}'s parameter ©x©, however, is an unbound type variable, and can thus take a value of any complete type, providing no guidance for the choice of candidate for the inner call to ©f©. The leaf expression ©'!'©, however, determines a zero-cost interpretation of the inner ©f© as \textit{(2)}, providing a minimal-cost expression resolution where ©T© is bound to ©void*©.
+Deciding when to switch between bottom-up and top-down resolution to minimize wasted work in a hybrid algorithm is a necessarily heuristic process, and though finding good heuristics for which subexpressions to swich matching strategies on is an open question, one reasonable approach might be to set a threshold $t$ for the number of candidate functions, and to use top-down resolution for any subexpression with fewer than $t$ candidate functions, to minimize the number of unmatchable argument interpretations computed, but to use bottom-up resolution for any subexpression with at least $t$ candidate functions, to reduce duplication in argument interpretation computation between the different candidate functions.
+\subsubsection{Common Subexpression Caching}
+With any of these argument-parameter approaches, it may be a useful optimization to cache the resolution results for common subexpressions; in Figure~\ref{fig:res_dag} this optimization would result in the list of interpretations $[p_c, p_i]$ for ©p© only being calculated once, and re-used for each of the three instances of ©p©.
+Here, the outer call to ©f© must have a return type that is (implicitly convertable to) ©int©, so a top-down approach could be used to select \textit{(1)} as the proper interpretation of ©f©. \textit{(1)}'s parameter ©x© here, however, is an unbound type variable, and can thus take a value of any complete type, providing no guidance for the choice of candidate for the inner ©f©. The leaf expression ©'!'©, however, gives us a zero-cost interpretation of the inner ©f© as \textit{(2)}, providing a minimal-cost expression resolution where ©T© is bound to ©void*©.
+Deciding when to switch between bottom-up and top-down resolution in a hybrid algorithm is a necessarily heuristic process, and though finding good heuristics for it is an open question, one reasonable approach might be to switch from top-down to bottom-up when the number of candidate functions exceeds some threshold.
 \subsection{Implicit Conversion Application}
 Baker's and Cormack's algorithms do not account for implicit conversions\footnote{Baker does briefly comment on an approach for handling implicit conversions.}; both assume that there is at most one valid interpretation of a given expression for each distinct type.
+Baker's\cite{Baker82} and Cormack's\cite{Cormack81} algorithms do not account for implicit conversions\footnote{Baker does briefly comment on an approach for handling implicit conversions.}; both assume that there is at most one valid interpretation of a given expression for each distinct type.
 Integrating implicit conversion handling into their algorithms provides some choice of implementation approach.

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changes in / [72e2ea0:5ae36ed]

Legend:

doc/aaron_comp_II/comp_II.tex

Download in other formats: