Ignore:
File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/aaron_comp_II/comp_II.tex

    rce76eb9 re394618  
    3737\setlength{\headsep}{0.25in}
    3838
     39\usepackage{caption}
     40\usepackage{subcaption}
     41
    3942%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    4043
     
    6265%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    6366
    64 \newcommand{\bigO}[1]{O\left( #1 \right)}
     67\newcommand{\bigO}[1]{O\!\left( #1 \right)}
    6568
    6669\begin{document}
     
    404407If cross-argument resolution dependencies cannot be completely eliminated, effective caching strategies to reduce duplicated work between equivalent argument-parameter matches in different combinations may mitigate the asymptotic defecits of the whole-combination matching approach.
    405408The final area of investigation is heuristics and algorithmic approaches to reduce the number of argument interpretations considered in the common case; if argument-parameter matches cannot be made independent, even small reductions in $i$ should yield significant reductions in the $i^{p+1}$ resolver runtime factor.
     409
    406410The discussion below presents a number of largely orthagonal axes for expression resolution algorithm design to be investigated, noting prior work where applicable.
     411Though some of the proposed improvements to the expression resolution algorithm are based on heuristics rather than asymptoticly superior algorithms, it should be noted that user programmers often employ idioms and other programming patterns to reduce the mental burden of producing correct code, and if these patterns can be identified and exploited by the compiler then the significant reduction in expression resolution time for common, idiomatic expressions should result in lower total compilation time even for code including difficult-to-resolve expressions that push the expression resolver to its theoretical worst case.
    407412
    408413\subsection{Argument-Parameter Matching}
    409 The first axis we consider is argument-parameter matching --- whether the type matching for a candidate function to a set of candidate arguments is directed by the argument types or the parameter types.
    410 
    411 \subsubsection{Argument-directed (``Bottom-up'')}
    412 Baker's algorithm for expression resolution\cite{Baker82} pre-computes argument candidates, from the leaves of the expression tree up.
     414The first axis for consideration is argument-parameter matching direction --- whether the type matching for a candidate function to a set of candidate arguments is directed by the argument types or the parameter types.
     415All expression resolution algorithms form a DAG of interpretations, some explicitly, some implicitly; in this DAG, arcs point from function-call interpretations to argument interpretations, as below:
     416\begin{figure}[h]
     417\centering
     418\begin{subfigure}[h]{2in}
     419\begin{lstlisting}
     420int *p;  // $p_i$
     421char *p; // $p_c$
     422
     423double *f(int*, int*); // $f_d$
     424char *f(char*, char*); // $f_c$
     425
     426f( f( p, p ), p );
     427\end{lstlisting}
     428\end{subfigure}~\begin{subfigure}[h]{2in}
     429\includegraphics{resolution_dag}
     430\end{subfigure}
     431\caption{Resolution DAG for a simple expression. Functions that do not have a valid argument matching are covered with an \textsf{X}.}\label{fig:res_dag}
     432\end{figure}
     433
     434Note that some interpretations may be part of more than one super-interpretation, as with $p_i$ in the bottom row, while some valid subexpression interpretations, like $f_d$ in the middle row, are not used in any interpretation of their containing expression.
     435
     436\subsubsection{Argument-directed (Bottom-up)}
     437Baker's algorithm for expression resolution~\cite{Baker82} pre-computes argument candidates, from the leaves of the expression tree up.
    413438For each candidate function, Baker attempts to match argument types to parameter types in sequence, failing if any parameter cannot be matched.
    414439
    415 Bilson\cite{Bilson03} similarly pre-computes argument candidates in the original \CFA compiler, but then explicitly enumerates all possible argument combinations for a multi-parameter function; these argument combinations are matched to the parameter types of the candidate function as a unit rather than individual arguments.
    416 This is less efficient than Baker's approach, as the same argument may be compared to the same parameter many times, but allows a more straightforward handling of polymorphic type binding and multiple return types.
    417 It is possible the efficiency losses here relative to Baker could be significantly reduced by application of memoization to the argument-parameter type comparisons.
    418 
    419 \subsubsection{Parameter-directed (``Top-down'')}
    420 Unlike Baker and Bilson, Cormack's algorithm\cite{Cormack81} requests argument candidates which match the type of each parameter of each candidate function, from the top-level expression down; memoization of these requests is presented as an optimization.
     440Bilson~\cite{Bilson03} similarly pre-computes argument candidates in the original \CFA compiler, but then explicitly enumerates all possible argument combinations for a multi-parameter function; these argument combinations are matched to the parameter types of the candidate function as a unit rather than individual arguments.
     441This approach is less efficient than Baker's approach, as the same argument may be compared to the same parameter many times, but allows a more straightforward handling of polymorphic type-binding and multiple return-types.
     442It is possible the efficiency losses here relative to Baker could be significantly reduced by keeping a memoized cache of argument-parameter type comparisons and reading previously-seen argument-parameter matches from this cache rather than recomputing them.
     443
     444\subsubsection{Parameter-directed (Top-down)}
     445Unlike Baker and Bilson, Cormack's algorithm~\cite{Cormack81} requests argument candidates that match the type of each parameter of each candidate function, from the top-level expression down; memoization of these requests is presented as an optimization.
    421446As presented, this algorithm requires the result of the expression to have a known type, though an algorithm based on Cormack's could reasonably request a candidate set of any return type, though such a set may be quite large.
    422447
    423448\subsubsection{Hybrid}
    424449This proposal includes the investigation of hybrid top-down/bottom-up argument-parameter matching.
    425 A reasonable hybrid approach might be to take a top-down approach when the expression to be matched is known to have a fixed type, and a bottom-up approach in untyped contexts.
    426 This may include switches from one type to another at different levels of the expression tree, for instance:
     450A reasonable hybrid approach might take a top-down approach when the expression to be matched has a fixed type, and a bottom-up approach in untyped contexts.
     451This approach may involve switching from one type to another at different levels of the expression tree.
     452For instance:
    427453\begin{lstlisting}
    428454forall(otype T)
     
    433459int x = f( f( '!' ) );
    434460\end{lstlisting}
    435 Here, the outer call to ©f© must have a return type that is (implicitly convertable to) ©int©, so a top-down approach could be used to select \textit{(1)} as the proper interpretation of ©f©. \textit{(1)}'s parameter ©x© here, however, is an unbound type variable, and can thus take a value of any complete type, providing no guidance for the choice of candidate for the inner ©f©. The leaf expression ©'!'©, however, gives us a zero-cost interpretation of the inner ©f© as \textit{(2)}, providing a minimal-cost expression resolution where ©T© is bound to ©void*©.
    436 
    437 Deciding when to switch between bottom-up and top-down resolution in a hybrid algorithm is a necessarily heuristic process, and though finding good heuristics for it is an open question, one reasonable approach might be to switch from top-down to bottom-up when the number of candidate functions exceeds some threshold.
     461The outer call to ©f© must have a return type that is (implicitly convertable to) ©int©, so a top-down approach is used to select \textit{(1)} as the proper interpretation of ©f©. \textit{(1)}'s parameter ©x©, however, is an unbound type variable, and can thus take a value of any complete type, providing no guidance for the choice of candidate for the inner call to ©f©. The leaf expression ©'!'©, however, determines a zero-cost interpretation of the inner ©f© as \textit{(2)}, providing a minimal-cost expression resolution where ©T© is bound to ©void*©.
     462
     463Deciding when to switch between bottom-up and top-down resolution to minimize wasted work in a hybrid algorithm is a necessarily heuristic process, and though finding good heuristics for which subexpressions to swich matching strategies on is an open question, one reasonable approach might be to set a threshold $t$ for the number of candidate functions, and to use top-down resolution for any subexpression with fewer than $t$ candidate functions, to minimize the number of unmatchable argument interpretations computed, but to use bottom-up resolution for any subexpression with at least $t$ candidate functions, to reduce duplication in argument interpretation computation between the different candidate functions.
     464
     465\subsubsection{Common Subexpression Caching}
     466With any of these argument-parameter approaches, it may be a useful optimization to cache the resolution results for common subexpressions; in Figure~\ref{fig:res_dag} this optimization would result in the list of interpretations $[p_c, p_i]$ for ©p© only being calculated once, and re-used for each of the three instances of ©p©.
    438467
    439468\subsection{Implicit Conversion Application}
    440 Baker's\cite{Baker82} and Cormack's\cite{Cormack81} algorithms do not account for implicit conversions\footnote{Baker does briefly comment on an approach for handling implicit conversions.}; both assume that there is at most one valid interpretation of a given expression for each distinct type.
     469Baker's and Cormack's algorithms do not account for implicit conversions\footnote{Baker does briefly comment on an approach for handling implicit conversions.}; both assume that there is at most one valid interpretation of a given expression for each distinct type.
    441470Integrating implicit conversion handling into their algorithms provides some choice of implementation approach.
    442471
Note: See TracChangeset for help on using the changeset viewer.