Ignore:
Timestamp:
Jan 23, 2019, 5:46:49 PM (6 years ago)
Author:
Aaron Moss <a3moss@…>
Branches:
ADT, aaron-thesis, arm-eh, ast-experimental, cleanup-dtors, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, pthread-emulation, qualifiedEnum
Children:
3deb316
Parents:
104a13e
Message:

thesis: further analysis of expression resolution

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/aaron_moss_PhD/phd/resolution-heuristics.tex

    r104a13e re451c0f  
    164164Accordingly, the cost of a single argument-parameter unification is $O(d)$, where !d! is the depth of the expression tree, and the cost of argument-parameter unification for a single candidate for a given function call expression is $O(pd)$, where $p$ is the number of parameters.
    165165
     166Implicit conversions are also checked in argument-parameter matching, but the cost of checking for the existence of an implicit conversion is again proportional to the complexity of the type, $O(d)$.
     167Polymorphism again introduces a potential expense here; for a monomorphic function there is only one potential implicit conversion from argument type to parameter type, while if the parameter type is an unbound polymorphic type variable then any implicit conversion from the argument type could potentially be considered a valid binding for that type variable.
     168\CFA{}, however, requires exact matches for the bound type of polymorphic parameters, removing this problem.
     169An interesting question for future work is whether loosening this requirement incurs significant runtime cost in practice.
     170
    166171Considering the recursive traversal of the expression tree, polymorphism again greatly expands the worst-case runtime.
    167172Letting $i$ be the number of candidate declarations for each function call, if all of these candidates are monomorphic then there are no more than $i$ unambiguous interpretations of the subexpression rooted at that function call.
     
    174179Given these calculations of number of subexpression interpretations and matching costs, the upper bound on runtime for generating candidates for a single subexpression $d$ levels up from the leaves is $O( i^{p^d} \cdot pd )$.
    175180Since there are $O(p^d)$ subexpressions in a single top-level expression, the total worst-case cost of argument-parameter matching with the overloading and polymorphism features of \CFA{} is $O( i^{p^d} \cdot pd \cdot p^d )$.
     181Since the size of the expression is $O(p^d)$, letting $n = p^d$ this simplifies to $O(i^n \cdot n^2)$
    176182
    177183This already high bound does not yet account for the cost of assertion resolution, though.
    178 % continue from here
    179 
    180 % need to discuss that n == p^d, that that i^p^d bound is **super** rare, and will usually be closer to $i$
    181 
    182 % missing a +1 somewhere in these bounds; also, pd is O(n) ()
     184\CFA{} uses type unification on the assertion type and the candidate declaration type to test assertion satisfaction; this unification calculation has cost proportional to the complexity of the declaration type after substitution of bound type variables; as discussed above, this cost is $O(d)$.
     185If there are $O(a)$ type assertions on each declaration, there are $O(i)$ candidates to satisfy each assertion, for a total of $O(ai)$ candidates to check for each declaration.
     186However, each assertion candidate may generate another $O(a)$ assertions, recursively until the assertion recursion limit $r$ is reached, for a total cost of $O((ai)^r \cdot d)$.
     187Now, $a$, $i$, and $r$ are properties of the set of declarations in scope, or the language spec in the case of $r$, so $(ai)^r$ is essentially a constant, albeit a very large one.
     188It is not uncommon in \CFA{} to have functions with dozens of assertions, and common function names (\eg{} !?{}!, the constructor) can have hundreds of overloads.
     189
     190It is clear that assertion resolution costs can be very large, and in fact a method for heuristically reducing them is one of the key contributions of this thesis, but it should be noted that the worst-case analysis is a particularly poor match for actual code in the case of assertions.
     191It is reasonable to assume that most code compiles without errors, as in an actively-developed project the code will be compiled many times, generally with relatively few new errors introduced between compiles.
     192However, the worst-case bound for assertion resolution is based on recursive assertion satisfaction exceeding the limit, which is an error case.
     193In practice, then, the depth of recursive assertion satisfaction should be bounded by a small constant for error-free code, which will account for the vast majority of problem instances.
     194
     195Similarly, uses of polymorphism like those that generate the $O(d)$ bound on unification or the $O(i^{p^d})$ bound on number of candidates are particular enough to be rare, but not completely absent.
     196This analysis points to type unification, argument-parameter matching, and assertion satisfaction as potentially costly elements of expression resolution, and thus potentially profitable targets for tuning on realistic data.
     197Type unification is discussed in Chapter~\ref{env-chap}, while the other aspects are covered below.
     198
     199% also, unification of two classes is not particularly cheap ... the bounds above may be optimistic
    183200
    184201\subsection{Related Work}
Note: See TracChangeset for help on using the changeset viewer.