Changeset cf01d0b for doc/theses/aaron_moss_PhD
- Timestamp:
- Apr 23, 2019, 11:52:28 AM (6 years ago)
- Branches:
- ADT, aaron-thesis, arm-eh, ast-experimental, cleanup-dtors, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, pthread-emulation, qualifiedEnum
- Children:
- c4b5486
- Parents:
- 8f55e8e9
- Location:
- doc/theses/aaron_moss_PhD/phd
- Files:
-
- 9 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/aaron_moss_PhD/phd/background.tex
r8f55e8e9 rcf01d0b 17 17 Due to this complexity, the expression-resolution pass in \CFACC{} requires 95\% of compiler runtime on some source files, making an efficient procedure for expression resolution a requirement for a performant \CFA{} compiler. 18 18 19 The selection offeatures presented in this chapter are chosen to elucidate the design constraints of the work presented in this thesis.19 The features presented in this chapter are chosen to elucidate the design constraints of the work presented in this thesis. 20 20 In some cases the interactions of multiple features make this design a significantly more complex problem than any individual feature; in other cases a feature that does not by itself add any complexity to expression resolution triggers previously rare edge cases more frequently. 21 21 … … 93 93 The addition of !one_t! allows generic algorithms to handle the unit value uniformly for types where it is meaningful; a simple example of this is that polymorphic functions\footnote{discussed in Section~\ref{poly-func-sec}} in the \CFA{} prelude define !++x! and !x++! in terms of !x += 1!, allowing users to idiomatically define all forms of increment for a type !T! by defining the single function !T& ?+=?(T&, one_t)!; analogous overloads for the decrement operators are also present, and programmers can override any of these functions for a particular type if desired. 94 94 95 \CFA{} previously allowed !0! and !1! to be the names of polymorphic variables, with separate overloads for !int 0!, !int 1!, and !forall(dtype T) T* 0!.95 \CFA{} previously allowed !0! and !1! to be the names of polymorphic variables, with separate overloads for !int 0!, !int 1!, and the polymorphic variable !forall(dtype T) T* 0!. 96 96 While designing \CFA{} generic types (see Chapter~\ref{generic-chap}), it was discovered that the parametric polymorphic zero variable is not generalizable to other types; though all null pointers have the same in-memory representation, the same cannot be said of the zero values of arbitrary types. 97 As such, variables that could represent !0! and !1!were phased out in favour of functions that could generate those values for a given type as appropriate.97 As such, polymorphic variables, and in particular variables for !0! and !1!, were phased out in favour of functions that could generate those values for a given type as appropriate. 98 98 99 99 \section{Polymorphic Functions} \label{poly-func-sec} … … 149 149 To avoid such infinite loops, \CFACC{} imposes a fixed limit on the possible depth of recursion, similar to that employed by most \CC{} compilers for template expansion; this restriction means that there are some otherwise semantically well-typed expressions that cannot be resolved by \CFACC{}. 150 150 151 % \subsection{Deleted Declarations}152 153 % Particular type combinations can be exempted from matching a given polymorphic function through use of a \emph{deleted function declaration}:154 155 % \begin{cfa}156 % int somefn(char) = void;157 % \end{cfa}158 159 % This feature is based on a \CCeleven{} feature typically used to make a type non-copyable by deleting its copy constructor and assignment operator\footnote{In previous versions of \CC{}, a type could be made non-copyable by declaring a private copy constructor and assignment operator, but not defining either. This idiom is well-known, but depends on some rather subtle and \CC{}-specific rules about private members and implicitly-generated functions; the deleted function form is both clearer and less verbose.} or forbidding some interpretations of a polymorphic function by specifically deleting the forbidden overloads\footnote{Specific polymorphic function overloads can also be forbidden in previous \CC{} versions through use of template metaprogramming techniques, though this advanced usage is beyond the skills of many programmers. A similar effect can be produced on an ad-hoc basis at the appropriate call sites through use of casts to determine the function type. In both cases, the deleted-function form is clearer and more concise.}.160 % Deleted function declarations are implemented in \CFACC{} by adding them to the symbol table as usual, but with a flag set that indicates that the function is deleted.161 % If this deleted declaration is selected as the unique minimal-cost interpretation of an expression than an error is produced.162 163 151 \subsection{Traits} 164 152 … … 195 183 196 184 Traits, however, are significantly more powerful than nominal-inheritance interfaces; firstly, due to the scoping rules of the declarations that satisfy a trait's type assertions, a type may not satisfy a trait everywhere that that type is declared, as with !char! and the !nominal! trait above. 197 Secondly, because \CFA{} is not object-oriented and types do not have a closed set of methods, existing C library types can be extended to implement a trait simply by writing the requisite functions\footnote{\CC{} only allows partial extension of C types, because constructors, destructors, conversions, and the assignment, indexing, and function-call operators may only be defined in a class; \CFA{} does not have any of these restrictions.}. 198 Finally, traits may be used to declare a relationship among multiple types, a property that may be difficult or impossible to represent in nominal-inheritance type systems\footnote{This example uses \CFA{}'s reference types, described in Section~\ref{type-features-sec}.}: 185 Secondly, because \CFA{} is not object-oriented and types do not have a closed set of methods, existing C library types can be extended to implement a trait simply by writing the requisite functions\footnote{\CC{} only allows partial extension of C types, because constructors, destructors, conversions, and the assignment, indexing, and function-call operators may only be defined in a class; \CFA{} does not have any of these restrictions.}. Finally, traits may be used to declare a relationship among multiple types, a property that may be difficult or impossible to represent in nominal-inheritance type systems\footnote{This example uses \CFA{}'s reference types, described in Section~\ref{type-features-sec}.}: 199 186 200 187 \begin{cfa} … … 217 204 In this example above, !(list_iterator, int)! satisfies !pointer_like! by the user-defined dereference function, and !(list_iterator, list)! also satisfies !pointer_like! by the built-in dereference operator for pointers. 218 205 Given a declaration !list_iterator it!, !*it! can be either an !int! or a !list!, with the meaning disambiguated by context (\eg{} !int x = *it;! interprets !*it! as !int!, while !(*it).value = 42;! interprets !*it! as !list!). 219 While a nominal-inheritance system with associated types could model one of those two relationships by making !El! an associated type of !Ptr! in the !pointer_like! implementation, few such systems could model both relationships simultaneously. 206 While a nominal-inheritance system with associated types could model one of those two relationships by making !El! an associated type of !Ptr! in the !pointer_like! implementation, the author is unaware of any nominal-inheritance system that could model both relationships simultaneously. 207 Further comparison of \CFA{} polymorphism with other languages can be found in Section~\ref{generic-related-sec}. 220 208 221 209 The flexibility of \CFA{}'s implicit trait-satisfaction mechanism provides programmers with a great deal of power, but also blocks some optimization approaches for expression resolution. 222 The ability of types to begin or cease to satisfy traits when declarations go into or out of scope makes caching of trait satisfaction judgments difficult, and the ability of traits to take multiple type parameters can lead to a combinatorial explosion of work in any attempt to pre-compute trait satisfaction relationships. 210 The ability of types to begin or cease to satisfy traits when declarations go into or out of scope makes caching of trait satisfaction judgments difficult, and the ability of traits to take multiple type parameters can lead to a combinatorial explosion of work in any attempt to pre-compute trait satisfaction relationships. 211 212 \subsection{Deleted Declarations} 213 214 Particular type combinations can be exempted from matching a given polymorphic function through use of a \emph{deleted function declaration}: 215 216 \begin{cfa} 217 int somefn(char) = void; 218 \end{cfa} 219 220 This feature is based on a \CCeleven{} feature typically used to make a type non-copyable by deleting its copy constructor and assignment operator\footnote{In previous versions of \CC{}, a type could be made non-copyable by declaring a private copy constructor and assignment operator, but not defining either. This idiom is well-known, but depends on some rather subtle and \CC{}-specific rules about private members and implicitly-generated functions; the deleted function form is both clearer and less verbose.} or forbidding some interpretations of a polymorphic function by specifically deleting the forbidden overloads\footnote{Specific polymorphic function overloads can also be forbidden in previous \CC{} versions through use of template metaprogramming techniques, though this advanced usage is beyond the skills of many programmers.}. 221 Deleted function declarations are implemented in \CFACC{} by adding them to the symbol table as usual, but with a flag set that indicates that the function is deleted. 222 If this deleted declaration is selected as the unique minimal-cost interpretation of an expression then an error is produced, allowing \CFA{} programmers to guide the expression resolver away from undesirable solutions. 223 223 224 224 \section{Implicit Conversions} \label{implicit-conv-sec} … … 252 252 The C standard makes heavy use of the concept of \emph{lvalue}, an expression with a memory address; its complement, \emph{rvalue} (a non-addressable expression) is not explicitly named in the standard. 253 253 In \CFA{}, the distinction between lvalue and rvalue can be re-framed in terms of reference and non-reference types, with the benefit of being able to express the difference in user code. 254 \CFA{} references preserve the existing qualifier-dropping implicit lvalue-to-rvalue conversion from C (\eg{} a !const volatile int&! can be implicitly copied to a bare !int!) 254 \CFA{} references preserve the existing qualifier-dropping implicit lvalue-to-rvalue conversion from C (\eg{} a !const volatile int&! can be implicitly copied to a bare !int!). 255 255 To make reference types more easily usable in legacy pass-by-value code, \CFA{} also adds an implicit rvalue-to-lvalue conversion, implemented by storing the value in a compiler-generated temporary variable and passing a reference to that temporary. 256 256 To mitigate the ``!const! hell'' problem present in \CC{}, there is also a qualifier-dropping lvalue-to-lvalue conversion implemented by copying into a temporary: … … 258 258 \begin{cfa} 259 259 const int magic = 42; 260 261 260 void inc_print( int& x ) { printf("%d\n", ++x); } 262 261 263 print_inc( magic ); $\C{// legal; implicitly generated code in red below:}$262 inc_print( magic ); $\C{// legal; implicitly generated code in red below:}$ 264 263 265 264 `int tmp = magic;` $\C{// to safely strip const-qualifier}$ 266 ` print_inc( tmp );` $\C{// tmp is incremented, magic is unchanged}$265 `inc_print( tmp );` $\C{// tmp is incremented, magic is unchanged}$ 267 266 \end{cfa} 268 267 … … 272 271 \CFA{} supports all of these use cases without further added syntax. 273 272 The key to this syntax-free feature support is an observation made by the author that the address of a reference is a lvalue. 274 In C, the address-of operator !&x! can only be applied to lvalue expressions, and always produces an immutable rvalue; \CFA{} supports reference re-binding by assignment to the address of a reference , and pointers to references by repeating the address-of operator:273 In C, the address-of operator !&x! can only be applied to lvalue expressions, and always produces an immutable rvalue; \CFA{} supports reference re-binding by assignment to the address of a reference\footnote{The syntactic difference between reference initialization and reference assignment is unfortunate, but preserves the ability to pass function arguments by reference (a reference initialization context) without added syntax.}, and pointers to references by repeating the address-of operator: 275 274 276 275 \begin{cfa} … … 281 280 \end{cfa} 282 281 283 For better compatibility with C, the \CFA{} team has chosen not to differentiate function overloads based on top-level reference types, and as such their contribution to the difficulty of \CFA{} expression resolution is largely restricted to the implementation details of normalization conversions and adapters.282 For better compatibility with C, the \CFA{} team has chosen not to differentiate function overloads based on top-level reference types, and as such their contribution to the difficulty of \CFA{} expression resolution is largely restricted to the implementation details of matching reference to non-reference types during type-checking. 284 283 285 284 \subsection{Resource Management} \label{ctor-sec} -
doc/theses/aaron_moss_PhD/phd/conclusion.tex
r8f55e8e9 rcf01d0b 7 7 The combination of these practical improvements and added features significantly improve the viability of \CFA{} as a practical programming language. 8 8 9 Further improvements to the \CFA{} type -system are still possible, however.9 Further improvements to the \CFA{} type system are still possible, however. 10 10 One area suggested by this work is development of a scheme for user-defined conversions; to integrate properly with the \CFA{} conversion model, there would need to be a distinction between safe and unsafe conversions, and possibly a way to denote conversions as explicit-only or non-chainable. 11 11 Another place for ongoing effort is improvement of compilation performance; I believe the most promising direction for that is rebuilding the \CFA{} compiler on a different framework than Bilson's \CFACC{}. -
doc/theses/aaron_moss_PhD/phd/experiments.tex
r8f55e8e9 rcf01d0b 7 7 8 8 \CFACC{} can generate realistic test inputs for the resolver prototype from equivalent \CFA{} code; 9 the generated test inputs currently comprise all \CFA{} code currently in existence , 9,000 lines drawn primarily from the standard library and compiler test suite.9 the generated test inputs currently comprise all \CFA{} code currently in existence\footnote{Though \CFA{} is backwards-compatible with C, the lack of \lstinline{forall} functions and name overloading in C mean that the larger corpus of C code does not provide challenging test instances for \CFACC{}}, 9,000 lines drawn primarily from the standard library and compiler test suite. 10 10 \CFACC{} is also instrumented to produce a number of code metrics. 11 11 These metrics were used to construct synthetic test inputs during development of the resolver prototype; these synthetic inputs provided useful design guidance, but the performance results presented in this chapter are based on the more realistic directly-generated inputs. … … 36 36 Generic named types are used to represent the built-in parameterized types of \CFA{} as well; !T*! is encoded as \texttt{\#\$ptr<T>}. 37 37 \CFA{} arrays are also represented as pointers, to simulate array-to-pointer decay, while top-level reference types are replaced by their referent to simulate the variety of reference conversions. 38 \emph{Function types} have first-class representation in the prototype as well; \CFA{} function function pointers are represented as variables with the appropriate function type, though \CFA{} polymorphic function pointers cannot be represented, as the prototype system function type does not store information about type assertions.38 \emph{Function types} have first-class representation in the prototype as well; \CFA{} function pointers are represented as variables with the appropriate function type, though \CFA{} polymorphic function pointers cannot be represented, as the prototype system stores information about type assertions in function declarations rather than in the function type. 39 39 \emph{Void} and \emph{tuple types} are also supported in the prototype, to express the multiple-return-value functions in \CFA{}, though varargs functions and !ttype! tuple-typed type variables are absent from the prototype system. 40 40 The prototype system also does not represent type qualifiers (\eg{} !const!, !volatile!), so all such qualifiers are stripped during conversion to the prototype system. … … 116 116 As a matter of experimental practicality, test runs that exceeded 8~GB of peak resident memory usage are excluded from the data set. 117 117 This restriction is justifiable by real-world use, as a compiler that is merely slow may be accommodated with patience, but one that uses in excess of 8~GB of RAM may be impossible to run on many currently deployed computer systems. 118 8~GB of RAM is not typical of the memory usage of the best-pe forming two variants, \textsc{bu-dca-bas} and \textsc{bu-dca-per}, which were able to run all 131 test inputs to completion with maximum memory usage of 70~MB and 78~MB, respectively.118 8~GB of RAM is not typical of the memory usage of the best-performing two variants, \textsc{bu-dca-bas} and \textsc{bu-dca-per}, which were able to run all 131 test inputs to completion with maximum memory usage of 70~MB and 78~MB, respectively. 119 119 However, this threshold did eliminate a significant number of algorithm-test variants, with the worst-performing variant, \textsc{td-imm-inc}, only completing 62 test inputs within the memory bound. 120 120 Full results for tests completed by algorithm variant are presented in Figure~\ref{tests-completed-fig}. … … 151 151 % \end{figure} 152 152 153 It can be seen from these results that th at the top-down, immediate assertion-satisfaction (\textsc{td-imm-*}) variants are particularly inefficient, as they check a significant number of assertions without filtering to determine if the arguments can be made to fit.153 It can be seen from these results that the top-down, immediate assertion-satisfaction (\textsc{td-imm-*}) variants are particularly inefficient, as they check a significant number of assertions without filtering to determine if the arguments can be made to fit. 154 154 It is also clear that the bottom-up (\textsc{bu}) traversal order is better than both top-down (\textsc{td}) and the Bilson-style bottom-up-combined (\textsc{co}) orders. 155 155 While the advantage of \textsc{bu} over \textsc{co} is clear, in that it performs less redundant work if a prefix of a combination fails, the advantage of \textsc{bu} over \textsc{td} provides an answer for an open question from Baker \cite{Baker82}. … … 256 256 \section{Conclusion} 257 257 258 As can be seen from the prototype results, per-expression benchmarks, and \CFACC{}, the dominant factor in the cost of \CFA{} expression resolution is assertion satisfaction.258 The dominant factor in the cost of \CFA{} expression resolution is assertion satisfaction. 259 259 Reducing the total number of assertions satisfied, as in the deferred satisfaction algorithm, is consistently effective at reducing runtime, and caching results of these satisfaction problem instances has shown promise in the prototype system. 260 260 The results presented here also demonstrate that a bottom-up approach to expression resolution is superior to top-down, settling an open question from Baker~\cite{Baker82}. -
doc/theses/aaron_moss_PhD/phd/generic-bench.tex
r8f55e8e9 rcf01d0b 1 1 \chapter{Generic Stack Benchmarks} \label{generic-bench-app} 2 2 3 Throughout, !/***/! designates a counted redundant type annotation; code reformatted slightly for brevity. 3 This appendix includes the generic stack code for all four language variants discussed in Section~\ref{generic-performance-sec}. Throughout, !/***/! designates a counted redundant type annotation; these include !sizeof! on a known type, repetition of a type name in initialization or return statements, and type-specific helper functions. 4 The code is reformatted slightly for brevity. 4 5 5 6 \section{C} -
doc/theses/aaron_moss_PhD/phd/generic-types.tex
r8f55e8e9 rcf01d0b 144 144 145 145 \CC{}, Java, and other languages use \emph{generic types} to produce type-safe abstract data types. 146 Design and implementation of generic types for \CFA{} is the first major contribution of this thesis, a summary of which is published in \cite{Moss18} and fromwhich this chapter is closely based.146 Design and implementation of generic types for \CFA{} is the first major contribution of this thesis, a summary of which is published in \cite{Moss18} and on which this chapter is closely based. 147 147 \CFA{} generic types integrate efficiently and naturally with the existing polymorphic functions in \CFA{}, while retaining backward compatibility with C in layout and support for separate compilation. 148 148 A generic type can be declared in \CFA{} by placing a !forall! specifier on a !struct! or !union! declaration, and instantiated using a parenthesized list of types after the generic name. … … 189 189 A key insight for this design is that C already possesses a handful of built-in generic types (\emph{derived types} in the language of the standard \cite[\S{}6.2.5]{C11}), notably pointer (!T*!) and array (!T[]!), and that user-definable generics should act similarly. 190 190 191 \subsection{Related Work} 191 \subsection{Related Work} \label{generic-related-sec} 192 192 193 193 One approach to the design of generic types is that taken by \CC{} templates \cite{C++}. … … 202 202 Java \cite{Java8} has another prominent implementation for generic types, introduced in Java~5 and based on a significantly different approach than \CC{}. 203 203 The Java approach has much more in common with the !void*!-polymorphism shown in Figure~\ref{void-generic-fig}; since in Java nearly all data is stored by reference, the Java approach to polymorphic data is to store pointers to arbitrary data and insert type-checked implicit casts at compile-time. 204 This process of \emph{type erasure} has the benefit of allowing a single instantiation of polymorphic code, but relies heavily on Java's object model and garbage collector.204 This process of \emph{type erasure} has the benefit of allowing a single instantiation of polymorphic code, but relies heavily on Java's object model. 205 205 To use this model, a more C-like language such as \CFA{} would be required to dynamically allocate internal storage for variables, track their lifetime, and properly clean them up afterward. 206 206 … … 277 277 A \emph{dtype-static} type has polymorphic parameters but is still concrete. 278 278 Polymorphic pointers are an example of dtype-static types; given some type variable !T!, !T! is a polymorphic type, but !T*! has a fixed size and can therefore be represented by a !void*! in code generation. 279 In particular, generic types where all parameters are un-!sized! (\ie{} they do not conform to the built-in !sized! trait because the compiler does not know their size and alignment) are always concrete, as there is no possibility for their layout to vary based on type parameters of unknown size and alignment.279 In particular, generic types where all parameters are un-!sized! (\ie{} they do not conform to the built-in !sized! trait, which is satisfied by all types the compiler knows the size and alignment of) are always concrete, as there is no possibility for their layout to vary based on type parameters of unknown size and alignment. 280 280 More precisely, a type is concrete if and only if all of its !sized! type parameters are concrete, and a concrete type is dtype-static if any of its type parameters are (possibly recursively) polymorphic. 281 281 To illustrate, the following code using the !pair! type from above has each use of !pair! commented with its class: … … 425 425 Since these languages are all C-based and compiled with the same compiler backend, maximal-performance benchmarks should show little runtime variance, differing only in length and clarity of source code. 426 426 A more illustrative comparison measures the costs of idiomatic usage of each language's features. 427 The code below shows the \CFA{} benchmark tests for a generic stack based on a singly-linked list; the test suite is equivalent for the other otherlanguages, code for which is included in Appendix~\ref{generic-bench-app}.428 The experiment uses element types !int! and !pair(short, char)! and pushes $N = 4 0M$ elements on a generic stack, copies the stack, clears one of the stacks, and finds the maximum value in the other stack.427 The code below shows the \CFA{} benchmark tests for a generic stack based on a singly-linked list; the test suite is equivalent for the other languages, code for which is included in Appendix~\ref{generic-bench-app}. 428 The experiment uses element types !int! and !pair(short, char)! and pushes $N = 4M$ elements on a generic stack, copies the stack, clears one of the stacks, and finds the maximum value in the other stack. 429 429 430 430 \begin{cfa} … … 451 451 452 452 The four versions of the benchmark implemented are C with !void*!-based polymorphism, \CFA{} with parametric polymorphism, \CC{} with templates, and \CC{} using only class inheritance for polymorphism, denoted \CCV{}. 453 The \CCV{} variant illustrates an alternative object-oriented idiom where all objects inherit from a base !object! class, mimicking a Java-like interface; in particular, runtime checks are necessary to safely downcast objects.453 The \CCV{} variant illustrates an alternative object-oriented idiom where all objects inherit from a base !object! class, a language design similar to Java 4; in particular, runtime checks are necessary to safely downcast objects. 454 454 The most notable difference among the implementations is the memory layout of generic types: \CFA{} and \CC{} inline the stack and pair elements into corresponding list and pair nodes, while C and \CCV{} lack such capability and, instead, must store generic objects via pointers to separately allocated objects. 455 455 Note that the C benchmark uses unchecked casts as C has no runtime mechanism to perform such checks, whereas \CFA{} and \CC{} provide type safety statically. … … 481 481 482 482 The C and \CCV{} variants are generally the slowest and have the largest memory footprint, due to their less-efficient memory layout and the pointer indirection necessary to implement generic types in those languages; this inefficiency is exacerbated by the second level of generic types in the pair benchmarks. 483 By contrast, the \CFA{} and \CC{} variants run in roughly equivalenttime for both the integer and pair because of the equivalent storage layout, with the inlined libraries (\ie{} no separate compilation) and greater maturity of the \CC{} compiler contributing to its lead.483 By contrast, the \CFA{} and \CC{} variants run in noticably less time for both the integer and pair because of the equivalent storage layout, with the inlined libraries (\ie{} no separate compilation) and greater maturity of the \CC{} compiler contributing to its lead. 484 484 \CCV{} is slower than C largely due to the cost of runtime type checking of downcasts (implemented with !dynamic_cast!); the outlier for \CFA{}, pop !pair!, results from the complexity of the generated-C polymorphic code. 485 485 The gcc compiler is unable to optimize some dead code and condense nested calls; a compiler designed for \CFA{} could more easily perform these optimizations. 486 Finally, the binary size for \CFA{} is larger because of static linking with \CFA{} libraries.486 Finally, the binary size for \CFA{} is larger because of static linking with the \CFA{} prelude library, which includes function definitions for all the built-in operators. 487 487 488 488 \CFA{} is also competitive in terms of source code size, measured as a proxy for programmer effort. -
doc/theses/aaron_moss_PhD/phd/introduction.tex
r8f55e8e9 rcf01d0b 26 26 The new features make \CFA{} more powerful and expressive than C, while maintaining strong backward-compatibility with both C code and the procedural paradigm expected by C programmers. 27 27 Unlike other popular C extensions like \CC{} and Objective-C, \CFA{} adds modern features to C without imposing an object-oriented paradigm to use them. 28 However, these new features do impose a compile-time cost, particularly in the expression resolver, which must evaluate the typing rules of a significantly more complex type -system.28 However, these new features do impose a compile-time cost, particularly in the expression resolver, which must evaluate the typing rules of a significantly more complex type system. 29 29 30 30 This thesis is focused on making \CFA{} a more powerful and expressive language, both by adding new features to the \CFA{} type system and ensuring that both added and existing features can be efficiently implemented in \CFACC{}, the \CFA{} reference compiler. -
doc/theses/aaron_moss_PhD/phd/resolution-heuristics.tex
r8f55e8e9 rcf01d0b 6 6 A given matching between identifiers and declarations in an expression is an \emph{interpretation}; an interpretation also includes information about polymorphic type bindings and implicit casts to support the \CFA{} features discussed in Sections~\ref{poly-func-sec} and~\ref{implicit-conv-sec}, each of which increase the number of valid candidate interpretations. 7 7 To choose among valid interpretations, a \emph{conversion cost} is used to rank interpretations. 8 Hence, the expression resolution problem is to find the unique minimal-cost interpretation for an expression, reporting an error if no such interpretation exists.8 Hence, the expression resolution problem is to find the unique minimal-cost interpretation for an expression, reporting an error if no such unique interpretation exists. 9 9 10 10 \section{Expression Resolution} … … 48 48 With more specificity, the cost is a lexicographically-ordered tuple, where each element corresponds to a particular kind of conversion. 49 49 In Bilson's design, conversion cost is a 3-tuple, $(unsafe, poly, safe)$, where $unsafe$ is the count of unsafe (narrowing) conversions, $poly$ is the count of polymorphic type bindings, and $safe$ is the sum of the degree of safe (widening) conversions. 50 Degree of safe conversion is calculated as path weight in a directed graph of safe conversions between types; Bilson's version and the current version of this graph are in Figures~\ref{bilson-conv-fig} and~\ref{extended-conv-fig}, respectively.50 Degree of safe conversion is calculated as path weight in a directed graph of safe conversions between types; Bilson's version of this graph is in Figure~\ref{bilson-conv-fig}. 51 51 The safe conversion graph is designed such that the common type $c$ of two types $u$ and $v$ is compatible with the C standard definitions from \cite[\S{}6.3.1.8]{C11} and can be calculated as the unique type minimizing the sum of the path weights of $\overrightarrow{uc}$ and $\overrightarrow{vc}$. 52 52 The following example lists the cost in the Bilson model of calling each of the following functions with two !int! parameters, where the interpretation with the minimum total cost will be selected: … … 124 124 In the redesign, for consistency with the approach of the usual arithmetic conversions, which select a common type primarily based on size, but secondarily on sign, arcs in the new graph are annotated with whether they represent a sign change, and such sign changes are summed in a new $sign$ cost element that lexicographically succeeds $safe$. 125 125 This means that sign conversions are approximately the same cost as widening conversions, but slightly more expensive (as opposed to less expensive in Bilson's graph), so maintaining the same signedness is consistently favoured. 126 This refined conversion graph is shown in Figure~\ref{extended-conv-fig}. 126 127 127 128 With these modifications, the current \CFA{} cost tuple is as follows: … … 257 258 258 259 Pruning possible interpretations as early as possible is one way to reduce the real-world cost of expression resolution, provided that a sufficient proportion of interpretations are pruned to pay for the cost of the pruning algorithm. 259 One opportunity for interpretation pruning is by the argument-parameter type matching, but the literature provides no clear answers on whether candidate functions should be chosen according to their available arguments, or whether argument resolution should be driven by the available function candidates.260 One opportunity for interpretation pruning is by the argument-parameter type matching, but the literature \cite{Baker82,Bilson03,Cormack81,Ganzinger80,Pennello80,PW:overload} provides no clear answers on whether candidate functions should be chosen according to their available arguments, or whether argument resolution should be driven by the available function candidates. 260 261 For programming languages without implicit conversions, argument-parameter matching is essentially the entirety of the expression resolution problem, and is generally referred to as ``overload resolution'' in the literature. 261 262 All expression-resolution algorithms form a DAG of interpretations, some explicitly, some implicitly; in this DAG, arcs point from function-call interpretations to argument interpretations, as in Figure~\ref{res-dag-fig} … … 305 306 The assertion satisfaction algorithm designed by Bilson~\cite{Bilson03} for the original \CFACC{} is the most-relevant prior work to this project. 306 307 Before accepting any subexpression candidate, Bilson first checks that that candidate's assertions can all be resolved; this is necessary due to Bilson's addition of assertion satisfaction costs to candidate costs (discussed in Section~\ref{conv-cost-sec}). 307 If this subexpression interpretation ends up not being used in the final resolution, th an the (sometimes substantial) work of checking its assertions ends up wasted.308 If this subexpression interpretation ends up not being used in the final resolution, then the (sometimes substantial) work of checking its assertions ends up wasted. 308 309 Bilson's assertion checking function recurses on two lists, !need! and !newNeed!, the current declaration's assertion set and those implied by the assertion-satisfying declarations, respectively, as detailed in the pseudo-code below (ancillary aspects of the algorithm are omitted for clarity): 309 310
Note: See TracChangeset
for help on using the changeset viewer.