Context Navigation

← Previous Changeset
Next Changeset →

Changeset cf01d0b

Timestamp:

Apr 23, 2019, 11:52:28 AM (5 years ago)

Author:

Aaron Moss <a3moss@…>

Branches:

ADT, aaron-thesis, arm-eh, ast-experimental, cleanup-dtors, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, pthread-emulation, qualifiedEnum

Children:

Parents:

Message:

thesis: typo-fixing revisions from Werner, Ondrej

Location:

Files:

: 10 edited

bibliography/pl.bib (modified) (1 diff)
theses/aaron_moss_PhD/phd/background.tex (modified) (9 diffs)
theses/aaron_moss_PhD/phd/conclusion.tex (modified) (1 diff)
theses/aaron_moss_PhD/phd/experiments.tex (modified) (5 diffs)
theses/aaron_moss_PhD/phd/figures/union-find-with-classes.eps (modified) (previous)
theses/aaron_moss_PhD/phd/figures/union-find-with-classes.odg (modified) (previous)
theses/aaron_moss_PhD/phd/generic-bench.tex (modified) (1 diff)
theses/aaron_moss_PhD/phd/generic-types.tex (modified) (7 diffs)
theses/aaron_moss_PhD/phd/introduction.tex (modified) (1 diff)
theses/aaron_moss_PhD/phd/resolution-heuristics.tex (modified) (5 diffs)

Legend:

: Unmodified
: Added
: Removed

doc/bibliography/pl.bib

r8f55e8e9	rcf01d0b
973	973	contributer = {pabuhr@plg},
974	974	author = {Aaron Moss and Robert Schluntz and Peter A. Buhr},
975		title = {\textsf{C}$\mathbf{\forall}$ : Adding Modern Programming Language Features to C},
	975	title = {\textsf{C}$\mathbf{\forall}$ : Adding Modern Programming Language Features to {C}},
976	976	journal = spe,
977	977	volume = 48,

doc/theses/aaron_moss_PhD/phd/background.tex

-                      r8f55e8e9
+                      rcf01d0b
 Due to this complexity, the expression-resolution pass in \CFACC{} requires 95\% of compiler runtime on some source files, making an efficient procedure for expression resolution a requirement for a performant \CFA{} compiler.
 The selection of features presented in this chapter are chosen to elucidate the design constraints of the work presented in this thesis.
+The features presented in this chapter are chosen to elucidate the design constraints of the work presented in this thesis.
 In some cases the interactions of multiple features make this design a significantly more complex problem than any individual feature; in other cases a feature that does not by itself add any complexity to expression resolution triggers previously rare edge cases more frequently.
 …
 The addition of !one_t! allows generic algorithms to handle the unit value uniformly for types where it is meaningful; a simple example of this is that polymorphic functions\footnote{discussed in Section~\ref{poly-func-sec}} in the \CFA{} prelude define !++x! and !x++! in terms of !x += 1!, allowing users to idiomatically define all forms of increment for a type !T! by defining the single function !T& ?+=?(T&, one_t)!; analogous overloads for the decrement operators are also present, and programmers can override any of these functions for a particular type if desired.
 \CFA{} previously allowed !0! and !1! to be the names of polymorphic variables, with separate overloads for !int 0!, !int 1!, and !forall(dtype T) T* 0!.
+\CFA{} previously allowed !0! and !1! to be the names of polymorphic variables, with separate overloads for !int 0!, !int 1!, and the polymorphic variable !forall(dtype T) T* 0!.
 While designing \CFA{} generic types (see Chapter~\ref{generic-chap}), it was discovered that the parametric polymorphic zero variable is not generalizable to other types; though all null pointers have the same in-memory representation, the same cannot be said of the zero values of arbitrary types.
 As such, variables that could represent !0! and !1! were phased out in favour of functions that could generate those values for a given type as appropriate.
+As such, polymorphic variables, and in particular variables for !0! and !1!, were phased out in favour of functions that could generate those values for a given type as appropriate.
 \section{Polymorphic Functions} \label{poly-func-sec}
 …
 To avoid such infinite loops, \CFACC{} imposes a fixed limit on the possible depth of recursion, similar to that employed by most \CC{} compilers for template expansion; this restriction means that there are some otherwise semantically well-typed expressions that cannot be resolved by \CFACC{}.
-% \subsection{Deleted Declarations}
-% Particular type combinations can be exempted from matching a given polymorphic function through use of a \emph{deleted function declaration}:
-% \begin{cfa}
-% int somefn(char) = void;
-% \end{cfa}
-% This feature is based on a \CCeleven{} feature typically used to make a type non-copyable by deleting its copy constructor and assignment operator\footnote{In previous versions of \CC{}, a type could be made non-copyable by declaring a private copy constructor and assignment operator, but not defining either. This idiom is well-known, but depends on some rather subtle and \CC{}-specific rules about private members and implicitly-generated functions; the deleted function form is both clearer and less verbose.} or forbidding some interpretations of a polymorphic function by specifically deleting the forbidden overloads\footnote{Specific polymorphic function overloads can also be forbidden in previous \CC{} versions through use of template metaprogramming techniques, though this advanced usage is beyond the skills of many programmers. A similar effect can be produced on an ad-hoc basis at the appropriate call sites through use of casts to determine the function type. In both cases, the deleted-function form is clearer and more concise.}.
-% Deleted function declarations are implemented in \CFACC{} by adding them to the symbol table as usual, but with a flag set that indicates that the function is deleted.
-% If this deleted declaration is selected as the unique minimal-cost interpretation of an expression than an error is produced.
 \subsection{Traits}
 …
 Traits, however, are significantly more powerful than nominal-inheritance interfaces; firstly, due to the scoping rules of the declarations that satisfy a trait's type assertions, a type may not satisfy a trait everywhere that that type is declared, as with !char! and the !nominal! trait above.
+Secondly, because \CFA{} is not object-oriented and types do not have a closed set of methods, existing C library types can be extended to implement a trait simply by writing the requisite functions\footnote{\CC{} only allows partial extension of C types, because constructors, destructors, conversions, and the assignment, indexing, and function-call operators may only be defined in a class; \CFA{} does not have any of these restrictions.}.
+Finally, traits may be used to declare a relationship among multiple types, a property that may be difficult or impossible to represent in nominal-inheritance type systems\footnote{This example uses \CFA{}'s reference types, described in Section~\ref{type-features-sec}.}:
+Secondly, because \CFA{} is not object-oriented and types do not have a closed set of methods, existing C library types can be extended to implement a trait simply by writing the requisite functions\footnote{\CC{} only allows partial extension of C types, because constructors, destructors, conversions, and the assignment, indexing, and function-call operators may only be defined in a class; \CFA{} does not have any of these restrictions.}. Finally, traits may be used to declare a relationship among multiple types, a property that may be difficult or impossible to represent in nominal-inheritance type systems\footnote{This example uses \CFA{}'s reference types, described in Section~\ref{type-features-sec}.}:
 \begin{cfa}
 …
 In this example above, !(list_iterator, int)! satisfies !pointer_like! by the user-defined dereference function, and !(list_iterator, list)! also satisfies !pointer_like! by the built-in dereference operator for pointers.
 Given a declaration !list_iterator it!, !*it! can be either an !int! or a !list!, with the meaning disambiguated by context (\eg{} !int x = *it;! interprets !*it! as !int!, while !(*it).value = 42;! interprets !*it! as !list!).
+While a nominal-inheritance system with associated types could model one of those two relationships by making !El! an associated type of !Ptr! in the !pointer_like! implementation, few such systems could model both relationships simultaneously.
+While a nominal-inheritance system with associated types could model one of those two relationships by making !El! an associated type of !Ptr! in the !pointer_like! implementation, the author is unaware of any nominal-inheritance system that could model both relationships simultaneously.
+Further comparison of \CFA{} polymorphism with other languages can be found in Section~\ref{generic-related-sec}.
 The flexibility of \CFA{}'s implicit trait-satisfaction mechanism provides programmers with a great deal of power, but also blocks some optimization approaches for expression resolution.
+The ability of types to begin or cease to satisfy traits when declarations go into or out of scope makes caching of trait satisfaction judgments difficult, and the ability of traits to take multiple type parameters can lead to a combinatorial explosion of work in any attempt to pre-compute trait satisfaction relationships.
+The ability of types to begin or cease to satisfy traits when declarations go into or out of scope makes caching of trait satisfaction judgments difficult, and the ability of traits to take multiple type parameters can lead to a combinatorial explosion of work in any attempt to pre-compute trait satisfaction relationships.
+\subsection{Deleted Declarations}
+Particular type combinations can be exempted from matching a given polymorphic function through use of a \emph{deleted function declaration}:
+\begin{cfa}
+int somefn(char) = void;
+\end{cfa}
+This feature is based on a \CCeleven{} feature typically used to make a type non-copyable by deleting its copy constructor and assignment operator\footnote{In previous versions of \CC{}, a type could be made non-copyable by declaring a private copy constructor and assignment operator, but not defining either. This idiom is well-known, but depends on some rather subtle and \CC{}-specific rules about private members and implicitly-generated functions; the deleted function form is both clearer and less verbose.} or forbidding some interpretations of a polymorphic function by specifically deleting the forbidden overloads\footnote{Specific polymorphic function overloads can also be forbidden in previous \CC{} versions through use of template metaprogramming techniques, though this advanced usage is beyond the skills of many programmers.}.
+Deleted function declarations are implemented in \CFACC{} by adding them to the symbol table as usual, but with a flag set that indicates that the function is deleted.
+If this deleted declaration is selected as the unique minimal-cost interpretation of an expression then an error is produced, allowing \CFA{} programmers to guide the expression resolver away from undesirable solutions.
 \section{Implicit Conversions} \label{implicit-conv-sec}
 …
 The C standard makes heavy use of the concept of \emph{lvalue}, an expression with a memory address; its complement, \emph{rvalue} (a non-addressable expression) is not explicitly named in the standard.
 In \CFA{}, the distinction between lvalue and rvalue can be re-framed in terms of reference and non-reference types, with the benefit of being able to express the difference in user code.
 \CFA{} references preserve the existing qualifier-dropping implicit lvalue-to-rvalue conversion from C (\eg{} a !const volatile int&! can be implicitly copied to a bare !int!)
+\CFA{} references preserve the existing qualifier-dropping implicit lvalue-to-rvalue conversion from C (\eg{} a !const volatile int&! can be implicitly copied to a bare !int!).
 To make reference types more easily usable in legacy pass-by-value code, \CFA{} also adds an implicit rvalue-to-lvalue conversion, implemented by storing the value in a compiler-generated temporary variable and passing a reference to that temporary.
 To mitigate the ``!const! hell'' problem present in \CC{}, there is also a qualifier-dropping lvalue-to-lvalue conversion implemented by copying into a temporary:
 …
 \begin{cfa}
 const int magic = 42;
 void inc_print( int& x ) { printf("%d\n", ++x); }
 print_inc( magic ); $\C{// legal; implicitly generated code in red below:}$
+inc_print( magic ); $\C{// legal; implicitly generated code in red below:}$
 `int tmp = magic;` $\C{// to safely strip const-qualifier}$
 `print_inc( tmp );` $\C{// tmp is incremented, magic is unchanged}$
+`inc_print( tmp );` $\C{// tmp is incremented, magic is unchanged}$
 \end{cfa}
 …
 \CFA{} supports all of these use cases without further added syntax.
 The key to this syntax-free feature support is an observation made by the author that the address of a reference is a lvalue.
 In C, the address-of operator !&x! can only be applied to lvalue expressions, and always produces an immutable rvalue; \CFA{} supports reference re-binding by assignment to the address of a reference, and pointers to references by repeating the address-of operator:
+In C, the address-of operator !&x! can only be applied to lvalue expressions, and always produces an immutable rvalue; \CFA{} supports reference re-binding by assignment to the address of a reference\footnote{The syntactic difference between reference initialization and reference assignment is unfortunate, but preserves the ability to pass function arguments by reference (a reference initialization context) without added syntax.}, and pointers to references by repeating the address-of operator:
 \begin{cfa}
 …
 \end{cfa}
 For better compatibility with C, the \CFA{} team has chosen not to differentiate function overloads based on top-level reference types, and as such their contribution to the difficulty of \CFA{} expression resolution is largely restricted to the implementation details of normalization conversions and adapters.
+For better compatibility with C, the \CFA{} team has chosen not to differentiate function overloads based on top-level reference types, and as such their contribution to the difficulty of \CFA{} expression resolution is largely restricted to the implementation details of matching reference to non-reference types during type-checking.
 \subsection{Resource Management} \label{ctor-sec}

doc/theses/aaron_moss_PhD/phd/conclusion.tex

r8f55e8e9	rcf01d0b
7	7	The combination of these practical improvements and added features significantly improve the viability of \CFA{} as a practical programming language.
8	8
9		Further improvements to the \CFA{} type-system are still possible, however.
	9	Further improvements to the \CFA{} type system are still possible, however.
10	10	One area suggested by this work is development of a scheme for user-defined conversions; to integrate properly with the \CFA{} conversion model, there would need to be a distinction between safe and unsafe conversions, and possibly a way to denote conversions as explicit-only or non-chainable.
11	11	Another place for ongoing effort is improvement of compilation performance; I believe the most promising direction for that is rebuilding the \CFA{} compiler on a different framework than Bilson's \CFACC{}.

doc/theses/aaron_moss_PhD/phd/experiments.tex

-                      r8f55e8e9
+                      rcf01d0b
 \CFACC{} can generate realistic test inputs for the resolver prototype from equivalent \CFA{} code;
 the generated test inputs currently comprise all \CFA{} code currently in existence, 9,000 lines drawn primarily from the standard library and compiler test suite.
+the generated test inputs currently comprise all \CFA{} code currently in existence\footnote{Though \CFA{} is backwards-compatible with C, the lack of \lstinline{forall} functions and name overloading in C mean that the larger corpus of C code does not provide challenging test instances for \CFACC{}}, 9,000 lines drawn primarily from the standard library and compiler test suite.
 \CFACC{} is also instrumented to produce a number of code metrics.
 These metrics were used to construct synthetic test inputs during development of the resolver prototype; these synthetic inputs provided useful design guidance, but the performance results presented in this chapter are based on the more realistic directly-generated inputs.
 …
 Generic named types are used to represent the built-in parameterized types of \CFA{} as well; !T*! is encoded as \texttt{\#\$ptr<T>}.
 \CFA{} arrays are also represented as pointers, to simulate array-to-pointer decay, while top-level reference types are replaced by their referent to simulate the variety of reference conversions.
 \emph{Function types} have first-class representation in the prototype as well; \CFA{} function function pointers are represented as variables with the appropriate function type, though \CFA{} polymorphic function pointers cannot be represented, as the prototype system function type does not store information about type assertions.
+\emph{Function types} have first-class representation in the prototype as well; \CFA{} function  pointers are represented as variables with the appropriate function type, though \CFA{} polymorphic function pointers cannot be represented, as the prototype system stores information about type assertions in function declarations rather than in the function type.
 \emph{Void} and \emph{tuple types} are also supported in the prototype, to express the multiple-return-value functions in \CFA{}, though varargs functions and !ttype! tuple-typed type variables are absent from the prototype system.
 The prototype system also does not represent type qualifiers (\eg{} !const!, !volatile!), so all such qualifiers are stripped during conversion to the prototype system.
 …
 As a matter of experimental practicality, test runs that exceeded 8~GB of peak resident memory usage are excluded from the data set.
 This restriction is justifiable by real-world use, as a compiler that is merely slow may be accommodated with patience, but one that uses in excess of 8~GB of RAM may be impossible to run on many currently deployed computer systems.
 ~GB of RAM is not typical of the memory usage of the best-peforming two variants, \textsc{bu-dca-bas} and \textsc{bu-dca-per}, which were able to run all 131 test inputs to completion  with maximum memory usage of 70~MB and 78~MB, respectively.
+~GB of RAM is not typical of the memory usage of the best-performing two variants, \textsc{bu-dca-bas} and \textsc{bu-dca-per}, which were able to run all 131 test inputs to completion  with maximum memory usage of 70~MB and 78~MB, respectively.
 However, this threshold did eliminate a significant number of algorithm-test variants, with the worst-performing variant, \textsc{td-imm-inc}, only completing 62 test inputs within the memory bound.
 Full results for tests completed by algorithm variant are presented in Figure~\ref{tests-completed-fig}.
 …
 % \end{figure}
 It can be seen from these results that that the top-down, immediate assertion-satisfaction (\textsc{td-imm-*}) variants are particularly inefficient, as they check a significant number of assertions without filtering to determine if the arguments can be made to fit.
+It can be seen from these results that the top-down, immediate assertion-satisfaction (\textsc{td-imm-*}) variants are particularly inefficient, as they check a significant number of assertions without filtering to determine if the arguments can be made to fit.
 It is also clear that the bottom-up (\textsc{bu}) traversal order is better than both top-down (\textsc{td}) and the Bilson-style bottom-up-combined (\textsc{co}) orders.
 While the advantage of \textsc{bu} over \textsc{co} is clear, in that it performs less redundant work if a prefix of a combination fails, the advantage of \textsc{bu} over \textsc{td} provides an answer for an open question from Baker \cite{Baker82}.
 …
 \section{Conclusion}
 As can be seen from the prototype results, per-expression benchmarks, and \CFACC{}, the dominant factor in the cost of \CFA{} expression resolution is assertion satisfaction.
+The dominant factor in the cost of \CFA{} expression resolution is assertion satisfaction.
 Reducing the total number of assertions satisfied, as in the deferred satisfaction algorithm, is consistently effective at reducing runtime, and caching results of these satisfaction problem instances has shown promise in the prototype system.
 The results presented here also demonstrate that a bottom-up approach to expression resolution is superior to top-down, settling an open question from Baker~\cite{Baker82}.

doc/theses/aaron_moss_PhD/phd/generic-bench.tex

-                      r8f55e8e9
+                      rcf01d0b
 \chapter{Generic Stack Benchmarks} \label{generic-bench-app}
+Throughout, !/***/! designates a counted redundant type annotation; code reformatted slightly for brevity.
+This appendix includes the generic stack code for all four language variants discussed in Section~\ref{generic-performance-sec}. Throughout, !/***/! designates a counted redundant type annotation; these include !sizeof! on a known type, repetition of a type name in initialization or return statements, and type-specific helper functions.
+The code is reformatted slightly for brevity.
 \section{C}

doc/theses/aaron_moss_PhD/phd/generic-types.tex

-                      r8f55e8e9
+                      rcf01d0b
 \CC{}, Java, and other languages use \emph{generic types} to produce type-safe abstract data types.
 Design and implementation of generic types for \CFA{} is the first major contribution of this thesis, a summary of which is published in \cite{Moss18} and from which this chapter is closely based.
+Design and implementation of generic types for \CFA{} is the first major contribution of this thesis, a summary of which is published in \cite{Moss18} and on which this chapter is closely based.
 \CFA{} generic types integrate efficiently and naturally with the existing polymorphic functions in \CFA{}, while retaining backward compatibility with C in layout and support for separate compilation.
 A generic type can be declared in \CFA{} by placing a !forall! specifier on a !struct! or !union! declaration, and instantiated using a parenthesized list of types after the generic name.
 …
 A key insight for this design is that C already possesses a handful of built-in generic types (\emph{derived types} in the language of the standard \cite[\S{}6.2.5]{C11}), notably pointer (!T*!) and array (!T[]!), and that user-definable generics should act similarly.
 \subsection{Related Work}
+\subsection{Related Work} \label{generic-related-sec}
 One approach to the design of generic types is that taken by \CC{} templates \cite{C++}.
 …
 Java \cite{Java8} has another prominent implementation for generic types, introduced in Java~5 and based on a significantly different approach than \CC{}.
 The Java approach has much more in common with the !void*!-polymorphism shown in Figure~\ref{void-generic-fig}; since in Java nearly all data is stored by reference, the Java approach to polymorphic data is to store pointers to arbitrary data and insert type-checked implicit casts at compile-time.
 This process of \emph{type erasure} has the benefit of allowing a single instantiation of polymorphic code, but relies heavily on Java's object model and garbage collector.
+This process of \emph{type erasure} has the benefit of allowing a single instantiation of polymorphic code, but relies heavily on Java's object model.
 To use this model, a more C-like language such as \CFA{} would be required to dynamically allocate internal storage for variables, track their lifetime, and properly clean them up afterward.
 …
 A \emph{dtype-static} type has polymorphic parameters but is still concrete.
 Polymorphic pointers are an example of dtype-static types; given some type variable !T!, !T! is a polymorphic type, but !T*! has a fixed size and can therefore be represented by a !void*! in code generation.
 In particular, generic types where all parameters are un-!sized! (\ie{} they do not conform to the built-in !sized! trait because the compiler does not know their size and alignment) are always concrete, as there is no possibility for their layout to vary based on type parameters of unknown size and alignment.
+In particular, generic types where all parameters are un-!sized! (\ie{} they do not conform to the built-in !sized! trait, which is satisfied by all types the compiler knows the size and alignment of) are always concrete, as there is no possibility for their layout to vary based on type parameters of unknown size and alignment.
 More precisely, a type is concrete if and only if all of its !sized! type parameters are concrete, and a concrete type is dtype-static if any of its type parameters are (possibly recursively) polymorphic.
 To illustrate, the following code using the !pair! type from above has each use of !pair! commented with its class:
 …
 Since these languages are all C-based and compiled with the same compiler backend, maximal-performance benchmarks should show little runtime variance, differing only in length and clarity of source code.
 A more illustrative comparison measures the costs of idiomatic usage of each language's features.
 The code below shows the \CFA{} benchmark tests for a generic stack based on a singly-linked list; the test suite is equivalent for the other other languages, code for which is included in Appendix~\ref{generic-bench-app}.
 The experiment uses element types !int! and !pair(short, char)! and pushes $N = 40M$ elements on a generic stack, copies the stack, clears one of the stacks, and finds the maximum value in the other stack.
+The code below shows the \CFA{} benchmark tests for a generic stack based on a singly-linked list; the test suite is equivalent for the other languages, code for which is included in Appendix~\ref{generic-bench-app}.
+The experiment uses element types !int! and !pair(short, char)! and pushes $N = 4M$ elements on a generic stack, copies the stack, clears one of the stacks, and finds the maximum value in the other stack.
 \begin{cfa}
 …
 The four versions of the benchmark implemented are C with !void*!-based polymorphism, \CFA{} with parametric polymorphism, \CC{} with templates, and \CC{} using only class inheritance for polymorphism, denoted \CCV{}.
 The \CCV{} variant illustrates an alternative object-oriented idiom where all objects inherit from a base !object! class, mimicking a Java-like interface; in particular, runtime checks are necessary to safely downcast objects.
+The \CCV{} variant illustrates an alternative object-oriented idiom where all objects inherit from a base !object! class, a language design similar to Java 4; in particular, runtime checks are necessary to safely downcast objects.
 The most notable difference among the implementations is the memory layout of generic types: \CFA{} and \CC{} inline the stack and pair elements into corresponding list and pair nodes, while C and \CCV{} lack such capability and, instead, must store generic objects via pointers to separately allocated objects.
 Note that the C benchmark uses unchecked casts as C has no runtime mechanism to perform such checks, whereas \CFA{} and \CC{} provide type safety statically.
 …
 The C and \CCV{} variants are generally the slowest and have the largest memory footprint, due to their less-efficient memory layout and the pointer indirection necessary to implement generic types in those languages; this inefficiency is exacerbated by the second level of generic types in the pair benchmarks.
 By contrast, the \CFA{} and \CC{} variants run in roughly equivalent time for both the integer and pair because of the equivalent storage layout, with the inlined libraries (\ie{} no separate compilation) and greater maturity of the \CC{} compiler contributing to its lead.
+By contrast, the \CFA{} and \CC{} variants run in noticably less time for both the integer and pair because of the equivalent storage layout, with the inlined libraries (\ie{} no separate compilation) and greater maturity of the \CC{} compiler contributing to its lead.
 \CCV{} is slower than C largely due to the cost of runtime type checking of downcasts (implemented with !dynamic_cast!); the outlier for \CFA{}, pop !pair!, results from the complexity of the generated-C polymorphic code.
 The gcc compiler is unable to optimize some dead code and condense nested calls; a compiler designed for \CFA{} could more easily perform these optimizations.
 Finally, the binary size for \CFA{} is larger because of static linking with \CFA{} libraries.
+Finally, the binary size for \CFA{} is larger because of static linking with the \CFA{} prelude library, which includes function definitions for all the built-in operators.
 \CFA{} is also competitive in terms of source code size, measured as a proxy for programmer effort.

doc/theses/aaron_moss_PhD/phd/introduction.tex

r8f55e8e9	rcf01d0b
26	26	The new features make \CFA{} more powerful and expressive than C, while maintaining strong backward-compatibility with both C code and the procedural paradigm expected by C programmers.
27	27	Unlike other popular C extensions like \CC{} and Objective-C, \CFA{} adds modern features to C without imposing an object-oriented paradigm to use them.
28		However, these new features do impose a compile-time cost, particularly in the expression resolver, which must evaluate the typing rules of a significantly more complex type-system.
	28	However, these new features do impose a compile-time cost, particularly in the expression resolver, which must evaluate the typing rules of a significantly more complex type system.
29	29
30	30	This thesis is focused on making \CFA{} a more powerful and expressive language, both by adding new features to the \CFA{} type system and ensuring that both added and existing features can be efficiently implemented in \CFACC{}, the \CFA{} reference compiler.

doc/theses/aaron_moss_PhD/phd/resolution-heuristics.tex

-                      r8f55e8e9
+                      rcf01d0b
 A given matching between identifiers and declarations in an expression is an \emph{interpretation}; an interpretation also includes information about polymorphic type bindings and implicit casts to support the \CFA{} features discussed in Sections~\ref{poly-func-sec} and~\ref{implicit-conv-sec}, each of which increase the number of valid candidate interpretations.
 To choose among valid interpretations, a \emph{conversion cost} is used to rank interpretations.
 Hence, the expression resolution problem is to find the unique minimal-cost interpretation for an expression, reporting an error if no such interpretation exists.
+Hence, the expression resolution problem is to find the unique minimal-cost interpretation for an expression, reporting an error if no such unique interpretation exists.
 \section{Expression Resolution}
 …
 With more specificity, the cost is a lexicographically-ordered tuple, where each element corresponds to a particular kind of conversion.
 In Bilson's design, conversion cost is a 3-tuple, $(unsafe, poly, safe)$, where $unsafe$ is the count of unsafe (narrowing) conversions, $poly$ is the count of polymorphic type bindings, and $safe$ is the sum of the degree of safe (widening) conversions.
 Degree of safe conversion is calculated as path weight in a directed graph of safe conversions between types; Bilson's version and the current version of this graph are in Figures~\ref{bilson-conv-fig} and~\ref{extended-conv-fig}, respectively.
+Degree of safe conversion is calculated as path weight in a directed graph of safe conversions between types; Bilson's version of this graph is in Figure~\ref{bilson-conv-fig}.
 The safe conversion graph is designed such that the common type $c$ of two types $u$ and $v$ is compatible with the C standard definitions from \cite[\S{}6.3.1.8]{C11} and can be calculated as the unique type minimizing the sum of the path weights of $\overrightarrow{uc}$ and $\overrightarrow{vc}$.
 The following example lists the cost in the Bilson model of calling each of the following functions with two !int! parameters, where the interpretation with the minimum total cost will be selected:
 …
 In the redesign, for consistency with the approach of the usual arithmetic conversions, which select a common type primarily based on size, but secondarily on sign, arcs in the new graph are annotated with whether they represent a sign change, and such sign changes are summed in a new $sign$ cost element that lexicographically succeeds $safe$.
 This means that sign conversions are approximately the same cost as widening conversions, but slightly more expensive (as opposed to less expensive in Bilson's graph), so maintaining the same signedness is consistently favoured.
+This refined conversion graph is shown in Figure~\ref{extended-conv-fig}.
 With these modifications, the current \CFA{} cost tuple is as follows:
 …
 Pruning possible interpretations as early as possible is one way to reduce the real-world cost of expression resolution, provided that a sufficient proportion of interpretations are pruned to pay for the cost of the pruning algorithm.
 One opportunity for interpretation pruning is by the argument-parameter type matching, but the literature provides no clear answers on whether candidate functions should be chosen according to their available arguments, or whether argument resolution should be driven by the available function candidates.
+One opportunity for interpretation pruning is by the argument-parameter type matching, but the literature \cite{Baker82,Bilson03,Cormack81,Ganzinger80,Pennello80,PW:overload} provides no clear answers on whether candidate functions should be chosen according to their available arguments, or whether argument resolution should be driven by the available function candidates.
 For programming languages without implicit conversions, argument-parameter matching is essentially the entirety of the expression resolution problem, and is generally referred to as ``overload resolution'' in the literature.
 All expression-resolution algorithms form a DAG of interpretations, some explicitly, some implicitly; in this DAG, arcs point from function-call interpretations to argument interpretations, as in Figure~\ref{res-dag-fig}
 …
 The assertion satisfaction algorithm designed by Bilson~\cite{Bilson03} for the original \CFACC{} is the most-relevant prior work to this project.
 Before accepting any subexpression candidate, Bilson first checks that that candidate's assertions can all be resolved; this is necessary due to Bilson's addition of assertion satisfaction costs to candidate costs (discussed in Section~\ref{conv-cost-sec}).
 If this subexpression interpretation ends up not being used in the final resolution, than the (sometimes substantial) work of checking its assertions ends up wasted.
+If this subexpression interpretation ends up not being used in the final resolution, then the (sometimes substantial) work of checking its assertions ends up wasted.
 Bilson's assertion checking function recurses on two lists, !need! and !newNeed!, the current declaration's assertion set and those implied by the assertion-satisfying declarations, respectively, as detailed in the pseudo-code below (ancillary aspects of the algorithm are omitted for clarity):

Note: See TracChangeset for help on using the changeset viewer.

Download in other formats: