Changeset cf01d0b


Ignore:
Timestamp:
Apr 23, 2019, 11:52:28 AM (5 years ago)
Author:
Aaron Moss <a3moss@…>
Branches:
ADT, aaron-thesis, arm-eh, ast-experimental, cleanup-dtors, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, pthread-emulation, qualifiedEnum
Children:
c4b5486
Parents:
8f55e8e9
Message:

thesis: typo-fixing revisions from Werner, Ondrej

Location:
doc
Files:
10 edited

Legend:

Unmodified
Added
Removed
  • doc/bibliography/pl.bib

    r8f55e8e9 rcf01d0b  
    973973    contributer = {pabuhr@plg},
    974974    author      = {Aaron Moss and Robert Schluntz and Peter A. Buhr},
    975     title       = {\textsf{C}$\mathbf{\forall}$ : Adding Modern Programming Language Features to C},
     975    title       = {\textsf{C}$\mathbf{\forall}$ : Adding Modern Programming Language Features to {C}},
    976976    journal     = spe,
    977977    volume      = 48,
  • doc/theses/aaron_moss_PhD/phd/background.tex

    r8f55e8e9 rcf01d0b  
    1717Due to this complexity, the expression-resolution pass in \CFACC{} requires 95\% of compiler runtime on some source files, making an efficient procedure for expression resolution a requirement for a performant \CFA{} compiler.
    1818
    19 The selection of features presented in this chapter are chosen to elucidate the design constraints of the work presented in this thesis.
     19The features presented in this chapter are chosen to elucidate the design constraints of the work presented in this thesis.
    2020In some cases the interactions of multiple features make this design a significantly more complex problem than any individual feature; in other cases a feature that does not by itself add any complexity to expression resolution triggers previously rare edge cases more frequently.
    2121
     
    9393The addition of !one_t! allows generic algorithms to handle the unit value uniformly for types where it is meaningful; a simple example of this is that polymorphic functions\footnote{discussed in Section~\ref{poly-func-sec}} in the \CFA{} prelude define !++x! and !x++! in terms of !x += 1!, allowing users to idiomatically define all forms of increment for a type !T! by defining the single function !T& ?+=?(T&, one_t)!; analogous overloads for the decrement operators are also present, and programmers can override any of these functions for a particular type if desired.
    9494
    95 \CFA{} previously allowed !0! and !1! to be the names of polymorphic variables, with separate overloads for !int 0!, !int 1!, and !forall(dtype T) T* 0!.
     95\CFA{} previously allowed !0! and !1! to be the names of polymorphic variables, with separate overloads for !int 0!, !int 1!, and the polymorphic variable !forall(dtype T) T* 0!.
    9696While designing \CFA{} generic types (see Chapter~\ref{generic-chap}), it was discovered that the parametric polymorphic zero variable is not generalizable to other types; though all null pointers have the same in-memory representation, the same cannot be said of the zero values of arbitrary types.
    97 As such, variables that could represent !0! and !1! were phased out in favour of functions that could generate those values for a given type as appropriate.
     97As such, polymorphic variables, and in particular variables for !0! and !1!, were phased out in favour of functions that could generate those values for a given type as appropriate.
    9898
    9999\section{Polymorphic Functions} \label{poly-func-sec}
     
    149149To avoid such infinite loops, \CFACC{} imposes a fixed limit on the possible depth of recursion, similar to that employed by most \CC{} compilers for template expansion; this restriction means that there are some otherwise semantically well-typed expressions that cannot be resolved by \CFACC{}.
    150150
    151 % \subsection{Deleted Declarations}
    152 
    153 % Particular type combinations can be exempted from matching a given polymorphic function through use of a \emph{deleted function declaration}:
    154 
    155 % \begin{cfa}
    156 % int somefn(char) = void;
    157 % \end{cfa}
    158 
    159 % This feature is based on a \CCeleven{} feature typically used to make a type non-copyable by deleting its copy constructor and assignment operator\footnote{In previous versions of \CC{}, a type could be made non-copyable by declaring a private copy constructor and assignment operator, but not defining either. This idiom is well-known, but depends on some rather subtle and \CC{}-specific rules about private members and implicitly-generated functions; the deleted function form is both clearer and less verbose.} or forbidding some interpretations of a polymorphic function by specifically deleting the forbidden overloads\footnote{Specific polymorphic function overloads can also be forbidden in previous \CC{} versions through use of template metaprogramming techniques, though this advanced usage is beyond the skills of many programmers. A similar effect can be produced on an ad-hoc basis at the appropriate call sites through use of casts to determine the function type. In both cases, the deleted-function form is clearer and more concise.}.
    160 % Deleted function declarations are implemented in \CFACC{} by adding them to the symbol table as usual, but with a flag set that indicates that the function is deleted.
    161 % If this deleted declaration is selected as the unique minimal-cost interpretation of an expression than an error is produced.
    162 
    163151\subsection{Traits}
    164152
     
    195183
    196184Traits, however, are significantly more powerful than nominal-inheritance interfaces; firstly, due to the scoping rules of the declarations that satisfy a trait's type assertions, a type may not satisfy a trait everywhere that that type is declared, as with !char! and the !nominal! trait above.
    197 Secondly, because \CFA{} is not object-oriented and types do not have a closed set of methods, existing C library types can be extended to implement a trait simply by writing the requisite functions\footnote{\CC{} only allows partial extension of C types, because constructors, destructors, conversions, and the assignment, indexing, and function-call operators may only be defined in a class; \CFA{} does not have any of these restrictions.}.
    198 Finally, traits may be used to declare a relationship among multiple types, a property that may be difficult or impossible to represent in nominal-inheritance type systems\footnote{This example uses \CFA{}'s reference types, described in Section~\ref{type-features-sec}.}:
     185Secondly, because \CFA{} is not object-oriented and types do not have a closed set of methods, existing C library types can be extended to implement a trait simply by writing the requisite functions\footnote{\CC{} only allows partial extension of C types, because constructors, destructors, conversions, and the assignment, indexing, and function-call operators may only be defined in a class; \CFA{} does not have any of these restrictions.}. Finally, traits may be used to declare a relationship among multiple types, a property that may be difficult or impossible to represent in nominal-inheritance type systems\footnote{This example uses \CFA{}'s reference types, described in Section~\ref{type-features-sec}.}:
    199186
    200187\begin{cfa}
     
    217204In this example above, !(list_iterator, int)! satisfies !pointer_like! by the user-defined dereference function, and !(list_iterator, list)! also satisfies !pointer_like! by the built-in dereference operator for pointers.
    218205Given a declaration !list_iterator it!, !*it! can be either an !int! or a !list!, with the meaning disambiguated by context (\eg{} !int x = *it;! interprets !*it! as !int!, while !(*it).value = 42;! interprets !*it! as !list!).
    219 While a nominal-inheritance system with associated types could model one of those two relationships by making !El! an associated type of !Ptr! in the !pointer_like! implementation, few such systems could model both relationships simultaneously.
     206While a nominal-inheritance system with associated types could model one of those two relationships by making !El! an associated type of !Ptr! in the !pointer_like! implementation, the author is unaware of any nominal-inheritance system that could model both relationships simultaneously.
     207Further comparison of \CFA{} polymorphism with other languages can be found in Section~\ref{generic-related-sec}.
    220208
    221209The flexibility of \CFA{}'s implicit trait-satisfaction mechanism provides programmers with a great deal of power, but also blocks some optimization approaches for expression resolution.
    222 The ability of types to begin or cease to satisfy traits when declarations go into or out of scope makes caching of trait satisfaction judgments difficult, and the ability of traits to take multiple type parameters can lead to a combinatorial explosion of work in any attempt to pre-compute trait satisfaction relationships.
     210The ability of types to begin or cease to satisfy traits when declarations go into or out of scope makes caching of trait satisfaction judgments difficult, and the ability of traits to take multiple type parameters can lead to a combinatorial explosion of work in any attempt to pre-compute trait satisfaction relationships.
     211
     212\subsection{Deleted Declarations}
     213
     214Particular type combinations can be exempted from matching a given polymorphic function through use of a \emph{deleted function declaration}:
     215
     216\begin{cfa}
     217int somefn(char) = void;
     218\end{cfa}
     219
     220This feature is based on a \CCeleven{} feature typically used to make a type non-copyable by deleting its copy constructor and assignment operator\footnote{In previous versions of \CC{}, a type could be made non-copyable by declaring a private copy constructor and assignment operator, but not defining either. This idiom is well-known, but depends on some rather subtle and \CC{}-specific rules about private members and implicitly-generated functions; the deleted function form is both clearer and less verbose.} or forbidding some interpretations of a polymorphic function by specifically deleting the forbidden overloads\footnote{Specific polymorphic function overloads can also be forbidden in previous \CC{} versions through use of template metaprogramming techniques, though this advanced usage is beyond the skills of many programmers.}.
     221Deleted function declarations are implemented in \CFACC{} by adding them to the symbol table as usual, but with a flag set that indicates that the function is deleted.
     222If this deleted declaration is selected as the unique minimal-cost interpretation of an expression then an error is produced, allowing \CFA{} programmers to guide the expression resolver away from undesirable solutions.
    223223
    224224\section{Implicit Conversions} \label{implicit-conv-sec}
     
    252252The C standard makes heavy use of the concept of \emph{lvalue}, an expression with a memory address; its complement, \emph{rvalue} (a non-addressable expression) is not explicitly named in the standard.
    253253In \CFA{}, the distinction between lvalue and rvalue can be re-framed in terms of reference and non-reference types, with the benefit of being able to express the difference in user code.
    254 \CFA{} references preserve the existing qualifier-dropping implicit lvalue-to-rvalue conversion from C (\eg{} a !const volatile int&! can be implicitly copied to a bare !int!)
     254\CFA{} references preserve the existing qualifier-dropping implicit lvalue-to-rvalue conversion from C (\eg{} a !const volatile int&! can be implicitly copied to a bare !int!).
    255255To make reference types more easily usable in legacy pass-by-value code, \CFA{} also adds an implicit rvalue-to-lvalue conversion, implemented by storing the value in a compiler-generated temporary variable and passing a reference to that temporary.
    256256To mitigate the ``!const! hell'' problem present in \CC{}, there is also a qualifier-dropping lvalue-to-lvalue conversion implemented by copying into a temporary:
     
    258258\begin{cfa}
    259259const int magic = 42;
    260 
    261260void inc_print( int& x ) { printf("%d\n", ++x); }
    262261
    263 print_inc( magic ); $\C{// legal; implicitly generated code in red below:}$
     262inc_print( magic ); $\C{// legal; implicitly generated code in red below:}$
    264263
    265264`int tmp = magic;` $\C{// to safely strip const-qualifier}$
    266 `print_inc( tmp );` $\C{// tmp is incremented, magic is unchanged}$
     265`inc_print( tmp );` $\C{// tmp is incremented, magic is unchanged}$
    267266\end{cfa}
    268267
     
    272271\CFA{} supports all of these use cases without further added syntax.
    273272The key to this syntax-free feature support is an observation made by the author that the address of a reference is a lvalue.
    274 In C, the address-of operator !&x! can only be applied to lvalue expressions, and always produces an immutable rvalue; \CFA{} supports reference re-binding by assignment to the address of a reference, and pointers to references by repeating the address-of operator:
     273In C, the address-of operator !&x! can only be applied to lvalue expressions, and always produces an immutable rvalue; \CFA{} supports reference re-binding by assignment to the address of a reference\footnote{The syntactic difference between reference initialization and reference assignment is unfortunate, but preserves the ability to pass function arguments by reference (a reference initialization context) without added syntax.}, and pointers to references by repeating the address-of operator:
    275274
    276275\begin{cfa}
     
    281280\end{cfa}
    282281
    283 For better compatibility with C, the \CFA{} team has chosen not to differentiate function overloads based on top-level reference types, and as such their contribution to the difficulty of \CFA{} expression resolution is largely restricted to the implementation details of normalization conversions and adapters.
     282For better compatibility with C, the \CFA{} team has chosen not to differentiate function overloads based on top-level reference types, and as such their contribution to the difficulty of \CFA{} expression resolution is largely restricted to the implementation details of matching reference to non-reference types during type-checking.
    284283
    285284\subsection{Resource Management} \label{ctor-sec}
  • doc/theses/aaron_moss_PhD/phd/conclusion.tex

    r8f55e8e9 rcf01d0b  
    77The combination of these practical improvements and added features significantly improve the viability of \CFA{} as a practical programming language.
    88
    9 Further improvements to the \CFA{} type-system are still possible, however.
     9Further improvements to the \CFA{} type system are still possible, however.
    1010One area suggested by this work is development of a scheme for user-defined conversions; to integrate properly with the \CFA{} conversion model, there would need to be a distinction between safe and unsafe conversions, and possibly a way to denote conversions as explicit-only or non-chainable.
    1111Another place for ongoing effort is improvement of compilation performance; I believe the most promising direction for that is rebuilding the \CFA{} compiler on a different framework than Bilson's \CFACC{}.
  • doc/theses/aaron_moss_PhD/phd/experiments.tex

    r8f55e8e9 rcf01d0b  
    77
    88\CFACC{} can generate realistic test inputs for the resolver prototype from equivalent \CFA{} code;
    9 the generated test inputs currently comprise all \CFA{} code currently in existence, 9,000 lines drawn primarily from the standard library and compiler test suite.
     9the generated test inputs currently comprise all \CFA{} code currently in existence\footnote{Though \CFA{} is backwards-compatible with C, the lack of \lstinline{forall} functions and name overloading in C mean that the larger corpus of C code does not provide challenging test instances for \CFACC{}}, 9,000 lines drawn primarily from the standard library and compiler test suite.
    1010\CFACC{} is also instrumented to produce a number of code metrics.
    1111These metrics were used to construct synthetic test inputs during development of the resolver prototype; these synthetic inputs provided useful design guidance, but the performance results presented in this chapter are based on the more realistic directly-generated inputs.
     
    3636Generic named types are used to represent the built-in parameterized types of \CFA{} as well; !T*! is encoded as \texttt{\#\$ptr<T>}.
    3737\CFA{} arrays are also represented as pointers, to simulate array-to-pointer decay, while top-level reference types are replaced by their referent to simulate the variety of reference conversions.
    38 \emph{Function types} have first-class representation in the prototype as well; \CFA{} function function pointers are represented as variables with the appropriate function type, though \CFA{} polymorphic function pointers cannot be represented, as the prototype system function type does not store information about type assertions.
     38\emph{Function types} have first-class representation in the prototype as well; \CFA{} function  pointers are represented as variables with the appropriate function type, though \CFA{} polymorphic function pointers cannot be represented, as the prototype system stores information about type assertions in function declarations rather than in the function type.
    3939\emph{Void} and \emph{tuple types} are also supported in the prototype, to express the multiple-return-value functions in \CFA{}, though varargs functions and !ttype! tuple-typed type variables are absent from the prototype system.
    4040The prototype system also does not represent type qualifiers (\eg{} !const!, !volatile!), so all such qualifiers are stripped during conversion to the prototype system.
     
    116116As a matter of experimental practicality, test runs that exceeded 8~GB of peak resident memory usage are excluded from the data set.
    117117This restriction is justifiable by real-world use, as a compiler that is merely slow may be accommodated with patience, but one that uses in excess of 8~GB of RAM may be impossible to run on many currently deployed computer systems.
    118 8~GB of RAM is not typical of the memory usage of the best-peforming two variants, \textsc{bu-dca-bas} and \textsc{bu-dca-per}, which were able to run all 131 test inputs to completion  with maximum memory usage of 70~MB and 78~MB, respectively.
     1188~GB of RAM is not typical of the memory usage of the best-performing two variants, \textsc{bu-dca-bas} and \textsc{bu-dca-per}, which were able to run all 131 test inputs to completion  with maximum memory usage of 70~MB and 78~MB, respectively.
    119119However, this threshold did eliminate a significant number of algorithm-test variants, with the worst-performing variant, \textsc{td-imm-inc}, only completing 62 test inputs within the memory bound.
    120120Full results for tests completed by algorithm variant are presented in Figure~\ref{tests-completed-fig}.
     
    151151% \end{figure}
    152152
    153 It can be seen from these results that that the top-down, immediate assertion-satisfaction (\textsc{td-imm-*}) variants are particularly inefficient, as they check a significant number of assertions without filtering to determine if the arguments can be made to fit.
     153It can be seen from these results that the top-down, immediate assertion-satisfaction (\textsc{td-imm-*}) variants are particularly inefficient, as they check a significant number of assertions without filtering to determine if the arguments can be made to fit.
    154154It is also clear that the bottom-up (\textsc{bu}) traversal order is better than both top-down (\textsc{td}) and the Bilson-style bottom-up-combined (\textsc{co}) orders.
    155155While the advantage of \textsc{bu} over \textsc{co} is clear, in that it performs less redundant work if a prefix of a combination fails, the advantage of \textsc{bu} over \textsc{td} provides an answer for an open question from Baker \cite{Baker82}.
     
    256256\section{Conclusion}
    257257
    258 As can be seen from the prototype results, per-expression benchmarks, and \CFACC{}, the dominant factor in the cost of \CFA{} expression resolution is assertion satisfaction.
     258The dominant factor in the cost of \CFA{} expression resolution is assertion satisfaction.
    259259Reducing the total number of assertions satisfied, as in the deferred satisfaction algorithm, is consistently effective at reducing runtime, and caching results of these satisfaction problem instances has shown promise in the prototype system.
    260260The results presented here also demonstrate that a bottom-up approach to expression resolution is superior to top-down, settling an open question from Baker~\cite{Baker82}.
  • doc/theses/aaron_moss_PhD/phd/generic-bench.tex

    r8f55e8e9 rcf01d0b  
    11\chapter{Generic Stack Benchmarks} \label{generic-bench-app}
    22
    3 Throughout, !/***/! designates a counted redundant type annotation; code reformatted slightly for brevity.
     3This appendix includes the generic stack code for all four language variants discussed in Section~\ref{generic-performance-sec}. Throughout, !/***/! designates a counted redundant type annotation; these include !sizeof! on a known type, repetition of a type name in initialization or return statements, and type-specific helper functions.
     4The code is reformatted slightly for brevity.
    45
    56\section{C}
  • doc/theses/aaron_moss_PhD/phd/generic-types.tex

    r8f55e8e9 rcf01d0b  
    144144
    145145\CC{}, Java, and other languages use \emph{generic types} to produce type-safe abstract data types.
    146 Design and implementation of generic types for \CFA{} is the first major contribution of this thesis, a summary of which is published in \cite{Moss18} and from which this chapter is closely based.
     146Design and implementation of generic types for \CFA{} is the first major contribution of this thesis, a summary of which is published in \cite{Moss18} and on which this chapter is closely based.
    147147\CFA{} generic types integrate efficiently and naturally with the existing polymorphic functions in \CFA{}, while retaining backward compatibility with C in layout and support for separate compilation.
    148148A generic type can be declared in \CFA{} by placing a !forall! specifier on a !struct! or !union! declaration, and instantiated using a parenthesized list of types after the generic name.
     
    189189A key insight for this design is that C already possesses a handful of built-in generic types (\emph{derived types} in the language of the standard \cite[\S{}6.2.5]{C11}), notably pointer (!T*!) and array (!T[]!), and that user-definable generics should act similarly.
    190190
    191 \subsection{Related Work}
     191\subsection{Related Work} \label{generic-related-sec}
    192192
    193193One approach to the design of generic types is that taken by \CC{} templates \cite{C++}.
     
    202202Java \cite{Java8} has another prominent implementation for generic types, introduced in Java~5 and based on a significantly different approach than \CC{}.
    203203The Java approach has much more in common with the !void*!-polymorphism shown in Figure~\ref{void-generic-fig}; since in Java nearly all data is stored by reference, the Java approach to polymorphic data is to store pointers to arbitrary data and insert type-checked implicit casts at compile-time.
    204 This process of \emph{type erasure} has the benefit of allowing a single instantiation of polymorphic code, but relies heavily on Java's object model and garbage collector.
     204This process of \emph{type erasure} has the benefit of allowing a single instantiation of polymorphic code, but relies heavily on Java's object model.
    205205To use this model, a more C-like language such as \CFA{} would be required to dynamically allocate internal storage for variables, track their lifetime, and properly clean them up afterward.
    206206
     
    277277A \emph{dtype-static} type has polymorphic parameters but is still concrete.
    278278Polymorphic pointers are an example of dtype-static types; given some type variable !T!, !T! is a polymorphic type, but !T*! has a fixed size and can therefore be represented by a !void*! in code generation.
    279 In particular, generic types where all parameters are un-!sized! (\ie{} they do not conform to the built-in !sized! trait because the compiler does not know their size and alignment) are always concrete, as there is no possibility for their layout to vary based on type parameters of unknown size and alignment.
     279In particular, generic types where all parameters are un-!sized! (\ie{} they do not conform to the built-in !sized! trait, which is satisfied by all types the compiler knows the size and alignment of) are always concrete, as there is no possibility for their layout to vary based on type parameters of unknown size and alignment.
    280280More precisely, a type is concrete if and only if all of its !sized! type parameters are concrete, and a concrete type is dtype-static if any of its type parameters are (possibly recursively) polymorphic.
    281281To illustrate, the following code using the !pair! type from above has each use of !pair! commented with its class:
     
    425425Since these languages are all C-based and compiled with the same compiler backend, maximal-performance benchmarks should show little runtime variance, differing only in length and clarity of source code.
    426426A more illustrative comparison measures the costs of idiomatic usage of each language's features.
    427 The code below shows the \CFA{} benchmark tests for a generic stack based on a singly-linked list; the test suite is equivalent for the other other languages, code for which is included in Appendix~\ref{generic-bench-app}.
    428 The experiment uses element types !int! and !pair(short, char)! and pushes $N = 40M$ elements on a generic stack, copies the stack, clears one of the stacks, and finds the maximum value in the other stack.
     427The code below shows the \CFA{} benchmark tests for a generic stack based on a singly-linked list; the test suite is equivalent for the other languages, code for which is included in Appendix~\ref{generic-bench-app}.
     428The experiment uses element types !int! and !pair(short, char)! and pushes $N = 4M$ elements on a generic stack, copies the stack, clears one of the stacks, and finds the maximum value in the other stack.
    429429
    430430\begin{cfa}
     
    451451
    452452The four versions of the benchmark implemented are C with !void*!-based polymorphism, \CFA{} with parametric polymorphism, \CC{} with templates, and \CC{} using only class inheritance for polymorphism, denoted \CCV{}.
    453 The \CCV{} variant illustrates an alternative object-oriented idiom where all objects inherit from a base !object! class, mimicking a Java-like interface; in particular, runtime checks are necessary to safely downcast objects.
     453The \CCV{} variant illustrates an alternative object-oriented idiom where all objects inherit from a base !object! class, a language design similar to Java 4; in particular, runtime checks are necessary to safely downcast objects.
    454454The most notable difference among the implementations is the memory layout of generic types: \CFA{} and \CC{} inline the stack and pair elements into corresponding list and pair nodes, while C and \CCV{} lack such capability and, instead, must store generic objects via pointers to separately allocated objects.
    455455Note that the C benchmark uses unchecked casts as C has no runtime mechanism to perform such checks, whereas \CFA{} and \CC{} provide type safety statically.
     
    481481
    482482The C and \CCV{} variants are generally the slowest and have the largest memory footprint, due to their less-efficient memory layout and the pointer indirection necessary to implement generic types in those languages; this inefficiency is exacerbated by the second level of generic types in the pair benchmarks.
    483 By contrast, the \CFA{} and \CC{} variants run in roughly equivalent time for both the integer and pair because of the equivalent storage layout, with the inlined libraries (\ie{} no separate compilation) and greater maturity of the \CC{} compiler contributing to its lead.
     483By contrast, the \CFA{} and \CC{} variants run in noticably less time for both the integer and pair because of the equivalent storage layout, with the inlined libraries (\ie{} no separate compilation) and greater maturity of the \CC{} compiler contributing to its lead.
    484484\CCV{} is slower than C largely due to the cost of runtime type checking of downcasts (implemented with !dynamic_cast!); the outlier for \CFA{}, pop !pair!, results from the complexity of the generated-C polymorphic code.
    485485The gcc compiler is unable to optimize some dead code and condense nested calls; a compiler designed for \CFA{} could more easily perform these optimizations.
    486 Finally, the binary size for \CFA{} is larger because of static linking with \CFA{} libraries.
     486Finally, the binary size for \CFA{} is larger because of static linking with the \CFA{} prelude library, which includes function definitions for all the built-in operators.
    487487
    488488\CFA{} is also competitive in terms of source code size, measured as a proxy for programmer effort.
  • doc/theses/aaron_moss_PhD/phd/introduction.tex

    r8f55e8e9 rcf01d0b  
    2626The new features make \CFA{} more powerful and expressive than C, while maintaining strong backward-compatibility with both C code and the procedural paradigm expected by C programmers.
    2727Unlike other popular C extensions like \CC{} and Objective-C, \CFA{} adds modern features to C without imposing an object-oriented paradigm to use them.
    28 However, these new features do impose a compile-time cost, particularly in the expression resolver, which must evaluate the typing rules of a significantly more complex type-system.
     28However, these new features do impose a compile-time cost, particularly in the expression resolver, which must evaluate the typing rules of a significantly more complex type system.
    2929
    3030This thesis is focused on making \CFA{} a more powerful and expressive language, both by adding new features to the \CFA{} type system and ensuring that both added and existing features can be efficiently implemented in \CFACC{}, the \CFA{} reference compiler.
  • doc/theses/aaron_moss_PhD/phd/resolution-heuristics.tex

    r8f55e8e9 rcf01d0b  
    66A given matching between identifiers and declarations in an expression is an \emph{interpretation}; an interpretation also includes information about polymorphic type bindings and implicit casts to support the \CFA{} features discussed in Sections~\ref{poly-func-sec} and~\ref{implicit-conv-sec}, each of which increase the number of valid candidate interpretations.
    77To choose among valid interpretations, a \emph{conversion cost} is used to rank interpretations.
    8 Hence, the expression resolution problem is to find the unique minimal-cost interpretation for an expression, reporting an error if no such interpretation exists.
     8Hence, the expression resolution problem is to find the unique minimal-cost interpretation for an expression, reporting an error if no such unique interpretation exists.
    99
    1010\section{Expression Resolution}
     
    4848With more specificity, the cost is a lexicographically-ordered tuple, where each element corresponds to a particular kind of conversion.
    4949In Bilson's design, conversion cost is a 3-tuple, $(unsafe, poly, safe)$, where $unsafe$ is the count of unsafe (narrowing) conversions, $poly$ is the count of polymorphic type bindings, and $safe$ is the sum of the degree of safe (widening) conversions.
    50 Degree of safe conversion is calculated as path weight in a directed graph of safe conversions between types; Bilson's version and the current version of this graph are in Figures~\ref{bilson-conv-fig} and~\ref{extended-conv-fig}, respectively.
     50Degree of safe conversion is calculated as path weight in a directed graph of safe conversions between types; Bilson's version of this graph is in Figure~\ref{bilson-conv-fig}.
    5151The safe conversion graph is designed such that the common type $c$ of two types $u$ and $v$ is compatible with the C standard definitions from \cite[\S{}6.3.1.8]{C11} and can be calculated as the unique type minimizing the sum of the path weights of $\overrightarrow{uc}$ and $\overrightarrow{vc}$.
    5252The following example lists the cost in the Bilson model of calling each of the following functions with two !int! parameters, where the interpretation with the minimum total cost will be selected:
     
    124124In the redesign, for consistency with the approach of the usual arithmetic conversions, which select a common type primarily based on size, but secondarily on sign, arcs in the new graph are annotated with whether they represent a sign change, and such sign changes are summed in a new $sign$ cost element that lexicographically succeeds $safe$.
    125125This means that sign conversions are approximately the same cost as widening conversions, but slightly more expensive (as opposed to less expensive in Bilson's graph), so maintaining the same signedness is consistently favoured.
     126This refined conversion graph is shown in Figure~\ref{extended-conv-fig}.
    126127
    127128With these modifications, the current \CFA{} cost tuple is as follows:
     
    257258
    258259Pruning possible interpretations as early as possible is one way to reduce the real-world cost of expression resolution, provided that a sufficient proportion of interpretations are pruned to pay for the cost of the pruning algorithm.
    259 One opportunity for interpretation pruning is by the argument-parameter type matching, but the literature provides no clear answers on whether candidate functions should be chosen according to their available arguments, or whether argument resolution should be driven by the available function candidates.
     260One opportunity for interpretation pruning is by the argument-parameter type matching, but the literature \cite{Baker82,Bilson03,Cormack81,Ganzinger80,Pennello80,PW:overload} provides no clear answers on whether candidate functions should be chosen according to their available arguments, or whether argument resolution should be driven by the available function candidates.
    260261For programming languages without implicit conversions, argument-parameter matching is essentially the entirety of the expression resolution problem, and is generally referred to as ``overload resolution'' in the literature.
    261262All expression-resolution algorithms form a DAG of interpretations, some explicitly, some implicitly; in this DAG, arcs point from function-call interpretations to argument interpretations, as in Figure~\ref{res-dag-fig}
     
    305306The assertion satisfaction algorithm designed by Bilson~\cite{Bilson03} for the original \CFACC{} is the most-relevant prior work to this project.
    306307Before accepting any subexpression candidate, Bilson first checks that that candidate's assertions can all be resolved; this is necessary due to Bilson's addition of assertion satisfaction costs to candidate costs (discussed in Section~\ref{conv-cost-sec}).
    307 If this subexpression interpretation ends up not being used in the final resolution, than the (sometimes substantial) work of checking its assertions ends up wasted.
     308If this subexpression interpretation ends up not being used in the final resolution, then the (sometimes substantial) work of checking its assertions ends up wasted.
    308309Bilson's assertion checking function recurses on two lists, !need! and !newNeed!, the current declaration's assertion set and those implied by the assertion-satisfying declarations, respectively, as detailed in the pseudo-code below (ancillary aspects of the algorithm are omitted for clarity):
    309310
Note: See TracChangeset for help on using the changeset viewer.