Ignore:
Timestamp:
Apr 26, 2019, 4:59:48 PM (5 years ago)
Author:
Thierry Delisle <tdelisle@…>
Branches:
ADT, arm-eh, ast-experimental, cleanup-dtors, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, pthread-emulation, qualifiedEnum
Children:
5b11c25
Parents:
ffe2fad (diff), 1bc5975 (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the (diff) links above to see all the changes relative to each parent.
Message:

Merge branch 'master' into ctxswitch

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/aaron_moss_PhD/phd/background.tex

    rffe2fad r673cd63  
    55To provide background for the contributions in subsequent chapters, this chapter provides a summary of the features of \CFA{} at the time this work was conducted.
    66
    7 The core design of \CFA{} is laid out in Glen Ditchfield's 1992 PhD thesis, \emph{Contextual Polymorphism} \cite{Ditchfield92}; in that thesis, Ditchfield presents the theoretical underpinnings of the \CFA{} polymorphism model.
     7Glen Ditchfield laid out the core design of \CFA{} in his 1992 PhD thesis, \emph{Contextual Polymorphism} \cite{Ditchfield92}; in that thesis, Ditchfield presents the theoretical underpinnings of the \CFA{} polymorphism model.
    88Building on Ditchfield's design for contextual polymorphism as well as KW-C \cite{Buhr94a}, an earlier set of (largely syntactic) extensions to C, Richard Bilson \cite{Bilson03} built the first version of the \CFA{} compiler, \CFACC{}, in the early 2000's.
    99This early \CFACC{} provided basic functionality, but incorporated a number of algorithmic choices that have failed to scale as \CFA{} has developed, lacking the runtime performance for practical use; this thesis is substantially concerned with rectifying those deficits.
     
    1313Notable features added during this period include generic types (Chapter~\ref{generic-chap}), constructors and destructors \cite{Schluntz17}, improved support for tuples \cite{Schluntz17}, reference types \cite{Moss18}, first-class concurrent and parallel programming support \cite{Delisle18}, as well as numerous pieces of syntactic sugar and the start of an idiomatic standard library \cite{Moss18}.
    1414
    15 The selection of features presented in this chapter are chosen to elucidate the design constraints of the work presented in this thesis.
     15This thesis is primarily concerned with the \emph{expression resolution} portion of \CFA{} type-checking; resolution is discussed in more detail in Chapter~\ref{resolution-chap}, but is essentially determining which declarations the identifiers in each expression correspond to.
     16In C, no simultaneously-visible declarations share identifiers, hence expression resolution in C is not difficult.
     17In \CFA{}, multiple added features make the resolution process significantly more complex.
     18Due to this complexity, the expression-resolution pass in \CFACC{} requires 95\% of compiler runtime on some source files, making a new, more efficient procedure for expression resolution a requirement for a performant \CFA{} compiler.
     19
     20The features presented in this chapter are chosen to elucidate the design constraints of the work presented in this thesis.
    1621In some cases the interactions of multiple features make this design a significantly more complex problem than any individual feature; in other cases a feature that does not by itself add any complexity to expression resolution triggers previously rare edge cases more frequently.
    1722
     
    2732
    2833Another aspect of \CFA{}'s procedural paradigm is that it retains C's translation-unit-based encapsulation model, rather than class-based encapsulation such as in \CC{}.
    29 This design choice implies that separate compilation must be maintained to allow headers to act as an encapsulation boundary, rather than the header-only libraries used by \CC{} templates.
     34As such, any language feature that requires code to be exposed in header files (\eg{} \CC{} templates) also eliminates encapsulation in \CFA{}.
     35Given this constraint, \CFA{} is carefully designed to allow separate compilation for its added language features under the existing C usage patterns.
    3036
    3137\section{Name Overloading} \label{overloading-sec}
     
    8894The addition of !one_t! allows generic algorithms to handle the unit value uniformly for types where it is meaningful; a simple example of this is that polymorphic functions\footnote{discussed in Section~\ref{poly-func-sec}} in the \CFA{} prelude define !++x! and !x++! in terms of !x += 1!, allowing users to idiomatically define all forms of increment for a type !T! by defining the single function !T& ?+=?(T&, one_t)!; analogous overloads for the decrement operators are also present, and programmers can override any of these functions for a particular type if desired.
    8995
    90 \CFA{} previously allowed !0! and !1! to be the names of polymorphic variables, with separate overloads for !int 0!, !int 1!, and !forall(dtype T) T* 0!.
     96\CFA{} previously allowed !0! and !1! to be the names of polymorphic variables, with separate overloads for !int 0!, !int 1!, and the polymorphic variable !forall(dtype T) T* 0!.
    9197While designing \CFA{} generic types (see Chapter~\ref{generic-chap}), it was discovered that the parametric polymorphic zero variable is not generalizable to other types; though all null pointers have the same in-memory representation, the same cannot be said of the zero values of arbitrary types.
    92 As such, variables that could represent !0! and !1! were phased out in favour of functions that could generate those values for a given type as appropriate.
     98As such, polymorphic variables, and in particular variables for !0! and !1!, were phased out in favour of functions that could generate those values for a given type as appropriate.
    9399
    94100\section{Polymorphic Functions} \label{poly-func-sec}
     
    105111The type variable !T! is transformed into a set of additional implicit parameters to !identity!, which encode sufficient information about !T! to create and return a variable of that type.
    106112\CFA{} passes the size and alignment of the type represented by an !otype! parameter, as well as a default constructor, copy constructor, assignment operator, and destructor.
    107 Types which do not have one or more of these operators visible cannot be bound to !otype! parameters, but may be bound to un-constrained !dtype! (``data type'') type variables.
     113Types that do not have one or more of these operators visible cannot be bound to !otype! parameters, but may be bound to un-constrained !dtype! (``data type'') type variables.
    108114In this design, the runtime cost of polymorphism is spread over each polymorphic call, due to passing more arguments to polymorphic functions; the experiments in Chapter~\ref{generic-chap} indicate that this overhead is comparable to that of \CC{} virtual function calls.
    109115% \TODO{rerun experiments, possibly look at vtable variant}
     
    115121\subsection{Type Assertions}
    116122
    117 Since bare polymorphic types do not provide a great range of available operations, \CFA{} provides a \emph{type assertion} mechanism to provide further information about a type\footnote{Subscript not included in source code.\label{sub-foot}}:
     123Since bare polymorphic types do not provide a great range of available operations, \CFA{} provides a \emph{type assertion} mechanism to provide further information about a type\footnote{This example introduces a convention used throughout this thesis of disambiguating overloaded names with subscripts; the subscripts do not appear in \CFA{} source code.}:
    118124
    119125\begin{cfa}
     
    129135
    130136Monomorphic specializations of polymorphic functions can themselves be used to satisfy type assertions.
    131 For instance, !twice! could have been defined like this\footref{sub-foot}:
     137For instance, !twice! could have been defined like this:
    132138
    133139\begin{cfa}
     
    136142\end{cfa}
    137143
    138 Specializing this polymorphic function with !S = double! produces a monomorphic function which can  be used to satisfy the type assertion on !four_times!.
     144Specializing this polymorphic function with !S = double! produces a monomorphic function which can be used to satisfy the type assertion on !four_times!.
    139145\CFACC{} accomplishes this by creating a wrapper function calling !twice!$_2$ with !S! bound to !double!, then providing this wrapper function to !four_times!\footnote{\lstinline{twice}$_2$ could also have had a type parameter named \lstinline{T}; \CFA{} specifies renaming of the type parameters, which would avoid the name conflict with the type variable \lstinline{T} of \lstinline{four_times}}.
    140146However, !twice!$_2$ also works for any type !S! that has an addition operator defined for it.
    141147
    142 Finding appropriate functions to satisfy type assertions is essentially a recursive case of expression resolution, as it takes a name (that of the type assertion) and attempts to match it to a suitable declaration in the current scope.
     148Finding appropriate functions to satisfy type assertions is essentially a recursive case of expression resolution, as it takes a name (that of the type assertion) and attempts to match it to a suitable declaration in the current scope\footnote{\CFACC{} actually performs a type-unification computation for assertion satisfaction rather than an expression resolution computation; see Section~\ref{assn-sat-sec} for details.}.
    143149If a polymorphic function can be used to satisfy one of its own type assertions, this recursion may not terminate, as it is possible that that function is examined as a candidate for its own assertion unboundedly repeatedly.
    144150To avoid such infinite loops, \CFACC{} imposes a fixed limit on the possible depth of recursion, similar to that employed by most \CC{} compilers for template expansion; this restriction means that there are some otherwise semantically well-typed expressions that cannot be resolved by \CFACC{}.
    145 
    146 % \subsection{Deleted Declarations}
    147 
    148 % Particular type combinations can be exempted from matching a given polymorphic function through use of a \emph{deleted function declaration}:
    149 
    150 % \begin{cfa}
    151 % int somefn(char) = void;
    152 % \end{cfa}
    153 
    154 % This feature is based on a \CCeleven{} feature typically used to make a type non-copyable by deleting its copy constructor and assignment operator\footnote{In previous versions of \CC{}, a type could be made non-copyable by declaring a private copy constructor and assignment operator, but not defining either. This idiom is well-known, but depends on some rather subtle and \CC{}-specific rules about private members and implicitly-generated functions; the deleted function form is both clearer and less verbose.} or forbidding some interpretations of a polymorphic function by specifically deleting the forbidden overloads\footnote{Specific polymorphic function overloads can also be forbidden in previous \CC{} versions through use of template metaprogramming techniques, though this advanced usage is beyond the skills of many programmers. A similar effect can be produced on an ad-hoc basis at the appropriate call sites through use of casts to determine the function type. In both cases, the deleted-function form is clearer and more concise.}.
    155 % Deleted function declarations are implemented in \CFACC{} by adding them to the symbol table as usual, but with a flag set that indicates that the function is deleted.
    156 % If this deleted declaration is selected as the unique minimal-cost interpretation of an expression than an error is produced.
    157151
    158152\subsection{Traits}
     
    190184
    191185Traits, however, are significantly more powerful than nominal-inheritance interfaces; firstly, due to the scoping rules of the declarations that satisfy a trait's type assertions, a type may not satisfy a trait everywhere that that type is declared, as with !char! and the !nominal! trait above.
    192 Secondly, because \CFA{} is not object-oriented and types do not have a closed set of methods, existing C library types can be extended to implement a trait simply by writing the requisite functions\footnote{\CC{} only allows partial extension of C types, because constructors, destructors, conversions, and the assignment, indexing, and function-call operators may only be defined in a class; \CFA{} does not have any of these restrictions.}. 
     186Secondly, because \CFA{} is not object-oriented and types do not have a closed set of methods, existing C library types can be extended to implement a trait simply by writing the requisite functions\footnote{\CC{} only allows partial extension of C types, because constructors, destructors, conversions, and the assignment, indexing, and function-call operators may only be defined in a class; \CFA{} does not have any of these restrictions.}.
    193187Finally, traits may be used to declare a relationship among multiple types, a property that may be difficult or impossible to represent in nominal-inheritance type systems\footnote{This example uses \CFA{}'s reference types, described in Section~\ref{type-features-sec}.}:
    194188
     
    212206In this example above, !(list_iterator, int)! satisfies !pointer_like! by the user-defined dereference function, and !(list_iterator, list)! also satisfies !pointer_like! by the built-in dereference operator for pointers.
    213207Given a declaration !list_iterator it!, !*it! can be either an !int! or a !list!, with the meaning disambiguated by context (\eg{} !int x = *it;! interprets !*it! as !int!, while !(*it).value = 42;! interprets !*it! as !list!).
    214 While a nominal-inheritance system with associated types could model one of those two relationships by making !El! an associated type of !Ptr! in the !pointer_like! implementation, few such systems could model both relationships simultaneously.
     208While a nominal-inheritance system with associated types could model one of those two relationships by making !El! an associated type of !Ptr! in the !pointer_like! implementation,
     209I am unaware of any nominal-inheritance system that can model both relationships simultaneously.
     210Further comparison of \CFA{} polymorphism with other languages can be found in Section~\ref{generic-related-sec}.
    215211
    216212The flexibility of \CFA{}'s implicit trait-satisfaction mechanism provides programmers with a great deal of power, but also blocks some optimization approaches for expression resolution.
    217 The ability of types to begin or cease to satisfy traits when declarations go into or out of scope makes caching of trait satisfaction judgments difficult, and the ability of traits to take multiple type parameters can lead to a combinatorial explosion of work in any attempt to pre-compute trait satisfaction relationships.
     213The ability of types to begin or cease to satisfy traits when declarations go into or out of scope makes caching of trait satisfaction judgments difficult, and the ability of traits to take multiple type parameters can lead to a combinatorial explosion of work in any attempt to pre-compute trait satisfaction relationships.
     214
     215\subsection{Deleted Declarations}
     216
     217Particular type combinations can be exempted from matching a given polymorphic function through the use of a \emph{deleted function declaration}:
     218
     219\begin{cfa}
     220int somefn(char) = void;
     221\end{cfa}
     222
     223This feature is based on a \CCeleven{} feature typically used to make a type non-copyable by deleting its copy constructor and assignment operator\footnote{In previous versions of \CC{}, a type could be made non-copyable by declaring a private copy constructor and assignment operator, but not defining either. This idiom is well-known, but depends on some rather subtle and \CC{}-specific rules about private members and implicitly-generated functions; the deleted function form is both clearer and less verbose.} or forbidding some interpretations of a polymorphic function by specifically deleting the forbidden overloads\footnote{Specific polymorphic function overloads can also be forbidden in previous \CC{} versions through use of template metaprogramming techniques, though this advanced usage is beyond the skills of many programmers.}.
     224Deleted function declarations are implemented in \CFACC{} by adding them to the symbol table as usual, but with a flag set that indicates that the function is deleted.
     225If this deleted declaration is selected as the unique minimal-cost interpretation of an expression then an error is produced, allowing \CFA{} programmers to guide the expression resolver away from undesirable solutions.
    218226
    219227\section{Implicit Conversions} \label{implicit-conv-sec}
     
    224232One contribution of this thesis, discussed in Section~\ref{conv-cost-sec}, is a number of refinements to this cost model to more efficiently resolve polymorphic function calls.
    225233
    226 The expression resolution problem, which is the focus of Chapter~\ref{resolution-chap}, is to find the unique minimal-cost interpretation of each expression in the program, where all identifiers must be matched to a declaration, and implicit conversions or polymorphic bindings of the result of an expression may increase the cost of the expression.
     234In the context of these implicit conversions, the expression resolution problem can be restated as finding the unique minimal-cost interpretation of each expression in the program, where all identifiers must be matched to a declaration, and implicit conversions or polymorphic bindings of the result of an expression may increase the cost of the expression.
    227235While semantically valid \CFA{} code always has such a unique minimal-cost interpretation, \CFACC{} must also be able to detect and report as errors expressions that have either no interpretation or multiple ambiguous minimal-cost interpretations.
    228236Note that which subexpression interpretation is minimal-cost may require contextual information to disambiguate.
     
    241249Given some type !T!, a !T&! (``reference to !T!'') is essentially an automatically dereferenced pointer.
    242250These types allow seamless pass-by-reference for function parameters, without the extraneous dereferencing syntax present in C; they also allow easy aliasing of nested values with a similarly convenient syntax.
    243 A particular improvement is removing syntactic special cases for operators that take or return mutable values; for example, the use !a += b! of a compound assignment operator now matches its signature, !int& ?+=?(int&, int)!, as opposed to the syntactic special cases in earlier versions of \CFA{} to automatically take the address of the first argument to !+=! and to mark its return value as mutable.
     251The addition of reference types also eliminated two syntactic special-cases present in previous versions of \CFA{}.
     252Consider the a call !a += b! to a compound assignment operator.
     253The previous declaration for that operator is !lvalue int ?+=?(int*, int)!.
     254To mutate the left argument, the built-in operators were special-cased to implicitly take the address of that argument, while the special !lvalue! syntax was used to mark the return type of a function as a mutable reference.
     255With references, this declaration is re-written as !int& ?+=?(int&, int)!.
     256The reference semantics generalize the implicit address-of on the left argument and allow it to be used in user-declared functions, while also subsuming the (now removed) !lvalue! syntax for function return types.
    244257
    245258The C standard makes heavy use of the concept of \emph{lvalue}, an expression with a memory address; its complement, \emph{rvalue} (a non-addressable expression) is not explicitly named in the standard.
    246259In \CFA{}, the distinction between lvalue and rvalue can be re-framed in terms of reference and non-reference types, with the benefit of being able to express the difference in user code.
    247 \CFA{} references preserve the existing qualifier-dropping implicit lvalue-to-rvalue conversion from C (\eg{} a !const volatile int&! can be implicitly copied to a bare !int!)
     260\CFA{} references preserve the existing qualifier-dropping implicit lvalue-to-rvalue conversion from C (\eg{} a !const volatile int&! can be implicitly copied to a bare !int!).
    248261To make reference types more easily usable in legacy pass-by-value code, \CFA{} also adds an implicit rvalue-to-lvalue conversion, implemented by storing the value in a compiler-generated temporary variable and passing a reference to that temporary.
    249262To mitigate the ``!const! hell'' problem present in \CC{}, there is also a qualifier-dropping lvalue-to-lvalue conversion implemented by copying into a temporary:
     
    251264\begin{cfa}
    252265const int magic = 42;
    253 
    254266void inc_print( int& x ) { printf("%d\n", ++x); }
    255267
    256 print_inc( magic ); $\C{// legal; implicitly generated code in red below:}$
     268inc_print( magic ); $\C{// legal; implicitly generated code in red below:}$
    257269
    258270`int tmp = magic;` $\C{// to safely strip const-qualifier}$
    259 `print_inc( tmp );` $\C{// tmp is incremented, magic is unchanged}$
     271`inc_print( tmp );` $\C{// tmp is incremented, magic is unchanged}$
    260272\end{cfa}
    261273
     
    265277\CFA{} supports all of these use cases without further added syntax.
    266278The key to this syntax-free feature support is an observation made by the author that the address of a reference is a lvalue.
    267 In C, the address-of operator !&x! can only be applied to lvalue expressions, and always produces an immutable rvalue; \CFA{} supports reference re-binding by assignment to the address of a reference, and pointers to references by repeating the address-of operator:
     279In C, the address-of operator !&x! can only be applied to lvalue expressions, and always produces an immutable rvalue; \CFA{} supports reference re-binding by assignment to the address of a reference\footnote{The syntactic difference between reference initialization and reference assignment is unfortunate, but preserves the ability to pass function arguments by reference (a reference initialization context) without added syntax.}, and pointers to references by repeating the address-of operator:
    268280
    269281\begin{cfa}
     
    274286\end{cfa}
    275287
    276 For better compatibility with C, the \CFA{} team has chosen not to differentiate function overloads based on top-level reference types, and as such their contribution to the difficulty of \CFA{} expression resolution is largely restricted to the implementation details of normalization conversions and adapters.
    277 
    278 \subsection{Resource Management}
     288For better compatibility with C, the \CFA{} team has chosen not to differentiate function overloads based on top-level reference types, and as such their contribution to the difficulty of \CFA{} expression resolution is largely restricted to the implementation details of matching reference to non-reference types during type-checking.
     289
     290\subsection{Resource Management} \label{ctor-sec}
    279291
    280292\CFA{} also supports the RAII (``Resource Acquisition is Initialization'') idiom originated by \CC{}, thanks to the object lifetime work of Robert Schluntz \cite{Schluntz17}.
Note: See TracChangeset for help on using the changeset viewer.