Changeset f9c7d27


Ignore:
Timestamp:
Sep 17, 2018, 11:43:41 AM (6 years ago)
Author:
Aaron Moss <a3moss@…>
Branches:
ADT, aaron-thesis, arm-eh, ast-experimental, cleanup-dtors, deferred_resn, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, no_list, persistent-indexer, pthread-emulation, qualifiedEnum
Children:
91a950c
Parents:
3271166
Message:

Draft reference types and resource management subsections of thesis background

Location:
doc/theses/aaron_moss_PhD/phd
Files:
3 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/aaron_moss_PhD/phd/background.tex

    r3271166 rf9c7d27  
    2121It is important to note that \CFA{} is not an object-oriented language.
    2222This is a deliberate choice intended to maintain the applicability of the mental model and language idioms already possessed by C programmers.
    23 This choice is in marked contrast to \CC{}, which, though it has backward-compatibility with C on the source code level, is a much larger and more complex language, and requires extensive developer re-training before they can write idiomatic, efficient code in \CC{}'s object-oriented paradigm.
     23This choice is in marked contrast to \CC{}, which, though it has backward-compatibility with C on the source code level, is a much larger and more complex language, and requires extensive developer re-training to write idiomatic, efficient code in \CC{}'s object-oriented paradigm.
    2424
    2525\CFA{} does have a system of implicit type conversions derived from C's ``usual arithmetic conversions''; while these conversions may be thought of as something like an inheritance hierarchy, the underlying semantics are significantly different and such an analogy is loose at best.
     
    6262struct counter { int x; };
    6363
    64 counter& `++?`(counter& c) { ++c.x; return c; }  $\C{// pre-increment}$
    65 counter `?++`(counter& c) {  $\C{// post-increment}$
     64counter& `++?`(counter& c) { ++c.x; return c; }  $\C[2in]{// pre-increment}$
     65counter `?++`(counter& c) {  $\C[2in]{// post-increment}$
    6666        counter tmp = c; ++c; return tmp;
    6767}
    68 bool `?<?`(const counter& a, const counter& b) {  $\C{// comparison}$
     68bool `?<?`(const counter& a, const counter& b) {  $\C[2in]{// comparison}$
    6969        return a.x < b.x;
    7070}
     
    9191One benefit of this design is that it allows polymorphic functions to be separately compiled.
    9292The forward declaration !forall(otype T) T identity(T);! uniquely defines a single callable function, which may be implemented in a different file.
    93 The fact that there is only one implementation of each polymorphic function also reduces compile times relative to the template-expansion approach taken by \CC{}, as well as reducing binary sizes and runtime pressure on instruction cache at by re-using a single version of each function.
     93The fact that there is only one implementation of each polymorphic function also reduces compile times relative to the template-expansion approach taken by \CC{}, as well as reducing binary sizes and runtime pressure on instruction cache by re-using a single version of each function.
    9494
    9595\subsubsection{Type Assertions}
     
    117117
    118118This version of !twice! works for any type !S! that has an addition operator defined for it, and it could be used to satisfy the type assertion on !four_times!.
    119 \CFACC{} accomplishes this by creating a wrapper function calling !twice // (2)! with !S! bound to !double!, then providing this wrapper function to !four_times!\footnote{\lstinline{twice // (2)} could also have had a type parameter named \lstinline{T}; \CFA{} specifies renaming of the type parameters, which would avoid the name conflict with the type variable \lstinline{T} of \lstinline{four_times}}.
    120 
    121 Finding appropriate functions to satisfy type assertions is essentially a recursive case of expression resolution, as it takes a name (that of the type assertion) and attempts to match it to a suitable declaration \emph{in the current scope}.
    122 If a polymorphic function can be used to satisfy one of its own type assertions, this recursion may not terminate, as it is possible that that function is examined as a candidate for its own type assertion unboundedly repeatedly.
     119\CFACC{} accomplishes this by creating a wrapper function calling !twice//(2)! with !S! bound to !double!, then providing this wrapper function to !four_times!\footnote{\lstinline{twice // (2)} could also have had a type parameter named \lstinline{T}; \CFA{} specifies renaming of the type parameters, which would avoid the name conflict with the type variable \lstinline{T} of \lstinline{four_times}}.
     120
     121Finding appropriate functions to satisfy type assertions is essentially a recursive case of expression resolution, as it takes a name (that of the type assertion) and attempts to match it to a suitable declaration in the current scope.
     122If a polymorphic function can be used to satisfy one of its own type assertions, this recursion may not terminate, as it is possible that that function is examined as a candidate for its own assertion unboundedly repeatedly.
    123123To avoid such infinite loops, \CFACC{} imposes a fixed limit on the possible depth of recursion, similar to that employed by most \CC{} compilers for template expansion; this restriction means that there are some semantically well-typed expressions that cannot be resolved by \CFACC{}.
    124 \TODO{Update this with final state} One contribution made in the course of this thesis was modifying \CFACC{} to use the more flexible expression resolution algorithm for assertion matching, rather than the previous simpler approach of unification on the types of the functions.
     124\TODO{Update this with final state} One contribution made in the course of this thesis was modifying \CFACC{} to use the more flexible expression resolution algorithm for assertion matching, rather than the simpler but limited previous approach of unification on the types of the functions.
    125125
    126126\subsubsection{Deleted Declarations}
     
    175175\begin{cfa}
    176176trait pointer_like(`otype Ptr, otype El`) {
    177         El& *?(Ptr);  $\C{Ptr can be dereferenced to El}$
     177        El& *?(Ptr);  $\C{// Ptr can be dereferenced to El}$
    178178};
    179179
    180180struct list {
    181181        int value;
    182         list* next; $\C{may omit struct on type names}$
     182        list* next; $\C{// may omit struct on type names}$
    183183};
    184184
     
    200200
    201201In addition to the multiple interpretations of an expression produced by name overloading and polymorphic functions, for backward compatibility \CFA{} must support all of the implicit conversions present in C, producing further candidate interpretations for expressions.
    202 As mentioned above, C does not have an inheritance hierarchy of types, but the C standard's rules for the ``usual arithmetic conversions''\cit{} define which of the built-in tyhpes are implicitly convertable to which other types, and the relative cost of any pair of such conversions from a single source type.
    203 \CFA{} adds to the usual arithmetic conversions rules defining the cost of binding a polymorphic type variable in a function call; such bindings are cheaper than any \emph{unsafe} (narrowing) conversion, \eg{} !int! to !char!, but more expensive than any \emph{safe} (widening) conversion, \eg{} !int! to !double!.
     202As mentioned above, C does not have an inheritance hierarchy of types, but the C standard's rules for the ``usual arithmetic conversions'\cit{} define which of the built-in types are implicitly convertible to which other types, and the relative cost of any pair of such conversions from a single source type.
     203\CFA{} adds rules to the usual arithmetic conversions defining the cost of binding a polymorphic type variable in a function call; such bindings are cheaper than any \emph{unsafe} (narrowing) conversion, \eg{} !int! to !char!, but more expensive than any \emph{safe} (widening) conversion, \eg{} !int! to !double!.
    204204One contribution of this thesis, discussed in Section \TODO{add to resolution chapter}, is a number of refinements to this cost model to more efficiently resolve polymorphic function calls.
    205205
     
    208208Note that which subexpression interpretation is minimal-cost may require contextual information to disambiguate.
    209209For instance, in the example in Section~\ref{overloading-sec}, !max(max, -max)! cannot be unambiguously resolved, but !int m = max(max, -max)! has a single minimal-cost resolution.
    210 While the interpretation !int m = (int)max((double)max, -(double)max)! is also a valid interpretation, it is not minimal-cost due to the unsafe cast from the !double! result of !max! to the !int!\footnote{The two \lstinline{double} casts function as type ascriptions selecting \lstinline{double max} rather than casts from \lstinline{int max} to \lstinline{double}, and as such are zero-cost.}.
    211 These contextual effects make the expression resolution problem for \CFA{} both theoretically and practically difficult, but the observation driving the work in Chapter~\ref{resolution-chap} is that of the many top-level expressions in a given program, most will likely be straightforward and idiomatic so that programmers writing and maintaining the code can easily understand them; it follows that effective heuristics for common cases can bring down compiler runtime enough that a small proportion of harder-to-resolve expressions should not increase compiler runtime or memory usage inordinately.
     210While the interpretation !int m = (int)max((double)max, -(double)max)! is also a valid interpretation, it is not minimal-cost due to the unsafe cast from the !double! result of !max! to !int!\footnote{The two \lstinline{double} casts function as type ascriptions selecting \lstinline{double max} rather than casts from \lstinline{int max} to \lstinline{double}, and as such are zero-cost.}.
     211These contextual effects make the expression resolution problem for \CFA{} both theoretically and practically difficult, but the observation driving the work in Chapter~\ref{resolution-chap} is that of the many top-level expressions in a given program, most are straightforward and idiomatic so that programmers writing and maintaining the code can easily understand them; it follows that effective heuristics for common cases can bring down compiler runtime enough that a small proportion of harder-to-resolve expressions does not inordinately increase overall compiler runtime or memory usage.
    212212
    213213\subsection{Type Features} \label{type-features-sec}
    214214
     215The name overloading and polymorphism features of \CFA{} have the greatest effect on language design and compiler runtime, but there are a number of other features in the type system which have a smaller effect but are useful for code examples.
     216These features are described here.
     217
    215218\subsubsection{Reference Types}
    216219
    217 % TODO mention contribution on reference rebind
    218 
    219 \subsubsection{Lifetime Management}
     220One of the key ergonomic improvements in \CFA{} is reference types, designed and implemented by Robert Schluntz\cite{Schluntz17}.
     221Given some type !T!, a !T&! (``reference to !T!'') is essentially an automatically dereferenced pointer.
     222These types allow seamless pass-by-reference for function parameters, without the extraneous dereferencing syntax present in C; they also allow easy easy aliasing of nested values with a similarly convenient syntax.
     223A particular improvement is removing syntactic special cases for operators which take or return mutable values; for example, the use !a += b! of a compound assignment operator now matches its signature, !int& ?+=?(int&, int)!, as opposed to the previous syntactic special cases to automatically take the address of the first argument to !+=! and to mark its return value as mutable.
     224
     225The C standard makes heavy use of the concept of \emph{lvalue}, an expression with a memory address; its complement, \emph{rvalue} (a non-addressable expression) is not explicitly named.
     226In \CFA{}, the distinction between lvalue and rvalue can be reframed in terms of reference and non-reference types, with the benefit of being able to express the difference in user code.
     227\CFA{} references preserve the existing qualifier-dropping implicit lvalue-to-rvalue conversion from C (\eg{} a !const volatile int&! can be implicitly copied to a bare !int!)
     228To make reference types more easily usable in legacy pass-by-value code, \CFA{} also adds an implicit rvalue-to-lvalue conversion, implemented by storing the value in a fresh compiler-generated temporary variable and passing a reference to that temporary.
     229To mitigate the ``!const! hell'' problem present in \CC{}, there is also a qualifier-dropping lvalue-to-lvalue conversion, also implemented by copying into a temporary:
     230
     231\begin{cfa}
     232const int magic = 42;
     233
     234void inc_print( int& x ) { printf("%d\n", ++x); }
     235
     236print_inc( magic ); $\C{// legal; implicitly generated code in red below:}$
     237
     238`int tmp = magic;` $\C{// to safely strip const-qualifier}$
     239`print_inc( tmp );` $\C{// tmp is incremented, magic is unchanged}$
     240\end{cfa}
     241
     242Despite the similar syntax, \CFA{} references are significantly more flexible than \CC{} references.
     243The primary issue with \CC{} references is that it is impossible to extract the address of the reference variable rather than the address of the referred-to variable.
     244This breaks a number of the usual compositional properties of the \CC{} type system, \eg{} a reference cannot be re-bound to another variable, nor is it possible to take a pointer to, array of, or reference to a reference.
     245\CFA{} supports all of these use cases \TODO{test array} without further added syntax.
     246The key to this syntax-free feature support is an observation made by the author that the address of a reference is a lvalue.
     247In C, the address-of operator !&x! can only be applied to lvalue expressions, and always produces an immutable rvalue; \CFA{} supports reference re-binding by assignment to the address of a reference, and pointers to references by repeating the address-of operator:
     248
     249\begin{cfa}
     250int x = 2, y = 3;
     251int& r = x;  $\C{// r aliases x}$
     252&r = &y; $\C{// r now aliases y}$
     253int** p = &&r; $\C{// p points to r}$
     254\end{cfa}
     255
     256For better compatibility with C, the \CFA{} team has chosen not to differentiate function overloads based on top-level reference types, and as such their contribution to the difficulty of \CFA{} expression resolution is largely restricted to the implementation details of normalization conversions and adapters.
     257
     258\subsubsection{Resource Management}
     259
     260\CFA{} also supports the RAII (``Resource Acquisition is Initialization'') idiom originated by \CC{}, thanks to the object lifetime work of Robert Schluntz\cite{Schluntz17}.
     261This idiom allows a safer and more principled approach to resource management by tying acquisition of a resource to object initialization, with the corresponding resource release executed automatically at object finalization.
     262A wide variety of conceptual resources may be conveniently managed by this scheme, including heap memory, file handles, and software locks.
     263
     264\CFA{}'s implementation of RAII is based on special constructor and destructor operators, available via the !x{ ... }! constructor syntax and !^x{ ... }! destructor syntax.
     265Each type has an overridable compiler-generated zero-argument constructor, copy constructor, assignment operator, and destructor, as well as a field-wise constructor for each appropriate prefix of the member fields of !struct! types.
     266For !struct! types the default versions of these operators call their equivalents on each field of the !struct!.
     267The main implication of these object lifetime functions for expression resolution is that they are all included as implicit type assertions for !otype! type variables, with a secondary effect being an increase in code size due to the compiler-generated operators.
     268Due to these implicit type assertions, assertion resolution is pervasive in \CFA{} polymorphic functions, even those without explicit type assertions.
     269Implicitly-generated code is shown in red in the following example:
     270
     271\begin{cfa}
     272struct kv {
     273        int key;
     274        char* value;
     275};
     276
     277`void ?{} (kv& this) {` $\C[3in]{// default constructor}$
     278`       this.key{};` $\C[3in]{// call recursively on members}$
     279`       this.value{};
     280}
     281
     282void ?{} (kv& this, int key) {` $\C[3in]{// partial field constructor}$
     283`       this.key{ key };
     284        this.value{};` $\C[3in]{// default-construct missing fields}$
     285`}
     286
     287void ?{} (kv& this, int key, char* value) {` $\C[3in]{// complete field constructor}$
     288`       this.key{ key };
     289        this.value{ value };
     290}
     291
     292void ?{} (kv& this, kv that) {` $\C[3in]{// copy constructor}$
     293`       this.key{ that.key };
     294        this.value{ that.value };
     295}
     296
     297kv ?=? (kv& this, kv that) {` $\C[3in]{// assignment operator}$
     298`       this.key = that.key;
     299        this.value = that.value;
     300}
     301
     302void ^?{} (kv& this) {` $\C[3in]{// destructor}$
     303`       ^this.key{};
     304        ^this.value{};
     305}`
     306
     307forall(otype T `| { void ?{}(T&); void ?{}(T&, T); T ?=?(T&, T); void ^?{}(T&); }`)
     308void foo(T);
     309\end{cfa}
    220310
    221311\subsubsection{0 and 1 Literals}
     312
     313% TODO mention own motivating contribution
     314
     315% TODO mention future work in user-defined implicit conversions
     316
     317\subsubsection{Tuple Types}
     318
     319% TODO "precludes some matching strategies"
  • doc/theses/aaron_moss_PhD/phd/cfa-macros.tex

    r3271166 rf9c7d27  
    2020\newcommand{\LstCommentStyle}[1]{{\lst@basicstyle{\lst@commentstyle{#1}}}}
    2121
    22 \newcommand{\C}[2][2in]{\hfill\makebox[#1][l]{\LstCommentStyle{#2}}}
     22\newcommand{\C}[2][3.5in]{\hfill\makebox[#1][l]{\LstCommentStyle{#2}}}
    2323
    2424% CFA programming language, based on ANSI C (with some gcc additions)
  • doc/theses/aaron_moss_PhD/phd/generic-types.tex

    r3271166 rf9c7d27  
    55
    66% TODO discuss layout function algorithm, application to separate compilation
     7% TODO put a static const field in for _n_fields for each generic, describe utility for separate compilation
    78
    89% TODO mention impetus for zero_t design
    910
    1011% TODO mention use in tuple-type implementation
     12
     13% TODO pull benchmarks from Moss et al.
Note: See TracChangeset for help on using the changeset viewer.