Context Navigation

← Previous Changeset
Next Changeset →

Changeset f9c7d27

Timestamp:

Sep 17, 2018, 11:43:41 AM (8 years ago)

Author:

Aaron Moss <a3moss@…>

Branches:

ADT, aaron-thesis, arm-eh, ast-experimental, cleanup-dtors, deferred_resn, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, no_list, persistent-indexer, pthread-emulation, qualifiedEnum, stuck-waitfor-destruct

Children:

Parents:

Message:

Draft reference types and resource management subsections of thesis background

Location:

doc/theses/aaron_moss_PhD/phd

Files:

: 3 edited

background.tex (modified) (7 diffs)
cfa-macros.tex (modified) (1 diff)
generic-types.tex (modified) (1 diff)

Legend:

: Unmodified
: Added
: Removed

doc/theses/aaron_moss_PhD/phd/background.tex

-              r3271166
+              rf9c7d27
 It is important to note that \CFA{} is not an object-oriented language.
 This is a deliberate choice intended to maintain the applicability of the mental model and language idioms already possessed by C programmers.
 This choice is in marked contrast to \CC{}, which, though it has backward-compatibility with C on the source code level, is a much larger and more complex language, and requires extensive developer re-training before they can write idiomatic, efficient code in \CC{}'s object-oriented paradigm.
+This choice is in marked contrast to \CC{}, which, though it has backward-compatibility with C on the source code level, is a much larger and more complex language, and requires extensive developer re-training to write idiomatic, efficient code in \CC{}'s object-oriented paradigm.
 \CFA{} does have a system of implicit type conversions derived from C's ``usual arithmetic conversions''; while these conversions may be thought of as something like an inheritance hierarchy, the underlying semantics are significantly different and such an analogy is loose at best.
 …
 struct counter { int x; };
 counter& `++?`(counter& c) { ++c.x; return c; }  $\C{// pre-increment}$
 counter `?++`(counter& c) {  $\C{// post-increment}$
+counter& `++?`(counter& c) { ++c.x; return c; }  $\C[2in]{// pre-increment}$
+counter `?++`(counter& c) {  $\C[2in]{// post-increment}$
         counter tmp = c; ++c; return tmp;
+}
 bool `?<?`(const counter& a, const counter& b) {  $\C{// comparison}$
+bool `?<?`(const counter& a, const counter& b) {  $\C[2in]{// comparison}$
         return a.x < b.x;
+}
 …
 One benefit of this design is that it allows polymorphic functions to be separately compiled.
 The forward declaration !forall(otype T) T identity(T);! uniquely defines a single callable function, which may be implemented in a different file.
 The fact that there is only one implementation of each polymorphic function also reduces compile times relative to the template-expansion approach taken by \CC{}, as well as reducing binary sizes and runtime pressure on instruction cache at by re-using a single version of each function.
+The fact that there is only one implementation of each polymorphic function also reduces compile times relative to the template-expansion approach taken by \CC{}, as well as reducing binary sizes and runtime pressure on instruction cache by re-using a single version of each function.
 \subsubsection{Type Assertions}
 …
 This version of !twice! works for any type !S! that has an addition operator defined for it, and it could be used to satisfy the type assertion on !four_times!.
 \CFACC{} accomplishes this by creating a wrapper function calling !twice // (2)! with !S! bound to !double!, then providing this wrapper function to !four_times!\footnote{\lstinline{twice // (2)} could also have had a type parameter named \lstinline{T}; \CFA{} specifies renaming of the type parameters, which would avoid the name conflict with the type variable \lstinline{T} of \lstinline{four_times}}.
 Finding appropriate functions to satisfy type assertions is essentially a recursive case of expression resolution, as it takes a name (that of the type assertion) and attempts to match it to a suitable declaration \emph{in the current scope}.
 If a polymorphic function can be used to satisfy one of its own type assertions, this recursion may not terminate, as it is possible that that function is examined as a candidate for its own type assertion unboundedly repeatedly.
+\CFACC{} accomplishes this by creating a wrapper function calling !twice//(2)! with !S! bound to !double!, then providing this wrapper function to !four_times!\footnote{\lstinline{twice // (2)} could also have had a type parameter named \lstinline{T}; \CFA{} specifies renaming of the type parameters, which would avoid the name conflict with the type variable \lstinline{T} of \lstinline{four_times}}.
+Finding appropriate functions to satisfy type assertions is essentially a recursive case of expression resolution, as it takes a name (that of the type assertion) and attempts to match it to a suitable declaration in the current scope.
+If a polymorphic function can be used to satisfy one of its own type assertions, this recursion may not terminate, as it is possible that that function is examined as a candidate for its own assertion unboundedly repeatedly.
 To avoid such infinite loops, \CFACC{} imposes a fixed limit on the possible depth of recursion, similar to that employed by most \CC{} compilers for template expansion; this restriction means that there are some semantically well-typed expressions that cannot be resolved by \CFACC{}.
 \TODO{Update this with final state} One contribution made in the course of this thesis was modifying \CFACC{} to use the more flexible expression resolution algorithm for assertion matching, rather than the previous simpler approach of unification on the types of the functions.
+\TODO{Update this with final state} One contribution made in the course of this thesis was modifying \CFACC{} to use the more flexible expression resolution algorithm for assertion matching, rather than the simpler but limited previous approach of unification on the types of the functions.
 \subsubsection{Deleted Declarations}
 …
 \begin{cfa}
 trait pointer_like(`otype Ptr, otype El`) {
         El& *?(Ptr);  $\C{Ptr can be dereferenced to El}$
+        El& *?(Ptr);  $\C{// Ptr can be dereferenced to El}$
 };
 struct list {
         int value;
         list* next; $\C{may omit struct on type names}$
+        list* next; $\C{// may omit struct on type names}$
 };
 …
 In addition to the multiple interpretations of an expression produced by name overloading and polymorphic functions, for backward compatibility \CFA{} must support all of the implicit conversions present in C, producing further candidate interpretations for expressions.
 As mentioned above, C does not have an inheritance hierarchy of types, but the C standard's rules for the ``usual arithmetic conversions''\cit{} define which of the built-in tyhpes are implicitly convertable to which other types, and the relative cost of any pair of such conversions from a single source type.
 \CFA{} adds to the usual arithmetic conversions rules defining the cost of binding a polymorphic type variable in a function call; such bindings are cheaper than any \emph{unsafe} (narrowing) conversion, \eg{} !int! to !char!, but more expensive than any \emph{safe} (widening) conversion, \eg{} !int! to !double!.
+As mentioned above, C does not have an inheritance hierarchy of types, but the C standard's rules for the ``usual arithmetic conversions'\cit{} define which of the built-in types are implicitly convertible to which other types, and the relative cost of any pair of such conversions from a single source type.
+\CFA{} adds rules to the usual arithmetic conversions defining the cost of binding a polymorphic type variable in a function call; such bindings are cheaper than any \emph{unsafe} (narrowing) conversion, \eg{} !int! to !char!, but more expensive than any \emph{safe} (widening) conversion, \eg{} !int! to !double!.
 One contribution of this thesis, discussed in Section \TODO{add to resolution chapter}, is a number of refinements to this cost model to more efficiently resolve polymorphic function calls.
 …
 Note that which subexpression interpretation is minimal-cost may require contextual information to disambiguate.
 For instance, in the example in Section~\ref{overloading-sec}, !max(max, -max)! cannot be unambiguously resolved, but !int m = max(max, -max)! has a single minimal-cost resolution.
 While the interpretation !int m = (int)max((double)max, -(double)max)! is also a valid interpretation, it is not minimal-cost due to the unsafe cast from the !double! result of !max! to the !int!\footnote{The two \lstinline{double} casts function as type ascriptions selecting \lstinline{double max} rather than casts from \lstinline{int max} to \lstinline{double}, and as such are zero-cost.}.
 These contextual effects make the expression resolution problem for \CFA{} both theoretically and practically difficult, but the observation driving the work in Chapter~\ref{resolution-chap} is that of the many top-level expressions in a given program, most will likely be straightforward and idiomatic so that programmers writing and maintaining the code can easily understand them; it follows that effective heuristics for common cases can bring down compiler runtime enough that a small proportion of harder-to-resolve expressions should not increase compiler runtime or memory usage inordinately.
+While the interpretation !int m = (int)max((double)max, -(double)max)! is also a valid interpretation, it is not minimal-cost due to the unsafe cast from the !double! result of !max! to !int!\footnote{The two \lstinline{double} casts function as type ascriptions selecting \lstinline{double max} rather than casts from \lstinline{int max} to \lstinline{double}, and as such are zero-cost.}.
+These contextual effects make the expression resolution problem for \CFA{} both theoretically and practically difficult, but the observation driving the work in Chapter~\ref{resolution-chap} is that of the many top-level expressions in a given program, most are straightforward and idiomatic so that programmers writing and maintaining the code can easily understand them; it follows that effective heuristics for common cases can bring down compiler runtime enough that a small proportion of harder-to-resolve expressions does not inordinately increase overall compiler runtime or memory usage.
 \subsection{Type Features} \label{type-features-sec}
+The name overloading and polymorphism features of \CFA{} have the greatest effect on language design and compiler runtime, but there are a number of other features in the type system which have a smaller effect but are useful for code examples.
+These features are described here.
 \subsubsection{Reference Types}
+% TODO mention contribution on reference rebind
+\subsubsection{Lifetime Management}
+One of the key ergonomic improvements in \CFA{} is reference types, designed and implemented by Robert Schluntz\cite{Schluntz17}.
+Given some type !T!, a !T&! (``reference to !T!'') is essentially an automatically dereferenced pointer.
+These types allow seamless pass-by-reference for function parameters, without the extraneous dereferencing syntax present in C; they also allow easy easy aliasing of nested values with a similarly convenient syntax.
+A particular improvement is removing syntactic special cases for operators which take or return mutable values; for example, the use !a += b! of a compound assignment operator now matches its signature, !int& ?+=?(int&, int)!, as opposed to the previous syntactic special cases to automatically take the address of the first argument to !+=! and to mark its return value as mutable.
+The C standard makes heavy use of the concept of \emph{lvalue}, an expression with a memory address; its complement, \emph{rvalue} (a non-addressable expression) is not explicitly named.
+In \CFA{}, the distinction between lvalue and rvalue can be reframed in terms of reference and non-reference types, with the benefit of being able to express the difference in user code.
+\CFA{} references preserve the existing qualifier-dropping implicit lvalue-to-rvalue conversion from C (\eg{} a !const volatile int&! can be implicitly copied to a bare !int!)
+To make reference types more easily usable in legacy pass-by-value code, \CFA{} also adds an implicit rvalue-to-lvalue conversion, implemented by storing the value in a fresh compiler-generated temporary variable and passing a reference to that temporary.
+To mitigate the ``!const! hell'' problem present in \CC{}, there is also a qualifier-dropping lvalue-to-lvalue conversion, also implemented by copying into a temporary:
+\begin{cfa}
+const int magic = 42;
+void inc_print( int& x ) { printf("%d\n", ++x); }
+print_inc( magic ); $\C{// legal; implicitly generated code in red below:}$
+`int tmp = magic;` $\C{// to safely strip const-qualifier}$
+`print_inc( tmp );` $\C{// tmp is incremented, magic is unchanged}$
+\end{cfa}
+Despite the similar syntax, \CFA{} references are significantly more flexible than \CC{} references.
+The primary issue with \CC{} references is that it is impossible to extract the address of the reference variable rather than the address of the referred-to variable.
+This breaks a number of the usual compositional properties of the \CC{} type system, \eg{} a reference cannot be re-bound to another variable, nor is it possible to take a pointer to, array of, or reference to a reference.
+\CFA{} supports all of these use cases \TODO{test array} without further added syntax.
+The key to this syntax-free feature support is an observation made by the author that the address of a reference is a lvalue.
+In C, the address-of operator !&x! can only be applied to lvalue expressions, and always produces an immutable rvalue; \CFA{} supports reference re-binding by assignment to the address of a reference, and pointers to references by repeating the address-of operator:
+\begin{cfa}
+int x = 2, y = 3;
+int& r = x;  $\C{// r aliases x}$
+&r = &y; $\C{// r now aliases y}$
+int** p = &&r; $\C{// p points to r}$
+\end{cfa}
+For better compatibility with C, the \CFA{} team has chosen not to differentiate function overloads based on top-level reference types, and as such their contribution to the difficulty of \CFA{} expression resolution is largely restricted to the implementation details of normalization conversions and adapters.
+\subsubsection{Resource Management}
+\CFA{} also supports the RAII (``Resource Acquisition is Initialization'') idiom originated by \CC{}, thanks to the object lifetime work of Robert Schluntz\cite{Schluntz17}.
+This idiom allows a safer and more principled approach to resource management by tying acquisition of a resource to object initialization, with the corresponding resource release executed automatically at object finalization.
+A wide variety of conceptual resources may be conveniently managed by this scheme, including heap memory, file handles, and software locks.
+\CFA{}'s implementation of RAII is based on special constructor and destructor operators, available via the !x{ ... }! constructor syntax and !^x{ ... }! destructor syntax.
+Each type has an overridable compiler-generated zero-argument constructor, copy constructor, assignment operator, and destructor, as well as a field-wise constructor for each appropriate prefix of the member fields of !struct! types.
+For !struct! types the default versions of these operators call their equivalents on each field of the !struct!.
+The main implication of these object lifetime functions for expression resolution is that they are all included as implicit type assertions for !otype! type variables, with a secondary effect being an increase in code size due to the compiler-generated operators.
+Due to these implicit type assertions, assertion resolution is pervasive in \CFA{} polymorphic functions, even those without explicit type assertions.
+Implicitly-generated code is shown in red in the following example:
+\begin{cfa}
+struct kv {
+        int key;
+        char* value;
+};
+`void ?{} (kv& this) {` $\C[3in]{// default constructor}$
+`       this.key{};` $\C[3in]{// call recursively on members}$
+`       this.value{};
+}
+void ?{} (kv& this, int key) {` $\C[3in]{// partial field constructor}$
+`       this.key{ key };
+        this.value{};` $\C[3in]{// default-construct missing fields}$
+`}
+void ?{} (kv& this, int key, char* value) {` $\C[3in]{// complete field constructor}$
+`       this.key{ key };
+        this.value{ value };
+}
+void ?{} (kv& this, kv that) {` $\C[3in]{// copy constructor}$
+`       this.key{ that.key };
+        this.value{ that.value };
+}
+kv ?=? (kv& this, kv that) {` $\C[3in]{// assignment operator}$
+`       this.key = that.key;
+        this.value = that.value;
+}
+void ^?{} (kv& this) {` $\C[3in]{// destructor}$
+`       ^this.key{};
+        ^this.value{};
+}`
+forall(otype T `| { void ?{}(T&); void ?{}(T&, T); T ?=?(T&, T); void ^?{}(T&); }`)
+void foo(T);
+\end{cfa}
 \subsubsection{0 and 1 Literals}
+% TODO mention own motivating contribution
+% TODO mention future work in user-defined implicit conversions
+\subsubsection{Tuple Types}
+% TODO "precludes some matching strategies"

doc/theses/aaron_moss_PhD/phd/cfa-macros.tex

r3271166	rf9c7d27
20	20	\newcommand{\LstCommentStyle}[1]{{\lst@basicstyle{\lst@commentstyle{#1}}}}
21	21
22		\newcommand{\C}[2][2in]{\hfill\makebox[#1][l]{\LstCommentStyle{#2}}}
	22	\newcommand{\C}[2][3.5in]{\hfill\makebox[#1][l]{\LstCommentStyle{#2}}}
23	23
24	24	% CFA programming language, based on ANSI C (with some gcc additions)

doc/theses/aaron_moss_PhD/phd/generic-types.tex

-              r3271166
+              rf9c7d27
 % TODO discuss layout function algorithm, application to separate compilation
+% TODO put a static const field in for _n_fields for each generic, describe utility for separate compilation
 % TODO mention impetus for zero_t design
 % TODO mention use in tuple-type implementation
+% TODO pull benchmarks from Moss et al.

Note: See TracChangeset for help on using the changeset viewer.

Download in other formats: