Changeset 12d3187


Ignore:
Timestamp:
May 1, 2017, 1:39:55 PM (5 years ago)
Author:
Rob Schluntz <rschlunt@…>
Branches:
aaron-thesis, arm-eh, cleanup-dtors, deferred_resn, demangler, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, new-env, no_list, persistent-indexer, resolv-new, with_gc
Children:
2055098
Parents:
b3d70eb
Message:

final revisions to thesis

Location:
doc/rob_thesis
Files:
2 added
8 edited

Legend:

Unmodified
Added
Removed
  • doc/rob_thesis/cfa-format.tex

    rb3d70eb r12d3187  
    1 \usepackage{xcolor}
    2 \usepackage{listings}
     1% \usepackage{xcolor}
     2% \usepackage{listings}
    33% \usepackage{booktabs}
    44% \usepackage{array}
     
    103103
    104104\renewcommand{\ttdefault}{pcr}
     105
     106\newcommand{\basicstylesmall}{\scriptsize\ttfamily\color{basicCol}}
    105107
    106108\lstdefinestyle{defaultStyle}{
     
    131133  style=defaultStyle
    132134}
    133 \lstMakeShortInline[basewidth=0.5em,breaklines=true,basicstyle=\normalsize\ttfamily\color{basicCol}]@  % single-character for \lstinline
     135\lstMakeShortInline[basewidth=0.5em,breaklines=true,breakatwhitespace,basicstyle=\normalsize\ttfamily\color{basicCol}]@  % single-character for \lstinline
    134136
    135137\lstnewenvironment{cfacode}[1][]{
     
    195197\newcommand{\one}{\lstinline{one_t}\xspace}
    196198\newcommand{\ateq}{\lstinline{\@=}\xspace}
     199
     200\newenvironment{newtext}{\color{red}}{\ignorespacesafterend}
    197201
    198202% \lstset{ %
  • doc/rob_thesis/conclusions.tex

    rb3d70eb r12d3187  
    66On the surface, the work may appear as a rehash of similar mechanisms in \CC.
    77However, every added feature is different than its \CC counterpart, often with extended functionality, better integration with C and its programmers, and always supports separate compilation.
    8 All of these new features are being used by the \CFA development-team to build the \CFA runtime system.
     8All of these new features are being used extensively by the \CFA development-team to build the \CFA runtime system.
     9In particular, the concurrency system is built on top of RAII, library functions @new@ and @delete@ are used to manage dynamically allocated objects, and tuples are used to provide uniform interfaces to C library routines such as @div@ and @remquo@.
    910
    1011\section{Constructors and Destructors}
     
    245246That is, structure fields can be accessed and modified by any block of code without restriction, so while it is possible to ensure that an object is initially set to a valid state, it is not possible to ensure that it remains in a consistent state throughout its lifetime.
    246247A popular technique for ensuring consistency in object-oriented programming languages is to provide access modifiers such as @private@, which provides compile-time checks that only privileged code accesses private data.
    247 This approach could be added to \CFA, but it requires an idiomatic way of specifying what code is privileged.
     248This approach could be added to \CFA, but it requires an idiomatic way of specifying what code is privileged and what data is protected.
    248249One possibility is to tie access control into an eventual module system.
     250
     251\begin{sloppypar}
     252The current implementation of implicit subobject-construction is currently an all-or-nothing check.
     253That is, if a subobject is conditionally constructed, \eg within an if-statement, no implicit constructors for that object are added.
     254\begin{cfacode}
     255struct A { ... };
     256void ?{}(A * a) { ... }
     257
     258struct B {
     259  A a;
     260};
     261void ?{}(B * b) {
     262  if (...) {
     263    (&b->a){};  // explicitly constructed
     264  } // does not construct in else case
     265}
     266\end{cfacode}
     267This behaviour is unsafe and breaks the guarantee that constructors fully initialize objects.
     268This situation should be properly handled, either by examining all paths and inserting implicit constructor calls only in the paths missing construction, or by emitting an error or warning.
     269\end{sloppypar}
    249270
    250271\subsection{Tuples}
     
    252273This feature ties nicely into named tuples, as seen in D and Swift.
    253274
    254 Currently, tuple flattening and structuring conversions are 0-cost.
     275Currently, tuple flattening and structuring conversions are 0-cost conversions in the resolution algorithm.
    255276This makes tuples conceptually very simple to work with, but easily causes unnecessary ambiguity in situations where the type system should be able to differentiate between alternatives.
    256277Adding an appropriate cost function to tuple conversions will allow tuples to interact with the rest of the programming language more cohesively.
  • doc/rob_thesis/ctordtor.tex

    rb3d70eb r12d3187  
    66% doesn't seem possible to do this without allowing ttype on generic structs?
    77
    8 Since \CFA is a true systems language, it does not provide a garbage collector.
    9 As well, \CFA is not an object-oriented programming language, \ie, structures cannot have routine members.
     8Since \CFA is a true systems language, it does not require a garbage collector.
     9As well, \CFA is not an object-oriented programming language, \ie, structures cannot have methods.
     10While structures can have function pointer members, this is different from methods, since methods have implicit access to structure members and methods cannot be reassigned.
    1011Nevertheless, one important goal is to reduce programming complexity and increase safety.
    1112To that end, \CFA provides support for implicit pre/post-execution of routines for objects, via constructors and destructors.
     
    3233The key difference between assignment and initialization being that assignment occurs on a live object (\ie, an object that contains data).
    3334It is important to note that this means @x@ could have been used uninitialized prior to being assigned, while @y@ could not be used uninitialized.
    34 Use of uninitialized variables yields undefined behaviour, which is a common source of errors in C programs.
     35Use of uninitialized variables yields undefined behaviour \cite[p.~558]{C11}, which is a common source of errors in C programs.
    3536
    3637Initialization of a declaration is strictly optional, permitting uninitialized variables to exist.
     
    7071int x2 = opaque_get(x, 2);
    7172\end{cfacode}
    72 This pattern is cumbersome to use since every access becomes a function call.
     73This pattern is cumbersome to use since every access becomes a function call, requiring awkward syntax and a performance cost.
    7374While useful in some situations, this compromise is too restrictive.
    7475Furthermore, even with this idiom it is easy to make mistakes, such as forgetting to destroy an object or destroying it multiple times.
    7576
    7677A constructor provides a way of ensuring that the necessary aspects of object initialization is performed, from setting up invariants to providing compile- and run-time checks for appropriate initialization parameters.
    77 This goal is achieved through a guarantee that a constructor is called implicitly after every object is allocated from a type with associated constructors, as part of an object's definition.
    78 Since a constructor is called on every object of a managed type, it is impossible to forget to initialize such objects, as long as all constructors perform some sensible form of initialization.
     78This goal is achieved through a \emph{guarantee} that a constructor is called \emph{implicitly} after every object is allocated from a type with associated constructors, as part of an object's \emph{definition}.
     79Since a constructor is called on every object of a managed type, it is \emph{impossible} to forget to initialize such objects, as long as all constructors perform some sensible form of initialization.
    7980
    8081In \CFA, a constructor is a function with the name @?{}@.
     
    114115In other words, a default constructor is a constructor that takes a single argument: the @this@ parameter.
    115116
    116 In \CFA, a destructor is a function much like a constructor, except that its name is \lstinline!^?{}! and it takes only one argument.
     117In \CFA, a destructor is a function much like a constructor, except that its name is \lstinline!^?{}! \footnote{Originally, the name @~?{}@ was chosen for destructors, to provide familiarity to \CC programmers. Unforunately, this name causes parsing conflicts with the bitwise-not operator when used with operator syntax (see section \ref{sub:syntax}.)} and it takes only one argument.
    117118A destructor for the @Array@ type can be defined as:
    118119\begin{cfacode}
     
    135136On line 2, @z@ is initialized with the value of @x@, while on line 3, @y@ is assigned the value of @x@.
    136137The key distinction between initialization and assignment is that a value to be initialized does not hold any meaningful values, whereas an object to be assigned might.
    137 In particular, these cases cannot be handled the same way because in the former case @z@ does not currently own an array, while @y@ does.
     138In particular, these cases cannot be handled the same way because in the former case @z@ has no array, while @y@ does.
     139A \emph{copy constructor} is used to perform initialization using another object of the same type.
    138140
    139141\begin{cfacode}[emph={other}, emphstyle=\color{red}]
     
    151153}
    152154\end{cfacode}
    153 The two functions above handle these cases.
    154 The first function is called a \emph{copy constructor}, because it constructs its argument by copying the values from another object of the same type.
     155The two functions above handle the cases of initialization and assignment.
     156The first function is called a copy constructor, because it constructs its argument by copying the values from another object of the same type.
    155157The second function is the standard copy-assignment operator.
     158\CFA does not currently have the concept of reference types, so the most appropriate type for the source object in copy constructors and assignment operators is a value type.
     159Appropriate care is taken in the implementation to avoid recursive calls to the copy constructor.
    156160The four functions (default constructor, destructor, copy constructor, and assignment operator) are special in that they safely control the state of most objects.
    157161
     
    216220A * y = malloc();  // copy construct: ?{}(&y, malloc())
    217221
     222^?{}(&x);   // explicit destroy x, in different order
    218223?{}(&x);    // explicit construct x, second construction
     224^?{}(y);    // explicit destroy y
    219225?{}(y, x);  // explit construct y from x, second construction
    220 ^?{}(&x);   // explicit destroy x, in different order
    221 ^?{}(y);    // explicit destroy y
    222226
    223227// implicit ^?{}(&y);
     
    279283\end{cfacode}
    280284In this example, @malloc@ dynamically allocates storage and initializes it using a constructor, all before assigning it into the variable @x@.
     285Intuitively, the expression-resolver determines that @malloc@ returns some type @T *@, as does the constructor expression since it returns the type of its argument.
     286This type flows outwards to the declaration site where the expected type is known to be @X *@, thus the first argument to the constructor must be @X *@, narrowing the search space.
     287
    281288If this extension is not present, constructing dynamically allocated objects is much more cumbersome, requiring separate initialization of the pointer and initialization of the pointed-to memory.
    282289\begin{cfacode}
     
    300307It should be noted that this technique is not exclusive to @malloc@, and allows a user to write a custom allocator that can be idiomatically used in much the same way as a constructed @malloc@ call.
    301308
    302 It should be noted that while it is possible to use operator syntax with destructors, destructors invalidate their argument, thus operator syntax with destructors is a statement and does not produce a value.
     309While it is possible to use operator syntax with destructors, destructors invalidate their argument, thus operator syntax with destructors is void-typed expression.
    303310
    304311\subsection{Function Generation}
     
    308315If the translator can expect these functions to exist, then it can unconditionally attempt to resolve them.
    309316Moreover, the existence of a standard interface allows polymorphic code to interoperate with new types seamlessly.
     317While automatic generation of assignment functions is present in previous versions of \CFA, the the implementation has been largely rewritten to accomodate constructors and destructors.
    310318
    311319To mimic the behaviour of standard C, the default constructor and destructor for all of the basic types and for all pointer types are defined to do nothing, while the copy constructor and assignment operator perform a bitwise copy of the source parameter (as in \CC).
     320This default is intended to maintain backwards compatibility and performance, by not imposing unexpected operations for a C programmer, as a zero-default behaviour would.
     321However, it is possible for a user to define such constructors so that variables are safely zeroed by default, if desired.
     322%%%%%%%%%%%%%%%%%%%%%%%%%% line width %%%%%%%%%%%%%%%%%%%%%%%%%%
     323\begin{cfacode}
     324void ?{}(int * i) { *i = 0; }
     325forall(dtype T) void ?{}(T ** p) { *p = 0; }  // any pointer type
     326void f() {
     327  int x;    // initialized to 0
     328  int * p;  // initialized to 0
     329}
     330\end{cfacode}
     331%%%%%%%%%%%%%%%%%%%%%%%%%% line width %%%%%%%%%%%%%%%%%%%%%%%%%%
    312332
    313333There are several options for user-defined types: structures, unions, and enumerations.
    314334To aid in ease of use, the standard set of four functions is automatically generated for a user-defined type after its definition is completed.
    315335By auto-generating these functions, it is ensured that legacy C code continues to work correctly in every context where \CFA expects these functions to exist, since they are generated for every complete type.
     336As well, these functions are always generated, since they may be needed by polymorphic functions.
     337With that said, the generated functions are not called implicitly unless they are non-trivial, and are never exported, making it simple for the optimizer to strip them away when they are not used.
    316338
    317339The generated functions for enumerations are the simplest.
     
    338360}
    339361\end{cfacode}
    340 In the future, \CFA will introduce strongly-typed enumerations, like those in \CC.
     362In the future, \CFA will introduce strongly-typed enumerations, like those in \CC, wherein enumerations create a new type distinct from @int@ so that integral values require an explicit cast to be stored in an enumeration variable.
    341363The existing generated routines are sufficient to express this restriction, since they are currently set up to take in values of that enumeration type.
    342364Changes related to this feature only need to affect the expression resolution phase, where more strict rules will be applied to prevent implicit conversions from integral types to enumeration types, but should continue to permit conversions from enumeration types to @int@.
     
    492514In addition to freedom, \ateq provides a simple path for migrating legacy C code to \CFA, in that objects can be moved from C-style initialization to \CFA gradually and individually.
    493515It is worth noting that the use of unmanaged objects can be tricky to get right, since there is no guarantee that the proper invariants are established on an unmanaged object.
    494 It is recommended that most objects be managed by sensible constructors and destructors, except where absolutely necessary.
     516It is recommended that most objects be managed by sensible constructors and destructors, except where absolutely necessary, such as memory-mapped devices, trigger devices, I/O controllers, etc.
    495517
    496518When a user declares any constructor or destructor, the corresponding intrinsic/generated function and all field constructors for that type are hidden, so that they are not found during expression resolution until the user-defined function goes out of scope.
     
    545567\end{cfacode}
    546568However, if the translator sees a sub-object used within the body of a constructor, but does not see a constructor call that uses the sub-object as the target of a constructor, then the translator assumes the object is to be implicitly constructed (copy constructed in a copy constructor and default constructed in any other constructor).
     569To override this rule, \ateq can be used to force the translator to trust the programmer's discretion.
     570This form of \ateq is not yet implemented.
    547571\begin{cfacode}
    548572void ?{}(A * a) {
     
    556580}
    557581
     582void ?{}(A * a, int x) {
     583  // object forwarded to another constructor,
     584  // does not implicitly construct any members
     585  (&a){};
     586}
     587
    558588void ^?{}(A * a) {
    559589  ^(&a->x){}; // explicit destructor call
    560590} // z, y, w implicitly destructed, in this order
    561591\end{cfacode}
    562 If at any point, the @this@ parameter is passed directly as the target of another constructor, then it is assumed that constructor handles the initialization of all of the object's members and no implicit constructor calls are added.
    563 To override this rule, \ateq can be used to force the translator to trust the programmer's discretion.
    564 This form of \ateq is not yet implemented.
     592If at any point, the @this@ parameter is passed directly as the target of another constructor, then it is assumed the other constructor handles the initialization of all of the object's members and no implicit constructor calls are added to the current constructor.
    565593
    566594Despite great effort, some forms of C syntax do not work well with constructors in \CFA.
     
    619647The body of @A@ has been omitted, since only the constructor interfaces are important.
    620648
    621 It should be noted that unmanaged objects can still make use of designations and nested initializers in \CFA.
     649It should be noted that unmanaged objects, i.e. objects that have only trivial constructors, can still make use of designations and nested initializers in \CFA.
    622650It is simple to overcome this limitation for managed objects by making use of compound literals, so that the arguments to the constructor call are explicitly typed.
     651%%%%%%%%%%%%%%%%%%%%%%%%%% line width %%%%%%%%%%%%%%%%%%%%%%%%%%
     652\begin{cfacode}
     653struct B { int x; };
     654struct C { int y; };
     655struct A { B b; C c; };
     656void ?{}(A *, B);
     657void ?{}(A *, C);
     658
     659A a = {
     660  (C){ 10 } // disambiguate with compound literal
     661};
     662\end{cfacode}
     663%%%%%%%%%%%%%%%%%%%%%%%%%% line width %%%%%%%%%%%%%%%%%%%%%%%%%%
    623664
    624665\subsection{Implicit Destructors}
     
    744785\end{cfacode}
    745786
     787While \CFA supports the GCC computed-goto extension, the behaviour of managed objects in combination with computed-goto is undefined.
     788\begin{cfacode}
     789void f(int val) {
     790  void * l = val == 0 ? &&L1 : &&L2;
     791  {
     792      A x;
     793    L1: ;
     794      goto *l;  // branches differently depending on argument
     795  }
     796  L2: ;
     797}
     798\end{cfacode}
     799Likewise, destructors are not executed at scope-exit due to a computed-goto in \CC, as of g++ version 6.2.
     800
    746801\subsection{Implicit Copy Construction}
    747802\label{s:implicit_copy_construction}
     
    750805Exempt from these rules are intrinsic and built-in functions.
    751806It should be noted that unmanaged objects are subject to copy constructor calls when passed as arguments to a function or when returned from a function, since they are not the \emph{target} of the copy constructor call.
    752 That is, since the parameter is not marked as an unmanaged object using \ateq, it is be copy constructed if it is returned by value or passed as an argument to another function, so to guarantee consistent behaviour, unmanaged objects must be copy constructed when passed as arguments.
     807That is, since the parameter is not marked as an unmanaged object using \ateq, it is copy constructed if it is returned by value or passed as an argument to another function, so to guarantee consistent behaviour, unmanaged objects must be copy constructed when passed as arguments.
    753808These semantics are important to bear in mind when using unmanaged objects, and could produce unexpected results when mixed with objects that are explicitly constructed.
    754809\begin{cfacode}
    755 struct A;
     810struct A { ... };
    756811void ?{}(A *);
    757812void ?{}(A *, A);
     
    810865It should be noted that reference types will allow specifying that a value does not need to be copied, however reference types do not provide a means of preventing implicit copy construction from uses of the reference, so the problem is still present when passing or returning the reference by value.
    811866
     867Adding implicit copy construction imposes the additional runtime cost of the copy constructor for every argument and return value in a function call.
     868This cost is necessary to maintain appropriate value semantics when calling a function.
     869In the future, return-value-optimization (RVO) can be implemented for \CFA to elide unnecessary copy construction and destruction of temporary objects.
     870This cost is not present for types with trivial copy constructors and destructors.
     871
    812872A known issue with this implementation is that the argument and return value temporaries are not guaranteed to have the same address for their entire lifetimes.
    813873In the previous example, since @_retval_f@ is allocated and constructed in @f@, then returned by value, the internal data is bitwise copied into the caller's stack frame.
     
    908968\subsection{Array Initialization}
    909969Arrays are a special case in the C type-system.
    910 C arrays do not carry around their size, making it impossible to write a standalone \CFA function that constructs or destructs an array while maintaining the standard interface for constructors and destructors.
     970Type checking largely ignores size information for C arrays, making it impossible to write a standalone \CFA function that constructs or destructs an array, while maintaining the standard interface for constructors and destructors.
    911971Instead, \CFA defines the initialization and destruction of an array recursively.
    912972That is, when an array is defined, each of its elements is constructed in order from element 0 up to element $n-1$.
     
    11471207\end{cfacode}
    11481208
     1209This implementation comes at the runtime cost of an additional branch for every @static@ local variable, each time the function is called.
     1210Since initializers are not required to be compile-time constant expressions, they can involve global variables, function arguments, function calls, etc.
     1211As a direct consequence, @static@ local variables cannot be initialized with an attribute-constructor routines like global variables can.
     1212However, in the case where the variable is unmanaged and has a compile-time constant initializer, a C-compliant initializer is generated and the additional cost is not present.
     1213\CC shares the same semantics for its @static@ local variables.
     1214
    11491215\subsection{Polymorphism}
    11501216As mentioned in section \ref{sub:polymorphism}, \CFA currently has 3 type-classes that are used to designate polymorphic data types: @otype@, @dtype@, and @ftype@.
     
    11721238These additions allow @f@'s body to create and destroy objects of type @T@, and pass objects of type @T@ as arguments to other functions, following the normal \CFA rules.
    11731239A point of note here is that objects can be missing default constructors (and eventually other functions through deleted functions), so it is important for \CFA programmers to think carefully about the operations needed by their function, as to not over-constrain the acceptable parameter types and prevent potential reuse.
     1240
     1241These additional assertion parameters impose a runtime cost on all managed temporary objects created in polymorphic code, even those with trivial constructors and destructors.
     1242This cost is necessary because polymorphic code does not know the actual type at compile-time, due to separate compilation.
     1243Since trivial constructors and destructors either do not perform operations or are simply bit-wise copy operations, the imposed cost is essentially the cost of the function calls.
     1244
     1245\section{Summary}
     1246
     1247When creating a new object of a managed type, it is guaranteed that a constructor is be called to initialize the object at its definition point, and is destructed when the object's lifetime ends.
     1248Destructors are called in the reverse order of construction.
     1249
     1250Every argument passed to a function is copy constructed into a temporary object that is passed by value to the functions and destructed at the end of the statement.
     1251Function return values are copy constructed inside the function at the return statement, passed by value to the call-site, and destructed at the call-site at the end of the statement.
     1252
     1253Every complete object type has a default constructor, copy constructor, assignment operator, and destructor.
     1254To accomplish this, these functions are generated as appropriate for new types.
     1255User-defined functions shadow built-in and automatically generated functions, so it is possible to specialize the behaviour of a type.
     1256Furthermore, default constructors and aggregate field constructors are hidden when \emph{any} constructor is defined.
     1257
     1258Objects dynamically allocated with @malloc@, \ateq objects, and objects with only trivial constructors and destructors are unmanaged.
     1259Unmanaged objects are never the target of an implicit constructor or destructor call.
  • doc/rob_thesis/intro.tex

    rb3d70eb r12d3187  
    1919Unfortunately, \CC is actively diverging from C, so incremental additions require significant effort and training, coupled with multiple legacy design-choices that cannot be updated.
    2020
    21 The remainder of this section describes some of the important new features that currently exist in \CFA, to give the reader the necessary context in which the new features presented in this thesis must dovetail.
     21The current implementation of \CFA is a source-to-source translator from \CFA to GNU C \cite{GCCExtensions}.
     22
     23The remainder of this section describes some of the important features that currently exist in \CFA, to give the reader the necessary context in which the new features presented in this thesis must dovetail.
    2224
    2325\subsection{C Background}
    2426\label{sub:c_background}
     27In the context of this work, the term \emph{object} refers to a region of data storage in the execution environment, the contents of which can represent values \cite[p.~6]{C11}.
     28
    2529One of the lesser-known features of standard C is \emph{designations}.
    2630Designations are similar to named parameters in languages such as Python and Scala, except that they only apply to aggregate initializers.
     31Note that in \CFA, designations use a colon separator, rather than an equals sign as in C, because this syntax is one of the few places that conflicts with the new language features.
    2732\begin{cfacode}
    2833struct A {
     
    4348Later initializers override earlier initializers, so a sub-object for which there is more than one initializer is only initialized by its last initializer.
    4449These semantics can be seen in the initialization of @a0@, where @x@ is designated twice, and thus initialized to @8@.
    45 Note that in \CFA, designations use a colon separator, rather than an equals sign as in C, because this syntax is one of the few places that conflicts with the new language features.
    4650
    4751C also provides \emph{compound literal} expressions, which provide a first-class mechanism for creating unnamed objects.
     
    5761Compound literals create an unnamed object, and result in an lvalue, so it is legal to assign a value into a compound literal or to take its address \cite[p.~86]{C11}.
    5862Syntactically, compound literals look like a cast operator followed by a brace-enclosed initializer, but semantically are different from a C cast, which only applies basic conversions and coercions and is never an lvalue.
     63
     64The \CFA translator makes use of several GNU C extensions, including \emph{nested functions} and \emph{attributes}.
     65Nested functions make it possible to access data that is lexically in scope in the nested function's body.
     66\begin{cfacode}
     67int f() {
     68  int x = 0;
     69  void g() {
     70    x++;
     71  }
     72  g();  // changes x
     73}
     74\end{cfacode}
     75Nested functions come with the usual C caveat that they should not leak into the containing environment, since they are only valid as long as the containing function's stack frame is active.
     76
     77Attributes make it possible to inform the compiler of certain properties of the code.
     78For example, a function can be marked as deprecated, so that legacy APIs can be identified and slowly removed, or as \emph{hot}, so that the compiler knows the function is called frequently and should be aggresively optimized.
     79\begin{cfacode}
     80__attribute__((deprecated("foo is deprecated, use bar instead")))
     81void foo();
     82__attribute__((hot)) void bar(); // heavily optimized
     83
     84foo();  // warning
     85bar();
     86\end{cfacode}
    5987
    6088\subsection{Overloading}
     
    6492C provides a small amount of built-in overloading, \eg + is overloaded for the basic types.
    6593Like in \CC, \CFA allows user-defined overloading based both on the number of parameters and on the types of parameters.
    66   \begin{cfacode}
    67   void f(void);  // (1)
    68   void f(int);   // (2)
    69   void f(char);  // (3)
    70 
    71   f('A');        // selects (3)
    72   \end{cfacode}
     94\begin{cfacode}
     95void f(void);  // (1)
     96void f(int);   // (2)
     97void f(char);  // (3)
     98
     99f('A');        // selects (3)
     100\end{cfacode}
    73101In this case, there are three @f@ procedures, where @f@ takes either 0 or 1 arguments, and if an argument is provided then it may be of type @int@ or of type @char@.
    74102Exactly which procedure is executed depends on the number and types of arguments passed.
    75103If there is no exact match available, \CFA attempts to find a suitable match by examining the C built-in conversion heuristics.
    76   \begin{cfacode}
    77   void g(long long);
    78 
    79   g(12345);
    80   \end{cfacode}
     104The \CFA expression resolution algorithm uses a cost function to determine the interpretation that uses the fewest conversions and polymorphic type bindings.
     105\begin{cfacode}
     106void g(long long);
     107
     108g(12345);
     109\end{cfacode}
    81110In the above example, there is only one instance of @g@, which expects a single parameter of type @long long@.
    82111Here, the argument provided has type @int@, but since all possible values of type @int@ can be represented by a value of type @long long@, there is a safe conversion from @int@ to @long long@, and so \CFA calls the provided @g@ routine.
    83112
     113Overloading solves the problem present in C where there can only be one function with a given name, requiring multiple names for functions that perform the same operation but take in different types.
     114This can be seen in the example of the absolute value functions C:
     115\begin{cfacode}
     116// stdlib.h
     117int abs(int);
     118long int labs(long int);
     119long long int llabs(long long int);
     120\end{cfacode}
     121In \CFA, the functions @labs@ and @llabs@ are replaced by appropriate overloads of @abs@.
     122
    84123In addition to this form of overloading, \CFA also allows overloading based on the number and types of \emph{return} values.
    85124This extension is a feature that is not available in \CC, but is available in other programming languages such as Ada \cite{Ada95}.
    86   \begin{cfacode}
    87   int g();         // (1)
    88   double g();      // (2)
    89 
    90   int x = g();     // selects (1)
    91   \end{cfacode}
     125\begin{cfacode}
     126int g();         // (1)
     127double g();      // (2)
     128
     129int x = g();     // selects (1)
     130\end{cfacode}
    92131Here, the only difference between the signatures of the different versions of @g@ is in the return values.
    93132The result context is used to select an appropriate routine definition.
    94133In this case, the result of @g@ is assigned into a variable of type @int@, so \CFA prefers the routine that returns a single @int@, because it is an exact match.
     134
     135Return-type overloading solves similar problems to parameter-list overloading, in that multiple functions that perform similar operations can have the same, but produce different values.
     136One use case for this feature is to provide two versions of the @bsearch@ routine:
     137\begin{cfacode}
     138forall(otype T | { int ?<?( T, T ); })
     139T * bsearch(T key, const T * arr, size_t dimension) {
     140  int comp(const void * t1, const void * t2) {
     141    return *(T *)t1 < *(T *)t2 ? -1 : *(T *)t2 < *(T *)t1 ? 1 : 0;
     142  }
     143  return (T *)bsearch(&key, arr, dimension, sizeof(T), comp);
     144}
     145forall(otype T | { int ?<?( T, T ); })
     146unsigned int bsearch(T key, const T * arr, size_t dimension) {
     147  T *result = bsearch(key, arr, dimension);
     148  // pointer subtraction includes sizeof(T)
     149  return result ? result - arr : dimension;
     150}
     151double key = 5.0;
     152double vals[10] = { /* 10 floating-point values */ };
     153
     154double * val = bsearch( 5.0, vals, 10 ); // selection based on return type
     155int posn = bsearch( 5.0, vals, 10 );
     156\end{cfacode}
     157The first version provides a thin wrapper around the C @bsearch@ routine, converting untyped @void *@ to the polymorphic type @T *@, allowing the \CFA compiler to catch errors when the type of @key@, @arr@, and the target at the call-site do not agree.
     158The second version provides an alternate return of the index in the array of the selected element, rather than its address.
    95159
    96160There are times when a function should logically return multiple values.
     
    145209
    146210An extra quirk introduced by multiple return values is in the resolution of function calls.
    147   \begin{cfacode}
    148   int f();            // (1)
    149   [int, int] f();     // (2)
    150 
    151   void g(int, int);
    152 
    153   int x, y;
    154   [x, y] = f();       // selects (2)
    155   g(f());             // selects (2)
    156   \end{cfacode}
     211\begin{cfacode}
     212int f();            // (1)
     213[int, int] f();     // (2)
     214
     215void g(int, int);
     216
     217int x, y;
     218[x, y] = f();       // selects (2)
     219g(f());             // selects (2)
     220\end{cfacode}
    157221In this example, the only possible call to @f@ that can produce the two @int@s required for assigning into the variables @x@ and @y@ is the second option.
    158222A similar reasoning holds calling the function @g@.
     223
     224This duality between aggregation and aliasing can be seen in the C standard library in the @div@ and @remquo@ functions, which return the quotient and remainder for a division of integer and floating-point values, respectively.
     225\begin{cfacode}
     226typedef struct { int quo, rem; } div_t; // from stdlib.h
     227div_t div( int num, int den );
     228double remquo( double num, double den, int * quo );
     229div_t qr = div( 13, 5 );            // return quotient/remainder aggregate
     230int q;
     231double r = remquo( 13.5, 5.2, &q ); // return remainder, alias quotient
     232\end{cfacode}
     233@div@ aggregates the quotient/remainder in a structure, while @remquo@ aliases a parameter to an argument.
     234Alternatively, a programming language can directly support returning multiple values, \eg in \CFA:
     235\begin{lstlisting}
     236[int, int] div(int num, int den);               // return two integers
     237[double, double] div( double num, double den ); // return two doubles
     238int q, r;                     // overloaded variable names
     239double q, r;
     240[q, r] = div(13, 5);          // select appropriate div and q, r
     241[q, r] = div(13.5, 5.2);
     242\end{lstlisting}
    159243
    160244In \CFA, overloading also applies to operator names, known as \emph{operator overloading}.
    161245Similar to function overloading, a single operator is given multiple meanings by defining new versions of the operator with different signatures.
    162246In \CC, this can be done as follows
    163   \begin{cppcode}
    164   struct A { int i; };
    165   int operator+(A x, A y);
    166   bool operator<(A x, A y);
    167   \end{cppcode}
     247\begin{cppcode}
     248struct A { int i; };
     249A operator+(A x, A y);
     250bool operator<(A x, A y);
     251\end{cppcode}
    168252
    169253In \CFA, the same example can be written as follows.
    170   \begin{cfacode}
    171   struct A { int i; };
    172   int ?+?(A x, A y);    // '?'s represent operands
    173   bool ?<?(A x, A y);
    174   \end{cfacode}
     254\begin{cfacode}
     255struct A { int i; };
     256A ?+?(A x, A y);    // '?'s represent operands
     257int ?<?(A x, A y);
     258\end{cfacode}
    175259Notably, the only difference is syntax.
    176260Most of the operators supported by \CC for operator overloading are also supported in \CFA.
     
    179263Finally, \CFA also permits overloading variable identifiers.
    180264This feature is not available in \CC.
    181   \begin{cfacode}
    182   struct Rational { int numer, denom; };
    183   int x = 3;               // (1)
    184   double x = 1.27;         // (2)
    185   Rational x = { 4, 11 };  // (3)
    186 
    187   void g(double);
    188 
    189   x += 1;                  // chooses (1)
    190   g(x);                    // chooses (2)
    191   Rational y = x;          // chooses (3)
    192   \end{cfacode}
     265\begin{cfacode}
     266struct Rational { int numer, denom; };
     267int x = 3;               // (1)
     268double x = 1.27;         // (2)
     269Rational x = { 4, 11 };  // (3)
     270
     271void g(double);
     272
     273x += 1;                  // chooses (1)
     274g(x);                    // chooses (2)
     275Rational y = x;          // chooses (3)
     276\end{cfacode}
    193277In this example, there are three definitions of the variable @x@.
    194278Based on the context, \CFA attempts to choose the variable whose type best matches the expression context.
     
    208292Due to these rewrite rules, the values @0@ and @1@ have the types \zero and \one in \CFA, which allow for overloading various operations that connect to @0@ and @1@ \footnote{In the original design of \CFA, @0@ and @1@ were overloadable names \cite[p.~7]{cforall}.}.
    209293The types \zero and \one have special built-in implicit conversions to the various integral types, and a conversion to pointer types for @0@, which allows standard C code involving @0@ and @1@ to work as normal.
    210   \begin{cfacode}
    211   // lvalue is similar to returning a reference in C++
    212   lvalue Rational ?+=?(Rational *a, Rational b);
    213   Rational ?=?(Rational * dst, zero_t) {
    214     return *dst = (Rational){ 0, 1 };
    215   }
    216 
    217   Rational sum(Rational *arr, int n) {
    218     Rational r;
    219     r = 0;     // use rational-zero_t assignment
    220     for (; n > 0; n--) {
    221       r += arr[n-1];
    222     }
    223     return r;
    224   }
    225   \end{cfacode}
     294\begin{cfacode}
     295// lvalue is similar to returning a reference in C++
     296lvalue Rational ?+=?(Rational *a, Rational b);
     297Rational ?=?(Rational * dst, zero_t) {
     298  return *dst = (Rational){ 0, 1 };
     299}
     300
     301Rational sum(Rational *arr, int n) {
     302  Rational r;
     303  r = 0;     // use rational-zero_t assignment
     304  for (; n > 0; n--) {
     305    r += arr[n-1];
     306  }
     307  return r;
     308}
     309\end{cfacode}
    226310This function takes an array of @Rational@ objects and produces the @Rational@ representing the sum of the array.
    227311Note the use of an overloaded assignment operator to set an object of type @Rational@ to an appropriate @0@ value.
     
    232316In particular, \CFA supports the notion of parametric polymorphism.
    233317Parametric polymorphism allows a function to be written generically, for all values of all types, without regard to the specifics of a particular type.
    234 For example, in \CC, the simple identity function for all types can be written as
    235   \begin{cppcode}
    236   template<typename T>
    237   T identity(T x) { return x; }
    238   \end{cppcode}
    239 \CC uses the template mechanism to support parametric polymorphism. In \CFA, an equivalent function can be written as
    240   \begin{cfacode}
    241   forall(otype T)
    242   T identity(T x) { return x; }
    243   \end{cfacode}
     318For example, in \CC, the simple identity function for all types can be written as:
     319\begin{cppcode}
     320template<typename T>
     321T identity(T x) { return x; }
     322\end{cppcode}
     323\CC uses the template mechanism to support parametric polymorphism. In \CFA, an equivalent function can be written as:
     324\begin{cfacode}
     325forall(otype T)
     326T identity(T x) { return x; }
     327\end{cfacode}
    244328Once again, the only visible difference in this example is syntactic.
    245329Fundamental differences can be seen by examining more interesting examples.
    246 In \CC, a generic sum function is written as follows
    247   \begin{cppcode}
    248   template<typename T>
    249   T sum(T *arr, int n) {
    250     T t;  // default construct => 0
    251     for (; n > 0; n--) t += arr[n-1];
    252     return t;
    253   }
    254   \end{cppcode}
     330In \CC, a generic sum function is written as follows:
     331\begin{cppcode}
     332template<typename T>
     333T sum(T *arr, int n) {
     334  T t;  // default construct => 0
     335  for (; n > 0; n--) t += arr[n-1];
     336  return t;
     337}
     338\end{cppcode}
    255339Here, the code assumes the existence of a default constructor, assignment operator, and an addition operator over the provided type @T@.
    256340If any of these required operators are not available, the \CC compiler produces an error message stating which operators could not be found.
    257341
    258 A similar sum function can be written in \CFA as follows
    259   \begin{cfacode}
    260   forall(otype T | **R**{ T ?=?(T *, zero_t); T ?+=?(T *, T); }**R**)
    261   T sum(T *arr, int n) {
    262     T t = 0;
    263     for (; n > 0; n--) t = t += arr[n-1];
    264     return t;
    265   }
    266   \end{cfacode}
     342A similar sum function can be written in \CFA as follows:
     343\begin{cfacode}
     344forall(otype T | **R**{ T ?=?(T *, zero_t); T ?+=?(T *, T); }**R**)
     345T sum(T *arr, int n) {
     346  T t = 0;
     347  for (; n > 0; n--) t = t += arr[n-1];
     348  return t;
     349}
     350\end{cfacode}
    267351The first thing to note here is that immediately following the declaration of @otype T@ is a list of \emph{type assertions} that specify restrictions on acceptable choices of @T@.
    268352In particular, the assertions above specify that there must be an assignment from \zero to @T@ and an addition assignment operator from @T@ to @T@.
    269353The existence of an assignment operator from @T@ to @T@ and the ability to create an object of type @T@ are assumed implicitly by declaring @T@ with the @otype@ type-class.
    270354In addition to @otype@, there are currently two other type-classes.
     355
     356@dtype@, short for \emph{data type}, serves as the top type for object types; any object type, complete or incomplete, can be bound to a @dtype@ type variable.
     357To contrast, @otype@, short for \emph{object type}, is a @dtype@ with known size, alignment, and an assignment operator, and thus bind only to complete object types.
     358With this extra information, complete objects can be used in polymorphic code in the same way they are used in monomorphic code, providing familiarity and ease of use.
     359The third type-class is @ftype@, short for \emph{function type}, matching only function types.
    271360The three type parameter kinds are summarized in \autoref{table:types}
    272361
     
    275364    \begin{tabular}{|c||c|c|c||c|c|c|}
    276365                                                                                                    \hline
    277     name    & object type & incomplete type & function type & can assign value & can create & has size \\ \hline
     366    name    & object type & incomplete type & function type & can assign & can create & has size \\ \hline
    278367    @otype@ & X           &                 &               & X                & X          & X        \\ \hline
    279368    @dtype@ & X           & X               &               &                  &            &          \\ \hline
     
    288377In contrast, the explicit nature of assertions allows \CFA's polymorphic functions to be separately compiled, as the function prototype states all necessary requirements separate from the implementation.
    289378For example, the prototype for the previous sum function is
    290   \begin{cfacode}
    291   forall(otype T | **R**{ T ?=?(T *, zero_t); T ?+=?(T *, T); }**R**)
    292   T sum(T *arr, int n);
    293   \end{cfacode}
     379\begin{cfacode}
     380forall(otype T | **R**{ T ?=?(T *, zero_t); T ?+=?(T *, T); }**R**)
     381T sum(T *arr, int n);
     382\end{cfacode}
    294383With this prototype, a caller in another translation unit knows all of the constraints on @T@, and thus knows all of the operations that need to be made available to @sum@.
    295384
    296385In \CFA, a set of assertions can be factored into a \emph{trait}.
    297386\begin{cfacode}
    298   trait Addable(otype T) {
    299     T ?+?(T, T);
    300     T ++?(T);
    301     T ?++(T);
    302   }
    303   forall(otype T | Addable(T)) void f(T);
    304   forall(otype T | Addable(T) | { T --?(T); }) T g(T);
    305   forall(otype T, U | Addable(T) | { T ?/?(T, U); }) U h(T, U);
     387trait Addable(otype T) {
     388  T ?+?(T, T);
     389  T ++?(T);
     390  T ?++(T);
     391}
     392forall(otype T | Addable(T)) void f(T);
     393forall(otype T | Addable(T) | { T --?(T); }) T g(T);
     394forall(otype T, U | Addable(T) | { T ?/?(T, U); }) U h(T, U);
    306395\end{cfacode}
    307396This capability allows specifying the same set of assertions in multiple locations, without the repetition and likelihood of mistakes that come with manually writing them out for each function declaration.
    308397
    309 An interesting application of return-type resolution and polymorphism is a type-safe version of @malloc@.
     398An interesting application of return-type resolution and polymorphism is a polymorphic version of @malloc@.
    310399\begin{cfacode}
    311400forall(dtype T | sized(T))
     
    321410The built-in trait @sized@ ensures that size and alignment information for @T@ is available in the body of @malloc@ through @sizeof@ and @_Alignof@ expressions respectively.
    322411In calls to @malloc@, the type @T@ is bound based on call-site information, allowing \CFA code to allocate memory without the potential for errors introduced by manually specifying the size of the allocated block.
     412
     413\subsection{Planned Features}
     414
     415One of the planned features \CFA is \emph{reference types}.
     416At a high level, the current proposal is to add references as a way to cleanup pointer syntax.
     417With references, it will be possible to store any address, as with a pointer, with the key difference being that references are automatically dereferenced.
     418\begin{cfacode}
     419int x = 0;
     420int * p = &x;  // needs &
     421int & ref = x; // no &
     422
     423printf("%d %d\n", *p, ref); // pointer needs *, ref does not
     424\end{cfacode}
     425
     426It is possible to add new functions or shadow existing functions for the duration of a scope, using normal C scoping rules.
     427One application of this feature is to reverse the order of @qsort@.
     428\begin{cfacode}
     429forall(otype T | { int ?<?( T, T ); })
     430void qsort(const T * arr, size_t size) {
     431  int comp(const void * t1, const void * t2) {
     432    return *(T *)t1 < *(T *)t2 ? -1 : *(T *)t2 < *(T *)t1 ? 1 : 0;
     433  }
     434  qsort(arr, dimension, sizeof(T), comp);
     435
     436}
     437double vals[10] = { ... };
     438qsort(vals, 10);                // ascending order
     439{
     440  int ?<?(double x, double y) { // locally override behaviour
     441    return x > y;
     442  }
     443  qsort(vals, 10);              // descending sort
     444}
     445\end{cfacode}
     446Currently, there is no way to \emph{remove} a function from consideration from the duration of a scope.
     447For example, it may be desirable to eliminate assignment from a scope, to reduce accidental mutation.
     448To address this desire, \emph{deleted functions} are a planned feature for \CFA.
     449\begin{cfacode}
     450forall(otype T) void f(T *);
     451
     452int x = 0;
     453f(&x);  // might modify x
     454{
     455  int ?=?(int *, int) = delete;
     456  f(&x);   // error, no assignment for int
     457}
     458\end{cfacode}
     459Now, if the deleted function is chosen as the best match, the expression resolver emits an error.
    323460
    324461\section{Invariants}
     
    450587\end{javacode}
    451588In Java 7, a new \emph{try-with-resources} construct was added to alleviate most of the pain of working with resources, but ultimately it still places the burden squarely on the user rather than on the library designer.
    452 Furthermore, for complete safety this pattern requires nested objects to be declared separately, otherwise resources that can throw an exception on close can leak nested resources \cite{TryWithResources}.
     589Furthermore, for complete safety this pattern requires nested objects to be declared separately, otherwise resources that can throw an exception on close can leak nested resources \footnote{Since close is only guaranteed to be called on objects declared in the try-list and not objects passed as constructor parameters, the @B@ object may not be closed in @new A(new B())@ if @A@'s close raises an exception.} \cite{TryWithResources}.
    453590\begin{javacode}
    454591public void write(String filename, String msg) throws Exception {
     
    521658% these are declared in the struct, so they're closer to C++ than to CFA, at least syntactically. Also do not allow for default constructors
    522659% D has a GC, which already makes the situation quite different from C/C++
    523 The programming language, D, also manages resources with constructors and destructors \cite{D}.
    524 In D, @struct@s are stack allocated and managed via scoping like in \CC, whereas @class@es are managed automatically by the garbage collector.
     660The programming language D also manages resources with constructors and destructors \cite{D}.
     661In D, @struct@s are stack allocatable and managed via scoping like in \CC, whereas @class@es are managed automatically by the garbage collector.
    525662Like Java, using the garbage collector means that destructors are called indeterminately, requiring the use of finally statements to ensure dynamically allocated resources that are not managed by the garbage collector, such as open files, are cleaned up.
    526663Since D supports RAII, it is possible to use the same techniques as in \CC to ensure that resources are released in a timely manner.
     
    755892
    756893Type-safe variadic functions are added to \CFA and discussed in Chapter 4.
     894
     895\section{Contributions}
     896\label{s:contributions}
     897
     898No prior work on constructors or destructors had been done for \CFA.
     899I did both the design and implementation work.
     900While the overall design is based on constructors and destructors in object-oriented C++, it had to be re-engineered into non-object-oriented \CFA.
     901I also had to make changes to the \CFA expression-resolver to integrate constructors and destructors into the type system.
     902
     903Prior work on the design of tuples for \CFA was done by Till, and some initial implementation work by Esteves.
     904I largely took the Till design but added tuple indexing, which exists in a number of programming languages with tuples, simplified the implicit tuple conversions, and integrated with the \CFA polymorphism and assertion satisfaction model.
     905I did a new implementation of tuples, and extensively
     906augmented initial work by Bilson to incorporate tuples into the \CFA expression-resolver and type-unifier.
     907
     908No prior work on variadic functions had been done for \CFA.
     909I did both the design and implementation work.
     910While the overall design is based on variadic templates in C++, my design is novel in the way it is incorporated into the \CFA polymorphism model, and is engineered into \CFA so it dovetails with tuples.
  • doc/rob_thesis/thesis-frontpgs.tex

    rb3d70eb r12d3187  
    8686%\newpage
    8787
    88 % % A C K N O W L E D G E M E N T S
    89 % % -------------------------------
     88% A C K N O W L E D G E M E N T S
     89% -------------------------------
    9090
    91 % \begin{center}\textbf{Acknowledgements}\end{center}
     91\begin{center}\textbf{Acknowledgements}\end{center}
    9292
    93 % % I would like to thank all the little people who made this possible.
    94 % TODO
    95 % \cleardoublepage
    96 % %\newpage
     93I would like to thank my supervisor, Professor Peter Buhr, for all of his help, including reading the many drafts of this thesis and providing guidance throughout my degree.
     94This work would not have been as enjoyable, nor would it have been as strong without Peter's knowledge, help, and encouragement.
     95
     96I would like to thank my readers, Professors Gregor Richards and Patrick Lam for all of their helpful feedback.
     97
     98Thanks to Aaron Moss and Thierry Delisle for many helpful discussions, both work-related and not, and for all of the work they have put into the \CFA project.
     99This thesis would not have been the same without their efforts.
     100
     101I thank Glen Ditchfield and Richard Bilson, for all of their help with both the design and implementation of \CFA.
     102
     103I thank my partner, Erin Blackmere, for all of her love and support.
     104Without her, I would not be who I am today.
     105
     106Thanks to my parents, Bob and Jackie Schluntz, for their love and support throughout my life, and for always encouraging me to be my best.
     107
     108Thanks to my best friends, Travis Bartlett, Abraham Dubrisingh, and Kevin Wu, whose companionship is always appreciated.
     109The time we've spent together over the past 4 years has always kept me entertained.
     110An extra shout-out to Kaleb Alway, Max Bardakov, Ten Bradley, and Ed Lee, with whom I've shared many a great meal; thank you for being my friend.
     111
     112Finally, I would like to acknowledge financial support in the form of a David R. Cheriton Graduate Scholarship and a corporate partnership with Huawei Ltd.
     113
     114\cleardoublepage
     115%\newpage
    97116
    98117% % D E D I C A T I O N
  • doc/rob_thesis/thesis.tex

    rb3d70eb r12d3187  
    118118\usepackage[pdftex]{graphicx} % For including graphics N.B. pdftex graphics driver
    119119
     120\usepackage{xcolor}
     121\usepackage{listings}
     122
    120123\input{cfa-format.tex}
    121124
     
    138141    pdftitle={Resource Management and Tuples in \CFA},    % title: CHANGE THIS TEXT!
    139142    pdfauthor={Rob Schluntz},    % author: CHANGE THIS TEXT! and uncomment this line
    140 %    pdfsubject={Subject},  % subject: CHANGE THIS TEXT! and uncomment this line
     143    pdfsubject={Programming Languages},  % subject: CHANGE THIS TEXT! and uncomment this line
    141144%    pdfkeywords={keyword1} {key2} {key3}, % list of keywords, and uncomment this line if desired
    142145    pdfnewwindow=true,      % links in new window
  • doc/rob_thesis/tuples.tex

    rb3d70eb r12d3187  
    161161\end{cfacode}
    162162
     163\begin{sloppypar}
    163164In addition to variables of tuple type, it is also possible to have pointers to tuples, and arrays of tuples.
    164 Tuple types can be composed of any types, except for array types, since arrays do not carry their size around, which makes tuple assignment difficult when a tuple contains an array.
     165Tuple types can be composed of any types, except for array types, since array assignment is disallowed, which makes tuple assignment difficult when a tuple contains an array.
    165166\begin{cfacode}
    166167[double, int] di;
     
    169170\end{cfacode}
    170171This examples declares a variable of type @[double, int]@, a variable of type pointer to @[double, int]@, and an array of ten @[double, int]@.
     172\end{sloppypar}
    171173
    172174\subsection{Tuple Indexing}
     
    212214The flexible structure of tuples permits a simple and expressive function-call syntax to work seamlessly with both single- and multiple-return-value functions, and with any number of arguments of arbitrarily complex structure.
    213215
    214 In \KWC \cite{Buhr94a,Till89}, a precursor to \CFA, there were 4 tuple coercions: opening, closing, flattening, and structuring.
     216In \KWC \cite{Buhr94a,Till89}, there were 4 tuple coercions: opening, closing, flattening, and structuring.
    215217Opening coerces a tuple value into a tuple of values, while closing converts a tuple of values into a single tuple value.
    216218Flattening coerces a nested tuple into a flat tuple, \ie it takes a tuple with tuple components and expands it into a tuple with only non-tuple components.
     
    218220
    219221In \CFA, the design has been simplified to require only the two conversions previously described, which trigger only in function call and return situations.
     222This simplification is a primary contribution of this thesis to the design of tuples in \CFA.
    220223Specifically, the expression resolution algorithm examines all of the possible alternatives for an expression to determine the best match.
    221224In resolving a function call expression, each combination of function value and list of argument alternatives is examined.
     
    254257Let $L_i$ for $i$ in $[0, n)$ represent each component of the flattened left side, $R_i$ represent each component of the flattened right side of a multiple assignment, and $R$ represent the right side of a mass assignment.
    255258
    256 For a multiple assignment to be valid, both tuples must have the same number of elements when flattened. Multiple assignment assigns $R_i$ to $L_i$ for each $i$.
     259For a multiple assignment to be valid, both tuples must have the same number of elements when flattened.
     260For example, the following is invalid because the number of components on the left does not match the number of components on the right.
     261\begin{cfacode}
     262[int, int] x, y, z;
     263[x, y] = z;   // multiple assignment, invalid 4 != 2
     264\end{cfacode}
     265Multiple assignment assigns $R_i$ to $L_i$ for each $i$.
    257266That is, @?=?(&$L_i$, $R_i$)@ must be a well-typed expression.
    258267In the previous example, @[x, y] = z@, @z@ is flattened into @z.0, z.1@, and the assignments @x = z.0@ and @y = z.1@ happen.
     
    265274
    266275Both kinds of tuple assignment have parallel semantics, such that each value on the left side and right side is evaluated \emph{before} any assignments occur.
    267 As a result, it is possible to swap the values in two variables without explicitly creating any temporary variables or calling a function,
     276As a result, it is possible to swap the values in two variables without explicitly creating any temporary variables or calling a function.
    268277\begin{cfacode}
    269278int x = 10, y = 20;
     
    296305\subsection{Tuple Construction}
    297306Tuple construction and destruction follow the same rules and semantics as tuple assignment, except that in the case where there is no right side, the default constructor or destructor is called on each component of the tuple.
     307As constructors and destructors did not exist in previous versions of \CFA or in \KWC, this is a primary contribution of this thesis to the design of tuples.
    298308\begin{cfacode}
    299309struct S;
     
    433443\section{Casting}
    434444In C, the cast operator is used to explicitly convert between types.
    435 In \CFA, the cast operator has a secondary use, which is type ascription, since it force the expression resolution algorithm to choose the lowest cost conversion to the target type.
     445In \CFA, the cast operator has a secondary use, which is type ascription, since it forces the expression resolution algorithm to choose the lowest cost conversion to the target type.
    436446That is, a cast can be used to select the type of an expression when it is ambiguous, as in the call to an overloaded function.
    437447\begin{cfacode}
     
    487497\section{Polymorphism}
    488498Due to the implicit flattening and structuring conversions involved in argument passing, @otype@ and @dtype@ parameters are restricted to matching only with non-tuple types.
     499The integration of polymorphism, type assertions, and monomorphic specialization of tuple-assertions are a primary contribution of this thesis to the design of tuples.
    489500\begin{cfacode}
    490501forall(otype T, dtype U)
     
    524535It is also important to note that these calls could be disambiguated if the function return types were different, as they likely would be for a reasonable implementation of @?+?@, since the return type is used in overload resolution.
    525536Still, these semantics are a deficiency of the current argument matching algorithm, and depending on the function, differing return values may not always be appropriate.
    526 These issues could be rectified by applying an appropriate cost to the structuring and flattening conversions, which are currently 0-cost conversions.
     537These issues could be rectified by applying an appropriate conversion cost to the structuring and flattening conversions, which are currently 0-cost conversions in the expression resolver.
    527538Care would be needed in this case to ensure that exact matches do not incur such a cost.
    528539\begin{cfacode}
     
    559570\section{Implementation}
    560571Tuples are implemented in the \CFA translator via a transformation into generic types.
     572Generic types are an independent contribution developed at the same time.
     573The transformation into generic types and the generation of tuple-specific code are primary contributions of this thesis to tuples.
     574
    561575The first time an $N$-tuple is seen for each $N$ in a scope, a generic type with $N$ type parameters is generated.
    562576For example,
  • doc/rob_thesis/variadic.tex

    rb3d70eb r12d3187  
    55\section{Design Criteria} % TODO: better section name???
    66C provides variadic functions through the manipulation of @va_list@ objects.
    7 A variadic function is one which contains at least one parameter, followed by @...@ as the last token in the parameter list.
    8 In particular, some form of \emph{argument descriptor} is needed to inform the function of the number of arguments and their types.
     7In C, a variadic function is one which contains at least one parameter, followed by @...@ as the last token in the parameter list.
     8In particular, some form of \emph{argument descriptor} or \emph{sentinel value} is needed to inform the function of the number of arguments and their types.
    99Two common argument descriptors are format strings or counter parameters.
    10 It is important to note that both of these mechanisms are inherently redundant, because they require the user to explicitly specify information that the compiler already knows.
     10It is important to note that both of these mechanisms are inherently redundant, because they require the user to explicitly specify information that the compiler already knows \footnote{While format specifiers can convey some information the compiler does not know, such as whether to print a number in decimal or hexadecimal, the number of arguments is wholly redundant.}.
    1111This required repetition is error prone, because it is easy for the user to add or remove arguments without updating the argument descriptor.
    1212In addition, C requires the programmer to hard code all of the possible expected types.
     
    152152That is to say, the programmer who writes @sum@ does not need full program knowledge of every possible data type, unlike what is necessary to write an equivalent function using the standard C mechanisms.
    153153
     154\begin{sloppypar}
    154155Going one last step, it is possible to achieve full generality in \CFA, allowing the summation of arbitrary lists of summable types.
    155156\begin{cfacode}
     
    170171\end{cfacode}
    171172The \CFA translator requires adding explicit @double ?+?(int, double)@ and @double ?+?(double, int)@ functions for this call to work, since implicit conversions are not supported for assertions.
     173\end{sloppypar}
    172174
    173175A notable limitation of this approach is that it heavily relies on recursive assertions.
     
    226228In the call to @new@, @Array@ is selected to match @T@, and @Params@ is expanded to match @[int, int, int, int]@. To satisfy the assertions, a constructor with an interface compatible with @void ?{}(Array *, int, int, int)@ must exist in the current scope.
    227229
    228 The @new@ function provides the combination of type-safe @malloc@ with a constructor call, so that it becomes impossible to forget to construct dynamically-allocated objects.
     230The @new@ function provides the combination of polymorphic @malloc@ with a constructor call, so that it becomes impossible to forget to construct dynamically-allocated objects.
    229231This approach provides the type-safety of @new@ in \CC, without the need to specify the allocated type, thanks to return-type inference.
    230232
Note: See TracChangeset for help on using the changeset viewer.