source: doc/theses/andrew_beach_MMath/existing.tex @ 0cfa768

ADTast-experimentalenumforall-pointer-decayjacob/cs343-translationpthread-emulationqualifiedEnum
Last change on this file since 0cfa768 was be497c6, checked in by Andrew Beach <ajbeach@…>, 3 years ago

Andrew MMath: Used Peter's feedback for the existing chapter.

  • Property mode set to 100644
File size: 14.8 KB
RevLine 
[382edbe]1\chapter{\CFA{} Existing Features}
[553f8abe]2\label{c:existing}
[f28fdee]3
[382edbe]4\CFA is an open-source project extending ISO C with
[6c79bef]5modern safety and productivity features, while still ensuring backwards
6compatibility with C and its programmers.  \CFA is designed to have an
7orthogonal feature-set based closely on the C programming paradigm
8(non-object-oriented) and these features can be added incrementally to an
9existing C code-base allowing programmers to learn \CFA on an as-needed basis.
10
[382edbe]11Only those \CFA features pertaining to this thesis are discussed.
[be497c6]12A familiarity with
[382edbe]13C or C-like languages is assumed.
[6c79bef]14
[9af0fe2d]15\section{Overloading and \lstinline{extern}}
[6c79bef]16\CFA has extensive overloading, allowing multiple definitions of the same name
[67c6a47]17to be defined~\cite{Moss18}.
[f42a6b8]18\begin{cfa}
19char i; int i; double i;
20int f(); double f();
21void g( int ); void g( double );
22\end{cfa}
[6c79bef]23This feature requires name mangling so the assembly symbols are unique for
24different overloads. For compatibility with names in C, there is also a syntax
25to disable name mangling. These unmangled names cannot be overloaded but act as
26the interface between C and \CFA code.  The syntax for disabling/enabling
27mangling is:
[f28fdee]28\begin{cfa}
[edc6ea2]29// name mangling on by default
[6c79bef]30int i; // _X1ii_1
[21f2e92]31extern "C" {  // disables name mangling
[6c79bef]32        int j; // j
[21f2e92]33        extern "Cforall" {  // enables name mangling
[6c79bef]34                int k; // _X1ki_1
35        }
[edc6ea2]36        // revert to no name mangling
[6e7b969]37}
[edc6ea2]38// revert to name mangling
[6c79bef]39\end{cfa}
40Both forms of @extern@ affect all the declarations within their nested lexical
41scope and transition back to the previous mangling state when the lexical scope
42ends.
43
44\section{Reference Type}
[03c0e44]45\CFA adds a reference type to C as an auto-dereferencing pointer.
46They work very similarly to pointers.
[382edbe]47Reference-types are written the same way as a pointer-type but each
[03c0e44]48asterisk (@*@) is replaced with a ampersand (@&@);
[21f2e92]49this includes cv-qualifiers and multiple levels of reference.
[03c0e44]50
[382edbe]51Generally, references act like pointers with an implicate dereferencing
52operation added to each use of the variable.
53These automatic dereferences may be disabled with the address-of operator
54(@&@).
[21f2e92]55
[382edbe]56% Check to see if these are generating errors.
57\begin{minipage}{0,5\textwidth}
[03c0e44]58With references:
59\begin{cfa}
60int i, j;
61int & ri = i;
62int && rri = ri;
63rri = 3;
[f42a6b8]64&ri = &j;
[03c0e44]65ri = 5;
66\end{cfa}
67\end{minipage}
[382edbe]68\begin{minipage}{0,5\textwidth}
[03c0e44]69With pointers:
[6c79bef]70\begin{cfa}
71int i, j;
[03c0e44]72int * pi = &i
73int ** ppi = &pi;
74**ppi = 3;
[21f2e92]75pi = &j;
[03c0e44]76*pi = 5;
[f28fdee]77\end{cfa}
[03c0e44]78\end{minipage}
[6e7b969]79
[be497c6]80References are intended to be used when the indirection of a pointer is
81required, but the address is not as important as the value and dereferencing
82is the common usage.
[382edbe]83Mutable references may be assigned to by converting them to a pointer
[be497c6]84with a @&@ and then assigning a pointer to them, as in @&ri = &j;@ above.
85% ???
[6e7b969]86
[382edbe]87\section{Operators}
[6c79bef]88
[be497c6]89\CFA implements operator overloading by providing special names, where
90operator expressions are translated into function calls using these names.
91An operator name is created by taking the operator symbols and joining them with
[1e567ab]92@?@s to show where the arguments go.
[382edbe]93For example,
[be497c6]94infixed multiplication is @?*?@, while prefix dereference is @*?@.
[edc6ea2]95This syntax make it easy to tell the difference between prefix operations
96(such as @++?@) and post-fix operations (@?++@).
[f42a6b8]97
[be497c6]98As an example, here are the addition and equality operators for a point type.
[382edbe]99\begin{cfa}
[6071efc]100point ?+?(point a, point b) { return point{a.x + b.x, a.y + b.y}; }
[be497c6]101int ?==?(point a, point b) { return a.x == b.x && a.y == b.y; }
[382edbe]102{
103        assert(point{1, 2} + point{3, 4} == point{4, 6});
104}
105\end{cfa}
[be497c6]106Note that this syntax works effectively but a textual transformation,
107the compiler converts all operators into functions and then resolves them
108normally. This means any combination of types may be used,
109although nonsensical ones (like @double ?==?(point, int);@) are discouraged.
110This feature is also used for all builtin operators as well,
111although those are implicitly provided by the language.
[382edbe]112
113%\subsection{Constructors and Destructors}
[be497c6]114In \CFA, constructors and destructors are operators, which means they are
115functions with special operator names rather than type names in \Cpp.
116Both constructors and destructors can be implicity called by the compiler,
117however the operator names allow explicit calls.
[f42a6b8]118% Placement new means that this is actually equivant to C++.
[382edbe]119
[21f2e92]120The special name for a constructor is @?{}@, which comes from the
[382edbe]121initialization syntax in C, \eg @Example e = { ... }@.
[be497c6]122\CFA generates a constructor call each time a variable is declared,
123passing the initialization arguments to the constructor.
[edc6ea2]124\begin{cfa}
[21f2e92]125struct Example { ... };
126void ?{}(Example & this) { ... }
[f42a6b8]127{
128        Example a;
129        Example b = {};
130}
[21f2e92]131void ?{}(Example & this, char first, int num) { ... }
[f42a6b8]132{
133        Example c = {'a', 2};
134}
[edc6ea2]135\end{cfa}
[f42a6b8]136Both @a@ and @b@ will be initalized with the first constructor,
[be497c6]137@b@ because of the explicit call and @a@ implicitly.
138@c@ will be initalized with the second constructor.
[f42a6b8]139Currently, there is no general way to skip initialation.
[be497c6]140% I don't use @= anywhere in the thesis.
[f42a6b8]141
[6c79bef]142% I don't like the \^{} symbol but $^\wedge$ isn't better.
[be497c6]143Similarly, destructors use the special name @^?{}@ (the @^@ has no special
[382edbe]144meaning).
[6c79bef]145\begin{cfa}
[21f2e92]146void ^?{}(Example & this) { ... }
[6c79bef]147{
[f42a6b8]148        Example d;
[be497c6]149        ^?{}(d);
150
151        Example e;
152} // Implicit call of ^?{}(e);
[6c79bef]153\end{cfa}
[edc6ea2]154
[be497c6]155Whenever a type is defined, \CFA creates a default zero-argument
[edc6ea2]156constructor, a copy constructor, a series of argument-per-field constructors
157and a destructor. All user constructors are defined after this.
[6e7b969]158
159\section{Polymorphism}
[6c79bef]160\CFA uses parametric polymorphism to create functions and types that are
161defined over multiple types. \CFA polymorphic declarations serve the same role
[29c9b23]162as \Cpp templates or Java generics. The ``parametric'' means the polymorphism is
[6c79bef]163accomplished by passing argument operations to associate \emph{parameters} at
164the call site, and these parameters are used in the function to differentiate
165among the types the function operates on.
166
167Polymorphic declarations start with a universal @forall@ clause that goes
168before the standard (monomorphic) declaration. These declarations have the same
169syntax except they may use the universal type names introduced by the @forall@
170clause.  For example, the following is a polymorphic identity function that
171works on any type @T@:
172\begin{cfa}
[edc6ea2]173forall( T ) T identity( T val ) { return val; }
174int forty_two = identity( 42 );
175char capital_a = identity( 'A' );
[6c79bef]176\end{cfa}
[382edbe]177Each use of a polymorphic declaration resolves its polymorphic parameters
[edc6ea2]178(in this case, just @T@) to concrete types (@int@ in the first use and @char@
179in the second).
[6e7b969]180
[6c79bef]181To allow a polymorphic function to be separately compiled, the type @T@ must be
182constrained by the operations used on @T@ in the function body. The @forall@
[382edbe]183clause is augmented with a list of polymorphic variables (local type names)
[6c79bef]184and assertions (constraints), which represent the required operations on those
185types used in a function, \eg:
[f28fdee]186\begin{cfa}
[382edbe]187forall( T | { void do_once(T); } )
[6c79bef]188void do_twice(T value) {
189        do_once(value);
190        do_once(value);
[6e7b969]191}
[f28fdee]192\end{cfa}
[6c79bef]193
194A polymorphic function can be used in the same way as a normal function.  The
195polymorphic variables are filled in with concrete types and the assertions are
196checked. An assertion is checked by verifying each assertion operation (with
197all the variables replaced with the concrete types from the arguments) is
198defined at a call site.
[edc6ea2]199\begin{cfa}
200void do_once(int i) { ... }
201int i;
202do_twice(i);
203\end{cfa}
204Any object with a type fulfilling the assertion may be passed as an argument to
205a @do_twice@ call.
[6c79bef]206
207Note, a function named @do_once@ is not required in the scope of @do_twice@ to
[29c9b23]208compile it, unlike \Cpp template expansion. Furthermore, call-site inferencing
[be497c6]209allows local replacement of the specific parametric functions needs for a
[6c79bef]210call.
[f28fdee]211\begin{cfa}
[edc6ea2]212void do_once(double y) { ... }
[6e7b969]213int quadruple(int x) {
[382edbe]214        void do_once(int & y) { y = y * 2; }
[21f2e92]215        do_twice(x);
[6c79bef]216        return x;
[6e7b969]217}
[f28fdee]218\end{cfa}
[6c79bef]219Specifically, the complier deduces that @do_twice@'s T is an integer from the
[21f2e92]220argument @x@. It then looks for the most specific definition matching the
[6c79bef]221assertion, which is the nested integral @do_once@ defined within the
222function. The matched assertion function is then passed as a function pointer
[21f2e92]223to @do_twice@ and called within it.
[382edbe]224The global definition of @do_once@ is ignored, however if quadruple took a
[be497c6]225@double@ argument, then the global definition would be used instead as it
226would then be a better match.
227\todo{cite Aaron's thesis (maybe)}
[6c79bef]228
[be497c6]229To avoid typing long lists of assertions, constraints can be collected into
230convenient a package called a @trait@, which can then be used in an assertion
[6c79bef]231instead of the individual constraints.
[f28fdee]232\begin{cfa}
[6c79bef]233trait done_once(T) {
234        void do_once(T);
[6e7b969]235}
[f28fdee]236\end{cfa}
[6c79bef]237and the @forall@ list in the previous example is replaced with the trait.
[f28fdee]238\begin{cfa}
[edc6ea2]239forall(dtype T | done_once(T))
[f28fdee]240\end{cfa}
[6c79bef]241In general, a trait can contain an arbitrary number of assertions, both
242functions and variables, and are usually used to create a shorthand for, and
243give descriptive names to, common groupings of assertions describing a certain
244functionality, like @sumable@, @listable@, \etc.
245
[be497c6]246Polymorphic structures and unions are defined by qualifying an aggregate type
[6c79bef]247with @forall@. The type variables work the same except they are used in field
248declarations instead of parameters, returns, and local variable declarations.
[f28fdee]249\begin{cfa}
[edc6ea2]250forall(dtype T)
[6e7b969]251struct node {
[9b0bb79]252        node(T) * next;
[edc6ea2]253        T * data;
[be497c6]254};
[edc6ea2]255node(int) inode;
[f28fdee]256\end{cfa}
[edc6ea2]257The generic type @node(T)@ is an example of a polymorphic type usage.  Like \Cpp
258template usage, a polymorphic type usage must specify a type parameter.
[6e7b969]259
[6c79bef]260There are many other polymorphism features in \CFA but these are the ones used
261by the exception system.
[6e7b969]262
[67c6a47]263\section{Control Flow}
264\CFA has a number of advanced control-flow features: @generator@, @coroutine@, @monitor@, @mutex@ parameters, and @thread@.
265The two features that interact with
266the exception system are @coroutine@ and @thread@; they and their supporting
[6c79bef]267constructs are described here.
268
269\subsection{Coroutine}
270A coroutine is a type with associated functions, where the functions are not
271required to finish execution when control is handed back to the caller. Instead
272they may suspend execution at any time and be resumed later at the point of
273last suspension. (Generators are stackless and coroutines are stackful.) These
274types are not concurrent but share some similarities along with common
275underpinnings, so they are combined with the \CFA threading library. Further
276discussion in this section only refers to the coroutine because generators are
277similar.
278
279In \CFA, a coroutine is created using the @coroutine@ keyword, which is an
280aggregate type like @struct,@ except the structure is implicitly modified by
281the compiler to satisfy the @is_coroutine@ trait; hence, a coroutine is
282restricted by the type system to types that provide this special trait.  The
283coroutine structure acts as the interface between callers and the coroutine,
284and its fields are used to pass information in and out of coroutine interface
285functions.
286
287Here is a simple example where a single field is used to pass (communicate) the
288next number in a sequence.
[f28fdee]289\begin{cfa}
[6e7b969]290coroutine CountUp {
[9b0bb79]291        unsigned int next;
[be497c6]292};
[6c79bef]293CountUp countup;
[f28fdee]294\end{cfa}
[67c6a47]295Each coroutine has a @main@ function, which takes a reference to a coroutine
[6c79bef]296object and returns @void@.
[382edbe]297%[numbers=left] Why numbers on this one?
[f42a6b8]298\begin{cfa}
[edc6ea2]299void main(CountUp & this) {
[f42a6b8]300        for (unsigned int next = 0 ; true ; ++next) {
[be497c6]301                this.next = next;
[edc6ea2]302                suspend;$\label{suspend}$
[6c79bef]303        }
[6e7b969]304}
[f28fdee]305\end{cfa}
[6c79bef]306In this function, or functions called by this function (helper functions), the
[f42a6b8]307@suspend@ statement is used to return execution to the coroutine's caller
308without terminating the coroutine's function.
[6c79bef]309
310A coroutine is resumed by calling the @resume@ function, \eg @resume(countup)@.
311The first resume calls the @main@ function at the top. Thereafter, resume calls
312continue a coroutine in the last suspended function after the @suspend@
[be497c6]313statement. In this case there is only one and, hence, the difference between
314subsequent calls is the state of variables inside the function and the
315coroutine object.
316The return value of @resume@ is a reference to the coroutine, to make it
317convent to access fields of the coroutine in the same expression.
318Here is a simple example in a helper function:
319\begin{cfa}
320unsigned int get_next(CountUp & this) {
321        return resume(this).next;
322}
323\end{cfa}
324
325When the main function returns the coroutine halts and can no longer be
326resumed.
[6e7b969]327
[67c6a47]328\subsection{Monitor and Mutex Parameter}
[6c79bef]329Concurrency does not guarantee ordering; without ordering results are
330non-deterministic. To claw back ordering, \CFA uses monitors and @mutex@
331(mutual exclusion) parameters. A monitor is another kind of aggregate, where
332the compiler implicitly inserts a lock and instances are compatible with
333@mutex@ parameters.
334
335A function that requires deterministic (ordered) execution, acquires mutual
336exclusion on a monitor object by qualifying an object reference parameter with
337@mutex@.
[f42a6b8]338\begin{cfa}
339void example(MonitorA & mutex argA, MonitorB & mutex argB);
340\end{cfa}
[6c79bef]341When the function is called, it implicitly acquires the monitor lock for all of
342the mutex parameters without deadlock.  This semantics means all functions with
343the same mutex type(s) are part of a critical section for objects of that type
344and only one runs at a time.
[6e7b969]345
[67c6a47]346\subsection{Thread}
[6c79bef]347Functions, generators, and coroutines are sequential so there is only a single
348(but potentially sophisticated) execution path in a program. Threads introduce
349multiple execution paths that continue independently.
[6e7b969]350
[6c79bef]351For threads to work safely with objects requires mutual exclusion using
352monitors and mutex parameters. For threads to work safely with other threads,
353also requires mutual exclusion in the form of a communication rendezvous, which
[67c6a47]354also supports internal synchronization as for mutex objects. For exceptions,
355only two basic thread operations are important: fork and join.
[6e7b969]356
[6c79bef]357Threads are created like coroutines with an associated @main@ function:
[f28fdee]358\begin{cfa}
[6e7b969]359thread StringWorker {
[6c79bef]360        const char * input;
361        int result;
[6e7b969]362};
363void main(StringWorker & this) {
[6c79bef]364        const char * localCopy = this.input;
365        // ... do some work, perhaps hashing the string ...
366        this.result = result;
[6e7b969]367}
[6c79bef]368{
369        StringWorker stringworker; // fork thread running in "main"
[be497c6]370} // Implicit call to join(stringworker), waits for completion.
[f28fdee]371\end{cfa}
[6c79bef]372The thread main is where a new thread starts execution after a fork operation
373and then the thread continues executing until it is finished. If another thread
374joins with an executing thread, it waits until the executing main completes
375execution. In other words, everything a thread does is between a fork and join.
376
377From the outside, this behaviour is accomplished through creation and
378destruction of a thread object.  Implicitly, fork happens after a thread
379object's constructor is run and join happens before the destructor runs. Join
380can also be specified explicitly using the @join@ function to wait for a
381thread's completion independently from its deallocation (\ie destructor
382call). If @join@ is called explicitly, the destructor does not implicitly join.
Note: See TracBrowser for help on using the repository browser.