source: doc/theses/andrew_beach_MMath/existing.tex @ b0b89a8

ADTast-experimentalenumforall-pointer-decayjacob/cs343-translationnew-ast-unique-exprpthread-emulationqualifiedEnum
Last change on this file since b0b89a8 was 1e567ab, checked in by Peter A. Buhr <pabuhr@…>, 3 years ago

fix error in proofreading

  • Property mode set to 100644
File size: 15.2 KB
RevLine 
[382edbe]1\chapter{\CFA{} Existing Features}
[553f8abe]2\label{c:existing}
[f28fdee]3
[382edbe]4\CFA is an open-source project extending ISO C with
[6c79bef]5modern safety and productivity features, while still ensuring backwards
6compatibility with C and its programmers.  \CFA is designed to have an
7orthogonal feature-set based closely on the C programming paradigm
8(non-object-oriented) and these features can be added incrementally to an
9existing C code-base allowing programmers to learn \CFA on an as-needed basis.
10
[382edbe]11Only those \CFA features pertaining to this thesis are discussed.
[fa7dbf1]12% Also, only new features of \CFA will be discussed,
13A familiarity with
[382edbe]14C or C-like languages is assumed.
[6c79bef]15
[9af0fe2d]16\section{Overloading and \lstinline{extern}}
[6c79bef]17\CFA has extensive overloading, allowing multiple definitions of the same name
[67c6a47]18to be defined~\cite{Moss18}.
[fa7dbf1]19\begin{lstlisting}[language=CFA,{moredelim=**[is][\color{red}]{@}{@}}]
20char @i@; int @i@; double @i@;
21int @f@(); double @f@();
22void @g@( int ); void @g@( double );
23\end{lstlisting}
[6c79bef]24This feature requires name mangling so the assembly symbols are unique for
25different overloads. For compatibility with names in C, there is also a syntax
26to disable name mangling. These unmangled names cannot be overloaded but act as
27the interface between C and \CFA code.  The syntax for disabling/enabling
28mangling is:
[f28fdee]29\begin{cfa}
[edc6ea2]30// name mangling on by default
[6c79bef]31int i; // _X1ii_1
[21f2e92]32extern "C" {  // disables name mangling
[6c79bef]33        int j; // j
[21f2e92]34        extern "Cforall" {  // enables name mangling
[6c79bef]35                int k; // _X1ki_1
36        }
[edc6ea2]37        // revert to no name mangling
[6e7b969]38}
[edc6ea2]39// revert to name mangling
[6c79bef]40\end{cfa}
41Both forms of @extern@ affect all the declarations within their nested lexical
42scope and transition back to the previous mangling state when the lexical scope
43ends.
44
45\section{Reference Type}
[03c0e44]46\CFA adds a reference type to C as an auto-dereferencing pointer.
47They work very similarly to pointers.
[382edbe]48Reference-types are written the same way as a pointer-type but each
[03c0e44]49asterisk (@*@) is replaced with a ampersand (@&@);
[21f2e92]50this includes cv-qualifiers and multiple levels of reference.
[03c0e44]51
[382edbe]52Generally, references act like pointers with an implicate dereferencing
53operation added to each use of the variable.
54These automatic dereferences may be disabled with the address-of operator
55(@&@).
[21f2e92]56
[382edbe]57% Check to see if these are generating errors.
58\begin{minipage}{0,5\textwidth}
[03c0e44]59With references:
60\begin{cfa}
61int i, j;
62int & ri = i;
63int && rri = ri;
64rri = 3;
[fa7dbf1]65&ri = &j; // rebindable
[03c0e44]66ri = 5;
67\end{cfa}
68\end{minipage}
[382edbe]69\begin{minipage}{0,5\textwidth}
[03c0e44]70With pointers:
[6c79bef]71\begin{cfa}
72int i, j;
[03c0e44]73int * pi = &i
74int ** ppi = &pi;
75**ppi = 3;
[21f2e92]76pi = &j;
[03c0e44]77*pi = 5;
[f28fdee]78\end{cfa}
[03c0e44]79\end{minipage}
[6e7b969]80
[fa7dbf1]81References are intended for pointer situations where dereferencing is the common usage,
82\ie the value is more important than the pointer.
[382edbe]83Mutable references may be assigned to by converting them to a pointer
84with a @&@ and then assigning a pointer to them, as in @&ri = &j;@ above
[6e7b969]85
[382edbe]86\section{Operators}
[6c79bef]87
[fa7dbf1]88\CFA implements operator overloading by providing special names, where
89operator usages are translated into function calls using these names.
90An operator name is created by taking the operator symbols and joining them with
[1e567ab]91@?@s to show where the arguments go.
[382edbe]92For example,
[fa7dbf1]93infixed multiplication is @?*?@, while prefix dereference is @*?@.
[edc6ea2]94This syntax make it easy to tell the difference between prefix operations
95(such as @++?@) and post-fix operations (@?++@).
[fa7dbf1]96For example, plus and equality operators are defined for a point type.
[382edbe]97\begin{cfa}
[6071efc]98point ?+?(point a, point b) { return point{a.x + b.x, a.y + b.y}; }
[fa7dbf1]99int ?==?(point a, point b) { return a.x == b.x && a.y == b.y; }
[382edbe]100{
101        assert(point{1, 2} + point{3, 4} == point{4, 6});
102}
103\end{cfa}
[fa7dbf1]104Note these special names are not limited to builtin
105operators, and hence, may be used with arbitrary types.
106\begin{cfa}
107double ?+?( int x, point y ); // arbitrary types
108\end{cfa}
109% Some ``near misses", that are that do not match an operator form but looks like
110% it may have been supposed to, will generate warning but otherwise they are
111% left alone.
112Because operators are never part of the type definition they may be added
113at any time, including on built-in types.
[382edbe]114
115%\subsection{Constructors and Destructors}
116
[fa7dbf1]117\CFA also provides constructors and destructors as operators, which means they
118are functions with special operator names rather than type names in \Cpp.
119While constructors and destructions are normally called implicitly by the compiler,
120the special operator names, allow explicit calls.
121
122% Placement new means that this is actually equivalent to C++.
[382edbe]123
[21f2e92]124The special name for a constructor is @?{}@, which comes from the
[382edbe]125initialization syntax in C, \eg @Example e = { ... }@.
[fa7dbf1]126\CFA generates a constructor call each time a variable is declared,
127passing the initialization arguments to the constructor.
[edc6ea2]128\begin{cfa}
[21f2e92]129struct Example { ... };
130void ?{}(Example & this) { ... }
131void ?{}(Example & this, char first, int num) { ... }
[fa7dbf1]132Example a;              // implicit constructor calls
133Example b = {};
134Example c = {'a', 2};
135\end{cfa}
136Both @a@ and @b@ are initialized with the first constructor,
137while @c@ is initialized with the second.
138Constructor calls can be replaced with C initialization using special operator \lstinline{@=}.
139\begin{cfa}
140Example d @= {42};
[edc6ea2]141\end{cfa}
[6c79bef]142% I don't like the \^{} symbol but $^\wedge$ isn't better.
[fa7dbf1]143Similarly, destructors use the special name @^?{}@ (the @^@ has no special
[382edbe]144meaning).
[fa7dbf1]145% These are a normally called implicitly called on a variable when it goes out
146% of scope. They can be called explicitly as well.
[6c79bef]147\begin{cfa}
[21f2e92]148void ^?{}(Example & this) { ... }
[6c79bef]149{
[fa7dbf1]150        Example e;      // implicit constructor call
151        ^?{}(e);                // explicit destructor call
152        ?{}(e);         // explicit constructor call
153} // implicit destructor call
[6c79bef]154\end{cfa}
[edc6ea2]155
[fa7dbf1]156Whenever a type is defined, \CFA creates a default zero-argument
[edc6ea2]157constructor, a copy constructor, a series of argument-per-field constructors
158and a destructor. All user constructors are defined after this.
[6e7b969]159
160\section{Polymorphism}
[6c79bef]161\CFA uses parametric polymorphism to create functions and types that are
162defined over multiple types. \CFA polymorphic declarations serve the same role
[29c9b23]163as \Cpp templates or Java generics. The ``parametric'' means the polymorphism is
[6c79bef]164accomplished by passing argument operations to associate \emph{parameters} at
165the call site, and these parameters are used in the function to differentiate
166among the types the function operates on.
167
168Polymorphic declarations start with a universal @forall@ clause that goes
169before the standard (monomorphic) declaration. These declarations have the same
170syntax except they may use the universal type names introduced by the @forall@
171clause.  For example, the following is a polymorphic identity function that
172works on any type @T@:
173\begin{cfa}
[edc6ea2]174forall( T ) T identity( T val ) { return val; }
175int forty_two = identity( 42 );
176char capital_a = identity( 'A' );
[6c79bef]177\end{cfa}
[382edbe]178Each use of a polymorphic declaration resolves its polymorphic parameters
[edc6ea2]179(in this case, just @T@) to concrete types (@int@ in the first use and @char@
180in the second).
[6e7b969]181
[6c79bef]182To allow a polymorphic function to be separately compiled, the type @T@ must be
183constrained by the operations used on @T@ in the function body. The @forall@
[382edbe]184clause is augmented with a list of polymorphic variables (local type names)
[6c79bef]185and assertions (constraints), which represent the required operations on those
186types used in a function, \eg:
[f28fdee]187\begin{cfa}
[382edbe]188forall( T | { void do_once(T); } )
[6c79bef]189void do_twice(T value) {
190        do_once(value);
191        do_once(value);
[6e7b969]192}
[f28fdee]193\end{cfa}
[6c79bef]194
195A polymorphic function can be used in the same way as a normal function.  The
196polymorphic variables are filled in with concrete types and the assertions are
197checked. An assertion is checked by verifying each assertion operation (with
198all the variables replaced with the concrete types from the arguments) is
199defined at a call site.
[edc6ea2]200\begin{cfa}
201void do_once(int i) { ... }
202int i;
203do_twice(i);
204\end{cfa}
205Any object with a type fulfilling the assertion may be passed as an argument to
206a @do_twice@ call.
[6c79bef]207
208Note, a function named @do_once@ is not required in the scope of @do_twice@ to
[29c9b23]209compile it, unlike \Cpp template expansion. Furthermore, call-site inferencing
[fa7dbf1]210allows local replacement of the specific parametric functions needs for a
[6c79bef]211call.
[f28fdee]212\begin{cfa}
[edc6ea2]213void do_once(double y) { ... }
[6e7b969]214int quadruple(int x) {
[382edbe]215        void do_once(int & y) { y = y * 2; }
[21f2e92]216        do_twice(x);
[6c79bef]217        return x;
[6e7b969]218}
[f28fdee]219\end{cfa}
[6c79bef]220Specifically, the complier deduces that @do_twice@'s T is an integer from the
[21f2e92]221argument @x@. It then looks for the most specific definition matching the
[6c79bef]222assertion, which is the nested integral @do_once@ defined within the
223function. The matched assertion function is then passed as a function pointer
[21f2e92]224to @do_twice@ and called within it.
[382edbe]225The global definition of @do_once@ is ignored, however if quadruple took a
[fa7dbf1]226@double@ argument, then the global definition would be used instead as it
227is a better match.
[382edbe]228% Aaron's thesis might be a good reference here.
[6c79bef]229
230To avoid typing long lists of assertions, constraints can be collect into
[fa7dbf1]231convenient package called a @trait@, which can then be used in an assertion
[6c79bef]232instead of the individual constraints.
[f28fdee]233\begin{cfa}
[6c79bef]234trait done_once(T) {
235        void do_once(T);
[6e7b969]236}
[f28fdee]237\end{cfa}
[6c79bef]238and the @forall@ list in the previous example is replaced with the trait.
[f28fdee]239\begin{cfa}
[edc6ea2]240forall(dtype T | done_once(T))
[f28fdee]241\end{cfa}
[6c79bef]242In general, a trait can contain an arbitrary number of assertions, both
243functions and variables, and are usually used to create a shorthand for, and
244give descriptive names to, common groupings of assertions describing a certain
245functionality, like @sumable@, @listable@, \etc.
246
[fa7dbf1]247Polymorphic structures and unions are defined by qualifying an aggregate type
[6c79bef]248with @forall@. The type variables work the same except they are used in field
249declarations instead of parameters, returns, and local variable declarations.
[f28fdee]250\begin{cfa}
[edc6ea2]251forall(dtype T)
[6e7b969]252struct node {
[9b0bb79]253        node(T) * next;
[edc6ea2]254        T * data;
[6e7b969]255}
[edc6ea2]256node(int) inode;
[f28fdee]257\end{cfa}
[edc6ea2]258The generic type @node(T)@ is an example of a polymorphic type usage.  Like \Cpp
259template usage, a polymorphic type usage must specify a type parameter.
[6e7b969]260
[6c79bef]261There are many other polymorphism features in \CFA but these are the ones used
262by the exception system.
[6e7b969]263
[67c6a47]264\section{Control Flow}
265\CFA has a number of advanced control-flow features: @generator@, @coroutine@, @monitor@, @mutex@ parameters, and @thread@.
266The two features that interact with
267the exception system are @coroutine@ and @thread@; they and their supporting
[6c79bef]268constructs are described here.
269
270\subsection{Coroutine}
271A coroutine is a type with associated functions, where the functions are not
272required to finish execution when control is handed back to the caller. Instead
273they may suspend execution at any time and be resumed later at the point of
274last suspension. (Generators are stackless and coroutines are stackful.) These
275types are not concurrent but share some similarities along with common
276underpinnings, so they are combined with the \CFA threading library. Further
277discussion in this section only refers to the coroutine because generators are
278similar.
279
280In \CFA, a coroutine is created using the @coroutine@ keyword, which is an
281aggregate type like @struct,@ except the structure is implicitly modified by
282the compiler to satisfy the @is_coroutine@ trait; hence, a coroutine is
283restricted by the type system to types that provide this special trait.  The
284coroutine structure acts as the interface between callers and the coroutine,
285and its fields are used to pass information in and out of coroutine interface
286functions.
287
288Here is a simple example where a single field is used to pass (communicate) the
289next number in a sequence.
[f28fdee]290\begin{cfa}
[6e7b969]291coroutine CountUp {
[9b0bb79]292        unsigned int next;
[fa7dbf1]293};
[6c79bef]294CountUp countup;
[fa7dbf1]295for (10) sout | resume(countup).next; // print 10 values
[f28fdee]296\end{cfa}
[67c6a47]297Each coroutine has a @main@ function, which takes a reference to a coroutine
[6c79bef]298object and returns @void@.
[382edbe]299%[numbers=left] Why numbers on this one?
[fa7dbf1]300\begin{cfa}[numbers=left,numberstyle=\scriptsize\sf]
[edc6ea2]301void main(CountUp & this) {
[fa7dbf1]302        for (unsigned int up = 0;; ++up) {
303                this.next = up;
[edc6ea2]304                suspend;$\label{suspend}$
[6c79bef]305        }
[6e7b969]306}
[f28fdee]307\end{cfa}
[6c79bef]308In this function, or functions called by this function (helper functions), the
[fa7dbf1]309@suspend@ statement is used to return execution to the coroutine's resumer
310without terminating the coroutine's function(s).
[6c79bef]311
312A coroutine is resumed by calling the @resume@ function, \eg @resume(countup)@.
313The first resume calls the @main@ function at the top. Thereafter, resume calls
314continue a coroutine in the last suspended function after the @suspend@
315statement, in this case @main@ line~\ref{suspend}.  The @resume@ function takes
316a reference to the coroutine structure and returns the same reference. The
317return value allows easy access to communication variables defined in the
318coroutine object. For example, the @next@ value for coroutine object @countup@
319is both generated and collected in the single expression:
320@resume(countup).next@.
[6e7b969]321
[67c6a47]322\subsection{Monitor and Mutex Parameter}
[6c79bef]323Concurrency does not guarantee ordering; without ordering results are
324non-deterministic. To claw back ordering, \CFA uses monitors and @mutex@
325(mutual exclusion) parameters. A monitor is another kind of aggregate, where
326the compiler implicitly inserts a lock and instances are compatible with
327@mutex@ parameters.
328
329A function that requires deterministic (ordered) execution, acquires mutual
330exclusion on a monitor object by qualifying an object reference parameter with
331@mutex@.
[fa7dbf1]332\begin{lstlisting}[language=CFA,{moredelim=**[is][\color{red}]{@}{@}}]
333void example(MonitorA & @mutex@ argA, MonitorB & @mutex@ argB);
334\end{lstlisting}
[6c79bef]335When the function is called, it implicitly acquires the monitor lock for all of
336the mutex parameters without deadlock.  This semantics means all functions with
337the same mutex type(s) are part of a critical section for objects of that type
338and only one runs at a time.
[6e7b969]339
[67c6a47]340\subsection{Thread}
[6c79bef]341Functions, generators, and coroutines are sequential so there is only a single
342(but potentially sophisticated) execution path in a program. Threads introduce
343multiple execution paths that continue independently.
[6e7b969]344
[6c79bef]345For threads to work safely with objects requires mutual exclusion using
346monitors and mutex parameters. For threads to work safely with other threads,
347also requires mutual exclusion in the form of a communication rendezvous, which
[67c6a47]348also supports internal synchronization as for mutex objects. For exceptions,
349only two basic thread operations are important: fork and join.
[6e7b969]350
[6c79bef]351Threads are created like coroutines with an associated @main@ function:
[f28fdee]352\begin{cfa}
[6e7b969]353thread StringWorker {
[6c79bef]354        const char * input;
355        int result;
[6e7b969]356};
357void main(StringWorker & this) {
[6c79bef]358        const char * localCopy = this.input;
359        // ... do some work, perhaps hashing the string ...
360        this.result = result;
[6e7b969]361}
[6c79bef]362{
363        StringWorker stringworker; // fork thread running in "main"
[fa7dbf1]364} // implicitly join with thread / wait for completion
[f28fdee]365\end{cfa}
[6c79bef]366The thread main is where a new thread starts execution after a fork operation
367and then the thread continues executing until it is finished. If another thread
368joins with an executing thread, it waits until the executing main completes
369execution. In other words, everything a thread does is between a fork and join.
370
371From the outside, this behaviour is accomplished through creation and
372destruction of a thread object.  Implicitly, fork happens after a thread
373object's constructor is run and join happens before the destructor runs. Join
374can also be specified explicitly using the @join@ function to wait for a
375thread's completion independently from its deallocation (\ie destructor
376call). If @join@ is called explicitly, the destructor does not implicitly join.
Note: See TracBrowser for help on using the repository browser.