source: doc/theses/andrew_beach_MMath/existing.tex @ 34fcc13

ADTast-experimentalenumforall-pointer-decayjacob/cs343-translationpthread-emulationqualifiedEnum
Last change on this file since 34fcc13 was 6cf21ed8, checked in by Andrew Beach <ajbeach@…>, 3 years ago

Andrew MMath: First pass on adding missing citations.

  • Property mode set to 100644
File size: 14.8 KB
Line 
1\chapter{\CFA{} Existing Features}
2\label{c:existing}
3
4\CFA is an open-source project extending ISO C with
5modern safety and productivity features, while still ensuring backwards
6compatibility with C and its programmers.  \CFA is designed to have an
7orthogonal feature-set based closely on the C programming paradigm
8(non-object-oriented) and these features can be added incrementally to an
9existing C code-base allowing programmers to learn \CFA on an as-needed basis.
10
11Only those \CFA features pertaining to this thesis are discussed.
12A familiarity with
13C or C-like languages is assumed.
14
15\section{Overloading and \lstinline{extern}}
16\CFA has extensive overloading, allowing multiple definitions of the same name
17to be defined~\cite{Moss18}.
18\begin{cfa}
19char i; int i; double i;
20int f(); double f();
21void g( int ); void g( double );
22\end{cfa}
23This feature requires name mangling so the assembly symbols are unique for
24different overloads. For compatibility with names in C, there is also a syntax
25to disable name mangling. These unmangled names cannot be overloaded but act as
26the interface between C and \CFA code.  The syntax for disabling/enabling
27mangling is:
28\begin{cfa}
29// name mangling on by default
30int i; // _X1ii_1
31extern "C" {  // disables name mangling
32        int j; // j
33        extern "Cforall" {  // enables name mangling
34                int k; // _X1ki_1
35        }
36        // revert to no name mangling
37}
38// revert to name mangling
39\end{cfa}
40Both forms of @extern@ affect all the declarations within their nested lexical
41scope and transition back to the previous mangling state when the lexical scope
42ends.
43
44\section{Reference Type}
45\CFA adds a reference type to C as an auto-dereferencing pointer.
46They work very similarly to pointers.
47Reference-types are written the same way as a pointer-type but each
48asterisk (@*@) is replaced with a ampersand (@&@);
49this includes cv-qualifiers and multiple levels of reference.
50
51Generally, references act like pointers with an implicate dereferencing
52operation added to each use of the variable.
53These automatic dereferences may be disabled with the address-of operator
54(@&@).
55
56% Check to see if these are generating errors.
57\begin{minipage}{0,5\textwidth}
58With references:
59\begin{cfa}
60int i, j;
61int & ri = i;
62int && rri = ri;
63rri = 3;
64&ri = &j;
65ri = 5;
66\end{cfa}
67\end{minipage}
68\begin{minipage}{0,5\textwidth}
69With pointers:
70\begin{cfa}
71int i, j;
72int * pi = &i
73int ** ppi = &pi;
74**ppi = 3;
75pi = &j;
76*pi = 5;
77\end{cfa}
78\end{minipage}
79
80References are intended to be used when the indirection of a pointer is
81required, but the address is not as important as the value and dereferencing
82is the common usage.
83Mutable references may be assigned to by converting them to a pointer
84with a @&@ and then assigning a pointer to them, as in @&ri = &j;@ above.
85% ???
86
87\section{Operators}
88
89\CFA implements operator overloading by providing special names, where
90operator expressions are translated into function calls using these names.
91An operator name is created by taking the operator symbols and joining them with
92@?@s to show where the arguments go.
93For example,
94infixed multiplication is @?*?@, while prefix dereference is @*?@.
95This syntax make it easy to tell the difference between prefix operations
96(such as @++?@) and post-fix operations (@?++@).
97
98As an example, here are the addition and equality operators for a point type.
99\begin{cfa}
100point ?+?(point a, point b) { return point{a.x + b.x, a.y + b.y}; }
101int ?==?(point a, point b) { return a.x == b.x && a.y == b.y; }
102{
103        assert(point{1, 2} + point{3, 4} == point{4, 6});
104}
105\end{cfa}
106Note that this syntax works effectively but a textual transformation,
107the compiler converts all operators into functions and then resolves them
108normally. This means any combination of types may be used,
109although nonsensical ones (like @double ?==?(point, int);@) are discouraged.
110This feature is also used for all builtin operators as well,
111although those are implicitly provided by the language.
112
113%\subsection{Constructors and Destructors}
114In \CFA, constructors and destructors are operators, which means they are
115functions with special operator names rather than type names in \Cpp.
116Both constructors and destructors can be implicity called by the compiler,
117however the operator names allow explicit calls.
118% Placement new means that this is actually equivant to C++.
119
120The special name for a constructor is @?{}@, which comes from the
121initialization syntax in C, \eg @Example e = { ... }@.
122\CFA generates a constructor call each time a variable is declared,
123passing the initialization arguments to the constructor.
124\begin{cfa}
125struct Example { ... };
126void ?{}(Example & this) { ... }
127{
128        Example a;
129        Example b = {};
130}
131void ?{}(Example & this, char first, int num) { ... }
132{
133        Example c = {'a', 2};
134}
135\end{cfa}
136Both @a@ and @b@ will be initalized with the first constructor,
137@b@ because of the explicit call and @a@ implicitly.
138@c@ will be initalized with the second constructor.
139Currently, there is no general way to skip initialation.
140% I don't use @= anywhere in the thesis.
141
142% I don't like the \^{} symbol but $^\wedge$ isn't better.
143Similarly, destructors use the special name @^?{}@ (the @^@ has no special
144meaning).
145\begin{cfa}
146void ^?{}(Example & this) { ... }
147{
148        Example d;
149        ^?{}(d);
150
151        Example e;
152} // Implicit call of ^?{}(e);
153\end{cfa}
154
155Whenever a type is defined, \CFA creates a default zero-argument
156constructor, a copy constructor, a series of argument-per-field constructors
157and a destructor. All user constructors are defined after this.
158
159\section{Polymorphism}
160\CFA uses parametric polymorphism to create functions and types that are
161defined over multiple types. \CFA polymorphic declarations serve the same role
162as \Cpp templates or Java generics. The ``parametric'' means the polymorphism is
163accomplished by passing argument operations to associate \emph{parameters} at
164the call site, and these parameters are used in the function to differentiate
165among the types the function operates on.
166
167Polymorphic declarations start with a universal @forall@ clause that goes
168before the standard (monomorphic) declaration. These declarations have the same
169syntax except they may use the universal type names introduced by the @forall@
170clause.  For example, the following is a polymorphic identity function that
171works on any type @T@:
172\begin{cfa}
173forall( T ) T identity( T val ) { return val; }
174int forty_two = identity( 42 );
175char capital_a = identity( 'A' );
176\end{cfa}
177Each use of a polymorphic declaration resolves its polymorphic parameters
178(in this case, just @T@) to concrete types (@int@ in the first use and @char@
179in the second).
180
181To allow a polymorphic function to be separately compiled, the type @T@ must be
182constrained by the operations used on @T@ in the function body. The @forall@
183clause is augmented with a list of polymorphic variables (local type names)
184and assertions (constraints), which represent the required operations on those
185types used in a function, \eg:
186\begin{cfa}
187forall( T | { void do_once(T); } )
188void do_twice(T value) {
189        do_once(value);
190        do_once(value);
191}
192\end{cfa}
193
194A polymorphic function can be used in the same way as a normal function.  The
195polymorphic variables are filled in with concrete types and the assertions are
196checked. An assertion is checked by verifying each assertion operation (with
197all the variables replaced with the concrete types from the arguments) is
198defined at a call site.
199\begin{cfa}
200void do_once(int i) { ... }
201int i;
202do_twice(i);
203\end{cfa}
204Any object with a type fulfilling the assertion may be passed as an argument to
205a @do_twice@ call.
206
207Note, a function named @do_once@ is not required in the scope of @do_twice@ to
208compile it, unlike \Cpp template expansion. Furthermore, call-site inferencing
209allows local replacement of the specific parametric functions needs for a
210call.
211\begin{cfa}
212void do_once(double y) { ... }
213int quadruple(int x) {
214        void do_once(int & y) { y = y * 2; }
215        do_twice(x);
216        return x;
217}
218\end{cfa}
219Specifically, the complier deduces that @do_twice@'s T is an integer from the
220argument @x@. It then looks for the most specific definition matching the
221assertion, which is the nested integral @do_once@ defined within the
222function. The matched assertion function is then passed as a function pointer
223to @do_twice@ and called within it.
224The global definition of @do_once@ is ignored, however if quadruple took a
225@double@ argument, then the global definition would be used instead as it
226would then be a better match.\cite{Moss19}
227
228To avoid typing long lists of assertions, constraints can be collected into
229convenient a package called a @trait@, which can then be used in an assertion
230instead of the individual constraints.
231\begin{cfa}
232trait done_once(T) {
233        void do_once(T);
234}
235\end{cfa}
236and the @forall@ list in the previous example is replaced with the trait.
237\begin{cfa}
238forall(dtype T | done_once(T))
239\end{cfa}
240In general, a trait can contain an arbitrary number of assertions, both
241functions and variables, and are usually used to create a shorthand for, and
242give descriptive names to, common groupings of assertions describing a certain
243functionality, like @sumable@, @listable@, \etc.
244
245Polymorphic structures and unions are defined by qualifying an aggregate type
246with @forall@. The type variables work the same except they are used in field
247declarations instead of parameters, returns, and local variable declarations.
248\begin{cfa}
249forall(dtype T)
250struct node {
251        node(T) * next;
252        T * data;
253};
254node(int) inode;
255\end{cfa}
256The generic type @node(T)@ is an example of a polymorphic type usage.  Like \Cpp
257template usage, a polymorphic type usage must specify a type parameter.
258
259There are many other polymorphism features in \CFA but these are the ones used
260by the exception system.
261
262\section{Control Flow}
263\CFA has a number of advanced control-flow features: @generator@, @coroutine@, @monitor@, @mutex@ parameters, and @thread@.
264The two features that interact with
265the exception system are @coroutine@ and @thread@; they and their supporting
266constructs are described here.
267
268\subsection{Coroutine}
269A coroutine is a type with associated functions, where the functions are not
270required to finish execution when control is handed back to the caller. Instead
271they may suspend execution at any time and be resumed later at the point of
272last suspension. (Generators are stackless and coroutines are stackful.) These
273types are not concurrent but share some similarities along with common
274underpinnings, so they are combined with the \CFA threading library. Further
275discussion in this section only refers to the coroutine because generators are
276similar.
277
278In \CFA, a coroutine is created using the @coroutine@ keyword, which is an
279aggregate type like @struct,@ except the structure is implicitly modified by
280the compiler to satisfy the @is_coroutine@ trait; hence, a coroutine is
281restricted by the type system to types that provide this special trait.  The
282coroutine structure acts as the interface between callers and the coroutine,
283and its fields are used to pass information in and out of coroutine interface
284functions.
285
286Here is a simple example where a single field is used to pass (communicate) the
287next number in a sequence.
288\begin{cfa}
289coroutine CountUp {
290        unsigned int next;
291};
292CountUp countup;
293\end{cfa}
294Each coroutine has a @main@ function, which takes a reference to a coroutine
295object and returns @void@.
296%[numbers=left] Why numbers on this one?
297\begin{cfa}
298void main(CountUp & this) {
299        for (unsigned int next = 0 ; true ; ++next) {
300                this.next = next;
301                suspend;$\label{suspend}$
302        }
303}
304\end{cfa}
305In this function, or functions called by this function (helper functions), the
306@suspend@ statement is used to return execution to the coroutine's caller
307without terminating the coroutine's function.
308
309A coroutine is resumed by calling the @resume@ function, \eg @resume(countup)@.
310The first resume calls the @main@ function at the top. Thereafter, resume calls
311continue a coroutine in the last suspended function after the @suspend@
312statement. In this case there is only one and, hence, the difference between
313subsequent calls is the state of variables inside the function and the
314coroutine object.
315The return value of @resume@ is a reference to the coroutine, to make it
316convent to access fields of the coroutine in the same expression.
317Here is a simple example in a helper function:
318\begin{cfa}
319unsigned int get_next(CountUp & this) {
320        return resume(this).next;
321}
322\end{cfa}
323
324When the main function returns the coroutine halts and can no longer be
325resumed.
326
327\subsection{Monitor and Mutex Parameter}
328Concurrency does not guarantee ordering; without ordering results are
329non-deterministic. To claw back ordering, \CFA uses monitors and @mutex@
330(mutual exclusion) parameters. A monitor is another kind of aggregate, where
331the compiler implicitly inserts a lock and instances are compatible with
332@mutex@ parameters.
333
334A function that requires deterministic (ordered) execution, acquires mutual
335exclusion on a monitor object by qualifying an object reference parameter with
336@mutex@.
337\begin{cfa}
338void example(MonitorA & mutex argA, MonitorB & mutex argB);
339\end{cfa}
340When the function is called, it implicitly acquires the monitor lock for all of
341the mutex parameters without deadlock.  This semantics means all functions with
342the same mutex type(s) are part of a critical section for objects of that type
343and only one runs at a time.
344
345\subsection{Thread}
346Functions, generators, and coroutines are sequential so there is only a single
347(but potentially sophisticated) execution path in a program. Threads introduce
348multiple execution paths that continue independently.
349
350For threads to work safely with objects requires mutual exclusion using
351monitors and mutex parameters. For threads to work safely with other threads,
352also requires mutual exclusion in the form of a communication rendezvous, which
353also supports internal synchronization as for mutex objects. For exceptions,
354only two basic thread operations are important: fork and join.
355
356Threads are created like coroutines with an associated @main@ function:
357\begin{cfa}
358thread StringWorker {
359        const char * input;
360        int result;
361};
362void main(StringWorker & this) {
363        const char * localCopy = this.input;
364        // ... do some work, perhaps hashing the string ...
365        this.result = result;
366}
367{
368        StringWorker stringworker; // fork thread running in "main"
369} // Implicit call to join(stringworker), waits for completion.
370\end{cfa}
371The thread main is where a new thread starts execution after a fork operation
372and then the thread continues executing until it is finished. If another thread
373joins with an executing thread, it waits until the executing main completes
374execution. In other words, everything a thread does is between a fork and join.
375
376From the outside, this behaviour is accomplished through creation and
377destruction of a thread object.  Implicitly, fork happens after a thread
378object's constructor is run and join happens before the destructor runs. Join
379can also be specified explicitly using the @join@ function to wait for a
380thread's completion independently from its deallocation (\ie destructor
381call). If @join@ is called explicitly, the destructor does not implicitly join.
Note: See TracBrowser for help on using the repository browser.