source: doc/theses/andrew_beach_MMath/existing.tex @ 432bffe

ADTast-experimentalenumforall-pointer-decaypthread-emulationqualifiedEnum
Last change on this file since 432bffe was 9cdfa5fb, checked in by Andrew Beach <ajbeach@…>, 3 years ago

Andrew MMath: Used (most of) Gregor's feedback to update the thesis. There are still a few \todo items as well as a general request for examples.

  • Property mode set to 100644
File size: 14.8 KB
Line 
1\chapter{\CFA{} Existing Features}
2\label{c:existing}
3
4\CFA is an open-source project extending ISO C with
5modern safety and productivity features, while still ensuring backwards
6compatibility with C and its programmers.  \CFA is designed to have an
7orthogonal feature-set based closely on the C programming paradigm
8(non-object-oriented), and these features can be added incrementally to an
9existing C code-base,
10allowing programmers to learn \CFA on an as-needed basis.
11
12Only those \CFA features pertaining to this thesis are discussed.
13A familiarity with
14C or C-like languages is assumed.
15
16\section{Overloading and \lstinline{extern}}
17\CFA has extensive overloading, allowing multiple definitions of the same name
18to be defined~\cite{Moss18}.
19\begin{cfa}
20char i; int i; double i;
21int f(); double f();
22void g( int ); void g( double );
23\end{cfa}
24This feature requires name mangling so the assembly symbols are unique for
25different overloads. For compatibility with names in C, there is also a syntax
26to disable name mangling. These unmangled names cannot be overloaded but act as
27the interface between C and \CFA code.  The syntax for disabling/enabling
28mangling is:
29\begin{cfa}
30// name mangling on by default
31int i; // _X1ii_1
32extern "C" {  // disables name mangling
33        int j; // j
34        extern "Cforall" {  // enables name mangling
35                int k; // _X1ki_1
36        }
37        // revert to no name mangling
38}
39// revert to name mangling
40\end{cfa}
41Both forms of @extern@ affect all the declarations within their nested lexical
42scope and transition back to the previous mangling state when the lexical scope
43ends.
44
45\section{Reference Type}
46\CFA adds a reference type to C as an auto-dereferencing pointer.
47They work very similarly to pointers.
48Reference-types are written the same way as pointer-types, but each
49asterisk (@*@) is replaced with a ampersand (@&@);
50this includes cv-qualifiers (\snake{const} and \snake{volatile})
51%\todo{Should I go into even more detail on cv-qualifiers.}
52and multiple levels of reference.
53
54Generally, references act like pointers with an implicit dereferencing
55operation added to each use of the variable.
56These automatic dereferences may be disabled with the address-of operator
57(@&@).
58
59% Check to see if these are generating errors.
60\begin{minipage}{0,5\textwidth}
61With references:
62\begin{cfa}
63int i, j;
64int & ri = i;
65int && rri = ri;
66rri = 3;
67&ri = &j;
68ri = 5;
69\end{cfa}
70\end{minipage}
71\begin{minipage}{0,5\textwidth}
72With pointers:
73\begin{cfa}
74int i, j;
75int * pi = &i
76int ** ppi = &pi;
77**ppi = 3;
78pi = &j;
79*pi = 5;
80\end{cfa}
81\end{minipage}
82
83References are intended to be used when the indirection of a pointer is
84required, but the address is not as important as the value and dereferencing
85is the common usage.
86Mutable references may be assigned to by converting them to a pointer
87with a @&@ and then assigning a pointer to them, as in @&ri = &j;@ above.
88
89\section{Operators}
90
91\CFA implements operator overloading by providing special names, where
92operator expressions are translated into function calls using these names.
93An operator name is created by taking the operator symbols and joining them with
94@?@s to show where the arguments go.
95For example,
96infixed multiplication is @?*?@, while prefix dereference is @*?@.
97This syntax makes it easy to tell the difference between prefix operations
98(such as @++?@) and postfix operations (@?++@).
99
100As an example, here are the addition and equality operators for a point type.
101\begin{cfa}
102point ?+?(point a, point b) { return point{a.x + b.x, a.y + b.y}; }
103int ?==?(point a, point b) { return a.x == b.x && a.y == b.y; }
104{
105        assert(point{1, 2} + point{3, 4} == point{4, 6});
106}
107\end{cfa}
108Note that this syntax works effectively as a textual transformation;
109the compiler converts all operators into functions and then resolves them
110normally. This means any combination of types may be used,
111although nonsensical ones (like @double ?==?(point, int);@) are discouraged.
112This feature is also used for all builtin operators as well,
113although those are implicitly provided by the language.
114
115%\subsection{Constructors and Destructors}
116In \CFA, constructors and destructors are operators, which means they are
117functions with special operator names, rather than type names as in \Cpp.
118Both constructors and destructors can be implicity called by the compiler,
119however the operator names allow explicit calls.
120% Placement new means that this is actually equivant to C++.
121
122The special name for a constructor is @?{}@, which comes from the
123initialization syntax in C, \eg @Example e = { ... }@.
124\CFA generates a constructor call each time a variable is declared,
125passing the initialization arguments to the constructor.
126\begin{cfa}
127struct Example { ... };
128void ?{}(Example & this) { ... }
129{
130        Example a;
131        Example b = {};
132}
133void ?{}(Example & this, char first, int num) { ... }
134{
135        Example c = {'a', 2};
136}
137\end{cfa}
138Both @a@ and @b@ will be initalized with the first constructor,
139@b@ because of the explicit call and @a@ implicitly.
140@c@ will be initalized with the second constructor.
141Currently, there is no general way to skip initialization.
142% I don't use @= anywhere in the thesis.
143
144% I don't like the \^{} symbol but $^\wedge$ isn't better.
145Similarly, destructors use the special name @^?{}@ (the @^@ has no special
146meaning).
147\begin{cfa}
148void ^?{}(Example & this) { ... }
149{
150        Example d;
151        ^?{}(d);
152
153        Example e;
154} // Implicit call of ^?{}(e);
155\end{cfa}
156
157Whenever a type is defined, \CFA creates a default zero-argument
158constructor, a copy constructor, a series of argument-per-field constructors
159and a destructor. All user constructors are defined after this.
160
161\section{Polymorphism}
162\CFA uses parametric polymorphism to create functions and types that are
163defined over multiple types. \CFA polymorphic declarations serve the same role
164as \Cpp templates or Java generics. The ``parametric'' means the polymorphism is
165accomplished by passing argument operations to associate \emph{parameters} at
166the call site, and these parameters are used in the function to differentiate
167among the types the function operates on.
168
169Polymorphic declarations start with a universal @forall@ clause that goes
170before the standard (monomorphic) declaration. These declarations have the same
171syntax except they may use the universal type names introduced by the @forall@
172clause.  For example, the following is a polymorphic identity function that
173works on any type @T@:
174\begin{cfa}
175forall( T ) T identity( T val ) { return val; }
176int forty_two = identity( 42 );
177char capital_a = identity( 'A' );
178\end{cfa}
179Each use of a polymorphic declaration resolves its polymorphic parameters
180(in this case, just @T@) to concrete types (@int@ in the first use and @char@
181in the second).
182
183To allow a polymorphic function to be separately compiled, the type @T@ must be
184constrained by the operations used on @T@ in the function body. The @forall@
185clause is augmented with a list of polymorphic variables (local type names)
186and assertions (constraints), which represent the required operations on those
187types used in a function, \eg:
188\begin{cfa}
189forall( T | { void do_once(T); } )
190void do_twice(T value) {
191        do_once(value);
192        do_once(value);
193}
194\end{cfa}
195
196A polymorphic function can be used in the same way as a normal function.  The
197polymorphic variables are filled in with concrete types and the assertions are
198checked. An assertion is checked by verifying each assertion operation (with
199all the variables replaced with the concrete types from the arguments) is
200defined at a call site.
201\begin{cfa}
202void do_once(int i) { ... }
203int i;
204do_twice(i);
205\end{cfa}
206Any value with a type fulfilling the assertion may be passed as an argument to
207a @do_twice@ call.
208
209Note, a function named @do_once@ is not required in the scope of @do_twice@ to
210compile it, unlike \Cpp template expansion. Furthermore, call-site inferencing
211allows local replacement of the specific parametric functions needs for a
212call.
213\begin{cfa}
214void do_once(double y) { ... }
215int quadruple(int x) {
216        void do_once(int & y) { y = y * 2; }
217        do_twice(x);
218        return x;
219}
220\end{cfa}
221Specifically, the complier deduces that @do_twice@'s T is an integer from the
222argument @x@. It then looks for the most specific definition matching the
223assertion, which is the nested integral @do_once@ defined within the
224function. The matched assertion function is then passed as a function pointer
225to @do_twice@ and called within it.
226The global definition of @do_once@ is ignored, however if @quadruple@ took a
227@double@ argument, then the global definition would be used instead as it
228would then be a better match.\cite{Moss19}
229
230To avoid typing long lists of assertions, constraints can be collected into
231a convenient package called a @trait@, which can then be used in an assertion
232instead of the individual constraints.
233\begin{cfa}
234trait done_once(T) {
235        void do_once(T);
236}
237\end{cfa}
238and the @forall@ list in the previous example is replaced with the trait.
239\begin{cfa}
240forall(dtype T | done_once(T))
241\end{cfa}
242In general, a trait can contain an arbitrary number of assertions, both
243functions and variables, and are usually used to create a shorthand for, and
244give descriptive names to, common groupings of assertions describing a certain
245functionality, like @summable@, @listable@, \etc.
246
247Polymorphic structures and unions are defined by qualifying an aggregate type
248with @forall@. The type variables work the same except they are used in field
249declarations instead of parameters, returns and local variable declarations.
250\begin{cfa}
251forall(dtype T)
252struct node {
253        node(T) * next;
254        T * data;
255};
256node(int) inode;
257\end{cfa}
258The generic type @node(T)@ is an example of a polymorphic type usage.  Like \Cpp
259template usage, a polymorphic type usage must specify a type parameter.
260
261There are many other polymorphism features in \CFA but these are the ones used
262by the exception system.
263
264\section{Control Flow}
265\CFA has a number of advanced control-flow features: @generator@, @coroutine@,
266@monitor@, @mutex@ parameters, and @thread@.
267The two features that interact with
268the exception system are @coroutine@ and @thread@; they and their supporting
269constructs are described here.
270
271\subsection{Coroutine}
272A coroutine is a type with associated functions, where the functions are not
273required to finish execution when control is handed back to the caller.
274Instead,
275they may suspend execution at any time and be resumed later at the point of
276last suspension.
277Coroutine
278types are not concurrent but share some similarities along with common
279underpinnings, so they are combined with the \CFA threading library.
280% I had mention of generators, but they don't actually matter here.
281
282In \CFA, a coroutine is created using the @coroutine@ keyword, which is an
283aggregate type like @struct,@ except the structure is implicitly modified by
284the compiler to satisfy the @is_coroutine@ trait; hence, a coroutine is
285restricted by the type system to types that provide this special trait.  The
286coroutine structure acts as the interface between callers and the coroutine,
287and its fields are used to pass information in and out of coroutine interface
288functions.
289
290Here is a simple example where a single field is used to pass (communicate) the
291next number in a sequence.
292\begin{cfa}
293coroutine CountUp {
294        unsigned int next;
295};
296CountUp countup;
297\end{cfa}
298Each coroutine has a @main@ function, which takes a reference to a coroutine
299object and returns @void@.
300%[numbers=left] Why numbers on this one?
301\begin{cfa}
302void main(CountUp & this) {
303        for (unsigned int next = 0 ; true ; ++next) {
304                this.next = next;
305                suspend;$\label{suspend}$
306        }
307}
308\end{cfa}
309In this function, or functions called by this function (helper functions), the
310@suspend@ statement is used to return execution to the coroutine's caller
311without terminating the coroutine's function.
312
313A coroutine is resumed by calling the @resume@ function, \eg @resume(countup)@.
314The first resume calls the @main@ function at the top. Thereafter, resume calls
315continue a coroutine in the last suspended function after the @suspend@
316statement. In this case there is only one and, hence, the difference between
317subsequent calls is the state of variables inside the function and the
318coroutine object.
319The return value of @resume@ is a reference to the coroutine, to make it
320convent to access fields of the coroutine in the same expression.
321Here is a simple example in a helper function:
322\begin{cfa}
323unsigned int get_next(CountUp & this) {
324        return resume(this).next;
325}
326\end{cfa}
327
328When the main function returns, the coroutine halts and can no longer be
329resumed.
330
331\subsection{Monitor and Mutex Parameter}
332Concurrency does not guarantee ordering; without ordering, results are
333non-deterministic. To claw back ordering, \CFA uses monitors and @mutex@
334(mutual exclusion) parameters. A monitor is another kind of aggregate, where
335the compiler implicitly inserts a lock and instances are compatible with
336@mutex@ parameters.
337
338A function that requires deterministic (ordered) execution acquires mutual
339exclusion on a monitor object by qualifying an object reference parameter with
340the @mutex@ qualifier.
341\begin{cfa}
342void example(MonitorA & mutex argA, MonitorB & mutex argB);
343\end{cfa}
344When the function is called, it implicitly acquires the monitor lock for all of
345the mutex parameters without deadlock.  This semantics means all functions with
346the same mutex type(s) are part of a critical section for objects of that type
347and only one runs at a time.
348
349\subsection{Thread}
350Functions, generators and coroutines are sequential, so there is only a single
351(but potentially sophisticated) execution path in a program. Threads introduce
352multiple execution paths that continue independently.
353
354For threads to work safely with objects requires mutual exclusion using
355monitors and mutex parameters. For threads to work safely with other threads
356also requires mutual exclusion in the form of a communication rendezvous, which
357also supports internal synchronization as for mutex objects. For exceptions,
358only two basic thread operations are important: fork and join.
359
360Threads are created like coroutines with an associated @main@ function:
361\begin{cfa}
362thread StringWorker {
363        const char * input;
364        int result;
365};
366void main(StringWorker & this) {
367        const char * localCopy = this.input;
368        // ... do some work, perhaps hashing the string ...
369        this.result = result;
370}
371{
372        StringWorker stringworker; // fork thread running in "main"
373} // Implicit call to join(stringworker), waits for completion.
374\end{cfa}
375The thread main is where a new thread starts execution after a fork operation
376and then the thread continues executing until it is finished. If another thread
377joins with an executing thread, it waits until the executing main completes
378execution. In other words, everything a thread does is between a fork and join.
379
380From the outside, this behaviour is accomplished through creation and
381destruction of a thread object.  Implicitly, fork happens after a thread
382object's constructor is run and join happens before the destructor runs. Join
383can also be specified explicitly using the @join@ function to wait for a
384thread's completion independently from its deallocation (\ie destructor
385call). If @join@ is called explicitly, the destructor does not implicitly join.
Note: See TracBrowser for help on using the repository browser.