source: doc/theses/andrew_beach_MMath/existing.tex @ b405039

ADTarm-ehast-experimentalenumforall-pointer-decayjacob/cs343-translationnew-ast-unique-exprpthread-emulationqualifiedEnum
Last change on this file since b405039 was 03c0e44, checked in by Andrew Beach <ajbeach@…>, 3 years ago

Andrew MMath: Rewrote the existing features/references piece.

  • Property mode set to 100644
File size: 13.9 KB
Line 
1\chapter{\CFA Existing Features}
2
3\CFA (C-for-all)~\cite{Cforall} is an open-source project extending ISO C with
4modern safety and productivity features, while still ensuring backwards
5compatibility with C and its programmers.  \CFA is designed to have an
6orthogonal feature-set based closely on the C programming paradigm
7(non-object-oriented) and these features can be added incrementally to an
8existing C code-base allowing programmers to learn \CFA on an as-needed basis.
9
10Only those \CFA features pertinent to this thesis are discussed.  Many of the
11\CFA syntactic and semantic features used in the thesis should be fairly
12obvious to the reader.
13
14\section{Overloading and \lstinline{extern}}
15\CFA has extensive overloading, allowing multiple definitions of the same name
16to be defined~\cite{Moss18}.
17\begin{cfa}
18char i; int i; double i;
19int f(); double f();
20void g( int ); void g( double );
21\end{cfa}
22This feature requires name mangling so the assembly symbols are unique for
23different overloads. For compatibility with names in C, there is also a syntax
24to disable name mangling. These unmangled names cannot be overloaded but act as
25the interface between C and \CFA code.  The syntax for disabling/enabling
26mangling is:
27\begin{cfa}
28// name mangling on by default
29int i; // _X1ii_1
30extern "C" {  // disables name mangling
31        int j; // j
32        extern "Cforall" {  // enables name mangling
33                int k; // _X1ki_1
34        }
35        // revert to no name mangling
36}
37// revert to name mangling
38\end{cfa}
39Both forms of @extern@ affect all the declarations within their nested lexical
40scope and transition back to the previous mangling state when the lexical scope
41ends.
42
43\section{Reference Type}
44\CFA adds a reference type to C as an auto-dereferencing pointer.
45They work very similarly to pointers.
46Reference-types are written the same way as a pointer-type is but each
47asterisk (@*@) is replaced with a ampersand (@&@);
48this includes cv-qualifiers and multiple levels of reference.
49
50They are intended for cases where you would want to use pointers but would
51be dereferencing them (almost) every usage.
52In most cases a reference can just be thought of as a pointer that
53automatically puts a dereference infront of each of its uses (per-level of
54reference).
55The address-of operator (@&@) acts as an escape and removes one of the
56automatic dereference operations.
57Mutable references may be assigned to by converting them to a pointer
58with a @&@ and then assigning a pointer too them.
59
60\begin{minipage}{0,5\textwidth}
61With references:
62\begin{cfa}
63int i, j;
64int & ri = i;
65int && rri = ri;
66rri = 3;
67&ri = &j;
68ri = 5;
69\end{cfa}
70\end{minipage}
71\begin{minipage}{0,5\textwidth}
72With pointers:
73\begin{cfa}
74int i, j;
75int * pi = &i
76int ** ppi = &pi;
77**ppi = 3;
78pi = &j;
79*pi = 5;
80\end{cfa}
81\end{minipage}
82
83\section{Constructors and Destructors}
84
85Both constructors and destructors are operators, which means they are
86functions with special operator names rather than type names in \Cpp. The
87special operator names may be used to call the functions explicitly (not
88allowed in \Cpp for constructors).
89
90In general, operator names in \CFA are constructed by bracketing an operator
91token with @?@, which indicates the position of the arguments. For example,
92infixed multiplication is @?*?@ while prefix dereference is @*?@.
93This syntax make it easy to tell the difference between prefix operations
94(such as @++?@) and post-fix operations (@?++@).
95
96The special name for a constructor is @?{}@, which comes from the
97initialization syntax in C. That initialation syntax is also the operator
98form. \CFA will generate a constructor call each time a variable is declared,
99passing the initialization arguments to the constructort.
100\begin{cfa}
101struct Example { ... };
102void ?{}(Example & this) { ... }
103{
104        Example a;
105        Example b = {};
106}
107void ?{}(Example & this, char first, int num) { ... }
108{
109        Example c = {'a', 2};
110}
111\end{cfa}
112Both @a@ and @b@ will be initalized with the first constructor (there is no
113general way to skip initialation) while @c@ will be initalized with the
114second.
115
116% I don't like the \^{} symbol but $^\wedge$ isn't better.
117Similarly destructors use the special name @^?{}@ (the @^@ has no special
118meaning). They can be called explicatly as well but normally they are
119implicitly called on a variable when it goes out of scope.
120\begin{cfa}
121void ^?{}(Example & this) { ... }
122{
123    Example d;
124} // <- implicit destructor call
125\end{cfa}
126No operator name is restricted in what function signatures they may be bound
127to although most of the forms cannot be called in operator form. Some
128``near-misses" will generate warnings.
129
130Whenever a type is defined, \CFA will create a default zero-argument
131constructor, a copy constructor, a series of argument-per-field constructors
132and a destructor. All user constructors are defined after this.
133Because operators are never part of the type definition they may be added
134at any time, including on built-in types.
135
136\section{Polymorphism}
137\CFA uses parametric polymorphism to create functions and types that are
138defined over multiple types. \CFA polymorphic declarations serve the same role
139as \Cpp templates or Java generics. The ``parametric'' means the polymorphism is
140accomplished by passing argument operations to associate \emph{parameters} at
141the call site, and these parameters are used in the function to differentiate
142among the types the function operates on.
143
144Polymorphic declarations start with a universal @forall@ clause that goes
145before the standard (monomorphic) declaration. These declarations have the same
146syntax except they may use the universal type names introduced by the @forall@
147clause.  For example, the following is a polymorphic identity function that
148works on any type @T@:
149\begin{cfa}
150forall( T ) T identity( T val ) { return val; }
151int forty_two = identity( 42 );
152char capital_a = identity( 'A' );
153\end{cfa}
154Each use of a polymorphic declaration will resolve its polymorphic parameters
155(in this case, just @T@) to concrete types (@int@ in the first use and @char@
156in the second).
157
158To allow a polymorphic function to be separately compiled, the type @T@ must be
159constrained by the operations used on @T@ in the function body. The @forall@
160clauses is augmented with a list of polymorphic variables (local type names)
161and assertions (constraints), which represent the required operations on those
162types used in a function, \eg:
163\begin{cfa}
164forall( T | { void do_once(T); })
165void do_twice(T value) {
166        do_once(value);
167        do_once(value);
168}
169\end{cfa}
170
171A polymorphic function can be used in the same way as a normal function.  The
172polymorphic variables are filled in with concrete types and the assertions are
173checked. An assertion is checked by verifying each assertion operation (with
174all the variables replaced with the concrete types from the arguments) is
175defined at a call site.
176\begin{cfa}
177void do_once(int i) { ... }
178int i;
179do_twice(i);
180\end{cfa}
181Any object with a type fulfilling the assertion may be passed as an argument to
182a @do_twice@ call.
183
184Note, a function named @do_once@ is not required in the scope of @do_twice@ to
185compile it, unlike \Cpp template expansion. Furthermore, call-site inferencing
186allows local replacement of the most specific parametric functions needs for a
187call.
188\begin{cfa}
189void do_once(double y) { ... }
190int quadruple(int x) {
191        void do_once(int y) { y = y * 2; }
192        do_twice(x);
193        return x;
194}
195\end{cfa}
196Specifically, the complier deduces that @do_twice@'s T is an integer from the
197argument @x@. It then looks for the most specific definition matching the
198assertion, which is the nested integral @do_once@ defined within the
199function. The matched assertion function is then passed as a function pointer
200to @do_twice@ and called within it.
201The global definition of @do_once@ is ignored.
202
203To avoid typing long lists of assertions, constraints can be collect into
204convenient packages called a @trait@, which can then be used in an assertion
205instead of the individual constraints.
206\begin{cfa}
207trait done_once(T) {
208        void do_once(T);
209}
210\end{cfa}
211and the @forall@ list in the previous example is replaced with the trait.
212\begin{cfa}
213forall(dtype T | done_once(T))
214\end{cfa}
215In general, a trait can contain an arbitrary number of assertions, both
216functions and variables, and are usually used to create a shorthand for, and
217give descriptive names to, common groupings of assertions describing a certain
218functionality, like @sumable@, @listable@, \etc.
219
220Polymorphic structures and unions are defined by qualifying the aggregate type
221with @forall@. The type variables work the same except they are used in field
222declarations instead of parameters, returns, and local variable declarations.
223\begin{cfa}
224forall(dtype T)
225struct node {
226        node(T) * next;
227        T * data;
228}
229node(int) inode;
230\end{cfa}
231The generic type @node(T)@ is an example of a polymorphic type usage.  Like \Cpp
232template usage, a polymorphic type usage must specify a type parameter.
233
234There are many other polymorphism features in \CFA but these are the ones used
235by the exception system.
236
237\section{Control Flow}
238\CFA has a number of advanced control-flow features: @generator@, @coroutine@, @monitor@, @mutex@ parameters, and @thread@.
239The two features that interact with
240the exception system are @coroutine@ and @thread@; they and their supporting
241constructs are described here.
242
243\subsection{Coroutine}
244A coroutine is a type with associated functions, where the functions are not
245required to finish execution when control is handed back to the caller. Instead
246they may suspend execution at any time and be resumed later at the point of
247last suspension. (Generators are stackless and coroutines are stackful.) These
248types are not concurrent but share some similarities along with common
249underpinnings, so they are combined with the \CFA threading library. Further
250discussion in this section only refers to the coroutine because generators are
251similar.
252
253In \CFA, a coroutine is created using the @coroutine@ keyword, which is an
254aggregate type like @struct,@ except the structure is implicitly modified by
255the compiler to satisfy the @is_coroutine@ trait; hence, a coroutine is
256restricted by the type system to types that provide this special trait.  The
257coroutine structure acts as the interface between callers and the coroutine,
258and its fields are used to pass information in and out of coroutine interface
259functions.
260
261Here is a simple example where a single field is used to pass (communicate) the
262next number in a sequence.
263\begin{cfa}
264coroutine CountUp {
265        unsigned int next;
266}
267CountUp countup;
268\end{cfa}
269Each coroutine has a @main@ function, which takes a reference to a coroutine
270object and returns @void@.
271\begin{cfa}
272void main(CountUp & this) {
273    for (unsigned int next = 0 ; true ; ++next) {
274                next = up;
275                suspend;$\label{suspend}$
276        }
277}
278\end{cfa}
279In this function, or functions called by this function (helper functions), the
280@suspend@ statement is used to return execution to the coroutine's caller
281without terminating the coroutine's function.
282
283A coroutine is resumed by calling the @resume@ function, \eg @resume(countup)@.
284The first resume calls the @main@ function at the top. Thereafter, resume calls
285continue a coroutine in the last suspended function after the @suspend@
286statement, in this case @main@ line~\ref{suspend}.  The @resume@ function takes
287a reference to the coroutine structure and returns the same reference. The
288return value allows easy access to communication variables defined in the
289coroutine object. For example, the @next@ value for coroutine object @countup@
290is both generated and collected in the single expression:
291@resume(countup).next@.
292
293\subsection{Monitor and Mutex Parameter}
294Concurrency does not guarantee ordering; without ordering results are
295non-deterministic. To claw back ordering, \CFA uses monitors and @mutex@
296(mutual exclusion) parameters. A monitor is another kind of aggregate, where
297the compiler implicitly inserts a lock and instances are compatible with
298@mutex@ parameters.
299
300A function that requires deterministic (ordered) execution, acquires mutual
301exclusion on a monitor object by qualifying an object reference parameter with
302@mutex@.
303\begin{cfa}
304void example(MonitorA & mutex argA, MonitorB & mutex argB);
305\end{cfa}
306When the function is called, it implicitly acquires the monitor lock for all of
307the mutex parameters without deadlock.  This semantics means all functions with
308the same mutex type(s) are part of a critical section for objects of that type
309and only one runs at a time.
310
311\subsection{Thread}
312Functions, generators, and coroutines are sequential so there is only a single
313(but potentially sophisticated) execution path in a program. Threads introduce
314multiple execution paths that continue independently.
315
316For threads to work safely with objects requires mutual exclusion using
317monitors and mutex parameters. For threads to work safely with other threads,
318also requires mutual exclusion in the form of a communication rendezvous, which
319also supports internal synchronization as for mutex objects. For exceptions,
320only two basic thread operations are important: fork and join.
321
322Threads are created like coroutines with an associated @main@ function:
323\begin{cfa}
324thread StringWorker {
325        const char * input;
326        int result;
327};
328void main(StringWorker & this) {
329        const char * localCopy = this.input;
330        // ... do some work, perhaps hashing the string ...
331        this.result = result;
332}
333{
334        StringWorker stringworker; // fork thread running in "main"
335} // <- implicitly join with thread / wait for completion
336\end{cfa}
337The thread main is where a new thread starts execution after a fork operation
338and then the thread continues executing until it is finished. If another thread
339joins with an executing thread, it waits until the executing main completes
340execution. In other words, everything a thread does is between a fork and join.
341
342From the outside, this behaviour is accomplished through creation and
343destruction of a thread object.  Implicitly, fork happens after a thread
344object's constructor is run and join happens before the destructor runs. Join
345can also be specified explicitly using the @join@ function to wait for a
346thread's completion independently from its deallocation (\ie destructor
347call). If @join@ is called explicitly, the destructor does not implicitly join.
Note: See TracBrowser for help on using the repository browser.