source: doc/theses/andrew_beach_MMath/existing.tex @ 6071efc

jacob/cs343-translationnew-ast-unique-expr
Last change on this file since 6071efc was 6071efc, checked in by Andrew Beach <ajbeach@…>, 6 months ago

Andrew MMath: Update the first three chapters using Colby's comments.

  • Property mode set to 100644
File size: 14.6 KB
Line 
1\chapter{\CFA{} Existing Features}
2\label{c:existing}
3
4\CFA is an open-source project extending ISO C with
5modern safety and productivity features, while still ensuring backwards
6compatibility with C and its programmers.  \CFA is designed to have an
7orthogonal feature-set based closely on the C programming paradigm
8(non-object-oriented) and these features can be added incrementally to an
9existing C code-base allowing programmers to learn \CFA on an as-needed basis.
10
11Only those \CFA features pertaining to this thesis are discussed.
12Also, only new features of \CFA will be discussed, a familiarity with
13C or C-like languages is assumed.
14
15\section{Overloading and \lstinline{extern}}
16\CFA has extensive overloading, allowing multiple definitions of the same name
17to be defined~\cite{Moss18}.
18\begin{cfa}
19char i; int i; double i;
20int f(); double f();
21void g( int ); void g( double );
22\end{cfa}
23This feature requires name mangling so the assembly symbols are unique for
24different overloads. For compatibility with names in C, there is also a syntax
25to disable name mangling. These unmangled names cannot be overloaded but act as
26the interface between C and \CFA code.  The syntax for disabling/enabling
27mangling is:
28\begin{cfa}
29// name mangling on by default
30int i; // _X1ii_1
31extern "C" {  // disables name mangling
32        int j; // j
33        extern "Cforall" {  // enables name mangling
34                int k; // _X1ki_1
35        }
36        // revert to no name mangling
37}
38// revert to name mangling
39\end{cfa}
40Both forms of @extern@ affect all the declarations within their nested lexical
41scope and transition back to the previous mangling state when the lexical scope
42ends.
43
44\section{Reference Type}
45\CFA adds a reference type to C as an auto-dereferencing pointer.
46They work very similarly to pointers.
47Reference-types are written the same way as a pointer-type but each
48asterisk (@*@) is replaced with a ampersand (@&@);
49this includes cv-qualifiers and multiple levels of reference.
50
51Generally, references act like pointers with an implicate dereferencing
52operation added to each use of the variable.
53These automatic dereferences may be disabled with the address-of operator
54(@&@).
55
56% Check to see if these are generating errors.
57\begin{minipage}{0,5\textwidth}
58With references:
59\begin{cfa}
60int i, j;
61int & ri = i;
62int && rri = ri;
63rri = 3;
64&ri = &j;
65ri = 5;
66\end{cfa}
67\end{minipage}
68\begin{minipage}{0,5\textwidth}
69With pointers:
70\begin{cfa}
71int i, j;
72int * pi = &i
73int ** ppi = &pi;
74**ppi = 3;
75pi = &j;
76*pi = 5;
77\end{cfa}
78\end{minipage}
79
80References are intended to be used when you would use pointers but would
81be dereferencing them (almost) every usage.
82Mutable references may be assigned to by converting them to a pointer
83with a @&@ and then assigning a pointer to them, as in @&ri = &j;@ above
84
85\section{Operators}
86
87\CFA implements operator overloading by providing special names.
88Operator uses are translated into function calls using these names.
89These names are created by taking the operator symbols and joining them with
90@?@s to show where the arguments go.
91For example,
92infixed multiplication is @?*?@ while prefix dereference is @*?@.
93This syntax make it easy to tell the difference between prefix operations
94(such as @++?@) and post-fix operations (@?++@).
95
96\begin{cfa}
97point ?+?(point a, point b) { return point{a.x + b.x, a.y + b.y}; }
98bool ?==?(point a, point b) { return a.x == b.x && a.y == b.y; }
99{
100        assert(point{1, 2} + point{3, 4} == point{4, 6});
101}
102\end{cfa}
103Note that these special names are not limited to just being used for these
104operator functions, and may be used name other declarations.
105Some ``near misses", that will not match an operator form but looks like
106it may have been supposed to, will generate wantings but otherwise they are
107left alone.
108
109%\subsection{Constructors and Destructors}
110
111Both constructors and destructors are operators, which means they are
112functions with special operator names rather than type names in \Cpp. The
113special operator names may be used to call the functions explicitly.
114% Placement new means that this is actually equivant to C++.
115
116The special name for a constructor is @?{}@, which comes from the
117initialization syntax in C, \eg @Example e = { ... }@.
118\CFA will generate a constructor call each time a variable is declared,
119passing the initialization arguments to the constructort.
120\begin{cfa}
121struct Example { ... };
122void ?{}(Example & this) { ... }
123{
124        Example a;
125        Example b = {};
126}
127void ?{}(Example & this, char first, int num) { ... }
128{
129        Example c = {'a', 2};
130}
131\end{cfa}
132Both @a@ and @b@ will be initalized with the first constructor,
133while @c@ will be initalized with the second.
134Currently, there is no general way to skip initialation.
135
136% I don't like the \^{} symbol but $^\wedge$ isn't better.
137Similarly destructors use the special name @^?{}@ (the @^@ has no special
138meaning).
139These are a normally called implicitly called on a variable when it goes out
140of scope. They can be called explicitly as well.
141\begin{cfa}
142void ^?{}(Example & this) { ... }
143{
144        Example d;
145} // <- implicit destructor call
146\end{cfa}
147
148Whenever a type is defined, \CFA will create a default zero-argument
149constructor, a copy constructor, a series of argument-per-field constructors
150and a destructor. All user constructors are defined after this.
151Because operators are never part of the type definition they may be added
152at any time, including on built-in types.
153
154\section{Polymorphism}
155\CFA uses parametric polymorphism to create functions and types that are
156defined over multiple types. \CFA polymorphic declarations serve the same role
157as \Cpp templates or Java generics. The ``parametric'' means the polymorphism is
158accomplished by passing argument operations to associate \emph{parameters} at
159the call site, and these parameters are used in the function to differentiate
160among the types the function operates on.
161
162Polymorphic declarations start with a universal @forall@ clause that goes
163before the standard (monomorphic) declaration. These declarations have the same
164syntax except they may use the universal type names introduced by the @forall@
165clause.  For example, the following is a polymorphic identity function that
166works on any type @T@:
167\begin{cfa}
168forall( T ) T identity( T val ) { return val; }
169int forty_two = identity( 42 );
170char capital_a = identity( 'A' );
171\end{cfa}
172Each use of a polymorphic declaration resolves its polymorphic parameters
173(in this case, just @T@) to concrete types (@int@ in the first use and @char@
174in the second).
175
176To allow a polymorphic function to be separately compiled, the type @T@ must be
177constrained by the operations used on @T@ in the function body. The @forall@
178clause is augmented with a list of polymorphic variables (local type names)
179and assertions (constraints), which represent the required operations on those
180types used in a function, \eg:
181\begin{cfa}
182forall( T | { void do_once(T); } )
183void do_twice(T value) {
184        do_once(value);
185        do_once(value);
186}
187\end{cfa}
188
189A polymorphic function can be used in the same way as a normal function.  The
190polymorphic variables are filled in with concrete types and the assertions are
191checked. An assertion is checked by verifying each assertion operation (with
192all the variables replaced with the concrete types from the arguments) is
193defined at a call site.
194\begin{cfa}
195void do_once(int i) { ... }
196int i;
197do_twice(i);
198\end{cfa}
199Any object with a type fulfilling the assertion may be passed as an argument to
200a @do_twice@ call.
201
202Note, a function named @do_once@ is not required in the scope of @do_twice@ to
203compile it, unlike \Cpp template expansion. Furthermore, call-site inferencing
204allows local replacement of the most specific parametric functions needs for a
205call.
206\begin{cfa}
207void do_once(double y) { ... }
208int quadruple(int x) {
209        void do_once(int & y) { y = y * 2; }
210        do_twice(x);
211        return x;
212}
213\end{cfa}
214Specifically, the complier deduces that @do_twice@'s T is an integer from the
215argument @x@. It then looks for the most specific definition matching the
216assertion, which is the nested integral @do_once@ defined within the
217function. The matched assertion function is then passed as a function pointer
218to @do_twice@ and called within it.
219The global definition of @do_once@ is ignored, however if quadruple took a
220@double@ argument then the global definition would be used instead as it
221would be a better match.
222% Aaron's thesis might be a good reference here.
223
224To avoid typing long lists of assertions, constraints can be collect into
225convenient packages called a @trait@, which can then be used in an assertion
226instead of the individual constraints.
227\begin{cfa}
228trait done_once(T) {
229        void do_once(T);
230}
231\end{cfa}
232and the @forall@ list in the previous example is replaced with the trait.
233\begin{cfa}
234forall(dtype T | done_once(T))
235\end{cfa}
236In general, a trait can contain an arbitrary number of assertions, both
237functions and variables, and are usually used to create a shorthand for, and
238give descriptive names to, common groupings of assertions describing a certain
239functionality, like @sumable@, @listable@, \etc.
240
241Polymorphic structures and unions are defined by qualifying the aggregate type
242with @forall@. The type variables work the same except they are used in field
243declarations instead of parameters, returns, and local variable declarations.
244\begin{cfa}
245forall(dtype T)
246struct node {
247        node(T) * next;
248        T * data;
249}
250node(int) inode;
251\end{cfa}
252The generic type @node(T)@ is an example of a polymorphic type usage.  Like \Cpp
253template usage, a polymorphic type usage must specify a type parameter.
254
255There are many other polymorphism features in \CFA but these are the ones used
256by the exception system.
257
258\section{Control Flow}
259\CFA has a number of advanced control-flow features: @generator@, @coroutine@, @monitor@, @mutex@ parameters, and @thread@.
260The two features that interact with
261the exception system are @coroutine@ and @thread@; they and their supporting
262constructs are described here.
263
264\subsection{Coroutine}
265A coroutine is a type with associated functions, where the functions are not
266required to finish execution when control is handed back to the caller. Instead
267they may suspend execution at any time and be resumed later at the point of
268last suspension. (Generators are stackless and coroutines are stackful.) These
269types are not concurrent but share some similarities along with common
270underpinnings, so they are combined with the \CFA threading library. Further
271discussion in this section only refers to the coroutine because generators are
272similar.
273
274In \CFA, a coroutine is created using the @coroutine@ keyword, which is an
275aggregate type like @struct,@ except the structure is implicitly modified by
276the compiler to satisfy the @is_coroutine@ trait; hence, a coroutine is
277restricted by the type system to types that provide this special trait.  The
278coroutine structure acts as the interface between callers and the coroutine,
279and its fields are used to pass information in and out of coroutine interface
280functions.
281
282Here is a simple example where a single field is used to pass (communicate) the
283next number in a sequence.
284\begin{cfa}
285coroutine CountUp {
286        unsigned int next;
287}
288CountUp countup;
289\end{cfa}
290Each coroutine has a @main@ function, which takes a reference to a coroutine
291object and returns @void@.
292%[numbers=left] Why numbers on this one?
293\begin{cfa}
294void main(CountUp & this) {
295        for (unsigned int next = 0 ; true ; ++next) {
296                next = up;
297                suspend;$\label{suspend}$
298        }
299}
300\end{cfa}
301In this function, or functions called by this function (helper functions), the
302@suspend@ statement is used to return execution to the coroutine's caller
303without terminating the coroutine's function.
304
305A coroutine is resumed by calling the @resume@ function, \eg @resume(countup)@.
306The first resume calls the @main@ function at the top. Thereafter, resume calls
307continue a coroutine in the last suspended function after the @suspend@
308statement, in this case @main@ line~\ref{suspend}.  The @resume@ function takes
309a reference to the coroutine structure and returns the same reference. The
310return value allows easy access to communication variables defined in the
311coroutine object. For example, the @next@ value for coroutine object @countup@
312is both generated and collected in the single expression:
313@resume(countup).next@.
314
315\subsection{Monitor and Mutex Parameter}
316Concurrency does not guarantee ordering; without ordering results are
317non-deterministic. To claw back ordering, \CFA uses monitors and @mutex@
318(mutual exclusion) parameters. A monitor is another kind of aggregate, where
319the compiler implicitly inserts a lock and instances are compatible with
320@mutex@ parameters.
321
322A function that requires deterministic (ordered) execution, acquires mutual
323exclusion on a monitor object by qualifying an object reference parameter with
324@mutex@.
325\begin{cfa}
326void example(MonitorA & mutex argA, MonitorB & mutex argB);
327\end{cfa}
328When the function is called, it implicitly acquires the monitor lock for all of
329the mutex parameters without deadlock.  This semantics means all functions with
330the same mutex type(s) are part of a critical section for objects of that type
331and only one runs at a time.
332
333\subsection{Thread}
334Functions, generators, and coroutines are sequential so there is only a single
335(but potentially sophisticated) execution path in a program. Threads introduce
336multiple execution paths that continue independently.
337
338For threads to work safely with objects requires mutual exclusion using
339monitors and mutex parameters. For threads to work safely with other threads,
340also requires mutual exclusion in the form of a communication rendezvous, which
341also supports internal synchronization as for mutex objects. For exceptions,
342only two basic thread operations are important: fork and join.
343
344Threads are created like coroutines with an associated @main@ function:
345\begin{cfa}
346thread StringWorker {
347        const char * input;
348        int result;
349};
350void main(StringWorker & this) {
351        const char * localCopy = this.input;
352        // ... do some work, perhaps hashing the string ...
353        this.result = result;
354}
355{
356        StringWorker stringworker; // fork thread running in "main"
357} // <- implicitly join with thread / wait for completion
358\end{cfa}
359The thread main is where a new thread starts execution after a fork operation
360and then the thread continues executing until it is finished. If another thread
361joins with an executing thread, it waits until the executing main completes
362execution. In other words, everything a thread does is between a fork and join.
363
364From the outside, this behaviour is accomplished through creation and
365destruction of a thread object.  Implicitly, fork happens after a thread
366object's constructor is run and join happens before the destructor runs. Join
367can also be specified explicitly using the @join@ function to wait for a
368thread's completion independently from its deallocation (\ie destructor
369call). If @join@ is called explicitly, the destructor does not implicitly join.
Note: See TracBrowser for help on using the repository browser.