source: doc/theses/andrew_beach_MMath/existing.tex @ 553f8abe

ADTarm-ehast-experimentalenumforall-pointer-decayjacob/cs343-translationnew-ast-unique-exprpthread-emulationqualifiedEnum
Last change on this file since 553f8abe was 553f8abe, checked in by Andrew Beach <ajbeach@…>, 2 years ago

Andrew MMath: Responding to Peter's suggestions on the introduction.

  • Property mode set to 100644
File size: 13.9 KB
Line 
1\chapter{\CFA Existing Features}
2\label{c:existing}
3
4\CFA (C-for-all)~\cite{Cforall} is an open-source project extending ISO C with
5modern safety and productivity features, while still ensuring backwards
6compatibility with C and its programmers.  \CFA is designed to have an
7orthogonal feature-set based closely on the C programming paradigm
8(non-object-oriented) and these features can be added incrementally to an
9existing C code-base allowing programmers to learn \CFA on an as-needed basis.
10
11Only those \CFA features pertinent to this thesis are discussed.  Many of the
12\CFA syntactic and semantic features used in the thesis should be fairly
13obvious to the reader.
14
15\section{Overloading and \lstinline{extern}}
16\CFA has extensive overloading, allowing multiple definitions of the same name
17to be defined~\cite{Moss18}.
18\begin{cfa}
19char i; int i; double i;
20int f(); double f();
21void g( int ); void g( double );
22\end{cfa}
23This feature requires name mangling so the assembly symbols are unique for
24different overloads. For compatibility with names in C, there is also a syntax
25to disable name mangling. These unmangled names cannot be overloaded but act as
26the interface between C and \CFA code.  The syntax for disabling/enabling
27mangling is:
28\begin{cfa}
29// name mangling on by default
30int i; // _X1ii_1
31extern "C" {  // disables name mangling
32        int j; // j
33        extern "Cforall" {  // enables name mangling
34                int k; // _X1ki_1
35        }
36        // revert to no name mangling
37}
38// revert to name mangling
39\end{cfa}
40Both forms of @extern@ affect all the declarations within their nested lexical
41scope and transition back to the previous mangling state when the lexical scope
42ends.
43
44\section{Reference Type}
45\CFA adds a reference type to C as an auto-dereferencing pointer.
46They work very similarly to pointers.
47Reference-types are written the same way as a pointer-type is but each
48asterisk (@*@) is replaced with a ampersand (@&@);
49this includes cv-qualifiers and multiple levels of reference.
50
51They are intended for cases where you would want to use pointers but would
52be dereferencing them (almost) every usage.
53In most cases a reference can just be thought of as a pointer that
54automatically puts a dereference infront of each of its uses (per-level of
55reference).
56The address-of operator (@&@) acts as an escape and removes one of the
57automatic dereference operations.
58Mutable references may be assigned to by converting them to a pointer
59with a @&@ and then assigning a pointer too them.
60
61\begin{minipage}{0,45\textwidth}
62With references:
63\begin{cfa}
64int i, j;
65int & ri = i;
66int && rri = ri;
67rri = 3;
68&ri = &j;
69ri = 5;
70\end{cfa}
71\end{minipage}
72\begin{minipage}{0,45\textwidth}
73With pointers:
74\begin{cfa}
75int i, j;
76int * pi = &i
77int ** ppi = &pi;
78**ppi = 3;
79pi = &j;
80*pi = 5;
81\end{cfa}
82\end{minipage}
83
84\section{Constructors and Destructors}
85
86Both constructors and destructors are operators, which means they are
87functions with special operator names rather than type names in \Cpp. The
88special operator names may be used to call the functions explicitly (not
89allowed in \Cpp for constructors).
90
91In general, operator names in \CFA are constructed by bracketing an operator
92token with @?@, which indicates the position of the arguments. For example,
93infixed multiplication is @?*?@ while prefix dereference is @*?@.
94This syntax make it easy to tell the difference between prefix operations
95(such as @++?@) and post-fix operations (@?++@).
96
97The special name for a constructor is @?{}@, which comes from the
98initialization syntax in C. That initialation syntax is also the operator
99form. \CFA will generate a constructor call each time a variable is declared,
100passing the initialization arguments to the constructort.
101\begin{cfa}
102struct Example { ... };
103void ?{}(Example & this) { ... }
104{
105        Example a;
106        Example b = {};
107}
108void ?{}(Example & this, char first, int num) { ... }
109{
110        Example c = {'a', 2};
111}
112\end{cfa}
113Both @a@ and @b@ will be initalized with the first constructor (there is no
114general way to skip initialation) while @c@ will be initalized with the
115second.
116
117% I don't like the \^{} symbol but $^\wedge$ isn't better.
118Similarly destructors use the special name @^?{}@ (the @^@ has no special
119meaning). They can be called explicatly as well but normally they are
120implicitly called on a variable when it goes out of scope.
121\begin{cfa}
122void ^?{}(Example & this) { ... }
123{
124    Example d;
125} // <- implicit destructor call
126\end{cfa}
127No operator name is restricted in what function signatures they may be bound
128to although most of the forms cannot be called in operator form. Some
129``near-misses" will generate warnings.
130
131Whenever a type is defined, \CFA will create a default zero-argument
132constructor, a copy constructor, a series of argument-per-field constructors
133and a destructor. All user constructors are defined after this.
134Because operators are never part of the type definition they may be added
135at any time, including on built-in types.
136
137\section{Polymorphism}
138\CFA uses parametric polymorphism to create functions and types that are
139defined over multiple types. \CFA polymorphic declarations serve the same role
140as \Cpp templates or Java generics. The ``parametric'' means the polymorphism is
141accomplished by passing argument operations to associate \emph{parameters} at
142the call site, and these parameters are used in the function to differentiate
143among the types the function operates on.
144
145Polymorphic declarations start with a universal @forall@ clause that goes
146before the standard (monomorphic) declaration. These declarations have the same
147syntax except they may use the universal type names introduced by the @forall@
148clause.  For example, the following is a polymorphic identity function that
149works on any type @T@:
150\begin{cfa}
151forall( T ) T identity( T val ) { return val; }
152int forty_two = identity( 42 );
153char capital_a = identity( 'A' );
154\end{cfa}
155Each use of a polymorphic declaration will resolve its polymorphic parameters
156(in this case, just @T@) to concrete types (@int@ in the first use and @char@
157in the second).
158
159To allow a polymorphic function to be separately compiled, the type @T@ must be
160constrained by the operations used on @T@ in the function body. The @forall@
161clauses is augmented with a list of polymorphic variables (local type names)
162and assertions (constraints), which represent the required operations on those
163types used in a function, \eg:
164\begin{cfa}
165forall( T | { void do_once(T); })
166void do_twice(T value) {
167        do_once(value);
168        do_once(value);
169}
170\end{cfa}
171
172A polymorphic function can be used in the same way as a normal function.  The
173polymorphic variables are filled in with concrete types and the assertions are
174checked. An assertion is checked by verifying each assertion operation (with
175all the variables replaced with the concrete types from the arguments) is
176defined at a call site.
177\begin{cfa}
178void do_once(int i) { ... }
179int i;
180do_twice(i);
181\end{cfa}
182Any object with a type fulfilling the assertion may be passed as an argument to
183a @do_twice@ call.
184
185Note, a function named @do_once@ is not required in the scope of @do_twice@ to
186compile it, unlike \Cpp template expansion. Furthermore, call-site inferencing
187allows local replacement of the most specific parametric functions needs for a
188call.
189\begin{cfa}
190void do_once(double y) { ... }
191int quadruple(int x) {
192        void do_once(int y) { y = y * 2; }
193        do_twice(x);
194        return x;
195}
196\end{cfa}
197Specifically, the complier deduces that @do_twice@'s T is an integer from the
198argument @x@. It then looks for the most specific definition matching the
199assertion, which is the nested integral @do_once@ defined within the
200function. The matched assertion function is then passed as a function pointer
201to @do_twice@ and called within it.
202The global definition of @do_once@ is ignored.
203
204To avoid typing long lists of assertions, constraints can be collect into
205convenient packages called a @trait@, which can then be used in an assertion
206instead of the individual constraints.
207\begin{cfa}
208trait done_once(T) {
209        void do_once(T);
210}
211\end{cfa}
212and the @forall@ list in the previous example is replaced with the trait.
213\begin{cfa}
214forall(dtype T | done_once(T))
215\end{cfa}
216In general, a trait can contain an arbitrary number of assertions, both
217functions and variables, and are usually used to create a shorthand for, and
218give descriptive names to, common groupings of assertions describing a certain
219functionality, like @sumable@, @listable@, \etc.
220
221Polymorphic structures and unions are defined by qualifying the aggregate type
222with @forall@. The type variables work the same except they are used in field
223declarations instead of parameters, returns, and local variable declarations.
224\begin{cfa}
225forall(dtype T)
226struct node {
227        node(T) * next;
228        T * data;
229}
230node(int) inode;
231\end{cfa}
232The generic type @node(T)@ is an example of a polymorphic type usage.  Like \Cpp
233template usage, a polymorphic type usage must specify a type parameter.
234
235There are many other polymorphism features in \CFA but these are the ones used
236by the exception system.
237
238\section{Control Flow}
239\CFA has a number of advanced control-flow features: @generator@, @coroutine@, @monitor@, @mutex@ parameters, and @thread@.
240The two features that interact with
241the exception system are @coroutine@ and @thread@; they and their supporting
242constructs are described here.
243
244\subsection{Coroutine}
245A coroutine is a type with associated functions, where the functions are not
246required to finish execution when control is handed back to the caller. Instead
247they may suspend execution at any time and be resumed later at the point of
248last suspension. (Generators are stackless and coroutines are stackful.) These
249types are not concurrent but share some similarities along with common
250underpinnings, so they are combined with the \CFA threading library. Further
251discussion in this section only refers to the coroutine because generators are
252similar.
253
254In \CFA, a coroutine is created using the @coroutine@ keyword, which is an
255aggregate type like @struct,@ except the structure is implicitly modified by
256the compiler to satisfy the @is_coroutine@ trait; hence, a coroutine is
257restricted by the type system to types that provide this special trait.  The
258coroutine structure acts as the interface between callers and the coroutine,
259and its fields are used to pass information in and out of coroutine interface
260functions.
261
262Here is a simple example where a single field is used to pass (communicate) the
263next number in a sequence.
264\begin{cfa}
265coroutine CountUp {
266        unsigned int next;
267}
268CountUp countup;
269\end{cfa}
270Each coroutine has a @main@ function, which takes a reference to a coroutine
271object and returns @void@.
272\begin{cfa}
273void main(CountUp & this) {
274    for (unsigned int next = 0 ; true ; ++next) {
275                next = up;
276                suspend;$\label{suspend}$
277        }
278}
279\end{cfa}
280In this function, or functions called by this function (helper functions), the
281@suspend@ statement is used to return execution to the coroutine's caller
282without terminating the coroutine's function.
283
284A coroutine is resumed by calling the @resume@ function, \eg @resume(countup)@.
285The first resume calls the @main@ function at the top. Thereafter, resume calls
286continue a coroutine in the last suspended function after the @suspend@
287statement, in this case @main@ line~\ref{suspend}.  The @resume@ function takes
288a reference to the coroutine structure and returns the same reference. The
289return value allows easy access to communication variables defined in the
290coroutine object. For example, the @next@ value for coroutine object @countup@
291is both generated and collected in the single expression:
292@resume(countup).next@.
293
294\subsection{Monitor and Mutex Parameter}
295Concurrency does not guarantee ordering; without ordering results are
296non-deterministic. To claw back ordering, \CFA uses monitors and @mutex@
297(mutual exclusion) parameters. A monitor is another kind of aggregate, where
298the compiler implicitly inserts a lock and instances are compatible with
299@mutex@ parameters.
300
301A function that requires deterministic (ordered) execution, acquires mutual
302exclusion on a monitor object by qualifying an object reference parameter with
303@mutex@.
304\begin{cfa}
305void example(MonitorA & mutex argA, MonitorB & mutex argB);
306\end{cfa}
307When the function is called, it implicitly acquires the monitor lock for all of
308the mutex parameters without deadlock.  This semantics means all functions with
309the same mutex type(s) are part of a critical section for objects of that type
310and only one runs at a time.
311
312\subsection{Thread}
313Functions, generators, and coroutines are sequential so there is only a single
314(but potentially sophisticated) execution path in a program. Threads introduce
315multiple execution paths that continue independently.
316
317For threads to work safely with objects requires mutual exclusion using
318monitors and mutex parameters. For threads to work safely with other threads,
319also requires mutual exclusion in the form of a communication rendezvous, which
320also supports internal synchronization as for mutex objects. For exceptions,
321only two basic thread operations are important: fork and join.
322
323Threads are created like coroutines with an associated @main@ function:
324\begin{cfa}
325thread StringWorker {
326        const char * input;
327        int result;
328};
329void main(StringWorker & this) {
330        const char * localCopy = this.input;
331        // ... do some work, perhaps hashing the string ...
332        this.result = result;
333}
334{
335        StringWorker stringworker; // fork thread running in "main"
336} // <- implicitly join with thread / wait for completion
337\end{cfa}
338The thread main is where a new thread starts execution after a fork operation
339and then the thread continues executing until it is finished. If another thread
340joins with an executing thread, it waits until the executing main completes
341execution. In other words, everything a thread does is between a fork and join.
342
343From the outside, this behaviour is accomplished through creation and
344destruction of a thread object.  Implicitly, fork happens after a thread
345object's constructor is run and join happens before the destructor runs. Join
346can also be specified explicitly using the @join@ function to wait for a
347thread's completion independently from its deallocation (\ie destructor
348call). If @join@ is called explicitly, the destructor does not implicitly join.
Note: See TracBrowser for help on using the repository browser.