source: doc/theses/andrew_beach_MMath/existing.tex @ 887fc79

ADTarm-ehast-experimentalenumforall-pointer-decayjacob/cs343-translationnew-ast-unique-exprpthread-emulationqualifiedEnum
Last change on this file since 887fc79 was 9b0bb79, checked in by Andrew Beach <ajbeach@…>, 3 years ago

Andrew MMath: Converted all the ASCII diagrams to xfig diagrams.

  • Property mode set to 100644
File size: 13.4 KB
Line 
1\chapter{\CFA Existing Features}
2
3\CFA (C-for-all)~\cite{Cforall} is an open-source project extending ISO C with
4modern safety and productivity features, while still ensuring backwards
5compatibility with C and its programmers.  \CFA is designed to have an
6orthogonal feature-set based closely on the C programming paradigm
7(non-object-oriented) and these features can be added incrementally to an
8existing C code-base allowing programmers to learn \CFA on an as-needed basis.
9
10Only those \CFA features pertinent to this thesis are discussed.  Many of the
11\CFA syntactic and semantic features used in the thesis should be fairly
12obvious to the reader.
13
14\section{Overloading and \lstinline{extern}}
15\CFA has extensive overloading, allowing multiple definitions of the same name
16to be defined~\cite{Moss18}.
17\begin{cfa}
18char i; int i; double i;
19int f(); double f();
20void g( int ); void g( double );
21\end{cfa}
22This feature requires name mangling so the assembly symbols are unique for
23different overloads. For compatibility with names in C, there is also a syntax
24to disable name mangling. These unmangled names cannot be overloaded but act as
25the interface between C and \CFA code.  The syntax for disabling/enabling
26mangling is:
27\begin{cfa}
28// name mangling on by default
29int i; // _X1ii_1
30extern "C" {  // disables name mangling
31        int j; // j
32        extern "Cforall" {  // enables name mangling
33                int k; // _X1ki_1
34        }
35        // revert to no name mangling
36}
37// revert to name mangling
38\end{cfa}
39Both forms of @extern@ affect all the declarations within their nested lexical
40scope and transition back to the previous mangling state when the lexical scope
41ends.
42
43\section{Reference Type}
44\CFA adds a rebindable reference type to C, but more expressive than the \Cpp
45reference.  Multi-level references are allowed and act like auto-dereferenced
46pointers using the ampersand (@&@) instead of the pointer asterisk (@*@). \CFA
47references may also be mutable or non-mutable. If mutable, a reference variable
48may be assigned using the address-of operator (@&@), which converts the
49reference to a pointer.
50\begin{cfa}
51int i, j;
52int & ri = i, && rri = ri;
53rri = 3;  // auto-dereference assign to i
54&ri = &j; // rebindable
55ri = 5;   // assign to j
56\end{cfa}
57
58\section{Constructors and Destructors}
59
60Both constructors and destructors are operators, which means they are
61functions with special operator names rather than type names in \Cpp. The
62special operator names may be used to call the functions explicitly (not
63allowed in \Cpp for constructors).
64
65In general, operator names in \CFA are constructed by bracketing an operator
66token with @?@, which indicates the position of the arguments. For example,
67infixed multiplication is @?*?@ while prefix dereference is @*?@.
68This syntax make it easy to tell the difference between prefix operations
69(such as @++?@) and post-fix operations (@?++@).
70
71The special name for a constructor is @?{}@, which comes from the
72initialization syntax in C. That initialation syntax is also the operator
73form. \CFA will generate a constructor call each time a variable is declared,
74passing the initialization arguments to the constructort.
75\begin{cfa}
76struct Example { ... };
77void ?{}(Example & this) { ... }
78{
79        Example a;
80        Example b = {};
81}
82void ?{}(Example & this, char first, int num) { ... }
83{
84        Example c = {'a', 2};
85}
86\end{cfa}
87Both @a@ and @b@ will be initalized with the first constructor (there is no
88general way to skip initialation) while @c@ will be initalized with the
89second.
90
91% I don't like the \^{} symbol but $^\wedge$ isn't better.
92Similarly destructors use the special name @^?{}@ (the @^@ has no special
93meaning). They can be called explicatly as well but normally they are
94implicitly called on a variable when it goes out of scope.
95\begin{cfa}
96void ^?{}(Example & this) { ... }
97{
98    Example d;
99} // <- implicit destructor call
100\end{cfa}
101No operator name is restricted in what function signatures they may be bound
102to although most of the forms cannot be called in operator form. Some
103``near-misses" will generate warnings.
104
105Whenever a type is defined, \CFA will create a default zero-argument
106constructor, a copy constructor, a series of argument-per-field constructors
107and a destructor. All user constructors are defined after this.
108Because operators are never part of the type definition they may be added
109at any time, including on built-in types.
110
111\section{Polymorphism}
112\CFA uses parametric polymorphism to create functions and types that are
113defined over multiple types. \CFA polymorphic declarations serve the same role
114as \Cpp templates or Java generics. The ``parametric'' means the polymorphism is
115accomplished by passing argument operations to associate \emph{parameters} at
116the call site, and these parameters are used in the function to differentiate
117among the types the function operates on.
118
119Polymorphic declarations start with a universal @forall@ clause that goes
120before the standard (monomorphic) declaration. These declarations have the same
121syntax except they may use the universal type names introduced by the @forall@
122clause.  For example, the following is a polymorphic identity function that
123works on any type @T@:
124\begin{cfa}
125forall( T ) T identity( T val ) { return val; }
126int forty_two = identity( 42 );
127char capital_a = identity( 'A' );
128\end{cfa}
129Each use of a polymorphic declaration will resolve its polymorphic parameters
130(in this case, just @T@) to concrete types (@int@ in the first use and @char@
131in the second).
132
133To allow a polymorphic function to be separately compiled, the type @T@ must be
134constrained by the operations used on @T@ in the function body. The @forall@
135clauses is augmented with a list of polymorphic variables (local type names)
136and assertions (constraints), which represent the required operations on those
137types used in a function, \eg:
138\begin{cfa}
139forall( T | { void do_once(T); })
140void do_twice(T value) {
141        do_once(value);
142        do_once(value);
143}
144\end{cfa}
145
146A polymorphic function can be used in the same way as a normal function.  The
147polymorphic variables are filled in with concrete types and the assertions are
148checked. An assertion is checked by verifying each assertion operation (with
149all the variables replaced with the concrete types from the arguments) is
150defined at a call site.
151\begin{cfa}
152void do_once(int i) { ... }
153int i;
154do_twice(i);
155\end{cfa}
156Any object with a type fulfilling the assertion may be passed as an argument to
157a @do_twice@ call.
158
159Note, a function named @do_once@ is not required in the scope of @do_twice@ to
160compile it, unlike \Cpp template expansion. Furthermore, call-site inferencing
161allows local replacement of the most specific parametric functions needs for a
162call.
163\begin{cfa}
164void do_once(double y) { ... }
165int quadruple(int x) {
166        void do_once(int y) { y = y * 2; }
167        do_twice(x);
168        return x;
169}
170\end{cfa}
171Specifically, the complier deduces that @do_twice@'s T is an integer from the
172argument @x@. It then looks for the most specific definition matching the
173assertion, which is the nested integral @do_once@ defined within the
174function. The matched assertion function is then passed as a function pointer
175to @do_twice@ and called within it.
176The global definition of @do_once@ is ignored.
177
178To avoid typing long lists of assertions, constraints can be collect into
179convenient packages called a @trait@, which can then be used in an assertion
180instead of the individual constraints.
181\begin{cfa}
182trait done_once(T) {
183        void do_once(T);
184}
185\end{cfa}
186and the @forall@ list in the previous example is replaced with the trait.
187\begin{cfa}
188forall(dtype T | done_once(T))
189\end{cfa}
190In general, a trait can contain an arbitrary number of assertions, both
191functions and variables, and are usually used to create a shorthand for, and
192give descriptive names to, common groupings of assertions describing a certain
193functionality, like @sumable@, @listable@, \etc.
194
195Polymorphic structures and unions are defined by qualifying the aggregate type
196with @forall@. The type variables work the same except they are used in field
197declarations instead of parameters, returns, and local variable declarations.
198\begin{cfa}
199forall(dtype T)
200struct node {
201        node(T) * next;
202        T * data;
203}
204node(int) inode;
205\end{cfa}
206The generic type @node(T)@ is an example of a polymorphic type usage.  Like \Cpp
207template usage, a polymorphic type usage must specify a type parameter.
208
209There are many other polymorphism features in \CFA but these are the ones used
210by the exception system.
211
212\section{Control Flow}
213\CFA has a number of advanced control-flow features: @generator@, @coroutine@, @monitor@, @mutex@ parameters, and @thread@.
214The two features that interact with
215the exception system are @coroutine@ and @thread@; they and their supporting
216constructs are described here.
217
218\subsection{Coroutine}
219A coroutine is a type with associated functions, where the functions are not
220required to finish execution when control is handed back to the caller. Instead
221they may suspend execution at any time and be resumed later at the point of
222last suspension. (Generators are stackless and coroutines are stackful.) These
223types are not concurrent but share some similarities along with common
224underpinnings, so they are combined with the \CFA threading library. Further
225discussion in this section only refers to the coroutine because generators are
226similar.
227
228In \CFA, a coroutine is created using the @coroutine@ keyword, which is an
229aggregate type like @struct,@ except the structure is implicitly modified by
230the compiler to satisfy the @is_coroutine@ trait; hence, a coroutine is
231restricted by the type system to types that provide this special trait.  The
232coroutine structure acts as the interface between callers and the coroutine,
233and its fields are used to pass information in and out of coroutine interface
234functions.
235
236Here is a simple example where a single field is used to pass (communicate) the
237next number in a sequence.
238\begin{cfa}
239coroutine CountUp {
240        unsigned int next;
241}
242CountUp countup;
243\end{cfa}
244Each coroutine has a @main@ function, which takes a reference to a coroutine
245object and returns @void@.
246\begin{cfa}
247void main(CountUp & this) {
248    for (unsigned int next = 0 ; true ; ++next) {
249                next = up;
250                suspend;$\label{suspend}$
251        }
252}
253\end{cfa}
254In this function, or functions called by this function (helper functions), the
255@suspend@ statement is used to return execution to the coroutine's caller
256without terminating the coroutine's function.
257
258A coroutine is resumed by calling the @resume@ function, \eg @resume(countup)@.
259The first resume calls the @main@ function at the top. Thereafter, resume calls
260continue a coroutine in the last suspended function after the @suspend@
261statement, in this case @main@ line~\ref{suspend}.  The @resume@ function takes
262a reference to the coroutine structure and returns the same reference. The
263return value allows easy access to communication variables defined in the
264coroutine object. For example, the @next@ value for coroutine object @countup@
265is both generated and collected in the single expression:
266@resume(countup).next@.
267
268\subsection{Monitor and Mutex Parameter}
269Concurrency does not guarantee ordering; without ordering results are
270non-deterministic. To claw back ordering, \CFA uses monitors and @mutex@
271(mutual exclusion) parameters. A monitor is another kind of aggregate, where
272the compiler implicitly inserts a lock and instances are compatible with
273@mutex@ parameters.
274
275A function that requires deterministic (ordered) execution, acquires mutual
276exclusion on a monitor object by qualifying an object reference parameter with
277@mutex@.
278\begin{cfa}
279void example(MonitorA & mutex argA, MonitorB & mutex argB);
280\end{cfa}
281When the function is called, it implicitly acquires the monitor lock for all of
282the mutex parameters without deadlock.  This semantics means all functions with
283the same mutex type(s) are part of a critical section for objects of that type
284and only one runs at a time.
285
286\subsection{Thread}
287Functions, generators, and coroutines are sequential so there is only a single
288(but potentially sophisticated) execution path in a program. Threads introduce
289multiple execution paths that continue independently.
290
291For threads to work safely with objects requires mutual exclusion using
292monitors and mutex parameters. For threads to work safely with other threads,
293also requires mutual exclusion in the form of a communication rendezvous, which
294also supports internal synchronization as for mutex objects. For exceptions,
295only two basic thread operations are important: fork and join.
296
297Threads are created like coroutines with an associated @main@ function:
298\begin{cfa}
299thread StringWorker {
300        const char * input;
301        int result;
302};
303void main(StringWorker & this) {
304        const char * localCopy = this.input;
305        // ... do some work, perhaps hashing the string ...
306        this.result = result;
307}
308{
309        StringWorker stringworker; // fork thread running in "main"
310} // <- implicitly join with thread / wait for completion
311\end{cfa}
312The thread main is where a new thread starts execution after a fork operation
313and then the thread continues executing until it is finished. If another thread
314joins with an executing thread, it waits until the executing main completes
315execution. In other words, everything a thread does is between a fork and join.
316
317From the outside, this behaviour is accomplished through creation and
318destruction of a thread object.  Implicitly, fork happens after a thread
319object's constructor is run and join happens before the destructor runs. Join
320can also be specified explicitly using the @join@ function to wait for a
321thread's completion independently from its deallocation (\ie destructor
322call). If @join@ is called explicitly, the destructor does not implicitly join.
Note: See TracBrowser for help on using the repository browser.