1 | \chapter{\CFA{} Existing Features}
|
---|
2 | \label{c:existing}
|
---|
3 |
|
---|
4 | \CFA is an open-source project extending ISO C with
|
---|
5 | modern safety and productivity features, while still ensuring backwards
|
---|
6 | compatibility with C and its programmers. \CFA is designed to have an
|
---|
7 | orthogonal feature-set based closely on the C programming paradigm
|
---|
8 | (non-object-oriented) and these features can be added incrementally to an
|
---|
9 | existing C code-base allowing programmers to learn \CFA on an as-needed basis.
|
---|
10 |
|
---|
11 | Only those \CFA features pertaining to this thesis are discussed.
|
---|
12 | Also, only new features of \CFA will be discussed, a familiarity with
|
---|
13 | C or C-like languages is assumed.
|
---|
14 |
|
---|
15 | \section{Overloading and \lstinline{extern}}
|
---|
16 | \CFA has extensive overloading, allowing multiple definitions of the same name
|
---|
17 | to be defined~\cite{Moss18}.
|
---|
18 | \begin{cfa}
|
---|
19 | char i; int i; double i;
|
---|
20 | int f(); double f();
|
---|
21 | void g( int ); void g( double );
|
---|
22 | \end{cfa}
|
---|
23 | This feature requires name mangling so the assembly symbols are unique for
|
---|
24 | different overloads. For compatibility with names in C, there is also a syntax
|
---|
25 | to disable name mangling. These unmangled names cannot be overloaded but act as
|
---|
26 | the interface between C and \CFA code. The syntax for disabling/enabling
|
---|
27 | mangling is:
|
---|
28 | \begin{cfa}
|
---|
29 | // name mangling on by default
|
---|
30 | int i; // _X1ii_1
|
---|
31 | extern "C" { // disables name mangling
|
---|
32 | int j; // j
|
---|
33 | extern "Cforall" { // enables name mangling
|
---|
34 | int k; // _X1ki_1
|
---|
35 | }
|
---|
36 | // revert to no name mangling
|
---|
37 | }
|
---|
38 | // revert to name mangling
|
---|
39 | \end{cfa}
|
---|
40 | Both forms of @extern@ affect all the declarations within their nested lexical
|
---|
41 | scope and transition back to the previous mangling state when the lexical scope
|
---|
42 | ends.
|
---|
43 |
|
---|
44 | \section{Reference Type}
|
---|
45 | \CFA adds a reference type to C as an auto-dereferencing pointer.
|
---|
46 | They work very similarly to pointers.
|
---|
47 | Reference-types are written the same way as a pointer-type but each
|
---|
48 | asterisk (@*@) is replaced with a ampersand (@&@);
|
---|
49 | this includes cv-qualifiers and multiple levels of reference.
|
---|
50 |
|
---|
51 | Generally, references act like pointers with an implicate dereferencing
|
---|
52 | operation added to each use of the variable.
|
---|
53 | These automatic dereferences may be disabled with the address-of operator
|
---|
54 | (@&@).
|
---|
55 |
|
---|
56 | % Check to see if these are generating errors.
|
---|
57 | \begin{minipage}{0,5\textwidth}
|
---|
58 | With references:
|
---|
59 | \begin{cfa}
|
---|
60 | int i, j;
|
---|
61 | int & ri = i;
|
---|
62 | int && rri = ri;
|
---|
63 | rri = 3;
|
---|
64 | &ri = &j;
|
---|
65 | ri = 5;
|
---|
66 | \end{cfa}
|
---|
67 | \end{minipage}
|
---|
68 | \begin{minipage}{0,5\textwidth}
|
---|
69 | With pointers:
|
---|
70 | \begin{cfa}
|
---|
71 | int i, j;
|
---|
72 | int * pi = &i
|
---|
73 | int ** ppi = π
|
---|
74 | **ppi = 3;
|
---|
75 | pi = &j;
|
---|
76 | *pi = 5;
|
---|
77 | \end{cfa}
|
---|
78 | \end{minipage}
|
---|
79 |
|
---|
80 | References are intended to be used when you would use pointers but would
|
---|
81 | be dereferencing them (almost) every usage.
|
---|
82 | Mutable references may be assigned to by converting them to a pointer
|
---|
83 | with a @&@ and then assigning a pointer to them, as in @&ri = &j;@ above
|
---|
84 |
|
---|
85 | \section{Operators}
|
---|
86 |
|
---|
87 | \CFA implements operator overloading by providing special names.
|
---|
88 | Operator uses are translated into function calls using these names.
|
---|
89 | These names are created by taking the operator symbols and joining them with
|
---|
90 | @?@s to show where the arguments go.
|
---|
91 | For example,
|
---|
92 | infixed multiplication is @?*?@ while prefix dereference is @*?@.
|
---|
93 | This syntax make it easy to tell the difference between prefix operations
|
---|
94 | (such as @++?@) and post-fix operations (@?++@).
|
---|
95 |
|
---|
96 | \begin{cfa}
|
---|
97 | point ?+?(point a, point b) { return point{a.x + b.x, a.y + b.y}; }
|
---|
98 | bool ?==?(point a, point b) { return a.x == b.x && a.y == b.y; }
|
---|
99 | {
|
---|
100 | assert(point{1, 2} + point{3, 4} == point{4, 6});
|
---|
101 | }
|
---|
102 | \end{cfa}
|
---|
103 | Note that these special names are not limited to just being used for these
|
---|
104 | operator functions, and may be used name other declarations.
|
---|
105 | Some ``near misses", that will not match an operator form but looks like
|
---|
106 | it may have been supposed to, will generate wantings but otherwise they are
|
---|
107 | left alone.
|
---|
108 |
|
---|
109 | %\subsection{Constructors and Destructors}
|
---|
110 |
|
---|
111 | Both constructors and destructors are operators, which means they are
|
---|
112 | functions with special operator names rather than type names in \Cpp. The
|
---|
113 | special operator names may be used to call the functions explicitly.
|
---|
114 | % Placement new means that this is actually equivant to C++.
|
---|
115 |
|
---|
116 | The special name for a constructor is @?{}@, which comes from the
|
---|
117 | initialization syntax in C, \eg @Example e = { ... }@.
|
---|
118 | \CFA will generate a constructor call each time a variable is declared,
|
---|
119 | passing the initialization arguments to the constructort.
|
---|
120 | \begin{cfa}
|
---|
121 | struct Example { ... };
|
---|
122 | void ?{}(Example & this) { ... }
|
---|
123 | {
|
---|
124 | Example a;
|
---|
125 | Example b = {};
|
---|
126 | }
|
---|
127 | void ?{}(Example & this, char first, int num) { ... }
|
---|
128 | {
|
---|
129 | Example c = {'a', 2};
|
---|
130 | }
|
---|
131 | \end{cfa}
|
---|
132 | Both @a@ and @b@ will be initalized with the first constructor,
|
---|
133 | while @c@ will be initalized with the second.
|
---|
134 | Currently, there is no general way to skip initialation.
|
---|
135 |
|
---|
136 | % I don't like the \^{} symbol but $^\wedge$ isn't better.
|
---|
137 | Similarly destructors use the special name @^?{}@ (the @^@ has no special
|
---|
138 | meaning).
|
---|
139 | These are a normally called implicitly called on a variable when it goes out
|
---|
140 | of scope. They can be called explicitly as well.
|
---|
141 | \begin{cfa}
|
---|
142 | void ^?{}(Example & this) { ... }
|
---|
143 | {
|
---|
144 | Example d;
|
---|
145 | } // <- implicit destructor call
|
---|
146 | \end{cfa}
|
---|
147 |
|
---|
148 | Whenever a type is defined, \CFA will create a default zero-argument
|
---|
149 | constructor, a copy constructor, a series of argument-per-field constructors
|
---|
150 | and a destructor. All user constructors are defined after this.
|
---|
151 | Because operators are never part of the type definition they may be added
|
---|
152 | at any time, including on built-in types.
|
---|
153 |
|
---|
154 | \section{Polymorphism}
|
---|
155 | \CFA uses parametric polymorphism to create functions and types that are
|
---|
156 | defined over multiple types. \CFA polymorphic declarations serve the same role
|
---|
157 | as \Cpp templates or Java generics. The ``parametric'' means the polymorphism is
|
---|
158 | accomplished by passing argument operations to associate \emph{parameters} at
|
---|
159 | the call site, and these parameters are used in the function to differentiate
|
---|
160 | among the types the function operates on.
|
---|
161 |
|
---|
162 | Polymorphic declarations start with a universal @forall@ clause that goes
|
---|
163 | before the standard (monomorphic) declaration. These declarations have the same
|
---|
164 | syntax except they may use the universal type names introduced by the @forall@
|
---|
165 | clause. For example, the following is a polymorphic identity function that
|
---|
166 | works on any type @T@:
|
---|
167 | \begin{cfa}
|
---|
168 | forall( T ) T identity( T val ) { return val; }
|
---|
169 | int forty_two = identity( 42 );
|
---|
170 | char capital_a = identity( 'A' );
|
---|
171 | \end{cfa}
|
---|
172 | Each use of a polymorphic declaration resolves its polymorphic parameters
|
---|
173 | (in this case, just @T@) to concrete types (@int@ in the first use and @char@
|
---|
174 | in the second).
|
---|
175 |
|
---|
176 | To allow a polymorphic function to be separately compiled, the type @T@ must be
|
---|
177 | constrained by the operations used on @T@ in the function body. The @forall@
|
---|
178 | clause is augmented with a list of polymorphic variables (local type names)
|
---|
179 | and assertions (constraints), which represent the required operations on those
|
---|
180 | types used in a function, \eg:
|
---|
181 | \begin{cfa}
|
---|
182 | forall( T | { void do_once(T); } )
|
---|
183 | void do_twice(T value) {
|
---|
184 | do_once(value);
|
---|
185 | do_once(value);
|
---|
186 | }
|
---|
187 | \end{cfa}
|
---|
188 |
|
---|
189 | A polymorphic function can be used in the same way as a normal function. The
|
---|
190 | polymorphic variables are filled in with concrete types and the assertions are
|
---|
191 | checked. An assertion is checked by verifying each assertion operation (with
|
---|
192 | all the variables replaced with the concrete types from the arguments) is
|
---|
193 | defined at a call site.
|
---|
194 | \begin{cfa}
|
---|
195 | void do_once(int i) { ... }
|
---|
196 | int i;
|
---|
197 | do_twice(i);
|
---|
198 | \end{cfa}
|
---|
199 | Any object with a type fulfilling the assertion may be passed as an argument to
|
---|
200 | a @do_twice@ call.
|
---|
201 |
|
---|
202 | Note, a function named @do_once@ is not required in the scope of @do_twice@ to
|
---|
203 | compile it, unlike \Cpp template expansion. Furthermore, call-site inferencing
|
---|
204 | allows local replacement of the most specific parametric functions needs for a
|
---|
205 | call.
|
---|
206 | \begin{cfa}
|
---|
207 | void do_once(double y) { ... }
|
---|
208 | int quadruple(int x) {
|
---|
209 | void do_once(int & y) { y = y * 2; }
|
---|
210 | do_twice(x);
|
---|
211 | return x;
|
---|
212 | }
|
---|
213 | \end{cfa}
|
---|
214 | Specifically, the complier deduces that @do_twice@'s T is an integer from the
|
---|
215 | argument @x@. It then looks for the most specific definition matching the
|
---|
216 | assertion, which is the nested integral @do_once@ defined within the
|
---|
217 | function. The matched assertion function is then passed as a function pointer
|
---|
218 | to @do_twice@ and called within it.
|
---|
219 | The global definition of @do_once@ is ignored, however if quadruple took a
|
---|
220 | @double@ argument then the global definition would be used instead as it
|
---|
221 | would be a better match.
|
---|
222 | % Aaron's thesis might be a good reference here.
|
---|
223 |
|
---|
224 | To avoid typing long lists of assertions, constraints can be collect into
|
---|
225 | convenient packages called a @trait@, which can then be used in an assertion
|
---|
226 | instead of the individual constraints.
|
---|
227 | \begin{cfa}
|
---|
228 | trait done_once(T) {
|
---|
229 | void do_once(T);
|
---|
230 | }
|
---|
231 | \end{cfa}
|
---|
232 | and the @forall@ list in the previous example is replaced with the trait.
|
---|
233 | \begin{cfa}
|
---|
234 | forall(dtype T | done_once(T))
|
---|
235 | \end{cfa}
|
---|
236 | In general, a trait can contain an arbitrary number of assertions, both
|
---|
237 | functions and variables, and are usually used to create a shorthand for, and
|
---|
238 | give descriptive names to, common groupings of assertions describing a certain
|
---|
239 | functionality, like @sumable@, @listable@, \etc.
|
---|
240 |
|
---|
241 | Polymorphic structures and unions are defined by qualifying the aggregate type
|
---|
242 | with @forall@. The type variables work the same except they are used in field
|
---|
243 | declarations instead of parameters, returns, and local variable declarations.
|
---|
244 | \begin{cfa}
|
---|
245 | forall(dtype T)
|
---|
246 | struct node {
|
---|
247 | node(T) * next;
|
---|
248 | T * data;
|
---|
249 | }
|
---|
250 | node(int) inode;
|
---|
251 | \end{cfa}
|
---|
252 | The generic type @node(T)@ is an example of a polymorphic type usage. Like \Cpp
|
---|
253 | template usage, a polymorphic type usage must specify a type parameter.
|
---|
254 |
|
---|
255 | There are many other polymorphism features in \CFA but these are the ones used
|
---|
256 | by the exception system.
|
---|
257 |
|
---|
258 | \section{Control Flow}
|
---|
259 | \CFA has a number of advanced control-flow features: @generator@, @coroutine@, @monitor@, @mutex@ parameters, and @thread@.
|
---|
260 | The two features that interact with
|
---|
261 | the exception system are @coroutine@ and @thread@; they and their supporting
|
---|
262 | constructs are described here.
|
---|
263 |
|
---|
264 | \subsection{Coroutine}
|
---|
265 | A coroutine is a type with associated functions, where the functions are not
|
---|
266 | required to finish execution when control is handed back to the caller. Instead
|
---|
267 | they may suspend execution at any time and be resumed later at the point of
|
---|
268 | last suspension. (Generators are stackless and coroutines are stackful.) These
|
---|
269 | types are not concurrent but share some similarities along with common
|
---|
270 | underpinnings, so they are combined with the \CFA threading library. Further
|
---|
271 | discussion in this section only refers to the coroutine because generators are
|
---|
272 | similar.
|
---|
273 |
|
---|
274 | In \CFA, a coroutine is created using the @coroutine@ keyword, which is an
|
---|
275 | aggregate type like @struct,@ except the structure is implicitly modified by
|
---|
276 | the compiler to satisfy the @is_coroutine@ trait; hence, a coroutine is
|
---|
277 | restricted by the type system to types that provide this special trait. The
|
---|
278 | coroutine structure acts as the interface between callers and the coroutine,
|
---|
279 | and its fields are used to pass information in and out of coroutine interface
|
---|
280 | functions.
|
---|
281 |
|
---|
282 | Here is a simple example where a single field is used to pass (communicate) the
|
---|
283 | next number in a sequence.
|
---|
284 | \begin{cfa}
|
---|
285 | coroutine CountUp {
|
---|
286 | unsigned int next;
|
---|
287 | }
|
---|
288 | CountUp countup;
|
---|
289 | \end{cfa}
|
---|
290 | Each coroutine has a @main@ function, which takes a reference to a coroutine
|
---|
291 | object and returns @void@.
|
---|
292 | %[numbers=left] Why numbers on this one?
|
---|
293 | \begin{cfa}
|
---|
294 | void main(CountUp & this) {
|
---|
295 | for (unsigned int next = 0 ; true ; ++next) {
|
---|
296 | next = up;
|
---|
297 | suspend;$\label{suspend}$
|
---|
298 | }
|
---|
299 | }
|
---|
300 | \end{cfa}
|
---|
301 | In this function, or functions called by this function (helper functions), the
|
---|
302 | @suspend@ statement is used to return execution to the coroutine's caller
|
---|
303 | without terminating the coroutine's function.
|
---|
304 |
|
---|
305 | A coroutine is resumed by calling the @resume@ function, \eg @resume(countup)@.
|
---|
306 | The first resume calls the @main@ function at the top. Thereafter, resume calls
|
---|
307 | continue a coroutine in the last suspended function after the @suspend@
|
---|
308 | statement, in this case @main@ line~\ref{suspend}. The @resume@ function takes
|
---|
309 | a reference to the coroutine structure and returns the same reference. The
|
---|
310 | return value allows easy access to communication variables defined in the
|
---|
311 | coroutine object. For example, the @next@ value for coroutine object @countup@
|
---|
312 | is both generated and collected in the single expression:
|
---|
313 | @resume(countup).next@.
|
---|
314 |
|
---|
315 | \subsection{Monitor and Mutex Parameter}
|
---|
316 | Concurrency does not guarantee ordering; without ordering results are
|
---|
317 | non-deterministic. To claw back ordering, \CFA uses monitors and @mutex@
|
---|
318 | (mutual exclusion) parameters. A monitor is another kind of aggregate, where
|
---|
319 | the compiler implicitly inserts a lock and instances are compatible with
|
---|
320 | @mutex@ parameters.
|
---|
321 |
|
---|
322 | A function that requires deterministic (ordered) execution, acquires mutual
|
---|
323 | exclusion on a monitor object by qualifying an object reference parameter with
|
---|
324 | @mutex@.
|
---|
325 | \begin{cfa}
|
---|
326 | void example(MonitorA & mutex argA, MonitorB & mutex argB);
|
---|
327 | \end{cfa}
|
---|
328 | When the function is called, it implicitly acquires the monitor lock for all of
|
---|
329 | the mutex parameters without deadlock. This semantics means all functions with
|
---|
330 | the same mutex type(s) are part of a critical section for objects of that type
|
---|
331 | and only one runs at a time.
|
---|
332 |
|
---|
333 | \subsection{Thread}
|
---|
334 | Functions, generators, and coroutines are sequential so there is only a single
|
---|
335 | (but potentially sophisticated) execution path in a program. Threads introduce
|
---|
336 | multiple execution paths that continue independently.
|
---|
337 |
|
---|
338 | For threads to work safely with objects requires mutual exclusion using
|
---|
339 | monitors and mutex parameters. For threads to work safely with other threads,
|
---|
340 | also requires mutual exclusion in the form of a communication rendezvous, which
|
---|
341 | also supports internal synchronization as for mutex objects. For exceptions,
|
---|
342 | only two basic thread operations are important: fork and join.
|
---|
343 |
|
---|
344 | Threads are created like coroutines with an associated @main@ function:
|
---|
345 | \begin{cfa}
|
---|
346 | thread StringWorker {
|
---|
347 | const char * input;
|
---|
348 | int result;
|
---|
349 | };
|
---|
350 | void main(StringWorker & this) {
|
---|
351 | const char * localCopy = this.input;
|
---|
352 | // ... do some work, perhaps hashing the string ...
|
---|
353 | this.result = result;
|
---|
354 | }
|
---|
355 | {
|
---|
356 | StringWorker stringworker; // fork thread running in "main"
|
---|
357 | } // <- implicitly join with thread / wait for completion
|
---|
358 | \end{cfa}
|
---|
359 | The thread main is where a new thread starts execution after a fork operation
|
---|
360 | and then the thread continues executing until it is finished. If another thread
|
---|
361 | joins with an executing thread, it waits until the executing main completes
|
---|
362 | execution. In other words, everything a thread does is between a fork and join.
|
---|
363 |
|
---|
364 | From the outside, this behaviour is accomplished through creation and
|
---|
365 | destruction of a thread object. Implicitly, fork happens after a thread
|
---|
366 | object's constructor is run and join happens before the destructor runs. Join
|
---|
367 | can also be specified explicitly using the @join@ function to wait for a
|
---|
368 | thread's completion independently from its deallocation (\ie destructor
|
---|
369 | call). If @join@ is called explicitly, the destructor does not implicitly join.
|
---|