1 | \chapter{\CFA{} Existing Features} |
---|
2 | \label{c:existing} |
---|
3 | |
---|
4 | \CFA is an open-source project extending ISO C with |
---|
5 | modern safety and productivity features, while still ensuring backwards |
---|
6 | compatibility with C and its programmers. \CFA is designed to have an |
---|
7 | orthogonal feature-set based closely on the C programming paradigm |
---|
8 | (non-object-oriented) and these features can be added incrementally to an |
---|
9 | existing C code-base allowing programmers to learn \CFA on an as-needed basis. |
---|
10 | |
---|
11 | Only those \CFA features pertaining to this thesis are discussed. |
---|
12 | % Also, only new features of \CFA will be discussed, |
---|
13 | A familiarity with |
---|
14 | C or C-like languages is assumed. |
---|
15 | |
---|
16 | \section{Overloading and \lstinline{extern}} |
---|
17 | \CFA has extensive overloading, allowing multiple definitions of the same name |
---|
18 | to be defined~\cite{Moss18}. |
---|
19 | \begin{lstlisting}[language=CFA,{moredelim=**[is][\color{red}]{@}{@}}] |
---|
20 | char @i@; int @i@; double @i@; |
---|
21 | int @f@(); double @f@(); |
---|
22 | void @g@( int ); void @g@( double ); |
---|
23 | \end{lstlisting} |
---|
24 | This feature requires name mangling so the assembly symbols are unique for |
---|
25 | different overloads. For compatibility with names in C, there is also a syntax |
---|
26 | to disable name mangling. These unmangled names cannot be overloaded but act as |
---|
27 | the interface between C and \CFA code. The syntax for disabling/enabling |
---|
28 | mangling is: |
---|
29 | \begin{cfa} |
---|
30 | // name mangling on by default |
---|
31 | int i; // _X1ii_1 |
---|
32 | extern "C" { // disables name mangling |
---|
33 | int j; // j |
---|
34 | extern "Cforall" { // enables name mangling |
---|
35 | int k; // _X1ki_1 |
---|
36 | } |
---|
37 | // revert to no name mangling |
---|
38 | } |
---|
39 | // revert to name mangling |
---|
40 | \end{cfa} |
---|
41 | Both forms of @extern@ affect all the declarations within their nested lexical |
---|
42 | scope and transition back to the previous mangling state when the lexical scope |
---|
43 | ends. |
---|
44 | |
---|
45 | \section{Reference Type} |
---|
46 | \CFA adds a reference type to C as an auto-dereferencing pointer. |
---|
47 | They work very similarly to pointers. |
---|
48 | Reference-types are written the same way as a pointer-type but each |
---|
49 | asterisk (@*@) is replaced with a ampersand (@&@); |
---|
50 | this includes cv-qualifiers and multiple levels of reference. |
---|
51 | |
---|
52 | Generally, references act like pointers with an implicate dereferencing |
---|
53 | operation added to each use of the variable. |
---|
54 | These automatic dereferences may be disabled with the address-of operator |
---|
55 | (@&@). |
---|
56 | |
---|
57 | % Check to see if these are generating errors. |
---|
58 | \begin{minipage}{0,5\textwidth} |
---|
59 | With references: |
---|
60 | \begin{cfa} |
---|
61 | int i, j; |
---|
62 | int & ri = i; |
---|
63 | int && rri = ri; |
---|
64 | rri = 3; |
---|
65 | &ri = &j; // rebindable |
---|
66 | ri = 5; |
---|
67 | \end{cfa} |
---|
68 | \end{minipage} |
---|
69 | \begin{minipage}{0,5\textwidth} |
---|
70 | With pointers: |
---|
71 | \begin{cfa} |
---|
72 | int i, j; |
---|
73 | int * pi = &i |
---|
74 | int ** ppi = π |
---|
75 | **ppi = 3; |
---|
76 | pi = &j; |
---|
77 | *pi = 5; |
---|
78 | \end{cfa} |
---|
79 | \end{minipage} |
---|
80 | |
---|
81 | References are intended for pointer situations where dereferencing is the common usage, |
---|
82 | \ie the value is more important than the pointer. |
---|
83 | Mutable references may be assigned to by converting them to a pointer |
---|
84 | with a @&@ and then assigning a pointer to them, as in @&ri = &j;@ above |
---|
85 | |
---|
86 | \section{Operators} |
---|
87 | |
---|
88 | \CFA implements operator overloading by providing special names, where |
---|
89 | operator usages are translated into function calls using these names. |
---|
90 | An operator name is created by taking the operator symbols and joining them with |
---|
91 | @?@s to show where the parameters go. |
---|
92 | For example, |
---|
93 | infixed multiplication is @?*?@, while prefix dereference is @*?@. |
---|
94 | This syntax make it easy to tell the difference between prefix operations |
---|
95 | (such as @++?@) and post-fix operations (@?++@). |
---|
96 | For example, plus and equality operators are defined for a point type. |
---|
97 | \begin{cfa} |
---|
98 | point ?+?(point a, point b) { return point{a.x + b.x, a.y + b.y}; } |
---|
99 | int ?==?(point a, point b) { return a.x == b.x && a.y == b.y; } |
---|
100 | { |
---|
101 | assert(point{1, 2} + point{3, 4} == point{4, 6}); |
---|
102 | } |
---|
103 | \end{cfa} |
---|
104 | Note these special names are not limited to builtin |
---|
105 | operators, and hence, may be used with arbitrary types. |
---|
106 | \begin{cfa} |
---|
107 | double ?+?( int x, point y ); // arbitrary types |
---|
108 | \end{cfa} |
---|
109 | % Some ``near misses", that are that do not match an operator form but looks like |
---|
110 | % it may have been supposed to, will generate warning but otherwise they are |
---|
111 | % left alone. |
---|
112 | Because operators are never part of the type definition they may be added |
---|
113 | at any time, including on built-in types. |
---|
114 | |
---|
115 | %\subsection{Constructors and Destructors} |
---|
116 | |
---|
117 | \CFA also provides constructors and destructors as operators, which means they |
---|
118 | are functions with special operator names rather than type names in \Cpp. |
---|
119 | While constructors and destructions are normally called implicitly by the compiler, |
---|
120 | the special operator names, allow explicit calls. |
---|
121 | |
---|
122 | % Placement new means that this is actually equivalent to C++. |
---|
123 | |
---|
124 | The special name for a constructor is @?{}@, which comes from the |
---|
125 | initialization syntax in C, \eg @Example e = { ... }@. |
---|
126 | \CFA generates a constructor call each time a variable is declared, |
---|
127 | passing the initialization arguments to the constructor. |
---|
128 | \begin{cfa} |
---|
129 | struct Example { ... }; |
---|
130 | void ?{}(Example & this) { ... } |
---|
131 | void ?{}(Example & this, char first, int num) { ... } |
---|
132 | Example a; // implicit constructor calls |
---|
133 | Example b = {}; |
---|
134 | Example c = {'a', 2}; |
---|
135 | \end{cfa} |
---|
136 | Both @a@ and @b@ are initialized with the first constructor, |
---|
137 | while @c@ is initialized with the second. |
---|
138 | Constructor calls can be replaced with C initialization using special operator \lstinline{@=}. |
---|
139 | \begin{cfa} |
---|
140 | Example d @= {42}; |
---|
141 | \end{cfa} |
---|
142 | % I don't like the \^{} symbol but $^\wedge$ isn't better. |
---|
143 | Similarly, destructors use the special name @^?{}@ (the @^@ has no special |
---|
144 | meaning). |
---|
145 | % These are a normally called implicitly called on a variable when it goes out |
---|
146 | % of scope. They can be called explicitly as well. |
---|
147 | \begin{cfa} |
---|
148 | void ^?{}(Example & this) { ... } |
---|
149 | { |
---|
150 | Example e; // implicit constructor call |
---|
151 | ^?{}(e); // explicit destructor call |
---|
152 | ?{}(e); // explicit constructor call |
---|
153 | } // implicit destructor call |
---|
154 | \end{cfa} |
---|
155 | |
---|
156 | Whenever a type is defined, \CFA creates a default zero-argument |
---|
157 | constructor, a copy constructor, a series of argument-per-field constructors |
---|
158 | and a destructor. All user constructors are defined after this. |
---|
159 | |
---|
160 | \section{Polymorphism} |
---|
161 | \CFA uses parametric polymorphism to create functions and types that are |
---|
162 | defined over multiple types. \CFA polymorphic declarations serve the same role |
---|
163 | as \Cpp templates or Java generics. The ``parametric'' means the polymorphism is |
---|
164 | accomplished by passing argument operations to associate \emph{parameters} at |
---|
165 | the call site, and these parameters are used in the function to differentiate |
---|
166 | among the types the function operates on. |
---|
167 | |
---|
168 | Polymorphic declarations start with a universal @forall@ clause that goes |
---|
169 | before the standard (monomorphic) declaration. These declarations have the same |
---|
170 | syntax except they may use the universal type names introduced by the @forall@ |
---|
171 | clause. For example, the following is a polymorphic identity function that |
---|
172 | works on any type @T@: |
---|
173 | \begin{cfa} |
---|
174 | forall( T ) T identity( T val ) { return val; } |
---|
175 | int forty_two = identity( 42 ); |
---|
176 | char capital_a = identity( 'A' ); |
---|
177 | \end{cfa} |
---|
178 | Each use of a polymorphic declaration resolves its polymorphic parameters |
---|
179 | (in this case, just @T@) to concrete types (@int@ in the first use and @char@ |
---|
180 | in the second). |
---|
181 | |
---|
182 | To allow a polymorphic function to be separately compiled, the type @T@ must be |
---|
183 | constrained by the operations used on @T@ in the function body. The @forall@ |
---|
184 | clause is augmented with a list of polymorphic variables (local type names) |
---|
185 | and assertions (constraints), which represent the required operations on those |
---|
186 | types used in a function, \eg: |
---|
187 | \begin{cfa} |
---|
188 | forall( T | { void do_once(T); } ) |
---|
189 | void do_twice(T value) { |
---|
190 | do_once(value); |
---|
191 | do_once(value); |
---|
192 | } |
---|
193 | \end{cfa} |
---|
194 | |
---|
195 | A polymorphic function can be used in the same way as a normal function. The |
---|
196 | polymorphic variables are filled in with concrete types and the assertions are |
---|
197 | checked. An assertion is checked by verifying each assertion operation (with |
---|
198 | all the variables replaced with the concrete types from the arguments) is |
---|
199 | defined at a call site. |
---|
200 | \begin{cfa} |
---|
201 | void do_once(int i) { ... } |
---|
202 | int i; |
---|
203 | do_twice(i); |
---|
204 | \end{cfa} |
---|
205 | Any object with a type fulfilling the assertion may be passed as an argument to |
---|
206 | a @do_twice@ call. |
---|
207 | |
---|
208 | Note, a function named @do_once@ is not required in the scope of @do_twice@ to |
---|
209 | compile it, unlike \Cpp template expansion. Furthermore, call-site inferencing |
---|
210 | allows local replacement of the specific parametric functions needs for a |
---|
211 | call. |
---|
212 | \begin{cfa} |
---|
213 | void do_once(double y) { ... } |
---|
214 | int quadruple(int x) { |
---|
215 | void do_once(int & y) { y = y * 2; } |
---|
216 | do_twice(x); |
---|
217 | return x; |
---|
218 | } |
---|
219 | \end{cfa} |
---|
220 | Specifically, the complier deduces that @do_twice@'s T is an integer from the |
---|
221 | argument @x@. It then looks for the most specific definition matching the |
---|
222 | assertion, which is the nested integral @do_once@ defined within the |
---|
223 | function. The matched assertion function is then passed as a function pointer |
---|
224 | to @do_twice@ and called within it. |
---|
225 | The global definition of @do_once@ is ignored, however if quadruple took a |
---|
226 | @double@ argument, then the global definition would be used instead as it |
---|
227 | is a better match. |
---|
228 | % Aaron's thesis might be a good reference here. |
---|
229 | |
---|
230 | To avoid typing long lists of assertions, constraints can be collect into |
---|
231 | convenient package called a @trait@, which can then be used in an assertion |
---|
232 | instead of the individual constraints. |
---|
233 | \begin{cfa} |
---|
234 | trait done_once(T) { |
---|
235 | void do_once(T); |
---|
236 | } |
---|
237 | \end{cfa} |
---|
238 | and the @forall@ list in the previous example is replaced with the trait. |
---|
239 | \begin{cfa} |
---|
240 | forall(dtype T | done_once(T)) |
---|
241 | \end{cfa} |
---|
242 | In general, a trait can contain an arbitrary number of assertions, both |
---|
243 | functions and variables, and are usually used to create a shorthand for, and |
---|
244 | give descriptive names to, common groupings of assertions describing a certain |
---|
245 | functionality, like @sumable@, @listable@, \etc. |
---|
246 | |
---|
247 | Polymorphic structures and unions are defined by qualifying an aggregate type |
---|
248 | with @forall@. The type variables work the same except they are used in field |
---|
249 | declarations instead of parameters, returns, and local variable declarations. |
---|
250 | \begin{cfa} |
---|
251 | forall(dtype T) |
---|
252 | struct node { |
---|
253 | node(T) * next; |
---|
254 | T * data; |
---|
255 | } |
---|
256 | node(int) inode; |
---|
257 | \end{cfa} |
---|
258 | The generic type @node(T)@ is an example of a polymorphic type usage. Like \Cpp |
---|
259 | template usage, a polymorphic type usage must specify a type parameter. |
---|
260 | |
---|
261 | There are many other polymorphism features in \CFA but these are the ones used |
---|
262 | by the exception system. |
---|
263 | |
---|
264 | \section{Control Flow} |
---|
265 | \CFA has a number of advanced control-flow features: @generator@, @coroutine@, @monitor@, @mutex@ parameters, and @thread@. |
---|
266 | The two features that interact with |
---|
267 | the exception system are @coroutine@ and @thread@; they and their supporting |
---|
268 | constructs are described here. |
---|
269 | |
---|
270 | \subsection{Coroutine} |
---|
271 | A coroutine is a type with associated functions, where the functions are not |
---|
272 | required to finish execution when control is handed back to the caller. Instead |
---|
273 | they may suspend execution at any time and be resumed later at the point of |
---|
274 | last suspension. (Generators are stackless and coroutines are stackful.) These |
---|
275 | types are not concurrent but share some similarities along with common |
---|
276 | underpinnings, so they are combined with the \CFA threading library. Further |
---|
277 | discussion in this section only refers to the coroutine because generators are |
---|
278 | similar. |
---|
279 | |
---|
280 | In \CFA, a coroutine is created using the @coroutine@ keyword, which is an |
---|
281 | aggregate type like @struct,@ except the structure is implicitly modified by |
---|
282 | the compiler to satisfy the @is_coroutine@ trait; hence, a coroutine is |
---|
283 | restricted by the type system to types that provide this special trait. The |
---|
284 | coroutine structure acts as the interface between callers and the coroutine, |
---|
285 | and its fields are used to pass information in and out of coroutine interface |
---|
286 | functions. |
---|
287 | |
---|
288 | Here is a simple example where a single field is used to pass (communicate) the |
---|
289 | next number in a sequence. |
---|
290 | \begin{cfa} |
---|
291 | coroutine CountUp { |
---|
292 | unsigned int next; |
---|
293 | }; |
---|
294 | CountUp countup; |
---|
295 | for (10) sout | resume(countup).next; // print 10 values |
---|
296 | \end{cfa} |
---|
297 | Each coroutine has a @main@ function, which takes a reference to a coroutine |
---|
298 | object and returns @void@. |
---|
299 | %[numbers=left] Why numbers on this one? |
---|
300 | \begin{cfa}[numbers=left,numberstyle=\scriptsize\sf] |
---|
301 | void main(CountUp & this) { |
---|
302 | for (unsigned int up = 0;; ++up) { |
---|
303 | this.next = up; |
---|
304 | suspend;$\label{suspend}$ |
---|
305 | } |
---|
306 | } |
---|
307 | \end{cfa} |
---|
308 | In this function, or functions called by this function (helper functions), the |
---|
309 | @suspend@ statement is used to return execution to the coroutine's resumer |
---|
310 | without terminating the coroutine's function(s). |
---|
311 | |
---|
312 | A coroutine is resumed by calling the @resume@ function, \eg @resume(countup)@. |
---|
313 | The first resume calls the @main@ function at the top. Thereafter, resume calls |
---|
314 | continue a coroutine in the last suspended function after the @suspend@ |
---|
315 | statement, in this case @main@ line~\ref{suspend}. The @resume@ function takes |
---|
316 | a reference to the coroutine structure and returns the same reference. The |
---|
317 | return value allows easy access to communication variables defined in the |
---|
318 | coroutine object. For example, the @next@ value for coroutine object @countup@ |
---|
319 | is both generated and collected in the single expression: |
---|
320 | @resume(countup).next@. |
---|
321 | |
---|
322 | \subsection{Monitor and Mutex Parameter} |
---|
323 | Concurrency does not guarantee ordering; without ordering results are |
---|
324 | non-deterministic. To claw back ordering, \CFA uses monitors and @mutex@ |
---|
325 | (mutual exclusion) parameters. A monitor is another kind of aggregate, where |
---|
326 | the compiler implicitly inserts a lock and instances are compatible with |
---|
327 | @mutex@ parameters. |
---|
328 | |
---|
329 | A function that requires deterministic (ordered) execution, acquires mutual |
---|
330 | exclusion on a monitor object by qualifying an object reference parameter with |
---|
331 | @mutex@. |
---|
332 | \begin{lstlisting}[language=CFA,{moredelim=**[is][\color{red}]{@}{@}}] |
---|
333 | void example(MonitorA & @mutex@ argA, MonitorB & @mutex@ argB); |
---|
334 | \end{lstlisting} |
---|
335 | When the function is called, it implicitly acquires the monitor lock for all of |
---|
336 | the mutex parameters without deadlock. This semantics means all functions with |
---|
337 | the same mutex type(s) are part of a critical section for objects of that type |
---|
338 | and only one runs at a time. |
---|
339 | |
---|
340 | \subsection{Thread} |
---|
341 | Functions, generators, and coroutines are sequential so there is only a single |
---|
342 | (but potentially sophisticated) execution path in a program. Threads introduce |
---|
343 | multiple execution paths that continue independently. |
---|
344 | |
---|
345 | For threads to work safely with objects requires mutual exclusion using |
---|
346 | monitors and mutex parameters. For threads to work safely with other threads, |
---|
347 | also requires mutual exclusion in the form of a communication rendezvous, which |
---|
348 | also supports internal synchronization as for mutex objects. For exceptions, |
---|
349 | only two basic thread operations are important: fork and join. |
---|
350 | |
---|
351 | Threads are created like coroutines with an associated @main@ function: |
---|
352 | \begin{cfa} |
---|
353 | thread StringWorker { |
---|
354 | const char * input; |
---|
355 | int result; |
---|
356 | }; |
---|
357 | void main(StringWorker & this) { |
---|
358 | const char * localCopy = this.input; |
---|
359 | // ... do some work, perhaps hashing the string ... |
---|
360 | this.result = result; |
---|
361 | } |
---|
362 | { |
---|
363 | StringWorker stringworker; // fork thread running in "main" |
---|
364 | } // implicitly join with thread / wait for completion |
---|
365 | \end{cfa} |
---|
366 | The thread main is where a new thread starts execution after a fork operation |
---|
367 | and then the thread continues executing until it is finished. If another thread |
---|
368 | joins with an executing thread, it waits until the executing main completes |
---|
369 | execution. In other words, everything a thread does is between a fork and join. |
---|
370 | |
---|
371 | From the outside, this behaviour is accomplished through creation and |
---|
372 | destruction of a thread object. Implicitly, fork happens after a thread |
---|
373 | object's constructor is run and join happens before the destructor runs. Join |
---|
374 | can also be specified explicitly using the @join@ function to wait for a |
---|
375 | thread's completion independently from its deallocation (\ie destructor |
---|
376 | call). If @join@ is called explicitly, the destructor does not implicitly join. |
---|