| 1 | \chapter{\CFA{} Existing Features} | 
|---|
| 2 | \label{c:existing} | 
|---|
| 3 |  | 
|---|
| 4 | \CFA is an open-source project extending ISO C with | 
|---|
| 5 | modern safety and productivity features, while still ensuring backwards | 
|---|
| 6 | compatibility with C and its programmers.  \CFA is designed to have an | 
|---|
| 7 | orthogonal feature-set based closely on the C programming paradigm | 
|---|
| 8 | (non-object-oriented), and these features can be added incrementally to an | 
|---|
| 9 | existing C code-base, | 
|---|
| 10 | allowing programmers to learn \CFA on an as-needed basis. | 
|---|
| 11 |  | 
|---|
| 12 | Only those \CFA features pertaining to this thesis are discussed. | 
|---|
| 13 | A familiarity with | 
|---|
| 14 | C or C-like languages is assumed. | 
|---|
| 15 |  | 
|---|
| 16 | \section{Overloading and \lstinline{extern}} | 
|---|
| 17 | \CFA has extensive overloading, allowing multiple definitions of the same name | 
|---|
| 18 | to be defined~\cite{Moss18}. | 
|---|
| 19 | \begin{cfa} | 
|---|
| 20 | char i; int i; double i; | 
|---|
| 21 | int f(); double f(); | 
|---|
| 22 | void g( int ); void g( double ); | 
|---|
| 23 | \end{cfa} | 
|---|
| 24 | This feature requires name mangling so the assembly symbols are unique for | 
|---|
| 25 | different overloads. For compatibility with names in C, there is also a syntax | 
|---|
| 26 | to disable name mangling. These unmangled names cannot be overloaded but act as | 
|---|
| 27 | the interface between C and \CFA code.  The syntax for disabling/enabling | 
|---|
| 28 | mangling is: | 
|---|
| 29 | \begin{cfa} | 
|---|
| 30 | // name mangling on by default | 
|---|
| 31 | int i; // _X1ii_1 | 
|---|
| 32 | extern "C" {  // disables name mangling | 
|---|
| 33 | int j; // j | 
|---|
| 34 | extern "Cforall" {  // enables name mangling | 
|---|
| 35 | int k; // _X1ki_1 | 
|---|
| 36 | } | 
|---|
| 37 | // revert to no name mangling | 
|---|
| 38 | } | 
|---|
| 39 | // revert to name mangling | 
|---|
| 40 | \end{cfa} | 
|---|
| 41 | Both forms of @extern@ affect all the declarations within their nested lexical | 
|---|
| 42 | scope and transition back to the previous mangling state when the lexical scope | 
|---|
| 43 | ends. | 
|---|
| 44 |  | 
|---|
| 45 | \section{Reference Type} | 
|---|
| 46 | \CFA adds a reference type to C as an auto-dereferencing pointer. | 
|---|
| 47 | They work very similarly to pointers. | 
|---|
| 48 | Reference-types are written the same way as pointer-types, but each | 
|---|
| 49 | asterisk (@*@) is replaced with a ampersand (@&@); | 
|---|
| 50 | this includes cv-qualifiers (\snake{const} and \snake{volatile}) | 
|---|
| 51 | and multiple levels of reference. | 
|---|
| 52 |  | 
|---|
| 53 | Generally, references act like pointers with an implicit dereferencing | 
|---|
| 54 | operation added to each use of the variable. | 
|---|
| 55 | These automatic dereferences may be disabled with the address-of operator | 
|---|
| 56 | (@&@). | 
|---|
| 57 |  | 
|---|
| 58 | % Check to see if these are generating errors. | 
|---|
| 59 | \begin{minipage}{0,5\textwidth} | 
|---|
| 60 | With references: | 
|---|
| 61 | \begin{cfa} | 
|---|
| 62 | int i, j; | 
|---|
| 63 | int & ri = i; | 
|---|
| 64 | int && rri = ri; | 
|---|
| 65 | rri = 3; | 
|---|
| 66 | &ri = &j; | 
|---|
| 67 | ri = 5; | 
|---|
| 68 | \end{cfa} | 
|---|
| 69 | \end{minipage} | 
|---|
| 70 | \begin{minipage}{0,5\textwidth} | 
|---|
| 71 | With pointers: | 
|---|
| 72 | \begin{cfa} | 
|---|
| 73 | int i, j; | 
|---|
| 74 | int * pi = &i | 
|---|
| 75 | int ** ppi = π | 
|---|
| 76 | **ppi = 3; | 
|---|
| 77 | pi = &j; | 
|---|
| 78 | *pi = 5; | 
|---|
| 79 | \end{cfa} | 
|---|
| 80 | \end{minipage} | 
|---|
| 81 |  | 
|---|
| 82 | References are intended to be used when the indirection of a pointer is | 
|---|
| 83 | required, but the address is not as important as the value and dereferencing | 
|---|
| 84 | is the common usage. | 
|---|
| 85 | Mutable references may be assigned to by converting them to a pointer | 
|---|
| 86 | with a @&@ and then assigning a pointer to them, as in @&ri = &j;@ above. | 
|---|
| 87 |  | 
|---|
| 88 | \section{Operators} | 
|---|
| 89 |  | 
|---|
| 90 | \CFA implements operator overloading by providing special names, where | 
|---|
| 91 | operator expressions are translated into function calls using these names. | 
|---|
| 92 | An operator name is created by taking the operator symbols and joining them with | 
|---|
| 93 | @?@s to show where the arguments go. | 
|---|
| 94 | For example, | 
|---|
| 95 | infixed multiplication is @?*?@, while prefix dereference is @*?@. | 
|---|
| 96 | This syntax makes it easy to tell the difference between prefix operations | 
|---|
| 97 | (such as @++?@) and postfix operations (@?++@). | 
|---|
| 98 |  | 
|---|
| 99 | As an example, here are the addition and equality operators for a point type. | 
|---|
| 100 | \begin{cfa} | 
|---|
| 101 | point ?+?(point a, point b) { return point{a.x + b.x, a.y + b.y}; } | 
|---|
| 102 | int ?==?(point a, point b) { return a.x == b.x && a.y == b.y; } | 
|---|
| 103 | { | 
|---|
| 104 | assert(point{1, 2} + point{3, 4} == point{4, 6}); | 
|---|
| 105 | } | 
|---|
| 106 | \end{cfa} | 
|---|
| 107 | Note that this syntax works effectively as a textual transformation; | 
|---|
| 108 | the compiler converts all operators into functions and then resolves them | 
|---|
| 109 | normally. This means any combination of types may be used, | 
|---|
| 110 | although nonsensical ones (like @double ?==?(point, int);@) are discouraged. | 
|---|
| 111 | This feature is also used for all builtin operators as well, | 
|---|
| 112 | although those are implicitly provided by the language. | 
|---|
| 113 |  | 
|---|
| 114 | %\subsection{Constructors and Destructors} | 
|---|
| 115 | In \CFA, constructors and destructors are operators, which means they are | 
|---|
| 116 | functions with special operator names, rather than type names as in \Cpp. | 
|---|
| 117 | Both constructors and destructors can be implicity called by the compiler, | 
|---|
| 118 | however the operator names allow explicit calls. | 
|---|
| 119 | % Placement new means that this is actually equivant to C++. | 
|---|
| 120 |  | 
|---|
| 121 | The special name for a constructor is @?{}@, which comes from the | 
|---|
| 122 | initialization syntax in C, \eg @Example e = { ... }@. | 
|---|
| 123 | \CFA generates a constructor call each time a variable is declared, | 
|---|
| 124 | passing the initialization arguments to the constructor. | 
|---|
| 125 | \begin{cfa} | 
|---|
| 126 | struct Example { ... }; | 
|---|
| 127 | void ?{}(Example & this) { ... } | 
|---|
| 128 | { | 
|---|
| 129 | Example a; | 
|---|
| 130 | Example b = {}; | 
|---|
| 131 | } | 
|---|
| 132 | void ?{}(Example & this, char first, int num) { ... } | 
|---|
| 133 | { | 
|---|
| 134 | Example c = {'a', 2}; | 
|---|
| 135 | } | 
|---|
| 136 | \end{cfa} | 
|---|
| 137 | Both @a@ and @b@ will be initalized with the first constructor, | 
|---|
| 138 | @b@ because of the explicit call and @a@ implicitly. | 
|---|
| 139 | @c@ will be initalized with the second constructor. | 
|---|
| 140 | Currently, there is no general way to skip initialization. | 
|---|
| 141 | % I don't use @= anywhere in the thesis. | 
|---|
| 142 |  | 
|---|
| 143 | % I don't like the \^{} symbol but $^\wedge$ isn't better. | 
|---|
| 144 | Similarly, destructors use the special name @^?{}@ (the @^@ has no special | 
|---|
| 145 | meaning). | 
|---|
| 146 | \begin{cfa} | 
|---|
| 147 | void ^?{}(Example & this) { ... } | 
|---|
| 148 | { | 
|---|
| 149 | Example d; | 
|---|
| 150 | ^?{}(d); | 
|---|
| 151 |  | 
|---|
| 152 | Example e; | 
|---|
| 153 | } // Implicit call of ^?{}(e); | 
|---|
| 154 | \end{cfa} | 
|---|
| 155 |  | 
|---|
| 156 | Whenever a type is defined, \CFA creates a default zero-argument | 
|---|
| 157 | constructor, a copy constructor, a series of argument-per-field constructors | 
|---|
| 158 | and a destructor. All user constructors are defined after this. | 
|---|
| 159 |  | 
|---|
| 160 | \section{Polymorphism} | 
|---|
| 161 | \CFA uses parametric polymorphism to create functions and types that are | 
|---|
| 162 | defined over multiple types. \CFA polymorphic declarations serve the same role | 
|---|
| 163 | as \Cpp templates or Java generics. The ``parametric'' means the polymorphism is | 
|---|
| 164 | accomplished by passing argument operations to associate \emph{parameters} at | 
|---|
| 165 | the call site, and these parameters are used in the function to differentiate | 
|---|
| 166 | among the types the function operates on. | 
|---|
| 167 |  | 
|---|
| 168 | Polymorphic declarations start with a universal @forall@ clause that goes | 
|---|
| 169 | before the standard (monomorphic) declaration. These declarations have the same | 
|---|
| 170 | syntax except they may use the universal type names introduced by the @forall@ | 
|---|
| 171 | clause.  For example, the following is a polymorphic identity function that | 
|---|
| 172 | works on any type @T@: | 
|---|
| 173 | \begin{cfa} | 
|---|
| 174 | forall( T ) T identity( T val ) { return val; } | 
|---|
| 175 | int forty_two = identity( 42 ); | 
|---|
| 176 | char capital_a = identity( 'A' ); | 
|---|
| 177 | \end{cfa} | 
|---|
| 178 | Each use of a polymorphic declaration resolves its polymorphic parameters | 
|---|
| 179 | (in this case, just @T@) to concrete types (@int@ in the first use and @char@ | 
|---|
| 180 | in the second). | 
|---|
| 181 |  | 
|---|
| 182 | To allow a polymorphic function to be separately compiled, the type @T@ must be | 
|---|
| 183 | constrained by the operations used on @T@ in the function body. The @forall@ | 
|---|
| 184 | clause is augmented with a list of polymorphic variables (local type names) | 
|---|
| 185 | and assertions (constraints), which represent the required operations on those | 
|---|
| 186 | types used in a function, \eg: | 
|---|
| 187 | \begin{cfa} | 
|---|
| 188 | forall( T | { void do_once(T); } ) | 
|---|
| 189 | void do_twice(T value) { | 
|---|
| 190 | do_once(value); | 
|---|
| 191 | do_once(value); | 
|---|
| 192 | } | 
|---|
| 193 | \end{cfa} | 
|---|
| 194 |  | 
|---|
| 195 | A polymorphic function can be used in the same way as a normal function.  The | 
|---|
| 196 | polymorphic variables are filled in with concrete types and the assertions are | 
|---|
| 197 | checked. An assertion is checked by verifying each assertion operation (with | 
|---|
| 198 | all the variables replaced with the concrete types from the arguments) is | 
|---|
| 199 | defined at a call site. | 
|---|
| 200 | \begin{cfa} | 
|---|
| 201 | void do_once(int i) { ... } | 
|---|
| 202 | int i; | 
|---|
| 203 | do_twice(i); | 
|---|
| 204 | \end{cfa} | 
|---|
| 205 | Any value with a type fulfilling the assertion may be passed as an argument to | 
|---|
| 206 | a @do_twice@ call. | 
|---|
| 207 |  | 
|---|
| 208 | Note, a function named @do_once@ is not required in the scope of @do_twice@ to | 
|---|
| 209 | compile it, unlike \Cpp template expansion. Furthermore, call-site inferencing | 
|---|
| 210 | allows local replacement of the specific parametric functions needs for a | 
|---|
| 211 | call. | 
|---|
| 212 | \begin{cfa} | 
|---|
| 213 | void do_once(double y) { ... } | 
|---|
| 214 | int quadruple(int x) { | 
|---|
| 215 | void do_once(int & y) { y = y * 2; } | 
|---|
| 216 | do_twice(x); | 
|---|
| 217 | return x; | 
|---|
| 218 | } | 
|---|
| 219 | \end{cfa} | 
|---|
| 220 | Specifically, the complier deduces that @do_twice@'s T is an integer from the | 
|---|
| 221 | argument @x@. It then looks for the most specific definition matching the | 
|---|
| 222 | assertion, which is the nested integral @do_once@ defined within the | 
|---|
| 223 | function. The matched assertion function is then passed as a function pointer | 
|---|
| 224 | to @do_twice@ and called within it. | 
|---|
| 225 | The global definition of @do_once@ is ignored, however if @quadruple@ took a | 
|---|
| 226 | @double@ argument, then the global definition would be used instead as it | 
|---|
| 227 | would then be a better match.\cite{Moss19} | 
|---|
| 228 |  | 
|---|
| 229 | To avoid typing long lists of assertions, constraints can be collected into | 
|---|
| 230 | a convenient package called a @trait@, which can then be used in an assertion | 
|---|
| 231 | instead of the individual constraints. | 
|---|
| 232 | \begin{cfa} | 
|---|
| 233 | trait done_once(T) { | 
|---|
| 234 | void do_once(T); | 
|---|
| 235 | } | 
|---|
| 236 | \end{cfa} | 
|---|
| 237 | and the @forall@ list in the previous example is replaced with the trait. | 
|---|
| 238 | \begin{cfa} | 
|---|
| 239 | forall(dtype T | done_once(T)) | 
|---|
| 240 | \end{cfa} | 
|---|
| 241 | In general, a trait can contain an arbitrary number of assertions, both | 
|---|
| 242 | functions and variables, and are usually used to create a shorthand for, and | 
|---|
| 243 | give descriptive names to, common groupings of assertions describing a certain | 
|---|
| 244 | functionality, like @summable@, @listable@, \etc. | 
|---|
| 245 |  | 
|---|
| 246 | Polymorphic structures and unions are defined by qualifying an aggregate type | 
|---|
| 247 | with @forall@. The type variables work the same except they are used in field | 
|---|
| 248 | declarations instead of parameters, returns and local variable declarations. | 
|---|
| 249 | \begin{cfa} | 
|---|
| 250 | forall(dtype T) | 
|---|
| 251 | struct node { | 
|---|
| 252 | node(T) * next; | 
|---|
| 253 | T * data; | 
|---|
| 254 | }; | 
|---|
| 255 | node(int) inode; | 
|---|
| 256 | \end{cfa} | 
|---|
| 257 | The generic type @node(T)@ is an example of a polymorphic type usage.  Like \Cpp | 
|---|
| 258 | template usage, a polymorphic type usage must specify a type parameter. | 
|---|
| 259 |  | 
|---|
| 260 | There are many other polymorphism features in \CFA but these are the ones used | 
|---|
| 261 | by the exception system. | 
|---|
| 262 |  | 
|---|
| 263 | \section{Control Flow} | 
|---|
| 264 | \CFA has a number of advanced control-flow features: @generator@, @coroutine@, | 
|---|
| 265 | @monitor@, @mutex@ parameters, and @thread@. | 
|---|
| 266 | The two features that interact with | 
|---|
| 267 | the exception system are @coroutine@ and @thread@; they and their supporting | 
|---|
| 268 | constructs are described here. | 
|---|
| 269 |  | 
|---|
| 270 | \subsection{Coroutine} | 
|---|
| 271 | A coroutine is a type with associated functions, where the functions are not | 
|---|
| 272 | required to finish execution when control is handed back to the caller. | 
|---|
| 273 | Instead, | 
|---|
| 274 | they may suspend execution at any time and be resumed later at the point of | 
|---|
| 275 | last suspension. | 
|---|
| 276 | Coroutine | 
|---|
| 277 | types are not concurrent but share some similarities along with common | 
|---|
| 278 | underpinnings, so they are combined with the \CFA threading library. | 
|---|
| 279 | % I had mention of generators, but they don't actually matter here. | 
|---|
| 280 |  | 
|---|
| 281 | In \CFA, a coroutine is created using the @coroutine@ keyword, which is an | 
|---|
| 282 | aggregate type like @struct,@ except the structure is implicitly modified by | 
|---|
| 283 | the compiler to satisfy the @is_coroutine@ trait; hence, a coroutine is | 
|---|
| 284 | restricted by the type system to types that provide this special trait.  The | 
|---|
| 285 | coroutine structure acts as the interface between callers and the coroutine, | 
|---|
| 286 | and its fields are used to pass information in and out of coroutine interface | 
|---|
| 287 | functions. | 
|---|
| 288 |  | 
|---|
| 289 | Here is a simple example where a single field is used to pass (communicate) the | 
|---|
| 290 | next number in a sequence. | 
|---|
| 291 | \begin{cfa} | 
|---|
| 292 | coroutine CountUp { | 
|---|
| 293 | unsigned int next; | 
|---|
| 294 | }; | 
|---|
| 295 | CountUp countup; | 
|---|
| 296 | \end{cfa} | 
|---|
| 297 | Each coroutine has a @main@ function, which takes a reference to a coroutine | 
|---|
| 298 | object and returns @void@. | 
|---|
| 299 | %[numbers=left] Why numbers on this one? | 
|---|
| 300 | \begin{cfa} | 
|---|
| 301 | void main(CountUp & this) { | 
|---|
| 302 | for (unsigned int next = 0 ; true ; ++next) { | 
|---|
| 303 | this.next = next; | 
|---|
| 304 | suspend;$\label{suspend}$ | 
|---|
| 305 | } | 
|---|
| 306 | } | 
|---|
| 307 | \end{cfa} | 
|---|
| 308 | In this function, or functions called by this function (helper functions), the | 
|---|
| 309 | @suspend@ statement is used to return execution to the coroutine's caller | 
|---|
| 310 | without terminating the coroutine's function. | 
|---|
| 311 |  | 
|---|
| 312 | A coroutine is resumed by calling the @resume@ function, \eg @resume(countup)@. | 
|---|
| 313 | The first resume calls the @main@ function at the top. Thereafter, resume calls | 
|---|
| 314 | continue a coroutine in the last suspended function after the @suspend@ | 
|---|
| 315 | statement. In this case there is only one and, hence, the difference between | 
|---|
| 316 | subsequent calls is the state of variables inside the function and the | 
|---|
| 317 | coroutine object. | 
|---|
| 318 | The return value of @resume@ is a reference to the coroutine, to make it | 
|---|
| 319 | convent to access fields of the coroutine in the same expression. | 
|---|
| 320 | Here is a simple example in a helper function: | 
|---|
| 321 | \begin{cfa} | 
|---|
| 322 | unsigned int get_next(CountUp & this) { | 
|---|
| 323 | return resume(this).next; | 
|---|
| 324 | } | 
|---|
| 325 | \end{cfa} | 
|---|
| 326 |  | 
|---|
| 327 | When the main function returns, the coroutine halts and can no longer be | 
|---|
| 328 | resumed. | 
|---|
| 329 |  | 
|---|
| 330 | \subsection{Monitor and Mutex Parameter} | 
|---|
| 331 | Concurrency does not guarantee ordering; without ordering, results are | 
|---|
| 332 | non-deterministic. To claw back ordering, \CFA uses monitors and @mutex@ | 
|---|
| 333 | (mutual exclusion) parameters. A monitor is another kind of aggregate, where | 
|---|
| 334 | the compiler implicitly inserts a lock and instances are compatible with | 
|---|
| 335 | @mutex@ parameters. | 
|---|
| 336 |  | 
|---|
| 337 | A function that requires deterministic (ordered) execution acquires mutual | 
|---|
| 338 | exclusion on a monitor object by qualifying an object reference parameter with | 
|---|
| 339 | the @mutex@ qualifier. | 
|---|
| 340 | \begin{cfa} | 
|---|
| 341 | void example(MonitorA & mutex argA, MonitorB & mutex argB); | 
|---|
| 342 | \end{cfa} | 
|---|
| 343 | When the function is called, it implicitly acquires the monitor lock for all of | 
|---|
| 344 | the mutex parameters without deadlock.  This semantics means all functions with | 
|---|
| 345 | the same mutex type(s) are part of a critical section for objects of that type | 
|---|
| 346 | and only one runs at a time. | 
|---|
| 347 |  | 
|---|
| 348 | \subsection{Thread} | 
|---|
| 349 | Functions, generators and coroutines are sequential, so there is only a single | 
|---|
| 350 | (but potentially sophisticated) execution path in a program. Threads introduce | 
|---|
| 351 | multiple execution paths that continue independently. | 
|---|
| 352 |  | 
|---|
| 353 | For threads to work safely with objects requires mutual exclusion using | 
|---|
| 354 | monitors and mutex parameters. For threads to work safely with other threads | 
|---|
| 355 | also requires mutual exclusion in the form of a communication rendezvous, which | 
|---|
| 356 | also supports internal synchronization as for mutex objects. For exceptions, | 
|---|
| 357 | only two basic thread operations are important: fork and join. | 
|---|
| 358 |  | 
|---|
| 359 | Threads are created like coroutines with an associated @main@ function: | 
|---|
| 360 | \begin{cfa} | 
|---|
| 361 | thread StringWorker { | 
|---|
| 362 | const char * input; | 
|---|
| 363 | int result; | 
|---|
| 364 | }; | 
|---|
| 365 | void main(StringWorker & this) { | 
|---|
| 366 | const char * localCopy = this.input; | 
|---|
| 367 | // ... do some work, perhaps hashing the string ... | 
|---|
| 368 | this.result = result; | 
|---|
| 369 | } | 
|---|
| 370 | { | 
|---|
| 371 | StringWorker stringworker; // fork thread running in "main" | 
|---|
| 372 | } // Implicit call to join(stringworker), waits for completion. | 
|---|
| 373 | \end{cfa} | 
|---|
| 374 | The thread main is where a new thread starts execution after a fork operation | 
|---|
| 375 | and then the thread continues executing until it is finished. If another thread | 
|---|
| 376 | joins with an executing thread, it waits until the executing main completes | 
|---|
| 377 | execution. In other words, everything a thread does is between a fork and join. | 
|---|
| 378 |  | 
|---|
| 379 | From the outside, this behaviour is accomplished through creation and | 
|---|
| 380 | destruction of a thread object.  Implicitly, fork happens after a thread | 
|---|
| 381 | object's constructor is run and join happens before the destructor runs. Join | 
|---|
| 382 | can also be specified explicitly using the @join@ function to wait for a | 
|---|
| 383 | thread's completion independently from its deallocation (\ie destructor | 
|---|
| 384 | call). If @join@ is called explicitly, the destructor does not implicitly join. | 
|---|