[29c9b23] | 1 | \chapter{\CFA Existing Features} |
---|
[f28fdee] | 2 | |
---|
[6c79bef] | 3 | \CFA (C-for-all)~\cite{Cforall} is an open-source project extending ISO C with |
---|
| 4 | modern safety and productivity features, while still ensuring backwards |
---|
| 5 | compatibility with C and its programmers. \CFA is designed to have an |
---|
| 6 | orthogonal feature-set based closely on the C programming paradigm |
---|
| 7 | (non-object-oriented) and these features can be added incrementally to an |
---|
| 8 | existing C code-base allowing programmers to learn \CFA on an as-needed basis. |
---|
| 9 | |
---|
| 10 | Only those \CFA features pertinent to this thesis are discussed. Many of the |
---|
| 11 | \CFA syntactic and semantic features used in the thesis should be fairly |
---|
| 12 | obvious to the reader. |
---|
| 13 | |
---|
[9af0fe2d] | 14 | \section{Overloading and \lstinline{extern}} |
---|
[6c79bef] | 15 | \CFA has extensive overloading, allowing multiple definitions of the same name |
---|
| 16 | to be defined.~\cite{Moss18} |
---|
[f28fdee] | 17 | \begin{cfa} |
---|
[6c79bef] | 18 | char i; int i; double i; $\C[3.75in]{// variable overload}$ |
---|
| 19 | int f(); double f(); $\C{// return overload}$ |
---|
| 20 | void g( int ); void g( double ); $\C{// parameter overload}\CRT$ |
---|
[f28fdee] | 21 | \end{cfa} |
---|
[6c79bef] | 22 | This feature requires name mangling so the assembly symbols are unique for |
---|
| 23 | different overloads. For compatibility with names in C, there is also a syntax |
---|
| 24 | to disable name mangling. These unmangled names cannot be overloaded but act as |
---|
| 25 | the interface between C and \CFA code. The syntax for disabling/enabling |
---|
| 26 | mangling is: |
---|
[f28fdee] | 27 | \begin{cfa} |
---|
[6c79bef] | 28 | // name mangling |
---|
| 29 | int i; // _X1ii_1 |
---|
| 30 | @extern "C"@ { // no name mangling |
---|
| 31 | int j; // j |
---|
| 32 | @extern "Cforall"@ { // name mangling |
---|
| 33 | int k; // _X1ki_1 |
---|
| 34 | } |
---|
| 35 | // no name mangling |
---|
[6e7b969] | 36 | } |
---|
[6c79bef] | 37 | // name mangling |
---|
| 38 | \end{cfa} |
---|
| 39 | Both forms of @extern@ affect all the declarations within their nested lexical |
---|
| 40 | scope and transition back to the previous mangling state when the lexical scope |
---|
| 41 | ends. |
---|
| 42 | |
---|
| 43 | \section{Reference Type} |
---|
[29c9b23] | 44 | \CFA adds a rebindable reference type to C, but more expressive than the \Cpp |
---|
[6c79bef] | 45 | reference. Multi-level references are allowed and act like auto-dereferenced |
---|
| 46 | pointers using the ampersand (@&@) instead of the pointer asterisk (@*@). \CFA |
---|
| 47 | references may also be mutable or non-mutable. If mutable, a reference variable |
---|
| 48 | may be assigned to using the address-of operator (@&@), which converts the |
---|
| 49 | reference to a pointer. |
---|
| 50 | \begin{cfa} |
---|
| 51 | int i, j; |
---|
| 52 | int @&@ ri = i, @&&@ rri = ri; |
---|
| 53 | rri = 3; // auto-dereference assign to i |
---|
| 54 | @&@ri = @&@j; // rebindable |
---|
| 55 | ri = 5; // assign to j |
---|
[f28fdee] | 56 | \end{cfa} |
---|
[6e7b969] | 57 | |
---|
| 58 | \section{Constructors and Destructors} |
---|
| 59 | |
---|
| 60 | Both constructors and destructors are operators, which means they are just |
---|
[29c9b23] | 61 | functions with special operator names rather than type names in \Cpp. The |
---|
[6c79bef] | 62 | special operator names may be used to call the functions explicitly (not |
---|
[29c9b23] | 63 | allowed in \Cpp for constructors). |
---|
[6c79bef] | 64 | |
---|
| 65 | In general, operator names in \CFA are constructed by bracketing an operator |
---|
| 66 | token with @?@, which indicates where the arguments. For example, infixed |
---|
| 67 | multiplication is @?*?@ while prefix dereference is @*?@. This syntax make it |
---|
| 68 | easy to tell the difference between prefix operations (such as @++?@) and |
---|
| 69 | post-fix operations (@?++@). |
---|
| 70 | |
---|
| 71 | The special name for a constructor is @?{}@, which comes from the |
---|
| 72 | initialization syntax in C. The special name for a destructor is @^{}@, where |
---|
| 73 | the @^@ has no special meaning. |
---|
| 74 | % I don't like the \^{} symbol but $^\wedge$ isn't better. |
---|
| 75 | \begin{cfa} |
---|
| 76 | struct T { ... }; |
---|
| 77 | void ?@{}@(@T &@ this, ...) { ... } // constructor |
---|
| 78 | void ?@^{}@(@T &@ this, ...) { ... } // destructor |
---|
| 79 | { |
---|
| 80 | T s = @{@ ... @}@; // same constructor/initialization braces |
---|
| 81 | } // destructor call automatically generated |
---|
| 82 | \end{cfa} |
---|
| 83 | The first parameter is a reference parameter to the type for the |
---|
| 84 | constructor/destructor. Destructors may have multiple parameters. The compiler |
---|
| 85 | implicitly matches an overloaded constructor @void ^?{}(T &, ...);@ to an |
---|
| 86 | object declaration with associated initialization, and generates a construction |
---|
| 87 | call after the object is allocated. When an object goes out of scope, the |
---|
| 88 | matching overloaded destructor @void ^?{}(T &);@ is called. Without explicit |
---|
| 89 | definition, \CFA creates a default and copy constructor, destructor and |
---|
[29c9b23] | 90 | assignment (like \Cpp). It is possible to define constructors/destructors for |
---|
[6c79bef] | 91 | basic and existing types. |
---|
[6e7b969] | 92 | |
---|
| 93 | \section{Polymorphism} |
---|
[6c79bef] | 94 | \CFA uses parametric polymorphism to create functions and types that are |
---|
| 95 | defined over multiple types. \CFA polymorphic declarations serve the same role |
---|
[29c9b23] | 96 | as \Cpp templates or Java generics. The ``parametric'' means the polymorphism is |
---|
[6c79bef] | 97 | accomplished by passing argument operations to associate \emph{parameters} at |
---|
| 98 | the call site, and these parameters are used in the function to differentiate |
---|
| 99 | among the types the function operates on. |
---|
| 100 | |
---|
| 101 | Polymorphic declarations start with a universal @forall@ clause that goes |
---|
| 102 | before the standard (monomorphic) declaration. These declarations have the same |
---|
| 103 | syntax except they may use the universal type names introduced by the @forall@ |
---|
| 104 | clause. For example, the following is a polymorphic identity function that |
---|
| 105 | works on any type @T@: |
---|
| 106 | \begin{cfa} |
---|
| 107 | @forall( T )@ @T@ identity( @T@ val ) { return val; } |
---|
| 108 | int forty_two = identity( 42 ); // T bound to int, forty_two == 42 |
---|
| 109 | \end{cfa} |
---|
[6e7b969] | 110 | |
---|
[6c79bef] | 111 | To allow a polymorphic function to be separately compiled, the type @T@ must be |
---|
| 112 | constrained by the operations used on @T@ in the function body. The @forall@ |
---|
| 113 | clauses is augmented with a list of polymorphic variables (local type names) |
---|
| 114 | and assertions (constraints), which represent the required operations on those |
---|
| 115 | types used in a function, \eg: |
---|
[f28fdee] | 116 | \begin{cfa} |
---|
[6c79bef] | 117 | forall( T @| { void do_once(T); }@) // assertion |
---|
| 118 | void do_twice(T value) { |
---|
| 119 | do_once(value); |
---|
| 120 | do_once(value); |
---|
[6e7b969] | 121 | } |
---|
[6c79bef] | 122 | void do_once(int i) { ... } // provide assertion |
---|
| 123 | int i; |
---|
| 124 | do_twice(i); // implicitly pass assertion do_once to do_twice |
---|
[f28fdee] | 125 | \end{cfa} |
---|
[6c79bef] | 126 | Any object with a type fulfilling the assertion may be passed as an argument to |
---|
| 127 | a @do_twice@ call. |
---|
| 128 | |
---|
| 129 | A polymorphic function can be used in the same way as a normal function. The |
---|
| 130 | polymorphic variables are filled in with concrete types and the assertions are |
---|
| 131 | checked. An assertion is checked by verifying each assertion operation (with |
---|
| 132 | all the variables replaced with the concrete types from the arguments) is |
---|
| 133 | defined at a call site. |
---|
| 134 | |
---|
| 135 | Note, a function named @do_once@ is not required in the scope of @do_twice@ to |
---|
[29c9b23] | 136 | compile it, unlike \Cpp template expansion. Furthermore, call-site inferencing |
---|
[6c79bef] | 137 | allows local replacement of the most specific parametric functions needs for a |
---|
| 138 | call. |
---|
[f28fdee] | 139 | \begin{cfa} |
---|
[6c79bef] | 140 | void do_once(double y) { ... } // global |
---|
[6e7b969] | 141 | int quadruple(int x) { |
---|
[6c79bef] | 142 | void do_once(int y) { y = y * 2; } // local |
---|
| 143 | do_twice(x); // using local "do_once" |
---|
| 144 | return x; |
---|
[6e7b969] | 145 | } |
---|
[f28fdee] | 146 | \end{cfa} |
---|
[6c79bef] | 147 | Specifically, the complier deduces that @do_twice@'s T is an integer from the |
---|
| 148 | argument @x@. It then looks for the most specific definition matching the |
---|
| 149 | assertion, which is the nested integral @do_once@ defined within the |
---|
| 150 | function. The matched assertion function is then passed as a function pointer |
---|
| 151 | to @do_twice@ and called within it. |
---|
| 152 | |
---|
| 153 | To avoid typing long lists of assertions, constraints can be collect into |
---|
| 154 | convenient packages called a @trait@, which can then be used in an assertion |
---|
| 155 | instead of the individual constraints. |
---|
[f28fdee] | 156 | \begin{cfa} |
---|
[6c79bef] | 157 | trait done_once(T) { |
---|
| 158 | void do_once(T); |
---|
[6e7b969] | 159 | } |
---|
[f28fdee] | 160 | \end{cfa} |
---|
[6c79bef] | 161 | and the @forall@ list in the previous example is replaced with the trait. |
---|
[f28fdee] | 162 | \begin{cfa} |
---|
[6c79bef] | 163 | forall(dtype T | @done_once(T)@) |
---|
[f28fdee] | 164 | \end{cfa} |
---|
[6c79bef] | 165 | In general, a trait can contain an arbitrary number of assertions, both |
---|
| 166 | functions and variables, and are usually used to create a shorthand for, and |
---|
| 167 | give descriptive names to, common groupings of assertions describing a certain |
---|
| 168 | functionality, like @sumable@, @listable@, \etc. |
---|
| 169 | |
---|
| 170 | Polymorphic structures and unions are defined by qualifying the aggregate type |
---|
| 171 | with @forall@. The type variables work the same except they are used in field |
---|
| 172 | declarations instead of parameters, returns, and local variable declarations. |
---|
[f28fdee] | 173 | \begin{cfa} |
---|
[6e7b969] | 174 | forall(dtype T) |
---|
| 175 | struct node { |
---|
[6c79bef] | 176 | node(T) * next; // generic linked node |
---|
| 177 | T * data; |
---|
[6e7b969] | 178 | } |
---|
[f28fdee] | 179 | \end{cfa} |
---|
[29c9b23] | 180 | The generic type @node(T)@ is an example of a polymorphic-type usage. Like \Cpp |
---|
[6c79bef] | 181 | templates usage, a polymorphic-type usage must specify a type parameter. |
---|
[6e7b969] | 182 | |
---|
[6c79bef] | 183 | There are many other polymorphism features in \CFA but these are the ones used |
---|
| 184 | by the exception system. |
---|
[6e7b969] | 185 | |
---|
| 186 | \section{Concurrency} |
---|
[6c79bef] | 187 | \CFA has a number of concurrency features: @thread@, @monitor@, @mutex@ |
---|
| 188 | parameters, @coroutine@ and @generator@. The two features that interact with |
---|
| 189 | the exception system are @thread@ and @coroutine@; they and their supporting |
---|
| 190 | constructs are described here. |
---|
| 191 | |
---|
| 192 | \subsection{Coroutine} |
---|
| 193 | A coroutine is a type with associated functions, where the functions are not |
---|
| 194 | required to finish execution when control is handed back to the caller. Instead |
---|
| 195 | they may suspend execution at any time and be resumed later at the point of |
---|
| 196 | last suspension. (Generators are stackless and coroutines are stackful.) These |
---|
| 197 | types are not concurrent but share some similarities along with common |
---|
| 198 | underpinnings, so they are combined with the \CFA threading library. Further |
---|
| 199 | discussion in this section only refers to the coroutine because generators are |
---|
| 200 | similar. |
---|
| 201 | |
---|
| 202 | In \CFA, a coroutine is created using the @coroutine@ keyword, which is an |
---|
| 203 | aggregate type like @struct,@ except the structure is implicitly modified by |
---|
| 204 | the compiler to satisfy the @is_coroutine@ trait; hence, a coroutine is |
---|
| 205 | restricted by the type system to types that provide this special trait. The |
---|
| 206 | coroutine structure acts as the interface between callers and the coroutine, |
---|
| 207 | and its fields are used to pass information in and out of coroutine interface |
---|
| 208 | functions. |
---|
| 209 | |
---|
| 210 | Here is a simple example where a single field is used to pass (communicate) the |
---|
| 211 | next number in a sequence. |
---|
[f28fdee] | 212 | \begin{cfa} |
---|
[6e7b969] | 213 | coroutine CountUp { |
---|
[6c79bef] | 214 | unsigned int next; // communication variable |
---|
[6e7b969] | 215 | } |
---|
[6c79bef] | 216 | CountUp countup; |
---|
[f28fdee] | 217 | \end{cfa} |
---|
[6c79bef] | 218 | Each coroutine has @main@ function, which takes a reference to a coroutine |
---|
| 219 | object and returns @void@. |
---|
| 220 | \begin{cfa}[numbers=left] |
---|
| 221 | void main(@CountUp & this@) { // argument matches trait is_coroutine |
---|
| 222 | unsigned int up = 0; // retained between calls |
---|
| 223 | while (true) { |
---|
| 224 | next = up; // make "up" available outside function |
---|
| 225 | @suspend;@$\label{suspend}$ |
---|
| 226 | up += 1; |
---|
| 227 | } |
---|
[6e7b969] | 228 | } |
---|
[f28fdee] | 229 | \end{cfa} |
---|
[6c79bef] | 230 | In this function, or functions called by this function (helper functions), the |
---|
| 231 | @suspend@ statement is used to return execution to the coroutine's caller |
---|
| 232 | without terminating the coroutine. |
---|
| 233 | |
---|
| 234 | A coroutine is resumed by calling the @resume@ function, \eg @resume(countup)@. |
---|
| 235 | The first resume calls the @main@ function at the top. Thereafter, resume calls |
---|
| 236 | continue a coroutine in the last suspended function after the @suspend@ |
---|
| 237 | statement, in this case @main@ line~\ref{suspend}. The @resume@ function takes |
---|
| 238 | a reference to the coroutine structure and returns the same reference. The |
---|
| 239 | return value allows easy access to communication variables defined in the |
---|
| 240 | coroutine object. For example, the @next@ value for coroutine object @countup@ |
---|
| 241 | is both generated and collected in the single expression: |
---|
| 242 | @resume(countup).next@. |
---|
[6e7b969] | 243 | |
---|
| 244 | \subsection{Monitors and Mutex} |
---|
[6c79bef] | 245 | Concurrency does not guarantee ordering; without ordering results are |
---|
| 246 | non-deterministic. To claw back ordering, \CFA uses monitors and @mutex@ |
---|
| 247 | (mutual exclusion) parameters. A monitor is another kind of aggregate, where |
---|
| 248 | the compiler implicitly inserts a lock and instances are compatible with |
---|
| 249 | @mutex@ parameters. |
---|
| 250 | |
---|
| 251 | A function that requires deterministic (ordered) execution, acquires mutual |
---|
| 252 | exclusion on a monitor object by qualifying an object reference parameter with |
---|
| 253 | @mutex@. |
---|
| 254 | \begin{cfa} |
---|
| 255 | void example(MonitorA & @mutex@ argA, MonitorB & @mutex@ argB); |
---|
| 256 | \end{cfa} |
---|
| 257 | When the function is called, it implicitly acquires the monitor lock for all of |
---|
| 258 | the mutex parameters without deadlock. This semantics means all functions with |
---|
| 259 | the same mutex type(s) are part of a critical section for objects of that type |
---|
| 260 | and only one runs at a time. |
---|
[6e7b969] | 261 | |
---|
[6c79bef] | 262 | \subsection{Threads} |
---|
| 263 | Functions, generators, and coroutines are sequential so there is only a single |
---|
| 264 | (but potentially sophisticated) execution path in a program. Threads introduce |
---|
| 265 | multiple execution paths that continue independently. |
---|
[6e7b969] | 266 | |
---|
[6c79bef] | 267 | For threads to work safely with objects requires mutual exclusion using |
---|
| 268 | monitors and mutex parameters. For threads to work safely with other threads, |
---|
| 269 | also requires mutual exclusion in the form of a communication rendezvous, which |
---|
| 270 | also supports internal synchronization as for mutex objects. For exceptions |
---|
| 271 | only the basic two basic operations are important: thread fork and join. |
---|
[6e7b969] | 272 | |
---|
[6c79bef] | 273 | Threads are created like coroutines with an associated @main@ function: |
---|
[f28fdee] | 274 | \begin{cfa} |
---|
[6e7b969] | 275 | thread StringWorker { |
---|
[6c79bef] | 276 | const char * input; |
---|
| 277 | int result; |
---|
[6e7b969] | 278 | }; |
---|
| 279 | void main(StringWorker & this) { |
---|
[6c79bef] | 280 | const char * localCopy = this.input; |
---|
| 281 | // ... do some work, perhaps hashing the string ... |
---|
| 282 | this.result = result; |
---|
[6e7b969] | 283 | } |
---|
[6c79bef] | 284 | { |
---|
| 285 | StringWorker stringworker; // fork thread running in "main" |
---|
| 286 | } // implicitly join with thread $\(\Rightarrow\)$ wait for completion |
---|
[f28fdee] | 287 | \end{cfa} |
---|
[6c79bef] | 288 | The thread main is where a new thread starts execution after a fork operation |
---|
| 289 | and then the thread continues executing until it is finished. If another thread |
---|
| 290 | joins with an executing thread, it waits until the executing main completes |
---|
| 291 | execution. In other words, everything a thread does is between a fork and join. |
---|
| 292 | |
---|
| 293 | From the outside, this behaviour is accomplished through creation and |
---|
| 294 | destruction of a thread object. Implicitly, fork happens after a thread |
---|
| 295 | object's constructor is run and join happens before the destructor runs. Join |
---|
| 296 | can also be specified explicitly using the @join@ function to wait for a |
---|
| 297 | thread's completion independently from its deallocation (\ie destructor |
---|
| 298 | call). If @join@ is called explicitly, the destructor does not implicitly join. |
---|