\chapter{\CFA{} Existing Features} \label{c:existing} \CFA is an open-source project extending ISO C with modern safety and productivity features, while still ensuring backwards compatibility with C and its programmers. \CFA is designed to have an orthogonal feature-set based closely on the C programming paradigm (non-object-oriented), and these features can be added incrementally to an existing C code-base, allowing programmers to learn \CFA on an as-needed basis. Only those \CFA features pertaining to this thesis are discussed. A familiarity with C or C-like languages is assumed. \section{Overloading and \lstinline{extern}} \CFA has extensive overloading, allowing multiple definitions of the same name to be defined~\cite{Moss18}. \begin{cfa} char i; int i; double i; int f(); double f(); void g( int ); void g( double ); \end{cfa} This feature requires name mangling so the assembly symbols are unique for different overloads. For compatibility with names in C, there is also a syntax to disable name mangling. These unmangled names cannot be overloaded but act as the interface between C and \CFA code. The syntax for disabling/enabling mangling is: \begin{cfa} // name mangling on by default int i; // _X1ii_1 extern "C" { // disables name mangling int j; // j extern "Cforall" { // enables name mangling int k; // _X1ki_1 } // revert to no name mangling } // revert to name mangling \end{cfa} Both forms of @extern@ affect all the declarations within their nested lexical scope and transition back to the previous mangling state when the lexical scope ends. \section{Reference Type} \CFA adds a reference type to C as an auto-dereferencing pointer. They work very similarly to pointers. Reference-types are written the same way as pointer-types, but each asterisk (@*@) is replaced with a ampersand (@&@); this includes cv-qualifiers (\snake{const} and \snake{volatile}) and multiple levels of reference. Generally, references act like pointers with an implicit dereferencing operation added to each use of the variable. These automatic dereferences may be disabled with the address-of operator (@&@). % Check to see if these are generating errors. \begin{minipage}{0,5\textwidth} With references: \begin{cfa} int i, j; int & ri = i; int && rri = ri; rri = 3; &ri = &j; ri = 5; \end{cfa} \end{minipage} \begin{minipage}{0,5\textwidth} With pointers: \begin{cfa} int i, j; int * pi = &i int ** ppi = π **ppi = 3; pi = &j; *pi = 5; \end{cfa} \end{minipage} References are intended to be used when the indirection of a pointer is required, but the address is not as important as the value and dereferencing is the common usage. Mutable references may be assigned to by converting them to a pointer with a @&@ and then assigning a pointer to them, as in @&ri = &j;@ above. \section{Operators} \CFA implements operator overloading by providing special names, where operator expressions are translated into function calls using these names. An operator name is created by taking the operator symbols and joining them with @?@s to show where the arguments go. For example, infixed multiplication is @?*?@, while prefix dereference is @*?@. This syntax makes it easy to tell the difference between prefix operations (such as @++?@) and postfix operations (@?++@). As an example, here are the addition and equality operators for a point type. \begin{cfa} point ?+?(point a, point b) { return point{a.x + b.x, a.y + b.y}; } int ?==?(point a, point b) { return a.x == b.x && a.y == b.y; } { assert(point{1, 2} + point{3, 4} == point{4, 6}); } \end{cfa} Note that this syntax works effectively as a textual transformation; the compiler converts all operators into functions and then resolves them normally. This means any combination of types may be used, although nonsensical ones (like @double ?==?(point, int);@) are discouraged. This feature is also used for all builtin operators as well, although those are implicitly provided by the language. %\subsection{Constructors and Destructors} In \CFA, constructors and destructors are operators, which means they are functions with special operator names, rather than type names as in \Cpp. Both constructors and destructors can be implicity called by the compiler, however the operator names allow explicit calls. % Placement new means that this is actually equivant to C++. The special name for a constructor is @?{}@, which comes from the initialization syntax in C, \eg @Example e = { ... }@. \CFA generates a constructor call each time a variable is declared, passing the initialization arguments to the constructor. \begin{cfa} struct Example { ... }; void ?{}(Example & this) { ... } { Example a; Example b = {}; } void ?{}(Example & this, char first, int num) { ... } { Example c = {'a', 2}; } \end{cfa} Both @a@ and @b@ will be initalized with the first constructor, @b@ because of the explicit call and @a@ implicitly. @c@ will be initalized with the second constructor. Currently, there is no general way to skip initialization. % I don't use @= anywhere in the thesis. % I don't like the \^{} symbol but $^\wedge$ isn't better. Similarly, destructors use the special name @^?{}@ (the @^@ has no special meaning). \begin{cfa} void ^?{}(Example & this) { ... } { Example d; ^?{}(d); Example e; } // Implicit call of ^?{}(e); \end{cfa} Whenever a type is defined, \CFA creates a default zero-argument constructor, a copy constructor, a series of argument-per-field constructors and a destructor. All user constructors are defined after this. \section{Polymorphism} \CFA uses parametric polymorphism to create functions and types that are defined over multiple types. \CFA polymorphic declarations serve the same role as \Cpp templates or Java generics. The ``parametric'' means the polymorphism is accomplished by passing argument operations to associate \emph{parameters} at the call site, and these parameters are used in the function to differentiate among the types the function operates on. Polymorphic declarations start with a universal @forall@ clause that goes before the standard (monomorphic) declaration. These declarations have the same syntax except they may use the universal type names introduced by the @forall@ clause. For example, the following is a polymorphic identity function that works on any type @T@: \begin{cfa} forall( T ) T identity( T val ) { return val; } int forty_two = identity( 42 ); char capital_a = identity( 'A' ); \end{cfa} Each use of a polymorphic declaration resolves its polymorphic parameters (in this case, just @T@) to concrete types (@int@ in the first use and @char@ in the second). To allow a polymorphic function to be separately compiled, the type @T@ must be constrained by the operations used on @T@ in the function body. The @forall@ clause is augmented with a list of polymorphic variables (local type names) and assertions (constraints), which represent the required operations on those types used in a function, \eg: \begin{cfa} forall( T | { void do_once(T); } ) void do_twice(T value) { do_once(value); do_once(value); } \end{cfa} A polymorphic function can be used in the same way as a normal function. The polymorphic variables are filled in with concrete types and the assertions are checked. An assertion is checked by verifying each assertion operation (with all the variables replaced with the concrete types from the arguments) is defined at a call site. \begin{cfa} void do_once(int i) { ... } int i; do_twice(i); \end{cfa} Any value with a type fulfilling the assertion may be passed as an argument to a @do_twice@ call. Note, a function named @do_once@ is not required in the scope of @do_twice@ to compile it, unlike \Cpp template expansion. Furthermore, call-site inferencing allows local replacement of the specific parametric functions needs for a call. \begin{cfa} void do_once(double y) { ... } int quadruple(int x) { void do_once(int & y) { y = y * 2; } do_twice(x); return x; } \end{cfa} Specifically, the complier deduces that @do_twice@'s T is an integer from the argument @x@. It then looks for the most specific definition matching the assertion, which is the nested integral @do_once@ defined within the function. The matched assertion function is then passed as a function pointer to @do_twice@ and called within it. The global definition of @do_once@ is ignored, however if @quadruple@ took a @double@ argument, then the global definition would be used instead as it would then be a better match.\cite{Moss19} To avoid typing long lists of assertions, constraints can be collected into a convenient package called a @trait@, which can then be used in an assertion instead of the individual constraints. \begin{cfa} trait done_once(T) { void do_once(T); } \end{cfa} and the @forall@ list in the previous example is replaced with the trait. \begin{cfa} forall(dtype T | done_once(T)) \end{cfa} In general, a trait can contain an arbitrary number of assertions, both functions and variables, and are usually used to create a shorthand for, and give descriptive names to, common groupings of assertions describing a certain functionality, like @summable@, @listable@, \etc. Polymorphic structures and unions are defined by qualifying an aggregate type with @forall@. The type variables work the same except they are used in field declarations instead of parameters, returns and local variable declarations. \begin{cfa} forall(dtype T) struct node { node(T) * next; T * data; }; node(int) inode; \end{cfa} The generic type @node(T)@ is an example of a polymorphic type usage. Like \Cpp template usage, a polymorphic type usage must specify a type parameter. There are many other polymorphism features in \CFA but these are the ones used by the exception system. \section{Control Flow} \CFA has a number of advanced control-flow features: @generator@, @coroutine@, @monitor@, @mutex@ parameters, and @thread@. The two features that interact with the exception system are @coroutine@ and @thread@; they and their supporting constructs are described here. \subsection{Coroutine} A coroutine is a type with associated functions, where the functions are not required to finish execution when control is handed back to the caller. Instead, they may suspend execution at any time and be resumed later at the point of last suspension. Coroutine types are not concurrent but share some similarities along with common underpinnings, so they are combined with the \CFA threading library. % I had mention of generators, but they don't actually matter here. In \CFA, a coroutine is created using the @coroutine@ keyword, which is an aggregate type like @struct,@ except the structure is implicitly modified by the compiler to satisfy the @is_coroutine@ trait; hence, a coroutine is restricted by the type system to types that provide this special trait. The coroutine structure acts as the interface between callers and the coroutine, and its fields are used to pass information in and out of coroutine interface functions. Here is a simple example where a single field is used to pass (communicate) the next number in a sequence. \begin{cfa} coroutine CountUp { unsigned int next; }; CountUp countup; \end{cfa} Each coroutine has a @main@ function, which takes a reference to a coroutine object and returns @void@. %[numbers=left] Why numbers on this one? \begin{cfa} void main(CountUp & this) { for (unsigned int next = 0 ; true ; ++next) { this.next = next; suspend;$\label{suspend}$ } } \end{cfa} In this function, or functions called by this function (helper functions), the @suspend@ statement is used to return execution to the coroutine's caller without terminating the coroutine's function. A coroutine is resumed by calling the @resume@ function, \eg @resume(countup)@. The first resume calls the @main@ function at the top. Thereafter, resume calls continue a coroutine in the last suspended function after the @suspend@ statement. In this case there is only one and, hence, the difference between subsequent calls is the state of variables inside the function and the coroutine object. The return value of @resume@ is a reference to the coroutine, to make it convent to access fields of the coroutine in the same expression. Here is a simple example in a helper function: \begin{cfa} unsigned int get_next(CountUp & this) { return resume(this).next; } \end{cfa} When the main function returns, the coroutine halts and can no longer be resumed. \subsection{Monitor and Mutex Parameter} Concurrency does not guarantee ordering; without ordering, results are non-deterministic. To claw back ordering, \CFA uses monitors and @mutex@ (mutual exclusion) parameters. A monitor is another kind of aggregate, where the compiler implicitly inserts a lock and instances are compatible with @mutex@ parameters. A function that requires deterministic (ordered) execution acquires mutual exclusion on a monitor object by qualifying an object reference parameter with the @mutex@ qualifier. \begin{cfa} void example(MonitorA & mutex argA, MonitorB & mutex argB); \end{cfa} When the function is called, it implicitly acquires the monitor lock for all of the mutex parameters without deadlock. This semantics means all functions with the same mutex type(s) are part of a critical section for objects of that type and only one runs at a time. \subsection{Thread} Functions, generators and coroutines are sequential, so there is only a single (but potentially sophisticated) execution path in a program. Threads introduce multiple execution paths that continue independently. For threads to work safely with objects requires mutual exclusion using monitors and mutex parameters. For threads to work safely with other threads also requires mutual exclusion in the form of a communication rendezvous, which also supports internal synchronization as for mutex objects. For exceptions, only two basic thread operations are important: fork and join. Threads are created like coroutines with an associated @main@ function: \begin{cfa} thread StringWorker { const char * input; int result; }; void main(StringWorker & this) { const char * localCopy = this.input; // ... do some work, perhaps hashing the string ... this.result = result; } { StringWorker stringworker; // fork thread running in "main" } // Implicit call to join(stringworker), waits for completion. \end{cfa} The thread main is where a new thread starts execution after a fork operation and then the thread continues executing until it is finished. If another thread joins with an executing thread, it waits until the executing main completes execution. In other words, everything a thread does is between a fork and join. From the outside, this behaviour is accomplished through creation and destruction of a thread object. Implicitly, fork happens after a thread object's constructor is run and join happens before the destructor runs. Join can also be specified explicitly using the @join@ function to wait for a thread's completion independently from its deallocation (\ie destructor call). If @join@ is called explicitly, the destructor does not implicitly join.