\chapter{Implementation} \label{c:implement} % Local Helpers: \newcommand\transformline[1][becomes...]{ \hrulefill#1\hrulefill \medskip } The implementation work for this thesis covers the two components: virtual system and exceptions. Each component is discussed in detail. \section{Virtual System} \label{s:VirtualSystem} % Virtual table rules. Virtual tables, the pointer to them and the cast. While the \CFA virtual system currently has only two public features, virtual cast and virtual tables, substantial structure is required to support them, and provide features for exception handling and the standard library. \subsection{Virtual Type} A virtual type~(see \autoref{s:virtuals}) has a pointer to a virtual table, called the \emph{virtual-table pointer}, which binds each instance of a virtual type to a virtual table. Internally, the field is called \snake{virtual_table} and is fixed after construction. This pointer is also the table's id and how the system accesses the virtual table and the virtual members there. It is always the first field in the structure so that its location is always known. % We have no special rules for these constructors. Virtual table pointers are passed to the constructors of virtual types as part of field-by-field construction. \subsection{Type ID} Every virtual type has a unique ID. These are used in type equality, to check if the representation of two values are the same, and to access the type's type information. This uniqueness means across a program composed of multiple translation units (TU), not uniqueness across all programs or even across multiple processes on the same machine. Our approach for program uniqueness is using a static declaration for each type ID, where the run-time storage address of that variable is guaranteed to be unique during program execution. The type ID storage can also be used for other purposes, and is used for type information. The problem is that a type ID may appear in multiple TUs that compose a program (see \autoref{ss:VirtualTable}), so the initial solution would seem to be make it external in each translation unit. However, the type ID must have a declaration in (exactly) one of the TUs to create the storage. No other declaration related to the virtual type has this property, so doing this through standard C declarations would require the user to do it manually. Instead, the linker is used to handle this problem. % I did not base anything off of C++17; they are solving the same problem. A new feature has been added to \CFA for this purpose, the special attribute \snake{cfa_linkonce}, which uses the special section @.gnu.linkonce@. When used as a prefix (\eg @.gnu.linkonce.example@), the linker does not combine these sections, but instead discards all but one with the same full name. So, each type ID must be given a unique section name with the \snake{linkonce} prefix. Luckily, \CFA already has a way to get unique names, the name mangler. For example, this could be written directly in \CFA: \begin{cfa} __attribute__((cfa_linkonce)) void f() {} \end{cfa} This is translated to: \begin{cfa} __attribute__((section(".gnu.linkonce._X1fFv___1"))) void _X1fFv___1() {} \end{cfa} This is done internally to access the name mangler. This attribute is useful for other purposes, any other place a unique instance required, and should eventually be made part of a public and stable feature in \CFA. \subsection{Type Information} There is data stored at the type ID's declaration, the type information. The type information currently is only the parent's type ID or, if the type has no parent, the null pointer. The ancestors of a virtual type are found by traversing type IDs through the type information. An example using helper macros looks like: \begin{cfa} struct INFO_TYPE(TYPE) { INFO_TYPE(PARENT) const * parent; }; __attribute__((cfa_linkonce)) INFO_TYPE(TYPE) const INFO_NAME(TYPE) = { &INFO_NAME(PARENT), }; \end{cfa} Type information is constructed as follows: \begin{enumerate}[nosep] \item Use the type's name to generate a name for the type information structure, which is saved so it can be reused. \item Generate a new structure definition to store the type information. The layout is the same in each case, just the parent's type ID, but the types used change from instance to instance. The generated name is used for both this structure and, if relevant, the parent pointer. If the virtual type is polymorphic then the type information structure is polymorphic as well, with the same polymorphic arguments. \item A separate name for instances is generated from the type's name. \item The definition is generated and initialized. The parent ID is set to the null pointer or to the address of the parent's type information instance. Name resolution handles the rest. \item \CFA's name mangler does its regular name mangling encoding the type of the declaration into the instance name. This process gives a completely unique name including different instances of the same polymorphic type. \end{enumerate} Writing that code manually, with helper macros for the early name mangling, would look like this: \begin{cfa} struct INFO_TYPE(TYPE) { INFO_TYPE(PARENT) const * parent; }; __attribute__((cfa_linkonce)) INFO_TYPE(TYPE) const INFO_NAME(TYPE) = { &INFO_NAME(PARENT), }; \end{cfa} \begin{comment} \subsubsection{\lstinline{cfa\_linkonce} Attribute} % I just realized: This is an extension of the inline keyword. % An extension of C's at least, it is very similar to C++'s. Another feature added to \CFA is a new attribute: \texttt{cfa\_linkonce}. This attribute is attached to an object or function definition (any global declaration with a name and a type) allowing it to be defined multiple times. All matching definitions mush have the link-once attribute and their implementations should be identical as well. A single definition with the attribute can be included in a header file as if it was a forward declaration, except no definition is required. This technique is used for type ID instances. A link-once definition is generated each time the structure is seen. This will result in multiple copies but the link-once attribute ensures all but one are removed for a unique instance. Internally, @cfa_linkonce@ is replaced with @section(".gnu.linkonce.NAME")@ where \texttt{NAME} is replaced by the mangled name of the object. Any other @section@ attributes are removed from the declaration. The prefix \texttt{.gnu.linkonce} in section names is recognized by the linker. If two of these sections appear with the same name, including everything that comes after the special prefix, then only one is used and the other is discarded. \end{comment} \subsection{Virtual Table} \label{ss:VirtualTable} Each virtual type has a virtual table type that stores its type ID and virtual members. An instance of a virtual type is bound to a virtual table instance, which have the values of the virtual members. Both the layout of the fields (in the virtual table type) and their value (in the virtual table instance) are decided by the rules given below. The layout always comes in three parts (see \autoref{f:VirtualTableLayout}). The first section is just the type ID at the head of the table. It is always there to ensure that it can be found even when the accessing code does not know which virtual type it has. The second section is all the virtual members of the parent, in the same order as they appear in the parent's virtual table. Note that the type may change slightly as references to the ``this" change. This is limited to inside pointers/references and via function pointers so that the size (and hence the offsets) are the same. The third section is similar to the second except that it is the new virtual members introduced at this level in the hierarchy. \begin{figure} \begin{center} \input{vtable-layout} \end{center} \caption{Virtual Table Layout} \label{f:VirtualTableLayout} \end{figure} The first and second sections together mean that every virtual table has a prefix that has the same layout and types as its parent virtual table. This, combined with the fixed offset to the virtual table pointer, means that for any virtual type, it is always safe to access its virtual table and, from there, it is safe to check the type ID to identify the exact type of the underlying object, access any of the virtual members and pass the object to any of the method-like virtual members. When a virtual table is declared, the user decides where to declare it and its name. The initialization of the virtual table is entirely automatic based on the context of the declaration. The type ID is always fixed, with each virtual table type having exactly one possible type ID. The virtual members are usually filled in by type resolution. The best match for a given name and type at the declaration site is used. There are two exceptions to that rule: the @size@ field, the type's size, is set using a @sizeof@ expression, and the @align@ field, the type's alignment, is set using an @alignof@ expression. Most of these tools are already inside the compiler. Using simple code transformations early on in compilation allows most of that work to be handed off to the existing tools. \autoref{f:VirtualTableTransformation} shows an example transformation; this example shows an exception virtual table. It also shows the transformation on the full declaration. For a forward declaration, the @extern@ keyword is preserved and the initializer is not added. \begin{figure}[htb] \begin{cfa} vtable(example_type) example_name; \end{cfa} \transformline % Check mangling. \begin{cfa} const struct example_type_vtable example_name = { .__cfavir_typeid : &__cfatid_example_type, .size : sizeof(example_type), .copy : copy, .^?{} : ^?{}, .msg : msg, }; \end{cfa} \caption{Virtual Table Transformation} \label{f:VirtualTableTransformation} \end{figure} \subsection{Concurrency Integration} Coroutines and threads need instances of @CoroutineCancelled@ and @ThreadCancelled@ respectively to use all of their functionality. When a new data type is declared with @coroutine@ or @thread@, a forward declaration for the instance is created as well. The definition of the virtual table is created at the definition of the main function. These transformations are shown through code re-writing in \autoref{f:CoroutineTypeTransformation} and \autoref{f:CoroutineMainTransformation}. Threads use the same pattern, with some names and types changed. In both cases, the original declaration is not modified, only new ones are added. \begin{figure}[htb] \begin{cfa} coroutine Example { // fields }; \end{cfa} \transformline[appends...] \begin{cfa} __attribute__((cfa_linkonce)) struct __cfatid_struct_CoroutineCancelled(Example) __cfatid_CoroutineCancelled = { &EXCEPTION_TYPE_ID, }; extern CoroutineCancelled_vtable _default_vtable_object_declaration; extern CoroutineCancelled_vtable & _default_vtable; \end{cfa} \caption{Coroutine Type Transformation} \label{f:CoroutineTypeTransformation} \end{figure} \begin{figure}[htb] \begin{cfa} void main(Example & this) { // body } \end{cfa} \transformline[appends...] \begin{cfa} CoroutineCancelled_vtable _default_vtable_object_declaration = { __cfatid_CoroutineCancelled, // Virtual member initialization. }; CoroutineCancelled_vtable & _default_vtable = &_default_vtable_object_declaration; \end{cfa} \caption{Coroutine Main Transformation} \label{f:CoroutineMainTransformation} \end{figure} \subsection{Virtual Cast} Virtual casts are implemented as a function call that does the subtype check and a C coercion-cast to do the type conversion. % The C-cast is just to make sure the generated code is correct so the rest of % the section is about that function. The function is implemented in the standard library and has the following signature: \begin{cfa} void * __cfa__virtual_cast( struct __cfavir_type_id * parent, struct __cfavir_type_id * const * child ); \end{cfa} The type ID for the target type of the virtual cast is passed in as @parent@ and the cast target is passed in as @child@. The generated C code wraps both arguments and the result with type casts. There is also an internal check inside the compiler to make sure that the target type is a virtual type. % It also checks for conflicting definitions. The virtual cast either returns the original pointer or the null pointer as the new type. The function does the parent check and returns the appropriate value. The parent check is a simple linear search of the child's ancestors using the type information. \section{Exceptions} % The implementation of exception types. Creating exceptions can be roughly divided into two parts: the exceptions themselves and the virtual system interactions. Creating an exception type is just a matter of prepending the field with the virtual table pointer to the list of the fields (see \autoref{f:ExceptionTypeTransformation}). \begin{figure}[htb] \begin{cfa} exception new_exception { // EXISTING FIELDS }; \end{cfa} \transformline \begin{cfa} struct new_exception { struct new_exception_vtable const * virtual_table; // EXISTING FIELDS }; \end{cfa} \caption{Exception Type Transformation} \label{f:ExceptionTypeTransformation} \end{figure} The integration between exceptions and the virtual system is a bit more complex simply because of the nature of the virtual system prototype. The primary issue is that the virtual system has no way to detect when it should generate any of its internal types and data. This is handled by the exception code, which tells the virtual system when to generate its components. All types associated with a virtual type, the types of the virtual table and the type ID, are generated when the virtual type (the exception) is first found. The type ID (the instance) is generated with the exception, if it is a monomorphic type. However, if the exception is polymorphic, then a different type ID has to be generated for every instance. In this case, generation is delayed until a virtual table is created. % There are actually some problems with this, which is why it is not used % for monomorphic types. When a virtual table is created and initialized, two functions are created to fill in the list of virtual members. The first is the @copy@ function that adapts the exception's copy constructor to work with pointers, avoiding some issues with the current copy constructor interface. Second is the @msg@ function that returns a C-string with the type's name, including any polymorphic parameters. \section{Unwinding} % Adapt the unwind chapter, just describe the sections of libunwind used. % Mention that termination and cancellation use it. Maybe go into why % resumption doesn't as well. % Many modern languages work with an internal stack that function push and pop % their local data to. Stack unwinding removes large sections of the stack, % often across functions. Stack unwinding is the process of removing stack frames (activations) from the stack. On function entry and return, unwinding is handled directly by the call/return code embedded in the function. % Discussing normal stack unwinding: Usually, the stack-frame size is known statically based on parameter and local variable declarations. Even for a dynamic stack-size, the information to determine how much of the stack has to be removed is still contained within the function. Allocating/deallocating stack space is usually an $O(1)$ operation achieved by bumping the hardware stack-pointer up or down as needed. Constructing/destructing values within a stack frame has a similar complexity but larger constants. % Discussing multiple frame stack unwinding: Unwinding across multiple stack frames is more complex, because that information is no longer contained within the current function. With separate compilation, a function does not know its callers nor their frame layout. Even using the return address, that information is encoded in terms of actions in code, intermixed with the actions required to finish the function. Without changing the main code path it is impossible to select one of those two groups of actions at the return site. The traditional unwinding mechanism for C is implemented by saving a snapshot of a function's state with @setjmp@ and restoring that snapshot with @longjmp@. This approach bypasses the need to know stack details by simply resetting to a snapshot of an arbitrary but existing function frame on the stack. It is up to the programmer to ensure the snapshot is valid when it is reset and that all required cleanup from the unwound stacks is performed. Because it does not automate or check any of this cleanup, it can be easy to make mistakes and always must be handled manually. With respect to the extra work in the surrounding code, many languages define cleanup actions that must be taken when certain sections of the stack are removed, such as when the storage for a variable is removed from the stack, possibly requiring a destructor call, or when a try statement with a finally clause is (conceptually) popped from the stack. None of these cases should be handled by the user -- that would contradict the intention of these features -- so they need to be handled automatically. To safely remove sections of the stack, the language must be able to find and run these cleanup actions even when removing multiple functions unknown at the beginning of the unwinding. One of the most popular tools for stack management is libunwind, a low-level library that provides tools for stack walking, handler execution, and unwinding. What follows is an overview of all the relevant features of libunwind needed for this work. Following that is the description of the \CFA code that uses libunwind to implement termination. \subsection{libunwind Usage} Libunwind, accessed through @unwind.h@ on most platforms, is a C library that provides \Cpp-style stack-unwinding. Its operation is divided into two phases: search and cleanup. The dynamic target search -- phase 1 -- is used to scan the stack and decide where unwinding should stop (but no unwinding occurs). The cleanup -- phase 2 -- does the unwinding and also runs any cleanup code. To use libunwind, each function must have a personality function and a Language Specific Data Area (LSDA). The LSDA has the unique information for each function to tell the personality function where a function is executing, its current stack frame, and what handlers should be checked. Theoretically, the LSDA can contain any information but conventionally it is a table with entries representing regions of a function and what has to be done there during unwinding. These regions are bracketed by instruction addresses. If the instruction pointer is within a region's start/end, then execution is currently executing in that region. Regions are used to mark out the scopes of objects with destructors and try blocks. % Libunwind actually does very little, it simply moves down the stack from % function to function. Most of the actions are implemented by the personality % function which libunwind calls on every function. Since this is shared across % many functions or even every function in a language it will need a bit more % information. The GCC compilation flag @-fexceptions@ causes the generation of an LSDA and attaches a personality function to each function. In plain C (which \CFA currently compiles down to) this flag only handles the cleanup attribute: %\label{code:cleanup} \begin{cfa} void clean_up( int * var ) { ... } int avar __attribute__(( cleanup(clean_up) )); \end{cfa} The attribute is used on a variable and specifies a function, in this case @clean_up@, run when the variable goes out of scope. This feature is enough to mimic destructors, but not try statements that affect the unwinding. To get full unwinding support, all of these features must be handled directly in assembly and assembler directives; particularly the cfi directives \snake{.cfi_lsda} and \snake{.cfi_personality}. \subsection{Personality Functions} Personality functions have a complex interface specified by libunwind. This section covers some of the important parts of the interface. A personality function can perform different actions depending on how it is called. \begin{lstlisting} typedef _Unwind_Reason_Code (*_Unwind_Personality_Fn) ( _Unwind_Action action, _Unwind_Exception_Class exception_class, _Unwind_Exception * exception, struct _Unwind_Context * context); \end{lstlisting} The @action@ argument is a bitmask of possible actions: \begin{enumerate}[topsep=5pt] \item @_UA_SEARCH_PHASE@ specifies a search phase and tells the personality function to check for handlers. If there is a handler in a stack frame, as defined by the language, the personality function returns @_URC_HANDLER_FOUND@; otherwise it return @_URC_CONTINUE_UNWIND@. \item @_UA_CLEANUP_PHASE@ specifies a cleanup phase, where the entire frame is unwound and all cleanup code is run. The personality function does whatever cleanup the language defines (such as running destructors/finalizers) and then generally returns @_URC_CONTINUE_UNWIND@. \item \begin{sloppypar} @_UA_HANDLER_FRAME@ specifies a cleanup phase on a function frame that found a handler. The personality function must prepare to return to normal code execution and return @_URC_INSTALL_CONTEXT@. \end{sloppypar} \item @_UA_FORCE_UNWIND@ specifies a forced unwind call. Forced unwind only performs the cleanup phase and uses a different means to decide when to stop (see \autoref{s:ForcedUnwind}). \end{enumerate} The @exception_class@ argument is a copy of the \code{C}{exception}'s @exception_class@ field, which is a number that identifies the EHM that created the exception. The \code{C}{exception} argument is a pointer to a user provided storage object. It has two public fields: the @exception_class@, which is described above, and the @exception_cleanup@ function. The cleanup function is used by the EHM to clean up the exception. If it should need to be freed at an unusual time, it takes an argument that says why it had to be cleaned up. The @context@ argument is a pointer to an opaque type passed to helper functions called inside the personality function. The return value, @_Unwind_Reason_Code@, is an enumeration of possible messages that can be passed several places in libunwind. It includes a number of messages for special cases (some of which should never be used by the personality function) and error codes. However, unless otherwise noted, the personality function always returns @_URC_CONTINUE_UNWIND@. \subsection{Raise Exception} Raising an exception is the central function of libunwind and it performs two-staged unwinding. \begin{cfa} _Unwind_Reason_Code _Unwind_RaiseException(_Unwind_Exception *); \end{cfa} First, the function begins the search phase, calling the personality function of the most recent stack frame. It continues to call personality functions traversing the stack from newest to oldest until a function finds a handler or the end of the stack is reached. In the latter case, @_Unwind_RaiseException@ returns @_URC_END_OF_STACK@. Second, when a handler is matched, @_Unwind_RaiseException@ moves to the cleanup phase and walks the stack a second time. Once again, it calls the personality functions of each stack frame from newest to oldest. This pass stops at the stack frame containing the matching handler. If that personality function has not installed a handler, it is an error. If an error is encountered, @_Unwind_RaiseException@ returns either @_URC_FATAL_PHASE1_ERROR@ or @_URC_FATAL_PHASE2_ERROR@ depending on when the error occurred. \subsection{Forced Unwind} \label{s:ForcedUnwind} Forced Unwind is the other central function in libunwind. \begin{cfa} _Unwind_Reason_Code _Unwind_ForcedUnwind(_Unwind_Exception *, _Unwind_Stop_Fn, void *); \end{cfa} It also unwinds the stack but it does not use the search phase. Instead, another function, the stop function, is used to stop searching. The exception is the same as the one passed to @_Unwind_RaiseException@. The extra arguments are the stop function and the stop parameter. The stop function has a similar interface as a personality function, except it is also passed the stop parameter. \begin{lstlisting} typedef _Unwind_Reason_Code (*_Unwind_Stop_Fn)( _Unwind_Action action, _Unwind_Exception_Class exception_class, _Unwind_Exception * exception, struct _Unwind_Context * context, void * stop_parameter); \end{lstlisting} The stop function is called at every stack frame before the personality function is called and then once more after all frames of the stack are unwound. Each time it is called, the stop function should return @_URC_NO_REASON@ or transfer control directly to other code outside of libunwind. The framework does not provide any assistance here. \begin{sloppypar} Its arguments are the same as the paired personality function. The actions \snake{_UA_CLEANUP_PHASE} and \snake{_UA_FORCE_UNWIND} are always set when it is called. Beyond the libunwind standard, both GCC and Clang add an extra action on the last call at the end of the stack: \snake{_UA_END_OF_STACK}. \end{sloppypar} \section{Exception Context} % Should I have another independent section? % There are only two things in it, top_resume and current_exception. How it is % stored changes depending on whether or not the thread-library is linked. The exception context is global storage used to maintain data across different exception operations and to communicate among different components. Each stack must have its own exception context. In a sequential \CFA program, there is only one stack with a single global exception-context. However, when the library @libcfathread@ is linked, there are multiple stacks and each needs its own exception context. The current exception context should be retrieved by calling the function \snake{this_exception_context}. For sequential execution, this function is defined as a weak symbol in the \CFA system-library, @libcfa@. When a \CFA program is concurrent, it links with @libcfathread@, where this function is defined with a strong symbol replacing the sequential version. The sequential @this_exception_context@ returns a hard-coded pointer to the global exception context. The concurrent version adds the exception context to the data stored at the base of each stack. When @this_exception_context@ is called, it retrieves the active stack and returns the address of the context saved there. \section{Termination} % Memory management & extra information, the custom function used to implement % catches. Talk about GCC nested functions. \CFA termination exceptions use libunwind heavily because they match \Cpp exceptions closely. The main complication for \CFA is that the compiler generates C code, making it very difficult to generate the assembly to form the LSDA for try blocks or destructors. \subsection{Memory Management} The first step of a termination raise is to copy the exception into memory managed by the exception system. Currently, the system uses @malloc@, rather than reserved memory or the stack top. The EHM manages memory for the exception as well as memory for libunwind and the system's own per-exception storage. \begin{figure} \centering \input{exception-layout} \caption{Exception Layout} \label{f:ExceptionLayout} \end{figure} Exceptions are stored in variable-sized blocks (see \autoref{f:ExceptionLayout}). The first component is a fixed-sized data structure that contains the information for libunwind and the exception system. The second component is an area of memory big enough to store the exception. Macros with pointer arthritic and type cast are used to move between the components or go from the embedded @_Unwind_Exception@ to the entire node. Multiple exceptions can exist at the same time because exceptions can be raised inside handlers, destructors and finally blocks. Figure~\vref{f:MultipleExceptions} shows a program that has multiple exceptions active at one time. Each time an exception is thrown and caught the stack unwinds and the finally clause runs. This handler throws another exception (until @num_exceptions@ gets high enough), which must be allocated. The previous exceptions may not be freed because the handler/catch clause has not been run. Therefore, the EHM must keep all unhandled exceptions alive while it allocates exceptions for new throws. \begin{figure} \centering \newsavebox{\codeBox} \newsavebox{\stackBox} \begin{lrbox}{\codeBox} \begin{cfa} unsigned num_exceptions = 0; void throws() { try { try { ++num_exceptions; throw (Example){table}; } finally { if (num_exceptions < 3) { throws(); } } } catch (exception_t *) { --num_exceptions; } } int main() { throws(); } \end{cfa} \end{lrbox} \begin{lrbox}{\stackBox} \begin{lstlisting} | finally block (Example) | try block throws() | finally block (Example) | try block throws() | finally block (Example) | try block throws() main() \end{lstlisting} \end{lrbox} {\usebox\codeBox} \hspace{25pt} {\usebox\stackBox} \caption{Multiple Exceptions} \label{f:MultipleExceptions} \end{figure} All exceptions are stored in nodes, which are then linked together in lists one list per stack, with the list head stored in the exception context. Within each linked list, the most recently thrown exception is at the head, followed by older thrown exceptions. This format allows exceptions to be thrown, while a different exception is being handled. The exception at the head of the list is currently being handled, while other exceptions wait for the exceptions before them to be handled and removed. The virtual members in the exception's virtual table provide the size of the exception, the copy function, and the free function, so they are specific to an exception type. The size and copy function are used immediately to copy an exception into managed memory. After the exception is handled, the free function is used to clean up the exception and then the entire node is passed to @free@, returning the memory back to the heap. \subsection{Try Statements and Catch Clauses} The try statement with termination handlers is complex because it must compensate for the C code-generation versus proper assembly-code generated from \CFA. Libunwind requires an LSDA and personality function for control to unwind across a function. The LSDA in particular is hard to mimic in generated C code. The workaround is a function called \snake{__cfaehm_try_terminate} in the standard \CFA library. The contents of a try block and the termination handlers are converted into nested functions. These are then passed to the try terminate function and it calls them, appropriately. Because this function is known and fixed (and not an arbitrary function that happens to contain a try statement), its LSDA can be generated ahead of time. Both the LSDA and the personality function for \snake{__cfaehm_try_terminate} are set ahead of time using embedded assembly. This assembly code is handcrafted using C @asm@ statements and contains enough information for the single try statement the function represents. The three functions passed to try terminate are: \begin{description} \item[try function:] This function is the try block. It is where all the code from inside the try block is placed. It takes no parameters and has no return value. This function is called during regular execution to run the try block. \item[match function:] This function is called during the search phase and decides if a catch clause matches the termination exception. It is constructed from the conditional part of each handler and runs each check, top to bottom, in turn, to see if the exception matches this handler. The match is performed in two steps: first, a virtual cast is used to check if the raised exception is an instance of the declared exception type or one of its descendant types, and then the condition is evaluated, if present. The match function takes a pointer to the exception and returns 0 if the exception is not handled here. Otherwise, the return value is the ID of the handler that matches the exception. \item[handler function:] This function handles the exception, and contains all the code from the handlers in the try statement, joined with a switch statement on the handler's id. It takes a pointer to the exception and the handler's id and returns nothing. It is called after the cleanup phase. \end{description} All three functions are created with GCC nested functions. GCC nested functions can be used to create closures; in other words, functions that can refer to variables in their lexical scope even though those variables are part of a different function. This approach allows the functions to refer to all the variables in scope for the function containing the @try@ statement. These nested functions and all other functions besides @__cfaehm_try_terminate@ in \CFA use the GCC personality function and the @-fexceptions@ flag to generate the LSDA. Using this pattern, \CFA implements destructors with the cleanup attribute. \autoref{f:TerminationTransformation} shows the pattern used to transform a \CFA try statement with catch clauses into the appropriate C functions. \begin{figure} \begin{cfa} try { // TRY BLOCK } catch (Exception1 * name1 ; check(name1)) { // CATCH BLOCK 1 } catch (Exception2 * name2) { // CATCH BLOCK 2 } \end{cfa} \transformline \begin{cfa} void try(void) { // TRY BLOCK } int match(exception_t * __exception_inst) { { Exception1 * name1; if (name1 = (virtual Exception1 *)__exception_inst && check(name1)) { return 1; } } { Exception2 * name2; if (name2 = (virtual Exception2 *)__exception_inst) { return 2; } } return 0; } void catch(exception_t * __exception_inst, int __handler_index) { switch (__handler_index) { case 1: { Exception1 * name1 = (virtual Exception1 *)__exception_inst; // CATCH BLOCK 1 } return; case 2: { Exception2 * name2 = (virtual Exception2 *)__exception_inst; // CATCH BLOCK 2 } return; } } { __cfaehm_try_terminate(try, catch, match); } \end{cfa} \caption{Termination Transformation} \label{f:TerminationTransformation} \end{figure} \section{Resumption} % The stack-local data, the linked list of nodes. Resumption is simpler to implement than termination because there is no stack unwinding. Instead of storing the data in a special area using assembly, there is just a linked list of possible handlers for each stack, with each node on the list representing a try statement on the stack. The head of the list is stored in the exception context. The nodes are stored in order, with the more recent try statements closer to the head of the list. Instead of traversing the stack, resumption handling traverses the list. At each node, the EHM checks to see if the try statement the node represents can handle the exception. If it can, then the exception is handled and the operation finishes; otherwise, the search continues to the next node. If the search reaches the end of the list without finding a try statement with a handler clause that can handle the exception, the default handler is executed. If the default handler returns, control continues after the raise statement. Each node has a handler function that does most of the work. The handler function is passed the raised exception and returns true if the exception is handled and false otherwise. The handler function checks each of its internal handlers in order, top-to-bottom, until it finds a match. If a match is found that handler is run, after which the function returns true, ignoring all remaining handlers. If no match is found the function returns false. The match is performed in two steps. First a virtual cast is used to see if the raised exception is an instance of the declared exception type or one of its descendant types, if so, then the second step is to see if the exception passes the custom predicate if one is defined. % You need to make sure the type is correct before running the predicate % because the predicate can depend on that. \autoref{f:ResumptionTransformation} shows the pattern used to transform a \CFA try statement with catchResume clauses into the appropriate C functions. \begin{figure} \begin{cfa} try { // TRY BLOCK } catchResume (Exception1 * name1 ; check(name1)) { // CATCH BLOCK 1 } catchResume (Exception2 * name2) { // CATCH BLOCK 2 } \end{cfa} \transformline \begin{cfa} bool handle(exception_t * __exception_inst) { { Exception1 * name1; if (name1 = (virtual Exception1 *)__exception_inst && check(name1)) { // CATCH BLOCK 1 return 1; } } { Exception2 * name2; if (name2 = (virtual Exception2 *)__exception_inst) { // CATCH BLOCK 2 return 2; } } return false; } struct __try_resume_node __resume_node __attribute__((cleanup( __cfaehm_try_resume_cleanup ))); __cfaehm_try_resume_setup( &__resume_node, handler ); \end{cfa} \caption{Resumption Transformation} \label{f:ResumptionTransformation} \end{figure} % Recursive Resumption Stuff: \autoref{f:ResumptionMarking} shows search skipping (see \autoref{s:ResumptionMarking}), which ignores parts of the stack already examined, and is accomplished by updating the front of the list as the search continues. Before the handler is called at a matching node, the head of the list is updated to the next node of the current node. After the search is complete, successful or not, the head of the list is reset. % No paragraph? This mechanism means the current handler and every handler that has already been checked are not on the list while a handler is run. If a resumption is thrown during the handling of another resumption, the active handlers and all the other handlers checked up to this point are not checked again. % No paragraph? This structure also supports new handlers added while the resumption is being handled. These are added to the front of the list, pointing back along the stack -- the first one points over all the checked handlers -- and the ordering is maintained. \begin{figure} \centering \input{resumption-marking} \caption{Resumption Marking} \label{f:ResumptionMarking} \end{figure} \label{p:zero-cost} Finally, the resumption implementation has a cost for entering/exiting a try statement with @catchResume@ clauses, whereas a try statement with @catch@ clauses has zero-cost entry/exit. While resumption does not need the stack unwinding and cleanup provided by libunwind, it could use the search phase to providing zero-cost enter/exit using the LSDA. Unfortunately, there is no way to return from a libunwind search without installing a handler or raising an error. Although workarounds might be possible, they are beyond the scope of this thesis. The current resumption implementation has simplicity in its favour. % Seriously, just compare the size of the two chapters and then consider % that unwind is required knowledge for that chapter. \section{Finally} % Uses destructors and GCC nested functions. %\autoref{code:cleanup} A finally clause is handled by converting it into a once-off destructor. The code inside the clause is placed into a GCC nested-function with a unique name, and no arguments or return values. This nested function is then set as the cleanup function of an empty object that is declared at the beginning of a block placed around the context of the associated try statement, as shown in \autoref{f:FinallyTransformation}. \begin{figure} \begin{cfa} try { // TRY BLOCK } finally { // FINALLY BLOCK } \end{cfa} \transformline \begin{cfa} { void finally(void *__hook){ // FINALLY BLOCK } __attribute__ ((cleanup(finally))) struct __cfaehm_cleanup_hook __finally_hook; { // TRY BLOCK } } \end{cfa} \caption{Finally Transformation} \label{f:FinallyTransformation} \end{figure} The rest is handled by GCC. The TRY BLOCK contains the try block itself as well as all code generated for handlers. Once that code has completed, control exits the block and the empty object is cleaned up, which runs the function that contains the finally code. \section{Cancellation} % Stack selections, the three internal unwind functions. Cancellation also uses libunwind to do its stack traversal and unwinding. However, it uses a different primary function: @_Unwind_ForcedUnwind@. Details of its interface can be found in Section~\vref{s:ForcedUnwind}. The first step of cancellation is to find the cancelled stack and its type: coroutine, thread or main thread. In \CFA, a thread (the construct the user works with) is a user-level thread (point of execution) paired with a coroutine, the thread's main coroutine. The thread library also stores pointers to the main thread and the current thread. If the current thread's main and current coroutines are the same then the current stack is a thread stack, otherwise it is a coroutine stack. If the current stack is a thread stack, it is also the main thread stack if and only if the main and current threads are the same. However, if the threading library is not linked, the sequential execution is on the main stack. Hence, the entire check is skipped because the weak-symbol function is loaded. Therefore, main thread cancellation is unconditionally performed. Regardless of how the stack is chosen, the stop function and parameter are passed to the forced-unwind function. The general pattern of all three stop functions is the same: continue unwinding until the end of stack and then perform the appropriate transfer. For main stack cancellation, the transfer is just a program abort. For coroutine cancellation, the exception is stored on the coroutine's stack, and the coroutine context switches to its last resumer. The rest is handled on the backside of the resume, which checks if the resumed coroutine is cancelled. If cancelled, the exception is retrieved from the resumed coroutine, and a @CoroutineCancelled@ exception is constructed and loaded with the cancelled exception. It is then resumed as a regular exception with the default handler coming from the context of the resumption call. For thread cancellation, the exception is stored on the thread's main stack and then context switched to the scheduler. The rest is handled by the thread joiner. When the join is complete, the joiner checks if the joined thread is cancelled. If cancelled, the exception is retrieved and the joined thread, and a @ThreadCancelled@ exception is constructed and loaded with the cancelled exception. The default handler is passed in as a function pointer. If it is null (as it is for the auto-generated joins on destructor call), the default is used, which is a program abort. %; which gives the required handling on implicate join.