\chapter{Exception Features} This chapter covers the design and user interface of the \CFA exception-handling mechanism. \section{Virtuals} Virtual types and casts are not part of the exception system nor are they required for an exception system. But an object-oriented style hierarchy is a great way of organizing exceptions so a minimal virtual system has been added to \CFA. The pattern of a simple hierarchy was borrowed from object-oriented programming was chosen for several reasons. The first is that it allows new exceptions to be added in user code and in libraries independently of each other. Another is it allows for different levels of exception grouping (all exceptions, all IO exceptions or a particular IO exception). Also it also provides a simple way of passing data back and forth across the throw. Virtual types and casts are not required for a basic exception-system but are useful for advanced exception features. However, \CFA is not object-oriented so there is no obvious concept of virtuals. Hence, to create advanced exception features for this work, I needed to design and implement a virtual-like system for \CFA. % NOTE: Maybe we should but less of the rational here. Object-oriented languages often organized exceptions into a simple hierarchy, \eg Java. \begin{center} \setlength{\unitlength}{4000sp}% \begin{picture}(1605,612)(2011,-1951) \put(2100,-1411){\vector(1, 0){225}} \put(3450,-1411){\vector(1, 0){225}} \put(3550,-1411){\line(0,-1){225}} \put(3550,-1636){\vector(1, 0){150}} \put(3550,-1636){\line(0,-1){225}} \put(3550,-1861){\vector(1, 0){150}} \put(2025,-1490){\makebox(0,0)[rb]{\LstBasicStyle{exception}}} \put(2400,-1460){\makebox(0,0)[lb]{\LstBasicStyle{arithmetic}}} \put(3750,-1460){\makebox(0,0)[lb]{\LstBasicStyle{underflow}}} \put(3750,-1690){\makebox(0,0)[lb]{\LstBasicStyle{overflow}}} \put(3750,-1920){\makebox(0,0)[lb]{\LstBasicStyle{zerodivide}}} \end{picture}% \end{center} The hierarchy provides the ability to handle an exception at different degrees of specificity (left to right). Hence, it is possible to catch a more general exception-type in higher-level code where the implementation details are unknown, which reduces tight coupling to the lower-level implementation. Otherwise, low-level code changes require higher-level code changes, \eg, changing from raising @underflow@ to @overflow@ at the low level means changing the matching catch at the high level versus catching the general @arithmetic@ exception. In detail, each virtual type may have a parent and can have any number of children. A type's descendants are its children and its children's descendants. A type may not be its own descendant. The exception hierarchy allows a handler (@catch@ clause) to match multiple exceptions, \eg a base-type handler catches both base and derived exception-types. \begin{cfa} try { ... } catch(arithmetic &) { ... // handle arithmetic, underflow, overflow, zerodivide } \end{cfa} Most exception mechanisms perform a linear search of the handlers and select the first matching handler, so the order of handers is now important because matching is many to one. Each virtual type needs an associated virtual table. A virtual table is a structure with fields for all the virtual members of a type. A virtual type has all the virtual members of its parent and can add more. It may also update the values of the virtual members and often does. While much of the virtual infrastructure is created, it is currently only used internally for exception handling. The only user-level feature is the virtual cast, which is the same as the \Cpp \lstinline[language=C++]|dynamic_cast|. \label{p:VirtualCast} \begin{cfa} (virtual TYPE)EXPRESSION \end{cfa} Note, the syntax and semantics matches a C-cast, rather than the function-like \Cpp syntax for special casts. Both the type of @EXPRESSION@ and @TYPE@ must be a pointer to a virtual type. The cast dynamically checks if the @EXPRESSION@ type is the same or a subtype of @TYPE@, and if true, returns a pointer to the @EXPRESSION@ object, otherwise it returns @0p@ (null pointer). \section{Exception} % Leaving until later, hopefully it can talk about actual syntax instead % of my many strange macros. Syntax aside I will also have to talk about the % features all exceptions support. Exceptions are defined by the trait system; there are a series of traits, and if a type satisfies them, then it can be used as an exception. The following is the base trait all exceptions need to match. \begin{cfa} trait is_exception(exceptT &, virtualT &) { virtualT const & get_exception_vtable(exceptT *); }; \end{cfa} The trait is defined over two types, the exception type and the virtual table type. This should be one-to-one, each exception type has only one virtual table type and vice versa. The only assertion in the trait is @get_exception_vtable@, which takes a pointer of the exception type and returns a reference to the virtual table type instance. The function @get_exception_vtable@ is actually a constant function. Recardless of the value passed in (including the null pointer) it should return a reference to the virtual table instance for that type. The reason it is a function instead of a constant is that it make type annotations easier to write as you can use the exception type instead of the virtual table type; which usually has a mangled name. % Also \CFA's trait system handles functions better than constants and doing % it this way % I did have a note about how it is the programmer's responsibility to make % sure the function is implemented correctly. But this is true of every % similar system I know of (except Agda's I guess) so I took it out. \section{Raise} \CFA provides two kinds of exception raise: termination \see{\VRef{s:Termination}} and resumption \see{\VRef{s:Resumption}}, which are specified with the following traits. \begin{cfa} trait is_termination_exception( exceptT &, virtualT & | is_exception(exceptT, virtualT)) { void defaultTerminationHandler(exceptT &); }; \end{cfa} The function is required to allow a termination raise, but is only called if a termination raise does not find an appropriate handler. Allowing a resumption raise is similar. \begin{cfa} trait is_resumption_exception( exceptT &, virtualT & | is_exception(exceptT, virtualT)) { void defaultResumptionHandler(exceptT &); }; \end{cfa} The function is required to allow a resumption raise, but is only called if a resumption raise does not find an appropriate handler. Finally there are three convenience macros for referring to the these traits: @IS_EXCEPTION@, @IS_TERMINATION_EXCEPTION@ and @IS_RESUMPTION_EXCEPTION@. All three traits are hard to use while naming the virtual table as it has an internal mangled name. These macros take the exception name as their first argument and do the mangling. They all take a second argument for polymorphic types which is the parenthesized list of polymorphic arguments. These arguments are passed to both the exception type and the virtual table type as the arguments do have to match. For example consider a function that is polymorphic over types that have a defined arithmetic exception: \begin{cfa} forall(Num | IS_EXCEPTION(Arithmetic, (Num))) void some_math_function(Num & left, Num & right); \end{cfa} \subsection{Termination} \label{s:Termination} Termination raise, called ``throw'', is familiar and used in most programming languages with exception handling. The semantics of termination is: search the stack for a matching handler, unwind the stack frames to the matching handler, execute the handler, and continue execution after the handler. Termination is used when execution \emph{cannot} return to the throw. To continue execution, the program must \emph{recover} in the handler from the failed (unwound) execution at the raise to safely proceed after the handler. A termination raise is started with the @throw@ statement: \begin{cfa} throw EXPRESSION; \end{cfa} The expression must return a reference to a termination exception, where the termination exception is any type that satifies @is_termination_exception@ at the call site. Through \CFA's trait system the functions in the traits are passed into the throw code. A new @defaultTerminationHandler@ can be defined in any scope to change the throw's behavior (see below). At runtime, the exception returned by the expression is copied into managed memory (heap) to ensure it remains in scope during unwinding. It is the user's responsibility to ensure the original exception object at the throw is freed when it goes out of scope. Being allocated on the stack is sufficient for this. Then the exception system searches the stack starting from the throw and proceeding towards the base of the stack, from callee to caller. At each stack frame, a check is made for termination handlers defined by the @catch@ clauses of a @try@ statement. \begin{cfa} try { GUARDED_BLOCK } catch (EXCEPTION_TYPE$\(_1\)$ * NAME$\(_1\)$) { // termination handler 1 HANDLER_BLOCK$\(_1\)$ } catch (EXCEPTION_TYPE$\(_2\)$ * NAME$\(_2\)$) { // termination handler 2 HANDLER_BLOCK$\(_2\)$ } \end{cfa} The statements in the @GUARDED_BLOCK@ are executed. If those statements, or any functions invoked from those statements, throws an exception, and the exception is not handled by a try statement further up the stack, the termination handlers are searched for a matching exception type from top to bottom. Exception matching checks the representation of the thrown exception-type is the same or a descendant type of the exception types in the handler clauses. If it is the same of a descendent of @EXCEPTION_TYPE@$_i$ then @NAME@$_i$ is bound to a pointer to the exception and the statements in @HANDLER_BLOCK@$_i$ are executed. If control reaches the end of the handler, the exception is freed and control continues after the try statement. The default handler visible at the throw statement is used if no matching termination handler is found after the entire stack is searched. At that point, the default handler is called with a reference to the exception object generated at the throw. If the default handler returns, control continues from after the throw statement. This feature allows each exception type to define its own action, such as printing an informative error message, when an exception is not handled in the program. However the default handler for all exception types triggers a cancellation using the exception. \subsection{Resumption} \label{s:Resumption} Resumption raise, called ``resume'', is as old as termination raise~\cite{Goodenough75} but is less popular. In many ways, resumption is simpler and easier to understand, as it is simply a dynamic call. The semantics of resumption is: search the stack for a matching handler, execute the handler, and continue execution after the resume. Notice, the stack cannot be unwound because execution returns to the raise point. Resumption is used used when execution \emph{can} return to the resume. To continue execution, the program must \emph{correct} in the handler for the failed execution at the raise so execution can safely continue after the resume. A resumption raise is started with the @throwResume@ statement: \begin{cfa} throwResume EXPRESSION; \end{cfa} The semantics of the @throwResume@ statement are like the @throw@, but the expression has return a reference a type that satifies the trait @is_resumption_exception@. Like with termination the exception system can use these assertions while (throwing/raising/handling) the exception. At runtime, no copies are made. As the stack is not unwound the exception and any values on the stack will remain in scope while the resumption is handled. Then the exception system searches the stack starting from the resume and proceeding to the base of the stack, from callee to caller. At each stack frame, a check is made for resumption handlers defined by the @catchResume@ clauses of a @try@ statement. \begin{cfa} try { GUARDED_BLOCK } catchResume (EXCEPTION_TYPE$\(_1\)$ * NAME$\(_1\)$) { HANDLER_BLOCK$\(_1\)$ } catchResume (EXCEPTION_TYPE$\(_2\)$ * NAME$\(_2\)$) { HANDLER_BLOCK$\(_2\)$ } \end{cfa} The statements in the @GUARDED_BLOCK@ are executed. If those statements, or any functions invoked from those statements, resumes an exception, and the exception is not handled by a try statement further up the stack, the resumption handlers are searched for a matching exception type from top to bottom. (Note, termination and resumption handlers may be intermixed in a @try@ statement but the kind of raise (throw/resume) only matches with the corresponding kind of handler clause.) The exception search and matching for resumption is the same as for termination, including exception inheritance. The difference is when control reaches the end of the handler: the resumption handler returns after the resume rather than after the try statement. The resume point assumes the handler has corrected the problem so execution can safely continue. Like termination, if no resumption handler is found, the default handler visible at the resume statement is called, and the system default action is executed. For resumption, the exception system uses stack marking to partition the resumption search. If another resumption exception is raised in a resumption handler, the second exception search does not start at the point of the original raise. (Remember the stack is not unwound and the current handler is at the top of the stack.) The search for the second resumption starts at the current point on the stack because new try statements may have been pushed by the handler or functions called from the handler. If there is no match back to the point of the current handler, the search skips\label{p:searchskip} the stack frames already searched by the first resume and continues after the try statement. The default handler always continues from default handler associated with the point where the exception is created. % This might need a diagram. But it is an important part of the justification % of the design of the traversal order. \begin{verbatim} throwResume2 ----------. | | generated from handler | | | handler | | | throwResume1 -----. : | | : try | : search skip | | : catchResume <----' : | | \end{verbatim} This resumption search pattern reflects the one for termination, and so should come naturally to most programmers. However, it avoids the \emph{recursive resumption} problem. If parts of the stack are searched multiple times, loops can easily form resulting in infinite recursion. Consider the trivial case: \begin{cfa} try { throwResume (E &){}; // first } catchResume(E *) { throwResume (E &){}; // second } \end{cfa} If this handler is ever used it will be placed on top of the stack above the try statement. If the stack was not masked than the @throwResume@ in the handler would always be caught by the handler, leading to an infinite loop. Masking avoids this problem and other more complex versions of it involving multiple handlers and exception types. Other masking stratagies could be used; such as masking the handlers that have caught an exception. This one was choosen because it creates a symmetry with termination (masked sections of the stack would be unwound with termination) and having only one pattern to learn is easier. \section{Conditional Catch} Both termination and resumption handler clauses can be given an additional condition to further control which exceptions they handle: \begin{cfa} catch (EXCEPTION_TYPE * NAME ; @CONDITION@) \end{cfa} First, the same semantics is used to match the exception type. Second, if the exception matches, @CONDITION@ is executed. The condition expression may reference all names in scope at the beginning of the try block and @NAME@ introduced in the handler clause. If the condition is true, then the handler matches. Otherwise, the exception search continues at the next appropriate kind of handler clause in the try block. \begin{cfa} try { f1 = open( ... ); f2 = open( ... ); ... } catch( IOFailure * f ; fd( f ) == f1 ) { // only handle IO failure for f1 } \end{cfa} Note, catching @IOFailure@, checking for @f1@ in the handler, and reraising the exception if not @f1@ is different because the reraise does not examine any of remaining handlers in the current try statement. \section{Reraise} \color{red}{From Andrew: I recomend we talk about why the language doesn't have rethrows/reraises instead.} \label{s:Reraise} Within the handler block or functions called from the handler block, it is possible to reraise the most recently caught exception with @throw@ or @throwResume@, respective. \begin{cfa} try { ... } catch( ... ) { ... throw; // rethrow } catchResume( ... ) { ... throwResume; // reresume } \end{cfa} The only difference between a raise and a reraise is that reraise does not create a new exception; instead it continues using the current exception, \ie no allocation and copy. However the default handler is still set to the one visible at the raise point, and hence, for termination could refer to data that is part of an unwound stack frame. To prevent this problem, a new default handler is generated that does a program-level abort. \section{Finally Clauses} A @finally@ clause may be placed at the end of a @try@ statement. \begin{cfa} try { GUARDED_BLOCK } ... // any number or kind of handler clauses ... finally { FINALLY_BLOCK } \end{cfa} The @FINALLY_BLOCK@ is executed when the try statement is removed from the stack, including when the @GUARDED_BLOCK@ or any handler clause finishes or during an unwind. The only time the block is not executed is if the program is exited before that happens. Execution of the finally block should always finish, meaning control runs off the end of the block. This requirement ensures always continues as if the finally clause is not present, \ie finally is for cleanup not changing control flow. Because of this requirement, local control flow out of the finally block is forbidden. The compiler precludes any @break@, @continue@, @fallthru@ or @return@ that causes control to leave the finally block. Other ways to leave the finally block, such as a long jump or termination are much harder to check, and at best requiring additional run-time overhead, and so are discouraged. \section{Cancellation} Cancellation is a stack-level abort, which can be thought of as as an uncatchable termination. It unwinds the entirety of the current stack, and if possible forwards the cancellation exception to a different stack. Cancellation is not an exception operation like termination or resumption. There is no special statement for starting a cancellation; instead the standard library function @cancel_stack@ is called passing an exception. Unlike a raise, this exception is not used in matching only to pass information about the cause of the cancellation. Handling of a cancellation depends on which stack is being cancelled. \begin{description} \item[Main Stack:] The main stack is the one used by the program main at the start of execution, and is the only stack in a sequential program. Even in a concurrent program the main stack is only dependent on the environment that started the program. Hence, when the main stack is cancelled there is nowhere else in the program to notify. After the stack is unwound, there is a program-level abort. \item[Thread Stack:] A thread stack is created for a @thread@ object or object that satisfies the @is_thread@ trait. A thread only has two points of communication that must happen: start and join. As the thread must be running to perform a cancellation, it must occur after start and before join, so join is used for communication here. After the stack is unwound, the thread halts and waits for another thread to join with it. The joining thread checks for a cancellation, and if present, resumes exception @ThreadCancelled@. There is a subtle difference between the explicit join (@join@ function) and implicit join (from a destructor call). The explicit join takes the default handler (@defaultResumptionHandler@) from its calling context, which is used if the exception is not caught. The implicit join does a program abort instead. This semantics is for safety. If an unwind is triggered while another unwind is underway only one of them can proceed as they both want to ``consume'' the stack. Letting both try to proceed leads to very undefined behaviour. Both termination and cancellation involve unwinding and, since the default @defaultResumptionHandler@ preforms a termination that could more easily happen in an implicate join inside a destructor. So there is an error message and an abort instead. The recommended way to avoid the abort is to handle the intial resumption from the implicate join. If required you may put an explicate join inside a finally clause to disable the check and use the local @defaultResumptionHandler@ instead. \item[Coroutine Stack:] A coroutine stack is created for a @coroutine@ object or object that satisfies the @is_coroutine@ trait. A coroutine only knows of two other coroutines, its starter and its last resumer. The last resumer has the tightest coupling to the coroutine it activated. Hence, cancellation of the active coroutine is forwarded to the last resumer after the stack is unwound, as the last resumer has the most precise knowledge about the current execution. When the resumer restarts, it resumes exception @CoroutineCancelled@, which is polymorphic over the coroutine type and has a pointer to the cancelled coroutine. The resume function also has an assertion that the @defaultResumptionHandler@ for the exception. So it will use the default handler like a regular throw. \end{description}