\chapter{Exception Features}

This chapter covers the design and user interface of the \CFA
exception-handling mechanism.

\section{Virtuals}
Virtual types and casts are not required for a basic exception-system but are
useful for advanced exception features. However, \CFA is not object-oriented so
there is no obvious concept of virtuals.  Hence, to create advanced exception
features for this work, I needed to designed and implemented a virtual-like
system for \CFA.

Object-oriented languages often organized exceptions into a simple hierarchy,
\eg Java.
\begin{center}
\setlength{\unitlength}{4000sp}%
\begin{picture}(1605,612)(2011,-1951)
\put(2100,-1411){\vector(1, 0){225}}
\put(3450,-1411){\vector(1, 0){225}}
\put(3550,-1411){\line(0,-1){225}}
\put(3550,-1636){\vector(1, 0){150}}
\put(3550,-1636){\line(0,-1){225}}
\put(3550,-1861){\vector(1, 0){150}}
\put(2025,-1490){\makebox(0,0)[rb]{\LstBasicStyle{exception}}}
\put(2400,-1460){\makebox(0,0)[lb]{\LstBasicStyle{arithmetic}}}
\put(3750,-1460){\makebox(0,0)[lb]{\LstBasicStyle{underflow}}}
\put(3750,-1690){\makebox(0,0)[lb]{\LstBasicStyle{overflow}}}
\put(3750,-1920){\makebox(0,0)[lb]{\LstBasicStyle{zerodivide}}}
\end{picture}%
\end{center}
The hierarchy provides the ability to handle an exception at different degrees
of specificity (left to right).  Hence, it is possible to catch a more general
exception-type in higher-level code where the implementation details are
unknown, which reduces tight coupling to the lower-level implementation.
Otherwise, low-level code changes require higher-level code changes, \eg,
changing from raising @underflow@ to @overflow@ at the low level means changing
the matching catch at the high level versus catching the general @arithmetic@
exception. In detail, each virtual type may have a parent and can have any
number of children. A type's descendants are its children and its children's
descendants. A type may not be its own descendant.

The exception hierarchy allows a handler (@catch@ clause) to match multiple
exceptions, \eg a base-type handler catches both base and derived
exception-types.
\begin{cfa}
try {
	...
} catch(arithmetic &) {
	... // handle arithmetic, underflow, overflow, zerodivide
}
\end{cfa}
Most exception mechanisms perform a linear search of the handlers and select
the first matching handler, so the order of handers is now important because
matching is many to one.

Each virtual type needs an associated virtual table. A virtual table is a
structure with fields for all the virtual members of a type. A virtual type has
all the virtual members of its parent and can add more. It may also update the
values of the virtual members and often does.

While much of the virtual infrastructure is created, it is currently only used
internally for exception handling. The only user-level feature is the virtual
cast, which is the same as the \CC \lstinline[language=C++]|dynamic_cast|.
\label{p:VirtualCast}
\begin{cfa}
(virtual TYPE)EXPRESSION
\end{cfa}
Note, the syntax and semantics matches a C-cast, rather than the unusual \CC
syntax for special casts. Both the type of @EXPRESSION@ and @TYPE@ must be a
pointer to a virtual type. The cast dynamically checks if the @EXPRESSION@ type
is the same or a subtype of @TYPE@, and if true, returns a pointer to the
@EXPRESSION@ object, otherwise it returns @0p@ (null pointer).

\section{Exception}
% Leaving until later, hopefully it can talk about actual syntax instead
% of my many strange macros. Syntax aside I will also have to talk about the
% features all exceptions support.

Exceptions are defined by the trait system; there are a series of traits, and
if a type satisfies them, then it can be used as an exception.  The following
is the base trait all exceptions need to match.
\begin{cfa}
trait is_exception(exceptT &, virtualT &) {
	virtualT const & @get_exception_vtable@(exceptT *);
};
\end{cfa}
The function takes any pointer, including the null pointer, and returns a
reference to the virtual-table object. Defining this function also establishes
the virtual type and a virtual-table pair to the \CFA type-resolver and
promises @exceptT@ is a virtual type and a child of the base exception-type.

\PAB{I do not understand this paragraph.}
One odd thing about @get_exception_vtable@ is that it should always be a
constant function, returning the same value regardless of its argument.  A
pointer or reference to the virtual table instance could be used instead,
however using a function has some ease of implementation advantages and allows
for easier disambiguation because the virtual type name (or the address of an
instance that is in scope) can be used instead of the mangled virtual table
name.  Also note the use of the word ``promise'' in the trait
description. Currently, \CFA cannot check to see if either @exceptT@ or
@virtualT@ match the layout requirements. This is considered part of
@get_exception_vtable@'s correct implementation.

\section{Raise}
\CFA provides two kinds of exception raise: termination
\see{\VRef{s:Termination}} and resumption \see{\VRef{s:Resumption}}, which are
specified with the following traits.
\begin{cfa}
trait is_termination_exception(
		exceptT &, virtualT & | is_exception(exceptT, virtualT)) {
	void @defaultTerminationHandler@(exceptT &);
};
\end{cfa}
The function is required to allow a termination raise, but is only called if a
termination raise does not find an appropriate handler.

Allowing a resumption raise is similar.
\begin{cfa}
trait is_resumption_exception(
		exceptT &, virtualT & | is_exception(exceptT, virtualT)) {
	void @defaultResumptionHandler@(exceptT &);
};
\end{cfa}
The function is required to allow a resumption raise, but is only called if a
resumption raise does not find an appropriate handler.

Finally there are three convenience macros for referring to the these traits:
@IS_EXCEPTION@, @IS_TERMINATION_EXCEPTION@ and @IS_RESUMPTION_EXCEPTION@.  Each
takes the virtual type's name, and for polymorphic types only, the
parenthesized list of polymorphic arguments. These macros do the name mangling
to get the virtual-table name and provide the arguments to both sides
\PAB{What's a ``side''?}

\subsection{Termination}
\label{s:Termination}

Termination raise, called ``throw'', is familiar and used in most programming
languages with exception handling. The semantics of termination is: search the
stack for a matching handler, unwind the stack frames to the matching handler,
execute the handler, and continue execution after the handler. Termination is
used when execution \emph{cannot} return to the throw. To continue execution,
the program must \emph{recover} in the handler from the failed (unwound)
execution at the raise to safely proceed after the handler.

A termination raise is started with the @throw@ statement:
\begin{cfa}
throw EXPRESSION;
\end{cfa}
The expression must return a termination-exception reference, where the
termination exception has a type with a @void defaultTerminationHandler(T &)@
(default handler) defined. The handler is found at the call site using \CFA's
trait system and passed into the exception system along with the exception
itself.

At runtime, a representation of the exception type and an instance of the
exception type is copied into managed memory (heap) to ensure it remains in
scope during unwinding. It is the user's responsibility to ensure the original
exception object at the throw is freed when it goes out of scope. Being
allocated on the stack is sufficient for this.

Then the exception system searches the stack starting from the throw and
proceeding towards the base of the stack, from callee to caller. At each stack
frame, a check is made for termination handlers defined by the @catch@ clauses
of a @try@ statement.
\begin{cfa}
try {
	GUARDED_BLOCK
} @catch (EXCEPTION_TYPE$\(_1\)$ * NAME)@ { // termination handler 1
	HANDLER_BLOCK$\(_1\)$
} @catch (EXCEPTION_TYPE$\(_2\)$ * NAME)@ { // termination handler 2
	HANDLER_BLOCK$\(_2\)$
}
\end{cfa}
The statements in the @GUARDED_BLOCK@ are executed. If those statements, or any
functions invoked from those statements, throws an exception, and the exception
is not handled by a try statement further up the stack, the termination
handlers are searched for a matching exception type from top to bottom.

Exception matching checks the representation of the thrown exception-type is
the same or a descendant type of the exception types in the handler clauses. If
there is a match, a pointer to the exception object created at the throw is
bound to @NAME@ and the statements in the associated @HANDLER_BLOCK@ are
executed. If control reaches the end of the handler, the exception is freed,
and control continues after the try statement.

The default handler visible at the throw statement is used if no matching
termination handler is found after the entire stack is searched. At that point,
the default handler is called with a reference to the exception object
generated at the throw. If the default handler returns, the system default
action is executed, which often terminates the program. This feature allows
each exception type to define its own action, such as printing an informative
error message, when an exception is not handled in the program.

\subsection{Resumption}
\label{s:Resumption}

Resumption raise, called ``resume'', is as old as termination
raise~\cite{Goodenough75} but is less popular. In many ways, resumption is
simpler and easier to understand, as it is simply a dynamic call (as in
Lisp). The semantics of resumption is: search the stack for a matching handler,
execute the handler, and continue execution after the resume. Notice, the stack
cannot be unwound because execution returns to the raise point. Resumption is
used used when execution \emph{can} return to the resume. To continue
execution, the program must \emph{correct} in the handler for the failed
execution at the raise so execution can safely continue after the resume.

A resumption raise is started with the @throwResume@ statement:
\begin{cfa}
throwResume EXPRESSION;
\end{cfa}
The semantics of the @throwResume@ statement are like the @throw@, but the
expression has a type with a @void defaultResumptionHandler(T &)@ (default
handler) defined, where the handler is found at the call site by the type
system.  At runtime, a representation of the exception type and an instance of
the exception type is \emph{not} copied because the stack is maintained during
the handler search.

Then the exception system searches the stack starting from the resume and
proceeding towards the base of the stack, from callee to caller. At each stack
frame, a check is made for resumption handlers defined by the @catchResume@
clauses of a @try@ statement.
\begin{cfa}
try {
	GUARDED_BLOCK
} @catchResume (EXCEPTION_TYPE$\(_1\)$ * NAME)@ { // resumption handler 1
	HANDLER_BLOCK$\(_1\)$
} @catchResume (EXCEPTION_TYPE$\(_2\)$ * NAME)@ { // resumption handler 2
	HANDLER_BLOCK$\(_2\)$
}
\end{cfa}
The statements in the @GUARDED_BLOCK@ are executed. If those statements, or any
functions invoked from those statements, resumes an exception, and the
exception is not handled by a try statement further up the stack, the
resumption handlers are searched for a matching exception type from top to
bottom. (Note, termination and resumption handlers may be intermixed in a @try@
statement but the kind of raise (throw/resume) only matches with the
corresponding kind of handler clause.)

The exception search and matching for resumption is the same as for
termination, including exception inheritance. The difference is when control
reaches the end of the handler: the resumption handler returns after the resume
rather than after the try statement. The resume point assumes the handler has
corrected the problem so execution can safely continue.

Like termination, if no resumption handler is found, the default handler
visible at the resume statement is called, and the system default action is
executed.

For resumption, the exception system uses stack marking to partition the
resumption search. If another resumption exception is raised in a resumption
handler, the second exception search does not start at the point of the
original raise. (Remember the stack is not unwound and the current handler is
at the top of the stack.) The search for the second resumption starts at the
current point on the stack because new try statements may have been pushed by
the handler or functions called from the handler. If there is no match back to
the point of the current handler, the search skips\label{p:searchskip} the stack frames already
searched by the first resume and continues after the try statement. The default
handler always continues from default handler associated with the point where
the exception is created.

% This might need a diagram. But it is an important part of the justification
% of the design of the traversal order.
\begin{verbatim}
       throwResume2 ----------.
            |                 |
 generated from handler       |
            |                 |
         handler              |
            |                 |
        throwResume1 -----.   :
            |             |   :
           try            |   : search skip
            |             |   :
        catchResume  <----'   :
            |                 |
\end{verbatim}

This resumption search-pattern reflect the one for termination, which matches
with programmer expectations. However, it avoids the \emph{recursive
resumption} problem. If parts of the stack are searched multiple times, loops
can easily form resulting in infinite recursion.

Consider the trivial case:
\begin{cfa}
try {
	throwResume$\(_1\)$ (E &){};
} catch( E * ) {
	throwResume;
}
\end{cfa}
Based on termination semantics, programmer expectation is for the re-resume to
continue searching the stack frames after the try statement. However, the
current try statement is still on the stack below the handler issuing the
reresume \see{\VRef{s:Reraise}}. Hence, the try statement catches the re-raise
again and does another re-raise \emph{ad infinitum}, which is confusing and
difficult to debug. The \CFA resumption search-pattern skips the try statement
so the reresume search continues after the try, mathcing programmer
expectation.

\section{Conditional Catch}
Both termination and resumption handler-clauses may perform conditional matching:
\begin{cfa}
catch (EXCEPTION_TYPE * NAME ; @CONDITION@)
\end{cfa}
First, the same semantics is used to match the exception type. Second, if the
exception matches, @CONDITION@ is executed. The condition expression may
reference all names in scope at the beginning of the try block and @NAME@
introduced in the handler clause.  If the condition is true, then the handler
matches. Otherwise, the exception search continues at the next appropriate kind
of handler clause in the try block.
\begin{cfa}
try {
	f1 = open( ... );
	f2 = open( ... );
	...
} catch( IOFailure * f ; fd( f ) == f1 ) {
	// only handle IO failure for f1
}
\end{cfa}
Note, catching @IOFailure@, checking for @f1@ in the handler, and reraising the
exception if not @f1@ is different because the reraise does not examine any of
remaining handlers in the current try statement.

\section{Reraise}
\label{s:Reraise}
Within the handler block or functions called from the handler block, it is
possible to reraise the most recently caught exception with @throw@ or
@throwResume@, respective.
\begin{cfa}
catch( ... ) {
	... throw; // rethrow
} catchResume( ... ) {
	... throwResume; // reresume
}
\end{cfa}
The only difference between a raise and a reraise is that reraise does not
create a new exception; instead it continues using the current exception, \ie
no allocation and copy. However the default handler is still set to the one
visible at the raise point, and hence, for termination could refer to data that
is part of an unwound stack frame. To prevent this problem, a new default
handler is generated that does a program-level abort.


\section{Finally Clauses}
A @finally@ clause may be placed at the end of a @try@ statement.
\begin{cfa}
try {
	GUARDED_BLOCK
} ...	// any number or kind of handler clauses
} finally {
	FINALLY_BLOCK
}
\end{cfa}
The @FINALLY_BLOCK@ is executed when the try statement is unwound from the
stack, \ie when the @GUARDED_BLOCK@ or any handler clause finishes. Hence, the
finally block is always executed.

Execution of the finally block should always finish, meaning control runs off
the end of the block. This requirement ensures always continues as if the
finally clause is not present, \ie finally is for cleanup not changing control
flow.  Because of this requirement, local control flow out of the finally block
is forbidden.  The compiler precludes any @break@, @continue@, @fallthru@ or
@return@ that causes control to leave the finally block. Other ways to leave
the finally block, such as a long jump or termination are much harder to check,
and at best requiring additional run-time overhead, and so are discouraged.

\section{Cancellation}
Cancellation is a stack-level abort, which can be thought of as as an
uncatchable termination. It unwinds the entirety of the current stack, and if
possible forwards the cancellation exception to a different stack.

There is no special statement for starting a cancellation; instead the standard
library function @cancel_stack@ is called passing an exception.  Unlike a
raise, this exception is not used in matching only to pass information about
the cause of the cancellation.

Handling of a cancellation depends on which stack is being cancelled.
\begin{description}
\item[Main Stack:]
The main stack is the one used by the program main at the start of execution,
and is the only stack in a sequential program.  Hence, when cancellation is
forwarded to the main stack, there is no other forwarding stack, so after the
stack is unwound, there is a program-level abort.

\item[Thread Stack:]
A thread stack is created for a @thread@ object or object that satisfies the
@is_thread@ trait.  A thread only has two points of communication that must
happen: start and join. As the thread must be running to perform a
cancellation, it must occur after start and before join, so join is a
cancellation point.  After the stack is unwound, the thread halts and waits for
another thread to join with it. The joining thread, checks for a cancellation,
and if present, resumes exception @ThreadCancelled@.

There is a subtle difference between the explicit join (@join@ function) and
implicit join (from a destructor call). The explicit join takes the default
handler (@defaultResumptionHandler@) from its calling context, which is used if
the exception is not caught. The implicit join does a program abort instead.

This semantics is for safety. One difficult problem for any exception system is
defining semantics when an exception is raised during an exception search:
which exception has priority, the original or new exception? No matter which
exception is selected, it is possible for the selected one to disrupt or
destroy the context required for the other. \PAB{I do not understand the
following sentences.} This loss of information can happen with join but as the
thread destructor is always run when the stack is being unwound and one
termination/cancellation is already active. Also since they are implicit they
are easier to forget about.

\item[Coroutine Stack:] A coroutine stack is created for a @coroutine@ object
or object that satisfies the @is_coroutine@ trait.  A coroutine only knows of
two other coroutines, its starter and its last resumer.  The last resumer has
the tightest coupling to the coroutine it activated.  Hence, cancellation of
the active coroutine is forwarded to the last resumer after the stack is
unwound, as the last resumer has the most precise knowledge about the current
execution. When the resumer restarts, it resumes exception
@CoroutineCancelled@, which is polymorphic over the coroutine type and has a
pointer to the cancelled coroutine.

The resume function also has an assertion that the @defaultResumptionHandler@
for the exception. So it will use the default handler like a regular throw.
\end{description}