source: doc/theses/andrew_beach_MMath/features.tex @ f8a7fed

Last change on this file since f8a7fed was 08e75215, checked in by Andrew Beach <ajbeach@…>, 3 years ago

Andrew MMath: Expanded a todo about open/closed types.

  • Property mode set to 100644
File size: 28.4 KB
1\chapter{Exception Features}
3This chapter covers the design and user interface of the \CFA
4exception-handling mechanism (EHM). % or exception system.
6% We should cover what is an exception handling mechanism and what is an
7% exception before this. Probably in the introduction. Some of this could
8% move there.
9\paragraph{Raise / Handle}
10An exception operation has two main parts: raise and handle.
11These are the two parts that the user will write themselves and so
12might be the only two pieces of the EHM that have any syntax.
13These terms are sometimes also known as throw and catch but this work uses
14throw/catch as a particular kind of raise/handle.
17The raise is the starting point for exception handling and usually how
18Some well known examples include the throw statements of \Cpp and Java and
19the raise statement from Python.
21For this overview a raise does nothing more kick off the handling of an
22exception, which is called raising the exception. This is inexact but close
23enough for the broad strokes of the overview.
26The purpose of most exception operations is to run some sort of handler that
27contains user code.
28The try statement of \Cpp illistrates the common features
29Handlers have three common features: a region of code they apply to, an
30exception label that describes what exceptions they handle and code to run
31when they handle an exception.
32Each handler can handle exceptions raised in that region that match their
33exception label. Different EHMs will have different rules to pick a handler
34if multipe handlers could be used such as ``best match" or ``first found".
37After an exception is raised comes what is usually the biggest step for the
38EHM, finding and setting up the handler. This can be broken up into three
39different tasks: searching for a handler, matching against the handler and
40installing the handler.
42First the EHM must search for possible handlers that could be used to handle
43the exception. Searching is usually independent of the exception that was
44thrown and instead depends on the call stack, the current function, its caller
45and repeating down the stack.
47Second it much match the exception with each handler to see which one is the
48best match and hence which one should be used to handle the exception.
49In languages where the best match is the first match these two are often
50intertwined, a match check is preformed immediately after the search finds
51a possible handler.
53Third, after a handler is chosen it must be made ready to run.
54What this actually involves can vary widely to fit with the rest of the
55design of the EHM. The installation step might be trivial or it could be
56the most expensive step in handling an exception. The latter tends to be the
57case when stack unwinding is involved.
59As an alternate third step if no appropriate handler is found then some sort
60of recovery has to be preformed. This is only required with unchecked
61exceptions as checked exceptions can promise that a handler is found. It also
62is also installing a handler but it is a special default that may be
63installed differently.
66In \CFA the EHM uses a hierarchial system to organise its exceptions.
67This stratagy is borrowed from object-orientated languages where the
68exception hierarchy is a natural extension of the object hierarchy.
70Consider the following hierarchy of exceptions:
74\put(2100,-1411){\vector(1, 0){225}}
75\put(3450,-1411){\vector(1, 0){225}}
77\put(3550,-1636){\vector(1, 0){150}}
79\put(3550,-1861){\vector(1, 0){150}}
88A handler labelled with any given exception can handle exceptions of that
89type or any child type of that exception. The root of the exception hierarchy
90(here \texttt{exception}) acts as a catch-all, leaf types catch single types
91and the exceptions in the middle can be used to catch different groups of
92related exceptions.
94This system has some notable advantages, such as multiple levels of grouping,
95the ability for libraries to add new exception types and the isolation
96between different sub-hierarchies. So the design was adapted for a
97non-object-orientated language.
99% Could I cite the rational for the Python IO exception rework?
102After the handler has finished the entire exception operation has to complete
103and continue executing somewhere else. This step is usually very simple
104both logically and in its implementation as the installation of the handler
105usually does the heavy lifting.
107The EHM can return control to many different places.
108However, the most common is after the handler definition and the next most
109common is after the raise.
112For effective exception handling, additional information is usually required
113as this base model only communicates the exception's identity. Common
114additional methods of communication are putting fields on an exception and
115allowing a handler to access the lexical scope it is defined in (usually
116a function's local variables).
118\paragraph{Other Features}
119Any given exception handling mechanism is free at add other features on top
120of this. This is an overview of the base that all EHMs use but it is not an
121exaustive list of everything an EHM can do.
124Virtual types and casts are not part of the exception system nor are they
125required for an exception system. But an object-oriented style hierarchy is a
126great way of organizing exceptions so a minimal virtual system has been added
127to \CFA.
129The virtual system supports multiple ``trees" of types. Each tree is
130a simple hierarchy with a single root type. Each type in a tree has exactly
131one parent - except for the root type which has zero parents - and any
132number of children.
133Any type that belongs to any of these trees is called a virtual type.
135% A type's ancestors are its parent and its parent's ancestors.
136% The root type has no ancestors.
137% A type's decendents are its children and its children's decendents.
139Every virtual type also has a list of virtual members. Children inherit
140their parent's list of virtual members but may add new members to it.
141It is important to note that these are virtual members, not virtual methods.
142However as function pointers are allowed they can be used to mimic virtual
143methods as well.
145The unique id for the virtual type and all the virtual members are combined
146into a virtual table type. Each virtual type has a pointer to a virtual table
147as a hidden field.
149Up until this point the virtual system is a lot like ones found in object-
150orientated languages but this where they diverge. Objects encapsulate a
151single set of behaviours in each type, universally across the entire program,
152and indeed all programs that use that type definition. In this sense the
153types are ``closed" and cannot be altered.
155However in \CFA types do not encapsulate any behaviour. Traits are local and
156types can begin to statify a trait, stop satifying a trait or satify the same
157trait in a different way with each new definition. In this sense they are
158``open" as they can change at any time. This means it is implossible to pick
159a single set of functions that repersent the type.
161So we don't try to have a single value. The user can define virtual tables
162which are filled in at their declaration and given a name. Anywhere you can
163see that name you can use that virtual table; even if it is defined locally
164inside a function, although in that case you must respect its lifetime.
166An object of a virtual type is ``bound" to a virtual table instance which
167sets the virtual members for that object. The virtual members can be accessed
168through the object.
170While much of the virtual infrastructure is created, it is currently only used
171internally for exception handling. The only user-level feature is the virtual
172cast, which is the same as the \Cpp \lstinline[language=C++]|dynamic_cast|.
177Note, the syntax and semantics matches a C-cast, rather than the function-like
178\Cpp syntax for special casts. Both the type of @EXPRESSION@ and @TYPE@ must be
179a pointer to a virtual type.
180The cast dynamically checks if the @EXPRESSION@ type is the same or a sub-type
181of @TYPE@, and if true, returns a pointer to the
182@EXPRESSION@ object, otherwise it returns @0p@ (null pointer).
185% Leaving until later, hopefully it can talk about actual syntax instead
186% of my many strange macros. Syntax aside I will also have to talk about the
187% features all exceptions support.
189Exceptions are defined by the trait system; there are a series of traits, and
190if a type satisfies them, then it can be used as an exception. The following
191is the base trait all exceptions need to match.
193trait is_exception(exceptT &, virtualT &) {
194        virtualT const & get_exception_vtable(exceptT *);
197The trait is defined over two types, the exception type and the virtual table
198type. This should be one-to-one, each exception type has only one virtual
199table type and vice versa. The only assertion in the trait is
200@get_exception_vtable@, which takes a pointer of the exception type and
201returns a reference to the virtual table type instance.
203The function @get_exception_vtable@ is actually a constant function.
204Regardless of the value passed in (including the null pointer) it should
205return a reference to the virtual table instance for that type.
206The reason it is a function instead of a constant is that it make type
207annotations easier to write as you can use the exception type instead of the
208virtual table type; which usually has a mangled name.
209% Also \CFA's trait system handles functions better than constants and doing
210% it this way reduce the amount of boiler plate we need.
212% I did have a note about how it is the programmer's responsibility to make
213% sure the function is implemented correctly. But this is true of every
214% similar system I know of (except Agda's I guess) so I took it out.
216There are two more traits for exceptions @is_termination_exception@ and
217@is_resumption_exception@. They are defined as follows:
220trait is_termination_exception(
221                exceptT &, virtualT & | is_exception(exceptT, virtualT)) {
222        void defaultTerminationHandler(exceptT &);
225trait is_resumption_exception(
226                exceptT &, virtualT & | is_exception(exceptT, virtualT)) {
227        void defaultResumptionHandler(exceptT &);
231In other words they make sure that a given type and virtual type is an
232exception and defines one of the two default handlers. These default handlers
233are used in the main exception handling operations \see{Exception Handling}
234and their use will be detailed there.
236However all three of these traits can be tricky to use directly.
237There is a bit of repetition required but
238the largest issue is that the virtual table type is mangled and not in a user
239facing way. So there are three macros that can be used to wrap these traits
240when you need to refer to the names:
243All take one or two arguments. The first argument is the name of the
244exception type. Its unmangled and mangled form are passed to the trait.
245The second (optional) argument is a parenthesized list of polymorphic
246arguments. This argument should only with polymorphic exceptions and the
247list will be passed to both types.
248In the current set-up the base name and the polymorphic arguments have to
249match so these macros can be used without losing flexibility.
251For example consider a function that is polymorphic over types that have a
252defined arithmetic exception:
254forall(Num | IS_EXCEPTION(Arithmetic, (Num)))
255void some_math_function(Num & left, Num & right);
258\section{Exception Handling}
259\CFA provides two kinds of exception handling, termination and resumption.
260These twin operations are the core of the exception handling mechanism and
261are the reason for the features of exceptions.
262This section will cover the general patterns shared by the two operations and
263then go on to cover the details each individual operation.
265Both operations follow the same set of steps to do their operation. They both
266start with the user preforming a throw on an exception.
267Then there is the search for a handler, if one is found than the exception
268is caught and the handler is run. After that control returns to normal
271If the search fails a default handler is run and then control
272returns to normal execution immediately. That is where the default handlers
273@defaultTermiationHandler@ and @defaultResumptionHandler@ are used.
278Termination handling is more familiar kind and used in most programming
279languages with exception handling.
280It is dynamic, non-local goto. If a throw is successful then the stack will
281be unwound and control will (usually) continue in a different function on
282the call stack. They are commonly used when an error has occurred and recovery
283is impossible in the current function.
285% (usually) Control can continue in the current function but then a different
286% control flow construct should be used.
288A termination throw is started with the @throw@ statement:
290throw EXPRESSION;
292The expression must return a reference to a termination exception, where the
293termination exception is any type that satisfies @is_termination_exception@
294at the call site.
295Through \CFA's trait system the functions in the traits are passed into the
296throw code. A new @defaultTerminationHandler@ can be defined in any scope to
297change the throw's behavior (see below).
299The throw will copy the provided exception into managed memory. It is the
300user's responsibility to ensure the original exception is cleaned up if the
301stack is unwound (allocating it on the stack should be sufficient).
303Then the exception system searches the stack using the copied exception.
304It starts starts from the throw and proceeds to the base of the stack,
305from callee to caller.
306At each stack frame, a check is made for resumption handlers defined by the
307@catch@ clauses of a @try@ statement.
309try {
310        GUARDED_BLOCK
311} catch (EXCEPTION_TYPE$\(_1\)$ * NAME$\(_1\)$) {
312        HANDLER_BLOCK$\(_1\)$
313} catch (EXCEPTION_TYPE$\(_2\)$ * NAME$\(_2\)$) {
314        HANDLER_BLOCK$\(_2\)$
317When viewed on its own a try statement will simply execute the statements in
318@GUARDED_BLOCK@ and when those are finished the try statement finishes.
320However, while the guarded statements are being executed, including any
321functions they invoke, all the handlers following the try block are now
322or any functions invoked from those
323statements, throws an exception, and the exception
324is not handled by a try statement further up the stack, the termination
325handlers are searched for a matching exception type from top to bottom.
327Exception matching checks the representation of the thrown exception-type is
328the same or a descendant type of the exception types in the handler clauses. If
329it is the same of a descendant of @EXCEPTION_TYPE@$_i$ then @NAME@$_i$ is
330bound to a pointer to the exception and the statements in @HANDLER_BLOCK@$_i$
331are executed. If control reaches the end of the handler, the exception is
332freed and control continues after the try statement.
334If no handler is found during the search then the default handler is run.
335Through \CFA's trait system the best match at the throw sight will be used.
336This function is run and is passed the copied exception. After the default
337handler is run control continues after the throw statement.
339There is a global @defaultTerminationHandler@ that cancels the current stack
340with the copied exception. However it is generic over all exception types so
341new default handlers can be defined for different exception types and so
342different exception types can have different default handlers.
347Resumption exception handling is a less common form than termination but is
348just as old~\cite{Goodenough75} and is in some sense simpler.
349It is a dynamic, non-local function call. If the throw is successful a
350closure will be taken from up the stack and executed, after which the throwing
351function will continue executing.
352These are most often used when an error occurred and if the error is repaired
353then the function can continue.
355A resumption raise is started with the @throwResume@ statement:
357throwResume EXPRESSION;
359The semantics of the @throwResume@ statement are like the @throw@, but the
360expression has return a reference a type that satisfies the trait
361@is_resumption_exception@. The assertions from this trait are available to
362the exception system while handling the exception.
364At run-time, no copies are made. As the stack is not unwound the exception and
365any values on the stack will remain in scope while the resumption is handled.
367Then the exception system searches the stack using the provided exception.
368It starts starts from the throw and proceeds to the base of the stack,
369from callee to caller.
370At each stack frame, a check is made for resumption handlers defined by the
371@catchResume@ clauses of a @try@ statement.
373try {
374        GUARDED_BLOCK
375} catchResume (EXCEPTION_TYPE$\(_1\)$ * NAME$\(_1\)$) {
376        HANDLER_BLOCK$\(_1\)$
377} catchResume (EXCEPTION_TYPE$\(_2\)$ * NAME$\(_2\)$) {
378        HANDLER_BLOCK$\(_2\)$
381If the handlers are not involved in a search this will simply execute the
382@GUARDED_BLOCK@ and then continue to the next statement.
383Its purpose is to add handlers onto the stack.
384(Note, termination and resumption handlers may be intermixed in a @try@
385statement but the kind of throw must be the same as the handler for it to be
386considered as a possible match.)
388If a search for a resumption handler reaches a try block it will check each
389@catchResume@ clause, top-to-bottom.
390At each handler if the thrown exception is or is a child type of
391@EXCEPTION_TYPE@$_i$ then the a pointer to the exception is bound to
392@NAME@$_i$ and then @HANDLER_BLOCK@$_i$ is executed. After the block is
393finished control will return to the @throwResume@ statement.
395Like termination, if no resumption handler is found, the default handler
396visible at the throw statement is called. It will use the best match at the
397call sight according to \CFA's overloading rules. The default handler is
398passed the exception given to the throw. When the default handler finishes
399execution continues after the throw statement.
401There is a global @defaultResumptionHandler@ is polymorphic over all
402termination exceptions and preforms a termination throw on the exception.
403The @defaultTerminationHandler@ for that throw is matched at the original
404throw statement (the resumption @throwResume@) and it can be customized by
405introducing a new or better match as well.
407% \subsubsection?
409A key difference between resumption and termination is that resumption does
410not unwind the stack. A side effect that is that when a handler is matched
411and run it's try block (the guarded statements) and every try statement
412searched before it are still on the stack. This can lead to the recursive
413resumption problem.
415The recursive resumption problem is any situation where a resumption handler
416ends up being called while it is running.
417Consider a trivial case:
419try {
420        throwResume (E &){};
421} catchResume(E *) {
422        throwResume (E &){};
425When this code is executed the guarded @throwResume@ will throw, start a
426search and match the handler in the @catchResume@ clause. This will be
427call and placed on the stack on top of the try-block. The second throw then
428throws and will search the same try block and put call another instance of the
429same handler leading to an infinite loop.
431This situation is trivial and easy to avoid, but much more complex cycles
432can form with multiple handlers and different exception types.
434To prevent all of these cases we mask sections of the stack, or equivalently
435the try statements on the stack, so that the resumption search skips over
436them and continues with the next unmasked section of the stack.
438A section of the stack is marked when it is searched to see if it contains
439a handler for an exception and unmarked when that exception has been handled
440or the search was completed without finding a handler.
442% This might need a diagram. But it is an important part of the justification
443% of the design of the traversal order.
445       throwResume2 ----------.
446            |                 |
447 generated from handler       |
448            |                 |
449         handler              |
450            |                 |
451        throwResume1 -----.   :
452            |             |   :
453           try            |   : search skip
454            |             |   :
455        catchResume  <----'   :
456            |                 |
459The rules can be remembered as thinking about what would be searched in
460termination. So when a throw happens in a handler; a termination handler
461skips everything from the original throw to the original catch because that
462part of the stack has been unwound, a resumption handler skips the same
463section of stack because it has been masked.
464A throw in a default handler will preform the same search as the original
465throw because; for termination nothing has been unwound, for resumption
466the mask will be the same.
468The symmetry with termination is why this pattern was picked. Other patterns,
469such as marking just the handlers that caught, also work but lack the
470symmetry which means there is more to remember.
472\section{Conditional Catch}
473Both termination and resumption handler clauses can be given an additional
474condition to further control which exceptions they handle:
478First, the same semantics is used to match the exception type. Second, if the
479exception matches, @CONDITION@ is executed. The condition expression may
480reference all names in scope at the beginning of the try block and @NAME@
481introduced in the handler clause. If the condition is true, then the handler
482matches. Otherwise, the exception search continues as if the exception type
483did not match.
485try {
486        f1 = open( ... );
487        f2 = open( ... );
488        ...
489} catch( IOFailure * f ; fd( f ) == f1 ) {
490        // only handle IO failure for f1
493Note, catching @IOFailure@, checking for @f1@ in the handler, and re-raising the
494exception if not @f1@ is different because the re-raise does not examine any of
495remaining handlers in the current try statement.
498\colour{red}{From Andrew: I recomend we talk about why the language doesn't
499have rethrows/reraises instead.}
502Within the handler block or functions called from the handler block, it is
503possible to reraise the most recently caught exception with @throw@ or
504@throwResume@, respectively.
506try {
507        ...
508} catch( ... ) {
509        ... throw;
510} catchResume( ... ) {
511        ... throwResume;
514The only difference between a raise and a reraise is that reraise does not
515create a new exception; instead it continues using the current exception, \ie
516no allocation and copy. However the default handler is still set to the one
517visible at the raise point, and hence, for termination could refer to data that
518is part of an unwound stack frame. To prevent this problem, a new default
519handler is generated that does a program-level abort.
521\section{Finally Clauses}
522Finally clauses are used to preform unconditional clean-up when leaving a
523scope. They are placed at the end of a try statement:
525try {
526        GUARDED_BLOCK
527} ... // any number or kind of handler clauses
528... finally {
529        FINALLY_BLOCK
532The @FINALLY_BLOCK@ is executed when the try statement is removed from the
533stack, including when the @GUARDED_BLOCK@ finishes, any termination handler
534finishes or during an unwind.
535The only time the block is not executed is if the program is exited before
536the stack is unwound.
538Execution of the finally block should always finish, meaning control runs off
539the end of the block. This requirement ensures always continues as if the
540finally clause is not present, \ie finally is for cleanup not changing control
541flow. Because of this requirement, local control flow out of the finally block
542is forbidden. The compiler precludes any @break@, @continue@, @fallthru@ or
543@return@ that causes control to leave the finally block. Other ways to leave
544the finally block, such as a long jump or termination are much harder to check,
545and at best requiring additional run-time overhead, and so are mealy
548Not all languages with exceptions have finally clauses. Notably \Cpp does
549without it as descructors serve a similar role. Although destructors and
550finally clauses can be used in many of the same areas they have their own
551use cases like top-level functions and lambda functions with closures.
552Destructors take a bit more work to set up but are much easier to reuse while
553finally clauses are good for once offs and can include local information.
556Cancellation is a stack-level abort, which can be thought of as as an
557uncatchable termination. It unwinds the entirety of the current stack, and if
558possible forwards the cancellation exception to a different stack.
560Cancellation is not an exception operation like termination or resumption.
561There is no special statement for starting a cancellation; instead the standard
562library function @cancel_stack@ is called passing an exception. Unlike a
563throw, this exception is not used in matching only to pass information about
564the cause of the cancellation.
565(This also means matching cannot fail so there is no default handler either.)
567After @cancel_stack@ is called the exception is copied into the exception
568handling mechanism's memory. Then the entirety of the current stack is
569unwound. After that it depends one which stack is being cancelled.
571\item[Main Stack:]
572The main stack is the one used by the program main at the start of execution,
573and is the only stack in a sequential program. Even in a concurrent program
574the main stack is only dependent on the environment that started the program.
575Hence, when the main stack is cancelled there is nowhere else in the program
576to notify. After the stack is unwound, there is a program-level abort.
578\item[Thread Stack:]
579A thread stack is created for a @thread@ object or object that satisfies the
580@is_thread@ trait. A thread only has two points of communication that must
581happen: start and join. As the thread must be running to perform a
582cancellation, it must occur after start and before join, so join is used
583for communication here.
584After the stack is unwound, the thread halts and waits for
585another thread to join with it. The joining thread checks for a cancellation,
586and if present, resumes exception @ThreadCancelled@.
588There is a subtle difference between the explicit join (@join@ function) and
589implicit join (from a destructor call). The explicit join takes the default
590handler (@defaultResumptionHandler@) from its calling context, which is used if
591the exception is not caught. The implicit join does a program abort instead.
593This semantics is for safety. If an unwind is triggered while another unwind
594is underway only one of them can proceed as they both want to ``consume'' the
595stack. Letting both try to proceed leads to very undefined behaviour.
596Both termination and cancellation involve unwinding and, since the default
597@defaultResumptionHandler@ preforms a termination that could more easily
598happen in an implicate join inside a destructor. So there is an error message
599and an abort instead.
600\todo{Perhaps have a more general disucssion of unwind collisions before
601this point.}
603The recommended way to avoid the abort is to handle the initial resumption
604from the implicate join. If required you may put an explicate join inside a
605finally clause to disable the check and use the local
606@defaultResumptionHandler@ instead.
608\item[Coroutine Stack:] A coroutine stack is created for a @coroutine@ object
609or object that satisfies the @is_coroutine@ trait. A coroutine only knows of
610two other coroutines, its starter and its last resumer. Of the two the last
611resumer has the tightest coupling to the coroutine it activated and the most
612up-to-date information.
614Hence, cancellation of the active coroutine is forwarded to the last resumer
615after the stack is unwound. When the resumer restarts, it resumes exception
616@CoroutineCancelled@, which is polymorphic over the coroutine type and has a
617pointer to the cancelled coroutine.
619The resume function also has an assertion that the @defaultResumptionHandler@
620for the exception. So it will use the default handler like a regular throw.
Note: See TracBrowser for help on using the repository browser.