Context Navigation

source: doc/theses/andrew_beach_MMath/features.tex @ d987881

Last change on this file since d987881 was 166b384, checked in by Andrew Beach <ajbeach@…>, 3 years ago
Andrew MMath: Added the missing front matter and corrected a few spelling/grammar mistakes.
Property mode set to `100644`
File size: 38.1 KB

Line
1	\chapter{Exception Features}
2	\label{c:features}
3
4	This chapter covers the design and user interface of the \CFA EHM
5	and begins with a general overview of EHMs. It is not a strict
6	definition of all EHMs nor an exhaustive list of all possible features.
7	However, it does cover the most common structure and features found in them.
8
9	\section{Overview of EHMs}
10	% We should cover what is an exception handling mechanism and what is an
11	% exception before this. Probably in the introduction. Some of this could
12	% move there.
13	\subsection{Raise / Handle}
14	An exception operation has two main parts: raise and handle.
15	These terms are sometimes known as throw and catch but this work uses
16	throw/catch as a particular kind of raise/handle.
17	These are the two parts that the user writes and may
18	be the only two pieces of the EHM that have any syntax in a language.
19
20	\paragraph{Raise}
21	The raise is the starting point for exception handling,
22	by raising an exception, which passes it to
23	the EHM.
24
25	Some well known examples include the @throw@ statements of \Cpp and Java and
26	the \code{Python}{raise} statement of Python. In real systems, a raise may
27	perform some other work (such as memory management) but for the
28	purposes of this overview that can be ignored.
29
30	\paragraph{Handle}
31	The primary purpose of an EHM is to run some user code to handle a raised
32	exception. This code is given, along with some other information,
33	in a handler.
34
35	A handler has three common features: the previously mentioned user code, a
36	region of code it guards and an exception label/condition that matches
37	against the raised exception.
38	Only raises inside the guarded region and raising exceptions that match the
39	label can be handled by a given handler.
40	If multiple handlers could can handle an exception,
41	EHMs define a rule to pick one, such as ``best match" or ``first found".
42
43	The @try@ statements of \Cpp, Java and Python are common examples. All three
44	also show another common feature of handlers: they are grouped by the guarded
45	region.
46
47	\subsection{Propagation}
48	After an exception is raised comes what is usually the biggest step for the
49	EHM: finding and setting up the handler for execution.
50	The propagation from raise to
51	handler can be broken up into three different tasks: searching for a handler,
52	matching against the handler and installing the handler.
53
54	\paragraph{Searching}
55	The EHM begins by searching for handlers that might be used to handle
56	the exception.
57	The search will find handlers that have the raise site in their guarded
58	region.
59	The search includes handlers in the current function, as well as any in
60	callers on the stack that have the function call in their guarded region.
61
62	\paragraph{Matching}
63	Each handler found is with the raised exception. The exception
64	label defines a condition that is used with the exception and decides if
65	there is a match or not.
66	%
67	In languages where the first match is used, this step is intertwined with
68	searching; a match check is performed immediately after the search finds
69	a handler.
70
71	\paragraph{Installing}
72	After a handler is chosen, it must be made ready to run.
73	The implementation can vary widely to fit with the rest of the
74	design of the EHM. The installation step might be trivial or it could be
75	the most expensive step in handling an exception. The latter tends to be the
76	case when stack unwinding is involved.
77
78	If a matching handler is not guaranteed to be found, the EHM needs a
79	different course of action for this case.
80	This situation only occurs with unchecked exceptions as checked exceptions
81	(such as in Java) can make the guarantee.
82	The unhandled action is usually very general, such as aborting the program.
83
84	\paragraph{Hierarchy}
85	A common way to organize exceptions is in a hierarchical structure.
86	This pattern comes from object-oriented languages where the
87	exception hierarchy is a natural extension of the object hierarchy.
88
89	Consider the following exception hierarchy:
90	\begin{center}
91	\input{exception-hierarchy}
92	\end{center}
93	A handler labeled with any given exception can handle exceptions of that
94	type or any child type of that exception. The root of the exception hierarchy
95	(here \code{C}{exception}) acts as a catch-all, leaf types catch single types
96	and the exceptions in the middle can be used to catch different groups of
97	related exceptions.
98
99	This system has some notable advantages, such as multiple levels of grouping,
100	the ability for libraries to add new exception types and the isolation
101	between different sub-hierarchies.
102	This design is used in \CFA even though it is not a object-orientated
103	language, so different tools are used to create the hierarchy.
104
105	% Could I cite the rational for the Python IO exception rework?
106
107	\subsection{Completion}
108	After the handler has finished, the entire exception operation has to complete
109	and continue executing somewhere else. This step is usually simple,
110	both logically and in its implementation, as the installation of the handler
111	is usually set up to do most of the work.
112
113	The EHM can return control to many different places, where
114	the most common are after the handler definition (termination)
115	and after the raise (resumption).
116
117	\subsection{Communication}
118	For effective exception handling, additional information is often passed
119	from the raise to the handler and back again.
120	So far, only communication of the exception's identity is covered.
121	A common communication method for adding information to an exception
122	is putting fields into the exception instance
123	and giving the handler access to them.
124	% You can either have pointers/references in the exception, or have p/rs to
125	% the exception when it doesn't have to be copied.
126	Passing references or pointers allows data at the raise location to be
127	updated, passing information in both directions.
128
129	\section{Virtuals}
130	\label{s:virtuals}
131	A common feature in many programming languages is a tool to pair code
132	(behaviour) with data.
133	In \CFA, this is done with the virtual system,
134	which allow type information to be abstracted away, recovered and allow
135	operations to be performed on the abstract objects.
136
137	Virtual types and casts are not part of \CFA's EHM nor are they required for
138	an EHM.
139	However, one of the best ways to support an exception hierarchy
140	is via a virtual hierarchy and dispatch system.
141	Ideally, the virtual system would have been part of \CFA before the work
142	on exception handling began, but unfortunately it was not.
143	Hence, only the features and framework needed for the EHM were
144	designed and implemented for this thesis.
145	Other features were considered to ensure that
146	the structure could accommodate other desirable features in the future
147	but are not implemented.
148	The rest of this section only discusses the implemented subset of the
149	virtual system design.
150
151	The virtual system supports multiple ``trees" of types. Each tree is
152	a simple hierarchy with a single root type. Each type in a tree has exactly
153	one parent -- except for the root type which has zero parents -- and any
154	number of children.
155	Any type that belongs to any of these trees is called a virtual type.
156	% A type's ancestors are its parent and its parent's ancestors.
157	% The root type has no ancestors.
158	% A type's descendants are its children and its children's descendants.
159
160	For the purposes of illustration, a proposed, but unimplemented, syntax
161	will be used. Each virtual type is represented by a trait with an annotation
162	that makes it a virtual type. This annotation is empty for a root type, which
163	creates a new tree:
164	\begin{cfa}
165	trait root_type(T) virtual() {}
166	\end{cfa}
167	The annotation may also refer to any existing virtual type to make this new
168	type a child of that type and part of the same tree. The parent may itself
169	be a child or a root type and may have any number of existing children.
170
171	% OK, for some reason the b and t positioning options are reversed here.
172	\begin{minipage}[b]{0.6\textwidth}
173	\begin{cfa}
174	trait child_a(T) virtual(root_type) {}
175	trait grandchild(T) virtual(child_a) {}
176	trait child_b(T) virtual(root_type) {}
177	\end{cfa}
178	\end{minipage}
179	\begin{minipage}{0.4\textwidth}
180	\begin{center}
181	\input{virtual-tree}
182	\end{center}
183	\end{minipage}
184
185	Every virtual type also has a list of virtual members and a unique id.
186	Both are stored in a virtual table.
187	Every instance of a virtual type also has a pointer to a virtual table stored
188	in it, although there is no per-type virtual table as in many other languages.
189
190	The list of virtual members is accumulated from the root type down the tree.
191	Every virtual type
192	inherits the list of virtual members from its parent and may add more
193	virtual members to the end of the list which are passed on to its children.
194	Again, using the unimplemented syntax this might look like:
195	\begin{cfa}
196	trait root_type(T) virtual() {
197	const char * to_string(T const & this);
198	unsigned int size;
199	}
200
201	trait child_type(T) virtual(root_type) {
202	char * irrelevant_function(int, char);
203	}
204	\end{cfa}
205	% Consider adding a diagram, but we might be good with the explanation.
206
207	As @child_type@ is a child of @root_type@, it has the virtual members of
208	@root_type@ (@to_string@ and @size@) as well as the one it declared
209	(@irrelevant_function@).
210
211	It is important to note that these are virtual members, and may contain
212	arbitrary fields, functions or otherwise.
213	The names ``size" and ``align" are reserved for the size and alignment of the
214	virtual type, and are always automatically initialized as such.
215	The other special case is uses of the trait's polymorphic argument
216	(@T@ in the example), which are always updated to refer to the current
217	virtual type. This allows functions that refer to the polymorphic argument
218	to act as traditional virtual methods (@to_string@ in the example), as the
219	object can always be passed to a virtual method in its virtual table.
220
221	Up until this point, the virtual system is similar to ones found in
222	object-oriented languages, but this is where \CFA diverges.
223	Objects encapsulate a single set of methods in each type,
224	universally across the entire program,
225	and indeed all programs that use that type definition.
226	The only way to change any method is to inherit and define a new type with
227	its own universal implementation. In this sense,
228	these object-oriented types are ``closed" and cannot be altered.
229	% Because really they are class oriented.
230
231	In \CFA, types do not encapsulate any code.
232	Whether or not a type satisfies any given assertion, and hence any trait, is
233	context sensitive. Types can begin to satisfy a trait, stop satisfying it or
234	satisfy the same trait at any lexical location in the program.
235	In this sense, a type's implementation in the set of functions and variables
236	that allow it to satisfy a trait is ``open" and can change
237	throughout the program.
238	This capability means it is impossible to pick a single set of functions
239	that represent a type's implementation across a program.
240
241	\CFA side-steps this issue by not having a single virtual table for each
242	type. A user can define virtual tables that are filled in at their
243	declaration and given a name. Anywhere that name is visible, even if it is
244	defined locally inside a function (although in this case the user must ensure
245	it outlives any objects that use it), it can be used.
246	Specifically, a virtual type is ``bound" to a virtual table that
247	sets the virtual members for that object. The virtual members can be accessed
248	through the object.
249
250	This means virtual tables are declared and named in \CFA.
251	They are declared as variables, using the type
252	@vtable(VIRTUAL_TYPE)@ and any valid name. For example:
253	\begin{cfa}
254	vtable(virtual_type_name) table_name;
255	\end{cfa}
256
257	Like any variable, they may be forward declared with the @extern@ keyword.
258	Forward declaring virtual tables is relatively common.
259	Many virtual types have an ``obvious" implementation that works in most
260	cases.
261	A pattern that has appeared in the early work using virtuals is to
262	implement a virtual table with the the obvious definition and place a forward
263	declaration of it in the header beside the definition of the virtual type.
264
265	Even on the full declaration, no initializer should be used.
266	Initialization is automatic.
267	The type id and special virtual members ``size" and ``align" only depend on
268	the virtual type, which is fixed given the type of the virtual table, and
269	so the compiler fills in a fixed value.
270	The other virtual members are resolved using the best match to the member's
271	name and type, in the same context as the virtual table is declared using
272	\CFA's normal resolution rules.
273
274	While much of the virtual infrastructure has been created,
275	it is currently only used
276	internally for exception handling. The only user-level feature is the virtual
277	cast, which is the same as the \Cpp \code{C++}{dynamic_cast}.
278	\label{p:VirtualCast}
279	\begin{cfa}
280	(virtual TYPE)EXPRESSION
281	\end{cfa}
282	Note, the syntax and semantics matches a C-cast, rather than the function-like
283	\Cpp syntax for special casts. Both the type of @EXPRESSION@ and @TYPE@ must be
284	pointers to virtual types.
285	The cast dynamically checks if the @EXPRESSION@ type is the same or a sub-type
286	of @TYPE@, and if true, returns a pointer to the
287	@EXPRESSION@ object, otherwise it returns @0p@ (null pointer).
288	This allows the expression to be used as both a cast and a type check.
289
290	\section{Exceptions}
291
292	The syntax for declaring an exception is the same as declaring a structure
293	except the keyword:
294	\begin{cfa}
295	exception TYPE_NAME {
296	FIELDS
297	};
298	\end{cfa}
299
300	Fields are filled in the same way as a structure as well. However, an extra
301	field is added that contains the pointer to the virtual table.
302	It must be explicitly initialized by the user when the exception is
303	constructed.
304
305	Here is an example of declaring an exception type along with a virtual table,
306	assuming the exception has an ``obvious" implementation and a default
307	virtual table makes sense.
308
309	\begin{minipage}[t]{0.4\textwidth}
310	Header (.hfa):
311	\begin{cfa}
312	exception Example {
313	int data;
314	};
315
316	extern vtable(Example)
317	example_base_vtable;
318	\end{cfa}
319	\end{minipage}
320	\begin{minipage}[t]{0.6\textwidth}
321	Implementation (.cfa):
322	\begin{cfa}
323	vtable(Example) example_base_vtable
324	\end{cfa}
325	\vfil
326	\end{minipage}
327
328	%\subsection{Exception Details}
329	This is the only interface needed when raising and handling exceptions.
330	However, it is actually a shorthand for a more complex
331	trait-based interface.
332
333	The language views exceptions through a series of traits.
334	If a type satisfies them, then it can be used as an exception. The following
335	is the base trait all exceptions need to match.
336	\begin{cfa}
337	trait is_exception(exceptT &, virtualT &) {
338	// Numerous imaginary assertions.
339	};
340	\end{cfa}
341	The trait is defined over two types: the exception type and the virtual table
342	type. Each exception type should have a single virtual table type.
343	There are no actual assertions in this trait because the trait system
344	cannot express them yet (adding such assertions would be part of
345	completing the virtual system). The imaginary assertions would probably come
346	from a trait defined by the virtual system, and state that the exception type
347	is a virtual type,
348	that that the type is a descendant of @exception_t@ (the base exception type)
349	and allow the user to find the virtual table type.
350
351	% I did have a note about how it is the programmer's responsibility to make
352	% sure the function is implemented correctly. But this is true of every
353	% similar system I know of (except Agda's I guess) so I took it out.
354
355	There are two more traits for exceptions defined as follows:
356	\begin{cfa}
357	trait is_termination_exception(
358	exceptT &, virtualT & \| is_exception(exceptT, virtualT)) {
359	void defaultTerminationHandler(exceptT &);
360	};
361
362	trait is_resumption_exception(
363	exceptT &, virtualT & \| is_exception(exceptT, virtualT)) {
364	void defaultResumptionHandler(exceptT &);
365	};
366	\end{cfa}
367	Both traits ensure a pair of types is an exception type and
368	its virtual table type,
369	and defines one of the two default handlers. The default handlers are used
370	as fallbacks and are discussed in detail in \autoref{s:ExceptionHandling}.
371
372	However, all three of these traits can be tricky to use directly.
373	While there is a bit of repetition required,
374	the largest issue is that the virtual table type is mangled and not in a user
375	facing way. So, these three macros are provided to wrap these traits to
376	simplify referring to the names:
377	@IS_EXCEPTION@, @IS_TERMINATION_EXCEPTION@ and @IS_RESUMPTION_EXCEPTION@.
378
379	All three take one or two arguments. The first argument is the name of the
380	exception type. The macro passes its unmangled and mangled form to the trait.
381	The second (optional) argument is a parenthesized list of polymorphic
382	arguments. This argument is only used with polymorphic exceptions and the
383	list is passed to both types.
384	In the current set-up, the two types always have the same polymorphic
385	arguments, so these macros can be used without losing flexibility.
386
387	For example, consider a function that is polymorphic over types that have a
388	defined arithmetic exception:
389	\begin{cfa}
390	forall(Num \| IS_EXCEPTION(Arithmetic, (Num)))
391	void some_math_function(Num & left, Num & right);
392	\end{cfa}
393
394	\section{Exception Handling}
395	\label{s:ExceptionHandling}
396	As stated,
397	\CFA provides two kinds of exception handling: termination and resumption.
398	These twin operations are the core of \CFA's exception handling mechanism.
399	This section covers the general patterns shared by the two operations and
400	then goes on to cover the details of each individual operation.
401
402	Both operations follow the same set of steps.
403	First, a user raises an exception.
404	Second, the exception propagates up the stack, searching for a handler.
405	Third, if a handler is found, the exception is caught and the handler is run.
406	After that control continues at a raise-dependent location.
407	As an alternate to the third step,
408	if a handler is not found, a default handler is run and, if it returns,
409	then control
410	continues after the raise.
411
412	The differences between the two operations include how propagation is
413	performed, where execution continues after an exception is handled
414	and which default handler is run.
415
416	\subsection{Termination}
417	\label{s:Termination}
418	Termination handling is the familiar kind of handling
419	used in most programming
420	languages with exception handling.
421	It is a dynamic, non-local goto. If the raised exception is matched and
422	handled, the stack is unwound and control (usually) continues in the function
423	on the call stack that defined the handler.
424	Termination is commonly used when an error has occurred and recovery is
425	impossible locally.
426
427	% (usually) Control can continue in the current function but then a different
428	% control flow construct should be used.
429
430	A termination raise is started with the @throw@ statement:
431	\begin{cfa}
432	throw EXPRESSION;
433	\end{cfa}
434	The expression must return a reference to a termination exception, where the
435	termination exception is any type that satisfies the trait
436	@is_termination_exception@ at the call site.
437	Through \CFA's trait system, the trait functions are implicitly passed into the
438	throw code for use by the EHM.
439	A new @defaultTerminationHandler@ can be defined in any scope to
440	change the throw's behaviour when a handler is not found (see below).
441
442	The throw copies the provided exception into managed memory to ensure
443	the exception is not destroyed if the stack is unwound.
444	It is the user's responsibility to ensure the original exception is cleaned
445	up whether the stack is unwound or not. Allocating it on the stack is
446	usually sufficient.
447
448	% How to say propagation starts, its first sub-step is the search.
449	Then propagation starts with the search. \CFA uses a ``first match" rule so
450	matching is performed with the copied exception as the search key.
451	It starts from the raise site and proceeds towards base of the stack,
452	from callee to caller.
453	At each stack frame, a check is made for termination handlers defined by the
454	@catch@ clauses of a @try@ statement.
455	\begin{cfa}
456	try {
457	GUARDED_BLOCK
458	} catch (EXCEPTION_TYPE$\(_1\)$ * [NAME$\(_1\)$]) {
459	HANDLER_BLOCK$\(_1\)$
460	} catch (EXCEPTION_TYPE$\(_2\)$ * [NAME$\(_2\)$]) {
461	HANDLER_BLOCK$\(_2\)$
462	}
463	\end{cfa}
464	When viewed on its own, a try statement simply executes the statements
465	in the \snake{GUARDED_BLOCK} and when those are finished,
466	the try statement finishes.
467
468	However, while the guarded statements are being executed, including any
469	invoked functions, all the handlers in these statements are included in the
470	search path.
471	Hence, if a termination exception is raised, these handlers may be matched
472	against the exception and may handle it.
473
474	Exception matching checks the handler in each catch clause in the order
475	they appear, top to bottom. If the representation of the raised exception type
476	is the same or a descendant of @EXCEPTION_TYPE@$_i$, then @NAME@$_i$
477	(if provided) is
478	bound to a pointer to the exception and the statements in @HANDLER_BLOCK@$_i$
479	are executed. If control reaches the end of the handler, the exception is
480	freed and control continues after the try statement.
481
482	If no termination handler is found during the search, then the default handler
483	(\defaultTerminationHandler) visible at the raise statement is called.
484	Through \CFA's trait system the best match at the raise statement is used.
485	This function is run and is passed the copied exception.
486	If the default handler finishes, control continues after the raise statement.
487
488	There is a global @defaultTerminationHandler@ that is polymorphic over all
489	termination exception types.
490	The global default termination handler performs a cancellation
491	(as described in \vref{s:Cancellation})
492	on the current stack with the copied exception.
493	Since it is so general, a more specific handler can be defined,
494	overriding the default behaviour for the specific exception types.
495
496	For example, consider an error reading a configuration file.
497	This is most likely a problem with the configuration file (@config_error@),
498	but the function could have been passed the wrong file name (@arg_error@).
499	In this case the function could raise one exception and then, if it is
500	unhandled, raise the other.
501	This is not usual behaviour for either exception so changing the
502	default handler will be done locally:
503	\begin{cfa}
504	{
505	void defaultTerminationHandler(config_error &) {
506	throw (arg_error){arg_vt};
507	}
508	throw (config_error){config_vt};
509	}
510	\end{cfa}
511
512	\subsection{Resumption}
513	\label{s:Resumption}
514
515	Resumption exception handling is less familar form of exception handling,
516	but is
517	just as old~\cite{Goodenough75} and is simpler in many ways.
518	It is a dynamic, non-local function call. If the raised exception is
519	matched, a closure is taken from up the stack and executed,
520	after which the raising function continues executing.
521	The common uses for resumption exceptions include
522	potentially repairable errors, where execution can continue in the same
523	function once the error is corrected, and
524	ignorable events, such as logging where nothing needs to happen and control
525	should always continue from the raise site.
526
527	Except for the changes to fit into that pattern, resumption exception
528	handling is symmetric with termination exception handling, by design
529	(see \autoref{s:Termination}).
530
531	A resumption raise is started with the @throwResume@ statement:
532	\begin{cfa}
533	throwResume EXPRESSION;
534	\end{cfa}
535	% The new keywords are currently ``experimental" and not used in this work.
536	It works much the same way as the termination raise, except the
537	type must satisfy the \snake{is_resumption_exception} that uses the
538	default handler: \defaultResumptionHandler.
539	This can be specialized for particular exception types.
540
541	At run-time, no exception copy is made. Since
542	resumption does not unwind the stack nor otherwise remove values from the
543	current scope, there is no need to manage memory to keep the exception
544	allocated.
545
546	Then propagation starts with the search,
547	following the same search path as termination,
548	from the raise site to the base of stack and top of try statement to bottom.
549	However, the handlers on try statements are defined by @catchResume@ clauses.
550	\begin{cfa}
551	try {
552	GUARDED_BLOCK
553	} catchResume (EXCEPTION_TYPE$\(_1\)$ * [NAME$\(_1\)$]) {
554	HANDLER_BLOCK$\(_1\)$
555	} catchResume (EXCEPTION_TYPE$\(_2\)$ * [NAME$\(_2\)$]) {
556	HANDLER_BLOCK$\(_2\)$
557	}
558	\end{cfa}
559	Note that termination handlers and resumption handlers may be used together
560	in a single try statement, intermixing @catch@ and @catchResume@ freely.
561	Each type of handler only interacts with exceptions from the matching
562	kind of raise.
563	Like @catch@ clauses, @catchResume@ clauses have no effect if an exception
564	is not raised.
565
566	The matching rules are exactly the same as well.
567	The first major difference here is that after
568	@EXCEPTION_TYPE@$_i$ is matched and @NAME@$_i$ is bound to the exception,
569	@HANDLER_BLOCK@$_i$ is executed right away without first unwinding the stack.
570	After the block has finished running, control jumps to the raise site, where
571	the just handled exception came from, and continues executing after it,
572	not after the try statement.
573
574	For instance, a resumption used to send messages to the logger may not
575	need to be handled at all. Putting the following default handler
576	at the global scope can make handling that exception optional by default.
577	\begin{cfa}
578	void defaultResumptionHandler(log_message &) {
579	// Nothing, it is fine not to handle logging.
580	}
581	// ... No change at raise sites. ...
582	throwResume (log_message){strlit_log, "Begin event processing."}
583	\end{cfa}
584
585	\subsubsection{Resumption Marking}
586	\label{s:ResumptionMarking}
587	A key difference between resumption and termination is that resumption does
588	not unwind the stack. A side effect is that, when a handler is matched
589	and run, its try block (the guarded statements) and every try statement
590	searched before it are still on the stack. Their presence can lead to
591	the recursive resumption problem.\cite{Buhr00a}
592	% Other possible citation is MacLaren77, but the form is different.
593
594	The recursive resumption problem is any situation where a resumption handler
595	ends up being called while it is running.
596	Consider a trivial case:
597	\begin{cfa}
598	try {
599	throwResume (E &){};
600	} catchResume(E *) {
601	throwResume (E &){};
602	}
603	\end{cfa}
604	When this code is executed, the guarded @throwResume@ starts a
605	search and matches the handler in the @catchResume@ clause. This
606	call is placed on the stack above the try-block.
607	Now the second raise in the handler searches the same try block,
608	matches again and then puts another instance of the
609	same handler on the stack leading to infinite recursion.
610
611	While this situation is trivial and easy to avoid, much more complex cycles
612	can form with multiple handlers and different exception types.
613	To prevent all of these cases, each try statement is ``marked" from the
614	time the exception search reaches it to either when a handler completes
615	handling that exception or when the search reaches the base
616	of the stack.
617	While a try statement is marked, its handlers are never matched, effectively
618	skipping over it to the next try statement.
619
620	\begin{center}
621	\input{stack-marking}
622	\end{center}
623
624	There are other sets of marking rules that could be used.
625	For instance, marking just the handlers that caught the exception
626	would also prevent recursive resumption.
627	However, the rules selected mirror what happens with termination,
628	so this reduces the amount of rules and patterns a programmer has to know.
629
630	The marked try statements are the ones that would be removed from
631	the stack for a termination exception, \ie those on the stack
632	between the handler and the raise statement.
633	This symmetry applies to the default handler as well, as both kinds of
634	default handlers are run at the raise statement, rather than (physically
635	or logically) at the bottom of the stack.
636	% In early development having the default handler happen after
637	% unmarking was just more useful. We assume that will continue.
638
639	\section{Conditional Catch}
640	Both termination and resumption handler clauses can be given an additional
641	condition to further control which exceptions they handle:
642	\begin{cfa}
643	catch (EXCEPTION_TYPE * [NAME] ; CONDITION)
644	\end{cfa}
645	First, the same semantics is used to match the exception type. Second, if the
646	exception matches, @CONDITION@ is executed. The condition expression may
647	reference all names in scope at the beginning of the try block and @NAME@
648	introduced in the handler clause. If the condition is true, then the handler
649	matches. Otherwise, the exception search continues as if the exception type
650	did not match.
651
652	The condition matching allows finer matching by checking
653	more kinds of information than just the exception type.
654	\begin{cfa}
655	try {
656	handle1 = open( f1, ... );
657	handle2 = open( f2, ... );
658	handle3 = open( f3, ... );
659	...
660	} catch( IOFailure * f ; fd( f ) == f1 ) {
661	// Only handle IO failure for f1.
662	} catch( IOFailure * f ; fd( f ) == f3 ) {
663	// Only handle IO failure for f3.
664	}
665	// Handle a failure relating to f2 further down the stack.
666	\end{cfa}
667	In this example, the file that experienced the IO error is used to decide
668	which handler should be run, if any at all.
669
670	\begin{comment}
671	% I know I actually haven't got rid of them yet, but I'm going to try
672	% to write it as if I had and see if that makes sense:
673	\section{Reraising}
674	\label{s:Reraising}
675	Within the handler block or functions called from the handler block, it is
676	possible to reraise the most recently caught exception with @throw@ or
677	@throwResume@, respectively.
678	\begin{cfa}
679	try {
680	...
681	} catch( ... ) {
682	... throw;
683	} catchResume( ... ) {
684	... throwResume;
685	}
686	\end{cfa}
687	The only difference between a raise and a reraise is that reraise does not
688	create a new exception; instead it continues using the current exception, \ie
689	no allocation and copy. However the default handler is still set to the one
690	visible at the raise point, and hence, for termination could refer to data that
691	is part of an unwound stack frame. To prevent this problem, a new default
692	handler is generated that does a program-level abort.
693	\end{comment}
694
695	\subsection{Comparison with Reraising}
696	In languages without conditional catch -- that is, no ability to match an
697	exception based on something other than its type -- it can be mimicked
698	by matching all exceptions of the right type, checking any additional
699	conditions inside the handler and re-raising the exception if it does not
700	match those.
701
702	Here is a minimal example comparing both patterns, using @throw;@
703	(no operand) to start a re-raise.
704	\begin{center}
705	\begin{tabular}{l r}
706	\begin{cfa}
707	try {
708	do_work_may_throw();
709	} catch(exception_t * exc ;
710	can_handle(exc)) {
711	handle(exc);
712	}
713
714
715
716	\end{cfa}
717	&
718	\begin{cfa}
719	try {
720	do_work_may_throw();
721	} catch(exception_t * exc) {
722	if (can_handle(exc)) {
723	handle(exc);
724	} else {
725	throw;
726	}
727	}
728	\end{cfa}
729	\end{tabular}
730	\end{center}
731	At first glance, catch-and-reraise may appear to just be a quality-of-life
732	feature, but there are some significant differences between the two
733	strategies.
734
735	A simple difference that is more important for \CFA than many other languages
736	is that the raise site changes with a re-raise, but does not with a
737	conditional catch.
738	This is important in \CFA because control returns to the raise site to run
739	the per-site default handler. Because of this, only a conditional catch can
740	allow the original raise to continue.
741
742	The more complex issue comes from the difference in how conditional
743	catches and re-raises handle multiple handlers attached to a single try
744	statement. A conditional catch will continue checking later handlers while
745	a re-raise will skip them.
746	If the different handlers could handle some of the same exceptions,
747	translating a try statement that uses one to use the other can quickly
748	become non-trivial:
749
750	\noindent
751	Original, with conditional catch:
752	\begin{cfa}
753	...
754	} catch (an_exception * e ; check_a(e)) {
755	handle_a(e);
756	} catch (exception_t * e ; check_b(e)) {
757	handle_b(e);
758	}
759	\end{cfa}
760	Translated, with re-raise:
761	\begin{cfa}
762	...
763	} catch (exception_t * e) {
764	an_exception * an_e = (virtual an_exception *)e;
765	if (an_e && check_a(an_e)) {
766	handle_a(an_e);
767	} else if (check_b(e)) {
768	handle_b(e);
769	} else {
770	throw;
771	}
772	}
773	\end{cfa}
774	(There is a simpler solution if @handle_a@ never raises exceptions,
775	using nested try statements.)
776
777	% } catch (an_exception * e ; check_a(e)) {
778	% handle_a(e);
779	% } catch (exception_t * e ; !(virtual an_exception *)e && check_b(e)) {
780	% handle_b(e);
781	% }
782	%
783	% } catch (an_exception * e)
784	% if (check_a(e)) {
785	% handle_a(e);
786	% } else throw;
787	% } catch (exception_t * e)
788	% if (check_b(e)) {
789	% handle_b(e);
790	% } else throw;
791	% }
792	In similar simple examples, translating from re-raise to conditional catch
793	takes less code but it does not have a general, trivial solution either.
794
795	So, given that the two patterns do not trivially translate into each other,
796	it becomes a matter of which on should be encouraged and made the default.
797	From the premise that if a handler could handle an exception then it
798	should, it follows that checking as many handlers as possible is preferred.
799	So, conditional catch and checking later handlers is a good default.
800
801	\section{Finally Clauses}
802	\label{s:FinallyClauses}
803	Finally clauses are used to perform unconditional cleanup when leaving a
804	scope and are placed at the end of a try statement after any handler clauses:
805	\begin{cfa}
806	try {
807	GUARDED_BLOCK
808	} ... // any number or kind of handler clauses
809	... finally {
810	FINALLY_BLOCK
811	}
812	\end{cfa}
813	The @FINALLY_BLOCK@ is executed when the try statement is removed from the
814	stack, including when the @GUARDED_BLOCK@ finishes, any termination handler
815	finishes or during an unwind.
816	The only time the block is not executed is if the program is exited before
817	the stack is unwound.
818
819	Execution of the finally block should always finish, meaning control runs off
820	the end of the block. This requirement ensures control always continues as if
821	the finally clause is not present, \ie finally is for cleanup, not changing
822	control flow.
823	Because of this requirement, local control flow out of the finally block
824	is forbidden. The compiler precludes any @break@, @continue@, @fallthru@ or
825	@return@ that causes control to leave the finally block. Other ways to leave
826	the finally block, such as a @longjmp@ or termination are much harder to check,
827	and at best require additional run-time overhead, and so are only
828	discouraged.
829
830	Not all languages with unwinding have finally clauses. Notably, \Cpp does
831	without it as destructors, and the RAII design pattern, serve a similar role.
832	Although destructors and finally clauses can be used for the same cases,
833	they have their own strengths, similar to top-level function and lambda
834	functions with closures.
835	Destructors take more work to create, but if there is clean-up code
836	that needs to be run every time a type is used, they are much easier
837	to set up for each use. % It's automatic.
838	On the other hand, finally clauses capture the local context, so are easy to
839	use when the cleanup is not dependent on the type of a variable or requires
840	information from multiple variables.
841
842	\section{Cancellation}
843	\label{s:Cancellation}
844	Cancellation is a stack-level abort, which can be thought of as as an
845	uncatchable termination. It unwinds the entire current stack, and if
846	possible, forwards the cancellation exception to a different stack.
847
848	Cancellation is not an exception operation like termination or resumption.
849	There is no special statement for starting a cancellation; instead the standard
850	library function @cancel_stack@ is called, passing an exception. Unlike a
851	raise, this exception is not used in matching, only to pass information about
852	the cause of the cancellation.
853	Finally, as no handler is provided, there is no default handler.
854
855	After @cancel_stack@ is called, the exception is copied into the EHM's memory
856	and the current stack is unwound.
857	The behaviour after that depends on the kind of stack being cancelled.
858
859	\paragraph{Main Stack}
860	The main stack is the one used by
861	the program's main function at the start of execution,
862	and is the only stack in a sequential program.
863	After the main stack is unwound, there is a program-level abort.
864
865	The first reason for this behaviour is for sequential programs where there
866	is only one stack, and hence no stack to pass information to.
867	Second, even in concurrent programs, the main stack has no dependency
868	on another stack and no reliable way to find another living stack.
869	Finally, keeping the same behaviour in both sequential and concurrent
870	programs is simple and easy to understand.
871
872	\paragraph{Thread Stack}
873	A thread stack is created for a \CFA @thread@ object or object that satisfies
874	the @is_thread@ trait.
875	After a thread stack is unwound, the exception is stored until another
876	thread attempts to join with it. Then the exception @ThreadCancelled@,
877	which stores a reference to the thread and to the exception passed to the
878	cancellation, is reported from the join to the joining thread.
879	There is one difference between an explicit join (with the @join@ function)
880	and an implicit join (from a destructor call). The explicit join takes the
881	default handler (@defaultResumptionHandler@) from its calling context while
882	the implicit join provides its own, which does a program abort if the
883	@ThreadCancelled@ exception cannot be handled.
884
885	The communication and synchronization are done here because threads only have
886	two structural points (not dependent on user-code) where
887	communication/synchronization happens: start and join.
888	Since a thread must be running to perform a cancellation (and cannot be
889	cancelled from another stack), the cancellation must be after start and
890	before the join, so join is used.
891
892	% TODO: Find somewhere to discuss unwind collisions.
893	The difference between the explicit and implicit join is for safety and
894	debugging. It helps prevent unwinding collisions by avoiding throwing from
895	a destructor and prevents cascading the error across multiple threads if
896	the user is not equipped to deal with it.
897	It is always possible to add an explicit join if that is the desired behaviour.
898
899	With explicit join and a default handler that triggers a cancellation, it is
900	possible to cascade an error across any number of threads,
901	alternating between the resumption (possibly termination) and cancellation,
902	cleaning up each
903	in turn, until the error is handled or the main thread is reached.
904
905	\paragraph{Coroutine Stack}
906	A coroutine stack is created for a @coroutine@ object or object that
907	satisfies the @is_coroutine@ trait.
908	After a coroutine stack is unwound, control returns to the @resume@ function
909	that most recently resumed it. @resume@ reports a
910	@CoroutineCancelled@ exception, which contains a reference to the cancelled
911	coroutine and the exception used to cancel it.
912	The @resume@ function also takes the \defaultResumptionHandler{} from the
913	caller's context and passes it to the internal report.
914
915	A coroutine only knows of two other coroutines,
916	its starter and its last resumer.
917	The starter has a much more distant connection, while the last resumer just
918	(in terms of coroutine state) called resume on this coroutine, so the message
919	is passed to the latter.
920
921	With a default handler that triggers a cancellation, it is possible to
922	cascade an error across any number of coroutines,
923	alternating between the resumption (possibly termination) and cancellation,
924	cleaning up each in turn,
925	until the error is handled or a thread stack is reached.

Note: See TracBrowser for help on using the repository browser.

Download in other formats: