Context Navigation

source: doc/theses/andrew_beach_MMath/features.tex @ f79ee0d

ADTast-experimentalenumforall-pointer-decayjacob/cs343-translationpthread-emulationqualifiedEnum

Last change on this file since f79ee0d was 93d0ed3, checked in by Peter A. Buhr <pabuhr@…>, 3 years ago
fix problem in virtual type examples and figure
Property mode set to `100644`
File size: 38.6 KB

Rev	Line
[4706098c]	1	\chapter{Exception Features}
[553f8abe]	2	\label{c:features}
[4706098c]	3
[4aba055]	4	This chapter covers the design and user interface of the \CFA EHM
	5	and begins with a general overview of EHMs. It is not a strict
	6	definition of all EHMs nor an exhaustive list of all possible features.
[21f2e92]	7	However it does cover the most common structure and features found in them.
[f6106a6]	8
[4aba055]	9	\section{Overview of EHMs}
[4260566]	10	% We should cover what is an exception handling mechanism and what is an
	11	% exception before this. Probably in the introduction. Some of this could
	12	% move there.
[4aba055]	13	\subsection{Raise / Handle}
[4260566]	14	An exception operation has two main parts: raise and handle.
[6071efc]	15	These terms are sometimes known as throw and catch but this work uses
[4260566]	16	throw/catch as a particular kind of raise/handle.
[4aba055]	17	These are the two parts that the user writes and may
[aa173d8]	18	be the only two pieces of the EHM that have any syntax in a language.
[4260566]	19
[4aba055]	20	\paragraph{Raise}
[aa173d8]	21	The raise is the starting point for exception handling
	22	by raising an exception, which passes it to
[f6106a6]	23	the EHM.
[4260566]	24
[f6106a6]	25	Some well known examples include the @throw@ statements of \Cpp and Java and
[aa173d8]	26	the \code{Python}{raise} statement of Python. In real systems, a raise may
	27	perform some other work (such as memory management) but for the
[299b8b2]	28	purposes of this overview that can be ignored.
[4260566]	29
[4aba055]	30	\paragraph{Handle}
[aa173d8]	31	The primary purpose of an EHM is to run some user code to handle a raised
	32	exception. This code is given, with some other information, in a handler.
[f6106a6]	33
	34	A handler has three common features: the previously mentioned user code, a
[aa173d8]	35	region of code it guards, and an exception label/condition that matches
	36	the raised exception.
[4aba055]	37	Only raises inside the guarded region and raising exceptions that match the
[f6106a6]	38	label can be handled by a given handler.
[6071efc]	39	If multiple handlers could can handle an exception,
[aa173d8]	40	EHMs define a rule to pick one, such as ``best match" or ``first found".
[4260566]	41
[f6106a6]	42	The @try@ statements of \Cpp, Java and Python are common examples. All three
[aa173d8]	43	show the common features of guarded region, raise, matching and handler.
	44	\begin{cfa}
	45	try { // guarded region
	46	...
	47	throw exception; // raise
	48	...
	49	} catch( exception ) { // matching condition, with exception label
	50	... // handler code
	51	}
	52	\end{cfa}
[f6106a6]	53
[4aba055]	54	\subsection{Propagation}
[de47a9d]	55	After an exception is raised comes what is usually the biggest step for the
[aa173d8]	56	EHM: finding and setting up the handler for execution. The propagation from raise to
[f6106a6]	57	handler can be broken up into three different tasks: searching for a handler,
[21f2e92]	58	matching against the handler and installing the handler.
[de47a9d]	59
[4aba055]	60	\paragraph{Searching}
[f6106a6]	61	The EHM begins by searching for handlers that might be used to handle
[aa173d8]	62	the exception. The search is restricted to
	63	handlers that have the raise site in their guarded
[f6106a6]	64	region.
[4aba055]	65	The search includes handlers in the current function, as well as any in
	66	callers on the stack that have the function call in their guarded region.
[f6106a6]	67
[4aba055]	68	\paragraph{Matching}
[aa173d8]	69	Each handler found is matched with the raised exception. The exception
	70	label defines a condition that is used with the exception and decides if
[f6106a6]	71	there is a match or not.
[4aba055]	72	In languages where the first match is used, this step is intertwined with
[aa173d8]	73	searching; a match check is performed immediately after the search finds
	74	a handler.
[4260566]	75
[4aba055]	76	\paragraph{Installing}
[aa173d8]	77	After a handler is chosen, it must be made ready to run.
[f6106a6]	78	The implementation can vary widely to fit with the rest of the
[de47a9d]	79	design of the EHM. The installation step might be trivial or it could be
[4260566]	80	the most expensive step in handling an exception. The latter tends to be the
	81	case when stack unwinding is involved.
[de47a9d]	82
[6071efc]	83	If a matching handler is not guaranteed to be found, the EHM needs a
[aa173d8]	84	different course of action for this case.
[4aba055]	85	This situation only occurs with unchecked exceptions as checked exceptions
[aa173d8]	86	(such as in Java) are guaranteed to find a matching handler.
	87	The unhandled action is usually very general, such as aborting the program.
[4260566]	88
[4aba055]	89	\paragraph{Hierarchy}
[f6106a6]	90	A common way to organize exceptions is in a hierarchical structure.
[4aba055]	91	This pattern comes from object-orientated languages where the
[4260566]	92	exception hierarchy is a natural extension of the object hierarchy.
	93
[aa173d8]	94	Consider the following exception hierarchy:
[4706098c]	95	\begin{center}
[6a8208cb]	96	\input{exception-hierarchy}
[4706098c]	97	\end{center}
[4aba055]	98	A handler labeled with any given exception can handle exceptions of that
[4260566]	99	type or any child type of that exception. The root of the exception hierarchy
[aa173d8]	100	(here \code{C}{exception}) acts as a catch-all, leaf types catch single types,
[4260566]	101	and the exceptions in the middle can be used to catch different groups of
	102	related exceptions.
	103
	104	This system has some notable advantages, such as multiple levels of grouping,
[aa173d8]	105	the ability for libraries to add new exception types, and the isolation
[f6106a6]	106	between different sub-hierarchies.
	107	This design is used in \CFA even though it is not a object-orientated
[a6c45c6]	108	language; so different tools are used to create the hierarchy.
[4260566]	109
	110	% Could I cite the rational for the Python IO exception rework?
	111
[4aba055]	112	\subsection{Completion}
[6071efc]	113	After the handler has finished, the entire exception operation has to complete
[f6106a6]	114	and continue executing somewhere else. This step is usually simple,
	115	both logically and in its implementation, as the installation of the handler
	116	is usually set up to do most of the work.
[de47a9d]	117
[aa173d8]	118	The EHM can return control to many different places, where
[4aba055]	119	the most common are after the handler definition (termination)
	120	and after the raise (resumption).
[4260566]	121
[4aba055]	122	\subsection{Communication}
[887fc79]	123	For effective exception handling, additional information is often passed
[4aba055]	124	from the raise to the handler and back again.
[aa173d8]	125	So far, only communication of the exception's identity is covered.
	126	A common communication method for passing more information is putting fields into the exception instance
[4aba055]	127	and giving the handler access to them.
[aa173d8]	128	Using reference fields pointing to data at the raise location allows data to be
[4aba055]	129	passed in both directions.
[4260566]	130
	131	\section{Virtuals}
[bfd7b30]	132	\label{s:Virtuals}
[f6106a6]	133	Virtual types and casts are not part of \CFA's EHM nor are they required for
[aa173d8]	134	an EHM.
	135	However, one of the best ways to support an exception hierarchy
[4aba055]	136	is via a virtual hierarchy and dispatch system.
[aa173d8]	137	Ideally, the virtual system should have been part of \CFA before the work
[a6c45c6]	138	on exception handling began, but unfortunately it was not.
[4aba055]	139	Hence, only the features and framework needed for the EHM were
[aa173d8]	140	designed and implemented for this thesis. Other features were considered to ensure that
[4aba055]	141	the structure could accommodate other desirable features in the future
[aa173d8]	142	but are not implemented.
	143	The rest of this section only discusses the implemented subset of the
	144	virtual-system design.
[4260566]	145
	146	The virtual system supports multiple ``trees" of types. Each tree is
	147	a simple hierarchy with a single root type. Each type in a tree has exactly
[f6106a6]	148	one parent -- except for the root type which has zero parents -- and any
[4260566]	149	number of children.
	150	Any type that belongs to any of these trees is called a virtual type.
[bfd7b30]	151	For example, the following hypothetical syntax creates two virtual-type trees.
	152	\begin{flushleft}
	153	\lstDeleteShortInline@
	154	\begin{tabular}{@{\hspace{20pt}}l@{\hspace{20pt}}l}
	155	\begin{cfa}
	156	vtype V0, V1(V0), V2(V0);
	157	vtype W0, W1(W0), W2(W1);
	158	\end{cfa}
	159	&
	160	\raisebox{-0.6\totalheight}{\input{vtable}}
	161	\end{tabular}
	162	\lstMakeShortInline@
	163	\end{flushleft}
[4260566]	164	% A type's ancestors are its parent and its parent's ancestors.
	165	% The root type has no ancestors.
[4aba055]	166	% A type's descendants are its children and its children's descendants.
[93d0ed3]	167	Every virtual type (tree node) has a pointer to a virtual table with a unique
	168	@Id@ and a list of virtual members (see \autoref{s:VirtualSystem} for
	169	details). Children inherit their parent's list of virtual members but may add
	170	and/or replace members. For example,
[bfd7b30]	171	\begin{cfa}
	172	vtable W0 \| { int ?<?( int, int ); int ?+?( int, int ); }
[93d0ed3]	173	vtable W1 \| { int ?+?( int, int ); int w, int ?-?( int, int ); }
[bfd7b30]	174	\end{cfa}
	175	creates a virtual table for @W0@ initialized with the matching @<@ and @+@
[93d0ed3]	176	operations visible at this declaration context. Similarly, @W1@ is initialized
	177	with @<@ from inheritance with @W0@, @+@ is replaced, and @-@ is added, where
	178	both operations are matched at this declaration context. It is important to
	179	note that these are virtual members, not virtual methods of object-orientated
	180	programming, and can be of any type. Finally, trait names can be used to
	181	specify the list of virtual members.
[4aba055]	182
[aa173d8]	183	\PAB{Need to look at these when done.
	184
[c21f5a9]	185	\CFA still supports virtual methods as a special case of virtual members.
[4aba055]	186	Function pointers that take a pointer to the virtual type are modified
[c21f5a9]	187	with each level of inheritance so that refers to the new type.
	188	This means an object can always be passed to a function in its virtual table
[21f2e92]	189	as if it were a method.
[4aba055]	190	\todo{Clarify (with an example) virtual methods.}
[aa173d8]	191	}%
[4260566]	192
[f6106a6]	193	Up until this point the virtual system is similar to ones found in
[aa173d8]	194	object-orientated languages but this is where \CFA diverges. Objects encapsulate a
	195	single set of methods in each type, universally across the entire program,
	196	and indeed all programs that use that type definition. Even if a type inherits and adds methods, it still encapsulate a
	197	single set of methods. In this sense,
	198	object-oriented types are ``closed" and cannot be altered.
	199
	200	In \CFA, types do not encapsulate any code. Traits are local for each function and
	201	types can satisfy a local trait, stop satisfying it or, satisfy the same
	202	trait in a different way at any lexical location in the program where a function is call.
	203	In this sense, the set of functions/variables that satisfy a trait for a type is ``open" as the set can change at every call site.
[4aba055]	204	This capability means it is impossible to pick a single set of functions
[aa173d8]	205	that represent a type's implementation across a program.
[f6106a6]	206
	207	\CFA side-steps this issue by not having a single virtual table for each
[4aba055]	208	type. A user can define virtual tables that are filled in at their
	209	declaration and given a name. Anywhere that name is visible, even if it is
[aa173d8]	210	defined locally inside a function \PAB{What does this mean? (although that means it does not have a
	211	static lifetime)}, it can be used.
[4aba055]	212	Specifically, a virtual type is ``bound" to a virtual table that
[08e75215]	213	sets the virtual members for that object. The virtual members can be accessed
	214	through the object.
[4706098c]	215
	216	While much of the virtual infrastructure is created, it is currently only used
	217	internally for exception handling. The only user-level feature is the virtual
[21f2e92]	218	cast, which is the same as the \Cpp \code{C++}{dynamic_cast}.
[7eb6eb5]	219	\label{p:VirtualCast}
[4706098c]	220	\begin{cfa}
[4a36b344]	221	(virtual TYPE)EXPRESSION
[4706098c]	222	\end{cfa}
[29c9b23]	223	Note, the syntax and semantics matches a C-cast, rather than the function-like
	224	\Cpp syntax for special casts. Both the type of @EXPRESSION@ and @TYPE@ must be
	225	a pointer to a virtual type.
[de47a9d]	226	The cast dynamically checks if the @EXPRESSION@ type is the same or a sub-type
[29c9b23]	227	of @TYPE@, and if true, returns a pointer to the
[4706098c]	228	@EXPRESSION@ object, otherwise it returns @0p@ (null pointer).
	229
	230	\section{Exception}
[4a36b344]	231	% Leaving until later, hopefully it can talk about actual syntax instead
	232	% of my many strange macros. Syntax aside I will also have to talk about the
	233	% features all exceptions support.
	234
[4706098c]	235	Exceptions are defined by the trait system; there are a series of traits, and
[1c1c180]	236	if a type satisfies them, then it can be used as an exception. The following
[4706098c]	237	is the base trait all exceptions need to match.
	238	\begin{cfa}
	239	trait is_exception(exceptT &, virtualT &) {
[a6c45c6]	240	// Numerous imaginary assertions.
[02b73ea]	241	};
[4706098c]	242	\end{cfa}
[29c9b23]	243	The trait is defined over two types, the exception type and the virtual table
[4aba055]	244	type. Each exception type should have a single virtual table type.
	245	There are no actual assertions in this trait because the trait system
	246	cannot express them yet (adding such assertions would be part of
[a6c45c6]	247	completing the virtual system). The imaginary assertions would probably come
	248	from a trait defined by the virtual system, and state that the exception type
[aa173d8]	249	is a virtual type, is a descendant of @exception_t@ (the base exception type),
[a6c45c6]	250	and note its virtual table type.
[29c9b23]	251
	252	% I did have a note about how it is the programmer's responsibility to make
	253	% sure the function is implemented correctly. But this is true of every
[de47a9d]	254	% similar system I know of (except Agda's I guess) so I took it out.
	255
[f6106a6]	256	There are two more traits for exceptions defined as follows:
[4706098c]	257	\begin{cfa}
[02b73ea]	258	trait is_termination_exception(
[4706098c]	259	exceptT &, virtualT & \| is_exception(exceptT, virtualT)) {
[29c9b23]	260	void defaultTerminationHandler(exceptT &);
[02b73ea]	261	};
	262
	263	trait is_resumption_exception(
[4706098c]	264	exceptT &, virtualT & \| is_exception(exceptT, virtualT)) {
[29c9b23]	265	void defaultResumptionHandler(exceptT &);
[02b73ea]	266	};
[4706098c]	267	\end{cfa}
[4aba055]	268	Both traits ensure a pair of types are an exception type, its virtual table
[aa173d8]	269	type,
[f6106a6]	270	and defines one of the two default handlers. The default handlers are used
[df24d37]	271	as fallbacks and are discussed in detail in \vref{s:ExceptionHandling}.
[de47a9d]	272
[f6106a6]	273	However, all three of these traits can be tricky to use directly.
	274	While there is a bit of repetition required,
[de47a9d]	275	the largest issue is that the virtual table type is mangled and not in a user
[f6106a6]	276	facing way. So these three macros are provided to wrap these traits to
	277	simplify referring to the names:
[aa173d8]	278	@IS_EXCEPTION@, @IS_TERMINATION_EXCEPTION@, and @IS_RESUMPTION_EXCEPTION@.
[1830a86]	279
[f6106a6]	280	All three take one or two arguments. The first argument is the name of the
	281	exception type. The macro passes its unmangled and mangled form to the trait.
[1830a86]	282	The second (optional) argument is a parenthesized list of polymorphic
[f6106a6]	283	arguments. This argument is only used with polymorphic exceptions and the
	284	list is be passed to both types.
	285	In the current set-up, the two types always have the same polymorphic
	286	arguments so these macros can be used without losing flexibility.
[29c9b23]	287
	288	For example consider a function that is polymorphic over types that have a
	289	defined arithmetic exception:
	290	\begin{cfa}
[de47a9d]	291	forall(Num \| IS_EXCEPTION(Arithmetic, (Num)))
[29c9b23]	292	void some_math_function(Num & left, Num & right);
	293	\end{cfa}
[4706098c]	294
[1830a86]	295	\section{Exception Handling}
[f6106a6]	296	\label{s:ExceptionHandling}
[4aba055]	297	As stated,
[21f2e92]	298	\CFA provides two kinds of exception handling: termination and resumption.
[f6106a6]	299	These twin operations are the core of \CFA's exception handling mechanism.
[aa173d8]	300	This section covers the general patterns shared by the two operations and
	301	then goes on to cover the details of each individual operation.
[de47a9d]	302
[f6106a6]	303	Both operations follow the same set of steps.
[aa173d8]	304	First, a user raises an exception.
	305	Second, the exception propagates up the stack.
	306	Third, if a handler is found, the exception is caught and the handler is run.
[4aba055]	307	After that control continues at a raise-dependent location.
[aa173d8]	308	Fourth, if a handler is not found, a default handler is run and, if it returns, then control
[4aba055]	309	continues after the raise.
[f6106a6]	310
[aa173d8]	311	%This general description covers what the two kinds have in common.
	312	The differences in the two operations include how propagation is performed, where execution continues
	313	after an exception is caught and handled, and which default handler is run.
[1830a86]	314
[4706098c]	315	\subsection{Termination}
	316	\label{s:Termination}
[aa173d8]	317	Termination handling is the familiar EHM and used in most programming
[1830a86]	318	languages with exception handling.
[4aba055]	319	It is a dynamic, non-local goto. If the raised exception is matched and
	320	handled, the stack is unwound and control (usually) continues in the function
[f6106a6]	321	on the call stack that defined the handler.
	322	Termination is commonly used when an error has occurred and recovery is
	323	impossible locally.
[1830a86]	324
	325	% (usually) Control can continue in the current function but then a different
	326	% control flow construct should be used.
[4706098c]	327
[f6106a6]	328	A termination raise is started with the @throw@ statement:
[4706098c]	329	\begin{cfa}
[4a36b344]	330	throw EXPRESSION;
[4706098c]	331	\end{cfa}
[29c9b23]	332	The expression must return a reference to a termination exception, where the
[f6106a6]	333	termination exception is any type that satisfies the trait
	334	@is_termination_exception@ at the call site.
[4aba055]	335	Through \CFA's trait system, the trait functions are implicitly passed into the
[aa173d8]	336	throw code for use by the EHM.
[f6106a6]	337	A new @defaultTerminationHandler@ can be defined in any scope to
[aa173d8]	338	change the throw's behaviour when a handler is not found (see below).
[de47a9d]	339
[4aba055]	340	The throw copies the provided exception into managed memory to ensure
[21f2e92]	341	the exception is not destroyed if the stack is unwound.
[f6106a6]	342	It is the user's responsibility to ensure the original exception is cleaned
[4aba055]	343	up whether the stack is unwound or not. Allocating it on the stack is
[f6106a6]	344	usually sufficient.
[de47a9d]	345
[4aba055]	346	% How to say propagation starts, its first sub-step is the search.
	347	Then propagation starts with the search. \CFA uses a ``first match" rule so
[aa173d8]	348	matching is performed with the copied exception as the search key.
	349	It starts from the raise in the throwing function and proceeds towards the base of the stack,
[1830a86]	350	from callee to caller.
[aa173d8]	351	At each stack frame, a check is made for termination handlers defined by the
[1830a86]	352	@catch@ clauses of a @try@ statement.
[4706098c]	353	\begin{cfa}
[4a36b344]	354	try {
[4706098c]	355	GUARDED_BLOCK
[f6106a6]	356	} catch (EXCEPTION_TYPE$\(_1\)$ * [NAME$\(_1\)$]) {
[4706098c]	357	HANDLER_BLOCK$\(_1\)$
[f6106a6]	358	} catch (EXCEPTION_TYPE$\(_2\)$ * [NAME$\(_2\)$]) {
[4706098c]	359	HANDLER_BLOCK$\(_2\)$
[4a36b344]	360	}
[4706098c]	361	\end{cfa}
[4aba055]	362	When viewed on its own, a try statement simply executes the statements
[aa173d8]	363	in the \snake{GUARDED_BLOCK}, and when those are finished,
[4aba055]	364	the try statement finishes.
[de47a9d]	365
	366	However, while the guarded statements are being executed, including any
[4aba055]	367	invoked functions, all the handlers in these statements are included in the
	368	search path.
[aa173d8]	369	Hence, if a termination exception is raised, these handlers may be matched
[4aba055]	370	against the exception and may handle it.
[f6106a6]	371
	372	Exception matching checks the handler in each catch clause in the order
[4aba055]	373	they appear, top to bottom. If the representation of the raised exception type
[aa173d8]	374	is the same or a descendant of @EXCEPTION_TYPE@$_i$, then @NAME@$_i$
[21f2e92]	375	(if provided) is
	376	bound to a pointer to the exception and the statements in @HANDLER_BLOCK@$_i$
	377	are executed. If control reaches the end of the handler, the exception is
[de47a9d]	378	freed and control continues after the try statement.
[4706098c]	379
[aa173d8]	380	If no termination handler is found during the search, then the default handler
	381	(\defaultTerminationHandler) visible at the raise statement is called.
	382	Through \CFA's trait system the best match at the raise statement is used.
[4aba055]	383	This function is run and is passed the copied exception.
[aa173d8]	384	If the default handler finishes, control continues after the raise statement.
[1830a86]	385
[f6106a6]	386	There is a global @defaultTerminationHandler@ that is polymorphic over all
[4aba055]	387	termination exception types.
[f6106a6]	388	The global default termination handler performs a cancellation
[aa173d8]	389	(see \vref{s:Cancellation} for the justification) on the current stack with the copied exception.
	390	Since it is so general, a more specific handler is usually
	391	defined, possibly with a detailed message, and used for specific exception type, effectively overriding the default handler.
[4706098c]	392
	393	\subsection{Resumption}
	394	\label{s:Resumption}
	395
[aa173d8]	396	Resumption exception handling is the less familar EHM, but is
[f6106a6]	397	just as old~\cite{Goodenough75} and is simpler in many ways.
	398	It is a dynamic, non-local function call. If the raised exception is
[aa173d8]	399	matched, a closure is taken from up the stack and executed,
[4aba055]	400	after which the raising function continues executing.
	401	The common uses for resumption exceptions include
	402	potentially repairable errors, where execution can continue in the same
	403	function once the error is corrected, and
	404	ignorable events, such as logging where nothing needs to happen and control
[aa173d8]	405	should always continue from the raise point.
[8483c39a]	406
[4706098c]	407	A resumption raise is started with the @throwResume@ statement:
	408	\begin{cfa}
[4a36b344]	409	throwResume EXPRESSION;
[4706098c]	410	\end{cfa}
[4aba055]	411	\todo{Decide on a final set of keywords and use them everywhere.}
[f6106a6]	412	It works much the same way as the termination throw.
	413	The expression must return a reference to a resumption exception,
	414	where the resumption exception is any type that satisfies the trait
	415	@is_resumption_exception@ at the call site.
	416	The assertions from this trait are available to
[21f2e92]	417	the exception system while handling the exception.
[29c9b23]	418
[aa173d8]	419	At run-time, no exception copy is made, since
	420	resumption does not unwind the stack nor otherwise remove values from the
	421	current scope, so there is no need to manage memory to keep the exception in scope.
[4706098c]	422
[aa173d8]	423	Then propagation starts with the search. It starts from the raise in the
[4aba055]	424	resuming function and proceeds towards the base of the stack,
	425	from callee to caller.
[1830a86]	426	At each stack frame, a check is made for resumption handlers defined by the
	427	@catchResume@ clauses of a @try@ statement.
[4706098c]	428	\begin{cfa}
[4a36b344]	429	try {
[4706098c]	430	GUARDED_BLOCK
[f6106a6]	431	} catchResume (EXCEPTION_TYPE$\(_1\)$ * [NAME$\(_1\)$]) {
[4706098c]	432	HANDLER_BLOCK$\(_1\)$
[f6106a6]	433	} catchResume (EXCEPTION_TYPE$\(_2\)$ * [NAME$\(_2\)$]) {
[4706098c]	434	HANDLER_BLOCK$\(_2\)$
[4a36b344]	435	}
[4706098c]	436	\end{cfa}
[aa173d8]	437	% PAB, you say this above.
	438	% When a try statement is executed, it simply executes the statements in the
	439	% @GUARDED_BLOCK@ and then finishes.
	440	%
	441	% However, while the guarded statements are being executed, including any
	442	% invoked functions, all the handlers in these statements are included in the
	443	% search path.
	444	% Hence, if a resumption exception is raised, these handlers may be matched
	445	% against the exception and may handle it.
	446	%
	447	% Exception matching checks the handler in each catch clause in the order
	448	% they appear, top to bottom. If the representation of the raised exception type
	449	% is the same or a descendant of @EXCEPTION_TYPE@$_i$, then @NAME@$_i$
	450	% (if provided) is bound to a pointer to the exception and the statements in
	451	% @HANDLER_BLOCK@$_i$ are executed.
	452	% If control reaches the end of the handler, execution continues after the
	453	% the raise statement that raised the handled exception.
	454	%
	455	% Like termination, if no resumption handler is found during the search,
	456	% then the default handler (\defaultResumptionHandler) visible at the raise
	457	% statement is called. It will use the best match at the raise sight according
	458	% to \CFA's overloading rules. The default handler is
	459	% passed the exception given to the raise. When the default handler finishes
	460	% execution continues after the raise statement.
	461	%
	462	% There is a global @defaultResumptionHandler{} is polymorphic over all
	463	% resumption exceptions and performs a termination throw on the exception.
	464	% The \defaultTerminationHandler{} can be overridden by providing a new
	465	% function that is a better match.
	466
	467	The @GUARDED_BLOCK@ and its associated nested guarded statements work the same
	468	for resumption as for termination, as does exception matching at each
	469	@catchResume@. Similarly, if no resumption handler is found during the search,
	470	then the currently visible default handler (\defaultResumptionHandler) is
	471	called and control continues after the raise statement if it returns. Finally,
	472	there is also a global @defaultResumptionHandler@, which can be overridden,
	473	that is polymorphic over all resumption exceptions but performs a termination
	474	throw on the exception rather than a cancellation.
	475
	476	Throwing the exception in @defaultResumptionHandler@ has the positive effect of
	477	walking the stack a second time for a recovery handler. Hence, a programmer has
	478	two chances for help with a problem, fixup or recovery, should either kind of
	479	handler appear on the stack. However, this dual stack walk leads to following
	480	apparent anomaly:
	481	\begin{cfa}
	482	try {
	483	throwResume E;
	484	} catch (E) {
	485	// this handler runs
	486	}
	487	\end{cfa}
	488	because the @catch@ appears to handle a @throwResume@, but a @throwResume@ only
	489	matches with @catchResume@. The anomaly results because the unmatched
	490	@catchResuem@, calls @defaultResumptionHandler@, which in turn throws @E@.
	491
[f6106a6]	492	% I wonder if there would be some good central place for this.
[aa173d8]	493	Note, termination and resumption handlers may be used together
[f6106a6]	494	in a single try statement, intermixing @catch@ and @catchResume@ freely.
[4aba055]	495	Each type of handler only interacts with exceptions from the matching
	496	kind of raise.
[1830a86]	497
[f6106a6]	498	\subsubsection{Resumption Marking}
[df24d37]	499	\label{s:ResumptionMarking}
[1830a86]	500	A key difference between resumption and termination is that resumption does
[aa173d8]	501	not unwind the stack. A side effect is that, when a handler is matched
	502	and run, its try block (the guarded statements) and every try statement
[4aba055]	503	searched before it are still on the stack. There presence can lead to
[aa173d8]	504	the \emph{recursive resumption problem}.
[1830a86]	505
	506	The recursive resumption problem is any situation where a resumption handler
	507	ends up being called while it is running.
	508	Consider a trivial case:
	509	\begin{cfa}
	510	try {
	511	throwResume (E &){};
	512	} catchResume(E *) {
	513	throwResume (E &){};
	514	}
	515	\end{cfa}
[4aba055]	516	When this code is executed, the guarded @throwResume@ starts a
	517	search and matches the handler in the @catchResume@ clause. This
[aa173d8]	518	call is placed on the stack above the try-block. Now the second raise in the handler
	519	searches the same try block, matches, and puts another instance of the
[4aba055]	520	same handler on the stack leading to infinite recursion.
[1830a86]	521
[aa173d8]	522	While this situation is trivial and easy to avoid, much more complex cycles can
	523	form with multiple handlers and different exception types. The key point is
	524	that the programmer's intuition expects every raise in a handler to start
	525	searching \emph{below} the @try@ statement, making it difficult to understand
	526	and fix the problem.
[1830a86]	527
[aa173d8]	528	To prevent all of these cases, each try statement is ``marked" from the
	529	time the exception search reaches it to either when a matching handler
	530	completes or when the search reaches the base
[4aba055]	531	of the stack.
	532	While a try statement is marked, its handlers are never matched, effectively
[21f2e92]	533	skipping over it to the next try statement.
[4a36b344]	534
[6a8208cb]	535	\begin{center}
	536	\input{stack-marking}
	537	\end{center}
[de47a9d]	538
[4aba055]	539	There are other sets of marking rules that could be used,
	540	for instance, marking just the handlers that caught the exception,
	541	would also prevent recursive resumption.
[aa173d8]	542	However, the rule selected mirrors what happens with termination,
	543	and hence, matches programmer intuition that a raise searches below a try.
[4706098c]	544
[aa173d8]	545	In detail, the marked try statements are the ones that would be removed from
	546	the stack for a termination exception, \ie those on the stack
[4aba055]	547	between the handler and the raise statement.
	548	This symmetry applies to the default handler as well, as both kinds of
	549	default handlers are run at the raise statement, rather than (physically
	550	or logically) at the bottom of the stack.
	551	% In early development having the default handler happen after
	552	% unmarking was just more useful. We assume that will continue.
[4706098c]	553
	554	\section{Conditional Catch}
[de47a9d]	555	Both termination and resumption handler clauses can be given an additional
	556	condition to further control which exceptions they handle:
[4706098c]	557	\begin{cfa}
[f6106a6]	558	catch (EXCEPTION_TYPE * [NAME] ; CONDITION)
[4706098c]	559	\end{cfa}
	560	First, the same semantics is used to match the exception type. Second, if the
	561	exception matches, @CONDITION@ is executed. The condition expression may
[de47a9d]	562	reference all names in scope at the beginning of the try block and @NAME@
[1c1c180]	563	introduced in the handler clause. If the condition is true, then the handler
[1830a86]	564	matches. Otherwise, the exception search continues as if the exception type
	565	did not match.
[f6106a6]	566
[4aba055]	567	The condition matching allows finer matching by checking
[f6106a6]	568	more kinds of information than just the exception type.
[4706098c]	569	\begin{cfa}
	570	try {
[f6106a6]	571	handle1 = open( f1, ... );
	572	handle2 = open( f2, ... );
	573	handle3 = open( f3, ... );
[4706098c]	574	...
[de47a9d]	575	} catch( IOFailure * f ; fd( f ) == f1 ) {
[f6106a6]	576	// Only handle IO failure for f1.
	577	} catch( IOFailure * f ; fd( f ) == f3 ) {
	578	// Only handle IO failure for f3.
[4706098c]	579	}
[aa173d8]	580	// Handle a failure relating to f2 further down the stack.
[4706098c]	581	\end{cfa}
[4aba055]	582	In this example the file that experienced the IO error is used to decide
[f6106a6]	583	which handler should be run, if any at all.
	584
	585	\begin{comment}
	586	% I know I actually haven't got rid of them yet, but I'm going to try
	587	% to write it as if I had and see if that makes sense:
	588	\section{Reraising}
	589	\label{s:Reraising}
[4706098c]	590	Within the handler block or functions called from the handler block, it is
	591	possible to reraise the most recently caught exception with @throw@ or
[1830a86]	592	@throwResume@, respectively.
[4706098c]	593	\begin{cfa}
[29c9b23]	594	try {
	595	...
	596	} catch( ... ) {
[1830a86]	597	... throw;
[4706098c]	598	} catchResume( ... ) {
[1830a86]	599	... throwResume;
[4706098c]	600	}
	601	\end{cfa}
	602	The only difference between a raise and a reraise is that reraise does not
	603	create a new exception; instead it continues using the current exception, \ie
	604	no allocation and copy. However the default handler is still set to the one
	605	visible at the raise point, and hence, for termination could refer to data that
	606	is part of an unwound stack frame. To prevent this problem, a new default
	607	handler is generated that does a program-level abort.
[f6106a6]	608	\end{comment}
	609
	610	\subsection{Comparison with Reraising}
[aa173d8]	611	Without conditional catch, the only approach to match in more detail is to reraise
	612	the exception after it has been caught, if it could not be handled.
	613	\begin{center}
	614	\begin{tabular}{l\|l}
	615	\begin{cfa}
	616	try {
	617	do_work_may_throw();
	618	} catch(excep_t * ex; can_handle(ex)) {
	619
	620	handle(ex);
	621
	622
[f6106a6]	623
[aa173d8]	624	}
	625	\end{cfa}
	626	&
[f6106a6]	627	\begin{cfa}
	628	try {
[aa173d8]	629	do_work_may_throw();
	630	} catch(excep_t * ex) {
	631	if (can_handle(ex)) {
	632	handle(ex);
	633	} else {
	634	throw;
	635	}
[f6106a6]	636	}
	637	\end{cfa}
[aa173d8]	638	\end{tabular}
	639	\end{center}
	640	Notice catch-and-reraise increases complexity by adding additional data and
	641	code to the exception process. Nevertheless, catch-and-reraise can simulate
	642	conditional catch straightforwardly, when exceptions are disjoint, \ie no
	643	inheritance.
	644
	645	However, catch-and-reraise simulation becomes unusable for exception inheritance.
	646	\begin{flushleft}
	647	\begin{cfa}[xleftmargin=6pt]
	648	exception E1;
	649	exception E2(E1); // inheritance
	650	\end{cfa}
	651	\begin{tabular}{l\|l}
	652	\begin{cfa}
	653	try {
	654	... foo(); ... // raise E1/E2
	655	... bar(); ... // raise E1/E2
	656	} catch( E2 e; e.rtn == foo ) {
	657	...
	658	} catch( E1 e; e.rtn == foo ) {
	659	...
	660	} catch( E1 e; e.rtn == bar ) {
	661	...
	662	}
[f6106a6]	663
[aa173d8]	664	\end{cfa}
	665	&
[f6106a6]	666	\begin{cfa}
	667	try {
[aa173d8]	668	... foo(); ...
	669	... bar(); ...
	670	} catch( E2 e ) {
	671	if ( e.rtn == foo ) { ...
	672	} else throw; // reraise
	673	} catch( E1 e ) {
	674	if (e.rtn == foo) { ...
	675	} else if (e.rtn == bar) { ...
	676	else throw; // reraise
[f6106a6]	677	}
	678	\end{cfa}
[aa173d8]	679	\end{tabular}
	680	\end{flushleft}
	681	The derived exception @E2@ must be ordered first in the catch list, otherwise
	682	the base exception @E1@ catches both exceptions. In the catch-and-reraise code
	683	(right), the @E2@ handler catches exceptions from both @foo@ and
	684	@bar@. However, the reraise misses the following catch clause. To fix this
	685	problem, an enclosing @try@ statement is need to catch @E2@ for @bar@ from the
	686	reraise, and its handler must duplicate the inner handler code for @bar@. To
	687	generalize, this fix for any amount of inheritance and complexity of try
	688	statement requires a technique called \emph{try-block
	689	splitting}~\cite{Krischer02}, which is not discussed in this thesis. It is
	690	sufficient to state that conditional catch is more expressive than
	691	catch-and-reraise in terms of complexity.
	692
	693	\begin{comment}
	694	That is, they have the same behaviour in isolation.
[4aba055]	695	Two things can expose differences between these cases.
	696
[6071efc]	697	One is the existence of multiple handlers on a single try statement.
[aa173d8]	698	A reraise skips all later handlers for a try statement but a conditional
[4aba055]	699	catch does not.
[aa173d8]	700	% Hence, if an earlier handler contains a reraise later handlers are
	701	% implicitly skipped, with a conditional catch they are not.
[4aba055]	702	Still, they are equivalently powerful,
[6071efc]	703	both can be used two mimic the behaviour of the other,
[4aba055]	704	as reraise can pack arbitrary code in the handler and conditional catches
	705	can put arbitrary code in the predicate.
[6071efc]	706	% I was struggling with a long explanation about some simple solutions,
[4aba055]	707	% like repeating a condition on later handlers, and the general solution of
	708	% merging everything together. I don't think it is useful though unless its
	709	% for a proof.
	710	% https://en.cppreference.com/w/cpp/language/throw
	711
	712	The question then becomes ``Which is a better default?"
	713	We believe that not skipping possibly useful handlers is a better default.
	714	If a handler can handle an exception it should and if the handler can not
	715	handle the exception then it is probably safer to have that explicitly
	716	described in the handler itself instead of implicitly described by its
	717	ordering with other handlers.
	718	% Or you could just alter the semantics of the throw statement. The handler
	719	% index is in the exception so you could use it to know where to start
	720	% searching from in the current try statement.
	721	% No place for the `goto else;` metaphor.
	722
	723	The other issue is all of the discussion above assumes that the only
	724	way to tell apart two raises is the exception being raised and the remaining
	725	search path.
	726	This is not true generally, the current state of the stack can matter in
	727	a number of cases, even only for a stack trace after an program abort.
	728	But \CFA has a much more significant need of the rest of the stack, the
	729	default handlers for both termination and resumption.
	730
	731	% For resumption it turns out it is possible continue a raise after the
	732	% exception has been caught, as if it hadn't been caught in the first place.
	733	This becomes a problem combined with the stack unwinding used in termination
	734	exception handling.
	735	The stack is unwound before the handler is installed, and hence before any
	736	reraises can run. So if a reraise happens the previous stack is gone,
	737	the place on the stack where the default handler was supposed to run is gone,
	738	if the default handler was a local function it may have been unwound too.
	739	There is no reasonable way to restore that information, so the reraise has
	740	to be considered as a new raise.
	741	This is the strongest advantage conditional catches have over reraising,
	742	they happen before stack unwinding and avoid this problem.
	743
	744	% The one possible disadvantage of conditional catch is that it runs user
	745	% code during the exception search. While this is a new place that user code
	746	% can be run destructors and finally clauses are already run during the stack
	747	% unwinding.
	748	%
	749	% https://www.cplusplus.com/reference/exception/current_exception/
	750	% `exception_ptr current_exception() noexcept;`
	751	% https://www.python.org/dev/peps/pep-0343/
[aa173d8]	752	\end{comment}
[4a36b344]	753
	754	\section{Finally Clauses}
[f6106a6]	755	\label{s:FinallyClauses}
[de47a9d]	756	Finally clauses are used to preform unconditional clean-up when leaving a
[f6106a6]	757	scope and are placed at the end of a try statement after any handler clauses:
[4706098c]	758	\begin{cfa}
[4a36b344]	759	try {
[4706098c]	760	GUARDED_BLOCK
[29c9b23]	761	} ... // any number or kind of handler clauses
	762	... finally {
[4706098c]	763	FINALLY_BLOCK
[4a36b344]	764	}
[4706098c]	765	\end{cfa}
[29c9b23]	766	The @FINALLY_BLOCK@ is executed when the try statement is removed from the
[1830a86]	767	stack, including when the @GUARDED_BLOCK@ finishes, any termination handler
[aa173d8]	768	finishes, or during an unwind.
[29c9b23]	769	The only time the block is not executed is if the program is exited before
[1830a86]	770	the stack is unwound.
[4706098c]	771
	772	Execution of the finally block should always finish, meaning control runs off
[f6106a6]	773	the end of the block. This requirement ensures control always continues as if
	774	the finally clause is not present, \ie finally is for cleanup not changing
	775	control flow.
	776	Because of this requirement, local control flow out of the finally block
[1c1c180]	777	is forbidden. The compiler precludes any @break@, @continue@, @fallthru@ or
[4706098c]	778	@return@ that causes control to leave the finally block. Other ways to leave
	779	the finally block, such as a long jump or termination are much harder to check,
[f6106a6]	780	and at best requiring additional run-time overhead, and so are only
[1830a86]	781	discouraged.
	782
[f6106a6]	783	Not all languages with unwinding have finally clauses. Notably \Cpp does
[aa173d8]	784	without it as destructors, and the RAII design pattern, serve a similar role.
	785	Although destructors and finally clauses can be used for the same cases,
[4aba055]	786	they have their own strengths, similar to top-level function and lambda
	787	functions with closures.
[aa173d8]	788	Destructors take more work for their creation, but if there is clean-up code
	789	that needs to be run every time a type is used, they are much easier
[4aba055]	790	to set-up.
	791	On the other hand finally clauses capture the local context, so is easy to
	792	use when the clean-up is not dependent on the type of a variable or requires
	793	information from multiple variables.
[4a36b344]	794
	795	\section{Cancellation}
[f6106a6]	796	\label{s:Cancellation}
[de47a9d]	797	Cancellation is a stack-level abort, which can be thought of as as an
[f6106a6]	798	uncatchable termination. It unwinds the entire current stack, and if
[de47a9d]	799	possible forwards the cancellation exception to a different stack.
[4706098c]	800
[29c9b23]	801	Cancellation is not an exception operation like termination or resumption.
[4706098c]	802	There is no special statement for starting a cancellation; instead the standard
[1c1c180]	803	library function @cancel_stack@ is called passing an exception. Unlike a
[f6106a6]	804	raise, this exception is not used in matching only to pass information about
[4706098c]	805	the cause of the cancellation.
[aa173d8]	806	Finaly, since a cancellation only unwinds and forwards, there is no default handler.
[4706098c]	807
[f6106a6]	808	After @cancel_stack@ is called the exception is copied into the EHM's memory
[4aba055]	809	and the current stack is unwound.
	810	The behaviour after that depends on the kind of stack being cancelled.
[a6c45c6]	811
	812	\paragraph{Main Stack}
[4706098c]	813	The main stack is the one used by the program main at the start of execution,
[f6106a6]	814	and is the only stack in a sequential program.
	815	After the main stack is unwound there is a program-level abort.
	816
[aa173d8]	817	The reasons for this semantics in a sequential program is that there is no more code to execute.
	818	This semantics also applies to concurrent programs, too, even if threads are running.
	819	That is, if any threads starts a cancellation, it implies all threads terminate.
	820	Keeping the same behaviour in sequential and concurrent programs is simple.
[4aba055]	821	Also, even in concurrent programs there may not currently be any other stacks
	822	and even if other stacks do exist, main has no way to know where they are.
[4706098c]	823
[a6c45c6]	824	\paragraph{Thread Stack}
[f6106a6]	825	A thread stack is created for a \CFA @thread@ object or object that satisfies
	826	the @is_thread@ trait.
[4aba055]	827	After a thread stack is unwound, the exception is stored until another
[f6106a6]	828	thread attempts to join with it. Then the exception @ThreadCancelled@,
	829	which stores a reference to the thread and to the exception passed to the
[4aba055]	830	cancellation, is reported from the join to the joining thread.
[f6106a6]	831	There is one difference between an explicit join (with the @join@ function)
	832	and an implicit join (from a destructor call). The explicit join takes the
	833	default handler (@defaultResumptionHandler@) from its calling context while
[4aba055]	834	the implicit join provides its own; which does a program abort if the
[f6106a6]	835	@ThreadCancelled@ exception cannot be handled.
	836
[4aba055]	837	The communication and synchronization are done here because threads only have
	838	two structural points (not dependent on user-code) where
	839	communication/synchronization happens: start and join.
[f6106a6]	840	Since a thread must be running to perform a cancellation (and cannot be
	841	cancelled from another stack), the cancellation must be after start and
[4aba055]	842	before the join, so join is used.
[f6106a6]	843
	844	% TODO: Find somewhere to discuss unwind collisions.
	845	The difference between the explicit and implicit join is for safety and
	846	debugging. It helps prevent unwinding collisions by avoiding throwing from
	847	a destructor and prevents cascading the error across multiple threads if
	848	the user is not equipped to deal with it.
[33e1c91]	849	It is always possible to add an explicit join if that is the desired behaviour.
	850
	851	With explicit join and a default handler that triggers a cancellation, it is
	852	possible to cascade an error across any number of threads, cleaning up each
	853	in turn, until the error is handled or the main thread is reached.
[f6106a6]	854
[a6c45c6]	855	\paragraph{Coroutine Stack}
[f6106a6]	856	A coroutine stack is created for a @coroutine@ object or object that
	857	satisfies the @is_coroutine@ trait.
[4aba055]	858	After a coroutine stack is unwound, control returns to the @resume@ function
	859	that most recently resumed it. @resume@ reports a
[21f2e92]	860	@CoroutineCancelled@ exception, which contains a references to the cancelled
[f6106a6]	861	coroutine and the exception used to cancel it.
[4aba055]	862	The @resume@ function also takes the \defaultResumptionHandler{} from the
[21f2e92]	863	caller's context and passes it to the internal report.
[f6106a6]	864
[aa173d8]	865	A coroutine only knows of two other coroutines, its starter and its last resumer.
[4aba055]	866	The starter has a much more distant connection, while the last resumer just
[f6106a6]	867	(in terms of coroutine state) called resume on this coroutine, so the message
	868	is passed to the latter.
[33e1c91]	869
	870	With a default handler that triggers a cancellation, it is possible to
	871	cascade an error across any number of coroutines, cleaning up each in turn,
	872	until the error is handled or a thread stack is reached.
[aa173d8]	873
	874	\PAB{Part of this I do not understand. A cancellation cannot be caught. But you
	875	talk about handling a cancellation in the last sentence. Which is correct?}

Note: See TracBrowser for help on using the repository browser.

Download in other formats: