Context Navigation

source: doc/theses/andrew_beach_MMath/implement.tex @ 77f1265

ADTarm-ehast-experimentalenumforall-pointer-decayjacob/cs343-translationnew-ast-unique-exprpthread-emulationqualifiedEnum

Last change on this file since 77f1265 was 9d7e5cb, checked in by Andrew Beach <ajbeach@…>, 3 years ago
Andrew MMath: First draft of the latest round of fixes to implement complete. More will be needed.
Property mode set to `100644`
File size: 32.4 KB

Rev	Line
[26ca815]	1	\chapter{Implementation}
	2	% Goes over how all the features are implemented.
	3
[7eb6eb5]	4	The implementation work for this thesis covers two components: the virtual
	5	system and exceptions. Each component is discussed in detail.
	6
[26ca815]	7	\section{Virtual System}
[7eb6eb5]	8	\label{s:VirtualSystem}
[26ca815]	9	% Virtual table rules. Virtual tables, the pointer to them and the cast.
[7eb6eb5]	10	While the \CFA virtual system currently has only one public feature, virtual
[df24d37]	11	cast (see the virtual cast feature \vpageref{p:VirtualCast}),
	12	substantial structure is required to support it,
	13	and provide features for exception handling and the standard library.
[7eb6eb5]	14
[830299f]	15	\subsection{Virtual Type}
[9d7e5cb]	16	Virtual types only have one change to their structure: the addition of a
	17	pointer to the virtual table, which is called the \emph{virtual-table pointer}.
	18	Internally, the field is called @virtual_table@.
	19	The field is fixed after construction. It is always the first field in the
	20	structure so that its location is always known.
	21	\todo{Talk about constructors for virtual types (after they are working).}
	22
	23	This is what binds an instance of a virtual type to a virtual table. This
	24	pointer can be used as an identity check. It can also be used to access the
	25	virtual table and the virtual members there.
	26
	27	\subsection{Type Id}
	28	Every virtual type has a unique id.
	29	Type ids can be compared for equality (the types reperented are the same)
	30	or used to access the type's type information.
	31	The type information currently is only the parent's type id or, if the
	32	type has no parent, zero.
	33
	34	The id's are implemented as pointers to the type's type information instance.
	35	Derefencing the pointer gets the type information.
	36	By going back-and-forth between the type id and
	37	the type info one can find every ancestor of a virtual type.
	38	It also pushes the issue of creating a unique value (for
	39	the type id) to the problem of creating a unique instance (for type
	40	information) which the linker can solve.
	41
	42	Advanced linker support is required because there is no place that appears
	43	only once to attach the type information to. There should be one structure
	44	definition but it is included in multiple translation units. Each virtual
	45	table definition should be unique but there are an arbitrary number of thoses.
	46	So the special section prefix \texttt{.gnu.linkonce} is used.
	47	With a unique suffix (making the entire section name unique) the linker will
	48	remove multiple definition making sure only one version exists after linking.
	49	Then it is just a matter of making sure there is a unique name for each type.
	50
	51	This is done in three phases.
	52	The first phase is to generate a new structure definition to store the type
	53	information. The layout is the same in each case, just the parent's type id,
	54	but the types are changed.
	55	The structure's name is change, it is based off the virtual type's name, and
	56	the type of the parent's type id.
	57	If the virtual type is polymorphic then the type information structure is
	58	polymorphic as well, with the same polymorphic arguments.
	59
	60	The second phase is to generate an instance of the type information with a
	61	almost unique name, generated by mangling the virtual type name.
	62
	63	The third phase is implicit with \CFA's overloading scheme. \CFA mangles
	64	names with type information so that all of the symbols exported to the linker
	65	are unique even if in \CFA code they are the same. Having two declarations
	66	with the same name and same type is forbidden because it is impossible for
	67	overload resolution to pick between them. This is why a unique type is
	68	generated for each virtual type.
	69	Polymorphic information is included in this mangling so polymorphic
	70	types will have seperate instances for each set of polymorphic arguments.
[0c4df43]	71
[9d7e5cb]	72	\begin{cfa}
	73	struct /* type name */ {
	74	/* parent type name / const parent;
	75	};
	76
	77	__attribute__((section(".gnu.linkonce./* instance name */")))
	78	/* type name / const / instance name */ = {
	79	&/* parent instance name */,
	80	};
	81	\end{cfa}
[830299f]	82
[7eb6eb5]	83	\subsection{Virtual Table}
[9d7e5cb]	84	Each virtual type has a virtual table type that stores its type id and
	85	virtual members.
	86	Each virtual type instance is bound to a table instance that is filled with
	87	the values of virtual members.
	88	Both the layout of the fields and their value are decided by the rules given
	89	below.
	90
	91	The layout always comes in three parts.
	92	The first section is just the type id at the head of the table. It is always
	93	there to ensure that
	94	The second section are all the virtual members of the parent, in the same
	95	order as they appear in the parent's virtual table. Note that the type may
	96	change slightly as references to the ``this" will change. This is limited to
	97	inside pointers/references and via function pointers so that the size (and
	98	hence the offsets) are the same.
	99	The third section is similar to the second except that it is the new virtual
	100	members introduced at this level in the hierarchy.
	101
	102	\begin{figure}
	103	\begin{cfa}
	104	type_id
	105	parent_field0
	106	...
	107	parent_fieldN
[0c4df43]	108	child_field0
[830299f]	109	...
	110	child_fieldN
[9d7e5cb]	111	\end{cfa}
	112	\caption{Virtual Table Layout}
	113	\label{f:VirtualTableLayout}
	114	\todo*{Improve the Virtual Table Layout diagram.}
	115	\end{figure}
	116
	117	The first and second sections together mean that every virtual table has a
	118	prefix that has the same layout and types as its parent virtual table.
	119	This, combined with the fixed offset to the virtual table pointer, means that
	120	for any virtual type it doesn't matter if we have it or any of its
	121	descendants, it is still always safe to access the virtual table through
	122	the virtual table pointer.
	123	From there it is safe to check the type id to identify the exact type of the
	124	underlying object, access any of the virtual members and pass the object to
	125	any of the method-like virtual members.
	126	\todo{Introduce method-like virtual members.}
	127
	128	When a virtual table is declared the user decides where to declare it and its
	129	name. The initialization of the virtual table is entirely automatic based on
	130	the context of the declaration.
	131
	132	The type id is always fixed, each virtual table type will always have one
	133	exactly one possible type id.
	134	The virtual members are usually filled in by resolution. The best match for
	135	a given name and type at the declaration site is filled in.
	136	There are two exceptions to that rule: the @size@ field is the type's size
	137	and is set to the result of a @sizeof@ expression, the @align@ field is the
	138	type's alignment and similarly uses an @alignof@ expression.
	139
	140	\subsubsection{Concurrency Integration}
[f28fdee]	141	Coroutines and threads need instances of @CoroutineCancelled@ and
[830299f]	142	@ThreadCancelled@ respectively to use all of their functionality. When a new
[0c4df43]	143	data type is declared with @coroutine@ or @thread@ the forward declaration for
[7eb6eb5]	144	the instance is created as well. The definition of the virtual table is created
	145	at the definition of the main function.
[9d7e5cb]	146	\todo{Add an example with code snipits.}
[26ca815]	147
	148	\subsection{Virtual Cast}
[7eb6eb5]	149	Virtual casts are implemented as a function call that does the subtype check
	150	and a C coercion-cast to do the type conversion.
	151	% The C-cast is just to make sure the generated code is correct so the rest of
	152	% the section is about that function.
[9d7e5cb]	153	The function is implemented in the standard library and has the following
	154	signature:
[7eb6eb5]	155	\begin{cfa}
[0c4df43]	156	void * __cfa__virtual_cast(
	157	struct __cfa__parent_vtable const * parent,
[7eb6eb5]	158	struct __cfa__parent_vtable const * const * child );
	159	\end{cfa}
[9d7e5cb]	160	\todo{Get rid of \_\_cfa\_\_parent\_vtable in the standard library and then
	161	the document.}
	162	The type id of target type of the virtual cast is passed in as @parent@ and
	163	the cast target is passed in as @child@.
	164
	165	For C generation both arguments and the result are wrapped with type casts.
	166	There is also an internal store inside the compiler to make sure that the
	167	target type is a virtual type.
	168	% It also checks for conflicting definitions.
	169
	170	The virtual cast either returns the original pointer as a new type or null.
	171	So the function just does the parent check and returns the approprate value.
	172	The parent check is a simple linear search of child's ancestors using the
	173	type information.
[26ca815]	174
	175	\section{Exceptions}
	176	% Anything about exception construction.
	177
	178	\section{Unwinding}
	179	% Adapt the unwind chapter, just describe the sections of libunwind used.
	180	% Mention that termination and cancellation use it. Maybe go into why
	181	% resumption doesn't as well.
	182
[0c4df43]	183	% Many modern languages work with an interal stack that function push and pop
[7eb6eb5]	184	% their local data to. Stack unwinding removes large sections of the stack,
	185	% often across functions.
	186
	187	Stack unwinding is the process of removing stack frames (activations) from the
[9d7e5cb]	188	stack. On function entry and return, unwinding is handled directly by the
	189	call/return code embedded in the function.
	190	In many cases the position of the instruction pointer (relative to parameter
	191	and local declarations) is enough to know the current size of the stack
	192	frame.
	193
	194	Usually, the stack-frame size is known statically based on parameter and
	195	local variable declarations. Even with dynamic stack-size the information
	196	to determain how much of the stack has to be removed is still contained
	197	within the function.
[7eb6eb5]	198	Allocating/deallocating stack space is usually an $O(1)$ operation achieved by
	199	bumping the hardware stack-pointer up or down as needed.
[9d7e5cb]	200	Constructing/destructing values on the stack takes longer put in terms of
	201	figuring out what needs to be done is of similar complexity.
[7eb6eb5]	202
[9d7e5cb]	203	Unwinding across multiple stack frames is more complex because that
	204	information is no longer contained within the current function.
	205	With seperate compilation a function has no way of knowing what its callers
	206	are so it can't know how large those frames are.
	207	Without altering the main code path it is also hard to pass that work off
	208	to the caller.
[7eb6eb5]	209
	210	The traditional unwinding mechanism for C is implemented by saving a snap-shot
	211	of a function's state with @setjmp@ and restoring that snap-shot with
	212	@longjmp@. This approach bypasses the need to know stack details by simply
	213	reseting to a snap-shot of an arbitrary but existing function frame on the
	214	stack. It is up to the programmer to ensure the snap-shot is valid when it is
[9d7e5cb]	215	reset and that all required clean-up from the unwound stacks is preformed.
	216	This approach is fragile and forces a work onto the surounding code.
	217
	218	With respect to that work forced onto the surounding code,
	219	many languages define clean-up actions that must be taken when certain
	220	sections of the stack are removed. Such as when the storage for a variable
	221	is removed from the stack or when a try statement with a finally clause is
	222	(conceptually) popped from the stack.
	223	None of these should be handled by the user, that would contradict the
	224	intention of these features, so they need to be handled automatically.
	225
	226	To safely remove sections of the stack the language must be able to find and
	227	run these clean-up actions even when removing multiple functions unknown at
	228	the beginning of the unwinding.
[7eb6eb5]	229
	230	One of the most popular tools for stack management is libunwind, a low-level
	231	library that provides tools for stack walking, handler execution, and
	232	unwinding. What follows is an overview of all the relevant features of
	233	libunwind needed for this work, and how \CFA uses them to implement exception
	234	handling.
	235
	236	\subsection{libunwind Usage}
	237	Libunwind, accessed through @unwind.h@ on most platforms, is a C library that
[df24d37]	238	provides \Cpp-style stack-unwinding. Its operation is divided into two phases:
[7eb6eb5]	239	search and cleanup. The dynamic target search -- phase 1 -- is used to scan the
	240	stack and decide where unwinding should stop (but no unwinding occurs). The
	241	cleanup -- phase 2 -- does the unwinding and also runs any cleanup code.
	242
	243	To use libunwind, each function must have a personality function and a Language
[830299f]	244	Specific Data Area (LSDA). The LSDA has the unique information for each
[7eb6eb5]	245	function to tell the personality function where a function is executing, its
[830299f]	246	current stack frame, and what handlers should be checked. Theoretically, the
[7eb6eb5]	247	LSDA can contain any information but conventionally it is a table with entries
	248	representing regions of the function and what has to be done there during
[9d7e5cb]	249	unwinding. These regions are bracketed by instruction addresses. If the
[7eb6eb5]	250	instruction pointer is within a region's start/end, then execution is currently
	251	executing in that region. Regions are used to mark out the scopes of objects
	252	with destructors and try blocks.
	253
	254	% Libunwind actually does very little, it simply moves down the stack from
	255	% function to function. Most of the actions are implemented by the personality
	256	% function which libunwind calls on every function. Since this is shared across
	257	% many functions or even every function in a language it will need a bit more
	258	% information.
	259
	260	The GCC compilation flag @-fexceptions@ causes the generation of an LSDA and
[9d7e5cb]	261	attaches a personality function to each function.
	262	In plain C (which \CFA currently compiles down to) this
[830299f]	263	flag only handles the cleanup attribute:
[7eb6eb5]	264	\begin{cfa}
	265	void clean_up( int * var ) { ... }
[830299f]	266	int avar __attribute__(( cleanup(clean_up) ));
[7eb6eb5]	267	\end{cfa}
[9d7e5cb]	268	The attribue is used on a variable and specifies a function,
	269	in this case @clean_up@, run when the variable goes out of scope.
	270	This is enough to mimic destructors, but not try statements which can effect
	271	the unwinding.
	272
	273	To get full unwinding support all of this has to be done directly with
	274	assembly and assembler directives. Partiularly the cfi directives
	275	\texttt{.cfi\_lsda} and \texttt{.cfi\_personality}.
[7eb6eb5]	276
	277	\subsection{Personality Functions}
[830299f]	278	Personality functions have a complex interface specified by libunwind. This
[7eb6eb5]	279	section covers some of the important parts of the interface.
	280
[0c4df43]	281	A personality function can preform different actions depending on how it is
[830299f]	282	called.
[7eb6eb5]	283	\begin{lstlisting}[language=C,{moredelim=**[is][\color{red}]{@}{@}}]
	284	typedef _Unwind_Reason_Code (*@_Unwind_Personality_Fn@) (
	285	_Unwind_Action @action@,
	286	_Unwind_Exception_Class @exception_class@,
	287	_Unwind_Exception * @exception@,
	288	struct _Unwind_Context * @context@
	289	);
[26ca815]	290	\end{lstlisting}
[7eb6eb5]	291	The @action@ argument is a bitmask of possible actions:
[9d7e5cb]	292	\begin{enumerate}[topsep=5pt]
[7eb6eb5]	293	\item
	294	@_UA_SEARCH_PHASE@ specifies a search phase and tells the personality function
[830299f]	295	to check for handlers. If there is a handler in a stack frame, as defined by
[7eb6eb5]	296	the language, the personality function returns @_URC_HANDLER_FOUND@; otherwise
	297	it return @_URC_CONTINUE_UNWIND@.
	298
	299	\item
	300	@_UA_CLEANUP_PHASE@ specifies a cleanup phase, where the entire frame is
	301	unwound and all cleanup code is run. The personality function does whatever
	302	cleanup the language defines (such as running destructors/finalizers) and then
	303	generally returns @_URC_CONTINUE_UNWIND@.
	304
	305	\item
	306	\begin{sloppypar}
	307	@_UA_HANDLER_FRAME@ specifies a cleanup phase on a function frame that found a
	308	handler. The personality function must prepare to return to normal code
	309	execution and return @_URC_INSTALL_CONTEXT@.
	310	\end{sloppypar}
	311
	312	\item
	313	@_UA_FORCE_UNWIND@ specifies a forced unwind call. Forced unwind only performs
	314	the cleanup phase and uses a different means to decide when to stop
[0c4df43]	315	(see \vref{s:ForcedUnwind}).
[7eb6eb5]	316	\end{enumerate}
	317
	318	The @exception_class@ argument is a copy of the
[9d7e5cb]	319	\code{C}{exception}'s @exception_class@ field.
	320	This a number that identifies the exception handling mechanism that created
	321	the
[7eb6eb5]	322
[9d7e5cb]	323	The \code{C}{exception} argument is a pointer to the user
	324	provided storage object. It has two public fields: the @exception_class@,
	325	which is described above, and the @exception_cleanup@ function.
	326	The clean-up function is used by the EHM to clean-up the exception if it
	327	should need to be freed at an unusual time, it takes an argument that says
	328	why it had to be cleaned up.
[7eb6eb5]	329
	330	The @context@ argument is a pointer to an opaque type passed to helper
	331	functions called inside the personality function.
	332
	333	The return value, @_Unwind_Reason_Code@, is an enumeration of possible messages
[26ca815]	334	that can be passed several places in libunwind. It includes a number of
	335	messages for special cases (some of which should never be used by the
[9d7e5cb]	336	personality function) and error codes. However, unless otherwise noted, the
[f28fdee]	337	personality function should always return @_URC_CONTINUE_UNWIND@.
[26ca815]	338
	339	\subsection{Raise Exception}
[7eb6eb5]	340	Raising an exception is the central function of libunwind and it performs a
	341	two-staged unwinding.
	342	\begin{cfa}
[26ca815]	343	_Unwind_Reason_Code _Unwind_RaiseException(_Unwind_Exception *);
[7eb6eb5]	344	\end{cfa}
	345	First, the function begins the search phase, calling the personality function
	346	of the most recent stack frame. It continues to call personality functions
	347	traversing the stack from newest to oldest until a function finds a handler or
	348	the end of the stack is reached. In the latter case, raise exception returns
	349	@_URC_END_OF_STACK@.
	350
[9d7e5cb]	351	Second, when a handler is matched, raise exception moves to the clean-up
	352	phase and walks the stack a second time.
[7eb6eb5]	353	Once again, it calls the personality functions of each stack frame from newest
	354	to oldest. This pass stops at the stack frame containing the matching handler.
	355	If that personality function has not install a handler, it is an error.
	356
	357	If an error is encountered, raise exception returns either
	358	@_URC_FATAL_PHASE1_ERROR@ or @_URC_FATAL_PHASE2_ERROR@ depending on when the
	359	error occurred.
[26ca815]	360
	361	\subsection{Forced Unwind}
[7eb6eb5]	362	\label{s:ForcedUnwind}
	363	Forced Unwind is the other central function in libunwind.
	364	\begin{cfa}
[9d7e5cb]	365	_Unwind_Reason_Code _Unwind_ForcedUnwind(_Unwind_Exception *,
[7eb6eb5]	366	_Unwind_Stop_Fn, void *);
	367	\end{cfa}
	368	It also unwinds the stack but it does not use the search phase. Instead another
[830299f]	369	function, the stop function, is used to stop searching. The exception is the
[7eb6eb5]	370	same as the one passed to raise exception. The extra arguments are the stop
	371	function and the stop parameter. The stop function has a similar interface as a
	372	personality function, except it is also passed the stop parameter.
	373	\begin{lstlisting}[language=C,{moredelim=**[is][\color{red}]{@}{@}}]
	374	typedef _Unwind_Reason_Code (*@_Unwind_Stop_Fn@)(
	375	_Unwind_Action @action@,
	376	_Unwind_Exception_Class @exception_class@,
	377	_Unwind_Exception * @exception@,
	378	struct _Unwind_Context * @context@,
	379	void * @stop_parameter@);
[26ca815]	380	\end{lstlisting}
	381
	382	The stop function is called at every stack frame before the personality
[7eb6eb5]	383	function is called and then once more after all frames of the stack are
	384	unwound.
[26ca815]	385
[7eb6eb5]	386	Each time it is called, the stop function should return @_URC_NO_REASON@ or
	387	transfer control directly to other code outside of libunwind. The framework
	388	does not provide any assistance here.
[26ca815]	389
[7eb6eb5]	390	\begin{sloppypar}
[830299f]	391	Its arguments are the same as the paired personality function. The actions
[7eb6eb5]	392	@_UA_CLEANUP_PHASE@ and @_UA_FORCE_UNWIND@ are always set when it is
	393	called. Beyond the libunwind standard, both GCC and Clang add an extra action
	394	on the last call at the end of the stack: @_UA_END_OF_STACK@.
	395	\end{sloppypar}
[26ca815]	396
	397	\section{Exception Context}
	398	% Should I have another independent section?
	399	% There are only two things in it, top_resume and current_exception. How it is
[7eb6eb5]	400	% stored changes depending on whether or not the thread-library is linked.
	401
	402	The exception context is global storage used to maintain data across different
	403	exception operations and to communicate among different components.
	404
	405	Each stack must have its own exception context. In a sequential \CFA program,
	406	there is only one stack with a single global exception-context. However, when
[9d7e5cb]	407	the library @libcfathread@ is linked, there are multiple stacks and each
[7eb6eb5]	408	needs its own exception context.
	409
[9d7e5cb]	410	The exception context should be retrieved by calling the function
[0c4df43]	411	@this_exception_context@. For sequential execution, this function is defined as
[7eb6eb5]	412	a weak symbol in the \CFA system-library, @libcfa@. When a \CFA program is
	413	concurrent, it links with @libcfathread@, where this function is defined with a
	414	strong symbol replacing the sequential version.
	415
[830299f]	416	The sequential @this_exception_context@ returns a hard-coded pointer to the
[9d7e5cb]	417	global exception context.
[830299f]	418	The concurrent version adds the exception context to the data stored at the
[9d7e5cb]	419	base of each stack. When @this_exception_context@ is called, it retrieves the
[830299f]	420	active stack and returns the address of the context saved there.
[26ca815]	421
	422	\section{Termination}
	423	% Memory management & extra information, the custom function used to implement
	424	% catches. Talk about GCC nested functions.
	425
[9d7e5cb]	426	\CFA termination exceptions use libunwind heavily because they match \Cpp
	427	\Cpp exceptions closely. The main complication for \CFA is that the
[7eb6eb5]	428	compiler generates C code, making it very difficult to generate the assembly to
	429	form the LSDA for try blocks or destructors.
[26ca815]	430
	431	\subsection{Memory Management}
[7eb6eb5]	432	The first step of a termination raise is to copy the exception into memory
	433	managed by the exception system. Currently, the system uses @malloc@, rather
[0c4df43]	434	than reserved memory or the stack top. The exception handling mechanism manages
[7eb6eb5]	435	memory for the exception as well as memory for libunwind and the system's own
	436	per-exception storage.
	437
[9d7e5cb]	438	\begin{figure}
[830299f]	439	\begin{verbatim}
	440	Fixed Header \| _Unwind_Exception <- pointer target
	441	\|
	442	\| Cforall storage
	443	\|
	444	Variable Body \| the exception <- fixed offset
	445	V ...
	446	\end{verbatim}
[9d7e5cb]	447	\caption{Exception Layout}
	448	\label{f:ExceptionLayout}
	449	\end{figure}
	450	\todo*{Convert the exception layout to an actual diagram.}
[830299f]	451
[9d7e5cb]	452	Exceptions are stored in variable-sized blocks (see \vref{f:ExceptionLayout}).
	453	The first component is a fixed-sized data structure that contains the
[7eb6eb5]	454	information for libunwind and the exception system. The second component is an
	455	area of memory big enough to store the exception. Macros with pointer arthritic
	456	and type cast are used to move between the components or go from the embedded
[f28fdee]	457	@_Unwind_Exception@ to the entire node.
[26ca815]	458
[9d7e5cb]	459	Multipe exceptions can exist at the same time because exceptions can be
	460	raised inside handlers, destructors and finally blocks.
	461	Figure~\vref{f:MultipleExceptions} shows a program that has multiple
	462	exceptions active at one time.
	463	Each time an exception is thrown and caught the stack unwinds and the finally
	464	clause runs. This will throw another exception (until @num_exceptions@ gets
	465	high enough) which must be allocated. The previous exceptions may not be
	466	freed because the handler/catch clause has not been run.
	467	So the EHM must keep them alive while it allocates exceptions for new throws.
	468
	469	\begin{figure}
	470	\centering
	471	% Andrew: Figure out what these do and give them better names.
	472	\newsavebox{\myboxA}
	473	\newsavebox{\myboxB}
	474	\begin{lrbox}{\myboxA}
	475	\begin{lstlisting}[language=CFA,{moredelim=**[is][\color{red}]{@}{@}}]
	476	unsigned num_exceptions = 0;
	477	void throws() {
	478	try {
	479	try {
	480	++num_exceptions;
	481	throw (Example){table};
	482	} finally {
	483	if (num_exceptions < 3) {
	484	throws();
	485	}
	486	}
	487	} catch (exception_t *) {
	488	--num_exceptions;
	489	}
	490	}
	491	int main() {
	492	throws();
	493	}
	494	\end{lstlisting}
	495	\end{lrbox}
	496
	497	\begin{lrbox}{\myboxB}
	498	\begin{lstlisting}
	499	\end{lstlisting}
	500	\end{lrbox}
	501
	502	{\usebox\myboxA}
	503	\hspace{25pt}
	504	{\usebox\myboxB}
	505
	506	\caption{Multiple Exceptions}
	507	\label{f:MultipleExceptions}
	508	\end{figure}
	509	\todo*{Work on multiple exceptions code sample.}
	510
	511	All exceptions are stored in nodes which are then linked together in lists,
	512	one list per stack, with the
[7eb6eb5]	513	list head stored in the exception context. Within each linked list, the most
	514	recently thrown exception is at the head followed by older thrown
	515	exceptions. This format allows exceptions to be thrown, while a different
	516	exception is being handled. The exception at the head of the list is currently
	517	being handled, while other exceptions wait for the exceptions before them to be
	518	removed.
	519
	520	The virtual members in the exception's virtual table provide the size of the
	521	exception, the copy function, and the free function, so they are specific to an
	522	exception type. The size and copy function are used immediately to copy an
[9d7e5cb]	523	exception into managed memory. After the exception is handled, the free
	524	function is used to clean up the exception and then the entire node is
	525	passed to free so the memory can be given back to the heap.
[7eb6eb5]	526
	527	\subsection{Try Statements and Catch Clauses}
	528	The try statement with termination handlers is complex because it must
[0c4df43]	529	compensate for the lack of assembly-code generated from \CFA. Libunwind
[7eb6eb5]	530	requires an LSDA and personality function for control to unwind across a
	531	function. The LSDA in particular is hard to mimic in generated C code.
	532
	533	The workaround is a function called @__cfaehm_try_terminate@ in the standard
	534	library. The contents of a try block and the termination handlers are converted
	535	into functions. These are then passed to the try terminate function and it
[830299f]	536	calls them.
	537	Because this function is known and fixed (and not an arbitrary function that
[9d7e5cb]	538	happens to contain a try statement), the LSDA can be generated ahead
[830299f]	539	of time.
	540
	541	Both the LSDA and the personality function are set ahead of time using
[9d7e5cb]	542	embedded assembly. This assembly code is handcrafted using C @asm@ statements
	543	and contains
[0c4df43]	544	enough information for the single try statement the function repersents.
[26ca815]	545
	546	The three functions passed to try terminate are:
[7eb6eb5]	547	\begin{description}
	548	\item[try function:] This function is the try block, all the code inside the
	549	try block is placed inside the try function. It takes no parameters and has no
	550	return value. This function is called during regular execution to run the try
	551	block.
	552
	553	\item[match function:] This function is called during the search phase and
[830299f]	554	decides if a catch clause matches the termination exception. It is constructed
[7eb6eb5]	555	from the conditional part of each handler and runs each check, top to bottom,
	556	in turn, first checking to see if the exception type matches and then if the
	557	condition is true. It takes a pointer to the exception and returns 0 if the
	558	exception is not handled here. Otherwise the return value is the id of the
	559	handler that matches the exception.
	560
	561	\item[handler function:] This function handles the exception. It takes a
	562	pointer to the exception and the handler's id and returns nothing. It is called
[830299f]	563	after the cleanup phase. It is constructed by stitching together the bodies of
[7eb6eb5]	564	each handler and dispatches to the selected handler.
	565	\end{description}
	566	All three functions are created with GCC nested functions. GCC nested functions
	567	can be used to create closures, functions that can refer to the state of other
	568	functions on the stack. This approach allows the functions to refer to all the
[830299f]	569	variables in scope for the function containing the @try@ statement. These
[7eb6eb5]	570	nested functions and all other functions besides @__cfaehm_try_terminate@ in
	571	\CFA use the GCC personality function and the @-fexceptions@ flag to generate
[9d7e5cb]	572	the LSDA.
	573	Using this pattern, \CFA implements destructors with the cleanup attribute.
	574	\todo{Add an example of the conversion from try statement to functions.}
[26ca815]	575
	576	\section{Resumption}
	577	% The stack-local data, the linked list of nodes.
	578
[9d7e5cb]	579	Resumption simpler to implement than termination
	580	because there is no stack unwinding.
	581	Instead of storing the data in a special area using assembly,
	582	there is just a linked list of possible handlers for each stack,
	583	with each node on the list reperenting a try statement on the stack.
	584
	585	The head of the list is stored in the exception context.
	586	The nodes are stored in order, with the more recent try statements closer
	587	to the head of the list.
	588	Instead of traversing the stack resumption handling traverses the list.
	589	At each node the EHM checks to see if the try statement the node repersents
	590	can handle the exception. If it can, then the exception is handled and
	591	the operation finishes, otherwise the search continues to the next node.
	592	If the search reaches the end of the list without finding a try statement
	593	that can handle the exception the default handler is executed and the
	594	operation finishes.
	595
	596	In each node is a handler function which does most of the work there.
	597	The handler function is passed the raised the exception and returns true
	598	if the exception is handled and false if it cannot be handled here.
	599
	600	For each @catchResume@ clause the handler function will:
	601	check to see if the raised exception is a descendant type of the declared
	602	exception type, if it is and there is a conditional expression then it will
	603	run the test, if both checks pass the handling code for the clause is run
	604	and the function returns true, otherwise it moves onto the next clause.
	605	If this is the last @catchResume@ clause then instead of moving onto
	606	the next clause the function returns false as no handler could be found.
	607
	608	\todo{Diagram showing a try statement being converted into resumption handlers.}
[26ca815]	609
[12b4ab4]	610	% Recursive Resumption Stuff:
[df24d37]	611	Search skipping (see \vpageref{s:ResumptionMarking}), which ignores parts of
	612	the stack
[7eb6eb5]	613	already examined, is accomplished by updating the front of the list as the
[9d7e5cb]	614	search continues. Before the handler at a node is called, the head of the list
[7eb6eb5]	615	is updated to the next node of the current node. After the search is complete,
	616	successful or not, the head of the list is reset.
[12b4ab4]	617
[7eb6eb5]	618	This mechanism means the current handler and every handler that has already
	619	been checked are not on the list while a handler is run. If a resumption is
	620	thrown during the handling of another resumption the active handlers and all
	621	the other handler checked up to this point are not checked again.
[12b4ab4]	622
[0c4df43]	623	This structure also supports new handler added while the resumption is being
[12b4ab4]	624	handled. These are added to the front of the list, pointing back along the
[7eb6eb5]	625	stack -- the first one points over all the checked handlers -- and the ordering
	626	is maintained.
[9d7e5cb]	627	\todo{Add a diagram for resumption marking.}
[7eb6eb5]	628
	629	\label{p:zero-cost}
	630	Note, the resumption implementation has a cost for entering/exiting a @try@
	631	statement with @catchResume@ clauses, whereas a @try@ statement with @catch@
	632	clauses has zero-cost entry/exit. While resumption does not need the stack
	633	unwinding and cleanup provided by libunwind, it could use the search phase to
	634	providing zero-cost enter/exit using the LSDA. Unfortunately, there is no way
	635	to return from a libunwind search without installing a handler or raising an
[830299f]	636	error. Although workarounds might be possible, they are beyond the scope of
[7eb6eb5]	637	this thesis. The current resumption implementation has simplicity in its
	638	favour.
[26ca815]	639	% Seriously, just compare the size of the two chapters and then consider
	640	% that unwind is required knowledge for that chapter.
	641
	642	\section{Finally}
	643	% Uses destructors and GCC nested functions.
[9d7e5cb]	644	A finally clause is placed into a GCC nested-function with a unique name,
	645	and no arguments or return values.
	646	This nested function is then set as the cleanup
[7eb6eb5]	647	function of an empty object that is declared at the beginning of a block placed
[0c4df43]	648	around the context of the associated @try@ statement.
[26ca815]	649
[9d7e5cb]	650	The rest is handled by GCC. The try block and all handlers are inside this
[7eb6eb5]	651	block. At completion, control exits the block and the empty object is cleaned
	652	up, which runs the function that contains the finally code.
[26ca815]	653
	654	\section{Cancellation}
	655	% Stack selections, the three internal unwind functions.
	656
	657	Cancellation also uses libunwind to do its stack traversal and unwinding,
[9d7e5cb]	658	however it uses a different primary function: @_Unwind_ForcedUnwind@. Details
	659	of its interface can be found in the Section~\vref{s:ForcedUnwind}.
[26ca815]	660
[7eb6eb5]	661	The first step of cancellation is to find the cancelled stack and its type:
[0c4df43]	662	coroutine or thread. Fortunately, the thread library stores the main thread
	663	pointer and the current thread pointer, and every thread stores a pointer to
	664	its main coroutine and the coroutine it is currently executing.
[9d7e5cb]	665	\todo*{Consider adding a description of how threads are coroutines.}
[0c4df43]	666
[9d7e5cb]	667	If a the current thread's main and current coroutines are the same then the
	668	current stack is a thread stack. Furthermore it is easy to compare the
	669	current thread to the main thread to see if they are the same. And if this
	670	is not a thread stack then it must be a coroutine stack.
[0c4df43]	671
[7eb6eb5]	672	However, if the threading library is not linked, the sequential execution is on
	673	the main stack. Hence, the entire check is skipped because the weak-symbol
	674	function is loaded. Therefore, a main thread cancellation is unconditionally
	675	performed.
	676
	677	Regardless of how the stack is chosen, the stop function and parameter are
	678	passed to the forced-unwind function. The general pattern of all three stop
[9d7e5cb]	679	functions is the same: they continue unwinding until the end of stack and
	680	then preform their transfer.
[0c4df43]	681
[7eb6eb5]	682	For main stack cancellation, the transfer is just a program abort.
	683
[0c4df43]	684	For coroutine cancellation, the exception is stored on the coroutine's stack,
[7eb6eb5]	685	and the coroutine context switches to its last resumer. The rest is handled on
	686	the backside of the resume, which check if the resumed coroutine is
	687	cancelled. If cancelled, the exception is retrieved from the resumed coroutine,
	688	and a @CoroutineCancelled@ exception is constructed and loaded with the
	689	cancelled exception. It is then resumed as a regular exception with the default
	690	handler coming from the context of the resumption call.
	691
	692	For thread cancellation, the exception is stored on the thread's main stack and
	693	then context switched to the scheduler. The rest is handled by the thread
	694	joiner. When the join is complete, the joiner checks if the joined thread is
	695	cancelled. If cancelled, the exception is retrieved and the joined thread, and
	696	a @ThreadCancelled@ exception is constructed and loaded with the cancelled
	697	exception. The default handler is passed in as a function pointer. If it is
	698	null (as it is for the auto-generated joins on destructor call), the default is
	699	used, which is a program abort.
	700	%; which gives the required handling on implicate join.

Note: See TracBrowser for help on using the repository browser.

Download in other formats: