source: doc/theses/andrew_beach_MMath/implement.tex @ 52f6250

ADTarm-ehast-experimentalenumforall-pointer-decayjacob/cs343-translationnew-ast-unique-exprpthread-emulationqualifiedEnum
Last change on this file since 52f6250 was 1c1c180, checked in by Andrew Beach <ajbeach@…>, 3 years ago

Fixed line length and trailing whitespace on modified tex files.

  • Property mode set to 100644
File size: 28.0 KB
Line 
1\chapter{Implementation}
2% Goes over how all the features are implemented.
3
4The implementation work for this thesis covers two components: the virtual
5system and exceptions. Each component is discussed in detail.
6
7\section{Virtual System}
8\label{s:VirtualSystem}
9% Virtual table rules. Virtual tables, the pointer to them and the cast.
10While the \CFA virtual system currently has only one public feature, virtual
11cast \see{\VPageref{p:VirtualCast}}, substantial structure is required to
12support it, and provide features for exception handling and the standard
13library.
14
15\subsection{Virtual Table}
16The virtual system is accessed through a private constant field inserted at the
17beginning of every virtual type, called the virtual-table pointer. This field
18points at a type's virtual table and is assigned during the object's
19construction.  The address of a virtual table acts as the unique identifier for
20the virtual type, and the first field of a virtual table is a pointer to the
21parent virtual-table or @0p@.  The remaining fields are duplicated from the
22parent tables in this type's inheritance chain, followed by any fields this type
23introduces. Parent fields are duplicated so they can be changed (\CC
24\lstinline[language=c++]|override|), so that references to the dispatched type
25are replaced with the current virtual type.
26\PAB{Can you create a simple diagram of the layout?}
27% These are always taken by pointer or reference.
28
29% For each virtual type, a virtual table is constructed. This is both a new type
30% and an instance of that type. Other instances of the type could be created
31% but the system doesn't use them. So this section will go over the creation of
32% the type and the instance.
33
34A virtual table is created when the virtual type is created. The name of the
35type is created by mangling the name of the base type. The name of the instance
36is also generated by name mangling.  The fields are initialized automatically.
37The parent field is initialized by getting the type of the parent field and
38using that to calculate the mangled name of the parent's virtual table type.
39There are two special fields that are included like normal fields but have
40special initialization rules: the @size@ field is the type's size and is
41initialized with a @sizeof@ expression, the @align@ field is the type's
42alignment and uses an @alignof@ expression. The remaining fields are resolved
43to a name matching the field's name and type using the normal visibility and
44overload resolution rules of the type system.
45
46These operations are split up into several groups depending on where they take
47place which varies for monomorphic and polymorphic types. The first devision is
48between the declarations and the definitions. Declarations, such as a function
49signature or a aggregate's name, must always be visible but may be repeated in
50the form of forward declarations in headers. Definitions, such as function
51bodies and a aggregate's layout, can be separately compiled but must occur
52exactly once in a source file.
53
54\begin{sloppypar}
55The declarations include the virtual type definition and forward declarations
56of the virtual table instance, constructor, message function and
57@get_exception_vtable@. The definition includes the storage and initialization
58of the virtual table instance and the bodies of the three functions.
59\end{sloppypar}
60
61Monomorphic instances put all of these two groups in one place each.
62Polymorphic instances also split out the core declarations and definitions from
63the per-instance information. The virtual table type and most of the functions
64are polymorphic so they are all part of the core. The virtual table instance
65and the @get_exception_vtable@ function.
66
67\begin{sloppypar}
68Coroutines and threads need instances of @CoroutineCancelled@ and
69@ThreadCancelled@ respectively to use all of their functionality.  When a new
70data type is declared with @coroutine@ or @thread@ the forward declaration for
71the instance is created as well. The definition of the virtual table is created
72at the definition of the main function.
73\end{sloppypar}
74
75\subsection{Virtual Cast}
76Virtual casts are implemented as a function call that does the subtype check
77and a C coercion-cast to do the type conversion.
78% The C-cast is just to make sure the generated code is correct so the rest of
79% the section is about that function.
80The function is
81\begin{cfa}
82void * __cfa__virtual_cast( struct __cfa__parent_vtable const * parent,
83        struct __cfa__parent_vtable const * const * child );
84}
85\end{cfa}
86and it is implemented in the standard library. It takes a pointer to the target
87type's virtual table and the object pointer being cast. The function performs a
88linear search starting at the object's virtual-table and walking through the
89the parent pointers, checking to if it or any of its ancestors are the same as
90the target-type virtual table-pointer.
91
92For the generated code, a forward declaration of the virtual works as follows.
93There is a forward declaration of @__cfa__virtual_cast@ in every \CFA file so
94it can just be used. The object argument is the expression being cast so that
95is just placed in the argument list.
96
97To build the target type parameter, the compiler creates a mapping from
98concrete type-name -- so for polymorphic types the parameters are filled in --
99to virtual table address. Every virtual table declaration is added to the this
100table; repeats are ignored unless they have conflicting definitions.  Note,
101these declarations do not have to be in scope, but they should usually be
102introduced as part of the type definition.
103
104\PAB{I do not understood all of \VRef{s:VirtualSystem}. I think you need to
105write more to make it clear.}
106
107
108\section{Exceptions}
109% Anything about exception construction.
110
111\section{Unwinding}
112% Adapt the unwind chapter, just describe the sections of libunwind used.
113% Mention that termination and cancellation use it. Maybe go into why
114% resumption doesn't as well.
115
116% Many modern languages work with an interal stack that function push and pop
117% their local data to. Stack unwinding removes large sections of the stack,
118% often across functions.
119
120Stack unwinding is the process of removing stack frames (activations) from the
121stack. On function entry and return, unwinding is handled directly by the code
122embedded in the function. Usually, the stack-frame size is known statically
123based on parameter and local variable declarations.  For dynamically-sized
124local variables, a runtime computation is necessary to know the frame
125size. Finally, a function's frame-size may change during execution as local
126variables (static or dynamic sized) go in and out of scope.
127Allocating/deallocating stack space is usually an $O(1)$ operation achieved by
128bumping the hardware stack-pointer up or down as needed.
129
130Unwinding across multiple stack frames is more complex because individual stack
131management code associated with each frame is bypassed. That is, the location
132of a function's frame-management code is largely unknown and dispersed
133throughout the function, hence the current frame size managed by that code is
134also unknown. Hence, code unwinding across frames does not have direct
135knowledge about what is on the stack, and hence, how much of the stack needs to
136be removed.
137
138% At a very basic level this can be done with @setjmp@ \& @longjmp@ which simply
139% move the top of the stack, discarding everything on the stack above a certain
140% point. However this ignores all the cleanup code that should be run when
141% certain sections of the stack are removed (for \CFA these are from destructors
142% and finally clauses) and also requires that the point to which the stack is
143% being unwound is known ahead of time. libunwind is used to address both of
144% these problems.
145
146The traditional unwinding mechanism for C is implemented by saving a snap-shot
147of a function's state with @setjmp@ and restoring that snap-shot with
148@longjmp@. This approach bypasses the need to know stack details by simply
149reseting to a snap-shot of an arbitrary but existing function frame on the
150stack. It is up to the programmer to ensure the snap-shot is valid when it is
151reset, making this unwinding approach fragile with potential errors that are
152difficult to debug because the stack becomes corrupted.
153
154However, many languages define cleanup actions that must be taken when objects
155are deallocated from the stack or blocks end, such as running a variable's
156destructor or a @try@ statement's @finally@ clause. Handling these mechanisms
157requires walking the stack and checking each stack frame for these potential
158actions.
159
160For exceptions, it must be possible to walk the stack frames in search of @try@
161statements to match and execute a handler. For termination exceptions, it must
162also be possible to unwind all stack frames from the throw to the matching
163catch, and each of these frames must be checked for cleanup actions. Stack
164walking is where most of the complexity and expense of exception handling
165appears.
166
167One of the most popular tools for stack management is libunwind, a low-level
168library that provides tools for stack walking, handler execution, and
169unwinding. What follows is an overview of all the relevant features of
170libunwind needed for this work, and how \CFA uses them to implement exception
171handling.
172
173\subsection{libunwind Usage}
174Libunwind, accessed through @unwind.h@ on most platforms, is a C library that
175provides \CC-style stack-unwinding. Its operation is divided into two phases:
176search and cleanup. The dynamic target search -- phase 1 -- is used to scan the
177stack and decide where unwinding should stop (but no unwinding occurs). The
178cleanup -- phase 2 -- does the unwinding and also runs any cleanup code.
179
180To use libunwind, each function must have a personality function and a Language
181Specific Data Area (LSDA).  The LSDA has the unique information for each
182function to tell the personality function where a function is executing, its
183current stack frame, and what handlers should be checked.  Theoretically, the
184LSDA can contain any information but conventionally it is a table with entries
185representing regions of the function and what has to be done there during
186unwinding. These regions are bracketed by the instruction pointer. If the
187instruction pointer is within a region's start/end, then execution is currently
188executing in that region. Regions are used to mark out the scopes of objects
189with destructors and try blocks.
190
191% Libunwind actually does very little, it simply moves down the stack from
192% function to function. Most of the actions are implemented by the personality
193% function which libunwind calls on every function. Since this is shared across
194% many functions or even every function in a language it will need a bit more
195% information.
196
197The GCC compilation flag @-fexceptions@ causes the generation of an LSDA and
198attaches its personality function. \PAB{to what is it attached?}  However, this
199flag only handles the cleanup attribute
200\begin{cfa}
201void clean_up( int * var ) { ... }
202int avar __attribute__(( __cleanup(clean_up) ));
203\end{cfa}
204which is used on a variable and specifies a function, \eg @clean_up@, run when
205the variable goes out of scope. The function is passed a pointer to the object
206so it can be used to mimic destructors. However, this feature cannot be used to
207mimic @try@ statements.
208
209\subsection{Personality Functions}
210Personality functions have a complex interface specified by libunwind.  This
211section covers some of the important parts of the interface.
212
213A personality function performs four tasks, although not all have to be
214present.
215\begin{lstlisting}[language=C,{moredelim=**[is][\color{red}]{@}{@}}]
216typedef _Unwind_Reason_Code (*@_Unwind_Personality_Fn@) (
217        _Unwind_Action @action@,
218        _Unwind_Exception_Class @exception_class@,
219        _Unwind_Exception * @exception@,
220        struct _Unwind_Context * @context@
221);
222\end{lstlisting}
223The @action@ argument is a bitmask of possible actions:
224\begin{enumerate}
225\item
226@_UA_SEARCH_PHASE@ specifies a search phase and tells the personality function
227to check for handlers.  If there is a handler in a stack frame, as defined by
228the language, the personality function returns @_URC_HANDLER_FOUND@; otherwise
229it return @_URC_CONTINUE_UNWIND@.
230
231\item
232@_UA_CLEANUP_PHASE@ specifies a cleanup phase, where the entire frame is
233unwound and all cleanup code is run. The personality function does whatever
234cleanup the language defines (such as running destructors/finalizers) and then
235generally returns @_URC_CONTINUE_UNWIND@.
236
237\item
238\begin{sloppypar}
239@_UA_HANDLER_FRAME@ specifies a cleanup phase on a function frame that found a
240handler. The personality function must prepare to return to normal code
241execution and return @_URC_INSTALL_CONTEXT@.
242\end{sloppypar}
243
244\item
245@_UA_FORCE_UNWIND@ specifies a forced unwind call. Forced unwind only performs
246the cleanup phase and uses a different means to decide when to stop
247\see{\VRef{s:ForcedUnwind}}.
248\end{enumerate}
249
250The @exception_class@ argument is a copy of the
251\lstinline[language=C]|exception|'s @exception_class@ field.
252
253The \lstinline[language=C]|exception| argument is a pointer to the user
254provided storage object. It has two public fields, the exception class, which
255is actually just a number, identifying the exception handling mechanism that
256created it, and the cleanup function. The cleanup function is called if
257required by the exception.
258
259The @context@ argument is a pointer to an opaque type passed to helper
260functions called inside the personality function.
261
262The return value, @_Unwind_Reason_Code@, is an enumeration of possible messages
263that can be passed several places in libunwind. It includes a number of
264messages for special cases (some of which should never be used by the
265personality function) and error codes but unless otherwise noted the
266personality function should always return @_URC_CONTINUE_UNWIND@.
267
268\subsection{Raise Exception}
269Raising an exception is the central function of libunwind and it performs a
270two-staged unwinding.
271\begin{cfa}
272_Unwind_Reason_Code _Unwind_RaiseException(_Unwind_Exception *);
273\end{cfa}
274First, the function begins the search phase, calling the personality function
275of the most recent stack frame. It continues to call personality functions
276traversing the stack from newest to oldest until a function finds a handler or
277the end of the stack is reached. In the latter case, raise exception returns
278@_URC_END_OF_STACK@.
279
280Second, when a handler is matched, raise exception continues onto the cleanup
281phase.
282Once again, it calls the personality functions of each stack frame from newest
283to oldest. This pass stops at the stack frame containing the matching handler.
284If that personality function has not install a handler, it is an error.
285
286If an error is encountered, raise exception returns either
287@_URC_FATAL_PHASE1_ERROR@ or @_URC_FATAL_PHASE2_ERROR@ depending on when the
288error occurred.
289
290\subsection{Forced Unwind}
291\label{s:ForcedUnwind}
292Forced Unwind is the other central function in libunwind.
293\begin{cfa}
294_Unwind_Reason_Code _Unwind_ForcedUnwind( _Unwind_Exception *,
295        _Unwind_Stop_Fn, void *);
296\end{cfa}
297It also unwinds the stack but it does not use the search phase. Instead another
298function, the stop function, is used to stop searching.  The exception is the
299same as the one passed to raise exception. The extra arguments are the stop
300function and the stop parameter. The stop function has a similar interface as a
301personality function, except it is also passed the stop parameter.
302\begin{lstlisting}[language=C,{moredelim=**[is][\color{red}]{@}{@}}]
303typedef _Unwind_Reason_Code (*@_Unwind_Stop_Fn@)(
304        _Unwind_Action @action@,
305        _Unwind_Exception_Class @exception_class@,
306        _Unwind_Exception * @exception@,
307        struct _Unwind_Context * @context@,
308        void * @stop_parameter@);
309\end{lstlisting}
310
311The stop function is called at every stack frame before the personality
312function is called and then once more after all frames of the stack are
313unwound.
314
315Each time it is called, the stop function should return @_URC_NO_REASON@ or
316transfer control directly to other code outside of libunwind. The framework
317does not provide any assistance here.
318
319\begin{sloppypar}
320Its arguments are the same as the paired personality function.  The actions
321@_UA_CLEANUP_PHASE@ and @_UA_FORCE_UNWIND@ are always set when it is
322called. Beyond the libunwind standard, both GCC and Clang add an extra action
323on the last call at the end of the stack: @_UA_END_OF_STACK@.
324\end{sloppypar}
325
326\section{Exception Context}
327% Should I have another independent section?
328% There are only two things in it, top_resume and current_exception. How it is
329% stored changes depending on whether or not the thread-library is linked.
330
331The exception context is global storage used to maintain data across different
332exception operations and to communicate among different components.
333
334Each stack must have its own exception context. In a sequential \CFA program,
335there is only one stack with a single global exception-context. However, when
336the library @libcfathread@ is linked, there are multiple stacks where each
337needs its own exception context.
338
339General access to the exception context is provided by function
340@this_exception_context@. For sequential execution, this function is defined as
341a weak symbol in the \CFA system-library, @libcfa@. When a \CFA program is
342concurrent, it links with @libcfathread@, where this function is defined with a
343strong symbol replacing the sequential version.
344
345% The version of the function defined in @libcfa@ is very simple. It returns a
346% pointer to a global static variable. With only one stack this global instance
347% is associated with the only stack.
348
349For coroutines, @this_exception_context@ accesses the exception context stored
350at the base of the stack. For threads, @this_exception_context@ uses the
351concurrency library to access the current stack of the thread or coroutine
352being executed by the thread, and then accesses the exception context stored at
353the base of this stack.
354
355\section{Termination}
356% Memory management & extra information, the custom function used to implement
357% catches. Talk about GCC nested functions.
358
359Termination exceptions use libunwind heavily because it matches the intended
360use from \CC exceptions closely. The main complication for \CFA is that the
361compiler generates C code, making it very difficult to generate the assembly to
362form the LSDA for try blocks or destructors.
363
364\subsection{Memory Management}
365The first step of a termination raise is to copy the exception into memory
366managed by the exception system. Currently, the system uses @malloc@, rather
367than reserved memory or the stack top. The exception handling mechanism manages
368memory for the exception as well as memory for libunwind and the system's own
369per-exception storage.
370
371Exceptions are stored in variable-sized blocks. \PAB{Show a memory layout
372figure.} The first component is a fixed sized data structure that contains the
373information for libunwind and the exception system. The second component is an
374area of memory big enough to store the exception. Macros with pointer arthritic
375and type cast are used to move between the components or go from the embedded
376@_Unwind_Exception@ to the entire node.
377
378All of these nodes are linked together in a list, one list per stack, with the
379list head stored in the exception context. Within each linked list, the most
380recently thrown exception is at the head followed by older thrown
381exceptions. This format allows exceptions to be thrown, while a different
382exception is being handled. The exception at the head of the list is currently
383being handled, while other exceptions wait for the exceptions before them to be
384removed.
385
386The virtual members in the exception's virtual table provide the size of the
387exception, the copy function, and the free function, so they are specific to an
388exception type. The size and copy function are used immediately to copy an
389exception into managed memory. After the exception is handled the free function
390is used to clean up the exception and then the entire node is passed to free.
391
392\subsection{Try Statements and Catch Clauses}
393The try statement with termination handlers is complex because it must
394compensate for the lack of assembly-code generated from \CFA. Libunwind
395requires an LSDA and personality function for control to unwind across a
396function. The LSDA in particular is hard to mimic in generated C code.
397
398The workaround is a function called @__cfaehm_try_terminate@ in the standard
399library. The contents of a try block and the termination handlers are converted
400into functions. These are then passed to the try terminate function and it
401calls them. This approach puts a try statement in its own functions so that no
402function has to deal with both termination handlers and destructors. \PAB{I do
403not understand the previous sentence.}
404
405This function has some custom embedded assembly that defines \emph{its}
406personality function and LSDA. The assembly is created with handcrafted C @asm@
407statements, which is why there is only one version of it. The personality
408function is structured so that it can be expanded, but currently it only
409handles this one function.  Notably, it does not handle any destructors so the
410function is constructed so that it does need to run it. \PAB{I do not
411understand the previous sentence.}
412
413The three functions passed to try terminate are:
414\begin{description}
415\item[try function:] This function is the try block, all the code inside the
416try block is placed inside the try function. It takes no parameters and has no
417return value. This function is called during regular execution to run the try
418block.
419
420\item[match function:] This function is called during the search phase and
421decides if a catch clause matches the termination exception.  It is constructed
422from the conditional part of each handler and runs each check, top to bottom,
423in turn, first checking to see if the exception type matches and then if the
424condition is true. It takes a pointer to the exception and returns 0 if the
425exception is not handled here. Otherwise the return value is the id of the
426handler that matches the exception.
427
428\item[handler function:] This function handles the exception. It takes a
429pointer to the exception and the handler's id and returns nothing. It is called
430after the cleanup phase.  It is constructed by stitching together the bodies of
431each handler and dispatches to the selected handler.
432\end{description}
433All three functions are created with GCC nested functions. GCC nested functions
434can be used to create closures, functions that can refer to the state of other
435functions on the stack. This approach allows the functions to refer to all the
436variables in scope for the function containing the @try@ statement.  These
437nested functions and all other functions besides @__cfaehm_try_terminate@ in
438\CFA use the GCC personality function and the @-fexceptions@ flag to generate
439the LSDA. This allows destructors to be implemented with the cleanup attribute.
440
441\section{Resumption}
442% The stack-local data, the linked list of nodes.
443
444Resumption simple to implement because there is no stack unwinding. The
445resumption raise uses a list of nodes for its stack traversal. The head of the
446list is stored in the exception context. The nodes in the list have a pointer
447to the next node and a pointer to the handler function.
448
449A resumption raise traverses this list. At each node the handler function is
450called, passing the exception by pointer. It returns true if the exception is
451handled and false otherwise.
452
453The handler function does both the matching and handling. It computes the
454condition of each @catchResume@ in top-to-bottom order, until it finds a
455handler that matches. If no handler matches then the function returns
456false. Otherwise the matching handler is run; if it completes successfully, the
457function returns true. Reresume, through the @throwResume;@ statement, cause
458the function to return true.
459
460% Recursive Resumption Stuff:
461Search skipping \see{\VPageref{p:searchskip}}, which ignores parts of the stack
462already examined, is accomplished by updating the front of the list as the
463search continues. Before the handler at a node is called the head of the list
464is updated to the next node of the current node. After the search is complete,
465successful or not, the head of the list is reset.
466
467This mechanism means the current handler and every handler that has already
468been checked are not on the list while a handler is run. If a resumption is
469thrown during the handling of another resumption the active handlers and all
470the other handler checked up to this point are not checked again.
471
472This structure also supports new handler added while the resumption is being
473handled. These are added to the front of the list, pointing back along the
474stack -- the first one points over all the checked handlers -- and the ordering
475is maintained.
476
477\label{p:zero-cost}
478Note, the resumption implementation has a cost for entering/exiting a @try@
479statement with @catchResume@ clauses, whereas a @try@ statement with @catch@
480clauses has zero-cost entry/exit. While resumption does not need the stack
481unwinding and cleanup provided by libunwind, it could use the search phase to
482providing zero-cost enter/exit using the LSDA. Unfortunately, there is no way
483to return from a libunwind search without installing a handler or raising an
484error.  Although workarounds might be possible, they are beyond the scope of
485this thesis. The current resumption implementation has simplicity in its
486favour.
487% Seriously, just compare the size of the two chapters and then consider
488% that unwind is required knowledge for that chapter.
489
490\section{Finally}
491% Uses destructors and GCC nested functions.
492Finally clauses is placed into a GCC nested-function with a unique name, and no
493arguments or return values. This nested function is then set as the cleanup
494function of an empty object that is declared at the beginning of a block placed
495around the context of the associated @try@ statement.
496
497The rest is handled by GCC. The try block and all handlers are inside the
498block. At completion, control exits the block and the empty object is cleaned
499up, which runs the function that contains the finally code.
500
501\section{Cancellation}
502% Stack selections, the three internal unwind functions.
503
504Cancellation also uses libunwind to do its stack traversal and unwinding,
505however it uses a different primary function @_Unwind_ForcedUnwind@.  Details
506of its interface can be found in the \VRef{s:ForcedUnwind}.
507
508The first step of cancellation is to find the cancelled stack and its type:
509coroutine or thread. Fortunately, the thread library stores the main thread
510pointer and the current thread pointer, and every thread stores a pointer to
511its main coroutine and the coroutine it is currently executing.
512
513The first check is if the current thread's main and current coroutine do not
514match, implying a coroutine cancellation; otherwise, it is a thread
515cancellation. Otherwise it is a main thread cancellation. \PAB{Previous
516sentence does not make sense.}
517
518However, if the threading library is not linked, the sequential execution is on
519the main stack. Hence, the entire check is skipped because the weak-symbol
520function is loaded. Therefore, a main thread cancellation is unconditionally
521performed.
522
523Regardless of how the stack is chosen, the stop function and parameter are
524passed to the forced-unwind function. The general pattern of all three stop
525functions is the same: they continue unwinding until the end of stack when they
526do there primary work.
527
528For main stack cancellation, the transfer is just a program abort.
529
530For coroutine cancellation, the exception is stored on the coroutine's stack,
531and the coroutine context switches to its last resumer. The rest is handled on
532the backside of the resume, which check if the resumed coroutine is
533cancelled. If cancelled, the exception is retrieved from the resumed coroutine,
534and a @CoroutineCancelled@ exception is constructed and loaded with the
535cancelled exception. It is then resumed as a regular exception with the default
536handler coming from the context of the resumption call.
537
538For thread cancellation, the exception is stored on the thread's main stack and
539then context switched to the scheduler. The rest is handled by the thread
540joiner. When the join is complete, the joiner checks if the joined thread is
541cancelled. If cancelled, the exception is retrieved and the joined thread, and
542a @ThreadCancelled@ exception is constructed and loaded with the cancelled
543exception. The default handler is passed in as a function pointer. If it is
544null (as it is for the auto-generated joins on destructor call), the default is
545used, which is a program abort.
546%; which gives the required handling on implicate join.
Note: See TracBrowser for help on using the repository browser.