source: doc/theses/andrew_beach_MMath/implement.tex @ 830299f

Last change on this file since 830299f was 830299f, checked in by Andrew Beach <ajbeach@…>, 3 years ago

Andrew MMath: Second draft of the implement chapter.

  • Property mode set to 100644
File size: 29.6 KB
2% Goes over how all the features are implemented.
4The implementation work for this thesis covers two components: the virtual
5system and exceptions. Each component is discussed in detail.
7\section{Virtual System}
9% Virtual table rules. Virtual tables, the pointer to them and the cast.
10While the \CFA virtual system currently has only one public feature, virtual
11cast \see{\VPageref{p:VirtualCast}}, substantial structure is required to
12support it, and provide features for exception handling and the standard
15\subsection{Virtual Type}
16Virtual types only have one change to their structure, the addition of a
17pointer to the virtual table. This is always the first field so that
18if it is cast to a supertype the field's location is still known.
20This field is set as part of all new generated constructors.
21\todo{They only come as part exceptions and don't work.}
22After the object is created the field is constant.
24However it can be read from, internally it is just a regular field called
25@virtual_table@. Dereferencing it gives the virtual table and access to the
26type's virtual members.
28\subsection{Virtual Table}
29Every time a virtual type is defined the new virtual table type must also be
32The unique instance is important because the address of the virtual table
33instance is used as the identifier for the virtual type. So a pointer to the
34virtual table and the ID for the virtual type are interchangable.
35\todo{Unique instances might be going so we will have to talk about the new
36system instead.}
38The first step in putting it all together is to create the virtual table type.
39The virtual table type is just a structure and can be described in terms of
40its fields. The first field is always the parent type ID (or a pointer to
41the parent virtual table) or 0 (the null pointer).
42Next are other fields on the parent virtual table are repeated.
43Finally are the fields used to store any new virtual members of the new
44The virtual type
46The virtual system is accessed through a private constant field inserted at the
47beginning of every virtual type, called the virtual-table pointer. This field
48points at a type's virtual table and is assigned during the object's
49construction. The address of a virtual table acts as the unique identifier for
50the virtual type, and the first field of a virtual table is a pointer to the
51parent virtual-table or @0p@. The remaining fields are duplicated from the
52parent tables in this type's inheritance chain, followed by any fields this type
53introduces. Parent fields are duplicated so they can be changed (all virtual
54members are overridable), so that references to the dispatched type
55are replaced with the current virtual type.
56% These are always taken by pointer or reference.
58% Simple ascii diragram:
60parent_pointer  \
61parent_field0   |
62...             | Same layout as parent.
63parent_fieldN   /
68\todo{Refine the diagram}
70% For each virtual type, a virtual table is constructed. This is both a new type
71% and an instance of that type. Other instances of the type could be created
72% but the system doesn't use them. So this section will go over the creation of
73% the type and the instance.
75A virtual table is created when the virtual type is created. The name of the
76type is created by mangling the name of the base type. The name of the instance
77is also generated by name mangling. The fields are initialized automatically.
78The parent field is initialized by getting the type of the parent field and
79using that to calculate the mangled name of the parent's virtual table type.
80There are two special fields that are included like normal fields but have
81special initialization rules: the @size@ field is the type's size and is
82initialized with a @sizeof@ expression, the @align@ field is the type's
83alignment and uses an @alignof@ expression. The remaining fields are resolved
84to a name matching the field's name and type using the normal visibility and
85overload resolution rules of the type system.
87These operations are split up into several groups depending on where they take
88place which varies for monomorphic and polymorphic types. The first devision is
89between the declarations and the definitions. Declarations, such as a function
90signature or a aggregate's name, must always be visible but may be repeated in
91the form of forward declarations in headers. Definitions, such as function
92bodies and a aggregate's layout, can be separately compiled but must occur
93exactly once in a source file.
96The declarations include the virtual type definition and forward declarations
97of the virtual table instance, constructor, message function and
98@get_exception_vtable@. The definition includes the storage and initialization
99of the virtual table instance and the bodies of the three functions.
102Monomorphic instances put all of these two groups in one place each.
103Polymorphic instances also split out the core declarations and definitions from
104the per-instance information. The virtual table type and most of the functions
105are polymorphic so they are all part of the core. The virtual table instance
106and the @get_exception_vtable@ function.
109Coroutines and threads need instances of @CoroutineCancelled@ and
110@ThreadCancelled@ respectively to use all of their functionality. When a new
111data type is declared with @coroutine@ or @thread@ the forward declaration for
112the instance is created as well. The definition of the virtual table is created
113at the definition of the main function.
116\subsection{Virtual Cast}
117Virtual casts are implemented as a function call that does the subtype check
118and a C coercion-cast to do the type conversion.
119% The C-cast is just to make sure the generated code is correct so the rest of
120% the section is about that function.
121The function is
123void * __cfa__virtual_cast(
124        struct __cfa__parent_vtable const * parent,
125        struct __cfa__parent_vtable const * const * child );
127and it is implemented in the standard library. The structure reperents the
128head of a vtable which is the pointer to the parent virtual table. The
129@parent@ points directly at the parent type virtual table while the @child@
130points at the object of the (possibe) child type.
132In terms of the virtual cast expression, @parent@ comes from looking up the
133type being cast to and @child@ is the result of the expression being cast.
134Because the complier outputs C code, some type C type casts are also used.
135The last bit of glue is an map that saves every virtual type the compiler
136sees. This is used to check the type used in a virtual cast is a virtual
137type and to get its virtual table.
138(It also checks for conflicting definitions.)
140Inside the function it is a simple conditional. If the type repersented by
141@parent@ is or is an ancestor of the type repersented by @*child@ (it
142requires one more level of derefence to pass through the object) then @child@
143is returned, otherwise the null pointer is returned.
145The check itself is preformed is a simple linear search. If the child
146virtual table or any of its ancestors (which are retreved through the first
147field of every virtual table) are the same as the parent virtual table then
148the cast succeeds.
151% Anything about exception construction.
154% Adapt the unwind chapter, just describe the sections of libunwind used.
155% Mention that termination and cancellation use it. Maybe go into why
156% resumption doesn't as well.
158% Many modern languages work with an interal stack that function push and pop
159% their local data to. Stack unwinding removes large sections of the stack,
160% often across functions.
162Stack unwinding is the process of removing stack frames (activations) from the
163stack. On function entry and return, unwinding is handled directly by the code
164embedded in the function. Usually, the stack-frame size is known statically
165based on parameter and local variable declarations. For dynamically-sized
166local variables, a runtime computation is necessary to know the frame
167size. Finally, a function's frame-size may change during execution as local
168variables (static or dynamic sized) go in and out of scope.
169Allocating/deallocating stack space is usually an $O(1)$ operation achieved by
170bumping the hardware stack-pointer up or down as needed.
172Unwinding across multiple stack frames is more complex because individual stack
173management code associated with each frame is bypassed. That is, the location
174of a function's frame-management code is largely unknown and dispersed
175throughout the function, hence the current frame size managed by that code is
176also unknown. Hence, code unwinding across frames does not have direct
177knowledge about what is on the stack, and hence, how much of the stack needs to
178be removed.
180% At a very basic level this can be done with @setjmp@ \& @longjmp@ which simply
181% move the top of the stack, discarding everything on the stack above a certain
182% point. However this ignores all the cleanup code that should be run when
183% certain sections of the stack are removed (for \CFA these are from destructors
184% and finally clauses) and also requires that the point to which the stack is
185% being unwound is known ahead of time. libunwind is used to address both of
186% these problems.
188The traditional unwinding mechanism for C is implemented by saving a snap-shot
189of a function's state with @setjmp@ and restoring that snap-shot with
190@longjmp@. This approach bypasses the need to know stack details by simply
191reseting to a snap-shot of an arbitrary but existing function frame on the
192stack. It is up to the programmer to ensure the snap-shot is valid when it is
193reset, making this unwinding approach fragile with potential errors that are
194difficult to debug because the stack becomes corrupted.
196However, many languages define cleanup actions that must be taken when objects
197are deallocated from the stack or blocks end, such as running a variable's
198destructor or a @try@ statement's @finally@ clause. Handling these mechanisms
199requires walking the stack and checking each stack frame for these potential
202For exceptions, it must be possible to walk the stack frames in search of @try@
203statements to match and execute a handler. For termination exceptions, it must
204also be possible to unwind all stack frames from the throw to the matching
205catch, and each of these frames must be checked for cleanup actions. Stack
206walking is where most of the complexity and expense of exception handling
209One of the most popular tools for stack management is libunwind, a low-level
210library that provides tools for stack walking, handler execution, and
211unwinding. What follows is an overview of all the relevant features of
212libunwind needed for this work, and how \CFA uses them to implement exception
215\subsection{libunwind Usage}
216Libunwind, accessed through @unwind.h@ on most platforms, is a C library that
217provides \CC-style stack-unwinding. Its operation is divided into two phases:
218search and cleanup. The dynamic target search -- phase 1 -- is used to scan the
219stack and decide where unwinding should stop (but no unwinding occurs). The
220cleanup -- phase 2 -- does the unwinding and also runs any cleanup code.
222To use libunwind, each function must have a personality function and a Language
223Specific Data Area (LSDA). The LSDA has the unique information for each
224function to tell the personality function where a function is executing, its
225current stack frame, and what handlers should be checked. Theoretically, the
226LSDA can contain any information but conventionally it is a table with entries
227representing regions of the function and what has to be done there during
228unwinding. These regions are bracketed by the instruction pointer. If the
229instruction pointer is within a region's start/end, then execution is currently
230executing in that region. Regions are used to mark out the scopes of objects
231with destructors and try blocks.
233% Libunwind actually does very little, it simply moves down the stack from
234% function to function. Most of the actions are implemented by the personality
235% function which libunwind calls on every function. Since this is shared across
236% many functions or even every function in a language it will need a bit more
237% information.
239The GCC compilation flag @-fexceptions@ causes the generation of an LSDA and
240attaches its personality function. However, this
241flag only handles the cleanup attribute:
242\todo{Peter: What is attached? Andrew: It uses the .cfi\_personality directive
243and that's all I know.}
245void clean_up( int * var ) { ... }
246int avar __attribute__(( cleanup(clean_up) ));
248which is used on a variable and specifies a function, in this case @clean_up@,
249run when the variable goes out of scope.
250The function is passed a pointer to the object being removed from the stack
251so it can be used to mimic destructors.
252However, this feature cannot be used to mimic @try@ statements as it cannot
253control the unwinding.
255\subsection{Personality Functions}
256Personality functions have a complex interface specified by libunwind. This
257section covers some of the important parts of the interface.
259A personality function can preform different actions depending on how it is
262typedef _Unwind_Reason_Code (*@_Unwind_Personality_Fn@) (
263        _Unwind_Action @action@,
264        _Unwind_Exception_Class @exception_class@,
265        _Unwind_Exception * @exception@,
266        struct _Unwind_Context * @context@
269The @action@ argument is a bitmask of possible actions:
272@_UA_SEARCH_PHASE@ specifies a search phase and tells the personality function
273to check for handlers. If there is a handler in a stack frame, as defined by
274the language, the personality function returns @_URC_HANDLER_FOUND@; otherwise
275it return @_URC_CONTINUE_UNWIND@.
278@_UA_CLEANUP_PHASE@ specifies a cleanup phase, where the entire frame is
279unwound and all cleanup code is run. The personality function does whatever
280cleanup the language defines (such as running destructors/finalizers) and then
281generally returns @_URC_CONTINUE_UNWIND@.
285@_UA_HANDLER_FRAME@ specifies a cleanup phase on a function frame that found a
286handler. The personality function must prepare to return to normal code
287execution and return @_URC_INSTALL_CONTEXT@.
291@_UA_FORCE_UNWIND@ specifies a forced unwind call. Forced unwind only performs
292the cleanup phase and uses a different means to decide when to stop
296The @exception_class@ argument is a copy of the
297\lstinline[language=C]|exception|'s @exception_class@ field.
299The \lstinline[language=C]|exception| argument is a pointer to the user
300provided storage object. It has two public fields, the exception class, which
301is actually just a number, identifying the exception handling mechanism that
302created it, and the cleanup function. The cleanup function is called if
303required by the exception.
305The @context@ argument is a pointer to an opaque type passed to helper
306functions called inside the personality function.
308The return value, @_Unwind_Reason_Code@, is an enumeration of possible messages
309that can be passed several places in libunwind. It includes a number of
310messages for special cases (some of which should never be used by the
311personality function) and error codes but unless otherwise noted the
312personality function should always return @_URC_CONTINUE_UNWIND@.
314\subsection{Raise Exception}
315Raising an exception is the central function of libunwind and it performs a
316two-staged unwinding.
318_Unwind_Reason_Code _Unwind_RaiseException(_Unwind_Exception *);
320First, the function begins the search phase, calling the personality function
321of the most recent stack frame. It continues to call personality functions
322traversing the stack from newest to oldest until a function finds a handler or
323the end of the stack is reached. In the latter case, raise exception returns
326Second, when a handler is matched, raise exception continues onto the cleanup
328Once again, it calls the personality functions of each stack frame from newest
329to oldest. This pass stops at the stack frame containing the matching handler.
330If that personality function has not install a handler, it is an error.
332If an error is encountered, raise exception returns either
333@_URC_FATAL_PHASE1_ERROR@ or @_URC_FATAL_PHASE2_ERROR@ depending on when the
334error occurred.
336\subsection{Forced Unwind}
338Forced Unwind is the other central function in libunwind.
340_Unwind_Reason_Code _Unwind_ForcedUnwind( _Unwind_Exception *,
341        _Unwind_Stop_Fn, void *);
343It also unwinds the stack but it does not use the search phase. Instead another
344function, the stop function, is used to stop searching. The exception is the
345same as the one passed to raise exception. The extra arguments are the stop
346function and the stop parameter. The stop function has a similar interface as a
347personality function, except it is also passed the stop parameter.
349typedef _Unwind_Reason_Code (*@_Unwind_Stop_Fn@)(
350        _Unwind_Action @action@,
351        _Unwind_Exception_Class @exception_class@,
352        _Unwind_Exception * @exception@,
353        struct _Unwind_Context * @context@,
354        void * @stop_parameter@);
357The stop function is called at every stack frame before the personality
358function is called and then once more after all frames of the stack are
361Each time it is called, the stop function should return @_URC_NO_REASON@ or
362transfer control directly to other code outside of libunwind. The framework
363does not provide any assistance here.
366Its arguments are the same as the paired personality function. The actions
367@_UA_CLEANUP_PHASE@ and @_UA_FORCE_UNWIND@ are always set when it is
368called. Beyond the libunwind standard, both GCC and Clang add an extra action
369on the last call at the end of the stack: @_UA_END_OF_STACK@.
372\section{Exception Context}
373% Should I have another independent section?
374% There are only two things in it, top_resume and current_exception. How it is
375% stored changes depending on whether or not the thread-library is linked.
377The exception context is global storage used to maintain data across different
378exception operations and to communicate among different components.
380Each stack must have its own exception context. In a sequential \CFA program,
381there is only one stack with a single global exception-context. However, when
382the library @libcfathread@ is linked, there are multiple stacks where each
383needs its own exception context.
385General access to the exception context is provided by function
386@this_exception_context@. For sequential execution, this function is defined as
387a weak symbol in the \CFA system-library, @libcfa@. When a \CFA program is
388concurrent, it links with @libcfathread@, where this function is defined with a
389strong symbol replacing the sequential version.
391The sequential @this_exception_context@ returns a hard-coded pointer to the
392global execption context.
393The concurrent version adds the exception context to the data stored at the
394base of each stack. When @this_exception_context@ is called it retrieves the
395active stack and returns the address of the context saved there.
398% Memory management & extra information, the custom function used to implement
399% catches. Talk about GCC nested functions.
401Termination exceptions use libunwind heavily because it matches the intended
402use from \CC exceptions closely. The main complication for \CFA is that the
403compiler generates C code, making it very difficult to generate the assembly to
404form the LSDA for try blocks or destructors.
406\subsection{Memory Management}
407The first step of a termination raise is to copy the exception into memory
408managed by the exception system. Currently, the system uses @malloc@, rather
409than reserved memory or the stack top. The exception handling mechanism manages
410memory for the exception as well as memory for libunwind and the system's own
411per-exception storage.
413[Quick ASCII diagram to get started.]
415Fixed Header  | _Unwind_Exception   <- pointer target
416              |
417              | Cforall storage
418              |
419Variable Body | the exception       <- fixed offset
420              V ...
423Exceptions are stored in variable-sized blocks.
424The first component is a fixed sized data structure that contains the
425information for libunwind and the exception system. The second component is an
426area of memory big enough to store the exception. Macros with pointer arthritic
427and type cast are used to move between the components or go from the embedded
428@_Unwind_Exception@ to the entire node.
430All of these nodes are linked together in a list, one list per stack, with the
431list head stored in the exception context. Within each linked list, the most
432recently thrown exception is at the head followed by older thrown
433exceptions. This format allows exceptions to be thrown, while a different
434exception is being handled. The exception at the head of the list is currently
435being handled, while other exceptions wait for the exceptions before them to be
438The virtual members in the exception's virtual table provide the size of the
439exception, the copy function, and the free function, so they are specific to an
440exception type. The size and copy function are used immediately to copy an
441exception into managed memory. After the exception is handled the free function
442is used to clean up the exception and then the entire node is passed to free
443so the memory can be given back to the heap.
445\subsection{Try Statements and Catch Clauses}
446The try statement with termination handlers is complex because it must
447compensate for the lack of assembly-code generated from \CFA. Libunwind
448requires an LSDA and personality function for control to unwind across a
449function. The LSDA in particular is hard to mimic in generated C code.
451The workaround is a function called @__cfaehm_try_terminate@ in the standard
452library. The contents of a try block and the termination handlers are converted
453into functions. These are then passed to the try terminate function and it
454calls them.
455Because this function is known and fixed (and not an arbitrary function that
456happens to contain a try statement) this means the LSDA can be generated ahead
457of time.
459Both the LSDA and the personality function are set ahead of time using
460embedded assembly. This is handcrafted using C @asm@ statements and contains
461enough information for the single try statement the function repersents.
463The three functions passed to try terminate are:
465\item[try function:] This function is the try block, all the code inside the
466try block is placed inside the try function. It takes no parameters and has no
467return value. This function is called during regular execution to run the try
470\item[match function:] This function is called during the search phase and
471decides if a catch clause matches the termination exception. It is constructed
472from the conditional part of each handler and runs each check, top to bottom,
473in turn, first checking to see if the exception type matches and then if the
474condition is true. It takes a pointer to the exception and returns 0 if the
475exception is not handled here. Otherwise the return value is the id of the
476handler that matches the exception.
478\item[handler function:] This function handles the exception. It takes a
479pointer to the exception and the handler's id and returns nothing. It is called
480after the cleanup phase. It is constructed by stitching together the bodies of
481each handler and dispatches to the selected handler.
483All three functions are created with GCC nested functions. GCC nested functions
484can be used to create closures, functions that can refer to the state of other
485functions on the stack. This approach allows the functions to refer to all the
486variables in scope for the function containing the @try@ statement. These
487nested functions and all other functions besides @__cfaehm_try_terminate@ in
488\CFA use the GCC personality function and the @-fexceptions@ flag to generate
489the LSDA. This allows destructors to be implemented with the cleanup attribute.
492% The stack-local data, the linked list of nodes.
494Resumption simple to implement because there is no stack unwinding. The
495resumption raise uses a list of nodes for its stack traversal. The head of the
496list is stored in the exception context. The nodes in the list have a pointer
497to the next node and a pointer to the handler function.
499A resumption raise traverses this list. At each node the handler function is
500called, passing the exception by pointer. It returns true if the exception is
501handled and false otherwise.
503The handler function does both the matching and handling. It computes the
504condition of each @catchResume@ in top-to-bottom order, until it finds a
505handler that matches. If no handler matches then the function returns
506false. Otherwise the matching handler is run; if it completes successfully, the
507function returns true. Rethrowing, through the @throwResume;@ statement,
508causes the function to return true.
510% Recursive Resumption Stuff:
511Search skipping \see{\VPageref{p:searchskip}}, which ignores parts of the stack
512already examined, is accomplished by updating the front of the list as the
513search continues. Before the handler at a node is called the head of the list
514is updated to the next node of the current node. After the search is complete,
515successful or not, the head of the list is reset.
517This mechanism means the current handler and every handler that has already
518been checked are not on the list while a handler is run. If a resumption is
519thrown during the handling of another resumption the active handlers and all
520the other handler checked up to this point are not checked again.
522This structure also supports new handler added while the resumption is being
523handled. These are added to the front of the list, pointing back along the
524stack -- the first one points over all the checked handlers -- and the ordering
525is maintained.
528Note, the resumption implementation has a cost for entering/exiting a @try@
529statement with @catchResume@ clauses, whereas a @try@ statement with @catch@
530clauses has zero-cost entry/exit. While resumption does not need the stack
531unwinding and cleanup provided by libunwind, it could use the search phase to
532providing zero-cost enter/exit using the LSDA. Unfortunately, there is no way
533to return from a libunwind search without installing a handler or raising an
534error. Although workarounds might be possible, they are beyond the scope of
535this thesis. The current resumption implementation has simplicity in its
537% Seriously, just compare the size of the two chapters and then consider
538% that unwind is required knowledge for that chapter.
541% Uses destructors and GCC nested functions.
542Finally clauses is placed into a GCC nested-function with a unique name, and no
543arguments or return values. This nested function is then set as the cleanup
544function of an empty object that is declared at the beginning of a block placed
545around the context of the associated @try@ statement.
547The rest is handled by GCC. The try block and all handlers are inside the
548block. At completion, control exits the block and the empty object is cleaned
549up, which runs the function that contains the finally code.
552% Stack selections, the three internal unwind functions.
554Cancellation also uses libunwind to do its stack traversal and unwinding,
555however it uses a different primary function @_Unwind_ForcedUnwind@. Details
556of its interface can be found in the \VRef{s:ForcedUnwind}.
558The first step of cancellation is to find the cancelled stack and its type:
559coroutine or thread. Fortunately, the thread library stores the main thread
560pointer and the current thread pointer, and every thread stores a pointer to
561its main coroutine and the coroutine it is currently executing.
563So if the active thread's main and current coroutine are the same. If they
564are then the current stack is a thread stack, otherwise it is a coroutine
565stack. If it is a thread stack then an equality check with the stored main
566thread pointer and current thread pointer is enough to tell if the current
567thread is the main thread or not.
569However, if the threading library is not linked, the sequential execution is on
570the main stack. Hence, the entire check is skipped because the weak-symbol
571function is loaded. Therefore, a main thread cancellation is unconditionally
574Regardless of how the stack is chosen, the stop function and parameter are
575passed to the forced-unwind function. The general pattern of all three stop
576functions is the same: they continue unwinding until the end of stack when they
577do there primary work.
579For main stack cancellation, the transfer is just a program abort.
581For coroutine cancellation, the exception is stored on the coroutine's stack,
582and the coroutine context switches to its last resumer. The rest is handled on
583the backside of the resume, which check if the resumed coroutine is
584cancelled. If cancelled, the exception is retrieved from the resumed coroutine,
585and a @CoroutineCancelled@ exception is constructed and loaded with the
586cancelled exception. It is then resumed as a regular exception with the default
587handler coming from the context of the resumption call.
589For thread cancellation, the exception is stored on the thread's main stack and
590then context switched to the scheduler. The rest is handled by the thread
591joiner. When the join is complete, the joiner checks if the joined thread is
592cancelled. If cancelled, the exception is retrieved and the joined thread, and
593a @ThreadCancelled@ exception is constructed and loaded with the cancelled
594exception. The default handler is passed in as a function pointer. If it is
595null (as it is for the auto-generated joins on destructor call), the default is
596used, which is a program abort.
597%; which gives the required handling on implicate join.
Note: See TracBrowser for help on using the repository browser.