Context Navigation

-              r029cbc0
+              r5a4f1a8
 \label{c:implement}
 The implementation work for this thesis covers two components: the virtual
+The implementation work for this thesis covers the two components: virtual
 system and exceptions. Each component is discussed in detail.
 …
 \todo{Talk about constructors for virtual types (after they are working).}
+This is what binds an instance of a virtual type to a virtual table. This
+pointer can be used as an identity check. It can also be used to access the
+The virtual table pointer binds an instance of a virtual type
+to a virtual table.
+The pointer is also the table's id and how the system accesses the
 virtual table and the virtual members there.
 \subsection{Type Id}
 Every virtual type has a unique id.
+Type ids can be compared for equality (the types reperented are the same)
+Type ids can be compared for equality,
+which checks if the types reperented are the same,
 or used to access the type's type information.
 The type information currently is only the parent's type id or, if the
 type has no parent, zero.
+type has no parent, the null pointer.
 The id's are implemented as pointers to the type's type information instance.
 Derefencing the pointer gets the type information.
+By going back-and-forth between the type id and
 the type info one can find every ancestor of a virtual type.
 It also pushes the issue of creating a unique value (for
+Dereferencing the pointer gets the type information.
+The ancestors of a virtual type are found by traversing type ids through
+the type information.
+The information pushes the issue of creating a unique value (for
 the type id) to the problem of creating a unique instance (for type
+information) which the linker can solve.
+Advanced linker support is required because there is no place that appears
+only once to attach the type information to. There should be one structure
+definition but it is included in multiple translation units. Each virtual
+table definition should be unique but there are an arbitrary number of thoses.
+So the special section prefix \texttt{.gnu.linkonce} is used.
+With a unique suffix (making the entire section name unique) the linker will
+remove multiple definition making sure only one version exists after linking.
+Then it is just a matter of making sure there is a unique name for each type.
+This is done in three phases.
+The first phase is to generate a new structure definition to store the type
+information), which the linker can solve.
+The advanced linker support is used here to avoid having to create
+a new declaration to attach this data to.
+With C/\CFA's header/implementation file divide for something to appear
+exactly once it must come from a declaration that appears in exactly one
+implementation file; the declarations in header files may exist only once
+they can be included in many different translation units.
+Therefore, structure's declaration will not work.
+Neither will attaching the type information to the virtual table -- although
+a vtable declarations are in implemention files they are not unique, see
+\autoref{ss:VirtualTable}.
+Instead the same type information is generated multiple times and then
+the new attribute \snake{cfa_linkone} is used to removed duplicates.
+Type information is constructed as follows:
+\begin{enumerate}
+\item
+Use the type's name to generate a name for the type information structure.
+This is saved so it may be reused.
+\item
+Generate a new structure definition to store the type
 information. The layout is the same in each case, just the parent's type id,
 but the types are changed.
 The structure's name is change, it is based off the virtual type's name, and
 the type of the parent's type id.
+but the types used change from instance to instance.
+The generated name is used for both this structure and, if relivant, the
+parent pointer.
 If the virtual type is polymorphic then the type information structure is
 polymorphic as well, with the same polymorphic arguments.
+The second phase is to generate an instance of the type information with a
+almost unique name, generated by mangling the virtual type name.
+The third phase is implicit with \CFA's overloading scheme. \CFA mangles
+names with type information so that all of the symbols exported to the linker
+are unique even if in \CFA code they are the same. Having two declarations
+with the same name and same type is forbidden because it is impossible for
+overload resolution to pick between them. This is why a unique type is
+generated for each virtual type.
+Polymorphic information is included in this mangling so polymorphic
+types will have seperate instances for each set of polymorphic arguments.
+\begin{cfa}
+struct TYPE_ID_TYPE {
+        PARENT_ID_TYPE const * parent;
+\item
+A seperate name for instances is generated from the type's name.
+\item
+The definition is generated and initialised.
+The parent id is set to the null pointer or to the address of the parent's
+type information instance. Name resolution handles the rest.
+\item
+\CFA's name mangler does its regular name mangling encoding the type of
+the declaration into the instance name. This gives a completely unique name
+including different instances of the same polymorphic type.
+\end{enumerate}
+\todo{The list is making me realise, some of this isn't ordered.}
+Writing that code manually, with helper macros for the early name mangling,
+would look like this:
+\begin{cfa}
+struct INFO_TYPE(TYPE) {
+        INFO_TYPE(PARENT) const * parent;
 };
 __attribute__((cfa_linkonce))
 TYPE_ID_TYPE const TYPE_ID_NAME = {
         &PARENT_ID_NAME,
+INFO_TYPE(TYPE) const INFO_NAME(TYPE) = {
+        &INFO_NAME(PARENT),
 };
 \end{cfa}
+\subsubsection{cfa\_linkonce Attribute}
+\subsubsection{\lstinline{cfa\_linkonce} Attribute}
+% I just realised: This is an extension of the inline keyword.
+% An extension of C's at least, it is very similar to C++'s.
 Another feature added to \CFA is a new attribute: \texttt{cfa\_linkonce}.
+This attribute can be put on an object or function definition
+(any global declaration with a name and a type).
+This allows you to define that object or function multiple times.
+All definitions should have the link-once attribute on them and all should
+be identical.
+The simplist way to use it is to put a definition in a header where the
+forward declaration would usually go.
+This is how it is used for type-id instances. There was is no unique location
+associated with a type except for the type definition which is in a header.
+This allows the unique type-id object to be generated there.
+Internally @cfa_linkonce@ removes all @section@ attributes
+from the declaration (as well as itself) and replaces them with
+This attribute is attached to an object or function definition
+(any global declaration with a name and a type)
+allowing it to be defined multiple times.
+All matching definitions mush have the link-once attribute
+and their implementations should be identical as well.
+A single definition with the attribute can be included in a header
+file as if it was a forward declaration, except no definition is required.
+This technique is used for type-id instances. A link-once definition is
+generated each time the structure is seen. This will result in multiple
+copies but the link-once attribute ensures all but one are removed for a
+unique instance.
+Internally, @cfa_linkonce@ is replaced with
 @section(".gnu.linkonce.NAME")@ where \texttt{NAME} is replaced by the
 mangled name of the object.
+Any other @section@ attributes are removed from the declaration.
 The prefix \texttt{.gnu.linkonce} in section names is recognized by the
 linker. If two of these sections with the same name, including everything
+that comes after the special prefix, then only one will be used and the other
 will be discarded.
+linker. If two of these sections appear with the same name, including
+everything that comes after the special prefix, then only one is used
+and the other is discarded.
 \subsection{Virtual Table}
+\label{ss:VirtualTable}
 Each virtual type has a virtual table type that stores its type id and
 virtual members.
 …
 The layout always comes in three parts.
+\todo{Add labels to the virtual table layout figure.}
 The first section is just the type id at the head of the table. It is always
+there to ensure that
+there to ensure that it can be found even when the accessing code does not
+know which virtual type it has.
 The second section are all the virtual members of the parent, in the same
 order as they appear in the parent's virtual table. Note that the type may
 …
 prefix that has the same layout and types as its parent virtual table.
 This, combined with the fixed offset to the virtual table pointer, means that
+for any virtual type it doesn't matter if we have it or any of its
+descendants, it is still always safe to access the virtual table through
+the virtual table pointer.
+From there it is safe to check the type id to identify the exact type of the
+for any virtual type, it is always safe to access its virtual table and,
+from there, it is safe to check the type id to identify the exact type of the
 underlying object, access any of the virtual members and pass the object to
 any of the method-like virtual members.
 When a virtual table is declared the user decides where to declare it and its
+When a virtual table is declared, the user decides where to declare it and its
 name. The initialization of the virtual table is entirely automatic based on
 the context of the declaration.
 The type id is always fixed, each virtual table type will always have one
+The type id is always fixed; with each virtual table type having
 exactly one possible type id.
 The virtual members are usually filled in by resolution. The best match for
 a given name and type at the declaration site is filled in.
 There are two exceptions to that rule: the @size@ field is the type's size
 and is set to the result of a @sizeof@ expression, the @align@ field is the
 type's alignment and similarly uses an @alignof@ expression.
+The virtual members are usually filled in by type resolution.
+The best match for a given name and type at the declaration site is used.
+There are two exceptions to that rule: the @size@ field, the type's size,
+is set using a @sizeof@ expression and the @align@ field, the
+type's alignment, is set using an @alignof@ expression.
 \subsubsection{Concurrency Integration}
 Coroutines and threads need instances of @CoroutineCancelled@ and
 @ThreadCancelled@ respectively to use all of their functionality. When a new
 data type is declared with @coroutine@ or @thread@ the forward declaration for
+data type is declared with @coroutine@ or @thread@, a forward declaration for
 the instance is created as well. The definition of the virtual table is created
 at the definition of the main function.
+This is showned through code re-writing in
+\autoref{f:ConcurrencyTransformations}.
 \begin{figure}
 …
 the cast target is passed in as @child@.
 For C generation both arguments and the result are wrapped with type casts.
 There is also an internal store inside the compiler to make sure that the
+For generated C code wraps both arguments and the result with type casts.
+There is also an internal check inside the compiler to make sure that the
 target type is a virtual type.
 % It also checks for conflicting definitions.
+The virtual cast either returns the original pointer as a new type or null.
+So the function just does the parent check and returns the approprate value.
+The virtual cast either returns the original pointer or the null pointer
+as the new type.
+So the function does the parent check and returns the appropriate value.
 The parent check is a simple linear search of child's ancestors using the
 type information.
 …
 % resumption doesn't as well.
 % Many modern languages work with an interal stack that function push and pop
+% Many modern languages work with an internal stack that function push and pop
 % their local data to. Stack unwinding removes large sections of the stack,
 % often across functions.
 …
 stack. On function entry and return, unwinding is handled directly by the
 call/return code embedded in the function.
 In many cases the position of the instruction pointer (relative to parameter
+In many cases, the position of the instruction pointer (relative to parameter
 and local declarations) is enough to know the current size of the stack
 frame.
 Usually, the stack-frame size is known statically based on parameter and
 local variable declarations. Even with dynamic stack-size the information
 to determain how much of the stack has to be removed is still contained
+local variable declarations. Even with dynamic stack-size, the information
+to determine how much of the stack has to be removed is still contained
 within the function.
 Allocating/deallocating stack space is usually an $O(1)$ operation achieved by
 bumping the hardware stack-pointer up or down as needed.
 Constructing/destructing values on the stack takes longer put in terms of
 figuring out what needs to be done is of similar complexity.
+Constructing/destructing values within a stack frame has
+a similar complexity but can add additional work and take longer.
 Unwinding across multiple stack frames is more complex because that
 …
 reseting to a snap-shot of an arbitrary but existing function frame on the
 stack. It is up to the programmer to ensure the snap-shot is valid when it is
 reset and that all required clean-up from the unwound stacks is preformed.
 This approach is fragile and forces a work onto the surounding code.
 With respect to that work forced onto the surounding code,
+reset and that all required clean-up from the unwound stacks is performed.
+This approach is fragile and requires extra work in the surrounding code.
+With respect to the extra work in the surounding code,
 many languages define clean-up actions that must be taken when certain
 sections of the stack are removed. Such as when the storage for a variable
 is removed from the stack or when a try statement with a finally clause is
 (conceptually) popped from the stack.
 None of these should be handled by the user, that would contradict the
 intention of these features, so they need to be handled automatically.
 To safely remove sections of the stack the language must be able to find and
+None of these should be handled by the user --- that would contradict the
+intention of these features --- so they need to be handled automatically.
+To safely remove sections of the stack, the language must be able to find and
 run these clean-up actions even when removing multiple functions unknown at
 the beginning of the unwinding.
 …
 current stack frame, and what handlers should be checked. Theoretically, the
 LSDA can contain any information but conventionally it is a table with entries
 representing regions of the function and what has to be done there during
+representing regions of a function and what has to be done there during
 unwinding. These regions are bracketed by instruction addresses. If the
 instruction pointer is within a region's start/end, then execution is currently
 …
 int avar __attribute__(( cleanup(clean_up) ));
 \end{cfa}
 The attribue is used on a variable and specifies a function,
+The attribute is used on a variable and specifies a function,
 in this case @clean_up@, run when the variable goes out of scope.
+This is enough to mimic destructors, but not try statements which can effect
+This feature is enough to mimic destructors,
+but not try statements which can effect
 the unwinding.
 To get full unwinding support all of this has to be done directly with
 assembly and assembler directives. Partiularly the cfi directives
+To get full unwinding support, all of these features must be handled directly
+in assembly and assembler directives; partiularly the cfi directives
 \snake{.cfi_lsda} and \snake{.cfi_personality}.
 …
 section covers some of the important parts of the interface.
 A personality function can preform different actions depending on how it is
+A personality function can perform different actions depending on how it is
 called.
 \begin{lstlisting}
 …
 The @exception_class@ argument is a copy of the
 \code{C}{exception}'s @exception_class@ field.
+This a number that identifies the exception handling mechanism that created
 the
 The \code{C}{exception} argument is a pointer to the user
+\code{C}{exception}'s @exception_class@ field,
+which is a number that identifies the exception handling mechanism
+that created the exception.
+The \code{C}{exception} argument is a pointer to a user
 provided storage object. It has two public fields: the @exception_class@,
 which is described above, and the @exception_cleanup@ function.
 The clean-up function is used by the EHM to clean-up the exception if it
+The clean-up function is used by the EHM to clean-up the exception, if it
 should need to be freed at an unusual time, it takes an argument that says
 why it had to be cleaned up.
 …
 messages for special cases (some of which should never be used by the
 personality function) and error codes. However, unless otherwise noted, the
 personality function should always return @_URC_CONTINUE_UNWIND@.
+personality function always returns @_URC_CONTINUE_UNWIND@.
 \subsection{Raise Exception}
 Raising an exception is the central function of libunwind and it performs a
+Raising an exception is the central function of libunwind and it performs
 two-staged unwinding.
 \begin{cfa}
 …
 % catches. Talk about GCC nested functions.
 \CFA termination exceptions use libunwind heavily because they match \Cpp
+\CFA termination exceptions use libunwind heavily because they match
 \Cpp exceptions closely. The main complication for \CFA is that the
 compiler generates C code, making it very difficult to generate the assembly to
 …
 \begin{figure}
+\centering
 \input{exception-layout}
 \caption{Exception Layout}
 \label{f:ExceptionLayout}
 \end{figure}
+\todo*{Convert the exception layout to an actual diagram.}
 Exceptions are stored in variable-sized blocks (see \vref{f:ExceptionLayout}).
+Exceptions are stored in variable-sized blocks
+(see \autoref{f:ExceptionLayout}).
 The first component is a fixed-sized data structure that contains the
 information for libunwind and the exception system. The second component is an
 …
 @_Unwind_Exception@ to the entire node.
 Multipe exceptions can exist at the same time because exceptions can be
+Multiple exceptions can exist at the same time because exceptions can be
 raised inside handlers, destructors and finally blocks.
 Figure~\vref{f:MultipleExceptions} shows a program that has multiple
 exceptions active at one time.
 Each time an exception is thrown and caught the stack unwinds and the finally
 clause runs. This will throw another exception (until @num_exceptions@ gets
 high enough) which must be allocated. The previous exceptions may not be
+clause runs. This handler throws another exception (until @num_exceptions@ gets
+high enough), which must be allocated. The previous exceptions may not be
 freed because the handler/catch clause has not been run.
+So the EHM must keep them alive while it allocates exceptions for new throws.
+Therefore, the EHM must keep all unhandled exceptions alive
+while it allocates exceptions for new throws.
 \begin{figure}
 …
 \todo*{Work on multiple exceptions code sample.}
 All exceptions are stored in nodes which are then linked together in lists,
+All exceptions are stored in nodes, which are then linked together in lists
 one list per stack, with the
 list head stored in the exception context. Within each linked list, the most
 …
 exception is being handled. The exception at the head of the list is currently
 being handled, while other exceptions wait for the exceptions before them to be
 removed.
+handled and removed.
 The virtual members in the exception's virtual table provide the size of the
 …
 exception into managed memory. After the exception is handled, the free
 function is used to clean up the exception and then the entire node is
 passed to free so the memory can be given back to the heap.
+passed to free, returning the memory back to the heap.
 \subsection{Try Statements and Catch Clauses}
 The try statement with termination handlers is complex because it must
+compensate for the lack of assembly-code generated from \CFA. Libunwind
+compensate for the C code-generation versus
+assembly-code generated from \CFA. Libunwind
 requires an LSDA and personality function for control to unwind across a
 function. The LSDA in particular is hard to mimic in generated C code.
 …
 embedded assembly. This assembly code is handcrafted using C @asm@ statements
 and contains
 enough information for the single try statement the function repersents.
+enough information for a single try statement the function repersents.
 The three functions passed to try terminate are:
 \begin{description}
 \item[try function:] This function is the try block, all the code inside the
 try block is placed inside the try function. It takes no parameters and has no
+\item[try function:] This function is the try block, it is where all the code
+from inside the try block is placed. It takes no parameters and has no
 return value. This function is called during regular execution to run the try
 block.
 …
 handler that matches the exception.
+\item[handler function:] This function handles the exception. It takes a
+\item[handler function:] This function handles the exception, and contains
+all the code from the handlers in the try statement, joined with a switch
+statement on the handler's id.
+It takes a
 pointer to the exception and the handler's id and returns nothing. It is called
+after the cleanup phase. It is constructed by stitching together the bodies of
+each handler and dispatches to the selected handler.
+after the cleanup phase.
 \end{description}
 All three functions are created with GCC nested functions. GCC nested functions
+can be used to create closures, functions that can refer to the state of other
+can be used to create closures,
+in other words functions that can refer to the state of other
 functions on the stack. This approach allows the functions to refer to all the
 variables in scope for the function containing the @try@ statement. These
 …
 Using this pattern, \CFA implements destructors with the cleanup attribute.
+\autoref{f:TerminationTransformation} shows the pattern used to transform
+a \CFA try statement with catch clauses into the approprate C functions.
+\todo{Explain the Termination Transformation figure.}
 \begin{figure}
 \begin{cfa}
 …
+}
 \end{cfa}
+\medskip
+\hrule
+\medskip
+\todo*{Termination Transformation divider feels too strong.}
 \begin{cfa}
 …
 % The stack-local data, the linked list of nodes.
 Resumption simpler to implement than termination
+Resumption is simpler to implement than termination
 because there is no stack unwinding.
 Instead of storing the data in a special area using assembly,
 …
 The nodes are stored in order, with the more recent try statements closer
 to the head of the list.
 Instead of traversing the stack resumption handling traverses the list.
 At each node the EHM checks to see if the try statement the node repersents
+Instead of traversing the stack, resumption handling traverses the list.
+At each node, the EHM checks to see if the try statement the node repersents
 can handle the exception. If it can, then the exception is handled and
 the operation finishes, otherwise the search continues to the next node.
 If the search reaches the end of the list without finding a try statement
 that can handle the exception the default handler is executed and the
+that can handle the exception, the default handler is executed and the
 operation finishes.
+In each node is a handler function which does most of the work there.
+The handler function is passed the raised the exception and returns true
+if the exception is handled and false if it cannot be handled here.
+For each @catchResume@ clause the handler function will:
+check to see if the raised exception is a descendant type of the declared
+exception type, if it is and there is a conditional expression then it will
+run the test, if both checks pass the handling code for the clause is run
+and the function returns true, otherwise it moves onto the next clause.
+If this is the last @catchResume@ clause then instead of moving onto
+the next clause the function returns false as no handler could be found.
+Each node has a handler function that does most of the work.
+The handler function is passed the raised exception and returns true
+if the exception is handled and false otherwise.
+The handler function checks each of its internal handlers in order,
+top-to-bottom, until it funds a match. If a match is found that handler is
+run, after which the function returns true, ignoring all remaining handlers.
+If no match is found the function returns false.
+The match is performed in two steps, first a virtual cast is used to see
+if the thrown exception is an instance of the declared exception or one of
+its descendant type, then check to see if passes the custom predicate if one
+is defined. This ordering gives the type guarantee used in the predicate.
+\autoref{f:ResumptionTransformation} shows the pattern used to transform
+a \CFA try statement with catch clauses into the approprate C functions.
+\todo{Explain the Resumption Transformation figure.}
 \begin{figure}
 …
+}
 \end{cfa}
+\medskip
+\hrule
+\medskip
+\todo*{Resumption Transformation divider feels too strong.}
 \begin{cfa}
 …
 % Recursive Resumption Stuff:
+Search skipping (see \vpageref{s:ResumptionMarking}), which ignores parts of
+\autoref{f:ResumptionMarking} shows search skipping
+(see \vpageref{s:ResumptionMarking}), which ignores parts of
 the stack
 already examined, is accomplished by updating the front of the list as the
 …
 is updated to the next node of the current node. After the search is complete,
 successful or not, the head of the list is reset.
+% No paragraph?
 This mechanism means the current handler and every handler that has already
 been checked are not on the list while a handler is run. If a resumption is
 thrown during the handling of another resumption the active handlers and all
+thrown during the handling of another resumption, the active handlers and all
 the other handler checked up to this point are not checked again.
 This structure also supports new handler added while the resumption is being
+% No paragraph?
+This structure also supports new handlers added while the resumption is being
 handled. These are added to the front of the list, pointing back along the
 stack -- the first one points over all the checked handlers -- and the ordering
 is maintained.
+stack --- the first one points over all the checked handlers ---
+and the ordering is maintained.
 \begin{figure}
 …
 \caption{Resumption Marking}
 \label{f:ResumptionMarking}
 \todo*{Convert Resumption Marking into a line figure.}
+\todo*{Label Resumption Marking to aid clarity.}
 \end{figure}
 \label{p:zero-cost}
+Note, the resumption implementation has a cost for entering/exiting a @try@
 statement with @catchResume@ clauses, whereas a @try@ statement with @catch@
+Finally, the resumption implementation has a cost for entering/exiting a try
+statement with @catchResume@ clauses, whereas a try statement with @catch@
 clauses has zero-cost entry/exit. While resumption does not need the stack
 unwinding and cleanup provided by libunwind, it could use the search phase to
 …
 The first step of cancellation is to find the cancelled stack and its type:
 coroutine or thread. Fortunately, the thread library stores the main thread
+pointer and the current thread pointer, and every thread stores a pointer to
 its main coroutine and the coroutine it is currently executing.
+\todo*{Consider adding a description of how threads are coroutines.}
 If a the current thread's main and current coroutines are the same then the
 current stack is a thread stack. Furthermore it is easy to compare the
+current thread to the main thread to see if they are the same. And if this
 is not a thread stack then it must be a coroutine stack.
+coroutine, thread or main thread.
+In \CFA, a thread (the construct the user works with) is a user-level thread
+(point of execution) paired with a coroutine, the thread's main coroutine.
+The thread library also stores pointers to the main thread and the current
+thread.
+If the current thread's main and current coroutines are the same then the
+current stack is a thread stack, otherwise it is a coroutine stack.
+If the current stack is a thread stack, it is also the main thread stack
+if and only if the main and current threads are the same.
 However, if the threading library is not linked, the sequential execution is on
 the main stack. Hence, the entire check is skipped because the weak-symbol
 function is loaded. Therefore, a main thread cancellation is unconditionally
+function is loaded. Therefore, main thread cancellation is unconditionally
 performed.
 Regardless of how the stack is chosen, the stop function and parameter are
 passed to the forced-unwind function. The general pattern of all three stop
 functions is the same: they continue unwinding until the end of stack and
 then preform their transfer.
+functions is the same: continue unwinding until the end of stack and
+then preform the appropriate transfer.
 For main stack cancellation, the transfer is just a program abort.
 …
 For coroutine cancellation, the exception is stored on the coroutine's stack,
 and the coroutine context switches to its last resumer. The rest is handled on
 the backside of the resume, which check if the resumed coroutine is
+the backside of the resume, which checks if the resumed coroutine is
 cancelled. If cancelled, the exception is retrieved from the resumed coroutine,
 and a @CoroutineCancelled@ exception is constructed and loaded with the

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset 5a4f1a8

Legend:

doc/theses/andrew_beach_MMath/implement.tex

Download in other formats: