Changeset fc1347d0


Ignore:
Timestamp:
May 17, 2021, 9:44:58 AM (3 years ago)
Author:
Andrew Beach <ajbeach@…>
Branches:
ADT, arm-eh, ast-experimental, enum, forall-pointer-decay, jacob/cs343-translation, master, new-ast-unique-expr, pthread-emulation, qualifiedEnum
Children:
8f910430
Parents:
58d471f (diff), 299b8b2 (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the (diff) links above to see all the changes relative to each parent.
Message:

Merge branch 'andrew-mmath' into 'master', implement chapter updates.

Location:
doc/theses/andrew_beach_MMath
Files:
1 added
4 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/andrew_beach_MMath/cfalab.sty

    r58d471f rfc1347d0  
    11% Package for CFA Research Lab.
     2% (Now more a personal collection and testing grounds for common.sty.)
    23%
    34% This is a collection of commands everyone working on CFA related documents
     
    2324% space and '{}<whatever-follows>' to force remove one.
    2425%
     26% \CFA
    2527% Cforall with the forall symbol.
    2628\newrobustcmd\CFA{\textsf{C\raisebox{\depth}{\rotatebox{180}{A}}}\xspace}
    27 % C++ with kerning. You may optionally append a standard number.
     29% \Cpp[<std>]
     30% C++ symbol name. You may optionally provide <std> to specify a standard.
    2831\newrobustcmd\Cpp[1][\xspace]{C++#1}
    2932
     
    4548\newcommand*\colour[2]{{\color{#1}#2}}
    4649
    47 % \code*{<code>}
    48 % Use the listings package to format a snipit of <code>.
    49 \newrobustcmd*\codeCFA[1]{\lstinline[language=CFA]{#1}}
    50 \newrobustcmd*\codeC[1]{\lstinline[language=C]{#1}}
    51 \newrobustcmd*\codeCpp[1]{\lstinline[language=C++]{#1}}
    52 \newrobustcmd*\codePy[1]{\lstinline[language=Python]{#1}}
     50% \code{<language>}{<code>}
     51% Use the listings package to format the snipit of <code> in <language>.
     52\newrobustcmd*\code[2]{\lstinline[language=#1]{#2}}
    5353
     54% \begin{cfa}[<options>]
     55% \end{cfa}
    5456% Use the listings package to format a block of CFA code.
    5557% Extra listings options can be passed in as an optional argument.
  • doc/theses/andrew_beach_MMath/features.tex

    r58d471f rfc1347d0  
    2424
    2525Some well known examples include the @throw@ statements of \Cpp and Java and
    26 the \codePy{raise} statement from Python. In real systems a raise may preform
    27 some other work (such as memory management) but for the purposes of this
    28 overview that can be ignored.
     26the \code{Python}{raise} statement from Python. In real systems a raise may
     27preform some other work (such as memory management) but for the
     28purposes of this overview that can be ignored.
    2929
    3030\subparagraph{Handle}
     
    9393A handler labelled with any given exception can handle exceptions of that
    9494type or any child type of that exception. The root of the exception hierarchy
    95 (here \codeC{exception}) acts as a catch-all, leaf types catch single types
     95(here \code{C}{exception}) acts as a catch-all, leaf types catch single types
    9696and the exceptions in the middle can be used to catch different groups of
    9797related exceptions.
     
    182182While much of the virtual infrastructure is created, it is currently only used
    183183internally for exception handling. The only user-level feature is the virtual
    184 cast, which is the same as the \Cpp \codeCpp{dynamic_cast}.
     184cast, which is the same as the \Cpp \code{C++}{dynamic_cast}.
    185185\label{p:VirtualCast}
    186186\begin{cfa}
  • doc/theses/andrew_beach_MMath/implement.tex

    r58d471f rfc1347d0  
    1515\subsection{Virtual Type}
    1616Virtual types only have one change to their structure: the addition of a
    17 pointer to the virtual table, called the \emph{virtual-table pointer}.
    18 Internally, the field is called
    19 @virtual_table@.
    20 This constant pointer is always the first field of the table so when
    21 casting to a supertype, the field's location is always known.
    22 The field is initialized as part of all generated constructors.
    23 \todo{They only come as part exceptions and don't work.}
    24 %After the object is created the field is constant.
    25 Dereferencing it gives the virtual table and access to the
    26 type's virtual members.
     17pointer to the virtual table, which is called the \emph{virtual-table pointer}.
     18Internally, the field is called @virtual_table@.
     19The field is fixed after construction. It is always the first field in the
     20structure so that its location is always known.
     21\todo{Talk about constructors for virtual types (after they are working).}
     22
     23This is what binds an instance of a virtual type to a virtual table. This
     24pointer can be used as an identity check. It can also be used to access the
     25virtual table and the virtual members there.
     26
     27\subsection{Type Id}
     28Every virtual type has a unique id.
     29Type ids can be compared for equality (the types reperented are the same)
     30or used to access the type's type information.
     31The type information currently is only the parent's type id or, if the
     32type has no parent, zero.
     33
     34The id's are implemented as pointers to the type's type information instance.
     35Derefencing the pointer gets the type information.
     36By going back-and-forth between the type id and
     37the type info one can find every ancestor of a virtual type.
     38It also pushes the issue of creating a unique value (for
     39the type id) to the problem of creating a unique instance (for type
     40information) which the linker can solve.
     41
     42Advanced linker support is required because there is no place that appears
     43only once to attach the type information to. There should be one structure
     44definition but it is included in multiple translation units. Each virtual
     45table definition should be unique but there are an arbitrary number of thoses.
     46So the special section prefix \texttt{.gnu.linkonce} is used.
     47With a unique suffix (making the entire section name unique) the linker will
     48remove multiple definition making sure only one version exists after linking.
     49Then it is just a matter of making sure there is a unique name for each type.
     50
     51This is done in three phases.
     52The first phase is to generate a new structure definition to store the type
     53information. The layout is the same in each case, just the parent's type id,
     54but the types are changed.
     55The structure's name is change, it is based off the virtual type's name, and
     56the type of the parent's type id.
     57If the virtual type is polymorphic then the type information structure is
     58polymorphic as well, with the same polymorphic arguments.
     59
     60The second phase is to generate an instance of the type information with a
     61almost unique name, generated by mangling the virtual type name.
     62
     63The third phase is implicit with \CFA's overloading scheme. \CFA mangles
     64names with type information so that all of the symbols exported to the linker
     65are unique even if in \CFA code they are the same. Having two declarations
     66with the same name and same type is forbidden because it is impossible for
     67overload resolution to pick between them. This is why a unique type is
     68generated for each virtual type.
     69Polymorphic information is included in this mangling so polymorphic
     70types will have seperate instances for each set of polymorphic arguments.
     71
     72\begin{cfa}
     73struct /* type name */ {
     74        /* parent type name */ const * parent;
     75};
     76
     77__attribute__((section(".gnu.linkonce./* instance name */")))
     78/* type name */ const /* instance name */ = {
     79        &/* parent instance name */,
     80};
     81\end{cfa}
    2782
    2883\subsection{Virtual Table}
    29 % PAB: These 2 paragraphs are repeated below, and maybe some of the paragraph above, too.
    30 \begin{comment}
    31 Every time a virtual type is defined, a new virtual table-type is
    32 instantiated.
    33 The uniqueness of the virtual-table
    34 instance is important because its address
    35 is used as the identifier for the virtual type. Hence, a pointer to the
    36 virtual table and the ID for the virtual type are interchangeable.
    37 \todo{Unique instances might be going so we will have to talk about the new
    38 system instead.}
    39 
    40 The first step is creating the virtual-table type.
    41 The virtual-table type is a structure and is described in terms of
    42 its fields. The first field contains the parent-type ID (or a pointer to
    43 the parent virtual-table) or 0 (null pointer).
    44 Next are repeated fields from on the parent virtual-table.
    45 Finally, the fields used to store any new virtual members of the new
    46 the virtual type.
    47 \end{comment}
    48 
    49 %The virtual system is accessed through a private constant field inserted at the
    50 %beginning of every virtual type. This field
    51 The virtual-table pointer
    52 points at a type's virtual table (see Figure~\vref{f:VirtualTableLayout}).
    53 %and is assigned during the object's
    54 %construction.
    55 The address of a virtual table acts as the unique identifier for
    56 the virtual type, and the first field of a virtual table is a pointer to the
    57 parent virtual-table or @0p@ (null pointer). The remaining fields are duplicated from the
    58 parent tables in this type's inheritance chain, followed by any fields this type
    59 introduces. Parent fields are duplicated so they can be changed, \ie all virtual
    60 members are overridable, while the parent pointer allows access to the original values.
    61 Hence, references to the dispatched type
    62 are replaced with the current virtual type.
    63 % These are always taken by pointer or reference.
     84Each virtual type has a virtual table type that stores its type id and
     85virtual members.
     86Each virtual type instance is bound to a table instance that is filled with
     87the values of virtual members.
     88Both the layout of the fields and their value are decided by the rules given
     89below.
     90
     91The layout always comes in three parts.
     92The first section is just the type id at the head of the table. It is always
     93there to ensure that
     94The second section are all the virtual members of the parent, in the same
     95order as they appear in the parent's virtual table. Note that the type may
     96change slightly as references to the ``this" will change. This is limited to
     97inside pointers/references and via function pointers so that the size (and
     98hence the offsets) are the same.
     99The third section is similar to the second except that it is the new virtual
     100members introduced at this level in the hierarchy.
    64101
    65102\begin{figure}
    66 % Simple ascii diragram:
    67103\begin{cfa}
    68 parent_pointer  // \C{parent pointer to access its fields}
    69 parent_field0   // \C{same layout as parent to allow replacement}
     104type_id
     105parent_field0
    70106...
    71107parent_fieldN
    72 child_field0    // \C{new types for this virtual table}
     108child_field0
    73109...
    74110child_fieldN
    75 size
    76 alignment
    77111\end{cfa}
    78 %\todo{Refine the diagram}
    79112\caption{Virtual Table Layout}
    80113\label{f:VirtualTableLayout}
     114\todo*{Improve the Virtual Table Layout diagram.}
    81115\end{figure}
    82116
    83 % For each virtual type, a virtual table is constructed. This is both a new type
    84 % and an instance of that type. Other instances of the type could be created
    85 % but the system doesn't use them. So this section will go over the creation of
    86 % the type and the instance.
    87 
    88 \begin{comment}
    89 PAB: seems to be said already.
    90 A virtual table is created when a virtual type is created. The name of the
    91 type is created by mangling the name of the base type. The name of the instance
    92 is also generated by name mangling. The fields are initialized automatically.
    93 The parent field is initialized by getting the type of the parent field and
    94 using that to calculate the mangled name of the parent's virtual-table type.
    95 \end{comment}
    96 There are two special fields that are included like normal fields but have
    97 special initialization rules: the @size@ field is the type's size and is
    98 initialized with a @sizeof@ expression, the @align@ field is the type's
    99 alignment and uses an @alignof@ expression. The remaining fields are resolved
    100 to a name matching the field's name and type using the normal visibility and
    101 overload resolution rules of the type system.
    102 
    103 These operations are split up into several groups depending on where they take
    104 place, which varies for monomorphic and polymorphic types. The first devision is
    105 between the declarations and the definitions. Declarations, such as a function
    106 signature or an aggregate's name, must always be visible but may be repeated in
    107 the form of forward declarations in headers. Definitions, such as function
    108 bodies and a aggregate's layout, can be separately compiled but must occur
    109 exactly once in a source file.
    110 
    111 The declarations include the virtual-type definition and forward declarations
    112 of the virtual-table instance, constructor, message function and
    113 @get_exception_vtable@. The definition includes the storage and initialization
    114 of the virtual table instance and the bodies of the three functions.
    115 
    116 Monomorphic instances put all of these two groups in one place.
    117 Polymorphic instances split out the core declarations and definitions from
    118 the per-instance information. The virtual-table type and most of the functions
    119 are polymorphic so they are all part of the core. The virtual-table instance
    120 and the @get_exception_vtable@ function \PAB{ are ...}.
    121 
     117The first and second sections together mean that every virtual table has a
     118prefix that has the same layout and types as its parent virtual table.
     119This, combined with the fixed offset to the virtual table pointer, means that
     120for any virtual type it doesn't matter if we have it or any of its
     121descendants, it is still always safe to access the virtual table through
     122the virtual table pointer.
     123From there it is safe to check the type id to identify the exact type of the
     124underlying object, access any of the virtual members and pass the object to
     125any of the method-like virtual members.
     126\todo{Introduce method-like virtual members.}
     127
     128When a virtual table is declared the user decides where to declare it and its
     129name. The initialization of the virtual table is entirely automatic based on
     130the context of the declaration.
     131
     132The type id is always fixed, each virtual table type will always have one
     133exactly one possible type id.
     134The virtual members are usually filled in by resolution. The best match for
     135a given name and type at the declaration site is filled in.
     136There are two exceptions to that rule: the @size@ field is the type's size
     137and is set to the result of a @sizeof@ expression, the @align@ field is the
     138type's alignment and similarly uses an @alignof@ expression.
     139
     140\subsubsection{Concurrency Integration}
    122141Coroutines and threads need instances of @CoroutineCancelled@ and
    123142@ThreadCancelled@ respectively to use all of their functionality. When a new
    124 data type is declared with @coroutine@ or @thread@, the forward declaration for
     143data type is declared with @coroutine@ or @thread@ the forward declaration for
    125144the instance is created as well. The definition of the virtual table is created
    126145at the definition of the main function.
    127 
    128 \PAB{You need an example here to show what happens for this case.}
    129 
     146\todo{Add an example with code snipits.}
    130147
    131148\subsection{Virtual Cast}
     
    134151% The C-cast is just to make sure the generated code is correct so the rest of
    135152% the section is about that function.
    136 The function is
     153The function is implemented in the standard library and has the following
     154signature:
    137155\begin{cfa}
    138 void * __cfa__virtual_cast( struct __cfa__parent_vtable const * parent,
     156void * __cfa__virtual_cast(
     157        struct __cfa__parent_vtable const * parent,
    139158        struct __cfa__parent_vtable const * const * child );
    140159\end{cfa}
    141 and it is implemented in the standard library. The structure represents the
    142 head of a virtual table, which is the pointer to the parent virtual table. The
    143 @parent@ points directly at the parent-type virtual-table, while the @child@
    144 points at the object of the (possible) child type.
    145 
    146 \PAB{Need a figure to show this relationship.}
    147 
    148 In terms of the virtual-cast expression, @parent@ comes from looking up the
    149 type being cast to and @child@ is the result of the expression being cast.
    150 Because the complier outputs C code, some C-type casts are also used.
    151 The last bit of glue is a map that saves every virtual type the compiler
    152 sees. This table is used to check the type used in a virtual cast is a virtual
    153 type and to get its virtual table.
    154 (It also checks for conflicting definitions.)
    155 
    156 \PAB{Can this be rolled into the figure above?}
    157 
    158 Inside the function is a simple conditional. If the type represented by
    159 @parent@ is an ancestor of the type represented by @*child@ (it
    160 requires one more level of dereference to pass through the object) then @child@
    161 is returned, otherwise the null pointer is returned.
    162 
    163 The check is a simple linear search (like \Cpp RTTI). If the child
    164 virtual table or any of its ancestors (which are retrieved through the first
    165 field of every virtual table) are the same as the parent virtual-table then
    166 the cast succeeds.
     160\todo{Get rid of \_\_cfa\_\_parent\_vtable in the standard library and then
     161the document.}
     162The type id of target type of the virtual cast is passed in as @parent@ and
     163the cast target is passed in as @child@.
     164
     165For C generation both arguments and the result are wrapped with type casts.
     166There is also an internal store inside the compiler to make sure that the
     167target type is a virtual type.
     168% It also checks for conflicting definitions.
     169
     170The virtual cast either returns the original pointer as a new type or null.
     171So the function just does the parent check and returns the approprate value.
     172The parent check is a simple linear search of child's ancestors using the
     173type information.
    167174
    168175\section{Exceptions}
     
    174181% resumption doesn't as well.
    175182
    176 % Many modern languages work with an internal stack that function push and pop
     183% Many modern languages work with an interal stack that function push and pop
    177184% their local data to. Stack unwinding removes large sections of the stack,
    178185% often across functions.
    179186
    180187Stack unwinding is the process of removing stack frames (activations) from the
    181 stack. On function entry and return, unwinding is handled directly by the call/return code
    182 embedded in a function. Usually, the stack-frame size is known statically
    183 based on parameter and local variable declarations. For dynamically-sized
    184 local variables.
    185 (Often called a variable-length array or VLA, even when the variable type is an aggregate.)
    186 For VLAs, a runtime computation is necessary to know the frame
    187 size. Finally, a function's frame-size may change during execution as local
    188 variables (static or dynamic sized) go in and out of scope, which is a form of VLA.
     188stack. On function entry and return, unwinding is handled directly by the
     189call/return code embedded in the function.
     190In many cases the position of the instruction pointer (relative to parameter
     191and local declarations) is enough to know the current size of the stack
     192frame.
     193
     194Usually, the stack-frame size is known statically based on parameter and
     195local variable declarations. Even with dynamic stack-size the information
     196to determain how much of the stack has to be removed is still contained
     197within the function.
    189198Allocating/deallocating stack space is usually an $O(1)$ operation achieved by
    190199bumping the hardware stack-pointer up or down as needed.
    191 
    192 Unwinding across multiple stack frames is more complex because individual stack-management
    193 code associated with each frame can be bypassed. That is, the location
    194 of a function's frame-management code is largely unknown and dispersed
    195 throughout the function, hence the current frame size managed by that code is
    196 also unknown. Hence, code unwinding across frames does not have direct
    197 knowledge about what is on the stack, and hence, how much of the stack needs to
    198 be removed.
    199 
    200 % At a very basic level this can be done with @setjmp@ \& @longjmp@ which simply
    201 % move the top of the stack, discarding everything on the stack above a certain
    202 % point. However this ignores all the cleanup code that should be run when
    203 % certain sections of the stack are removed (for \CFA these are from destructors
    204 % and finally clauses) and also requires that the point to which the stack is
    205 % being unwound is known ahead of time. libunwind is used to address both of
    206 % these problems.
     200Constructing/destructing values on the stack takes longer put in terms of
     201figuring out what needs to be done is of similar complexity.
     202
     203Unwinding across multiple stack frames is more complex because that
     204information is no longer contained within the current function.
     205With seperate compilation a function has no way of knowing what its callers
     206are so it can't know how large those frames are.
     207Without altering the main code path it is also hard to pass that work off
     208to the caller.
    207209
    208210The traditional unwinding mechanism for C is implemented by saving a snap-shot
     
    211213reseting to a snap-shot of an arbitrary but existing function frame on the
    212214stack. It is up to the programmer to ensure the snap-shot is valid when it is
    213 reset and that unwound frames do not have side-effects.
    214 Hence, this unwinding approach is fragile with potential errors that are
    215 difficult to debug because the stack becomes corrupted.
    216 
    217 With respect to stack side-effects, many languages define cleanup actions that must be taken when objects
    218 are deallocated from the stack, when the function of blocks within the function end, such as running a variable's
    219 destructor or a @try@ statement's @finally@ clause.
    220 The purpose of these side-effects is to reestablish the global state of the program, such as dynamic memory-allocation or file access.
    221 Handling these side-effect mechanisms
    222 requires walking the stack and checking each stack frame for these potential
    223 actions, where a frame can be any block with declarations.
    224 
    225 In languages like \Cpp and Java, it must be possible to walk the stack frames in search of @try@
    226 statements to match and execute a handler. For termination exceptions, it must
    227 also be possible to unwind all stack frames from the throw to the matching
    228 catch (including the @try@ block), and each of these frames must be checked for cleanup actions. Stack
    229 walking is where most of the complexity and expense of exception handling
    230 appears.
     215reset and that all required clean-up from the unwound stacks is preformed.
     216This approach is fragile and forces a work onto the surounding code.
     217
     218With respect to that work forced onto the surounding code,
     219many languages define clean-up actions that must be taken when certain
     220sections of the stack are removed. Such as when the storage for a variable
     221is removed from the stack or when a try statement with a finally clause is
     222(conceptually) popped from the stack.
     223None of these should be handled by the user, that would contradict the
     224intention of these features, so they need to be handled automatically.
     225
     226To safely remove sections of the stack the language must be able to find and
     227run these clean-up actions even when removing multiple functions unknown at
     228the beginning of the unwinding.
    231229
    232230One of the most popular tools for stack management is libunwind, a low-level
     
    261259
    262260The GCC compilation flag @-fexceptions@ causes the generation of an LSDA and
    263 attaches its personality function.
    264 It attaches a series of opaque directives (@.cfi_personality@ directive)
    265 used internally and not part of this work.
    266 However, this
     261attaches a personality function to each function.
     262In plain C (which \CFA currently compiles down to) this
    267263flag only handles the cleanup attribute:
    268264\begin{cfa}
     
    270266int avar __attribute__(( cleanup(clean_up) ));
    271267\end{cfa}
    272 that is used on a variable and specifies a function, in this case @clean_up@,
    273 run when the variable goes out of scope, which is used to mimic destructors.
    274 However, this feature cannot be used to mimic @try@ statements as it cannot
    275 control the unwinding.
     268The attribue is used on a variable and specifies a function,
     269in this case @clean_up@, run when the variable goes out of scope.
     270This is enough to mimic destructors, but not try statements which can effect
     271the unwinding.
     272
     273To get full unwinding support all of this has to be done directly with
     274assembly and assembler directives. Partiularly the cfi directives
     275\texttt{.cfi\_lsda} and \texttt{.cfi\_personality}.
    276276
    277277\subsection{Personality Functions}
     
    279279section covers some of the important parts of the interface.
    280280
    281 A personality function can perform different actions depending on how it is
     281A personality function can preform different actions depending on how it is
    282282called.
    283283\begin{lstlisting}[language=C,{moredelim=**[is][\color{red}]{@}{@}}]
     
    313313@_UA_FORCE_UNWIND@ specifies a forced unwind call. Forced unwind only performs
    314314the cleanup phase and uses a different means to decide when to stop
    315 (see Section~\vref{s:ForcedUnwind}).
     315(see \vref{s:ForcedUnwind}).
    316316\end{enumerate}
    317317
    318318The @exception_class@ argument is a copy of the
    319 \lstinline[language=C]|exception|'s @exception_class@ field.
    320 \PAB{Say more.}
    321 
    322 The \lstinline[language=C]|exception| argument is a pointer to the user
    323 provided storage object. It has two public fields, the exception class, which
    324 is just a number, identifying the exception handling mechanism that
    325 created it, and the cleanup function. The cleanup function is called if
    326 required by the exception.
     319\code{C}{exception}'s @exception_class@ field.
     320This a number that identifies the exception handling mechanism that created
     321the
     322
     323The \code{C}{exception} argument is a pointer to the user
     324provided storage object. It has two public fields: the @exception_class@,
     325which is described above, and the @exception_cleanup@ function.
     326The clean-up function is used by the EHM to clean-up the exception if it
     327should need to be freed at an unusual time, it takes an argument that says
     328why it had to be cleaned up.
    327329
    328330The @context@ argument is a pointer to an opaque type passed to helper
     
    347349@_URC_END_OF_STACK@.
    348350
    349 Second, when a handler is matched, raise exception walks the stack again performing the cleanup
    350 phase.
     351Second, when a handler is matched, raise exception moves to the clean-up
     352phase and walks the stack a second time.
    351353Once again, it calls the personality functions of each stack frame from newest
    352354to oldest. This pass stops at the stack frame containing the matching handler.
     
    403405Each stack must have its own exception context. In a sequential \CFA program,
    404406there is only one stack with a single global exception-context. However, when
    405 the library @libcfathread@ is linked, there are multiple stacks, where each
     407the library @libcfathread@ is linked, there are multiple stacks and each
    406408needs its own exception context.
    407409
    408 The function @this_exception_context@ provides general access to the exception context.
    409 For sequential execution, this function is defined as
     410The exception context should be retrieved by calling the function
     411@this_exception_context@. For sequential execution, this function is defined as
    410412a weak symbol in the \CFA system-library, @libcfa@. When a \CFA program is
    411413concurrent, it links with @libcfathread@, where this function is defined with a
     
    422424% catches. Talk about GCC nested functions.
    423425
    424 Termination exceptions use libunwind heavily because \CFA termination exceptions match
     426\CFA termination exceptions use libunwind heavily because they match \Cpp
    425427\Cpp exceptions closely. The main complication for \CFA is that the
    426428compiler generates C code, making it very difficult to generate the assembly to
     
    430432The first step of a termination raise is to copy the exception into memory
    431433managed by the exception system. Currently, the system uses @malloc@, rather
    432 than reserved memory or the stack top. The exception-handling mechanism manages
     434than reserved memory or the stack top. The exception handling mechanism manages
    433435memory for the exception as well as memory for libunwind and the system's own
    434436per-exception storage.
     
    446448\label{f:ExceptionLayout}
    447449\end{figure}
    448 
    449 Exceptions are stored in variable-sized blocks (see Figure~\vref{f:ExceptionLayout}).
    450 The first component is a fixed-sized data-structure that contains the
     450\todo*{Convert the exception layout to an actual diagram.}
     451
     452Exceptions are stored in variable-sized blocks (see \vref{f:ExceptionLayout}).
     453The first component is a fixed-sized data structure that contains the
    451454information for libunwind and the exception system. The second component is an
    452455area of memory big enough to store the exception. Macros with pointer arthritic
     
    454457@_Unwind_Exception@ to the entire node.
    455458
    456 Multiple exceptions can exist because handlers can call functions that raise
    457 exceptions.  Figure~\vref{f:MultipleExceptions} shows a \Cpp program where
    458 exceptions are handled, and then a function is called from the handler that
    459 raises a new exception. The previous exception must persist because it is
    460 unhandled, and hence, control can return to the handler and that exception is
    461 reraised.
     459Multipe exceptions can exist at the same time because exceptions can be
     460raised inside handlers, destructors and finally blocks.
     461Figure~\vref{f:MultipleExceptions} shows a program that has multiple
     462exceptions active at one time.
     463Each time an exception is thrown and caught the stack unwinds and the finally
     464clause runs. This will throw another exception (until @num_exceptions@ gets
     465high enough) which must be allocated. The previous exceptions may not be
     466freed because the handler/catch clause has not been run.
     467So the EHM must keep them alive while it allocates exceptions for new throws.
    462468
    463469\begin{figure}
    464470\centering
     471% Andrew: Figure out what these do and give them better names.
    465472\newsavebox{\myboxA}
    466473\newsavebox{\myboxB}
    467474\begin{lrbox}{\myboxA}
    468 \begin{lstlisting}[language=C++,{moredelim=**[is][\color{red}]{@}{@}}]
    469 struct E {};
    470 int cnt = 3;
    471 void f( int i ) {
    472         if ( i == 0 ) @throw E();@
    473         try {
    474                 @f( i - 1 );@
    475         } catch( E ) { // handler h
    476                 cnt -= 1;
    477                 if ( cnt > 0 ) @f( 2 );@
    478         }
     475\begin{lstlisting}[language=CFA,{moredelim=**[is][\color{red}]{@}{@}}]
     476unsigned num_exceptions = 0;
     477void throws() {
     478    try {
     479        try {
     480            ++num_exceptions;
     481            throw (Example){table};
     482        } finally {
     483            if (num_exceptions < 3) {
     484                throws();
     485            }
     486        }
     487    } catch (exception_t *) {
     488        --num_exceptions;
     489    }
    479490}
    480 int main() { @f( 2 );@ }
     491int main() {
     492    throws();
     493}
    481494\end{lstlisting}
    482495\end{lrbox}
     
    484497\begin{lrbox}{\myboxB}
    485498\begin{lstlisting}
    486 h  $\makebox[0pt][l]{\textbackslash}f$
    487    f
    488    f
    489 h  $\makebox[0pt][l]{\textbackslash}f$  throw E$\(_2\)$
    490    f
    491    f
    492 h  $\makebox[0pt][l]{\textbackslash}f$  throw E$\(_1\)$
    493    f
    494    f
    495499\end{lstlisting}
    496500\end{lrbox}
     
    503507\label{f:MultipleExceptions}
    504508\end{figure}
    505 
    506 In this case, the exception nodes are linked together in a list, one list per stack, with the
     509\todo*{Work on multiple exceptions code sample.}
     510
     511All exceptions are stored in nodes which are then linked together in lists,
     512one list per stack, with the
    507513list head stored in the exception context. Within each linked list, the most
    508514recently thrown exception is at the head followed by older thrown
     
    515521exception, the copy function, and the free function, so they are specific to an
    516522exception type. The size and copy function are used immediately to copy an
    517 exception into managed memory. After the exception is handled, the free function
    518 is used to clean up the exception and then the entire node is passed to free
    519 so the memory can be given back to the heap.
     523exception into managed memory. After the exception is handled, the free
     524function is used to clean up the exception and then the entire node is
     525passed to free so the memory can be given back to the heap.
    520526
    521527\subsection{Try Statements and Catch Clauses}
    522528The try statement with termination handlers is complex because it must
    523 compensate for the lack of assembly code generated from \CFA. Libunwind
     529compensate for the lack of assembly-code generated from \CFA. Libunwind
    524530requires an LSDA and personality function for control to unwind across a
    525531function. The LSDA in particular is hard to mimic in generated C code.
     
    530536calls them.
    531537Because this function is known and fixed (and not an arbitrary function that
    532 happens to contain a try statement), this means the LSDA can be generated ahead
     538happens to contain a try statement), the LSDA can be generated ahead
    533539of time.
    534540
    535541Both the LSDA and the personality function are set ahead of time using
    536 embedded assembly. This assembly code is handcrafted using C @asm@ statements and contains
    537 enough information for the single try statement the function represents.
     542embedded assembly. This assembly code is handcrafted using C @asm@ statements
     543and contains
     544enough information for the single try statement the function repersents.
    538545
    539546The three functions passed to try terminate are:
     
    563570nested functions and all other functions besides @__cfaehm_try_terminate@ in
    564571\CFA use the GCC personality function and the @-fexceptions@ flag to generate
    565 the LSDA. Through this mechanism, \CFA destructors are implemented via the cleanup attribute.
    566 
    567 \PAB{Try to put together an example try statement illustrating these components.}
     572the LSDA.
     573Using this pattern, \CFA implements destructors with the cleanup attribute.
     574\todo{Add an example of the conversion from try statement to functions.}
    568575
    569576\section{Resumption}
    570577% The stack-local data, the linked list of nodes.
    571578
    572 Resumption is simpler to implement than termination because there is no stack
    573 unwinding.  \PAB{You need to explain how the \lstinline{catchResume} clauses are
    574 handled. Do you use the personality mechanism in libunwind or do you roll your
    575 own mechanism?}
    576 
    577 The
    578 resumption raise uses a list of nodes for its stack traversal. The head of the
    579 list is stored in the exception context. The nodes in the list have a pointer
    580 to the next node and a pointer to the handler function.
    581 A resumption raise traverses this list. At each node the handler function is
    582 called, passing the exception by pointer. It returns true if the exception is
    583 handled and false otherwise.
    584 
    585 The handler function does both the matching and handling. It computes the
    586 condition of each @catchResume@ in top-to-bottom order, until it finds a
    587 handler that matches. If no handler matches then the function returns
    588 false. Otherwise the matching handler is run; if it completes successfully, the
    589 function returns true. Rethrowing, through the @throwResume;@ statement,
    590 causes the function to return true.
     579Resumption simpler to implement than termination
     580because there is no stack unwinding.
     581Instead of storing the data in a special area using assembly,
     582there is just a linked list of possible handlers for each stack,
     583with each node on the list reperenting a try statement on the stack.
     584
     585The head of the list is stored in the exception context.
     586The nodes are stored in order, with the more recent try statements closer
     587to the head of the list.
     588Instead of traversing the stack resumption handling traverses the list.
     589At each node the EHM checks to see if the try statement the node repersents
     590can handle the exception. If it can, then the exception is handled and
     591the operation finishes, otherwise the search continues to the next node.
     592If the search reaches the end of the list without finding a try statement
     593that can handle the exception the default handler is executed and the
     594operation finishes.
     595
     596In each node is a handler function which does most of the work there.
     597The handler function is passed the raised the exception and returns true
     598if the exception is handled and false if it cannot be handled here.
     599
     600For each @catchResume@ clause the handler function will:
     601check to see if the raised exception is a descendant type of the declared
     602exception type, if it is and there is a conditional expression then it will
     603run the test, if both checks pass the handling code for the clause is run
     604and the function returns true, otherwise it moves onto the next clause.
     605If this is the last @catchResume@ clause then instead of moving onto
     606the next clause the function returns false as no handler could be found.
     607
     608\todo{Diagram showing a try statement being converted into resumption handlers.}
    591609
    592610% Recursive Resumption Stuff:
     
    603621the other handler checked up to this point are not checked again.
    604622
    605 This structure also supports new handlers added while the resumption is being
     623This structure also supports new handler added while the resumption is being
    606624handled. These are added to the front of the list, pointing back along the
    607625stack -- the first one points over all the checked handlers -- and the ordering
    608626is maintained.
    609 
    610 \PAB{Again, a figure to show how this works would be helpful.}
     627\todo{Add a diagram for resumption marking.}
    611628
    612629\label{p:zero-cost}
     
    623640% that unwind is required knowledge for that chapter.
    624641
    625 \PAB{This paragraph needs to be moved to the start of this Section, where I have have my other comment.}
    626 
    627642\section{Finally}
    628643% Uses destructors and GCC nested functions.
    629 A finally clause is placed into a GCC nested-function with a unique mangled name, and no
    630 arguments or return values. This nested function is then set as the cleanup
     644A finally clause is placed into a GCC nested-function with a unique name,
     645and no arguments or return values.
     646This nested function is then set as the cleanup
    631647function of an empty object that is declared at the beginning of a block placed
    632 around the context of an associated @try@ statement.
     648around the context of the associated @try@ statement.
    633649
    634650The rest is handled by GCC. The try block and all handlers are inside this
     
    640656
    641657Cancellation also uses libunwind to do its stack traversal and unwinding,
    642 however it uses a different primary function, @_Unwind_ForcedUnwind@. Details
    643 of its interface can be found in Section~\vref{s:ForcedUnwind}.
     658however it uses a different primary function: @_Unwind_ForcedUnwind@. Details
     659of its interface can be found in the Section~\vref{s:ForcedUnwind}.
    644660
    645661The first step of cancellation is to find the cancelled stack and its type:
    646 coroutine or thread. Fortunately, the thread library stores the program-main thread
    647 pointer and the current-thread pointer, and every thread stores a pointer to
    648 the current coroutine it is executing.
    649 
    650 \PAB{I don't know if my corrections in the previous paragraph are correct.}
    651 
    652 When the active thread and coroutine are the same, the current stack is the thread stack, otherwise it is a coroutine
    653 stack.
    654 % PAB: repeated?
    655 % If it is a thread stack, then an equality check with the stored main
    656 % thread pointer and current thread pointer is enough to tell if the current
    657 % thread is the main thread or not.
     662coroutine or thread. Fortunately, the thread library stores the main thread
     663pointer and the current thread pointer, and every thread stores a pointer to
     664its main coroutine and the coroutine it is currently executing.
     665\todo*{Consider adding a description of how threads are coroutines.}
     666
     667If a the current thread's main and current coroutines are the same then the
     668current stack is a thread stack. Furthermore it is easy to compare the
     669current thread to the main thread to see if they are the same. And if this
     670is not a thread stack then it must be a coroutine stack.
     671
    658672However, if the threading library is not linked, the sequential execution is on
    659673the main stack. Hence, the entire check is skipped because the weak-symbol
     
    663677Regardless of how the stack is chosen, the stop function and parameter are
    664678passed to the forced-unwind function. The general pattern of all three stop
    665 functions is the same: continue unwinding until the end of stack.
    666 %when they
    667 %do there primary work.
     679functions is the same: they continue unwinding until the end of stack and
     680then preform their transfer.
     681
    668682For main stack cancellation, the transfer is just a program abort.
    669683
    670 For coroutine cancellation, the exception is stored in the coroutine's stack,
     684For coroutine cancellation, the exception is stored on the coroutine's stack,
    671685and the coroutine context switches to its last resumer. The rest is handled on
    672686the backside of the resume, which check if the resumed coroutine is
  • doc/theses/andrew_beach_MMath/uw-ethesis.tex

    r58d471f rfc1347d0  
    9393% Removes large sections of the document.
    9494\usepackage{comment}
    95 % Adds todos (Must be included after comment.)
    96 \usepackage{todonotes}
     95% Adds todo commands.
     96\usepackage{todo}
    9797% cfa macros used in the document
    9898\usepackage{cfalab}
     
    145145
    146146% Exception to the rule of hyperref being the last add-on package
    147 \usepackage[automake,toc,abbreviations]{glossaries-extra}
     147\usepackage[toc,abbreviations]{glossaries-extra}
    148148% If glossaries-extra is not in your LaTeX distribution, get it from CTAN
    149149% (http://ctan.org/pkg/glossaries-extra), although it's supposed to be in
     
    235235% Tip: Putting each sentence on a new line is a way to simplify later editing.
    236236%----------------------------------------------------------------------
     237\input{intro}
    237238\input{existing}
    238239\input{features}
    239240\input{implement}
    240 %\input{unwinding}
    241241\input{future}
    242242
     
    298298\phantomsection         % allows hyperref to link to the correct page
    299299
     300\todos
     301
    300302%----------------------------------------------------------------------
    301303\end{document} % end of logical document
Note: See TracChangeset for help on using the changeset viewer.