Changeset ba2e8f0


Ignore:
Timestamp:
Aug 24, 2021, 4:41:53 PM (5 months ago)
Author:
Andrew Beach <ajbeach@…>
Branches:
jacob/cs343-translation, master
Children:
e8bad5c8
Parents:
9cd37d9
Message:

Andrew MMath: Folded in Peter's updates to the implement chapter.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/andrew_beach_MMath/implement.tex

    r9cd37d9 rba2e8f0  
    2020
    2121\subsection{Virtual Type}
    22 Virtual types only have one change to their structure: the addition of a
    23 pointer to the virtual table, which is called the \emph{virtual-table pointer}.
    24 Internally, the field is called \snake{virtual_table}.
    25 The field is fixed after construction. It is always the first field in the
     22A virtual type~(see \autoref{s:virtuals}) has a pointer to a virtual table,
     23called the \emph{virtual-table pointer},
     24which binds each instance of a virtual type to a virtual table.
     25Internally, the field is called \snake{virtual_table}
     26and is fixed after construction.
     27This pointer is also the table's id and how the system accesses the
     28virtual table and the virtual members there.
     29It is always the first field in the
    2630structure so that its location is always known.
    27 \todo{Talk about constructors for virtual types (after they are working).}
    28 
    29 The virtual table pointer binds an instance of a virtual type
    30 to a virtual table.
    31 The pointer is also the table's id and how the system accesses the
    32 virtual table and the virtual members there.
     31
     32% We have no special rules for these constructors.
     33Virtual table pointers are passed to the constructors of virtual types
     34as part of field-by-field construction.
    3335
    3436\subsection{Type Id}
    3537Every virtual type has a unique id.
    36 Type ids can be compared for equality,
    37 which checks if the types reperented are the same,
    38 or used to access the type's type information.
     38These are used in type equality, to check if the representation of two values
     39are the same, and to access the type's type information.
     40This uniqueness means across a program composed of multiple translation
     41units (TU), not uniqueness across all programs or even across multiple
     42processes on the same machine.
     43
     44Our approach for program uniqueness is using a static declaration for each
     45type id, where the run-time storage address of that variable is guaranteed to
     46be unique during program execution.
     47The type id storage can also be used for other purposes,
     48and is used for type information.
     49
     50The problem is that a type id may appear in multiple TUs that compose a
     51program (see \autoref{ss:VirtualTable}); so the initial solution would seem
     52to be make it external in each translation unit. Honever, the type id must
     53have a declaration in (exactly) one of the TUs to create the storage.
     54No other declaration related to the virtual type has this property, so doing
     55this through standard C declarations would require the user to do it manually.
     56
     57Instead the linker is used to handle this problem.
     58% I did not base anything off of C++17; they are solving the same problem.
     59A new feature has been added to \CFA for this purpose, the special attribute
     60\snake{cfa_linkonce}, which uses the special section @.gnu.linkonce@.
     61When used as a prefix (\eg @.gnu.linkonce.example@) the linker does
     62not combine these sections, but instead discards all but one with the same
     63full name.
     64
     65So each type id must be given a unique section name with the linkonce
     66prefix. Luckily \CFA already has a way to get unique names, the name mangler.
     67For example, this could be written directly in \CFA:
     68\begin{cfa}
     69__attribute__((cfa_linkonce)) void f() {}
     70\end{cfa}
     71This is translated to:
     72\begin{cfa}
     73__attribute__((section(".gnu.linkonce._X1fFv___1"))) void _X1fFv___1() {}
     74\end{cfa}
     75This is done internally to access the name manglers.
     76This attribute is useful for other purposes, any other place a unique
     77instance required, and should eventually be made part of a public and
     78stable feature in \CFA.
     79
     80\subsection{Type Information}
     81
     82There is data stored at the type id's declaration, the type information.
    3983The type information currently is only the parent's type id or, if the
    4084type has no parent, the null pointer.
    41 
    42 The id's are implemented as pointers to the type's type information instance.
    43 Dereferencing the pointer gets the type information.
    4485The ancestors of a virtual type are found by traversing type ids through
    4586the type information.
    46 The information pushes the issue of creating a unique value (for
    47 the type id) to the problem of creating a unique instance (for type
    48 information), which the linker can solve.
    49 
    50 The advanced linker support is used here to avoid having to create
    51 a new declaration to attach this data to.
    52 With C/\CFA's header/implementation file divide for something to appear
    53 exactly once it must come from a declaration that appears in exactly one
    54 implementation file; the declarations in header files may exist only once
    55 they can be included in many different translation units.
    56 Therefore, structure's declaration will not work.
    57 Neither will attaching the type information to the virtual table -- although
    58 a vtable declarations are in implemention files they are not unique, see
    59 \autoref{ss:VirtualTable}.
    60 Instead the same type information is generated multiple times and then
    61 the new attribute \snake{cfa_linkone} is used to removed duplicates.
     87An example using helper macros looks like:
     88\begin{cfa}
     89struct INFO_TYPE(TYPE) {
     90        INFO_TYPE(PARENT) const * parent;
     91};
     92
     93__attribute__((cfa_linkonce))
     94INFO_TYPE(TYPE) const INFO_NAME(TYPE) = {
     95        &INFO_NAME(PARENT),
     96};
     97\end{cfa}
    6298
    6399Type information is constructed as follows:
    64100\begin{enumerate}
    65101\item
    66 Use the type's name to generate a name for the type information structure.
    67 This is saved so it may be reused.
     102Use the type's name to generate a name for the type information structure,
     103which is saved so it can be reused.
    68104\item
    69105Generate a new structure definition to store the type
    70106information. The layout is the same in each case, just the parent's type id,
    71107but the types used change from instance to instance.
    72 The generated name is used for both this structure and, if relivant, the
     108The generated name is used for both this structure and, if relevant, the
    73109parent pointer.
    74110If the virtual type is polymorphic then the type information structure is
    75111polymorphic as well, with the same polymorphic arguments.
    76112\item
    77 A seperate name for instances is generated from the type's name.
     113A separate name for instances is generated from the type's name.
    78114\item
    79 The definition is generated and initialised.
     115The definition is generated and initialized.
    80116The parent id is set to the null pointer or to the address of the parent's
    81117type information instance. Name resolution handles the rest.
    82118\item
    83119\CFA's name mangler does its regular name mangling encoding the type of
    84 the declaration into the instance name. This gives a completely unique name
     120the declaration into the instance name.
     121This process gives a completely unique name
    85122including different instances of the same polymorphic type.
    86123\end{enumerate}
    87 \todo{The list is making me realise, some of this isn't ordered.}
     124\todo{The list is making me realize, some of this isn't ordered.}
    88125
    89126Writing that code manually, with helper macros for the early name mangling,
     
    100137\end{cfa}
    101138
     139\begin{comment}
    102140\subsubsection{\lstinline{cfa\_linkonce} Attribute}
    103 % I just realised: This is an extension of the inline keyword.
     141% I just realized: This is an extension of the inline keyword.
    104142% An extension of C's at least, it is very similar to C++'s.
    105143Another feature added to \CFA is a new attribute: \texttt{cfa\_linkonce}.
     
    126164everything that comes after the special prefix, then only one is used
    127165and the other is discarded.
     166\end{comment}
    128167
    129168\subsection{Virtual Table}
     
    176215type's alignment, is set using an @alignof@ expression.
    177216
    178 \subsubsection{Concurrency Integration}
     217\subsection{Concurrency Integration}
    179218Coroutines and threads need instances of @CoroutineCancelled@ and
    180219@ThreadCancelled@ respectively to use all of their functionality. When a new
     
    183222at the definition of the main function.
    184223
    185 This is showned through code re-writing in
    186 \autoref{f:ConcurrencyTypeTransformation} and
    187 \autoref{f:ConcurrencyMainTransformation}.
    188 In both cases the original declaration is not modified,
     224These transformations are shown through code re-writing in
     225\autoref{f:CoroutineTypeTransformation} and
     226\autoref{f:CoroutineMainTransformation}.
     227Threads use the same pattern, with some names and types changed.
     228In both cases, the original declaration is not modified,
    189229only new ones are added.
    190230
    191 \begin{figure}
     231\begin{figure}[htb]
    192232\begin{cfa}
    193233coroutine Example {
     
    207247extern CoroutineCancelled_vtable & _default_vtable;
    208248\end{cfa}
    209 \caption{Concurrency Type Transformation}
    210 \label{f:ConcurrencyTypeTransformation}
     249\caption{Coroutine Type Transformation}
     250\label{f:CoroutineTypeTransformation}
    211251\end{figure}
    212252
     
    229269        &_default_vtable_object_declaration;
    230270\end{cfa}
    231 \caption{Concurrency Main Transformation}
    232 \label{f:ConcurrencyMainTransformation}
     271\caption{Coroutine Main Transformation}
     272\label{f:CoroutineMainTransformation}
    233273\end{figure}
    234274
     
    242282\begin{cfa}
    243283void * __cfa__virtual_cast(
    244         struct __cfavir_type_td parent,
    245         struct __cfavir_type_id const * child );
    246 \end{cfa}
    247 The type id of target type of the virtual cast is passed in as @parent@ and
     284        struct __cfavir_type_id * parent,
     285        struct __cfavir_type_id * const * child );
     286\end{cfa}
     287The type id for the target type of the virtual cast is passed in as
     288@parent@ and
    248289the cast target is passed in as @child@.
    249 
    250 For generated C code wraps both arguments and the result with type casts.
     290The generated C code wraps both arguments and the result with type casts.
    251291There is also an internal check inside the compiler to make sure that the
    252292target type is a virtual type.
     
    260300
    261301\section{Exceptions}
    262 % Anything about exception construction.
     302\todo{The entire exceptions section.}
    263303
    264304\section{Unwinding}
     
    274314stack. On function entry and return, unwinding is handled directly by the
    275315call/return code embedded in the function.
    276 In many cases, the position of the instruction pointer (relative to parameter
    277 and local declarations) is enough to know the current size of the stack
    278 frame.
    279 
     316
     317% Discussing normal stack unwinding:
    280318Usually, the stack-frame size is known statically based on parameter and
    281 local variable declarations. Even with dynamic stack-size, the information
     319local variable declarations. Even for a dynamic stack-size, the information
    282320to determine how much of the stack has to be removed is still contained
    283321within the function.
     
    285323bumping the hardware stack-pointer up or down as needed.
    286324Constructing/destructing values within a stack frame has
    287 a similar complexity but can add additional work and take longer.
    288 
     325a similar complexity but larger constants.
     326
     327% Discussing multiple frame stack unwinding:
    289328Unwinding across multiple stack frames is more complex because that
    290329information is no longer contained within the current function.
    291 With seperate compilation a function has no way of knowing what its callers
    292 are so it can't know how large those frames are.
    293 Without altering the main code path it is also hard to pass that work off
    294 to the caller.
     330With seperate compilation,
     331a function does not know its callers nor their frame layout.
     332Even using the return address, that information is encoded in terms of
     333actions in code, intermixed with the actions required finish the function.
     334Without changing the main code path it is impossible to select one of those
     335two groups of actions at the return site.
    295336
    296337The traditional unwinding mechanism for C is implemented by saving a snap-shot
     
    302343This approach is fragile and requires extra work in the surrounding code.
    303344
    304 With respect to the extra work in the surounding code,
     345With respect to the extra work in the surrounding code,
    305346many languages define clean-up actions that must be taken when certain
    306347sections of the stack are removed. Such as when the storage for a variable
    307 is removed from the stack or when a try statement with a finally clause is
     348is removed from the stack, possibly requiring a destructor call,
     349or when a try statement with a finally clause is
    308350(conceptually) popped from the stack.
    309 None of these should be handled by the user --- that would contradict the
     351None of these cases should be handled by the user --- that would contradict the
    310352intention of these features --- so they need to be handled automatically.
    311353
     
    348390In plain C (which \CFA currently compiles down to) this
    349391flag only handles the cleanup attribute:
     392%\label{code:cleanup}
    350393\begin{cfa}
    351394void clean_up( int * var ) { ... }
     
    355398in this case @clean_up@, run when the variable goes out of scope.
    356399This feature is enough to mimic destructors,
    357 but not try statements which can effect
     400but not try statements that affect
    358401the unwinding.
    359402
     
    399442@_UA_FORCE_UNWIND@ specifies a forced unwind call. Forced unwind only performs
    400443the cleanup phase and uses a different means to decide when to stop
    401 (see \vref{s:ForcedUnwind}).
     444(see \autoref{s:ForcedUnwind}).
    402445\end{enumerate}
    403446
    404447The @exception_class@ argument is a copy of the
    405448\code{C}{exception}'s @exception_class@ field,
    406 which is a number that identifies the exception handling mechanism
     449which is a number that identifies the EHM
    407450that created the exception.
    408451
     
    494537needs its own exception context.
    495538
    496 The exception context should be retrieved by calling the function
     539The current exception context should be retrieved by calling the function
    497540\snake{this_exception_context}.
    498541For sequential execution, this function is defined as
     
    519562The first step of a termination raise is to copy the exception into memory
    520563managed by the exception system. Currently, the system uses @malloc@, rather
    521 than reserved memory or the stack top. The exception handling mechanism manages
     564than reserved memory or the stack top. The EHM manages
    522565memory for the exception as well as memory for libunwind and the system's own
    523566per-exception storage.
     
    618661\subsection{Try Statements and Catch Clauses}
    619662The try statement with termination handlers is complex because it must
    620 compensate for the C code-generation versus
     663compensate for the C code-generation versus proper
    621664assembly-code generated from \CFA. Libunwind
    622665requires an LSDA and personality function for control to unwind across a
    623666function. The LSDA in particular is hard to mimic in generated C code.
    624667
    625 The workaround is a function called @__cfaehm_try_terminate@ in the standard
    626 library. The contents of a try block and the termination handlers are converted
    627 into functions. These are then passed to the try terminate function and it
    628 calls them.
     668The workaround is a function called \snake{__cfaehm_try_terminate} in the
     669standard \CFA library. The contents of a try block and the termination
     670handlers are converted into nested functions. These are then passed to the
     671try terminate function and it calls them, appropriately.
    629672Because this function is known and fixed (and not an arbitrary function that
    630 happens to contain a try statement), the LSDA can be generated ahead
     673happens to contain a try statement), its LSDA can be generated ahead
    631674of time.
    632675
    633 Both the LSDA and the personality function are set ahead of time using
     676Both the LSDA and the personality function for \snake{__cfaehm_try_terminate}
     677are set ahead of time using
    634678embedded assembly. This assembly code is handcrafted using C @asm@ statements
    635679and contains
    636 enough information for a single try statement the function repersents.
     680enough information for the single try statement the function represents.
    637681
    638682The three functions passed to try terminate are:
     
    646690decides if a catch clause matches the termination exception. It is constructed
    647691from the conditional part of each handler and runs each check, top to bottom,
    648 in turn, first checking to see if the exception type matches and then if the
    649 condition is true. It takes a pointer to the exception and returns 0 if the
     692in turn, to see if the exception matches this handler.
     693The match is performed in two steps, first a virtual cast is used to check
     694if the raised exception is an instance of the declared exception type or
     695one of its descendant types, and then the condition is evaluated, if
     696present.
     697The match function takes a pointer to the exception and returns 0 if the
    650698exception is not handled here. Otherwise the return value is the id of the
    651699handler that matches the exception.
     
    660708All three functions are created with GCC nested functions. GCC nested functions
    661709can be used to create closures,
    662 in other words functions that can refer to the state of other
    663 functions on the stack. This approach allows the functions to refer to all the
     710in other words,
     711functions that can refer to variables in their lexical scope even
     712those variables are part of a different function.
     713This approach allows the functions to refer to all the
    664714variables in scope for the function containing the @try@ statement. These
    665715nested functions and all other functions besides @__cfaehm_try_terminate@ in
     
    669719
    670720\autoref{f:TerminationTransformation} shows the pattern used to transform
    671 a \CFA try statement with catch clauses into the approprate C functions.
     721a \CFA try statement with catch clauses into the appropriate C functions.
    672722\todo{Explain the Termination Transformation figure.}
    673723
     
    738788Instead of storing the data in a special area using assembly,
    739789there is just a linked list of possible handlers for each stack,
    740 with each node on the list reperenting a try statement on the stack.
     790with each node on the list representing a try statement on the stack.
    741791
    742792The head of the list is stored in the exception context.
     
    744794to the head of the list.
    745795Instead of traversing the stack, resumption handling traverses the list.
    746 At each node, the EHM checks to see if the try statement the node repersents
     796At each node, the EHM checks to see if the try statement the node represents
    747797can handle the exception. If it can, then the exception is handled and
    748798the operation finishes, otherwise the search continues to the next node.
    749799If the search reaches the end of the list without finding a try statement
    750 that can handle the exception, the default handler is executed and the
    751 operation finishes.
     800with a handler clause
     801that can handle the exception, the default handler is executed.
     802If the default handler returns, control continues after the raise statement.
    752803
    753804Each node has a handler function that does most of the work.
    754805The handler function is passed the raised exception and returns true
    755806if the exception is handled and false otherwise.
    756 
    757807The handler function checks each of its internal handlers in order,
    758808top-to-bottom, until it funds a match. If a match is found that handler is
     
    760810If no match is found the function returns false.
    761811The match is performed in two steps, first a virtual cast is used to see
    762 if the thrown exception is an instance of the declared exception or one of
    763 its descendant type, then check to see if passes the custom predicate if one
    764 is defined. This ordering gives the type guarantee used in the predicate.
     812if the raised exception is an instance of the declared exception type or one
     813of its descendant types, if so then it is passed to the custom predicate
     814if one is defined.
     815% You need to make sure the type is correct before running the predicate
     816% because the predicate can depend on that.
    765817
    766818\autoref{f:ResumptionTransformation} shows the pattern used to transform
     
    814866(see \vpageref{s:ResumptionMarking}), which ignores parts of
    815867the stack
    816 already examined, is accomplished by updating the front of the list as the
    817 search continues. Before the handler at a node is called, the head of the list
     868already examined, and is accomplished by updating the front of the list as
     869the search continues.
     870Before the handler is called at a matching node, the head of the list
    818871is updated to the next node of the current node. After the search is complete,
    819872successful or not, the head of the list is reset.
     
    822875been checked are not on the list while a handler is run. If a resumption is
    823876thrown during the handling of another resumption, the active handlers and all
    824 the other handler checked up to this point are not checked again.
     877the other handlers checked up to this point are not checked again.
    825878% No paragraph?
    826879This structure also supports new handlers added while the resumption is being
     
    830883
    831884\begin{figure}
     885\centering
    832886\input{resumption-marking}
    833887\caption{Resumption Marking}
     
    851905\section{Finally}
    852906% Uses destructors and GCC nested functions.
    853 A finally clause is placed into a GCC nested-function with a unique name,
    854 and no arguments or return values.
    855 This nested function is then set as the cleanup
    856 function of an empty object that is declared at the beginning of a block placed
    857 around the context of the associated @try@ statement.
    858 
    859 The rest is handled by GCC. The try block and all handlers are inside this
    860 block. At completion, control exits the block and the empty object is cleaned
     907
     908%\autoref{code:cleanup}
     909A finally clause is handled by converting it into a once-off destructor.
     910The code inside the clause is placed into GCC nested-function
     911with a unique name, and no arguments or return values.
     912This nested function is
     913then set as the cleanup function of an empty object that is declared at the
     914beginning of a block placed around the context of the associated try
     915statement (see \autoref{f:FinallyTransformation}).
     916
     917\begin{figure}
     918\begin{cfa}
     919try {
     920        // TRY BLOCK
     921} finally {
     922        // FINALLY BLOCK
     923}
     924\end{cfa}
     925
     926\transformline
     927
     928\begin{cfa}
     929{
     930        void finally(void *__hook){
     931                // FINALLY BLOCK
     932        }
     933        __attribute__ ((cleanup(finally)))
     934        struct __cfaehm_cleanup_hook __finally_hook;
     935        {
     936                // TRY BLOCK
     937        }
     938}
     939\end{cfa}
     940
     941\caption{Finally Transformation}
     942\label{f:FinallyTransformation}
     943\end{figure}
     944
     945The rest is handled by GCC.
     946The TRY BLOCK
     947contains the try block itself as well as all code generated for handlers.
     948Once that code has completed,
     949control exits the block and the empty object is cleaned
    861950up, which runs the function that contains the finally code.
    862951
     
    887976passed to the forced-unwind function. The general pattern of all three stop
    888977functions is the same: continue unwinding until the end of stack and
    889 then preform the appropriate transfer.
     978then perform the appropriate transfer.
    890979
    891980For main stack cancellation, the transfer is just a program abort.
Note: See TracChangeset for help on using the changeset viewer.