Ignore:
Timestamp:
Aug 21, 2021, 5:39:14 PM (3 years ago)
Author:
Andrew Beach <ajbeach@…>
Branches:
ADT, ast-experimental, enum, forall-pointer-decay, jacob/cs343-translation, master, pthread-emulation, qualifiedEnum
Children:
7372065
Parents:
1a6a6f2 (diff), e56eabac (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the (diff) links above to see all the changes relative to each parent.
Message:

Merge branch 'master' into 'andrew-mmath', collecting Peter's updates.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/andrew_beach_MMath/implement.tex

    r1a6a6f2 rd8f8d08  
    2020
    2121\subsection{Virtual Type}
    22 Virtual types only have one change to their structure: the addition of a
    23 pointer to the virtual table, which is called the \emph{virtual-table pointer}.
    24 Internally, the field is called \snake{virtual_table}.
    25 The field is fixed after construction. It is always the first field in the
    26 structure so that its location is always known.
     22A virtual type~(see \autoref{s:Virtuals}) has a pointer to a virtual table,
     23called the \emph{virtual-table pointer}, which binds an instance of a virtual
     24type to a virtual table.  Internally, the field is called \snake{virtual_table}
     25and is fixed after construction.  This pointer is also the table's id and how
     26the system accesses the virtual table and the virtual members there. It is
     27always the first field in the structure so that its location is always known.
    2728\todo{Talk about constructors for virtual types (after they are working).}
    2829
    29 The virtual table pointer binds an instance of a virtual type
    30 to a virtual table.
    31 The pointer is also the table's id and how the system accesses the
    32 virtual table and the virtual members there.
    33 
    3430\subsection{Type Id}
    35 Every virtual type has a unique id.
    36 Type ids can be compared for equality,
    37 which checks if the types reperented are the same,
    38 or used to access the type's type information.
     31Every virtual type needs a unique id, so that type ids can be compared for
     32equality, which checks if the types representation are the same, or used to
     33access the type's type information.  Here, uniqueness means within a program
     34composed of multiple translation units (TU), not uniqueness across all
     35programs.
     36
     37One approach for program uniqueness is declaring a static declaration for each
     38type id, where the runtime storage address of that variable is guaranteed to be
     39unique during program execution. The type id storage can also be used for other
     40purposes.
     41
     42The problem is that a type id may appear in multiple TUs that compose a
     43program, see \autoref{ss:VirtualTable}; hence in each TU, it must be declared
     44as external to prevent multiple definitions. However, the type id must actually
     45be declared in one of the TUs so the linker creates the storage.  Hence, the
     46problem becomes designating one TU to insert an actual type-id declaration. But
     47the \CFA compiler does not know the set of the translation units that compose a
     48program, because TUs can be compile separately, followed by a separate link
     49step.
     50
     51The solution is to mimic a \CFA feature in \Cpp{17}, @inline@ variables and
     52function:
     53\begin{quote}
     54There may be more than one definition of an inline function or variable (since
     55\Cpp{17} in the program as long as each definition appears in a different
     56translation unit and (for non-static inline functions and variables (since
     57\Cpp{17})) all definitions are identical. For example, an inline function or an
     58inline variable (since \Cpp{17}) may be defined in a header file that is
     59@#include@'d in multiple source files.~\cite{C++17}
     60\end{quote}
     61The underlying mechanism to provide this capability is attribute
     62\begin{cfa}
     63section(".gnu.linkonce.NAME")
     64\end{cfa}
     65where @NAME@ is the variable/function name duplicated in each TU.  The linker than
     66provides the service of generating a single declaration (instance) across all
     67TUs, even if a program is linked incrementally.
     68
     69C does not support this feature for @inline@, and hence, neither does \CFA.
     70Again, rather than implement a new @inline@ extension for \CFA, a temporary
     71solution for the exception handling is to add the following in \CFA.
     72\begin{lstlisting}[language=CFA,{moredelim=**[is][\color{red}]{@}{@}}]
     73@__attribute__((cfa_linkonce))@ void f() {}
     74\end{lstlisting}
     75which becomes
     76\begin{lstlisting}[language=CFA,{moredelim=**[is][\color{red}]{@}{@}}]
     77__attribute__((section(".gnu.linkonce._X1fFv___1"))) void @_X1fFv___1@(){}
     78\end{lstlisting}
     79where @NAME@ from above is the \CFA mangled variable/function name.  Note,
     80adding this feature is necessary because, when using macros, the mangled name
     81is unavailable.  This attribute is useful for purposes other than exception
     82handling, and should eventually be rolled into @inline@ processing in \CFA.
     83
     84Finally, a type id's data implements a pointers to the type's type information
     85instance.  Dereferencing the pointer gets the type information.
     86
     87\subsection{Implementation}
     88
    3989The type information currently is only the parent's type id or, if the
    4090type has no parent, the null pointer.
    41 
    42 The id's are implemented as pointers to the type's type information instance.
    43 Dereferencing the pointer gets the type information.
    4491The ancestors of a virtual type are found by traversing type ids through
    4592the type information.
    46 The information pushes the issue of creating a unique value (for
    47 the type id) to the problem of creating a unique instance (for type
    48 information), which the linker can solve.
    49 
    50 The advanced linker support is used here to avoid having to create
    51 a new declaration to attach this data to.
    52 With C/\CFA's header/implementation file divide for something to appear
    53 exactly once it must come from a declaration that appears in exactly one
    54 implementation file; the declarations in header files may exist only once
    55 they can be included in many different translation units.
    56 Therefore, structure's declaration will not work.
    57 Neither will attaching the type information to the virtual table -- although
    58 a vtable declarations are in implemention files they are not unique, see
    59 \autoref{ss:VirtualTable}.
    60 Instead the same type information is generated multiple times and then
    61 the new attribute \snake{cfa_linkone} is used to removed duplicates.
    62 
    63 Type information is constructed as follows:
     93An example using helper macros looks like:
     94\begin{cfa}
     95struct INFO_TYPE(TYPE) {
     96        INFO_TYPE(PARENT) const * parent;
     97};
     98
     99__attribute__((cfa_linkonce))
     100INFO_TYPE(TYPE) const INFO_NAME(TYPE) = {
     101        &INFO_NAME(PARENT),
     102};
     103\end{cfa}
     104
     105The type information is constructed as follows:
    64106\begin{enumerate}
    65107\item
    66 Use the type's name to generate a name for the type information structure.
    67 This is saved so it may be reused.
     108Use the type's name to generate a name for the type information structure,
     109which is saved so it can be reused.
    68110\item
    69111Generate a new structure definition to store the type
    70112information. The layout is the same in each case, just the parent's type id,
    71113but the types used change from instance to instance.
    72 The generated name is used for both this structure and, if relivant, the
     114The generated name is used for both this structure and, if relevant, the
    73115parent pointer.
    74116If the virtual type is polymorphic then the type information structure is
    75117polymorphic as well, with the same polymorphic arguments.
    76118\item
    77 A seperate name for instances is generated from the type's name.
     119A separate name for instances is generated from the type's name.
    78120\item
    79 The definition is generated and initialised.
     121The definition is generated and initialized.
    80122The parent id is set to the null pointer or to the address of the parent's
    81123type information instance. Name resolution handles the rest.
    82124\item
    83125\CFA's name mangler does its regular name mangling encoding the type of
    84 the declaration into the instance name. This gives a completely unique name
     126the declaration into the instance name. This process gives a program unique name
    85127including different instances of the same polymorphic type.
    86128\end{enumerate}
    87 \todo{The list is making me realise, some of this isn't ordered.}
    88 
    89 Writing that code manually, with helper macros for the early name mangling,
    90 would look like this:
    91 \begin{cfa}
    92 struct INFO_TYPE(TYPE) {
    93         INFO_TYPE(PARENT) const * parent;
    94 };
    95 
    96 __attribute__((cfa_linkonce))
    97 INFO_TYPE(TYPE) const INFO_NAME(TYPE) = {
    98         &INFO_NAME(PARENT),
    99 };
    100 \end{cfa}
    101 
     129\todo{The list is making me realize, some of this isn't ordered.}
     130
     131
     132\begin{comment}
    102133\subsubsection{\lstinline{cfa\_linkonce} Attribute}
    103 % I just realised: This is an extension of the inline keyword.
     134% I just realized: This is an extension of the inline keyword.
    104135% An extension of C's at least, it is very similar to C++'s.
    105136Another feature added to \CFA is a new attribute: \texttt{cfa\_linkonce}.
     
    126157everything that comes after the special prefix, then only one is used
    127158and the other is discarded.
     159\end{comment}
     160
    128161
    129162\subsection{Virtual Table}
     
    158191The first and second sections together mean that every virtual table has a
    159192prefix that has the same layout and types as its parent virtual table.
    160 This, combined with the fixed offset to the virtual table pointer, means that
     193This, combined with the fixed offset to the virtual-table pointer, means that
    161194for any virtual type, it is always safe to access its virtual table and,
    162195from there, it is safe to check the type id to identify the exact type of the
     
    176209type's alignment, is set using an @alignof@ expression.
    177210
    178 \subsubsection{Concurrency Integration}
     211\subsection{Concurrency Integration}
    179212Coroutines and threads need instances of @CoroutineCancelled@ and
    180213@ThreadCancelled@ respectively to use all of their functionality. When a new
     
    183216at the definition of the main function.
    184217
    185 This is showned through code re-writing in
    186 \autoref{f:ConcurrencyTypeTransformation} and
    187 \autoref{f:ConcurrencyMainTransformation}.
    188 In both cases the original declaration is not modified,
    189 only new ones are added.
     218These transformations are shown through code re-writing in
     219\autoref{f:CoroutineTypeTransformation} and
     220\autoref{f:CoroutineMainTransformation} for a coroutine and a thread is similar.
     221In both cases, the original declaration is not modified, only new ones are
     222added.
    190223
    191224\begin{figure}
     
    207240extern CoroutineCancelled_vtable & _default_vtable;
    208241\end{cfa}
    209 \caption{Concurrency Type Transformation}
    210 \label{f:ConcurrencyTypeTransformation}
    211 \end{figure}
    212 
    213 \begin{figure}
     242\caption{Coroutine Type Transformation}
     243\label{f:CoroutineTypeTransformation}
     244%\end{figure}
     245
     246\bigskip
     247
     248%\begin{figure}
    214249\begin{cfa}
    215250void main(Example & this) {
     
    229264        &_default_vtable_object_declaration;
    230265\end{cfa}
    231 \caption{Concurrency Main Transformation}
    232 \label{f:ConcurrencyMainTransformation}
     266\caption{Coroutine Main Transformation}
     267\label{f:CoroutineMainTransformation}
    233268\end{figure}
    234269
     
    245280        struct __cfavir_type_id const * child );
    246281\end{cfa}
    247 The type id of target type of the virtual cast is passed in as @parent@ and
     282The type id for the target type of the virtual cast is passed in as @parent@ and
    248283the cast target is passed in as @child@.
    249 
    250 For generated C code wraps both arguments and the result with type casts.
     284The generated C code wraps both arguments and the result with type casts.
    251285There is also an internal check inside the compiler to make sure that the
    252286target type is a virtual type.
     
    260294
    261295\section{Exceptions}
    262 % Anything about exception construction.
     296\todo{Anything about exception construction.}
    263297
    264298\section{Unwinding}
     
    274308stack. On function entry and return, unwinding is handled directly by the
    275309call/return code embedded in the function.
    276 In many cases, the position of the instruction pointer (relative to parameter
     310\PAB{Meaning: In many cases, the position of the instruction pointer (relative to parameter
    277311and local declarations) is enough to know the current size of the stack
    278 frame.
     312frame.}
    279313
    280314Usually, the stack-frame size is known statically based on parameter and
    281 local variable declarations. Even with dynamic stack-size, the information
     315local variable declarations. Even for a dynamic stack-size, the information
    282316to determine how much of the stack has to be removed is still contained
    283317within the function.
     
    285319bumping the hardware stack-pointer up or down as needed.
    286320Constructing/destructing values within a stack frame has
    287 a similar complexity but can add additional work and take longer.
     321a similar complexity but larger constants, which takes longer.
    288322
    289323Unwinding across multiple stack frames is more complex because that
    290324information is no longer contained within the current function.
    291 With seperate compilation a function has no way of knowing what its callers
    292 are so it can't know how large those frames are.
     325With separate compilation a function does not know its callers nor their frame size.
     326In general, the caller's frame size is embedded only at the functions entry (push
     327stack) and exit (pop stack).
    293328Without altering the main code path it is also hard to pass that work off
    294329to the caller.
     
    302337This approach is fragile and requires extra work in the surrounding code.
    303338
    304 With respect to the extra work in the surounding code,
     339With respect to the extra work in the surrounding code,
    305340many languages define clean-up actions that must be taken when certain
    306341sections of the stack are removed. Such as when the storage for a variable
    307 is removed from the stack or when a try statement with a finally clause is
     342is removed from the stack (destructor call) or when a try statement with a finally clause is
    308343(conceptually) popped from the stack.
    309 None of these should be handled by the user --- that would contradict the
     344None of these cases should be handled by the user --- that would contradict the
    310345intention of these features --- so they need to be handled automatically.
    311346
     
    355390in this case @clean_up@, run when the variable goes out of scope.
    356391This feature is enough to mimic destructors,
    357 but not try statements which can effect
     392but not try statements that affect
    358393the unwinding.
    359394
    360395To get full unwinding support, all of these features must be handled directly
    361 in assembly and assembler directives; partiularly the cfi directives
     396in assembly and assembler directives; particularly the cfi directives
    362397\snake{.cfi_lsda} and \snake{.cfi_personality}.
    363398
     
    399434@_UA_FORCE_UNWIND@ specifies a forced unwind call. Forced unwind only performs
    400435the cleanup phase and uses a different means to decide when to stop
    401 (see \vref{s:ForcedUnwind}).
     436(see \autoref{s:ForcedUnwind}).
    402437\end{enumerate}
    403438
    404439The @exception_class@ argument is a copy of the
    405440\code{C}{exception}'s @exception_class@ field,
    406 which is a number that identifies the exception handling mechanism
     441which is a number that identifies the EHM
    407442that created the exception.
    408443
     
    494529needs its own exception context.
    495530
    496 The exception context should be retrieved by calling the function
     531An exception context is retrieved by calling the function
    497532\snake{this_exception_context}.
    498533For sequential execution, this function is defined as
     
    519554The first step of a termination raise is to copy the exception into memory
    520555managed by the exception system. Currently, the system uses @malloc@, rather
    521 than reserved memory or the stack top. The exception handling mechanism manages
     556than reserved memory or the stack top. The EHM manages
    522557memory for the exception as well as memory for libunwind and the system's own
    523558per-exception storage.
     
    618653\subsection{Try Statements and Catch Clauses}
    619654The try statement with termination handlers is complex because it must
    620 compensate for the C code-generation versus
     655compensate for the C code-generation versus proper
    621656assembly-code generated from \CFA. Libunwind
    622657requires an LSDA and personality function for control to unwind across a
     
    624659
    625660The workaround is a function called @__cfaehm_try_terminate@ in the standard
    626 library. The contents of a try block and the termination handlers are converted
    627 into functions. These are then passed to the try terminate function and it
    628 calls them.
     661\CFA library. The contents of a try block and the termination handlers are converted
     662into nested functions. These are then passed to the try terminate function and it
     663calls them, appropriately.
    629664Because this function is known and fixed (and not an arbitrary function that
    630 happens to contain a try statement), the LSDA can be generated ahead
     665happens to contain a try statement), its LSDA can be generated ahead
    631666of time.
    632667
    633 Both the LSDA and the personality function are set ahead of time using
     668Both the LSDA and the personality function for @__cfaehm_try_terminate@ are set ahead of time using
    634669embedded assembly. This assembly code is handcrafted using C @asm@ statements
    635670and contains
    636 enough information for a single try statement the function repersents.
     671enough information for a single try statement the function represents.
    637672
    638673The three functions passed to try terminate are:
     
    646681decides if a catch clause matches the termination exception. It is constructed
    647682from the conditional part of each handler and runs each check, top to bottom,
    648 in turn, first checking to see if the exception type matches and then if the
    649 condition is true. It takes a pointer to the exception and returns 0 if the
     683in turn, first checking to see if the exception type matches.
     684The match is performed in two steps, first a virtual cast is used to see
     685if the raised exception is an instance of the declared exception or one of
     686its descendant type, and then is the condition true, if present.
     687It takes a pointer to the exception and returns 0 if the
    650688exception is not handled here. Otherwise the return value is the id of the
    651689handler that matches the exception.
     
    660698All three functions are created with GCC nested functions. GCC nested functions
    661699can be used to create closures,
    662 in other words functions that can refer to the state of other
    663 functions on the stack. This approach allows the functions to refer to all the
     700in other words, functions that can refer to their lexical scope in other
     701functions on the stack when called. This approach allows the functions to refer to all the
    664702variables in scope for the function containing the @try@ statement. These
    665703nested functions and all other functions besides @__cfaehm_try_terminate@ in
     
    669707
    670708\autoref{f:TerminationTransformation} shows the pattern used to transform
    671 a \CFA try statement with catch clauses into the approprate C functions.
     709a \CFA try statement with catch clauses into the appropriate C functions.
    672710\todo{Explain the Termination Transformation figure.}
    673711
     
    738776Instead of storing the data in a special area using assembly,
    739777there is just a linked list of possible handlers for each stack,
    740 with each node on the list reperenting a try statement on the stack.
     778with each node on the list representing a try statement on the stack.
    741779
    742780The head of the list is stored in the exception context.
     
    744782to the head of the list.
    745783Instead of traversing the stack, resumption handling traverses the list.
    746 At each node, the EHM checks to see if the try statement the node repersents
     784At each node, the EHM checks to see if the try statement the node represents
    747785can handle the exception. If it can, then the exception is handled and
    748786the operation finishes, otherwise the search continues to the next node.
    749787If the search reaches the end of the list without finding a try statement
    750788that can handle the exception, the default handler is executed and the
    751 operation finishes.
     789operation finishes, unless it throws an exception.
    752790
    753791Each node has a handler function that does most of the work.
    754792The handler function is passed the raised exception and returns true
    755793if the exception is handled and false otherwise.
    756 
    757794The handler function checks each of its internal handlers in order,
    758795top-to-bottom, until it funds a match. If a match is found that handler is
     
    760797If no match is found the function returns false.
    761798The match is performed in two steps, first a virtual cast is used to see
    762 if the thrown exception is an instance of the declared exception or one of
    763 its descendant type, then check to see if passes the custom predicate if one
    764 is defined. This ordering gives the type guarantee used in the predicate.
     799if the raised exception is an instance of the declared exception or one of
     800its descendant type, and then is the condition true, if present.
     801\PAB{I don't understand this sentence.
     802This ordering gives the type guarantee used in the predicate.}
    765803
    766804\autoref{f:ResumptionTransformation} shows the pattern used to transform
    767 a \CFA try statement with catch clauses into the approprate C functions.
     805a \CFA try statement with catch clauses into the appropriate C functions.
    768806\todo{Explain the Resumption Transformation figure.}
    769807
     
    814852(see \vpageref{s:ResumptionMarking}), which ignores parts of
    815853the stack
    816 already examined, is accomplished by updating the front of the list as the
    817 search continues. Before the handler at a node is called, the head of the list
     854already examined, and is accomplished by updating the front of the list as the
     855search continues. Before the handler is called at a matching node, the head of the list
    818856is updated to the next node of the current node. After the search is complete,
    819857successful or not, the head of the list is reset.
     
    822860been checked are not on the list while a handler is run. If a resumption is
    823861thrown during the handling of another resumption, the active handlers and all
    824 the other handler checked up to this point are not checked again.
     862the other handlers checked up to this point are not checked again.
    825863% No paragraph?
    826864This structure also supports new handlers added while the resumption is being
     
    830868
    831869\begin{figure}
     870\centering
    832871\input{resumption-marking}
    833872\caption{Resumption Marking}
     
    851890\section{Finally}
    852891% Uses destructors and GCC nested functions.
    853 A finally clause is placed into a GCC nested-function with a unique name,
    854 and no arguments or return values.
    855 This nested function is then set as the cleanup
    856 function of an empty object that is declared at the beginning of a block placed
    857 around the context of the associated @try@ statement.
     892\autoref{f:FinallyTransformation} shows the pattern used to transform a \CFA
     893try statement with finally clause into the appropriate C functions.
     894The finally clause is placed into a GCC nested-function
     895with a unique name, and no arguments or return values.  This nested function is
     896then set as the cleanup function of an empty object that is declared at the
     897beginning of a block placed around the context of the associated @try@
     898statement.
     899
     900\begin{figure}
     901\begin{cfa}
     902try {
     903        // TRY BLOCK
     904} finally {
     905        // FINALLY BLOCK
     906}
     907\end{cfa}
     908
     909\transformline
     910
     911\begin{cfa}
     912{
     913        void finally(void *__hook){
     914                // FINALLY BLOCK
     915        }
     916        __attribute__ ((cleanup(finally)))
     917        struct __cfaehm_cleanup_hook __finally_hook;
     918        {
     919                // TRY BLOCK
     920        }
     921
     922}
     923\end{cfa}
     924
     925\caption{Finally Transformation}
     926\label{f:FinallyTransformation}
     927\end{figure}
    858928
    859929The rest is handled by GCC. The try block and all handlers are inside this
     
    869939
    870940The first step of cancellation is to find the cancelled stack and its type:
    871 coroutine, thread or main thread.
     941coroutine, thread, or main thread.
    872942In \CFA, a thread (the construct the user works with) is a user-level thread
    873943(point of execution) paired with a coroutine, the thread's main coroutine.
    874944The thread library also stores pointers to the main thread and the current
    875 thread.
     945coroutine.
    876946If the current thread's main and current coroutines are the same then the
    877947current stack is a thread stack, otherwise it is a coroutine stack.
     
    887957passed to the forced-unwind function. The general pattern of all three stop
    888958functions is the same: continue unwinding until the end of stack and
    889 then preform the appropriate transfer.
     959then perform the appropriate transfer.
    890960
    891961For main stack cancellation, the transfer is just a program abort.
Note: See TracChangeset for help on using the changeset viewer.