# Changeset b5ec090

Ignore:
Timestamp:
Sep 13, 2021, 1:42:07 PM (8 months ago)
Branches:
enum, forall-pointer-decay, master
Children:
445f984
Parents:
56e5b24 (diff), 63b3279 (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the (diff) links above to see all the changes relative to each parent.
Message:

Merge branch 'master' of plg.uwaterloo.ca:software/cfa/cfa-cc

Files:
10 edited

Unmodified
Removed
• ## doc/theses/andrew_beach_MMath/conclusion.tex

 r56e5b24 % Just a little knot to tie the paper together. In the previous chapters this thesis presents the design and implementation In the previous chapters, this thesis presents the design and implementation of \CFA's exception handling mechanism (EHM). Both the design and implementation are based off of tools and techniques developed for other programming languages but they were adapted to better fit \CFA's feature set and add a few features that do not exist in other EHMs; other EHMs, including conditional matching, default handlers for unhandled exceptions and cancellation though coroutines and threads back to the program main stack.

• ## doc/theses/andrew_beach_MMath/features.tex

 r56e5b24 and begins with a general overview of EHMs. It is not a strict definition of all EHMs nor an exhaustive list of all possible features. However it does cover the most common structure and features found in them. However, it does cover the most common structure and features found in them. \section{Overview of EHMs} The @try@ statements of \Cpp, Java and Python are common examples. All three also show another common feature of handlers, they are grouped by the guarded also show another common feature of handlers: they are grouped by the guarded region. between different sub-hierarchies. This design is used in \CFA even though it is not a object-orientated language; so different tools are used to create the hierarchy. language, so different tools are used to create the hierarchy. % Could I cite the rational for the Python IO exception rework? For effective exception handling, additional information is often passed from the raise to the handler and back again. So far, only communication of the exceptions' identity is covered. So far, only communication of the exception's identity is covered. A common communication method for adding information to an exception is putting fields into the exception instance \section{Virtuals} \label{s:virtuals} %\todo{Maybe explain what "virtual" actually means.} Virtual types and casts are not part of \CFA's EHM nor are they required for an EHM. % A type's descendants are its children and its children's descendants. For the purposes of illustration, a proposed -- but unimplemented syntax -- For the purposes of illustration, a proposed, but unimplemented, syntax will be used. Each virtual type is represented by a trait with an annotation that makes it a virtual type. This annotation is empty for a root type, which \end{minipage} Every virtual type also has a list of virtual members and a unique id, both are stored in a virtual table. Every virtual type also has a list of virtual members and a unique id. Both are stored in a virtual table. Every instance of a virtual type also has a pointer to a virtual table stored in it, although there is no per-type virtual table as in many other languages. The list of virtual members is built up down the tree. Every virtual type The list of virtual members is accumulated from the root type down the tree. Every virtual type inherits the list of virtual members from its parent and may add more virtual members to the end of the list which are passed on to its children. % Consider adding a diagram, but we might be good with the explanation. As @child_type@ is a child of @root_type@ it has the virtual members of As @child_type@ is a child of @root_type@, it has the virtual members of @root_type@ (@to_string@ and @size@) as well as the one it declared (@irrelevant_function@). The names size" and align" are reserved for the size and alignment of the virtual type, and are always automatically initialized as such. The other special case are uses of the trait's polymorphic argument The other special case is uses of the trait's polymorphic argument (@T@ in the example), which are always updated to refer to the current virtual type. This allows functions that refer to to polymorphic argument virtual type. This allows functions that refer to the polymorphic argument to act as traditional virtual methods (@to_string@ in the example), as the object can always be passed to a virtual method in its virtual table. Up until this point the virtual system is similar to ones found in object-oriented languages but this is where \CFA diverges. Up until this point, the virtual system is similar to ones found in object-oriented languages, but this is where \CFA diverges. Objects encapsulate a single set of methods in each type, universally across the entire program, In \CFA, types do not encapsulate any code. Whether or not satisfies any given assertion, and hence any trait, is Whether or not a type satisfies any given assertion, and hence any trait, is context sensitive. Types can begin to satisfy a trait, stop satisfying it or satisfy the same trait at any lexical location in the program. In this sense, an type's implementation in the set of functions and variables In this sense, a type's implementation in the set of functions and variables that allow it to satisfy a trait is open" and can change throughout the program. \end{cfa} Like any variable they may be forward declared with the @extern@ keyword. Like any variable, they may be forward declared with the @extern@ keyword. Forward declaring virtual tables is relatively common. Many virtual types have an obvious" implementation that works in most Initialization is automatic. The type id and special virtual members size" and align" only depend on the virtual type, which is fixed given the type of the virtual table and the virtual type, which is fixed given the type of the virtual table, and so the compiler fills in a fixed value. The other virtual members are resolved, using the best match to the member's The other virtual members are resolved using the best match to the member's name and type, in the same context as the virtual table is declared using \CFA's normal resolution rules. While much of the virtual infrastructure is created, it is currently only used While much of the virtual infrastructure has been created, it is currently only used internally for exception handling. The only user-level feature is the virtual cast, which is the same as the \Cpp \code{C++}{dynamic_cast}. Note, the syntax and semantics matches a C-cast, rather than the function-like \Cpp syntax for special casts. Both the type of @EXPRESSION@ and @TYPE@ must be a pointer to a virtual type. pointers to virtual types. The cast dynamically checks if the @EXPRESSION@ type is the same or a sub-type of @TYPE@, and if true, returns a pointer to the @EXPRESSION@ object, otherwise it returns @0p@ (null pointer). This allows the expression to be used as both a cast and a type check. \section{Exceptions} The syntax for declaring an exception is the same as declaring a structure except the keyword that is swapped out: except the keyword: \begin{cfa} exception TYPE_NAME { \end{cfa} Fields are filled in the same way as a structure as well. However an extra Fields are filled in the same way as a structure as well. However, an extra field is added that contains the pointer to the virtual table. It must be explicitly initialized by the user when the exception is \begin{minipage}[t]{0.4\textwidth} Header: Header (.hfa): \begin{cfa} exception Example { \end{minipage} \begin{minipage}[t]{0.6\textwidth} Source: Implementation (.cfa): \begin{cfa} vtable(Example) example_base_vtable %\subsection{Exception Details} This is the only interface needed when raising and handling exceptions. However it is actually a short hand for a more complex trait based interface. However, it is actually a shorthand for a more complex trait-based interface. The language views exceptions through a series of traits. completing the virtual system). The imaginary assertions would probably come from a trait defined by the virtual system, and state that the exception type is a virtual type, is a descendant of @exception_t@ (the base exception type) is a virtual type, that that the type is a descendant of @exception_t@ (the base exception type) and allow the user to find the virtual table type. }; \end{cfa} Both traits ensure a pair of types is an exception type, its virtual table type Both traits ensure a pair of types is an exception type and its virtual table type, and defines one of the two default handlers. The default handlers are used as fallbacks and are discussed in detail in \vref{s:ExceptionHandling}. as fallbacks and are discussed in detail in \autoref{s:ExceptionHandling}. However, all three of these traits can be tricky to use directly. While there is a bit of repetition required, the largest issue is that the virtual table type is mangled and not in a user facing way. So these three macros are provided to wrap these traits to facing way. So, these three macros are provided to wrap these traits to simplify referring to the names: @IS_EXCEPTION@, @IS_TERMINATION_EXCEPTION@ and @IS_RESUMPTION_EXCEPTION@. The second (optional) argument is a parenthesized list of polymorphic arguments. This argument is only used with polymorphic exceptions and the list is be passed to both types. list is passed to both types. In the current set-up, the two types always have the same polymorphic arguments so these macros can be used without losing flexibility. For example consider a function that is polymorphic over types that have a arguments, so these macros can be used without losing flexibility. For example, consider a function that is polymorphic over types that have a defined arithmetic exception: \begin{cfa} These twin operations are the core of \CFA's exception handling mechanism. This section covers the general patterns shared by the two operations and then goes on to cover the details each individual operation. then goes on to cover the details of each individual operation. Both operations follow the same set of steps. \label{s:Termination} Termination handling is the familiar kind of handling and used in most programming used in most programming languages with exception handling. It is a dynamic, non-local goto. If the raised exception is matched and Since it is so general, a more specific handler can be defined, overriding the default behaviour for the specific exception types. %\todo{Examples?} \subsection{Resumption} @EXCEPTION_TYPE@$_i$ is matched and @NAME@$_i$ is bound to the exception, @HANDLER_BLOCK@$_i$ is executed right away without first unwinding the stack. After the block has finished running control jumps to the raise site, where After the block has finished running, control jumps to the raise site, where the just handled exception came from, and continues executing after it, not after the try statement. %\todo{Examples?} \subsubsection{Resumption Marking} not unwind the stack. A side effect is that, when a handler is matched and run, its try block (the guarded statements) and every try statement searched before it are still on the stack. There presence can lead to searched before it are still on the stack. Their presence can lead to the recursive resumption problem.\cite{Buhr00a} % Other possible citation is MacLaren77, but the form is different. \end{center} There are other sets of marking rules that could be used, for instance, marking just the handlers that caught the exception, There are other sets of marking rules that could be used. For instance, marking just the handlers that caught the exception would also prevent recursive resumption. However, the rules selected mirrors what happens with termination, However, the rules selected mirror what happens with termination, so this reduces the amount of rules and patterns a programmer has to know. // Handle a failure relating to f2 further down the stack. \end{cfa} In this example the file that experienced the IO error is used to decide In this example, the file that experienced the IO error is used to decide which handler should be run, if any at all. \subsection{Comparison with Reraising} In languages without conditional catch, that is no ability to match an exception based on something other than its type, it can be mimicked In languages without conditional catch -- that is, no ability to match an exception based on something other than its type -- it can be mimicked by matching all exceptions of the right type, checking any additional conditions inside the handler and re-raising the exception if it does not Here is a minimal example comparing both patterns, using @throw;@ (no argument) to start a re-raise. (no operand) to start a re-raise. \begin{center} \begin{tabular}{l r} \end{tabular} \end{center} At first glance catch-and-reraise may appear to just be a quality of life At first glance, catch-and-reraise may appear to just be a quality-of-life feature, but there are some significant differences between the two stratagies. strategies. A simple difference that is more important for \CFA than many other languages is that the raise site changes, with a re-raise but does not with a is that the raise site changes with a re-raise, but does not with a conditional catch. This is important in \CFA because control returns to the raise site to run the per-site default handler. Because of this only a conditional catch can the per-site default handler. Because of this, only a conditional catch can allow the original raise to continue. %   } else throw; % } In similar simple examples translating from re-raise to conditional catch takes less code but it does not have a general trivial solution either. In similar simple examples, translating from re-raise to conditional catch takes less code but it does not have a general, trivial solution either. So, given that the two patterns do not trivially translate into each other, it becomes a matter of which on should be encouraged and made the default. From the premise that if a handler that could handle an exception then it From the premise that if a handler could handle an exception then it should, it follows that checking as many handlers as possible is preferred. So conditional catch and checking later handlers is a good default. So, conditional catch and checking later handlers is a good default. \section{Finally Clauses} \label{s:FinallyClauses} Finally clauses are used to preform unconditional clean-up when leaving a Finally clauses are used to perform unconditional cleanup when leaving a scope and are placed at the end of a try statement after any handler clauses: \begin{cfa} Execution of the finally block should always finish, meaning control runs off the end of the block. This requirement ensures control always continues as if the finally clause is not present, \ie finally is for cleanup not changing the finally clause is not present, \ie finally is for cleanup, not changing control flow. Because of this requirement, local control flow out of the finally block is forbidden. The compiler precludes any @break@, @continue@, @fallthru@ or @return@ that causes control to leave the finally block. Other ways to leave the finally block, such as a long jump or termination are much harder to check, and at best requiring additional run-time overhead, and so are only the finally block, such as a @longjmp@ or termination are much harder to check, and at best require additional run-time overhead, and so are only discouraged. Not all languages with unwinding have finally clauses. Notably \Cpp does Not all languages with unwinding have finally clauses. Notably, \Cpp does without it as destructors, and the RAII design pattern, serve a similar role. Although destructors and finally clauses can be used for the same cases, Destructors take more work to create, but if there is clean-up code that needs to be run every time a type is used, they are much easier to set-up for each use. % It's automatic. On the other hand finally clauses capture the local context, so is easy to use when the clean-up is not dependent on the type of a variable or requires to set up for each use. % It's automatic. On the other hand, finally clauses capture the local context, so are easy to use when the cleanup is not dependent on the type of a variable or requires information from multiple variables. Cancellation is a stack-level abort, which can be thought of as as an uncatchable termination. It unwinds the entire current stack, and if possible forwards the cancellation exception to a different stack. possible, forwards the cancellation exception to a different stack. Cancellation is not an exception operation like termination or resumption. There is no special statement for starting a cancellation; instead the standard library function @cancel_stack@ is called passing an exception. Unlike a raise, this exception is not used in matching only to pass information about library function @cancel_stack@ is called, passing an exception. Unlike a raise, this exception is not used in matching, only to pass information about the cause of the cancellation. Finally, as no handler is provided, there is no default handler. After @cancel_stack@ is called the exception is copied into the EHM's memory After @cancel_stack@ is called, the exception is copied into the EHM's memory and the current stack is unwound. The behaviour after that depends on the kind of stack being cancelled. \paragraph{Main Stack} The main stack is the one used by the program main at the start of execution, The main stack is the one used by the program's main function at the start of execution, and is the only stack in a sequential program. After the main stack is unwound there is a program-level abort. After the main stack is unwound, there is a program-level abort. The first reason for this behaviour is for sequential programs where there is only one stack, and hence to stack to pass information to. is only one stack, and hence no stack to pass information to. Second, even in concurrent programs, the main stack has no dependency on another stack and no reliable way to find another living stack. and an implicit join (from a destructor call). The explicit join takes the default handler (@defaultResumptionHandler@) from its calling context while the implicit join provides its own; which does a program abort if the the implicit join provides its own, which does a program abort if the @ThreadCancelled@ exception cannot be handled.

• ## doc/theses/andrew_beach_MMath/implement.tex

 r56e5b24 \label{s:VirtualSystem} % Virtual table rules. Virtual tables, the pointer to them and the cast. While the \CFA virtual system currently has only one public features, virtual While the \CFA virtual system currently has only two public features, virtual cast and virtual tables, % ??? refs (see the virtual cast feature \vpageref{p:VirtualCast}), substantial structure is required to support them, and provide features for exception handling and the standard library. as part of field-by-field construction. \subsection{Type Id} Every virtual type has a unique id. \subsection{Type ID} Every virtual type has a unique ID. These are used in type equality, to check if the representation of two values are the same, and to access the type's type information. Our approach for program uniqueness is using a static declaration for each type id, where the run-time storage address of that variable is guaranteed to type ID, where the run-time storage address of that variable is guaranteed to be unique during program execution. The type id storage can also be used for other purposes, The type ID storage can also be used for other purposes, and is used for type information. The problem is that a type id may appear in multiple TUs that compose a program (see \autoref{ss:VirtualTable}); so the initial solution would seem to be make it external in each translation unit. Hovever, the type id must The problem is that a type ID may appear in multiple TUs that compose a program (see \autoref{ss:VirtualTable}), so the initial solution would seem to be make it external in each translation unit. Hovever, the type ID must have a declaration in (exactly) one of the TUs to create the storage. No other declaration related to the virtual type has this property, so doing this through standard C declarations would require the user to do it manually. Instead the linker is used to handle this problem. Instead, the linker is used to handle this problem. % I did not base anything off of C++17; they are solving the same problem. A new feature has been added to \CFA for this purpose, the special attribute \snake{cfa_linkonce}, which uses the special section @.gnu.linkonce@. When used as a prefix (\eg @.gnu.linkonce.example@) the linker does When used as a prefix (\eg @.gnu.linkonce.example@), the linker does not combine these sections, but instead discards all but one with the same full name. So each type id must be given a unique section name with the linkonce prefix. Luckily \CFA already has a way to get unique names, the name mangler. So, each type ID must be given a unique section name with the \snake{linkonce} prefix. Luckily, \CFA already has a way to get unique names, the name mangler. For example, this could be written directly in \CFA: \begin{cfa} __attribute__((section(".gnu.linkonce._X1fFv___1"))) void _X1fFv___1() {} \end{cfa} This is done internally to access the name manglers. This is done internally to access the name mangler. This attribute is useful for other purposes, any other place a unique instance required, and should eventually be made part of a public and \subsection{Type Information} There is data stored at the type id's declaration, the type information. The type information currently is only the parent's type id or, if the There is data stored at the type ID's declaration, the type information. The type information currently is only the parent's type ID or, if the type has no parent, the null pointer. The ancestors of a virtual type are found by traversing type ids through The ancestors of a virtual type are found by traversing type IDs through the type information. An example using helper macros looks like: \item Generate a new structure definition to store the type information. The layout is the same in each case, just the parent's type id, information. The layout is the same in each case, just the parent's type ID, but the types used change from instance to instance. The generated name is used for both this structure and, if relevant, the \item The definition is generated and initialized. The parent id is set to the null pointer or to the address of the parent's The parent ID is set to the null pointer or to the address of the parent's type information instance. Name resolution handles the rest. \item file as if it was a forward declaration, except no definition is required. This technique is used for type-id instances. A link-once definition is This technique is used for type ID instances. A link-once definition is generated each time the structure is seen. This will result in multiple copies but the link-once attribute ensures all but one are removed for a \subsection{Virtual Table} \label{ss:VirtualTable} Each virtual type has a virtual table type that stores its type id and %\todo{Clarify virtual table type vs. virtual table instance.} Each virtual type has a virtual table type that stores its type ID and virtual members. Each virtual type instance is bound to a table instance that is filled with The layout always comes in three parts (see \autoref{f:VirtualTableLayout}). The first section is just the type id at the head of the table. It is always The first section is just the type ID at the head of the table. It is always there to ensure that it can be found even when the accessing code does not know which virtual type it has. The second section are all the virtual members of the parent, in the same The second section is all the virtual members of the parent, in the same order as they appear in the parent's virtual table. Note that the type may change slightly as references to the this" change. This is limited to This, combined with the fixed offset to the virtual table pointer, means that for any virtual type, it is always safe to access its virtual table and, from there, it is safe to check the type id to identify the exact type of the from there, it is safe to check the type ID to identify the exact type of the underlying object, access any of the virtual members and pass the object to any of the method-like virtual members. the context of the declaration. The type id is always fixed; with each virtual table type having exactly one possible type id. The type ID is always fixed, with each virtual table type having exactly one possible type ID. The virtual members are usually filled in by type resolution. The best match for a given name and type at the declaration site is used. There are two exceptions to that rule: the @size@ field, the type's size, is set using a @sizeof@ expression and the @align@ field, the is set using a @sizeof@ expression, and the @align@ field, the type's alignment, is set using an @alignof@ expression. Most of these tools are already inside the compiler. Using simple code transformations early on in compilation, allows most of that work to be code transformations early on in compilation allows most of that work to be handed off to the existing tools. \autoref{f:VirtualTableTransformation} shows an example transformation, this example shows an exception virtual table. shows an example transformation; this example shows an exception virtual table. It also shows the transformation on the full declaration. For a forward declaration, the @extern@ keyword is preserved and the struct __cfavir_type_id * const * child ); \end{cfa} The type id for the target type of the virtual cast is passed in as The type ID for the target type of the virtual cast is passed in as @parent@ and the cast target is passed in as @child@. The virtual cast either returns the original pointer or the null pointer as the new type. So the function does the parent check and returns the appropriate value. The parent check is a simple linear search of child's ancestors using the The function does the parent check and returns the appropriate value. The parent check is a simple linear search of the child's ancestors using the type information. % The implementation of exception types. Creating exceptions can roughly divided into two parts, Creating exceptions can be roughly divided into two parts: the exceptions themselves and the virtual system interactions. All types associated with a virtual type, the types of the virtual table and the type id, the types of the virtual table and the type ID, are generated when the virtual type (the exception) is first found. The type id (the instance) is generated with the exception, if it is The type ID (the instance) is generated with the exception, if it is a monomorphic type. However, if the exception is polymorphic, then a different type id has to However, if the exception is polymorphic, then a different type ID has to be generated for every instance. In this case, generation is delayed until a virtual table is created. When a virtual table is created and initialized, two functions are created to fill in the list of virtual members. The first is a copy function that adapts the exception's copy constructor The first is the @copy@ function that adapts the exception's copy constructor to work with pointers, avoiding some issues with the current copy constructor interface. Second is the msg function that returns a C-string with the type's name, Second is the @msg@ function that returns a C-string with the type's name, including any polymorphic parameters. % Discussing multiple frame stack unwinding: Unwinding across multiple stack frames is more complex because that Unwinding across multiple stack frames is more complex, because that information is no longer contained within the current function. With separate compilation, a function does not know its callers nor their frame layout. Even using the return address, that information is encoded in terms of actions in code, intermixed with the actions required finish the function. actions in code, intermixed with the actions required to finish the function. Without changing the main code path it is impossible to select one of those two groups of actions at the return site. The traditional unwinding mechanism for C is implemented by saving a snap-shot of a function's state with @setjmp@ and restoring that snap-shot with The traditional unwinding mechanism for C is implemented by saving a snapshot of a function's state with @setjmp@ and restoring that snapshot with @longjmp@. This approach bypasses the need to know stack details by simply reseting to a snap-shot of an arbitrary but existing function frame on the stack. It is up to the programmer to ensure the snap-shot is valid when it is reset and that all required clean-up from the unwound stacks is performed. reseting to a snapshot of an arbitrary but existing function frame on the stack. It is up to the programmer to ensure the snapshot is valid when it is reset and that all required cleanup from the unwound stacks is performed. This approach is fragile and requires extra work in the surrounding code. With respect to the extra work in the surrounding code, many languages define clean-up actions that must be taken when certain sections of the stack are removed. Such as when the storage for a variable many languages define cleanup actions that must be taken when certain sections of the stack are removed, such as when the storage for a variable is removed from the stack, possibly requiring a destructor call, or when a try statement with a finally clause is (conceptually) popped from the stack. None of these cases should be handled by the user --- that would contradict the intention of these features --- so they need to be handled automatically. None of these cases should be handled by the user -- that would contradict the intention of these features -- so they need to be handled automatically. To safely remove sections of the stack, the language must be able to find and run these clean-up actions even when removing multiple functions unknown at run these cleanup actions even when removing multiple functions unknown at the beginning of the unwinding. provided storage object. It has two public fields: the @exception_class@, which is described above, and the @exception_cleanup@ function. The clean-up function is used by the EHM to clean-up the exception, if it The cleanup function is used by the EHM to clean up the exception. If it should need to be freed at an unusual time, it takes an argument that says why it had to be cleaned up. of the most recent stack frame. It continues to call personality functions traversing the stack from newest to oldest until a function finds a handler or the end of the stack is reached. In the latter case, raise exception returns @_URC_END_OF_STACK@. Second, when a handler is matched, raise exception moves to the clean-up phase and walks the stack a second time. the end of the stack is reached. In the latter case, @_Unwind_RaiseException@ returns @_URC_END_OF_STACK@. Second, when a handler is matched, @_Unwind_RaiseException@ moves to the cleanup phase and walks the stack a second time. Once again, it calls the personality functions of each stack frame from newest to oldest. This pass stops at the stack frame containing the matching handler. If that personality function has not install a handler, it is an error. If an error is encountered, raise exception returns either If that personality function has not installed a handler, it is an error. If an error is encountered, @_Unwind_RaiseException@ returns either @_URC_FATAL_PHASE1_ERROR@ or @_URC_FATAL_PHASE2_ERROR@ depending on when the error occurred. _Unwind_Stop_Fn, void *); \end{cfa} It also unwinds the stack but it does not use the search phase. Instead another It also unwinds the stack but it does not use the search phase. Instead, another function, the stop function, is used to stop searching. The exception is the same as the one passed to raise exception. The extra arguments are the stop same as the one passed to @_Unwind_RaiseException@. The extra arguments are the stop function and the stop parameter. The stop function has a similar interface as a personality function, except it is also passed the stop parameter. one list per stack, with the list head stored in the exception context. Within each linked list, the most recently thrown exception is at the head followed by older thrown recently thrown exception is at the head, followed by older thrown exceptions. This format allows exceptions to be thrown, while a different exception is being handled. The exception at the head of the list is currently exception into managed memory. After the exception is handled, the free function is used to clean up the exception and then the entire node is passed to free, returning the memory back to the heap. passed to @free@, returning the memory back to the heap. \subsection{Try Statements and Catch Clauses} The three functions passed to try terminate are: \begin{description} \item[try function:] This function is the try block, it is where all the code \item[try function:] This function is the try block. It is where all the code from inside the try block is placed. It takes no parameters and has no return value. This function is called during regular execution to run the try from the conditional part of each handler and runs each check, top to bottom, in turn, to see if the exception matches this handler. The match is performed in two steps, first a virtual cast is used to check The match is performed in two steps: first, a virtual cast is used to check if the raised exception is an instance of the declared exception type or one of its descendant types, and then the condition is evaluated, if present. The match function takes a pointer to the exception and returns 0 if the exception is not handled here. Otherwise the return value is the id of the exception is not handled here. Otherwise, the return value is the ID of the handler that matches the exception. \end{description} All three functions are created with GCC nested functions. GCC nested functions can be used to create closures, can be used to create closures; in other words, functions that can refer to variables in their lexical scope even functions that can refer to variables in their lexical scope even though those variables are part of a different function. This approach allows the functions to refer to all the At each node, the EHM checks to see if the try statement the node represents can handle the exception. If it can, then the exception is handled and the operation finishes, otherwise the search continues to the next node. the operation finishes; otherwise, the search continues to the next node. If the search reaches the end of the list without finding a try statement with a handler clause if the exception is handled and false otherwise. The handler function checks each of its internal handlers in order, top-to-bottom, until it funds a match. If a match is found that handler is top-to-bottom, until it finds a match. If a match is found that handler is run, after which the function returns true, ignoring all remaining handlers. If no match is found the function returns false. The match is performed in two steps, first a virtual cast is used to see The match is performed in two steps. First a virtual cast is used to see if the raised exception is an instance of the declared exception type or one of its descendant types, if so then it is passed to the custom predicate of its descendant types, if so, then the second step is to see if the exception passes the custom predicate if one is defined. % You need to make sure the type is correct before running the predicate % Recursive Resumption Stuff: \autoref{f:ResumptionMarking} shows search skipping (see \vpageref{s:ResumptionMarking}), which ignores parts of (see \autoref{s:ResumptionMarking}), which ignores parts of the stack already examined, and is accomplished by updating the front of the list as This structure also supports new handlers added while the resumption is being handled. These are added to the front of the list, pointing back along the stack --- the first one points over all the checked handlers --- stack -- the first one points over all the checked handlers -- and the ordering is maintained. %\autoref{code:cleanup} A finally clause is handled by converting it into a once-off destructor. The code inside the clause is placed into GCC nested-function The code inside the clause is placed into a GCC nested-function with a unique name, and no arguments or return values. This nested function is then set as the cleanup function of an empty object that is declared at the beginning of a block placed around the context of the associated try statement (see \autoref{f:FinallyTransformation}). statement, as shown in \autoref{f:FinallyTransformation}. \begin{figure} % Stack selections, the three internal unwind functions. Cancellation also uses libunwind to do its stack traversal and unwinding, however it uses a different primary function: @_Unwind_ForcedUnwind@. Details of its interface can be found in the Section~\vref{s:ForcedUnwind}. Cancellation also uses libunwind to do its stack traversal and unwinding. However, it uses a different primary function: @_Unwind_ForcedUnwind@. Details of its interface can be found in Section~\vref{s:ForcedUnwind}. The first step of cancellation is to find the cancelled stack and its type:
• ## doc/theses/andrew_beach_MMath/intro.tex

 r56e5b24 All types of exception handling link a raise with a handler. Both operations are usually language primitives, although raises can be treated as a primitive function that takes an exception argument. Handlers are more complex as they are added to and removed from the stack during execution, must specify what they can handle and give the code to treated as a function that takes an exception argument. Handlers are more complex, as they are added to and removed from the stack during execution, must specify what they can handle and must give the code to handle the exception. \input{termination} \end{center} %\todo{What does the right half of termination.fig mean?} Resumption exception handling searches the stack for a handler and then calls The handler is run on top of the existing stack, often as a new function or closure capturing the context in which the handler was defined. After the handler has finished running it returns control to the function After the handler has finished running, it returns control to the function that preformed the raise, usually starting after the raise. \begin{center} Although a powerful feature, exception handling tends to be complex to set up and expensive to use and expensive to use, so it is often limited to unusual or exceptional" cases. The classic example is error handling, exceptions can be used to The classic example is error handling; exceptions can be used to remove error handling logic from the main execution path, and pay most of the cost only when the error actually occurs. The \CFA EHM implements all of the common exception features (or an equivalent) found in most other EHMs and adds some features of its own. The design of all the features had to be adapted to \CFA's feature set as The design of all the features had to be adapted to \CFA's feature set, as some of the underlying tools used to implement and express exception handling in other languages are absent in \CFA. Still the resulting syntax resembles that of other languages: Still, the resulting syntax resembles that of other languages: \begin{cfa} try { covering both changes to the compiler and the run-time. In addition, a suite of test cases and performance benchmarks were created along side the implementation. alongside the implementation. The implementation techniques are generally applicable in other programming languages and much of the design is as well. \item Implementing stack unwinding and the \CFA EHM, including updating the \CFA compiler and the run-time environment. \item Designed and implemented a prototype virtual system. \item Designing and implementing a prototype virtual system. % I think the virtual system and per-call site default handlers are the only % "new" features, everything else is a matter of implementation. \item Creating tests to check the behaviour of the EHM. \item Creating benchmarks to check the performances of the EHM, \item Creating benchmarks to check the performance of the EHM, as compared to other languages. \end{enumerate} The rest of this thesis is organized as follows. The current state of exceptions is covered in \autoref{s:background}. The existing state of \CFA is also covered in \autoref{c:existing}. The existing state of \CFA is covered in \autoref{c:existing}. New EHM features are introduced in \autoref{c:features}, covering their usage and design. inheriting from \code{C++}{std::exception}. Although there is a special catch-all syntax (@catch(...)@) there are no Although there is a special catch-all syntax (@catch(...)@), there are no operations that can be performed on the caught value, not even type inspection. Instead the base exception-type \code{C++}{std::exception} defines common Instead, the base exception-type \code{C++}{std::exception} defines common functionality (such as the ability to describe the reason the exception was raised) and all Java was the next popular language to use exceptions.\cite{Java8} Its exception system largely reflects that of \Cpp, except that requires Its exception system largely reflects that of \Cpp, except that it requires you throw a child type of \code{Java}{java.lang.Throwable} and it uses checked exceptions. Checked exceptions are part of a function's interface, the exception signature of the function. Every function that could be raised from a function, either directly or Every exception that could be raised from a function, either directly or because it is not handled from a called function, is given. Using this information, it is possible to statically verify if any given exception is handled and guarantee that no exception will go unhandled. exception is handled, and guarantee that no exception will go unhandled. Making exception information explicit improves clarity and safety, but can slow down or restrict programming. recovery or repair. In theory that could be good enough to properly handle the exception, but more often is used to ignore an exception that the programmer does not feel is worth the effort of handling it, for instance if programmer does not feel is worth the effort of handling, for instance if they do not believe it will ever be raised. If they are incorrect the exception will be silenced, while in a similar If they are incorrect, the exception will be silenced, while in a similar situation with unchecked exceptions the exception would at least activate the language's unhandled exception code (usually program abort with an the language's unhandled exception code (usually, a program abort with an error message). %\subsection Resumption exceptions are less popular, although resumption is as old as termination; hence, few although resumption is as old as termination; that is, few programming languages have implemented them. % http://bitsavers.informatik.uni-stuttgart.de/pdf/xerox/parc/techReports/ included in the \Cpp standard. % https://en.wikipedia.org/wiki/Exception_handling Since then resumptions have been ignored in main-stream programming languages. Since then, resumptions have been ignored in main-stream programming languages. However, resumption is being revisited in the context of decades of other developments in programming languages. While rejecting resumption may have been the right decision in the past, the situation has changed since then. Some developments, such as the function programming equivalent to resumptions, Some developments, such as the functional programming equivalent to resumptions, algebraic effects\cite{Zhang19}, are enjoying success. A complete reexamination of resumptions is beyond this thesis, but there reemergence is enough to try them in \CFA. A complete reexamination of resumption is beyond this thesis, but their reemergence is enough reason to try them in \CFA. % Especially considering how much easier they are to implement than % termination exceptions and how much Peter likes them. %\subsection More recently exceptions seem to be vanishing from newer programming More recently exceptions, seem to be vanishing from newer programming languages, replaced by panic". In Rust, a panic is just a program level abort that may be implemented by %\subsection While exception handling's most common use cases are in error handling, As exception handling's most common use cases are in error handling, here are some other ways to handle errors with comparisons with exceptions. \begin{itemize} is discarded to avoid this problem. Checking error codes also bloats the main execution path, especially if the error is not handled immediately hand has to be passed especially if the error is not handled immediately and has to be passed through multiple functions before it is addressed. \item\emph{Special Return with Global Store}: Similar to the error codes pattern but the function itself only returns that there was an error and store the reason for the error in a fixed global location. For example many routines in the C standard library will only return some that there was an error, and stores the reason for the error in a fixed global location. For example, many routines in the C standard library will only return some error value (such as -1 or a null pointer) and the error code is written into the standard variable @errno@. Success is one tag and the errors are another. It is also possible to make each possible error its own tag and carry its own additional information, but the two branch format is easy to make generic additional information, but the two-branch format is easy to make generic so that one type can be used everywhere in error handling code. % Rust's \code{rust}{Result} The main advantage is that an arbitrary object can be used to represent an error so it can include a lot more information than a simple error code. error, so it can include a lot more information than a simple error code. The disadvantages include that the it does have to be checked along the main execution and if there aren't primitive tagged unions proper usage can be execution, and if there aren't primitive tagged unions proper, usage can be hard to enforce. variable). C++ uses this approach as its fallback system if exception handling fails, such as \snake{std::terminate_handler} and, for a time, \snake{std::unexpected_handler}. such as \snake{std::terminate} and, for a time, \snake{std::unexpected}.\footnote{\snake{std::unexpected} was part of the Dynamic Exception Specification, which has been removed from the standard as of C++20.\cite{CppExceptSpec}} Handler functions work a lot like resumption exceptions, happily making them expensive to use in exchange. This difference is less important in higher-level scripting languages, where using exception for other tasks is more common. where using exceptions for other tasks is more common. An iconic example is Python's \code{Python}{StopIteration}\cite{PythonExceptions} exception that \code{Python}{StopIteration}\cite{PythonExceptions} exception, that is thrown by an iterator to indicate that it is exhausted. When paired with Python's iterator-based for-loop this will be thrown every When paired with Python's iterator-based for-loop, this will be thrown every time the end of the loop is reached.\cite{PythonForLoop}
• ## doc/theses/andrew_beach_MMath/performance.tex

 r56e5b24 Instead, the focus was to get the features working. The only performance requirement is to ensure the tests for correctness run in a reasonable amount of time. Hence, a few basic performance tests were performed to amount of time. Hence, only a few basic performance tests were performed to check this requirement. one with termination and one with resumption. C++ is the most comparable language because both it and \CFA use the same GCC C++ is the most comparable language because both it and \CFA use the same framework, libunwind. In fact, the comparison is almost entirely in quality of implementation. the number used in the timing runs is given with the results per test. The Java tests run the main loop 1000 times before beginning the actual test to warm-up" the JVM. beginning the actual test to warm up" the JVM. % All other languages are precompiled or interpreted. unhandled exceptions in \Cpp and Java as that would cause the process to terminate. Luckily, performance on the give-up and kill the process" path is not Luckily, performance on the give up and kill the process" path is not critical. using gcc-10 10.3.0 as a backend. g++-10 10.3.0 is used for \Cpp. Java tests are complied and run with version 11.0.11. Python used version 3.8.10. Java tests are complied and run with Oracle OpenJDK version 11.0.11. Python used CPython version 3.8.10. The machines used to run the tests are: \begin{itemize}[nosep] \lstinline{@} 2.5 GHz running Linux v5.11.0-25 \end{itemize} Representing the two major families of hardware architecture. These represent the two major families of hardware architecture. \section{Tests} \paragraph{Stack Traversal} This group measures the cost of traversing the stack, This group of tests measures the cost of traversing the stack (and in termination, unwinding it). Inside the main loop is a call to a recursive function. This group of tests measures the cost for setting up exception handling, if it is not used (because the exceptional case did not occur). not used because the exceptional case did not occur. Tests repeatedly cross (enter, execute and leave) a try statement but never perform a raise. for that language and the result is marked N/A. There are also cases where the feature is supported but measuring its cost is impossible. This happened with Java, which uses a JIT that optimize away the tests and it cannot be stopped.\cite{Dice21} cost is impossible. This happened with Java, which uses a JIT that optimizes away the tests and cannot be stopped.\cite{Dice21} These tests are marked N/C. To get results in a consistent range (1 second to 1 minute is ideal, results and has a value in the millions. An anomaly in some results came from \CFA's use of gcc nested functions. An anomaly in some results came from \CFA's use of GCC nested functions. These nested functions are used to create closures that can access stack variables in their lexical scope. However, if they do so, then they can cause the benchmark's run-time to However, if they do so, then they can cause the benchmark's run time to increase by an order of magnitude. The simplest solution is to make those values global variables instead of function local variables. of function-local variables. % Do we know if editing a global inside nested function is a problem? Tests that had to be modified to avoid this problem have been marked \CFA, \Cpp and Java. % To be exact, the Match All and Match None cases. %\todo{Not true in Python.} The most likely explanation is that, since exceptions are rarely considered to be the common case, the more optimized languages Performance is similar to Empty Traversal in all languages that support finally clauses. Only Python seems to have a larger than random noise change in its run-time and it is still not large. its run time and it is still not large. Despite the similarity between finally clauses and destructors, finally clauses seem to avoid the spike that run-time destructors have. finally clauses seem to avoid the spike that run time destructors have. Possibly some optimization removes the cost of changing contexts. This results in a significant jump. Other languages experience a small increase in run-time. Other languages experience a small increase in run time. The small increase likely comes from running the checks, but they could avoid the spike by not having the same kind of overhead for \item[Cross Handler] Here \CFA falls behind \Cpp by a much more significant margin. This is likely due to the fact \CFA has to insert two extra function calls, while \Cpp does not have to do execute any other instructions. Here, \CFA falls behind \Cpp by a much more significant margin. This is likely due to the fact that \CFA has to insert two extra function calls, while \Cpp does not have to execute any other instructions. Python is much further behind. \item[Conditional Match] Both of the conditional matching tests can be considered on their own. However for evaluating the value of conditional matching itself, the However, for evaluating the value of conditional matching itself, the comparison of the two sets of results is useful. Consider the massive jump in run-time for \Cpp going from match all to match Consider the massive jump in run time for \Cpp going from match all to match none, which none of the other languages have. Some strange interaction is causing run-time to more than double for doing Some strange interaction is causing run time to more than double for doing twice as many raises. Java and Python avoid this problem and have similar run-time for both tests, Java and Python avoid this problem and have similar run time for both tests, possibly through resource reuse or their program representation. However \CFA is built like \Cpp and avoids the problem as well, this matches However, \CFA is built like \Cpp, and avoids the problem as well. This matches the pattern of the conditional match, which makes the two execution paths very similar. \subsection{Resumption \texorpdfstring{(\autoref{t:PerformanceResumption})}{}} Moving on to resumption, there is one general note, Moving on to resumption, there is one general note: resumption is \textit{fast}. The only test where it fell behind termination is Cross Handler. In every other case, the number of iterations had to be increased by a factor of 10 to get the run-time in an appropriate range factor of 10 to get the run time in an appropriate range and in some cases resumption still took less time. \item[D'tor Traversal] Resumption does have the same spike in run-time that termination has. The run-time is actually very similar to Finally Traversal. Resumption does have the same spike in run time that termination has. The run time is actually very similar to Finally Traversal. As resumption does not unwind the stack, both destructors and finally clauses are run while walking down the stack during the recursive returns. \item[Finally Traversal] Same as D'tor Traversal, except termination did not have a spike in run-time on this test case. except termination did not have a spike in run time on this test case. \item[Other Traversal] The only test case where resumption could not keep up with termination, although the difference is not as significant as many other cases. It is simply a matter of where the costs come from, both termination and resumption have some work to set-up or tear-down a It is simply a matter of where the costs come from: both termination and resumption have some work to set up or tear down a handler. It just so happens that resumption's work is slightly slower. Resumption shows a slight slowdown if the exception is not matched by the first handler, which follows from the fact the second handler now has to be checked. However the difference is not large. to be checked. However, the difference is not large. \end{description} More experiments could try to tease out the exact trade-offs, but the prototype's only performance goal is to be reasonable. It has already in that range, and \CFA's fixup routine simulation is It is already in that range, and \CFA's fixup routine simulation is one of the faster simulations as well. Plus exceptions add features and remove syntactic overhead, so even at similar performance resumptions have advantages Plus, exceptions add features and remove syntactic overhead, so even at similar performance, resumptions have advantages over fixup routines.
• ## doc/theses/andrew_beach_MMath/uw-ethesis-frontpgs.tex

 r56e5b24 This thesis covers the design and implementation of the \CFA EHM, along with a review of the other required \CFA features. The EHM includes common features of termination exception handling and similar support for resumption exception handling. The EHM includes common features of termination exception handling, which abandons and recovers from an operation, and similar support for resumption exception handling, which repairs and continues with an operation. The design of both has been adapted to utilize other tools \CFA provides, as well as fit with the assertion based interfaces of the language. The EHM has been implemented into the \CFA compiler and run-time environment. Although it has not yet been optimized, performance testing has shown it has comparable performance to other EHM's, comparable performance to other EHMs, which is sufficient for use in current \CFA programs.
• ## doc/theses/andrew_beach_MMath/uw-ethesis.bib

 r56e5b24 } @misc{CppExceptSpec, author={C++ Community}, key={Cpp Reference Exception Specification}, howpublished={\href{https://en.cppreference.com/w/cpp/language/except_spec}{https://\-en.cppreference.com/\-w/\-cpp/\-language/\-except\_spec}}, addendum={Accessed 2021-09-08}, } @misc{RustPanicMacro, author={The Rust Team},
• ## src/Parser/parser.yy

 r56e5b24 // Created On       : Sat Sep  1 20:22:55 2001 // Last Modified By : Peter A. Buhr // Last Modified On : Sun Aug  8 09:14:44 2021 // Update Count     : 5038 // Last Modified On : Sat Sep 11 08:20:44 2021 // Update Count     : 5040 // | simple_assignment_operator initializer        { $$= 1 == OperKinds::Assign ? 2 : 2->set_maybeConstructed( false ); } | '=' VOID {$$ = new InitializerNode( true ); } | '{' initializer_list_opt comma_opt '}'        { $$= new InitializerNode( 2, true ); } ; | designation initializer {$$ = $2->set_designators($1 ); } | initializer_list_opt ',' initializer          { $$= (InitializerNode *)( 1->set_last( 3 ) ); } | initializer_list_opt ',' designation initializer {$$ = (InitializerNode *)($1->set_last($4->set_designators( $3 ) )); } | initializer_list_opt ',' designation initializer {$$= (InitializerNode *)($1->set_last( $4->set_designators($3 ) )); } ;
Note: See TracChangeset for help on using the changeset viewer.