Changeset 5407cdc for doc/theses

doc/theses/andrew_beach_MMath/Makefile

-              rfeacef9
+              r5407cdc
 BUILD=out
 TEXSRC=$(wildcard *.tex)
+FIGSRC=$(wildcard *.fig)
 BIBSRC=$(wildcard *.bib)
 STYSRC=$(wildcard *.sty)
 …
 BASE= ${DOC:%.pdf=%}
+RAWSRC=${TEXSRC} ${BIBSRC} ${STYSRC} ${CLSSRC}
+FIGTEX=${FIGSRC:%.fig=${BUILD}/%.tex}
 ### Special Rules:
 …
 ### Commands:
 LATEX=TEXINPUTS=${TEXLIB} pdflatex -halt-on-error -output-directory=${BUILD}
+LATEX=TEXINPUTS=${TEXLIB} latex -halt-on-error -output-directory=${BUILD}
 BIBTEX=BIBINPUTS=${BIBLIB} bibtex
 GLOSSARY=INDEXSTYLE=${BUILD} makeglossaries-lite
 …
 all: ${DOC}
+${BUILD}/${DOC}: ${TEXSRC} ${BIBSRC} ${STYSRC} ${CLSSRC} Makefile | ${BUILD}
+# The main rule, it does all the tex/latex processing.
+${BUILD}/${BASE}.dvi: ${RAWSRC} ${FIGTEX} Makefile | ${BUILD}
         ${LATEX} ${BASE}
         ${BIBTEX} ${BUILD}/${BASE}
 …
         ${LATEX} ${BASE}
+${DOC}: ${BUILD}/${DOC}
+        cp $< $@
+# Convert xfig output to tex. (Generates \special declarations.)
+${FIGTEX}: ${BUILD}/%.tex: %.fig | ${BUILD}
+        fig2dev -L eepic $< > $@
+# Step through dvi & postscript to handle xfig specials.
+%.pdf : ${BUILD}/%.dvi
+        dvipdf $^ $@
 ${BUILD}:

doc/theses/andrew_beach_MMath/existing.tex

-              rfeacef9
+              r5407cdc
 \section{Overloading and \lstinline{extern}}
 \CFA has extensive overloading, allowing multiple definitions of the same name
 to be defined.~\cite{Moss18}
+to be defined~\cite{Moss18}.
 \begin{cfa}
 char i; int i; double i;                        $\C[3.75in]{// variable overload}$
 …
 pointers using the ampersand (@&@) instead of the pointer asterisk (@*@). \CFA
 references may also be mutable or non-mutable. If mutable, a reference variable
 may be assigned to using the address-of operator (@&@), which converts the
+may be assigned using the address-of operator (@&@), which converts the
 reference to a pointer.
 \begin{cfa}
 …
 \section{Constructors and Destructors}
 Both constructors and destructors are operators, which means they are just
+Both constructors and destructors are operators, which means they are
 functions with special operator names rather than type names in \Cpp. The
 special operator names may be used to call the functions explicitly (not
 …
 In general, operator names in \CFA are constructed by bracketing an operator
 token with @?@, which indicates where the arguments. For example, infixed
+token with @?@, which indicates the position of the arguments. For example, infixed
 multiplication is @?*?@ while prefix dereference is @*?@. This syntax make it
 easy to tell the difference between prefix operations (such as @++?@) and
 …
 definition, \CFA creates a default and copy constructor, destructor and
 assignment (like \Cpp). It is possible to define constructors/destructors for
 basic and existing types.
+basic and existing types (unlike \Cpp).
 \section{Polymorphism}
 …
         do_once(value);
+}
 void do_once(int i) { ... }  // provide assertion
 int i;
+void do_once(@int@ i) { ... }  // provide assertion
+@int@ i;
 do_twice(i); // implicitly pass assertion do_once to do_twice
 \end{cfa}
 …
 declarations instead of parameters, returns, and local variable declarations.
 \begin{cfa}
 forall(dtype T)
+forall(dtype @T@)
 struct node {
+        node(T) * next;  // generic linked node
+        T * data;
+}
+        node(@T@) * next;  // generic linked node
+        @T@ * data;
+}
+node(@int@) inode;
 \end{cfa}
 The generic type @node(T)@ is an example of a polymorphic-type usage.  Like \Cpp
 templates usage, a polymorphic-type usage must specify a type parameter.
+template usage, a polymorphic-type usage must specify a type parameter.
 There are many other polymorphism features in \CFA but these are the ones used
 by the exception system.
 \section{Concurrency}
 \CFA has a number of concurrency features: @thread@, @monitor@, @mutex@
 parameters, @coroutine@ and @generator@. The two features that interact with
 the exception system are @thread@ and @coroutine@; they and their supporting
+\section{Control Flow}
+\CFA has a number of advanced control-flow features: @generator@, @coroutine@, @monitor@, @mutex@ parameters, and @thread@.
+The two features that interact with
+the exception system are @coroutine@ and @thread@; they and their supporting
 constructs are described here.
 …
 CountUp countup;
 \end{cfa}
 Each coroutine has @main@ function, which takes a reference to a coroutine
+Each coroutine has a @main@ function, which takes a reference to a coroutine
 object and returns @void@.
 \begin{cfa}[numbers=left]
 …
 In this function, or functions called by this function (helper functions), the
 @suspend@ statement is used to return execution to the coroutine's caller
 without terminating the coroutine.
+without terminating the coroutine's function.
 A coroutine is resumed by calling the @resume@ function, \eg @resume(countup)@.
 …
 @resume(countup).next@.
 \subsection{Monitors and Mutex}
+\subsection{Monitor and Mutex Parameter}
 Concurrency does not guarantee ordering; without ordering results are
 non-deterministic. To claw back ordering, \CFA uses monitors and @mutex@
 …
 and only one runs at a time.
 \subsection{Threads}
+\subsection{Thread}
 Functions, generators, and coroutines are sequential so there is only a single
 (but potentially sophisticated) execution path in a program. Threads introduce
 …
 monitors and mutex parameters. For threads to work safely with other threads,
 also requires mutual exclusion in the form of a communication rendezvous, which
 also supports internal synchronization as for mutex objects. For exceptions
 only the basic two basic operations are important: thread fork and join.
+also supports internal synchronization as for mutex objects. For exceptions,
+only two basic thread operations are important: fork and join.
 Threads are created like coroutines with an associated @main@ function:

doc/theses/andrew_beach_MMath/features.tex

-              rfeacef9
+              r5407cdc
 This chapter covers the design and user interface of the \CFA
+exception-handling mechanism.
+exception-handling mechanism (EHM). % or exception system.
+We will begin with an overview of EHMs in general. It is not a strict
+definition of all EHMs nor an exaustive list of all possible features.
+However it does cover the most common structure and features found in them.
+% We should cover what is an exception handling mechanism and what is an
+% exception before this. Probably in the introduction. Some of this could
+% move there.
+\paragraph{Raise / Handle}
+An exception operation has two main parts: raise and handle.
+These terms are sometimes also known as throw and catch but this work uses
+throw/catch as a particular kind of raise/handle.
+These are the two parts that the user will write themselves and may
+be the only two pieces of the EHM that have any syntax in the language.
+\subparagraph{Raise}
+The raise is the starting point for exception handling. It marks the beginning
+of exception handling by \newterm{raising} an excepion, which passes it to
+the EHM.
+Some well known examples include the @throw@ statements of \Cpp and Java and
+the \codePy{raise} statement from Python. In real systems a raise may preform
+some other work (such as memory management) but for the purposes of this
+overview that can be ignored.
+\subparagraph{Handle}
+The purpose of most exception operations is to run some user code to handle
+that exception. This code is given, with some other information, in a handler.
+A handler has three common features: the previously mentioned user code, a
+region of code they cover and an exception label/condition that matches
+certain exceptions.
+Only raises inside the covered region and raising exceptions that match the
+label can be handled by a given handler.
+Different EHMs will have different rules to pick a handler
+if multipe handlers could be used such as ``best match" or ``first found".
+The @try@ statements of \Cpp, Java and Python are common examples. All three
+also show another common feature of handlers, they are grouped by the covered
+region.
+\paragraph{Propagation}
+After an exception is raised comes what is usually the biggest step for the
+EHM: finding and setting up the handler. The propogation from raise to
+handler can be broken up into three different tasks: searching for a handler,
+matching against the handler and installing the handler.
+\subparagraph{Searching}
+The EHM begins by searching for handlers that might be used to handle
+the exception. Searching is usually independent of the exception that was
+thrown as it looks for handlers that have the raise site in their covered
+region.
+This includes handlers in the current function, as well as any in callers
+on the stack that have the function call in their covered region.
+\subparagraph{Matching}
+Each handler found has to be matched with the raised exception. The exception
+label defines a condition that be use used with exception and decides if
+there is a match or not.
+In languages where the first match is used this step is intertwined with
+searching, a match check is preformed immediately after the search finds
+a possible handler.
+\subparagraph{Installing}
+After a handler is chosen it must be made ready to run.
+The implementation can vary widely to fit with the rest of the
+design of the EHM. The installation step might be trivial or it could be
+the most expensive step in handling an exception. The latter tends to be the
+case when stack unwinding is involved.
+If a matching handler is not guarantied to be found the EHM will need a
+different course of action here in the cases where no handler matches.
+This is only required with unchecked exceptions as checked exceptions
+(such as in Java) can make than guaranty.
+This different action can also be installing a handler but it is usually an
+implicat and much more general one.
+\subparagraph{Hierarchy}
+A common way to organize exceptions is in a hierarchical structure.
+This is especially true in object-orientated languages where the
+exception hierarchy is a natural extension of the object hierarchy.
+Consider the following hierarchy of exceptions:
+\begin{center}
+\input{exception-hierarchy}
+\end{center}
+A handler labelled with any given exception can handle exceptions of that
+type or any child type of that exception. The root of the exception hierarchy
+(here \codeC{exception}) acts as a catch-all, leaf types catch single types
+and the exceptions in the middle can be used to catch different groups of
+related exceptions.
+This system has some notable advantages, such as multiple levels of grouping,
+the ability for libraries to add new exception types and the isolation
+between different sub-hierarchies.
+This design is used in \CFA even though it is not a object-orientated
+language using different tools to create the hierarchy.
+% Could I cite the rational for the Python IO exception rework?
+\paragraph{Completion}
+After the handler has finished the entire exception operation has to complete
+and continue executing somewhere else. This step is usually simple,
+both logically and in its implementation, as the installation of the handler
+is usually set up to do most of the work.
+The EHM can return control to many different places,
+the most common are after the handler definition and after the raise.
+\paragraph{Communication}
+For effective exception handling, additional information is usually passed
+from the raise to the handler.
+So far only communication of the exceptions' identity has been covered.
+A common method is putting fields into the exception instance and giving the
+handler access to them.
 \section{Virtuals}
+Virtual types and casts are not part of the exception system nor are they
+required for an exception system. But an object-oriented style hierarchy is a
+great way of organizing exceptions so a minimal virtual system has been added
+to \CFA.
+The pattern of a simple hierarchy was borrowed from object-oriented
+programming was chosen for several reasons.
+The first is that it allows new exceptions to be added in user code
+and in libraries independently of each other. Another is it allows for
+different levels of exception grouping (all exceptions, all IO exceptions or
+a particular IO exception). Also it also provides a simple way of passing
+data back and forth across the throw.
+Virtual types and casts are not required for a basic exception-system but are
+useful for advanced exception features. However, \CFA is not object-oriented so
+there is no obvious concept of virtuals. Hence, to create advanced exception
+features for this work, I needed to design and implement a virtual-like
+system for \CFA.
+% NOTE: Maybe we should but less of the rational here.
+Object-oriented languages often organized exceptions into a simple hierarchy,
+\eg Java.
+\begin{center}
+\setlength{\unitlength}{4000sp}%
+\begin{picture}(1605,612)(2011,-1951)
+\put(2100,-1411){\vector(1, 0){225}}
+\put(3450,-1411){\vector(1, 0){225}}
+\put(3550,-1411){\line(0,-1){225}}
+\put(3550,-1636){\vector(1, 0){150}}
+\put(3550,-1636){\line(0,-1){225}}
+\put(3550,-1861){\vector(1, 0){150}}
+\put(2025,-1490){\makebox(0,0)[rb]{\LstBasicStyle{exception}}}
+\put(2400,-1460){\makebox(0,0)[lb]{\LstBasicStyle{arithmetic}}}
+\put(3750,-1460){\makebox(0,0)[lb]{\LstBasicStyle{underflow}}}
+\put(3750,-1690){\makebox(0,0)[lb]{\LstBasicStyle{overflow}}}
+\put(3750,-1920){\makebox(0,0)[lb]{\LstBasicStyle{zerodivide}}}
+\end{picture}%
+\end{center}
+The hierarchy provides the ability to handle an exception at different degrees
+of specificity (left to right). Hence, it is possible to catch a more general
+exception-type in higher-level code where the implementation details are
+unknown, which reduces tight coupling to the lower-level implementation.
+Otherwise, low-level code changes require higher-level code changes, \eg,
+changing from raising @underflow@ to @overflow@ at the low level means changing
+the matching catch at the high level versus catching the general @arithmetic@
+exception. In detail, each virtual type may have a parent and can have any
+number of children. A type's descendants are its children and its children's
+descendants. A type may not be its own descendant.
+The exception hierarchy allows a handler (@catch@ clause) to match multiple
+exceptions, \eg a base-type handler catches both base and derived
+exception-types.
+\begin{cfa}
+try {
+        ...
+} catch(arithmetic &) {
+        ... // handle arithmetic, underflow, overflow, zerodivide
+}
+\end{cfa}
+Most exception mechanisms perform a linear search of the handlers and select
+the first matching handler, so the order of handers is now important because
+matching is many to one.
+Each virtual type needs an associated virtual table. A virtual table is a
+structure with fields for all the virtual members of a type. A virtual type has
+all the virtual members of its parent and can add more. It may also update the
+values of the virtual members and often does.
+Virtual types and casts are not part of \CFA's EHM nor are they required for
+any EHM. But \CFA uses a hierarchial system of exceptions and this feature
+is leveraged to create that.
+% Maybe talk about why the virtual system is so minimal.
+% Created for but not a part of the exception system.
+The virtual system supports multiple ``trees" of types. Each tree is
+a simple hierarchy with a single root type. Each type in a tree has exactly
+one parent -- except for the root type which has zero parents -- and any
+number of children.
+Any type that belongs to any of these trees is called a virtual type.
+% A type's ancestors are its parent and its parent's ancestors.
+% The root type has no ancestors.
+% A type's decendents are its children and its children's decendents.
+Every virtual type also has a list of virtual members. Children inherit
+their parent's list of virtual members but may add new members to it.
+It is important to note that these are virtual members, not virtual methods
+of object-orientated programming, and can be of any type.
+However, since \CFA has function pointers and they are allowed, virtual
+members can be used to mimic virtual methods.
+Each virtual type has a unique id.
+This unique id and all the virtual members are combined
+into a virtual table type. Each virtual type has a pointer to a virtual table
+as a hidden field.
+Up until this point the virtual system is similar to ones found in
+object-orientated languages but this where \CFA diverges. Objects encapsulate a
+single set of behaviours in each type, universally across the entire program,
+and indeed all programs that use that type definition. In this sense the
+types are ``closed" and cannot be altered.
+In \CFA types do not encapsulate any behaviour. Traits are local and
+types can begin to statify a trait, stop satifying a trait or satify the same
+trait in a different way at any lexical location in the program.
+In this sense they are ``open" as they can change at any time. This means it
+is implossible to pick a single set of functions that repersent the type's
+implementation across the program.
+\CFA side-steps this issue by not having a single virtual table for each
+type. A user can define virtual tables which are filled in at their
+declaration and given a name. Anywhere that name is visible, even if it was
+defined locally inside a function (although that means it will not have a
+static lifetime), it can be used.
+Specifically, a virtual type is ``bound" to a virtual table which
+sets the virtual members for that object. The virtual members can be accessed
+through the object.
 While much of the virtual infrastructure is created, it is currently only used
 …
 \Cpp syntax for special casts. Both the type of @EXPRESSION@ and @TYPE@ must be
 a pointer to a virtual type.
 The cast dynamically checks if the @EXPRESSION@ type is the same or a subtype
+The cast dynamically checks if the @EXPRESSION@ type is the same or a sub-type
 of @TYPE@, and if true, returns a pointer to the
 @EXPRESSION@ object, otherwise it returns @0p@ (null pointer).
 …
 \end{cfa}
 The trait is defined over two types, the exception type and the virtual table
 type. This should be one-to-one, each exception type has only one virtual
+type. This should be one-to-one: each exception type has only one virtual
 table type and vice versa. The only assertion in the trait is
 @get_exception_vtable@, which takes a pointer of the exception type and
 returns a reference to the virtual table type instance.
+% TODO: This section, and all references to get_exception_vtable, are
+% out-of-data. Perhaps wait until the update is finished before rewriting it.
 The function @get_exception_vtable@ is actually a constant function.
 Recardless of the value passed in (including the null pointer) it should
+Regardless of the value passed in (including the null pointer) it should
 return a reference to the virtual table instance for that type.
 The reason it is a function instead of a constant is that it make type
 …
 % similar system I know of (except Agda's I guess) so I took it out.
+There are two more traits for exceptions @is_termination_exception@ and
+@is_resumption_exception@. They are defined as follows:
+There are two more traits for exceptions defined as follows:
 \begin{cfa}
 trait is_termination_exception(
 …
 };
 \end{cfa}
+In other words they make sure that a given type and virtual type is an
+exception and defines one of the two default handlers. These default handlers
+are used in the main exception handling operations \see{Exception Handling}
+and their use will be detailed there.
+However all three of these traits can be trickly to use directly.
+There is a bit of repetition required but
+Both traits ensure a pair of types are an exception type and its virtual table
+and defines one of the two default handlers. The default handlers are used
+as fallbacks and are discussed in detail in \VRef{s:ExceptionHandling}.
+However, all three of these traits can be tricky to use directly.
+While there is a bit of repetition required,
 the largest issue is that the virtual table type is mangled and not in a user
 facing way. So there are three macros that can be used to wrap these traits
 when you need to refer to the names:
+facing way. So these three macros are provided to wrap these traits to
+simplify referring to the names:
 @IS_EXCEPTION@, @IS_TERMINATION_EXCEPTION@ and @IS_RESUMPTION_EXCEPTION@.
 All take one or two arguments. The first argument is the name of the
 exception type. Its unmangled and mangled form are passed to the trait.
+All three take one or two arguments. The first argument is the name of the
+exception type. The macro passes its unmangled and mangled form to the trait.
 The second (optional) argument is a parenthesized list of polymorphic
 arguments. This argument should only with polymorphic exceptions and the
 list will be passed to both types.
 In the current set-up the base name and the polymorphic arguments have to
 match so these macros can be used without losing flexability.
+arguments. This argument is only used with polymorphic exceptions and the
+list is be passed to both types.
+In the current set-up, the two types always have the same polymorphic
+arguments so these macros can be used without losing flexibility.
 For example consider a function that is polymorphic over types that have a
 …
 \section{Exception Handling}
 \CFA provides two kinds of exception handling, termination and resumption.
+These twin operations are the core of the exception handling mechanism and
 are the reason for the features of exceptions.
+\label{s:ExceptionHandling}
+\CFA provides two kinds of exception handling: termination and resumption.
+These twin operations are the core of \CFA's exception handling mechanism.
 This section will cover the general patterns shared by the two operations and
 then go on to cover the details each individual operation.
+Both operations follow the same set of steps to do their operation. They both
+start with the user preforming a throw on an exception.
+Then there is the search for a handler, if one is found than the exception
+is caught and the handler is run. After that control returns to normal
+execution.
+Both operations follow the same set of steps.
+Both start with the user preforming a raise on an exception.
+Then the exception propogates up the stack.
+If a handler is found the exception is caught and the handler is run.
+After that control returns to normal execution.
 If the search fails a default handler is run and then control
+returns to normal execution immediately. That is where the default handlers
+@defaultTermiationHandler@ and @defaultResumptionHandler@ are used.
+returns to normal execution after the raise.
+This general description covers what the two kinds have in common.
+Differences include how propogation is preformed, where exception continues
+after an exception is caught and handled and which default handler is run.
 \subsection{Termination}
 \label{s:Termination}
+Termination handling is more familiar kind and used in most programming
+Termination handling is the familiar kind and used in most programming
 languages with exception handling.
+It is dynamic, non-local goto. If a throw is successful then the stack will
+be unwound and control will (usually) continue in a different function on
+the call stack. They are commonly used when an error has occured and recovery
+is impossible in the current function.
+It is dynamic, non-local goto. If the raised exception is matched and
+handled the stack is unwound and control will (usually) continue the function
+on the call stack that defined the handler.
+Termination is commonly used when an error has occurred and recovery is
+impossible locally.
 % (usually) Control can continue in the current function but then a different
 % control flow construct should be used.
 A termination throw is started with the @throw@ statement:
+A termination raise is started with the @throw@ statement:
 \begin{cfa}
 throw EXPRESSION;
 \end{cfa}
 The expression must return a reference to a termination exception, where the
+termination exception is any type that satifies @is_termination_exception@
+at the call site.
+Through \CFA's trait system the functions in the traits are passed into the
+throw code. A new @defaultTerminationHandler@ can be defined in any scope to
+termination exception is any type that satisfies the trait
+@is_termination_exception@ at the call site.
+Through \CFA's trait system the trait functions are implicity passed into the
+throw code and the EHM.
+A new @defaultTerminationHandler@ can be defined in any scope to
 change the throw's behavior (see below).
+The throw will copy the provided exception into managed memory. It is the
+user's responcibility to ensure the original exception is cleaned up if the
+stack is unwound (allocating it on the stack should be sufficient).
+Then the exception system searches the stack using the copied exception.
+It starts starts from the throw and proceeds to the base of the stack,
+The throw will copy the provided exception into managed memory to ensure
+the exception is not destroyed if the stack is unwound.
+It is the user's responsibility to ensure the original exception is cleaned
+up wheither the stack is unwound or not. Allocating it on the stack is
+usually sufficient.
+Then propogation starts with the search. \CFA uses a ``first match" rule so
+matching is preformed with the copied exception as the search continues.
+It starts from the throwing function and proceeds to the base of the stack,
 from callee to caller.
 At each stack frame, a check is made for resumption handlers defined by the
 …
 try {
         GUARDED_BLOCK
 } catch (EXCEPTION_TYPE$\(_1\)$ * NAME$\(_1\)$) {
+} catch (EXCEPTION_TYPE$\(_1\)$ * [NAME$\(_1\)$]) {
         HANDLER_BLOCK$\(_1\)$
 } catch (EXCEPTION_TYPE$\(_2\)$ * NAME$\(_2\)$) {
+} catch (EXCEPTION_TYPE$\(_2\)$ * [NAME$\(_2\)$]) {
         HANDLER_BLOCK$\(_2\)$
+}
 \end{cfa}
 When viewed on its own a try statement will simply exceute the statements in
 @GUARDED_BLOCK@ and when those are finished the try statement finishes.
+When viewed on its own, a try statement will simply execute the statements
+in @GUARDED_BLOCK@ and when those are finished the try statement finishes.
 However, while the guarded statements are being executed, including any
+functions they invoke, all the handlers following the try block are now
+or any functions invoked from those
+statements, throws an exception, and the exception
+is not handled by a try statement further up the stack, the termination
+handlers are searched for a matching exception type from top to bottom.
+Exception matching checks the representation of the thrown exception-type is
+the same or a descendant type of the exception types in the handler clauses. If
+it is the same of a descendent of @EXCEPTION_TYPE@$_i$ then @NAME@$_i$ is
+invoked functions, all the handlers in the statement are now on the search
+path. If a termination exception is thrown and not handled further up the
+stack they will be matched against the exception.
+Exception matching checks the handler in each catch clause in the order
+they appear, top to bottom. If the representation of the thrown exception type
+is the same or a descendant of @EXCEPTION_TYPE@$_i$ then @NAME@$_i$
+(if provided) is
 bound to a pointer to the exception and the statements in @HANDLER_BLOCK@$_i$
 are executed. If control reaches the end of the handler, the exception is
 freed and control continues after the try statement.
+If no handler is found during the search then the default handler is run.
+If no termination handler is found during the search then the default handler
+(@defaultTerminationHandler@) is run.
 Through \CFA's trait system the best match at the throw sight will be used.
 This function is run and is passed the copied exception. After the default
 handler is run control continues after the throw statement.
+There is a global @defaultTerminationHandler@ that cancels the current stack
+with the copied exception. However it is generic over all exception types so
+new default handlers can be defined for different exception types and so
+different exception types can have different default handlers.
+There is a global @defaultTerminationHandler@ that is polymorphic over all
+exception types. Since it is so general a more specific handler can be
+defined and will be used for those types, effectively overriding the handler
+for particular exception type.
+The global default termination handler performs a cancellation
+\see{\VRef{s:Cancellation}} on the current stack with the copied exception.
 \subsection{Resumption}
 \label{s:Resumption}
 Resumption exception handling is a less common form than termination but is
 just as old~\cite{Goodenough75} and is in some sense simpler.
 It is a dynamic, non-local function call. If the throw is successful a
+closure will be taken from up the stack and executed, after which the throwing
 function will continue executing.
 These are most often used when an error occured and if the error is repaired
+Resumption exception handling is less common than termination but is
+just as old~\cite{Goodenough75} and is simpler in many ways.
+It is a dynamic, non-local function call. If the raised exception is
+matched a closure will be taken from up the stack and executed,
+after which the raising function will continue executing.
+These are most often used when an error occurred and if the error is repaired
 then the function can continue.
 …
 throwResume EXPRESSION;
 \end{cfa}
+The semantics of the @throwResume@ statement are like the @throw@, but the
+expression has return a reference a type that satifies the trait
+@is_resumption_exception@. The assertions from this trait are available to
+It works much the same way as the termination throw.
+The expression must return a reference to a resumption exception,
+where the resumption exception is any type that satisfies the trait
+@is_resumption_exception@ at the call site.
+The assertions from this trait are available to
 the exception system while handling the exception.
+At runtime, no copies are made. As the stack is not unwound the exception and
+At run-time, no exception copy is made.
+As the stack is not unwound the exception and
 any values on the stack will remain in scope while the resumption is handled.
+Then the exception system searches the stack using the provided exception.
+It starts starts from the throw and proceeds to the base of the stack,
+from callee to caller.
+The EHM then begins propogation. The search starts from the raise in the
+resuming function and proceeds to the base of the stack, from callee to caller.
 At each stack frame, a check is made for resumption handlers defined by the
 @catchResume@ clauses of a @try@ statement.
 …
 try {
         GUARDED_BLOCK
 } catchResume (EXCEPTION_TYPE$\(_1\)$ * NAME$\(_1\)$) {
+} catchResume (EXCEPTION_TYPE$\(_1\)$ * [NAME$\(_1\)$]) {
         HANDLER_BLOCK$\(_1\)$
 } catchResume (EXCEPTION_TYPE$\(_2\)$ * NAME$\(_2\)$) {
+} catchResume (EXCEPTION_TYPE$\(_2\)$ * [NAME$\(_2\)$]) {
         HANDLER_BLOCK$\(_2\)$
+}
 \end{cfa}
+If the handlers are not involved in a search this will simply execute the
+@GUARDED_BLOCK@ and then continue to the next statement.
+Its purpose is to add handlers onto the stack.
+(Note, termination and resumption handlers may be intermixed in a @try@
+statement but the kind of throw must be the same as the handler for it to be
+considered as a possible match.)
+If a search for a resumption handler reaches a try block it will check each
+@catchResume@ clause, top-to-bottom.
+At each handler if the thrown exception is or is a child type of
+@EXCEPTION_TYPE@$_i$ then the a pointer to the exception is bound to
+@NAME@$_i$ and then @HANDLER_BLOCK@$_i$ is executed. After the block is
+finished control will return to the @throwResume@ statement.
+% I wonder if there would be some good central place for this.
+Note that termination handlers and resumption handlers may be used together
+in a single try statement, intermixing @catch@ and @catchResume@ freely.
+Each type of handler will only interact with exceptions from the matching
+type of raise.
+When a try statement is executed it simply executes the statements in the
+@GUARDED_BLOCK@ and then finishes.
+However, while the guarded statements are being executed, including any
+invoked functions, all the handlers in the statement are now on the search
+path. If a resumption exception is reported and not handled further up the
+stack they will be matched against the exception.
+Exception matching checks the handler in each catch clause in the order
+they appear, top to bottom. If the representation of the thrown exception type
+is the same or a descendant of @EXCEPTION_TYPE@$_i$ then @NAME@$_i$
+(if provided) is bound to a pointer to the exception and the statements in
+@HANDLER_BLOCK@$_i$ are executed.
+If control reaches the end of the handler, execution continues after the
+the raise statement that raised the handled exception.
 Like termination, if no resumption handler is found, the default handler
 …
 call sight according to \CFA's overloading rules. The default handler is
 passed the exception given to the throw. When the default handler finishes
 execution continues after the throw statement.
+execution continues after the raise statement.
 There is a global @defaultResumptionHandler@ is polymorphic over all
 termination exceptions and preforms a termination throw on the exception.
 The @defaultTerminationHandler@ for that throw is matched at the original
 throw statement (the resumption @throwResume@) and it can be customized by
+The @defaultTerminationHandler@ for that raise is matched at the original
+raise statement (the resumption @throwResume@) and it can be customized by
 introducing a new or better match as well.
+% \subsubsection?
+\subsubsection{Resumption Marking}
 A key difference between resumption and termination is that resumption does
 not unwind the stack. A side effect that is that when a handler is matched
 …
 search and match the handler in the @catchResume@ clause. This will be
 call and placed on the stack on top of the try-block. The second throw then
 throws and will seach the same try block and put call another instance of the
+throws and will search the same try block and put call another instance of the
 same handler leading to an infinite loop.
 …
 can form with multiple handlers and different exception types.
+To prevent all of these cases we mask sections of the stack, or equvilantly
+the try statements on the stack, so that the resumption seach skips over
+them and continues with the next unmasked section of the stack.
+A section of the stack is marked when it is searched to see if it contains
+a handler for an exception and unmarked when that exception has been handled
+or the search was completed without finding a handler.
+% This might need a diagram. But it is an important part of the justification
+% of the design of the traversal order.
+\begin{verbatim}
+       throwResume2 ----------.
+            |                 |
+ generated from handler       |
+            |                 |
+         handler              |
+            |                 |
+        throwResume1 -----.   :
+            |             |   :
+           try            |   : search skip
+            |             |   :
+        catchResume  <----'   :
+            |                 |
+\end{verbatim}
+The rules can be remembered as thinking about what would be searched in
+termination. So when a throw happens in a handler; a termination handler
+skips everything from the original throw to the original catch because that
+part of the stack has been unwound, a resumption handler skips the same
+section of stack because it has been masked.
+A throw in a default handler will preform the same search as the original
+throw because; for termination nothing has been unwound, for resumption
+the mask will be the same.
+The symmetry with termination is why this pattern was picked. Other patterns,
+such as marking just the handlers that caught, also work but lack the
+symmetry whih means there is more to remember.
+To prevent all of these cases we mark try statements on the stack.
+A try statement is marked when a match check is preformed with it and an
+exception. The statement will be unmarked when the handling of that exception
+is completed or the search completes without finding a handler.
+While a try statement is marked its handlers are never matched, effectify
+skipping over it to the next try statement.
+\begin{center}
+\input{stack-marking}
+\end{center}
+These rules mirror what happens with termination.
+When a termination throw happens in a handler the search will not look at
+any handlers from the original throw to the original catch because that
+part of the stack has been unwound.
+A resumption raise in the same situation wants to search the entire stack,
+but it will not try to match the exception with try statements in the section
+that would have been unwound as they are marked.
+The symmetry between resumption termination is why this pattern was picked.
+Other patterns, such as marking just the handlers that caught, also work but
+lack the symmetry means there are less rules to remember.
 \section{Conditional Catch}
 …
 condition to further control which exceptions they handle:
 \begin{cfa}
 catch (EXCEPTION_TYPE * NAME ; CONDITION)
+catch (EXCEPTION_TYPE * [NAME] ; CONDITION)
 \end{cfa}
 First, the same semantics is used to match the exception type. Second, if the
 …
 matches. Otherwise, the exception search continues as if the exception type
 did not match.
+\begin{cfa}
+try {
+        f1 = open( ... );
+        f2 = open( ... );
+The condition matching allows finer matching by allowing the match to check
+more kinds of information than just the exception type.
+\begin{cfa}
+try {
+        handle1 = open( f1, ... );
+        handle2 = open( f2, ... );
+        handle3 = open( f3, ... );
         ...
 } catch( IOFailure * f ; fd( f ) == f1 ) {
+        // only handle IO failure for f1
+}
+\end{cfa}
+Note, catching @IOFailure@, checking for @f1@ in the handler, and reraising the
+exception if not @f1@ is different because the reraise does not examine any of
+remaining handlers in the current try statement.
+\section{Rethrowing}
+\colour{red}{From Andrew: I recomend we talk about why the language doesn't
+have rethrows/reraises instead.}
+\label{s:Rethrowing}
+        // Only handle IO failure for f1.
+} catch( IOFailure * f ; fd( f ) == f3 ) {
+        // Only handle IO failure for f3.
+}
+// Can't handle a failure relating to f2 here.
+\end{cfa}
+In this example the file that experianced the IO error is used to decide
+which handler should be run, if any at all.
+\begin{comment}
+% I know I actually haven't got rid of them yet, but I'm going to try
+% to write it as if I had and see if that makes sense:
+\section{Reraising}
+\label{s:Reraising}
 Within the handler block or functions called from the handler block, it is
 possible to reraise the most recently caught exception with @throw@ or
 …
 is part of an unwound stack frame. To prevent this problem, a new default
 handler is generated that does a program-level abort.
+\end{comment}
+\subsection{Comparison with Reraising}
+A more popular way to allow handlers to match in more detail is to reraise
+the exception after it has been caught if it could not be handled here.
+On the surface these two features seem interchangable.
+If we used @throw;@ to start a termination reraise then these two statements
+would have the same behaviour:
+\begin{cfa}
+try {
+    do_work_may_throw();
+} catch(exception_t * exc ; can_handle(exc)) {
+    handle(exc);
+}
+\end{cfa}
+\begin{cfa}
+try {
+    do_work_may_throw();
+} catch(exception_t * exc) {
+    if (can_handle(exc)) {
+        handle(exc);
+    } else {
+        throw;
+    }
+}
+\end{cfa}
+If there are further handlers after this handler only the first version will
+check them. If multiple handlers on a single try block could handle the same
+exception the translations get more complex but they are equivilantly
+powerful.
+Until stack unwinding comes into the picture. In termination handling, a
+conditional catch happens before the stack is unwound, but a reraise happens
+afterwards. Normally this might only cause you to loose some debug
+information you could get from a stack trace (and that can be side stepped
+entirely by collecting information during the unwind). But for \CFA there is
+another issue, if the exception isn't handled the default handler should be
+run at the site of the original raise.
+There are two problems with this: the site of the original raise doesn't
+exist anymore and the default handler might not exist anymore. The site will
+always be removed as part of the unwinding, often with the entirety of the
+function it was in. The default handler could be a stack allocated nested
+function removed during the unwind.
+This means actually trying to pretend the catch didn't happening, continuing
+the original raise instead of starting a new one, is infeasible.
+That is the expected behaviour for most languages and we can't replicate
+that behaviour.
 \section{Finally Clauses}
+\label{s:FinallyClauses}
 Finally clauses are used to preform unconditional clean-up when leaving a
 scope. They are placed at the end of a try statement:
+scope and are placed at the end of a try statement after any handler clauses:
 \begin{cfa}
 try {
 …
 Execution of the finally block should always finish, meaning control runs off
+the end of the block. This requirement ensures always continues as if the
+finally clause is not present, \ie finally is for cleanup not changing control
+flow. Because of this requirement, local control flow out of the finally block
+the end of the block. This requirement ensures control always continues as if
+the finally clause is not present, \ie finally is for cleanup not changing
+control flow.
+Because of this requirement, local control flow out of the finally block
 is forbidden. The compiler precludes any @break@, @continue@, @fallthru@ or
 @return@ that causes control to leave the finally block. Other ways to leave
 the finally block, such as a long jump or termination are much harder to check,
 and at best requiring additional run-time overhead, and so are mearly
+and at best requiring additional run-time overhead, and so are only
 discouraged.
 Not all languages with exceptions have finally clauses. Notably \Cpp does
+Not all languages with unwinding have finally clauses. Notably \Cpp does
 without it as descructors serve a similar role. Although destructors and
 finally clauses can be used in many of the same areas they have their own
 use cases like top-level functions and lambda functions with closures.
 Destructors take a bit more work to set up but are much easier to reuse while
+finally clauses are good for once offs and can include local information.
+finally clauses are good for one-off uses and
+can easily include local information.
 \section{Cancellation}
+\label{s:Cancellation}
 Cancellation is a stack-level abort, which can be thought of as as an
 uncatchable termination. It unwinds the entirety of the current stack, and if
+uncatchable termination. It unwinds the entire current stack, and if
 possible forwards the cancellation exception to a different stack.
 …
 There is no special statement for starting a cancellation; instead the standard
 library function @cancel_stack@ is called passing an exception. Unlike a
 throw, this exception is not used in matching only to pass information about
+raise, this exception is not used in matching only to pass information about
 the cause of the cancellation.
 (This also means matching cannot fail so there is no default handler either.)
 After @cancel_stack@ is called the exception is copied into the exception
 handling mechanism's memory. Then the entirety of the current stack is
+(This also means matching cannot fail so there is no default handler.)
+After @cancel_stack@ is called the exception is copied into the EHM's memory
+and the current stack is
 unwound. After that it depends one which stack is being cancelled.
 \begin{description}
 \item[Main Stack:]
 The main stack is the one used by the program main at the start of execution,
+and is the only stack in a sequential program. Even in a concurrent program
+the main stack is only dependent on the environment that started the program.
+Hence, when the main stack is cancelled there is nowhere else in the program
+to notify. After the stack is unwound, there is a program-level abort.
+and is the only stack in a sequential program.
+After the main stack is unwound there is a program-level abort.
+There are two reasons for this. The first is that it obviously had to do this
+in a sequential program as there is nothing else to notify and the simplicity
+of keeping the same behaviour in sequential and concurrent programs is good.
+Also, even in concurrent programs there is no stack that an innate connection
+to, so it would have be explicitly managed.
 \item[Thread Stack:]
+A thread stack is created for a @thread@ object or object that satisfies the
+@is_thread@ trait. A thread only has two points of communication that must
+happen: start and join. As the thread must be running to perform a
+cancellation, it must occur after start and before join, so join is used
+for communication here.
+After the stack is unwound, the thread halts and waits for
+another thread to join with it. The joining thread checks for a cancellation,
+and if present, resumes exception @ThreadCancelled@.
+There is a subtle difference between the explicit join (@join@ function) and
+implicit join (from a destructor call). The explicit join takes the default
+handler (@defaultResumptionHandler@) from its calling context, which is used if
+the exception is not caught. The implicit join does a program abort instead.
+This semantics is for safety. If an unwind is triggered while another unwind
+is underway only one of them can proceed as they both want to ``consume'' the
+stack. Letting both try to proceed leads to very undefined behaviour.
+Both termination and cancellation involve unwinding and, since the default
+@defaultResumptionHandler@ preforms a termination that could more easily
+happen in an implicate join inside a destructor. So there is an error message
+and an abort instead.
+\todo{Perhaps have a more general disucssion of unwind collisions before
+this point.}
+The recommended way to avoid the abort is to handle the intial resumption
+from the implicate join. If required you may put an explicate join inside a
+finally clause to disable the check and use the local
+@defaultResumptionHandler@ instead.
+\item[Coroutine Stack:] A coroutine stack is created for a @coroutine@ object
+or object that satisfies the @is_coroutine@ trait. A coroutine only knows of
+two other coroutines, its starter and its last resumer. Of the two the last
+resumer has the tightest coupling to the coroutine it activated and the most
+up-to-date information.
+Hence, cancellation of the active coroutine is forwarded to the last resumer
+after the stack is unwound. When the resumer restarts, it resumes exception
+@CoroutineCancelled@, which is polymorphic over the coroutine type and has a
+pointer to the cancelled coroutine.
+The resume function also has an assertion that the @defaultResumptionHandler@
+for the exception. So it will use the default handler like a regular throw.
+A thread stack is created for a \CFA @thread@ object or object that satisfies
+the @is_thread@ trait.
+After a thread stack is unwound there exception is stored until another
+thread attempts to join with it. Then the exception @ThreadCancelled@,
+which stores a reference to the thread and to the exception passed to the
+cancellation, is reported from the join.
+There is one difference between an explicit join (with the @join@ function)
+and an implicit join (from a destructor call). The explicit join takes the
+default handler (@defaultResumptionHandler@) from its calling context while
+the implicit join provides its own which does a program abort if the
+@ThreadCancelled@ exception cannot be handled.
+Communication is done at join because a thread only has to have to points of
+communication with other threads: start and join.
+Since a thread must be running to perform a cancellation (and cannot be
+cancelled from another stack), the cancellation must be after start and
+before the join. So join is the one that we will use.
+% TODO: Find somewhere to discuss unwind collisions.
+The difference between the explicit and implicit join is for safety and
+debugging. It helps prevent unwinding collisions by avoiding throwing from
+a destructor and prevents cascading the error across multiple threads if
+the user is not equipped to deal with it.
+Also you can always add an explicit join if that is the desired behaviour.
+\item[Coroutine Stack:]
+A coroutine stack is created for a @coroutine@ object or object that
+satisfies the @is_coroutine@ trait.
+After a coroutine stack is unwound control returns to the resume function
+that most recently resumed it. The resume statement reports a
+@CoroutineCancelled@ exception, which contains a references to the cancelled
+coroutine and the exception used to cancel it.
+The resume function also takes the @defaultResumptionHandler@ from the
+caller's context and passes it to the internal report.
+A coroutine knows of two other coroutines, its starter and its last resumer.
+The starter has a much more distant connection while the last resumer just
+(in terms of coroutine state) called resume on this coroutine, so the message
+is passed to the latter.
 \end{description}

doc/theses/andrew_beach_MMath/future.tex

-              rfeacef9
+              r5407cdc
 patterns to find the handler.
+\section{Checked Exceptions}
+Checked exceptions make exceptions part of a function's type by adding the
+exception signature. An exception signature must declare all checked
+exceptions that could propogate from the function (either because they were
+raised inside the function or came from a sub-function). This improves safety
+by making sure every checked exception is either handled or consciously
+passed on.
+However checked exceptions were never seriously considered for this project
+for two reasons. The first is due to time constraints, even copying an
+existing checked exception system would be pushing the remaining time and
+trying to address the second problem would take even longer. The second
+problem is that checked exceptions have some real usability trade-offs in
+exchange for the increased safety.
+These trade-offs are most problematic when trying to pass exceptions through
+higher-order functions from the functions the user passed into the
+higher-order function. There are no well known solutions to this problem
+that were statifactory for \CFA (which carries some of C's flexability
+over safety design) so one would have to be researched and developed.
+Follow-up work might add checked exceptions to \CFA, possibly using
+polymorphic exception signatures, a form of tunneling\cite{Zhang19} or
+checked and unchecked raises.
 \section{Zero-Cost Try}
 \CFA does not have zero-cost try-statements because the compiler generates C

doc/theses/andrew_beach_MMath/implement.tex

-              rfeacef9
+              r5407cdc
 library.
+\subsection{Virtual Type}
+Virtual types only have one change to their structure, the addition of a
+pointer to the virtual table. This is always the first field so that
+if it is cast to a supertype the field's location is still known.
+This field is set as part of all new generated constructors.
+\todo{They only come as part exceptions and don't work.}
+After the object is created the field is constant.
+However it can be read from, internally it is just a regular field called
+@virtual_table@. Dereferencing it gives the virtual table and access to the
+type's virtual members.
 \subsection{Virtual Table}
+Every time a virtual type is defined the new virtual table type must also be
+defined.
+The unique instance is important because the address of the virtual table
+instance is used as the identifier for the virtual type. So a pointer to the
+virtual table and the ID for the virtual type are interchangable.
+\todo{Unique instances might be going so we will have to talk about the new
+system instead.}
+The first step in putting it all together is to create the virtual table type.
+The virtual table type is just a structure and can be described in terms of
+its fields. The first field is always the parent type ID (or a pointer to
+the parent virtual table) or 0 (the null pointer).
+Next are other fields on the parent virtual table are repeated.
+Finally are the fields used to store any new virtual members of the new
+The virtual type
 The virtual system is accessed through a private constant field inserted at the
 beginning of every virtual type, called the virtual-table pointer. This field
 points at a type's virtual table and is assigned during the object's
 construction.  The address of a virtual table acts as the unique identifier for
+construction. The address of a virtual table acts as the unique identifier for
 the virtual type, and the first field of a virtual table is a pointer to the
 parent virtual-table or @0p@.  The remaining fields are duplicated from the
+parent virtual-table or @0p@. The remaining fields are duplicated from the
 parent tables in this type's inheritance chain, followed by any fields this type
 introduces. Parent fields are duplicated so they can be changed (\CC
 \lstinline[language=c++]|override|), so that references to the dispatched type
+introduces. Parent fields are duplicated so they can be changed (all virtual
+members are overridable), so that references to the dispatched type
 are replaced with the current virtual type.
-\PAB{Can you create a simple diagram of the layout?}
 % These are always taken by pointer or reference.
+% Simple ascii diragram:
+\begin{verbatim}
+parent_pointer  \
+parent_field0   |
+...             | Same layout as parent.
+parent_fieldN   /
+child_field0
+...
+child_fieldN
+\end{verbatim}
+\todo{Refine the diagram}
 % For each virtual type, a virtual table is constructed. This is both a new type
 …
 A virtual table is created when the virtual type is created. The name of the
 type is created by mangling the name of the base type. The name of the instance
 is also generated by name mangling.  The fields are initialized automatically.
+is also generated by name mangling. The fields are initialized automatically.
 The parent field is initialized by getting the type of the parent field and
 using that to calculate the mangled name of the parent's virtual table type.
 …
 \begin{sloppypar}
 Coroutines and threads need instances of @CoroutineCancelled@ and
 @ThreadCancelled@ respectively to use all of their functionality.  When a new
+@ThreadCancelled@ respectively to use all of their functionality. When a new
 data type is declared with @coroutine@ or @thread@ the forward declaration for
 the instance is created as well. The definition of the virtual table is created
 …
 The function is
 \begin{cfa}
+void * __cfa__virtual_cast( struct __cfa__parent_vtable const * parent,
+void * __cfa__virtual_cast(
+        struct __cfa__parent_vtable const * parent,
         struct __cfa__parent_vtable const * const * child );
+}
 \end{cfa}
+and it is implemented in the standard library. It takes a pointer to the target
+type's virtual table and the object pointer being cast. The function performs a
+linear search starting at the object's virtual-table and walking through the
+the parent pointers, checking to if it or any of its ancestors are the same as
+the target-type virtual table-pointer.
+For the generated code, a forward declaration of the virtual works as follows.
+There is a forward declaration of @__cfa__virtual_cast@ in every \CFA file so
+it can just be used. The object argument is the expression being cast so that
+is just placed in the argument list.
+To build the target type parameter, the compiler creates a mapping from
+concrete type-name -- so for polymorphic types the parameters are filled in --
+to virtual table address. Every virtual table declaration is added to the this
+table; repeats are ignored unless they have conflicting definitions.  Note,
+these declarations do not have to be in scope, but they should usually be
+introduced as part of the type definition.
+\PAB{I do not understood all of \VRef{s:VirtualSystem}. I think you need to
+write more to make it clear.}
+and it is implemented in the standard library. The structure reperents the
+head of a vtable which is the pointer to the parent virtual table. The
+@parent@ points directly at the parent type virtual table while the @child@
+points at the object of the (possibe) child type.
+In terms of the virtual cast expression, @parent@ comes from looking up the
+type being cast to and @child@ is the result of the expression being cast.
+Because the complier outputs C code, some type C type casts are also used.
+The last bit of glue is an map that saves every virtual type the compiler
+sees. This is used to check the type used in a virtual cast is a virtual
+type and to get its virtual table.
+(It also checks for conflicting definitions.)
+Inside the function it is a simple conditional. If the type repersented by
+@parent@ is or is an ancestor of the type repersented by @*child@ (it
+requires one more level of derefence to pass through the object) then @child@
+is returned, otherwise the null pointer is returned.
+The check itself is preformed is a simple linear search. If the child
+virtual table or any of its ancestors (which are retreved through the first
+field of every virtual table) are the same as the parent virtual table then
+the cast succeeds.
 \section{Exceptions}
 …
 stack. On function entry and return, unwinding is handled directly by the code
 embedded in the function. Usually, the stack-frame size is known statically
 based on parameter and local variable declarations.  For dynamically-sized
+based on parameter and local variable declarations. For dynamically-sized
 local variables, a runtime computation is necessary to know the frame
 size. Finally, a function's frame-size may change during execution as local
 …
 To use libunwind, each function must have a personality function and a Language
 Specific Data Area (LSDA).  The LSDA has the unique information for each
+Specific Data Area (LSDA). The LSDA has the unique information for each
 function to tell the personality function where a function is executing, its
 current stack frame, and what handlers should be checked.  Theoretically, the
+current stack frame, and what handlers should be checked. Theoretically, the
 LSDA can contain any information but conventionally it is a table with entries
 representing regions of the function and what has to be done there during
 …
 The GCC compilation flag @-fexceptions@ causes the generation of an LSDA and
+attaches its personality function. \PAB{to what is it attached?}  However, this
+flag only handles the cleanup attribute
+attaches its personality function. However, this
+flag only handles the cleanup attribute:
+\todo{Peter: What is attached? Andrew: It uses the .cfi\_personality directive
+and that's all I know.}
 \begin{cfa}
 void clean_up( int * var ) { ... }
 int avar __attribute__(( __cleanup(clean_up) ));
+int avar __attribute__(( cleanup(clean_up) ));
 \end{cfa}
+which is used on a variable and specifies a function, \eg @clean_up@, run when
+the variable goes out of scope. The function is passed a pointer to the object
+so it can be used to mimic destructors. However, this feature cannot be used to
+mimic @try@ statements.
+which is used on a variable and specifies a function, in this case @clean_up@,
+run when the variable goes out of scope.
+The function is passed a pointer to the object being removed from the stack
+so it can be used to mimic destructors.
+However, this feature cannot be used to mimic @try@ statements as it cannot
+control the unwinding.
 \subsection{Personality Functions}
 Personality functions have a complex interface specified by libunwind.  This
+Personality functions have a complex interface specified by libunwind. This
 section covers some of the important parts of the interface.
 A personality function performs four tasks, although not all have to be
 present.
+A personality function can preform different actions depending on how it is
+called.
 \begin{lstlisting}[language=C,{moredelim=**[is][\color{red}]{@}{@}}]
 typedef _Unwind_Reason_Code (*@_Unwind_Personality_Fn@) (
 …
 \item
 @_UA_SEARCH_PHASE@ specifies a search phase and tells the personality function
 to check for handlers.  If there is a handler in a stack frame, as defined by
+to check for handlers. If there is a handler in a stack frame, as defined by
 the language, the personality function returns @_URC_HANDLER_FOUND@; otherwise
 it return @_URC_CONTINUE_UNWIND@.
 …
 \end{cfa}
 It also unwinds the stack but it does not use the search phase. Instead another
 function, the stop function, is used to stop searching.  The exception is the
+function, the stop function, is used to stop searching. The exception is the
 same as the one passed to raise exception. The extra arguments are the stop
 function and the stop parameter. The stop function has a similar interface as a
 …
 \begin{sloppypar}
 Its arguments are the same as the paired personality function.  The actions
+Its arguments are the same as the paired personality function. The actions
 @_UA_CLEANUP_PHASE@ and @_UA_FORCE_UNWIND@ are always set when it is
 called. Beyond the libunwind standard, both GCC and Clang add an extra action
 …
 strong symbol replacing the sequential version.
+% The version of the function defined in @libcfa@ is very simple. It returns a
+% pointer to a global static variable. With only one stack this global instance
+% is associated with the only stack.
+For coroutines, @this_exception_context@ accesses the exception context stored
+at the base of the stack. For threads, @this_exception_context@ uses the
+concurrency library to access the current stack of the thread or coroutine
+being executed by the thread, and then accesses the exception context stored at
+the base of this stack.
+The sequential @this_exception_context@ returns a hard-coded pointer to the
+global execption context.
+The concurrent version adds the exception context to the data stored at the
+base of each stack. When @this_exception_context@ is called it retrieves the
+active stack and returns the address of the context saved there.
 \section{Termination}
 …
 per-exception storage.
+Exceptions are stored in variable-sized blocks. \PAB{Show a memory layout
+figure.} The first component is a fixed sized data structure that contains the
+[Quick ASCII diagram to get started.]
+\begin{verbatim}
+Fixed Header  | _Unwind_Exception   <- pointer target
+              |
+              | Cforall storage
+              |
+Variable Body | the exception       <- fixed offset
+              V ...
+\end{verbatim}
+Exceptions are stored in variable-sized blocks.
+The first component is a fixed sized data structure that contains the
 information for libunwind and the exception system. The second component is an
 area of memory big enough to store the exception. Macros with pointer arthritic
 …
 exception type. The size and copy function are used immediately to copy an
 exception into managed memory. After the exception is handled the free function
+is used to clean up the exception and then the entire node is passed to free.
+is used to clean up the exception and then the entire node is passed to free
+so the memory can be given back to the heap.
 \subsection{Try Statements and Catch Clauses}
 …
 library. The contents of a try block and the termination handlers are converted
 into functions. These are then passed to the try terminate function and it
+calls them. This approach puts a try statement in its own functions so that no
+function has to deal with both termination handlers and destructors. \PAB{I do
+not understand the previous sentence.}
+This function has some custom embedded assembly that defines \emph{its}
+personality function and LSDA. The assembly is created with handcrafted C @asm@
+statements, which is why there is only one version of it. The personality
+function is structured so that it can be expanded, but currently it only
+handles this one function.  Notably, it does not handle any destructors so the
+function is constructed so that it does need to run it. \PAB{I do not
+understand the previous sentence.}
+calls them.
+Because this function is known and fixed (and not an arbitrary function that
+happens to contain a try statement) this means the LSDA can be generated ahead
+of time.
+Both the LSDA and the personality function are set ahead of time using
+embedded assembly. This is handcrafted using C @asm@ statements and contains
+enough information for the single try statement the function repersents.
 The three functions passed to try terminate are:
 …
 \item[match function:] This function is called during the search phase and
 decides if a catch clause matches the termination exception.  It is constructed
+decides if a catch clause matches the termination exception. It is constructed
 from the conditional part of each handler and runs each check, top to bottom,
 in turn, first checking to see if the exception type matches and then if the
 …
 \item[handler function:] This function handles the exception. It takes a
 pointer to the exception and the handler's id and returns nothing. It is called
 after the cleanup phase.  It is constructed by stitching together the bodies of
+after the cleanup phase. It is constructed by stitching together the bodies of
 each handler and dispatches to the selected handler.
 \end{description}
 …
 can be used to create closures, functions that can refer to the state of other
 functions on the stack. This approach allows the functions to refer to all the
 variables in scope for the function containing the @try@ statement.  These
+variables in scope for the function containing the @try@ statement. These
 nested functions and all other functions besides @__cfaehm_try_terminate@ in
 \CFA use the GCC personality function and the @-fexceptions@ flag to generate
 …
 handler that matches. If no handler matches then the function returns
 false. Otherwise the matching handler is run; if it completes successfully, the
 function returns true. Reresume, through the @throwResume;@ statement, cause
 the function to return true.
+function returns true. Rethrowing, through the @throwResume;@ statement,
+causes the function to return true.
 % Recursive Resumption Stuff:
 …
 providing zero-cost enter/exit using the LSDA. Unfortunately, there is no way
 to return from a libunwind search without installing a handler or raising an
 error.  Although workarounds might be possible, they are beyond the scope of
+error. Although workarounds might be possible, they are beyond the scope of
 this thesis. The current resumption implementation has simplicity in its
 favour.
 …
 Cancellation also uses libunwind to do its stack traversal and unwinding,
 however it uses a different primary function @_Unwind_ForcedUnwind@.  Details
+however it uses a different primary function @_Unwind_ForcedUnwind@. Details
 of its interface can be found in the \VRef{s:ForcedUnwind}.
 …
 its main coroutine and the coroutine it is currently executing.
+The first check is if the current thread's main and current coroutine do not
+match, implying a coroutine cancellation; otherwise, it is a thread
+cancellation. Otherwise it is a main thread cancellation. \PAB{Previous
+sentence does not make sense.}
+So if the active thread's main and current coroutine are the same. If they
+are then the current stack is a thread stack, otherwise it is a coroutine
+stack. If it is a thread stack then an equality check with the stored main
+thread pointer and current thread pointer is enough to tell if the current
+thread is the main thread or not.
 However, if the threading library is not linked, the sequential execution is on

doc/theses/andrew_beach_MMath/uw-ethesis.tex

-              rfeacef9
+              r5407cdc
 % ======================================================================
 %   D O C U M E N T   P R E A M B L E
+% Specify the document class, default style attributes, page dimensions, etc.
+% For hyperlinked PDF, suitable for viewing on a computer, use this:
+\documentclass[letterpaper,12pt,titlepage,oneside,final]{book}
+% For PDF, suitable for double-sided printing, change the PrintVersion
+% variable below to "true" and use this \documentclass line instead of the
+% one above:
+%\documentclass[letterpaper,12pt,titlepage,openright,twoside,final]{book}
+\usepackage{etoolbox}
+\RequirePackage{etoolbox}
+% Control if this for print (set true) or will stay digital (default).
+% Print is two sided, digital uses more colours.
+\newtoggle{printversion}
+%\toggletrue{printversion}
+\iftoggle{printversion}{%
+  \documentclass[letterpaper,12pt,titlepage,openright,twoside,final]{book}
+}{%
+  \documentclass[letterpaper,12pt,titlepage,oneside,final]{book}
+}
 % Some LaTeX commands I define for my own nomenclature.
 …
 % Anything defined here may be redefined by packages added below...
+% This package allows if-then-else control structures.
+\usepackage{ifthen}
+\newboolean{PrintVersion}
+\setboolean{PrintVersion}{false}
+% CHANGE THIS VALUE TO "true" as necessary, to improve printed results for
+% hard copies by overriding some options of the hyperref package, called below.
+%\usepackage{nomencl} % For a nomenclature (optional; available from ctan.org)
+% For a nomenclature (optional; available from ctan.org)
+%\usepackage{nomencl}
 % Lots of math symbols and environments
 \usepackage{amsmath,amssymb,amstext}
+% For including graphics N.B. pdftex graphics driver
+\usepackage[pdftex]{graphicx}
+% For including graphics (must match graphics driver)
+\usepackage{epic,eepic}
+\usepackage{graphicx}
 % Removes large sections of the document.
 \usepackage{comment}
 % Adds todos (Must be included after comment.)
 \usepackage{todonotes}
 % Hyperlinks make it very easy to navigate an electronic document.
 …
 % Use the "hyperref" package
 % N.B. HYPERREF MUST BE THE LAST PACKAGE LOADED; ADD ADDITIONAL PKGS ABOVE
+\usepackage[pdftex,pagebackref=true]{hyperref} % with basic options
+%\usepackage[pdftex,pagebackref=true]{hyperref}
+\usepackage[pagebackref=true]{hyperref}
 % N.B. pagebackref=true provides links back from the References to the body
 % text. This can cause trouble for printing.
 …
     pdffitwindow=false,     % window fit to page when opened
     pdfstartview={FitH},    % fits the width of the page to the window
-%    pdftitle={uWaterloo\ LaTeX\ Thesis\ Template}, % title: CHANGE THIS TEXT!
-%    pdfauthor={Author},    % author: CHANGE THIS TEXT! and uncomment this line
-%    pdfsubject={Subject},  % subject: CHANGE THIS TEXT! and uncomment this line
-%    pdfkeywords={keyword1} {key2} {key3}, % optional list of keywords
     pdfnewwindow=true,      % links in new window
     colorlinks=true,        % false: boxed links; true: colored links
-    linkcolor=blue,         % color of internal links
-    citecolor=green,        % color of links to bibliography
-    filecolor=magenta,      % color of file links
-    urlcolor=cyan           % color of external links
+}
+% for improved print quality, change some hyperref options
+\ifthenelse{\boolean{PrintVersion}}{
+\hypersetup{    % override some previously defined hyperref options
+%    colorlinks,%
+    citecolor=black,%
+    filecolor=black,%
+    linkcolor=black,%
+    urlcolor=black}
+}{} % end of ifthenelse (no else)
+\iftoggle{printversion}{
+  \hypersetup{
+    citecolor=black,        % colour of links to bibliography
+    filecolor=black,        % colour of file links
+    linkcolor=black,        % colour of internal links
+    urlcolor=black,         % colour of external links
+  }
+}{ % Digital Version
+  \hypersetup{
+    citecolor=green,
+    filecolor=magenta,
+    linkcolor=blue,
+    urlcolor=cyan,
+  }
+}
+\hypersetup{
+  pdftitle={Exception Handling in Cforall},
+  pdfauthor={Andrew James Beach},
+  pdfsubject={Computer Science},
+  pdfkeywords={programming languages} {exceptions}
+      {language design} {language implementation},
+}
 % Exception to the rule of hyperref being the last add-on package
 …
 \pdfstringdefDisableCommands{\def\Cpp{C++}}
+% Wrappers for inline code snippits.
+\newrobustcmd*\codeCFA[1]{\lstinline[language=CFA]{#1}}
+\newrobustcmd*\codeC[1]{\lstinline[language=C]{#1}}
+\newrobustcmd*\codeCpp[1]{\lstinline[language=C++]{#1}}
+\newrobustcmd*\codePy[1]{\lstinline[language=Python]{#1}}
 % Colour text, formatted in LaTeX style instead of TeX style.
 \newcommand*\colour[2]{{\color{#1}#2}}

doc/theses/mubeen_zulfiqar_MMath/uw-ethesis.bib

-              rfeacef9
+              r5407cdc
                          Alexander Samarin",
         title =         "The \LaTeX\ Companion",
         year =          "1994",
+        year =          "1994",
         publisher =     "Addison-Wesley",
         address =       "Reading, Massachusetts"
+        address =       "Reading, Massachusetts"
+}
 …
         title =         "\LaTeX\ --- A Document Preparation System",
         edition =       "Second",
         year =          "1994",
         publisher =     "Addison-Wesley",
+        year =          "1994",
+        publisher =     "Addison-Wesley",
         address =       "Reading, Massachusetts"
+}

doc/theses/thierry_delisle_PhD/code/readyQ_proto/links.hpp

rfeacef9	r5407cdc
117	117	}
118	118
119		long long ts() const {
	119	unsigned long long ts() const {
120	120	return before._links.ts;
121	121	}

doc/theses/thierry_delisle_PhD/code/readyQ_proto/processor_list.hpp

-              rfeacef9
+              r5407cdc
                 while( __builtin_expect(ll.exchange(true),false) ) {
                         while(ll.load(std::memory_order_relaxed))
                                 asm volatile("pause");
+                                Pause();
+                }
                 /* paranoid */ assert(ll);
 …
                          && ready.compare_exchange_weak(copy, n + 1) )
                                 break;
                         asm volatile("pause");
+                        Pause();
+                }
 …
                 // Step 1 : make sure no writer are in the middle of the critical section
                 while(lock.load(std::memory_order_relaxed))
                         asm volatile("pause");
+                        Pause();
                 // Fence needed because we don't want to start trying to acquire the lock
 …
                 //   to simply lock their own lock and enter.
                 while(lock.load(std::memory_order_relaxed))
                         asm volatile("pause");
+                        Pause();
                 // Step 2 : lock per-proc lock
 …
                 for(uint_fast32_t i = 0; i < s; i++) {
                         while(data[i].lock.load(std::memory_order_relaxed))
                                 asm volatile("pause");
+                                Pause();
+                }

doc/theses/thierry_delisle_PhD/code/readyQ_proto/processor_list_good.cpp

rfeacef9	r5407cdc
21	21	target = (target - (target % total)) + total;
22	22	while(waiting < target)
23		~~asm volatile("pause"~~);
	23	Pause();
24	24
25	25	assert(waiting < (1ul << 60));

doc/theses/thierry_delisle_PhD/code/readyQ_proto/randbit.cpp

rfeacef9	r5407cdc
123	123	target = (target - (target % total)) + total;
124	124	while(waiting < target)
125		~~asm volatile("pause"~~);
	125	Pause();
126	126
127	127	assert(waiting < (1ul << 60));

doc/theses/thierry_delisle_PhD/code/readyQ_proto/relaxed_list.cpp

-              rfeacef9
+              r5407cdc
         std::cout << "Total ops     : " << ops << "(" << global.in << "i, " << global.out << "o, " << global.empty << "e)\n";
         #ifndef NO_STATS
                 LIST_VARIANT<Node>::stats_print(std::cout);
+                LIST_VARIANT<Node>::stats_print(std::cout, duration);
         #endif
+}
 …
                 for(Node * & node : nodes) {
+                        node = list.pop();
+                        assert(node);
+                        node = nullptr;
+                        while(!node) {
+                                node = list.pop();
+                        }
                         local.crc_out += node->value;
                         local.out++;
 …
                                 for(const auto & n : nodes) {
                                         local.valmax = max(local.valmax, size_t(n.value));
                                         local.valmin = min(local.valmin, size_t(n.value));
+                                        local.valmax = std::max(local.valmax, size_t(n.value));
+                                        local.valmin = std::min(local.valmin, size_t(n.value));
+                                }
 …
                                                 try {
                                                         arg = optarg = argv[optind];
                                                         nnodes = stoul(optarg, &len);
+                                                        nnodes = std::stoul(optarg, &len);
                                                         if(len != arg.size()) { throw std::invalid_argument(""); }
                                                 } catch(std::invalid_argument &) {
 …
                                                 try {
                                                         arg = optarg = argv[optind];
                                                         nnodes = stoul(optarg, &len);
+                                                        nnodes = std::stoul(optarg, &len);
                                                         if(len != arg.size()) { throw std::invalid_argument(""); }
                                                 } catch(std::invalid_argument &) {
 …
                                                 try {
                                                         arg = optarg = argv[optind];
                                                         nnodes = stoul(optarg, &len);
+                                                        nnodes = std::stoul(optarg, &len);
                                                         if(len != arg.size()) { throw std::invalid_argument(""); }
                                                         nslots = nnodes;
 …
                                                 try {
                                                         arg = optarg = argv[optind];
                                                         nnodes = stoul(optarg, &len);
+                                                        nnodes = std::stoul(optarg, &len);
                                                         if(len != arg.size()) { throw std::invalid_argument(""); }
                                                 } catch(std::invalid_argument &) {
 …
                                                 try {
                                                         arg = optarg = argv[optind + 1];
                                                         nslots = stoul(optarg, &len);
+                                                        nslots = std::stoul(optarg, &len);
                                                         if(len != arg.size()) { throw std::invalid_argument(""); }
                                                 } catch(std::invalid_argument &) {
 …
                         case 'd':
                                 try {
                                         duration = stod(optarg, &len);
+                                        duration = std::stod(optarg, &len);
                                         if(len != arg.size()) { throw std::invalid_argument(""); }
                                 } catch(std::invalid_argument &) {
 …
                         case 't':
                                 try {
                                         nthreads = stoul(optarg, &len);
+                                        nthreads = std::stoul(optarg, &len);
                                         if(len != arg.size()) { throw std::invalid_argument(""); }
                                 } catch(std::invalid_argument &) {
 …
                         case 'q':
                                 try {
                                         nqueues = stoul(optarg, &len);
+                                        nqueues = std::stoul(optarg, &len);
                                         if(len != arg.size()) { throw std::invalid_argument(""); }
                                 } catch(std::invalid_argument &) {

doc/theses/thierry_delisle_PhD/code/readyQ_proto/snzi-packed.hpp

-              rfeacef9
+              r5407cdc
         for(int i = 0; i < width; i++) {
                 int idx = i % hwdith;
-                std::cout << i << " -> " << idx + width << std::endl;
                 leafs[i].parent = &nodes[ idx ];
+        }
 …
         for(int i = 0; i < root; i++) {
                 int idx = (i / 2) + hwdith;
-                std::cout << i + width << " -> " << idx + width << std::endl;
                 nodes[i].parent = &nodes[ idx ];
+        }

doc/theses/thierry_delisle_PhD/code/readyQ_proto/snzi.hpp

rfeacef9	r5407cdc
159	159	std::cout << "SNZI: " << depth << "x" << width << "(" << mask - 1 << ") " << (sizeof(snzi_t::node) * (root + 1)) << " bytes" << std::endl;
160	160	for(int i = 0; i < root; i++) {
161		~~std::cout << i << " -> " << (i / base) + width << std::endl;~~
162	161	nodes[i].parent = &nodes[(i / base) + width];
163	162	}

doc/theses/thierry_delisle_PhD/code/readyQ_proto/utils.hpp

-              rfeacef9
+              r5407cdc
 #include <sys/sysinfo.h>
+#include <x86intrin.h>
+// Barrier from
+class barrier_t {
+public:
+        barrier_t(size_t total)
+                : waiting(0)
+                , total(total)
+        {}
+        void wait(unsigned) {
+                size_t target = waiting++;
+                target = (target - (target % total)) + total;
+                while(waiting < target)
+                        asm volatile("pause");
+                assert(waiting < (1ul << 60));
+        }
+private:
+        std::atomic<size_t> waiting;
+        size_t total;
+};
+// #include <x86intrin.h>
 // class Random {
 …
 };
+static inline long long rdtscl(void) {
+    unsigned int lo, hi;
+    __asm__ __volatile__ ("rdtsc" : "=a"(lo), "=d"(hi));
+    return ( (unsigned long long)lo)|( ((unsigned long long)hi)<<32 );
+}
+static inline long long int rdtscl(void) {
+        #if defined( __i386 ) || defined( __x86_64 )
+                unsigned int lo, hi;
+                __asm__ __volatile__ ("rdtsc" : "=a"(lo), "=d"(hi));
+                return ( (unsigned long long)lo)|( ((unsigned long long)hi)<<32 );
+        #elif defined( __aarch64__ ) || defined( __arm__ )
+                // https://github.com/google/benchmark/blob/v1.1.0/src/cycleclock.h#L116
+                long long int virtual_timer_value;
+                asm volatile("mrs %0, cntvct_el0" : "=r"(virtual_timer_value));
+                return virtual_timer_value;
+        #else
+                #error unsupported hardware architecture
+        #endif
+}
+#if defined( __i386 ) || defined( __x86_64 )
+        #define Pause() __asm__ __volatile__ ( "pause" : : : )
+#elif defined( __ARM_ARCH )
+        #define Pause() __asm__ __volatile__ ( "YIELD" : : : )
+#else
+        #error unsupported architecture
+#endif
 static inline void affinity(int tid) {
 …
+}
+// Barrier from
+class barrier_t {
+public:
+        barrier_t(size_t total)
+                : waiting(0)
+                , total(total)
+        {}
+        void wait(unsigned) {
+                size_t target = waiting++;
+                target = (target - (target % total)) + total;
+                while(waiting < target)
+                        Pause();
+                assert(waiting < (1ul << 60));
+        }
+private:
+        std::atomic<size_t> waiting;
+        size_t total;
+};
 struct spinlock_t {
         std::atomic_bool ll = { false };
 …
                 while( __builtin_expect(ll.exchange(true),false) ) {
                         while(ll.load(std::memory_order_relaxed))
                                 asm volatile("pause");
+                                Pause();
+                }
+        }

doc/theses/thierry_delisle_PhD/code/readyQ_proto/work_stealing.hpp

-              rfeacef9
+              r5407cdc
 #include <memory>
 #include <mutex>
+#include <thread>
 #include <type_traits>
 …
 #include "utils.hpp"
 #include "links.hpp"
+#include "links2.hpp"
 #include "snzi.hpp"
+// #include <x86intrin.h>
 using namespace std;
+static const long long lim = 2000;
+static const unsigned nqueues = 2;
+struct __attribute__((aligned(128))) timestamp_t {
+        volatile unsigned long long val = 0;
+};
+template<typename node_t>
+struct __attribute__((aligned(128))) localQ_t {
+        #ifdef NO_MPSC
+                intrusive_queue_t<node_t> list;
+                inline auto ts() { return list.ts(); }
+                inline auto lock() { return list.lock.lock(); }
+                inline auto try_lock() { return list.lock.try_lock(); }
+                inline auto unlock() { return list.lock.unlock(); }
+                inline auto push( node_t * node ) { return list.push( node ); }
+                inline auto pop() { return list.pop(); }
+        #else
+                mpsc_queue<node_t> queue = {};
+                spinlock_t _lock = {};
+                inline auto ts() { auto h = queue.head(); return h ? h->_links.ts : 0ull; }
+                inline auto lock() { return _lock.lock(); }
+                inline auto try_lock() { return _lock.try_lock(); }
+                inline auto unlock() { return _lock.unlock(); }
+                inline auto push( node_t * node ) { return queue.push( node ); }
+                inline auto pop() { return queue.pop(); }
+        #endif
+};
 template<typename node_t>
 …
         work_stealing(unsigned _numThreads, unsigned)
+                : numThreads(_numThreads)
+                , lists(new intrusive_queue_t<node_t>[numThreads])
+                , snzi( std::log2( numThreads / 2 ), 2 )
+                : numThreads(_numThreads * nqueues)
+                , lists(new localQ_t<node_t>[numThreads])
+                // , lists(new intrusive_queue_t<node_t>[numThreads])
+                , times(new timestamp_t[numThreads])
+                // , snzi( std::log2( numThreads / 2 ), 2 )
+        {
 …
         __attribute__((noinline, hot)) void push(node_t * node) {
                 node->_links.ts = rdtscl();
+                if( node->_links.hint > numThreads ) {
+                        node->_links.hint = tls.rng.next() % numThreads;
+                        tls.stat.push.nhint++;
+                // node->_links.ts = 1;
+                auto & list = *({
+                        unsigned i;
+                        #ifdef NO_MPSC
+                                do {
+                        #endif
+                                tls.stats.push.attempt++;
+                                // unsigned r = tls.rng1.next();
+                                unsigned r = tls.it++;
+                                if(tls.my_queue == outside) {
+                                        i = r % numThreads;
+                                } else {
+                                        i = tls.my_queue + (r % nqueues);
+                                }
+                        #ifdef NO_MPSC
+                                } while(!lists[i].try_lock());
+                        #endif
+                        &lists[i];
+                });
+                list.push( node );
+                #ifdef NO_MPSC
+                        list.unlock();
+                #endif
+                // tls.rng2.set_raw_state( tls.rng1.get_raw_state());
+                // count++;
+                tls.stats.push.success++;
+        }
+        __attribute__((noinline, hot)) node_t * pop() {
+                if(tls.my_queue != outside) {
+                        // if( tls.myfriend == outside ) {
+                        //      auto r  = tls.rng1.next();
+                        //      tls.myfriend = r % numThreads;
+                        //      // assert(lists[(tls.it % nqueues) + tls.my_queue].ts() >= lists[((tls.it + 1) % nqueues) + tls.my_queue].ts());
+                        //      tls.mytime = std::min(lists[(tls.it % nqueues) + tls.my_queue].ts(), lists[((tls.it + 1) % nqueues) + tls.my_queue].ts());
+                        //      // times[tls.myfriend].val = 0;
+                        //      // lists[tls.myfriend].val = 0;
+                        // }
+                        // // else if(times[tls.myfriend].val == 0) {
+                        // // else if(lists[tls.myfriend].val == 0) {
+                        // else if(times[tls.myfriend].val < tls.mytime) {
+                        // // else if(times[tls.myfriend].val < lists[(tls.it % nqueues) + tls.my_queue].ts()) {
+                        //      node_t * n = try_pop(tls.myfriend, tls.stats.pop.help);
+                        //      tls.stats.help++;
+                        //      tls.myfriend = outside;
+                        //      if(n) return n;
+                        // }
+                        // if( tls.myfriend == outside ) {
+                        //      auto r  = tls.rng1.next();
+                        //      tls.myfriend = r % numThreads;
+                        //      tls.mytime = lists[((tls.it + 1) % nqueues) + tls.my_queue].ts();
+                        // }
+                        // else {
+                        //      if(times[tls.myfriend].val + 1000 < tls.mytime) {
+                        //              node_t * n = try_pop(tls.myfriend, tls.stats.pop.help);
+                        //              tls.stats.help++;
+                        //              if(n) return n;
+                        //      }
+                        //      tls.myfriend = outside;
+                        // }
+                        node_t * n = local();
+                        if(n) return n;
+                }
+                unsigned i = node->_links.hint;
+                auto & list = lists[i];
+                list.lock.lock();
+                if(list.push( node )) {
+                        snzi.arrive(i);
+                // try steal
+                for(int i = 0; i < 25; i++) {
+                        node_t * n = steal();
+                        if(n) return n;
+                }
+                list.lock.unlock();
+        }
+        __attribute__((noinline, hot)) node_t * pop() {
+                node_t * node;
+                while(true) {
+                        if(!snzi.query()) {
+                                return nullptr;
+                        }
+                        {
+                                unsigned i = tls.my_queue;
+                                auto & list = lists[i];
+                                if( list.ts() != 0 ) {
+                                        list.lock.lock();
+                                        if((node = try_pop(i))) {
+                                                tls.stat.pop.local.success++;
+                                                break;
+                                        }
+                                        else {
+                                                tls.stat.pop.local.elock++;
+                                        }
+                                }
+                                else {
+                                        tls.stat.pop.local.espec++;
+                                }
+                        }
+                        tls.stat.pop.steal.tried++;
+                        int i = tls.rng.next() % numThreads;
+                        auto & list = lists[i];
+                        if( list.ts() == 0 ) {
+                                tls.stat.pop.steal.empty++;
+                                continue;
+                        }
+                        if( !list.lock.try_lock() ) {
+                                tls.stat.pop.steal.locked++;
+                                continue;
+                        }
+                        if((node = try_pop(i))) {
+                                tls.stat.pop.steal.success++;
+                                break;
+                return search();
+        }
+private:
+        inline node_t * local() {
+                unsigned i = (--tls.it % nqueues) + tls.my_queue;
+                node_t * n = try_pop(i, tls.stats.pop.local);
+                if(n) return n;
+                i = (--tls.it % nqueues) + tls.my_queue;
+                return try_pop(i, tls.stats.pop.local);
+        }
+        inline node_t * steal() {
+                unsigned i = tls.rng2.prev() % numThreads;
+                return try_pop(i, tls.stats.pop.steal);
+        }
+        inline node_t * search() {
+                unsigned offset = tls.rng2.prev();
+                for(unsigned i = 0; i < numThreads; i++) {
+                        unsigned idx = (offset + i) % numThreads;
+                        node_t * thrd = try_pop(idx, tls.stats.pop.search);
+                        if(thrd) {
+                                return thrd;
+                        }
+                }
+                #if defined(READ)
+                        const unsigned f = READ;
+                        if(0 == (tls.it % f)) {
+                                unsigned i = tls.it / f;
+                                lists[i % numThreads].ts();
+                        }
+                        // lists[tls.it].ts();
+                        tls.it++;
+                #endif
+                return node;
+        }
+private:
+        node_t * try_pop(unsigned i) {
+                return nullptr;
+        }
+private:
+        struct attempt_stat_t {
+                std::size_t attempt = { 0 };
+                std::size_t elock   = { 0 };
+                std::size_t eempty  = { 0 };
+                std::size_t espec   = { 0 };
+                std::size_t success = { 0 };
+        };
+        node_t * try_pop(unsigned i, attempt_stat_t & stat) {
+                assert(i < numThreads);
                 auto & list = lists[i];
+                stat.attempt++;
+                // If the list is empty, don't try
+                if(list.ts() == 0) { stat.espec++; return nullptr; }
+                // If we can't get the lock, move on
+                if( !list.try_lock() ) { stat.elock++; return nullptr; }
                 // If list is empty, unlock and retry
                 if( list.ts() == 0 ) {
+                        list.lock.unlock();
+                        list.unlock();
+                        stat.eempty++;
                         return nullptr;
+                }
+                        // Actually pop the list
+                node_t * node;
+                bool emptied;
+                std::tie(node, emptied) = list.pop();
+                assert(node);
+                if(emptied) {
+                        snzi.depart(i);
+                }
+                // Unlock and return
+                list.lock.unlock();
+                return node;
+                auto node = list.pop();
+                list.unlock();
+                stat.success++;
+                #ifdef NO_MPSC
+                        // times[i].val = 1;
+                        times[i].val = node.first->_links.ts;
+                        // lists[i].val = node.first->_links.ts;
+                        return node.first;
+                #else
+                        times[i].val = node->_links.ts;
+                        return node;
+                #endif
+        }
 …
         static std::atomic_uint32_t ticket;
+        static const unsigned outside = 0xFFFFFFFF;
+        static inline unsigned calc_preferred() {
+                unsigned t = ticket++;
+                if(t == 0) return outside;
+                unsigned i = (t - 1) * nqueues;
+                return i;
+        }
         static __attribute__((aligned(128))) thread_local struct TLS {
+                Random     rng = { int(rdtscl()) };
+                unsigned   my_queue = ticket++;
+                Random     rng1 = { unsigned(std::hash<std::thread::id>{}(std::this_thread::get_id()) ^ rdtscl()) };
+                Random     rng2 = { unsigned(std::hash<std::thread::id>{}(std::this_thread::get_id()) ^ rdtscl()) };
+                unsigned   it   = 0;
+                unsigned   my_queue = calc_preferred();
+                unsigned   myfriend = outside;
+                unsigned long long int mytime = 0;
                 #if defined(READ)
                         unsigned it = 0;
 …
                 struct {
                         struct {
+                                std::size_t nhint = { 0 };
+                                std::size_t attempt = { 0 };
+                                std::size_t success = { 0 };
                         } push;
                         struct {
+                                struct {
+                                        std::size_t success = { 0 };
+                                        std::size_t espec = { 0 };
+                                        std::size_t elock = { 0 };
+                                } local;
+                                struct {
+                                        std::size_t tried   = { 0 };
+                                        std::size_t locked  = { 0 };
+                                        std::size_t empty   = { 0 };
+                                        std::size_t success = { 0 };
+                                } steal;
+                                attempt_stat_t help;
+                                attempt_stat_t local;
+                                attempt_stat_t steal;
+                                attempt_stat_t search;
                         } pop;
+                } stat;
+                        std::size_t help = { 0 };
+                } stats;
         } tls;
 private:
         const unsigned numThreads;
+        std::unique_ptr<intrusive_queue_t<node_t> []> lists;
+        __attribute__((aligned(64))) snzi_t snzi;
+        std::unique_ptr<localQ_t<node_t> []> lists;
+        // std::unique_ptr<intrusive_queue_t<node_t> []> lists;
+        std::unique_ptr<timestamp_t []> times;
+        __attribute__((aligned(128))) std::atomic_size_t count;
 #ifndef NO_STATS
 …
         static struct GlobalStats {
                 struct {
+                        std::atomic_size_t nhint = { 0 };
+                        std::atomic_size_t attempt = { 0 };
+                        std::atomic_size_t success = { 0 };
                 } push;
                 struct {
                         struct {
+                                std::atomic_size_t attempt = { 0 };
+                                std::atomic_size_t elock   = { 0 };
+                                std::atomic_size_t eempty  = { 0 };
+                                std::atomic_size_t espec   = { 0 };
                                 std::atomic_size_t success = { 0 };
+                                std::atomic_size_t espec = { 0 };
+                                std::atomic_size_t elock = { 0 };
+                        } help;
+                        struct {
+                                std::atomic_size_t attempt = { 0 };
+                                std::atomic_size_t elock   = { 0 };
+                                std::atomic_size_t eempty  = { 0 };
+                                std::atomic_size_t espec   = { 0 };
+                                std::atomic_size_t success = { 0 };
                         } local;
                         struct {
+                                std::atomic_size_t tried   = { 0 };
+                                std::atomic_size_t locked  = { 0 };
+                                std::atomic_size_t empty   = { 0 };
+                                std::atomic_size_t attempt = { 0 };
+                                std::atomic_size_t elock   = { 0 };
+                                std::atomic_size_t eempty  = { 0 };
+                                std::atomic_size_t espec   = { 0 };
                                 std::atomic_size_t success = { 0 };
                         } steal;
+                        struct {
+                                std::atomic_size_t attempt = { 0 };
+                                std::atomic_size_t elock   = { 0 };
+                                std::atomic_size_t eempty  = { 0 };
+                                std::atomic_size_t espec   = { 0 };
+                                std::atomic_size_t success = { 0 };
+                        } search;
                 } pop;
+                std::atomic_size_t help = { 0 };
         } global_stats;
 public:
         static void stats_tls_tally() {
+                global_stats.push.nhint += tls.stat.push.nhint;
+                global_stats.pop.local.success += tls.stat.pop.local.success;
+                global_stats.pop.local.espec   += tls.stat.pop.local.espec  ;
+                global_stats.pop.local.elock   += tls.stat.pop.local.elock  ;
+                global_stats.pop.steal.tried   += tls.stat.pop.steal.tried  ;
+                global_stats.pop.steal.locked  += tls.stat.pop.steal.locked ;
+                global_stats.pop.steal.empty   += tls.stat.pop.steal.empty  ;
+                global_stats.pop.steal.success += tls.stat.pop.steal.success;
+        }
+        static void stats_print(std::ostream & os ) {
+                global_stats.push.attempt += tls.stats.push.attempt;
+                global_stats.push.success += tls.stats.push.success;
+                global_stats.pop.help  .attempt += tls.stats.pop.help  .attempt;
+                global_stats.pop.help  .elock   += tls.stats.pop.help  .elock  ;
+                global_stats.pop.help  .eempty  += tls.stats.pop.help  .eempty ;
+                global_stats.pop.help  .espec   += tls.stats.pop.help  .espec  ;
+                global_stats.pop.help  .success += tls.stats.pop.help  .success;
+                global_stats.pop.local .attempt += tls.stats.pop.local .attempt;
+                global_stats.pop.local .elock   += tls.stats.pop.local .elock  ;
+                global_stats.pop.local .eempty  += tls.stats.pop.local .eempty ;
+                global_stats.pop.local .espec   += tls.stats.pop.local .espec  ;
+                global_stats.pop.local .success += tls.stats.pop.local .success;
+                global_stats.pop.steal .attempt += tls.stats.pop.steal .attempt;
+                global_stats.pop.steal .elock   += tls.stats.pop.steal .elock  ;
+                global_stats.pop.steal .eempty  += tls.stats.pop.steal .eempty ;
+                global_stats.pop.steal .espec   += tls.stats.pop.steal .espec  ;
+                global_stats.pop.steal .success += tls.stats.pop.steal .success;
+                global_stats.pop.search.attempt += tls.stats.pop.search.attempt;
+                global_stats.pop.search.elock   += tls.stats.pop.search.elock  ;
+                global_stats.pop.search.eempty  += tls.stats.pop.search.eempty ;
+                global_stats.pop.search.espec   += tls.stats.pop.search.espec  ;
+                global_stats.pop.search.success += tls.stats.pop.search.success;
+                global_stats.help += tls.stats.help;
+        }
+        static void stats_print(std::ostream & os, double duration ) {
                 std::cout << "----- Work Stealing Stats -----" << std::endl;
+                double stealSucc = double(global_stats.pop.steal.success) / global_stats.pop.steal.tried;
+                os << "Push to new Q : " << std::setw(15) << global_stats.push.nhint << "\n";
+                os << "Local Pop     : " << std::setw(15) << global_stats.pop.local.success << "\n";
+                os << "Steal Pop     : " << std::setw(15) << global_stats.pop.steal.success << "(" << global_stats.pop.local.espec << "s, " << global_stats.pop.local.elock << "l)\n";
+                os << "Steal Success : " << std::setw(15) << stealSucc << "(" << global_stats.pop.steal.tried << " tries)\n";
+                os << "Steal Fails   : " << std::setw(15) << global_stats.pop.steal.empty << "e, " << global_stats.pop.steal.locked << "l\n";
+                double push_suc = (100.0 * double(global_stats.push.success) / global_stats.push.attempt);
+                double push_len = double(global_stats.push.attempt     ) / global_stats.push.success;
+                os << "Push   Pick : " << push_suc << " %, len " << push_len << " (" << global_stats.push.attempt      << " / " << global_stats.push.success << ")\n";
+                double hlp_suc = (100.0 * double(global_stats.pop.help.success) / global_stats.pop.help.attempt);
+                double hlp_len = double(global_stats.pop.help.attempt     ) / global_stats.pop.help.success;
+                os << "Help        : " << hlp_suc << " %, len " << hlp_len << " (" << global_stats.pop.help.attempt      << " / " << global_stats.pop.help.success << ")\n";
+                os << "Help Fail   : " << global_stats.pop.help.espec << "s, " << global_stats.pop.help.eempty << "e, " << global_stats.pop.help.elock << "l\n";
+                double pop_suc = (100.0 * double(global_stats.pop.local.success) / global_stats.pop.local.attempt);
+                double pop_len = double(global_stats.pop.local.attempt     ) / global_stats.pop.local.success;
+                os << "Local       : " << pop_suc << " %, len " << pop_len << " (" << global_stats.pop.local.attempt      << " / " << global_stats.pop.local.success << ")\n";
+                os << "Local Fail  : " << global_stats.pop.local.espec << "s, " << global_stats.pop.local.eempty << "e, " << global_stats.pop.local.elock << "l\n";
+                double stl_suc = (100.0 * double(global_stats.pop.steal.success) / global_stats.pop.steal.attempt);
+                double stl_len = double(global_stats.pop.steal.attempt     ) / global_stats.pop.steal.success;
+                os << "Steal       : " << stl_suc << " %, len " << stl_len << " (" << global_stats.pop.steal.attempt      << " / " << global_stats.pop.steal.success << ")\n";
+                os << "Steal Fail  : " << global_stats.pop.steal.espec << "s, " << global_stats.pop.steal.eempty << "e, " << global_stats.pop.steal.elock << "l\n";
+                double srh_suc = (100.0 * double(global_stats.pop.search.success) / global_stats.pop.search.attempt);
+                double srh_len = double(global_stats.pop.search.attempt     ) / global_stats.pop.search.success;
+                os << "Search      : " << srh_suc << " %, len " << srh_len << " (" << global_stats.pop.search.attempt      << " / " << global_stats.pop.search.success << ")\n";
+                os << "Search Fail : " << global_stats.pop.search.espec << "s, " << global_stats.pop.search.eempty << "e, " << global_stats.pop.search.elock << "l\n";
+                os << "Helps       : " << std::setw(15) << std::scientific << global_stats.help / duration << "/sec (" << global_stats.help  << ")\n";
+        }
 private:

Context Navigation

Legend:

doc/theses/andrew_beach_MMath/Makefile

doc/theses/andrew_beach_MMath/existing.tex

doc/theses/andrew_beach_MMath/features.tex

doc/theses/andrew_beach_MMath/future.tex

doc/theses/andrew_beach_MMath/implement.tex

doc/theses/andrew_beach_MMath/uw-ethesis.tex

doc/theses/mubeen_zulfiqar_MMath/uw-ethesis.bib

doc/theses/thierry_delisle_PhD/code/readyQ_proto/links.hpp

doc/theses/thierry_delisle_PhD/code/readyQ_proto/processor_list.hpp

doc/theses/thierry_delisle_PhD/code/readyQ_proto/processor_list_good.cpp

doc/theses/thierry_delisle_PhD/code/readyQ_proto/randbit.cpp

doc/theses/thierry_delisle_PhD/code/readyQ_proto/relaxed_list.cpp

doc/theses/thierry_delisle_PhD/code/readyQ_proto/snzi-packed.hpp

doc/theses/thierry_delisle_PhD/code/readyQ_proto/snzi.hpp

doc/theses/thierry_delisle_PhD/code/readyQ_proto/utils.hpp

doc/theses/thierry_delisle_PhD/code/readyQ_proto/work_stealing.hpp

Download in other formats: