Index: doc/theses/andrew_beach_MMath/features.tex
===================================================================
--- doc/theses/andrew_beach_MMath/features.tex	(revision 0edd11a6077a0ddf81da70d5c62e8928e3d3e1b8)
+++ doc/theses/andrew_beach_MMath/features.tex	(revision 35ea4f33811f564594a8a11b8c003adbeca7973b)
@@ -10,5 +10,5 @@
 of they virtual system was designed and implemented.
 
-Virtual types are organizied in simple hierarchies. Each virtual type may have
+Virtual types are organized in simple hierarchies. Each virtual type may have
 a parent and can have any number of children. A type's descendants are its
 children and its children's descendants. A type may not be its own descendant.
@@ -17,5 +17,5 @@
 structure that has fields for all the virtual members of a type. A virtual
 type has all the virtual members of its parent and can add more. It may also
-update the values of the virtual members.
+update the values of the virtual members and should in many cases.
 
 Except for virtual casts, this is only used internally in the exception
@@ -27,5 +27,5 @@
 \end{lstlisting}
 
-This has the same precidence as a traditional C-cast and can be used in the
+This has the same precedence as a traditional C-cast and can be used in the
 same places. This will convert the result of EXPRESSION to the type TYPE. Both
 the type of EXPRESSION and TYPE must be pointers to virtual types.
@@ -41,7 +41,64 @@
 % features all exceptions support.
 
+\subsection{Exception Traits}
+Exceptions are defined by the trait system; there are a series of traits and
+if a type satisfies them then they can be used as exceptions.
+
+\begin{lstlisting}
+trait is_exception(dtype exceptT, dtype virtualT) {
+    virtualT const & get_exception_vtable(exceptT *);
+};
+\end{lstlisting}
+This is the base trait that all exceptions need to match.
+The single function takes any pointer (including the null pointer) and
+returns a reference to the virtual table instance. Defining this function
+also establishes the virtual type and virtual table pair to the resolver
+and promises that \codeCFA{exceptT} is a virtual type and a child of the
+base exception type.
+
+One odd thing about \codeCFA{get_exception_vtable} is that it should always
+be a constant function, returning the same value regardless of its argument.
+A pointer or reference to the virtual table instance could be used instead,
+however using a function has some ease of implementation advantages and
+allows for easier disambiguation because the virtual type name (or the
+address of an instance that is in scope) can be used instead of the mangled
+virtual table name.
+
+Also note the use of the word ``promise" in the trait description. \CFA
+cannot currently check to see if either \codeCFA{exceptT} or
+\codeCFA{virtualT} match the layout requirements. Currently this is
+considered part of \codeCFA{get_exception_vtable}'s correct implementation.
+
+\begin{lstlisting}
+trait is_termination_exception(
+        dtype exceptT, dtype virtualT | is_exception(exceptT, virtualT)) {
+    void defaultTerminationHandler(exceptT &);
+};
+\end{lstlisting}
+The only additional function required to make the exception usable with
+termination is a default handler. This function is called whenever a
+termination throw on an exception of this type is preformed and no handler
+is found.
+
+\begin{lstlisting}
+trait is_resumption_exception(
+        dtype exceptT, dtype virtualT | is_exception(exceptT, virtualT)) {
+    void defaultResumptionHandler(exceptT &);
+};
+\end{lstlisting}
+Creating a resumption exception is exactly the same except for resumption.
+The name change reflects that and the function is called when a resumption
+throw on an exception of this type is preformed and no handler is found.
+
+Finally there are three additional macros that can be used to refer to the
+these traits. They are \codeCFA{IS_EXCEPTION},
+\codeCFA{IS_TERMINATION_EXCEPTION} and \codeCFA{IS_RESUMPTION_EXCEPTION}.
+Each takes the virtual type's name and, for polymorphic types only, the
+parenthesized list of polymorphic arguments. These do the name mangling to
+get the virtual table name and provide the arguments to both sides.
+
 \section{Termination}
 
-Termination exception throws are likely the most framilar kind, as they are
+Termination exception throws are likely the most familiar kind, as they are
 used in several popular programming languages. A termination will throw an
 exception, search the stack for a handler, unwind the stack to where the
@@ -66,5 +123,5 @@
 
 Then the exception system will search the stack starting from the throw and
-proceding towards the base of the stack, from callee to caller. As it goes
+proceeding towards the base of the stack, from callee to caller. As it goes
 it will check any termination handlers it finds:
 
@@ -111,5 +168,5 @@
 \paragraph{Re-Throwing}
 
-You can also rethrow the most recent termination exception with
+You can also re-throw the most recent termination exception with
 \codeCFA{throw;}. % This is terrible and you should never do it.
 This can be done in a handler or any function that could be called from a
@@ -135,5 +192,5 @@
 \end{lstlisting}
 The result of EXPRESSION must be a resumption exception type. A resumption
-exception type is any type that satifies the assertion
+exception type is any type that satisfies the assertion
 \codeCFA{void defaultResumptionHandler(T &);} (the default handler). When the
 statement is executed the expression is evaluated and the result is thrown.
@@ -159,5 +216,5 @@
 continues from the throw statement.
 
-If no approprate handler is found then the default handler is called. The
+If no appropriate handler is found then the default handler is called. The
 throw statement acts as a regular function call passing the exception to
 the default handler and after the handler finishes executing control continues
@@ -174,5 +231,5 @@
 which is what most users expect.
 
-% This might need a diagram. But it is an important part of the justifaction
+% This might need a diagram. But it is an important part of the justification
 % of the design of the traversal order.
 It also avoids the recursive resumption problem. If the entire stack is
@@ -184,5 +241,5 @@
 system an A resumed from the top of the stack will be handled by the first
 handler. A B resumed from the top or from the first handler it will be handled
-by the second hander. The only difference is when A is thrown from the second
+by the second handler. The only difference is when A is thrown from the second
 handler. The entire stack search will call the first handler again, creating a
 loop. Starting from the position in the stack though will break this loop.
@@ -198,5 +255,5 @@
 It also has the same behaviour, after the exception type has been matched
 with the EXCEPTION\_TYPE the CONDITION is evaluated with NAME in scope. If
-the result is true then the hander is run, otherwise the search continues
+the result is true then the handler is run, otherwise the search continues
 just as if there had been a type mismatch.
 
@@ -241,5 +298,5 @@
 the finally block. Other ways to leave the finally block - such as a long
 jump or termination - are much harder to check, at best requiring additional
-runtime overhead, and so are merely discouraged.
+run-time overhead, and so are merely discouraged.
 
 \section{Cancellation}
@@ -250,9 +307,9 @@
 
 There is no special statement for starting a cancellation, instead you call
-the standard libary function \codeCFA{cancel\_stack} which takes an exception.
+the standard library function \codeCFA{cancel\_stack} which takes an exception.
 Unlike in a throw this exception is not used in control flow but is just there
 to pass information about why the cancellation happened.
 
-The handler is decided entirely by which stack is being cancelled. There are
+The handler is decided entirely by which stack is being canceled. There are
 three handlers that apply to three different groups of stacks:
 \begin{itemize}
@@ -266,5 +323,5 @@
 
 \item Thread Stack:
-Thread stacks are those created \codeCFA{thread} or otherwise satify the
+Thread stacks are those created \codeCFA{thread} or otherwise satisfy the
 \codeCFA{is\_thread} trait.
 
@@ -288,10 +345,10 @@
 context required for the other. This can happen with join but as the
 destructors will always be run when the stack is being unwound and one
-termination/cancellation is already active. Also since they are implicite they
+termination/cancellation is already active. Also since they are implicit they
 are easier to forget about.
 
 \item Coroutine Stack:
 Coroutine stacks are those created with \codeCFA{coroutine} or otherwise
-satify the \codeCFA{is\_coroutine} trait.
+satisfy the \codeCFA{is\_coroutine} trait.
 
 A coroutine knows of two other coroutines, its starter and its last resumer.
@@ -301,5 +358,5 @@
 Resume will resume throw a \codeCFA{CoroutineCancelled} exception, which is
 polymorphic over the coroutine type and has a pointer to the coroutine being
-cancelled and the cancelling exception. The resume function also has an
+canceled and the canceling exception. The resume function also has an
 assertion that the \codeCFA{defaultResumptionHandler} for the exception. So it
 will use the default handler like a regular throw.
Index: doc/theses/andrew_beach_MMath/future.tex
===================================================================
--- doc/theses/andrew_beach_MMath/future.tex	(revision 0edd11a6077a0ddf81da70d5c62e8928e3d3e1b8)
+++ doc/theses/andrew_beach_MMath/future.tex	(revision 35ea4f33811f564594a8a11b8c003adbeca7973b)
@@ -8,9 +8,14 @@
 parts of the exception system that use the current version.
 
-For instance a full virtual system would probably allow for several
-improvements to the exception traits. Although they do currently work they
-could be made easier to use by making the virtual table type implitate in the
-trait (which would remove the need for those wrapper marcos) or allowing
-for assertions that give the layout of a virtual table for safety.
+There are several improvements to the virtual system that would improve
+the exception traits. The biggest one is an assertion that checks that one
+virtual type is a child of another virtual type. This would capture many of
+the requirements much more precisely.
+
+The full virtual system might also include other improvement like associated
+types. This is a proposed feature that would allow traits to refer to types
+not listed in their header. This would allow the exception traits to not
+refer to the virtual table type explicatly which would remove the need for
+the interface macros.
 
 \section{Additional Throws}
@@ -65,17 +70,44 @@
 
 Also new techniques to skip previously searched parts of the stack will have
-to be developed.
+to be developed. The recursive resume problem still remains and ideally the
+same pattern of ignoring sections of the stack.
 
-\section{Support for More Platforms}
-Termination is not portable because it is implemented with inline assembly.
-Those sections will have to be rewritten to support different architectures
+\section{Signal Exceptions}
+Exception Handling: Issues and a Proposed Notation suggests there are three
+types of exceptions: escape, notify and signal.
+Escape exceptions are our termination exceptions, notify exceptions are
+resumption exceptions and that leaves signal exception unimplemented.
 
-\section{Quality-of-Life Improvements}
-Finally come various improvements to the usability of \CFA. Most of these
-would just require time. Time that would not lead to interesting research so
-it has been left aside for now. A few examples are included here but there
-are more:
+Signal exceptions allow either behaviour, that is after the exception is
+handled control can either return to the throw or from where the handler is
+defined.
+
+The design should be rexamined and be updated for \CFA. A very direct
+translation would perhaps have a new throw and catch pair and a statement
+(or statements) could be used to decide if the handler returns to the throw
+or continues where it is, but there are other options.
+
+For instance resumption could be extended to cover this use by allowing
+local control flow out of it. This would require an unwind as part of the
+transition as there are stack frames that have to be removed.
+This would mean there is no notify like throw but because \CFA does not have
+exception signatures a termination can be thrown from any resumption handler
+already so there are already ways one could try to do this in existing \CFA.
+
+% Maybe talk about the escape; and escape CONTROL_STMT; statements or how
+% if we could choose if _Unwind_Resume proceeded to the clean-up stage this
+% would be much easier to implement.
+
+\section{Language Improvements}
+There is also a lot of work that are not follow ups to this work in terms of
+research, some have no interesting research to be done at all, but would
+improve \CFA as a programming language. The full list of these would
+naturally be quite extensive but here are a few examples that involve
+exceptions:
 
 \begin{itemize}
+\item The implementation of termination is not portable because it includes
+some assembly statements. These sections will have to be re-written to so
+\CFA has full support on more machines.
 \item Allowing exception handler to bind the exception to a reference instead
 of a pointer. This should actually result in no change in behaviour so there
@@ -88,13 +120,3 @@
 much easier. (To do the same for try blocks would probably wait for zero-cost
 exceptions, which would allow the try block to be inlined as well.)
-\item Enabling local control flow out of a resumption handler. This would be
-a weighty operation, causing a stack unwind like a termination, so there might
-be a different statement or a statement modifier to make sure the user does
-this purposefully.
-
-However this would require the more complex system as they cannot be inlined
-into the original function as they can be run at a different place on the
-stack. So instead the unwinding will have to carry with it information on
-which one of these points to continue at and possibly also the return value
-for the function if a \codeCFA{return} statement was used.
 \end{itemize}
Index: doc/theses/fangren_yu_COOP_F20/Report.tex
===================================================================
--- doc/theses/fangren_yu_COOP_F20/Report.tex	(revision 0edd11a6077a0ddf81da70d5c62e8928e3d3e1b8)
+++ doc/theses/fangren_yu_COOP_F20/Report.tex	(revision 35ea4f33811f564594a8a11b8c003adbeca7973b)
@@ -87,7 +87,18 @@
 
 \begin{abstract}
+
+\CFA is an evolutionary extension to the C programming language, featuring a parametric type system, and is currently under active development. The reference compiler for \CFA language, @cfa-cc@, has some of its major components dated back to early 2000s, and is based on inefficient data structures and algorithms. Some improvements targeting the expression resolution algorithm, suggested by a recent prototype experiment on a simplified model, are implemented in @cfa-cc@ to support the full \CFA language. These optimizations speed up the compiler significantly by a factor of 20 across the existing \CFA codebase, bringing the compilation time of a mid-sized \CFA source file down to 10-second level. A few cases derived from realistic code examples that causes trouble to the compiler are analyzed in detail, with proposed solutions. This step of \CFA project development is critical to its eventual goal to be used alongside C for large software systems.
+
 \end{abstract}
 
 \section{Introduction}
+
+\CFA language, developed by the Programming Language Group at University of Waterloo, has a long history, with the first proof-of-concept compiler built in 2003 by Richard Bilson~\cite{Bilson03}. Many new features are added to the language over time, but the core of \CFA, parametric functions introduced by the @forall@ clause (hence the name of the language), with the type system supporting parametric overloading, remains mostly unchanged. 
+
+The current \CFA reference compiler @cfa-cc@ still includes many parts taken directly from the original Bilson's implementation, and serves as a starting point for the enhancement work to the type system. Unfortunately, it does not provide the efficiency required for the language to be used practically: a \CFA source file of approximately 1000 lines of code can take a few minutes to compile. The cause of the problem is that the old compiler used inefficient data structures and algorithms for expression resolution, which involved a lot of copying and redundant work. 
+
+This paper presents a series of optimizations to the performance-critical parts of the resolver, with a major rework of the data structure used by the compiler, using a functional programming approach to reduce memory complexity. Subsequent improvements are mostly suggested by running the compiler builds with a performance profiler against the \CFA standard library source code and a test suite to find the most underperforming components in the compiler algorithm.
+
+The \CFA team endorses a pragmatic philosophy in work that mostly focuses on practical implications of language design and implementation, rather than the theoretical limits. In particular, the compiler is designed to work on production \CFA code efficiently and keep type safety, while sometimes making compromises to expressiveness in extreme corner cases. However, when these corner cases do appear in actual usage, they need to be thoroughly investigated. Analysis presented in this paper, therefore, are conducted on a case-by-case basis. Some of them eventually point to certain weaknesses in the language design and solutions are proposed based on experimental results.
 
 \section{Completed work}
@@ -444,5 +455,5 @@
 \section{Timing results}
 
-For the timing results presented here, the \CFA compiler is built with gcc 9.3.0, and tested on a server machine running Ubuntu 20.04, 64GB RAM and 32-core 2.2 GHz CPU, results reported by the time command, and using only 8 cores in parallel such that the time is close to the case with 100% CPU utilization on a single thread.
+For the timing results presented here, the \CFA compiler is built with gcc 9.3.0, and tested on a server machine running Ubuntu 20.04, 64GB RAM and 32-core 2.2 GHz CPU, results reported by the time command, and using only 8 cores in parallel such that the time is close to the case with 100\% CPU utilization on a single thread.
 
 On the most recent build, the \CFA standard library (~1.3 MB of source code) compiles in 4 minutes 47 seconds total processor time (single thread equivalent), with the slowest file taking 13 seconds. The test suite (178 test cases, ~2.2MB of source code) completes within 25 minutes total processor time,\footnote{Including a few runtime tests; total time spent in compilation is approximately 21 minutes.} with the slowest file taking 23 seconds. In contrast, the library build on old compiler takes 85 minutes total, 5 minutes for the slowest file. Full test suite takes too long with old compiler build and is therefore not run, but the slowest test cases take approximately 5 minutes. Overall, the most recent build compared to old build in April 2020, before the project started, is consistently faster by a factor of 20.
Index: libcfa/src/stdlib.hfa
===================================================================
--- libcfa/src/stdlib.hfa	(revision 0edd11a6077a0ddf81da70d5c62e8928e3d3e1b8)
+++ libcfa/src/stdlib.hfa	(revision 35ea4f33811f564594a8a11b8c003adbeca7973b)
@@ -191,5 +191,11 @@
 		} else if(Fill.tag == 't') {
 			for ( int i = copy_end; i < Dim * size; i += size ) {
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Warray-bounds"
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wstringop-overflow="
 				memcpy( (char *)ptr + i, &Fill.t, size );
+#pragma GCC diagnostic pop
+#pragma GCC diagnostic pop
 			}
 		} else if(Fill.tag == 'a') {
