\chapter{Introduction} % The highest level overview of Cforall and EHMs. Get this done right away. This thesis goes over the design and implementation of the exception handling mechanism (EHM) of \CFA (pronounced sea-for-all and may be written Cforall or CFA). \CFA is a new programming language that extends C, that maintains backwards-compatibility while introducing modern programming features. Adding exception handling to \CFA gives it new ways to handle errors and make other large control-flow jumps. % Now take a step back and explain what exceptions are generally. Exception handling provides dynamic inter-function control flow. There are two forms of exception handling covered in this thesis: termination, which acts as a multi-level return, and resumption, which is a dynamic function call. Termination handling is much more common, to the extent that it is often seen This seperation is uncommon because termination exception handling is so much more common that it is often assumed. % WHY: Mention other forms of continuation and \cite{CommonLisp} here? A language's EHM is the combination of language syntax and run-time components that are used to construct, raise and handle exceptions, including all control flow. Termination exception handling allows control to return to any previous function on the stack directly, skipping any functions between it and the current function. \begin{center} \input{callreturn} \end{center} Resumption exception handling seaches the stack for a handler and then calls it without adding or removing any other stack frames. \todo{Add a diagram showing control flow for resumption.} Although a powerful feature, exception handling tends to be complex to set up and expensive to use so they are often limited to unusual or ``exceptional" cases. The classic example of this is error handling, exceptions can be used to remove error handling logic from the main execution path and while paying most of the cost only when the error actually occurs. \section{Thesis Overview} This work describes the design and implementation of the \CFA EHM. The \CFA EHM implements all of the common exception features (or an equivalent) found in most other EHMs and adds some features of its own. The design of all the features had to be adapted to \CFA's feature set as some of the underlying tools used to implement and express exception handling in other languages are absent in \CFA. Still the resulting syntax resembles that of other languages: \begin{cfa} try { ... T * object = malloc(request_size); if (!object) { throw OutOfMemory{fixed_allocation, request_size}; } ... } catch (OutOfMemory * error) { ... } \end{cfa} % A note that yes, that was a very fast overview. The design and implementation of all of \CFA's EHM's features are described in detail throughout this thesis, whether they are a common feature or one unique to \CFA. % The current state of the project and what it contributes. All of these features have been implemented in \CFA, along with a suite of test cases as part of this project. The implementation techniques are generally applicable in other programming languages and much of the design is as well. Some parts of the EHM use other features unique to \CFA and these would be harder to replicate in other programming languages. % Talk about other programming languages. Some existing programming languages that include EHMs/exception handling include C++, Java and Python. All three examples focus on termination exceptions which unwind the stack as part of the Exceptions also can replace return codes and return unions. The contributions of this work are: \begin{enumerate} \item Designing \CFA's exception handling mechanism, adapting designs from other programming languages and the creation of new features. \item Implementing stack unwinding and the EHM in \CFA, including updating the compiler and the run-time environment. \item Designed and implemented a prototype virtual system. % I think the virtual system and per-call site default handlers are the only % "new" features, everything else is a matter of implementation. \end{enumerate} \todo{I can't figure out a good lead-in to the roadmap.} The next section covers the existing state of exceptions. The existing state of \CFA is also covered in \autoref{c:existing}. The new features are introduced in \autoref{c:features}, which explains their usage and design. That is followed by the implementation of those features in \autoref{c:implement}. The performance results are examined in \autoref{c:performance}. Possibilities to extend this project are discussed in \autoref{c:future}. \section{Background} \label{s:background} Exception handling is not a new concept, with papers on the subject dating back 70s. Their were popularised by \Cpp, which added them in its first major wave of non-object-orientated features in 1990. % https://en.cppreference.com/w/cpp/language/history Java was the next popular language to use exceptions. It is also the most popular language with checked exceptions. Checked exceptions are part of the function interface they are raised from. This includes functions they propogate through, until a handler for that type of exception is found. This makes exception information explicit, which can improve clarity and safety, but can slow down programming. Some of these, such as dealing with high-order methods or an overly specified throws clause, are technical. However some of the issues are much more human, in that writing/updating all the exception signatures can be enough of a burden people will hack the system to avoid them. Including the ``catch-and-ignore" pattern where a catch block is used without anything to repair or recover from the exception. %\subsection Resumption exceptions have been much less popular. Although resumption has a history as old as termination's, very few programming languages have implement them. % http://bitsavers.informatik.uni-stuttgart.de/pdf/xerox/parc/techReports/ % CSL-79-3_Mesa_Language_Manual_Version_5.0.pdf Mesa is one programming languages that did and experiance with that languages is quoted as being one of the reasons resumptions were not included in the \Cpp standard. % https://en.wikipedia.org/wiki/Exception_handling \todo{A comment about why we did include them when they are so unpopular might be approprate.} %\subsection Functional languages, tend to use solutions like the return union, but some exception-like constructs still appear. For instance Haskell's built in error mechanism can make the result of any expression, including function calls. Any expression that examines an error value will in-turn produce an error. This continues until the main function produces an error or until it is handled by one of the catch functions. %\subsection More recently exceptions seem to be vanishing from newer programming languages. Rust and Go reduce this feature to panics. Panicing is somewhere between a termination exception and a program abort. Notably in Rust a panic can trigger either, a panic may unwind the stack or simply kill the process. % https://doc.rust-lang.org/std/panic/fn.catch_unwind.html Go's panic is much more similar to a termination exception but there is only a catch-all function with \code{Go}{recover()}. So exceptions still are appearing, just in reduced forms. %\subsection Exception handling's most common use cases are in error handling. Here are some other ways to handle errors and comparisons with exceptions. \begin{itemize} \item\emph{Error Codes}: This pattern uses an enumeration (or just a set of fixed values) to indicate that an error has occured and which error it was. There are some issues if a function wants to return an error code and another value. The main issue is that it can be easy to forget checking the error code, which can lead to an error being quitely and implicitly ignored. Some new languages have tools that raise warnings if the return value is discarded to avoid this. It also puts more code on the main execution path. \item\emph{Special Return with Global Store}: A function that encounters an error returns some value indicating that it encountered a value but store which error occured in a fixed global location. Perhaps the C standard @errno@ is the most famous example of this, where some standard library functions will return some non-value (often a NULL pointer) and set @errno@. This avoids the multiple results issue encountered with straight error codes but otherwise many of the same advantages and disadvantages. It does however introduce one other major disadvantage: Everything that uses that global location must agree on all possible errors. \item\emph{Return Union}: Replaces error codes with a tagged union. Success is one tag and the errors are another. It is also possible to make each possible error its own tag and carry its own additional information, but the two branch format is easy to make generic so that one type can be used everywhere in error handling code. This pattern is very popular in functional or semi-functional language, anything with primitive support for tagged unions (or algebraic data types). % We need listing Rust/rust to format code snipits from it. % Rust's \code{rust}{Result} The main disadvantage is again it puts code on the main execution path. This is also the first technique that allows for more information about an error, other than one of a fix-set of ids, to be sent. They can be missed but some languages can force that they are checked. It is also implicitly forced in any languages with checked union access. \item\emph{Handler Functions}: On error the function that produced the error calls another function to handle it. The handler function can be provided locally (passed in as an argument, either directly as as a field of a structure/object) or globally (a global variable). C++ uses this as its fallback system if exception handling fails. \snake{std::terminate_handler} and for a time \snake{std::unexpected_handler} Handler functions work a lot like resumption exceptions. The difference is they are more expencive to set up but cheaper to use, and so are more suited to more fequent errors. The exception being global handlers if they are rarely change as the time in both cases strinks towards zero. \end{itemize} %\subsection Because of their cost exceptions are rarely used for hot paths of execution. There is an element of self-fulfilling prophocy here as implementation techniques have been designed to make exceptions cheap to set-up at the cost of making them expencive to use. Still, use of exceptions for other tasks is more common in higher-level scripting languages. An iconic example is Python's StopIteration exception which is thrown by an iterator to indicate that it is exausted. Combined with Python's heavy use of the iterator based for-loop. % https://docs.python.org/3/library/exceptions.html#StopIteration