# Changeset cb6b8cb

Ignore:
Timestamp:
Aug 12, 2021, 3:13:22 PM (16 months ago)
Branches:
enum, forall-pointer-decay, jacob/cs343-translation, master, pthread-emulation, qualifiedEnum
Children:
be497c6
Parents:
f42a6b8
Message:

Andrew MMath: Fixes in the conclusion and main body. Used Peter's infroduction feedback; which was never wrong in the first place.

Location:
doc/theses/andrew_beach_MMath
Files:
3 edited

### Legend:

Unmodified
 rf42a6b8 % The highest level overview of Cforall and EHMs. Get this done right away. This thesis goes over the design and implementation of the exception handling This thesis covers the design and implementation of the exception handling mechanism (EHM) of \CFA (pronounced sea-for-all and may be written Cforall or CFA). \CFA is a new programming language that extends C, that maintains \CFA is a new programming language that extends C, which maintains backwards-compatibility while introducing modern programming features. Adding exception handling to \CFA gives it new ways to handle errors and make other large control-flow jumps. make large control-flow jumps. % Now take a step back and explain what exceptions are generally. Exception handling provides dynamic inter-function control flow. A language's EHM is a combination of language syntax and run-time components that construct, raise, propagate and handle exceptions, to provide all of that control flow. There are two forms of exception handling covered in this thesis: termination, which acts as a multi-level return, and resumption, which is a dynamic function call. Termination handling is much more common, to the extent that it is often seen This seperation is uncommon because termination exception handling is so much more common that it is often assumed. % WHY: Mention other forms of continuation and \cite{CommonLisp} here? A language's EHM is the combination of language syntax and run-time components that are used to construct, raise and handle exceptions, including all control flow. Termination exception handling allows control to return to any previous function on the stack directly, skipping any functions between it and the current function. % About other works: Often, when this separation is not made, termination exceptions are assumed as they are more common and may be the only form of handling provided in a language. All types of exception handling link a raise with a handler. Both operations are usually language primitives, although raises can be treated as a primitive function that takes an exception argument. Handlers are more complex as they are added to and removed from the stack during execution, must specify what they can handle and give the code to handle the exception. Exceptions work with different execution models but for the descriptions that follow a simple call stack, with functions added and removed in a first-in-last-out order, is assumed. Termination exception handling searches the stack for the handler, then unwinds the stack to where the handler was found before calling it. The handler is run inside the function that defined it and when it finishes it returns control to that function. \begin{center} \input{callreturn} \end{center} Resumption exception handling seaches the stack for a handler and then calls it without adding or removing any other stack frames. \todo{Add a diagram showing control flow for resumption.} Resumption exception handling searches the stack for a handler and then calls it without removing any other stack frames. The handler is run on top of the existing stack, often as a new function or closure capturing the context in which the handler was defined. After the handler has finished running it returns control to the function that preformed the raise, usually starting after the raise. \begin{center} \input{resumption} \end{center} Although a powerful feature, exception handling tends to be complex to set up and expensive to use so they are often limited to unusual or exceptional" cases. The classic example of this is error handling, exceptions can be used to remove error handling logic from the main execution path and while paying so it is often limited to unusual or exceptional" cases. The classic example is error handling, exceptions can be used to remove error handling logic from the main execution path, and pay most of the cost only when the error actually occurs. } \end{cfa} % A note that yes, that was a very fast overview. The design and implementation of all of \CFA's EHM's features are % The current state of the project and what it contributes. All of these features have been implemented in \CFA, along with a suite of test cases as part of this project. All of these features have been implemented in \CFA, covering both changes to the compiler and the run-time. In addition, a suite of test cases and performance benchmarks were created along side the implementation. The implementation techniques are generally applicable in other programming languages and much of the design is as well. Some parts of the EHM use other features unique to \CFA and these would be Some parts of the EHM use other features unique to \CFA and would be harder to replicate in other programming languages. % Talk about other programming languages. Some existing programming languages that include EHMs/exception handling include C++, Java and Python. All three examples focus on termination exceptions which unwind the stack as part of the Exceptions also can replace return codes and return unions. The contributions of this work are: \begin{enumerate} \item Designing \CFA's exception handling mechanism, adapting designs from other programming languages and the creation of new features. \item Implementing stack unwinding and the EHM in \CFA, including updating the compiler and the run-time environment. other programming languages and creating new features. \item Implementing stack unwinding and the \CFA EHM, including updating the \CFA compiler and the run-time environment. \item Designed and implemented a prototype virtual system. % I think the virtual system and per-call site default handlers are the only % "new" features, everything else is a matter of implementation. \item Creating tests to check the behaviour of the EHM. \item Creating benchmarks to check the performances of the EHM, as compared to other languages. \end{enumerate} \todo{I can't figure out a good lead-in to the roadmap.} The next section covers the existing state of exceptions. The rest of this thesis is organized as follows. The current state of exceptions is covered in \autoref{s:background}. The existing state of \CFA is also covered in \autoref{c:existing}. The new features are introduced in \autoref{c:features}, which explains their usage and design. That is followed by the implementation of those features in New EHM features are introduced in \autoref{c:features}, covering their usage and design. That is followed by the implementation of these features in \autoref{c:implement}. The performance results are examined in \autoref{c:performance}. Performance results are examined in \autoref{c:performance}. Possibilities to extend this project are discussed in \autoref{c:future}. Finally, the project is summarized in \autoref{c:conclusion}. \section{Background} \label{s:background} Exception handling is not a new concept, with papers on the subject dating back 70s.\cite{Goodenough} Early exceptions were often treated as signals. They carried no information except their identity. Ada still uses this system. Exception handling has been examined before in programming languages, with papers on the subject dating back 70s.\cite{Goodenough75} Early exceptions were often treated as signals, which carried no information except their identity. Ada still uses this system.\todo{cite Ada} The modern flag-ship for termination exceptions is \Cpp, which added them in its first major wave of non-object-orientated features in 1990. % https://en.cppreference.com/w/cpp/language/history \Cpp has the ability to use any value of any type as an exception. However that seems to immediately pushed aside for classes inherited from \todo{cite https://en.cppreference.com/w/cpp/language/history} Many EHMs have special exception types, however \Cpp has the ability to use any type as an exception. These were found to be not very useful and have been pushed aside for classes inheriting from \code{C++}{std::exception}. Although there is a special catch-all syntax it does not allow anything to be done with the caught value becuase nothing is known about it. So instead a base type is defined with some common functionality (such as Although there is a special catch-all syntax (@catch(...)@) there are no operations that can be performed on the caught value, not even type inspection. Instead the base exception-type \code{C++}{std::exception} defines common functionality (such as the ability to describe the reason the exception was raised) and all exceptions have that functionality. This seems to be the standard now, as the garentied functionality is worth any lost flexibility from limiting it to a single type. Java was the next popular language to use exceptions. exceptions have this functionality. That trade-off, restricting usable types to gain guaranteed functionality, is almost universal now, as without some common functionality it is almost impossible to actually handle any errors. Java was the next popular language to use exceptions. \todo{cite Java} Its exception system largely reflects that of \Cpp, except that requires you throw a child type of \code{Java}{java.lang.Throwable} and it uses checked exceptions. Checked exceptions are part of the function interface they are raised from. This includes functions they propogate through, until a handler for that type of exception is found. This makes exception information explicit, which can improve clarity and safety, but can slow down programming. Some of these, such as dealing with high-order methods or an overly specified throws clause, are technical. However some of the issues are much more human, in that writing/updating all the exception signatures can be enough of a burden people will hack the system to avoid them. Including the catch-and-ignore" pattern where a catch block is used without anything to repair or recover from the exception. %\subsection Resumption exceptions have been much less popular. Although resumption has a history as old as termination's, very few Checked exceptions are part of a function's interface, the exception signature of the function. Every function that could be raised from a function, either directly or because it is not handled from a called function, is given. Using this information, it is possible to statically verify if any given exception is handled and guarantee that no exception will go unhandled. Making exception information explicit improves clarity and safety, but can slow down or restrict programming. For example, programming high-order functions becomes much more complex if the argument functions could raise exceptions. However, as odd it may seem, the worst problems are rooted in the simple inconvenience of writing and updating exception signatures. This has caused Java programmers to develop multiple programming hacks'' to circumvent checked exceptions, negating their advantages. One particularly problematic example is the catch-and-ignore'' pattern, where an empty handler is used to handle an exception without doing any recovery or repair. In theory that could be good enough to properly handle the exception, but more often is used to ignore an exception that the programmer does not feel is worth the effort of handling it, for instance if they do not believe it will ever be raised. If they are incorrect the exception will be silenced, while in a similar situation with unchecked exceptions the exception would at least activate the language's unhandled exception code (usually program abort with an error message). %\subsection Resumption exceptions are less popular, although resumption is as old as termination; hence, few programming languages have implemented them. % http://bitsavers.informatik.uni-stuttgart.de/pdf/xerox/parc/techReports/ %   CSL-79-3_Mesa_Language_Manual_Version_5.0.pdf Mesa is one programming languages that did. Experiance with Mesa Mesa is one programming language that did.\todo{cite Mesa} Experience with Mesa is quoted as being one of the reasons resumptions were not included in the \Cpp standard. % https://en.wikipedia.org/wiki/Exception_handling Since then resumptions have been ignored in the main-stream. All of this does call into question the use of resumptions, is something largely rejected decades ago worth revisiting now? Yes, even if it was the right call at the time there have been decades of other developments in computer science that have changed the situation since then. Some of these developments, such as in functional programming's resumption equivalent: algebraic effects\cite{Zhang19}, are directly related to resumptions as well. A complete rexamination of resumptions is beyond a single paper, but it is enough to try them again in \CFA. Since then resumptions have been ignored in main-stream programming languages. However, resumption is being revisited in the context of decades of other developments in programming languages. While rejecting resumption may have been the right decision in the past, the situation has changed since then. Some developments, such as the function programming equivalent to resumptions, algebraic effects\cite{Zhang19}, are enjoying success. A complete reexamination of resumptions is beyond this thesis, but there reemergence is enough to try them in \CFA. % Especially considering how much easier they are to implement than % termination exceptions. % termination exceptions and how much Peter likes them. %\subsection Functional languages tend to use other solutions for their primary error handling mechanism, exception-like constructs still appear. Termination appears in error construct, which marks the result of an expression as an error, the result of any expression that tries to use it as an error, and so on until an approprate handler is reached. Resumption appears in algebric effects, where a function dispatches its handling mechanism, but exception-like constructs still appear. Termination appears in the error construct, which marks the result of an expression as an error; then the result of any expression that tries to use it also results in an error, and so on until an appropriate handler is reached. Resumption appears in algebraic effects, where a function dispatches its side-effects to its caller for handling. More recently exceptions seem to be vanishing from newer programming languages, replaced by panic". In Rust a panic is just a program level abort that may be implemented by unwinding the stack like in termination exception handling. In Rust, a panic is just a program level abort that may be implemented by unwinding the stack like in termination exception handling.\todo{cite Rust} % https://doc.rust-lang.org/std/panic/fn.catch_unwind.html Go's panic through is very similar to a termination except it only supports Go's panic through is very similar to a termination, except it only supports a catch-all by calling \code{Go}{recover()}, simplifying the interface at the cost of flexability. %\subsection Exception handling's most common use cases are in error handling. Here are some other ways to handle errors and comparisons with exceptions. the cost of flexibility.\todo{cite Go} %\subsection While exception handling's most common use cases are in error handling, here are some other ways to handle errors with comparisons with exceptions. \begin{itemize} \item\emph{Error Codes}: This pattern uses an enumeration (or just a set of fixed values) to indicate that an error has occured and which error it was. There are some issues if a function wants to return an error code and another value. The main issue is that it can be easy to forget checking the error code, which can lead to an error being quitely and implicitly ignored. Some new languages have tools that raise warnings if the return value is discarded to avoid this. It also puts more code on the main execution path. This pattern has a function return an enumeration (or just a set of fixed values) to indicate if an error has occurred and possibly which error it was. Error codes mix exceptional/error and normal values, enlarging the range of possible return values. This can be addressed with multiple return values (or a tuple) or a tagged union. However, the main issue with error codes is forgetting to check them, which leads to an error being quietly and implicitly ignored. Some new languages and tools will try to issue warnings when an error code is discarded to avoid this problem. Checking error codes also bloats the main execution path, especially if the error is not handled immediately hand has to be passed through multiple functions before it is addressed. \item\emph{Special Return with Global Store}: A function that encounters an error returns some value indicating that it encountered a value but store which error occured in a fixed global location. Perhaps the C standard @errno@ is the most famous example of this, where some standard library functions will return some non-value (often a NULL pointer) and set @errno@. This avoids the multiple results issue encountered with straight error codes but otherwise many of the same advantages and disadvantages. It does however introduce one other major disadvantage: Everything that uses that global location must agree on all possible errors. Similar to the error codes pattern but the function itself only returns that there was an error and store the reason for the error in a fixed global location. For example many routines in the C standard library will only return some error value (such as -1 or a null pointer) and the error code is written into the standard variable @errno@. This approach avoids the multiple results issue encountered with straight error codes but otherwise has the same disadvantages and more. Every function that reads or writes to the global store must agree on all possible errors and managing it becomes more complex with concurrency. \item\emph{Return Union}: Replaces error codes with a tagged union. This pattern replaces error codes with a tagged union. Success is one tag and the errors are another. It is also possible to make each possible error its own tag and carry its own so that one type can be used everywhere in error handling code. This pattern is very popular in functional or semi-functional language, anything with primitive support for tagged unions (or algebraic data types). % We need listing Rust/rust to format code snipits from it. This pattern is very popular in any functional or semi-functional language with primitive support for tagged unions (or algebraic data types). % We need listing Rust/rust to format code snippets from it. % Rust's \code{rust}{Result} The main disadvantage is again it puts code on the main execution path. This is also the first technique that allows for more information about an error, other than one of a fix-set of ids, to be sent. They can be missed but some languages can force that they are checked. It is also implicitly forced in any languages with checked union access. The main advantage is that an arbitrary object can be used to represent an error so it can include a lot more information than a simple error code. The disadvantages include that the it does have to be checked along the main execution and if there aren't primitive tagged unions proper usage can be hard to enforce. \item\emph{Handler Functions}: On error the function that produced the error calls another function to This pattern associates errors with functions. On error, the function that produced the error calls another function to handle it. The handler function can be provided locally (passed in as an argument, either directly as as a field of a structure/object) or globally (a global variable). C++ uses this as its fallback system if exception handling fails. \snake{std::terminate_handler} and for a time \snake{std::unexpected_handler} Handler functions work a lot like resumption exceptions. The difference is they are more expencive to set up but cheaper to use, and so are more suited to more fequent errors. The exception being global handlers if they are rarely change as the time in both cases strinks towards zero. C++ uses this approach as its fallback system if exception handling fails, such as \snake{std::terminate_handler} and, for a time, \snake{std::unexpected_handler}. Handler functions work a lot like resumption exceptions, but without the dynamic search for a handler. Since setting up the handler can be more complex/expensive, especially when the handler has to be passed through multiple layers of function calls, but cheaper (constant time) to call, they are more suited to more frequent (less exceptional) situations. \end{itemize} %\subsection Because of their cost exceptions are rarely used for hot paths of execution. There is an element of self-fulfilling prophocy here as implementation techniques have been designed to make exceptions cheap to set-up at the cost of making them expencive to use. Still, use of exceptions for other tasks is more common in higher-level scripting languages. An iconic example is Python's StopIteration exception which is thrown by an iterator to indicate that it is exausted. Combined with Python's heavy use of the iterator based for-loop. Because of their cost, exceptions are rarely used for hot paths of execution. Hence, there is an element of self-fulfilling prophecy as implementation techniques have been focused on making them cheap to set-up, happily making them expensive to use in exchange. This difference is less important in higher-level scripting languages, where using exception for other tasks is more common. An iconic example is Python's \code{Python}{StopIteration} exception that is thrown by an iterator to indicate that it is exhausted. When paired with Python's iterator-based for-loop this will be thrown every time the end of the loop is reached. \todo{Cite Python StopIteration and for-each loop.} % https://docs.python.org/3/library/exceptions.html#StopIteration