source: doc/theses/andrew_beach_MMath/intro.tex @ b041f11

ADTast-experimentalenumforall-pointer-decayjacob/cs343-translationpthread-emulationqualifiedEnum
Last change on this file since b041f11 was 75f8e04, checked in by Andrew Beach <ajbeach@…>, 3 years ago

Andrew MMath: Fix \input to point at the new file name (file renamed a while ago).

  • Property mode set to 100644
File size: 14.5 KB
Line 
1\chapter{Introduction}
2
3% The highest level overview of Cforall and EHMs. Get this done right away.
4This thesis covers the design and implementation of the exception handling
5mechanism (EHM) of
6\CFA (pronounced sea-for-all and may be written Cforall or CFA).
7\CFA is a new programming language that extends C, which maintains
8backwards-compatibility while introducing modern programming features.
9Adding exception handling to \CFA gives it new ways to handle errors and
10make large control-flow jumps.
11
12% Now take a step back and explain what exceptions are generally.
13Exception handling provides dynamic inter-function control flow.
14A language's EHM is a combination of language syntax and run-time
15components that construct, raise, propagate and handle exceptions,
16to provide all of that control flow.
17There are two forms of exception handling covered in this thesis:
18termination, which acts as a multi-level return,
19and resumption, which is a dynamic function call.
20% About other works:
21Often, when this separation is not made, termination exceptions are assumed
22as they are more common and may be the only form of handling provided in
23a language.
24
25All types of exception handling link a raise with a handler.
26Both operations are usually language primitives, although raises can be
27treated as a primitive function that takes an exception argument.
28Handlers are more complex as they are added to and removed from the stack
29during execution, must specify what they can handle and give the code to
30handle the exception.
31
32Exceptions work with different execution models but for the descriptions
33that follow a simple call stack, with functions added and removed in a
34first-in-last-out order, is assumed.
35
36Termination exception handling searches the stack for the handler, then
37unwinds the stack to where the handler was found before calling it.
38The handler is run inside the function that defined it and when it finishes
39it returns control to that function.
40\begin{center}
41\input{termination}
42\end{center}
43
44Resumption exception handling searches the stack for a handler and then calls
45it without removing any other stack frames.
46The handler is run on top of the existing stack, often as a new function or
47closure capturing the context in which the handler was defined.
48After the handler has finished running it returns control to the function
49that preformed the raise, usually starting after the raise.
50\begin{center}
51\input{resumption}
52\end{center}
53
54Although a powerful feature, exception handling tends to be complex to set up
55and expensive to use
56so it is often limited to unusual or ``exceptional" cases.
57The classic example is error handling, exceptions can be used to
58remove error handling logic from the main execution path, and pay
59most of the cost only when the error actually occurs.
60
61\section{Thesis Overview}
62This work describes the design and implementation of the \CFA EHM.
63The \CFA EHM implements all of the common exception features (or an
64equivalent) found in most other EHMs and adds some features of its own.
65The design of all the features had to be adapted to \CFA's feature set as
66some of the underlying tools used to implement and express exception handling
67in other languages are absent in \CFA.
68Still the resulting syntax resembles that of other languages:
69\begin{cfa}
70try {
71        ...
72        T * object = malloc(request_size);
73        if (!object) {
74                throw OutOfMemory{fixed_allocation, request_size};
75        }
76        ...
77} catch (OutOfMemory * error) {
78        ...
79}
80\end{cfa}
81% A note that yes, that was a very fast overview.
82The design and implementation of all of \CFA's EHM's features are
83described in detail throughout this thesis, whether they are a common feature
84or one unique to \CFA.
85
86% The current state of the project and what it contributes.
87All of these features have been implemented in \CFA,
88covering both changes to the compiler and the run-time.
89In addition, a suite of test cases and performance benchmarks were created
90along side the implementation.
91The implementation techniques are generally applicable in other programming
92languages and much of the design is as well.
93Some parts of the EHM use other features unique to \CFA and would be
94harder to replicate in other programming languages.
95
96The contributions of this work are:
97\begin{enumerate}
98\item Designing \CFA's exception handling mechanism, adapting designs from
99other programming languages and creating new features.
100\item Implementing stack unwinding and the \CFA EHM, including updating
101the \CFA compiler and the run-time environment.
102\item Designed and implemented a prototype virtual system.
103% I think the virtual system and per-call site default handlers are the only
104% "new" features, everything else is a matter of implementation.
105\item Creating tests to check the behaviour of the EHM.
106\item Creating benchmarks to check the performances of the EHM,
107as compared to other languages.
108\end{enumerate}
109
110The rest of this thesis is organized as follows.
111The current state of exceptions is covered in \autoref{s:background}.
112The existing state of \CFA is also covered in \autoref{c:existing}.
113New EHM features are introduced in \autoref{c:features},
114covering their usage and design.
115That is followed by the implementation of these features in
116\autoref{c:implement}.
117Performance results are examined in \autoref{c:performance}.
118Possibilities to extend this project are discussed in \autoref{c:future}.
119Finally, the project is summarized in \autoref{c:conclusion}.
120
121\section{Background}
122\label{s:background}
123
124Exception handling has been examined before in programming languages,
125with papers on the subject dating back 70s.\cite{Goodenough75}
126Early exceptions were often treated as signals, which carried no information
127except their identity. Ada still uses this system.\todo{cite Ada}
128
129The modern flag-ship for termination exceptions is \Cpp,
130which added them in its first major wave of non-object-orientated features
131in 1990.
132\todo{cite https://en.cppreference.com/w/cpp/language/history}
133Many EHMs have special exception types,
134however \Cpp has the ability to use any type as an exception.
135These were found to be not very useful and have been pushed aside for classes
136inheriting from
137\code{C++}{std::exception}.
138Although there is a special catch-all syntax (@catch(...)@) there are no
139operations that can be performed on the caught value, not even type inspection.
140Instead the base exception-type \code{C++}{std::exception} defines common
141functionality (such as
142the ability to describe the reason the exception was raised) and all
143exceptions have this functionality.
144That trade-off, restricting usable types to gain guaranteed functionality,
145is almost universal now, as without some common functionality it is almost
146impossible to actually handle any errors.
147
148Java was the next popular language to use exceptions. \todo{cite Java}
149Its exception system largely reflects that of \Cpp, except that requires
150you throw a child type of \code{Java}{java.lang.Throwable}
151and it uses checked exceptions.
152Checked exceptions are part of a function's interface,
153the exception signature of the function.
154Every function that could be raised from a function, either directly or
155because it is not handled from a called function, is given.
156Using this information, it is possible to statically verify if any given
157exception is handled and guarantee that no exception will go unhandled.
158Making exception information explicit improves clarity and safety,
159but can slow down or restrict programming.
160For example, programming high-order functions becomes much more complex
161if the argument functions could raise exceptions.
162However, as odd it may seem, the worst problems are rooted in the simple
163inconvenience of writing and updating exception signatures.
164This has caused Java programmers to develop multiple programming ``hacks''
165to circumvent checked exceptions, negating their advantages.
166One particularly problematic example is the ``catch-and-ignore'' pattern,
167where an empty handler is used to handle an exception without doing any
168recovery or repair. In theory that could be good enough to properly handle
169the exception, but more often is used to ignore an exception that the       
170programmer does not feel is worth the effort of handling it, for instance if
171they do not believe it will ever be raised.
172If they are incorrect the exception will be silenced, while in a similar
173situation with unchecked exceptions the exception would at least activate   
174the language's unhandled exception code (usually program abort with an 
175error message).
176
177%\subsection
178Resumption exceptions are less popular,
179although resumption is as old as termination; hence, few
180programming languages have implemented them.
181% http://bitsavers.informatik.uni-stuttgart.de/pdf/xerox/parc/techReports/
182%   CSL-79-3_Mesa_Language_Manual_Version_5.0.pdf
183Mesa is one programming language that did.\todo{cite Mesa} Experience with Mesa
184is quoted as being one of the reasons resumptions were not
185included in the \Cpp standard.
186% https://en.wikipedia.org/wiki/Exception_handling
187Since then resumptions have been ignored in main-stream programming languages.
188However, resumption is being revisited in the context of decades of other
189developments in programming languages.
190While rejecting resumption may have been the right decision in the past,
191the situation has changed since then.
192Some developments, such as the function programming equivalent to resumptions,
193algebraic effects\cite{Zhang19}, are enjoying success.
194A complete reexamination of resumptions is beyond this thesis,
195but there reemergence is enough to try them in \CFA.
196% Especially considering how much easier they are to implement than
197% termination exceptions and how much Peter likes them.
198
199%\subsection
200Functional languages tend to use other solutions for their primary error
201handling mechanism, but exception-like constructs still appear.
202Termination appears in the error construct, which marks the result of an
203expression as an error; then the result of any expression that tries to use
204it also results in an error, and so on until an appropriate handler is reached.
205Resumption appears in algebraic effects, where a function dispatches its
206side-effects to its caller for handling.
207
208%\subsection
209More recently exceptions seem to be vanishing from newer programming
210languages, replaced by ``panic".
211In Rust, a panic is just a program level abort that may be implemented by
212unwinding the stack like in termination exception handling.\todo{cite Rust}
213% https://doc.rust-lang.org/std/panic/fn.catch_unwind.html
214Go's panic through is very similar to a termination, except it only supports
215a catch-all by calling \code{Go}{recover()}, simplifying the interface at
216the cost of flexibility.\todo{cite Go}
217
218%\subsection
219While exception handling's most common use cases are in error handling,
220here are some other ways to handle errors with comparisons with exceptions.
221\begin{itemize}
222\item\emph{Error Codes}:
223This pattern has a function return an enumeration (or just a set of fixed
224values) to indicate if an error has occurred and possibly which error it was.
225
226Error codes mix exceptional/error and normal values, enlarging the range of
227possible return values. This can be addressed with multiple return values
228(or a tuple) or a tagged union.
229However, the main issue with error codes is forgetting to check them,
230which leads to an error being quietly and implicitly ignored.
231Some new languages and tools will try to issue warnings when an error code
232is discarded to avoid this problem.
233Checking error codes also bloats the main execution path,
234especially if the error is not handled immediately hand has to be passed
235through multiple functions before it is addressed.
236
237\item\emph{Special Return with Global Store}:
238Similar to the error codes pattern but the function itself only returns
239that there was an error
240and store the reason for the error in a fixed global location.
241For example many routines in the C standard library will only return some
242error value (such as -1 or a null pointer) and the error code is written into
243the standard variable @errno@.
244
245This approach avoids the multiple results issue encountered with straight
246error codes but otherwise has the same disadvantages and more.
247Every function that reads or writes to the global store must agree on all
248possible errors and managing it becomes more complex with concurrency.
249
250\item\emph{Return Union}:
251This pattern replaces error codes with a tagged union.
252Success is one tag and the errors are another.
253It is also possible to make each possible error its own tag and carry its own
254additional information, but the two branch format is easy to make generic
255so that one type can be used everywhere in error handling code.
256
257This pattern is very popular in any functional or semi-functional language
258with primitive support for tagged unions (or algebraic data types).
259% We need listing Rust/rust to format code snippets from it.
260% Rust's \code{rust}{Result<T, E>}
261The main advantage is that an arbitrary object can be used to represent an
262error so it can include a lot more information than a simple error code.
263The disadvantages include that the it does have to be checked along the main
264execution and if there aren't primitive tagged unions proper usage can be
265hard to enforce.
266
267\item\emph{Handler Functions}:
268This pattern associates errors with functions.
269On error, the function that produced the error calls another function to
270handle it.
271The handler function can be provided locally (passed in as an argument,
272either directly as as a field of a structure/object) or globally (a global
273variable).
274C++ uses this approach as its fallback system if exception handling fails,
275such as \snake{std::terminate_handler} and, for a time,
276\snake{std::unexpected_handler}.
277
278Handler functions work a lot like resumption exceptions,
279but without the dynamic search for a handler.
280Since setting up the handler can be more complex/expensive,
281especially when the handler has to be passed through multiple layers of
282function calls, but cheaper (constant time) to call,
283they are more suited to more frequent (less exceptional) situations.
284\end{itemize}
285
286%\subsection
287Because of their cost, exceptions are rarely used for hot paths of execution.
288Hence, there is an element of self-fulfilling prophecy as implementation
289techniques have been focused on making them cheap to set-up,
290happily making them expensive to use in exchange.
291This difference is less important in higher-level scripting languages,
292where using exception for other tasks is more common.
293An iconic example is Python's \code{Python}{StopIteration} exception that
294is thrown by an iterator to indicate that it is exhausted.
295When paired with Python's iterator-based for-loop this will be thrown every
296time the end of the loop is reached.
297\todo{Cite Python StopIteration and for-each loop.}
298% https://docs.python.org/3/library/exceptions.html#StopIteration
Note: See TracBrowser for help on using the repository browser.