Context Navigation

-                      rae66348
+                      rc72ea7a
 \chapter{Unwinding in \CFA}
+When a function returns, a \emph{single} stack frame is unwound, removing the
 function's parameters and local variables, and control continues in the
 function's caller using the caller's stack frame.  When an exception is raised,
+\emph{multiple} stack frames are unwound, removing the function parameters and
+local variables for called functions from the exception raise-frame to the
 exception catch-frame.
+Stack unwinding is the process of removing things from the stack. Within
+functions and on function return this is handled directly by the code in the
+function itself as it knows exactly what is on the stack just from the
+current location in the function. Unwinding across stack frames means that it
+is no longer knows exactly what is on the stack or even how much of the stack
+needs to be removed.
+Unwinding multiple levels is simple for a programming languages without object
+destructors or block finalizers because a direct transfer is possible from the
+current stack frame to a prior stack frame, where control continues at a
+location within the prior caller's function. For example, C provides non-local
+transfer using $longjmp$, which stores a function's state including its
+frame pointer and program counter, and simply reloads this information to
 continue at this prior location on the stack.
+Even this is fairly simple if nothing needs to happen when the stack unwinds.
+Traditional C can unwind the stack by saving and restoring state (with
+\codeC{setjmp} \& \codeC{longjmp}). However many languages define actions that
+have to be taken when something is removed from the stack, such as running
+a variable's destructor or a \codeCFA{try} statement's \codeCFA{finally}
+clause. Handling this requires walking the stack going through each stack
+frame.
+For programming languages with object destructors or block finalizers it is
+necessary to walk the stack frames from raise to catch, checking for code that
+must be executed as part of terminating each frame. Walking the stack has a
+higher cost, and necessary information must be available to detect
+destructors/finalizers and call them.
+For exceptions, this means everything from the point the exception is raised
+to the point it is caught, while checking each frame for handlers during the
+stack walk to find out where it should be caught. This is where the most of
+the expense and complexity of exception handling comes from.
+A powerful package to provide stack-walking capabilities is $libunwind$,
+which is used in this work to provide exception handling in \CFA. The following
+explains how $libunwind$ works and how it is used.
+% Stack unwinding is the process of removing things from the stack from outside
+% the functions there. In languages that don't provide a way to guaranty that
+% code will run when the program leaves a scope or finishes a function, this
+% can be relatively trivial. C does this with $longjmp$ by setting the
+% stack pointer and a few other registers.
+To do all of this we use libunwind, a low level library that provides tools
+for stack walking and stack unwinding. What follows is an overview of all the
+relivant features of libunwind and then how \CFA uses them to implement its
+exception handling.
 \section{libunwind Usage}
+\CFA uses two primary functions in $libunwind$ to create most of its
+exceptional control-flow: $_Unwind_RaiseException$ and $_Unwind_ForcedUnwind$.
+\CFA uses two primary functions in libunwind to create most of its
+exceptional control-flow: \codeC{_Unwind_RaiseException} and
+\codeC{_Unwind_ForcedUnwind}.
 Their operation is divided into two phases: search and clean-up. The search
 phase -- phase 1 -- is used to scan the stack but not unwinding it. The
 clean-up phase -- phase 2 -- is used for unwinding.
-% Somewhere around here I need to talk about the control structures.
-% $_Unwind_Exception$ is used to carry the API's universal data. Some
-% of this is internal, other fields are used to communicate between different
-% exception handling mechanisms in different runtimes.
-% $_Unwind_Context$ is an opaque data structure that is used to pass
-% information to helper functions.
 The raise-exception function uses both phases. It starts by searching for a
 handler, and if found, performs a clean-up phase to unwind the stack to the
 handler. If a handler is not found, control returns allowing the
 exception-handling policy for unhandled exception to be executed.  During both
+exception-handling policy for unhandled exception to be executed. During both
 phases, the raise-exception function searches down the stack, calling each
 function's \emph{personality function}.
 …
 A personality function performs three tasks, although not all have to be
 present. The tasks performed are decided by the actions provided.
+% Something argument something bitmask.
+\codeC{_Unwind_Action} is a bitmask of possible actions and an argument of
+this type is passed into the personality function.
 \begin{itemize}
+\item$_UA_SEARCH_PHASE$ is called during the clean-up phase and means search
+for handlers. If a hander is found, the personality function should return
+$_URC_HANDLER_FOUND$, otherwise it returns $_URC_CONTINUE_UNWIND$.
+{\color{red}What is the connection between finding the handler and the
+personality function?}
+\item$_UA_CLEANUP_PHASE$ is passed in during the clean-up phase and means part
+or all of the stack frame is removed. The personality function should do
+whatever clean-up the language defines (such as running destructors/finalizers)
+and then generally returns $_URC_CONTINUE_UNWIND$.
+\item$_UA_HANDLER_FRAME$ means the personality function must install a
+handler. It is also passed in during the clean-up phase and is in addition to
+the clean-up action. $libunwind$ provides several helpers for the personality
+\item\codeC{_UA_SEARCH_PHASE} is passed in search phase and tells the
+personality function to check for handlers. If there is a handler in this
+stack frame, as defined by the language, the personality function should
+return \codeC{_URC_HANDLER_FOUND}. Otherwise it should return
+\codeC{_URC_CONTINUE_UNWIND}.
+\item\codeC{_UA_CLEANUP_PHASE} is passed in during the clean-up phase and
+means part or all of the stack frame is removed. The personality function
+should do whatever clean-up the language defines
+(such as running destructors/finalizers) and then generally returns
+\codeC{_URC_CONTINUE_UNWIND}.
+\item\codeC{_UA_HANDLER_FRAME} means the personality function must install
+a handler. It is also passed in during the clean-up phase and is in addition
+to the clean-up action. libunwind provides several helpers for the personality
 function here. Once it is done, the personality function must return
 $_URC_INSTALL_CONTEXT$.
+\codeC{_URC_INSTALL_CONTEXT}.
 \end{itemize}
+The personality function is given a number of other arguments. Some are for
+compatability and there is the \codeC{struct _Unwind_Context} pointer which
+passed to many helpers to get information about the current stack frame.
+Forced unwind only performs the clean-up phase. It is similar to the phase 2
+section of raise exception with a few changes. A simple difference is that it
+passes in an extra action to the personality function $_UA_FORCE_UNWIND$, which
+means a handler cannot be installed. The most difference significant is the
+addition of the $stop$ function, which is passed in as an argument to forced
+unwind.
+Forced-unwind only performs the clean-up phase. It takes three arguments:
+a pointer to the exception, a pointer to the stop function and a pointer to
+the stop parameter. It does most of the same things as phase two of
+raise-exception but with some extras.
+The first it passes in an extra action to the personality function on each
+stack frame, \codeC{_UA_FORCE_UNWIND}, which means a handler cannot be
+installed.
+The $stop$ function is similar to a personality function. It takes an extra
+argument: a $void$ pointer passed into force unwind. It may return
+$_URC_NO_REASON$ to continue unwinding or it can transfer control out of the
+unwind code using its own mechanism.
+The big change is that forced-unwind calls the stop function. Each time it
+steps into a frame, before calling the personality function, it calls the
+stop function. The stop function receives all the same arguments as the
+personality function will and the stop parameter supplied to forced-unwind.
+The stop function is called one more time at the end of the stack after all
+stack frames have been removed. By the standard API this is marked by setting
+the stack pointer inside the context passed to the stop function. However both
+GCC and Clang add an extra action for this case \codeC{_UA_END_OF_STACK}.
+Each time function the stop function is called it can do one or two things.
+When it is not the end of the stack it can return \codeC{_URC_NO_REASON} to
+continue unwinding.
 % Is there a reason that NO_REASON is used instead of CONTINUE_UNWIND?
+The $stop$ function is called for each stack frame and at the end of the
+stack. In a stack frame, it is called before the personality routine with the
+same arguments (except for the extra $void$ pointer). At the end of the stack,
+the arguments are mostly the same, except the stack pointer stored in the
+context is set to null. Because of this change, both GCC and Clang add an extra
+action in this case $_UA_END_OF_STACK$.  The $stop$ function may not return at
+the end of the stack.
+{\color{red}This needs work as I do not understand all of it.}
+Its only other option is to use its own means to transfer control elsewhere
+and never return to its caller. It may always do this and no additional tools
+are provided to do it.
 \section{\CFA Implementation}
 To use $libunwind$, \CFA provides several wrappers, its own storage,
 personality functions, and a $stop$ function.
+To use libunwind, \CFA provides several wrappers, its own storage,
+personality functions, and a stop function.
 The wrappers perform three tasks: set-up, clean-up and controlling the
 unwinding. The set-up allocates a copy of the \CFA exception into a handler to
 control its lifetime, and stores it in the exception context.  Clean-up -- run
+control its lifetime, and stores it in the exception context. Clean-up -- run
 when control exits a catch clause and returns to normal code -- frees the
 exception copy.
 …
 % runtime/language features. Also the exception context is global.
+The control code in the middle {\color{red}(In the middle of what?)} is run
+every time a throw or re-throw is called. It uses raise exception to search for
+a handler and to run it, if one is found. Otherwise, it uses forced unwind to
+unwind the stack, running all destructors, before terminating the process.
+The core control code is called every time a throw -- after set-up -- or
+re-throw is run. It uses raise-exception to search for a handler and to run it
+if one is found. If no handler is found and raise-exception returns then
+forced-unwind is called to run all destructors on the stack before terminating
+the process.
 The $stop$ function is very simple. It checks the end of stack flag to see if
 it is finished unwinding. If so, it calls $exit$ to end the process, otherwise
 it tells the system {\color{red}(What system?)} to continue unwinding.
+The stop function is very simple. It checks the end of stack flag to see if
+it is finished unwinding. If so, it calls \codeC{exit} to end the process,
+otherwise it returns with no-reason to continue unwinding.
 % Yeah, this is going to have to change.
 …
 about the function by scanning the LSDA (Language Specific Data Area). This
 step allows a single personality function to be used for multiple functions and
+it accounts for multiple regions{\color{red}(What's a region?)} and possible
+handlers in a single function.
+let that personaliity function figure out exactly where in the function
+execution was, what is currently in the stack frame and what handlers should
+be checked.
 % Not that we do that yet.
 However, generating the LSDA is difficult. It requires knowledge about the
+location of the instruction pointer and stack layout, which varies by
+optimization levels. So for frames where there are only destructors, GCC's
+attribute cleanup with the $-fexception$ flag is sufficient to handle unwinding.
+location of the instruction pointer and stack layout, which varies with
+compiler and optimization levels. So for frames where there are only
+destructors, GCC's attribute cleanup with the \texttt{-fexception} flag is
+sufficient to handle unwinding.
+For functions with handlers (defined in the $try$ statement) the function is
+split into several functions. Everything outside the $try$ statement is the
+first function, which only has destructors to be run during unwinding. The
+catch clauses of the $try$ block are then converted into GCC inner functions,
+which are passed via function pointers while still having access to the outer
+function's scope. $catchResume$ and $finally$ clauses are handled separately
+and not discussed here.
+The only functions that require more than that are those that contain
+\codeCFA{try} statements. A \codeCFA{try} statement has a \codeCFA{try}
+clause, some number of \codeCFA{catch} clauses and \codeCFA{catchResume}
+clauses and may have a \codeCFA{finally} clause. Of these only \codeCFA{try}
+statements with \codeCFA{catch} clauses need to be transformed and only they
+and the \codeCFA{try} clause are involved.
+The $try$ clause {\color{red}You have $try$ statement, $try$ block, and $try$
+clause, which need clarification.)} is converted to a function directly. The
+$catch$ clauses are combined into two functions. The first is the match
+function, which is used during the search phase to find a handler. The second
+it the catch function, which is a large switch-case for the different
+handlers. These functions do not interact with unwinding except for running
+destructors and so can be handled by GCC.
+The \codeCFA{try} statement is converted into a series of closures which can
+access other parts of the function according to scoping rules but can be
+passed around. The \codeCFA{try} clause is converted into the try functions,
+almost entirely unchanged. The \codeCFA{catch} clauses are converted into two
+functions; the match function and the catch function.
+These three functions are passed into $try_terminate$, an internal function
+that represents the $try$ statement. This function uses the generated
+personality functions as well as assembly statements to create the LSDA.  In
+normal execution, this function only calls the $try$ block closure. However,
+using $libunwind$, its personality function now handles exception matching and
+catching. {\color{red}(I don't understand the last sentence.)}
+Together the match function and the catch function form the code that runs
+when an exception passes out of a try block. The match function is used during
+the search phase, it is passed an exception and checks each handler to see if
+it will handle the exception. It returns an index that repersents which
+handler matched or that none of them did. The catch function is used during
+the clean-up phase, it is passed an exception and the index of a handler. It
+casts the exception to the exception type declared in that handler and then
+runs the handler's body.
+During the search phase, the personality function retrieves the match function
+from the stack using the saved stack pointer. The function is called, either
+returning 0 for no match or the index (a positive integer) of the handler for a
+match. If a handler is found, the personality function reports it after saving
+the index to the exception context.
+These three functions are passed to \codeC{try_terminate}. This is an
+% Maybe I shouldn't quote that, it isn't its actual name.
+internal hand-written function that has its own personality function and
+custom assembly LSD does the exception handling in \CFA. During normal
+execution all this function does is call the try function and then return.
+It is only when exceptions are thrown that anything interesting happens.
+During the clean-up phase there is nothing for the personality function to
+clean-up in $try_terminate$. So if this is not the handler frame, unwinding
+continues. If this is the handler frame, control is transferred to the catch
+function, giving it the exception and the handler index.
+During the search phase the personality function gets the pointer to the match
+function and calls it. If the match function returns a handler index the
+personality function saves it and reports that the handler has been found,
+otherwise unwinding continues.
+During the clean-up phase the personality function only does anything if the
+handler was found in this frame. If it was then the personality function
+installs the handler, which is setting the instruction pointer in
+\codeC{try_terminate} to an otherwise unused section that calls the catch
+function, passing it the current exception and handler index.
+\codeC{try_terminate} returns as soon as the catch function returns.
+{\color{red}This needs work as I do not understand all of it.}
+At this point control has returned to normal control flow.

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset c72ea7a for doc/theses/andrew_beach_MMath/unwinding.tex

Legend:

doc/theses/andrew_beach_MMath/unwinding.tex

Download in other formats: