Mar 24, 2020, 4:37:09 PM (2 years ago)
Andrew Beach <ajbeach@…>
arm-eh, enum, forall-pointer-decay, jacob/cs343-translation, master, new-ast, new-ast-unique-expr

ABMM thesis: Lots of context reworking. Took out Peter's formating and put back parts that I understood and used.

1 edited


  • doc/theses/andrew_beach_MMath/unwinding.tex

    rae66348 rc72ea7a  
    11\chapter{Unwinding in \CFA}
    3 When a function returns, a \emph{single} stack frame is unwound, removing the
    4 function's parameters and local variables, and control continues in the
    5 function's caller using the caller's stack frame.  When an exception is raised,
    6 \emph{multiple} stack frames are unwound, removing the function parameters and
    7 local variables for called functions from the exception raise-frame to the
    8 exception catch-frame.
     3Stack unwinding is the process of removing things from the stack. Within
     4functions and on function return this is handled directly by the code in the
     5function itself as it knows exactly what is on the stack just from the
     6current location in the function. Unwinding across stack frames means that it
     7is no longer knows exactly what is on the stack or even how much of the stack
     8needs to be removed.
    10 Unwinding multiple levels is simple for a programming languages without object
    11 destructors or block finalizers because a direct transfer is possible from the
    12 current stack frame to a prior stack frame, where control continues at a
    13 location within the prior caller's function. For example, C provides non-local
    14 transfer using $longjmp$, which stores a function's state including its
    15 frame pointer and program counter, and simply reloads this information to
    16 continue at this prior location on the stack.
     10Even this is fairly simple if nothing needs to happen when the stack unwinds.
     11Traditional C can unwind the stack by saving and restoring state (with
     12\codeC{setjmp} \& \codeC{longjmp}). However many languages define actions that
     13have to be taken when something is removed from the stack, such as running
     14a variable's destructor or a \codeCFA{try} statement's \codeCFA{finally}
     15clause. Handling this requires walking the stack going through each stack
    18 For programming languages with object destructors or block finalizers it is
    19 necessary to walk the stack frames from raise to catch, checking for code that
    20 must be executed as part of terminating each frame. Walking the stack has a
    21 higher cost, and necessary information must be available to detect
    22 destructors/finalizers and call them.
     18For exceptions, this means everything from the point the exception is raised
     19to the point it is caught, while checking each frame for handlers during the
     20stack walk to find out where it should be caught. This is where the most of
     21the expense and complexity of exception handling comes from.
    24 A powerful package to provide stack-walking capabilities is $libunwind$,
    25 which is used in this work to provide exception handling in \CFA. The following
    26 explains how $libunwind$ works and how it is used.
    28 % Stack unwinding is the process of removing things from the stack from outside
    29 % the functions there. In languages that don't provide a way to guaranty that
    30 % code will run when the program leaves a scope or finishes a function, this
    31 % can be relatively trivial. C does this with $longjmp$ by setting the
    32 % stack pointer and a few other registers.
     23To do all of this we use libunwind, a low level library that provides tools
     24for stack walking and stack unwinding. What follows is an overview of all the
     25relivant features of libunwind and then how \CFA uses them to implement its
     26exception handling.
    3428\section{libunwind Usage}
    36 \CFA uses two primary functions in $libunwind$ to create most of its
    37 exceptional control-flow: $_Unwind_RaiseException$ and $_Unwind_ForcedUnwind$.
     30\CFA uses two primary functions in libunwind to create most of its
     31exceptional control-flow: \codeC{_Unwind_RaiseException} and
    3833Their operation is divided into two phases: search and clean-up. The search
    3934phase -- phase 1 -- is used to scan the stack but not unwinding it. The
    4035clean-up phase -- phase 2 -- is used for unwinding.
    42 % Somewhere around here I need to talk about the control structures.
    43 % $_Unwind_Exception$ is used to carry the API's universal data. Some
    44 % of this is internal, other fields are used to communicate between different
    45 % exception handling mechanisms in different runtimes.
    46 % $_Unwind_Context$ is an opaque data structure that is used to pass
    47 % information to helper functions.
    4937The raise-exception function uses both phases. It starts by searching for a
    5038handler, and if found, performs a clean-up phase to unwind the stack to the
    5139handler. If a handler is not found, control returns allowing the
    52 exception-handling policy for unhandled exception to be executed.  During both
     40exception-handling policy for unhandled exception to be executed. During both
    5341phases, the raise-exception function searches down the stack, calling each
    5442function's \emph{personality function}.
    5644A personality function performs three tasks, although not all have to be
    5745present. The tasks performed are decided by the actions provided.
    58 % Something argument something bitmask.
     46\codeC{_Unwind_Action} is a bitmask of possible actions and an argument of
     47this type is passed into the personality function.
    60 \item$_UA_SEARCH_PHASE$ is called during the clean-up phase and means search
    61 for handlers. If a hander is found, the personality function should return
    62 $_URC_HANDLER_FOUND$, otherwise it returns $_URC_CONTINUE_UNWIND$.
    63 {\color{red}What is the connection between finding the handler and the
    64 personality function?}
    65 \item$_UA_CLEANUP_PHASE$ is passed in during the clean-up phase and means part
    66 or all of the stack frame is removed. The personality function should do
    67 whatever clean-up the language defines (such as running destructors/finalizers)
    68 and then generally returns $_URC_CONTINUE_UNWIND$.
    69 \item$_UA_HANDLER_FRAME$ means the personality function must install a
    70 handler. It is also passed in during the clean-up phase and is in addition to
    71 the clean-up action. $libunwind$ provides several helpers for the personality
     49\item\codeC{_UA_SEARCH_PHASE} is passed in search phase and tells the
     50personality function to check for handlers. If there is a handler in this
     51stack frame, as defined by the language, the personality function should
     52return \codeC{_URC_HANDLER_FOUND}. Otherwise it should return
     54\item\codeC{_UA_CLEANUP_PHASE} is passed in during the clean-up phase and
     55means part or all of the stack frame is removed. The personality function
     56should do whatever clean-up the language defines
     57(such as running destructors/finalizers) and then generally returns
     59\item\codeC{_UA_HANDLER_FRAME} means the personality function must install
     60a handler. It is also passed in during the clean-up phase and is in addition
     61to the clean-up action. libunwind provides several helpers for the personality
    7262function here. Once it is done, the personality function must return
     65The personality function is given a number of other arguments. Some are for
     66compatability and there is the \codeC{struct _Unwind_Context} pointer which
     67passed to many helpers to get information about the current stack frame.
    76 Forced unwind only performs the clean-up phase. It is similar to the phase 2
    77 section of raise exception with a few changes. A simple difference is that it
    78 passes in an extra action to the personality function $_UA_FORCE_UNWIND$, which
    79 means a handler cannot be installed. The most difference significant is the
    80 addition of the $stop$ function, which is passed in as an argument to forced
    81 unwind.
     69Forced-unwind only performs the clean-up phase. It takes three arguments:
     70a pointer to the exception, a pointer to the stop function and a pointer to
     71the stop parameter. It does most of the same things as phase two of
     72raise-exception but with some extras.
     73The first it passes in an extra action to the personality function on each
     74stack frame, \codeC{_UA_FORCE_UNWIND}, which means a handler cannot be
    83 The $stop$ function is similar to a personality function. It takes an extra
    84 argument: a $void$ pointer passed into force unwind. It may return
    85 $_URC_NO_REASON$ to continue unwinding or it can transfer control out of the
    86 unwind code using its own mechanism.
     77The big change is that forced-unwind calls the stop function. Each time it
     78steps into a frame, before calling the personality function, it calls the
     79stop function. The stop function receives all the same arguments as the
     80personality function will and the stop parameter supplied to forced-unwind.
     82The stop function is called one more time at the end of the stack after all
     83stack frames have been removed. By the standard API this is marked by setting
     84the stack pointer inside the context passed to the stop function. However both
     85GCC and Clang add an extra action for this case \codeC{_UA_END_OF_STACK}.
     87Each time function the stop function is called it can do one or two things.
     88When it is not the end of the stack it can return \codeC{_URC_NO_REASON} to
     89continue unwinding.
    8790% Is there a reason that NO_REASON is used instead of CONTINUE_UNWIND?
    88 The $stop$ function is called for each stack frame and at the end of the
    89 stack. In a stack frame, it is called before the personality routine with the
    90 same arguments (except for the extra $void$ pointer). At the end of the stack,
    91 the arguments are mostly the same, except the stack pointer stored in the
    92 context is set to null. Because of this change, both GCC and Clang add an extra
    93 action in this case $_UA_END_OF_STACK$.  The $stop$ function may not return at
    94 the end of the stack.
    96 {\color{red}This needs work as I do not understand all of it.}
     91Its only other option is to use its own means to transfer control elsewhere
     92and never return to its caller. It may always do this and no additional tools
     93are provided to do it.
    9995\section{\CFA Implementation}
    101 To use $libunwind$, \CFA provides several wrappers, its own storage,
    102 personality functions, and a $stop$ function.
     97To use libunwind, \CFA provides several wrappers, its own storage,
     98personality functions, and a stop function.
    104100The wrappers perform three tasks: set-up, clean-up and controlling the
    105101unwinding. The set-up allocates a copy of the \CFA exception into a handler to
    106 control its lifetime, and stores it in the exception context.  Clean-up -- run
     102control its lifetime, and stores it in the exception context. Clean-up -- run
    107103when control exits a catch clause and returns to normal code -- frees the
    108104exception copy.
    110106% runtime/language features. Also the exception context is global.
    112 The control code in the middle {\color{red}(In the middle of what?)} is run
    113 every time a throw or re-throw is called. It uses raise exception to search for
    114 a handler and to run it, if one is found. Otherwise, it uses forced unwind to
    115 unwind the stack, running all destructors, before terminating the process.
     108The core control code is called every time a throw -- after set-up -- or
     109re-throw is run. It uses raise-exception to search for a handler and to run it
     110if one is found. If no handler is found and raise-exception returns then
     111forced-unwind is called to run all destructors on the stack before terminating
     112the process.
    117 The $stop$ function is very simple. It checks the end of stack flag to see if
    118 it is finished unwinding. If so, it calls $exit$ to end the process, otherwise
    119 it tells the system {\color{red}(What system?)} to continue unwinding.
     114The stop function is very simple. It checks the end of stack flag to see if
     115it is finished unwinding. If so, it calls \codeC{exit} to end the process,
     116otherwise it returns with no-reason to continue unwinding.
    120117% Yeah, this is going to have to change.
    123120about the function by scanning the LSDA (Language Specific Data Area). This
    124121step allows a single personality function to be used for multiple functions and
    125 it accounts for multiple regions{\color{red}(What's a region?)} and possible
    126 handlers in a single function.
     122let that personaliity function figure out exactly where in the function
     123execution was, what is currently in the stack frame and what handlers should
     124be checked.
    127125% Not that we do that yet.
    129127However, generating the LSDA is difficult. It requires knowledge about the
    130 location of the instruction pointer and stack layout, which varies by
    131 optimization levels. So for frames where there are only destructors, GCC's
    132 attribute cleanup with the $-fexception$ flag is sufficient to handle unwinding.
     128location of the instruction pointer and stack layout, which varies with
     129compiler and optimization levels. So for frames where there are only
     130destructors, GCC's attribute cleanup with the \texttt{-fexception} flag is
     131sufficient to handle unwinding.
    134 For functions with handlers (defined in the $try$ statement) the function is
    135 split into several functions. Everything outside the $try$ statement is the
    136 first function, which only has destructors to be run during unwinding. The
    137 catch clauses of the $try$ block are then converted into GCC inner functions,
    138 which are passed via function pointers while still having access to the outer
    139 function's scope. $catchResume$ and $finally$ clauses are handled separately
    140 and not discussed here.
     133The only functions that require more than that are those that contain
     134\codeCFA{try} statements. A \codeCFA{try} statement has a \codeCFA{try}
     135clause, some number of \codeCFA{catch} clauses and \codeCFA{catchResume}
     136clauses and may have a \codeCFA{finally} clause. Of these only \codeCFA{try}
     137statements with \codeCFA{catch} clauses need to be transformed and only they
     138and the \codeCFA{try} clause are involved.
    142 The $try$ clause {\color{red}You have $try$ statement, $try$ block, and $try$
    143 clause, which need clarification.)} is converted to a function directly. The
    144 $catch$ clauses are combined into two functions. The first is the match
    145 function, which is used during the search phase to find a handler. The second
    146 it the catch function, which is a large switch-case for the different
    147 handlers. These functions do not interact with unwinding except for running
    148 destructors and so can be handled by GCC.
     140The \codeCFA{try} statement is converted into a series of closures which can
     141access other parts of the function according to scoping rules but can be
     142passed around. The \codeCFA{try} clause is converted into the try functions,
     143almost entirely unchanged. The \codeCFA{catch} clauses are converted into two
     144functions; the match function and the catch function.
    150 These three functions are passed into $try_terminate$, an internal function
    151 that represents the $try$ statement. This function uses the generated
    152 personality functions as well as assembly statements to create the LSDA.  In
    153 normal execution, this function only calls the $try$ block closure. However,
    154 using $libunwind$, its personality function now handles exception matching and
    155 catching. {\color{red}(I don't understand the last sentence.)}
     146Together the match function and the catch function form the code that runs
     147when an exception passes out of a try block. The match function is used during
     148the search phase, it is passed an exception and checks each handler to see if
     149it will handle the exception. It returns an index that repersents which
     150handler matched or that none of them did. The catch function is used during
     151the clean-up phase, it is passed an exception and the index of a handler. It
     152casts the exception to the exception type declared in that handler and then
     153runs the handler's body.
    157 During the search phase, the personality function retrieves the match function
    158 from the stack using the saved stack pointer. The function is called, either
    159 returning 0 for no match or the index (a positive integer) of the handler for a
    160 match. If a handler is found, the personality function reports it after saving
    161 the index to the exception context.
     155These three functions are passed to \codeC{try_terminate}. This is an
     156% Maybe I shouldn't quote that, it isn't its actual name.
     157internal hand-written function that has its own personality function and
     158custom assembly LSD does the exception handling in \CFA. During normal
     159execution all this function does is call the try function and then return.
     160It is only when exceptions are thrown that anything interesting happens.
    163 During the clean-up phase there is nothing for the personality function to
    164 clean-up in $try_terminate$. So if this is not the handler frame, unwinding
    165 continues. If this is the handler frame, control is transferred to the catch
    166 function, giving it the exception and the handler index.
     162During the search phase the personality function gets the pointer to the match
     163function and calls it. If the match function returns a handler index the
     164personality function saves it and reports that the handler has been found,
     165otherwise unwinding continues.
     166During the clean-up phase the personality function only does anything if the
     167handler was found in this frame. If it was then the personality function
     168installs the handler, which is setting the instruction pointer in
     169\codeC{try_terminate} to an otherwise unused section that calls the catch
     170function, passing it the current exception and handler index.
     171\codeC{try_terminate} returns as soon as the catch function returns.
    168 {\color{red}This needs work as I do not understand all of it.}
     173At this point control has returned to normal control flow.
Note: See TracChangeset for help on using the changeset viewer.