Changeset 44a88528 for doc/theses/andrew_beach_MMath/unwinding.tex
- Timestamp:
- Mar 25, 2020, 2:08:43 PM (5 years ago)
- Branches:
- ADT, arm-eh, ast-experimental, enum, forall-pointer-decay, jacob/cs343-translation, master, new-ast, new-ast-unique-expr, pthread-emulation, qualifiedEnum
- Children:
- 2a3b019, 63863f8
- Parents:
- 6c6e36c (diff), c72ea7a (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the(diff)
links above to see all the changes relative to each parent. - File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/andrew_beach_MMath/unwinding.tex
r6c6e36c r44a88528 1 1 \chapter{Unwinding in \CFA} 2 2 3 When a function returns, a \emph{single} stack frame is unwound, removing the 4 function 's parameters and local variables, and control continuesin the5 function 's caller using the caller's stack frame. When an exception is raised,6 \emph{multiple} stack frames are unwound, removing the function parameters and 7 local variables for called functions from the exception raise-frame to the 8 exception catch-frame.3 Stack unwinding is the process of removing things from the stack. Within 4 functions and on function return this is handled directly by the code in the 5 function itself as it knows exactly what is on the stack just from the 6 current location in the function. Unwinding across stack frames means that it 7 is no longer knows exactly what is on the stack or even how much of the stack 8 needs to be removed. 9 9 10 Unwinding multiple levels is simple for a programming languages without object 11 destructors or block finalizers because a direct transfer is possible from the 12 current stack frame to a prior stack frame, where control continues at a 13 location within the prior caller's function. For example, C provides non-local 14 transfer using $longjmp$, which stores a function's state including its 15 frame pointer and program counter, and simply reloads this information to 16 continue at this prior location on the stack.10 Even this is fairly simple if nothing needs to happen when the stack unwinds. 11 Traditional C can unwind the stack by saving and restoring state (with 12 \codeC{setjmp} \& \codeC{longjmp}). However many languages define actions that 13 have to be taken when something is removed from the stack, such as running 14 a variable's destructor or a \codeCFA{try} statement's \codeCFA{finally} 15 clause. Handling this requires walking the stack going through each stack 16 frame. 17 17 18 For programming languages with object destructors or block finalizers it is 19 necessary to walk the stack frames from raise to catch, checking for code that 20 must be executed as part of terminating each frame. Walking the stack has a 21 higher cost, and necessary information must be available to detect 22 destructors/finalizers and call them. 18 For exceptions, this means everything from the point the exception is raised 19 to the point it is caught, while checking each frame for handlers during the 20 stack walk to find out where it should be caught. This is where the most of 21 the expense and complexity of exception handling comes from. 23 22 24 A powerful package to provide stack-walking capabilities is $libunwind$, 25 which is used in this work to provide exception handling in \CFA. The following 26 explains how $libunwind$ works and how it is used. 27 28 % Stack unwinding is the process of removing things from the stack from outside 29 % the functions there. In languages that don't provide a way to guaranty that 30 % code will run when the program leaves a scope or finishes a function, this 31 % can be relatively trivial. C does this with $longjmp$ by setting the 32 % stack pointer and a few other registers. 23 To do all of this we use libunwind, a low level library that provides tools 24 for stack walking and stack unwinding. What follows is an overview of all the 25 relivant features of libunwind and then how \CFA uses them to implement its 26 exception handling. 33 27 34 28 \section{libunwind Usage} 35 29 36 \CFA uses two primary functions in $libunwind$ to create most of its 37 exceptional control-flow: $_Unwind_RaiseException$ and $_Unwind_ForcedUnwind$. 30 \CFA uses two primary functions in libunwind to create most of its 31 exceptional control-flow: \codeC{_Unwind_RaiseException} and 32 \codeC{_Unwind_ForcedUnwind}. 38 33 Their operation is divided into two phases: search and clean-up. The search 39 34 phase -- phase 1 -- is used to scan the stack but not unwinding it. The 40 35 clean-up phase -- phase 2 -- is used for unwinding. 41 36 42 % Somewhere around here I need to talk about the control structures.43 % $_Unwind_Exception$ is used to carry the API's universal data. Some44 % of this is internal, other fields are used to communicate between different45 % exception handling mechanisms in different runtimes.46 % $_Unwind_Context$ is an opaque data structure that is used to pass47 % information to helper functions.48 49 37 The raise-exception function uses both phases. It starts by searching for a 50 38 handler, and if found, performs a clean-up phase to unwind the stack to the 51 39 handler. If a handler is not found, control returns allowing the 52 exception-handling policy for unhandled exception to be executed. 40 exception-handling policy for unhandled exception to be executed. During both 53 41 phases, the raise-exception function searches down the stack, calling each 54 42 function's \emph{personality function}. … … 56 44 A personality function performs three tasks, although not all have to be 57 45 present. The tasks performed are decided by the actions provided. 58 % Something argument something bitmask. 46 \codeC{_Unwind_Action} is a bitmask of possible actions and an argument of 47 this type is passed into the personality function. 59 48 \begin{itemize} 60 \item$_UA_SEARCH_PHASE$ is called during the clean-up phase and means search 61 for handlers. If a hander is found, the personality function should return 62 $_URC_HANDLER_FOUND$, otherwise it returns $_URC_CONTINUE_UNWIND$. 63 {\color{red}What is the connection between finding the handler and the 64 personality function?} 65 \item$_UA_CLEANUP_PHASE$ is passed in during the clean-up phase and means part 66 or all of the stack frame is removed. The personality function should do 67 whatever clean-up the language defines (such as running destructors/finalizers) 68 and then generally returns $_URC_CONTINUE_UNWIND$. 69 \item$_UA_HANDLER_FRAME$ means the personality function must install a 70 handler. It is also passed in during the clean-up phase and is in addition to 71 the clean-up action. $libunwind$ provides several helpers for the personality 49 \item\codeC{_UA_SEARCH_PHASE} is passed in search phase and tells the 50 personality function to check for handlers. If there is a handler in this 51 stack frame, as defined by the language, the personality function should 52 return \codeC{_URC_HANDLER_FOUND}. Otherwise it should return 53 \codeC{_URC_CONTINUE_UNWIND}. 54 \item\codeC{_UA_CLEANUP_PHASE} is passed in during the clean-up phase and 55 means part or all of the stack frame is removed. The personality function 56 should do whatever clean-up the language defines 57 (such as running destructors/finalizers) and then generally returns 58 \codeC{_URC_CONTINUE_UNWIND}. 59 \item\codeC{_UA_HANDLER_FRAME} means the personality function must install 60 a handler. It is also passed in during the clean-up phase and is in addition 61 to the clean-up action. libunwind provides several helpers for the personality 72 62 function here. Once it is done, the personality function must return 73 $_URC_INSTALL_CONTEXT$.63 \codeC{_URC_INSTALL_CONTEXT}. 74 64 \end{itemize} 65 The personality function is given a number of other arguments. Some are for 66 compatability and there is the \codeC{struct _Unwind_Context} pointer which 67 passed to many helpers to get information about the current stack frame. 75 68 76 Forced unwind only performs the clean-up phase. It is similar to the phase 2 77 section of raise exception with a few changes. A simple difference is that it 78 passes in an extra action to the personality function $_UA_FORCE_UNWIND$, which 79 means a handler cannot be installed. The most difference significant is the 80 addition of the $stop$ function, which is passed in as an argument to forced 81 unwind. 69 Forced-unwind only performs the clean-up phase. It takes three arguments: 70 a pointer to the exception, a pointer to the stop function and a pointer to 71 the stop parameter. It does most of the same things as phase two of 72 raise-exception but with some extras. 73 The first it passes in an extra action to the personality function on each 74 stack frame, \codeC{_UA_FORCE_UNWIND}, which means a handler cannot be 75 installed. 82 76 83 The $stop$ function is similar to a personality function. It takes an extra 84 argument: a $void$ pointer passed into force unwind. It may return 85 $_URC_NO_REASON$ to continue unwinding or it can transfer control out of the 86 unwind code using its own mechanism. 77 The big change is that forced-unwind calls the stop function. Each time it 78 steps into a frame, before calling the personality function, it calls the 79 stop function. The stop function receives all the same arguments as the 80 personality function will and the stop parameter supplied to forced-unwind. 81 82 The stop function is called one more time at the end of the stack after all 83 stack frames have been removed. By the standard API this is marked by setting 84 the stack pointer inside the context passed to the stop function. However both 85 GCC and Clang add an extra action for this case \codeC{_UA_END_OF_STACK}. 86 87 Each time function the stop function is called it can do one or two things. 88 When it is not the end of the stack it can return \codeC{_URC_NO_REASON} to 89 continue unwinding. 87 90 % Is there a reason that NO_REASON is used instead of CONTINUE_UNWIND? 88 The $stop$ function is called for each stack frame and at the end of the 89 stack. In a stack frame, it is called before the personality routine with the 90 same arguments (except for the extra $void$ pointer). At the end of the stack, 91 the arguments are mostly the same, except the stack pointer stored in the 92 context is set to null. Because of this change, both GCC and Clang add an extra 93 action in this case $_UA_END_OF_STACK$. The $stop$ function may not return at 94 the end of the stack. 95 96 {\color{red}This needs work as I do not understand all of it.} 97 91 Its only other option is to use its own means to transfer control elsewhere 92 and never return to its caller. It may always do this and no additional tools 93 are provided to do it. 98 94 99 95 \section{\CFA Implementation} 100 96 101 To use $libunwind$, \CFA provides several wrappers, its own storage,102 personality functions, and a $stop$function.97 To use libunwind, \CFA provides several wrappers, its own storage, 98 personality functions, and a stop function. 103 99 104 100 The wrappers perform three tasks: set-up, clean-up and controlling the 105 101 unwinding. The set-up allocates a copy of the \CFA exception into a handler to 106 control its lifetime, and stores it in the exception context. 102 control its lifetime, and stores it in the exception context. Clean-up -- run 107 103 when control exits a catch clause and returns to normal code -- frees the 108 104 exception copy. … … 110 106 % runtime/language features. Also the exception context is global. 111 107 112 The control code in the middle {\color{red}(In the middle of what?)} is run 113 every time a throw or re-throw is called. It uses raise exception to search for 114 a handler and to run it, if one is found. Otherwise, it uses forced unwind to 115 unwind the stack, running all destructors, before terminating the process. 108 The core control code is called every time a throw -- after set-up -- or 109 re-throw is run. It uses raise-exception to search for a handler and to run it 110 if one is found. If no handler is found and raise-exception returns then 111 forced-unwind is called to run all destructors on the stack before terminating 112 the process. 116 113 117 The $stop$function is very simple. It checks the end of stack flag to see if118 it is finished unwinding. If so, it calls $exit$ to end the process, otherwise119 it tells the system {\color{red}(What system?)}to continue unwinding.114 The stop function is very simple. It checks the end of stack flag to see if 115 it is finished unwinding. If so, it calls \codeC{exit} to end the process, 116 otherwise it returns with no-reason to continue unwinding. 120 117 % Yeah, this is going to have to change. 121 118 … … 123 120 about the function by scanning the LSDA (Language Specific Data Area). This 124 121 step allows a single personality function to be used for multiple functions and 125 it accounts for multiple regions{\color{red}(What's a region?)} and possible 126 handlers in a single function. 122 let that personaliity function figure out exactly where in the function 123 execution was, what is currently in the stack frame and what handlers should 124 be checked. 127 125 % Not that we do that yet. 128 126 129 127 However, generating the LSDA is difficult. It requires knowledge about the 130 location of the instruction pointer and stack layout, which varies by 131 optimization levels. So for frames where there are only destructors, GCC's 132 attribute cleanup with the $-fexception$ flag is sufficient to handle unwinding. 128 location of the instruction pointer and stack layout, which varies with 129 compiler and optimization levels. So for frames where there are only 130 destructors, GCC's attribute cleanup with the \texttt{-fexception} flag is 131 sufficient to handle unwinding. 133 132 134 For functions with handlers (defined in the $try$ statement) the function is 135 split into several functions. Everything outside the $try$ statement is the 136 first function, which only has destructors to be run during unwinding. The 137 catch clauses of the $try$ block are then converted into GCC inner functions, 138 which are passed via function pointers while still having access to the outer 139 function's scope. $catchResume$ and $finally$ clauses are handled separately 140 and not discussed here. 133 The only functions that require more than that are those that contain 134 \codeCFA{try} statements. A \codeCFA{try} statement has a \codeCFA{try} 135 clause, some number of \codeCFA{catch} clauses and \codeCFA{catchResume} 136 clauses and may have a \codeCFA{finally} clause. Of these only \codeCFA{try} 137 statements with \codeCFA{catch} clauses need to be transformed and only they 138 and the \codeCFA{try} clause are involved. 141 139 142 The $try$ clause {\color{red}You have $try$ statement, $try$ block, and $try$ 143 clause, which need clarification.)} is converted to a function directly. The 144 $catch$ clauses are combined into two functions. The first is the match 145 function, which is used during the search phase to find a handler. The second 146 it the catch function, which is a large switch-case for the different 147 handlers. These functions do not interact with unwinding except for running 148 destructors and so can be handled by GCC. 140 The \codeCFA{try} statement is converted into a series of closures which can 141 access other parts of the function according to scoping rules but can be 142 passed around. The \codeCFA{try} clause is converted into the try functions, 143 almost entirely unchanged. The \codeCFA{catch} clauses are converted into two 144 functions; the match function and the catch function. 149 145 150 These three functions are passed into $try_terminate$, an internal function 151 that represents the $try$ statement. This function uses the generated 152 personality functions as well as assembly statements to create the LSDA. In 153 normal execution, this function only calls the $try$ block closure. However, 154 using $libunwind$, its personality function now handles exception matching and 155 catching. {\color{red}(I don't understand the last sentence.)} 146 Together the match function and the catch function form the code that runs 147 when an exception passes out of a try block. The match function is used during 148 the search phase, it is passed an exception and checks each handler to see if 149 it will handle the exception. It returns an index that repersents which 150 handler matched or that none of them did. The catch function is used during 151 the clean-up phase, it is passed an exception and the index of a handler. It 152 casts the exception to the exception type declared in that handler and then 153 runs the handler's body. 156 154 157 During the search phase, the personality function retrieves the match function 158 from the stack using the saved stack pointer. The function is called, either 159 returning 0 for no match or the index (a positive integer) of the handler for a 160 match. If a handler is found, the personality function reports it after saving 161 the index to the exception context. 155 These three functions are passed to \codeC{try_terminate}. This is an 156 % Maybe I shouldn't quote that, it isn't its actual name. 157 internal hand-written function that has its own personality function and 158 custom assembly LSD does the exception handling in \CFA. During normal 159 execution all this function does is call the try function and then return. 160 It is only when exceptions are thrown that anything interesting happens. 162 161 163 During the clean-up phase there is nothing for the personality function to 164 clean-up in $try_terminate$. So if this is not the handler frame, unwinding 165 continues. If this is the handler frame, control is transferred to the catch 166 function, giving it the exception and the handler index. 162 During the search phase the personality function gets the pointer to the match 163 function and calls it. If the match function returns a handler index the 164 personality function saves it and reports that the handler has been found, 165 otherwise unwinding continues. 166 During the clean-up phase the personality function only does anything if the 167 handler was found in this frame. If it was then the personality function 168 installs the handler, which is setting the instruction pointer in 169 \codeC{try_terminate} to an otherwise unused section that calls the catch 170 function, passing it the current exception and handler index. 171 \codeC{try_terminate} returns as soon as the catch function returns. 167 172 168 {\color{red}This needs work as I do not understand all of it.} 173 At this point control has returned to normal control flow.
Note: See TracChangeset
for help on using the changeset viewer.