| 1 | \chapter{Unwinding in \CFA} | 
|---|
| 2 |  | 
|---|
| 3 | Stack unwinding is the process of removing things from the stack. Within | 
|---|
| 4 | functions and on function return this is handled directly by the code in the | 
|---|
| 5 | function itself as it knows exactly what is on the stack just from the | 
|---|
| 6 | current location in the function. Unwinding across stack frames means that it | 
|---|
| 7 | is no longer knows exactly what is on the stack or even how much of the stack | 
|---|
| 8 | needs to be removed. | 
|---|
| 9 |  | 
|---|
| 10 | Even this is fairly simple if nothing needs to happen when the stack unwinds. | 
|---|
| 11 | Traditional C can unwind the stack by saving and restoring state (with | 
|---|
| 12 | \codeC{setjmp} \& \codeC{longjmp}). However many languages define actions that | 
|---|
| 13 | have to be taken when something is removed from the stack, such as running | 
|---|
| 14 | a variable's destructor or a \codeCFA{try} statement's \codeCFA{finally} | 
|---|
| 15 | clause. Handling this requires walking the stack going through each stack | 
|---|
| 16 | frame. | 
|---|
| 17 |  | 
|---|
| 18 | For exceptions, this means everything from the point the exception is raised | 
|---|
| 19 | to the point it is caught, while checking each frame for handlers during the | 
|---|
| 20 | stack walk to find out where it should be caught. This is where the most of | 
|---|
| 21 | the expense and complexity of exception handling comes from. | 
|---|
| 22 |  | 
|---|
| 23 | To do all of this we use libunwind, a low level library that provides tools | 
|---|
| 24 | for stack walking and stack unwinding. What follows is an overview of all the | 
|---|
| 25 | relivant features of libunwind and then how \CFA uses them to implement its | 
|---|
| 26 | exception handling. | 
|---|
| 27 |  | 
|---|
| 28 | \section{libunwind Usage} | 
|---|
| 29 |  | 
|---|
| 30 | \CFA uses two primary functions in libunwind to create most of its | 
|---|
| 31 | exceptional control-flow: \codeC{_Unwind_RaiseException} and | 
|---|
| 32 | \codeC{_Unwind_ForcedUnwind}. | 
|---|
| 33 | Their operation is divided into two phases: search and clean-up. The search | 
|---|
| 34 | phase -- phase 1 -- is used to scan the stack but not unwinding it. The | 
|---|
| 35 | clean-up phase -- phase 2 -- is used for unwinding. | 
|---|
| 36 |  | 
|---|
| 37 | The raise-exception function uses both phases. It starts by searching for a | 
|---|
| 38 | handler, and if found, performs a clean-up phase to unwind the stack to the | 
|---|
| 39 | handler. If a handler is not found, control returns allowing the | 
|---|
| 40 | exception-handling policy for unhandled exception to be executed. During both | 
|---|
| 41 | phases, the raise-exception function searches down the stack, calling each | 
|---|
| 42 | function's \emph{personality function}. | 
|---|
| 43 |  | 
|---|
| 44 | A personality function performs three tasks, although not all have to be | 
|---|
| 45 | present. The tasks performed are decided by the actions provided. | 
|---|
| 46 | \codeC{_Unwind_Action} is a bitmask of possible actions and an argument of | 
|---|
| 47 | this type is passed into the personality function. | 
|---|
| 48 | \begin{itemize} | 
|---|
| 49 | \item\codeC{_UA_SEARCH_PHASE} is passed in search phase and tells the | 
|---|
| 50 | personality function to check for handlers. If there is a handler in this | 
|---|
| 51 | stack frame, as defined by the language, the personality function should | 
|---|
| 52 | return \codeC{_URC_HANDLER_FOUND}. Otherwise it should return | 
|---|
| 53 | \codeC{_URC_CONTINUE_UNWIND}. | 
|---|
| 54 | \item\codeC{_UA_CLEANUP_PHASE} is passed in during the clean-up phase and | 
|---|
| 55 | means part or all of the stack frame is removed. The personality function | 
|---|
| 56 | should do whatever clean-up the language defines | 
|---|
| 57 | (such as running destructors/finalizers) and then generally returns | 
|---|
| 58 | \codeC{_URC_CONTINUE_UNWIND}. | 
|---|
| 59 | \item\codeC{_UA_HANDLER_FRAME} means the personality function must install | 
|---|
| 60 | a handler. It is also passed in during the clean-up phase and is in addition | 
|---|
| 61 | to the clean-up action. libunwind provides several helpers for the personality | 
|---|
| 62 | function here. Once it is done, the personality function must return | 
|---|
| 63 | \codeC{_URC_INSTALL_CONTEXT}. | 
|---|
| 64 | \end{itemize} | 
|---|
| 65 | The personality function is given a number of other arguments. Some are for | 
|---|
| 66 | compatability and there is the \codeC{struct _Unwind_Context} pointer which | 
|---|
| 67 | passed to many helpers to get information about the current stack frame. | 
|---|
| 68 |  | 
|---|
| 69 | Forced-unwind only performs the clean-up phase. It takes three arguments: | 
|---|
| 70 | a pointer to the exception, a pointer to the stop function and a pointer to | 
|---|
| 71 | the stop parameter. It does most of the same things as phase two of | 
|---|
| 72 | raise-exception but with some extras. | 
|---|
| 73 | The first it passes in an extra action to the personality function on each | 
|---|
| 74 | stack frame, \codeC{_UA_FORCE_UNWIND}, which means a handler cannot be | 
|---|
| 75 | installed. | 
|---|
| 76 |  | 
|---|
| 77 | The big change is that forced-unwind calls the stop function. Each time it | 
|---|
| 78 | steps into a frame, before calling the personality function, it calls the | 
|---|
| 79 | stop function. The stop function receives all the same arguments as the | 
|---|
| 80 | personality function will and the stop parameter supplied to forced-unwind. | 
|---|
| 81 |  | 
|---|
| 82 | The stop function is called one more time at the end of the stack after all | 
|---|
| 83 | stack frames have been removed. By the standard API this is marked by setting | 
|---|
| 84 | the stack pointer inside the context passed to the stop function. However both | 
|---|
| 85 | GCC and Clang add an extra action for this case \codeC{_UA_END_OF_STACK}. | 
|---|
| 86 |  | 
|---|
| 87 | Each time function the stop function is called it can do one or two things. | 
|---|
| 88 | When it is not the end of the stack it can return \codeC{_URC_NO_REASON} to | 
|---|
| 89 | continue unwinding. | 
|---|
| 90 | % Is there a reason that NO_REASON is used instead of CONTINUE_UNWIND? | 
|---|
| 91 | Its only other option is to use its own means to transfer control elsewhere | 
|---|
| 92 | and never return to its caller. It may always do this and no additional tools | 
|---|
| 93 | are provided to do it. | 
|---|
| 94 |  | 
|---|
| 95 | \section{\CFA Implementation} | 
|---|
| 96 |  | 
|---|
| 97 | To use libunwind, \CFA provides several wrappers, its own storage, | 
|---|
| 98 | personality functions, and a stop function. | 
|---|
| 99 |  | 
|---|
| 100 | The wrappers perform three tasks: set-up, clean-up and controlling the | 
|---|
| 101 | unwinding. The set-up allocates a copy of the \CFA exception into a handler to | 
|---|
| 102 | control its lifetime, and stores it in the exception context. Clean-up -- run | 
|---|
| 103 | when control exits a catch clause and returns to normal code -- frees the | 
|---|
| 104 | exception copy. | 
|---|
| 105 | % It however does not set up the unwind exception so we can't use any inter- | 
|---|
| 106 | % runtime/language features. Also the exception context is global. | 
|---|
| 107 |  | 
|---|
| 108 | The core control code is called every time a throw -- after set-up -- or | 
|---|
| 109 | re-throw is run. It uses raise-exception to search for a handler and to run it | 
|---|
| 110 | if one is found. If no handler is found and raise-exception returns then | 
|---|
| 111 | forced-unwind is called to run all destructors on the stack before terminating | 
|---|
| 112 | the process. | 
|---|
| 113 |  | 
|---|
| 114 | The stop function is very simple. It checks the end of stack flag to see if | 
|---|
| 115 | it is finished unwinding. If so, it calls \codeC{exit} to end the process, | 
|---|
| 116 | otherwise it returns with no-reason to continue unwinding. | 
|---|
| 117 | % Yeah, this is going to have to change. | 
|---|
| 118 |  | 
|---|
| 119 | The personality routine is more complex because it has to obtain information | 
|---|
| 120 | about the function by scanning the LSDA (Language Specific Data Area). This | 
|---|
| 121 | step allows a single personality function to be used for multiple functions and | 
|---|
| 122 | let that personaliity function figure out exactly where in the function | 
|---|
| 123 | execution was, what is currently in the stack frame and what handlers should | 
|---|
| 124 | be checked. | 
|---|
| 125 | % Not that we do that yet. | 
|---|
| 126 |  | 
|---|
| 127 | However, generating the LSDA is difficult. It requires knowledge about the | 
|---|
| 128 | location of the instruction pointer and stack layout, which varies with | 
|---|
| 129 | compiler and optimization levels. So for frames where there are only | 
|---|
| 130 | destructors, GCC's attribute cleanup with the \texttt{-fexception} flag is | 
|---|
| 131 | sufficient to handle unwinding. | 
|---|
| 132 |  | 
|---|
| 133 | The only functions that require more than that are those that contain | 
|---|
| 134 | \codeCFA{try} statements. A \codeCFA{try} statement has a \codeCFA{try} | 
|---|
| 135 | clause, some number of \codeCFA{catch} clauses and \codeCFA{catchResume} | 
|---|
| 136 | clauses and may have a \codeCFA{finally} clause. Of these only \codeCFA{try} | 
|---|
| 137 | statements with \codeCFA{catch} clauses need to be transformed and only they | 
|---|
| 138 | and the \codeCFA{try} clause are involved. | 
|---|
| 139 |  | 
|---|
| 140 | The \codeCFA{try} statement is converted into a series of closures which can | 
|---|
| 141 | access other parts of the function according to scoping rules but can be | 
|---|
| 142 | passed around. The \codeCFA{try} clause is converted into the try functions, | 
|---|
| 143 | almost entirely unchanged. The \codeCFA{catch} clauses are converted into two | 
|---|
| 144 | functions; the match function and the catch function. | 
|---|
| 145 |  | 
|---|
| 146 | Together the match function and the catch function form the code that runs | 
|---|
| 147 | when an exception passes out of a try block. The match function is used during | 
|---|
| 148 | the search phase, it is passed an exception and checks each handler to see if | 
|---|
| 149 | it will handle the exception. It returns an index that repersents which | 
|---|
| 150 | handler matched or that none of them did. The catch function is used during | 
|---|
| 151 | the clean-up phase, it is passed an exception and the index of a handler. It | 
|---|
| 152 | casts the exception to the exception type declared in that handler and then | 
|---|
| 153 | runs the handler's body. | 
|---|
| 154 |  | 
|---|
| 155 | These three functions are passed to \codeC{try_terminate}. This is an | 
|---|
| 156 | % Maybe I shouldn't quote that, it isn't its actual name. | 
|---|
| 157 | internal hand-written function that has its own personality function and | 
|---|
| 158 | custom assembly LSD does the exception handling in \CFA. During normal | 
|---|
| 159 | execution all this function does is call the try function and then return. | 
|---|
| 160 | It is only when exceptions are thrown that anything interesting happens. | 
|---|
| 161 |  | 
|---|
| 162 | During the search phase the personality function gets the pointer to the match | 
|---|
| 163 | function and calls it. If the match function returns a handler index the | 
|---|
| 164 | personality function saves it and reports that the handler has been found, | 
|---|
| 165 | otherwise unwinding continues. | 
|---|
| 166 | During the clean-up phase the personality function only does anything if the | 
|---|
| 167 | handler was found in this frame. If it was then the personality function | 
|---|
| 168 | installs the handler, which is setting the instruction pointer in | 
|---|
| 169 | \codeC{try_terminate} to an otherwise unused section that calls the catch | 
|---|
| 170 | function, passing it the current exception and handler index. | 
|---|
| 171 | \codeC{try_terminate} returns as soon as the catch function returns. | 
|---|
| 172 |  | 
|---|
| 173 | At this point control has returned to normal control flow. | 
|---|