| 1 | \chapter{Performance}
|
|---|
| 2 | \label{c:performance}
|
|---|
| 3 |
|
|---|
| 4 | Performance has been of secondary importance for most of this project.
|
|---|
| 5 | Instead, the focus has been to get the features working. The only performance
|
|---|
| 6 | requirements is to ensure the tests for correctness run in a reasonable
|
|---|
| 7 | amount of time.
|
|---|
| 8 |
|
|---|
| 9 | \section{Test Set-Up}
|
|---|
| 10 | Tests will be run in \CFA, C++, Java and Python.
|
|---|
| 11 | In addition there are two sets of tests for \CFA,
|
|---|
| 12 | one for termination exceptions and once with resumption exceptions.
|
|---|
| 13 |
|
|---|
| 14 | C++ is the most comparable language because both it and \CFA use the same
|
|---|
| 15 | framework, libunwind.
|
|---|
| 16 | In fact, the comparison is almost entirely a quality of implementation
|
|---|
| 17 | comparison. \CFA's EHM has had significantly less time to be optimized and
|
|---|
| 18 | does not generate its own assembly. It does have a slight advantage in that
|
|---|
| 19 | there are some features it does not handle, through utility functions,
|
|---|
| 20 | but otherwise \Cpp has a significant advantage.
|
|---|
| 21 |
|
|---|
| 22 | Java is another very popular language with similar termination semantics.
|
|---|
| 23 | It is implemented in a very different environment, a virtual machine with
|
|---|
| 24 | garbage collection.
|
|---|
| 25 | It also implements the finally clause on try blocks allowing for a direct
|
|---|
| 26 | feature-to-feature comparison.
|
|---|
| 27 | As with \Cpp, Java's implementation is more mature, has more optimizations
|
|---|
| 28 | and more extra features.
|
|---|
| 29 |
|
|---|
| 30 | Python was used as a point of comparison because of the \CFA EHM's
|
|---|
| 31 | current performance goals, which is not be prohibitively slow while the
|
|---|
| 32 | features are designed and examined. Python has similar performance goals for
|
|---|
| 33 | creating quick scripts and its wide use suggests it has achieved those goals.
|
|---|
| 34 |
|
|---|
| 35 | Unfortunately there are no notable modern programming languages with
|
|---|
| 36 | resumption exceptions. Even the older programming languages with resumptions
|
|---|
| 37 | seem to be notable only for having resumptions.
|
|---|
| 38 | So instead resumptions are compared to a less similar but much more familiar
|
|---|
| 39 | feature, termination exceptions.
|
|---|
| 40 |
|
|---|
| 41 | All tests are run inside a main loop which will perform the test
|
|---|
| 42 | repeatedly. This is to avoids start-up or tear-down time from
|
|---|
| 43 | affecting the timing results.
|
|---|
| 44 | Tests ran their main loop a million times.
|
|---|
| 45 | The Java versions of the test also run this loop an extra 1000 times before
|
|---|
| 46 | beginning to time the results to ``warm-up" the JVM.
|
|---|
| 47 |
|
|---|
| 48 | Timing is done internally, with time measured immediately before and
|
|---|
| 49 | immediately after the test loop. The difference is calculated and printed.
|
|---|
| 50 |
|
|---|
| 51 | The loop structure and internal timing means it is impossible to test
|
|---|
| 52 | unhandled exceptions in \Cpp and Java as that would cause the process to
|
|---|
| 53 | terminate.
|
|---|
| 54 | Luckily, performance on the ``give-up and kill the process" path is not
|
|---|
| 55 | critical.
|
|---|
| 56 |
|
|---|
| 57 | The exceptions used in these tests will always be a exception based off of
|
|---|
| 58 | the base exception. This requirement minimizes performance differences based
|
|---|
| 59 | on the object model used to repersent the exception.
|
|---|
| 60 |
|
|---|
| 61 | All tests were designed to be as minimal as possible while still preventing
|
|---|
| 62 | exessive optimizations.
|
|---|
| 63 | For example, empty inline assembly blocks are used in \CFA and \Cpp to
|
|---|
| 64 | prevent excessive optimizations while adding no actual work.
|
|---|
| 65 |
|
|---|
| 66 | % We don't use catch-alls but if we did:
|
|---|
| 67 | % Catch-alls are done by catching the root exception type (not using \Cpp's
|
|---|
| 68 | % \code{C++}{catch(...)}).
|
|---|
| 69 |
|
|---|
| 70 | \section{Tests}
|
|---|
| 71 | The following tests were selected to test the performance of different
|
|---|
| 72 | components of the exception system.
|
|---|
| 73 | The should provide a guide as to where the EHM's costs can be found.
|
|---|
| 74 |
|
|---|
| 75 | \paragraph{Raise and Handle}
|
|---|
| 76 | The first group of tests involve setting up
|
|---|
| 77 | So there is three layers to the test. The first is set up and a loop, which
|
|---|
| 78 | configures the test and then runs it repeatedly to reduce the impact of
|
|---|
| 79 | start-up and shutdown on the results.
|
|---|
| 80 | Each iteration of the main loop
|
|---|
| 81 | \begin{itemize}[nosep]
|
|---|
| 82 | \item Empty Function:
|
|---|
| 83 | The repeating function is empty except for the necessary control code.
|
|---|
| 84 | \item Destructor:
|
|---|
| 85 | The repeating function creates an object with a destructor before calling
|
|---|
| 86 | itself.
|
|---|
| 87 | \item Finally:
|
|---|
| 88 | The repeating function calls itself inside a try block with a finally clause
|
|---|
| 89 | attached.
|
|---|
| 90 | \item Other Handler:
|
|---|
| 91 | The repeating function calls itself inside a try block with a handler that
|
|---|
| 92 | will not match the raised exception. (But is of the same kind of handler.)
|
|---|
| 93 | \end{itemize}
|
|---|
| 94 |
|
|---|
| 95 | \paragraph{Cross Try Statement}
|
|---|
| 96 | The next group measures the cost of a try statement when no exceptions are
|
|---|
| 97 | raised. The test is set-up, then there is a loop to reduce the impact of
|
|---|
| 98 | start-up and shutdown on the results.
|
|---|
| 99 | In each iteration, a try statement is executed. Entering and leaving a loop
|
|---|
| 100 | is all the test wants to do.
|
|---|
| 101 | \begin{itemize}[nosep]
|
|---|
| 102 | \item Handler:
|
|---|
| 103 | The try statement has a handler (of the matching kind).
|
|---|
| 104 | \item Finally:
|
|---|
| 105 | The try statement has a finally clause.
|
|---|
| 106 | \end{itemize}
|
|---|
| 107 |
|
|---|
| 108 | \paragraph{Conditional Matching}
|
|---|
| 109 | This group of tests checks the cost of conditional matching.
|
|---|
| 110 | Only \CFA implements the language level conditional match,
|
|---|
| 111 | the other languages must mimic with an ``unconditional" match (it still
|
|---|
| 112 | checks the exception's type) and conditional re-raise if it was not supposed
|
|---|
| 113 | to handle that exception.
|
|---|
| 114 | \begin{itemize}[nosep]
|
|---|
| 115 | \item Match All:
|
|---|
| 116 | The condition is always true. (Always matches or never re-raises.)
|
|---|
| 117 | \item Match None:
|
|---|
| 118 | The condition is always false. (Never matches or always re-raises.)
|
|---|
| 119 | \end{itemize}
|
|---|
| 120 |
|
|---|
| 121 | %\section{Cost in Size}
|
|---|
| 122 | %Using exceptions also has a cost in the size of the executable.
|
|---|
| 123 | %Although it is sometimes ignored
|
|---|
| 124 | %
|
|---|
| 125 | %There is a size cost to defining a personality function but the later problem
|
|---|
| 126 | %is the LSDA which will be generated for every function.
|
|---|
| 127 | %
|
|---|
| 128 | %(I haven't actually figured out how to compare this, probably using something
|
|---|
| 129 | %related to -fexceptions.)
|
|---|
| 130 |
|
|---|
| 131 | \section{Results}
|
|---|
| 132 | Each test was run eleven times. The top three and bottom three results were
|
|---|
| 133 | discarded and the remaining five values are averaged.
|
|---|
| 134 |
|
|---|
| 135 | In cases where a feature is not supported by a language the test is skipped
|
|---|
| 136 | for that language. Similarly, if a test is does not change between resumption
|
|---|
| 137 | and termination in \CFA, then only one test is written and the result
|
|---|
| 138 | was put into the termination column.
|
|---|
| 139 |
|
|---|
| 140 | % Raw Data:
|
|---|
| 141 | % run-algol-a.sat
|
|---|
| 142 | % ---------------
|
|---|
| 143 | % Raise Empty & 82687046678 & 291616256 & 3252824847 & 15422937623 & 14736271114 \\
|
|---|
| 144 | % Raise D'tor & 219933199603 & 297897792 & 223602799362 & N/A & N/A \\
|
|---|
| 145 | % Raise Finally & 219703078448 & 298391745 & N/A & ... & 18923060958 \\
|
|---|
| 146 | % Raise Other & 296744104920 & 2854342084 & 112981255103 & 15475924808 & 21293137454 \\
|
|---|
| 147 | % Cross Handler & 9256648 & 13518430 & 769328 & 3486252 & 31790804 \\
|
|---|
| 148 | % Cross Finally & 769319 & N/A & N/A & 2272831 & 37491962 \\
|
|---|
| 149 | % Match All & 3654278402 & 47518560 & 3218907794 & 1296748192 & 624071886 \\
|
|---|
| 150 | % Match None & 4788861754 & 58418952 & 9458936430 & 1318065020 & 625200906 \\
|
|---|
| 151 | %
|
|---|
| 152 | % run-algol-thr-c
|
|---|
| 153 | % ---------------
|
|---|
| 154 | % Raise Empty & 3757606400 & 36472972 & 3257803337 & 15439375452 & 14717808642 \\
|
|---|
| 155 | % Raise D'tor & 64546302019 & 102148375 & 223648121635 & N/A & N/A \\
|
|---|
| 156 | % Raise Finally & 64671359172 & 103285005 & N/A & 15442729458 & 18927008844 \\
|
|---|
| 157 | % Raise Other & 294143497130 & 2630130385 & 112969055576 & 15448220154 & 21279953424 \\
|
|---|
| 158 | % Cross Handler & 9646462 & 11955668 & 769328 & 3453707 & 31864074 \\
|
|---|
| 159 | % Cross Finally & 773412 & N/A & N/A & 2253825 & 37266476 \\
|
|---|
| 160 | % Match All & 3719462155 & 43294042 & 3223004977 & 1286054154 & 623887874 \\
|
|---|
| 161 | % Match None & 4971630929 & 55311709 & 9481225467 & 1310251289 & 623752624 \\
|
|---|
| 162 | %
|
|---|
| 163 | % run-algol-04-a
|
|---|
| 164 | % --------------
|
|---|
| 165 | % Raise Empty & 0.0 & 0.0 & 3250260945 & 0.0 & 0.0 \\
|
|---|
| 166 | % Raise D'tor & 0.0 & 0.0 & 29017675113 & N/A & N/A \\
|
|---|
| 167 | % Raise Finally & 0.0 & 0.0 & N/A & 0.0 & 0.0 \\
|
|---|
| 168 | % Raise Other & 0.0 & 0.0 & 24411823773 & 0.0 & 0.0 \\
|
|---|
| 169 | % Cross Handler & 0.0 & 0.0 & 769334 & 0.0 & 0.0 \\
|
|---|
| 170 | % Cross Finally & 0.0 & N/A & N/A & 0.0 & 0.0 \\
|
|---|
| 171 | % Match All & 0.0 & 0.0 & 3254283504 & 0.0 & 0.0 \\
|
|---|
| 172 | % Match None & 0.0 & 0.0 & 9476060146 & 0.0 & 0.0 \\
|
|---|
| 173 |
|
|---|
| 174 | \begin{tabular}{|l|c c c c c|}
|
|---|
| 175 | \hline
|
|---|
| 176 | & \CFA (Terminate) & \CFA (Resume) & \Cpp & Java & Python \\
|
|---|
| 177 | \hline
|
|---|
| 178 | Raise Empty & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\
|
|---|
| 179 | Raise D'tor & 0.0 & 0.0 & 0.0 & N/A & N/A \\
|
|---|
| 180 | Raise Finally & 0.0 & 0.0 & N/A & 0.0 & 0.0 \\
|
|---|
| 181 | Raise Other & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\
|
|---|
| 182 | Cross Handler & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\
|
|---|
| 183 | Cross Finally & 0.0 & N/A & N/A & 0.0 & 0.0 \\
|
|---|
| 184 | Match All & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\
|
|---|
| 185 | Match None & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\
|
|---|
| 186 | \hline
|
|---|
| 187 | \end{tabular}
|
|---|
| 188 |
|
|---|
| 189 | % run-plg7a-a.sat
|
|---|
| 190 | % ---------------
|
|---|
| 191 | % Raise Empty & 57169011329 & 296612564 & 2788557155 & 17511466039 & 23324548496 \\
|
|---|
| 192 | % Raise D'tor & 150599858014 & 318443709 & 149651693682 & N/A & N/A \\
|
|---|
| 193 | % Raise Finally & 148223145000 & 373325807 & N/A & ... & 29074552998 \\
|
|---|
| 194 | % Raise Other & 189463708732 & 3017109322 & 85819281694 & 17584295487 & 32602686679 \\
|
|---|
| 195 | % Cross Handler & 8001654 & 13584858 & 1555995 & 6626775 & 41927358 \\
|
|---|
| 196 | % Cross Finally & 1002473 & N/A & N/A & 4554344 & 51114381 \\
|
|---|
| 197 | % Match All & 3162460860 & 37315018 & 2649464591 & 1523205769 & 742374509 \\
|
|---|
| 198 | % Match None & 4054773797 & 47052659 & 7759229131 & 1555373654 & 744656403 \\
|
|---|
| 199 | %
|
|---|
| 200 | % run-plg7a-thr-a
|
|---|
| 201 | % ---------------
|
|---|
| 202 | % Raise Empty & 3604235388 & 29829965 & 2786931833 & 17576506385 & 23352975105 \\
|
|---|
| 203 | % Raise D'tor & 46552380948 & 178709605 & 149834207219 & N/A & N/A \\
|
|---|
| 204 | % Raise Finally & 46265157775 & 177906320 & N/A & 17493045092 & 29170962959 \\
|
|---|
| 205 | % Raise Other & 195659245764 & 2376968982 & 86070431924 & 17552979675 & 32501882918 \\
|
|---|
| 206 | % Cross Handler & 397031776 & 12503552 & 1451225 & 6658628 & 42304965 \\
|
|---|
| 207 | % Cross Finally & 1136746 & N/A & N/A & 4468799 & 46155817 \\
|
|---|
| 208 | % Match All & 3189512499 & 39124453 & 2667795989 & 1525889031 & 733785613 \\
|
|---|
| 209 | % Match None & 4094675477 & 48749857 & 7850618572 & 1566713577 & 733478963 \\
|
|---|
| 210 | %
|
|---|
| 211 | % run-plg7a-04-a
|
|---|
| 212 | % --------------
|
|---|
| 213 | % 0.0 are unfilled.
|
|---|
| 214 | % Raise Empty & 0.0 & 0.0 & 2770781479 & 0.0 & 0.0 \\
|
|---|
| 215 | % Raise D'tor & 0.0 & 0.0 & 23530084907 & N/A & N/A \\
|
|---|
| 216 | % Raise Finally & 0.0 & 0.0 & N/A & 0.0 & 0.0 \\
|
|---|
| 217 | % Raise Other & 0.0 & 0.0 & 23816827982 & 0.0 & 0.0 \\
|
|---|
| 218 | % Cross Handler & 0.0 & 0.0 & 1422188 & 0.0 & 0.0 \\
|
|---|
| 219 | % Cross Finally & 0.0 & N/A & N/A & 0.0 & 0.0 \\
|
|---|
| 220 | % Match All & 0.0 & 0.0 & 2671989778 & 0.0 & 0.0 \\
|
|---|
| 221 | % Match None & 0.0 & 0.0 & 7829059869 & 0.0 & 0.0 \\
|
|---|
| 222 |
|
|---|
| 223 | % PLG7A (in seconds)
|
|---|
| 224 | \begin{tabular}{|l|c c c c c|}
|
|---|
| 225 | \hline
|
|---|
| 226 | & \CFA (Terminate) & \CFA (Resume) & \Cpp & Java & Python \\
|
|---|
| 227 | \hline
|
|---|
| 228 | Raise Empty & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\
|
|---|
| 229 | Raise D'tor & 0.0 & 0.0 & 0.0 & N/A & N/A \\
|
|---|
| 230 | Raise Finally & 0.0 & 0.0 & N/A & 0.0 & 0.0 \\
|
|---|
| 231 | Raise Other & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\
|
|---|
| 232 | Cross Handler & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\
|
|---|
| 233 | Cross Finally & 0.0 & N/A & N/A & 0.0 & 0.0 \\
|
|---|
| 234 | Match All & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\
|
|---|
| 235 | Match None & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\
|
|---|
| 236 | \hline
|
|---|
| 237 | \end{tabular}
|
|---|
| 238 |
|
|---|
| 239 | One result that is not directly related to \CFA but is important to keep in
|
|---|
| 240 | mind is that in exceptions the standard intuitions about which languages
|
|---|
| 241 | should go faster often do not hold. There are cases where Python out-preforms
|
|---|
| 242 | \Cpp and Java. The most likely explination is that, since exceptions are
|
|---|
| 243 | rarely considered to be the common case, the more optimized langages have
|
|---|
| 244 | optimized at their expence. In addition languages with high level
|
|---|
| 245 | repersentations have a much easier time scanning the stack as there is less
|
|---|
| 246 | to decode.
|
|---|
| 247 |
|
|---|
| 248 | This means that while \CFA does not actually keep up with Python in every
|
|---|
| 249 | case it is usually no worse than roughly half the speed of \Cpp. This is good
|
|---|
| 250 | enough for the prototyping purposes of the project.
|
|---|
| 251 |
|
|---|
| 252 | The test case where \CFA falls short is Raise Other, the case where the
|
|---|
| 253 | stack is unwound including a bunch of non-matching handlers.
|
|---|
| 254 | This slowdown seems to come from missing optimizations,
|
|---|
| 255 | the results above came from gcc/g++ 10 (gcc as \CFA backend or g++ for \Cpp)
|
|---|
| 256 | but the results change if they are run in gcc/g++ 9 instead.
|
|---|
| 257 | Importantly, there is a huge slowdown in \Cpp's results bringing that brings
|
|---|
| 258 | \CFA's performace back in that roughly half speed area. However many other
|
|---|
| 259 | \CFA benchmarks increase their run-time by a similar amount falling far
|
|---|
| 260 | behind their \Cpp counter-parts.
|
|---|
| 261 |
|
|---|
| 262 | This suggests that the performance issue in Raise Other is just an
|
|---|
| 263 | optimization not being applied. Later versions of gcc may be able to
|
|---|
| 264 | optimize this case further, at least down to the half of \Cpp mark.
|
|---|
| 265 | A \CFA compiler that directly produced assembly could do even better as it
|
|---|
| 266 | would not have to work across some of \CFA's current abstractions, like
|
|---|
| 267 | the try terminate function.
|
|---|
| 268 |
|
|---|
| 269 | Resumption exception handling is also incredibly fast. Often an order of
|
|---|
| 270 | magnitude or two better than the best termination speed.
|
|---|
| 271 | There is a simple explination for this; traversing a linked list is much
|
|---|
| 272 | faster than examining and unwinding the stack. When resumption does not do as
|
|---|
| 273 | well its when more try statements are used per raise. Updating the interal
|
|---|
| 274 | linked list is not very expencive but it does add up.
|
|---|
| 275 |
|
|---|
| 276 | The relative speed of the Match All and Match None tests (within each
|
|---|
| 277 | language) can also show the effectiveness conditional matching as compared
|
|---|
| 278 | to catch and rethrow.
|
|---|
| 279 | \begin{itemize}[nosep]
|
|---|
| 280 | \item
|
|---|
| 281 | Java and Python get similar values in both tests.
|
|---|
| 282 | Between the interperated code, a higher level repersentation of the call
|
|---|
| 283 | stack and exception reuse it it is possible the cost for a second
|
|---|
| 284 | throw can be folded into the first.
|
|---|
| 285 | % Is this due to optimization?
|
|---|
| 286 | \item
|
|---|
| 287 | Both types of \CFA are slighly slower if there is not a match.
|
|---|
| 288 | For termination this likely comes from unwinding a bit more stack through
|
|---|
| 289 | libunwind instead of executing the code normally.
|
|---|
| 290 | For resumption there is extra work in traversing more of the list and running
|
|---|
| 291 | more checks for a matching exceptions.
|
|---|
| 292 | % Resumption is a bit high for that but this is my best theory.
|
|---|
| 293 | \item
|
|---|
| 294 | Then there is \Cpp, which takes 2--3 times longer to catch and rethrow vs.
|
|---|
| 295 | just the catch. This is very high, but it does have to repeat the same
|
|---|
| 296 | process of unwinding the stack and may have to parse the LSDA of the function
|
|---|
| 297 | with the catch and rethrow twice, once before the catch and once after the
|
|---|
| 298 | rethrow.
|
|---|
| 299 | % I spent a long time thinking of what could push it over twice, this is all
|
|---|
| 300 | % I have to explain it.
|
|---|
| 301 | \end{itemize}
|
|---|
| 302 | The difference in relative performance does show that there are savings to
|
|---|
| 303 | be made by performing the check without catching the exception.
|
|---|