Changeset 0b67a19
- Timestamp:
- Aug 5, 2021, 10:52:59 AM (4 years ago)
- Branches:
- ADT, ast-experimental, enum, forall-pointer-decay, jacob/cs343-translation, master, new-ast-unique-expr, pthread-emulation, qualifiedEnum
- Children:
- b2525d9
- Parents:
- 199894e
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/andrew_beach_MMath/performance.tex
r199894e r0b67a19 42 42 repeatedly. This is to avoids start-up or tear-down time from 43 43 affecting the timing results. 44 Most test were run 1 000 000 (a million)times.44 Tests ran their main loop a million times. 45 45 The Java versions of the test also run this loop an extra 1000 times before 46 46 beginning to time the results to ``warm-up" the JVM. … … 130 130 131 131 \section{Results} 132 Each test is was run five times, the best and worst result were discarded and133 the remaining values were averaged.132 Each test was run eleven times. The top three and bottom three results were 133 discarded and the remaining five values are averaged. 134 134 135 135 In cases where a feature is not supported by a language the test is skipped … … 138 138 was put into the termination column. 139 139 140 % Raw Data: 141 % run-algol-a.sat 142 % --------------- 143 % Raise Empty & 82687046678 & 291616256 & 3252824847 & 15422937623 & 14736271114 \\ 144 % Raise D'tor & 219933199603 & 297897792 & 223602799362 & N/A & N/A \\ 145 % Raise Finally & 219703078448 & 298391745 & N/A & ... & 18923060958 \\ 146 % Raise Other & 296744104920 & 2854342084 & 112981255103 & 15475924808 & 21293137454 \\ 147 % Cross Handler & 9256648 & 13518430 & 769328 & 3486252 & 31790804 \\ 148 % Cross Finally & 769319 & N/A & N/A & 2272831 & 37491962 \\ 149 % Match All & 3654278402 & 47518560 & 3218907794 & 1296748192 & 624071886 \\ 150 % Match None & 4788861754 & 58418952 & 9458936430 & 1318065020 & 625200906 \\ 151 % 152 % run-algol-thr-c 153 % --------------- 154 % Raise Empty & 3757606400 & 36472972 & 3257803337 & 15439375452 & 14717808642 \\ 155 % Raise D'tor & 64546302019 & 102148375 & 223648121635 & N/A & N/A \\ 156 % Raise Finally & 64671359172 & 103285005 & N/A & 15442729458 & 18927008844 \\ 157 % Raise Other & 294143497130 & 2630130385 & 112969055576 & 15448220154 & 21279953424 \\ 158 % Cross Handler & 9646462 & 11955668 & 769328 & 3453707 & 31864074 \\ 159 % Cross Finally & 773412 & N/A & N/A & 2253825 & 37266476 \\ 160 % Match All & 3719462155 & 43294042 & 3223004977 & 1286054154 & 623887874 \\ 161 % Match None & 4971630929 & 55311709 & 9481225467 & 1310251289 & 623752624 \\ 140 162 \begin{tabular}{|l|c c c c c|} 141 163 \hline … … 152 174 \hline 153 175 \end{tabular} 176 177 % run-plg7a-a.sat 178 % --------------- 179 % Raise Empty & 57169011329 & 296612564 & 2788557155 & 17511466039 & 23324548496 \\ 180 % Raise D'tor & 150599858014 & 318443709 & 149651693682 & N/A & N/A \\ 181 % Raise Finally & 148223145000 & 373325807 & N/A & ... & 29074552998 \\ 182 % Raise Other & 189463708732 & 3017109322 & 85819281694 & 17584295487 & 32602686679 \\ 183 % Cross Handler & 8001654 & 13584858 & 1555995 & 6626775 & 41927358 \\ 184 % Cross Finally & 1002473 & N/A & N/A & 4554344 & 51114381 \\ 185 % Match All & 3162460860 & 37315018 & 2649464591 & 1523205769 & 742374509 \\ 186 % Match None & 4054773797 & 47052659 & 7759229131 & 1555373654 & 744656403 \\ 187 % 188 % run-plg7a-thr-a 189 % --------------- 190 % Raise Empty & 3604235388 & 29829965 & 2786931833 & 17576506385 & 23352975105 \\ 191 % Raise D'tor & 46552380948 & 178709605 & 149834207219 & N/A & N/A \\ 192 % Raise Finally & 46265157775 & 177906320 & N/A & 17493045092 & 29170962959 \\ 193 % Raise Other & 195659245764 & 2376968982 & 86070431924 & 17552979675 & 32501882918 \\ 194 % Cross Handler & 397031776 & 12503552 & 1451225 & 6658628 & 42304965 \\ 195 % Cross Finally & 1136746 & N/A & N/A & 4468799 & 46155817 \\ 196 % Match All & 3189512499 & 39124453 & 2667795989 & 1525889031 & 733785613 \\ 197 % Match None & 4094675477 & 48749857 & 7850618572 & 1566713577 & 733478963 \\ 198 199 % PLG7A (in seconds) 200 \begin{tabular}{|l|c c c c c|} 201 \hline 202 & \CFA (Terminate) & \CFA (Resume) & \Cpp & Java & Python \\ 203 \hline 204 % Raise Empty & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 205 % Raise D'tor & 0.0 & 0.0 & 0.0 & N/A & N/A \\ 206 % Raise Finally & 0.0 & 0.0 & N/A & 0.0 & 0.0 \\ 207 % Raise Other & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 208 % Cross Handler & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 209 % Cross Finally & 0.0 & N/A & N/A & 0.0 & 0.0 \\ 210 % Match All & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 211 % Match None & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 212 Raise Empty & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 213 Raise D'tor & 0.0 & 0.0 & 0.0 & N/A & N/A \\ 214 Raise Finally & 0.0 & 0.0 & N/A & 0.0 & 0.0 \\ 215 Raise Other & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 216 Cross Handler & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 217 Cross Finally & 0.0 & N/A & N/A & 0.0 & 0.0 \\ 218 Match All & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 219 Match None & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 220 \hline 221 \end{tabular} 222 223 One result that is not directly related to \CFA but is important to keep in 224 mind is that in exceptions the standard intuitions about which languages 225 should go faster often do not hold. There are cases where Python out-preforms 226 \Cpp and Java. The most likely explination is that, since exceptions are 227 rarely considered to be the common case, the more optimized langages have 228 optimized at their expence. In addition languages with high level 229 repersentations have a much easier time scanning the stack as there is less 230 to decode. 231 232 This means that while \CFA does not actually keep up with Python in every 233 case it is no worse than roughly half the speed of \Cpp. This is good 234 enough for the prototyping purposes of the project. 235 236 One difference not shown is that optimizations in \CFA is very fragile. 237 The \CFA compiler uses gcc as part of its complation process and the version 238 of gcc could change the speed of some of the benchmarks by 10 times or more. 239 Similar changes to g++ for the \Cpp benchmarks had no significant changes. 240 Because of the connection between gcc and g++; this suggests it is not the 241 optimizations that are changing but how the optimizer is detecting if the 242 optimizations can be applied. So the optimizations are always applied in 243 g++, but only newer versions of gcc can detect that they can be applied in 244 the more complex \CFA code. 245 246 Resumption exception handling is also incredibly fast. Often an order of 247 magnitude or two better than the best termination speed. 248 There is a simple explination for this; traversing a linked list is much 249 faster than examining and unwinding the stack. When resumption does not do as 250 well its when more try statements are used per raise. Updating the interal 251 linked list is not very expencive but it does add up. 252 253 The relative speed of the Match All and Match None tests (within each 254 language) can also show the effectiveness conditional matching as compared 255 to catch and rethrow. 256 \begin{itemize}[nosep] 257 \item 258 Java and Python get similar values in both tests. 259 Between the interperated code, a higher level repersentation of the call 260 stack and exception reuse it it is possible the cost for a second 261 throw can be folded into the first. 262 % Is this due to optimization? 263 \item 264 Both types of \CFA are slighly slower if there is not a match. 265 For termination this likely comes from unwinding a bit more stack through 266 libunwind instead of executing the code normally. 267 For resumption there is extra work in traversing more of the list and running 268 more checks for a matching exceptions. 269 % Resumption is a bit high for that but this is my best theory. 270 \item 271 Then there is \Cpp, which takes 2--3 times longer to catch and rethrow vs. 272 just the catch. This is very high, but it does have to repeat the same 273 process of unwinding the stack and may have to parse the LSDA of the function 274 with the catch and rethrow twice, once before the catch and once after the 275 rethrow. 276 % I spent a long time thinking of what could push it over twice, this is all 277 % I have to explain it. 278 \end{itemize} 279 The difference in relative performance does show that there are savings to 280 be made by performing the check without catching the exception.
Note: See TracChangeset
for help on using the changeset viewer.