Changeset f93d7fc
- Timestamp:
- Aug 30, 2021, 9:29:26 PM (3 years ago)
- Branches:
- ADT, ast-experimental, enum, forall-pointer-decay, jacob/cs343-translation, master, pthread-emulation, qualifiedEnum
- Children:
- b041f11
- Parents:
- 3548ddb (diff), 0477127 (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the(diff)
links above to see all the changes relative to each parent. - File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/andrew_beach_MMath/performance.tex
r3548ddb rf93d7fc 21 21 but otherwise \Cpp should have a significant advantage. 22 22 23 Java, a popular language with similar termination semantics, but24 i t is implemented in a very different environment, a virtual machine with23 Java, a popular language with similar termination semantics, 24 is implemented in a very different environment, a virtual machine with 25 25 garbage collection. 26 26 It also implements the finally clause on try blocks allowing for a direct … … 45 45 The number of times the loop is run is configurable from the command line; 46 46 the number used in the timing runs is given with the results per test. 47 % Tests ran their main loop a million times.48 47 The Java tests run the main loop 1000 times before 49 48 beginning the actual test to ``warm-up" the JVM. … … 140 139 The repeating function calls itself inside a try block with a handler that 141 140 does not match the raised exception, but is of the same kind of handler. 142 This means that the EHM has to check each handler, butcontinue141 This means that the EHM has to check each handler, and continue 143 142 over all of them until it reaches the base of the stack. 144 143 Comparing this to the empty test gives the time to traverse over and … … 147 146 148 147 \paragraph{Cross Try Statement} 149 This group of tests measures the cost for setting up exception handling, if it is 148 This group of tests measures the cost for setting up exception handling, 149 if it is 150 150 not used (because the exceptional case did not occur). 151 Tests repeatedly cross (enter, execute ,and leave) a try statement but never151 Tests repeatedly cross (enter, execute and leave) a try statement but never 152 152 perform a raise. 153 153 \begin{itemize}[nosep] … … 301 301 \multicolumn{1}{c}{Raise} & \multicolumn{1}{c}{\CFA} & \multicolumn{1}{c}{\Cpp} & \multicolumn{1}{c}{Java} & \multicolumn{1}{c|}{Python} \\ 302 302 \hline 303 Resume Empty (10M) & 3.8 & 3.5 & 14.7 & 2.3 & 176.1 & 0.3 & 0.1 & 8.9 & 1.2 & 119.9 \\ 304 %Resume Other (10M) & 4.0* & 0.1* & 21.9 & 6.2 & 381.0 & 0.3* & 0.1* & 13.2 & 5.0 & 290.7 \\ 303 Resume Empty (10M) & 1.5 & 1.5 & 14.7 & 2.3 & 176.1 & 1.0 & 1.4 & 8.9 & 1.2 & 119.9 \\ 305 304 \hline 306 305 \end{tabular} … … 313 312 For example, there are a few cases where Python out-performs 314 313 \CFA, \Cpp and Java. 314 \todo{Make sure there are still cases where Python wins.} 315 315 The most likely explanation is that, since exceptions 316 316 are rarely considered to be the common case, the more optimized languages … … 328 328 Details on the different test cases follow. 329 329 330 \subsection{Termination , \autoref{t:PerformanceTermination}}330 \subsection{Termination \texorpdfstring{(\autoref{t:PerformanceTermination})}{}} 331 331 332 332 \begin{description} … … 334 334 \CFA is slower than \Cpp, but is still faster than the other languages 335 335 and closer to \Cpp than other languages. 336 This result is to be expected as \CFA is closer to \Cpp than the other languages. 336 This result is to be expected, 337 as \CFA is closer to \Cpp than the other languages. 337 338 338 339 \item[D'tor Traversal] 339 340 Running destructors causes a huge slowdown in the two languages that support 340 341 them. \CFA has a higher proportionate slowdown but it is similar to \Cpp's. 341 Considering the amount of work done in destructors is virtually zero (asm instruction), the cost 342 must come from the change of context required to trigger the destructor. 342 Considering the amount of work done in destructors is effectively zero 343 (an assembly comment), the cost 344 must come from the change of context required to run the destructor. 343 345 344 346 \item[Finally Traversal] … … 360 362 but they could avoid the spike by not having the same kind of overhead for 361 363 switching to the check's context. 362 363 364 \todo{Could revisit Other Traversal, after Finally Traversal.} 364 365 … … 366 367 Here \CFA falls behind \Cpp by a much more significant margin. 367 368 This is likely due to the fact \CFA has to insert two extra function 368 calls, while \Cpp does not have to execute any other instructions.369 calls, while \Cpp does not have to do execute any other instructions. 369 370 Python is much further behind. 370 371 … … 391 392 \end{description} 392 393 393 \subsection{Resumption , \autoref{t:PerformanceResumption}}394 \subsection{Resumption \texorpdfstring{(\autoref{t:PerformanceResumption})}{}} 394 395 395 396 Moving on to resumption, there is one general note, … … 413 414 The run-time is actually very similar to Finally Traversal. 414 415 As resumption does not unwind the stack, both destructors and finally 415 clauses are run while walking down the stack during the recursi onreturns.416 clauses are run while walking down the stack during the recursive returns. 416 417 So it follows their performance is similar. 417 418 418 419 \item[Finally Traversal] 419 % The increase in run-time from Empty Traversal (once adjusted for 420 % the number of iterations) is roughly the same as for termination. 421 % This suggests that the 422 See D'tor Traversal discussion. 420 Same as D'tor Traversal, 421 except termination did not have a spike in run-time on this test case. 423 422 424 423 \item[Other Traversal] … … 431 430 The only test case where resumption could not keep up with termination, 432 431 although the difference is not as significant as many other cases. 433 It is simply a matter of where the costs come from. \PAB{What does this mean? 434 Even if \CFA termination 435 is not ``zero-cost", passing through an empty function still seems to be 436 cheaper than updating global values.} 432 It is simply a matter of where the costs come from, 433 both termination and resumption have some work to set-up or tear-down a 434 handler. It just so happens that resumption's work is slightly slower. 437 435 438 436 \item[Conditional Match] … … 443 441 \end{description} 444 442 445 \subsection{Resumption/Fixup , \autoref{t:PerformanceFixupRoutines}}443 \subsection{Resumption/Fixup \texorpdfstring{(\autoref{t:PerformanceFixupRoutines})}{}} 446 444 447 445 Finally are the results of the resumption/fixup routine comparison. … … 449 447 has more to do with performance than passing the argument through layers of 450 448 calls. 451 Even with 100 stack frames though, resumption is only about as fast as 452 manually passing a fixup routine. 453 However, as the number of fixup routines is increased, the cost of passing them 454 should make the resumption dynamic-search cheaper. 455 So there is a cost for the additional power and flexibility exceptions 456 provide. 449 At 100 stack frames, resumption and manual fixup routines have similar 450 performance in \CFA. 451 More experiments could try to tease out the exact trade-offs, 452 but the prototype's only performance goal is to be reasonable. 453 It has already in that range, and \CFA's fixup routine simulation is 454 one of the faster simulations as well. 455 Plus exceptions add features and remove syntactic overhead, 456 so even at similar performance resumptions have advantages 457 over fixup routines.
Note: See TracChangeset
for help on using the changeset viewer.