Changes in / [f93d7fc:3548ddb]


Ignore:
File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/andrew_beach_MMath/performance.tex

    rf93d7fc r3548ddb  
    2121but otherwise \Cpp should have a significant advantage.
    2222
    23 Java, a popular language with similar termination semantics,
    24 is implemented in a very different environment, a virtual machine with
     23Java, a popular language with similar termination semantics, but
     24it is implemented in a very different environment, a virtual machine with
    2525garbage collection.
    2626It also implements the finally clause on try blocks allowing for a direct
     
    4545The number of times the loop is run is configurable from the command line;
    4646the number used in the timing runs is given with the results per test.
     47% Tests ran their main loop a million times.
    4748The Java tests run the main loop 1000 times before
    4849beginning the actual test to ``warm-up" the JVM.
     
    139140The repeating function calls itself inside a try block with a handler that
    140141does not match the raised exception, but is of the same kind of handler.
    141 This means that the EHM has to check each handler, and continue
     142This means that the EHM has to check each handler, but continue
    142143over all of them until it reaches the base of the stack.
    143144Comparing this to the empty test gives the time to traverse over and
     
    146147
    147148\paragraph{Cross Try Statement}
    148 This group of tests measures the cost for setting up exception handling,
    149 if it is
     149This group of tests measures the cost for setting up exception handling, if it is
    150150not used (because the exceptional case did not occur).
    151 Tests repeatedly cross (enter, execute and leave) a try statement but never
     151Tests repeatedly cross (enter, execute, and leave) a try statement but never
    152152perform a raise.
    153153\begin{itemize}[nosep]
     
    301301              \multicolumn{1}{c}{Raise} & \multicolumn{1}{c}{\CFA} & \multicolumn{1}{c}{\Cpp} & \multicolumn{1}{c}{Java} & \multicolumn{1}{c|}{Python} \\
    302302\hline
    303 Resume Empty (10M)  & 1.5 & 1.5 & 14.7 & 2.3 & 176.1  & 1.0 & 1.4 & 8.9 & 1.2 & 119.9 \\
     303Resume Empty (10M)  & 3.8  & 3.5  & 14.7  & 2.3   & 176.1 & 0.3  & 0.1  & 8.9   & 1.2   & 119.9 \\
     304%Resume Other (10M)  & 4.0* & 0.1* & 21.9  & 6.2   & 381.0 & 0.3* & 0.1* & 13.2  & 5.0   & 290.7 \\
    304305\hline
    305306\end{tabular}
     
    312313For example, there are a few cases where Python out-performs
    313314\CFA, \Cpp and Java.
    314 \todo{Make sure there are still cases where Python wins.}
    315315The most likely explanation is that, since exceptions
    316316are rarely considered to be the common case, the more optimized languages
     
    328328Details on the different test cases follow.
    329329
    330 \subsection{Termination \texorpdfstring{(\autoref{t:PerformanceTermination})}{}}
     330\subsection{Termination, \autoref{t:PerformanceTermination}}
    331331
    332332\begin{description}
     
    334334\CFA is slower than \Cpp, but is still faster than the other languages
    335335and closer to \Cpp than other languages.
    336 This result is to be expected,
    337 as \CFA is closer to \Cpp than the other languages.
     336This result is to be expected as \CFA is closer to \Cpp than the other languages.
    338337
    339338\item[D'tor Traversal]
    340339Running destructors causes a huge slowdown in the two languages that support
    341340them. \CFA has a higher proportionate slowdown but it is similar to \Cpp's.
    342 Considering the amount of work done in destructors is effectively zero
    343 (an assembly comment), the cost
    344 must come from the change of context required to run the destructor.
     341Considering the amount of work done in destructors is virtually zero (asm instruction), the cost
     342must come from the change of context required to trigger the destructor.
    345343
    346344\item[Finally Traversal]
     
    362360but they could avoid the spike by not having the same kind of overhead for
    363361switching to the check's context.
     362
    364363\todo{Could revisit Other Traversal, after Finally Traversal.}
    365364
     
    367366Here \CFA falls behind \Cpp by a much more significant margin.
    368367This is likely due to the fact \CFA has to insert two extra function
    369 calls, while \Cpp does not have to do execute any other instructions.
     368calls, while \Cpp does not have to execute any other instructions.
    370369Python is much further behind.
    371370
     
    392391\end{description}
    393392
    394 \subsection{Resumption \texorpdfstring{(\autoref{t:PerformanceResumption})}{}}
     393\subsection{Resumption, \autoref{t:PerformanceResumption}}
    395394
    396395Moving on to resumption, there is one general note,
     
    414413The run-time is actually very similar to Finally Traversal.
    415414As resumption does not unwind the stack, both destructors and finally
    416 clauses are run while walking down the stack during the recursive returns.
     415clauses are run while walking down the stack during the recursion returns.
    417416So it follows their performance is similar.
    418417
    419418\item[Finally Traversal]
    420 Same as D'tor Traversal,
    421 except termination did not have a spike in run-time on this test case.
     419% The increase in run-time from Empty Traversal (once adjusted for
     420% the number of iterations) is roughly the same as for termination.
     421% This suggests that the
     422See D'tor Traversal discussion.
    422423
    423424\item[Other Traversal]
     
    430431The only test case where resumption could not keep up with termination,
    431432although the difference is not as significant as many other cases.
    432 It is simply a matter of where the costs come from,
    433 both termination and resumption have some work to set-up or tear-down a
    434 handler. It just so happens that resumption's work is slightly slower.
     433It is simply a matter of where the costs come from. \PAB{What does this mean?
     434Even if \CFA termination
     435is not ``zero-cost", passing through an empty function still seems to be
     436cheaper than updating global values.}
    435437
    436438\item[Conditional Match]
     
    441443\end{description}
    442444
    443 \subsection{Resumption/Fixup \texorpdfstring{(\autoref{t:PerformanceFixupRoutines})}{}}
     445\subsection{Resumption/Fixup, \autoref{t:PerformanceFixupRoutines}}
    444446
    445447Finally are the results of the resumption/fixup routine comparison.
     
    447449has more to do with performance than passing the argument through layers of
    448450calls.
    449 At 100 stack frames, resumption and manual fixup routines have similar
    450 performance in \CFA.
    451 More experiments could try to tease out the exact trade-offs,
    452 but the prototype's only performance goal is to be reasonable.
    453 It has already in that range, and \CFA's fixup routine simulation is
    454 one of the faster simulations as well.
    455 Plus exceptions add features and remove syntactic overhead,
    456 so even at similar performance resumptions have advantages
    457 over fixup routines.
     451Even with 100 stack frames though, resumption is only about as fast as
     452manually passing a fixup routine.
     453However, as the number of fixup routines is increased, the cost of passing them
     454should make the resumption dynamic-search cheaper.
     455So there is a cost for the additional power and flexibility exceptions
     456provide.
Note: See TracChangeset for help on using the changeset viewer.