Ignore:
Timestamp:
Aug 30, 2021, 9:29:26 PM (3 years ago)
Author:
Peter A. Buhr <pabuhr@…>
Branches:
ADT, ast-experimental, enum, forall-pointer-decay, jacob/cs343-translation, master, pthread-emulation, qualifiedEnum
Children:
b041f11
Parents:
3548ddb (diff), 0477127 (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the (diff) links above to see all the changes relative to each parent.
Message:

Merge branch 'master' of plg.uwaterloo.ca:software/cfa/cfa-cc

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/andrew_beach_MMath/performance.tex

    r3548ddb rf93d7fc  
    2121but otherwise \Cpp should have a significant advantage.
    2222
    23 Java, a popular language with similar termination semantics, but
    24 it is implemented in a very different environment, a virtual machine with
     23Java, a popular language with similar termination semantics,
     24is implemented in a very different environment, a virtual machine with
    2525garbage collection.
    2626It also implements the finally clause on try blocks allowing for a direct
     
    4545The number of times the loop is run is configurable from the command line;
    4646the number used in the timing runs is given with the results per test.
    47 % Tests ran their main loop a million times.
    4847The Java tests run the main loop 1000 times before
    4948beginning the actual test to ``warm-up" the JVM.
     
    140139The repeating function calls itself inside a try block with a handler that
    141140does not match the raised exception, but is of the same kind of handler.
    142 This means that the EHM has to check each handler, but continue
     141This means that the EHM has to check each handler, and continue
    143142over all of them until it reaches the base of the stack.
    144143Comparing this to the empty test gives the time to traverse over and
     
    147146
    148147\paragraph{Cross Try Statement}
    149 This group of tests measures the cost for setting up exception handling, if it is
     148This group of tests measures the cost for setting up exception handling,
     149if it is
    150150not used (because the exceptional case did not occur).
    151 Tests repeatedly cross (enter, execute, and leave) a try statement but never
     151Tests repeatedly cross (enter, execute and leave) a try statement but never
    152152perform a raise.
    153153\begin{itemize}[nosep]
     
    301301              \multicolumn{1}{c}{Raise} & \multicolumn{1}{c}{\CFA} & \multicolumn{1}{c}{\Cpp} & \multicolumn{1}{c}{Java} & \multicolumn{1}{c|}{Python} \\
    302302\hline
    303 Resume Empty (10M)  & 3.8  & 3.5  & 14.7  & 2.3   & 176.1 & 0.3  & 0.1  & 8.9   & 1.2   & 119.9 \\
    304 %Resume Other (10M)  & 4.0* & 0.1* & 21.9  & 6.2   & 381.0 & 0.3* & 0.1* & 13.2  & 5.0   & 290.7 \\
     303Resume Empty (10M)  & 1.5 & 1.5 & 14.7 & 2.3 & 176.1  & 1.0 & 1.4 & 8.9 & 1.2 & 119.9 \\
    305304\hline
    306305\end{tabular}
     
    313312For example, there are a few cases where Python out-performs
    314313\CFA, \Cpp and Java.
     314\todo{Make sure there are still cases where Python wins.}
    315315The most likely explanation is that, since exceptions
    316316are rarely considered to be the common case, the more optimized languages
     
    328328Details on the different test cases follow.
    329329
    330 \subsection{Termination, \autoref{t:PerformanceTermination}}
     330\subsection{Termination \texorpdfstring{(\autoref{t:PerformanceTermination})}{}}
    331331
    332332\begin{description}
     
    334334\CFA is slower than \Cpp, but is still faster than the other languages
    335335and closer to \Cpp than other languages.
    336 This result is to be expected as \CFA is closer to \Cpp than the other languages.
     336This result is to be expected,
     337as \CFA is closer to \Cpp than the other languages.
    337338
    338339\item[D'tor Traversal]
    339340Running destructors causes a huge slowdown in the two languages that support
    340341them. \CFA has a higher proportionate slowdown but it is similar to \Cpp's.
    341 Considering the amount of work done in destructors is virtually zero (asm instruction), the cost
    342 must come from the change of context required to trigger the destructor.
     342Considering the amount of work done in destructors is effectively zero
     343(an assembly comment), the cost
     344must come from the change of context required to run the destructor.
    343345
    344346\item[Finally Traversal]
     
    360362but they could avoid the spike by not having the same kind of overhead for
    361363switching to the check's context.
    362 
    363364\todo{Could revisit Other Traversal, after Finally Traversal.}
    364365
     
    366367Here \CFA falls behind \Cpp by a much more significant margin.
    367368This is likely due to the fact \CFA has to insert two extra function
    368 calls, while \Cpp does not have to execute any other instructions.
     369calls, while \Cpp does not have to do execute any other instructions.
    369370Python is much further behind.
    370371
     
    391392\end{description}
    392393
    393 \subsection{Resumption, \autoref{t:PerformanceResumption}}
     394\subsection{Resumption \texorpdfstring{(\autoref{t:PerformanceResumption})}{}}
    394395
    395396Moving on to resumption, there is one general note,
     
    413414The run-time is actually very similar to Finally Traversal.
    414415As resumption does not unwind the stack, both destructors and finally
    415 clauses are run while walking down the stack during the recursion returns.
     416clauses are run while walking down the stack during the recursive returns.
    416417So it follows their performance is similar.
    417418
    418419\item[Finally Traversal]
    419 % The increase in run-time from Empty Traversal (once adjusted for
    420 % the number of iterations) is roughly the same as for termination.
    421 % This suggests that the
    422 See D'tor Traversal discussion.
     420Same as D'tor Traversal,
     421except termination did not have a spike in run-time on this test case.
    423422
    424423\item[Other Traversal]
     
    431430The only test case where resumption could not keep up with termination,
    432431although the difference is not as significant as many other cases.
    433 It is simply a matter of where the costs come from. \PAB{What does this mean?
    434 Even if \CFA termination
    435 is not ``zero-cost", passing through an empty function still seems to be
    436 cheaper than updating global values.}
     432It is simply a matter of where the costs come from,
     433both termination and resumption have some work to set-up or tear-down a
     434handler. It just so happens that resumption's work is slightly slower.
    437435
    438436\item[Conditional Match]
     
    443441\end{description}
    444442
    445 \subsection{Resumption/Fixup, \autoref{t:PerformanceFixupRoutines}}
     443\subsection{Resumption/Fixup \texorpdfstring{(\autoref{t:PerformanceFixupRoutines})}{}}
    446444
    447445Finally are the results of the resumption/fixup routine comparison.
     
    449447has more to do with performance than passing the argument through layers of
    450448calls.
    451 Even with 100 stack frames though, resumption is only about as fast as
    452 manually passing a fixup routine.
    453 However, as the number of fixup routines is increased, the cost of passing them
    454 should make the resumption dynamic-search cheaper.
    455 So there is a cost for the additional power and flexibility exceptions
    456 provide.
     449At 100 stack frames, resumption and manual fixup routines have similar
     450performance in \CFA.
     451More experiments could try to tease out the exact trade-offs,
     452but the prototype's only performance goal is to be reasonable.
     453It has already in that range, and \CFA's fixup routine simulation is
     454one of the faster simulations as well.
     455Plus exceptions add features and remove syntactic overhead,
     456so even at similar performance resumptions have advantages
     457over fixup routines.
Note: See TracChangeset for help on using the changeset viewer.