Context Navigation

← Previous Changeset
Next Changeset →

Changeset f93d7fc

Timestamp:

Aug 30, 2021, 9:29:26 PM (5 years ago)

Author:

Peter A. Buhr <pabuhr@…>

Branches:

ADT, ast-experimental, enum, forall-pointer-decay, jacob/cs343-translation, master, pthread-emulation, qualifiedEnum, stuck-waitfor-destruct

Children:

b041f11

Parents:

3548ddb (diff), 0477127 (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the (diff) links above to see all the changes relative to each parent.

Message:

Merge branch 'master' of plg.uwaterloo.ca:software/cfa/cfa-cc

File:

: 1 edited

doc/theses/andrew_beach_MMath/performance.tex (modified) (15 diffs)

Legend:

: Unmodified
: Added
: Removed

doc/theses/andrew_beach_MMath/performance.tex

-              r3548ddb
+              rf93d7fc
 but otherwise \Cpp should have a significant advantage.
 Java, a popular language with similar termination semantics, but
 it is implemented in a very different environment, a virtual machine with
+Java, a popular language with similar termination semantics,
+is implemented in a very different environment, a virtual machine with
 garbage collection.
 It also implements the finally clause on try blocks allowing for a direct
 …
 The number of times the loop is run is configurable from the command line;
 the number used in the timing runs is given with the results per test.
-% Tests ran their main loop a million times.
 The Java tests run the main loop 1000 times before
 beginning the actual test to ``warm-up" the JVM.
 …
 The repeating function calls itself inside a try block with a handler that
 does not match the raised exception, but is of the same kind of handler.
 This means that the EHM has to check each handler, but continue
+This means that the EHM has to check each handler, and continue
 over all of them until it reaches the base of the stack.
 Comparing this to the empty test gives the time to traverse over and
 …
 \paragraph{Cross Try Statement}
+This group of tests measures the cost for setting up exception handling, if it is
+This group of tests measures the cost for setting up exception handling,
+if it is
 not used (because the exceptional case did not occur).
 Tests repeatedly cross (enter, execute, and leave) a try statement but never
+Tests repeatedly cross (enter, execute and leave) a try statement but never
 perform a raise.
 \begin{itemize}[nosep]
 …
               \multicolumn{1}{c}{Raise} & \multicolumn{1}{c}{\CFA} & \multicolumn{1}{c}{\Cpp} & \multicolumn{1}{c}{Java} & \multicolumn{1}{c|}{Python} \\
 \hline
+Resume Empty (10M)  & 3.8  & 3.5  & 14.7  & 2.3   & 176.1 & 0.3  & 0.1  & 8.9   & 1.2   & 119.9 \\
+%Resume Other (10M)  & 4.0* & 0.1* & 21.9  & 6.2   & 381.0 & 0.3* & 0.1* & 13.2  & 5.0   & 290.7 \\
+Resume Empty (10M)  & 1.5 & 1.5 & 14.7 & 2.3 & 176.1  & 1.0 & 1.4 & 8.9 & 1.2 & 119.9 \\
 \hline
 \end{tabular}
 …
 For example, there are a few cases where Python out-performs
 \CFA, \Cpp and Java.
+\todo{Make sure there are still cases where Python wins.}
 The most likely explanation is that, since exceptions
 are rarely considered to be the common case, the more optimized languages
 …
 Details on the different test cases follow.
 \subsection{Termination, \autoref{t:PerformanceTermination}}
+\subsection{Termination \texorpdfstring{(\autoref{t:PerformanceTermination})}{}}
 \begin{description}
 …
 \CFA is slower than \Cpp, but is still faster than the other languages
 and closer to \Cpp than other languages.
+This result is to be expected as \CFA is closer to \Cpp than the other languages.
+This result is to be expected,
+as \CFA is closer to \Cpp than the other languages.
 \item[D'tor Traversal]
 Running destructors causes a huge slowdown in the two languages that support
 them. \CFA has a higher proportionate slowdown but it is similar to \Cpp's.
+Considering the amount of work done in destructors is virtually zero (asm instruction), the cost
+must come from the change of context required to trigger the destructor.
+Considering the amount of work done in destructors is effectively zero
+(an assembly comment), the cost
+must come from the change of context required to run the destructor.
 \item[Finally Traversal]
 …
 but they could avoid the spike by not having the same kind of overhead for
 switching to the check's context.
 \todo{Could revisit Other Traversal, after Finally Traversal.}
 …
 Here \CFA falls behind \Cpp by a much more significant margin.
 This is likely due to the fact \CFA has to insert two extra function
 calls, while \Cpp does not have to execute any other instructions.
+calls, while \Cpp does not have to do execute any other instructions.
 Python is much further behind.
 …
 \end{description}
 \subsection{Resumption, \autoref{t:PerformanceResumption}}
+\subsection{Resumption \texorpdfstring{(\autoref{t:PerformanceResumption})}{}}
 Moving on to resumption, there is one general note,
 …
 The run-time is actually very similar to Finally Traversal.
 As resumption does not unwind the stack, both destructors and finally
 clauses are run while walking down the stack during the recursion returns.
+clauses are run while walking down the stack during the recursive returns.
 So it follows their performance is similar.
 \item[Finally Traversal]
+% The increase in run-time from Empty Traversal (once adjusted for
+% the number of iterations) is roughly the same as for termination.
+% This suggests that the
+See D'tor Traversal discussion.
+Same as D'tor Traversal,
+except termination did not have a spike in run-time on this test case.
 \item[Other Traversal]
 …
 The only test case where resumption could not keep up with termination,
 although the difference is not as significant as many other cases.
+It is simply a matter of where the costs come from. \PAB{What does this mean?
+Even if \CFA termination
+is not ``zero-cost", passing through an empty function still seems to be
+cheaper than updating global values.}
+It is simply a matter of where the costs come from,
+both termination and resumption have some work to set-up or tear-down a
+handler. It just so happens that resumption's work is slightly slower.
 \item[Conditional Match]
 …
 \end{description}
 \subsection{Resumption/Fixup, \autoref{t:PerformanceFixupRoutines}}
+\subsection{Resumption/Fixup \texorpdfstring{(\autoref{t:PerformanceFixupRoutines})}{}}
 Finally are the results of the resumption/fixup routine comparison.
 …
 has more to do with performance than passing the argument through layers of
 calls.
+Even with 100 stack frames though, resumption is only about as fast as
+manually passing a fixup routine.
+However, as the number of fixup routines is increased, the cost of passing them
+should make the resumption dynamic-search cheaper.
+So there is a cost for the additional power and flexibility exceptions
+provide.
+At 100 stack frames, resumption and manual fixup routines have similar
+performance in \CFA.
+More experiments could try to tease out the exact trade-offs,
+but the prototype's only performance goal is to be reasonable.
+It has already in that range, and \CFA's fixup routine simulation is
+one of the faster simulations as well.
+Plus exceptions add features and remove syntactic overhead,
+so even at similar performance resumptions have advantages
+over fixup routines.

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset f93d7fc

Legend:

doc/theses/andrew_beach_MMath/performance.tex

Download in other formats: