Changeset a7d93cb for doc/theses/mike_brooks_MMath/string.tex
- Timestamp:
- Mar 19, 2025, 10:15:18 AM (5 days ago)
- Branches:
- master
- Children:
- 048dde4
- Parents:
- bb85f76
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
TabularUnified doc/theses/mike_brooks_MMath/string.tex ¶
rbb85f76 ra7d93cb 20 20 @strcat@, @strncat@ & @+@ & @+@ & @+@ \\ 21 21 @strcmp@, @strncmp@ & @==@, @!=@, @<@, @<=@, @>@, @>=@ 22 & @equals@, @compareTo@ 23 & @==@, @!=@, @<@, @<=@, @>@, @>=@ \\ 22 & @equals@, @compareTo@ & @==@, @!=@, @<@, @<=@, @>@, @>=@ \\ 24 23 @strlen@ & @length@, @size@ & @length@ & @size@ \\ 25 24 @[ ]@ & @[ ]@ & @charAt@ & @[ ]@ \\ … … 369 368 370 369 \begin{figure} 371 \includegraphics{memmgr-basic }370 \includegraphics{memmgr-basic.pdf} 372 371 \caption{String memory-management data structures} 373 372 \label{f:memmgr-basic} … … 578 577 The \emph{append} tests use the varying-from-1 corpus construction, \ie they do not assume away the STL's advantage for small-string optimization. 579 578 \PAB{To discuss: any other case variables introduced in the performance intro} 580 Figure \ref{fig:string-graph-peq-cppemu} shows this behaviour, by the STL and by \CFA in STL emulation mode.579 \VRef[Figure]{fig:string-graph-peq-cppemu} shows this behaviour, by the STL and by \CFA in STL emulation mode. 581 580 \CFA reproduces STL's performance, up to a 15\% penalty averaged over the cases shown, diminishing with larger strings, and 50\% in the worst case. 582 581 This penalty characterizes the amount of implementation fine tuning done with STL and not done with \CFA in present state. 583 582 The larger inherent penalty, for a user mismanaging reuse, is 40\% averaged over the cases shown, is minimally 24\%, shows up consistently between the STL and \CFA implementations, and increases with larger strings. 584 583 585 \PAB{Most of your graphs are unreadable. gnuplot is a good tool for generating high quality graphs.}586 587 584 \begin{figure} 588 \includegraphics[width=\textwidth]{string-graph-peq-cppemu.png} 589 \caption{Average time per iteration with one \lstinline{x += y} invocation, comparing \CFA with STL implementations (given \CFA running in STL emulation mode), and comparing the ``fresh'' with ``reused'' reset styles, at various string sizes.} 585 \centering 586 \includegraphics{string-graph-peq-cppemu.pdf} 587 % \includegraphics[width=\textwidth]{string-graph-peq-cppemu.png} 588 \caption{Average time per iteration (lower is better) with one \lstinline{x += y} invocation, comparing \CFA with STL implementations (given \CFA running in STL emulation mode), and comparing the ``fresh'' with ``reused'' reset styles, at various string sizes.} 590 589 \label{fig:string-graph-peq-cppemu} 591 590 \end{figure} 592 591 593 592 \begin{figure} 594 \includegraphics[width=\textwidth]{string-graph-peq-sharing.png} 595 \caption{Average time per iteration with one \lstinline{x += y} invocation, comparing \CFA (having implicit sharing activated) with STL, and comparing the ``fresh'' with ``reused'' reset styles, at various string sizes.} 593 \centering 594 \includegraphics{string-graph-peq-sharing.pdf} 595 % \includegraphics[width=\textwidth]{string-graph-peq-sharing.png} 596 \caption{Average time per iteration (lower is better) with one \lstinline{x += y} invocation, comparing \CFA (having implicit sharing activated) with STL, and comparing the ``fresh'' with ``reused'' reset styles, at various string sizes.} 596 597 \label{fig:string-graph-peq-sharing} 597 598 \end{figure} 598 599 599 In sharing mode, \CFA makes the fresh/reuse difference disappear, as shown in Figure \ref{fig:string-graph-peq-sharing}.600 In sharing mode, \CFA makes the fresh/reuse difference disappear, as shown in \VRef[Figure]{fig:string-graph-peq-sharing}. 600 601 At append lengths 5 and above, \CFA not only splits the two baseline STL cases, but its slowdown of 16\% over (STL with user-managed reuse) is close to the \CFA-v-STL implementation difference seen with \CFA in STL-emulation mode. 601 602 602 603 \begin{figure} 603 \includegraphics[width=\textwidth]{string-graph-pta-sharing.png} 604 \caption{Average time per iteration with one \lstinline{x = x + y} invocation (new, purple bands), comparing \CFA (having implicit sharing activated) with STL. 605 For context, the results from Figure \ref{fig:string-graph-peq-sharing} are repeated as the bottom bands. 604 \centering 605 \includegraphics{string-graph-pta-sharing.pdf} 606 % \includegraphics[width=\textwidth]{string-graph-pta-sharing.png} 607 \caption{Average time per iteration (lower is better) with one \lstinline{x = x + y} invocation (new, purple bands), comparing \CFA (having implicit sharing activated) with STL. 608 For context, the results from \VRef[Figure]{fig:string-graph-peq-sharing} are repeated as the bottom bands. 606 609 While not a design goal, and not graphed out, \CFA in STL-emulation mode outperformed STL in this case; user-managed allocation reuse did not affect any of the implementations in this case.} 607 610 \label{fig:string-graph-pta-sharing} 608 611 \end{figure} 609 612 610 When the user takes a further step beyond the STL's optimal zone, by running @x = x + y@, as in Figure \ref{fig:string-graph-pta-sharing}, the STL's penalty is above $15 \times$ while \CFA's (with sharing) is under $2 \times$, averaged across the cases shown here.613 When the user takes a further step beyond the STL's optimal zone, by running @x = x + y@, as in \VRef[Figure]{fig:string-graph-pta-sharing}, the STL's penalty is above $15 \times$ while \CFA's (with sharing) is under $2 \times$, averaged across the cases shown here. 611 614 Moreover, the STL's gap increases with string size, while \CFA's converges. 615 612 616 613 617 \subsubsection{Test: Pass argument} … … 621 625 622 626 \begin{figure} 623 \includegraphics[width=\textwidth]{string-graph-pbv.png} 624 \caption{Average time per iteration with one call to a function that takes a by-value string argument, comparing \CFA (having implicit sharing activated) with STL. 627 \centering 628 \includegraphics{string-graph-pbv.pdf} 629 % \includegraphics[width=\textwidth]{string-graph-pbv.png} 630 \caption{Average time per iteration (lower is better) with one call to a function that takes a by-value string argument, comparing \CFA (having implicit sharing activated) with STL. 625 631 (a) With \emph{Varying-from-1} corpus construction, in which the STL-only benefit of small-string optimization occurs, in varying degrees, at all string sizes. 626 632 (b) With \emph{Fixed-size} corpus construction, in which this benefit applies exactly to strings with length below 16. … … 629 635 \end{figure} 630 636 631 Figure \ref{fig:string-graph-pbv} shows the costs for calling a function that receives a string argument by value.637 \VRef[Figure]{fig:string-graph-pbv} shows the costs for calling a function that receives a string argument by value. 632 638 STL's performance worsens as string length increases, while \CFA has the same performance at all sizes. 633 639 … … 674 680 675 681 \begin{figure} 682 \centering 676 683 \includegraphics[width=\textwidth]{string-graph-allocn.png} 677 684 \caption{Space and time performance, under varying fraction-live targets, for the five string lengths shown, at (\emph{Fixed-size} corpus construction. … … 682 689 \end{figure} 683 690 684 Figure \ref{fig:string-graph-allocn} shows the results of this experiment.691 \VRef[Figure]{fig:string-graph-allocn} shows the results of this experiment. 685 692 At all string sizes, varying the liveness threshold gives offers speed-for-space tradeoffs relative to STL. 686 693 At the default liveness threshold, all measured string sizes see a ??\%--??\% speedup for a ??\%--??\% increase in memory footprint.
Note: See TracChangeset
for help on using the changeset viewer.