Ignore:
Timestamp:
Mar 27, 2025, 3:48:24 PM (6 months ago)
Author:
Peter A. Buhr <pabuhr@…>
Branches:
master
Children:
185cd94
Parents:
379b6ea
Message:

preliminary updates to string chapter

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/mike_brooks_MMath/string.tex

    r379b6ea rb1c220a  
    459459
    460460There are tradeoffs associated with the copy-on-write mechanism.
    461 Several qualitative matters are detailed in the \VRef{s:PerformanceAssessment} and the qualitative issue of multi-threaded support is introduced here.
    462 The \CFA sting library provides a switch to disable the sharing mechanism for situations where it is inappropriate.
    463 
    464 Because of the inter-linked string handles, any participant managing one string is also managing, directly, the neighbouring strings, and from there, a data structure of the ``set of all strings.''  This data structure is intended for sequential access.
    465 A negative consequence of this decision is that multiple threads using strings need to be set up so that they avoid attempting to modify (concurrently) an instance of this structure.
    466 A positive consequence is that a single-threaded program, or a program with several independent threads, can use the sharing context without locking overhead.
     461Several qualitative matters are detailed in \VRef{s:PerformanceAssessment} and the qualitative issue of multi-threaded support is introduced here.
     462The \CFA string library provides a switch to disable threads allocating from the string buffer, when string sharing is unsafe.
     463When toggled, string management is moved to the storage allocator, specifically @malloc@/@free@, where the storage allocator is assumed to be thread-safe.
     464
     465In detail, string sharing has inter-linked string handles, so any participant managing one string is also managing, directly, the neighbouring strings, and from there, a data structure of the ``set of all strings.''
     466This string structure is intended for sequential access.
     467Hence, multiple threads using shared strings need to avoid modifying (concurrently) an instance of this structure (like Java immutable strings).
     468A positive consequence of this approach is that independent threads can use the sharing buffer without locking overhead.
    467469
    468470When the string library is running with sharing disabled, it runs without implicit thread-safety challenges, which is the same as the \CC STL, and with performance goals similar to the STL.
     
    470472Hence, concurrent users of string objects must still bring their own mutual exclusion, but the string library does not add any cross thread uses that are not apparent in a user's code.
    471473
    472 The \CFA string library provides the type @string_sharectx@to control an ambient sharing context for the current thread.
     474The \CFA string library provides the type @string_sharectx@ to control an ambient sharing context for a current thread.
    473475It allows two adjustments: to opt out of sharing entirely or to begin sharing within a private context.
    474476Either way, the chosen mode applies only to the current thread, for the duration of the lifetime of the created  @string_sharectx@ object, up to being suspended by child lifetimes of different contexts.
     
    492494        \lstinputlisting[language=CFA, firstline=10, lastline=55]{sharectx.run.cfa}
    493495        &
    494         \includegraphics{string-sharectx.pdf}
     496        \raisebox{-0.17\totalheight}{\includegraphics{string-sharectx.pdf}} % lower
    495497    \end{tabular}
    496498        \caption{Controlling copying vs sharing of strings using \lstinline{string_sharectx}.}
     
    519521
    520522To discuss: general goal of ...
    521 while STL makes you think about memory management, all the time, and if you do your performance can be great ...
     523while STL makes you think about memory management, all the time, and if you do, your performance can be great ...
    522524\CFA sacrifices this advantage modestly in exchange for big wins when you're not thinking about memory management.
    523525[Does this position cover all of it?]
     
    530532\subsection{Methodology}
    531533
    532 These tests use randomly generated text fragments of varying lengths.
     534These tests use randomly generated varying-length strings (string content is immaterial).
    533535A collection of such fragments is a \emph{corpus}.
    534536The mean length of a fragment from a corpus is a typical explanatory variable.
     
    540542\end{description}
    541543The geometric distribution implies that lengths much longer than the mean occur frequently.
    542 The special treatment of length 16 deals with comparison to STL, given that STL has short-string optimization (see [TODO: write and cross-ref future-work SSO]), currently not implemented in \CFA.
    543 When success, notwithstanding SSO, is illustrated, a fixed-size or from-16 distribution ensures that extra-optimized cases are not part of the mix on the STL side.
     544The special treatment of length 16 deals with comparison to STL, given that STL has a short-string optimization (see [TODO: write and cross-ref future-work SSO]), currently not implemented in \CFA.
     545When success is illustrated, notwithstanding SSO, a fixed-size or from-16 distribution ensures that extra-optimized cases are not part of the mix on the STL side.
    544546In all experiments that use a corpus, its text is generated and loaded into the SUT before the timed phase begins.
    545547
     
    564566
    565567Another experimental variable is whether the user's logical allocation is fresh \vs reused.
    566 Here, \emph{reusing a logical allocation}, means that the program variable, into which the user is concatenating, previously held a long string:\\
     568Here, \emph{reusing a logical allocation}, means that the program variable, into which the user is concatenating, previously held a long string:
    567569\begin{cquote}
    568570\setlength{\tabcolsep}{20pt}
    569571\begin{tabular}{@{}ll@{}}
     572\multicolumn{1}{c}{\textbf{fresh}} & \multicolumn{1}{c}{\textbf{reuse}} \\
    570573\begin{cfa}
    571574
    572575for ( ... ) {
    573         @string x;@   // fresh
     576        @string x;@
    574577        for ( ... )
    575578                x += ...
     
    580583string x;
    581584for ( ... ) {
    582         @x = "";@  // reused
     585        @x = "";@
    583586        for ( ... )
    584587                x += ...
     
    587590\end{tabular}
    588591\end{cquote}
    589 These benchmark drivers have an outer loop for ``until a sample-worthy amount of execution has happened'' and an inner loop for ``build up the desired-length string.''
     592
     593All benchmark drivers have an outer loop for ``until a sample-worthy amount of execution has happened'' and an inner loop for ``building up the desired-length string.''
    590594In general, a user should not have to care about this difference, yet the STL performs differently in these cases.
    591595Furthermore, if a routine takes a string by reference, if cannot use the fresh approach.
     
    593597For the STL, this cost includes obtaining a fresh buffer from the memory allocator and copying older characters into the new buffer, while \CFA-sharing hides such a cost entirely.
    594598The fresh \vs reuse distinction is only relevant in the \emph{append} tests.
    595 
    596 The \emph{append} tests use the varying-from-1 corpus construction, \ie they do not assume away the STL's advantage for small-string optimization.
    597 \PAB{To discuss: any other case variables introduced in the performance intro}
    598 \VRef[Figure]{fig:string-graph-peq-cppemu} shows this behaviour, by the STL and by \CFA in STL emulation mode.
    599 \CFA reproduces STL's performance, up to a 15\% penalty averaged over the cases shown, diminishing with larger strings, and 50\% in the worst case.
    600 This penalty characterizes the amount of implementation fine tuning done with STL and not done with \CFA in present state.
    601 The larger inherent penalty, for a user mismanaging reuse, is 40\% averaged over the cases shown, is minimally 24\%, shows up consistently between the STL and \CFA implementations, and increases with larger strings.
    602599
    603600\begin{figure}
     
    609606\end{figure}
    610607
     608The \emph{append} tests use the varying-from-1 corpus construction, \ie they assume the STL's advantage of small-string optimization.
     609\PAB{To discuss: any other case variables introduced in the performance intro}
     610\VRef[Figure]{fig:string-graph-peq-cppemu} shows this behaviour, by the STL and by \CFA in STL emulation mode.
     611\CFA reproduces STL's performance, up to a 15\% penalty averaged over the cases shown, diminishing with larger strings, and 50\% in the worst case.
     612This penalty characterizes the amount of implementation fine tuning done with STL and not done with \CFA in present state.
     613\PAB{The larger inherent penalty, for a user mismanaging reuse, is 40\% averaged over the cases shown, is minimally 24\%, shows up consistently between the STL and \CFA implementations, and increases with larger strings.}
     614
    611615\begin{figure}
    612616\centering
     
    624628        \includegraphics{string-graph-pta-sharing.pdf}
    625629%       \includegraphics[width=\textwidth]{string-graph-pta-sharing.png}
    626         \caption{Average time per iteration (lower is better) with one \lstinline{x = x + y} invocation (new, purple bands), comparing \CFA (having implicit sharing activated) with STL.
     630        \caption{Average time per iteration (lower is better) with one \lstinline{x = x + y} invocation, comparing \CFA (having implicit sharing activated) with STL.
    627631For context, the results from \VRef[Figure]{fig:string-graph-peq-sharing} are repeated as the bottom bands.
    628632While not a design goal, and not graphed out, \CFA in STL-emulation mode outperformed STL in this case; user-managed allocation reuse did not affect any of the implementations in this case.}
     
    637641
    638642To have introduced:  STL string library forces users to think about memory management when communicating values across a function call
    639 
     643\begin{cfa}
     644void foo( string s );
     645string s = "abc";
     646foo( s );
     647\end{cfa}
    640648STL charges a prohibitive penalty for passing a string by value.
    641649With implicit sharing active, \CFA treats this operation as normal and supported.
     
    657665STL's performance worsens as string length increases, while \CFA has the same performance at all sizes.
    658666
    659 The \CFA cost to pass is nontrivial.
     667The \CFA cost to pass a string is nontrivial.
    660668The contributor is adding and removing the callee's string handle from the global list.
    661669This cost is $1.5 \times$ to $2 \times$ over STL's when small-string optimization applies, though this cost should be avoidable in the same case, given a \CFA realization of this optimization.
Note: See TracChangeset for help on using the changeset viewer.