Changeset 7184906


Ignore:
Timestamp:
Apr 25, 2026, 7:27:28 PM (3 days ago)
Author:
Michael Brooks <mlbrooks@…>
Branches:
master
Children:
eeefc0c
Parents:
408f954
Message:

address string pending item, to elaborate on the cfa "text import" slowdown

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/mike_brooks_MMath/string.tex

    r408f954 r7184906  
    22072207Under a \CFA-generous separate-compilation stance, \CFA is equal or ahead in every important category so far.
    22082208
    2209 The remaining attribution, \emph{text-import}, is thus a major reason that \CFA is currently unsuccessful at delivering improved speed with long strings.  \PAB{From Mike: need to finish this point.}
     2209The remaining attribution, \emph{text import}, is thus a major reason that \CFA is currently unsuccessful at delivering improved speed at large sizes.
     2210\CFA's loss to STL on \emph{text import} occurs somewhat at length 50, and is pronounced at length 200.
     2211If the STL monolithic compilation advantage is removed from consideration, the \emph{text-import} difference is the only reason that \CFA is not beating STL on speed, by about 10\%, across the board.
     2212
     2213An investigation\footnote{
     2214        \MLB{Peter, you need to be okay with this.}
     2215        The description of this investigation that appears in the current draft is my best recollection concerning work done previously.
     2216        But, so far, I have been unable to find this actual work.
     2217        90\% case: I find it or reproduce it and save the details properly; this footnote disappears.
     2218        10\% case: I can't do so; I retract the explanation above.
     2219} into the \emph{text-import} difference revealed an interesting optimization opportunity.
     2220Both implementations use a @memcpy@ operation, sourcing from the program's @argv@ representation, targeting the string library's working space.
     2221The @memcpy@ action is inlined into its call site successfully, in both implementations.
     2222But STL's, which runs faster, does the data movement with vector instructions, while \CFA's does not.
     2223This STL-only instruction sequence appears to be correct only when the source and destination have their starting byte at the same offset within a vector chunk.
     2224The \CFA implementation has made no provision for this quality, so it is good for correctness that \CFA does not receive the vector version.
     2225Presumably, the optimizer (or check affecting the instruction stream) has noticed STL arranging for the destination to line up with the source.
     2226It could do so either by matching a known alignment (statically) or choosing to match the source's unaligned chunk offset (dynamically).
     2227Either possibility would be a choice to incur further fragmentation, when allocating working space (the copy's destination), in exchange for a faster copy.
     2228The \CFA implementation may benefit from attempting such a scheme.
     2229At present, incorporating the necessary fragementation into the working heap management is too disruptive.
     2230So, this discovery is left as a potential improvement.
     2231
    22102232
    22112233% \subsection{Test: Normalize}
     
    22242246% Using the STL string, the most natural ways to write the helper module's function, given its requirements in isolation, slow down when it is driven in the adapted context.
    22252247
    2226 \begin{lstlisting}
    2227 void processItem( string & item ) {
    2228          // find issues in item and fix them
    2229 }
    2230 \end{lstlisting}
     2248% \begin{lstlisting}
     2249% void processItem( string & item ) {
     2250%        // find issues in item and fix them
     2251% }
     2252% \end{lstlisting}
    22312253
    22322254
Note: See TracChangeset for help on using the changeset viewer.