Context Navigation

-              r408f954
+              r7184906
 Under a \CFA-generous separate-compilation stance, \CFA is equal or ahead in every important category so far.
+The remaining attribution, \emph{text-import}, is thus a major reason that \CFA is currently unsuccessful at delivering improved speed with long strings.  \PAB{From Mike: need to finish this point.}
+The remaining attribution, \emph{text import}, is thus a major reason that \CFA is currently unsuccessful at delivering improved speed at large sizes.
+\CFA's loss to STL on \emph{text import} occurs somewhat at length 50, and is pronounced at length 200.
+If the STL monolithic compilation advantage is removed from consideration, the \emph{text-import} difference is the only reason that \CFA is not beating STL on speed, by about 10\%, across the board.
+An investigation\footnote{
+        \MLB{Peter, you need to be okay with this.}
+        The description of this investigation that appears in the current draft is my best recollection concerning work done previously.
+        But, so far, I have been unable to find this actual work.
+\% case: I find it or reproduce it and save the details properly; this footnote disappears.
+\% case: I can't do so; I retract the explanation above.
+} into the \emph{text-import} difference revealed an interesting optimization opportunity.
+Both implementations use a @memcpy@ operation, sourcing from the program's @argv@ representation, targeting the string library's working space.
+The @memcpy@ action is inlined into its call site successfully, in both implementations.
+But STL's, which runs faster, does the data movement with vector instructions, while \CFA's does not.
+This STL-only instruction sequence appears to be correct only when the source and destination have their starting byte at the same offset within a vector chunk.
+The \CFA implementation has made no provision for this quality, so it is good for correctness that \CFA does not receive the vector version.
+Presumably, the optimizer (or check affecting the instruction stream) has noticed STL arranging for the destination to line up with the source.
+It could do so either by matching a known alignment (statically) or choosing to match the source's unaligned chunk offset (dynamically).
+Either possibility would be a choice to incur further fragmentation, when allocating working space (the copy's destination), in exchange for a faster copy.
+The \CFA implementation may benefit from attempting such a scheme.
+At present, incorporating the necessary fragementation into the working heap management is too disruptive.
+So, this discovery is left as a potential improvement.
 % \subsection{Test: Normalize}
 …
 % Using the STL string, the most natural ways to write the helper module's function, given its requirements in isolation, slow down when it is driven in the adapted context.
 \begin{lstlisting}
 void processItem( string & item ) {
          // find issues in item and fix them
+}
 \end{lstlisting}
+% \begin{lstlisting}
+% void processItem( string & item ) {
+%        // find issues in item and fix them
+% }
+% \end{lstlisting}

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset 7184906

Legend:

doc/theses/mike_brooks_MMath/string.tex

Download in other formats: