Index: doc/theses/mike_brooks_MMath/string.tex
===================================================================
--- doc/theses/mike_brooks_MMath/string.tex	(revision 810c2c596a73f36f250f13be31cf07734799f98a)
+++ doc/theses/mike_brooks_MMath/string.tex	(revision eeefc0ce7b34d6158108a35960cd5f58cbed09fa)
@@ -2207,5 +2207,27 @@
 Under a \CFA-generous separate-compilation stance, \CFA is equal or ahead in every important category so far.
 
-The remaining attribution, \emph{text-import}, is thus a major reason that \CFA is currently unsuccessful at delivering improved speed with long strings.  \PAB{From Mike: need to finish this point.}
+The remaining attribution, \emph{text import}, is thus a major reason that \CFA is currently unsuccessful at delivering improved speed at large sizes.
+\CFA's loss to STL on \emph{text import} occurs somewhat at length 50, and is pronounced at length 200.
+If the STL monolithic compilation advantage is removed from consideration, the \emph{text-import} difference is the only reason that \CFA is not beating STL on speed, by about 10\%, across the board.
+
+An investigation\footnote{
+	\MLB{Peter, you need to be okay with this.}
+	The description of this investigation that appears in the current draft is my best recollection concerning work done previously.
+	But, so far, I have been unable to find this actual work.
+	90\% case: I find it or reproduce it and save the details properly; this footnote disappears.
+	10\% case: I can't do so; I retract the explanation above. 
+} into the \emph{text-import} difference revealed an interesting optimization opportunity.
+Both implementations use a @memcpy@ operation, sourcing from the program's @argv@ representation, targeting the string library's working space.
+The @memcpy@ action is inlined into its call site successfully, in both implementations.
+But STL's, which runs faster, does the data movement with vector instructions, while \CFA's does not.
+This STL-only instruction sequence appears to be correct only when the source and destination have their starting byte at the same offset within a vector chunk.
+The \CFA implementation has made no provision for this quality, so it is good for correctness that \CFA does not receive the vector version.
+Presumably, the optimizer (or check affecting the instruction stream) has noticed STL arranging for the destination to line up with the source.
+It could do so either by matching a known alignment (statically) or choosing to match the source's unaligned chunk offset (dynamically).
+Either possibility would be a choice to incur further fragmentation, when allocating working space (the copy's destination), in exchange for a faster copy.
+The \CFA implementation may benefit from attempting such a scheme.
+At present, incorporating the necessary fragementation into the working heap management is too disruptive.
+So, this discovery is left as a potential improvement.
+
 
 % \subsection{Test: Normalize}
@@ -2224,9 +2246,9 @@
 % Using the STL string, the most natural ways to write the helper module's function, given its requirements in isolation, slow down when it is driven in the adapted context.
 
-\begin{lstlisting}
-void processItem( string & item ) {
-	 // find issues in item and fix them
-}
-\end{lstlisting}
+% \begin{lstlisting}
+% void processItem( string & item ) {
+% 	 // find issues in item and fix them
+% }
+% \end{lstlisting}
 
 
