Changeset 7184906
- Timestamp:
- Apr 25, 2026, 7:27:28 PM (3 days ago)
- Branches:
- master
- Children:
- eeefc0c
- Parents:
- 408f954
- File:
-
- 1 edited
-
doc/theses/mike_brooks_MMath/string.tex (modified) (2 diffs)
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/mike_brooks_MMath/string.tex
r408f954 r7184906 2207 2207 Under a \CFA-generous separate-compilation stance, \CFA is equal or ahead in every important category so far. 2208 2208 2209 The remaining attribution, \emph{text-import}, is thus a major reason that \CFA is currently unsuccessful at delivering improved speed with long strings. \PAB{From Mike: need to finish this point.} 2209 The remaining attribution, \emph{text import}, is thus a major reason that \CFA is currently unsuccessful at delivering improved speed at large sizes. 2210 \CFA's loss to STL on \emph{text import} occurs somewhat at length 50, and is pronounced at length 200. 2211 If the STL monolithic compilation advantage is removed from consideration, the \emph{text-import} difference is the only reason that \CFA is not beating STL on speed, by about 10\%, across the board. 2212 2213 An investigation\footnote{ 2214 \MLB{Peter, you need to be okay with this.} 2215 The description of this investigation that appears in the current draft is my best recollection concerning work done previously. 2216 But, so far, I have been unable to find this actual work. 2217 90\% case: I find it or reproduce it and save the details properly; this footnote disappears. 2218 10\% case: I can't do so; I retract the explanation above. 2219 } into the \emph{text-import} difference revealed an interesting optimization opportunity. 2220 Both implementations use a @memcpy@ operation, sourcing from the program's @argv@ representation, targeting the string library's working space. 2221 The @memcpy@ action is inlined into its call site successfully, in both implementations. 2222 But STL's, which runs faster, does the data movement with vector instructions, while \CFA's does not. 2223 This STL-only instruction sequence appears to be correct only when the source and destination have their starting byte at the same offset within a vector chunk. 2224 The \CFA implementation has made no provision for this quality, so it is good for correctness that \CFA does not receive the vector version. 2225 Presumably, the optimizer (or check affecting the instruction stream) has noticed STL arranging for the destination to line up with the source. 2226 It could do so either by matching a known alignment (statically) or choosing to match the source's unaligned chunk offset (dynamically). 2227 Either possibility would be a choice to incur further fragmentation, when allocating working space (the copy's destination), in exchange for a faster copy. 2228 The \CFA implementation may benefit from attempting such a scheme. 2229 At present, incorporating the necessary fragementation into the working heap management is too disruptive. 2230 So, this discovery is left as a potential improvement. 2231 2210 2232 2211 2233 % \subsection{Test: Normalize} … … 2224 2246 % Using the STL string, the most natural ways to write the helper module's function, given its requirements in isolation, slow down when it is driven in the adapted context. 2225 2247 2226 \begin{lstlisting}2227 void processItem( string & item ) {2228 // find issues in item and fix them2229 }2230 \end{lstlisting}2248 % \begin{lstlisting} 2249 % void processItem( string & item ) { 2250 % // find issues in item and fix them 2251 % } 2252 % \end{lstlisting} 2231 2253 2232 2254
Note:
See TracChangeset
for help on using the changeset viewer.