Ignore:
Timestamp:
Feb 4, 2026, 12:43:09 PM (6 days ago)
Author:
Michael Brooks <mlbrooks@…>
Branches:
master, stuck-waitfor-destruct
Children:
f648875
Parents:
df72682
Message:

Add data for Intel host 'java', alongside incumbent AMD host 'swift'.

Revise analysis--presentation to show both side by side. Remove presentation of CFA Attribution, which is now seen to be chasing a red herring.

Data continue to be from harness of commit 78bc398830.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/mike_brooks_MMath/list.tex

    rdf72682 r8eb85de  
    715715
    716716
    717 \subsection{Experimental Environment}
     717\subsubsection{Execution Environment}
    718718\label{s:ExperimentalEnvironment}
    719719
     
    726726\item[AMD]
    727727Supermicro AS--1125HS--TNR EPYC 9754 128--core socket, hyper-threading $\times$ 2 sockets (512 processing units) 2.25 GHz, TSO memory model, with cache structure 32KB L1i/L1d, 1024KB L2, 16MB L3, where each L3 cache covers 16 processors.
    728 %\item[Intel]
    729 %Supermicro SYS-121H-TNR Xeon Gold 6530 32--core, hyper-threading $\times$ 2 sockets (128 processing units) 2.1 GHz, TSO memory model
     728\item[Intel]
     729Supermicro SYS-121H-TNR Xeon Gold 6530 32--core, hyper-threading $\times$ 2 sockets (128 processing units) 2.1 GHz, TSO memory model
    730730\end{description}
    731731The experiments are single threaded and pinned to single core to prevent any OS movement, which might cause cache or NUMA effects perturbing the experiment.
     
    743743
    744744
    745 \subsubsection{Result: Coarse comparison of styles}
     745\subsection{Result: Coarse comparison of styles}
    746746
    747747This comparison establishes how an intrusive list performs compared with a wrapped-reference list.
     
    757757\begin{figure}
    758758  \centering
    759   \subfloat[Linear List Nodes]{\label{f:Linear}
    760     \includegraphics{plot-list-zoomout-noshuf.pdf}
     759  \setlength{\tabcolsep}{0pt}
     760  \begin{tabular}{p{0.75in}p{2.75in}p{3in}}
     761  &
     762  \subfloat[Linear List Nodes, AMD]{\label{f:Linear-swift}
     763    \hspace*{-0.75in}
     764    \includegraphics{plot-list-zoomout-noshuf-swift.pdf}
     765  } % subfigure
     766  &
     767  \subfloat[Linear List Nodes, Intel]{\label{f:Linear-java}
     768    \includegraphics{plot-list-zoomout-noshuf-java.pdf}
    761769  } % subfigure
    762770  \\
    763   \subfloat[Shuffled List Nodes]{\label{f:Shuffled}
    764     \includegraphics{plot-list-zoomout-shuf.pdf}
     771  &
     772  \subfloat[Shuffled List Nodes, AMD]{\label{f:Shuffled-swift}
     773    \hspace*{-0.75in}
     774    \includegraphics{plot-list-zoomout-shuf-swift.pdf}
    765775  } % subfigure
     776  &
     777  \subfloat[Shuffled List Nodes, Intel]{\label{f:Shuffled-java}
     778    \includegraphics{plot-list-zoomout-shuf-java.pdf}
     779  } % subfigure
     780  \end{tabular}
    766781  \caption{Insert/remove duration \vs list length.
    767782  Lengths go as large possible without error.
     
    779794Hence, dynamic allocation cost is fundamental for wrapped lists.
    780795
    781 In detail, \VRef[Figure]{f:Linear} shows linear insertion of all the nodes and then linear removal, both in the same direction.
     796In detail, \VRef[Figure]{f:Linear-swift}--\subref*{f:Linear-java} shows linear insertion of all the nodes and then linear removal, both in the same direction.
    782797For intrusive lists, the nodes are adjacent because they are preallocated in an array.
    783798For wrapped lists, the wrapped nodes are still adjacent because the memory allocator happens to use bump allocation for the small fixed-sized wrapped nodes.
     
    786801Hence, performance is largely constant for both kinds of lists, until L3 cache and NUMA boundaries are crossed for longer lists and the costs increase consistently for both kinds of lists.
    787802
    788 In detail, \VRef[Figure]{f:Shuffled} shows shuffle insertion and removal of the nodes.
     803In detail, \VRef[Figure]{f:Shuffled-swift}--\subref*{f:Shuffled-java} shows shuffled insertion and removal of the nodes.
    789804As for linear, there are issues with the wrapped list and memory allocation.
    790805For intrusive lists, it is possible to link the nodes randomly, so consecutive nodes in memory seldom point at adjacent nodes.
     
    840855
    841856
    842 \section{Result: Comparing intrusive implementations}
     857\subsection{Result: Comparing intrusive implementations}
    843858\label{s:ComparingIntrusiveImplementations}
    844859
     
    849864\begin{figure}
    850865\centering
    851   \subfloat[Absolute Time]{\label{f:AbsoluteTime}
    852   \includegraphics{plot-list-zoomin-abs.pdf}
     866  \setlength{\tabcolsep}{0pt}
     867  \begin{tabular}{p{0.75in}p{2.75in}p{3in}}
     868  &
     869  \subfloat[Absolute Time, AMD]{\label{f:AbsoluteTime-swift}
     870    \hspace*{-0.75in}
     871    \includegraphics{plot-list-zoomin-abs-swift.pdf}
     872  } % subfigure
     873  &
     874  \subfloat[Absolute Time, Intel]{\label{f:AbsoluteTime-java}
     875    \includegraphics{plot-list-zoomin-abs-java.pdf}
    853876  } % subfigure
    854877  \\
    855   \subfloat[Relative Time]{\label{f:RelativeTime}
    856   \includegraphics{plot-list-zoomin-rel.pdf}
     878  &
     879  \subfloat[Relative Time, AMD]{\label{f:RelativeTime-swift}
     880    \hspace*{-0.75in}
     881    \includegraphics{plot-list-zoomin-rel-swift.pdf}
    857882  } % subfigure
     883  &
     884  \subfloat[Relative Time, Intel]{\label{f:RelativeTime-java}
     885    \includegraphics{plot-list-zoomin-rel-java.pdf}
     886  } % subfigure
     887  \end{tabular}
    858888  \caption{Operation duration \vs list length at small-medium lengths.  One example operation is shown: stack movement, insert-first polarity and head-mediated access.  (a) has absolute times.  (b) has times relative to those of LQ-\lstinline{tailq}.}
    859889  \label{fig:plot-list-zoomin}
     
    880910\begin{figure}
    881911\centering
    882   \subfloat[Supersets]{\label{f:Superset}
    883   \includegraphics{plot-list-cmp-exout.pdf}
     912  \setlength{\tabcolsep}{0pt}
     913  \begin{tabular}{p{0.75in}p{2.75in}p{3in}}
     914  &
     915  \subfloat[Supersets, AMD]{\label{f:Superset-swift}
     916    \hspace*{-0.75in}
     917    \includegraphics{plot-list-cmp-exout-swift.pdf}
     918  } % subfigure
     919  &
     920  \subfloat[Supersets, Intel]{\label{f:Superset-java}
     921    \includegraphics{plot-list-cmp-exout-java.pdf}
    884922  } % subfigure
    885923  \\
    886   \subfloat[1st Level Slice]{\label{f:1stLevelSlice}
    887   \includegraphics{plot-list-cmp-survey.pdf}
     924  &
     925  \subfloat[1st Level Slice, AMD]{\label{f:1stLevelSlice-swift}
     926    \hspace*{-0.75in}
     927    \includegraphics{plot-list-cmp-survey-swift.pdf}
    888928  } % subfigure
     929  &
     930  \subfloat[1st Level Slice, Intel]{\label{f:1stLevelSlice-java}
     931    \includegraphics{plot-list-cmp-survey-java.pdf}
     932  } % subfigure
     933  \end{tabular}
    889934  \caption{Operation duration ranges across operational scenarios.  (a) has the supersets of the running example operation.  (b) has the first-level slices of the full space of operations.}
    890935  \label{fig:plot-list-cmp-overall}
     
    892937
    893938\VRef[Figure]{fig:plot-list-cmp-overall} introduces alternative views of the data.
    894 Part (a)'s first column summarizes all the data of \VRef{fig:plot-list-zoomin}-(b).
     939Part \ref*{f:Superset-swift}--\ref*{f:Superset-java}'s first column summarizes all the data of \VRef{f:RelativeTime}.
    895940Its x-axis label, ``stack/insfirst/allhead,'' names the concrete scenario that has been discussed until now.
    896941Moving across the columns, the next three each stretch to include more scenarios on each of the operation dimensions, one at a time.
     
    900945The \CFA bar in the last column is summarizing 840 test-program runs: 14 list lengths, 2 movements, 2 polarities, 3 accessors and 5 repetitions.
    901946
    902 In the earlier plots of one scenario broken down by length, each data point, with its error bars, represents just 5 repetitions.
     947In the earlier plots of a single scenario broken down by length, each data point, with its error bars, represents just 5 repetitions.
    903948With a couple exceptions, this reproducibility error was small.
    904 Now, for a \CFA bar, summarizing 70 (first column) to 840 (last column) runs, a bar's height is dominated by the different behaviours of the scenarios and list length that it summarizes.
     949Now, for a \CFA bar, summarizing 70 (first column) to 840 (last column) runs, a bar's height is dominated by the different behaviours of the scenarios and list lengths that it summarizes.
    905950Accordingly, the first column's bars are short and last one's are tall.
    906951A box represents the inner 68\% of the durations, while its lines extend to cover 95\%.
     
    935980% \end{figure}
    936981
    937 
    938 \section{Result: CFA cost attribution}
     982\begin{comment}
     983\subsection{Result: CFA cost attribution}
    939984
    940985This comparison loosely itemizes the reasons that the \CFA implementation runs 15--20\% slower than LQ.
     
    9761021\centering
    9771022  \begin{tabular}{c}
    978   \includegraphics{plot-list-cfa-attrib.pdf} \\
     1023  \includegraphics{plot-list-cfa-attrib-swift.pdf} \\
    9791024  (a) \\
    980   \includegraphics{plot-list-cfa-attrib-remelem.pdf} \\
     1025  \includegraphics{plot-list-cfa-attrib-remelem-swift.pdf} \\
    9811026  (b) \\
    9821027  \end{tabular}
     
    10271072Ultimately, this analysis provides options for a future effort that needs to get the most speed out of the \CFA list.
    10281073
    1029 
     1074\end{comment}
    10301075
    10311076
Note: See TracChangeset for help on using the changeset viewer.