Index: doc/theses/mike_brooks_MMath/list.tex
===================================================================
--- doc/theses/mike_brooks_MMath/list.tex	(revision 0e991df4ae4c1af42734f50ff23af8b3530886f3)
+++ doc/theses/mike_brooks_MMath/list.tex	(revision fa8d17f5096421d5a025e942bd505a6add5cde16)
@@ -88,5 +88,5 @@
 \subsection{Core Design Issues}
 
-The doubly-linked list attaches links intrusively, supports multiple link directions, integrates with user code via the type system, treats its ends uniformly, and identifies a list using an explicit head.
+The doubly-linked list attaches links intrusively, supports multiple link axes, integrates with user code via the type system, treats its ends uniformly, and identifies a list using an explicit head.
 This design covers system and data management issues stated in \VRef{toc:lst:issue}.
 
@@ -101,5 +101,5 @@
 \begin{figure}
     \lstinput{20-30}{lst-features-intro.run.cfa}
-    \caption[Multiple link directions in \CFA list library]{
+    \caption[Multiple link axes in \CFA list library]{
         Demonstration of the running \lstinline{req} example, done using the \CFA list library.
         This example is equivalent to the three approaches in \VRef[Figure]{fig:lst-issues-attach}.
@@ -128,5 +128,5 @@
 \end{figure}
 
-\VRef[Figure]{fig:lst-features-multidir} shows how the \CFA library supports multi-inline links, so a node can be on one or more lists simultaneously.
+\VRef[Figure]{fig:lst-features-multidir} shows how the \CFA library supports multi-inline links, so a node can be on one or more lists simultaneously (axes).
 The declaration of @req@ has two inline-inheriting @dlink@ occurrences.
 The first of these gives a type named @req.by_pri@, @req@ inherits from it, and it inherits from @dlink@.
@@ -150,5 +150,5 @@
 
 \caption{
-        Demonstration of multiple static link directions done in the \CFA list library.
+        Demonstration of multiple static link axes done in the \CFA list library.
         The right example is from \VRef[Figure]{fig:lst-issues-multi-static}.
 		The left \CFA example does the same job.
@@ -158,9 +158,9 @@
 
 The list library also supports the common case of single directionality more naturally than LQ.  
-Returning to \VRef[Figure]{fig:lst-features-intro}, the single-direction list has no contrived name for the link direction as it uses the default type in the definition of @dlist@;
+Returning to \VRef[Figure]{fig:lst-features-intro}, the single-axis list has no contrived name for the link axis as it uses the default type in the definition of @dlist@;
 in contrast, the LQ list in \VRef[Figure]{f:Intrusive} adds the unnecessary field name @d@.
-In \CFA, a single direction list sets up a single inheritance with @dlink@, and the default list axis is to itself.
-
-When operating on a list with several directions and operations that do not take the list head, the list axis can be ambiguous.
+In \CFA, a single axis list sets up a single inheritance with @dlink@, and the default list axis is to itself.
+
+When operating on a list with several axes and operations that do not take the list head, the list axis can be ambiguous.
 For example, a call like @insert_after( r1, r2 )@ does not have enough information to know which axes to select implicitly.
 Is @r2@ supposed to be the next-priority request after @r1@, or is @r2@ supposed to join the same-requester list of @r1@?
@@ -171,8 +171,8 @@
 To mitigate this issue, the list library provides a hook for applying the \CFA language's scoping and priority rules.
 \begin{cfa}
-with ( DLINK_VIA(  req, req.pri ) ) insert_after( r1, r2 );
+with ( DLINK_VIA( req, req.pri ) ) insert_after( r1, r2 );
 \end{cfa}
 Here, the @with@ statement opens the scope of the object type for the expression;
-hence, the @DLINK_VIA@ result causes one of the list directions to become a more attractive candidate to \CFA's overload resolution.
+hence, the @DLINK_VIA@ result causes one of the list axes to become a more attractive candidate to \CFA's overload resolution.
 This boost can be applied across multiple statements in a block or an entire function body.
 \begin{cquote}
@@ -192,10 +192,10 @@
 \end{tabular}
 \end{cquote}
-Within the @with@, the code acts as if there is only one list direction.
+Within the @with@, the code acts as if there is only one list axis, without explicit casting.
 
 Unlike the \CC template container-types, the \CFA library works completely within the type system;
 both @dlink@ and @dlist@ are ordinary types, not language macros.
 There is no textual expansion other than header-included static-inline function for performance.
-Hence, errors in user code are reported only with mention of the library's declarations.
+Hence, errors in user code are reported only with mention of the library's declarations, versus long template names in error messages.
 Finally, the library is separately compiled from the usage code, modulo inlining.
 
@@ -361,5 +361,5 @@
 \end{tabular}
 \end{cquote}
-Iterating forward and reverse through the entire list using the shorthand start at the list head and pick a direction.
+Iterating forward and reverse through the entire list using the shorthand start at the list head and pick a axis.
 In this case, @advance@ and @recede@ return a boolean, like \CC @while ( cin >> i )@.
 \begin{cquote}
@@ -629,4 +629,5 @@
 After each round, a counter is incremented by $n$ (for throughput).
 Time is measured outside the loop because a large $n$ can overrun the time duration before the @CONTINUE@ flag is tested.
+Hence, there is minimum of one outer (@CONTINUE@) loop iteration for large lists.
 The loop duration is divided by the counter and this throughput is reported.
 In a scatter-plot, each dot is one throughput, which means insert + remove + harness overhead.
@@ -636,6 +637,8 @@
 To test list operations, the experiment performs the inserts/removes in different patterns, \eg insert and remove from front, insert from front and remove from back, random insert and remove, \etc.
 Unfortunately, the @std::list@ does \emph{not} support direct insert/remove from a node without an iterator, \ie no @erase( node )@, even though the list is doubly-linked.
-To eliminate this additional cost in the harness, a trick is used for random insertions without replacement.
-The @i@ fields in each node are initialized from @0..n-1@, these @i@ values are then shuffled in the nodes, and the @i@ value is used to represent an indirection to that node for insertion, so the nodes are inserted in random order, and hence, removed in th same random order.
+To eliminate the iterator, a trick is used for random insertions without replacement.
+The @i@ fields in each node are initialized from @0..n-1@.
+These @i@ values are then shuffled in the nodes, and the @i@ value is used to represent an indirection to that node for insertion.
+hence, the nodes are inserted in random order and removed in the same random order.
 $\label{p:Shuffle}$
 \begin{c++}
@@ -644,10 +647,21 @@
 
 	while ( CONTINUE ) {
-		for ( i; n ) insert_first( lst, nodes[ @nodes[i].i@ ] ); $\C{// build up}$
+		for ( i; n ) {
+			node_t & temp = nodes[ nodes[i].i ];
+			@temp.j = 0;@				$\C{// touch random node for wrapped nodes}$
+			insert_first( lst, temp );
+		}
+		insert_first( lst, nodes[ @nodes[i].i@ ] ); $\C{// build up}$
 		for ( i; n ) pass( &remove_first( lst )	);	$\C{// tear down}$
 		totalOpsDone += n;
 	}
 \end{c++}
-This approach works across intrusive and wrapped lists.
+Note, insertion is traversing the list of nodes linearly, @node[i]@.
+For intrusive lists, the inserted (random) node is always touched because its link fields are read/written for insertion into the list.
+Hence, the array of nodes is being accessed both linearly and randomly during the traversal.
+For wrapped lists, the wrapped nodes are traversed linearly but the random node is not accessed, only a pointer to it is inserted into the linearly accessed wrapped node.
+Hence, the traversal is the same as the non-random traversal above.
+To level the experiments, an explicit access to the random node is inserted after the insertion, @temp.j = 0@, for the wrapped experiment.
+Furthermore, it is rare to insert/remove nodes and not access them.
 
 % \emph{Interleaving} allows for movements other than pure stack and queue.
@@ -770,7 +784,7 @@
 As a result, these memory layouts result in high spatial and temporal locality for both kinds of lists during the linear array traversal.
 With address look-ahead, the hardware does an excellent job of managing the multi-level cache.
-Hence, performance is largely constant for both kinds of lists, until cache and NUMA boundaries are crossed for longer lists and the costs increase consistently for both kinds of lists.
-
-In detail, \VRef[Figure]{f:Shuffled} shows shuffle insertion of all the nodes and then linear removal, both in the same direction.
+Hence, performance is largely constant for both kinds of lists, until L3 cache and NUMA boundaries are crossed for longer lists and the costs increase consistently for both kinds of lists.
+
+In detail, \VRef[Figure]{f:Shuffled} shows shuffle insertion and removal of the nodes.
 As for linear, there are issues with the wrapped list and memory allocation.
 For intrusive lists, it is possible to link the nodes randomly, so consecutive nodes in memory seldom point at adjacent nodes.