Index: doc/theses/colby_parsons_MMAth/glossary.tex
===================================================================
--- doc/theses/colby_parsons_MMAth/glossary.tex	(revision e6e1a1206cdd5defb28fcd401966d2d7b93b358d)
+++ doc/theses/colby_parsons_MMAth/glossary.tex	(revision 04c31f4c9533583c26f64923016af702d88ebf40)
@@ -41,5 +41,5 @@
 \newabbreviation{dwcas}{DWCAS}{\Newterm{double-wide (width) compare-and-set (swap)}}
 \newabbreviation{dcas}{DCAS}{\Newterm{double compare-and-set (swap)}}
-\newabbreviation{das}{DAS}{\Newterm{double swap}}
+\newabbreviation{dcasw}{DCASW}{\Newterm{weak double compare-and-set (swap)}}
 \newabbreviation{ll}{LL}{\Newterm{load linked}}
 \newabbreviation{sc}{SC}{\Newterm{store conditional}}
Index: doc/theses/colby_parsons_MMAth/text/actors.tex
===================================================================
--- doc/theses/colby_parsons_MMAth/text/actors.tex	(revision e6e1a1206cdd5defb28fcd401966d2d7b93b358d)
+++ doc/theses/colby_parsons_MMAth/text/actors.tex	(revision 04c31f4c9533583c26f64923016af702d88ebf40)
@@ -740,17 +740,19 @@
 In this case, there is a race between loading the register and performing the swap (discussed shortly).
 
-Hence, a novel swap is constructed, called \gls{das}, special cased in two ways:
+Either a true memory/memory swap instruction or a \gls{dcas} would provide the ability to atomically swap two memory locations, but unfortunately neither of these instructions are supported on the architectures used in this work, and would require simulation.
+Hence, a novel swap for this use case is constructed, called \gls{dcasw}.
+The \gls{dcasw} is effectively a \gls{dcas} special cased in two ways:
 \begin{enumerate}
 \item
 It works on two separate memory locations, and hence, is logically the same as.
 \begin{cfa}
-bool DAS( T * assn1, T * assn2 ) {
-	return DCAS( assn1, assn2, *assn1, *assn2, *assn2, *assn1 );
+bool DCASW( T * dst, T * src ) {
+	return DCAS( dest, src, *dest, *src, *src, *dest );
 }
 \end{cfa}
 \item
-The values swapped are never null pointers, so a null pointer can be used as an intermediate values during the swap.
+The values swapped are never null pointers, so a null pointer can be used as an intermediate value during the swap.
 \end{enumerate}
-Figure~\ref{c:swap} shows the \CFA pseudocode for the \gls{das}.
+Figure~\ref{c:swap} shows the \CFA pseudocode for the \gls{dcasw}.
 In detail, a thief performs the following steps to swap two pointers:
 \begin{enumerate}[start=0]
@@ -764,4 +766,7 @@
 At no other point is a queue pointer set to null.
 Since each worker owns a disjoint range of the queue array, it is impossible for @my_queue@ to be null.
+Note, this algorithm is simplified due to each worker owning a disjoint range, allowing only the @vic_queue@ to be checked for null.
+This was not listed as a special case of this algorithm, since this requirement can be avoided by modifying Step 1 of Figure~\ref{c:swap} to also check @my_queue@ for null.
+Further discussion of this generalization is omitted since it is not needed for the presented application.
 \item
 attempts to atomically set the thief's queue pointer to null.
@@ -807,12 +812,12 @@
 }
 \end{cfa}
-\caption{DAS Concurrent}
+\caption{DCASW Concurrent}
 \label{c:swap}
 \end{figure}
 
 \begin{theorem}
-\gls{das} is correct in both the success and failure cases.
+\gls{dcasw} is correct in both the success and failure cases.
 \end{theorem}
-To verify sequential correctness, Figure~\ref{s:swap} shows a simplified \gls{das}.
+To verify sequential correctness, Figure~\ref{s:swap} shows a simplified \gls{dcasw}.
 Step 1 is missing in the sequential example since it only matters in the concurrent context.
 By inspection, the sequential swap copies each pointer being swapped, and then the original values of each pointer are reset using the copy of the other pointer.
@@ -832,9 +837,9 @@
 }
 \end{cfa}
-\caption{DAS Sequential}
+\caption{DCASW Sequential}
 \label{s:swap}
 \end{figure}
 
-To verify concurrent correctness, it is necessary to show \gls{das} is wait-free, \ie all thieves fail or succeed in swapping the queues in a finite number of steps.
+To verify concurrent correctness, it is necessary to show \gls{dcasw} is wait-free, \ie all thieves fail or succeed in swapping the queues in a finite number of steps.
 This property is straightforward, because there are no locks or looping.
 As well, there is no retry mechanism in the case of a failed swap, since a failed swap either means the work is already stolen or that work is stolen from the thief.
@@ -864,8 +869,9 @@
 Once a thief atomically sets their queue pointer to be @0p@ in step 2, the invariant guarantees that that pointer does not change.
 In the success case of step 3, it is known the value of the victim's queue-pointer, which is not overwritten, must be @vic_queue@ due to the use of @CAS@.
-Given that pointers all have unique memory locations, this first write of the successful swap is correct since it can only occur when the pointer has not changed.
+Given that the pointers all have unique memory locations, this first write of the successful swap is correct since it can only occur when the pointer has not changed.
 By the invariant, the write back in the successful case is correct since no other worker can write to the @0p@ pointer.
 In the failed case of step 3, the outcome is correct in steps 1 and 2 since no writes have occurred so the program state is unchanged.
 Therefore, the program state is safely restored to the state it had prior to the @0p@ write in step 2, because the invariant makes the write back to the @0p@ pointer safe.
+Note that the assumption of the pointers having unique memory locations prevents the ABA problem in this usage of \gls{dcasw}, but it is not needed for correctness of the general \gls{dcasw} operation.
 
 \begin{comment}
@@ -987,16 +993,20 @@
 
 The longest-victim heuristic maintains a timestamp per executor thread that is updated every time a worker attempts to steal work.
-\PAB{Explain the timestamp, \ie how is it formed?}
+The timestamps are generated using @rdtsc@~\cite{} and are stored in a shared array, with one index per worker.
 Thieves then attempt to steal from the worker with the oldest timestamp.
+The intuition behind this heuristic is that the slowest worker will receive help via work stealing until it becomes a thief, which indicates that it has caught up to the pace of the rest of the workers.
 This heuristic means that if two thieves look to steal at the same time, they likely attempt to steal from the same victim.
-\PAB{This idea seems counter intuitive so what is the intuition?}
-This consequence does increase the chance at contention among thieves;
+This approach consequently does increase the chance at contention among thieves;
 however, given that workers have multiple queues, often in the tens or hundreds of queues, it is rare for two thieves to attempt stealing from the same queue.
-\PAB{Both of these theorems are commented out.}
-Furthermore, in the case they attempt to steal the same queue, at least one of them is guaranteed to successfully steal the queue as shown in Theorem~\ref{t:one_vic}.
-Additionally, the longest victim heuristic makes it very improbable that the no swap scenario presented in Theorem~\ref{t:vic_cycle} manifests.
-Given the longest victim heuristic, for a cycle to manifest it requires all workers to attempt to steal in a short timeframe.
-This scenario is the only way that more than one thief could choose another thief as a victim, since timestamps are only updated upon attempts to steal.
-In this case, the probability of an unsuccessful swap is rare, since it is likely these steals are not important when all workers are trying to steal.
+This approach may seem counter-intuitive, but in cases with not enough work to steal, the contention among thieves can result in less stealing, due to failed swaps.
+This can be beneficial when there is not enough work for all the stealing to be productive.
+This heuristic does not boast better performance than randomized victim selection, but it is comparable.
+However, it constitutes an interesting contribution as it shows that adding some complexity to the heuristic of the stealing fast path does not impact mainline performance, paving the way for more involved victim selection heuristics.
+% \PAB{Both of these theorems are commented out.}
+% Furthermore, in the case they attempt to steal the same queue, at least one of them is guaranteed to successfully steal the queue as shown in Theorem~\ref{t:one_vic}.
+% Additionally, the longest victim heuristic makes it very improbable that the no swap scenario presented in Theorem~\ref{t:vic_cycle} manifests.
+% Given the longest victim heuristic, for a cycle to manifest it requires all workers to attempt to steal in a short timeframe.
+% This scenario is the only way that more than one thief could choose another thief as a victim, since timestamps are only updated upon attempts to steal.
+% In this case, the probability of an unsuccessful swap is rare, since it is likely these steals are not important when all workers are trying to steal.
 
 \section{Safety and Productivity}\label{s:SafetyProductivity}
