Changeset 04c31f4


Ignore:
Timestamp:
Jul 10, 2023, 5:10:11 PM (11 months ago)
Author:
caparsons <caparson@…>
Branches:
master
Children:
0aa77e6
Parents:
c3f7dd9
Message:

some changes to the queue swap chapter

Location:
doc/theses/colby_parsons_MMAth
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/colby_parsons_MMAth/glossary.tex

    rc3f7dd9 r04c31f4  
    4141\newabbreviation{dwcas}{DWCAS}{\Newterm{double-wide (width) compare-and-set (swap)}}
    4242\newabbreviation{dcas}{DCAS}{\Newterm{double compare-and-set (swap)}}
    43 \newabbreviation{das}{DAS}{\Newterm{double swap}}
     43\newabbreviation{dcasw}{DCASW}{\Newterm{weak double compare-and-set (swap)}}
    4444\newabbreviation{ll}{LL}{\Newterm{load linked}}
    4545\newabbreviation{sc}{SC}{\Newterm{store conditional}}
  • doc/theses/colby_parsons_MMAth/text/actors.tex

    rc3f7dd9 r04c31f4  
    740740In this case, there is a race between loading the register and performing the swap (discussed shortly).
    741741
    742 Hence, a novel swap is constructed, called \gls{das}, special cased in two ways:
     742Either a true memory/memory swap instruction or a \gls{dcas} would provide the ability to atomically swap two memory locations, but unfortunately neither of these instructions are supported on the architectures used in this work, and would require simulation.
     743Hence, a novel swap for this use case is constructed, called \gls{dcasw}.
     744The \gls{dcasw} is effectively a \gls{dcas} special cased in two ways:
    743745\begin{enumerate}
    744746\item
    745747It works on two separate memory locations, and hence, is logically the same as.
    746748\begin{cfa}
    747 bool DAS( T * assn1, T * assn2 ) {
    748         return DCAS( assn1, assn2, *assn1, *assn2, *assn2, *assn1 );
     749bool DCASW( T * dst, T * src ) {
     750        return DCAS( dest, src, *dest, *src, *src, *dest );
    749751}
    750752\end{cfa}
    751753\item
    752 The values swapped are never null pointers, so a null pointer can be used as an intermediate values during the swap.
     754The values swapped are never null pointers, so a null pointer can be used as an intermediate value during the swap.
    753755\end{enumerate}
    754 Figure~\ref{c:swap} shows the \CFA pseudocode for the \gls{das}.
     756Figure~\ref{c:swap} shows the \CFA pseudocode for the \gls{dcasw}.
    755757In detail, a thief performs the following steps to swap two pointers:
    756758\begin{enumerate}[start=0]
     
    764766At no other point is a queue pointer set to null.
    765767Since each worker owns a disjoint range of the queue array, it is impossible for @my_queue@ to be null.
     768Note, this algorithm is simplified due to each worker owning a disjoint range, allowing only the @vic_queue@ to be checked for null.
     769This was not listed as a special case of this algorithm, since this requirement can be avoided by modifying Step 1 of Figure~\ref{c:swap} to also check @my_queue@ for null.
     770Further discussion of this generalization is omitted since it is not needed for the presented application.
    766771\item
    767772attempts to atomically set the thief's queue pointer to null.
     
    807812}
    808813\end{cfa}
    809 \caption{DAS Concurrent}
     814\caption{DCASW Concurrent}
    810815\label{c:swap}
    811816\end{figure}
    812817
    813818\begin{theorem}
    814 \gls{das} is correct in both the success and failure cases.
     819\gls{dcasw} is correct in both the success and failure cases.
    815820\end{theorem}
    816 To verify sequential correctness, Figure~\ref{s:swap} shows a simplified \gls{das}.
     821To verify sequential correctness, Figure~\ref{s:swap} shows a simplified \gls{dcasw}.
    817822Step 1 is missing in the sequential example since it only matters in the concurrent context.
    818823By inspection, the sequential swap copies each pointer being swapped, and then the original values of each pointer are reset using the copy of the other pointer.
     
    832837}
    833838\end{cfa}
    834 \caption{DAS Sequential}
     839\caption{DCASW Sequential}
    835840\label{s:swap}
    836841\end{figure}
    837842
    838 To verify concurrent correctness, it is necessary to show \gls{das} is wait-free, \ie all thieves fail or succeed in swapping the queues in a finite number of steps.
     843To verify concurrent correctness, it is necessary to show \gls{dcasw} is wait-free, \ie all thieves fail or succeed in swapping the queues in a finite number of steps.
    839844This property is straightforward, because there are no locks or looping.
    840845As well, there is no retry mechanism in the case of a failed swap, since a failed swap either means the work is already stolen or that work is stolen from the thief.
     
    864869Once a thief atomically sets their queue pointer to be @0p@ in step 2, the invariant guarantees that that pointer does not change.
    865870In the success case of step 3, it is known the value of the victim's queue-pointer, which is not overwritten, must be @vic_queue@ due to the use of @CAS@.
    866 Given that pointers all have unique memory locations, this first write of the successful swap is correct since it can only occur when the pointer has not changed.
     871Given that the pointers all have unique memory locations, this first write of the successful swap is correct since it can only occur when the pointer has not changed.
    867872By the invariant, the write back in the successful case is correct since no other worker can write to the @0p@ pointer.
    868873In the failed case of step 3, the outcome is correct in steps 1 and 2 since no writes have occurred so the program state is unchanged.
    869874Therefore, the program state is safely restored to the state it had prior to the @0p@ write in step 2, because the invariant makes the write back to the @0p@ pointer safe.
     875Note that the assumption of the pointers having unique memory locations prevents the ABA problem in this usage of \gls{dcasw}, but it is not needed for correctness of the general \gls{dcasw} operation.
    870876
    871877\begin{comment}
     
    987993
    988994The longest-victim heuristic maintains a timestamp per executor thread that is updated every time a worker attempts to steal work.
    989 \PAB{Explain the timestamp, \ie how is it formed?}
     995The timestamps are generated using @rdtsc@~\cite{} and are stored in a shared array, with one index per worker.
    990996Thieves then attempt to steal from the worker with the oldest timestamp.
     997The intuition behind this heuristic is that the slowest worker will receive help via work stealing until it becomes a thief, which indicates that it has caught up to the pace of the rest of the workers.
    991998This heuristic means that if two thieves look to steal at the same time, they likely attempt to steal from the same victim.
    992 \PAB{This idea seems counter intuitive so what is the intuition?}
    993 This consequence does increase the chance at contention among thieves;
     999This approach consequently does increase the chance at contention among thieves;
    9941000however, given that workers have multiple queues, often in the tens or hundreds of queues, it is rare for two thieves to attempt stealing from the same queue.
    995 \PAB{Both of these theorems are commented out.}
    996 Furthermore, in the case they attempt to steal the same queue, at least one of them is guaranteed to successfully steal the queue as shown in Theorem~\ref{t:one_vic}.
    997 Additionally, the longest victim heuristic makes it very improbable that the no swap scenario presented in Theorem~\ref{t:vic_cycle} manifests.
    998 Given the longest victim heuristic, for a cycle to manifest it requires all workers to attempt to steal in a short timeframe.
    999 This scenario is the only way that more than one thief could choose another thief as a victim, since timestamps are only updated upon attempts to steal.
    1000 In this case, the probability of an unsuccessful swap is rare, since it is likely these steals are not important when all workers are trying to steal.
     1001This approach may seem counter-intuitive, but in cases with not enough work to steal, the contention among thieves can result in less stealing, due to failed swaps.
     1002This can be beneficial when there is not enough work for all the stealing to be productive.
     1003This heuristic does not boast better performance than randomized victim selection, but it is comparable.
     1004However, it constitutes an interesting contribution as it shows that adding some complexity to the heuristic of the stealing fast path does not impact mainline performance, paving the way for more involved victim selection heuristics.
     1005% \PAB{Both of these theorems are commented out.}
     1006% Furthermore, in the case they attempt to steal the same queue, at least one of them is guaranteed to successfully steal the queue as shown in Theorem~\ref{t:one_vic}.
     1007% Additionally, the longest victim heuristic makes it very improbable that the no swap scenario presented in Theorem~\ref{t:vic_cycle} manifests.
     1008% Given the longest victim heuristic, for a cycle to manifest it requires all workers to attempt to steal in a short timeframe.
     1009% This scenario is the only way that more than one thief could choose another thief as a victim, since timestamps are only updated upon attempts to steal.
     1010% In this case, the probability of an unsuccessful swap is rare, since it is likely these steals are not important when all workers are trying to steal.
    10011011
    10021012\section{Safety and Productivity}\label{s:SafetyProductivity}
Note: See TracChangeset for help on using the changeset viewer.