Context Navigation

-                      rf519bd8
+                      r3397eed
 % Through these behaviours, a worker inserts messages onto its own and other worker ready-queues.
 To steal, a thief takes work from a victim's ready queue, so work stealing always results in a potential increase in contention on ready queues between the victim gulping from a queue and the thief stealing the queue.
 This contention can slows down the victim's throughput.
+This contention can reduce the victim's throughput.
 Note, the data structure used for the ready queue is not discussed since the largest cost is the mutual exclusion and its duration for safely performing the queue operations.
 The stealing mechanism in this work differs from most work-stealing actor-systems because of the message-centric (inverted) actor-system.
 Actor systems, such as Akka~\cite{Akka} and CAF~\cite{CAF} using actor-centric systems, steal by dequeuing from a non-empty actor ready-queue and enqueue\-ing to an empty ready-queue.
+Actor systems, such as Akka~\cite{Akka} and CAF~\cite{CAF} using actor-centric systems, steal by dequeuing an actor from a non-empty actor ready-queue and enqueue\-ing to an empty ready-queue.
 % As an example, in CAF, the sharded actor ready queue is a set of double-ended queues (dequeues).
 In \CFA, the actor work-stealing implementation is unique because of the message-centric system.
 With this approach, it is impractical to steal actors because an actor's messages are distributed in temporal order along the message queue.
 To ensure sequential actor execution and \gls{fifo} message delivery in a message-centric system,  stealing requires finding and removing \emph{all} of an actor's messages, and inserting them consecutively in another message queue.
+To ensure sequential actor execution and \gls{fifo} message delivery in a message-centric system, stealing requires finding and removing \emph{all} of an actor's messages, and inserting them consecutively in another message queue.
 This operation is $O(N)$ with a non-trivial constant.
 The only way for work stealing to become practical is to shard each worker's message queue, which also reduces contention, and to steal queues to eliminate queue searching.
+\begin{figure}
+\begin{center}
+\input{diagrams/steal.tikz}
+\end{center}
+\caption{Queue Stealing Mechanism}
+\label{f:steal}
+\end{figure}
+Given queue stealing, the goal of the presented work stealing implementation is to have an essentially zero-contention-cost stealing mechanism.
+Given queue stealing, the goal of the presented stealing implementation is to have an essentially zero-contention-cost stealing mechanism.
 Achieving this goal requires work stealing to have minimal (practically no) effect on the performance of the victim.
 The implication is that thieves cannot contend with a victim, and that a victim should perform no stealing related work unless it becomes a thief.
 In theory, this goal is not achievable, but practical results show the goal is virtually achieved.
 One important lesson learned while working on \uC actors~\cite{} and through discussions with fellow student Thierry Delisle, who examined work-stealing for user-threads in his Ph.D., is \emph{not} to aggressively steal.
+One important lesson learned while working on \uC actors~\cite{} and through discussions with fellow student Thierry Delisle, who examined work-stealing for user-threads in his Ph.D.~\cite{Delisle22}, is \emph{not} to aggressively steal.
 With reasonable workloads, being a thief should be a temporary state, \ie eventually work appears on the thief's ready-queues and it returns to normal operation.
 Furthermore, the act of \emph{looking} to find work is invasive (Heisenberg uncertainty principle), possibly disrupting multiple victims.
 …
 Note, the cost of stealing is not crucial for the thief because it does not have anything else to do except poll or block.
 As such a lazy-stealing approach is used; a thief select a victim, scan its queues once, and return immediately if a queue is stolen.
+The outline for lazy-stealing by a thief is: select a victim, scan its queues once, and return immediately if a queue is stolen.
 The thief then returns to normal operation and conducts a regular scan over its own queues looking for work, where stolen work is placed at the end of the scan.
 Hence, only one victim is affected and there is a reasonable delay between stealing events by scanning all the thief's ready queue looking for its own work.
 …
 Stealing a queue is done wait-free (\ie no busy waiting) with a few atomic instructions that only create contention with other stealing workers, not the victim.
 The complexity in the implementation is that victim gulping does not take the mailbox queue;
 rather it atomically transfers the mailbox queue to another list leaving the mailbox empty.
 Hence, the original list is available for new message deliveries.
+rather it atomically transfers the mailbox nodes to another list leaving the mailbox empty, \ie it unlinks the queue nodes from the top pointer leaving the original list empty.
+Hence, the original mailbox is always available for new message deliveries.
 However, this transfer logically subdivides the mailbox, and during this period, the mailbox cannot be stolen;
 otherwise there are two threads simultaneously running messages on actors in the two parts of the mailbox queue.
 …
 Hence, a thief checks if a queue is eligible and non-empty before attempting an atomic steal of a queue.
+Internally, the mailbox queues are accessed through two arrays of size $N$, that are shared among all workers.
+There is an array of mailbox queue objects @mailboxes@, and an array of pointers to the mailbox objects, @worker_queues@:
+\begin{cfa}
+work_queue * mailboxes;                   // master array of work request queues
+work_queue ** worker_queues;                // secondary array of work queues to allow for swapping
+\end{cfa}
+A diagram of the queue architecture and stealing mechanism is shown in Figure~\ref{f:steal}.
+When a send happens, it inserts a request directly into one of the @mailboxes@, but when a steal happens, two pointers in @worker_queues@ are swapped.
+Figure~\ref{f:steal} shows the queue architecture and stealing mechanism.
+Internally, the mailbox queues are accessed through two arrays of size $N$, which are shared among all workers.
+There is an array of mailbox queues, @mailboxes@, and an array of pointers to the mailboxes, @worker_queues@:
+\begin{cfa}
+struct work_queue {
+        spinlock_t mutex_lock;                          $\C[2.75in]{// atomicity for queue operations}$
+        copy_queue * owned_queue;                       $\C{// copy queue}$
+        copy_queue * c_queue;                           $\C{// current queue}$
+        volatile bool being_processed;          $\C{// flag to prevent concurrent processing}$
+};
+work_queue * mailboxes;                                 $\C{// master array of work request queues}$
+work_queue ** worker_queues;                    $\C{// secondary array of work queues to allow for swapping}\CRT$
+\end{cfa}
+A send inserts a request at the end of a @mailboxes@.
+A steal swaps two pointers in @worker_queues@.
 Conceptually, @worker_queues@ represents the ownership relation between mailboxes and workers.
 Given $M$ workers and $N$ mailboxes, each worker owns a contiguous $M$/$N$ block of pointers in @worker_queues@.
 …
 As such, operations can happen on the queues independent of stealing, which avoids almost all contention between thief threads and victim threads.
+\begin{figure}
+\begin{center}
+\input{diagrams/steal.tikz}
+\end{center}
+\caption{Queue Stealing Mechanism}
+\label{f:steal}
+\end{figure}
 To steal a queue, a thief does the following:
 \begin{enumerate}[topsep=5pt,itemsep=3pt,parsep=0pt]
 \item
+chooses a victim. Victim selection heuristics are independent of the stealing mechanism and will be discussed in Section~\ref{s:victimSelect}.
+chooses a victim.
+Victim selection heuristics are independent of the stealing mechanism and discussed in Section~\ref{s:victimSelect}.
 \item
 scan the victim's $M$/$N$ range of @worker_queues@ and non-atomically checks each mail queue to see if it is eligible and non-empty.
+If one such mail queue is found, the thief attempts to steal the queue by swapping two pointers in @worker_queues@.
+The swap races to interchange the non-empty mail-queue pointer from the victim's @worker_queues@ range with an empty mail-queue pointer in the thief's @worker_queues@ range.
+This swap can fail if another thief steals the queue, which will be discussed further in Section~\ref{s:swap}.
+Note, a thief never exceeds its $M$/$N$ worker range because it is always exchanging queues with other workers.
+If no such mail queue is found, no swap is attempted.
+\item
+stops searching after a successful queue steal, a failed queue steal or all queues in the victim's range are examined.
+The thief then resumes normal execution and ceases being a thief; it returns to its own queues and iterates over them, because new messages may have arrived during stealing, including ones in the potentially stolen queue.
+If a match is found, the thief attempts to steal the queue by swapping the appropriate victim worker-queue pointer with an empty thief's pointer, where the pointers come from the victim's and thief's ranges, respectively.
+% The swap races to interchange the appropriate victim's mail-queue pointer with an empty mail-queue pointer in the thief's @worker_queues@ range.
+This swap can fail if another thief steals the queue, which is discussed further in Section~\ref{s:swap}.
+% Note, a thief never exceeds its $M$/$N$ worker range because it is always exchanging queues with other workers.
+If no appropriate victim mailbox is found, no swap is attempted.
+\item
+stops searching after a successful queue steal, a failed queue steal, or all queues in the victim's range are examined.
+The thief then resumes normal execution and ceases being a thief.
+Hence, it iterates over its own worker queues because new messages may have arrived during stealing, including ones in the potentially stolen queue.
 Stealing is only repeated after the worker completes two consecutive iterations over its owned queues without finding work.
 \end{enumerate}
+This approach largely eliminates contention among thieves and victims except for the rare moment when a thief steals a queue and the victim and thief concurrently attempt to gulp the same queue.
+When a victim operates on a queue, it first copies the queue pointer from @worker_queues@ to a local variable.
+It then uses that local variable for all queue operations until it moves to the next index of its range of the queue array.
+This ensures that any swaps do not interrupt gulping operations, however this introduces a correctness issue.
+There is a window for a race condition between the victim and a thief.
+Once a victim copies the queue pointer from @worker_queues@, a thief could steal that pointer and both may try to gulp from the same queue.
+These two gulps cannot be allowed to happen concurrently.
+If any behaviours from a queue are run by two workers at a time it violates both mutual exclusion and the actor ordering guarantees.
+To avoid concurrent gulps, each queue stores a @being_processed@ flag that is atomically set to @true@ when a queue is gulped.
+As mentioned, there is a race between a victim gulping and a thief stealing because gulping partitions a mailbox queue making it ineligible for stealing.
+Furthermore, after a thief steals, there is moment when victim gulps but the queue no longer
+% This algorithm largely eliminates contention among thieves and victims except for the rare moment when a victim/thief concurrently attempt to gulp/steal the same queue.
+% Restating, when a victim operates on a queue, it first copies the queue pointer from @worker_queues@ to a local variable (gulp).
+% It then uses that local variable for all queue operations until it moves to the next index of its range.
+% This approach ensures any swaps do not interrupt gulping operations, however this introduces a correctness issue.
+% There is a window for a race condition between the victim and a thief.
+% Once a victim copies the queue pointer from @worker_queues@, a thief could steal that pointer and both may try to gulp from the same queue.
+% These two gulps cannot be allowed to happen concurrently.
+% If any behaviours from a queue are run by two workers at a time it violates both mutual exclusion and the actor ordering guarantees.
+To avoid this race, each mailbox header stores a @being_processed@ flag that is atomically set to @true@ when a queue is gulped.
 The flag indicates that a queue is being processed by a worker and is set back to @false@ once the local processing is finished.
 If a worker, either victim or thief, attempts to gulp a queue and finds that the @being_processed@ flag is @true@, it does not gulp the queue and moves on to the next queue in its range.

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset 3397eed

Legend:

doc/theses/colby_parsons_MMAth/text/actors.tex

Download in other formats: