# Changeset e6662f5

Ignore:
Timestamp:
Jul 20, 2022, 8:56:30 PM (5 months ago)
Branches:
Children:
b9e2b87
Parents:
6726a3a
Message:

Merge Peter's changes and added some details to idle sleep tracking

File:
1 edited

### Legend:

Unmodified
 r6726a3a Therefore, dynamically changing the number of \procs is an appropriate moment to allocate or free resources to match the new state. As such, all internal scheduling arrays that are sized based on the number of \procs need to be @realloc@ed. This requirement also means any references into these arrays, \eg pointers or indexes, may need to be updated if elements are moved for compaction or any other reason. % \footnote{Indexes may still need fixing when shrinking because some indexes are expected to refer to dense contiguous resources and there is no guarantee the resource being removed has the highest index.} This requirement also means any references into these arrays, \eg pointers or indexes, may need to be updated if elements are moved for compaction or for any other reason. There are no performance requirements, within reason, for resizing since it is expected to be rare. \subsection{Read-Copy-Update} One solution is to use the Read-Copy-Update pattern~\cite{wiki:rcu}. In this pattern, resizing is done by creating a copy of the internal data structures (\eg see Figure~\ref{fig:base-ts2}), updating the copy with the desired changes, and then attempt an Indiana Jones Switch to replace the original with the copy. In this pattern, resizing is done by creating a copy of the internal data structures, \eg see Figure~\ref{fig:base-ts2}, updating the copy with the desired changes, and then attempt an Indiana Jones Switch to replace the original with the copy. This approach has the advantage that it may not need any synchronization to do the switch. However, there is a race where \procs still use the original data structure after the copy is switched. \section{Idle-Sleep} While manual resizing of \procs is expected to be rare, the number of \ats can vary significantly over an application's lifetime, which means there are times when there are too few or too many \procs. For this work, it is the programer's responsibility to manually create \procs, so if there a too few \procs, the application must address this issue. For this work, it is the programer's responsibility to manually create \procs, so if there are too few \procs, the application must address this issue. This leaves too many \procs when there are not enough \ats for all the \procs to be useful. These idle \procs cannot be removed because their lifetime is controlled by the application, and only the application knows when the number of \ats may increase or decrease. While idle \procs can spin until work appears, this approach wastes the processor (from other applications), energy and heat. While idle \procs can spin until work appears, this approach wastes energy, unnecessarily produces heat and prevents other applications from using the processor. Therefore, idle \procs are put into an idle state, called \newterm{Idle-Sleep}, where the \gls{kthrd} is blocked until the scheduler deems it is needed. This generally takes the form of creating a file descriptor, \eg, dummy file, pipe, or event fd, and using that file descriptor when \procs need to wake each other. This leads to additional complexity because there can be a race between these artificial \io and genuine \io operations. If not handled correctly, this can lead to artificial files getting delaying too long behind genuine files, resulting in longer latency. If not handled correctly, this can lead to artificial files getting delayed too long behind genuine files, resulting in longer latency. \subsection{Event FDs} As a result, improper handling of this race leads to all \procs going to sleep when there are ready \ats and the system deadlocks. The handshake closing the race is done with both the notifier and the idle \proc executing two ordered steps. The notifier first make sure the newly ready \at is visible to \procs searching for \ats, and then attempt to notify an idle \proc. On the other side, \procs make themselves visible as idle \procs and then search for any \ats they may have missed. Unlike regular work-stealing, this search must be exhaustive to make sure that pre-existing \at is missed. These steps from both sides guarantee that if the search misses a newly ready \at, then the notifier is guaranteed to see at least one idle \proc. Conversly, if the notifier does not see any idle \proc, then a \proc is guaranteed to find the new \at in its exhaustive search. Furthermore, the Race-to-Idle'' approach means that there may be contention on the data structure tracking sleepers. Contention can be tolerated for \procs attempting to sleep or wake-up because these \procs are not doing useful work, and therefore, not contributing to overall performance. \subsection{Sleepers List} Each cluster maintains a list of idle \procs, organized as a stack. This ordering allows \procs at the head of the list to stay constantly active and those at the tail to stay in idle sleep for extended period of times. This ordering allows \procs at the tail to stay in idle sleep for extended period of times while those at the head of the list wake-up for bursts of activity. Because of unbalanced performance requirements, the algorithm tracking sleepers is designed to have idle \procs handle as much of the work as possible. The idle \procs maintain the stack of sleepers among themselves and notifying a sleeping \proc takes as little work as possible. The contention from the \procs attempting to go to sleep can be mitigated slightly by using @try_acquire@, so the \procs simply busy wait again searching for \ats if the lock is held. This trick cannot be used when waking \procs since the waker needs to return immediately to what it was doing. Interestingly, general notification, \ie waking any idle processor versus a specific one, does not strictly require modifying the list. Here, contention can be reduced notably by having notifiers avoid the lock entirely by adding a pointer to the event @fd@ of the first idle \proc, as in Figure~\ref{fig:idle2}. To avoid contention among notifiers, notifiers atomically exchange it to @NULL@ so only one notifier contends on the system call. \todo{Expand explanation of how a notification works.} The first notifier will succeed the atomic exchange and obtain the @fd@ of an idle \proc. The notifier will the normally write to the @fd@, which will wake a \proc. The woken \proc can then update the atomic pointer while it is updating the head of the list. Notifiers that obtained a @NULL@ in the exchange simply move on, knowing that another notifier is already waking a \proc. This behaviour is equivalent to having multiple notifier write to the @fd@ since reads consume all previous writes. \begin{figure}[t] \end{figure} The next optimization is to avoid the latency of the event @fd@, which can be done by adding what is effectively a benaphore\cit{benaphore} in front of the event @fd@. The next optimization is to avoid the latency of the event @fd@, which can be done by adding what is effectively a binary benaphore\cit{benaphore} in front of the event @fd@. A simple three state flag is added beside the event @fd@ to avoid unnecessary system calls, as shown in Figure~\ref{fig:idle:state}. In Topological Work Stealing (see Section~\ref{s:TopologicalWorkStealing}), a \proc without \ats begins searching by setting the state flag to @SEARCH@. If no \ats can be found to steal, the \proc then confirms it is going to sleep by atomically swapping the state to @SLEEP@. A \proc begins its idle sleep by adding itself to the idle list before searching for an \at. In the process of adding itself to the idle list, it sets the state flag to @SEARCH@. If no \ats can be found during the search, the \proc then confirms it is going to sleep by atomically swapping the state to @SLEEP@. If the previous state is still @SEARCH@, then the \proc does read the event @fd@. Meanwhile, notifiers atomically exchange the state to @AWAKE@ state. If the previous state is @SLEEP@, then the notifier must write to the event @fd@. However, if the notify arrives almost immediately after the \proc marks itself sleeping (idle), then both reads and writes on the event @fd@ can be omitted, which reduces latency notably. However, if the notify arrives almost immediately after the \proc marks itself idle, then both reads and writes on the event @fd@ can be omitted, which reduces latency notably. These extensions leads to the final data structure shown in Figure~\ref{fig:idle}. \todo{You never talk about the Beaphore. What is its purpose and when is it used?} \begin{figure}