- Timestamp:
- Sep 19, 2022, 11:22:32 AM (2 years ago)
- Branches:
- ADT, ast-experimental, master, pthread-emulation
- Children:
- 4520b77e, aa144c5a
- Parents:
- 4407b7e
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/thierry_delisle_PhD/thesis/text/core.tex
r4407b7e rd489da8 13 13 To match expectations, the design must offer the programmer sufficient guarantees so that, as long as they respect the execution mental model, the system also respects this model. 14 14 15 For threading, a simple and common execution mental model is the ``ideal multitasking CPU'' 15 For threading, a simple and common execution mental model is the ``ideal multitasking CPU'': 16 16 17 17 \begin{displayquote}[Linux CFS\cite{MAN:linux/cfs}] … … 120 120 On the other hand, work-stealing schedulers only attempt to do load-balancing when a \gls{proc} runs out of work. 121 121 This means that the scheduler never balances unfair loads unless they result in a \gls{proc} running out of work. 122 Chapter~\ref{microbench} shows that pathological caseswork stealing can lead to indefinite starvation.122 Chapter~\ref{microbench} shows that, in pathological cases, work stealing can lead to indefinite starvation. 123 123 124 124 Based on these observations, the conclusion is that a \emph{perfect} scheduler should behave similarly to work-stealing in the steady-state case, but load balance proactively when the need arises. … … 126 126 \subsection{Relaxed-FIFO} 127 127 A different scheduling approach is to create a ``relaxed-FIFO'' queue, as in \cite{alistarh2018relaxed}. 128 This approach forgoes any ownership between \gls{proc} and sub-queue, and simply creates a pool of readyqueues from which \glspl{proc} pick.128 This approach forgoes any ownership between \gls{proc} and sub-queue, and simply creates a pool of sub-queues from which \glspl{proc} pick. 129 129 Scheduling is performed as follows: 130 130 \begin{itemize} … … 134 134 Timestamps are added to each element of a sub-queue. 135 135 \item 136 A \gls{proc} randomly tests readyqueues until it has acquired one or two queues.136 A \gls{proc} randomly tests sub-queues until it has acquired one or two queues. 137 137 \item 138 138 If two queues are acquired, the older of the two \ats is dequeued from the front of the acquired queues. … … 254 254 255 255 With these additions to work stealing, scheduling can be made as fair as the relaxed-FIFO approach, avoiding the majority of unnecessary migrations. 256 Unfortunately, the work to achieve fairness has a performance cost, especially when the workload is inherently fair, and hence, there is only short-term or no starvation.256 Unfortunately, the work to achieve fairness has a performance cost, especially when the workload is inherently fair, and hence, there is only short-term unfairness or no starvation. 257 257 The problem is that the constant polling, \ie reads, of remote sub-queues generally entails cache misses because the TSs are constantly being updated, \ie, writes. 258 258 To make things worst, remote sub-queues that are very active, \ie \ats are frequently enqueued and dequeued from them, lead to higher chances that polling will incur a cache-miss. … … 342 342 \subsection{Topological Work Stealing} 343 343 \label{s:TopologicalWorkStealing} 344 The refore, theapproach used in the \CFA scheduler is to have per-\proc sub-queues, but have an explicit data structure to track which cache substructure each sub-queue is tied to.344 The approach used in the \CFA scheduler is to have per-\proc sub-queues, but have an explicit data structure to track which cache substructure each sub-queue is tied to. 345 345 This tracking requires some finesse because reading this data structure must lead to fewer cache misses than not having the data structure in the first place. 346 346 A key element, however, is that, like the timestamps for helping, reading the cache instance mapping only needs to give the correct result \emph{often enough}.
Note: See TracChangeset
for help on using the changeset viewer.