Context Navigation

-                      r4407b7e
+                      rd489da8
 To match expectations, the design must offer the programmer sufficient guarantees so that, as long as they respect the execution mental model, the system also respects this model.
 For threading, a simple and common execution mental model is the ``ideal multitasking CPU'' :
+For threading, a simple and common execution mental model is the ``ideal multitasking CPU'':
 \begin{displayquote}[Linux CFS\cite{MAN:linux/cfs}]
 …
 On the other hand, work-stealing schedulers only attempt to do load-balancing when a \gls{proc} runs out of work.
 This means that the scheduler never balances unfair loads unless they result in a \gls{proc} running out of work.
 Chapter~\ref{microbench} shows that pathological cases work stealing can lead to indefinite starvation.
+Chapter~\ref{microbench} shows that, in pathological cases, work stealing can lead to indefinite starvation.
 Based on these observations, the conclusion is that a \emph{perfect} scheduler should behave similarly to work-stealing in the steady-state case, but load balance proactively when the need arises.
 …
 \subsection{Relaxed-FIFO}
 A different scheduling approach is to create a ``relaxed-FIFO'' queue, as in \cite{alistarh2018relaxed}.
 This approach forgoes any ownership between \gls{proc} and sub-queue, and simply creates a pool of ready queues from which \glspl{proc} pick.
+This approach forgoes any ownership between \gls{proc} and sub-queue, and simply creates a pool of sub-queues from which \glspl{proc} pick.
 Scheduling is performed as follows:
 \begin{itemize}
 …
 Timestamps are added to each element of a sub-queue.
 \item
 A \gls{proc} randomly tests ready queues until it has acquired one or two queues.
+A \gls{proc} randomly tests sub-queues until it has acquired one or two queues.
 \item
 If two queues are acquired, the older of the two \ats is dequeued from the front of the acquired queues.
 …
 With these additions to work stealing, scheduling can be made as fair as the relaxed-FIFO approach, avoiding the majority of unnecessary migrations.
 Unfortunately, the work to achieve fairness has a performance cost, especially when the workload is inherently fair, and hence, there is only short-term or no starvation.
+Unfortunately, the work to achieve fairness has a performance cost, especially when the workload is inherently fair, and hence, there is only short-term unfairness or no starvation.
 The problem is that the constant polling, \ie reads, of remote sub-queues generally entails cache misses because the TSs are constantly being updated, \ie, writes.
 To make things worst, remote sub-queues that are very active, \ie \ats are frequently enqueued and dequeued from them, lead to higher chances that polling will incur a cache-miss.
 …
 \subsection{Topological Work Stealing}
 \label{s:TopologicalWorkStealing}
 Therefore, the approach used in the \CFA scheduler is to have per-\proc sub-queues, but have an explicit data structure to track which cache substructure each sub-queue is tied to.
+The approach used in the \CFA scheduler is to have per-\proc sub-queues, but have an explicit data structure to track which cache substructure each sub-queue is tied to.
 This tracking requires some finesse because reading this data structure must lead to fewer cache misses than not having the data structure in the first place.
 A key element, however, is that, like the timestamps for helping, reading the cache instance mapping only needs to give the correct result \emph{often enough}.

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset d489da8 for doc

Legend:

doc/theses/thierry_delisle_PhD/thesis/text/core.tex

Download in other formats: