Changeset 13088f1 for doc/theses/thierry_delisle_PhD
- Timestamp:
- Aug 13, 2022, 1:37:19 PM (2 years ago)
- Branches:
- ADT, ast-experimental, master, pthread-emulation
- Children:
- 08e7590d
- Parents:
- 4d85e47
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/thierry_delisle_PhD/thesis/text/intro.tex
r4d85e47 r13088f1 17 17 18 18 After examining, testing and selecting specific approaches to these scheduling issues, a completely new scheduler was created and tested in the \CFA (C-for-all) user-threading runtime-system. 19 The goal of the new scheduler to offer increased safety and productivity without sacrificing performance.19 The goal of the new scheduler is to offer increased safety and productivity without sacrificing performance. 20 20 The quality of the new scheduler is demonstrated by comparing it with other user-threading work-stealing schedulers with the aim of showing equivalent or better performance while offering better fairness. 21 21 … … 25 25 Chapter~\ref{core} analyses different scheduler approaches, while looking for scheduler mechanisms that provide both performance and fairness. 26 26 Chapter~\ref{s:UserLevelIO} covers the complex mechanisms that must be used to achieve nonblocking I/O to prevent the blocking of \glspl{kthrd}. 27 Chapter~\ref{practice} presents the mechanisms needed to adjust the amount of parallelism, both manually and automatically. 27 28 Chapters~\ref{microbench} and~\ref{macrobench} present micro and macro benchmarks used to evaluate and compare the new scheduler with similar schedulers. 28 29 … … 34 35 In general, work units without threads, like routines and coroutines, are self-scheduling, while work units with threads, like tasks and programs, are scheduled. 35 36 For scheduled work-units, a scheduler takes a sequence of threads and attempts to run them to completion, subject to shared resource restrictions and utilization. 36 A general-purpose dynamic-scheduler for an open system cannot anticipate (future)work requests, so its performance is rarely optimal.37 A general-purpose dynamic-scheduler for an open system cannot anticipate work requests, so its performance is rarely optimal. 37 38 Even with complete knowledge of arrive order and work, creating an optimal solution is a bin packing problem~\cite{wiki:binpak}. 38 39 However, optimal solutions are often not required: schedulers often produce excellent solutions, without needing optimality, by taking advantage of regularities in work patterns. … … 70 71 \end{tabular} 71 72 \end{center} 72 Beyond these two schedulers are a host of options, \ ie adding an optional global, shared queue to MQMS.73 Beyond these two schedulers are a host of options, \eg adding an global shared queue to MQMS or adding multiple private queues with distinc characteristics. 73 74 74 75 Once there are multiple resources and ready queues, a scheduler is faced with three major optimization criteria: … … 90 91 \newterm{contention}: safe access of shared objects by multiple processors requires mutual exclusion in some form, generally locking.\footnote{ 91 92 Lock-free data-structures do not involve locking but incur similar costs to achieve mutual exclusion.} 92 93 \noindent94 93 Mutual exclusion cost and latency increases significantly with the number of processors access\-ing a shared object. 95 94 \end{enumerate} … … 103 102 Significant research effort has looked at load balancing by stealing/sharing work units among queues: when a ready queue is too short or long, respectively, load stealing/sharing schedulers attempt to push/pull work units to/from other ready queues. 104 103 These approaches attempt to perform better load-balancing at the cost of affinity and contention. 105 However, \emph{all} approaches come at a cost (zero-sum game), but not all compromises are necessarily equivalent, especially across workloads.104 However, \emph{all} approaches come at a cost, but not all compromises are necessarily equivalent, especially across workloads. 106 105 Hence, some schedulers perform very well for specific workloads, while others offer acceptable performance over a wider range of workloads. 107 106 … … 139 138 An algorithm for load-balancing and idle sleep of processors, including NUMA awareness. 140 139 \item 141 A mechanism for adding fairness on top of work-stealing through helping (used both for the I/O and ready-queue).140 A mechanism for adding fairness on top of MQMS algorithm through helping, used both for scalable scheduling algorithm and the user-level \glsxtrshort{io}. 142 141 \item 143 An optimization of the work-stealinghelping-mechanism for load balancing to reduce scheduling costs.142 An optimization of the helping-mechanism for load balancing to reduce scheduling costs. 144 143 \item 145 144 An optimization for the alternative relaxed-list for load balancing to reduce scheduling costs in embarrassingly parallel cases.
Note: See TracChangeset
for help on using the changeset viewer.