Changeset ddcaff6 for doc/theses/thierry_delisle_PhD/thesis/text/intro.tex
- Timestamp:
- Nov 24, 2022, 3:41:44 PM (17 months ago)
- Branches:
- ADT, ast-experimental, master
- Children:
- dacd8e6e
- Parents:
- 82a90d4
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/thierry_delisle_PhD/thesis/text/intro.tex
r82a90d4 rddcaff6 1 1 \chapter{Introduction}\label{intro} 2 2 3 \Gls{uthrding} (M:N) is gaining popularity over kernel-level threading (1:1) in many programming languages .3 \Gls{uthrding} (M:N) is gaining popularity over kernel-level threading (1:1) in many programming languages, like Go~\cite{GITHUB:go}, Java's Project Loom~\cite{MAN:project-loom} and Kotlin~\cite{MAN:kotlin}. 4 4 The user threading approach is often a better mechanism to express complex concurrent applications by efficiently running 10,000+ threads on multicore systems. 5 5 Indeed, over-partitioning into small work units with user threading significantly eases load bal\-ancing, while simultaneously providing advanced synchronization and mutual exclusion capabilities. 6 6 To manage these high levels of concurrency, the underlying runtime must efficiently schedule many user threads across a few kernel threads; 7 which begs the question of how many kernel threads are needed and should the number be dynamically reevaluated.7 which raises the question of how many kernel threads are needed and should the number be dynamically reevaluated. 8 8 Furthermore, scheduling must prevent kernel threads from blocking, otherwise user-thread parallelism drops. 9 When user-threading parallelism does drop, how and when should idle kernel-threads be put to sleep to avoid wasting CPU resources .9 When user-threading parallelism does drop, how and when should idle kernel-threads be put to sleep to avoid wasting CPU resources? 10 10 Finally, the scheduling system must provide fairness to prevent a user thread from monopolizing a kernel thread; 11 11 otherwise, other user threads can experience short/long term starvation or kernel threads can deadlock waiting for events to occur on busy kernel threads. … … 16 16 Fairness is handled through preemption and/or ad hoc solutions, which leads to coarse-grained fairness with some pathological cases. 17 17 18 After examining, testing and selecting specific approaches to these scheduling issues, a completelynew scheduler was created and tested in the \CFA (C-for-all) user-threading runtime system.18 After examining, testing and selecting specific approaches to these scheduling issues, a new scheduler was created and tested in the \CFA (C-for-all) user-threading runtime system. 19 19 The goal of the new scheduler is to offer increased safety and productivity without sacrificing performance. 20 The quality of the new scheduler is demonstrated by comparing it with other user-threading work-stealing schedulers with ,the aim of showing equivalent or better performance while offering better fairness.20 The quality of the new scheduler is demonstrated by comparing it with other user-threading work-stealing schedulers with the aim of showing equivalent or better performance while offering better fairness. 21 21 22 22 Chapter~\ref{intro} defines scheduling and its general goals. … … 32 32 Computer systems share multiple resources across many threads of execution, even on single-user computers like laptops or smartphones. 33 33 On a computer system with multiple processors and work units (routines, coroutines, threads, programs, \etc), there exists the problem of mapping many different kinds of work units onto many different kinds of processors efficiently, called \newterm{scheduling}. 34 Scheduling systems are normally \newterm{open}, meaning new work arrives from an external source or is randomly spawned from an existing work unit. 34 Scheduling systems are normally \newterm{open}, meaning new work arrives from an external source or is randomly spawned from an existing work unit\footnote{ 35 Open systems constrasts to \newterm{closed} systems, where work never arrives from external sources. 36 This definition is a extension of open/closed systems in the field of thermodynamics. 37 }. 38 35 39 In general, work units without threads, like routines and coroutines, are self-scheduling, while work units with threads, like tasks and programs, are scheduled. 36 40 For scheduled work-units, a scheduler takes a sequence of threads and attempts to run them to completion, subject to shared resource restrictions and utilization. … … 94 98 \end{enumerate} 95 99 96 Scheduling is azero-sum game as computer processors normally have a fixed, maximum number of cycles per unit time.\footnote{100 At a high-level, scheduling is considered zero-sum game as computer processors normally have a fixed, maximum number of cycles per unit time.\footnote{ 97 101 Frequency scaling and turbo-boost add a degree of complexity that can be ignored in this discussion without loss of generality.} 98 Hence, schedulers are a series of compromises, occasionally with some static or dynamic tuning parameters to enhance specific workload patterns. 102 This almost invariably leads to schedulers needing to pick some \ats over others, opening the door to fairness problems. 103 However, at a lower-level, schedulers can make inefficient or incorrect decisions leading to strictly worse outcomes than what the theoretical zero-sum game suggests. 104 Since it can be difficult to avoid these poor decisions, schedulers are generally a series of compromises, occasionally with some static or dynamic tuning parameters to enhance specific workload patterns. 99 105 For example, SQMS has perfect load-balancing but poor affinity and high contention by the processors, because of the single queue. 100 106 While MQMS has poor load-balancing but perfect affinity and no contention, because each processor has its own queue. … … 113 119 Specifically, this work concentrates on advanced thread and \glsxtrshort{io} scheduling. 114 120 Prior to this work, the \CFA runtime used a strict SQMS \gls{rQ} and provided no nonblocking \glsxtrshort{io} capabilities at the user-thread level.\footnote{ 115 C/\CC only support \glsxtrshort{io} capabilities at the kernel level, which means many \io operations block \glspl{kthrd} reducing parallelism at the user level.}121 C/\CC only support \glsxtrshort{io} capabilities at the kernel level, which means many \io operations block \glspl{kthrd}, reducing parallelism at the user level.} 116 122 117 123 Since \CFA attempts to improve the safety and productivity of C, the new scheduler presented in this thesis attempts to achieve the same goals. 118 More specifically, safety and productivity for scheduling mean supporting a wide range of workloads so that programmers can rely on progress guarantees (safety) and more easily achieve acceptable performance (productivity).124 More specifically, safety and productivity for scheduling mean supporting a wide range of workloads, so that programmers can rely on progress guarantees (safety) and more easily achieve acceptable performance (productivity). 119 125 The new scheduler also includes support for implicit nonblocking \io, allowing applications to have more user-threads blocking on \io operations than there are \glspl{kthrd}. 120 126 To complete the scheduler, an idle sleep mechanism is implemented that significantly reduces wasted CPU cycles, which are then available outside the application. 121 127 122 As a research project, this work builds exclusively on newer versions of the Linux operating system and gcc/clang compilers.128 As a research project, this work runs exclusively on newer versions of the Linux operating system and gcc/clang compilers. 123 129 The new scheduler implementation uses several optimizations to successfully balance the cost of fairness against performance; 124 130 some of these optimizations rely on interesting hardware optimizations only present on modern CPUs. … … 138 144 An algorithm for load-balancing and idle sleep of processors, including NUMA awareness. 139 145 \item 140 A mechanism for adding fairness on top of MQMS algorithm through helping, used both for scalable scheduling algorithm and the user-level \glsxtrshort{io}.146 A mechanism for adding fairness on top of the MQMS algorithm through helping, used both for scalable scheduling algorithm and the user-level \glsxtrshort{io}. 141 147 \item 142 148 An optimization of the helping mechanism for load balancing to reduce scheduling costs.
Note: See TracChangeset
for help on using the changeset viewer.