Context Navigation

-                      r82a90d4
+                      rddcaff6
 \chapter{Introduction}\label{intro}
 \Gls{uthrding} (M:N) is gaining popularity over kernel-level threading (1:1) in many programming languages.
+\Gls{uthrding} (M:N) is gaining popularity over kernel-level threading (1:1) in many programming languages, like Go~\cite{GITHUB:go}, Java's Project Loom~\cite{MAN:project-loom} and Kotlin~\cite{MAN:kotlin}.
 The user threading approach is often a better mechanism to express complex concurrent applications by efficiently running 10,000+ threads on multicore systems.
 Indeed, over-partitioning into small work units with user threading significantly eases load bal\-ancing, while simultaneously providing advanced synchronization and mutual exclusion capabilities.
 To manage these high levels of concurrency, the underlying runtime must efficiently schedule many user threads across a few kernel threads;
 which begs the question of how many kernel threads are needed and should the number be dynamically reevaluated.
+which raises the question of how many kernel threads are needed and should the number be dynamically reevaluated.
 Furthermore, scheduling must prevent kernel threads from blocking, otherwise user-thread parallelism drops.
 When user-threading parallelism does drop, how and when should idle kernel-threads be put to sleep to avoid wasting CPU resources.
+When user-threading parallelism does drop, how and when should idle kernel-threads be put to sleep to avoid wasting CPU resources?
 Finally, the scheduling system must provide fairness to prevent a user thread from monopolizing a kernel thread;
 otherwise, other user threads can experience short/long term starvation or kernel threads can deadlock waiting for events to occur on busy kernel threads.
 …
 Fairness is handled through preemption and/or ad hoc solutions, which leads to coarse-grained fairness with some pathological cases.
 After examining, testing and selecting specific approaches to these scheduling issues, a completely new scheduler was created and tested in the \CFA (C-for-all) user-threading runtime system.
+After examining, testing and selecting specific approaches to these scheduling issues, a new scheduler was created and tested in the \CFA (C-for-all) user-threading runtime system.
 The goal of the new scheduler is to offer increased safety and productivity without sacrificing performance.
 The quality of the new scheduler is demonstrated by comparing it with other user-threading work-stealing schedulers with, the aim of showing equivalent or better performance while offering better fairness.
+The quality of the new scheduler is demonstrated by comparing it with other user-threading work-stealing schedulers with the aim of showing equivalent or better performance while offering better fairness.
 Chapter~\ref{intro} defines scheduling and its general goals.
 …
 Computer systems share multiple resources across many threads of execution, even on single-user computers like laptops or smartphones.
 On a computer system with multiple processors and work units (routines, coroutines, threads, programs, \etc), there exists the problem of mapping many different kinds of work units onto many different kinds of processors efficiently, called \newterm{scheduling}.
+Scheduling systems are normally \newterm{open}, meaning new work arrives from an external source or is randomly spawned from an existing work unit.
+Scheduling systems are normally \newterm{open}, meaning new work arrives from an external source or is randomly spawned from an existing work unit\footnote{
+Open systems constrasts to \newterm{closed} systems, where work never arrives from external sources.
+This definition is a extension of open/closed systems in the field of thermodynamics.
+}.
 In general, work units without threads, like routines and coroutines, are self-scheduling, while work units with threads, like tasks and programs, are scheduled.
 For scheduled work-units, a scheduler takes a sequence of threads and attempts to run them to completion, subject to shared resource restrictions and utilization.
 …
 \end{enumerate}
 Scheduling is a zero-sum game as computer processors normally have a fixed, maximum number of cycles per unit time.\footnote{
+At a high-level, scheduling is considered zero-sum game as computer processors normally have a fixed, maximum number of cycles per unit time.\footnote{
 Frequency scaling and turbo-boost add a degree of complexity that can be ignored in this discussion without loss of generality.}
+Hence, schedulers are a series of compromises, occasionally with some static or dynamic tuning parameters to enhance specific workload patterns.
+This almost invariably leads to schedulers needing to pick some \ats over others, opening the door to fairness problems.
+However, at a lower-level, schedulers can make inefficient or incorrect decisions leading to strictly worse outcomes than what the theoretical zero-sum game suggests.
+Since it can be difficult to avoid these poor decisions, schedulers are generally a series of compromises, occasionally with some static or dynamic tuning parameters to enhance specific workload patterns.
 For example, SQMS has perfect load-balancing but poor affinity and high contention by the processors, because of the single queue.
 While MQMS has poor load-balancing but perfect affinity and no contention, because each processor has its own queue.
 …
 Specifically, this work concentrates on advanced thread and \glsxtrshort{io} scheduling.
 Prior to this work, the \CFA runtime used a strict SQMS \gls{rQ} and provided no nonblocking \glsxtrshort{io} capabilities at the user-thread level.\footnote{
 C/\CC only support \glsxtrshort{io} capabilities at the kernel level, which means many \io operations block \glspl{kthrd} reducing parallelism at the user level.}
+C/\CC only support \glsxtrshort{io} capabilities at the kernel level, which means many \io operations block \glspl{kthrd}, reducing parallelism at the user level.}
 Since \CFA attempts to improve the safety and productivity of C, the new scheduler presented in this thesis attempts to achieve the same goals.
 More specifically, safety and productivity for scheduling mean supporting a wide range of workloads so that programmers can rely on progress guarantees (safety) and more easily achieve acceptable performance (productivity).
+More specifically, safety and productivity for scheduling mean supporting a wide range of workloads, so that programmers can rely on progress guarantees (safety) and more easily achieve acceptable performance (productivity).
 The new scheduler also includes support for implicit nonblocking \io, allowing applications to have more user-threads blocking on \io operations than there are \glspl{kthrd}.
 To complete the scheduler, an idle sleep mechanism is implemented that significantly reduces wasted CPU cycles, which are then available outside the application.
 As a research project, this work builds exclusively on newer versions of the Linux operating system and gcc/clang compilers.
+As a research project, this work runs exclusively on newer versions of the Linux operating system and gcc/clang compilers.
 The new scheduler implementation uses several optimizations to successfully balance the cost of fairness against performance;
 some of these optimizations rely on interesting hardware optimizations only present on modern CPUs.
 …
 An algorithm for load-balancing and idle sleep of processors, including NUMA awareness.
 \item
 A mechanism for adding fairness on top of MQMS algorithm through helping, used both for scalable scheduling algorithm and the user-level \glsxtrshort{io}.
+A mechanism for adding fairness on top of the MQMS algorithm through helping, used both for scalable scheduling algorithm and the user-level \glsxtrshort{io}.
 \item
 An optimization of the helping mechanism for load balancing to reduce scheduling costs.

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset ddcaff6 for doc/theses/thierry_delisle_PhD/thesis/text/intro.tex

Legend:

doc/theses/thierry_delisle_PhD/thesis/text/intro.tex

Download in other formats: