Ignore:
Timestamp:
Sep 7, 2022, 4:12:00 PM (2 years ago)
Author:
Thierry Delisle <tdelisle@…>
Branches:
ADT, ast-experimental, master, pthread-emulation
Children:
e4855f6
Parents:
7a0f798b
Message:

A whole bunch of small changes:
trying to setup a version that I can pass through a spell checker.
Fixing a whole bunch of grammar errors

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/thierry_delisle_PhD/thesis/text/intro.tex

    r7a0f798b ra44514e  
    22
    33\Gls{uthrding} (M:N) is gaining popularity over kernel-level threading (1:1) in many programming languages.
    4 The user threading approach is often a better mechanism to express complex concurrent applications by efficiently running 10,000+ threads on multi-core systems.
     4The user threading approach is often a better mechanism to express complex concurrent applications by efficiently running 10,000+ threads on multicore systems.
    55Indeed, over-partitioning into small work-units with user threading significantly eases load bal\-ancing, while simultaneously providing advanced synchronization and mutual exclusion capabilities.
    66To manage these high levels of concurrency, the underlying runtime must efficiently schedule many user threads across a few kernel threads;
    7 which begs of the question of how many kernel threads are needed and should the number be dynamically reevaluated.
     7which begs the question of how many kernel threads are needed and should the number be dynamically reevaluated.
    88Furthermore, scheduling must prevent kernel threads from blocking, otherwise user-thread parallelism drops.
    99When user-threading parallelism does drop, how and when should idle kernel-threads be put to sleep to avoid wasting CPU resources.
     
    1313This thesis analyses multiple scheduler systems, where each system attempts to fulfill the necessary requirements for \gls{uthrding}.
    1414The predominant technique for managing high levels of concurrency is sharding the ready-queue with one queue per kernel-thread and using some form of work stealing/sharing to dynamically rebalance workload shifts.
    15 Preventing kernel blocking is accomplish by transforming kernel locks and I/O operations into user-level operations that do not block the kernel thread or spin up new kernel threads to manage the blocking.
     15Preventing kernel blocking is accomplished by transforming kernel locks and I/O operations into user-level operations that do not block the kernel thread or spin up new kernel threads to manage the blocking.
    1616Fairness is handled through preemption and/or ad-hoc solutions, which leads to coarse-grained fairness with some pathological cases.
    1717
    1818After examining, testing and selecting specific approaches to these scheduling issues, a completely new scheduler was created and tested in the \CFA (C-for-all) user-threading runtime-system.
    1919The goal of the new scheduler is to offer increased safety and productivity without sacrificing performance.
    20 The quality of the new scheduler is demonstrated by comparing it with other user-threading work-stealing schedulers with the aim of showing equivalent or better performance while offering better fairness.
     20The quality of the new scheduler is demonstrated by comparing it with other user-threading work-stealing schedulers with, the aim of showing equivalent or better performance while offering better fairness.
    2121
    2222Chapter~\ref{intro} defines scheduling and its general goals.
    2323Chapter~\ref{existing} discusses how scheduler implementations attempt to achieve these goals, but all implementations optimize some workloads better than others.
    24 Chapter~\ref{cfaruntime} presents the relevant aspects of the \CFA runtime system that have a significant affect on the new scheduler design and implementation.
     24Chapter~\ref{cfaruntime} presents the relevant aspects of the \CFA runtime system that have a significant effect on the new scheduler design and implementation.
    2525Chapter~\ref{core} analyses different scheduler approaches, while looking for scheduler mechanisms that provide both performance and fairness.
    2626Chapter~\ref{userio} covers the complex mechanisms that must be used to achieve nonblocking I/O to prevent the blocking of \glspl{kthrd}.
     
    3131\section{Scheduling}\label{sched}
    3232Computer systems share multiple resources across many threads of execution, even on single-user computers like laptops or smartphones.
    33 On a computer system with multiple processors and work units (routines, coroutines, threads, programs, \etc), there exists the problem of mapping many different kinds of work units onto many different kinds of processors in an efficient manner, called \newterm{scheduling}.
     33On a computer system with multiple processors and work units (routines, coroutines, threads, programs, \etc), there exists the problem of mapping many different kinds of work units onto many different kinds of processors efficiently, called \newterm{scheduling}.
    3434Scheduling systems are normally \newterm{open}, meaning new work arrives from an external source or is randomly spawned from an existing work unit.
    3535In general, work units without threads, like routines and coroutines, are self-scheduling, while work units with threads, like tasks and programs, are scheduled.
     
    3939However, optimal solutions are often not required: schedulers often produce excellent solutions, without needing optimality, by taking advantage of regularities in work patterns.
    4040
    41 Scheduling occurs at discreet points when there are transitions in a system.
    42 For example, a thread cycles through the following transitions during its execution.
     41Scheduling occurs at discrete points when there are transitions in a system.
     42For example, a \at cycles through the following transitions during its execution.
    4343\begin{center}
    4444\input{executionStates.pstex_t}
     
    4949entering the system (new $\rightarrow$ ready)
    5050\item
    51 scheduler assigns a thread to a computing resource, \eg CPU (ready $\rightarrow$ running)
     51scheduler assigns a \at to a computing resource, \eg CPU (ready $\rightarrow$ running)
    5252\item
    5353timer alarm for preemption (running $\rightarrow$ ready)
     
    5959normal completion or error, \eg segment fault (running $\rightarrow$ halted)
    6060\end{itemize}
    61 Key to scheduling is that a thread cannot bypass the ``ready'' state during a transition so the scheduler maintains complete control of the system, \ie no self-scheduling among threads.
     61Key to scheduling is that a \at cannot bypass the ``ready'' state during a transition so the scheduler maintains complete control of the system, \ie no self-scheduling among threads.
    6262
    6363When the workload exceeds the capacity of the processors, \ie work cannot be executed immediately, it is placed on a queue for subsequent service, called a \newterm{ready queue}.
     
    7171\end{tabular}
    7272\end{center}
    73 Beyond these two schedulers are a host of options, \eg adding an global shared queue to MQMS or adding multiple private queues with distinc characteristics.
     73Beyond these two schedulers are a host of options, \eg adding a global shared queue to MQMS or adding multiple private queues with distinct characteristics.
    7474
    7575Once there are multiple resources and ready queues, a scheduler is faced with three major optimization criteria:
     
    8686Essentially, all multi-processor computers have non-uniform memory access (NUMA), with one or more quantized steps to access data at different levels in the memory hierarchy.
    8787When a system has a large number of independently executing threads, affinity becomes difficult because of \newterm{thread churn}.
    88 That is, threads must be scheduled on different processors to obtain high processors utilization because the number of threads $\ggg$ processors.
     88That is, threads must be scheduled on different processors to obtain high processor utilization because the number of threads $\ggg$ processors.
    8989
    9090\item
     
    118118More specifically, safety and productivity for scheduling means supporting a wide range of workloads so that programmers can rely on progress guarantees (safety) and more easily achieve acceptable performance (productivity).
    119119The new scheduler also includes support for implicit nonblocking \io, allowing applications to have more user-threads blocking on \io operations than there are \glspl{kthrd}.
    120 To complete the scheduler, an idle sleep mechanism is implemented that significantly reduces wasted CPU cycles, which are then available outside of the application.
     120To complete the scheduler, an idle sleep mechanism is implemented that significantly reduces wasted CPU cycles, which are then available outside the application.
    121121
    122122As a research project, this work builds exclusively on newer versions of the Linux operating-system and gcc/clang compilers.
Note: See TracChangeset for help on using the changeset viewer.