Ignore:
Timestamp:
Dec 14, 2022, 12:23:42 PM (3 years ago)
Author:
caparson <caparson@…>
Branches:
ADT, ast-experimental, master
Children:
441a6a7
Parents:
7d9598d8 (diff), d8bdf13 (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the (diff) links above to see all the changes relative to each parent.
Message:

Merge branch 'master' of plg.uwaterloo.ca:software/cfa/cfa-cc

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/thierry_delisle_PhD/thesis/text/existing.tex

    r7d9598d8 r2dcd80a  
    22As stated, scheduling is the process of assigning resources to incoming requests, where the common example is assigning available workers to work requests or vice versa.
    33Common scheduling examples in Computer Science are: operating systems and hypervisors schedule available CPUs, NICs schedule available bandwidth, virtual memory and memory allocator schedule available storage, \etc.
    4 Scheduling is also common in most other fields, \eg in assembly lines, assigning parts to line workers is a form of scheduling.
     4Scheduling is also common in most other fields; \eg in assembly lines, assigning parts to line workers is a form of scheduling.
    55
    66In general, \emph{selecting} a scheduling algorithm depends on how much information is available to the scheduler.
     
    88A secondary aspect is how much information can be gathered versus how much information must be given as part of the scheduler input.
    99This information adds to the spectrum of scheduling algorithms, going from static schedulers that are well informed from the start, to schedulers that gather most of the information needed, to schedulers that can only rely on very limited information.
    10 Note, this description includes both information about each request, \eg time to complete or resources needed, and information about the relationships among requests, \eg whether some requests must be completed before another request starts.
    11 
    12 Scheduling physical resources, \eg in an assembly line, is generally amenable to using well-informed scheduling since information can be gathered much faster than the physical resources can be assigned and workloads are likely to stay stable for long periods.
    13 When a faster pace is needed and changes are much more frequent gathering information on workloads, up-front or live, can become much more limiting and more general schedulers are needed.
     10This description includes both information about each request, \eg time to complete or resources needed, and information about the relationships among requests, \eg whether some requests must be completed before another request starts.
     11
     12Scheduling physical resources, \eg in an assembly line, is generally amenable to using well-informed scheduling since information can be gathered much faster than the physical resources can be assigned, and workloads are likely to stay stable for long periods.
     13When a faster pace is needed and changes are much more frequent, then gathering information on workloads, up-front or live, can become much more limiting and more general schedulers are needed.
    1414
    1515\section{Naming Convention}
    16 Scheduling has been studied by various communities concentrating on different incarnations of the same problems.
     16Scheduling has been studied by various communities, concentrating on different incarnations of the same problems.
    1717As a result, there are no standard naming conventions for scheduling that are respected across these communities.
    1818This document uses the term \newterm{\Gls{at}} to refer to the abstract objects being scheduled and the term \newterm{\Gls{proc}} to refer to the concrete objects executing these \ats.
    1919
    2020\section{Static Scheduling}
    21 \newterm{Static schedulers} require \ats dependencies and costs to be explicitly and exhaustively specified prior to scheduling.
     21\newterm{Static schedulers} require \at dependencies and costs to be explicitly and exhaustively specified prior to scheduling.
    2222The scheduler then processes this input ahead of time and produces a \newterm{schedule} the system follows during execution.
    2323This approach is popular in real-time systems since the need for strong guarantees justifies the cost of determining and supplying this information.
     
    2727
    2828\section{Dynamic Scheduling}
    29 \newterm{Dynamic schedulers} determine \at dependencies and costs during scheduling, if at all.
    30 Hence, unlike static scheduling, \at dependencies are conditional and detected at runtime.
    31 This detection takes the form of observing new \ats in the system and determining dependencies from their behaviour, including suspending or halting a \at that dynamically detects unfulfilled dependencies.
     29\newterm{Dynamic schedulers} detect \at dependencies and costs during scheduling, if at all.
     30This detection takes the form of observing new \ats in the system and determining dependencies from their behaviour, where a \at suspends or halts dynamically when it detects unfulfilled dependencies.
    3231Furthermore, each \at has the responsibility of adding dependent \ats back into the system once dependencies are fulfilled.
    3332As a consequence, the scheduler often has an incomplete view of the system, seeing only \ats with no pending dependencies.
     
    3534\subsection{Explicitly Informed Dynamic Schedulers}
    3635While dynamic schedulers may not have an exhaustive list of dependencies for a \at, some information may be available about each \at, \eg expected duration, required resources, relative importance, \etc.
    37 When available, a scheduler can then use this information to direct the scheduling decisions.
     36When available, a scheduler can then use this information to direct scheduling decisions.
    3837For example, when scheduling in a cloud computing context, \ats will commonly have extra information that was manually entered, \eg caps on compute time or \io usage.
    3938However, in the context of user-level threading, most programmers do not determine or even \emph{predict} this information;
     
    4544
    4645\subsubsection{Priority Scheduling}
    47 Common information used by schedulers to direct their algorithm is priorities.
     46A common approach to direct the scheduling algorithm is to add information about \at priority.
    4847Each \at is given a priority, and higher-priority \ats are preferred to lower-priority ones.
    4948The simplest priority scheduling algorithm is to require that every \at have a distinct pre-established priority and always run the available \ats with the highest priority.
     
    8180\paragraph{Task Placement} Another aspect of work stealing that has been studied extensively is the mapping between \at and \proc.
    8281In its simplest form, work stealing assumes that all \procs are interchangeable and therefore the mapping between \at and \proc is not interesting.
    83 However, in real-life architectures there are contexts where different \procs can have different characteristics, which makes some mapping more interesting than others.
     82However, in real-life architectures, there are contexts where different \procs can have different characteristics, which makes some mappings more interesting than others.
    8483A common example where this is statically true is architectures with \glsxtrshort{numa}.
    8584In these cases, it can be relevant to change the scheduler to be cognizant of the topology~\cite{vikranth2013topology,min2011hierarchical}.
     
    8887\paragraph{Complex Machine Architecture} Another aspect that has been examined is how applicable work stealing is to different machine architectures.
    8988This is arguably strongly related to Task Placement but extends into more heterogeneous architectures.
    90 As \CFA offers no particular support for heterogeneous architecture, this is also an area that is less relevant to this thesis.
    91 Although it could be an interesting avenue for future work.
     89As \CFA offers no particular support for heterogeneous architectures, this is also an area that is not examined in this thesis.
     90However, support for concurrency across heterogeneous architectures is interesting avenue for future work, at which point the literature on this topic and how it relates to scheduling will become relevant.
    9291
    9392\subsection{Theoretical Results}
    9493There is also a large body of research on the theoretical aspects of work stealing. These evaluate, for example, the cost of \glslink{atmig}{migration}~\cite{DBLP:conf/sigmetrics/SquillanteN91,DBLP:journals/pe/EagerLZ86}, how affinity affects performance~\cite{DBLP:journals/tpds/SquillanteL93,DBLP:journals/mst/AcarBB02,DBLP:journals/ipl/SuksompongLS16} and theoretical models for heterogeneous systems~\cite{DBLP:journals/jpdc/MirchandaneyTS90,DBLP:journals/mst/BenderR02,DBLP:conf/sigmetrics/GastG10}.
    95 \cite{DBLP:journals/jacm/BlellochGM99} examines the space bounds of work stealing and \cite{DBLP:journals/siamcomp/BerenbrinkFG03} shows that for under-loaded systems, the scheduler completes its computations in finite time, \ie is \newterm{stable}.
     94Blelloch et al.~\cite{DBLP:journals/jacm/BlellochGM99} examines the space bounds of work stealing and \cite{DBLP:journals/siamcomp/BerenbrinkFG03} shows that for under-loaded systems, the scheduler completes its computations in finite time, \ie is \newterm{stable}.
    9695Others show that work stealing applies to various scheduling contexts~\cite{DBLP:journals/mst/AroraBP01,DBLP:journals/anor/TchiboukdjianGT13,DBLP:conf/isaac/TchiboukdjianGTRB10,DBLP:conf/ppopp/AgrawalLS10,DBLP:conf/spaa/AgrawalFLSSU14}.
    9796\cite{DBLP:conf/ipps/ColeR13} also studied how randomized work-stealing affects false sharing among \ats.
     
    104103Preemption is the idea of interrupting \ats that have been running too long, effectively injecting suspend points into the application.
    105104There are multiple techniques to achieve this effect, but they all aim to guarantee that the suspend points in a \at are never further apart than some fixed duration.
    106 While this helps schedulers guarantee that no \ats unfairly monopolize a worker, preemption can effectively be added to any scheduler.
     105This helps schedulers guarantee that no \ats unfairly monopolize a worker.
     106Preemption can effectively be added to any scheduler.
    107107Therefore, the only interesting aspect of preemption for the design of scheduling is whether to require it.
    108108
     
    110110This section presents a quick overview of several current schedulers.
    111111While these schedulers do not necessarily represent the most recent advances in scheduling, they are what is generally accessible to programmers.
    112 As such, I believe these schedulers are at least as relevant as those presented in published work.
    113 Both Schedulers that operate in kernel space and user space are considered, as both can offer relevant insight for this project.
    114 However, real-time schedulers are not considered, as these have constraints that are much stricter than what is needed for this project.
     112As such, I believe these schedulers are as relevant as those presented in published work.
     113Both schedulers that operate in kernel space and user space are considered, as both can offer relevant insight for this project.
     114However, real-time schedulers aim to guarantee bounded compute time in order to meet deadlines.
     115These deadlines lead to constraints much stricter than the starvation freedom that is needed for this project.
     116As such real-time schedulers are not considered for this work.
    115117
    116118\subsection{Operating System Schedulers}
    117 Operating System Schedulers tend to be fairly complex as they generally support some amount of real time, aim to balance interactive and non-interactive \ats and support multiple users sharing hardware without requiring these users to cooperate.
     119Operating System Schedulers tend to be fairly complex, as they generally support some amount of real time, aim to balance interactive and non-interactive \ats and support multiple users sharing hardware without requiring these users to cooperate.
    118120Here are more details on a few schedulers used in the common operating systems: Linux, FreeBSD, Microsoft Windows and Apple's OS X.
    119121The information is less complete for closed source operating systems.
     
    137139It also periodically balances the load of the system (according to a different heuristic) but uses a simpler work stealing approach.
    138140
    139 \paragraph{Windows(OS)}
     141\paragraph{Windows (OS)}
    140142Microsoft's Operating System's Scheduler~\cite{MAN:windows/scheduler} is a feedback scheduler with priorities.
    141143It supports 32 levels of priorities, some of which are reserved for real-time and privileged applications.
     
    143145The scheduler may also temporarily adjust priorities after certain effects like the completion of I/O requests.
    144146
    145 In~\cite{russinovich2009windows}, Chapter 1 section 2.3 ``Processes, Threads, and Jobs'' discusses the scheduling policy more in-depth.
     147The scheduling policy is discussed more in-depth in~\cite{russinovich2009windows}, Chapter 1 section 2.3 ``Processes, Threads, and Jobs''.
    146148Multicore scheduling is based on a combination of priorities and \proc preference.
    147149Each \at is assigned an initial processor using a round-robin policy, called the \at's \newterm{ideal} \proc.
     
    168170\paragraph{Go}\label{GoSafePoint}
    169171Go's scheduler uses a randomized work-stealing algorithm that has a global run-queue (\emph{GRQ}) and each processor (\emph{P}) has both a fixed-size run-queue (\emph{LRQ}) and a high-priority next ``chair'' holding a single element~\cite{GITHUB:go,YTUBE:go}.
    170 Preemption is present, but only at safe points,~\cite{go:safepoints} which are detection code inserted at various frequent access boundaries.
     172Preemption is present, but only at safe points~\cite{go:safepoints}, which are detection code inserted at various frequent access boundaries.
    171173
    172174The algorithm is as follows :
    173175\begin{enumerate}
    174176        \item Once out of 61 times, pick 1 element from the \emph{GRQ}.
    175         \item If there is an item in the ``chair'' pick it.
     177        \item Otherwise, if there is an item in the ``chair'' pick it.
    176178        \item Else pick an item from the \emph{LRQ}.
    177179        \begin{itemize}
     
    180182        \end{itemize}
    181183\end{enumerate}
     184
     185Chapter~\ref{microbench} uses Go as one of its comparison point in this thesis's performance evaluation.
    182186
    183187\paragraph{Erlang}
     
    224228\paragraph{LibFibre}
    225229LibFibre~\cite{DBLP:journals/pomacs/KarstenB20} is a lightweight user-level threading framework developed at the University of Waterloo.
    226 Similarly to Go, it uses a variation of work stealing with a global queue that has a higher priority than stealing.
    227 Unlike Go, it does not have the high-priority next ``chair'' and does not use randomized work-stealing.
     230It shares a very strong resemblance to Go: using a variation of work stealing with a global queue that has a higher priority than stealing.
     231Unlike Go, it does not have the high-priority next ``chair'' and its work-stealing is not randomized.
     232
     233Chapter~\ref{microbench} uses LibFibre as one of its comparison point in this thesis's performance evaluation.
Note: See TracChangeset for help on using the changeset viewer.