Changeset 918e0137


Ignore:
Timestamp:
Sep 8, 2022, 5:11:19 PM (20 months ago)
Author:
Thierry Delisle <tdelisle@…>
Branches:
ADT, ast-experimental, master, pthread-emulation
Children:
264f6c9
Parents:
e4855f6
Message:

First pass at spellchecking until chapter 2

Location:
doc/theses/thierry_delisle_PhD/thesis/text
Files:
4 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/thierry_delisle_PhD/thesis/text/conclusion.tex

    re4855f6 r918e0137  
    99However, I discovered two significant challenges.
    1010
    11 First, modern symmetric multiprocessing CPU have significant performance penalties for communication (often cache related).
     11First, modern symmetric multiprocessing CPU have significant performance penalties for communicationm, often cache related.
    1212A SQMS scheduler (see Section~\ref{sched}), with its \proc-shared ready-queue, has perfect load-balancing but poor affinity resulting in high communication across \procs.
    1313A MQMS scheduler, with its \proc-specific ready-queues, has poor load-balancing but perfect affinity often resulting in significantly reduced communication.
    1414However, implementing fairness for an MQMS scheduler is difficult, since fairness requires \procs to be aware of each other's ready-queue progress, \ie communicated knowledge.
    15 For balanced workloads with little or no data sharing (embarrassingly parallel), an MQMS scheduler, \eg a state-of-the-art work-stealing scheduler, is near optimal.
     15For balanced workloads with little or no data sharing, \ie embarrassingly parallel, an MQMS scheduler is near optimal, \eg a state-of-the-art work-stealing scheduler.
    1616For these kinds of fair workloads, adding fairness must be low-cost to hide the communication costs needed for global ready-queue progress or performance suffers.
    1717While I was aware of these realities, I underestimated how little performance margin there is for communication.
  • doc/theses/thierry_delisle_PhD/thesis/text/existing.tex

    re4855f6 r918e0137  
    88A secondary aspect is how much information can be gathered versus how much information must be given as part of the scheduler input.
    99This information adds to the spectrum of scheduling algorithms, going from static schedulers that are well-informed from the start, to schedulers that gather most of the information needed, to schedulers that can only rely on very limited information.
    10 Note, this description includes both information about each request, \eg time to complete or resources needed, and information about the relationships among request, \eg whether some request must be completed before another request starts.
    11 
    12 Scheduling physical resources, \eg in an assembly line, is generally amenable to using well-informed scheduling, since information can be gathered much faster than the physical resources can be assigned and workloads are likely to stay stable for long periods of time.
     10Note, this description includes both information about each request, \eg time to complete or resources needed, and information about the relationships among requests, \eg whether some requests must be completed before another request starts.
     11
     12Scheduling physical resources, \eg in an assembly line, is generally amenable to using well-informed scheduling since information can be gathered much faster than the physical resources can be assigned and workloads are likely to stay stable for long periods.
    1313When a faster pace is needed and changes are much more frequent gathering information on workloads, up-front or live, can become much more limiting and more general schedulers are needed.
    1414
    1515\section{Naming Convention}
    16 Scheduling has been studied by various communities concentrating on different incarnation of the same problems.
    17 As a result, there are no standard naming conventions for scheduling that is respected across these communities.
     16Scheduling has been studied by various communities concentrating on different incarnations of the same problems.
     17As a result, there are no standard naming conventions for scheduling that are respected across these communities.
    1818This document uses the term \newterm{\Gls{at}} to refer to the abstract objects being scheduled and the term \newterm{\Gls{proc}} to refer to the concrete objects executing these \ats.
    1919
    2020\section{Static Scheduling}
    21 \newterm{Static schedulers} require \ats dependencies and costs be explicitly and exhaustively specified prior to scheduling.
     21\newterm{Static schedulers} require \ats dependencies and costs to be explicitly and exhaustively specified prior to scheduling.
    2222The scheduler then processes this input ahead of time and produces a \newterm{schedule} the system follows during execution.
    2323This approach is popular in real-time systems since the need for strong guarantees justifies the cost of determining and supplying this information.
    24 In general, static schedulers are less relevant to this project because they require input from the programmers that the programming language does not have as part of its concurrency semantic.
     24In general, static schedulers are less relevant to this project because they require input from the programmers that the programming language does not have as part of its concurrency semantics.
    2525Specifying this information explicitly adds a significant burden to the programmer and reduces flexibility.
    2626For this reason, the \CFA scheduler does not require this information.
    2727
    2828\section{Dynamic Scheduling}
    29 \newterm{Dynamic schedulers} determine \ats dependencies and costs during scheduling, if at all.
    30 Hence, unlike static scheduling, \ats dependencies are conditional and detected at runtime.
     29\newterm{Dynamic schedulers} determine \at dependencies and costs during scheduling, if at all.
     30Hence, unlike static scheduling, \at dependencies are conditional and detected at runtime.
    3131This detection takes the form of observing new \ats in the system and determining dependencies from their behaviour, including suspending or halting a \at that dynamically detects unfulfilled dependencies.
    3232Furthermore, each \at has the responsibility of adding dependent \ats back into the system once dependencies are fulfilled.
     
    4040at best, the scheduler has only some imprecise information provided by the programmer, \eg, indicating a \at takes approximately 3--7 seconds to complete, rather than exactly 5 seconds.
    4141Providing this kind of information is a significant programmer burden, especially if the information does not scale with the number of \ats and their complexity.
    42 For example, providing an exhaustive list of files read by 5 \ats is an easier requirement then providing an exhaustive list of memory addresses accessed by 10,000 independent \ats.
     42For example, providing an exhaustive list of files read by 5 \ats is an easier requirement than providing an exhaustive list of memory addresses accessed by 10,000 independent \ats.
    4343
    4444Since the goal of this thesis is to provide a scheduler as a replacement for \CFA's existing \emph{uninformed} scheduler, explicitly informed schedulers are less relevant to this project. Nevertheless, some strategies are worth mentioning.
     
    4949The simplest priority scheduling algorithm is to require that every \at have a distinct pre-established priority and always run the available \ats with the highest priority.
    5050Asking programmers to provide an exhaustive set of unique priorities can be prohibitive when the system has a large number of \ats.
    51 It can therefore be desirable for schedulers to support \ats with identical priorities and/or automatically setting and adjusting priorities for \ats.
     51It can therefore be desirable for schedulers to support \ats with identical priorities and/or automatically set and adjust priorities for \ats.
    5252Most common operating systems use some variant on priorities with overlaps and dynamic priority adjustments.
    5353For example, Microsoft Windows uses a pair of priorities~\cite{win:priority}, one specified by users out of ten possible options and one adjusted by the system.
    5454
    5555\subsection{Uninformed and Self-Informed Dynamic Schedulers}
    56 Several scheduling algorithms do not require programmers to provide additional information on each \at, and instead make scheduling decisions based solely on internal state and/or information implicitly gathered by the scheduler.
     56Several scheduling algorithms do not require programmers to provide additional information on each \at, and instead, make scheduling decisions based solely on internal state and/or information implicitly gathered by the scheduler.
    5757
    5858
     
    6161This design effectively moves the scheduler into the realm of \newterm{Control Theory}~\cite{wiki:controltheory}.
    6262This information gathering does not generally involve programmers, and as such, does not increase programmer burden the same way explicitly provided information may.
    63 However, some feedback schedulers do allow programmers to offer additional information on certain \ats, in order to direct scheduling decisions.
    64 The important distinction being whether the scheduler can function without this additional information.
     63However, some feedback schedulers do allow programmers to offer additional information on certain \ats, to direct scheduling decisions.
     64The important distinction is whether the scheduler can function without this additional information.
    6565
    6666
    6767\section{Work Stealing}\label{existing:workstealing}
    68 One of the most popular scheduling algorithm in practice (see~\ref{existing:prod}) is work stealing.
    69 This idea, introduced by \cite{DBLP:conf/fpca/BurtonS81}, effectively has each worker process its local \ats first, but allows the possibility for other workers to steal local \ats if they run out of \ats.
     68One of the most popular scheduling algorithms in practice (see~\ref{existing:prod}) is work stealing.
     69This idea, introduced by \cite{DBLP:conf/fpca/BurtonS81}, effectively has each worker process its local \ats first but allows the possibility for other workers to steal local \ats if they run out of \ats.
    7070\cite{DBLP:conf/focs/Blumofe94} introduced the more familiar incarnation of this, where each worker has a queue of \ats and workers without \ats steal \ats from random workers\footnote{The Burton and Sleep algorithm has trees of \ats and steals only among neighbours.}.
    71 Blumofe and Leiserson also prove worst case space and time requirements for well-structured computations.
     71Blumofe and Leiserson also prove worst-case space and time requirements for well-structured computations.
    7272
    7373Many variations of this algorithm have been proposed over the years~\cite{DBLP:journals/ijpp/YangH18}, both optimizations of existing implementations and approaches that account for new metrics.
     
    8686Another example is energy usage, where the scheduler is modified to optimize for energy efficiency in addition/instead of performance~\cite{ribic2014energy,torng2016asymmetry}.
    8787
    88 \paragraph{Complex Machine Architecture} Another aspect that has been examined is how well work stealing is applicable to different machine architectures.
    89 This is arguably strongly related to Task Placement, but extends into more heterogeneous architectures.
     88\paragraph{Complex Machine Architecture} Another aspect that has been examined is how applicable work stealing is to different machine architectures.
     89This is arguably strongly related to Task Placement but extends into more heterogeneous architectures.
    9090As \CFA offers no particular support for heterogeneous architecture, this is also an area that is less relevant to this thesis.
    9191Although it could be an interesting avenue for future work.
     
    9494There is also a large body of research on the theoretical aspects of work stealing. These evaluate, for example, the cost of \glslink{atmig}{migration}~\cite{DBLP:conf/sigmetrics/SquillanteN91,DBLP:journals/pe/EagerLZ86}, how affinity affects performance~\cite{DBLP:journals/tpds/SquillanteL93,DBLP:journals/mst/AcarBB02,DBLP:journals/ipl/SuksompongLS16} and theoretical models for heterogeneous systems~\cite{DBLP:journals/jpdc/MirchandaneyTS90,DBLP:journals/mst/BenderR02,DBLP:conf/sigmetrics/GastG10}.
    9595\cite{DBLP:journals/jacm/BlellochGM99} examines the space bounds of work stealing and \cite{DBLP:journals/siamcomp/BerenbrinkFG03} shows that for under-loaded systems, the scheduler completes its computations in finite time, \ie is \newterm{stable}.
    96 Others show that work stealing is applicable to various scheduling contexts~\cite{DBLP:journals/mst/AroraBP01,DBLP:journals/anor/TchiboukdjianGT13,DBLP:conf/isaac/TchiboukdjianGTRB10,DBLP:conf/ppopp/AgrawalLS10,DBLP:conf/spaa/AgrawalFLSSU14}.
     96Others show that work stealing applies to various scheduling contexts~\cite{DBLP:journals/mst/AroraBP01,DBLP:journals/anor/TchiboukdjianGT13,DBLP:conf/isaac/TchiboukdjianGTRB10,DBLP:conf/ppopp/AgrawalLS10,DBLP:conf/spaa/AgrawalFLSSU14}.
    9797\cite{DBLP:conf/ipps/ColeR13} also studied how randomized work-stealing affects false sharing among \ats.
    9898
     
    101101
    102102\section{Preemption}
    103 One last aspect of scheduling is preemption, since many schedulers rely on it for some of their guarantees.
     103One last aspect of scheduling is preemption since many schedulers rely on it for some of their guarantees.
    104104Preemption is the idea of interrupting \ats that have been running too long, effectively injecting suspend points into the application.
    105 There are multiple techniques to achieve this effect, but they all aim to guarantee that the suspend points in a \ats are never further apart than some fixed duration.
    106 While this helps schedulers guarantee that no \ats unfairly monopolizes a worker, preemption can effectively be added to any scheduler.
     105There are multiple techniques to achieve this effect, but they all aim to guarantee that the suspend points in a \at are never further apart than some fixed duration.
     106While this helps schedulers guarantee that no \ats unfairly monopolize a worker, preemption can effectively be added to any scheduler.
    107107Therefore, the only interesting aspect of preemption for the design of scheduling is whether to require it.
    108108
     
    111111While these schedulers do not necessarily represent the most recent advances in scheduling, they are what is generally accessible to programmers.
    112112As such, I believe these schedulers are at least as relevant as those presented in published work.
    113 Schedulers that operate in kernel space and in user space are considered, as both can offer relevant insight for this project.
     113Both Schedulers that operate in kernel space and user space are considered, as both can offer relevant insight for this project.
    114114However, real-time schedulers are not considered, as these have constraints that are much stricter than what is needed for this project.
    115115
     
    117117Operating System Schedulers tend to be fairly complex as they generally support some amount of real-time, aim to balance interactive and non-interactive \ats and support multiple users sharing hardware without requiring these users to cooperate.
    118118Here are more details on a few schedulers used in the common operating systems: Linux, FreeBSD, Microsoft Windows and Apple's OS X.
    119 The information is less complete for operating systems with closed source.
     119The information is less complete for closed source operating systems.
    120120
    121121\paragraph{Linux's CFS}
     
    124124The \at that has used the least CPU time is scheduled.
    125125It also supports the concept of \newterm{Nice values}, which are effectively multiplicative factors on the CPU time used.
    126 The ordering of \ats is also affected by a group based notion of fairness, where \ats belonging to groups having used less CPU time are preferred to \ats belonging to groups having used more CPU time.
     126The ordering of \ats is also affected by a group-based notion of fairness, where \ats belonging to groups having used less CPU time are preferred to \ats belonging to groups having used more CPU time.
    127127Linux achieves load-balancing by regularly monitoring the system state~\cite{MAN:linux/cfs/balancing} and using some heuristic on the \gls{load}, currently CPU time used in the last millisecond plus a decayed version of the previous time slots~\cite{MAN:linux/cfs/pelt}.
    128128
    129129\cite{DBLP:conf/eurosys/LoziLFGQF16} shows that Linux's CFS also does work stealing to balance the workload of each \proc, but the paper argues this aspect can be improved significantly.
    130 The issues highlighted stem from Linux's need to support fairness across \ats \emph{and} across users\footnote{Enforcing fairness across users means that given two users, one with a single \at and the other with one thousand \ats, the user with a single \at does not receive one thousandth of the CPU time.}, increasing the complexity.
     130The issues highlighted stem from Linux's need to support fairness across \ats \emph{and} across users\footnote{Enforcing fairness across users means that given two users, one with a single \at and the other with one thousand \ats, the user with a single \at does not receive one-thousandth of the CPU time.}, increasing the complexity.
    131131
    132132Linux also offers a FIFO scheduler, a real-time scheduler, which runs the highest-priority \ats, and a round-robin scheduler, which is an extension of the FIFO-scheduler that adds fixed time slices. \cite{MAN:linux/sched}
     
    135135The ULE scheduler used in FreeBSD\cite{DBLP:conf/bsdcon/Roberson03} is a feedback scheduler similar to Linux's CFS.
    136136It uses different data structures and heuristics but also schedules according to some combination of CPU time used and niceness values.
    137 It also periodically balances the load of the system (according to a different heuristic), but uses a simpler work stealing approach.
     137It also periodically balances the load of the system (according to a different heuristic) but uses a simpler work stealing approach.
    138138
    139139\paragraph{Windows(OS)}
     
    143143The scheduler may also temporarily adjust priorities after certain effects like the completion of I/O requests.
    144144
    145 In~\cite{russinovich2009windows}, Chapter 1 section 2.3 ``Processes, Threads, and Jobs'' discusses the scheduling policy more in depth.
    146 Multicore scheduling is based on a combination of priorities and preferred \proc.
     145In~\cite{russinovich2009windows}, Chapter 1 section 2.3 ``Processes, Threads, and Jobs'' discusses the scheduling policy more in-depth.
     146Multicore scheduling is based on a combination of priorities and \proc preferance.
    147147Each \at is assigned an initial processor using a round-robin policy, called the \at's \newterm{ideal} \proc.
    148148\Glspl{at} are distributed among the \procs according to their priority, preferring to match \ats to their ideal \proc and then to the last \proc they ran on.
    149 This approach is a variation of work stealing, where the stealing \proc restore the \at to its original \proc after running it, but mixed with priorities.
     149This approach is a variation of work stealing, where the stealing \proc restores the \at to its original \proc after running it, but mixed with priorities.
    150150
    151151\paragraph{Apple OS X}
     
    160160
    161161There is very little documentation on the internals of this scheduler.
    162 However, the documentation do describe a feature set that is very similar to the Windows and Linux OS schedulers.
     162However, the documentation does describe a feature set that is very similar to the Windows and Linux OS schedulers.
    163163Presumably, this means that the internals are also fairly similar overall.
    164164
    165165\subsection{User-Level Schedulers}
    166 By comparison, user level schedulers tend to be simpler, gathering fewer metrics and avoid complex notions of fairness. Part of the simplicity is due to the fact that all \ats have the same user, and therefore cooperation is both feasible and probable.
     166By comparison, user-level schedulers tend to be simpler, gather fewer metrics and avoid complex notions of fairness. Part of the simplicity is due to the fact that all \ats have the same user, and therefore cooperation is both feasible and probable.
    167167
    168168\paragraph{Go}\label{GoSafePoint}
    169169Go's scheduler uses a randomized work-stealing algorithm that has a global run-queue (\emph{GRQ}) and each processor (\emph{P}) has both a fixed-size run-queue (\emph{LRQ}) and a high-priority next ``chair'' holding a single element~\cite{GITHUB:go,YTUBE:go}.
    170 Preemption is present, but only at safe-points,~\cite{go:safepoints} which are inserted detection code at various frequent access boundaries.
     170Preemption is present, but only at safe points,~\cite{go:safepoints} which are detection code inserted at various frequent access boundaries.
    171171
    172172The algorithm is as follows :
     
    184184Erlang is a functional language that supports concurrency in the form of processes: threads that share no data.
    185185It uses a kind of round-robin scheduler, with a mix of work sharing and stealing to achieve load balancing~\cite{:erlang}, where under-loaded workers steal from other workers, but overloaded workers also push work to other workers.
    186 This \glslink{atmig}{migration} logic is directed by monitoring logic that evaluates the load a few times per seconds.
     186This \glslink{atmig}{migration} logic is directed by monitoring logic that evaluates the load a few times per second.
    187187
    188188\paragraph{Intel\textregistered ~Threading Building Blocks}
     
    196196                \item The successor of t if \textit{t} was its last completed predecessor.
    197197                \item A task popped from the end of the thread's own deque.
    198                 \item A task with affinity for the thread.
     198                \item A task with an affinity for the thread.
    199199                \item A task popped from approximately the beginning of the shared queue.
    200200                \item A task popped from the beginning of another randomly chosen thread's deque.
     
    217217While the documentation only gives limited insight into the scheduling and load balancing approach, \cite{apple:gcd2} suggests a fairly classic approach.
    218218Each \proc has a queue of \ats to run, called \newterm{blocks}, which are drained in \glsxtrshort{fifo}.
    219 GCD also has secondary queues, called \newterm{Dispatch Queues} with clear ordering, where executing a block ends-up scheduling more blocks.
     219GCD also has secondary queues, called \newterm{Dispatch Queues} with clear ordering, where executing a block ends up scheduling more blocks.
    220220In terms of semantics, these Dispatch Queues seem to be very similar to Intel\textregistered ~TBB \lstinline{execute()} and predecessor semantics.
    221221
     
    223223
    224224\paragraph{LibFibre}
    225 LibFibre~\cite{DBLP:journals/pomacs/KarstenB20} is a light-weight user-level threading framework developed at the University of Waterloo.
    226 Similarly to Go, it uses a variation of work stealing with a global queue that is higher priority than stealing.
     225LibFibre~\cite{DBLP:journals/pomacs/KarstenB20} is a lightweight user-level threading framework developed at the University of Waterloo.
     226Similarly to Go, it uses a variation of work stealing with a global queue that has a higher priority than stealing.
    227227Unlike Go, it does not have the high-priority next ``chair'' and does not use randomized work-stealing.
  • doc/theses/thierry_delisle_PhD/thesis/text/front.tex

    re4855f6 r918e0137  
    127127Indeed, over-partitioning into small work-units with user threading significantly eases load bal\-ancing, while simultaneously providing advanced synchronization and mutual exclusion capabilities.
    128128To manage these high levels of concurrency, the underlying runtime must efficiently schedule many user threads across a few kernel threads;
    129 which begs of the question of how many kernel threads are needed and should the number be dynamically reevaluated.
     129which begs the question of how many kernel threads are needed and should the number be dynamically reevaluated.
    130130Furthermore, scheduling must prevent kernel threads from blocking, otherwise user-thread parallelism drops.
    131 When user-threading parallelism does drop, how and when should idle kernel-threads be put to sleep to avoid wasting CPU resources.
     131When user-threading parallelism does drop, how and when should idle \glspl{kthrd} be put to sleep to avoid wasting CPU resources.
    132132Finally, the scheduling system must provide fairness to prevent a user thread from monopolizing a kernel thread;
    133 otherwise other user threads can experience short/long term starvation or kernel threads can deadlock waiting for events to occur on busy kernel threads.
    134 
    135 This thesis analyses multiple scheduler systems, where each system attempts to fulfill the necessary requirements for user-level threading.
    136 The predominant technique for managing high levels of concurrency is sharding the ready-queue with one queue per kernel-thread and using some form of work stealing/sharing to dynamically rebalance workload shifts.
     133otherwise, other user threads can experience short/long term starvation or kernel threads can deadlock waiting for events to occur on busy kernel threads.
     134
     135This thesis analyses multiple scheduler systems, where each system attempts to fulfill the requirements for user-level threading.
     136The predominant technique for managing high levels of concurrency is sharding the ready-queue with one queue per \gls{kthrd} and using some form of work stealing/sharing to dynamically rebalance workload shifts.
    137137Preventing kernel blocking is accomplished by transforming kernel locks and I/O operations into user-level operations that do not block the kernel thread or spin up new kernel threads to manage the blocking.
    138138Fairness is handled through preemption and/or ad-hoc solutions, which leads to coarse-grained fairness with some pathological cases.
  • doc/theses/thierry_delisle_PhD/thesis/text/intro.tex

    re4855f6 r918e0137  
    99When user-threading parallelism does drop, how and when should idle kernel-threads be put to sleep to avoid wasting CPU resources.
    1010Finally, the scheduling system must provide fairness to prevent a user thread from monopolizing a kernel thread;
    11 otherwise other user threads can experience short/long term starvation or kernel threads can deadlock waiting for events to occur on busy kernel threads.
     11otherwise, other user threads can experience short/long term starvation or kernel threads can deadlock waiting for events to occur on busy kernel threads.
    1212
    13 This thesis analyses multiple scheduler systems, where each system attempts to fulfill the necessary requirements for \gls{uthrding}.
     13This thesis analyses multiple scheduler systems, where each system attempts to fulfill the requirements for \gls{uthrding}.
    1414The predominant technique for managing high levels of concurrency is sharding the ready-queue with one queue per kernel-thread and using some form of work stealing/sharing to dynamically rebalance workload shifts.
    1515Preventing kernel blocking is accomplished by transforming kernel locks and I/O operations into user-level operations that do not block the kernel thread or spin up new kernel threads to manage the blocking.
     
    2323Chapter~\ref{existing} discusses how scheduler implementations attempt to achieve these goals, but all implementations optimize some workloads better than others.
    2424Chapter~\ref{cfaruntime} presents the relevant aspects of the \CFA runtime system that have a significant effect on the new scheduler design and implementation.
    25 Chapter~\ref{core} analyses different scheduler approaches, while looking for scheduler mechanisms that provide both performance and fairness.
     25Chapter~\ref{core} analyses different scheduler approaches while looking for scheduler mechanisms that provide both performance and fairness.
    2626Chapter~\ref{userio} covers the complex mechanisms that must be used to achieve nonblocking I/O to prevent the blocking of \glspl{kthrd}.
    2727Chapter~\ref{practice} presents the mechanisms needed to adjust the amount of parallelism, both manually and automatically.
     
    3636For scheduled work-units, a scheduler takes a sequence of threads and attempts to run them to completion, subject to shared resource restrictions and utilization.
    3737A general-purpose dynamic-scheduler for an open system cannot anticipate work requests, so its performance is rarely optimal.
    38 Even with complete knowledge of arrive order and work, creating an optimal solution is a bin packing problem~\cite{wiki:binpak}.
     38Even with complete knowledge of arrival order and work, creating an optimal solution is a bin packing problem~\cite{wiki:binpak}.
    3939However, optimal solutions are often not required: schedulers often produce excellent solutions, without needing optimality, by taking advantage of regularities in work patterns.
    4040
     
    9191\newterm{contention}: safe access of shared objects by multiple processors requires mutual exclusion in some form, generally locking.\footnote{
    9292Lock-free data-structures do not involve locking but incur similar costs to achieve mutual exclusion.}
    93 Mutual exclusion cost and latency increases significantly with the number of processors access\-ing a shared object.
     93Mutual exclusion cost and latency increase significantly with the number of processors access\-ing a shared object.
    9494\end{enumerate}
    9595
     
    116116
    117117Since \CFA attempts to improve the safety and productivity of C, the new scheduler presented in this thesis attempts to achieve the same goals.
    118 More specifically, safety and productivity for scheduling means supporting a wide range of workloads so that programmers can rely on progress guarantees (safety) and more easily achieve acceptable performance (productivity).
     118More specifically, safety and productivity for scheduling mean supporting a wide range of workloads so that programmers can rely on progress guarantees (safety) and more easily achieve acceptable performance (productivity).
    119119The new scheduler also includes support for implicit nonblocking \io, allowing applications to have more user-threads blocking on \io operations than there are \glspl{kthrd}.
    120120To complete the scheduler, an idle sleep mechanism is implemented that significantly reduces wasted CPU cycles, which are then available outside the application.
Note: See TracChangeset for help on using the changeset viewer.