Ignore:
Timestamp:
Nov 24, 2022, 3:41:44 PM (20 months ago)
Author:
Thierry Delisle <tdelisle@…>
Branches:
ADT, ast-experimental, master
Children:
dacd8e6e
Parents:
82a90d4
Message:

Last corrections to my thesis... hopefully

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/thierry_delisle_PhD/thesis/text/practice.tex

    r82a90d4 rddcaff6  
    22The scheduling algorithm described in Chapter~\ref{core} addresses scheduling in a stable state.
    33This chapter addresses problems that occur when the system state changes.
    4 Indeed the \CFA runtime, supports expanding and shrinking the number of \procs, both manually and, to some extent, automatically.
     4Indeed the \CFA runtime supports expanding and shrinking the number of \procs, both manually and, to some extent, automatically.
    55These changes affect the scheduling algorithm, which must dynamically alter its behaviour.
    66
    7 In detail, \CFA supports adding \procs using the type @processor@, in both RAII and heap coding scenarios.
     7Specifically, \CFA supports adding \procs using the type @processor@, in both RAII and heap coding scenarios.
    88\begin{cfa}
    99{
     
    2626This requirement also means any references into these arrays, \eg pointers or indexes, may need to be updated if elements are moved for compaction or any other reason.
    2727
    28 There are no performance requirements, within reason, for resizing since it is expected to be rare.
    29 However, this operation has strict correctness requirements since updating and idle sleep can easily lead to deadlocks.
    30 It should also avoid as much as possible any effect on performance when the number of \procs remains constant.
    31 This later requirement prohibits naive solutions, like simply adding a global lock to the ready-queue arrays.
     28There are no performance requirements, within reason, for act of resizing itself, since it is expected to be rare.
     29However, this operation has strict correctness requirements, since updating and idle sleep can easily lead to deadlocks.
     30The resizing mechanism should also avoid, as much as possible any effect on performance when the number of \procs remains constant.
     31This last requirement prohibits naive solutions, like simply adding a global lock to the ready-queue arrays.
    3232
    3333\subsection{Read-Copy-Update}
    3434One solution is to use the Read-Copy-Update pattern~\cite{wiki:rcu}.
     35This is a very common pattern that avoids large critical sections, which is why it is worth mentioning.
    3536In this pattern, resizing is done by creating a copy of the internal data structures, \eg see Figure~\ref{fig:base-ts2}, updating the copy with the desired changes, and then attempting an Indiana Jones Switch to replace the original with the copy.
    36 This approach has the advantage that it may not need any synchronization to do the switch.
     37This approach has the advantage that it may not need any synchronization to do the switch, depending on how reclamation of the original copy is handled.
    3738However, there is a race where \procs still use the original data structure after the copy is switched.
    3839This race not only requires adding a memory-reclamation scheme, but it also requires that operations made on the stale original version are eventually moved to the copy.
     
    6364Acquiring all the local read-locks guarantees mutual exclusion among the readers and the writer, while the wait on the read side prevents readers from continuously starving the writer.
    6465Figure~\ref{f:SpecializedReadersWriterLock} shows the outline for this specialized readers-writer lock.
    65 The lock in nonblocking, so both readers and writers spin while the lock is held.
     66The lock is nonblocking, so both readers and writers spin while the lock is held.
    6667This very wide sharding strategy means that readers have very good locality since they only ever need to access two memory locations.
    6768
     
    9899\section{Idle-Sleep}\label{idlesleep}
    99100While manual resizing of \procs is expected to be rare, the number of \ats can vary significantly over an application's lifetime, which means there are times when there are too few or too many \procs.
    100 For this work, it is the programmer's responsibility to manually create \procs, so if there are too few \procs, the application must address this issue.
     101For this work, it is the application programmer's responsibility to manually create \procs, so if there are too few \procs, the application must address this issue.
    101102This leaves too many \procs when there are not enough \ats for all the \procs to be useful.
    102103These idle \procs cannot be removed because their lifetime is controlled by the application, and only the application knows when the number of \ats may increase or decrease.
     
    108109Because idle sleep is spurious, this data structure has strict performance requirements, in addition to strict correctness requirements.
    109110Next, some mechanism is needed to block \glspl{kthrd}, \eg @pthread_cond_wait@ or a pthread semaphore.
    110 The complexity here is to support \at \glslink{atblock}{parking} and \glslink{atsched}{unparking}, user-level locking, timers, \io operations, and all other \CFA features with minimal complexity.
     111The complexity here is to support user-level locking, timers, \io operations, and all other \CFA features with minimal complexity.
    111112Finally, the scheduler needs a heuristic to determine when to block and unblock an appropriate number of \procs.
    112113However, this third challenge is outside the scope of this thesis because developing a general heuristic is complex enough to justify its own work.
     
    115116An interesting subpart of this heuristic is what to do with bursts of \ats that become ready.
    116117Since waking up a sleeping \proc can have notable latency, multiple \ats may become ready while a single \proc is waking up.
    117 This fact begs the question, if many \procs are available, how many should be woken?
     118This fact raises the question: if many \procs are available, how many should be woken?
    118119If the ready \ats will run longer than the wake-up latency, waking one \proc per \at will offer maximum parallelization.
    119120If the ready \ats will run for a very short time, waking many \procs may be wasteful.
    120 As mentioned, a heuristic to handle these complex cases is outside the scope of this thesis, the behaviour of the scheduler in this particular case is left unspecified.
     121As mentioned, since a heuristic to handle these complex cases is outside the scope of this thesis, so the behaviour of the scheduler in this particular case is left unspecified.
    121122
    122123\section{Sleeping}
     
    156157The notifier first makes sure the newly ready \at is visible to \procs searching for \ats, and then attempts to notify an idle \proc.
    157158On the other side, \procs make themselves visible as idle \procs and then search for any \ats they may have missed.
    158 Unlike regular work-stealing, this search must be exhaustive to make sure that pre-existing \at is missed.
     159Unlike regular work-stealing, this search must be exhaustive to make sure that no pre-existing \at is missed.
    159160These steps from both sides guarantee that if the search misses a newly ready \at, then the notifier is guaranteed to see at least one idle \proc.
    160161Conversely, if the notifier does not see any idle \proc, then a \proc is guaranteed to find the new \at in its exhaustive search.
     
    188189        \centering
    189190        \input{idle1.pstex_t}
    190         \caption[Basic Idle Sleep Data Structure]{Basic Idle Sleep Data Structure \smallskip\newline Each idle \proc is put unto a doubly-linked stack protected by a lock.
     191        \caption[Basic Idle Sleep Data Structure]{Basic Idle Sleep Data Structure \smallskip\newline Each idle \proc is put onto a doubly-linked stack protected by a lock.
    191192        Each \proc has a private event \lstinline{fd}.}
    192193        \label{fig:idle1}
Note: See TracChangeset for help on using the changeset viewer.