Context Navigation

-                      r82a90d4
+                      rddcaff6
 The scheduling algorithm described in Chapter~\ref{core} addresses scheduling in a stable state.
 This chapter addresses problems that occur when the system state changes.
 Indeed the \CFA runtime, supports expanding and shrinking the number of \procs, both manually and, to some extent, automatically.
+Indeed the \CFA runtime supports expanding and shrinking the number of \procs, both manually and, to some extent, automatically.
 These changes affect the scheduling algorithm, which must dynamically alter its behaviour.
 In detail, \CFA supports adding \procs using the type @processor@, in both RAII and heap coding scenarios.
+Specifically, \CFA supports adding \procs using the type @processor@, in both RAII and heap coding scenarios.
 \begin{cfa}
+{
 …
 This requirement also means any references into these arrays, \eg pointers or indexes, may need to be updated if elements are moved for compaction or any other reason.
 There are no performance requirements, within reason, for resizing since it is expected to be rare.
 However, this operation has strict correctness requirements since updating and idle sleep can easily lead to deadlocks.
 It should also avoid as much as possible any effect on performance when the number of \procs remains constant.
 This later requirement prohibits naive solutions, like simply adding a global lock to the ready-queue arrays.
+There are no performance requirements, within reason, for act of resizing itself, since it is expected to be rare.
+However, this operation has strict correctness requirements, since updating and idle sleep can easily lead to deadlocks.
+The resizing mechanism should also avoid, as much as possible any effect on performance when the number of \procs remains constant.
+This last requirement prohibits naive solutions, like simply adding a global lock to the ready-queue arrays.
 \subsection{Read-Copy-Update}
 One solution is to use the Read-Copy-Update pattern~\cite{wiki:rcu}.
+This is a very common pattern that avoids large critical sections, which is why it is worth mentioning.
 In this pattern, resizing is done by creating a copy of the internal data structures, \eg see Figure~\ref{fig:base-ts2}, updating the copy with the desired changes, and then attempting an Indiana Jones Switch to replace the original with the copy.
 This approach has the advantage that it may not need any synchronization to do the switch.
+This approach has the advantage that it may not need any synchronization to do the switch, depending on how reclamation of the original copy is handled.
 However, there is a race where \procs still use the original data structure after the copy is switched.
 This race not only requires adding a memory-reclamation scheme, but it also requires that operations made on the stale original version are eventually moved to the copy.
 …
 Acquiring all the local read-locks guarantees mutual exclusion among the readers and the writer, while the wait on the read side prevents readers from continuously starving the writer.
 Figure~\ref{f:SpecializedReadersWriterLock} shows the outline for this specialized readers-writer lock.
 The lock in nonblocking, so both readers and writers spin while the lock is held.
+The lock is nonblocking, so both readers and writers spin while the lock is held.
 This very wide sharding strategy means that readers have very good locality since they only ever need to access two memory locations.
 …
 \section{Idle-Sleep}\label{idlesleep}
 While manual resizing of \procs is expected to be rare, the number of \ats can vary significantly over an application's lifetime, which means there are times when there are too few or too many \procs.
 For this work, it is the programmer's responsibility to manually create \procs, so if there are too few \procs, the application must address this issue.
+For this work, it is the application programmer's responsibility to manually create \procs, so if there are too few \procs, the application must address this issue.
 This leaves too many \procs when there are not enough \ats for all the \procs to be useful.
 These idle \procs cannot be removed because their lifetime is controlled by the application, and only the application knows when the number of \ats may increase or decrease.
 …
 Because idle sleep is spurious, this data structure has strict performance requirements, in addition to strict correctness requirements.
 Next, some mechanism is needed to block \glspl{kthrd}, \eg @pthread_cond_wait@ or a pthread semaphore.
 The complexity here is to support \at \glslink{atblock}{parking} and \glslink{atsched}{unparking}, user-level locking, timers, \io operations, and all other \CFA features with minimal complexity.
+The complexity here is to support user-level locking, timers, \io operations, and all other \CFA features with minimal complexity.
 Finally, the scheduler needs a heuristic to determine when to block and unblock an appropriate number of \procs.
 However, this third challenge is outside the scope of this thesis because developing a general heuristic is complex enough to justify its own work.
 …
 An interesting subpart of this heuristic is what to do with bursts of \ats that become ready.
 Since waking up a sleeping \proc can have notable latency, multiple \ats may become ready while a single \proc is waking up.
 This fact begs the question, if many \procs are available, how many should be woken?
+This fact raises the question: if many \procs are available, how many should be woken?
 If the ready \ats will run longer than the wake-up latency, waking one \proc per \at will offer maximum parallelization.
 If the ready \ats will run for a very short time, waking many \procs may be wasteful.
 As mentioned, a heuristic to handle these complex cases is outside the scope of this thesis, the behaviour of the scheduler in this particular case is left unspecified.
+As mentioned, since a heuristic to handle these complex cases is outside the scope of this thesis, so the behaviour of the scheduler in this particular case is left unspecified.
 \section{Sleeping}
 …
 The notifier first makes sure the newly ready \at is visible to \procs searching for \ats, and then attempts to notify an idle \proc.
 On the other side, \procs make themselves visible as idle \procs and then search for any \ats they may have missed.
 Unlike regular work-stealing, this search must be exhaustive to make sure that pre-existing \at is missed.
+Unlike regular work-stealing, this search must be exhaustive to make sure that no pre-existing \at is missed.
 These steps from both sides guarantee that if the search misses a newly ready \at, then the notifier is guaranteed to see at least one idle \proc.
 Conversely, if the notifier does not see any idle \proc, then a \proc is guaranteed to find the new \at in its exhaustive search.
 …
         \centering
         \input{idle1.pstex_t}
         \caption[Basic Idle Sleep Data Structure]{Basic Idle Sleep Data Structure \smallskip\newline Each idle \proc is put unto a doubly-linked stack protected by a lock.
+        \caption[Basic Idle Sleep Data Structure]{Basic Idle Sleep Data Structure \smallskip\newline Each idle \proc is put onto a doubly-linked stack protected by a lock.
         Each \proc has a private event \lstinline{fd}.}
         \label{fig:idle1}

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset ddcaff6 for doc/theses/thierry_delisle_PhD/thesis/text/practice.tex

Legend:

doc/theses/thierry_delisle_PhD/thesis/text/practice.tex

Download in other formats: