- Timestamp:
- Nov 24, 2022, 3:41:44 PM (20 months ago)
- Branches:
- ADT, ast-experimental, master
- Children:
- dacd8e6e
- Parents:
- 82a90d4
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/thierry_delisle_PhD/thesis/text/practice.tex
r82a90d4 rddcaff6 2 2 The scheduling algorithm described in Chapter~\ref{core} addresses scheduling in a stable state. 3 3 This chapter addresses problems that occur when the system state changes. 4 Indeed the \CFA runtime ,supports expanding and shrinking the number of \procs, both manually and, to some extent, automatically.4 Indeed the \CFA runtime supports expanding and shrinking the number of \procs, both manually and, to some extent, automatically. 5 5 These changes affect the scheduling algorithm, which must dynamically alter its behaviour. 6 6 7 In detail, \CFA supports adding \procs using the type @processor@, in both RAII and heap coding scenarios.7 Specifically, \CFA supports adding \procs using the type @processor@, in both RAII and heap coding scenarios. 8 8 \begin{cfa} 9 9 { … … 26 26 This requirement also means any references into these arrays, \eg pointers or indexes, may need to be updated if elements are moved for compaction or any other reason. 27 27 28 There are no performance requirements, within reason, for resizingsince it is expected to be rare.29 However, this operation has strict correctness requirements since updating and idle sleep can easily lead to deadlocks.30 It should also avoidas much as possible any effect on performance when the number of \procs remains constant.31 This la terrequirement prohibits naive solutions, like simply adding a global lock to the ready-queue arrays.28 There are no performance requirements, within reason, for act of resizing itself, since it is expected to be rare. 29 However, this operation has strict correctness requirements, since updating and idle sleep can easily lead to deadlocks. 30 The resizing mechanism should also avoid, as much as possible any effect on performance when the number of \procs remains constant. 31 This last requirement prohibits naive solutions, like simply adding a global lock to the ready-queue arrays. 32 32 33 33 \subsection{Read-Copy-Update} 34 34 One solution is to use the Read-Copy-Update pattern~\cite{wiki:rcu}. 35 This is a very common pattern that avoids large critical sections, which is why it is worth mentioning. 35 36 In this pattern, resizing is done by creating a copy of the internal data structures, \eg see Figure~\ref{fig:base-ts2}, updating the copy with the desired changes, and then attempting an Indiana Jones Switch to replace the original with the copy. 36 This approach has the advantage that it may not need any synchronization to do the switch .37 This approach has the advantage that it may not need any synchronization to do the switch, depending on how reclamation of the original copy is handled. 37 38 However, there is a race where \procs still use the original data structure after the copy is switched. 38 39 This race not only requires adding a memory-reclamation scheme, but it also requires that operations made on the stale original version are eventually moved to the copy. … … 63 64 Acquiring all the local read-locks guarantees mutual exclusion among the readers and the writer, while the wait on the read side prevents readers from continuously starving the writer. 64 65 Figure~\ref{f:SpecializedReadersWriterLock} shows the outline for this specialized readers-writer lock. 65 The lock i nnonblocking, so both readers and writers spin while the lock is held.66 The lock is nonblocking, so both readers and writers spin while the lock is held. 66 67 This very wide sharding strategy means that readers have very good locality since they only ever need to access two memory locations. 67 68 … … 98 99 \section{Idle-Sleep}\label{idlesleep} 99 100 While manual resizing of \procs is expected to be rare, the number of \ats can vary significantly over an application's lifetime, which means there are times when there are too few or too many \procs. 100 For this work, it is the programmer's responsibility to manually create \procs, so if there are too few \procs, the application must address this issue.101 For this work, it is the application programmer's responsibility to manually create \procs, so if there are too few \procs, the application must address this issue. 101 102 This leaves too many \procs when there are not enough \ats for all the \procs to be useful. 102 103 These idle \procs cannot be removed because their lifetime is controlled by the application, and only the application knows when the number of \ats may increase or decrease. … … 108 109 Because idle sleep is spurious, this data structure has strict performance requirements, in addition to strict correctness requirements. 109 110 Next, some mechanism is needed to block \glspl{kthrd}, \eg @pthread_cond_wait@ or a pthread semaphore. 110 The complexity here is to support \at \glslink{atblock}{parking} and \glslink{atsched}{unparking},user-level locking, timers, \io operations, and all other \CFA features with minimal complexity.111 The complexity here is to support user-level locking, timers, \io operations, and all other \CFA features with minimal complexity. 111 112 Finally, the scheduler needs a heuristic to determine when to block and unblock an appropriate number of \procs. 112 113 However, this third challenge is outside the scope of this thesis because developing a general heuristic is complex enough to justify its own work. … … 115 116 An interesting subpart of this heuristic is what to do with bursts of \ats that become ready. 116 117 Since waking up a sleeping \proc can have notable latency, multiple \ats may become ready while a single \proc is waking up. 117 This fact begs the question,if many \procs are available, how many should be woken?118 This fact raises the question: if many \procs are available, how many should be woken? 118 119 If the ready \ats will run longer than the wake-up latency, waking one \proc per \at will offer maximum parallelization. 119 120 If the ready \ats will run for a very short time, waking many \procs may be wasteful. 120 As mentioned, a heuristic to handle these complex cases is outside the scope of this thesis,the behaviour of the scheduler in this particular case is left unspecified.121 As mentioned, since a heuristic to handle these complex cases is outside the scope of this thesis, so the behaviour of the scheduler in this particular case is left unspecified. 121 122 122 123 \section{Sleeping} … … 156 157 The notifier first makes sure the newly ready \at is visible to \procs searching for \ats, and then attempts to notify an idle \proc. 157 158 On the other side, \procs make themselves visible as idle \procs and then search for any \ats they may have missed. 158 Unlike regular work-stealing, this search must be exhaustive to make sure that pre-existing \at is missed.159 Unlike regular work-stealing, this search must be exhaustive to make sure that no pre-existing \at is missed. 159 160 These steps from both sides guarantee that if the search misses a newly ready \at, then the notifier is guaranteed to see at least one idle \proc. 160 161 Conversely, if the notifier does not see any idle \proc, then a \proc is guaranteed to find the new \at in its exhaustive search. … … 188 189 \centering 189 190 \input{idle1.pstex_t} 190 \caption[Basic Idle Sleep Data Structure]{Basic Idle Sleep Data Structure \smallskip\newline Each idle \proc is put unto a doubly-linked stack protected by a lock.191 \caption[Basic Idle Sleep Data Structure]{Basic Idle Sleep Data Structure \smallskip\newline Each idle \proc is put onto a doubly-linked stack protected by a lock. 191 192 Each \proc has a private event \lstinline{fd}.} 192 193 \label{fig:idle1}
Note: See TracChangeset
for help on using the changeset viewer.