Changeset 0a945fd for doc/theses/thierry_delisle_PhD
- Timestamp:
- Sep 22, 2020, 1:02:45 PM (4 years ago)
- Branches:
- ADT, arm-eh, ast-experimental, enum, forall-pointer-decay, jacob/cs343-translation, master, new-ast-unique-expr, pthread-emulation, qualifiedEnum
- Children:
- 305cd5c
- Parents:
- 7a80113
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/thierry_delisle_PhD/comp_II/comp_II.tex
r7a80113 r0a945fd 138 138 139 139 \paragraph{Efficiency} Finally, efficient usage of CPU resources is also an important requirement and is discussed in depth towards the end of the proposal. 140 \newterm{Efficiency} means avoiding using CPU cycles when there are no threads to run ( conserve energy/heat), and conversely, using as many available CPU cycles when the workload can benefit from it.140 \newterm{Efficiency} means avoiding using CPU cycles when there are no threads to run (to conserve energy), and conversely, using as many available CPU cycles when the workload can benefit from it. 141 141 Balancing these two states is where the complexity lies. 142 142 The \CFA scheduler should be efficient with respect to the underlying (shared) computer. … … 150 150 \begin{enumerate} 151 151 \item Threads live long enough for useful feedback information to be gathered. 152 \item Threads belong to multiple users so fairness across users is largely invisible.152 \item Threads belong to multiple users so fairness across users is important. 153 153 \end{enumerate} 154 154 … … 162 162 Security concerns mean more precise and robust fairness metrics must be used to guarantee fairness across processes created by users as well as threads created within a process. 163 163 In the case of the \CFA scheduler, every thread runs in the same user space and is controlled by the same user. 164 Fairness across threads is therefore a given and it is then possible to safely ignore the possibility that threads are malevolent.164 Fairness across users is therefore a given and it is then possible to safely ignore the possibility that threads are malevolent. 165 165 This approach allows for a much simpler fairness metric, and in this proposal, \emph{fairness} is defined as: 166 166 \begin{quote} … … 193 193 \subsection{Schedulers without feedback or priorities} 194 194 This proposal conjectures that it is possible to construct a default scheduler for the \CFA runtime that offers good scalability and a simple fairness guarantee that is easy for programmers to reason about. 195 The simplest fairness guarantee is FIFO ordering, \ie threads scheduled first comefirst.195 The simplest fairness guarantee is FIFO ordering, \ie threads scheduled first run first. 196 196 However, enforcing FIFO ordering generally conflicts with scalability across multiple processors because of the additional synchronization. 197 197 Thankfully, strict FIFO is not needed for sufficient fairness. … … 236 236 \input{empty.pstex_t} 237 237 \end{center} 238 \caption{Un loaded relaxed FIFO list where the array contains many empty cells.}238 \caption{Underloaded relaxed FIFO list where the array contains many empty cells.} 239 239 \label{fig:empty} 240 240 \end{figure} 241 241 242 In an un loaded system, several of the queues are empty, so selecting a random queue for popping is less likely to yield a successful selection and more attempts are needed, resulting in a performance degradation.242 In an underloaded system, several of the queues are empty, so selecting a random queue for popping is less likely to yield a successful selection and more attempts are needed, resulting in a performance degradation. 243 243 Figure~\ref{fig:empty} shows an example with fewer elements, where the chances of getting an empty queue is 5/7 per pick, meaning two random picks yield an item only half the time. 244 244 Since the ready queue is not empty, the pop operation \emph{must} find an element before returning and therefore must retry. … … 288 288 \end{center} 289 289 \vspace*{-5pt} 290 \caption{Un loaded queue with added bitmask to indicate which array cells have items.}290 \caption{Underloaded queue with added bitmask to indicate which array cells have items.} 291 291 \label{fig:emptybit} 292 292 \begin{center} … … 294 294 \end{center} 295 295 \vspace*{-5pt} 296 \caption{Un loaded queue with added binary search tree indicate which array cells have items.}296 \caption{Underloaded queue with added binary search tree indicate which array cells have items.} 297 297 \label{fig:emptytree} 298 298 \begin{center} … … 300 300 \end{center} 301 301 \vspace*{-5pt} 302 \caption{Un loaded queue with added per processor bitmask to indicate which array cells have items.}302 \caption{Underloaded queue with added per processor bitmask to indicate which array cells have items.} 303 303 \label{fig:emptytls} 304 304 \end{figure} 305 305 306 Figure~\ref{fig:emptytree} shows an approach using a hierarchical tree data-structure to reduce contention and has been shown to work in similar cases~\cite{ellen2007snzi}\footnote{This particular paper seems to be patented in the US. 307 How does that affect \CFA? Can I use it in my work?}. 306 Figure~\ref{fig:emptytree} shows an approach using a hierarchical tree data-structure to reduce contention and has been shown to work in similar cases~\cite{ellen2007snzi}. 308 307 However, this approach may lead to poorer performance in Table~\ref{tab:perfcases} case~B due to the inherent pointer chasing cost and already low contention cost in that case. 309 308 310 309 Figure~\ref{fig:emptytls} shows an approach using dense information, similar to the bitmap, but have each thread keep its own independent copy of it. 311 310 While this approach can offer good scalability \emph{and} low latency, the liveliness of the information can become a problem. 312 In the simple cases, local copies with empty underlying queuescan become stale and end-up not being useful for the pop operation.311 In the simple cases, local copies can become stale and end-up not being useful for the pop operation. 313 312 A more serious problem is that reliable information is necessary for some parts of this algorithm to be correct. 314 313 As mentioned in this section, processors must know \emph{reliably} whether the list is empty or not to decide if they can return \texttt{NULL} or if they must keep looking during a pop operation. … … 458 457 A very recent alternative that I am investigating is $io_uring$~\cite{io_uring}. 459 458 It claims to address some of the issues with $epoll$ and my early investigating suggests that the claim is accurate. 460 $io_uring$ uses a much more general approach where system calls are registered to a queue and later executed by the kernel, rather than relying on system calls to su bsequently wait for changes on file descriptors or return an error.459 $io_uring$ uses a much more general approach where system calls are registered to a queue and later executed by the kernel, rather than relying on system calls to support returning an error instead of blocking. 461 460 I believe this approach allows for fewer problems, \eg the manpage for $open$~\cite{open} states: 462 461 \begin{quote}
Note: See TracChangeset
for help on using the changeset viewer.