Context Navigation

← Previous Change
Next Change →

runtime.tex

Timestamp:

Sep 19, 2022, 8:11:02 PM (3 years ago)

Author:

Peter A. Buhr <pabuhr@…>

Branches:

ADT, ast-experimental, master, pthread-emulation, stuck-waitfor-destruct

Children:

aa9f215

Parents:

ebf8ca5 (diff), ae1d151 (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the (diff) links above to see all the changes relative to each parent.

Message:

fix merge conflict

File:

: 1 edited

doc/theses/thierry_delisle_PhD/thesis/text/runtime.tex (modified) (6 diffs)

Legend:

: Unmodified
: Added
: Removed

doc/theses/thierry_delisle_PhD/thesis/text/runtime.tex

-              rebf8ca5
+              r23a08aa0
 \section{C Threading}
 \Celeven introduced threading features, such the @_Thread_local@ storage class, and libraries @stdatomic.h@ and @threads.h@.
+\Celeven introduced threading features, such as the @_Thread_local@ storage class, and libraries @stdatomic.h@ and @threads.h@.
 Interestingly, almost a decade after the \Celeven standard, the most recent versions of gcc, clang, and msvc do not support the \Celeven include @threads.h@, indicating no interest in the C11 concurrency approach (possibly because of the recent effort to add concurrency to \CC).
 While the \Celeven standard does not state a threading model, the historical association with pthreads suggests implementations would adopt kernel-level threading (1:1)~\cite{ThreadModel}, as for \CC.
 …
 \section{M:N Threading}\label{prev:model}
 Threading in \CFA is based on \Gls{uthrding}, where \glspl{thrd} are the representation of a unit of work. As such, \CFA programmers should expect these units to be fairly inexpensive, \ie programmers should be able to create a large number of \glspl{thrd} and switch among \glspl{thrd} liberally without many concerns for performance.
+Threading in \CFA is based on \Gls{uthrding}, where \ats are the representation of a unit of work. As such, \CFA programmers should expect these units to be fairly inexpensive, \ie programmers should be able to create a large number of \ats and switch among \ats liberally without many performance concerns.
 The \CFA M:N threading models is implemented using many user-level threads mapped onto fewer \glspl{kthrd}.
+The \CFA M:N threading model is implemented using many user-level threads mapped onto fewer \glspl{kthrd}.
 The user-level threads have the same semantic meaning as a \glspl{kthrd} in the 1:1 model: they represent an independent thread of execution with its own stack.
 The difference is that user-level threads do not have a corresponding object in the kernel; they are handled by the runtime in user space and scheduled onto \glspl{kthrd}, referred to as \glspl{proc} in this document. \Glspl{proc} run a \gls{thrd} until it context switches out, it then chooses a different \gls{thrd} to run.
+The difference is that user-level threads do not have a corresponding object in the kernel; they are handled by the runtime in user space and scheduled onto \glspl{kthrd}, referred to as \glspl{proc} in this document. \Glspl{proc} run a \at until it context switches out, it then chooses a different \at to run.
 \section{Clusters}
 \CFA allows the option to group user-level threading, in the form of clusters.
 Both \glspl{thrd} and \glspl{proc} belong to a specific cluster.
 \Glspl{thrd} are only scheduled onto \glspl{proc} in the same cluster and scheduling is done independently of other clusters.
+Both \ats and \glspl{proc} belong to a specific cluster.
+\Glspl{at} are only scheduled onto \glspl{proc} in the same cluster and scheduling is done independently of other clusters.
 Figure~\ref{fig:system} shows an overview of the \CFA runtime, which allows programmers to tightly control parallelism.
 It also opens the door to handling effects like NUMA, by pinning clusters to a specific NUMA node\footnote{This capability is not currently implemented in \CFA, but the only hurdle left is creating a generic interface for CPU masks.}.
 …
                 \input{system.pstex_t}
         \end{center}
         \caption[Overview of the \CFA runtime]{Overview of the \CFA runtime \newline \Glspl{thrd} are scheduled inside a particular cluster and run on the \glspl{proc} that belong to the cluster. The discrete-event manager, which handles preemption and timeout, is a \gls{proc} that lives outside any cluster and does not run \glspl{thrd}.}
+        \caption[Overview of the \CFA runtime]{Overview of the \CFA runtime \newline \Glspl{at} are scheduled inside a particular cluster and run on the \glspl{proc} that belong to the cluster. The discrete-event manager, which handles preemption and timeout, is a \gls{proc} that lives outside any cluster and does not run \ats.}
         \label{fig:system}
 \end{figure}
 …
 \section{\glsxtrshort{io}}\label{prev:io}
 Prior to this work, the \CFA runtime did not add any particular support for \glsxtrshort{io} operations. While all \glsxtrshort{io} operations available in C are available in \CFA, \glsxtrshort{io} operations are designed for the POSIX threading model~\cite{pthreads}. Using these 1:1 threading operations in an M:N threading model means \glsxtrshort{io} operations block \glspl{proc} instead of \glspl{thrd}. While this can work in certain cases, it limits the number of concurrent operations to the number of \glspl{proc} rather than \glspl{thrd}. It also means deadlock can occur because all \glspl{proc} are blocked even if at least one \gls{thrd} is ready to run. A simple example of this type of deadlock would be as follows:
+Prior to this work, the \CFA runtime did not add any particular support for \glsxtrshort{io} operations. While all \glsxtrshort{io} operations available in C are available in \CFA, \glsxtrshort{io} operations are designed for the POSIX threading model~\cite{pthreads}. Using these 1:1 threading operations in an M:N threading model means \glsxtrshort{io} operations block \glspl{proc} instead of \ats. While this can work in certain cases, it limits the number of concurrent operations to the number of \glspl{proc} rather than \ats. It also means deadlock can occur because all \glspl{proc} are blocked even if at least one \at is ready to run. A simple example of this type of deadlock would be as follows:
 \begin{quote}
 Given a simple network program with 2 \glspl{thrd} and a single \gls{proc}, one \gls{thrd} sends network requests to a server and the other \gls{thrd} waits for a response from the server.
 If the second \gls{thrd} races ahead, it may wait for responses to requests that have not been sent yet.
 In theory, this should not be a problem, even if the second \gls{thrd} waits, because the first \gls{thrd} is still ready to run and should be able to get CPU time to send the request.
 With M:N threading, while the first \gls{thrd} is ready, the lone \gls{proc} \emph{cannot} run the first \gls{thrd} if it is blocked in the \glsxtrshort{io} operation of the second \gls{thrd}.
 If this happen, the system is in a synchronization deadlock\footnote{In this example, the deadlock could be resolved if the server sends unprompted messages to the client.
+Given a simple network program with 2 \ats and a single \gls{proc}, one \at sends network requests to a server and the other \at waits for a response from the server.
+If the second \at races ahead, it may wait for responses to requests that have not been sent yet.
+In theory, this should not be a problem, even if the second \at waits, because the first \at is still ready to run and should be able to get CPU time to send the request.
+With M:N threading, while the first \at is ready, the lone \gls{proc} \emph{cannot} run the first \at if it is blocked in the \glsxtrshort{io} operation of the second \at.
+If this happens, the system is in a synchronization deadlock\footnote{In this example, the deadlock could be resolved if the server sends unprompted messages to the client.
 However, this solution is neither general nor appropriate even in this simple case.}.
 \end{quote}
 Therefore, one of the objective of this work is to introduce \emph{User-Level \glsxtrshort{io}}, which like \glslink{uthrding}{User-Level \emph{Threading}}, blocks \glspl{thrd} rather than \glspl{proc} when doing \glsxtrshort{io} ope      rations.
 This feature entails multiplexing the \glsxtrshort{io} operations of many \glspl{thrd} onto fewer \glspl{proc}.
+Therefore, one of the objectives of this work is to introduce \emph{User-Level \glsxtrshort{io}}, which like \glslink{uthrding}{User-Level \emph{Threading}}, blocks \ats rather than \glspl{proc} when doing \glsxtrshort{io} operations.
+This feature entails multiplexing the \glsxtrshort{io} operations of many \ats onto fewer \glspl{proc}.
 The multiplexing requires a single \gls{proc} to execute multiple \glsxtrshort{io} operations in parallel.
 This requirement cannot be done with operations that block \glspl{proc}, \ie \glspl{kthrd}, since the first operation would prevent starting new operations for its blocking duration.
 …
 All functions defined by this volume of POSIX.1-2017 shall be thread-safe, except that the following functions need not be thread-safe. ... (list of 70+ excluded functions)
 \end{quote}
 Only UNIX @man@ pages identify whether or not a library function is thread safe, and hence, may block on a pthreads lock or system call; hence interoperability with UNIX library functions is a challenge for an M:N threading model.
+Only UNIX @man@ pages identify whether a library function is thread-safe, and hence, may block on a pthreads lock or system call; hence interoperability with UNIX library functions is a challenge for an M:N threading model.
 Languages like Go and Java, which have strict interoperability with C\cite{wiki:jni,go:cgo}, can control operations in C by ``sandboxing'' them, \eg a blocking function may be delegated to a \gls{kthrd}. Sandboxing may help towards guaranteeing that the kind of deadlock mentioned above does not occur.
 …
 Therefore, it is possible calls to an unknown library function can block a \gls{kthrd} leading to deadlocks in \CFA's M:N threading model, which would not occur in a traditional 1:1 threading model.
 Currently, all M:N thread systems interacting with UNIX without sandboxing suffer from this problem but manage to work very well in the majority of applications.
 Therefore, a complete solution to this problem is outside the scope of this thesis.\footnote{\CFA does provide a pthreads emulation, so any library function using embedded pthreads locks are redirected to \CFA user-level locks. This capability further reduces the chances of blocking a \gls{kthrd}.}
+Therefore, a complete solution to this problem is outside the scope of this thesis.\footnote{\CFA does provide a pthreads emulation, so any library function using embedded pthreads locks is redirected to \CFA user-level locks. This capability further reduces the chances of blocking a \gls{kthrd}.}

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset 23a08aa0 for doc/theses/thierry_delisle_PhD/thesis/text/runtime.tex

Legend:

doc/theses/thierry_delisle_PhD/thesis/text/runtime.tex

Download in other formats: