Changeset d1fbaa78 for doc/proposals
- Timestamp:
- Oct 20, 2016, 12:19:16 PM (8 years ago)
- Branches:
- ADT, aaron-thesis, arm-eh, ast-experimental, cleanup-dtors, deferred_resn, demangler, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, new-env, no_list, persistent-indexer, pthread-emulation, qualifiedEnum, resolv-new, with_gc
- Children:
- fe7b281
- Parents:
- b1bdc7d6
- Location:
- doc/proposals/concurrency
- Files:
-
- 2 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/proposals/concurrency/concurrency.tex
rb1bdc7d6 rd1fbaa78 635 635 TO BE CONTINUED... 636 636 637 \newpage 637 638 \section{Parallelism} 638 Historically, computer performance was about processor speeds and instructions count. However, with heat dissipa ction being an ever growing challenge, parallelism has become the new source of greatest performance \cite{Sutter05, Sutter05b}. In this decade, it is not longer reasonnable create high-performance application without caring about parallelism. Indeed, parallelism an important aspect of performance and more specifically throughput and hardware utilization. The lowest level approach parallelism is to use \glspl{kthread}. However since these have significant costs and limitations,\glspl{kthread} are now mostly used as an implementation tool rather than a user oriented one. There are several alternatives to solve these issues which all have strengths and weaknesses.639 Historically, computer performance was about processor speeds and instructions count. However, with heat dissipation being an ever growing challenge, parallelism has become the new source of greatest performance \cite{Sutter05, Sutter05b}. In this decade, it is not longer reasonnable to create high-performance application without caring about parallelism. Indeed, parallelism is an important aspect of performance and more specifically throughput and hardware utilization. The lowest level approach of parallelism is to use \glspl{kthread}. However since these have significant costs and limitations \glspl{kthread} are now mostly used as an implementation tool rather than a user oriented one. There are several alternatives to solve these issues which all have strengths and weaknesses. 639 640 640 641 \subsection{User-level threads} … … 642 643 643 644 Examples of languages that support are Java\cite{Java}, Haskell\cite{Haskell} and \uC\cite{uC++book}. 645 644 646 \subsection{Jobs and thread pools} 645 The opposite approach is to base parallelism on \glspl{job}. Indeed, \glspl{job} offer limited flexibility but at the benefit of a simpler user interface. In \gls{job} based systems users express parallelism as units of work and the dependency graph (either explicit or implicit) that tie them together. This means users need not to worry about concurrency but significantly limits the interaction that can occur between different jobs. Indeed, any \gls{job} that blocks also blocks the underlying \gls{kthread}, this effectively mean the CPU utilization, and therefore throughput, will suffer noticeably. The golden standard of this implementation is Intel's TBB library\cite{TBB}. 647 The approach on the opposite end of the spectrum is to base parallelism on \glspl{job}. Indeed, \glspl{job} offer limited flexibility but at the benefit of a simpler user interface. In \gls{job} based systems users express parallelism as units of work and the dependency graph (either explicit or implicit) that tie them together. This means users need not to worry about concurrency but significantly limits the interaction that can occur between different jobs. Indeed, any \gls{job} that blocks also blocks the underlying \gls{kthread}, this effectively mean the CPU utilization, and therefore throughput, will suffer noticeably. 648 The golden standard of this implementation is Intel's TBB library\cite{TBB}. 646 649 647 650 \subsection{Fibers : user-level threads without preemption} 648 651 Finally, in the middle of the flexibility versus complexity spectrum lay \glspl{fiber} which offer \glspl{uthread} without the complexity of preemption. This means users don't have to worry about other \glspl{fiber} suddenly executing between two instructions which signficantly reduces complexity. However, any call to IO or other concurrency primitives can lead to context switches. Furthermore, users can also block \glspl{fiber} in the middle of their execution without blocking a full processor core. This means users still have to worry about mutual exclusion, deadlocks and race conditions in their code, raising the complexity significantly. 649 \cite{Go}652 An example of a language that uses fibers is Go\cite{Go} 650 653 651 654 \subsection{Paradigm performance} 652 While the choice between the three paradigms listed above may have significant performance implication, it is difficult to pin the performance implications of chosing a model at the language level. Indeed, in many situations o wn of these paradigms will show better performance but it all strongly depends on the usage. Having mostly indepent units of work to execute almost guarantess that the \gls{job} based system will have the best performance. However, add interactions between jobs and the processor utilisation might suffer. User-level threads may allow maximum ressource utilisation but context switches will be more expansive and it is also harder for users to get perfect tunning. As with every example, fibers sit somewhat in the middle of the spectrum. Furthermore, if the units of uninterrupted work are large enough the paradigm choice will be fully armoticised by the actual work done.655 While the choice between the three paradigms listed above may have significant performance implication, it is difficult to pin the performance implications of chosing a model at the language level. Indeed, in many situations one of these paradigms will show better performance but it all strongly depends on the usage. Having mostly indepent units of work to execute almost guarantess that the \gls{job} based system will have the best performance. However, add interactions between jobs and the processor utilisation might suffer. User-level threads may allow maximum ressource utilisation but context switches will be more expansive and it is also harder for users to get perfect tunning. As with every example, fibers sit somewhat in the middle of the spectrum. Furthermore, if the units of uninterrupted work are large enough the paradigm choice will be largely amorticised by the actual work done. 653 656 654 657 \section{\CFA 's Thread Building Blocks} … … 705 708 \end{lstlisting} 706 709 707 In this example \code{func} is a function pointer stored in \acrfull{tls}, which is \CFA is both easy to use and completly typesafe.708 709 Of course for threads to be useful, it must be possible to start and stop threads and wait for them to complete execution. While using \acrshort{api} such as \code{fork} and \code{join} is relatively common in the literature, such an interface is not needed. Indeed, the simplest approach is to use \acrshort{raii} principles and have threads \code{fork} once the constructor has completed and \code{join} before the destructor runs.710 % In this example \code{func} is a function pointer stored in \acrfull{tls}, which is \CFA is both easy to use and completly typesafe. 711 712 Of course for threads to be useful, it must be possible to start and stop threads and wait for them to complete execution. While using an \acrshort{api} such as \code{fork} and \code{join} is relatively common in the literature, such an interface is not needed. Indeed, the simplest approach is to use \acrshort{raii} principles and have threads \code{fork} once the constructor has completed and \code{join} before the destructor runs. 710 713 \begin{lstlisting} 711 714 thread struct FuncRunner; //FuncRunner declared above … … 751 754 \end{lstlisting} 752 755 756 \newpage 757 \large{\textbf{WORK IN PROGRESS}} 753 758 \subsection{The \CFA Kernel : Processors, Clusters and Threads}\label{kernel} 754 759 -
doc/proposals/concurrency/version
rb1bdc7d6 rd1fbaa78 1 0.4. 881 0.4.95
Note: See TracChangeset
for help on using the changeset viewer.