Changeset b512454 for doc/proposals
- Timestamp:
- Oct 19, 2016, 3:32:19 PM (8 years ago)
- Branches:
- ADT, aaron-thesis, arm-eh, ast-experimental, cleanup-dtors, deferred_resn, demangler, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, new-env, no_list, persistent-indexer, pthread-emulation, qualifiedEnum, resolv-new, with_gc
- Children:
- ab84e8a
- Parents:
- a3eaa29
- Location:
- doc/proposals/concurrency
- Files:
-
- 2 added
- 4 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/proposals/concurrency/.gitignore
ra3eaa29 rb512454 18 18 concurrency.ps 19 19 version.aux 20 monitor.tex 21 ext_monitor.tex -
doc/proposals/concurrency/Makefile
ra3eaa29 rb512454 12 12 13 13 FIGURES = ${addsuffix .tex, \ 14 monitor \ 15 ext_monitor \ 14 16 } 15 17 … … 32 34 33 35 clean : 34 rm -f *.bbl *.aux *.dvi *.idx *.ilg *.ind *.brf *.out *.log *.toc *.blg *.pstex_t *.cf *.glg *.glo *.gls *.ist \36 rm -f *.bbl *.aux *.dvi *.idx *.ilg *.ind *.brf *.out *.log *.toc *.blg *.pstex_t *.cf *.glg *.glo *.gls *.ist *.acn *.acr *.alg \ 35 37 ${FIGURES} ${PICTURES} ${PROGRAMS} ${GRAPHS} ${basename ${DOCUMENT}}.ps ${DOCUMENT} 36 38 -
doc/proposals/concurrency/concurrency.tex
ra3eaa29 rb512454 1 1 % requires tex packages: texlive-base texlive-latex-base tex-common texlive-humanities texlive-latex-extra texlive-fonts-recommended 2 2 3 % inline code �...�(copyright symbol) emacs: C-q M-)4 % red highlighting �...�(registered trademark symbol) emacs: C-q M-.5 % blue highlighting �...�(sharp s symbol) emacs: C-q M-_6 % green highlighting �...�(cent symbol) emacs: C-q M-"7 % LaTex escape �...�(section symbol) emacs: C-q M-'8 % keyword escape �...�(pilcrow symbol) emacs: C-q M-^3 % inline code ©...© (copyright symbol) emacs: C-q M-) 4 % red highlighting ®...® (registered trademark symbol) emacs: C-q M-. 5 % blue highlighting ß...ß (sharp s symbol) emacs: C-q M-_ 6 % green highlighting ¢...¢ (cent symbol) emacs: C-q M-" 7 % LaTex escape §...§ (section symbol) emacs: C-q M-' 8 % keyword escape ¶...¶ (pilcrow symbol) emacs: C-q M-^ 9 9 % math escape $...$ (dollar symbol) 10 10 … … 101 101 Finally, an approach that is worth mentionning because it is gaining in popularity is transactionnal memory\cite{Dice10}. However, the performance and feature set is currently too restrictive to be possible to add such a paradigm to a language like C or \CC\cit, which is why it was rejected as the core paradigm for concurrency in \CFA. 102 102 103 \s ection{Monitors}103 \subsection{Monitors} 104 104 A monitor is a set of routines that ensure mutual exclusion when accessing shared state. This concept is generally associated with Object-Oriented Languages like Java\cite{Java} or \uC\cite{uC++book} but does not strictly require OOP semantics. The only requirements is the ability to declare a handle to a shared object and a set of routines that act on it : 105 105 \begin{lstlisting} … … 113 113 \end{lstlisting} 114 114 115 \subs ection{Call semantics} \label{call}115 \subsubsection{Call semantics} \label{call} 116 116 The above example of monitors already displays some of their intrinsic caracteristics. Indeed, it is necessary to use pass-by-reference over pass-by-value for monitor routines. This semantics is important because at their core, monitors are implicit mutual exclusion objects (locks), and these objects cannot be copied. Therefore, monitors are implicitly non-copyable. 117 117 \\ … … 147 147 The problem is to indentify which object(s) should be acquired. Furthermore we also need to acquire each objects only once. In case of simple routines like \code{f1} and \code{f2} it is easy to identify an exhaustive list of objects to acquire on entering. Adding indirections (\code{f3}) still allows the compiler and programmer to indentify which object will be acquired. However, adding in arrays (\code{f4}) makes it much harder. Array lengths aren't necessarily known in C and even then making sure we only acquire objects once becomes also none trivial. This can be extended to absurd limits like \code{f5} which uses a custom graph of monitors. To keep everyone as sane as possible\cite{Chicken}, this projects imposes the requirement that a routine may only acquire one monitor per parameter and it must be the type of the parameter (ignoring potential qualifiers and indirections). 148 148 149 \subs ection{Data semantics} \label{data}149 \subsubsection{Data semantics} \label{data} 150 150 Once the call semantics are established, the next step is to establish data semantics. Indeed, until now a monitor is used simply as a generic handle but in most cases monitors contian shared data. This data should be intrinsic to the monitor declaration to prevent any accidental use of data without its appripriate protection. For example here is a more fleshed-out version of the counter showed in \ref{call}: 151 151 \begin{lstlisting} … … 217 217 218 218 Recursive mutex routine calls are allowed in \CFA but if not done carefully it can lead to nested monitor call problems\cite{Lister77}. These problems which are a specific implementation of the lock acquiring order problem. In the example above, the user uses implicit ordering in the case of function \code{bar} but explicit ordering in the case of \code{baz}. This subtle mistake can mean that calling these two functions concurrently will lead to deadlocks, depending on the implicit ordering matching the explicit ordering. As shown on several occasion\cit, there isn't really any solutions to this problem, users simply need to be carefull when acquiring multiple monitors at the same time. 219 220 \subsubsection{Implementation Details: Interaction with polymorphism} 221 At first glance, interaction between monitors and \CFA's concept of polymorphism seem complexe to support. However, it can be reasoned that entry-point locking can solve most of the issues that could be present with polymorphism. 222 223 First of all, interaction between \code{otype} polymorphism and monitors is impossible since monitors do not support copying. Therefore the main question is how to support \code{dtype} polymorphism. We must remember that monitors' main purpose is to ensure mutual exclusion when accessing shared data. This implies that mutual exclusion is only required for routines that do in fact access shared data. However, since \code{dtype} polymorphism always handle incomplete types (by definition) no \code{dtype} polymorphic routine can access shared data since the data would require knowledge about the type. Therefore the only concern when combining \code{dtype} polymorphism and monitors is to protect access to routines. With callsite-locking, this would require significant amount of work since any \code{dtype} routine could have to obtain some lock before calling a routine. However, with entry-point-locking calling a monitor routine becomes exactly the same as calling it from anywhere else. 219 224 220 225 \subsection{Internal scheduling} \label{insched} … … 462 467 463 468 \subsection{External scheduling} \label{extsched} 464 As one might expect, the alternative to Internal scheduling is to use External scheduling instead. This method is somewhat more robust to deadlocks since one of the threads keeps a relatively tight control on scheduling. Indeed, as the following examples will demon trate, external scheduling allows users to wait for events from other threads without the concern of unrelated events occuring. External scheduling can generally be done either in terms of control flow (see \uC) or in terms of data (see Go). Of course, both of these paradigms have their own strenghts and weaknesses but for this project control flow semantics where chosen to stay consistent with the reset of the languages semantics. Two challenges specific to \CFA arise when trying to add external scheduling whith loose object definitions and multi-monitor routines. The following example shows what a simple use \code{accept} versus \code{wait}/\code{signal} and its advantages.469 As one might expect, the alternative to Internal scheduling is to use External scheduling instead. This method is somewhat more robust to deadlocks since one of the threads keeps a relatively tight control on scheduling. Indeed, as the following examples will demonstrate, external scheduling allows users to wait for events from other threads without the concern of unrelated events occuring. External scheduling can generally be done either in terms of control flow (ex: \uC) or in terms of data (ex: Go). Of course, both of these paradigms have their own strenghts and weaknesses but for this project control flow semantics where chosen to stay consistent with the rest of the languages semantics. Two challenges specific to \CFA arise when trying to add external scheduling with loose object definitions and multi-monitor routines. The following example shows what a simple use \code{accept} versus \code{wait}/\code{signal} and its advantages. 465 470 466 471 \begin{center} … … 472 477 condition c; 473 478 public: 474 void f(); 475 void g() { signal} 476 void h() { wait(c); } 479 void f() { signal(c)} 480 void g() { wait(c); } 477 481 private: 478 482 } … … 482 486 public: 483 487 void f(); 484 void g(); 485 void h() { _Accept(g); } 488 void g() { _Accept(f); } 486 489 private: 487 490 } … … 500 503 501 504 void f(A & mutex a); 502 void g(A & mutex a); 503 void h(A & mutex a) { accept(g); } 504 \end{lstlisting} 505 506 While this is the direct translation of the \uC code, at the time of compiling routine \code{f} the \CFA does not already have a declaration of \code{g} while the \uC compiler does. This means that either the compiler has to dynamically find which routines are "acceptable" or the language needs a way of statically listing "acceptable" routines. Since \CFA has no existing concept that resemble dynamic routine definitions or pattern matching, the static approach seems the more consistent with the current language paradigms. This approach leads to the \uC example being translated to : 505 void g(A & mutex a) { accept(f); } 506 \end{lstlisting} 507 508 However, external scheduling is an example where implementation constraints become visible from the interface. Indeed, ince there is no hard limit to the number of threads trying to acquire a monitor concurrently, performance is a significant concern. Here is the pseudo code for the entering phase of a monitor : 509 510 \begin{center} 511 \begin{tabular}{l} 512 \begin{lstlisting} 513 ¶if¶ critical section is free : 514 enter 515 elif critical section accepts me : 516 enter 517 ¶else¶ : 518 block 519 \end{lstlisting} 520 \end{tabular} 521 \end{center} 522 523 For the \code{critical section is free} condition it is easy to implement a check that can evaluate the condition in a few instruction. However, a fast check for \code{critical section accepts me} is much harder to implement depending on the constraints put on the monitors. Indeed, monitors are often expressed as an entry queue and some acceptor queue as in the following figure : 524 525 \begin{center} 526 {\resizebox{0.5\textwidth}{!}{\input{monitor}}} 527 \end{center} 528 529 There are other alternatives to these pictures but in the case of this picture implementing a fast accept check is relatively easy. Indeed simply updating a bitmask when the acceptor queue changes is enough to have a check that executes in a single instruction, even with a fairly large number of acceptor. However, this requires all the acceptable routines to be declared with the monitor declaration. For OO languages this doesn't compromise much since monitors already have an exhaustive list of member routines. However, for \CFA this isn't the case, routines can be added to a type anywhere after its declaration. A more flexible 530 531 532 At this point we must make a decision between flexibility and performance. Many design decisions in \CFA achieve both flexibility and performance, for example polymorphic routines add significant flexibility but inlining them means the optimizer can easily remove any runtime cost. 533 534 This approach leads to the \uC example being translated to : 507 535 \begin{lstlisting} 508 536 accept( void g(mutex struct A & mutex a) ) … … 584 612 Note that the set of monitors passed to the \code{accept} statement must be entirely contained in the set of monitor already acquired in the routine. \code{accept} used in any other context is Undefined Behaviour. 585 613 586 \subsection{Implementation Details} 587 \textbf{\large{Work in progress...}} 588 \subsubsection{Interaction with polymorphism} 589 At first glance, interaction between monitors and \CFA's concept of polymorphism seem complexe to support. However, it can be reasoned that entry-point locking can solve most of the issues that could be present with polymorphism. 590 591 First of all, interaction between \code{otype} polymorphism and monitors is impossible since monitors do not support copying. Therefore the main question is how to support \code{dtype} polymorphism. We must remember that monitors' main purpose is to ensure mutual exclusion when accessing shared data. This implies that mutual exclusion is only required for routines that do in fact access shared data. However, since \code{dtype} polymorphism always handle incomplete types (by definition) no \code{dtype} polymorphic routine can access shared data since the data would require knowledge about the type. Therefore the only concern when combining \code{dtype} polymorphism and monitors is to protect access to routines. With callsite-locking, this would require significant amount of work since any \code{dtype} routine could have to obtain some lock before calling a routine. However, with entry-point-locking calling a monitor routine becomes exactly the same as calling it from anywhere else. 592 593 \subsubsection{External scheduling queues} 614 \subsubsection{Implementation Details: External scheduling queues} 594 615 To support multi-monitor external scheduling means that some kind of entry-queues must be used that is aware of both monitors. However, acceptable routines must be aware of the entry queues which means they most be stored inside at least one of the monitors that will be acquired. This in turn adds the requirement a systematic algorithm of disambiguating which queue is relavant regardless of user ordering. The proposed algorithm is to fall back on monitors lock ordering and specify that the monitor that is acquired first is the lock with the relevant entry queue. This assumes that the lock acquiring order is static for the lifetime of all concerned objects gut that is a reasonnable contraint. This algorithm choice has two consequences, the ofthe highest priority monitor is no longer a true FIFO queue and the queue of the lowest priority monitor is both required and probably unused. The queue can no longer be a FIFO queue because instead of simply containing the waiting threads in order arrival, they also contain the second mutex. Therefore, another thread with the same highest priority monitor but a different lowest priority monitor may arrive first but enter the critical section after a thread with the correct pairing. Secondly, since it may not be known at compile time which monitor will be the lowest priority monitor, every monitor needs to have the correct queues even though it is probably that half the multi-monitor queues will go unused for the entire duration of the program. 595 616 -
doc/proposals/concurrency/version
ra3eaa29 rb512454 1 0.4. 221 0.4.61
Note: See TracChangeset
for help on using the changeset viewer.