Changeset e04aec4


Ignore:
Timestamp:
Jun 20, 2018, 8:41:48 AM (3 years ago)
Author:
Peter A. Buhr <pabuhr@…>
Branches:
aaron-thesis, arm-eh, cleanup-dtors, deferred_resn, demangler, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, new-env, no_list, persistent-indexer, with_gc
Children:
db4062d
Parents:
f184ca3
Message:

more updates

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/papers/concurrency/Paper.tex

    rf184ca3 re04aec4  
    271271Hence, there are two problems to be solved: concurrency and parallelism.
    272272While these two concepts are often combined, they are distinct, requiring different tools~\cite[\S~2]{Buhr05a}.
    273 Concurrency tools handle synchronization and mutual exclusion, while parallelism tools handle performance, cost and resource utilization.
     273Concurrency tools handle mutual exclusion and synchronization, while parallelism tools handle performance, cost, and resource utilization.
    274274
    275275The proposed concurrency API is implemented in a dialect of C, called \CFA.
     
    282282Extended versions and explanation of the following code examples are available at the \CFA website~\cite{Cforall} or in Moss~\etal~\cite{Moss18}.
    283283
    284 \CFA is an extension of ISO-C, and hence, supports all C paradigms.
     284\CFA is a non-object-oriented extension of ISO-C, and hence, supports all C paradigms.
    285285%It is a non-object-oriented system-language, meaning most of the major abstractions have either no runtime overhead or can be opted out easily.
    286 Like C, the basics of \CFA revolve around structures and routines.
     286Like C, the building blocks of \CFA are structures and routines.
    287287Virtually all of the code generated by the \CFA translator respects C memory layouts and calling conventions.
    288288While \CFA is not an object-oriented language, lacking the concept of a receiver (\eg @this@) and nominal inheritance-relationships, C does have a notion of objects: ``region of data storage in the execution environment, the contents of which can represent values''~\cite[3.15]{C11}.
     
    296296int x = 1, y = 2, z = 3;
    297297int * p1 = &x, ** p2 = &p1,  *** p3 = &p2,      $\C{// pointers to x}$
    298         `&` r1 = x,  `&&` r2 = r1,  `&&&` r3 = r2;      $\C{// references to x}$
     298    `&` r1 = x,   `&&` r2 = r1,   `&&&` r3 = r2;        $\C{// references to x}$
    299299int * p4 = &z, `&` r4 = z;
    300300
     
    411411\end{cquote}
    412412Overloading is important for \CFA concurrency since the runtime system relies on creating different types to represent concurrency objects.
    413 Therefore, overloading is necessary to prevent the need for long prefixes and other naming conventions to prevent name clashes.
     413Therefore, overloading eliminates long prefixes and other naming conventions to prevent name clashes.
    414414As seen in Section~\ref{basics}, routine @main@ is heavily overloaded.
    415 
    416 Variable overloading is useful in the parallel semantics of the @with@ statement for fields with the same name:
     415For example, variable overloading is useful in the parallel semantics of the @with@ statement for fields with the same name:
    417416\begin{cfa}
    418417struct S { int `i`; int j; double m; } s;
     
    428427}
    429428\end{cfa}
    430 For parallel semantics, both @s.i@ and @t.i@ are visible the same type, so only @i@ is ambiguous without qualification.
     429For parallel semantics, both @s.i@ and @t.i@ are visible with the same type, so only @i@ is ambiguous without qualification.
    431430
    432431
     
    468467\end{cquote}
    469468While concurrency does not use operator overloading directly, it provides an introduction for the syntax of constructors.
    470 
    471 
    472 \subsection{Parametric Polymorphism}
    473 \label{s:ParametricPolymorphism}
    474 
    475 The signature feature of \CFA is parametric-polymorphic routines~\cite{} with routines generalized using a @forall@ clause (giving the language its name), which allow separately compiled routines to support generic usage over multiple types.
    476 For example, the following sum routine works for any type that supports construction from 0 and addition:
    477 \begin{cfa}
    478 forall( otype T | { void `?{}`( T *, zero_t ); T `?+?`( T, T ); } ) // constraint type, 0 and +
    479 T sum( T a[$\,$], size_t size ) {
    480         `T` total = { `0` };                                    $\C{// initialize by 0 constructor}$
    481         for ( size_t i = 0; i < size; i += 1 )
    482                 total = total `+` a[i];                         $\C{// select appropriate +}$
    483         return total;
    484 }
    485 S sa[5];
    486 int i = sum( sa, 5 );                                           $\C{// use S's 0 construction and +}$
    487 \end{cfa}
    488 
    489 \CFA provides \newterm{traits} to name a group of type assertions, where the trait name allows specifying the same set of assertions in multiple locations, preventing repetition mistakes at each routine declaration:
    490 \begin{cfa}
    491 trait `sumable`( otype T ) {
    492         void `?{}`( T &, zero_t );                              $\C{// 0 literal constructor}$
    493         T `?+?`( T, T );                                                $\C{// assortment of additions}$
    494         T ?+=?( T &, T );
    495         T ++?( T & );
    496         T ?++( T & );
    497 };
    498 forall( otype T `| sumable( T )` )                      $\C{// use trait}$
    499 T sum( T a[$\,$], size_t size );
    500 \end{cfa}
    501 
    502 Assertions can be @otype@ or @dtype@.
    503 @otype@ refers to a ``complete'' object, \ie an object has a size, default constructor, copy constructor, destructor and an assignment operator.
    504 @dtype@ only guarantees an object has a size and alignment.
    505 
    506 Using the return type for discrimination, it is possible to write a type-safe @alloc@ based on the C @malloc@:
    507 \begin{cfa}
    508 forall( dtype T | sized(T) ) T * alloc( void ) { return (T *)malloc( sizeof(T) ); }
    509 int * ip = alloc();                                                     $\C{// select type and size from left-hand side}$
    510 double * dp = alloc();
    511 struct S {...} * sp = alloc();
    512 \end{cfa}
    513 where the return type supplies the type/size of the allocation, which is impossible in most type systems.
    514469
    515470
     
    540495\CFA also provides @new@ and @delete@, which behave like @malloc@ and @free@, in addition to constructing and destructing objects:
    541496\begin{cfa}
    542 {       struct S s = {10};                                              $\C{// allocation, call constructor}$
    543         ...
     497{
     498        ... struct S s = {10}; ...                              $\C{// allocation, call constructor}$
    544499}                                                                                       $\C{// deallocation, call destructor}$
    545500struct S * s = new();                                           $\C{// allocation, call constructor}$
     
    547502delete( s );                                                            $\C{// deallocation, call destructor}$
    548503\end{cfa}
    549 \CFA concurrency uses object lifetime as a means of synchronization and/or mutual exclusion.
     504\CFA concurrency uses object lifetime as a means of mutual exclusion and/or synchronization.
     505
     506
     507\subsection{Parametric Polymorphism}
     508\label{s:ParametricPolymorphism}
     509
     510The signature feature of \CFA is parametric-polymorphic routines~\cite{} with routines generalized using a @forall@ clause (giving the language its name), which allow separately compiled routines to support generic usage over multiple types.
     511For example, the following sum routine works for any type that supports construction from 0 and addition:
     512\begin{cfa}
     513forall( otype T | { void `?{}`( T *, zero_t ); T `?+?`( T, T ); } ) // constraint type, 0 and +
     514T sum( T a[$\,$], size_t size ) {
     515        `T` total = { `0` };                                    $\C{// initialize by 0 constructor}$
     516        for ( size_t i = 0; i < size; i += 1 )
     517                total = total `+` a[i];                         $\C{// select appropriate +}$
     518        return total;
     519}
     520S sa[5];
     521int i = sum( sa, 5 );                                           $\C{// use S's 0 construction and +}$
     522\end{cfa}
     523The builtin type @zero_t@ (and @one_t@) overload constant 0 (and 1) for a new types, where both 0 and 1 have special meaning in C.
     524
     525\CFA provides \newterm{traits} to name a group of type assertions, where the trait name allows specifying the same set of assertions in multiple locations, preventing repetition mistakes at each routine declaration:
     526\begin{cfa}
     527trait `sumable`( otype T ) {
     528        void `?{}`( T &, zero_t );                              $\C{// 0 literal constructor}$
     529        T `?+?`( T, T );                                                $\C{// assortment of additions}$
     530        T ?+=?( T &, T );
     531        T ++?( T & );
     532        T ?++( T & );
     533};
     534forall( otype T `| sumable( T )` )                      $\C{// use trait}$
     535T sum( T a[$\,$], size_t size );
     536\end{cfa}
     537
     538Assertions can be @otype@ or @dtype@.
     539@otype@ refers to a ``complete'' object, \ie an object has a size, default constructor, copy constructor, destructor and an assignment operator.
     540@dtype@ only guarantees an object has a size and alignment.
     541
     542Using the return type for discrimination, it is possible to write a type-safe @alloc@ based on the C @malloc@:
     543\begin{cfa}
     544forall( dtype T | sized(T) ) T * alloc( void ) { return (T *)malloc( sizeof(T) ); }
     545int * ip = alloc();                                                     $\C{// select type and size from left-hand side}$
     546double * dp = alloc();
     547struct S {...} * sp = alloc();
     548\end{cfa}
     549where the return type supplies the type/size of the allocation, which is impossible in most type systems.
    550550
    551551
     
    727727
    728728Using a coroutine, it is possible to express the Fibonacci formula directly without any of the C problems.
    729 Figure~\ref{f:Coroutine3States} creates a @coroutine@ type:
    730 \begin{cfa}
    731 `coroutine` Fib { int fn; };
    732 \end{cfa}
    733 which provides communication, @fn@, for the \newterm{coroutine main}, @main@, which runs on the coroutine stack, and possibly multiple interface routines @next@.
     729Figure~\ref{f:Coroutine3States} creates a @coroutine@ type, @`coroutine` Fib { int fn; }@, which provides communication, @fn@, for the \newterm{coroutine main}, @main@, which runs on the coroutine stack, and possibly multiple interface routines, \eg @next@.
    734730Like the structure in Figure~\ref{f:ExternalState}, the coroutine type allows multiple instances, where instances of this type are passed to the (overloaded) coroutine main.
    735 The coroutine main's stack holds the state for the next generation, @f1@ and @f2@, and the code has the three suspend points, representing the three states in the Fibonacci formula, to context switch back to the caller's resume.
     731The coroutine main's stack holds the state for the next generation, @f1@ and @f2@, and the code has the three suspend points, representing the three states in the Fibonacci formula, to context switch back to the caller's @resume@.
    736732The interface routine @next@, takes a Fibonacci instance and context switches to it using @resume@;
    737733on restart, the Fibonacci field, @fn@, contains the next value in the sequence, which is returned.
     
    843839\end{figure}
    844840
    845 The previous examples are \newterm{asymmetric (semi) coroutine}s because one coroutine always calls a resuming routine for another coroutine, and the resumed coroutine always suspends back to its last resumer, similar to call/return for normal routines
    846 However, there is no stack growth because @resume@/@suspend@ context switch to existing stack-frames rather than create new ones.
     841The previous examples are \newterm{asymmetric (semi) coroutine}s because one coroutine always calls a resuming routine for another coroutine, and the resumed coroutine always suspends back to its last resumer, similar to call/return for normal routines.
     842However,@resume@/@suspend@ context switch to existing stack-frames rather than create new ones so there is no stack growth.
    847843\newterm{Symmetric (full) coroutine}s have a coroutine call a resuming routine for another coroutine, which eventually forms a resuming-call cycle.
    848844(The trivial cycle is a coroutine resuming itself.)
     
    933929The producer call to @delivery@ transfers values into the consumer's communication variables, resumes the consumer, and returns the consumer status.
    934930For the first resume, @cons@'s stack is initialized, creating local variables retained between subsequent activations of the coroutine.
    935 The consumer iterates until the @done@ flag is set, prints, increments status, and calls back to the producer via @payment@, and on return from @payment@, prints the receipt from the producer and increments @money@ (inflation).
     931The consumer iterates until the @done@ flag is set, prints the values delivered by the producer, increments status, and calls back to the producer via @payment@, and on return from @payment@, prints the receipt from the producer and increments @money@ (inflation).
    936932The call from the consumer to the @payment@ introduces the cycle between producer and consumer.
    937933When @payment@ is called, the consumer copies values into the producer's communication variable and a resume is executed.
     
    963959\end{cfa}
    964960and the programming language (and possibly its tool set, \eg debugger) may need to understand @baseCoroutine@ because of the stack.
    965 Furthermore, the execution of constructs/destructors is in the wrong order for certain operations, \eg for threads;
    966 \eg, if the thread is implicitly started, it must start \emph{after} all constructors, because the thread relies on a completely initialized object, but the inherited constructor runs \emph{before} the derived.
     961Furthermore, the execution of constructs/destructors is in the wrong order for certain operations.
     962For example, for threads if the thread is implicitly started, it must start \emph{after} all constructors, because the thread relies on a completely initialized object, but the inherited constructor runs \emph{before} the derived.
    967963
    968964An alternatively is composition:
     
    984980symmetric_coroutine<>::yield_type
    985981\end{cfa}
    986 Similarly, the canonical threading paradigm is often based on routine pointers, \eg @pthread@~\cite{pthreads}, \Csharp~\cite{Csharp}, Go~\cite{Go}, and Scala~\cite{Scala}.
     982Similarly, the canonical threading paradigm is often based on routine pointers, \eg @pthreads@~\cite{pthreads}, \Csharp~\cite{Csharp}, Go~\cite{Go}, and Scala~\cite{Scala}.
    987983However, the generic thread-handle (identifier) is limited (few operations), unless it is wrapped in a custom type.
    988984\begin{cfa}
     
    1001997Note, the type @coroutine_t@ must be an abstract handle to the coroutine, because the coroutine descriptor and its stack are non-copyable.
    1002998Copying the coroutine descriptor results in copies being out of date with the current state of the stack.
    1003 Correspondingly, copying the stack results is copies being out of date with coroutine descriptor, and pointers in the stack being out of date to data on the stack.
     999Correspondingly, copying the stack results is copies being out of date with the coroutine descriptor, and pointers in the stack being out of date to data on the stack.
    10041000(There is no mechanism in C to find all stack-specific pointers and update them as part of a copy.)
    10051001
     
    10151011Furthermore, implementing coroutines without language supports also displays the power of a programming language.
    10161012While this is ultimately the option used for idiomatic \CFA code, coroutines and threads can still be constructed without using the language support.
    1017 The reserved keyword eases use for the common cases.
     1013The reserved keyword simply eases use for the common cases.
    10181014
    10191015Part of the mechanism to generalize coroutines is using a \CFA trait, which defines a coroutine as anything satisfying the trait @is_coroutine@, and this trait is used to restrict coroutine-manipulation routines:
     
    10301026The @main@ routine has no return value or additional parameters because the coroutine type allows an arbitrary number of interface routines with corresponding arbitrary typed input/output values versus fixed ones.
    10311027The generic routines @suspend@ and @resume@ can be redefined, but any object passed to them is a coroutine since it must satisfy the @is_coroutine@ trait to compile.
    1032 The advantage of this approach is that users can easily create different types of coroutines, for example, changing the memory layout of a coroutine is trivial when implementing the @get_coroutine@ routine, and possibly redefining @suspend@ and @resume@.
     1028The advantage of this approach is that users can easily create different types of coroutines, \eg changing the memory layout of a coroutine is trivial when implementing the @get_coroutine@ routine, and possibly redefining @suspend@ and @resume@.
    10331029The \CFA keyword @coroutine@ implicitly implements the getter and forward declarations required for implementing the coroutine main:
    10341030\begin{cquote}
     
    10981094The difference is that a coroutine borrows a thread from its caller, so the first thread resuming a coroutine creates an instance of @main@;
    10991095whereas, a user thread receives its own thread from the runtime system, which starts in @main@ as some point after the thread constructor is run.\footnote{
    1100 The \lstinline@main@ routine is already a special routine in C (where the program begins), so it is a natural extension of the semantics to use overloading to declare mains for different coroutines/threads (the normal main being the main of the initial thread).}
     1096The \lstinline@main@ routine is already a special routine in C, \ie where the program's initial thread begins, so it is a natural extension of this semantics to use overloading to declare \lstinline@main@s for user coroutines and threads.}
    11011097No return value or additional parameters are necessary for this routine because the task type allows an arbitrary number of interface routines with corresponding arbitrary typed input/output values.
    11021098
     
    11891185void main( Adder & adder ) with( adder ) {
    11901186    subtotal = 0;
    1191     for ( int c = 0; c < cols; c += 1 ) {
    1192                 subtotal += row[c];
    1193     }
     1187    for ( int c = 0; c < cols; c += 1 ) { subtotal += row[c]; }
    11941188}
    11951189int main() {
     
    12161210
    12171211Uncontrolled non-deterministic execution is meaningless.
    1218 To reestablish meaningful execution requires mechanisms to reintroduce determinism (\ie restrict non-determinism), called mutual exclusion and synchronization, where mutual exclusion is an access-control mechanism on data shared by threads, and synchronization is a timing relationship among threads~\cite[\S~4]{Buhr05a}.
     1212To reestablish meaningful execution requires mechanisms to reintroduce determinism, \ie restrict non-determinism, called mutual exclusion and synchronization, where mutual exclusion is an access-control mechanism on data shared by threads, and synchronization is a timing relationship among threads~\cite[\S~4]{Buhr05a}.
    12191213Since many deterministic challenges appear with the use of mutable shared state, some languages/libraries disallow it, \eg Erlang~\cite{Erlang}, Haskell~\cite{Haskell}, Akka~\cite{Akka} (Scala).
    1220 In these paradigms, interaction among concurrent objects is performed by stateless message-passing~\cite{Thoth,Harmony,V-Kernel} or other paradigms closely relate to networking concepts (\eg channels~\cite{CSP,Go}).
    1221 However, in call/return-based languages, these approaches force a clear distinction (\ie introduce a new programming paradigm) between regular and concurrent computation (\ie routine call versus message passing).
     1214In these paradigms, interaction among concurrent objects is performed by stateless message-passing~\cite{Thoth,Harmony,V-Kernel} or other paradigms closely relate to networking concepts, \eg channels~\cite{CSP,Go}.
     1215However, in call/return-based languages, these approaches force a clear distinction, \ie introduce a new programming paradigm, between regular and concurrent computation, \eg routine call versus message passing.
    12221216Hence, a programmer must learn and manipulate two sets of design patterns.
    12231217While this distinction can be hidden away in library code, effective use of the library still has to take both paradigms into account.
     
    12441238However, many solutions exist for mutual exclusion, which vary in terms of performance, flexibility and ease of use.
    12451239Methods range from low-level locks, which are fast and flexible but require significant attention for correctness, to higher-level concurrency techniques, which sacrifice some performance to improve ease of use.
    1246 Ease of use comes by either guaranteeing some problems cannot occur (\eg deadlock free), or by offering a more explicit coupling between shared data and critical section.
    1247 For example, the \CC @std::atomic<T>@ offers an easy way to express mutual-exclusion on a restricted set of operations (\eg reading/writing) for numerical types.
     1240Ease of use comes by either guaranteeing some problems cannot occur, \eg deadlock free, or by offering a more explicit coupling between shared data and critical section.
     1241For example, the \CC @std::atomic<T>@ offers an easy way to express mutual-exclusion on a restricted set of operations, \eg reading/writing, for numerical types.
    12481242However, a significant challenge with locks is composability because it takes careful organization for multiple locks to be used while preventing deadlock.
    12491243Easing composability is another feature higher-level mutual-exclusion mechanisms can offer.
     
    12541248Synchronization enforces relative ordering of execution, and synchronization tools provide numerous mechanisms to establish these timing relationships.
    12551249Low-level synchronization primitives offer good performance and flexibility at the cost of ease of use;
    1256 higher-level mechanisms often simplify usage by adding better coupling between synchronization and data (\eg message passing), or offering a simpler solution to otherwise involved challenges, \eg barrier lock.
     1250higher-level mechanisms often simplify usage by adding better coupling between synchronization and data, \eg message passing, or offering a simpler solution to otherwise involved challenges, \eg barrier lock.
    12571251Often synchronization is used to order access to a critical section, \eg ensuring a reader thread is the next kind of thread to enter a critical section.
    12581252If a writer thread is scheduled for next access, but another reader thread acquires the critical section first, that reader has \newterm{barged}.
     
    12721266The strong association with the call/return paradigm eases programmability, readability and maintainability, at a slight cost in flexibility and efficiency.
    12731267
    1274 Note, like coroutines/threads, both locks and monitors require an abstract handle to reference them, because at their core, both mechanisms are manipulating non-copyable shared state.
     1268Note, like coroutines/threads, both locks and monitors require an abstract handle to reference them, because at their core, both mechanisms are manipulating non-copyable shared-state.
    12751269Copying a lock is insecure because it is possible to copy an open lock and then use the open copy when the original lock is closed to simultaneously access the shared data.
    12761270Copying a monitor is secure because both the lock and shared data are copies, but copying the shared data is meaningless because it no longer represents a unique entity.
     
    13751369\end{cfa}
    13761370(While object-oriented monitors can be extended with a mutex qualifier for multiple-monitor members, no prior example of this feature could be found.)
    1377 In practice, writing multi-locking routines that do not deadlocks is tricky.
     1371In practice, writing multi-locking routines that do not deadlock is tricky.
    13781372Having language support for such a feature is therefore a significant asset for \CFA.
    13791373
    13801374The capability to acquire multiple locks before entering a critical section is called \newterm{bulk acquire}.
    1381 In previous example, \CFA guarantees the order of acquisition is consistent across calls to different routines using the same monitors as arguments.
     1375In the previous example, \CFA guarantees the order of acquisition is consistent across calls to different routines using the same monitors as arguments.
    13821376This consistent ordering means acquiring multiple monitors is safe from deadlock.
    13831377However, users can force the acquiring order.
     
    13951389In the calls to @bar@ and @baz@, the monitors are acquired in opposite order.
    13961390
    1397 However, such use leads to lock acquiring order problems resulting in deadlock~\cite{Lister77}, where detecting it requires dynamically tracking of monitor calls, and dealing with it requires implement rollback semantics~\cite{Dice10}.
     1391However, such use leads to lock acquiring order problems resulting in deadlock~\cite{Lister77}, where detecting it requires dynamically tracking of monitor calls, and dealing with it requires rollback semantics~\cite{Dice10}.
    13981392In \CFA, safety is guaranteed by using bulk acquire of all monitors to shared objects, whereas other monitor systems provide no aid.
    13991393While \CFA provides only a partial solution, the \CFA partial solution handles many useful cases.
     
    14401434
    14411435
    1442 \section{Internal Scheduling}
    1443 \label{s:InternalScheduling}
     1436\section{Scheduling}
     1437\label{s:Scheduling}
    14441438
    14451439While monitor mutual-exclusion provides safe access to shared data, the monitor data may indicate that a thread accessing it cannot proceed.
     
    14541448The appropriate condition lock is signalled to unblock an opposite kind of thread after an element is inserted/removed from the buffer.
    14551449Signalling is unconditional, because signalling an empty condition lock does nothing.
     1450
    14561451Signalling semantics cannot have the signaller and signalled thread in the monitor simultaneously, which means:
    14571452\begin{enumerate}
     
    14631458The signalling thread blocks but is marked for urgrent unblocking at the next scheduling point and the signalled thread continues.
    14641459\end{enumerate}
    1465 The first approach is too restrictive, as it precludes solving a reasonable class of problems (\eg dating service).
     1460The first approach is too restrictive, as it precludes solving a reasonable class of problems, \eg dating service.
    14661461\CFA supports the next two semantics as both are useful.
    14671462Finally, while it is common to store a @condition@ as a field of the monitor, in \CFA, a @condition@ variable can be created/stored independently.
     
    15391534If the buffer is full, only calls to @remove@ can acquire the buffer, and if the buffer is empty, only calls to @insert@ can acquire the buffer.
    15401535Threads making calls to routines that are currently excluded block outside (external) of the monitor on a calling queue, versus blocking on condition queues inside (internal) of the monitor.
     1536% External scheduling is more constrained and explicit, which helps programmers reduce the non-deterministic nature of concurrency.
     1537External scheduling allows users to wait for events from other threads without concern of unrelated events occurring.
     1538The mechnaism can be done in terms of control flow, \eg Ada @accept@ or \uC @_Accept@, or in terms of data, \eg Go channels.
     1539Of course, both of these paradigms have their own strengths and weaknesses, but for this project, control-flow semantics was chosen to stay consistent with the rest of the languages semantics.
     1540Two challenges specific to \CFA arise when trying to add external scheduling with loose object definitions and multiple-monitor routines.
     1541The previous example shows a simple use @_Accept@ versus @wait@/@signal@ and its advantages.
     1542Note that while other languages often use @accept@/@select@ as the core external scheduling keyword, \CFA uses @waitfor@ to prevent name collisions with existing socket \textbf{api}s.
    15411543
    15421544For internal scheduling, non-blocking signalling (as in the producer/consumer example) is used when the signaller is providing the cooperation for a waiting thread;
    15431545the signaller enters the monitor and changes state, detects a waiting threads that can use the state, performs a non-blocking signal on the condition queue for the waiting thread, and exits the monitor to run concurrently.
    1544 The waiter unblocks next, takes the state, and exits the monitor.
     1546The waiter unblocks next, uses/takes the state, and exits the monitor.
    15451547Blocking signalling is the reverse, where the waiter is providing the cooperation for the signalling thread;
    15461548the signaller enters the monitor, detects a waiting thread providing the necessary state, performs a blocking signal to place it on the urgent queue and unblock the waiter.
    1547 The waiter changes state and exits the monitor, and the signaller unblocks next from the urgent queue to take the state.
     1549The waiter changes state and exits the monitor, and the signaller unblocks next from the urgent queue to use/take the state.
    15481550
    15491551Figure~\ref{f:DatingService} shows a dating service demonstrating the two forms of signalling: non-blocking and blocking.
    15501552The dating service matches girl and boy threads with matching compatibility codes so they can exchange phone numbers.
    15511553A thread blocks until an appropriate partner arrives.
    1552 The complexity is exchanging phone number in the monitor,
    1553 While the non-barging monitor prevents a caller from stealing a phone number, the monitor mutual-exclusion property
    1554 
    1555 The dating service is an example of a monitor that cannot be written using external scheduling because:
    1556 
    1557 The example in table \ref{tbl:datingservice} highlights the difference in behaviour.
    1558 As mentioned, @signal@ only transfers ownership once the current critical section exits; this behaviour requires additional synchronization when a two-way handshake is needed.
    1559 To avoid this explicit synchronization, the @condition@ type offers the @signal_block@ routine, which handles the two-way handshake as shown in the example.
    1560 This feature removes the need for a second condition variables and simplifies programming.
    1561 Like every other monitor semantic, @signal_block@ uses barging prevention, which means mutual-exclusion is baton-passed both on the front end and the back end of the call to @signal_block@, meaning no other thread can acquire the monitor either before or after the call.
     1554The complexity is exchanging phone number in the monitor because the monitor mutual-exclusion property prevents exchanging numbers.
     1555For internal scheduling, the @exchange@ condition is necessary to block the thread finding the match, while the matcher unblocks to take the oppose number, post its phone number, and unblock the partner.
     1556For external scheduling, the implicit urgent-condition replaces the explict @exchange@-condition and @signal_block@ puts the finding thread on the urgent condition and unblocks the matcher..
     1557
     1558The dating service is an example of a monitor that cannot be written using external scheduling because it requires knowledge of calling parameters to make scheduling decisions, and parameters of waiting threads are unavailable;
     1559as well, an arriving thread may not find a partner and must wait, which requires a condition variable, and condition variables imply internal scheduling.
    15621560
    15631561\begin{figure}
     
    16551653}
    16561654\end{cfa}
    1657 must have acquired monitor locks that are greater than or equal to the number of locks for the waiting thread signalled from the front of the condition queue.
    1658 In general, the signaller does not know the order of waiting threads, so in general, it must acquire the maximum number of mutex locks for the worst-case waiting thread.
     1655must have acquired monitor locks that are greater than or equal to the number of locks for the waiting thread signalled from the condition queue.
     1656{\color{red}In general, the signaller does not know the order of waiting threads, so in general, it must acquire the maximum number of mutex locks for the worst-case waiting thread.}
    16591657
    16601658Similarly, for @waitfor( rtn )@, the default semantics is to atomically block the acceptor and release all acquired mutex types in the parameter list, \ie @waitfor( rtn, m1, m2 )@.
     
    16671665void foo( M & mutex m1, M & mutex m2 ) {
    16681666        ... wait( `e, m1` ); ...                                $\C{// release m1, keeping m2 acquired )}$
    1669 void baz( M & mutex m1, M & mutex m2 ) {        $\C{// must acquire m1 and m2 )}$
     1667void bar( M & mutex m1, M & mutex m2 ) {        $\C{// must acquire m1 and m2 )}$
    16701668        ... signal( `e` ); ...
    16711669\end{cfa}
    1672 The @wait@ only releases @m1@ so the signalling thread cannot acquire both @m1@ and @m2@ to  enter @baz@ to get to the @signal@.
     1670The @wait@ only releases @m1@ so the signalling thread cannot acquire both @m1@ and @m2@ to  enter @bar@ to get to the @signal@.
    16731671While deadlock issues can occur with multiple/nesting acquisition, this issue results from the fact that locks, and by extension monitors, are not perfectly composable.
    16741672
     
    17551753However, Figure~\ref{f:OtherWaitingThread} shows this solution is complex depending on other waiters, resulting is choices when the signaller finishes the inner mutex-statement.
    17561754The singaller can retain @m2@ until completion of the outer mutex statement and pass the locks to waiter W1, or it can pass @m2@ to waiter W2 after completing the inner mutex-statement, while continuing to hold @m1@.
    1757 In the latter case, waiter W2 must eventually pass @m2@ to waiter W1, which is complex because W2 may have waited before W1 so it is unaware of W1.
     1755In the latter case, waiter W2 must eventually pass @m2@ to waiter W1, which is complex because W1 may have waited before W2, so W2 is unaware of it.
    17581756Furthermore, there is an execution sequence where the signaller always finds waiter W2, and hence, waiter W1 starves.
    17591757
     
    18611859
    18621860
     1861\begin{comment}
    18631862\section{External scheduling} \label{extsched}
    18641863
    1865 An alternative to internal scheduling is external scheduling (see Table~\ref{tbl:sched}).
    1866 
    1867 \begin{comment}
    18681864\begin{table}
    18691865\begin{tabular}{|c|c|c|}
     
    19291925\label{tbl:sched}
    19301926\end{table}
    1931 \end{comment}
    1932 
    1933 This method is more constrained and explicit, which helps users reduce the non-deterministic nature of concurrency.
    1934 Indeed, as the following examples demonstrate, external scheduling allows users to wait for events from other threads without the concern of unrelated events occurring.
    1935 External scheduling can generally be done either in terms of control flow (\eg Ada with @accept@, \uC with @_Accept@) or in terms of data (\eg Go with channels).
    1936 Of course, both of these paradigms have their own strengths and weaknesses, but for this project, control-flow semantics was chosen to stay consistent with the rest of the languages semantics.
    1937 Two challenges specific to \CFA arise when trying to add external scheduling with loose object definitions and multiple-monitor routines.
    1938 The previous example shows a simple use @_Accept@ versus @wait@/@signal@ and its advantages.
    1939 Note that while other languages often use @accept@/@select@ as the core external scheduling keyword, \CFA uses @waitfor@ to prevent name collisions with existing socket \textbf{api}s.
    19401927
    19411928For the @P@ member above using internal scheduling, the call to @wait@ only guarantees that @V@ is the last routine to access the monitor, allowing a third routine, say @isInUse()@, acquire mutual exclusion several times while routine @P@ is waiting.
    19421929On the other hand, external scheduling guarantees that while routine @P@ is waiting, no other routine than @V@ can acquire the monitor.
    1943 
    1944 % ======================================================================
    1945 % ======================================================================
     1930\end{comment}
     1931
     1932
    19461933\subsection{Loose Object Definitions}
    1947 % ======================================================================
    1948 % ======================================================================
     1934
    19491935In \uC, a monitor class declaration includes an exhaustive list of monitor operations.
    19501936Since \CFA is not object oriented, monitors become both more difficult to implement and less clear for a user:
Note: See TracChangeset for help on using the changeset viewer.