Context Navigation

← Previous Change
Next Change →

Changeset c03c1ac for doc/theses/colby_parsons_MMAth

Timestamp:

Jul 20, 2023, 3:47:25 PM (13 months ago)

Author:

Peter A. Buhr <pabuhr@…>

Branches:

Children:

Parents:

Message:

more proofreading waituntil chapter

Location:

doc/theses/colby_parsons_MMAth/text

Files:

: 2 edited

channels.tex (modified) (4 diffs)
waituntil.tex (modified) (20 diffs)

Legend:

: Unmodified
: Added
: Removed

doc/theses/colby_parsons_MMAth/text/channels.tex

-                      r0e8f4c6
+                      rc03c1ac
 \begin{lrbox}{\myboxB}
 \begin{cfa}[aboveskip=0pt,belowskip=0pt]
 channel( size_t ) chan{ 128 };
+channel( int ) chan{ 128 };
 thread Consumer {};
 thread Producer {};
 …
         try {
                 for ()
                         insert( chan, 5 );
+                        chan << 5;
         } catch( channel_closed * ) {
                 // unhandled resume or full
 …
+}
 void main( Consumer & this ) {
+        int i;
         try {
                 for () { int i = remove( chan ); }
+                for () { i << chan; }
         @} catchResume( channel_closed * ) {@
                 // handled resume => consume from chan
 …
         close( chan );
+}

doc/theses/colby_parsons_MMAth/text/waituntil.tex

-                      r0e8f4c6
+                      rc03c1ac
 Now the problem is extended.
 Some stalls are wheelchair accessible and some stalls have specific gender identification.
+Some stalls are wheelchair accessible and some stalls have gender identification.
 Each person (thread) may be limited to only one kind of stall or may choose among different kinds of stalls that match their criteria.
 Immediately, the problem becomes more difficult.
 A single queue no longer fully solves the problem.
+A single queue no longer solves the problem.
 What happens when there is a stall available that the person at the front of the queue cannot choose?
 The na\"ive solution has each thread spin indefinitely continually checking the every matching kind of stall(s) until a suitable one is free.
+The na\"ive solution has each thread spin indefinitely continually checking every matching kind of stall(s) until a suitable one is free.
 This approach is insufficient since it wastes cycles and results in unfairness among waiting threads as a thread can acquire the first matching stall without regard to the waiting time of other threads.
 Waiting for the first appropriate stall (resource) that becomes available without spinning is an example of \gls{synch_multiplex}: the ability to wait synchronously for one or more resources based on some selection criteria.
 …
 (@G1@(x) $\rightarrow$ P @|@ @G2@(y) $\rightarrow$ Q )
 \end{cfa}
 waits for either channel @x@ or @y@ to have a value, if and only guards @G1@ and @G2@ are true;
+waits for either channel @x@ or @y@ to have a value, if and only if guards @G1@ and @G2@ are true;
 if only one guard is true, only one channel receives, and if both guards are false, no receive occurs.
 % extended CSP with a \gls{synch_multiplex} construct @ALT@, which waits for one resource to be available and then executes a corresponding block of code.
 …
 but require $2^N-1$ @if@ statements, where $N$ is the number of guards.
 The exponential blowup comes from applying rule 2.4 repeatedly, since it works on one guard at a time.
 Figure~\ref{f:wu_if} shows an example of applying rule 2.4 for three guards.
+Figure~\ref{f:wu_if} shows in \CFA an example of applying rule 2.4 for three guards.
 Also, notice the additional code duplication for statements @S1@, @S2@, and @S3@.
 …
 When discussing \gls{synch_multiplex} implementations, the resource being multiplexed is important.
 While CSP wait on channels, the earliest known implementation of synch\-ronous multiplexing is Unix's @select@~\cite{linux:select}, multiplexing over file descriptors.
+While CSP waits on channels, the earliest known implementation of synch\-ronous multiplexing is Unix's @select@~\cite{linux:select}, multiplexing over file descriptors.
 The @select@ system-call is passed three sets of file descriptors (read, write, exceptional) to wait on and an optional timeout.
 @select@ blocks until either some subset of file descriptors are available or the timeout expires.
 …
 It is possible to achieve exclusive-or semantics with @select@ by arbitrarily operating on only one of the returned descriptors.
 @select@ passes the interest set of file descriptors between application and kernel in the form of a worst-case sized bit-mask, where the worst-case is the largest numbered file descriptor.
 @poll@ reduces the size of the interest sets changing from a bit mask to a linked data structures, independent of the file-descriptor values.
+@poll@ reduces the size of the interest sets changing from a bit mask to a linked data structure, independent of the file-descriptor values.
 @epoll@ further reduces the data passed per call by keeping the interest set in the kernel, rather than supplying it on every call.
 …
 Later, \gls{synch_multiplex} started to appear in applications, via programming languages, to support fast multiplexed concurrent communication among threads.
 An early example of \gls{synch_multiplex} is the @select@ statement in Ada~\cite[\S~9.7]{Ichbiah79}.
 The @select@ statement in Ada allows a task object, with their own threads, to multiplex over a subset of asynchronous calls its methods.
 The Ada @select@ statement has the same exclusive-or semantics and guards as Occam ALT;
+This @select@ allows a task object, with their own threads, to multiplex over a subset of asynchronous calls to its methods.
+The Ada @select@ has the same exclusive-or semantics and guards as Occam ALT;
 however, it multiplexes over methods rather than channels.
 …
 \end{figure}
 Figure~\ref{f:BB_Ada} shows the outline of a bounded buffer implemented with Ada task.
+Figure~\ref{f:BB_Ada} shows the outline of a bounded buffer implemented with an Ada task.
 Note, a task method is associated with the \lstinline[language=ada]{accept} clause of the \lstinline[language=ada]{select} statement, rather than being a separate routine.
 The thread executing the loop in the task body blocks at the \lstinline[language=ada]{select} until a call occurs to @insert@ or @remove@.
 Then the appropriate \lstinline[language=ada]{accept} method is run with the called arguments.
 Hence, the @select@ statement provides rendezvous points for threads, rather than providing channels with message passing.
+Hence, the \lstinline[language=ada]{select} statement provides rendezvous points for threads, rather than providing channels with message passing.
 The \lstinline[language=ada]{select} statement also provides a timeout and @else@ (nonblocking), which changes synchronous multiplexing to asynchronous.
 Now the thread polls rather than blocks.
 …
 While these approaches solve the \gls{synch_multiplex} problem, they introduce other issues.
 Consider the case where a thread has a single source of communication and it wants a set of @N@ resources.
 It sequentially requests the @N@ resources and waits for each response.
+It must sequentially request the @N@ resources and wait for each response.
 During the receives for the @N@ resources, it can receive other communication, and has to save and postpone these communications, or discard them.
 % If the requests for the other resources need to be retracted, the burden falls on the programmer to determine how to synchronize appropriately to ensure that only one resource is delivered.
 …
 The new \CFA \gls{synch_multiplex} utility introduced in this work is the @waituntil@ statement.
 There is a @waitfor@ statement in \CFA that supports Ada-style \gls{synch_multiplex} over monitor methods, so this @waituntil@ focuses on synchronizing over other resources.
+There already exists a @waitfor@ statement in \CFA that supports Ada-style \gls{synch_multiplex} over monitor methods~\cite{Delisle21}, so this @waituntil@ focuses on synchronizing over other resources.
 All of the \gls{synch_multiplex} features mentioned so far are monomorphic, only waiting on one kind of resource: Unix @select@ supports file descriptors, Go's @select@ supports channel operations, \uC's @select@ supports futures, and Ada's @select@ supports monitor method calls.
 The \CFA @waituntil@ is polymorphic and provides \gls{synch_multiplex} over any objects that satisfy the trait in Figure~\ref{f:wu_trait}.
 …
 \end{figure}
 Currently locks, channels, futures and timeouts are supported by the @waituntil@ statement, and can be expanded through the @is_selectable@ trait as other use-cases arise.
 The @waituntil@ statement supports guarded clauses, both @or@ and @and@ semantics, and provides an @else@ for asynchronous multiplexing.
+Currently locks, channels, futures and timeouts are supported by the @waituntil@ statement, and this set can be expanded through the @is_selectable@ trait as other use-cases arise.
+The @waituntil@ statement supports guard clauses, both @or@ and @and@ semantics, and timeout and @else@ for asynchronous multiplexing.
 Figure~\ref{f:wu_example} shows a \CFA @waituntil@ usage, which is waiting for either @Lock@ to be available \emph{or} for a value to be read from @Channel@ into @i@ \emph{and} for @Future@ to be fulfilled \emph{or} a timeout of one second.
+Note, the expression inside a @waituntil@ clause is evaluated once at the start of the @waituntil@ algorithm.
 \begin{figure}
 …
 \section{Waituntil Semantics}
 The @waituntil@ semantics has two parts: the semantics of the statement itself, \ie @and@, @or@, @when@ guards, and @else@ semantics, and the semantics of how the @waituntil@ interacts with types like channels, locks and futures.
+The @waituntil@ semantics has two parts: the semantics of the statement itself, \ie @and@, @or@, @when@ guards, and @else@ semantics, and the semantics of how the @waituntil@ interacts with types like locks, channels, and futures.
 \subsection{Statement Semantics}
 …
 \begin{cfa}
 future(int) bar, foo;
+waituntil( foo ) { ... }
+or waituntil( bar ) { ... }
+waituntil( foo ) { ... } or waituntil( bar ) { ... }
+\end{cfa}
+The reason for this semantics is that prioritizing resources can be useful in certain problems.
+In the rare case where there is a starvation problem with the ordering, it possible to follow a @waituntil@ with its reverse form:
+\begin{cfa}
+waituntil( foo ) { ... } or waituntil( bar ) { ... } // prioritize foo
+waituntil( bar ) { ... } or waituntil( foo ) { ... } // prioritize bar
 \end{cfa}
 …
 When an @and@ clause becomes available, the waiting thread unblocks and runs that clause's code-block, and then the thread waits again for the next available clause or the @waituntil@ statement is now true.
 This semantics allows work to be done in parallel while synchronizing over a set of resources, and furthermore, gives a good reason to use the @and@ operator.
 If the @and@ operator waited for all clauses to be available before running, it would be the same as just acquiring those resources consecutively by a sequence of @waituntil@ statements.
+If the @and@ operator waited for all clauses to be available before running, it is the same as just acquiring those resources consecutively by a sequence of @waituntil@ statements.
 As for normal C expressions, the @and@ operator binds more tightly than the @or@.
 …
 More detail on channels and their interaction with @waituntil@ appear in Section~\ref{s:wu_chans}.
+The trait is used by having a blocking object return a type that supports the @is_selectable@ trait.
+This feature leverages \CFA's ability to overload on return type to select the correct overloaded routine for the @waituntil@ context.
+A selectable type is needed for types that want to support multiple operations such as channels that allow both reading and writing.
 \section{\lstinline{waituntil} Implementation}
+The @waituntil@ statement is not inherently complex, and the pseudo code is presented in Figure~\ref{f:WU_Impl}.
+The @waituntil@ statement is not inherently complex, and Figure~\ref{f:WU_Impl} only shows the basic outline of the @waituntil@ algorithm.
 The complexity comes from the consideration of race conditions and synchronization needed when supporting various primitives.
+Figure~\ref{f:WU_Impl} aims to introduce the reader to the rudimentary idea and control flow of the @waituntil@.
+The following sections then use examples to fill in details that Figure~\ref{f:WU_Impl} does not provide.
+Finally, the full pseudocode of the waituntil is presented in Figure~\ref{f:WU_Full_Impl}.
+The basic steps of the @waituntil@ statement are:
+\begin{figure}
+\begin{cfa}
+select_nodes s[N];                                                               $\C[3.25in]{// declare N select nodes}$
+for ( node in s )                                                                $\C{// register nodes}$
+The following sections then use examples to fill in details missing in Figure~\ref{f:WU_Impl}.
+The full pseudocode for the @waituntil@ algorithm is presented in Figure~\ref{f:WU_Full_Impl}.
+\begin{figure}
+\begin{cfa}
+select_nodes s[N];                                                              $\C[3.25in]{// declare N select nodes}$
+for ( node in s )                                                               $\C{// register nodes}$
         register_select( resource, node );
 while ( statement predicate not satisfied ) {   $\C{// check predicate}$
         // block
+        // block until clause(s) satisfied
         for ( resource in waituntil statement ) {       $\C{// run true code blocks}$
-        if ( statement predicate is satisfied ) break;
                 if ( resource is avail ) run code block
+    }
+                if ( statement predicate is satisfied ) break;
+        }
+}
 for ( node in s )                                                               $\C{// deregister nodes}\CRT$
 …
 \end{figure}
+The basic steps of the algorithm are:
 \begin{enumerate}
 \item
 …
 \item
 The thread executing the @waituntil@ then loops until the statement's predicate is satisfied.
 In each iteration, if the predicate is unsatisfied, the thread blocks.
 If clauses becomes satisfied, the thread unblocks, and for each satisfied clause the block fails and the thread proceeds, otherwise the block succeeds (like a semaphore where a block is a @P()@ and a satisfied clause is a @V()@).
 After proceeding past the block all clauses are checked for completion and the completed clauses have their code blocks run.
+In each iteration, if the predicate is unsatisfied, the @waituntil@ thread blocks.
+When another thread satisfies a resource clause (\eg sends to a  channel), it unblocks the @waituntil@ thread.
+This thread checks all clauses for completion, and any completed clauses have their code blocks run.
 While checking clause completion, if enough clauses have been run such that the statement predicate is satisfied, the loop exits early.
-In the case where the block succeeds, the thread will be woken by the thread that marks one of the resources as available.
 \item
 Once the thread escapes the loop, the @select_nodes@ are unregistered from the resources.
 \end{enumerate}
 These steps give a basic overview of how the statement works.
 Digging into parts of the implementation will shed light on the specifics and provide more detail.
+The following sections shed light on the specific changes and provide more implementation detail.
 \subsection{Locks}\label{s:wu_locks}
 …
 \end{cfa}
 Implicitly, the @waituntil@ is calling the lock acquire for each of these locks to establish a position in the lock's queue of waiting threads.
+When the lock schedules this thread, it unblocks and performs the @waituntil@ code to determine if it can proceed.
+If it cannot proceed, it blocks again on the @waituntil@ lock, holding the acquired lock.
+When the lock schedules this thread, it unblocks and runs the code block associated with the lock and then releases the lock.
 In detail, when a thread waits on multiple locks via a @waituntil@, it enqueues a @select_node@ in each of the lock's waiting queues.
 When a @select_node@ reaches the front of the lock's queue and gains ownership, the thread blocked on the @waituntil@ is unblocked.
+Now, the lock is temporarily held by the @waituntil@ thread until the node is unregistered, versus the thread waiting on the lock.
+To prevent the waiting thread from holding many locks at once and potentially introducing a deadlock, the node is unregistered right after the corresponding code block is executed.
+This prevents deadlocks since the waiting thread will never hold a lock while waiting on another resource.
+As such the only nodes unregistered at the end are the ones that have not run.
+Now, the lock is held by the @waituntil@ thread until the code block is executed, and then the node is unregistered, during which the lock is released.
+Immediately releasing the lock prevents the waiting thread from holding multiple locks and potentially introducing a deadlock.
+As such, the only unregistered nodes associated with locks are the ones that have not run.
 \subsection{Timeouts}
+Timeouts in the @waituntil@ take the form of a duration being passed to a @sleep@ or @timeout@ call.
+An example is shown in the following code.
+\begin{cfa}
+waituntil( sleep( 1`ms ) ) {}
+waituntil( timeout( 1`s ) ) {} or waituntil( timeout( 2`s ) ) {}
+waituntil( timeout( 1`ns ) ) {} and waituntil( timeout( 2`s ) ) {}
+\end{cfa}
+The timeout implementation highlights a key part of the @waituntil@ semantics, the expression inside a @waituntil()@ is evaluated once at the start of the @waituntil@ algorithm.
+As such, calls to these @sleep@ and @timeout@ routines do not block, but instead return a type that supports the @is_selectable@ trait.
+This feature leverages \CFA's ability to overload on return type; a call to @sleep@ outside a @waituntil@ will call a different @sleep@ that does not return a type, which will block for the appropriate duration.
+This mechanism of returning a selectable type is needed for types that want to support multiple operations such as channels that allow both reading and writing.
+A timeout for the @waituntil@ statement is a duration passed to \lstinline[deletekeywords={timeout}]{timeout}, \eg:
+\begin{cquote}
+\begin{tabular}{@{}l|l@{}}
+\multicolumn{2}{@{}l@{}}{\lstinline{Duration D1\{ 1`ms \}, D2\{ 2`ms \}, D3\{ 3`ms \};}} \\
+\begin{cfa}[deletekeywords={timeout}]
+waituntil( i << C1 ) {}
+or waituntil( i << C2 ) {}
+or waituntil( i << C3 ) {}
+or waituntil( timeout( D1 ) ) {}
+or waituntil( timeout( D2 ) ) {}
+or waituntil( timeout( D3 ) ) {}
+\end{cfa}
+&
+\begin{cfa}[deletekeywords={timeout}]
+waituntil( i << C1 ) {}
+or waituntil( i << C2 ) {}
+or waituntil( i << C3 ) {}
+or waituntil( min( timeout( D1 ), timeout( D2 ),  timeout( D3 ) ) {}
+\end{cfa}
+\end{tabular}
+\end{cquote}
+These examples are basically equivalence.
+Here, the multiple timeouts are useful because the durations can change during execution and the separate clauses provide different code blocks if a timeout triggers.
+Multiple timeouts can also be used with @and@ to provide a minimal delay before proceeding.
+In following example, either channels @C1@ or @C2@ must be satisfied but nothing can be done for at least 1 or 3 seconds after the channel read.
+\begin{cfa}[deletekeywords={timeout}]
+waituntil( i << C1 ); and waituntil( timeout( 1`s ) );
+or waituntil( i << C2 ); and waituntil( timeout( 3`s ) );
+\end{cfa}
+If only @C2@ is satisfied, \emph{both} timeout code-blocks trigger.
+Note, the \CFA @waitfor@ statement only provides a single @timeout@ clause because it only supports @or@ semantics.
+The \lstinline[deletekeywords={timeout}]{timeout} routine is different from UNIX @sleep@, which blocks for the specified duration and returns the amount of time elapsed since the call started.
+Instead, \lstinline[deletekeywords={timeout}]{timeout} returns a type that supports the @is_selectable@ trait, allowing the type system to select the correct overloaded routine for this context.
+For the @waituntil@, it is more idiomatic for the \lstinline[deletekeywords={timeout}]{timeout} to use the same syntax as other blocking operations instead of having a special language clause.
 \subsection{Channels}\label{s:wu_chans}
+To support both waiting on both reading and writing to channels, the operators @?<<?@ and @?>>?@ are used read and write to a channel respectively, where the left-hand operand is the value being read into/written and the right-hand operand is the channel.
+Channels require significant complexity to synchronously multiplex on for a few reasons.
+First, reading or writing to a channel is a mutating operation;
+If a read or write to a channel occurs, the state of the channel has changed.
+In comparison, for standard locks and futures, if a lock is acquired then released or a future is ready but not accessed, the state of the lock and the future is not permanently modified.
+In this way, a @waituntil@ over locks or futures that completes with resources available but not consumed is not an issue.
+However, if a thread modifies a channel on behalf of a thread blocked on a @waituntil@ statement, it is important that the corresponding @waituntil@ code block is run, otherwise there is a potentially erroneous mismatch between the channel state and associated side effects.
+As such, the @unregister_select@ routine has a boolean return that is used by channels to indicate when the operation was completed but the block was not run yet.
+When the return is @true@, the corresponding code block is run after the unregister.
+Furthermore, if both @and@ and @or@ operators are used, the @or@ operators have to stop behaving like exclusive-or semantics due to the race between channel operations and unregisters.
+It was deemed important that exclusive-or semantics were maintained when only @or@ operators were used, so this situation has been special-cased, and is handled by having all clauses race to set a value \emph{before} operating on the channel.
+This approach is infeasible in the case where @and@ and @or@ operators are used.
+To show this consider the following @waituntil@ statement.
+\begin{cfa}
+waituntil( i >> A ) {} and waituntil( i >> B ) {}
+or waituntil( i >> C ) {} and waituntil( i >> D ) {}
+\end{cfa}
+If exclusive-or semantics were followed, this @waituntil@ would only run the code blocks for @A@ and @B@, or the code blocks for @C@ and @D@.
+However, to race before operation completion in this case introduces a race whose complexity increases with the size of the @waituntil@ statement.
+In the example above, for @i@ to be inserted into @C@, to ensure the exclusive-or it must be ensured that @i@ can also be inserted into @D@.
+Furthermore, the race for the @or@ would also need to be won.
+However, due to TOCTOU issues, one cannot know that all resources are available without acquiring all the internal locks of channels in the subtree.
+This is not a good solution for two reasons.
+It is possible that once all the locks are acquired the subtree is not satisfied and the locks must all be released.
+This would incur a high cost for signalling threads and heavily increase contention on internal channel locks.
+Channels require more complexity to allow synchronous multiplexing.
+For locks, when an outside thread releases a lock and unblocks the waituntil thread (WUT), the lock's MX property is passed to the WUT (no spinning locks).
+For futures, the outside thread deliveries a value to the future and unblocks any waiting threads, including WUTs.
+In either case, after the WUT unblocks it is safe to execute its the corresponding code block knowing access to the resource is protected by the lock or the read-only state of the future.
+Similarly, for channels, when an outside thread inserts a value into a channel, it must unblock the WUT.
+However, for channels, there is a race issue.
+If the outside thread inserts into the channel and unblocks the WUT, there is a race such that, another thread can remove the channel data, so after the WUT unblocks and attempts to remove from the buffer, it fails, and the WUT must reblock (busy waiting).
+This scenario is a \gls{toctou} race that needs to be consolidated.
+To close the race, the outside thread must detect this case and insert directly into the left-hand side of the channel expression (@i << chan@) rather than into the channel, and then unblock the WUT.
+Now the unblocked WUT is guaranteed to have a satisfied resource and its code block can safely executed.
+The insertion circumvents the channel buffer via the wait-morphing in the \CFA channel implementation \see{Section~\ref{s:chan_impl}}, allowing @waituntil@ channel unblocking to not be special-cased.
+Furthermore, if both @and@ and @or@ operators are used, the @or@ operations stop behaving like exclusive-or due to the race among channel operations, \eg:
+\begin{cfa}
+waituntil( i << A ) {} and waituntil( i << B ) {}
+or waituntil( i << C ) {} and waituntil( i << D ) {}
+\end{cfa}
+If exclusive-or semantics are followed, only the code blocks for @A@ and @B@ are run, or the code blocks for @C@ and @D@.
+However, four outside threads can simultaneously put values into @i@ and attempt to unblock the WUT to run the four code-blocks.
+This case introduces a race with complexity that increases with the size of the @waituntil@ statement.
+However, due to TOCTOU issues, it is impossible to know if all resources are available without acquiring all the internal locks of channels in the subtree of the @waituntil@ clauses.
+This approach is a poor solution for two reasons.
+It is possible that once all the locks are acquired the subtree is not satisfied and the locks must be released.
+This work incurs a high cost for signalling threads and heavily increase contention on internal channel locks.
 Furthermore, the @waituntil@ statement is polymorphic and can support resources that do not have internal locks, which also makes this approach infeasible.
+As such, the exclusive-or semantics are lost when using both @and@ and @or@ operators since they can not be supported without significant complexity and hits to @waituntil@ statement performance.
+As such, the exclusive-or semantics is lost when using both @and@ and @or@ operators since it cannot be supported without significant complexity and significantly affects @waituntil@ performance.
+It was deemed important that exclusive-or semantics are maintained when only @or@ operators are used, so this situation has been special-cased, and is handled by having all clauses race to set a value \emph{before} operating on the channel.
+\PAB{explain how this is handled.}
+\PAB{I'm lost here to the end of the section. You have no example of the case you want to discuss so a made something up.}
 Channels introduce another interesting consideration in their implementation.
+Supporting both reading and writing to a channel in A @waituntil@ means that one @waituntil@ clause may be the notifier for another @waituntil@ clause.
+Supporting both reading and writing to a channel in a @waituntil@ means that one @waituntil@ clause may be the notifier by another @waituntil@ clause.
+\begin{cfa}
+waituntil( i << A ) { @B << i;@ }
+or waituntil( i << B ) { @A << i;@ }
+\end{cfa}
 This poses a problem when dealing with the special-cased @or@ where the clauses need to win a race to operate on a channel.
 When both a special-case @or@ is inserting to a channel on one thread and another thread is blocked in a special-case @or@ consuming from the same channel there is not one but two races that need to be consolidated by the inserting thread.
 …
 Go solves this problem in their select statement by acquiring the internal locks of all channels before registering the select on the channels.
 This eliminates the race since no other threads can operate on the blocked channel since its lock will be held.
+This eliminates the race since no other threads can operate on the blocked channel since its lock is held.
 This approach is not used in \CFA since the @waituntil@ is polymorphic.
 Not all types in a @waituntil@ have an internal lock, and when using non-channel types acquiring all the locks incurs extra unneeded overhead.
 Instead this race is consolidated in \CFA in two phases by having an intermediate pending status value for the race.
 This race case is detectable, and if detected, producer will first race to set its own race flag to be pending.
 If it succeeds, it then attempts to set the consumer's race flag to its success value.
 If the producer successfully sets the consumer race flag, then the operation can proceed, if not the signalling thread will set its own race flag back to the initial value.
+Instead, this race is consolidated in \CFA in two phases by having an intermediate pending status value for the race.
+This race case is detectable, and if detected, the outside thread first races to set its own race flag to be pending.
+If it succeeds, it then attempts to set the WUT's race flag to its success value.
+If the outside thread successfully sets the WUT's race flag, then the operation can proceed, if not the signalling threads set its own race flag back to the initial value.
 If any other threads attempt to set the producer's flag and see a pending value, they will wait until the value changes before proceeding to ensure that in the case that the producer fails, the signal will not be lost.
 This protocol ensures that signals will not be lost and that the two races can be resolved in a safe manner.
 …
 \begin{cfa}
 bool when_conditions[N];
 for ( node in s )                                                                $\C{// evaluate guards}$
     if ( node has guard )
         when_conditions[node] = node_guard;
     else
         when_conditions[node] = true;
 select_nodes s[N];                                                               $\C[3.25in]{// declare N select nodes}$
+for ( node in s )                                                                       $\C[3.75in]{// evaluate guards}$
+        if ( node has guard )
+                when_conditions[node] = node_guard;
+        else
+                when_conditions[node] = true;
+select_nodes s[N];                                                                      $\C{// declare N select nodes}$
 try {
+    for ( node in s )                                                            $\C{// register nodes}$
+        if ( when_conditions[node] )
+            register_select( resource, node );
+    // ... set statuses for nodes with when_conditions[node] == false ...
+    while ( statement predicate not satisfied ) {       $\C{// check predicate}$
+        // block
+        for ( resource in waituntil statement ) {       $\C{// run true code blocks}$
+            if ( statement predicate is satisfied ) break;
+            if ( resource is avail ) {
+                try {
+                    if( on_selected( resource ) )              $\C{// conditionally run block}$
+                        run code block
+                } finally {                                    $\C{// for exception safety}$
+                    unregister_select( resource, node );       $\C{// immediate unregister}$
+                }
+            }
+        }
+    }
+} finally {                                                     $\C{// for exception safety}$
+    for ( registered nodes in s )                                                               $\C{// deregister nodes}$
+        if ( when_conditions[node] && unregister_select( resource, node ) && on_selected( resource ) )
+            run code block $\C{// conditionally run code block upon unregister}\CRT$
+        for ( node in s )                                                               $\C{// register nodes}$
+                if ( when_conditions[node] )
+                        register_select( resource, node );
+        // ... set statuses for nodes with when_conditions[node] == false ...
+        while ( statement predicate not satisfied ) {   $\C{// check predicate}$
+                // block
+                for ( resource in waituntil statement ) {       $\C{// run true code blocks}$
+                        if ( statement predicate is satisfied ) break;
+                        if ( resource is avail )
+                                try {
+                                        if( on_selected( resource ) )   $\C{// conditionally run block}$
+                                                run code block
+                                } finally                                                       $\C{// for exception safety}$
+                                        unregister_select( resource, node ); $\C{// immediate unregister}$
+                }
+        }
+} finally {                                                                                     $\C{// for exception safety}$
+        for ( registered nodes in s )                                   $\C{// deregister nodes}$
+                if ( when_conditions[node] && unregister_select( resource, node )
+                                && on_selected( resource ) )
+                        run code block                                                  $\C{// run code block upon unregister}\CRT$
+}
 \end{cfa}
 …
 Some things to note are as follows:
 The @finally@ blocks provide exception-safe RAII unregistering of nodes, and in particular, the @finally@ inside the innermost loop performs the immediate unregistering required for deadlock-freedom that was mentioned in Section~\ref{s:wu_locks}.
 The @when_conditions@ array is used to store the boolean result of evaulating each guard at the beginning of the @waituntil@, and it is used to conditionally omit operations on resources with @false@ guards.
+The @when_conditions@ array is used to store the boolean result of evaluating each guard at the beginning of the @waituntil@, and it is used to conditionally omit operations on resources with @false@ guards.
 As discussed in Section~\ref{s:wu_chans}, this pseudocode includes code blocks conditional on the result of both @on_selected@ and @unregister_select@, which allows the channel implementation to ensure that all available channel resources will have their corresponding code block run.
 \section{Waituntil Performance}
+Similar facilities to @waituntil@ are discussed at the start of this chapter in C, Ada, Rust, \CC, and OCaml.
+However, these facilities are either not meaningful or feasible to benchmark against.
+The UNIX @select@ and related utilities are not comparable since they are system calls that go into the kernel and operate on file descriptors, whereas the @waituntil@ exists solely in user space.
+Ada's @select@ only operates on methods, which is done in \CFA via the @waitfor@ utility so it is not meaningful to benchmark against the @waituntil@, which cannot wait on the same resource.
+Rust and \CC only offer a busy-wait based approach, which is not comparable to a blocking approach.
+OCaml's @select@ waits on channels that are not comparable with \CFA and Go channels, so OCaml @select@ is not benchmarked against Go's @select@ and \CFA's @waituntil@.
 The two \gls{synch_multiplex} utilities that are in the realm of comparability with the \CFA @waituntil@ statement are the Go @select@ statement and the \uC @_Select@ statement.
 As such, two microbenchmarks are presented, one for Go and one for \uC to contrast the systems.
-The similar utilities discussed at the start of this chapter in C, Ada, Rust, \CC, and OCaml are either not meaningful or feasible to benchmark against.
-The select(2) and related utilities in C are not comparable since they are system calls that go into the kernel and operate on file descriptors, whereas the @waituntil@ exists solely in user space.
-Ada's @select@ only operates on methods, which is done in \CFA via the @waitfor@ utility so it is not meaningful to benchmark against the @waituntil@, which cannot wait on the same resource.
-Rust and \CC only offer a busy-wait based approach which is not comparable to a blocking approach.
-OCaml's @select@ waits on channels that are not comparable with \CFA and Go channels, so OCaml @select@ is not benchmarked against Go's @select@ and \CFA's @waituntil@.
 Given the differences in features, polymorphism, and expressibility between @waituntil@ and @select@, and @_Select@, the aim of the microbenchmarking in this chapter is to show that these implementations lie in the same realm of performance, not to pick a winner.
 \subsection{Channel Benchmark}
+The channel multiplexing microbenchmarks compare \CFA's @waituntil@ and Go's select, where the resource being waited on is a set of channels.
+The basic structure of the microbenchmark has the number of cores split evenly between producer and consumer threads, \ie, with 8 cores there would be 4 producer threads and 4 consumer threads.
+The number of clauses @C@ is also varied, with results shown with 2, 4, and 8 clauses.
+Each clause has a respective channel that is operates on.
+Each producer and consumer repeatedly waits to either produce or consume from one of the @C@ clauses and respective channels.
+An example in \CFA syntax of the work loop in the consumer main with @C = 4@ clauses follows.
+\begin{cfa}
+        for (;;)
+                waituntil( val << chans[0] ) {} or waituntil( val << chans[1] ) {}
+                or waituntil( val << chans[2] ) {} or waituntil( val << chans[3] ) {}
+The channel multiplexing microbenchmarks compare \CFA's @waituntil@ and Go's \lstinline[language=Go]{select}, where the resource being waited on is a set of channels.
+The basic structure of the microbenchmark has the number of cores split evenly between producer and consumer threads, \ie, with 8 cores there are 4 producer and 4 consumer threads.
+The number of resource clauses $C$ is also varied across 2, 4, and 8 clauses, where each clause has different channel that is waits on.
+Each producer and consumer repeatedly waits to either produce or consume from one of the $C$ clauses and respective channels.
+For example, in \CFA syntax, the work loop in the consumer main with $C = 4$ clauses is:
+\begin{cfa}
+for ()
+        waituntil( val << chans[0] ); or waituntil( val << chans[1] );
+        or waituntil( val << chans[2] ); or waituntil( val << chans[3] );
 \end{cfa}
 A successful consumption is counted as a channel operation, and the throughput of these operations is measured over 10 seconds.

Note: See TracChangeset for help on using the changeset viewer.

Download in other formats: