Ignore:
Timestamp:
Jul 17, 2023, 9:24:19 AM (12 months ago)
Author:
Peter A. Buhr <pabuhr@…>
Branches:
master
Children:
dbf5e18
Parents:
bcc56c9
Message:

first proofread of waituntil chapter up to Section 7.5

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/colby_parsons_MMAth/text/waituntil.tex

    rbcc56c9 r847ab8f  
    66
    77Consider the following motivating problem.
    8 There are @N@ stalls (resources) in a bathroom and there are @M@ people (threads).
     8There are $N$ stalls (resources) in a bathroom and there are $M$ people (threads) using the bathroom.
    99Each stall has its own lock since only one person may occupy a stall at a time.
    10 Humans tend to solve this problem in the following way.
     10Humans solve this problem in the following way.
    1111They check if all of the stalls are occupied.
    12 If not they enter and claim an available stall.
    13 If they are all occupied, the people queue and watch the stalls until one is free and then enter and lock the stall.
    14 This solution can be implemented on a computer easily if all threads are waiting on all stalls and agree to queue.
     12If not, they enter and claim an available stall.
     13If they are all occupied, people queue and watch the stalls until one is free, and then enter and lock the stall.
     14This solution can be implemented on a computer easily, if all threads are waiting on all stalls and agree to queue.
     15
    1516Now the problem is extended.
    16 Some stalls are wheelchair accessible, some stalls are dirty and other stalls are clean.
    17 Each person (thread) may choose some subset of dirty, clean and accessible stalls that they want to wait for.
    18 Immediately the problem becomes much more difficult.
    19 A single queue no longer fully solves the problem: What happens when there is a stall available that the person at the front of the queue will not choose?
    20 The naive solution to this problem has each thread to spin indefinitely continually checking the stalls until an suitable one is free.
    21 This is not good enough since this approach wastes cycles and results in no fairness among threads waiting for stalls as a thread will jump in the first stall available without any regard to other waiting threads.
    22 Waiting for the first stall (resource) available without spinning is an example of \gls{synch_multiplex}, the ability to wait synchronously for a resource or set of resources.
     17Some stalls are wheelchair accessible and some stalls have specific sexual orientation.
     18Each person (thread) may be limited to only one kind of stall or may choose among different kinds of stalls that match their criteria.
     19Immediately, the problem becomes more difficult.
     20A single queue no longer fully solves the problem.
     21What happens when there is a stall available that the person at the front of the queue cannot choose?
     22The na\"ive solution has each thread spin indefinitely continually checking the every matching kind of stall(s) until a suitable one is free.
     23This approach is insufficient since it wastes cycles and results in unfairness among waiting threads as a thread can acquire the first matching stall without regard to the waiting time of other threads.
     24Waiting for the first appropriate stall (resource) that becomes available without spinning is an example of \gls{synch_multiplex}: the ability to wait synchronously for one or more resources based on some selection criteria.
    2325
    2426\section{History of Synchronous Multiplexing}
    2527There is a history of tools that provide \gls{synch_multiplex}.
    26 Some well known \gls{synch_multiplex} tools include unix system utilities: select(2)\cite{linux:select}, poll(2)\cite{linux:poll}, and epoll(7)\cite{linux:epoll}, and the select statement provided by Go\cite{go:selectref}.
    27 
    28 The theory surrounding \gls{synch_multiplex} was largely introduced by Hoare in his 1985 CSP book \cite{Hoare85} and his later work with Roscoe on the theoretical language Occam\cite{Roscoe88}.
    29 The work on Occam in \cite{Roscoe88} calls their \gls{synch_multiplex} primitive ALT, which waits for one resource to be available and then executes a corresponding block of code.
    30 Waiting for one resource out of a set of resources can be thought of as a logical exclusive-or over the set of resources.
    31 Both CSP and Occam include \Newterm{guards} for communication channels and the ability to wait for a single channel to be ready out of a set of channels.
     28Some well known \gls{synch_multiplex} tools include Unix system utilities: @select@~\cite{linux:select}, @poll@~\cite{linux:poll}, and @epoll@~\cite{linux:epoll}, and the @select@ statement provided by Go~\cite{go:selectref}, Ada~\cite[\S~9.7]{Ada16}, and \uC~\cite[\S~3.3.1]{uC++}.
     29The concept and theory surrounding \gls{synch_multiplex} was introduced by Hoare in his 1985 book, Communicating Sequential Processes (CSP)~\cite{Hoare85},
     30\begin{quote}
     31A communication is an event that is described by a pair $c.v$ where $c$ is the name of the channel on which the communication takes place and $v$ is the value of the message which passes.~\cite[p.~113]{Hoare85}
     32\end{quote}
     33The ideas in CSP were implemented by Roscoe and Hoare in the language Occam~\cite{Roscoe88}.
     34
     35Both CSP and Occam include the ability to wait for a \Newterm{choice} among receiver channels and \Newterm{guards} to toggle which receives are valid.
     36For example,
     37\begin{cfa}[mathescape]
     38(@G1@(x) $\rightarrow$ P @|@ @G2@(y) $\rightarrow$ Q )
     39\end{cfa}
     40waits for either channel @x@ or @y@ to have a value, if and only guards @G1@ and @G2@ are true;
     41if only one guard is true, only one channel receives, and if both guards are false, no receive occurs.
     42% extended CSP with a \gls{synch_multiplex} construct @ALT@, which waits for one resource to be available and then executes a corresponding block of code.
     43In detail, waiting for one resource out of a set of resources can be thought of as a logical exclusive-or over the set of resources.
    3244Guards are a conditional operator similar to an @if@, except they apply to the resource being waited on.
    33 If a guard is false then the resource it guards is considered to not be in the set of resources being waited on.
    34 Guards can be simulated using if statements, but to do so requires \[2^N\] if statements, where @N@ is the number of guards.
    35 The equivalence between guards and exponential if statements comes from an Occam ALT statement rule~\cite{Roscoe88}, which is presented in \CFA syntax in Figure~\ref{f:wu_if}.
    36 Providing guards allows for easy toggling of waituntil clauses without introducing repeated code.
    37 
    38 \begin{figure}
    39 \begin{cfa}
    40 // CFA's guards use the keyword 'when'
    41 when( predicate ) waituntil( A ) {}
    42 or waituntil( B ) {}
    43 // ===
    44 if ( predicate ) {
    45     waituntil( A ) {}
    46     or waituntil( B ) {}
    47 } else {
    48     waituntil( B ) {}
     45If a guard is false, then the resource it guards is not in the set of resources being waited on.
     46If all guards are false, the ALT does nothing and the thread continues.
     47Guards can be simulated using @if@ statements as shown in~\cite[rule~2.4, p~183]{Roscoe88}
     48\begin{lstlisting}[basicstyle=\rm,mathescape]
     49ALT( $b$ & $g$ $P$, $G$ ) = IF ( $b$ ALT($\,g$ $P$, $G$ ), $\neg\,$b ALT( $G$ ) )                        (boolean guard elim).
     50\end{lstlisting}
     51but require $2^N-1$ @if@ statements, where $N$ is the number of guards.
     52The exponential blowup comes from applying rule 2.4 repeatedly, since it works on one guard at a time.
     53Figure~\ref{f:wu_if} shows an example of applying rule 2.4 for three guards.
     54Also, notice the additional code duplication for statements @S1@, @S2@, and @S3@.
     55
     56\begin{figure}
     57\centering
     58\begin{lrbox}{\myboxA}
     59\begin{cfa}
     60when( G1 )
     61        waituntil( R1 ) S1
     62or when( G2 )
     63        waituntil( R2 ) S2
     64or when( G3 )
     65        waituntil( R3 ) S3
     66
     67
     68
     69
     70
     71
     72
     73\end{cfa}
     74\end{lrbox}
     75
     76\begin{lrbox}{\myboxB}
     77\begin{cfa}
     78if ( G1 )
     79        if ( G2 )
     80                if ( G3 ) waituntil( R1 ) S1 or waituntil( R2 ) S2 or waituntil( R3 ) S3
     81                else waituntil( R1 ) S1 or waituntil( R2 ) S2
     82        else
     83                if ( G3 ) waituntil( R1 ) S1 or waituntil( R3 ) S3
     84                else waituntil( R1 ) S1
     85else
     86        if ( G2 )
     87                if ( G3 ) waituntil( R2 ) S2 or waituntil( R3 ) S3
     88                else waituntil( R2 ) S2
     89        else
     90                if ( G3 ) waituntil( R3 ) S3
     91\end{cfa}
     92\end{lrbox}
     93
     94\subfloat[Guards]{\label{l:guards}\usebox\myboxA}
     95\hspace*{5pt}
     96\vrule
     97\hspace*{5pt}
     98\subfloat[Simulated Guards]{\label{l:simulated_guards}\usebox\myboxB}
     99\caption{\CFA guard simulated with \lstinline{if} statement.}
     100\label{f:wu_if}
     101\end{figure}
     102
     103When discussing \gls{synch_multiplex} implementations, the resource being multiplexed is important.
     104While CSP wait on channels, the earliest known implementation of synch\-ronous multiplexing is Unix's @select@~\cite{linux:select}, multiplexing over file descriptors.
     105The @select@ system-call is passed three sets of file descriptors (read, write, exceptional) to wait on and an optional timeout.
     106@select@ blocks until either some subset of file descriptors are available or the timeout expires.
     107All file descriptors that are ready are returned by modifying the argument sets to only contain the ready descriptors.
     108
     109This early implementation differs from the theory presented in CSP: when the call from @select@ returns it may provide more than one ready file descriptor.
     110As such, @select@ has logical-or multiplexing semantics, whereas the theory described exclusive-or semantics.
     111It is possible to achieve exclusive-or semantics with @select@ by arbitrarily operating on only one of the returned descriptors.
     112@select@ passes the interest set of file descriptors between application and kernel in the form of a worst-case sized bit-mask, where the worst-case is the largest numbered file descriptor.
     113@poll@ reduces the size of the interest sets changing from a bit mask to a linked data structures, independent of the file-descriptor values.
     114@epoll@ further reduces the data passed per call by keeping the interest set in the kernel, rather than supplying it on every call.
     115
     116These early \gls{synch_multiplex} tools interact directly with the operating system and others are used to communicate among processes.
     117Later, \gls{synch_multiplex} started to appear in applications, via programming languages, to support fast multiplexed concurrent communication among threads.
     118An early example of \gls{synch_multiplex} is the @select@ statement in Ada~\cite[\S~9.7]{Ichbiah79}.
     119The @select@ statement in Ada allows a task object, with their own threads, to multiplex over a subset of asynchronous calls its methods.
     120The Ada @select@ statement has the same exclusive-or semantics and guards as Occam ALT;
     121however, it multiplexes over methods rather than channels.
     122
     123\begin{figure}
     124\begin{lstlisting}[language=ada,literate=]
     125task type buffer is -- thread
     126        ... -- buffer declarations
     127        count : integer := 0;
     128begin -- thread starts here
     129        loop
     130                select
     131                        when count < Size => -- guard
     132                        accept insert( elem : in ElemType ) do  -- method
     133                                ... -- add to buffer
     134                                count := count + 1;
     135                        end;
     136                        -- executed if this accept called
     137                or
     138                        when count > 0 => -- guard
     139                        accept remove( elem : out ElemType ) do  -- method
     140                                ... --remove and return from buffer via parameter
     141                                count := count - 1;
     142                        end;
     143                        -- executed if this accept called
     144                or delay 10.0;  -- unblock after 10 seconds without call
     145                or else -- do not block, cannot appear with delay
     146                end select;
     147        end loop;
     148end buffer;
     149var buf : buffer; -- create task object and start thread in task body
     150\end{lstlisting}
     151\caption{Ada Bounded Buffer}
     152\label{f:BB_Ada}
     153\end{figure}
     154
     155Figure~\ref{f:BB_Ada} shows the outline of a bounded buffer implemented with Ada task.
     156Note, a task method is associated with the \lstinline[language=ada]{accept} clause of the \lstinline[language=ada]{select} statement, rather than being a separate routine.
     157The thread executing the loop in the task body blocks at the \lstinline[language=ada]{select} until a call occurs to @insert@ or @remove@.
     158Then the appropriate \lstinline[language=ada]{accept} method is run with the called arguments.
     159Hence, the @select@ statement provides rendezvous points for threads, rather than providing channels with message passing.
     160The \lstinline[language=ada]{select} statement also provides a timeout and @else@ (nonblocking), which changes synchronous multiplexing to asynchronous.
     161Now the thread polls rather than blocks.
     162
     163Another example of programming-language \gls{synch_multiplex} is Go using a @select@ statement with channels~\cite{go:selectref}.
     164Figure~\ref{l:BB_Go} shows the outline of a bounded buffer implemented with a Go routine.
     165Here two channels are used for inserting and removing by client producers and consumers, respectively.
     166(The @term@ and @stop@ channels are used to synchronize with the program main.)
     167Go's @select@ has the same exclusive-or semantics as the ALT primitive from Occam and associated code blocks for each clause like ALT and Ada.
     168However, unlike Ada and ALT, Go does not provide guards for the \lstinline[language=go]{case} clauses of the \lstinline[language=go]{select}.
     169Go also provides a timeout via a channel and a @default@ clause like Ada @else@ for asynchronous multiplexing.
     170
     171\begin{figure}
     172\centering
     173
     174\begin{lrbox}{\myboxA}
     175\begin{lstlisting}[language=go,literate=]
     176func main() {
     177        insert := make( chan int, Size )
     178        remove := make( chan int, Size )
     179        term := make( chan string )
     180        finish := make( chan string )
     181
     182        buf := func() {
     183                L: for {
     184                        select { // wait for message
     185                          case i = <- buffer:
     186                          case <- term: break L
     187                        }
     188                        remove <- i;
     189                }
     190                finish <- "STOP" // completion
     191        }
     192        go buf() // start thread in buf
    49193}
    50 \end{cfa}
    51 \caption{Occam's guard to if statement equivalence shown in \CFA syntax.}
    52 \label{f:wu_if}
    53 \end{figure}
    54 
    55 When discussing \gls{synch_multiplex} implementations, one must discuss the resources being multiplexed.
    56 While the aforementioned theory waits on channels, the earliest known implementation of a synchronous multiplexing tool, Unix's select(2)\cite{linux:select}, is multiplexed over file descriptors.
    57 The select(2) system call is passed three sets of file descriptors (read, write, exceptional) to wait on and an optional timeout.
    58 Select(2) will block until either some subset of file descriptors are available or the timeout expires.
    59 All file descriptors that are ready will be returned by modifying the argument sets to only contain the ready descriptors.
    60 This early implementation differs from the theory presented in Occam and CSP; when the call from select(2) returns it may provide more than one ready file descriptor.
    61 As such, select(2) has logical-or multiplexing semantics, whereas the theory described exclusive-or semantics.
    62 This is not a drawback.
    63 A user can easily achieve exclusive-or semantics with select by arbitrarily choosing only one of the returned descriptors to operate on.
    64 Select(2) was followed by poll(2), which was later followed by epoll(7), with each successor improving upon drawbacks in their predecessors.
    65 The syscall poll(2) improved on select(2) by allowing users to monitor descriptors with numbers higher than 1024 which was not supported by select.
    66 Epoll(7) then improved on poll(2) to return the set of file descriptors; when one or more descriptors became available poll(2) would return the number of availables descriptors, but would not indicate which descriptors were ready.
    67 
    68 It is worth noting these \gls{synch_multiplex} tools mentioned so far interact directly with the operating system and are often used to communicate between processes.
    69 Later, \gls{synch_multiplex} started to appear in user-space to support fast multiplexed concurrent communication between threads.
    70 An early example of \gls{synch_multiplex} is the select statement in Ada~\cite[\S~9.7]{Ichbiah79}.
    71 The select statement in Ada allows a task to multiplex over some subset of its own methods that it would like to @accept@ calls to.
    72 Tasks in Ada are essentially objects that have their own thread, and as such have methods, fields, etc.
    73 The Ada select statement has the same exclusive-or semantics and guards as ALT from Occam, however it multiplexes over methods on rather than multiplexing over channels.
    74 A code block is associated with each @accept@, and the method that is accepted first has its corresponding code block run after the task unblocks.
    75 In this way the select statement in Ada provides rendezvous points for threads, rather than providing some resource through message passing.
    76 The select statement in Ada also supports an optional timeout with the same semantics as select(2), and provides an @else@.
    77 The @else@ changes the synchronous multiplexing to asynchronous multiplexing.
    78 If an @else@ clause is in a select statement and no calls to the @accept@ed methods are immediately available the code block associated with the @else@ is run and the task does not block.
    79 
    80 A popular example of user-space \gls{synch_multiplex} is Go with their select statement~\cite{go:selectref}.
    81 Go's select statement operates on channels and has the same exclusive-or semantics as the ALT primitive from Occam, and has associated code blocks for each clause like ALT and Ada.
    82 However, unlike Ada and ALT, Go does not provide any guards for their select statement cases.
    83 Go provides a timeout utility and also provides a @default@ clause which has the same semantics as Ada's @else@ clause.
    84 
    85 \uC provides \gls{synch_multiplex} over futures with their @_Select@ statement and Ada-style \gls{synch_multiplex} over monitor and task methods with their @_Accept@ statement~\cite{uC++}.
    86 Their @_Accept@ statement builds upon the select statement offered by Ada, by offering both @and@ and @or@ semantics, which can be used together in the same statement.
    87 These semantics are also supported for \uC's @_Select@ statement.
    88 This enables fully expressive \gls{synch_multiplex} predicates.
    89 
    90 There are many other languages that provide \gls{synch_multiplex}, including Rust's @select!@ over futures~\cite{rust:select}, OCaml's @select@ over channels~\cite{ocaml:channel}, and C++14's @when_any@ over futures~\cite{cpp:whenany}.
    91 Note that while C++14 and Rust provide \gls{synch_multiplex}, their implementations leave much to be desired as they both rely on busy-waiting polling to wait on multiple resources.
     194
     195
     196
     197
     198\end{lstlisting}
     199\end{lrbox}
     200
     201\begin{lrbox}{\myboxB}
     202\begin{lstlisting}[language=uC++=]
     203_Task BoundedBuffer {
     204        ... // buffer declarations
     205        int count = 0;
     206  public:
     207        void insert( int elem ) {
     208                ... // add to buffer
     209                count += 1;
     210        }
     211        int remove() {
     212                ... // remove and return from buffer
     213                count -= 1;
     214        }
     215  private:
     216        void main() {
     217                for ( ;; ) {
     218                        _Accept( ~buffer ) break;
     219                        or _When ( count < Size ) _Accept( insert );
     220                        or _When ( count > 0 ) _Accept( remove );
     221                }
     222        }
     223};
     224buffer buf; // start thread in main method
     225\end{lstlisting}
     226\end{lrbox}
     227
     228\subfloat[Go]{\label{l:BB_Go}\usebox\myboxA}
     229\hspace*{5pt}
     230\vrule
     231\hspace*{5pt}
     232\subfloat[\uC]{\label{l:BB_uC++}\usebox\myboxB}
     233
     234\caption{Bounded Buffer}
     235\label{f:AdaMultiplexing}
     236\end{figure}
     237
     238Finally, \uC provides \gls{synch_multiplex} with Ada-style @select@ over monitor and task methods with the @_Accept@ statement~\cite[\S~2.9.2.1]{uC++}, and over futures with the @_Select@ statement~\cite[\S~3.3.1]{uC++}.
     239The @_Select@ statement extends the ALT/Go @select@ by offering both @and@ and @or@ semantics, which can be used together in the same statement.
     240Both @_Accept@ and @_Select@ statements provide guards for multiplexing clauses, as well as, timeout, and @else@ clauses.
     241
     242There are other languages that provide \gls{synch_multiplex}, including Rust's @select!@ over futures~\cite{rust:select}, OCaml's @select@ over channels~\cite{ocaml:channel}, and C++14's @when_any@ over futures~\cite{cpp:whenany}.
     243Note that while C++14 and Rust provide \gls{synch_multiplex}, the implementations leave much to be desired as both rely on polling to wait on multiple resources.
    92244
    93245\section{Other Approaches to Synchronous Multiplexing}
    94 To avoid the need for \gls{synch_multiplex}, all communication between threads/processes has to come from a single source.
    95 One key example is Erlang, in which each process has a single heterogenous mailbox that is the sole source of concurrent communication, removing the need for \gls{synch_multiplex} as there is only one place to wait on resources.
    96 In a similar vein, actor systems circumvent the \gls{synch_multiplex} problem as actors are traditionally non-blocking, so they will never block in a behaviour and only block when waiting for the next message.
     246
     247To avoid the need for \gls{synch_multiplex}, all communication among threads/processes must come from a single source.
     248For example, in Erlang each process has a single heterogeneous mailbox that is the sole source of concurrent communication, removing the need for \gls{synch_multiplex} as there is only one place to wait on resources.
     249Similar, actor systems circumvent the \gls{synch_multiplex} problem as actors only block when waiting for the next message never in a behaviour.
    97250While these approaches solve the \gls{synch_multiplex} problem, they introduce other issues.
    98 Consider the case where a thread has a single source of communication (like erlang and actor systems) wants one of a set of @N@ resources.
    99 It requests @N@ resources and waits for responses.
    100 In the meantime the thread may receive other communication, and may either has to save and postpone the related work or discard it.
    101 After the thread receives one of the @N@ resources, it will continue to receive the other ones it requested, even if it does not need them.
    102 If the requests for the other resources need to be retracted, the burden falls on the programmer to determine how to synchronize appropriately to ensure that only one resource is delivered.
     251Consider the case where a thread has a single source of communication and it wants a set of @N@ resources.
     252It sequentially requests the @N@ resources and waits for each response.
     253During the receives for the @N@ resources, it can receive other communication, and has to save and postpone these communications, or discard them.
     254% If the requests for the other resources need to be retracted, the burden falls on the programmer to determine how to synchronize appropriately to ensure that only one resource is delivered.
    103255
    104256\section{\CFA's Waituntil Statement}
     257
    105258The new \CFA \gls{synch_multiplex} utility introduced in this work is the @waituntil@ statement.
    106259There is a @waitfor@ statement in \CFA that supports Ada-style \gls{synch_multiplex} over monitor methods, so this @waituntil@ focuses on synchronizing over other resources.
    107 All of the \gls{synch_multiplex} features mentioned so far are monomorphic, only supporting one resource to wait on: select(2) supports file descriptors, Go's select supports channel operations, \uC's select supports futures, and Ada's select supports monitor method calls.
    108 The waituntil statement in \CFA is polymorphic and provides \gls{synch_multiplex} over any objects that satisfy the trait in Figure~\ref{f:wu_trait}.
    109 No other language provides a synchronous multiplexing tool polymorphic over resources like \CFA's waituntil.
     260All of the \gls{synch_multiplex} features mentioned so far are monomorphic, only waiting on one kind of resource: Unix @select@ supports file descriptors, Go's @select@ supports channel operations, \uC's @select@ supports futures, and Ada's @select@ supports monitor method calls.
     261The \CFA @waituntil@ is polymorphic and provides \gls{synch_multiplex} over any objects that satisfy the trait in Figure~\ref{f:wu_trait}.
     262No other language provides a synchronous multiplexing tool polymorphic over resources like \CFA's @waituntil@.
    110263
    111264\begin{figure}
     
    113266forall(T & | sized(T))
    114267trait is_selectable {
    115     // For registering a waituntil stmt on a selectable type
    116     bool register_select( T &, select_node & );
    117 
    118     // For unregistering a waituntil stmt from a selectable type
    119     bool unregister_select( T &, select_node & );
    120 
    121     // on_selected is run on the selecting thread prior to executing the statement associated with the select_node
    122     bool on_selected( T &, select_node & );
     268        // For registering a waituntil stmt on a selectable type
     269        bool register_select( T &, select_node & );
     270
     271        // For unregistering a waituntil stmt from a selectable type
     272        bool unregister_select( T &, select_node & );
     273
     274        // on_selected is run on the selecting thread prior to executing
     275        // the statement associated with the select_node
     276        bool on_selected( T &, select_node & );
    123277};
    124278\end{cfa}
    125 \caption{Trait for types that can be passed into \CFA's waituntil statement.}
     279\caption{Trait for types that can be passed into \CFA's \lstinline{waituntil} statement.}
    126280\label{f:wu_trait}
    127281\end{figure}
    128282
    129 Currently locks, channels, futures and timeouts are supported by the waituntil statement, but this will be expanded as other use cases arise.
    130 The @waituntil@ statement supports guarded clauses, like Ada, and Occam, supports both @or@, and @and@ semantics, like \uC, and provides an @else@ for asynchronous multiplexing. An example of \CFA waituntil usage is shown in Figure~\ref{f:wu_example}. In Figure~\ref{f:wu_example} the waituntil statement is waiting for either @Lock@ to be available or for a value to be read from @Channel@ into @i@ and for @Future@ to be fulfilled.
     283Currently locks, channels, futures and timeouts are supported by the @waituntil@ statement, and can be expanded through the @is_selectable@ trait as other use-cases arise.
     284The @waituntil@ statement supports guarded clauses, both @or@ and @and@ semantics, and provides an @else@ for asynchronous multiplexing.
     285Figure~\ref{f:wu_example} shows a \CFA @waituntil@ usage, which is waiting for either @Lock@ to be available \emph{or} for a value to be read from @Channel@ into @i@ \emph{and} for @Future@ to be fulfilled \emph{or} a timeout of one second.
    131286
    132287\begin{figure}
     
    140295or when( i == 0 ) waituntil( i << Channel ) { ... }
    141296and waituntil( Future ) { ... }
     297or waituntil( timeout( 1`s ) ) { ... }
     298// else { ... }
    142299\end{cfa}
    143300\caption{Example of \CFA's waituntil statement}
     
    146303
    147304\section{Waituntil Semantics}
    148 There are two parts of the waituntil semantics to discuss, the semantics of the statement itself, \ie @and@, @or@, @when@ guards, and @else@ semantics, and the semantics of how the waituntil interacts with types like channels, locks and futures.
    149 
    150 \subsection{Waituntil Statement Semantics}
    151 The @or@ semantics are the most straightforward and nearly match those laid out in the ALT statement from Occam, the clauses have an exclusive-or relationship where the first one to be available will be run and only one clause is run.
    152 \CFA's @or@ semantics differ from ALT semantics in one respect, instead of randomly picking a clause when multiple are available, the clause that appears first in the order of clauses will be picked.
    153 \eg in the following example, if @foo@ and @bar@ are both available, @foo@ will always be selected since it comes first in the order of @waituntil@ clauses.
    154 \begin{cfa}
    155 future(int) bar;
    156 future(int) foo;
     305
     306The @waituntil@ semantics has two parts: the semantics of the statement itself, \ie @and@, @or@, @when@ guards, and @else@ semantics, and the semantics of how the @waituntil@ interacts with types like channels, locks and futures.
     307
     308\subsection{Statement Semantics}
     309
     310The @or@ semantics are the most straightforward and nearly match those laid out in the ALT statement from Occam.
     311The clauses have an exclusive-or relationship where the first available one is run and only one clause is run.
     312\CFA's @or@ semantics differ from ALT semantics: instead of randomly picking a clause when multiple are available, the first clause in the @waituntil@ that is available is executed.
     313For example, in the following example, if @foo@ and @bar@ are both available, @foo@ is always selected since it comes first in the order of @waituntil@ clauses.
     314\begin{cfa}
     315future(int) bar, foo;
     316
    157317waituntil( foo ) { ... }
    158318or waituntil( bar ) { ... }
    159319\end{cfa}
    160320
    161 The @and@ semantics match the @and@ semantics used by \uC.
    162 When multiple clauses are joined by @and@, the @waituntil@ will make a thread wait for all to be available, but will run the corresponding code blocks \emph{as they become available}.
    163 As @and@ clauses are made available, the thread will be woken to run those clauses' code blocks and then the thread will wait again until all clauses have been run.
    164 This allows work to be done in parallel while synchronizing over a set of resources, and furthermore gives a good reason to use the @and@ operator.
    165 If the @and@ operator waited for all clauses to be available before running, it would not provide much more use that just acquiring those resources one by one in subsequent lines of code.
    166 The @and@ operator binds more tightly than the @or@ operator.
    167 To give an @or@ operator higher precedence brackets can be used.
    168 \eg the following waituntil unconditionally waits for @C@ and one of either @A@ or @B@, since the @or@ is given higher precendence via brackets.
    169 \begin{cfa}
    170 (waituntil( A ) { ... }
    171 or waituntil( B ) { ... } )
     321The \CFA @and@ semantics match the @and@ semantics of \uC \lstinline[language=uC++]{_Select}.
     322When multiple clauses are joined by @and@, the @waituntil@ makes a thread wait for all to be available, but still runs the corresponding code blocks \emph{as they become available}.
     323When an @and@ clause becomes available, the waiting thread unblocks and runs that clause's code-block, and then the thread waits again for the next available clause or the @waituntil@ statement is now true.
     324This semantics allows work to be done in parallel while synchronizing over a set of resources, and furthermore, gives a good reason to use the @and@ operator.
     325If the @and@ operator waited for all clauses to be available before running, it would be the same as just acquiring those resources consecutively by a sequence of @waituntil@ statements.
     326
     327As for normal C expressions, the @and@ operator binds more tightly than the @or@.
     328To give an @or@ operator higher precedence, parenthesis are used.
     329For example, the following @waituntil@ unconditionally waits for @C@ and one of either @A@ or @B@, since the @or@ is given higher precedence via parenthesis.
     330\begin{cfa}
     331@(@ waituntil( A ) { ... }              // bind tightly to or
     332or waituntil( B ) { ... } @)@
    172333and waituntil( C ) { ... }
    173334\end{cfa}
    174335
    175 The guards in the waituntil statement are called @when@ clauses.
    176 Each the boolean expression inside a @when@ is evaluated once before the waituntil statement is run.
    177 The guards in Occam's ALT effectively toggle clauses on and off, where a clause will only be evaluated and waited on if the corresponding guard is @true@.
    178 The guards in the waituntil statement operate the same way, but require some nuance since both @and@ and @or@ operators are supported.
    179 This will be discussed further in Section~\ref{s:wu_guards}.
    180 When a guard is false and a clause is removed, it can be thought of as removing that clause and its preceding operator from the statement.
    181 \eg in the following example the two waituntil statements are semantically the same.
    182 \begin{cfa}
    183 when(true) waituntil( A ) { ... }
    184 or when(false) waituntil( B ) { ... }
     336The guards in the @waituntil@ statement are called @when@ clauses.
     337Each boolean expression inside a @when@ is evaluated \emph{once} before the @waituntil@ statement is run.
     338Like Occam's ALT, the guards toggle clauses on and off, where a @waituntil@ clause is only evaluated and waited on if the corresponding guard is @true@.
     339In addition, the @waituntil@ guards require some nuance since both @and@ and @or@ operators are supported \see{Section~\ref{s:wu_guards}}.
     340When a guard is false and a clause is removed, it can be thought of as removing that clause and its preceding operation from the statement.
     341For example, in the following, the two @waituntil@ statements are semantically equivalent.
     342
     343\begin{lrbox}{\myboxA}
     344\begin{cfa}
     345when( true ) waituntil( A ) { ... }
     346or when( false ) waituntil( B ) { ... }
    185347and waituntil( C ) { ... }
    186 // ===
     348\end{cfa}
     349\end{lrbox}
     350
     351\begin{lrbox}{\myboxB}
     352\begin{cfa}
    187353waituntil( A ) { ... }
    188354and waituntil( C ) { ... }
    189 \end{cfa}
    190 
    191 The @else@ clause on the waituntil has identical semantics to the @else@ clause in Ada.
    192 If all resources are not immediately available and there is an @else@ clause, the @else@ clause is run and the thread will not block.
    193 
    194 \subsection{Waituntil Type Semantics}
    195 As described earlier, to support interaction with the waituntil statement a type must support the trait shown in Figure~\ref{f:wu_trait}.
    196 The waituntil statement expects types to register and unregister themselves via calls to @register_select@ and @unregister_select@ respectively.
     355
     356\end{cfa}
     357\end{lrbox}
     358
     359\begin{tabular}{@{}lcl@{}}
     360\usebox\myboxA & $\equiv$ & \usebox\myboxB
     361\end{tabular}
     362
     363The @else@ clause on the @waituntil@ has identical semantics to the @else@ clause in Ada.
     364If all resources are not immediately available and there is an @else@ clause, the @else@ clause is run and the thread continues.
     365
     366\subsection{Type Semantics}
     367
     368As mentioned, to support interaction with the @waituntil@ statement a type must support the trait in Figure~\ref{f:wu_trait}.
     369The @waituntil@ statement expects types to register and unregister themselves via calls to @register_select@ and @unregister_select@, respectively.
    197370When a resource becomes available, @on_selected@ is run.
    198 Many types do not need @on_selected@, but it is provided since some types may need to perform some work or checks before the resource can be accessed in the code block.
     371Many types do not need @on_selected@, but it is provided if a type needs to perform work or checks before the resource can be accessed in the code block.
    199372The register/unregister routines in the trait return booleans.
    200 The return value of @register_select@ is @true@ if the resource is immediately available, and @false@ otherwise.
    201 The return value of @unregister_select@ is @true@ if the corresponding code block should be run after unregistration and @false@ otherwise.
    202 The routine @on_selected@, and the return value of @unregister_select@ were needed to support channels as a resource.
    203 More detail on channels and their interaction with waituntil will be discussed in Section~\ref{s:wu_chans}.
    204 
    205 \section{Waituntil Implementation}
    206 The waituntil statement is not inherently complex, and can be described as a few steps.
    207 The complexity of the statement comes from the consideration of race conditions and synchronization needed when supporting various primitives.
    208 The basic steps of the waituntil statement are the following:
    209 
    210 \begin{enumerate}[topsep=5pt,itemsep=3pt,parsep=0pt]
    211 
     373The return value of @register_select@ is @true@, if the resource is immediately available and @false@ otherwise.
     374The return value of @unregister_select@ is @true@, if the corresponding code block should be run after unregistration and @false@ otherwise.
     375The routine @on_selected@ and the return value of @unregister_select@ are needed to support channels as a resource.
     376More detail on channels and their interaction with @waituntil@ appear in Section~\ref{s:wu_chans}.
     377
     378\section{\lstinline{waituntil} Implementation}
     379The @waituntil@ statement is not inherently complex, and the pseudo code in presented in Figure~\ref{f:WU_Impl}.
     380The complexity comes from the consideration of race conditions and synchronization needed when supporting various primitives.
     381The basic steps of the @waituntil@ statement are:
     382
     383\begin{figure}
     384\begin{cfa}
     385select_nodes s[N];                                                               $\C[3.25in]{// declare N select nodes}$
     386for ( node in s )                                                                $\C{// register nodes}$
     387        register_select( resource, node );
     388while ( statement predicate not satisfied ) {   $\C{// check predicate}$
     389        // block
     390        for ( resource in waituntil statement )         $\C{// run true code blocks}$
     391                if ( resource is avail ) run code block
     392}
     393for ( resource in waituntil statement ) {
     394        if ( statement predicate is run-satisfied ) break;
     395        if ( resource is avail ) run code block
     396}
     397for ( node in s )                                                               $\C{// deregister nodes}\CRT$
     398        if (unregister_select( resource, node ) ) run code block
     399\end{cfa}
     400Each clause has a couple of statuses, UNSAT when not available, SAT when available and not run and RUN when it is available and the associated code block was run.
     401The first while ( statement predicate not satisfied) waits until the predicate is satisfied, where UNSAT = false and SAT = true and RUN = true.
     402The if ( statement predicate is run-satisfied ) considers a status of RUN = true and all other statuses to be false.
     403
     404\caption{\lstinline{waituntil} Implementation}
     405\label{f:WU_Impl}
     406\end{figure}
     407
     408\begin{enumerate}
    212409\item
    213 First the waituntil statement creates a @select_node@ per resource that is being waited on.
    214 The @select_node@ is an object that stores the waituntil data pertaining to one of the resources.
     410The @waituntil@ statement declares $N$ @select_node@s, one per resource that is being waited on, which stores any @waituntil@ data pertaining to that resource.
    215411
    216412\item
    217 Then, each @select_node@ is then registered with the corresponding resource.
     413Each @select_node@ is then registered with the corresponding resource.
    218414
    219415\item
    220 The thread executing the waituntil then enters a loop that will loop until the @waituntil@ statement's predicate is satisfied.
    221 In each iteration of the loop the thread attempts to block.
    222 If any clauses are satified the block will fail and the thread will proceed, otherwise the block succeeds.
     416The thread executing the @waituntil@ then loops until the statement's predicate is satisfied.
     417In each iteration, if the predicate is unsatisfied, the thread blocks.
     418If clauses becomes satisfied, the thread unblocks, and for each satisfied the block fails and the thread proceeds, otherwise the block succeeds.
    223419After proceeding past the block all clauses are checked for completion and the completed clauses have their code blocks run.
    224 In the case where the block suceeds, the thread will be woken by the thread that marks one of the resources as available.
     420In the case where the block succeeds, the thread will be woken by the thread that marks one of the resources as available.
    225421
    226422\item
     
    228424\end{enumerate}
    229425Pseudocode detailing these steps is presented in the following code block.
    230 \begin{cfa}
    231 select_nodes s[N]; // N select nodes
    232 for ( node in s )
    233     register_select( resource, node );
    234 while( statement predicate not satisfied ) {
    235     // try to block
    236     for ( resource in waituntil statement )
    237         if ( resource is avail ) run code block
    238 }
    239 for ( node in s )
    240     unregister_select( resource, node );
    241 \end{cfa}
     426
    242427These steps give a basic overview of how the statement works.
    243428Digging into parts of the implementation will shed light on the specifics and provide more detail.
    244429
    245430\subsection{Locks}
    246 Locks are one of the resources supported by the @waituntil@ statement.
    247 When a thread waits on multiple locks via a waituntil, it enqueues a @select_node@ in each of the lock's waiting queues.
    248 When a @select_node@ reaches the front of the queue and gains ownership of a lock, the blocked thread is notified.
    249 The lock will be held until the node is unregistered.
     431
     432The \CFA runtime supports a number of spinning and blocking locks, \eg semaphore, MCS, futex, Go mutex, spinlock, owner, \etc.
     433Many of these locks satisfy the @is_selectable@ trait, and hence, are resources supported by the @waituntil@ statement.
     434For example, the following waits until the thread has acquired lock @l1@ or locks @l2@ and @l3@.
     435\begin{cfa}
     436owner_lock l1, l2, l3;
     437waituntil ( l1 ) { ... }
     438or waituntil( l2 ) { ... }
     439and waituntil( l3 ) { ... }
     440\end{cfa}
     441Implicitly, the @waituntil@ is calling the lock acquire for each of these locks to establish a position in the lock's queue of waiting threads.
     442When the lock schedules this thread, it unblocks and performs the @waituntil@ code to determine if it can proceed.
     443If it cannot proceed, it blocks again on the @waituntil@ lock, holding the acquired lock.
     444
     445In detail, when a thread waits on multiple locks via a @waituntil@, it enqueues a @select_node@ in each of the lock's waiting queues.
     446When a @select_node@ reaches the front of the lock's queue and gains ownership, the thread blocked on the @waituntil@ is unblocked.
     447Now, the lock is temporarily held by the @waituntil@ thread until the node is unregistered, versus the thread waiting on the lock
    250448To prevent the waiting thread from holding many locks at once and potentially introducing a deadlock, the node is unregistered right after the corresponding code block is executed.
    251449This prevents deadlocks since the waiting thread will never hold a lock while waiting on another resource.
     
    253451
    254452\subsection{Timeouts}
    255 Timeouts in the waituntil take the form of a duration being passed to a @sleep@ or @timeout@ call.
     453
     454Timeouts in the @waituntil@ take the form of a duration being passed to a @sleep@ or @timeout@ call.
    256455An example is shown in the following code.
    257456
     
    262461\end{cfa}
    263462
    264 The timeout implementation highlights a key part of the waituntil semantics, the expression inside a @waituntil()@ is evaluated once at the start of the @waituntil@ algorithm.
     463The timeout implementation highlights a key part of the @waituntil@ semantics, the expression inside a @waituntil()@ is evaluated once at the start of the @waituntil@ algorithm.
    265464As such, calls to these @sleep@ and @timeout@ routines do not block, but instead return a type that supports the @is_selectable@ trait.
    266 This feature leverages \CFA's ability to overload on return type; a call to @sleep@ outside a waituntil will call a different @sleep@ that does not return a type, which will block for the appropriate duration.
     465This feature leverages \CFA's ability to overload on return type; a call to @sleep@ outside a @waituntil@ will call a different @sleep@ that does not return a type, which will block for the appropriate duration.
    267466This mechanism of returning a selectable type is needed for types that want to support multiple operations such as channels that allow both reading and writing.
    268467
    269468\subsection{Channels}\label{s:wu_chans}
    270 To support both waiting on both reading and writing to channels, the operators @?<<?@ and @?>>?@ are used read and write to a channel respectively, where the lefthand operand is the value being read into/written and the righthand operand is the channel.
     469To support both waiting on both reading and writing to channels, the operators @?<<?@ and @?>>?@ are used read and write to a channel respectively, where the left-hand operand is the value being read into/written and the right-hand operand is the channel.
    271470Channels require significant complexity to synchronously multiplex on for a few reasons.
    272471First, reading or writing to a channel is a mutating operation;
    273472If a read or write to a channel occurs, the state of the channel has changed.
    274473In comparison, for standard locks and futures, if a lock is acquired then released or a future is ready but not accessed, the state of the lock and the future is not permanently modified.
    275 In this way, a waituntil over locks or futures that completes with resources available but not consumed is not an issue.
    276 However, if a thread modifies a channel on behalf of a thread blocked on a waituntil statement, it is important that the corresponding waituntil code block is run, otherwise there is a potentially erroneous mismatch between the channel state and associated side effects.
     474In this way, a @waituntil@ over locks or futures that completes with resources available but not consumed is not an issue.
     475However, if a thread modifies a channel on behalf of a thread blocked on a @waituntil@ statement, it is important that the corresponding @waituntil@ code block is run, otherwise there is a potentially erroneous mismatch between the channel state and associated side effects.
    277476As such, the @unregister_select@ routine has a boolean return that is used by channels to indicate when the operation was completed but the block was not run yet.
    278477When the return is @true@, the corresponding code block is run after the unregister.
     
    281480It was deemed important that exclusive-or semantics were maintained when only @or@ operators were used, so this situation has been special-cased, and is handled by having all clauses race to set a value \emph{before} operating on the channel.
    282481This approach is infeasible in the case where @and@ and @or@ operators are used.
    283 To show this consider the following waituntil statement.
     482To show this consider the following @waituntil@ statement.
    284483
    285484\begin{cfa}
     
    288487\end{cfa}
    289488
    290 If exclusive-or semantics were followed, this waituntil would only run the code blocks for @A@ and @B@, or the code blocks for @C@ and @D@.
    291 However, to race before operation completion in this case introduces a race whose complexity increases with the size of the waituntil statement.
     489If exclusive-or semantics were followed, this @waituntil@ would only run the code blocks for @A@ and @B@, or the code blocks for @C@ and @D@.
     490However, to race before operation completion in this case introduces a race whose complexity increases with the size of the @waituntil@ statement.
    292491In the example above, for @i@ to be inserted into @C@, to ensure the exclusive-or it must be ensured that @i@ can also be inserted into @D@.
    293492Furthermore, the race for the @or@ would also need to be won.
     
    297496This would incur a high cost for signalling threads and heavily increase contention on internal channel locks.
    298497Furthermore, the @waituntil@ statement is polymorphic and can support resources that do not have internal locks, which also makes this approach infeasible.
    299 As such, the exclusive-or semantics are lost when using both @and@ and @or@ operators since they can not be supported without significant complexity and hits to waituntil statement performance.
     498As such, the exclusive-or semantics are lost when using both @and@ and @or@ operators since they can not be supported without significant complexity and hits to @waituntil@ statement performance.
    300499
    301500Channels introduce another interesting consideration in their implementation.
     
    304503When both a special-case @or@ is inserting to a channel on one thread and another thread is blocked in a special-case @or@ consuming from the same channel there is not one but two races that need to be consolidated by the inserting thread.
    305504(This race can also occur in the mirrored case with a blocked producer and signalling consumer.)
    306 For the producing thread to know that the insert succeeded, they need to win the race for their own waituntil and win the race for the other waituntil.
     505For the producing thread to know that the insert succeeded, they need to win the race for their own @waituntil@ and win the race for the other @waituntil@.
    307506
    308507Go solves this problem in their select statement by acquiring the internal locks of all channels before registering the select on the channels.
    309508This eliminates the race since no other threads can operate on the blocked channel since its lock will be held.
    310 This approach is not used in \CFA since the waituntil is polymorphic.
    311 Not all types in a waituntil have an internal lock, and when using non-channel types acquiring all the locks incurs extra uneeded overhead.
     509This approach is not used in \CFA since the @waituntil@ is polymorphic.
     510Not all types in a @waituntil@ have an internal lock, and when using non-channel types acquiring all the locks incurs extra unneeded overhead.
    312511Instead this race is consolidated in \CFA in two phases by having an intermediate pending status value for the race.
    313512This race case is detectable, and if detected the thread attempting to signal will first race to set the race flag to be pending.
     
    317516This protocol ensures that signals will not be lost and that the two races can be resolved in a safe manner.
    318517
    319 Channels in \CFA have exception based shutdown mechanisms that the waituntil statement needs to support.
     518Channels in \CFA have exception based shutdown mechanisms that the @waituntil@ statement needs to support.
    320519These exception mechanisms were what brought in the @on_selected@ routine.
    321 This routine is needed by channels to detect if they are closed upon waking from a waituntil statement, to ensure that the appropriate behaviour is taken.
     520This routine is needed by channels to detect if they are closed upon waking from a @waituntil@ statement, to ensure that the appropriate behaviour is taken.
    322521
    323522\subsection{Guards and Statement Predicate}\label{s:wu_guards}
     
    332531To support statement guards in \uC, the tree prunes a branch if the corresponding guard is false.
    333532
    334 The \CFA waituntil statement blocks a thread until a set of resources have become available that satisfy the underlying predicate.
    335 The waiting condition of the waituntil statement can be represented as a predicate over the resources, joined by the waituntil operators, where a resource is @true@ if it is available, and @false@ otherwise.
    336 In \CFA, this representation is used as the mechanism to check if a thread is done waiting on the waituntil.
    337 Leveraging the compiler, a predicate routine is generated per waituntil that when passed the statuses of the resources, returns @true@ when the waituntil is done, and false otherwise.
    338 To support guards on the \CFA waituntil statement, the status of a resource disabled by a guard is set to a boolean value that ensures that the predicate function behaves as if that resource is no longer part of the predicate.
     533The \CFA @waituntil@ statement blocks a thread until a set of resources have become available that satisfy the underlying predicate.
     534The waiting condition of the @waituntil@ statement can be represented as a predicate over the resources, joined by the @waituntil@ operators, where a resource is @true@ if it is available, and @false@ otherwise.
     535In \CFA, this representation is used as the mechanism to check if a thread is done waiting on the @waituntil@.
     536Leveraging the compiler, a predicate routine is generated per @waituntil@ that when passed the statuses of the resources, returns @true@ when the @waituntil@ is done, and false otherwise.
     537To support guards on the \CFA @waituntil@ statement, the status of a resource disabled by a guard is set to a boolean value that ensures that the predicate function behaves as if that resource is no longer part of the predicate.
    339538
    340539\uC's @_Select@, supports operators both inside and outside of the clauses.
     
    348547\end{cfa}
    349548
    350 This is more expressive that the waituntil statement in \CFA.
    351 In \CFA, since the waituntil statement supports more resources than just futures, implementing operators inside clauses was avoided for a few reasons.
     549This is more expressive that the @waituntil@ statement in \CFA.
     550In \CFA, since the @waituntil@ statement supports more resources than just futures, implementing operators inside clauses was avoided for a few reasons.
    352551As a motivating example, suppose \CFA supported operators inside clauses and consider the code snippet in Figure~\ref{f:wu_inside_op}.
    353552
     
    362561\end{figure}
    363562
    364 If the waituntil in Figure~\ref{f:wu_inside_op} works with the same semantics as described and acquires each lock as it becomes available, it opens itself up to possible deadlocks since it is now holding locks and waiting on other resources.
     563If the @waituntil@ in Figure~\ref{f:wu_inside_op} works with the same semantics as described and acquires each lock as it becomes available, it opens itself up to possible deadlocks since it is now holding locks and waiting on other resources.
    365564Other semantics would be needed to ensure that this operation is safe.
    366565One possibility is to use \CC's @scoped_lock@ approach that was described in Section~\ref{s:DeadlockAvoidance}, however the potential for livelock leaves much to be desired.
    367566Another possibility would be to use resource ordering similar to \CFA's @mutex@ statement, but that alone is not sufficient if the resource ordering is not used everywhere.
    368 Additionally, using resource ordering could conflict with other semantics of the waituntil statement.
     567Additionally, using resource ordering could conflict with other semantics of the @waituntil@ statement.
    369568To show this conflict, consider if the locks in Figure~\ref{f:wu_inside_op} were ordered @D@, @B@, @C@, @A@.
    370 If all the locks are available, it becomes complex to both respect the ordering of the waituntil in Figure~\ref{f:wu_inside_op} when choosing which code block to run and also respect the lock ordering of @D@, @B@, @C@, @A@ at the same time.
     569If all the locks are available, it becomes complex to both respect the ordering of the @waituntil@ in Figure~\ref{f:wu_inside_op} when choosing which code block to run and also respect the lock ordering of @D@, @B@, @C@, @A@ at the same time.
    371570One other way this could be implemented is to wait until all resources for a given clause are available before proceeding to acquire them, but this also quickly becomes a poor approach.
    372571This approach won't work due to TOCTOU issues; it is not possible to ensure that the full set resources are available without holding them all first.
     
    376575
    377576\section{Waituntil Performance}
    378 The two \gls{synch_multiplex} utilities that are in the realm of comparability with the \CFA waituntil statement are the Go @select@ statement and the \uC @_Select@ statement.
     577The two \gls{synch_multiplex} utilities that are in the realm of comparability with the \CFA @waituntil@ statement are the Go @select@ statement and the \uC @_Select@ statement.
    379578As such, two microbenchmarks are presented, one for Go and one for \uC to contrast the systems.
    380579The similar utilities discussed at the start of this chapter in C, Ada, Rust, \CC, and OCaml are either not meaningful or feasible to benchmark against.
    381 The select(2) and related utilities in C are not comparable since they are system calls that go into the kernel and operate on file descriptors, whereas the waituntil exists solely in userspace.
     580The select(2) and related utilities in C are not comparable since they are system calls that go into the kernel and operate on file descriptors, whereas the @waituntil@ exists solely in user space.
    382581Ada's @select@ only operates on methods, which is done in \CFA via the @waitfor@ utility so it is not meaningful to benchmark against the @waituntil@, which cannot wait on the same resource.
    383582Rust and \CC only offer a busy-wait based approach which is not comparable to a blocking approach.
     
    386585
    387586\subsection{Channel Benchmark}
    388 The channel multiplexing microbenchmarks compare \CFA's waituntil and Go's select, where the resource being waited on is a set of channels.
     587The channel multiplexing microbenchmarks compare \CFA's @waituntil@ and Go's select, where the resource being waited on is a set of channels.
    389588The basic structure of the microbenchmark has the number of cores split evenly between producer and consumer threads, \ie, with 8 cores there would be 4 producer threads and 4 consumer threads.
    390589The number of clauses @C@ is also varied, with results shown with 2, 4, and 8 clauses.
     
    394593
    395594\begin{cfa}
    396     for (;;)
    397         waituntil( val << chans[0] ) {} or waituntil( val << chans[1] ) {}
    398         or waituntil( val << chans[2] ) {} or waituntil( val << chans[3] ) {}
     595        for (;;)
     596                waituntil( val << chans[0] ) {} or waituntil( val << chans[1] ) {}
     597                or waituntil( val << chans[2] ) {} or waituntil( val << chans[3] ) {}
    399598\end{cfa}
    400599A successful consumption is counted as a channel operation, and the throughput of these operations is measured over 10 seconds.
     
    404603\begin{figure}
    405604        \centering
    406     \captionsetup[subfloat]{labelfont=footnotesize,textfont=footnotesize}
     605        \captionsetup[subfloat]{labelfont=footnotesize,textfont=footnotesize}
    407606        \subfloat[AMD]{
    408607                \resizebox{0.5\textwidth}{!}{\input{figures/nasus_Contend_2.pgf}}
     
    411610                \resizebox{0.5\textwidth}{!}{\input{figures/pyke_Contend_2.pgf}}
    412611        }
    413     \bigskip
     612        \bigskip
    414613
    415614        \subfloat[AMD]{
     
    419618                \resizebox{0.5\textwidth}{!}{\input{figures/pyke_Contend_4.pgf}}
    420619        }
    421     \bigskip
     620        \bigskip
    422621
    423622        \subfloat[AMD]{
     
    427626                \resizebox{0.5\textwidth}{!}{\input{figures/pyke_Contend_8.pgf}}
    428627        }
    429         \caption{The channel synchronous multiplexing benchmark comparing Go select and \CFA waituntil statement throughput (higher is better).}
     628        \caption{The channel synchronous multiplexing benchmark comparing Go select and \CFA \lstinline{waituntil} statement throughput (higher is better).}
    430629        \label{f:select_contend_bench}
    431630\end{figure}
     
    433632\begin{figure}
    434633        \centering
    435     \captionsetup[subfloat]{labelfont=footnotesize,textfont=footnotesize}
     634        \captionsetup[subfloat]{labelfont=footnotesize,textfont=footnotesize}
    436635        \subfloat[AMD]{
    437636                \resizebox{0.5\textwidth}{!}{\input{figures/nasus_Spin_2.pgf}}
     
    440639                \resizebox{0.5\textwidth}{!}{\input{figures/pyke_Spin_2.pgf}}
    441640        }
    442     \bigskip
     641        \bigskip
    443642
    444643        \subfloat[AMD]{
     
    448647                \resizebox{0.5\textwidth}{!}{\input{figures/pyke_Spin_4.pgf}}
    449648        }
    450     \bigskip
     649        \bigskip
    451650
    452651        \subfloat[AMD]{
     
    456655                \resizebox{0.5\textwidth}{!}{\input{figures/pyke_Spin_8.pgf}}
    457656        }
    458         \caption{The asynchronous multiplexing channel benchmark comparing Go select and \CFA waituntil statement throughput (higher is better).}
     657        \caption{The asynchronous multiplexing channel benchmark comparing Go select and \CFA \lstinline{waituntil} statement throughput (higher is better).}
    459658        \label{f:select_spin_bench}
    460659\end{figure}
     
    464663The AMD machine has been observed to have higher caching contention cost, which creates on a bottleneck on the channel locks, which results in similar scaling between \CFA and Go.
    465664At low cores, Go has significantly better performance, which is likely due to an optimization in their scheduler.
    466 Go heavily optimizes thread handoffs on their local runqueue, which can result in very good performance for low numbers of threads which are parking/unparking eachother~\cite{go:sched}.
     665Go heavily optimizes thread handoffs on their local run-queue, which can result in very good performance for low numbers of threads which are parking/unparking each other~\cite{go:sched}.
    467666In the Intel benchmarks, \CFA performs better than Go as the number of cores scale and as the number of clauses scale.
    468667This is likely due to Go's implementation choice of acquiring all channel locks when registering and unregistering channels on a @select@.
     
    491690\label{t:pathGo}
    492691\begin{tabular}{*{5}{r|}r}
    493     & \multicolumn{1}{c|}{\CFA} & \multicolumn{1}{c@{}}{Go} \\
    494     \hline
    495     AMD         & \input{data/nasus_Order} \\
    496     \hline
    497     Intel       & \input{data/pyke_Order}
     692        & \multicolumn{1}{c|}{\CFA} & \multicolumn{1}{c@{}}{Go} \\
     693        \hline
     694        AMD             & \input{data/nasus_Order} \\
     695        \hline
     696        Intel   & \input{data/pyke_Order}
    498697\end{tabular}
    499698\end{table}
     
    505704
    506705\subsection{Future Benchmark}
    507 The future benchmark compares \CFA's waituntil with \uC's @_Select@, with both utilities waiting on futures.
     706The future benchmark compares \CFA's @waituntil@ with \uC's @_Select@, with both utilities waiting on futures.
    508707Both \CFA's @waituntil@ and \uC's @_Select@ have very similar semantics, however @_Select@ can only wait on futures, whereas the @waituntil@ is polymorphic.
    509708They both support @and@ and @or@ operators, but the underlying implementation of the operators differs between @waituntil@ and @_Select@.
     
    520719                \label{f:futureIntel}
    521720        }
    522         \caption{\CFA waituntil and \uC \_Select statement throughput synchronizing on a set of futures with varying wait predicates (higher is better).}
    523     \caption{}
     721        \caption{\CFA \lstinline{waituntil} and \uC \lstinline{_Select} statement throughput synchronizing on a set of futures with varying wait predicates (higher is better).}
     722        \caption{}
    524723        \label{f:futurePerf}
    525724\end{figure}
     
    529728Results of this benchmark are shown in Figure~\ref{f:futurePerf}.
    530729Each set of columns is marked with a name representing the predicate for that set of columns.
    531 The predicate name and corresponding waituntil statement is shown below:
     730The predicate name and corresponding @waituntil@ statement is shown below:
    532731
    533732\begin{cfa}
     
    554753\end{cfa}
    555754
    556 In Figure~\ref{f:futurePerf}, the @OR@ column for \CFA is more performant than the other \CFA predicates, likely due to the special-casing of waituntil statements with only @or@ operators.
     755In Figure~\ref{f:futurePerf}, the @OR@ column for \CFA is more performant than the other \CFA predicates, likely due to the special-casing of @waituntil@ statements with only @or@ operators.
    557756For both \uC and \CFA the @AND@ column is the least performant, which is expected since all three futures need to be fulfilled for each statement completion, unlike any of the other operators.
    558757Interestingly, \CFA has lower variation across predicates on the AMD (excluding the special OR case), whereas \uC has lower variation on the Intel.
Note: See TracChangeset for help on using the changeset viewer.