Changeset efe39bb for doc/theses/colby_parsons_MMAth/text/waituntil.tex
- Timestamp:
- Sep 5, 2023, 6:08:28 PM (3 years ago)
- Branches:
- master, stuck-waitfor-destruct
- Children:
- 1fc111c
- Parents:
- 737988b (diff), aae9c17 (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the(diff)links above to see all the changes relative to each parent. - File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/colby_parsons_MMAth/text/waituntil.tex
r737988b refe39bb 103 103 104 104 When discussing \gls{synch_multiplex} implementations, the resource being multiplexed is important. 105 While CSP waits on channels, the earliest known implementation of synch\-ronous multiplexing is Unix's @select@~\cite{linux:select}, multiplexing over file descriptors .105 While CSP waits on channels, the earliest known implementation of synch\-ronous multiplexing is Unix's @select@~\cite{linux:select}, multiplexing over file descriptors, which conceptually differ from channels in name only. 106 106 The @select@ system-call is passed three sets of file descriptors (read, write, exceptional) to wait on and an optional timeout. 107 107 @select@ blocks until either some subset of file descriptors are available or the timeout expires. … … 155 155 156 156 Figure~\ref{f:BB_Ada} shows the outline of a bounded buffer implemented with an Ada task. 157 Note ,a task method is associated with the \lstinline[language=ada]{accept} clause of the \lstinline[language=ada]{select} statement, rather than being a separate routine.157 Note that a task method is associated with the \lstinline[language=ada]{accept} clause of the \lstinline[language=ada]{select} statement, rather than being a separate routine. 158 158 The thread executing the loop in the task body blocks at the \lstinline[language=ada]{select} until a call occurs to @insert@ or @remove@. 159 159 Then the appropriate \lstinline[language=ada]{accept} method is run with the called arguments. … … 319 319 The @waituntil@ statement supports guard clauses, both @or@ and @and@ semantics, and timeout and @else@ for asynchronous multiplexing. 320 320 Figure~\ref{f:wu_example} shows a \CFA @waituntil@ usage, which is waiting for either @Lock@ to be available \emph{or} for a value to be read from @Channel@ into @i@ \emph{and} for @Future@ to be fulfilled \emph{or} a timeout of one second. 321 Note ,the expression inside a @waituntil@ clause is evaluated once at the start of the @waituntil@ algorithm.321 Note that the expression inside a @waituntil@ clause is evaluated once at the start of the @waituntil@ algorithm. 322 322 323 323 \section{Waituntil Semantics} … … 366 366 If the @and@ operator waited for all clauses to be available before running, it is the same as just acquiring those resources consecutively by a sequence of @waituntil@ statements. 367 367 368 As fornormal C expressions, the @and@ operator binds more tightly than the @or@.368 As with normal C expressions, the @and@ operator binds more tightly than the @or@. 369 369 To give an @or@ operator higher precedence, parenthesis are used. 370 370 For example, the following @waituntil@ unconditionally waits for @C@ and one of either @A@ or @B@, since the @or@ is given higher precedence via parenthesis. … … 523 523 \end{cfa} 524 524 If only @C2@ is satisfied, \emph{both} timeout code-blocks trigger because 1 second occurs before 3 seconds. 525 Note ,the \CFA @waitfor@ statement only provides a single @timeout@ clause because it only supports @or@ semantics.525 Note that the \CFA @waitfor@ statement only provides a single @timeout@ clause because it only supports @or@ semantics. 526 526 527 527 The \lstinline[deletekeywords={timeout}]{timeout} routine is different from UNIX @sleep@, which blocks for the specified duration and returns the amount of time elapsed since the call started. … … 624 624 Correspondingly, if thread 2 wins the race and sees that @A@ has a waiting removal, then thread 1 must execute the first clause of its @waituntil@ and thread 2 execute its second. 625 625 Any other execution scenario is incorrect for exclusive-or semantics. 626 Note ,that priority execution of multiple satisfied @waituntil@ causes (\ie top to bottom) is not violated because, in this scenario, there is only one satisfied clause for either thread.626 Note that priority execution of multiple satisfied @waituntil@ causes (\ie top to bottom) is not violated because, in this scenario, there is only one satisfied clause for either thread. 627 627 628 628 The Go @select@ solves this problem by acquiring all the internal locks of the channels before registering the @select@ on the channels. … … 715 715 The waiting condition of the @waituntil@ statement is implemented as the complete predicate over the resources, joined by the @waituntil@ operators, where a resource is @true@ if it is available, and @false@ otherwise. 716 716 This complete predicate is used as the mechanism to check if a thread is done waiting on a @waituntil@. 717 Leveraging the compiler, a predicate routine is generated per @waituntil@ that when passes the statuses of the resources, returns @true@ when the @waituntil@ is done, and false otherwise. 717 Leveraging the compiler, a predicate routine is generated per @waituntil@. 718 This predicate routine accepts the statuses of the resources being waited on as arguments. 719 A resource status is an integer that indicates whether a resource is either not available, available, or has already run its associated code block. 720 The predicate routine returns @true@ when the @waituntil@ is done, \ie enough resources have run their associated code blocks to satisfy the @waituntil@'s predicate, and false otherwise. 718 721 To support guards on the \CFA @waituntil@ statement, the status of a resource disabled by a guard is set to a boolean value that ensures that the predicate function behaves as if that resource is no longer part of the predicate. 719 722 The generated code allows the predicate that is checked with each iteration to be simplified to not check guard values. … … 760 763 It would require some way to ensure channels used with internal operators are modified, if and only if, the corresponding code block is run. 761 764 However, that is not feasible due to reasons described in the exclusive-or portion of Section~\ref{s:wu_chans}. 762 Ultimately, I did not deemed this feature to be crucial enough whentaking into account the complexity and runtime cost.765 Ultimately, this feature was not considered crucial after taking into account the complexity and runtime cost. 763 766 764 767 \subsection{The full \lstinline{waituntil} picture} … … 828 831 829 832 The channel multiplexing benchmarks compare \CFA's @waituntil@ and Go's \lstinline[language=Go]{select}, where the resource being waited on is a set of channels. 833 Although Unix's select, poll and epoll multiplex over file descriptors, which are effectively channels, these utilities are not included in the benchmarks. 834 Reading or writing to a file descriptor requires a system call which is much more expensive than operating on a user-space channel. 835 As such, it is infeasible to compare Unix's select, poll and epoll with \CFA's @waituntil@ and Go's \lstinline[language=Go]{select}. 830 836 The basic structure of the benchmark has the number of cores split evenly between producer and consumer threads, \ie, with 8 cores there are 4 producer and 4 consumer threads. 831 837 The number of resource clauses $C$ is also varied across 2, 4, and 8 clauses, where each clause has a different channel that it waits on. … … 901 907 Both Figures~\ref{f:select_contend_bench} and~\ref{f:select_spin_bench} have similar results when comparing \lstinline[language=Go]{select} and @waituntil@. 902 908 In the AMD benchmarks (left column), the performance is very similar as the number of cores scale. 903 The AMD machine has a high-caching contention cost because of its \emph{chi cklet} L3 cache (\ie many L3 caches servicing a small number of cores), which creates a bottleneck on the channel locks and dominates the shape of the performance curve for both \CFA and Go.909 The AMD machine has a high-caching contention cost because of its \emph{chiplet} L3 cache (\ie many L3 caches servicing a small number of cores), which creates a bottleneck on the channel locks and dominates the shape of the performance curve for both \CFA and Go. 904 910 Hence, it is difficult to detect differences in the \gls{synch_multiplex}, except at low cores, where Go has significantly better performance, due to an optimization in its scheduler. 905 911 Go heavily optimizes thread handoffs on the local run-queue, which can result in very good performance for low numbers of threads parking/unparking each other~\cite{go:sched}. … … 939 945 \end{cquote} 940 946 In Go, this setup results in significantly worse performance since @P2@ and @C2@ cannot operate in parallel with @P1@ and @C1@ due to all locks being acquired. 941 Interesting , this case may not be as pathological as it seems.947 Interestingly, this case may not be as pathological as it seems. 942 948 If the set of channels belonging to a \lstinline[language=Go]{select} have channels that overlap with the set of another \lstinline[language=Go]{select}, these statements lose the ability to operate in parallel. 943 949 The implementation in \CFA only holds a single lock at a time, resulting in better locking granularity, and hence, more parallelism.
Note:
See TracChangeset
for help on using the changeset viewer.