Changeset d4c1d1f
- Timestamp:
- Jul 20, 2023, 10:46:41 PM (16 months ago)
- Branches:
- master
- Children:
- 1ae3f185
- Parents:
- 47b7142
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/colby_parsons_MMAth/text/waituntil.tex
r47b7142 rd4c1d1f 522 522 523 523 It was deemed important that exclusive-or semantics are maintained when only @or@ operators are used, so this situation has been special-cased, and is handled by having all clauses race to set a value \emph{before} operating on the channel. 524 Consider the following example with three threads running concurrently. 525 \begin{cfa} 526 channel A, B; // zero size channels 527 528 // Thread 1: 524 Consider the following example where thread 1 is reading and threads 2 and 3 are writing to channels @A@ and @B@ concurrently. 525 \begin{cquote} 526 \begin{tabular}{@{}l|l|l@{}} 527 \multicolumn{3}{@{}l@{}}{\lstinline{channel A, B; // zero size channels}} \\ 528 thread 1 & thread 2 & thread 3 \\ 529 \begin{cfa} 529 530 waituntil( i << A ) {} 530 531 or waituntil( i << B ) {} 531 532 // Thread 2: 533 waituntil( A << 1 ) {} 534 535 // Thread 3: 536 waituntil( B << 2 ) {} 537 \end{cfa} 538 539 For Thread 1 to have exclusive-or semantics, it must only consume from exactly one of @A@ or @B@. 540 As such, Thread 2 and 3 must race to establish the winning clause of the @waituntil@ in Thread 1. 541 This race is consolidated by Thread 2 and 3 each attempting to set a pointer to the winning clause's @select_node@ address using \gls{cas}. 542 The winner will successfully \gls{cas} the address, insert the value and then signal Thread 1, whereas the loser will fail their \gls{cas}. 543 If there is space in the channel, the losing thread may proceed with their channel operation and not signal, otherwise if there is no space they block as they would in a standard channel insert operation. 544 It is important that threads race \emph{before} operating on the channel, since if they lose the race they may need to take a different action with respect to their channel operation. 545 In the example case, the winner will handoff their value to Thread 1 and the loser will block. 546 If the race was consolidated after the operation, both Thread 2 and 3 could potentially write into @i@ concurrently, which would be erroneous behaviour. 547 548 Channels introduce another interesting consideration in their implementation. 532 \end{cfa} 533 & 534 \begin{cfa} 535 A << 1; 536 537 \end{cfa} 538 & 539 \begin{cfa} 540 B << 2; 541 542 \end{cfa} 543 \end{tabular} 544 \end{cquote} 545 For thread 1 to have exclusive-or semantics, it must only consume from exactly one of @A@ or @B@. 546 As such, thread 2 and 3 must race to establish the winning clause of the @waituntil@ in thread 1. 547 This race is consolidated by thread 2 and 3 each attempting to set a pointer to the winning clause's @select_node@ address using \gls{cas}. 548 The winner bypasses the channel and inserts into the WUT's left-hand, and signals thread 1. 549 The loser continues checking if there is space in the channel, and if so performs the channel insert operation with a possible signal of a waiting remove thread; 550 otherwise, if there is no space, the loser blocks. 551 It is important the race occurs \emph{before} operating on the channel, because channel actions are different with respect to each thread. 552 If the race was consolidated after the operation, both thread 2 and 3 could potentially write into @i@ concurrently. 553 554 Channels introduce another interesting implementation issue. 549 555 Supporting both reading and writing to a channel in a @waituntil@ means that one @waituntil@ clause may be the notifier of another @waituntil@ clause. 550 556 This poses a problem when dealing with the special-cased @or@ where the clauses need to win a race to operate on a channel. 551 552 557 Consider the following example, alongside a described ordering of events to highlight the race. 553 Assume Thread 1 executes first, registers with channel @A@ and proceeds since it is empty and then is interrupted before registering with @B@. 554 Thread 2 similarly registers with channel @B@, and proceeds since it doesn't have space to insert and then is interrupted before registering with @A@. 555 At this point Thread 1 and 2 resume execution. 558 \begin{cquote} 559 \begin{tabular}{@{}l|l@{}} 560 \multicolumn{2}{@{}l@{}}{\lstinline{channel A, B; // zero size channels}} \\ 561 thread 1 & thread 2 \\ 562 \begin{cfa} 563 waituntil( @i << A@ ) {} 564 or waituntil( @i << B@ ) {} 565 \end{cfa} 566 & 567 \begin{cfa} 568 waituntil( @B << 2}@ ) {} 569 or waituntil( @A << 1@ ) {} 570 \end{cfa} 571 \end{tabular} 572 \end{cquote} 573 Assume thread 1 executes first, registers with channel @A@ and proceeds, since it is empty, and then is interrupted before registering with @B@. 574 Thread 2 similarly registers with channel @B@, and proceeds, since it does not have space to insert, and then is interrupted before registering with @A@. 575 At this point, thread 1 and 2 resume execution. 556 576 There is now a race that must be dealt with on two fronts. 557 If Thread 1 and 2 only race to \gls{cas} a winning clause for their own @waituntil@, Thread 1 may think that it successfully removed from @B@ and Thread 2 may think it successfully inserted into @A@, when only one of these operations can occur. 558 \begin{cfa} 559 channel A, B; // zero size channels 560 561 // Thread 1: 562 waituntil( i << A ) {} 563 or waituntil( i << B ) {} 564 565 // Thread 2: 566 waituntil( B << 2 ) {} 567 or waituntil( A << 1 ) {} 568 \end{cfa} 569 570 Go solves this problem in their select statement by acquiring the internal locks of all channels before registering the select on the channels. 571 This eliminates the race shown above since both Thread 1 and 2 cannot both be registering at the same time. 572 This approach is not used in \CFA since the @waituntil@ is polymorphic. 577 If thread 1 and 2 only race to the \gls{cas}, \ie a clause in their own @waituntil@, thread 1 can think that it successfully removed from @B@, and thread 2 may think it successfully inserted into @A@, when only one of these operations occurs. 578 579 The Go @select@ solves this problem by acquiring all the internal locks of the channels before registering the @select@ on the channels. 580 This approach eliminates the race shown above since thread 1 and 2 cannot both be registering at the same time. 581 However, this approach cannot be used in \CFA, since the @waituntil@ is polymorphic. 573 582 Not all types in a @waituntil@ have an internal lock, and when using non-channel types acquiring all the locks incurs extra unneeded overhead. 574 583 Instead, this race is consolidated in \CFA in two phases by having an intermediate pending status value for the race. … … 577 586 If either thread successfully sets the the other thread's @waituntil@ race pointer, then the operation can proceed, if not the signalling threads set its own race pointer back to the initial value and repeats. 578 587 This retry mechanism can potentially introduce a livelock, but in practice a livelock here is highly unlikely. 579 Furthermore, the likelihood of a livelock here is zero unless the program is in the niche case of having two or more exclusive-or waituntils with 2 or more clauses in reverse order of priority. 580 This livelock case could be fully eliminated using locks like Go, or if a \gls{dcas} instruction is available. 581 If any other threads attempt to set a WUT's race pointer and see a pending value, they will wait until the value changes before proceeding to ensure that in the case that the WUT fails, the signal will not be lost. 582 This protocol ensures that signals will not be lost and that the two races can be resolved in a safe manner. 583 584 Channels in \CFA have exception based shutdown mechanisms that the @waituntil@ statement needs to support. 588 Furthermore, the likelihood of a livelock here is zero unless the program is in the niche case of having two or more exclusive-or @waituntil@s with two or more clauses in reverse order of priority. 589 This livelock case can be fully eliminated using locks like Go, or if a \gls{dcas} instruction is available. 590 If any other threads attempt to set a WUT's race pointer and see a pending value, they wait until the value changes before proceeding to ensure that, in the case the WUT fails, the signal is not lost. 591 This protocol ensures that signals cannot be lost and that the two races can be resolved in a safe manner. 592 \PAB{I bet one of the readers is going to ask you to write the pseudo code for this algorithm.} 593 594 Channels in \CFA have exception-based shutdown mechanisms that the @waituntil@ statement needs to support. 585 595 These exception mechanisms are supported through the @on_selected@ routine. 586 This routine is needed by channels to detect if they are closed upon waking from a @waituntil@ statement, to ensure thatthe appropriate behaviour is taken and an exception is thrown.596 This routine is needed by channels to detect if they are closed after unblocking in a @waituntil@ statement, to ensure the appropriate behaviour is taken and an exception is thrown. 587 597 588 598 \subsection{Guards and Statement Predicate}\label{s:wu_guards} 589 Checking for when a synchronous multiplexing utility is done is trivial when it has an or/xor relationship, since any resource becoming available means that the blocked thread can proceed. 590 In \uC and \CFA, their \gls{synch_multiplex} utilities involve both an @and@ and @or@ operator, which make the problem of checking for completion of the statement more difficult. 599 600 It is trivial to check when a synchronous multiplexing utility is done for the or/xor relationship, since any resource becoming available means that the blocked thread can proceed and the @waituntil@ statement is finished. 601 In \uC and \CFA, the \gls{synch_multiplex} mechanism have both an and/or relationship, which make the problem of checking for completion of the statement difficult. 602 \PAB{Show an example of why this is difficult.} 591 603 592 604 In the \uC @_Select@ statement, this problem is solved by constructing a tree of the resources, where the internal nodes are operators and the leaves are booleans storing the state of each resource. 593 605 The internal nodes also store the statuses of the two subtrees beneath them. 594 When resources become available, their corresponding leaf node status is modified and then percolates up into the internal nodesto update the state of the statement.606 When resources become available, their corresponding leaf node status is modified, which percolates up the tree to update the state of the statement. 595 607 Once the root of the tree has both subtrees marked as @true@ then the statement is complete. 596 As an optimization, when the internal nodes are updated, the ir subtrees marked as @true@ are pruned and are not touched again.608 As an optimization, when the internal nodes are updated, the subtrees marked as @true@ are pruned and not examined again. 597 609 To support statement guards in \uC, the tree prunes a branch if the corresponding guard is false. 610 \PAB{Show an example.} 598 611 599 612 The \CFA @waituntil@ statement blocks a thread until a set of resources have become available that satisfy the underlying predicate. 600 613 The waiting condition of the @waituntil@ statement can be represented as a predicate over the resources, joined by the @waituntil@ operators, where a resource is @true@ if it is available, and @false@ otherwise. 601 614 In \CFA, this representation is used as the mechanism to check if a thread is done waiting on the @waituntil@. 602 Leveraging the compiler, a predicate routine is generated per @waituntil@ that when passe dthe statuses of the resources, returns @true@ when the @waituntil@ is done, and false otherwise.615 Leveraging the compiler, a predicate routine is generated per @waituntil@ that when passes the statuses of the resources, returns @true@ when the @waituntil@ is done, and false otherwise. 603 616 To support guards on the \CFA @waituntil@ statement, the status of a resource disabled by a guard is set to a boolean value that ensures that the predicate function behaves as if that resource is no longer part of the predicate. 604 605 \uC's @_Select@, supports operators both inside and outside of the clauses. 606 \ eg in the following example the code blocks will run once their corresponding predicate inside the round braces is satisfied.607 617 \PAB{Show an example.} 618 619 \uC's @_Select@, supports operators both inside and outside of the \lstinline[language=uC++]{_Select} clauses. 620 In the following example, the code blocks run once their corresponding predicate inside the round braces is satisfied. 608 621 % C_TODO put this is uC++ code style not cfa-style 609 \begin{ cfa}622 \begin{lstlisting}[language=uC++] 610 623 Future_ISM<int> A, B, C, D; 611 624 _Select( A || B && C ) { ... } 612 625 and _Select( D && E ) { ... } 613 \end{cfa} 614 615 This is more expressive that the @waituntil@ statement in \CFA. 616 In \CFA, since the @waituntil@ statement supports more resources than just futures, implementing operators inside clauses was avoided for a few reasons. 617 As a motivating example, suppose \CFA supported operators inside clauses and consider the code snippet in Figure~\ref{f:wu_inside_op}. 618 619 \begin{figure} 626 \end{lstlisting} 627 This is more expressive that the @waituntil@ statement in \CFA, allowing the code block for @&&@ to only run after both resources are available. 628 629 In \CFA, since the @waituntil@ statement supports more resources than just futures, implementing operators inside clauses is avoided for a few reasons. 630 As a motivating example, suppose \CFA supported operators inside clauses as in: 620 631 \begin{cfa} 621 632 owner_lock A, B, C, D; … … 623 634 or waituntil( C && D ) { ... } 624 635 \end{cfa} 625 \caption{Example of unsupported operators inside clauses in \CFA.} 626 \label{f:wu_inside_op} 627 \end{figure} 628 629 If the @waituntil@ in Figure~\ref{f:wu_inside_op} works with the same semantics as described and acquires each lock as it becomes available, it opens itself up to possible deadlocks since it is now holding locks and waiting on other resources. 630 Other semantics would be needed to ensure that this operation is safe. 631 One possibility is to use \CC's @scoped_lock@ approach that was described in Section~\ref{s:DeadlockAvoidance}, however the potential for livelock leaves much to be desired. 632 Another possibility would be to use resource ordering similar to \CFA's @mutex@ statement, but that alone is not sufficient if the resource ordering is not used everywhere. 636 If the @waituntil@ acquires each lock as it becomes available, there is a possible deadlock since it is in a hold and wait situation. 637 Other semantics are needed to ensure this operation is safe. 638 One possibility is to use \CC's @scoped_lock@ approach described in Section~\ref{s:DeadlockAvoidance}; 639 however, that opens the potential for livelock. 640 Another possibility is to use resource ordering similar to \CFA's @mutex@ statement, but that alone is insufficient, if the resource ordering is not used universally. 633 641 Additionally, using resource ordering could conflict with other semantics of the @waituntil@ statement. 634 To show this conflict, consider if the locks in Figure~\ref{f:wu_inside_op} were ordered @D@, @B@, @C@, @A@.635 If all the locks are available, it becomes complex to both respect the ordering of the @waituntil@ in Figure~\ref{f:wu_inside_op} when choosing which code block to run and also respect the lock ordering of @D@, @B@, @C@, @A@ at the same time. 642 For example, consider if the locks in the example must be acquired in the order @D@, @B@, @C@, @A@ because of other @waituntil@ statements. 643 \PAB{I don't understand: If all the locks are available, it becomes complex to both respect the ordering of the \lstinline{waituntil} when choosing which code block to run and also respect the lock ordering of \lstinline{D}, \lstinline{B}, \lstinline{C}, \lstinline{A} at the same time.} 636 644 One other way this could be implemented is to wait until all resources for a given clause are available before proceeding to acquire them, but this also quickly becomes a poor approach. 637 This approach won't work due to TOCTOU issues; it is not possible to ensure that the full set resources are available without holding them all first. 638 Operators inside clauses in \CFA could potentially be implemented with careful circumvention of the problems involved, but it was not deemed an important feature when taking into account the runtime cost that would need to be paid to handle these situations. 645 This approach does not work due to \gls{toctou} issues; 646 it is impossible to ensure that the full set of resources are available without holding them all first. 647 Operators inside clauses in \CFA could potentially be implemented with careful circumvention of the problems involved, but it was not deemed an important feature when taking into account the runtime cost need paid to handle this situation. 639 648 The problem of operators inside clauses also becomes a difficult issue to handle when supporting channels. 640 I f internal operators were supported, it would require some way to ensure that channels used with internal operators are modified on if and only ifthe corresponding code block is run, but that is not feasible due to reasons described in the exclusive-or portion of Section~\ref{s:wu_chans}.649 It would require some way to ensure channels used with internal operators are modified, if and only if, the corresponding code block is run, but that is not feasible due to reasons described in the exclusive-or portion of Section~\ref{s:wu_chans}. 641 650 642 651 \subsection{The full \lstinline{waituntil} picture} 643 Now that the details have been discussed, the full pseudocode of the waituntil is presented in Figure~\ref{f:WU_Full_Impl}. 652 653 Now the details have been discussed, the full pseudocode of the @waituntil@ is presented in Figure~\ref{f:WU_Full_Impl}. 654 Some things to note are as follows. 655 The @finally@ blocks provide exception-safe \gls{raii} unregistering of nodes, and in particular, the @finally@ inside the innermost loop performs the immediate unregistering required for deadlock-freedom mentioned in Section~\ref{s:wu_locks}. 656 The @when_conditions@ array is used to store the boolean result of evaluating each guard at the beginning of the @waituntil@, and it is used to conditionally omit operations on resources with @false@ guards. 657 As discussed in Section~\ref{s:wu_chans}, this pseudocode includes conditional code-block execution based on the result of both @on_selected@ and @unregister_select@, which allows the channel implementation to ensure all available channel resources have their corresponding code block run. 644 658 645 659 \begin{figure} … … 682 696 \label{f:WU_Full_Impl} 683 697 \end{figure} 684 685 In comparison to Figure~\ref{f:WU_Impl}, this pseudocode now includes the specifics discussed in this chapter.686 Some things to note are as follows:687 The @finally@ blocks provide exception-safe RAII unregistering of nodes, and in particular, the @finally@ inside the innermost loop performs the immediate unregistering required for deadlock-freedom that was mentioned in Section~\ref{s:wu_locks}.688 The @when_conditions@ array is used to store the boolean result of evaluating each guard at the beginning of the @waituntil@, and it is used to conditionally omit operations on resources with @false@ guards.689 As discussed in Section~\ref{s:wu_chans}, this pseudocode includes code blocks conditional on the result of both @on_selected@ and @unregister_select@, which allows the channel implementation to ensure that all available channel resources will have their corresponding code block run.690 698 691 699 \section{Waituntil Performance}
Note: See TracChangeset
for help on using the changeset viewer.