Changes in / [3fc59bdb:7bdcac1]


Ignore:
Files:
3 deleted
4 edited

Legend:

Unmodified
Added
Removed
  • doc/papers/concurrency/Paper.tex

    r3fc59bdb r7bdcac1  
    580580\subsection{\protect\CFA's Thread Building Blocks}
    581581
    582 An important missing feature in C is threading\footnote{While the C11 standard defines a \protect\lstinline@threads.h@ header, it is minimal and defined as optional.
     582An important missing feature in C is threading\footnote{While the C11 standard defines a ``threads.h'' header, it is minimal and defined as optional.
    583583As such, library support for threading is far from widespread.
    584 At the time of writing the paper, neither \protect\lstinline@gcc@ nor \protect\lstinline@clang@ support \protect\lstinline@threads.h@ in their standard libraries.}.
     584At the time of writing the paper, neither \protect\lstinline|gcc| nor \protect\lstinline|clang| support ``threads.h'' in their standard libraries.}.
    585585In modern programming languages, a lack of threading is unacceptable~\cite{Sutter05, Sutter05b}, and therefore existing and new programming languages must have tools for writing efficient concurrent programs to take advantage of parallelism.
    586586As an extension of C, \CFA needs to express these concepts in a way that is as natural as possible to programmers familiar with imperative languages.
     
    11401140}
    11411141\end{cfa}
    1142 A consequence of the strongly typed approach to main is that memory layout of parameters and return values to/from a thread are now explicitly specified in the \textbf{API}.
     1142A consequence of the strongly typed approach to main is that memory layout of parameters and return values to/from a thread are now explicitly specified in the \textbf{api}.
    11431143\end{comment}
    11441144
     
    14431443\label{s:InternalScheduling}
    14441444
    1445 While monitor mutual-exclusion provides safe access to shared data, the monitor data may indicate that a thread accessing it cannot proceed.
    1446 For example, Figure~\ref{f:GenericBoundedBuffer} shows a bounded buffer that may be full/empty so produce/consumer threads must block.
     1445While monitor mutual-exclusion provides safe access to shared data, the monitor data may indicate that a thread accessing it cannot proceed, \eg a bounded buffer, Figure~\ref{f:GenericBoundedBuffer}, may be full/empty so produce/consumer threads must block.
    14471446Leaving the monitor and trying again (busy waiting) is impractical for high-level programming.
    14481447Monitors eliminate busy waiting by providing internal synchronization to schedule threads needing access to the shared data, where the synchronization is blocking (threads are parked) versus spinning.
    1449 Synchronization is generally achieved with internal~\cite{Hoare74} or external~\cite[\S~2.9.2]{uC++} scheduling, where \newterm{scheduling} defines which thread acquires the critical section next.
     1448The synchronization is generally achieved with internal~\cite{Hoare74} or external~\cite[\S~2.9.2]{uC++} scheduling, where \newterm{scheduling} is defined as indicating which thread acquires the critical section next.
    14501449\newterm{Internal scheduling} is characterized by each thread entering the monitor and making an individual decision about proceeding or blocking, while \newterm{external scheduling} is characterized by an entering thread making a decision about proceeding for itself and on behalf of other threads attempting entry.
    14511450
     
    15381537External scheduling is controlled by the @waitfor@ statement, which atomically blocks the calling thread, releases the monitor lock, and restricts the routine calls that can next acquire mutual exclusion.
    15391538If the buffer is full, only calls to @remove@ can acquire the buffer, and if the buffer is empty, only calls to @insert@ can acquire the buffer.
    1540 Threads making calls to routines that are currently excluded block outside (external) of the monitor on a calling queue, versus blocking on condition queues inside (internal) of the monitor.
    1541 
    1542 For internal scheduling, non-blocking signalling (as in the producer/consumer example) is used when the signaller is providing the cooperation for a waiting thread;
    1543 the signaller enters the monitor and changes state, detects a waiting threads that can use the state, performs a non-blocking signal on the condition queue for the waiting thread, and exits the monitor to run concurrently.
    1544 The waiter unblocks next, takes the state, and exits the monitor.
    1545 Blocking signalling is the reverse, where the waiter is providing the cooperation for the signalling thread;
    1546 the signaller enters the monitor, detects a waiting thread providing the necessary state, performs a blocking signal to place it on the urgent queue and unblock the waiter.
    1547 The waiter changes state and exits the monitor, and the signaller unblocks next from the urgent queue to take the state.
    1548 
    1549 Figure~\ref{f:DatingService} shows a dating service demonstrating the two forms of signalling: non-blocking and blocking.
    1550 The dating service matches girl and boy threads with matching compatibility codes so they can exchange phone numbers.
    1551 A thread blocks until an appropriate partner arrives.
    1552 The complexity is exchanging phone number in the monitor,
    1553 While the non-barging monitor prevents a caller from stealing a phone number, the monitor mutual-exclusion property
    1554 
    1555 The dating service is an example of a monitor that cannot be written using external scheduling because:
    1556 
    1557 The example in table \ref{tbl:datingservice} highlights the difference in behaviour.
    1558 As mentioned, @signal@ only transfers ownership once the current critical section exits; this behaviour requires additional synchronization when a two-way handshake is needed.
    1559 To avoid this explicit synchronization, the @condition@ type offers the @signal_block@ routine, which handles the two-way handshake as shown in the example.
    1560 This feature removes the need for a second condition variables and simplifies programming.
    1561 Like every other monitor semantic, @signal_block@ uses barging prevention, which means mutual-exclusion is baton-passed both on the front end and the back end of the call to @signal_block@, meaning no other thread can acquire the monitor either before or after the call.
     1539Threads making calls to routines that are currently excluded block outside (externally) of the monitor on a calling queue, versus blocking on condition queues inside the monitor.
     1540
     1541Both internal and external scheduling extend to multiple monitors in a natural way.
     1542\begin{cfa}
     1543monitor M { `condition e`; ... };
     1544void foo( M & mutex m1, M & mutex m2 ) {
     1545        ... wait( `e` ); ...                                    $\C{// wait( e, m1, m2 )}$
     1546        ... wait( `e, m1` ); ...
     1547        ... wait( `e, m2` ); ...
     1548}
     1549
     1550void rtn$\(_1\)$( M & mutex m1, M & mutex m2 );
     1551void rtn$\(_2\)$( M & mutex m1 );
     1552void bar( M & mutex m1, M & mutex m2 ) {
     1553        ... waitfor( `rtn` ); ...                               $\C{// waitfor( rtn\(_1\), m1, m2 )}$
     1554        ... waitfor( `rtn, m1` ); ...                   $\C{// waitfor( rtn\(_2\), m1 )}$
     1555}
     1556\end{cfa}
     1557For @wait( e )@, the default semantics is to atomically block the signaller and release all acquired mutex types in the parameter list, \ie @wait( e, m1, m2 )@.
     1558To override the implicit multi-monitor wait, specific mutex parameter(s) can be specified, \eg @wait( e, m1 )@.
     1559Wait statically verifies the released monitors are the acquired mutex-parameters so unconditional release is safe.
     1560Similarly, for @waitfor( rtn, ... )@, the default semantics is to atomically block the acceptor and release all acquired mutex types in the parameter list, \ie @waitfor( rtn, m1, m2 )@.
     1561To override the implicit multi-monitor wait, specific mutex parameter(s) can be specified, \eg @waitfor( rtn, m1 )@.
     1562Waitfor statically verifies the released monitors are the same as the acquired mutex-parameters of the given routine or routine pointer.
     1563To statically verify the released monitors match with the accepted routine's mutex parameters, the routine (pointer) prototype must be accessible.
     1564
     1565Given the ability to release a subset of acquired monitors can result in a \newterm{nested monitor}~\cite{Lister77} deadlock.
     1566\begin{cfa}
     1567void foo( M & mutex m1, M & mutex m2 ) {
     1568        ... wait( `e, m1` ); ...                                $\C{// release m1, keeping m2 acquired )}$
     1569void baz( M & mutex m1, M & mutex m2 ) {        $\C{// must acquire m1 and m2 )}$
     1570        ... signal( `e` ); ...
     1571\end{cfa}
     1572The @wait@ only releases @m1@ so the signalling thread cannot acquire both @m1@ and @m2@ to  enter @baz@ to get to the @signal@.
     1573While deadlock issues can occur with multiple/nesting acquisition, this issue results from the fact that locks, and by extension monitors, are not perfectly composable.
     1574
     1575Finally, an important aspect of monitor implementation is barging, \ie can calling threads barge ahead of signalled threads?
     1576If barging is allowed, synchronization between a singller and signallee is difficult, often requiring multiple unblock/block cycles (looping around a wait rechecking if a condition is met).
     1577\begin{quote}
     1578However, we decree that a signal operation be followed immediately by resumption of a waiting program, without possibility of an intervening procedure call from yet a third program.
     1579It is only in this way that a waiting program has an absolute guarantee that it can acquire the resource just released by the signalling program without any danger that a third program will interpose a monitor entry and seize the resource instead.~\cite[p.~550]{Hoare74}
     1580\end{quote}
     1581\CFA scheduling \emph{precludes} barging, which simplifies synchronization among threads in the monitor and increases correctness.
     1582For example, there are no loops in either bounded buffer solution in Figure~\ref{f:GenericBoundedBuffer}.
     1583Supporting barging prevention as well as extending internal scheduling to multiple monitors is the main source of complexity in the design and implementation of \CFA concurrency.
     1584
     1585
     1586\subsection{Barging Prevention}
     1587
     1588Figure~\ref{f:BargingPrevention} shows \CFA code where bulk acquire adds complexity to the internal-signalling semantics.
     1589The complexity begins at the end of the inner @mutex@ statement, where the semantics of internal scheduling need to be extended for multiple monitors.
     1590The problem is that bulk acquire is used in the inner @mutex@ statement where one of the monitors is already acquired.
     1591When the signalling thread reaches the end of the inner @mutex@ statement, it should transfer ownership of @m1@ and @m2@ to the waiting thread to prevent barging into the outer @mutex@ statement by another thread.
     1592However, both the signalling and signalled threads still need monitor @m1@.
     1593
     1594\begin{figure}
     1595\newbox\myboxA
     1596\begin{lrbox}{\myboxA}
     1597\begin{cfa}[aboveskip=0pt,belowskip=0pt]
     1598monitor M m1, m2;
     1599condition c;
     1600mutex( m1 ) {
     1601        ...
     1602        mutex( m1, m2 ) {
     1603                ... `wait( c )`; // block and release m1, m2
     1604                // m1, m2 acquired
     1605        } // $\LstCommentStyle{\color{red}release m2}$
     1606        // m1 acquired
     1607} // release m1
     1608\end{cfa}
     1609\end{lrbox}
     1610
     1611\newbox\myboxB
     1612\begin{lrbox}{\myboxB}
     1613\begin{cfa}[aboveskip=0pt,belowskip=0pt]
     1614
     1615
     1616mutex( m1 ) {
     1617        ...
     1618        mutex( m1, m2 ) {
     1619                ... `signal( c )`; ...
     1620                // m1, m2 acquired
     1621        } // $\LstCommentStyle{\color{red}release m2}$
     1622        // m1 acquired
     1623} // release m1
     1624\end{cfa}
     1625\end{lrbox}
     1626
     1627\newbox\myboxC
     1628\begin{lrbox}{\myboxC}
     1629\begin{cfa}[aboveskip=0pt,belowskip=0pt]
     1630
     1631
     1632mutex( m1 ) {
     1633        ... `wait( c )`; ...
     1634        // m1 acquired
     1635} // $\LstCommentStyle{\color{red}release m1}$
     1636
     1637
     1638
     1639
     1640\end{cfa}
     1641\end{lrbox}
     1642
     1643\begin{cquote}
     1644\subfloat[Waiting Thread]{\label{f:WaitingThread}\usebox\myboxA}
     1645\hspace{2\parindentlnth}
     1646\subfloat[Signalling Thread]{\label{f:SignallingThread}\usebox\myboxB}
     1647\hspace{2\parindentlnth}
     1648\subfloat[Other Waiting Thread]{\label{f:SignallingThread}\usebox\myboxC}
     1649\end{cquote}
     1650\caption{Barging Prevention}
     1651\label{f:BargingPrevention}
     1652\end{figure}
     1653
     1654The obvious solution to the problem of multi-monitor scheduling is to keep ownership of all locks until the last lock is ready to be transferred.
     1655It can be argued that that moment is when the last lock is no longer needed, because this semantics fits most closely to the behaviour of single-monitor scheduling.
     1656This solution has the main benefit of transferring ownership of groups of monitors, which simplifies the semantics from multiple objects to a single group of objects, effectively making the existing single-monitor semantic viable by simply changing monitors to monitor groups.
     1657This solution releases the monitors once every monitor in a group can be released.
     1658However, since some monitors are never released (\eg the monitor of a thread), this interpretation means a group might never be released.
     1659A more interesting interpretation is to transfer the group until all its monitors are released, which means the group is not passed further and a thread can retain its locks.
     1660
     1661However, listing \ref{f:int-secret} shows this solution can become much more complicated depending on what is executed while secretly holding B at line \ref{line:secret}, while avoiding the need to transfer ownership of a subset of the condition monitors.
     1662Figure~\ref{f:dependency} shows a slightly different example where a third thread is waiting on monitor @A@, using a different condition variable.
     1663Because the third thread is signalled when secretly holding @B@, the goal  becomes unreachable.
     1664Depending on the order of signals (listing \ref{f:dependency} line \ref{line:signal-ab} and \ref{line:signal-a}) two cases can happen:
     1665
     1666\begin{comment}
     1667\paragraph{Case 1: thread $\alpha$ goes first.} In this case, the problem is that monitor @A@ needs to be passed to thread $\beta$ when thread $\alpha$ is done with it.
     1668\paragraph{Case 2: thread $\beta$ goes first.} In this case, the problem is that monitor @B@ needs to be retained and passed to thread $\alpha$ along with monitor @A@, which can be done directly or possibly using thread $\beta$ as an intermediate.
     1669\\
     1670
     1671Note that ordering is not determined by a race condition but by whether signalled threads are enqueued in FIFO or FILO order.
     1672However, regardless of the answer, users can move line \ref{line:signal-a} before line \ref{line:signal-ab} and get the reverse effect for listing \ref{f:dependency}.
     1673
     1674In both cases, the threads need to be able to distinguish, on a per monitor basis, which ones need to be released and which ones need to be transferred, which means knowing when to release a group becomes complex and inefficient (see next section) and therefore effectively precludes this approach.
     1675
     1676
     1677\subsubsection{Dependency graphs}
     1678
     1679\begin{figure}
     1680\begin{multicols}{3}
     1681Thread $\alpha$
     1682\begin{cfa}[numbers=left, firstnumber=1]
     1683acquire A
     1684        acquire A & B
     1685                wait A & B
     1686        release A & B
     1687release A
     1688\end{cfa}
     1689\columnbreak
     1690Thread $\gamma$
     1691\begin{cfa}[numbers=left, firstnumber=6, escapechar=|]
     1692acquire A
     1693        acquire A & B
     1694                |\label{line:signal-ab}|signal A & B
     1695        |\label{line:release-ab}|release A & B
     1696        |\label{line:signal-a}|signal A
     1697|\label{line:release-a}|release A
     1698\end{cfa}
     1699\columnbreak
     1700Thread $\beta$
     1701\begin{cfa}[numbers=left, firstnumber=12, escapechar=|]
     1702acquire A
     1703        wait A
     1704|\label{line:release-aa}|release A
     1705\end{cfa}
     1706\end{multicols}
     1707\begin{cfa}[caption={Pseudo-code for the three thread example.},label={f:dependency}]
     1708\end{cfa}
     1709\begin{center}
     1710\input{dependency}
     1711\end{center}
     1712\caption{Dependency graph of the statements in listing \ref{f:dependency}}
     1713\label{fig:dependency}
     1714\end{figure}
     1715
     1716In listing \ref{f:int-bulk-cfa}, there is a solution that satisfies both barging prevention and mutual exclusion.
     1717If ownership of both monitors is transferred to the waiter when the signaller releases @A & B@ and then the waiter transfers back ownership of @A@ back to the signaller when it releases it, then the problem is solved (@B@ is no longer in use at this point).
     1718Dynamically finding the correct order is therefore the second possible solution.
     1719The problem is effectively resolving a dependency graph of ownership requirements.
     1720Here even the simplest of code snippets requires two transfers and has a super-linear complexity.
     1721This complexity can be seen in listing \ref{f:explosion}, which is just a direct extension to three monitors, requires at least three ownership transfer and has multiple solutions.
     1722Furthermore, the presence of multiple solutions for ownership transfer can cause deadlock problems if a specific solution is not consistently picked; In the same way that multiple lock acquiring order can cause deadlocks.
     1723\begin{figure}
     1724\begin{multicols}{2}
     1725\begin{cfa}
     1726acquire A
     1727        acquire B
     1728                acquire C
     1729                        wait A & B & C
     1730                release C
     1731        release B
     1732release A
     1733\end{cfa}
     1734
     1735\columnbreak
     1736
     1737\begin{cfa}
     1738acquire A
     1739        acquire B
     1740                acquire C
     1741                        signal A & B & C
     1742                release C
     1743        release B
     1744release A
     1745\end{cfa}
     1746\end{multicols}
     1747\begin{cfa}[caption={Extension to three monitors of listing \ref{f:int-bulk-cfa}},label={f:explosion}]
     1748\end{cfa}
     1749\end{figure}
     1750
     1751Given the three threads example in listing \ref{f:dependency}, figure \ref{fig:dependency} shows the corresponding dependency graph that results, where every node is a statement of one of the three threads, and the arrows the dependency of that statement (\eg $\alpha1$ must happen before $\alpha2$).
     1752The extra challenge is that this dependency graph is effectively post-mortem, but the runtime system needs to be able to build and solve these graphs as the dependencies unfold.
     1753Resolving dependency graphs being a complex and expensive endeavour, this solution is not the preferred one.
     1754
     1755\subsubsection{Partial Signalling} \label{partial-sig}
     1756\end{comment}
     1757
     1758Finally, the solution that is chosen for \CFA is to use partial signalling.
     1759Again using listing \ref{f:int-bulk-cfa}, the partial signalling solution transfers ownership of monitor @B@ at lines \ref{line:signal1} to the waiter but does not wake the waiting thread since it is still using monitor @A@.
     1760Only when it reaches line \ref{line:lastRelease} does it actually wake up the waiting thread.
     1761This solution has the benefit that complexity is encapsulated into only two actions: passing monitors to the next owner when they should be released and conditionally waking threads if all conditions are met.
     1762This solution has a much simpler implementation than a dependency graph solving algorithms, which is why it was chosen.
     1763Furthermore, after being fully implemented, this solution does not appear to have any significant downsides.
     1764
     1765Using partial signalling, listing \ref{f:dependency} can be solved easily:
     1766\begin{itemize}
     1767        \item When thread $\gamma$ reaches line \ref{line:release-ab} it transfers monitor @B@ to thread $\alpha$ and continues to hold monitor @A@.
     1768        \item When thread $\gamma$ reaches line \ref{line:release-a}  it transfers monitor @A@ to thread $\beta$  and wakes it up.
     1769        \item When thread $\beta$  reaches line \ref{line:release-aa} it transfers monitor @A@ to thread $\alpha$ and wakes it up.
     1770\end{itemize}
     1771
     1772
     1773\subsection{Signalling: Now or Later}
    15621774
    15631775\begin{figure}
     
    16211833\subfloat[\lstinline@signal_block@]{\label{f:DatingSignalBlock}\usebox\myboxB}
    16221834\caption{Dating service. }
    1623 \label{f:DatingService}
     1835\label{f:Dating service}
    16241836\end{figure}
    16251837
    1626 Both internal and external scheduling extend to multiple monitors in a natural way.
    1627 \begin{cquote}
    1628 \begin{tabular}{@{}l@{\hspace{3\parindentlnth}}l@{}}
    1629 \begin{cfa}
    1630 monitor M { `condition e`; ... };
    1631 void foo( M & mutex m1, M & mutex m2 ) {
    1632         ... wait( `e` ); ...   // wait( e, m1, m2 )
    1633         ... wait( `e, m1` ); ...
    1634         ... wait( `e, m2` ); ...
    1635 }
    1636 \end{cfa}
    1637 &
    1638 \begin{cfa}
    1639 void rtn$\(_1\)$( M & mutex m1, M & mutex m2 );
    1640 void rtn$\(_2\)$( M & mutex m1 );
    1641 void bar( M & mutex m1, M & mutex m2 ) {
    1642         ... waitfor( `rtn` ); ...       // $\LstCommentStyle{waitfor( rtn\(_1\), m1, m2 )}$
    1643         ... waitfor( `rtn, m1` ); ... // $\LstCommentStyle{waitfor( rtn\(_2\), m1 )}$
    1644 }
    1645 \end{cfa}
    1646 \end{tabular}
    1647 \end{cquote}
    1648 For @wait( e )@, the default semantics is to atomically block the signaller and release all acquired mutex types in the parameter list, \ie @wait( e, m1, m2 )@.
    1649 To override the implicit multi-monitor wait, specific mutex parameter(s) can be specified, \eg @wait( e, m1 )@.
    1650 Wait statically verifies the released monitors are the acquired mutex-parameters so unconditional release is safe.
    1651 Finally, a signaller,
    1652 \begin{cfa}
    1653 void baz( M & mutex m1, M & mutex m2 ) {
    1654         ... signal( e ); ...
    1655 }
    1656 \end{cfa}
    1657 must have acquired monitor locks that are greater than or equal to the number of locks for the waiting thread signalled from the front of the condition queue.
    1658 In general, the signaller does not know the order of waiting threads, so in general, it must acquire the maximum number of mutex locks for the worst-case waiting thread.
    1659 
    1660 Similarly, for @waitfor( rtn )@, the default semantics is to atomically block the acceptor and release all acquired mutex types in the parameter list, \ie @waitfor( rtn, m1, m2 )@.
    1661 To override the implicit multi-monitor wait, specific mutex parameter(s) can be specified, \eg @waitfor( rtn, m1 )@.
    1662 Waitfor statically verifies the released monitors are the same as the acquired mutex-parameters of the given routine or routine pointer.
    1663 To statically verify the released monitors match with the accepted routine's mutex parameters, the routine (pointer) prototype must be accessible.
    1664 
    1665 Given the ability to release a subset of acquired monitors can result in a \newterm{nested monitor}~\cite{Lister77} deadlock.
    1666 \begin{cfa}
    1667 void foo( M & mutex m1, M & mutex m2 ) {
    1668         ... wait( `e, m1` ); ...                                $\C{// release m1, keeping m2 acquired )}$
    1669 void baz( M & mutex m1, M & mutex m2 ) {        $\C{// must acquire m1 and m2 )}$
    1670         ... signal( `e` ); ...
    1671 \end{cfa}
    1672 The @wait@ only releases @m1@ so the signalling thread cannot acquire both @m1@ and @m2@ to  enter @baz@ to get to the @signal@.
    1673 While deadlock issues can occur with multiple/nesting acquisition, this issue results from the fact that locks, and by extension monitors, are not perfectly composable.
    1674 
    1675 Finally, an important aspect of monitor implementation is barging, \ie can calling threads barge ahead of signalled threads?
    1676 If barging is allowed, synchronization between a singller and signallee is difficult, often requiring multiple unblock/block cycles (looping around a wait rechecking if a condition is met).
    1677 \begin{quote}
    1678 However, we decree that a signal operation be followed immediately by resumption of a waiting program, without possibility of an intervening procedure call from yet a third program.
    1679 It is only in this way that a waiting program has an absolute guarantee that it can acquire the resource just released by the signalling program without any danger that a third program will interpose a monitor entry and seize the resource instead.~\cite[p.~550]{Hoare74}
    1680 \end{quote}
    1681 \CFA scheduling \emph{precludes} barging, which simplifies synchronization among threads in the monitor and increases correctness.
    1682 For example, there are no loops in either bounded buffer solution in Figure~\ref{f:GenericBoundedBuffer}.
    1683 Supporting barging prevention as well as extending internal scheduling to multiple monitors is the main source of complexity in the design and implementation of \CFA concurrency.
    1684 
    1685 
    1686 \subsection{Barging Prevention}
    1687 
    1688 Figure~\ref{f:BargingPrevention} shows \CFA code where bulk acquire adds complexity to the internal-signalling semantics.
    1689 The complexity begins at the end of the inner @mutex@ statement, where the semantics of internal scheduling need to be extended for multiple monitors.
    1690 The problem is that bulk acquire is used in the inner @mutex@ statement where one of the monitors is already acquired.
    1691 When the signalling thread reaches the end of the inner @mutex@ statement, it should transfer ownership of @m1@ and @m2@ to the waiting threads to prevent barging into the outer @mutex@ statement by another thread.
    1692 However, both the signalling and waiting thread W1 still need monitor @m1@.
    1693 
    1694 \begin{figure}
    1695 \newbox\myboxA
    1696 \begin{lrbox}{\myboxA}
    1697 \begin{cfa}[aboveskip=0pt,belowskip=0pt]
    1698 monitor M m1, m2;
    1699 condition c;
    1700 mutex( m1 ) { // $\LstCommentStyle{\color{red}outer}$
    1701         ...
    1702         mutex( m1, m2 ) { // $\LstCommentStyle{\color{red}inner}$
    1703                 ... `signal( c )`; ...
    1704                 // m1, m2 acquired
    1705         } // $\LstCommentStyle{\color{red}release m2}$
    1706         // m1 acquired
    1707 } // release m1
    1708 \end{cfa}
    1709 \end{lrbox}
    1710 
    1711 \newbox\myboxB
    1712 \begin{lrbox}{\myboxB}
    1713 \begin{cfa}[aboveskip=0pt,belowskip=0pt]
    1714 
    1715 
    1716 mutex( m1 ) {
    1717         ...
    1718         mutex( m1, m2 ) {
    1719                 ... `wait( c )`; // block and release m1, m2
    1720                 // m1, m2 acquired
    1721         } // $\LstCommentStyle{\color{red}release m2}$
    1722         // m1 acquired
    1723 } // release m1
    1724 \end{cfa}
    1725 \end{lrbox}
    1726 
    1727 \newbox\myboxC
    1728 \begin{lrbox}{\myboxC}
    1729 \begin{cfa}[aboveskip=0pt,belowskip=0pt]
    1730 
    1731 
    1732 mutex( m2 ) {
    1733         ... `wait( c )`; ...
    1734         // m2 acquired
    1735 } // $\LstCommentStyle{\color{red}release m2}$
    1736 
    1737 
    1738 
    1739 
    1740 \end{cfa}
    1741 \end{lrbox}
    1742 
    1743 \begin{cquote}
    1744 \subfloat[Signalling Thread]{\label{f:SignallingThread}\usebox\myboxA}
    1745 \hspace{2\parindentlnth}
    1746 \subfloat[Waiting Thread (W1)]{\label{f:WaitingThread}\usebox\myboxB}
    1747 \hspace{2\parindentlnth}
    1748 \subfloat[Waiting Thread (W2)]{\label{f:OtherWaitingThread}\usebox\myboxC}
    1749 \end{cquote}
    1750 \caption{Barging Prevention}
    1751 \label{f:BargingPrevention}
    1752 \end{figure}
    1753 
    1754 One scheduling solution is for the signaller to keep ownership of all locks until the last lock is ready to be transferred, because this semantics fits most closely to the behaviour of single-monitor scheduling.
    1755 However, Figure~\ref{f:OtherWaitingThread} shows this solution is complex depending on other waiters, resulting is choices when the signaller finishes the inner mutex-statement.
    1756 The singaller can retain @m2@ until completion of the outer mutex statement and pass the locks to waiter W1, or it can pass @m2@ to waiter W2 after completing the inner mutex-statement, while continuing to hold @m1@.
    1757 In the latter case, waiter W2 must eventually pass @m2@ to waiter W1, which is complex because W2 may have waited before W1 so it is unaware of W1.
    1758 Furthermore, there is an execution sequence where the signaller always finds waiter W2, and hence, waiter W1 starves.
    1759 
    1760 While a number of approaches were examined~\cite[\S~4.3]{Delisle18}, the solution chosen for \CFA is a novel techique called \newterm{partial signalling}.
    1761 Signalled threads are moved to an urgent queue and the waiter at the front defines the set of monitors necessary for it to unblock.
    1762 Partial signalling transfers ownership of monitors to the front waiter.
    1763 When the signaller thread exits or waits in the monitor the front waiter is unblocked if all its monitors are released.
    1764 This solution has the benefit that complexity is encapsulated into only two actions: passing monitors to the next owner when they should be released and conditionally waking threads if all conditions are met.
    1765 
    1766 \begin{comment}
    1767 Figure~\ref{f:dependency} shows a slightly different example where a third thread is waiting on monitor @A@, using a different condition variable.
    1768 Because the third thread is signalled when secretly holding @B@, the goal  becomes unreachable.
    1769 Depending on the order of signals (listing \ref{f:dependency} line \ref{line:signal-ab} and \ref{line:signal-a}) two cases can happen:
    1770 
    1771 \paragraph{Case 1: thread $\alpha$ goes first.} In this case, the problem is that monitor @A@ needs to be passed to thread $\beta$ when thread $\alpha$ is done with it.
    1772 \paragraph{Case 2: thread $\beta$ goes first.} In this case, the problem is that monitor @B@ needs to be retained and passed to thread $\alpha$ along with monitor @A@, which can be done directly or possibly using thread $\beta$ as an intermediate.
    1773 \\
    1774 
    1775 Note that ordering is not determined by a race condition but by whether signalled threads are enqueued in FIFO or FILO order.
    1776 However, regardless of the answer, users can move line \ref{line:signal-a} before line \ref{line:signal-ab} and get the reverse effect for listing \ref{f:dependency}.
    1777 
    1778 In both cases, the threads need to be able to distinguish, on a per monitor basis, which ones need to be released and which ones need to be transferred, which means knowing when to release a group becomes complex and inefficient (see next section) and therefore effectively precludes this approach.
    1779 
    1780 
    1781 \subsubsection{Dependency graphs}
    1782 
    1783 \begin{figure}
    1784 \begin{multicols}{3}
    1785 Thread $\alpha$
    1786 \begin{cfa}[numbers=left, firstnumber=1]
    1787 acquire A
    1788         acquire A & B
    1789                 wait A & B
    1790         release A & B
    1791 release A
    1792 \end{cfa}
    1793 \columnbreak
    1794 Thread $\gamma$
    1795 \begin{cfa}[numbers=left, firstnumber=6, escapechar=|]
    1796 acquire A
    1797         acquire A & B
    1798                 |\label{line:signal-ab}|signal A & B
    1799         |\label{line:release-ab}|release A & B
    1800         |\label{line:signal-a}|signal A
    1801 |\label{line:release-a}|release A
    1802 \end{cfa}
    1803 \columnbreak
    1804 Thread $\beta$
    1805 \begin{cfa}[numbers=left, firstnumber=12, escapechar=|]
    1806 acquire A
    1807         wait A
    1808 |\label{line:release-aa}|release A
    1809 \end{cfa}
    1810 \end{multicols}
    1811 \begin{cfa}[caption={Pseudo-code for the three thread example.},label={f:dependency}]
    1812 \end{cfa}
    1813 \begin{center}
    1814 \input{dependency}
    1815 \end{center}
    1816 \caption{Dependency graph of the statements in listing \ref{f:dependency}}
    1817 \label{fig:dependency}
    1818 \end{figure}
    1819 
    1820 In listing \ref{f:int-bulk-cfa}, there is a solution that satisfies both barging prevention and mutual exclusion.
    1821 If ownership of both monitors is transferred to the waiter when the signaller releases @A & B@ and then the waiter transfers back ownership of @A@ back to the signaller when it releases it, then the problem is solved (@B@ is no longer in use at this point).
    1822 Dynamically finding the correct order is therefore the second possible solution.
    1823 The problem is effectively resolving a dependency graph of ownership requirements.
    1824 Here even the simplest of code snippets requires two transfers and has a super-linear complexity.
    1825 This complexity can be seen in listing \ref{f:explosion}, which is just a direct extension to three monitors, requires at least three ownership transfer and has multiple solutions.
    1826 Furthermore, the presence of multiple solutions for ownership transfer can cause deadlock problems if a specific solution is not consistently picked; In the same way that multiple lock acquiring order can cause deadlocks.
    1827 \begin{figure}
    1828 \begin{multicols}{2}
    1829 \begin{cfa}
    1830 acquire A
    1831         acquire B
    1832                 acquire C
    1833                         wait A & B & C
    1834                 release C
    1835         release B
    1836 release A
    1837 \end{cfa}
    1838 
    1839 \columnbreak
    1840 
    1841 \begin{cfa}
    1842 acquire A
    1843         acquire B
    1844                 acquire C
    1845                         signal A & B & C
    1846                 release C
    1847         release B
    1848 release A
    1849 \end{cfa}
    1850 \end{multicols}
    1851 \begin{cfa}[caption={Extension to three monitors of listing \ref{f:int-bulk-cfa}},label={f:explosion}]
    1852 \end{cfa}
    1853 \end{figure}
    1854 
    1855 Given the three threads example in listing \ref{f:dependency}, figure \ref{fig:dependency} shows the corresponding dependency graph that results, where every node is a statement of one of the three threads, and the arrows the dependency of that statement (\eg $\alpha1$ must happen before $\alpha2$).
    1856 The extra challenge is that this dependency graph is effectively post-mortem, but the runtime system needs to be able to build and solve these graphs as the dependencies unfold.
    1857 Resolving dependency graphs being a complex and expensive endeavour, this solution is not the preferred one.
    1858 
    1859 \subsubsection{Partial Signalling} \label{partial-sig}
    1860 \end{comment}
    1861 
    1862 
     1838An important note is that, until now, signalling a monitor was a delayed operation.
     1839The ownership of the monitor is transferred only when the monitor would have otherwise been released, not at the point of the @signal@ statement.
     1840However, in some cases, it may be more convenient for users to immediately transfer ownership to the thread that is waiting for cooperation, which is achieved using the @signal_block@ routine.
     1841
     1842The example in table \ref{tbl:datingservice} highlights the difference in behaviour.
     1843As mentioned, @signal@ only transfers ownership once the current critical section exits; this behaviour requires additional synchronization when a two-way handshake is needed.
     1844To avoid this explicit synchronization, the @condition@ type offers the @signal_block@ routine, which handles the two-way handshake as shown in the example.
     1845This feature removes the need for a second condition variables and simplifies programming.
     1846Like every other monitor semantic, @signal_block@ uses barging prevention, which means mutual-exclusion is baton-passed both on the front end and the back end of the call to @signal_block@, meaning no other thread can acquire the monitor either before or after the call.
     1847
     1848% ======================================================================
     1849% ======================================================================
    18631850\section{External scheduling} \label{extsched}
    1864 
     1851% ======================================================================
     1852% ======================================================================
    18651853An alternative to internal scheduling is external scheduling (see Table~\ref{tbl:sched}).
    18661854
  • doc/proposals/user_conversions.md

    r3fc59bdb r7bdcac1  
    55There is also a set of _explicit_ conversions that are only allowed through a
    66cast expression.
    7 I propose that safe, unsafe, and explicit (cast) conversions be expressed as
    8 constructor variants.
     7Based on Glen's notes on conversions [1], I propose that safe and unsafe
     8conversions be expressed as constructor variants, though I make explicit
     9(cast) conversions a constructor variant as well rather than a dedicated
     10operator.
    911Throughout this article, I will use the following operator names for
    1012constructors and conversion functions from `From` to `To`:
    1113
    12         void ?{} ( To&, To );            // copy constructor
    13         void ?{} ( To&, From );          // explicit constructor
    14         void ?{explicit} ( To&, From );  // explicit cast conversion
    15         void ?{safe} ( To&, From );      // implicit safe conversion
    16         void ?{unsafe} ( To&, From );    // implicit unsafe conversion
    17 
    18 It has been suggested that all constructors would define unsafe implicit
     14        void ?{} ( To*, To );            // copy constructor
     15        void ?{} ( To*, From );          // explicit constructor
     16        void ?{explicit} ( To*, From );  // explicit cast conversion
     17        void ?{safe} ( To*, From );      // implicit safe conversion
     18        void ?{unsafe} ( To*, From );    // implicit unsafe conversion
     19
     20[1] http://plg.uwaterloo.ca/~cforall/Conversions/index.html
     21
     22Glen's design made no distinction between constructors and unsafe implicit
    1923conversions; this is elegant, but interacts poorly with tuples.
    2024Essentially, without making this distinction, a constructor like the following
     
    2226multiplying the space of possible interpretations of all functions:
    2327
    24         void ?{}( Coord& this, int x, int y );
     28        void ?{}( Coord *this, int x, int y );
    2529
    2630That said, it would certainly be possible to make a multiple-argument implicit
     
    2832used infrequently:
    2933
    30         void ?{unsafe}( Coord& this, int x, int y );
     34        void ?{unsafe}( Coord *this, int x, int y );
    3135
    3236An alternate possibility would be to only count two-arg constructors
    33 `void ?{} ( To&, From )` as unsafe conversions; under this semantics, safe and
     37`void ?{} ( To*, From )` as unsafe conversions; under this semantics, safe and
    3438explicit conversions should also have a compiler-enforced restriction to
    3539ensure that they are two-arg functions (this restriction may be valuable
     
    3943is convertable to `To`.
    4044If user-defined conversions are not added to the language,
    41 `void ?{} ( To&, From )` may be a suitable representation, relying on
     45`void ?{} ( To*, From )` may be a suitable representation, relying on
    4246conversions on the argument types to account for transitivity.
    43 Since `To&` should be an exact match on `To`, this should put all the implicit
    44 conversions on the RHS.
    45 On the other hand, under some models (like [1]), implicit conversions are not
    46 allowed in assertion parameters, so another assertion syntax specific to
    47 conversions may be required, e.g. `From -> To`.
    48 It has also been suggested that, for programmer control, no implicit
    49 conversions (except, possibly, for polymorphic specialization) should be
    50 allowed in resolution of cast operators.
    51 
    52 [1] ../working/assertion_resolution.md
     47On the other hand, `To*` should perhaps match its target type exactly, so
     48another assertion syntax specific to conversions may be required, e.g.
     49`From -> To`.
    5350
    5451### Constructor Idiom ###
     
    5653that we can use the full range of Cforall features for conversions, including
    5754polymorphism.
    58 In an earlier version of this proposal, Glen Ditchfield defines a
    59 _constructor idiom_ that can be used to create chains of safe conversions
    60 without duplicating code; given a type `Safe` which members of another type
    61 `From` can be directly converted to, the constructor idiom allows us to write
    62 a conversion for any type `To` which `Safe` converts to:
    63 
    64         forall(otype To | { void ?{safe}( To&, Safe ) })
    65         void ?{safe}( To& this, From that ) {
     55Glen [1] defines a _constructor idiom_ that can be used to create chains of
     56safe conversions without duplicating code; given a type `Safe` which members
     57of another type `From` can be directly converted to, the constructor idiom
     58allows us to write a conversion for any type `To` which `Safe` converts to:
     59
     60        forall(otype To | { void ?{safe}( To*, Safe ) })
     61        void ?{safe}( To *this, From that ) {
    6662                Safe tmp = /* some expression involving that */;
    67                 this{ tmp }; // initialize from assertion parameter
     63                *this = tmp; // uses assertion parameter
    6864        }
    6965
     
    7167unsafe conversions.
    7268
    73 Glen's original suggestion said the copy constructor for `To` should also be
    74 accepted as a resolution for `void ?{safe}( To&, Safe )` (`Safe` == `To`),
    75 allowing this same code to be used for the single-step conversion as well.
    76 This proposal does come at the cost of an extra copy initialization of the
    77 target value, though.
    78 
    79 Contrariwise, if a monomorphic conversion from `From` to `Safe` is written,
    80 e.g:
    81 
    82         void ?{safe}( Safe& this, From that ) {
    83                 this{ /* some parameters involving that */ };
    84         }
    85 
    86 Then the code for a transitive conversion from `From` to any `To` type
    87 convertable from `Safe` is written:
    88 
    89         forall(otype To | { void ?{safe}( To&, Safe ) })
    90         void ?{safe}( To& this, From that ) {
    91                 Safe tmp = that;  // uses monomorphic conversion
    92                 this{ tmp };      // initialize from assertion parameter
    93         }
    94 
    95 Given the entirely-boilerplate nature of this code, but negative performance
    96 implications of the unmodified constructor idiom, it might be fruitful to have
    97 transitive and single step conversion operators, and let CFA build the
    98 transitive conversions; some possible names:
    99 
    100         void ?{safe}  (To&, From);    void ?{final safe} (To&, From);  // single-step
    101         void ?{safe*} (To&, From);    void ?{safe}       (To&, From);  // transitive
    102 
    10369What selective non-use of the constructor idiom gives us is the ability to
    10470define a conversion that may only be the *last* conversion in a chain of such.
    105 One use for this is to solve the problem that `explicit` conversions were
    106 added to C++ for, that of conversions to `bool` chaining to become conversions
    107 to any arithmetic type.
    108 Another use is to unambiguously represent the full hierarchy of implicit
    109 conversions in C by making sign conversions non-transitive, allowing the
    110 compiler to resolve e.g. `int -> unsigned long` as
    111 `int -> long -> unsigned long` over `int -> unsigned int -> unsigned long`.
    112 See [2] for more details.
    113 
    114 [2] ../working/glen_conversions/index.html#usual
     71Constructing a conversion graph able to unambiguously represent the full
     72hierarchy of implicit conversions in C is provably impossible using only
     73single-step conversions with no additional information (see Appendix A), but
     74this mechanism is sufficiently powerful (see [1], though the design there has
     75some minor bugs; the general idea is to use the constructor idiom to define
     76two chains of conversions, one among the signed integral types, another among
     77the unsigned, and to use monomorphic conversions to allow conversions between
     78signed and unsigned integer types).
    11579
    11680### Appendix A: Partial and Total Orders ###
     
    189153convert from `int` to `unsigned long`, so we just put in a direct conversion
    190154and make the compiler smart enough to figure out the costs" - this is the
    191 approach taken by the existing compiler, but given that in a user-defined
     155approach taken by the existing compipler, but given that in a user-defined
    192156conversion proposal the users can build an arbitrary graph of conversions,
    193157this case still needs to be handled.
     
    196160exists a chain of conversions from `a` to `b` (see Appendix A for description
    197161of preorders and related constructs).
    198 This preorder roughly corresponds to a more usual type-theoretic concept of
     162This preorder corresponds roughly to a more usual type-theoretic concept of
    199163subtyping ("if I can convert `a` to `b`, `a` is a more specific type than
    200164`b`"); however, since this graph is arbitrary, it may contain cycles, so if
     
    228192and so is considered to be the nearer type.
    229193By transitivity, then, the conversion from `X` to `Y2` should be cheaper than
    230 the conversion from `X` to `W`, but in this case the `Y2` and `W` are
     194the conversion from `X` to `W`, but in this case the `X` and `W` are
    231195incomparable by the conversion preorder, so the tie is broken by the shorter
    232196path from `X` to `W` in favour of `W`, contradicting the transitivity property
  • src/tests/preempt_longrun/Makefile.am

    r3fc59bdb r7bdcac1  
    4545
    4646clean-local:
    47         rm -f ${TESTS} core* out.log
     47        rm -f ${TESTS}
    4848
    4949% : %.c ${CC}
  • src/tests/preempt_longrun/Makefile.in

    r3fc59bdb r7bdcac1  
    889889
    890890clean-local:
    891         rm -f ${TESTS} core* out.log
     891        rm -f ${TESTS}
    892892
    893893% : %.c ${CC}
Note: See TracChangeset for help on using the changeset viewer.