Changeset 5a48d79


Ignore:
Timestamp:
Apr 11, 2017, 9:37:48 AM (4 years ago)
Author:
Rob Schluntz <rschlunt@…>
Branches:
aaron-thesis, arm-eh, cleanup-dtors, deferred_resn, demangler, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-env, no_list, persistent-indexer, resolv-new, with_gc
Children:
4b0f997, e6dceef
Parents:
2ccb93c (diff), b39e3dae (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the (diff) links above to see all the changes relative to each parent.
Message:

Merge branch 'master' of plg.uwaterloo.ca:/u/cforall/software/cfa/cfa-cc

Location:
doc
Files:
10 edited

Legend:

Unmodified
Added
Removed
  • doc/bibliography/cfa.bib

    r2ccb93c r5a48d79  
    54635463    contributer = {pabuhr@plg},
    54645464    title       = {The Programming Language {Ada}: Reference Manual},
     5465    author      = {Ada},
    54655466    organization= {United States Department of Defense},
    54665467    edition     = {{ANSI/MIL-STD-1815A-1983}},
  • doc/generic_types/evaluation/Makefile

    r2ccb93c r5a48d79  
    5555
    5656run-c: c-bench
     57        @echo
    5758        @echo '## C ##'
    58         @./c-bench
    59         @printf 'source_size:\t%7d lines\n' `cat c-bench.c bench.h c-stack.h c-stack.c | wc -l`
    60         @printf 'binary_size:\t%7d bytes\n' `wc -c < c-bench`
     59        @/usr/bin/time -f 'max_memory:\t%M kilobytes' ./c-bench
     60        @printf 'source_size:\t%8d lines\n' `cat c-bench.c bench.h c-stack.h c-stack.c | wc -l`
     61        @printf 'binary_size:\t%8d bytes\n' `stat -c %s c-bench`
    6162
    6263run-cfa: cfa-bench
     64        @echo
    6365        @echo '## Cforall ##'
    64         @./cfa-bench
    65         @printf 'source_size:\t%7d lines\n' `cat cfa-bench.c bench.h cfa-stack.h cfa-stack.c | wc -l`
    66         @printf 'binary_size:\t%7d bytes\n' `wc -c < cfa-bench`
     66        @/usr/bin/time -f 'max_memory:\t %M kilobytes' ./cfa-bench
     67        @printf 'source_size:\t%8d lines\n' `cat cfa-bench.c bench.h cfa-stack.h cfa-stack.c | wc -l`
     68        @printf 'binary_size:\t%8d bytes\n' `stat -c %s cfa-bench`
    6769
    6870run-cpp: cpp-bench
     71        @echo
    6972        @echo '## C++ ##'
    70         @./cpp-bench
    71         @printf 'source_size:\t%7d lines\n' `cat cpp-bench.cpp bench.hpp cpp-stack.hpp | wc -l`
    72         @printf 'binary_size:\t%7d bytes\n' `wc -c < cpp-bench`
     73        @/usr/bin/time -f 'max_memory:\t %M kilobytes' ./cpp-bench
     74        @printf 'source_size:\t%8d lines\n' `cat cpp-bench.cpp bench.hpp cpp-stack.hpp | wc -l`
     75        @printf 'binary_size:\t%8d bytes\n' `stat -c %s cpp-bench`
    7376
    7477run-cppv: cpp-vbench
     78        @echo
    7579        @echo '## C++ virtual ##'
    76         @./cpp-vbench
    77         @printf 'source_size:\t%7d lines\n' `cat cpp-vbench.cpp bench.hpp object.hpp cpp-vstack.hpp cpp-vstack.cpp | wc -l`
    78         @printf 'binary_size:\t%7d bytes\n' `wc -c < cpp-vbench`
     80        @/usr/bin/time -f 'max_memory:\t%M kilobytes' ./cpp-vbench
     81        @printf 'source_size:\t%8d lines\n' `cat cpp-vbench.cpp bench.hpp object.hpp cpp-vstack.hpp cpp-vstack.cpp | wc -l`
     82        @printf 'binary_size:\t%8d bytes\n' `stat -c %s cpp-vbench`
    7983
    8084run: run-c run-cfa run-cpp run-cppv
  • doc/generic_types/evaluation/bench.h

    r2ccb93c r5a48d79  
    44#include <time.h>
    55
    6  #define N 100000000
    7  
     6#define N 100000000
    87
    98long ms_between(clock_t start, clock_t end) {
     
    1615        code \
    1716        _end = clock(); \
    18         printf("%s:\t%7ld ms\n", name, ms_between(_start, _end)); \
     17        printf("%s:\t%8ld ms\n", name, ms_between(_start, _end)); \
    1918}
    2019
  • doc/generic_types/evaluation/bench.hpp

    r2ccb93c r5a48d79  
    55#include <time.h>
    66
    7  #define N 100000000
    8  
     7static const int N = 100000000;
    98
    109long ms_between(clock_t start, clock_t end) {
     
    1716        code \
    1817        _end = clock(); \
    19         std::cout << name << ":\t" << std::setw(7) << ms_between(_start, _end) << std::setw(0) << " ms" << std::endl; \
     18        std::cout << name << ":\t" << std::setw(8) << ms_between(_start, _end) << std::setw(0) << " ms" << std::endl; \
    2019}
    2120
  • doc/generic_types/evaluation/cfa-stack.c

    r2ccb93c r5a48d79  
    5959        s->head = n->next;
    6060        T x = n->value;
    61         delete(n);
     61        ^n{};
     62        free(n);
    6263        return x;
    6364}
  • doc/generic_types/evaluation/cpp-vstack.cpp

    r2ccb93c r5a48d79  
    2525                delete crnt;
    2626        }
     27        head = nullptr;
    2728}
    2829
  • doc/generic_types/generic_types.tex

    r2ccb93c r5a48d79  
    154154(3) \CFA code must be at least as portable as standard C code;
    155155(4) Extensions introduced by \CFA must be translated in the most efficient way possible.
    156 These goals ensure existing C code-bases can be converted to \CFA incrementally with minimal effort, and C programmers can productively generate \CFA code without training beyond the features being used. In its current implementation, \CFA is compiled by translating it to the GCC-dialect of C~\citep{GCCExtensions}, allowing it to leverage the portability and code optimizations provided by GCC, meeting goals (1)-(3). Ultimately, a compiler is necessary for advanced features and optimal performance.
     156These goals ensure existing C code-bases can be converted to \CFA incrementally with minimal effort, and C programmers can productively generate \CFA code without training beyond the features being used.
     157We claim \CC is diverging from C, and hence, incremental additions of language features require significant effort and training, while suffering from historically poor design choices.
     158
     159\CFA is currently implemented as a source-to-source translator from \CFA to the GCC-dialect of C~\citep{GCCExtensions}, allowing it to leverage the portability and code optimizations provided by GCC, meeting goals (1)-(3). Ultimately, a compiler is necessary for advanced features and optimal performance.
    157160
    158161This paper identifies shortcomings in existing approaches to generic and variadic data types in C-like languages and presents a design for generic and variadic types avoiding those shortcomings. Specifically, the solution is both reusable and type-checked, as well as conforming to the design goals of \CFA with ergonomic use of existing C abstractions. The new constructs are empirically compared with both standard C and \CC; the results show the new design is comparable in performance.
     
    169172The @identity@ function above can be applied to any complete \emph{object type} (or @otype@). The type variable @T@ is transformed into a set of additional implicit parameters encoding sufficient information about @T@ to create and return a variable of that type. The \CFA implementation passes the size and alignment of the type represented by an @otype@ parameter, as well as an assignment operator, constructor, copy constructor and destructor. If this extra information is not needed, \eg for a pointer, the type parameter can be declared as a \emph{data type} (or @dtype@).
    170173
    171 Here, the runtime cost of polymorphism is spread over each polymorphic call, due to passing more arguments to polymorphic functions; preliminary experiments have shown this overhead is similar to \CC virtual function calls. An advantage of this design is that, unlike \CC template functions, \CFA polymorphic functions are compatible with C \emph{separate} compilation, preventing code bloat.
     174In \CFA, the polymorphism runtime-cost is spread over each polymorphic call, due to passing more arguments to polymorphic functions; preliminary experiments show this overhead is similar to \CC virtual-function calls. An advantage of this design is that, unlike \CC template-functions, \CFA polymorphic-functions are compatible with C \emph{separate compilation}, preventing compilation and code bloat.
    172175
    173176Since bare polymorphic-types provide only a narrow set of available operations, \CFA provides a \emph{type assertion} mechanism to provide further type information, where type assertions may be variable or function declarations that depend on a polymorphic type-variable. For example, the function @twice@ can be defined using the \CFA syntax for operator overloading:
     
    176179int val = twice( twice( 3.7 ) );
    177180\end{lstlisting}
    178 which works for any type @T@ with a matching addition operator. The polymorphism is achieved by creating a wrapper function for calling @+@ with @T@ bound to @double@, then passing this function to the first call of @twice@. There is now the option of using the same @twice@ and converting the result to @int@ on assignment, or creating another @twice@ with type parameter @T@ bound to @int@ because \CFA uses the return type~\cite{Ada} in its type analysis. The first approach has a late conversion from @int@ to @double@ on the final assignment, while the second has an eager conversion to @int@. \CFA minimizes the number of conversions and their potential to lose information, so it selects the first approach, which corresponds with C-programmer intuition.
     181which works for any type @T@ with a matching addition operator. The polymorphism is achieved by creating a wrapper function for calling @+@ with @T@ bound to @double@, then passing this function to the first call of @twice@. There is now the option of using the same @twice@ and converting the result to @int@ on assignment, or creating another @twice@ with type parameter @T@ bound to @int@ because \CFA uses the return type (as in~\cite{Ada}) in its type analysis. The first approach has a late conversion from @int@ to @double@ on the final assignment, while the second has an eager conversion to @int@. \CFA minimizes the number of conversions and their potential to lose information, so it selects the first approach, which corresponds with C-programmer intuition.
    179182
    180183Crucial to the design of a new programming language are the libraries to access thousands of external software features.
    181 \CFA inherits a massive compatible library-base, where other programming languages must rewrite or provide fragile inter-language communication with C.
     184Like \CC, \CFA inherits a massive compatible library-base, where other programming languages must rewrite or provide fragile inter-language communication with C.
    182185A simple example is leveraging the existing type-unsafe (@void *@) C @bsearch@ to binary search a sorted floating-point array:
    183186\begin{lstlisting}
     
    201204int posn = bsearch( 5.0, vals, 10 );
    202205\end{lstlisting}
    203 The nested routine @comp@ provides the hidden interface from typed \CFA to untyped (@void *@) C, plus the cast of the result.
     206The nested routine @comp@ (impossible in \CC as lambdas do not use C calling conventions) provides the hidden interface from typed \CFA to untyped (@void *@) C, plus the cast of the result.
    204207As well, an alternate kind of return is made available: position versus pointer to found element.
    205208\CC's type-system cannot disambiguate between the two versions of @bsearch@ because it does not use the return type in overload resolution, nor can \CC separately compile a templated @bsearch@.
     
    269272forall( otype T `| summable( T )` ) T sum( T a[$\,$], size_t size ) {  // use trait
    270273        `T` total = { `0` };                                    $\C{// instantiate T from 0 by calling its constructor}$
    271         for ( unsigned int i = 0; i < size; i += 1 )
    272                 total `+=` a[i];                                        $\C{// select appropriate +}$
     274        for ( unsigned int i = 0; i < size; i += 1 ) total `+=` a[i]; $\C{// select appropriate +}$
    273275        return total; }
    274276\end{lstlisting}
     277A trait name plays no part in type equivalence; it is solely a macro for a list of assertions.
     278Traits may overlap assertions without conflict, and therefore, do not form a hierarchy.
    275279
    276280In fact, the set of operators is incomplete, \eg no assignment, but @otype@ is syntactic sugar for the following implicit trait:
  • doc/proposals/concurrency/concurrency.tex

    r2ccb93c r5a48d79  
    350350
    351351\subsection{Internal scheduling} \label{insched}
    352 Monitors also need to schedule waiting threads internally as a mean of synchronization. Internal scheduling is one of the simple examples of such a feature. It allows users to declare condition variables and have threads wait and signaled from them. Here is a simple example of such a technique :
    353 
    354 \begin{lstlisting}
    355         mutex struct A {
    356                 condition e;
    357         }
    358 
    359         void foo(A & mutex a) {
    360                 //...
    361                 wait(a.e);
    362                 //...
    363         }
    364 
    365         void bar(A & mutex a) {
    366                 signal(a.e);
    367         }
    368 \end{lstlisting}
    369 
    370 Note that in \CFA, \code{condition} have no particular need to be stored inside a monitor, beyond any software engineering reasons. Here routine \code{foo} waits for the \code{signal} from \code{bar} before making further progress, effectively ensuring a basic ordering.
    371 
    372 As for simple mutual exclusion, these semantics must also be extended to include \gls{group-acquire} :
     352
    373353\begin{center}
    374354\begin{tabular}{ c @{\hskip 0.65in} c }
    375 Thread 1 & Thread 2 \\
    376 \begin{lstlisting}
    377 void foo(A & mutex a,
    378            A & mutex b) {
    379         //...
    380         wait(a.e);
    381         //...
    382 }
    383 
    384 foo(a, b);
    385 \end{lstlisting} &\begin{lstlisting}
    386 void bar(A & mutex a,
    387            A & mutex b) {
    388         signal(a.e);
    389 }
    390 
    391 
    392 
    393 bar(a, b);
     355\begin{lstlisting}[language=Pseudo]
     356acquire A
     357        wait A
     358release A
     359\end{lstlisting}&\begin{lstlisting}[language=Pseudo]
     360acquire A
     361        signal A
     362release A
    394363\end{lstlisting}
    395364\end{tabular}
    396365\end{center}
    397366
    398 To define the semantics of internal scheduling, it is important to look at nesting and \gls{group-acquire}. Indeed, beyond concerns about lock ordering, without scheduling the two following pseudo codes are mostly equivalent. In fact, if we assume monitors are ordered alphabetically, these two pseudo codes would probably lead to exactly the same implementation :
    399 
    400 \begin{table}[h!]
    401 \centering
    402 \begin{tabular}{c c}
    403 \begin{lstlisting}[language=pseudo]
    404 monitor A, B, C
    405 
     367Easy : like uC++
     368
     369\begin{center}
     370\begin{tabular}{ c @{\hskip 0.65in} c }
     371\begin{lstlisting}[language=Pseudo]
    406372acquire A
    407         acquire B & C
    408 
    409                         //Do stuff
    410 
    411         release B & C
     373        acquire B
     374                wait B
     375        release B
    412376release A
    413 \end{lstlisting} &\begin{lstlisting}[language=pseudo]
    414 monitor A, B, C
    415 
     377\end{lstlisting}&\begin{lstlisting}[language=Pseudo]
     378acquire A
     379        acquire B
     380                signal B
     381        release B
     382release A
     383\end{lstlisting}
     384\end{tabular}
     385\end{center}
     386
     387Also easy : like uC++
     388
     389\begin{center}
     390\begin{tabular}{ c @{\hskip 0.65in} c }
     391\begin{lstlisting}[language=Pseudo]
     392acquire A & B
     393        wait A & B
     394release A & B
     395\end{lstlisting}&\begin{lstlisting}[language=Pseudo]
     396acquire A & B
     397        signal A & B
     398release A & B
     399\end{lstlisting}
     400\end{tabular}
     401\end{center}
     402
     403Simplest extension : can be made like uC++ by tying B to A
     404
     405\begin{center}
     406\begin{tabular}{ c @{\hskip 0.65in} c }
     407\begin{lstlisting}[language=Pseudo]
     408acquire A
     409        // Code Section 1
     410        acquire B
     411                // Code Section 2
     412                wait A & B
     413                // Code Section 3
     414        release B
     415        // Code Section 4
     416release A
     417\end{lstlisting}&\begin{lstlisting}[language=Pseudo]
     418acquire A
     419        // Code Section 5
     420        acquire B
     421                // Code Section 6
     422                signal A & B
     423                // Code Section 7
     424        release B
     425        // Code Section 8
     426release A
     427\end{lstlisting}
     428\end{tabular}
     429\end{center}
     430
     431Hard extension :
     432
     433Incorrect options for the signal :
     434
     435\begin{description}
     436 \item[-] Release B and baton pass after Code Section 8 : Passing b without having it
     437 \item[-] Keep B during Code Section 8 : Can lead to deadlocks since we secretly keep a lock longer than specified by the user
     438 \item[-] Instead of release B transfer A and B to waiter then try to reacquire A before running Code Section 8 : This allows barging
     439\end{description}
     440
     441Since we don't want barging we need to pass A \& B and somehow block and get A back.
     442
     443\begin{center}
     444\begin{tabular}{ c @{\hskip 0.65in} c }
     445\begin{lstlisting}[language=Pseudo]
    416446acquire A
    417447        acquire B
    418448                acquire C
    419                         //Do stuff
    420                 release C
    421         release B
    422 release A
     449                        wait A & B & C
     450                1: release C
     451        2: release B
     4523: release A
     453\end{lstlisting}&\begin{lstlisting}[language=Pseudo]
     454acquire A
     455        acquire B
     456                acquire C
     457                        signal A & B & C
     458                4: release C
     459        5: release B
     4606: release A
    423461\end{lstlisting}
    424462\end{tabular}
    425 \end{table}
    426 
    427 Once internal scheduling is introduce however, semantics of \gls{group-acquire} become relevant. For example, let us look into the semantics of the following pseudo-code :
    428 
     463\end{center}
     464
     465To prevent barging :
     466
     467\begin{description}
     468 \item[-] When the signaller hits 4 : pass A, B, C to waiter
     469 \item[-] When the waiter hits 2 : pass A, B to signaller
     470 \item[-] When the signaller hits 5 : pass A to waiter
     471\end{description}
     472
     473
     474\begin{center}
     475\begin{tabular}{ c @{\hskip 0.65in} c }
    429476\begin{lstlisting}[language=Pseudo]
    430 1: monitor A, B, C
    431 2: condition c1
    432 3:
    433 4: acquire A
    434 5:              acquire A & B & C
    435 6:                              signal c1
    436 7:              release A & B & C
    437 8: release A
    438 \end{lstlisting}
    439 
    440 Without \gls{group-acquire} signal simply baton passes the monitor lock on the next release. In the case above, we therefore need to indentify the next release. If line 8 is picked at the release point, then the signal will attempt to pass A \& B \& C, without having ownership of B \& C. Since this violates mutual exclusion, we conclude that line 7 is the only valid location where signalling can occur. The traditionnal meaning of signalling is to transfer ownership of the monitor(s) and immediately schedule the longest waiting task. However, in the discussed case, the signalling thread expects to maintain ownership of monitor A. This can be expressed in two differents ways : 1) the thread transfers ownership of all locks and reacquires A when it gets schedulled again or 2) it transfers ownership of all three monitors and then expects the ownership of A to be transferred back.
    441 
    442 However, the question is does these behavior motivate supporting acquireing non-disjoint set of monitors. Indeed, if the previous example was modified to only acquire B \& C at line 5 (an release the accordingly) then in respects to scheduling, we could add the simplifying constraint that all monitors in a bulk will behave the same way, simplifying the problem back to a single monitor problem which has already been solved. For this constraint to be acceptble however, we need to demonstrate that in does not prevent any meaningful possibilities. And, indeed, we can look at the two previous interpretation of the above pseudo-code and conclude that supporting the acquiring of non-disjoint set of monitors does not add any expressiveness to the language.
    443 
    444 Option 1 reacquires the lock after the signal statement, this can be rewritten as follows without the need for non-disjoint sets :
    445 \begin{lstlisting}[language=Pseudo]
    446 monitor A, B, C
    447 condition c1
    448 
    449 acquire A & B & C
    450         signal c1
    451 release A & B & C
    452477acquire A
    453 
    454 release A
    455 \end{lstlisting}
    456 
    457 This pseudo code has almost exaclty the same semantics as the code acquiring intersecting sets of monitors.
    458 
    459 Option 2 uses two-way lock ownership transferring instead of reacquiring monitor A. Two-way monitor ownership transfer is normally done using signalBlock semantics, which immedietely transfers ownership of a monitor before getting the ownership back when the other thread no longer needs the monitor. While the example pseudo-code for Option 2 seems toe transfer ownership of A, B and C and only getting A back, this is not a requirement. Getting back all 3 monitors and releasing B and C differs only in performance. For this reason, the second option could arguably be rewritten as :
    460 
    461 \begin{lstlisting}[language=Pseudo]
    462 monitor A, B, C
    463 condition c1
    464 
    465 acquire A
    466         acquire B & C
    467                 signalBlock c1
    468         release B & C
    469 release A
    470 \end{lstlisting}
    471 
    472 Obviously, the difference between these two snippets of pseudo code is that the first one transfers ownership of A, B and C while the second one only transfers ownership of B and C. However, this limitation can be removed by allowing user to release extra monitors when using internal scheduling, referred to as extended internal scheduling (pattent pending) from this point on. Extended internal scheduling means the two following pseudo-codes are functionnaly equivalent :
    473 \begin{table}[h!]
    474 \centering
    475 \begin{tabular}{c @{\hskip 0.65in} c}
    476 \begin{lstlisting}[language=pseudo]
    477 monitor A, B, C
    478 condition c1
    479 
    480 acquire A
    481         acquire B & C
    482                 signalBlock c1 with A
    483         release B & C
    484 release A
    485 \end{lstlisting} &\begin{lstlisting}[language=pseudo]
    486 monitor A, B, C
    487 condition c1
    488 
    489 acquire A
    490         acquire A & B & C
    491                 signal c1
    492         release A & B & C
    493 release A
     478        acquire C
     479                acquire B
     480                        wait A & B & C
     481                1: release B
     482        2: release C
     4833: release A
     484\end{lstlisting}&\begin{lstlisting}[language=Pseudo]
     485acquire B
     486        acquire A
     487                acquire C
     488                        signal A & B & C
     489                4: release C
     490        5: release A
     4916: release B
    494492\end{lstlisting}
    495493\end{tabular}
    496 \end{table}
    497 
    498 It must be stated that the extended internal scheduling only makes sense when using wait and signalBlock, since they need to prevent barging, which cannot be done in the context of signal since the ownership transfer is strictly one-directionnal.
    499 
    500 One critic that could arise is that extended internal schedulling is not composable since signalBlock must be explicitly aware of which context it is in. However, this argument is not relevant since acquire A, B and C in a context where a subset of them is already acquired cannot be achieved without spurriously releasing some locks or having an oracle aware of all monitors. Therefore, composability of internal scheduling is no more an issue than composability of monitors in general.
    501 
    502 The main benefit of using extended internal scheduling is that it offers the same expressiveness as intersecting monitor set acquiring but greatly simplifies the selection of a leader (or representative) for a group of monitor. Indeed, when using intersecting sets, it is not obvious which set intersects with other sets which means finding a leader representing only the smallest scope is a hard problem. Where as when using disjoint sets, any monitor that would be intersecting must be specified in the extended set, the leader can be chosen as any monitor in the primary set.
     494\end{center}
     495
     496To prevent barging : When the signaller hits 4 : pass A, B, C to waiter. When the waiter hits 1 it must release B,
     497
     498\begin{description}
     499 \item[-]
     500 \item[-] When the waiter hits 1 : pass A, B to signaller
     501 \item[-] When the signaller hits 5 : pass A, B to waiter
     502 \item[-] When the waiter hits 2 : pass A to signaller
     503\end{description}
     504
     505% Monitors also need to schedule waiting threads internally as a mean of synchronization. Internal scheduling is one of the simple examples of such a feature. It allows users to declare condition variables and have threads wait and signaled from them. Here is a simple example of such a technique :
     506
     507% \begin{lstlisting}
     508%       mutex struct A {
     509%               condition e;
     510%       }
     511
     512%       void foo(A & mutex a) {
     513%               //...
     514%               wait(a.e);
     515%               //...
     516%       }
     517
     518%       void bar(A & mutex a) {
     519%               signal(a.e);
     520%       }
     521% \end{lstlisting}
     522
     523% Note that in \CFA, \code{condition} have no particular need to be stored inside a monitor, beyond any software engineering reasons. Here routine \code{foo} waits for the \code{signal} from \code{bar} before making further progress, effectively ensuring a basic ordering.
     524
     525% As for simple mutual exclusion, these semantics must also be extended to include \gls{group-acquire} :
     526% \begin{center}
     527% \begin{tabular}{ c @{\hskip 0.65in} c }
     528% Thread 1 & Thread 2 \\
     529% \begin{lstlisting}
     530% void foo(A & mutex a,
     531%            A & mutex b) {
     532%       //...
     533%       wait(a.e);
     534%       //...
     535% }
     536
     537% foo(a, b);
     538% \end{lstlisting} &\begin{lstlisting}
     539% void bar(A & mutex a,
     540%            A & mutex b) {
     541%       signal(a.e);
     542% }
     543
     544
     545
     546% bar(a, b);
     547% \end{lstlisting}
     548% \end{tabular}
     549% \end{center}
     550
     551% To define the semantics of internal scheduling, it is important to look at nesting and \gls{group-acquire}. Indeed, beyond concerns about lock ordering, without scheduling the two following pseudo codes are mostly equivalent. In fact, if we assume monitors are ordered alphabetically, these two pseudo codes would probably lead to exactly the same implementation :
     552
     553% \begin{table}[h!]
     554% \centering
     555% \begin{tabular}{c c}
     556% \begin{lstlisting}[language=pseudo]
     557% monitor A, B, C
     558
     559% acquire A
     560%       acquire B & C
     561
     562%                       //Do stuff
     563
     564%       release B & C
     565% release A
     566% \end{lstlisting} &\begin{lstlisting}[language=pseudo]
     567% monitor A, B, C
     568
     569% acquire A
     570%       acquire B
     571%               acquire C
     572%                       //Do stuff
     573%               release C
     574%       release B
     575% release A
     576% \end{lstlisting}
     577% \end{tabular}
     578% \end{table}
     579
     580% Once internal scheduling is introduce however, semantics of \gls{group-acquire} become relevant. For example, let us look into the semantics of the following pseudo-code :
     581
     582% \begin{lstlisting}[language=Pseudo]
     583% 1: monitor A, B, C
     584% 2: condition c1
     585% 3:
     586% 4: acquire A
     587% 5:            acquire A & B & C
     588% 6:                            signal c1
     589% 7:            release A & B & C
     590% 8: release A
     591% \end{lstlisting}
     592
     593% Without \gls{group-acquire} signal simply baton passes the monitor lock on the next release. In the case above, we therefore need to indentify the next release. If line 8 is picked at the release point, then the signal will attempt to pass A \& B \& C, without having ownership of B \& C. Since this violates mutual exclusion, we conclude that line 7 is the only valid location where signalling can occur. The traditionnal meaning of signalling is to transfer ownership of the monitor(s) and immediately schedule the longest waiting task. However, in the discussed case, the signalling thread expects to maintain ownership of monitor A. This can be expressed in two differents ways : 1) the thread transfers ownership of all locks and reacquires A when it gets schedulled again or 2) it transfers ownership of all three monitors and then expects the ownership of A to be transferred back.
     594
     595% However, the question is does these behavior motivate supporting acquireing non-disjoint set of monitors. Indeed, if the previous example was modified to only acquire B \& C at line 5 (an release the accordingly) then in respects to scheduling, we could add the simplifying constraint that all monitors in a bulk will behave the same way, simplifying the problem back to a single monitor problem which has already been solved. For this constraint to be acceptble however, we need to demonstrate that in does not prevent any meaningful possibilities. And, indeed, we can look at the two previous interpretation of the above pseudo-code and conclude that supporting the acquiring of non-disjoint set of monitors does not add any expressiveness to the language.
     596
     597% Option 1 reacquires the lock after the signal statement, this can be rewritten as follows without the need for non-disjoint sets :
     598% \begin{lstlisting}[language=Pseudo]
     599% monitor A, B, C
     600% condition c1
     601
     602% acquire A & B & C
     603%       signal c1
     604% release A & B & C
     605% acquire A
     606
     607% release A
     608% \end{lstlisting}
     609
     610% This pseudo code has almost exaclty the same semantics as the code acquiring intersecting sets of monitors.
     611
     612% Option 2 uses two-way lock ownership transferring instead of reacquiring monitor A. Two-way monitor ownership transfer is normally done using signalBlock semantics, which immedietely transfers ownership of a monitor before getting the ownership back when the other thread no longer needs the monitor. While the example pseudo-code for Option 2 seems toe transfer ownership of A, B and C and only getting A back, this is not a requirement. Getting back all 3 monitors and releasing B and C differs only in performance. For this reason, the second option could arguably be rewritten as :
     613
     614% \begin{lstlisting}[language=Pseudo]
     615% monitor A, B, C
     616% condition c1
     617
     618% acquire A
     619%       acquire B & C
     620%               signalBlock c1
     621%       release B & C
     622% release A
     623% \end{lstlisting}
     624
     625% Obviously, the difference between these two snippets of pseudo code is that the first one transfers ownership of A, B and C while the second one only transfers ownership of B and C. However, this limitation can be removed by allowing user to release extra monitors when using internal scheduling, referred to as extended internal scheduling (pattent pending) from this point on. Extended internal scheduling means the two following pseudo-codes are functionnaly equivalent :
     626% \begin{table}[h!]
     627% \centering
     628% \begin{tabular}{c @{\hskip 0.65in} c}
     629% \begin{lstlisting}[language=pseudo]
     630% monitor A, B, C
     631% condition c1
     632
     633% acquire A
     634%       acquire B & C
     635%               signalBlock c1 with A
     636%       release B & C
     637% release A
     638% \end{lstlisting} &\begin{lstlisting}[language=pseudo]
     639% monitor A, B, C
     640% condition c1
     641
     642% acquire A
     643%       acquire A & B & C
     644%               signal c1
     645%       release A & B & C
     646% release A
     647% \end{lstlisting}
     648% \end{tabular}
     649% \end{table}
     650
     651% It must be stated that the extended internal scheduling only makes sense when using wait and signalBlock, since they need to prevent barging, which cannot be done in the context of signal since the ownership transfer is strictly one-directionnal.
     652
     653% One critic that could arise is that extended internal schedulling is not composable since signalBlock must be explicitly aware of which context it is in. However, this argument is not relevant since acquire A, B and C in a context where a subset of them is already acquired cannot be achieved without spurriously releasing some locks or having an oracle aware of all monitors. Therefore, composability of internal scheduling is no more an issue than composability of monitors in general.
     654
     655% The main benefit of using extended internal scheduling is that it offers the same expressiveness as intersecting monitor set acquiring but greatly simplifies the selection of a leader (or representative) for a group of monitor. Indeed, when using intersecting sets, it is not obvious which set intersects with other sets which means finding a leader representing only the smallest scope is a hard problem. Where as when using disjoint sets, any monitor that would be intersecting must be specified in the extended set, the leader can be chosen as any monitor in the primary set.
    503656
    504657% We need to make sure the semantics for internally scheduling N monitors are a natural extension of the single monitor semantics. For this reason, we introduce the concept of \gls{mon-ctx}. In terms of context internal scheduling means "releasing a \gls{mon-ctx} and waiting for an other thread to acquire the same \gls{mon-ctx} and baton-pass it back to the initial thread". This definitions requires looking into what a \gls{mon-ctx} is and what the semantics of waiting and baton-passing are.
  • doc/proposals/concurrency/style.tex

    r2ccb93c r5a48d79  
    11\input{common}                                          % bespoke macros used in the document
    22
    3 \CFADefaultStyle
     3% \CFADefaultStyle
    44
    55\lstset{
  • doc/proposals/concurrency/version

    r2ccb93c r5a48d79  
    1 0.7.134
     10.7.141
Note: See TracChangeset for help on using the changeset viewer.