Changeset e3b2474


Ignore:
Timestamp:
Jul 3, 2018, 3:25:56 PM (3 years ago)
Author:
Rob Schluntz <rschlunt@…>
Branches:
aaron-thesis, arm-eh, cleanup-dtors, deferred_resn, demangler, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, no_list, persistent-indexer
Children:
638ac26
Parents:
c653b37 (diff), bbe1a87 (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the (diff) links above to see all the changes relative to each parent.
Message:

Merge branch 'master' of plg.uwaterloo.ca:/u/cforall/software/cfa/cfa-cc

Files:
3 deleted
11 edited

Legend:

Unmodified
Added
Removed
  • doc/papers/concurrency/Paper.tex

    rc653b37 re3b2474  
    33\articletype{RESEARCH ARTICLE}%
    44
    5 \received{26 April 2016}
    6 \revised{6 June 2016}
    7 \accepted{6 June 2016}
     5\received{XXXXX}
     6\revised{XXXXX}
     7\accepted{XXXXX}
    88
    99\raggedbottom
     
    3232\renewcommand{\linenumberfont}{\scriptsize\sffamily}
    3333
     34\renewcommand{\topfraction}{0.8}                        % float must be greater than X of the page before it is forced onto its own page
     35\renewcommand{\bottomfraction}{0.8}                     % float must be greater than X of the page before it is forced onto its own page
     36\renewcommand{\floatpagefraction}{0.8}          % float must be greater than X of the page before it is forced onto its own page
    3437\renewcommand{\textfraction}{0.0}                       % the entire page maybe devoted to floats with no text on the page at all
    3538
     
    132135\makeatother
    133136
    134 \newenvironment{cquote}{%
    135         \list{}{\lstset{resetmargins=true,aboveskip=0pt,belowskip=0pt}\topsep=3pt\parsep=0pt\leftmargin=\parindentlnth\rightmargin\leftmargin}%
    136         \item\relax
    137 }{%
    138         \endlist
    139 }% cquote
     137\newenvironment{cquote}
     138               {\list{}{\lstset{resetmargins=true,aboveskip=0pt,belowskip=0pt}\topsep=3pt\parsep=0pt\leftmargin=\parindentlnth\rightmargin\leftmargin}%
     139                \item\relax}
     140               {\endlist}
     141
     142%\newenvironment{cquote}{%
     143%\list{}{\lstset{resetmargins=true,aboveskip=0pt,belowskip=0pt}\topsep=3pt\parsep=0pt\leftmargin=\parindentlnth\rightmargin\leftmargin}%
     144%\item\relax%
     145%}{%
     146%\endlist%
     147%}% cquote
    140148
    141149% CFA programming language, based on ANSI C (with some gcc additions)
     
    234242\abstract[Summary]{
    235243\CFA is a modern, polymorphic, \emph{non-object-oriented} extension of the C programming language.
    236 This paper discusses the design of the concurrency and parallelism features in \CFA, and the concurrent runtime-system.
     244This paper discusses the design of the concurrency and parallelism features in \CFA, and its concurrent runtime-system.
    237245These features are created from scratch as ISO C lacks concurrency, relying largely on the pthreads library for concurrency.
    238 Coroutines and lightweight (user) threads are introduced into the language.
    239 In addition, monitors are added as a high-level mechanism for mutual exclusion and synchronization.
    240 A unique contribution is allowing multiple monitors to be safely acquired simultaneously.
     246Coroutines and lightweight (user) threads are introduced into \CFA;
     247as well, monitors are added as a high-level mechanism for mutual exclusion and synchronization.
     248A unique contribution of this work is allowing multiple monitors to be safely acquired \emph{simultaneously}.
    241249All features respect the expectations of C programmers, while being fully integrate with the \CFA polymorphic type-system and other language features.
    242 Finally, experimental results show comparable performance of the new features with similar mechanisms in other concurrent programming-languages.
     250Experimental results show comparable performance of the new features with similar mechanisms in other concurrent programming-languages.
    243251}%
    244252
     
    271279Concurrency tools handle mutual exclusion and synchronization, while parallelism tools handle performance, cost, and resource utilization.
    272280
    273 The proposed concurrency API is implemented in a dialect of C, called \CFA.
     281The proposed concurrency API is implemented in a dialect of C, called \CFA (pronounced C-for-all).
    274282The paper discusses how the language features are added to the \CFA translator with respect to parsing, semantics, and type checking, and the corresponding high-performance runtime-library to implement the concurrent features.
    275283
     
    281289
    282290\CFA is a non-object-oriented extension of ISO-C, and hence, supports all C paradigms.
    283 %It is a non-object-oriented system-language, meaning most of the major abstractions have either no runtime overhead or can be opted out easily.
    284291Like C, the building blocks of \CFA are structures and routines.
    285292Virtually all of the code generated by the \CFA translator respects C memory layouts and calling conventions.
    286 While \CFA is not object oriented, lacking the concept of a receiver (\eg @this@) and nominal inheritance-relationships, C does have a notion of objects: ``region of data storage in the execution environment, the contents of which can represent values''~\cite[3.15]{C11}.
    287 While some \CFA features are common in object-oriented programming, they are independent capabilities allowing \CFA to adopt them while maintaining a procedural paradigm.
     293While \CFA is not object oriented, lacking the concept of a receiver (\eg @this@) and nominal inheritance-relationships, C has a notion of objects: ``region of data storage in the execution environment, the contents of which can represent values''~\cite[3.15]{C11}.
     294While some object-oriented features appear in \CFA, they are independent capabilities, allowing \CFA to adopt them while maintaining a procedural paradigm.
    288295
    289296
     
    338345Object-oriented programming languages only provide implicit qualification for the receiver.
    339346
    340 In detail, the @with@ statement has the form:
     347In detail, the @with@-statement syntax is:
    341348\begin{cfa}
    342349$\emph{with-statement}$:
     
    348355(Enumerations are already opened.)
    349356The object is the implicit qualifier for the open structure-fields.
    350 All expressions in the expression list are open in parallel within the compound statement, which is different from Pascal, which nests the openings from left to right.
     357All expressions in the expression list are opened in parallel within the compound statement, which is different from Pascal, which nests the openings from left to right.
    351358
    352359
     
    354361
    355362\CFA maximizes the ability to reuse names via overloading to aggressively address the naming problem.
    356 Both variables and routines may be overloaded, where selection is based on types, and number of returns (as in Ada~\cite{Ada}) and arguments.
     363Both variables and routines may be overloaded, where selection is based on number and types of returns and arguments (as in Ada~\cite{Ada}).
     364\newpage
     365\vspace*{-2\baselineskip}%???
    357366\begin{cquote}
    358 \vspace*{-\baselineskip}%???
     367\begin{cfa}
     368// selection based on type
     369\end{cfa}
    359370\lstDeleteShortInline@%
    360 \begin{cfa}
    361 // selection based on type
    362 \end{cfa}
    363371\begin{tabular}{@{}l@{\hspace{2\parindentlnth}}|@{\hspace{2\parindentlnth}}l@{}}
    364372\begin{cfa}
     
    411419Therefore, overloading eliminates long prefixes and other naming conventions to prevent name clashes.
    412420As seen in Section~\ref{s:Concurrency}, routine @main@ is heavily overloaded.
    413 For example, variable overloading is useful in the parallel semantics of the @with@ statement for fields with the same name:
     421As another example, variable overloading is useful in the parallel semantics of the @with@ statement for fields with the same name:
    414422\begin{cfa}
    415423struct S { int `i`; int j; double m; } s;
     
    464472\lstMakeShortInline@%
    465473\end{cquote}
    466 While concurrency does not use operator overloading directly, it provides an introduction for the syntax of constructors.
    467474
    468475
     
    470477
    471478Object lifetime is a challenge in non-managed programming languages.
    472 \CFA responds with \CC-like constructors and destructors:
     479\CFA responds with \CC-like constructors and destructors, using a different operator-overloading syntax.
    473480\begin{cfa}
    474481struct VLA { int len, * data; };                        $\C{// variable length array of integers}$
     
    490497Like \CC, construction is implicit on allocation (stack/heap) and destruction is implicit on deallocation.
    491498The object and all their fields are constructed/destructed.
    492 \CFA also provides @new@ and @delete@, which behave like @malloc@ and @free@, in addition to constructing and destructing objects:
     499\CFA also provides @new@ and @delete@ as library routines, which behave like @malloc@ and @free@, in addition to constructing and destructing objects:
    493500\begin{cfa}
    494501{
     
    518525int i = sum( sa, 5 );                                           $\C{// use S's 0 construction and +}$
    519526\end{cfa}
    520 The builtin type @zero_t@ (and @one_t@) overload constant 0 (and 1) for a new types, where both 0 and 1 have special meaning in C.
     527Type variables can be @otype@ or @dtype@.
     528@otype@ refers to a \emph{complete type}, \ie, a type with size, alignment, default constructor, copy constructor, destructor, and assignment operator.
     529@dtype@ refers to an \emph{incomplete type}, \eg, void or a forward-declared type.
     530The builtin types @zero_t@ and @one_t@ overload constant 0 and 1 for a new types, where both 0 and 1 have special meaning in C.
    521531
    522532\CFA provides \newterm{traits} to name a group of type assertions, where the trait name allows specifying the same set of assertions in multiple locations, preventing repetition mistakes at each routine declaration:
     
    533543\end{cfa}
    534544
    535 Assertions can be @otype@ or @dtype@.
    536 @otype@ refers to a \emph{complete} object, \ie an object has a size, alignment, default constructor, copy constructor, destructor and an assignment operator.
    537 @dtype@ refers to an \emph{incomplete} object, \ie, an object only has a size and alignment.
    538 
    539545Using the return type for overload discrimination, it is possible to write a type-safe @alloc@ based on the C @malloc@:
    540546\begin{cfa}
     
    554560In coroutining, the single thread is self-scheduling across the stacks, so execution is deterministic, \ie the execution path from input to output is fixed and predictable.
    555561A \newterm{stackless} coroutine executes on the caller's stack~\cite{Python} but this approach is restrictive, \eg preventing modularization and supporting only iterator/generator-style programming;
    556 a \newterm{stackfull} coroutine executes on its own stack, allowing full generality.
    557 Only stackfull coroutines are a stepping stone to concurrency.
     562a \newterm{stackful} coroutine executes on its own stack, allowing full generality.
     563Only stackful coroutines are a stepping stone to concurrency.
    558564
    559565The transition to concurrency, even for execution with a single thread and multiple stacks, occurs when coroutines also context switch to a \newterm{scheduling oracle}, introducing non-determinism from the coroutine perspective~\cite[\S~3]{Buhr05a}.
     
    561567The resulting execution system now follows a cooperative threading-model, called \newterm{non-preemptive scheduling}.
    562568
    563 Because the scheduler is special, it can either be a stackless or stackfull coroutine.
     569Because the scheduler is special, it can either be a stackless or stackful coroutine.
    564570For stackless, the scheduler performs scheduling on the stack of the current coroutine and switches directly to the next coroutine, so there is one context switch.
    565 For stackfull, the current coroutine switches to the scheduler, which performs scheduling, and it then switches to the next coroutine, so there are two context switches.
    566 A stackfull scheduler is often used for simplicity and security, even through there is a slightly higher runtime-cost.
     571For stackful, the current coroutine switches to the scheduler, which performs scheduling, and it then switches to the next coroutine, so there are two context switches.
     572A stackful scheduler is often used for simplicity and security.
    567573
    568574Regardless of the approach used, a subset of concurrency related challenges start to appear.
    569575For the complete set of concurrency challenges to occur, the missing feature is \newterm{preemption}, where context switching occurs randomly between any two instructions, often based on a timer interrupt, called \newterm{preemptive scheduling}.
    570 While a scheduler introduces uncertainty in the order of execution, preemption introduces uncertainty where context switches occur.
     576While a scheduler introduces uncertainty in the order of execution, preemption introduces uncertainty about where context switches occur.
    571577Interestingly, uncertainty is necessary for the runtime (operating) system to give the illusion of parallelism on a single processor and increase performance on multiple processors.
    572578The reason is that only the runtime has complete knowledge about resources and how to best utilized them.
     
    574580otherwise, it is impossible to write meaningful programs.
    575581Optimal performance in concurrent applications is often obtained by having as much non-determinism as correctness allows.
    576 
    577 
    578 \subsection{\protect\CFA's Thread Building Blocks}
    579582
    580583An important missing feature in C is threading\footnote{While the C11 standard defines a \protect\lstinline@threads.h@ header, it is minimal and defined as optional.
     
    590593
    591594While the focus of this discussion is concurrency and parallelism, it is important to address coroutines, which are a significant building block of a concurrency system (but not concurrent among themselves).
    592 Coroutines are generalized routines allowing execution to be temporarily suspend and later resumed.
     595Coroutines are generalized routines allowing execution to be temporarily suspended and later resumed.
    593596Hence, unlike a normal routine, a coroutine may not terminate when it returns to its caller, allowing it to be restarted with the values and execution location present at the point of suspension.
    594 This capability is accomplish via the coroutine's stack, where suspend/resume context switch among stacks.
     597This capability is accomplished via the coroutine's stack, where suspend/resume context switch among stacks.
    595598Because threading design-challenges are present in coroutines, their design effort is relevant, and this effort can be easily exposed to programmers giving them a useful new programming paradigm because a coroutine handles the class of problems that need to retain state between calls, \eg plugins, device drivers, and finite-state machines.
    596 Therefore, the core \CFA coroutine-API for has two fundamental features: independent call-stacks and @suspend@/@resume@ operations.
     599Therefore, the two fundamental features of the core \CFA coroutine-API are independent call-stacks and @suspend@/@resume@ operations.
    597600
    598601For example, a problem made easier with coroutines is unbounded generators, \eg generating an infinite sequence of Fibonacci numbers
     
    722725Figure~\ref{f:Coroutine3States} creates a @coroutine@ type, @`coroutine` Fib { int fn; }@, which provides communication, @fn@, for the \newterm{coroutine main}, @main@, which runs on the coroutine stack, and possibly multiple interface routines, \eg @next@.
    723726Like the structure in Figure~\ref{f:ExternalState}, the coroutine type allows multiple instances, where instances of this type are passed to the (overloaded) coroutine main.
    724 The coroutine main's stack holds the state for the next generation, @f1@ and @f2@, and the code has the three suspend points, representing the three states in the Fibonacci formula, to context switch back to the caller's @resume@.
     727The coroutine main's stack holds the state for the next generation, @f1@ and @f2@, and the code represents the three states in the Fibonacci formula via the three suspend points, to context switch back to the caller's @resume@.
    725728The interface routine @next@, takes a Fibonacci instance and context switches to it using @resume@;
    726729on restart, the Fibonacci field, @fn@, contains the next value in the sequence, which is returned.
     
    747750\end{quote}
    748751The example takes advantage of resuming a coroutine in the constructor to prime the loops so the first character sent for formatting appears inside the nested loops.
    749 The destruction provides a newline, if formatted text ends with a full line.
    750 Figure~\ref{f:CFmt} shows the C equivalent formatter, where the loops of the coroutine are flatten (linearized) and rechecked on each call because execution location is not retained between calls.
     752The destructor provides a newline, if formatted text ends with a full line.
     753Figure~\ref{f:CFmt} shows the C equivalent formatter, where the loops of the coroutine are flattened (linearized) and rechecked on each call because execution location is not retained between calls.
    751754(Linearized code is the bane of device drivers.)
    752755
     
    761764};
    762765void main( Format & fmt ) with( fmt ) {
    763         for ( ;; ) {   
     766        for ( ;; ) {
    764767                for ( g = 0; g < 5; g += 1 ) {      // group
    765768                        for ( b = 0; b < 4; b += 1 ) { // block
     
    835838The previous examples are \newterm{asymmetric (semi) coroutine}s because one coroutine always calls a resuming routine for another coroutine, and the resumed coroutine always suspends back to its last resumer, similar to call/return for normal routines.
    836839However, @resume@ and @suspend@ context switch among existing stack-frames, rather than create new ones so there is no stack growth.
    837 \newterm{Symmetric (full) coroutine}s have a coroutine call to a resuming routine for another coroutine, and its coroutine main has a call to another resuming routine, which eventually forms a resuming-call cycle.
     840\newterm{Symmetric (full) coroutine}s have a coroutine call to a resuming routine for another coroutine, and its coroutine main calls another resuming routine, which eventually forms a resuming-call cycle.
    838841(The trivial cycle is a coroutine resuming itself.)
    839842This control flow is similar to recursion for normal routines, but again there is no stack growth from the context switch.
     
    926929The call from the consumer to @payment@ introduces the cycle between producer and consumer.
    927930When @payment@ is called, the consumer copies values into the producer's communication variable and a resume is executed.
    928 The context switch restarts the producer at the point where it was last context switched, so it continues in @delivery@ after the resume.
     931The context switch restarts the producer at the point where it last context switched, so it continues in @delivery@ after the resume.
    929932
    930933@delivery@ returns the status value in @prod@'s coroutine main, where the status is printed.
     
    945948\subsection{Coroutine Implementation}
    946949
    947 A significant implementation challenge for coroutines (and threads, see section \ref{threads}) is adding extra fields and executing code after/before the coroutine constructor/destructor and coroutine main to create/initialize/de-initialize/destroy extra fields and the stack.
     950A significant implementation challenge for coroutines (and threads, see Section~\ref{threads}) is adding extra fields and executing code after/before the coroutine constructor/destructor and coroutine main to create/initialize/de-initialize/destroy extra fields and the stack.
    948951There are several solutions to this problem and the chosen option forced the \CFA coroutine design.
    949952
     
    953956\end{cfa}
    954957and the programming language (and possibly its tool set, \eg debugger) may need to understand @baseCoroutine@ because of the stack.
    955 Furthermore, the execution of constructs/destructors is in the wrong order for certain operations.
     958Furthermore, the execution of constructors/destructors is in the wrong order for certain operations.
    956959For example, for threads if the thread is implicitly started, it must start \emph{after} all constructors, because the thread relies on a completely initialized object, but the inherited constructor runs \emph{before} the derived.
    957960
    958 An alternatively is composition:
     961An alternative is composition:
    959962\begin{cfa}
    960963struct mycoroutine {
     
    10041007Users wanting to extend coroutines or build their own for various reasons can only do so in ways offered by the language.
    10051008Furthermore, implementing coroutines without language supports also displays the power of a programming language.
    1006 While this is ultimately the option used for idiomatic \CFA code, coroutines and threads can still be constructed without using the language support.
    1007 The reserved keyword simply eases use for the common cases.
    1008 
    1009 Part of the mechanism to generalize coroutines is using a \CFA trait, which defines a coroutine as anything satisfying the trait @is_coroutine@, and this trait is used to restrict coroutine-manipulation routines:
     1009While this is ultimately the option used for idiomatic \CFA code, coroutines and threads can still be constructed without language support.
     1010The reserved keyword simply eases use for the common case.
     1011
     1012Part of the mechanism to generalize coroutines is using a \CFA trait, which defines a coroutine as anything satisfying the trait @is_coroutine@, and this trait restricts the available set of coroutine-manipulation routines:
    10101013\begin{cfa}
    10111014trait is_coroutine( `dtype` T ) {
     
    10161019forall( `dtype` T | is_coroutine(T) ) void resume( T & );
    10171020\end{cfa}
    1018 The @dtype@ property of the trait ensures the coroutine descriptor is non-copyable, so all coroutines must be passed by reference (pointer).
     1021The @dtype@ property provides no implicit copying operations and the @is_coroutine@ trait provides no explicit copying operations, so all coroutines must be passed by reference (pointer).
    10191022The routine definitions ensures there is a statically-typed @main@ routine that is the starting point (first stack frame) of a coroutine, and a mechanism to get (read) the currently executing coroutine handle.
    10201023The @main@ routine has no return value or additional parameters because the coroutine type allows an arbitrary number of interface routines with corresponding arbitrary typed input/output values versus fixed ones.
    1021 The generic routines @suspend@ and @resume@ can be redefined, but any object passed to them is a coroutine since it must satisfy the @is_coroutine@ trait to compile.
    10221024The advantage of this approach is that users can easily create different types of coroutines, \eg changing the memory layout of a coroutine is trivial when implementing the @get_coroutine@ routine, and possibly redefining @suspend@ and @resume@.
    10231025The \CFA keyword @coroutine@ implicitly implements the getter and forward declarations required for implementing the coroutine main:
     
    11501152\begin{cfa}
    11511153int main() {
    1152         MyThread * heapLived;
     1154        MyThread * heapLive;
    11531155        {
    1154                 MyThread blockLived;                            $\C{// fork block-based thread}$
    1155                 heapLived = `new`( MyThread );          $\C{// fork heap-based thread}$
     1156                MyThread blockLive;                                     $\C{// fork block-based thread}$
     1157                heapLive = `new`( MyThread );           $\C{// fork heap-based thread}$
    11561158                ...
    11571159        }                                                                               $\C{// join block-based thread}$
    11581160        ...
    1159         `delete`( heapLived );                                  $\C{// join heap-based thread}$
     1161        `delete`( heapLive );                                   $\C{// join heap-based thread}$
    11601162}
    11611163\end{cfa}
    11621164The heap-based approach allows arbitrary thread-creation topologies, with respect to fork/join-style concurrency.
    11631165
    1164 Figure~\ref{s:ConcurrentMatrixSummation} shows concurrently adding the rows of a matrix and then totalling the subtotals sequential, after all the row threads have terminated.
     1166Figure~\ref{s:ConcurrentMatrixSummation} shows concurrently adding the rows of a matrix and then totalling the subtotals sequentially, after all the row threads have terminated.
    11651167The program uses heap-based threads because each thread needs different constructor values.
    11661168(Python provides a simple iteration mechanism to initialize array elements to different values allowing stack allocation.)
    11671169The allocation/deallocation pattern appears unusual because allocated objects are immediately deallocated without any intervening code.
    11681170However, for threads, the deletion provides implicit synchronization, which is the intervening code.
    1169 While the subtotals are added in linear order rather than completion order, which slight inhibits concurrency, the computation is restricted by the critical-path thread (\ie the thread that takes the longest), and so any inhibited concurrency is very small as totalling the subtotals is trivial.
     1171While the subtotals are added in linear order rather than completion order, which slightly inhibits concurrency, the computation is restricted by the critical-path thread (\ie the thread that takes the longest), and so any inhibited concurrency is very small as totalling the subtotals is trivial.
    11701172
    11711173\begin{figure}
    11721174\begin{cfa}
    1173 thread Adder {
    1174     int * row, cols, & subtotal;                        $\C{// communication}$
    1175 };
     1175`thread` Adder { int * row, cols, & subtotal; } $\C{// communication variables}$
    11761176void ?{}( Adder & adder, int row[], int cols, int & subtotal ) {
    11771177    adder.[ row, cols, &subtotal ] = [ row, cols, &subtotal ];
     
    11871187    Adder * adders[rows];
    11881188    for ( int r = 0; r < rows; r += 1 ) {       $\C{// start threads to sum rows}$
    1189                 adders[r] = new( matrix[r], cols, &subtotals[r] );
     1189                adders[r] = `new( matrix[r], cols, &subtotals[r] );`
    11901190    }
    11911191    for ( int r = 0; r < rows; r += 1 ) {       $\C{// wait for threads to finish}$
    1192                 delete( adders[r] );                            $\C{// termination join}$
     1192                `delete( adders[r] );`                          $\C{// termination join}$
    11931193                total += subtotals[r];                          $\C{// total subtotal}$
    11941194    }
     
    12061206To reestablish meaningful execution requires mechanisms to reintroduce determinism, \ie restrict non-determinism, called mutual exclusion and synchronization, where mutual exclusion is an access-control mechanism on data shared by threads, and synchronization is a timing relationship among threads~\cite[\S~4]{Buhr05a}.
    12071207Since many deterministic challenges appear with the use of mutable shared state, some languages/libraries disallow it, \eg Erlang~\cite{Erlang}, Haskell~\cite{Haskell}, Akka~\cite{Akka} (Scala).
    1208 In these paradigms, interaction among concurrent objects is performed by stateless message-passing~\cite{Thoth,Harmony,V-Kernel} or other paradigms closely relate to networking concepts, \eg channels~\cite{CSP,Go}.
     1208In these paradigms, interaction among concurrent objects is performed by stateless message-passing~\cite{Thoth,Harmony,V-Kernel} or other paradigms closely related to networking concepts, \eg channels~\cite{CSP,Go}.
    12091209However, in call/return-based languages, these approaches force a clear distinction, \ie introduce a new programming paradigm between regular and concurrent computation, \eg routine call versus message passing.
    12101210Hence, a programmer must learn and manipulate two sets of design patterns.
    12111211While this distinction can be hidden away in library code, effective use of the library still has to take both paradigms into account.
    1212 In contrast, approaches based on statefull models more closely resemble the standard call/return programming-model, resulting in a single programming paradigm.
     1212In contrast, approaches based on stateful models more closely resemble the standard call/return programming-model, resulting in a single programming paradigm.
    12131213
    12141214At the lowest level, concurrent control is implemented by atomic operations, upon which different kinds of locking mechanisms are constructed, \eg semaphores~\cite{Dijkstra68b}, barriers, and path expressions~\cite{Campbell74}.
    12151215However, for productivity it is always desirable to use the highest-level construct that provides the necessary efficiency~\cite{Hochstein05}.
    12161216A newer approach for restricting non-determinism is transactional memory~\cite{Herlihy93}.
    1217 While this approach is pursued in hardware~\cite{Nakaike15} and system languages, like \CC~\cite{Cpp-Transactions}, the performance and feature set is still too restrictive to be the main concurrency paradigm for system languages, which is why it was rejected as the core paradigm for concurrency in \CFA.
     1217While this approach is pursued in hardware~\cite{Nakaike15} and system languages, like \CC~\cite{Cpp-Transactions}, the performance and feature set is still too restrictive to be the main concurrency paradigm for system languages, which is why it is rejected as the core paradigm for concurrency in \CFA.
    12181218
    12191219One of the most natural, elegant, and efficient mechanisms for mutual exclusion and synchronization for shared-memory systems is the \emph{monitor}.
     
    12441244higher-level mechanisms often simplify usage by adding better coupling between synchronization and data, \eg message passing, or offering a simpler solution to otherwise involved challenges, \eg barrier lock.
    12451245Often synchronization is used to order access to a critical section, \eg ensuring a reader thread is the next kind of thread to enter a critical section.
    1246 If a writer thread is scheduled for next access, but another reader thread acquires the critical section first, that reader has \newterm{barged}.
     1246If a writer thread is scheduled for next access, but another reader thread acquires the critical section first, that reader \newterm{barged}.
    12471247Barging can result in staleness/freshness problems, where a reader barges ahead of a writer and reads temporally stale data, or a writer barges ahead of another writer overwriting data with a fresh value preventing the previous value from ever being read (lost computation).
    12481248Preventing or detecting barging is an involved challenge with low-level locks, which can be made much easier by higher-level constructs.
     
    12501250Algorithms that allow a barger, but divert it until later using current synchronization state (flags), are avoiding the barger;
    12511251algorithms that preclude a barger from entering during synchronization in the critical section prevent barging completely.
    1252 Techniques like baton-pass locks~\cite{Andrews89} between threads instead of unconditionally releasing locks is an example of barging prevention.
     1252Techniques like baton-passing locks~\cite{Andrews89} between threads instead of unconditionally releasing locks is an example of barging prevention.
    12531253
    12541254
     
    12631263Copying a lock is insecure because it is possible to copy an open lock and then use the open copy when the original lock is closed to simultaneously access the shared data.
    12641264Copying a monitor is secure because both the lock and shared data are copies, but copying the shared data is meaningless because it no longer represents a unique entity.
    1265 As for coroutines/tasks, a non-copyable (@dtype@) trait is used to capture this requirement, so all locks/monitors must be passed by reference (pointer).
     1265As for coroutines/tasks, the @dtype@ property provides no implicit copying operations and the @is_monitor@ trait provides no explicit copying operations, so all locks/monitors must be passed by reference (pointer).
    12661266\begin{cfa}
    12671267trait is_monitor( `dtype` T ) {
     
    12751275\label{s:MutexAcquisition}
    12761276
    1277 While correctness implicitly implies a monitor's mutual exclusion is acquired and released, there are implementation options about when and where the locking/unlocking occurs.
     1277While correctness implies a monitor's mutual exclusion is acquired and released, there are implementation options about when and where the locking/unlocking occurs.
    12781278(Much of this discussion also applies to basic locks.)
    12791279For example, a monitor may need to be passed through multiple helper routines before it becomes necessary to acquire the monitor mutual-exclusion.
     
    13011301
    13021302For maximum usability, monitors have \newterm{multi-acquire} semantics allowing a thread to acquire it multiple times without deadlock.
    1303 For example, atomically printing the contents of a binary tree:
    1304 \begin{cfa}
    1305 monitor Tree {
    1306         Tree * left, right;
    1307         // value
    1308 };
    1309 void print( Tree & mutex tree ) {                       $\C{// prefix traversal}$
    1310         // write value
    1311         print( tree->left );                                    $\C{// multiply acquire monitor lock on each recursion}$
    1312         print( tree->right );
    1313 }
    1314 \end{cfa}
    1315 
    1316 Mandatory monitor qualifiers have the benefit of being self-documenting, but requiring both @mutex@ and \lstinline[morekeywords=nomutex]@nomutex@ for all monitor parameter is redundant.
    1317 Instead, one of qualifier semantics can be the default, and the other required.
    1318 For example, assume the safe @mutex@ qualifier for all monitor parameters because assuming \lstinline[morekeywords=nomutex]@nomutex@ may cause subtle errors.
    1319 On the other hand, assuming \lstinline[morekeywords=nomutex]@nomutex@ is the \emph{normal} parameter behaviour, stating explicitly \emph{this parameter is not special}.
     1303\begin{cfa}
     1304monitor M { ... } m;
     1305void foo( M & mutex m ) { ... }                         $\C{// acquire mutual exclusion}$
     1306void bar( M & mutex m ) {                                       $\C{// acquire mutual exclusion}$
     1307        ... `foo( m );` ...                                             $\C{// reacquire mutual exclusion}$
     1308}
     1309`bar( m );`                                                                     $\C{// nested monitor call}$
     1310\end{cfa}
     1311
     1312The benefit of mandatory monitor qualifiers is self-documentation, but requiring both @mutex@ and \lstinline[morekeywords=nomutex]@nomutex@ for all monitor parameters is redundant.
     1313Instead, the semantics have one qualifier as the default, and the other required.
     1314For example, make the safe @mutex@ qualifier the default because assuming \lstinline[morekeywords=nomutex]@nomutex@ may cause subtle errors.
     1315Alternatively, make the unsafe \lstinline[morekeywords=nomutex]@nomutex@ qualifier the default because it is the \emph{normal} parameter semantics while @mutex@ parameters are rare.
    13201316Providing a default qualifier implies knowing whether a parameter is a monitor.
    13211317Since \CFA relies heavily on traits as an abstraction mechanism, the distinction between a type that is a monitor and a type that looks like a monitor can become blurred.
     
    13401336
    13411337To make the issue tractable, \CFA only acquires one monitor per parameter with at most one level of indirection.
    1342 However, the C type-system has an ambiguity with respects to arrays.
     1338However, there is an ambiguity in the C type-system with respects to arrays.
    13431339Is the argument for @f2@ a single object or an array of objects?
    13441340If it is an array, only the first element of the array is acquired, which seems unsafe;
     
    13551351
    13561352For object-oriented monitors, calling a mutex member \emph{implicitly} acquires mutual exclusion of the receiver object, @`rec`.foo(...)@.
    1357 \CFA has no receiver, and hence, must use an explicit mechanism to specify which object has mutual exclusion acquired.
     1353\CFA has no receiver, and hence, must use an explicit mechanism to specify which object acquires mutual exclusion.
    13581354A positive consequence of this design decision is the ability to support multi-monitor routines.
    13591355\begin{cfa}
     
    13661362Having language support for such a feature is therefore a significant asset for \CFA.
    13671363
    1368 The capability to acquire multiple locks before entering a critical section is called \newterm{bulk acquire}.
     1364The capability to acquire multiple locks before entering a critical section is called \newterm{bulk acquire} (see Section~\ref{s:Implementation} for implementation details).
    13691365In the previous example, \CFA guarantees the order of acquisition is consistent across calls to different routines using the same monitors as arguments.
    13701366This consistent ordering means acquiring multiple monitors is safe from deadlock.
     
    13841380
    13851381However, such use leads to lock acquiring order problems resulting in deadlock~\cite{Lister77}, where detecting it requires dynamic tracking of monitor calls, and dealing with it requires rollback semantics~\cite{Dice10}.
    1386 In \CFA, safety is guaranteed by using bulk acquire of all monitors to shared objects, whereas other monitor systems provide no aid.
    1387 While \CFA provides only a partial solution, the \CFA partial solution handles many useful cases.
    1388 \begin{cfa}
    1389 monitor Bank { ... };
    1390 void deposit( Bank & `mutex` b, int deposit );
    1391 void transfer( Bank & `mutex` mybank, Bank & `mutex` yourbank, int me2you) {
    1392         deposit( mybank, `-`me2you );                   $\C{// debit}$
    1393         deposit( yourbank, me2you );                    $\C{// credit}$
     1382In \CFA, a safety aid is provided by using bulk acquire of all monitors to shared objects, whereas other monitor systems provide no aid.
     1383While \CFA provides only a partial solution, it handles many useful cases, \eg:
     1384\begin{cfa}
     1385monitor BankAccount { ... };
     1386void deposit( BankAccount & `mutex` b, int deposit );
     1387void transfer( BankAccount & `mutex` my, BankAccount & `mutex` your, int me2you ) {
     1388        deposit( my, `-`me2you );                               $\C{// debit}$
     1389        deposit( your, me2you );                                $\C{// credit}$
    13941390}
    13951391\end{cfa}
     
    13981394
    13991395
    1400 \subsection{\protect\lstinline|mutex| statement} \label{mutex-stmt}
     1396\subsection{\protect\lstinline@mutex@ statement}
     1397\label{mutex-stmt}
    14011398
    14021399The monitor call-semantics associate all locking semantics to routines.
     
    14051402\begin{tabular}{@{}l@{\hspace{3\parindentlnth}}l@{}}
    14061403\begin{cfa}
    1407 monitor M {};
     1404monitor M { ... };
    14081405void foo( M & mutex m1, M & mutex m2 ) {
    14091406        // critical section
     
    14351432For example, Figure~\ref{f:GenericBoundedBuffer} shows a bounded buffer that may be full/empty so produce/consumer threads must block.
    14361433Leaving the monitor and trying again (busy waiting) is impractical for high-level programming.
    1437 Monitors eliminate busy waiting by providing internal synchronization to schedule threads needing access to the shared data, where the synchronization is blocking (threads are parked) versus spinning.
     1434Monitors eliminate busy waiting by providing synchronization to schedule threads needing access to the shared data, where threads block versus spinning.
    14381435Synchronization is generally achieved with internal~\cite{Hoare74} or external~\cite[\S~2.9.2]{uC++} scheduling, where \newterm{scheduling} defines which thread acquires the critical section next.
    14391436\newterm{Internal scheduling} is characterized by each thread entering the monitor and making an individual decision about proceeding or blocking, while \newterm{external scheduling} is characterized by an entering thread making a decision about proceeding for itself and on behalf of other threads attempting entry.
     
    15251522\end{figure}
    15261523
    1527 Figure~\ref{f:BBExt} shows a \CFA generic bounded-buffer with external scheduling, where producers/consumers detecting a full/empty buffer block and prevent more producers/consumers from entering the monitor until the buffer has a free/empty slot.
     1524Figure~\ref{f:BBExt} shows a \CFA generic bounded-buffer with external scheduling, where producers/consumers detecting a full/empty buffer block and prevent more producers/consumers from entering the monitor until there is a free/empty slot in the buffer.
    15281525External scheduling is controlled by the @waitfor@ statement, which atomically blocks the calling thread, releases the monitor lock, and restricts the routine calls that can next acquire mutual exclusion.
    15291526If the buffer is full, only calls to @remove@ can acquire the buffer, and if the buffer is empty, only calls to @insert@ can acquire the buffer.
    1530 Threads making calls to routines that are currently excluded, block outside (external) of the monitor on a calling queue, versus blocking on condition queues inside (internal) of the monitor.
    1531 % External scheduling is more constrained and explicit, which helps programmers reduce the non-deterministic nature of concurrency.
     1527Threads making calls to routines that are currently excluded, block outside of (external to) the monitor on a calling queue, versus blocking on condition queues inside of (internal to) the monitor.
    15321528External scheduling allows users to wait for events from other threads without concern of unrelated events occurring.
    15331529The mechnaism can be done in terms of control flow, \eg Ada @accept@ or \uC @_Accept@, or in terms of data, \eg Go channels.
     
    15461542A thread blocks until an appropriate partner arrives.
    15471543The complexity is exchanging phone numbers in the monitor because of the mutual-exclusion property.
    1548 For signal scheduling, the @exchange@ condition is necessary to block the thread finding the match, while the matcher unblocks to take the oppose number, post its phone number, and unblock the partner.
     1544For signal scheduling, the @exchange@ condition is necessary to block the thread finding the match, while the matcher unblocks to take the opposite number, post its phone number, and unblock the partner.
    15491545For signal-block scheduling, the implicit urgent-queue replaces the explict @exchange@-condition and @signal_block@ puts the finding thread on the urgent condition and unblocks the matcher.
    15501546
     
    15671563                wait( Girls[ccode] );
    15681564                GirlPhNo = phNo;
    1569                 `exchange.signal();`
     1565                `signal( exchange );`
    15701566        } else {
    15711567                GirlPhNo = phNo;
    15721568                `signal( Boys[ccode] );`
    1573                 `exchange.wait();`
     1569                `wait( exchange );`
    15741570        } // if
    15751571        return BoyPhNo;
     
    16461642}
    16471643\end{cfa}
    1648 must have acquired monitor locks that are greater than or equal to the number of locks for the waiting thread signalled from the condition queue.
     1644must have acquired at least the same locks as the waiting thread signalled from the condition queue.
    16491645
    16501646Similarly, for @waitfor( rtn )@, the default semantics is to atomically block the acceptor and release all acquired mutex types in the parameter list, \ie @waitfor( rtn, m1, m2 )@.
     
    16521648@waitfor@ statically verifies the released monitors are the same as the acquired mutex-parameters of the given routine or routine pointer.
    16531649To statically verify the released monitors match with the accepted routine's mutex parameters, the routine (pointer) prototype must be accessible.
    1654 
    1655 When an overloaded routine appears in an @waitfor@ statement, calls to any routine with that name are accepted.
    1656 The rationale is that members with the same name should perform a similar function, and therefore, all should be eligible to accept a call.
    1657 As always, overloaded routines can be disambiguated using a cast:
     1650% When an overloaded routine appears in an @waitfor@ statement, calls to any routine with that name are accepted.
     1651% The rationale is that members with the same name should perform a similar function, and therefore, all should be eligible to accept a call.
     1652Overloaded routines can be disambiguated using a cast:
    16581653\begin{cfa}
    16591654void rtn( M & mutex m );
    16601655`int` rtn( M & mutex m );
    1661 waitfor( (`int` (*)( M & mutex ))rtn, m1, m2 );
    1662 \end{cfa}
    1663 
    1664 Given the ability to release a subset of acquired monitors can result in a \newterm{nested monitor}~\cite{Lister77} deadlock.
     1656waitfor( (`int` (*)( M & mutex ))rtn, m );
     1657\end{cfa}
     1658
     1659The ability to release a subset of acquired monitors can result in a \newterm{nested monitor}~\cite{Lister77} deadlock.
    16651660\begin{cfa}
    16661661void foo( M & mutex m1, M & mutex m2 ) {
     
    16731668
    16741669Finally, an important aspect of monitor implementation is barging, \ie can calling threads barge ahead of signalled threads?
    1675 If barging is allowed, synchronization between a singller and signallee is difficult, often requiring multiple unblock/block cycles (looping around a wait rechecking if a condition is met).
     1670If barging is allowed, synchronization between a signaller and signallee is difficult, often requiring multiple unblock/block cycles (looping around a wait rechecking if a condition is met).
     1671In fact, signals-as-hints is completely opposite from that proposed by Hoare in the seminal paper on monitors:
    16761672\begin{quote}
    16771673However, we decree that a signal operation be followed immediately by resumption of a waiting program, without possibility of an intervening procedure call from yet a third program.
     
    16791675\end{quote}
    16801676\CFA scheduling \emph{precludes} barging, which simplifies synchronization among threads in the monitor and increases correctness.
     1677Furthermore, \CFA concurrency has no spurious wakeup~\cite[\S~9]{Buhr05a}, which eliminates an implict form of barging.
    16811678For example, there are no loops in either bounded buffer solution in Figure~\ref{f:GenericBoundedBuffer}.
    16821679Supporting barging prevention as well as extending internal scheduling to multiple monitors is the main source of complexity in the design and implementation of \CFA concurrency.
    1683 Furthermore, \CFA concurrency has no superious wakeup~\cite[\S~9]{Buhr05a}, which eliminates an implict form of barging.
    16841680
    16851681
     
    17531749
    17541750One scheduling solution is for the signaller to keep ownership of all locks until the last lock is ready to be transferred, because this semantics fits most closely to the behaviour of single-monitor scheduling.
    1755 However, Figure~\ref{f:OtherWaitingThread} shows this solution is complex depending on other waiters, resulting is choices when the signaller finishes the inner mutex-statement.
    1756 The singaller can retain @m2@ until completion of the outer mutex statement and pass the locks to waiter W1, or it can pass @m2@ to waiter W2 after completing the inner mutex-statement, while continuing to hold @m1@.
     1751However, Figure~\ref{f:OtherWaitingThread} shows this solution is complex depending on other waiters, resulting in options when the signaller finishes the inner mutex-statement.
     1752The signaller can retain @m2@ until completion of the outer mutex statement and pass the locks to waiter W1, or it can pass @m2@ to waiter W2 after completing the inner mutex-statement, while continuing to hold @m1@.
    17571753In the latter case, waiter W2 must eventually pass @m2@ to waiter W1, which is complex because W1 may have waited before W2, so W2 is unaware of it.
    17581754Furthermore, there is an execution sequence where the signaller always finds waiter W2, and hence, waiter W1 starves.
     
    17611757Signalled threads are moved to the urgent queue and the waiter at the front defines the set of monitors necessary for it to unblock.
    17621758Partial signalling transfers ownership of monitors to the front waiter.
    1763 When the signaller thread exits or waits in the monitor the front waiter is unblocked if all its monitors are released.
    1764 This solution has the benefit that complexity is encapsulated into only two actions: passing monitors to the next owner when they should be released and conditionally waking threads if all conditions are met.
     1759When the signaller thread exits or waits in the monitor, the front waiter is unblocked if all its monitors are released.
     1760The benefit is encapsulating complexity into only two actions: passing monitors to the next owner when they should be released and conditionally waking threads if all conditions are met.
    17651761
    17661762
     
    17691765
    17701766In an object-oriented programming-language, a class includes an exhaustive list of operations.
    1771 However, new members can be added via static inheritance or dynaic members, \eg JavaScript~\cite{JavaScript}.
     1767However, new members can be added via static inheritance or dynamic members, \eg JavaScript~\cite{JavaScript}.
    17721768Similarly, monitor routines can be added at any time in \CFA, making it less clear for programmers and more difficult to implement.
    17731769\begin{cfa}
    1774 monitor M {};
     1770monitor M { ... };
    17751771void `f`( M & mutex m );
    17761772void g( M & mutex m ) { waitfor( `f` ); }       $\C{// clear which f}$
     
    17781774void h( M & mutex m ) { waitfor( `f` ); }       $\C{// unclear which f}$
    17791775\end{cfa}
    1780 Hence, the cfa-code for the entering a monitor looks like:
     1776Hence, the cfa-code for entering a monitor looks like:
    17811777\begin{cfa}
    17821778if ( $\textrm{\textit{monitor is free}}$ ) $\LstCommentStyle{// \color{red}enter}$
     
    18181814External scheduling, like internal scheduling, becomes significantly more complex for multi-monitor semantics.
    18191815Even in the simplest case, new semantics needs to be established.
    1820 \begin{cfa}
    1821 monitor M {};
     1816\newpage
     1817\begin{cfa}
     1818monitor M { ... };
    18221819void f( M & mutex m1 );
    18231820void g( M & mutex m1, M & mutex m2 ) {
     
    18291826        waitfor( f, m2 );                                               $\C{\color{red}// wait for call to f with argument m2}$
    18301827\end{cfa}
    1831 Routine @g@ has acquired both locks, so when routine @f@ is called, the lock for monitor @m2@ is passed from @g@ to @f@, while @g@ still holds lock @m1@.
     1828Both locks are acquired by routine @g@, so when routine @f@ is called, the lock for monitor @m2@ is passed from @g@ to @f@, while @g@ still holds lock @m1@.
    18321829This behaviour can be extended to the multi-monitor @waitfor@ statement.
    18331830\begin{cfa}
    1834 monitor M {};
     1831monitor M { ... };
    18351832void f( M & mutex m1, M & mutex m2 );
    18361833void g( M & mutex m1, M & mutex m2 ) {
     
    18391836\end{cfa}
    18401837Again, the set of monitors passed to the @waitfor@ statement must be entirely contained in the set of monitors already acquired by the accepting routine.
    1841 
    1842 Note, for internal and external scheduling with multiple monitors, a signalling or accepting thread must match exactly, \ie partial matching results in waiting.
    1843 \begin{cquote}
     1838Also, the order of the monitors in a @waitfor@ statement is unimportant.
     1839
     1840Figure~\ref{f:UnmatchedMutexSets} shows an example where, for internal and external scheduling with multiple monitors, a signalling or accepting thread must match exactly, \ie partial matching results in waiting.
     1841For both examples, the set of monitors is disjoint so unblocking is impossible.
     1842
     1843\begin{figure}
    18441844\lstDeleteShortInline@%
    18451845\begin{tabular}{@{}l@{\hspace{\parindentlnth}}|@{\hspace{\parindentlnth}}l@{}}
     
    18541854        wait( c );
    18551855}
     1856g( `m11`, m2 ); // block on wait
     1857f( `m12`, m2 ); // cannot fulfil
     1858\end{cfa}
     1859&
     1860\begin{cfa}
     1861monitor M1 {} m11, m12;
     1862monitor M2 {} m2;
     1863
     1864void f( M1 & mutex m1, M2 & mutex m2 ) {
     1865
     1866}
     1867void g( M1 & mutex m1, M2 & mutex m2 ) {
     1868        waitfor( f, m1, m2 );
     1869}
    18561870g( `m11`, m2 ); // block on accept
    18571871f( `m12`, m2 ); // cannot fulfil
    18581872\end{cfa}
    1859 &
    1860 \begin{cfa}
    1861 monitor M1 {} m11, m12;
    1862 monitor M2 {} m2;
    1863 
    1864 void f( M1 & mutex m1, M2 & mutex m2 ) {
    1865 
    1866 }
    1867 void g( M1 & mutex m1, M2 & mutex m2 ) {
    1868         waitfor( f, m1, m2 );
    1869 }
    1870 g( `m11`, m2 ); // block on accept
    1871 f( `m12`, m2 ); // cannot fulfil
    1872 \end{cfa}
    18731873\end{tabular}
    18741874\lstMakeShortInline@%
    1875 \end{cquote}
    1876 For both internal and external scheduling, the set of monitors is disjoint so unblocking is impossible.
     1875\caption{Unmatched \protect\lstinline@mutex@ sets}
     1876\label{f:UnmatchedMutexSets}
     1877\end{figure}
    18771878
    18781879
    18791880\subsection{Extended \protect\lstinline@waitfor@}
    18801881
    1881 The extended form of the @waitfor@ statement conditionally accepts one of a group of mutex routines and allows a specific action to be performed \emph{after} the mutex routine finishes.
     1882Figure~\ref{f:ExtendedWaitfor} show the extended form of the @waitfor@ statement to conditionally accept one of a group of mutex routines, with a specific action to be performed \emph{after} the mutex routine finishes.
     1883For a @waitfor@ clause to be executed, its @when@ must be true and an outstanding call to its corresponding member(s) must exist.
     1884The \emph{conditional-expression} of a @when@ may call a routine, but the routine must not block or context switch.
     1885If there are multiple acceptable mutex calls, selection occurs top-to-bottom (prioritized) in the @waitfor@ clauses, whereas some programming languages with similar mechanisms accept non-deterministically for this case, \eg Go \lstinline[morekeywords=select]@select@.
     1886If some accept guards are true and there are no outstanding calls to these members, the acceptor is accept-blocked until a call to one of these members is made.
     1887If all the accept guards are false, the statement does nothing, unless there is a terminating @else@ clause with a true guard, which is executed instead.
     1888Hence, the terminating @else@ clause allows a conditional attempt to accept a call without blocking.
     1889If there is a @timeout@ clause, it provides an upper bound on waiting.
     1890If both a @timeout@ clause and an @else@ clause are present, the @else@ must be conditional, or the @timeout@ is never triggered.
     1891In all cases, the statement following is executed \emph{after} a clause is executed to know which of the clauses executed.
     1892
     1893\begin{figure}
    18821894\begin{cfa}
    18831895`when` ( $\emph{conditional-expression}$ )      $\C{// optional guard}$
     
    18951907                $\emph{statement}$                                      $\C{// action when no immediate calls}$
    18961908\end{cfa}
    1897 For a @waitfor@ clause to be executed, its @when@ must be true and an outstanding call to its corresponding member(s) must exist.
    1898 The \emph{conditional-expression} of a @when@ may call a routine, but the routine must not block or context switch.
    1899 If there are several mutex calls that can be accepted, selection occurs top-to-bottom in the @waitfor@ clauses versus non-deterministically.
    1900 If some accept guards are true and there are no outstanding calls to these members, the acceptor is accept-blocked until a call to one of these members is made.
    1901 If all the accept guards are false, the statement does nothing, unless there is a terminating @else@ clause with a true guard, which is executed instead.
    1902 Hence, the terminating @else@ clause allows a conditional attempt to accept a call without blocking.
    1903 If there is a @timeout@ clause, it provides an upper bound on waiting, and can only appear with a conditional @else@, otherwise the timeout cannot be triggered.
    1904 In all cases, the statement following is executed \emph{after} a clause is executed to know which of the clauses executed.
     1909\caption{Extended \protect\lstinline@waitfor@}
     1910\label{f:ExtendedWaitfor}
     1911\end{figure}
    19051912
    19061913Note, a group of conditional @waitfor@ clauses is \emph{not} the same as a group of @if@ statements, e.g.:
     
    19151922\begin{cfa}
    19161923void insert( Buffer(T) & mutex buffer, T elem ) with( buffer ) {
    1917         if ( count == BufferSize )
     1924        if ( count == 10 )
    19181925                waitfor( remove, buffer ) {
    1919                         elements[back] = elem;
    1920                         back = ( back + 1 ) % BufferSize;
    1921                         count += 1;
     1926                        // insert elem into buffer
    19221927                } or `waitfor( ^?{}, buffer )` throw insertFail;
    19231928}
     
    19251930When the buffer is deallocated, the current waiter is unblocked and informed, so it can perform an appropriate action.
    19261931However, the basic @waitfor@ semantics do not support this functionality, since using an object after its destructor is called is undefined.
    1927 Therefore, to make this useful capability work, the semantics for accepting the destructor is the same as @signal@, \ie the call to the destructor is placed on the urgent queue and the acceptor continues execution, which throws an exception to the acceptor and then the caller is unbloced from the urgent queue to deallocate the object.
     1932Therefore, to make this useful capability work, the semantics for accepting the destructor is the same as @signal@, \ie the call to the destructor is placed on the urgent queue and the acceptor continues execution, which throws an exception to the acceptor and then the caller is unblocked from the urgent queue to deallocate the object.
    19281933Accepting the destructor is an idiomatic way to terminate a thread in \CFA.
    19291934
     
    19331938Threads in \CFA are monitors to allow direct communication among threads, \ie threads can have mutex routines that are called by other threads.
    19341939Hence, all monitor features are available when using threads.
    1935 Figure~\ref{f:pingpong} shows an example of two threads directly calling each other and accepting calls from each other in a cycle.
    1936 Note, both ping/pong threads are globally declared, @pi@/@po@, and hence, start (and possibly complete) before the program main starts.
    1937 
    1938 \begin{figure}
    1939 \lstDeleteShortInline@%
    1940 \begin{cquote}
     1940The following shows an example of two threads directly calling each other and accepting calls from each other in a cycle.
    19411941\begin{cfa}
    19421942thread Ping {} pi;
     
    19461946int main() {}
    19471947\end{cfa}
     1948\vspace{-0.8\baselineskip}
     1949\begin{cquote}
    19481950\begin{tabular}{@{}l@{\hspace{3\parindentlnth}}l@{}}
    19491951\begin{cfa}
     
    19651967\end{cfa}
    19661968\end{tabular}
    1967 \lstMakeShortInline@%
    19681969\end{cquote}
    1969 \caption{Threads ping/pong using external scheduling}
    1970 \label{f:pingpong}
    1971 \end{figure}
     1970% \lstMakeShortInline@%
     1971% \caption{Threads ping/pong using external scheduling}
     1972% \label{f:pingpong}
     1973% \end{figure}
     1974Note, the ping/pong threads are globally declared, @pi@/@po@, and hence, start (and possibly complete) before the program main starts.
     1975
     1976
     1977\subsection{Low-level Locks}
     1978
     1979For completeness and efficiency, \CFA provides a standard set of low-level locks: recursive mutex, condition, semaphore, barrier, \etc, and atomic instructions: @fetchAssign@, @fetchAdd@, @testSet@, @compareSet@, \etc.
     1980
    19721981
    19731982\section{Parallelism}
    19741983
    19751984Historically, computer performance was about processor speeds.
    1976 However, with heat dissipation being a direct consequence of speed increase, parallelism has become the new source for increased performance~\cite{Sutter05, Sutter05b}.
     1985However, with heat dissipation being a direct consequence of speed increase, parallelism is the new source for increased performance~\cite{Sutter05, Sutter05b}.
    19771986Now, high-performance applications must care about parallelism, which requires concurrency.
    19781987The lowest-level approach of parallelism is to use \newterm{kernel threads} in combination with semantics like @fork@, @join@, \etc.
     
    19932002\label{s:fibers}
    19942003
    1995 A variant of user thread is \newterm{fibers}, which removes preemption, \eg Go~\cite{Go}.
    1996 Like functional programming, which removes mutation and its associated problems, removing preemption from concurrency reduces nondeterminism, hence race and deadlock errors are more difficult to generate.
     2004A variant of user thread is \newterm{fibers}, which removes preemption, \eg Go~\cite{Go} @goroutine@s.
     2005Like functional programming, which removes mutation and its associated problems, removing preemption from concurrency reduces nondeterminism, making race and deadlock errors more difficult to generate.
    19972006However, preemption is necessary for concurrency that relies on spinning, so there are a class of problems that cannot be programmed without preemption.
    19982007
     
    20002009\subsection{Thread Pools}
    20012010
    2002 In contrast to direct threading is indirect \newterm{thread pools}, where small jobs (work units) are insert into a work pool for execution.
     2011In contrast to direct threading is indirect \newterm{thread pools}, where small jobs (work units) are inserted into a work pool for execution.
    20032012If the jobs are dependent, \ie interact, there is an implicit/explicit dependency graph that ties them together.
    20042013While removing direct concurrency, and hence the amount of context switching, thread pools significantly limit the interaction that can occur among jobs.
    2005 Indeed, jobs should not block because that also block the underlying thread, which effectively means the CPU utilization, and therefore throughput, suffers.
     2014Indeed, jobs should not block because that also blocks the underlying thread, which effectively means the CPU utilization, and therefore throughput, suffers.
    20062015While it is possible to tune the thread pool with sufficient threads, it becomes difficult to obtain high throughput and good core utilization as job interaction increases.
    20072016As well, concurrency errors return, which threads pools are suppose to mitigate.
    2008 Intel's TBB library~\cite{TBB} is the gold standard for thread pools.
    20092017
    20102018
     
    20362044The user cluster is created to contain the application user-threads.
    20372045Having all threads execute on the one cluster often maximizes utilization of processors, which minimizes runtime.
    2038 However, because of limitations of the underlying operating system, heterogeneous hardware, or scheduling requirements (real-time), it is sometimes necessary to have multiple clusters.
     2046However, because of limitations of the underlying operating system, heterogeneous hardware, or scheduling requirements (real-time), multiple clusters are sometimes necessary.
    20392047
    20402048
     
    20612069
    20622070\section{Implementation}
     2071\label{s:Implementation}
    20632072
    20642073Currently, \CFA has fixed-sized stacks, where the stack size can be set at coroutine/thread creation but with no subsequent growth.
    20652074Schemes exist for dynamic stack-growth, such as stack copying and chained stacks.
    2066 However, stack copying requires pointer adjustment to items on the stack, which is impossible without some form of garage collection.
     2075However, stack copying requires pointer adjustment to items on the stack, which is impossible without some form of garbage collection.
    20672076As well, chained stacks require all modules be recompiled to use this feature, which breaks backward compatibility with existing C libraries.
    20682077In the long term, it is likely C libraries will migrate to stack chaining to support concurrency, at only a minimal cost to sequential programs.
     
    20742083This storage is allocated at the base of a thread's stack before blocking, which means programmers must add a small amount of extra space for stacks.
    20752084
    2076 In \CFA, ordering of monitor acquisition relies on memory ordering to prevent deadlock~\cite{Havender68}, because all objects are guaranteed to have distinct non-overlapping memory layouts, and mutual-exclusion for a monitor is only defined for its lifetime.
     2085In \CFA, ordering of monitor acquisition relies on memory ordering to prevent deadlock~\cite{Havender68}, because all objects have distinct non-overlapping memory layouts, and mutual-exclusion for a monitor is only defined for its lifetime.
    20772086When a mutex call is made, pointers to the concerned monitors are aggregated into a variable-length array and sorted.
    2078 This array persists for the entire duration of the mutual-exclusion and its ordering reused extensively.
    2079 
    2080 To improve performance and simplicity, context switching occur inside a routine call, so only callee-saved registers are copied onto the stack and then the stack register is switched;
     2087This array persists for the entire duration of the mutual exclusion and is used extensively for synchronization operations.
     2088
     2089To improve performance and simplicity, context switching occurs inside a routine call, so only callee-saved registers are copied onto the stack and then the stack register is switched;
    20812090the corresponding registers are then restored for the other context.
    20822091Note, the instruction pointer is untouched since the context switch is always inside the same routine.
     
    20852094This method is a 2-step context-switch and provides a clear distinction between user and kernel code, where scheduling and other system operations happen.
    20862095The alternative 1-step context-switch uses the \emph{from} thread's stack to schedule and then context-switches directly to the \emph{to} thread's stack.
    2087 Experimental results (not presented) show the performance difference between these two approaches is virtually equivalent, because the 1-step performance is dominated by a lock instruction to prevent a race condition.
     2096Experimental results (not presented) show the performance of these two approaches is virtually equivalent, because both approaches are dominated by locking to prevent a race condition.
    20882097
    20892098All kernel threads (@pthreads@) created a stack.
     
    20932102
    20942103Finally, an important aspect for a complete threading system is preemption, which introduces extra non-determinism via transparent interleaving, rather than cooperation among threads for proper scheduling and processor fairness from long-running threads.
    2095 Because preemption frequency is usually long, 1 millisecond, performance cost is negligible.
     2104Because preemption frequency is usually long (1 millisecond) performance cost is negligible.
    20962105Preemption is normally handled by setting a count-down timer on each virtual processor.
    20972106When the timer expires, an interrupt is delivered, and the interrupt handler resets the count-down timer, and if the virtual processor is executing in user code, the signal handler performs a user-level context-switch, or if executing in the language runtime-kernel, the preemption is ignored or rolled forward to the point where the runtime kernel context switches back to user code.
    20982107Multiple signal handlers may be pending.
    2099 When control eventually switches back to the signal handler, it returns normally, and execution continues in the interrupted user thread, even though the return from the signal handler may be on a different kernel thread than the one where the signal was delivered.
     2108When control eventually switches back to the signal handler, it returns normally, and execution continues in the interrupted user thread, even though the return from the signal handler may be on a different kernel thread than the one where the signal is delivered.
    21002109The only issue with this approach is that signal masks from one kernel thread may be restored on another as part of returning from the signal handler;
    2101 therefore, all virtual processors in a cluster need to have the same signal mask.
     2110therefore, the same signal mask is required for all virtual processors in a cluster.
    21022111
    21032112However, on current UNIX systems:
     
    21572166\end{cfa}
    21582167The method used to get time is @clock_gettime( CLOCK_REALTIME )@.
    2159 Each benchmark is performed @N@ times, where @N@ varies depending on the benchmark, the total time is divided by @N@ to obtain the average time for a benchmark.
     2168Each benchmark is performed @N@ times, where @N@ varies depending on the benchmark;
     2169the total time is divided by @N@ to obtain the average time for a benchmark.
    21602170All omitted tests for other languages are functionally identical to the shown \CFA test.
    21612171
     
    22152225\multicolumn{1}{c|}{} & \multicolumn{1}{c|}{Median} &\multicolumn{1}{c|}{Average} & \multicolumn{1}{c|}{Std Dev} \\
    22162226\hline
    2217 Kernel Thread   & 241.5         & 243.86        & 5.08 \\
    2218 \CFA Coroutine  & 38            & 38            & 0    \\
    2219 \CFA Thread             & 103           & 102.96        & 2.96 \\
    2220 \uC Coroutine   & 46            & 45.86         & 0.35 \\
    2221 \uC Thread              & 98            & 99.11         & 1.42 \\
    2222 Goroutine               & 150           & 149.96        & 3.16 \\
    2223 Java Thread             & 289           & 290.68        & 8.72 \\
     2227Kernel Thread   & 333.5 & 332.96        & 4.1 \\
     2228\CFA Coroutine  & 49            & 48.68 & 0.47    \\
     2229\CFA Thread             & 105           & 105.57        & 1.37 \\
     2230\uC Coroutine   & 44            & 44            & 0 \\
     2231\uC Thread              & 100           & 99.29 & 0.96 \\
     2232Goroutine               & 145           & 147.25        & 4.15 \\
     2233Java Thread             & 373.5 & 375.14        & 8.72 \\
    22242234\hline
    22252235\end{tabular}
     
    22302240\lstset{language=CFA,moredelim=**[is][\color{red}]{@}{@},deletedelim=**[is][]{`}{`}}
    22312241\begin{cfa}
    2232 monitor M {} m1/*, m2, m3, m4*/;
     2242monitor M { ... } m1/*, m2, m3, m4*/;
    22332243void __attribute__((noinline)) do_call( M & mutex m/*, m2, m3, m4*/ ) {}
    22342244int main() {
     
    22502260\multicolumn{1}{c|}{} & \multicolumn{1}{c|}{Median} &\multicolumn{1}{c|}{Average} & \multicolumn{1}{c|}{Std Dev} \\
    22512261\hline
    2252 C routine                                                       & 2                     & 2                     & 0    \\
    2253 FetchAdd + FetchSub                                     & 26            & 26            & 0    \\
    2254 Pthreads Mutex Lock                                     & 31            & 31.86         & 0.99 \\
    2255 \uC @monitor@ member routine            & 30            & 30            & 0    \\
    2256 \CFA @mutex@ routine, 1 argument        & 41            & 41.57         & 0.9  \\
    2257 \CFA @mutex@ routine, 2 argument        & 76            & 76.96         & 1.57 \\
    2258 \CFA @mutex@ routine, 4 argument        & 145           & 146.68        & 3.85 \\
    2259 Java synchronized routine                       & 27            & 28.57         & 2.6  \\
     2262C routine                                       & 2             & 2             & 0    \\
     2263FetchAdd + FetchSub                     & 26            & 26            & 0    \\
     2264Pthreads Mutex Lock                     & 31            & 31.71 & 0.97 \\
     2265\uC @monitor@ member routine            & 31            & 31            & 0    \\
     2266\CFA @mutex@ routine, 1 argument        & 46            & 46.68 & 0.93  \\
     2267\CFA @mutex@ routine, 2 argument        & 84            & 85.36 & 1.99 \\
     2268\CFA @mutex@ routine, 4 argument        & 158           & 161           & 4.22 \\
     2269Java synchronized routine               & 27.5  & 29.79 & 2.93  \\
    22602270\hline
    22612271\end{tabular}
     
    22772287Figure~\ref{f:int-sched} shows the code for \CFA, with results in Table~\ref{tab:int-sched}.
    22782288Note, the incremental cost of bulk acquire for \CFA, which is largely a fixed cost for small numbers of mutex objects.
     2289Java scheduling is significantly greater because the benchmark explicitly creates multiple thread in order to prevent the JIT from making the program sequential, \ie removing all locking.
    22792290
    22802291\begin{figure}
     
    22832294volatile int go = 0;
    22842295condition c;
    2285 monitor M {} m;
     2296monitor M { ... } m;
    22862297void __attribute__((noinline)) do_call( M & mutex a1 ) { signal( c ); }
    22872298thread T {};
     
    23012312}
    23022313\end{cfa}
    2303 \captionof{figure}{\CFA Internal scheduling benchmark}
     2314\captionof{figure}{\CFA Internal-scheduling benchmark}
    23042315\label{f:int-sched}
    23052316
    23062317\centering
    2307 \captionof{table}{Internal scheduling comparison (nanoseconds)}
     2318\captionof{table}{Internal-scheduling comparison (nanoseconds)}
    23082319\label{tab:int-sched}
    23092320\bigskip
     
    23132324\multicolumn{1}{c|}{} & \multicolumn{1}{c|}{Median} &\multicolumn{1}{c|}{Average} & \multicolumn{1}{c|}{Std Dev} \\
    23142325\hline
    2315 Pthreads Condition Variable             & 5902.5        & 6093.29       & 714.78 \\
    2316 \uC @signal@                                    & 322           & 323           & 3.36   \\
    2317 \CFA @signal@, 1 @monitor@              & 352.5         & 353.11        & 3.66   \\
    2318 \CFA @signal@, 2 @monitor@              & 430           & 430.29        & 8.97   \\
    2319 \CFA @signal@, 4 @monitor@              & 594.5         & 606.57        & 18.33  \\
    2320 Java @notify@                                   & 13831.5       & 15698.21      & 4782.3 \\
     2326Pthreads Condition Variable             & 6005  & 5681.43       & 835.45 \\
     2327\uC @signal@                                    & 324           & 325.54        & 3,02   \\
     2328\CFA @signal@, 1 @monitor@              & 368.5         & 370.61        & 4.77   \\
     2329\CFA @signal@, 2 @monitor@              & 467           & 470.5 & 6.79   \\
     2330\CFA @signal@, 4 @monitor@              & 700.5         & 702.46        & 7.23  \\
     2331Java @notify@                                   & 15471 & 172511        & 5689 \\
    23212332\hline
    23222333\end{tabular}
     
    23342345\begin{cfa}
    23352346volatile int go = 0;
    2336 monitor M {} m;
     2347monitor M { ... } m;
    23372348thread T {};
    23382349void __attribute__((noinline)) do_call( M & mutex ) {}
     
    23522363}
    23532364\end{cfa}
    2354 \captionof{figure}{\CFA external scheduling benchmark}
     2365\captionof{figure}{\CFA external-scheduling benchmark}
    23552366\label{f:ext-sched}
    23562367
    23572368\centering
    23582369
    2359 \captionof{table}{External scheduling comparison (nanoseconds)}
     2370\captionof{table}{External-scheduling comparison (nanoseconds)}
    23602371\label{tab:ext-sched}
    23612372\bigskip
     
    23642375\multicolumn{1}{c|}{} & \multicolumn{1}{c|}{Median} &\multicolumn{1}{c|}{Average} & \multicolumn{1}{c|}{Std Dev} \\
    23652376\hline
    2366 \uC @_Accept@                           & 350           & 350.61        & 3.11  \\
    2367 \CFA @waitfor@, 1 @monitor@     & 358.5         & 358.36        & 3.82  \\
    2368 \CFA @waitfor@, 2 @monitor@     & 422           & 426.79        & 7.95  \\
    2369 \CFA @waitfor@, 4 @monitor@     & 579.5         & 585.46        & 11.25 \\
     2377\uC @_Accept@                           & 358           & 359.11        & 2.53  \\
     2378\CFA @waitfor@, 1 @monitor@     & 359           & 360.93        & 4.07  \\
     2379\CFA @waitfor@, 2 @monitor@     & 450           & 449.39        & 6.62  \\
     2380\CFA @waitfor@, 4 @monitor@     & 652           & 655.64        & 7.73 \\
    23702381\hline
    23712382\end{tabular}
     
    23832394}
    23842395\end{cfa}
    2385 \captionof{figure}{\CFA object creation benchmark}
     2396\captionof{figure}{\CFA object-creation benchmark}
    23862397\label{f:creation}
    23872398
     
    23962407\multicolumn{1}{c|}{} & \multicolumn{1}{c|}{Median} & \multicolumn{1}{c|}{Average} & \multicolumn{1}{c|}{Std Dev} \\
    23972408\hline
    2398 Pthreads                                & 26996         & 26984.71      & 156.6  \\
    2399 \CFA Coroutine Lazy             & 6                     & 5.71          & 0.45   \\
    2400 \CFA Coroutine Eager    & 708           & 706.68        & 4.82   \\
    2401 \CFA Thread                             & 1173.5        & 1176.18       & 15.18  \\
    2402 \uC Coroutine                   & 109           & 107.46        & 1.74   \\
    2403 \uC Thread                              & 526           & 530.89        & 9.73   \\
    2404 Goroutine                               & 2520.5        & 2530.93       & 61.56  \\
    2405 Java Thread                             & 91114.5       & 92272.79      & 961.58 \\
     2409Pthreads                                & 28091         & 28073.39      & 163.1  \\
     2410\CFA Coroutine Lazy             & 6                     & 6.07          & 0.26   \\
     2411\CFA Coroutine Eager    & 520           & 520.61        & 2.04   \\
     2412\CFA Thread                             & 2032  & 2016.29       & 112.07  \\
     2413\uC Coroutine                   & 106           & 107.36        & 1.47   \\
     2414\uC Thread                              & 536.5 & 537.07        & 4.64   \\
     2415Goroutine                               & 3103  & 3086.29       & 90.25  \\
     2416Java Thread                             & 103416.5      & 103732.29     & 1137 \\
    24062417\hline
    24072418\end{tabular}
     
    24182429\section{Conclusion}
    24192430
    2420 This paper demonstrate a concurrency API that is simple, efficient, and able to build higher-level concurrency features.
     2431This paper demonstrates a concurrency API that is simple, efficient, and able to build higher-level concurrency features.
    24212432The approach provides concurrency based on a preemptive M:N user-level threading-system, executing in clusters, which encapsulate scheduling of work on multiple kernel threads providing parallelism.
    24222433The M:N model is judged to be efficient and provide greater flexibility than a 1:1 threading model.
     
    24392450However, no single scheduler is optimal for all workloads and therefore there is value in being able to change the scheduler for given programs.
    24402451One solution is to offer various tweaking options, allowing the scheduler to be adjusted to the requirements of the workload.
    2441 However, to be truly flexible, it is necessary to have a pluggable scheduler.
     2452However, to be truly flexible, a pluggable scheduler is necessary.
    24422453Currently, the \CFA pluggable scheduler is too simple to handle complex scheduling, \eg quality of service and real-time, where the scheduler must interact with mutex objects to deal with issues like priority inversion.
    24432454
     
    24492460At its core, non-blocking I/O is an operating-system level feature queuing IO operations, \eg network operations, and registering for notifications instead of waiting for requests to complete.
    24502461Current trends use asynchronous programming like callbacks, futures, and/or promises, \eg Node.js~\cite{NodeJs} for JavaScript, Spring MVC~\cite{SpringMVC} for Java, and Django~\cite{Django} for Python.
    2451 However, these solutions lead to code that is hard create, read and maintain.
     2462However, these solutions lead to code that is hard to create, read and maintain.
    24522463A better approach is to tie non-blocking I/O into the concurrency system to provide ease of use with low overhead, \eg thread-per-connection web-services.
    24532464A non-blocking I/O library is currently under development for \CFA.
     
    24562467\label{futur:tools}
    24572468
    2458 While monitors offer a flexible and powerful concurrent for \CFA, other concurrency tools are also necessary for a complete multi-paradigm concurrency package.
     2469While monitors offer flexible and powerful concurrency for \CFA, other concurrency tools are also necessary for a complete multi-paradigm concurrency package.
    24592470Examples of such tools can include futures and promises~\cite{promises}, executors and actors.
    24602471These additional features are useful when monitors offer a level of abstraction that is inadequate for certain tasks.
     2472As well, new \CFA extensions should make it possible to create a uniform interface for virtually all mutual exclusion, including monitors and low-level locks.
    24612473
    24622474\paragraph{Implicit Threading}
     
    24672479The canonical example of implicit concurrency is concurrent nested @for@ loops, which are amenable to divide and conquer algorithms~\cite{uC++book}.
    24682480The \CFA language features should make it possible to develop a reasonable number of implicit concurrency mechanism to solve basic HPC data-concurrency problems.
    2469 However, implicit concurrency is a restrictive solution and has its limitations, so it can never replace explicit concurrent programming.
     2481However, implicit concurrency is a restrictive solution with significant limitations, so it can never replace explicit concurrent programming.
    24702482
    24712483
  • doc/papers/concurrency/figures/ext_monitor.fig

    rc653b37 re3b2474  
    89894 1 -1 0 0 0 12 0.0000 2 135 525 3525 3675 shared\001
    90904 1 -1 0 0 0 12 0.0000 2 135 735 3525 3975 variables\001
    91 4 0 4 50 -1 0 11 0.0000 2 120 135 4150 1875 Y\001
    92 4 0 4 50 -1 0 11 0.0000 2 120 105 4150 2175 Z\001
    93 4 0 4 50 -1 0 11 0.0000 2 120 135 4150 2475 X\001
    94 4 0 4 50 -1 0 11 0.0000 2 120 165 4150 2775 W\001
     914 0 4 50 -1 0 11 0.0000 2 120 135 4150 1875 X\001
     924 0 4 50 -1 0 11 0.0000 2 120 135 4150 2175 Y\001
     934 0 4 50 -1 0 11 0.0000 2 120 135 4150 2475 Y\001
     944 0 4 50 -1 0 11 0.0000 2 120 135 4150 2775 X\001
    95954 0 -1 0 0 3 12 0.0000 2 150 540 5025 4275 urgent\001
    96964 1 0 50 -1 0 11 0.0000 2 165 600 3150 3150 accepted\001
  • src/Parser/LinkageSpec.h

    rc653b37 re3b2474  
    1010// Created On       : Sat May 16 13:24:28 2015
    1111// Last Modified By : Peter A. Buhr
    12 // Last Modified On : Sat Jul 22 09:32:16 2017
    13 // Update Count     : 14
     12// Last Modified On : Mon Jul  2 07:46:49 2018
     13// Update Count     : 16
    1414//
    1515
     
    2222namespace LinkageSpec {
    2323        // All linkage specs are some combination of these flags:
    24         enum {
    25                 Mangle = 1 << 0,
    26                 Generate = 1 << 1,
    27                 Overrideable = 1 << 2,
    28                 Builtin = 1 << 3,
    29                 GccBuiltin = 1 << 4,
    30 
    31                 NoOfSpecs = 1 << 5,
    32         };
     24        enum { Mangle = 1 << 0, Generate = 1 << 1, Overrideable = 1 << 2, Builtin = 1 << 3, GccBuiltin = 1 << 4, NoOfSpecs = 1 << 5, };
    3325
    3426        union Spec {
     
    4234                };
    4335                constexpr Spec( unsigned int val ) : val( val ) {}
    44                 constexpr Spec( Spec const &other ) : val( other.val ) {}
     36                constexpr Spec( Spec const & other ) : val( other.val ) {}
    4537                // Operators may go here.
    4638                // Supports == and !=
    47                 constexpr operator unsigned int () const { return val; }
     39                constexpr operator unsigned int() const { return val; }
    4840        };
    4941
  • src/Parser/parser.yy

    rc653b37 re3b2474  
    1010// Created On       : Sat Sep  1 20:22:55 2001
    1111// Last Modified By : Peter A. Buhr
    12 // Last Modified On : Sun Jun 24 10:41:10 2018
    13 // Update Count     : 3587
     12// Last Modified On : Mon Jul  2 20:23:14 2018
     13// Update Count     : 3607
    1414//
    1515
     
    112112        for ( DeclarationNode *iter = declaration; iter != nullptr; iter = (DeclarationNode *)iter->get_next() ) {
    113113                iter->set_extension( true );
     114        } // for
     115} // distExt
     116
     117void distQual( DeclarationNode * declaration, DeclarationNode * qualifiers ) {
     118        // distribute qualifiers across all declarations in a distribution statemement
     119        for ( DeclarationNode * iter = declaration; iter != nullptr; iter = (DeclarationNode *)iter->get_next() ) {
     120                if ( isMangled( iter->linkage ) ) {             // ignore extern "C"
     121                        iter->addQualifiers( qualifiers->clone() );
     122                } // if
    114123        } // for
    115124} // distExt
     
    11361145
    11371146waitfor:
    1138         WAITFOR '(' identifier ')'
    1139                 {
    1140                         $$ = new ExpressionNode( new NameExpr( *$3 ) );
    1141                         delete $3;
    1142                 }
    1143         | WAITFOR '(' identifier ',' argument_expression_list ')'
    1144                 {
    1145                         $$ = new ExpressionNode( new NameExpr( *$3 ) );
    1146                         $$->set_last( $5 );
    1147                         delete $3;
    1148                 }
     1147        WAITFOR '(' cast_expression ')'
     1148                { $$ = $3; }
     1149        | WAITFOR '(' cast_expression ',' argument_expression_list ')'
     1150                { $$ = (ExpressionNode *)$3->set_last( $5 ); }
    11491151        ;
    11501152
     
    11631165                { $$ = build_waitfor_timeout( nullptr, $3, $1 ); }
    11641166                // "else" must be conditional after timeout or timeout is never triggered (i.e., it is meaningless)
     1167        | when_clause_opt timeout statement WOR ELSE statement
     1168                { SemanticError( yylloc, "else clause must be conditional after timeout or timeout never triggered." ); $$ = nullptr; }
    11651169        | when_clause_opt timeout statement WOR when_clause ELSE statement
    11661170                { $$ = build_waitfor_timeout( $2, $3, $1, $7, $5 ); }
     
    23802384          '{' up external_definition_list_opt down '}'          // CFA, namespace
    23812385                {
    2382                         for ( DeclarationNode * iter = $5; iter != nullptr; iter = (DeclarationNode *)iter->get_next() ) {
    2383                                 if ( isMangled( iter->linkage ) ) {             // ignore extern "C"
    2384                                         iter->addQualifiers( $1->clone() );
    2385                                 } // if
    2386                         } // for
     2386                        distQual( $5, $1 );
    23872387                        xxx = false;
    23882388                        delete $1;
     
    23912391        | declaration_qualifier_list
    23922392                {
    2393                         if ( $1->type->qualifiers.val ) { SemanticError( yylloc, "CV qualifiers cannot be distributed; only storage-class and forall qualifiers." ); }
    2394                         if ( $1->type->forall ) xxx = forall = true; // remember generic type
     2393                        if ( $1->type && $1->type->qualifiers.val ) { SemanticError( yylloc, "CV qualifiers cannot be distributed; only storage-class and forall qualifiers." ); }
     2394                        if ( $1->type && $1->type->forall ) xxx = forall = true; // remember generic type
    23952395                }
    23962396          '{' up external_definition_list_opt down '}'          // CFA, namespace
    23972397                {
    2398                         for ( DeclarationNode * iter = $5; iter != nullptr; iter = (DeclarationNode *)iter->get_next() ) {
    2399                                 if ( isMangled( iter->linkage ) ) {             // ignore extern "C"
    2400                                         iter->addQualifiers( $1->clone() );
    2401                                 } // if
    2402                         } // for
     2398                        distQual( $5, $1 );
    24032399                        xxx = false;
    24042400                        delete $1;
     
    24122408          '{' up external_definition_list_opt down '}'          // CFA, namespace
    24132409                {
    2414                         for ( DeclarationNode * iter = $6; iter != nullptr; iter = (DeclarationNode *)iter->get_next() ) {
    2415                                 if ( isMangled( iter->linkage ) && isMangled( $2->linkage ) ) { // ignore extern "C"
    2416                                         iter->addQualifiers( $1->clone() );
    2417                                         iter->addQualifiers( $2->clone() );
    2418                                 } // if
    2419                         } // for
     2410                        distQual( $6, $2 );
     2411                        distQual( $6, $1 );
    24202412                        xxx = false;
    24212413                        delete $1;
  • src/ResolvExpr/Resolver.cc

    rc653b37 re3b2474  
    583583                                                        // Make sure we don't widen any existing bindings
    584584                                                        resultEnv.forbidWidening();
    585                                                        
     585
    586586                                                        // Find any unbound type variables
    587587                                                        resultEnv.extractOpenVars( openVars );
     
    590590                                                        auto param_end = function->parameters.end();
    591591
    592                                                         int n_mutex_arg = 0;
     592                                                        int n_mutex_param = 0;
    593593
    594594                                                        // For every arguments of its set, check if it matches one of the parameter
     
    600600                                                                        // We ran out of parameters but still have arguments
    601601                                                                        // this function doesn't match
    602                                                                         SemanticError( function, toString("candidate function not viable: too many mutex arguments, expected ", n_mutex_arg, "\n" ));
     602                                                                        SemanticError( function, toString("candidate function not viable: too many mutex arguments, expected ", n_mutex_param, "\n" ));
    603603                                                                }
    604604
    605                                                                 n_mutex_arg++;
     605                                                                n_mutex_param++;
    606606
    607607                                                                // Check if the argument matches the parameter type in the current scope
     
    626626                                                        // Check if parameters are missing
    627627                                                        if( advance_to_mutex( param, param_end ) ) {
     628                                                                do {
     629                                                                        n_mutex_param++;
     630                                                                        param++;
     631                                                                } while( advance_to_mutex( param, param_end ) );
     632
    628633                                                                // We ran out of arguments but still have parameters left
    629634                                                                // this function doesn't match
    630                                                                 SemanticError( function, toString("candidate function not viable: too few mutex arguments, expected ", n_mutex_arg, "\n" ));
     635                                                                SemanticError( function, toString("candidate function not viable: too few mutex arguments, expected ", n_mutex_param, "\n" ));
    631636                                                        }
    632637
  • src/libcfa/clock

    rc653b37 re3b2474  
    1010// Created On       : Thu Apr 12 14:36:06 2018
    1111// Last Modified By : Peter A. Buhr
    12 // Last Modified On : Thu Apr 12 16:53:31 2018
    13 // Update Count     : 3
     12// Last Modified On : Mon Jul  2 21:40:01 2018
     13// Update Count     : 7
    1414//
    1515
     
    3434};
    3535
    36 static inline void resetClock( Clock & clk ) with( clk ) {
    37         clocktype = CLOCK_REALTIME_COARSE;
    38 } // Clock::resetClock
     36static inline {
     37        void resetClock( Clock & clk ) with( clk ) {
     38                clocktype = CLOCK_REALTIME_COARSE;
     39        } // Clock::resetClock
    3940
    40 static inline void resetClock( Clock & clk, Duration adj ) with( clk ) {
    41         clocktype = -1;
    42         offset = adj + timezone`s;                                                      // timezone (global) is (UTC - local time) in seconds
    43 } // resetClock
     41        void resetClock( Clock & clk, Duration adj ) with( clk ) {
     42                clocktype = -1;
     43                offset = adj + __timezone`s;                                    // timezone (global) is (UTC - local time) in seconds
     44        } // resetClock
    4445
    45 static inline void ?{}( Clock & clk ) { resetClock( clk ); }
    46 static inline void ?{}( Clock & clk, Duration adj ) { resetClock( clk, adj ); }
     46        void ?{}( Clock & clk ) { resetClock( clk ); }
     47        void ?{}( Clock & clk, Duration adj ) { resetClock( clk, adj ); }
    4748
    48 static inline Duration getRes() {
    49         struct timespec res;
    50         clock_getres( CLOCK_REALTIME_COARSE, &res );
    51         return ((int64_t)res.tv_sec * TIMEGRAN + res.tv_nsec)`ns;
    52 } // getRes
     49        Duration getResNsec() {
     50                struct timespec res;
     51                clock_getres( CLOCK_REALTIME, &res );
     52                return ((int64_t)res.tv_sec * TIMEGRAN + res.tv_nsec)`ns;
     53        } // getRes
    5354
    54 static inline Time getTimeNsec() {                                              // with nanoseconds
    55         timespec curr;
    56         clock_gettime( CLOCK_REALTIME_COARSE, &curr );
    57         return (Time){ curr };
    58 } // getTime
     55        Duration getRes() {
     56                struct timespec res;
     57                clock_getres( CLOCK_REALTIME_COARSE, &res );
     58                return ((int64_t)res.tv_sec * TIMEGRAN + res.tv_nsec)`ns;
     59        } // getRes
    5960
    60 static inline Time getTime() {                                                  // without nanoseconds
    61         timespec curr;
    62         clock_gettime( CLOCK_REALTIME_COARSE, &curr );
    63         curr.tv_nsec = 0;
    64         return (Time){ curr };
    65 } // getTime
     61        Time getTimeNsec() {                                                            // with nanoseconds
     62                timespec curr;
     63                clock_gettime( CLOCK_REALTIME, &curr );
     64                return (Time){ curr };
     65        } // getTime
    6666
    67 static inline Time getTime( Clock & clk ) with( clk ) {
    68         return getTime() + offset;
    69 } // getTime
     67        Time getTime() {                                                                        // without nanoseconds
     68                timespec curr;
     69                clock_gettime( CLOCK_REALTIME_COARSE, &curr );
     70                curr.tv_nsec = 0;
     71                return (Time){ curr };
     72        } // getTime
    7073
    71 static inline Time ?()( Clock & clk ) with( clk ) {             // alternative syntax
    72         return getTime() + offset;
    73 } // getTime
     74        Time getTime( Clock & clk ) with( clk ) {
     75                return getTime() + offset;
     76        } // getTime
    7477
    75 static inline timeval getTime( Clock & clk ) {
    76         return (timeval){ clk() };
    77 } // getTime
     78        Time ?()( Clock & clk ) with( clk ) {                           // alternative syntax
     79                return getTime() + offset;
     80        } // getTime
    7881
    79 static inline tm getTime( Clock & clk ) with( clk ) {
    80         tm ret;
    81         localtime_r( getTime( clk ).tv_sec, &ret );
    82         return ret;
    83 } // getTime
     82        timeval getTime( Clock & clk ) {
     83                return (timeval){ clk() };
     84        } // getTime
     85
     86        tm getTime( Clock & clk ) with( clk ) {
     87                tm ret;
     88                localtime_r( getTime( clk ).tv_sec, &ret );
     89                return ret;
     90        } // getTime
     91} // distribution
    8492
    8593// Local Variables: //
  • src/libcfa/iostream

    rc653b37 re3b2474  
    1010// Created On       : Wed May 27 17:56:53 2015
    1111// Last Modified By : Peter A. Buhr
    12 // Last Modified On : Sat Jun  2 08:07:55 2018
    13 // Update Count     : 153
     12// Last Modified On : Sun Jul  1 12:12:22 2018
     13// Update Count     : 155
    1414//
    1515
     
    4242        void open( ostype & os, const char * name, const char * mode );
    4343        void close( ostype & os );
    44         ostype & write( ostype &, const char *, unsigned long int );
     44        ostype & write( ostype &, const char *, size_t );
    4545        int fmt( ostype &, const char fmt[], ... );
    4646}; // ostream
     
    117117        void open( istype & is, const char * name );
    118118        void close( istype & is );
    119         istype & read( istype &, char *, unsigned long int );
     119        istype & read( istype &, char *, size_t );
    120120        istype & ungetc( istype &, char );
    121121        int fmt( istype &, const char fmt[], ... );
  • src/libcfa/stdlib

    rc653b37 re3b2474  
    1010// Created On       : Thu Jan 28 17:12:35 2016
    1111// Last Modified By : Peter A. Buhr
    12 // Last Modified On : Sat Jun  2 08:46:35 2018
    13 // Update Count     : 306
     12// Last Modified On : Tue Jul  3 08:17:28 2018
     13// Update Count     : 324
    1414//
    1515
     
    264264//---------------------------------------
    265265
    266 extern "C" { void srandom( unsigned int seed ); }               // override C version
    267 char random( void );
    268 char random( char u );
    269 char random( char l, char u );
    270 int random( void );
    271 int random( int u );
    272 int random( int l, int u );
    273 unsigned int random( void );
    274 unsigned int random( unsigned int u );
    275 unsigned int random( unsigned int l, unsigned int u );
    276 extern "C" { long int random( void ); }                                 // override C version
    277 long int random( long int u );
    278 long int random( long int l, long int u );
    279 unsigned long int random( void );
    280 unsigned long int random( unsigned long int u );
    281 unsigned long int random( unsigned long int l, unsigned long int u );
    282 float random( void );
    283 double random( void );
    284 float _Complex random( void );
    285 double _Complex random( void );
    286 long double _Complex random( void );
     266extern "C" {                                                                                    // override C version
     267        void srandom( unsigned int seed );
     268        long int random( void );
     269} // extern "C"
     270
     271static inline {
     272        long int random( long int l, long int u ) { if ( u < l ) [u, l] = [l, u]; return lrand48() % (u - l) + l; } // [l,u)
     273        long int random( long int u ) { if ( u < 0 ) return random( u, 0 ); else return random( 0, u ); } // [0,u)
     274        unsigned long int random( void ) { return lrand48(); }
     275        unsigned long int random( unsigned long int l, unsigned long int u ) { if ( u < l ) [u, l] = [l, u]; return lrand48() % (u - l) + l; } // [l,u)
     276        unsigned long int random( unsigned long int u ) { return lrand48() % u; } // [0,u)
     277
     278        char random( void ) { return (unsigned long int)random(); }
     279        char random( char u ) { return random( (unsigned long int)u ); } // [0,u)
     280        char random( char l, char u ) { return random( (unsigned long int)l, (unsigned long int)u ); } // [l,u)
     281        int random( void ) { return (long int)random(); }
     282        int random( int u ) { return random( (long int)u ); } // [0,u]
     283        int random( int l, int u ) { return random( (long int)l, (long int)u ); } // [l,u)
     284        unsigned int random( void ) { return (unsigned long int)random(); }
     285        unsigned int random( unsigned int u ) { return random( (unsigned long int)u ); } // [0,u]
     286        unsigned int random( unsigned int l, unsigned int u ) { return random( (unsigned long int)l, (unsigned long int)u ); } // [l,u)
     287} // distribution
     288
     289float random( void );                                                                   // [0.0, 1.0)
     290double random( void );                                                                  // [0.0, 1.0)
     291float _Complex random( void );                                                  // [0.0, 1.0)+[0.0, 1.0)i
     292double _Complex random( void );                                                 // [0.0, 1.0)+[0.0, 1.0)i
     293long double _Complex random( void );                                    // [0.0, 1.0)+[0.0, 1.0)i
    287294
    288295//---------------------------------------
  • src/libcfa/stdlib.c

    rc653b37 re3b2474  
    1010// Created On       : Thu Jan 28 17:10:29 2016
    1111// Last Modified By : Peter A. Buhr
    12 // Last Modified On : Sat Jun  2 06:15:05 2018
    13 // Update Count     : 448
     12// Last Modified On : Tue Jul  3 08:17:30 2018
     13// Update Count     : 457
    1414//
    1515
     
    249249//---------------------------------------
    250250
    251 extern "C" { void srandom( unsigned int seed ) { srand48( (long int)seed ); } } // override C version
    252 char random( void ) { return (unsigned long int)random(); }
    253 char random( char u ) { return random( (unsigned long int)u ); }
    254 char random( char l, char u ) { return random( (unsigned long int)l, (unsigned long int)u ); }
    255 int random( void ) { return (long int)random(); }
    256 int random( int u ) { return random( (long int)u ); }
    257 int random( int l, int u ) { return random( (long int)l, (long int)u ); }
    258 unsigned int random( void ) { return (unsigned long int)random(); }
    259 unsigned int random( unsigned int u ) { return random( (unsigned long int)u ); }
    260 unsigned int random( unsigned int l, unsigned int u ) { return random( (unsigned long int)l, (unsigned long int)u ); }
    261 extern "C" { long int random( void ) { return mrand48(); } } // override C version
    262 long int random( long int u ) { if ( u < 0 ) return random( u, 0 ); else return random( 0, u ); }
    263 long int random( long int l, long int u ) { assert( l < u ); return lrand48() % (u - l) + l; }
    264 unsigned long int random( void ) { return lrand48(); }
    265 unsigned long int random( unsigned long int u ) { return lrand48() % u; }
    266 unsigned long int random( unsigned long int l, unsigned long int u ) { assert( l < u ); return lrand48() % (u - l) + l; }
     251extern "C" {                                                                                    // override C version
     252        void srandom( unsigned int seed ) { srand48( (long int)seed ); }
     253        long int random( void ) { return mrand48(); }
     254} // extern "C"
     255
    267256float random( void ) { return (float)drand48(); }               // cast otherwise float uses lrand48
    268257double random( void ) { return drand48(); }
  • src/libcfa/time

    rc653b37 re3b2474  
    1010// Created On       : Wed Mar 14 23:18:57 2018
    1111// Last Modified By : Peter A. Buhr
    12 // Last Modified On : Sat Apr 14 17:48:23 2018
    13 // Update Count     : 636
     12// Last Modified On : Mon Jul  2 21:28:38 2018
     13// Update Count     : 641
    1414//
    1515
     
    2727enum { TIMEGRAN = 1_000_000_000LL };                                    // nanosecond granularity, except for timeval
    2828
    29 
    3029//######################### Duration #########################
    3130
    32 static inline Duration ?=?( Duration & dur, zero_t ) { return dur{ 0 }; }
    33 
    34 static inline Duration +?( Duration rhs ) with( rhs ) { return (Duration)@{ +tv }; }
    35 static inline Duration ?+?( Duration & lhs, Duration rhs ) { return (Duration)@{ lhs.tv + rhs.tv }; }
    36 static inline Duration ?+=?( Duration & lhs, Duration rhs ) { lhs = lhs + rhs; return lhs; }
    37 
    38 static inline Duration -?( Duration rhs ) with( rhs ) { return (Duration)@{ -tv }; }
    39 static inline Duration ?-?( Duration & lhs, Duration rhs ) { return (Duration)@{ lhs.tv - rhs.tv }; }
    40 static inline Duration ?-=?( Duration & lhs, Duration rhs ) { lhs = lhs - rhs; return lhs; }
    41 
    42 static inline Duration ?*?( Duration lhs, int64_t rhs ) { return (Duration)@{ lhs.tv * rhs }; }
    43 static inline Duration ?*?( int64_t lhs, Duration rhs ) { return (Duration)@{ lhs * rhs.tv }; }
    44 static inline Duration ?*=?( Duration & lhs, int64_t rhs ) { lhs = lhs * rhs; return lhs; }
    45 
    46 static inline int64_t ?/?( Duration lhs, Duration rhs ) { return lhs.tv / rhs.tv; }
    47 static inline Duration ?/?( Duration lhs, int64_t rhs ) { return (Duration)@{ lhs.tv / rhs }; }
    48 static inline Duration ?/=?( Duration & lhs, int64_t rhs ) { lhs = lhs / rhs; return lhs; }
    49 static inline double div( Duration lhs, Duration rhs ) { return (double)lhs.tv / (double)rhs.tv; }
    50 
    51 static inline Duration ?%?( Duration lhs, Duration rhs ) { return (Duration)@{ lhs.tv % rhs.tv }; }
    52 static inline Duration ?%=?( Duration & lhs, Duration rhs ) { lhs = lhs % rhs; return lhs; }
    53 
    54 static inline _Bool ?==?( Duration lhs, Duration rhs ) { return lhs.tv == rhs.tv; }
    55 static inline _Bool ?!=?( Duration lhs, Duration rhs ) { return lhs.tv != rhs.tv; }
    56 static inline _Bool ?<? ( Duration lhs, Duration rhs ) { return lhs.tv <  rhs.tv; }
    57 static inline _Bool ?<=?( Duration lhs, Duration rhs ) { return lhs.tv <= rhs.tv; }
    58 static inline _Bool ?>? ( Duration lhs, Duration rhs ) { return lhs.tv >  rhs.tv; }
    59 static inline _Bool ?>=?( Duration lhs, Duration rhs ) { return lhs.tv >= rhs.tv; }
    60 
    61 static inline _Bool ?==?( Duration lhs, zero_t ) { return lhs.tv == 0; }
    62 static inline _Bool ?!=?( Duration lhs, zero_t ) { return lhs.tv != 0; }
    63 static inline _Bool ?<? ( Duration lhs, zero_t ) { return lhs.tv <  0; }
    64 static inline _Bool ?<=?( Duration lhs, zero_t ) { return lhs.tv <= 0; }
    65 static inline _Bool ?>? ( Duration lhs, zero_t ) { return lhs.tv >  0; }
    66 static inline _Bool ?>=?( Duration lhs, zero_t ) { return lhs.tv >= 0; }
    67 
    68 static inline Duration abs( Duration rhs ) { return rhs.tv >= 0 ? rhs : -rhs; }
    69 
    70 static inline Duration ?`ns( int64_t nsec ) { return (Duration)@{ nsec }; }
    71 static inline Duration ?`us( int64_t usec ) { return (Duration)@{ usec * (TIMEGRAN / 1_000_000LL) }; }
    72 static inline Duration ?`ms( int64_t msec ) { return (Duration)@{ msec * (TIMEGRAN / 1_000LL) }; }
    73 static inline Duration ?`s( int64_t sec ) { return (Duration)@{ sec * TIMEGRAN }; }
    74 static inline Duration ?`s( double sec ) { return (Duration)@{ sec * TIMEGRAN }; }
    75 static inline Duration ?`m( int64_t min ) { return (Duration)@{ min * (60LL * TIMEGRAN) }; }
    76 static inline Duration ?`m( double min ) { return (Duration)@{ min * (60LL * TIMEGRAN) }; }
    77 static inline Duration ?`h( int64_t hours ) { return (Duration)@{ hours * (60LL * 60LL * TIMEGRAN) }; }
    78 static inline Duration ?`h( double hours ) { return (Duration)@{ hours * (60LL * 60LL * TIMEGRAN) }; }
    79 static inline Duration ?`d( int64_t days ) { return (Duration)@{ days * (24LL * 60LL * 60LL * TIMEGRAN) }; }
    80 static inline Duration ?`d( double days ) { return (Duration)@{ days * (24LL * 60LL * 60LL * TIMEGRAN) }; }
    81 static inline Duration ?`w( int64_t weeks ) { return (Duration)@{ weeks * (7LL * 24LL * 60LL * 60LL * TIMEGRAN) }; }
    82 static inline Duration ?`w( double weeks ) { return (Duration)@{ weeks * (7LL * 24LL * 60LL * 60LL * TIMEGRAN) }; }
    83 
    84 static inline int64_t ?`ns( Duration dur ) { return dur.tv; }
    85 static inline int64_t ?`us( Duration dur ) { return dur.tv / (TIMEGRAN / 1_000_000LL); }
    86 static inline int64_t ?`ms( Duration dur ) { return dur.tv / (TIMEGRAN / 1_000LL); }
    87 static inline int64_t ?`s( Duration dur ) { return dur.tv / TIMEGRAN; }
    88 static inline int64_t ?`m( Duration dur ) { return dur.tv / (60LL * TIMEGRAN); }
    89 static inline int64_t ?`h( Duration dur ) { return dur.tv / (60LL * 60LL * TIMEGRAN); }
    90 static inline int64_t ?`d( Duration dur ) { return dur.tv / (24LL * 60LL * 60LL * TIMEGRAN); }
    91 static inline int64_t ?`w( Duration dur ) { return dur.tv / (7LL * 24LL * 60LL * 60LL * TIMEGRAN); }
    92 
    93 static inline Duration max( Duration lhs, Duration rhs ) { return  (lhs.tv < rhs.tv) ? rhs : lhs;}
    94 static inline Duration min( Duration lhs, Duration rhs ) { return !(rhs.tv < lhs.tv) ? lhs : rhs;}
    95 
     31static inline {
     32        Duration ?=?( Duration & dur, zero_t ) { return dur{ 0 }; }
     33
     34        Duration +?( Duration rhs ) with( rhs ) {       return (Duration)@{ +tv }; }
     35        Duration ?+?( Duration & lhs, Duration rhs ) { return (Duration)@{ lhs.tv + rhs.tv }; }
     36        Duration ?+=?( Duration & lhs, Duration rhs ) { lhs = lhs + rhs; return lhs; }
     37
     38        Duration -?( Duration rhs ) with( rhs ) { return (Duration)@{ -tv }; }
     39        Duration ?-?( Duration & lhs, Duration rhs ) { return (Duration)@{ lhs.tv - rhs.tv }; }
     40        Duration ?-=?( Duration & lhs, Duration rhs ) { lhs = lhs - rhs; return lhs; }
     41
     42        Duration ?*?( Duration lhs, int64_t rhs ) { return (Duration)@{ lhs.tv * rhs }; }
     43        Duration ?*?( int64_t lhs, Duration rhs ) { return (Duration)@{ lhs * rhs.tv }; }
     44        Duration ?*=?( Duration & lhs, int64_t rhs ) { lhs = lhs * rhs; return lhs; }
     45
     46        int64_t ?/?( Duration lhs, Duration rhs ) { return lhs.tv / rhs.tv; }
     47        Duration ?/?( Duration lhs, int64_t rhs ) { return (Duration)@{ lhs.tv / rhs }; }
     48        Duration ?/=?( Duration & lhs, int64_t rhs ) { lhs = lhs / rhs; return lhs; }
     49        double div( Duration lhs, Duration rhs ) { return (double)lhs.tv / (double)rhs.tv; }
     50
     51        Duration ?%?( Duration lhs, Duration rhs ) { return (Duration)@{ lhs.tv % rhs.tv }; }
     52        Duration ?%=?( Duration & lhs, Duration rhs ) { lhs = lhs % rhs; return lhs; }
     53
     54        _Bool ?==?( Duration lhs, Duration rhs ) { return lhs.tv == rhs.tv; }
     55        _Bool ?!=?( Duration lhs, Duration rhs ) { return lhs.tv != rhs.tv; }
     56        _Bool ?<? ( Duration lhs, Duration rhs ) { return lhs.tv <  rhs.tv; }
     57        _Bool ?<=?( Duration lhs, Duration rhs ) { return lhs.tv <= rhs.tv; }
     58        _Bool ?>? ( Duration lhs, Duration rhs ) { return lhs.tv >  rhs.tv; }
     59        _Bool ?>=?( Duration lhs, Duration rhs ) { return lhs.tv >= rhs.tv; }
     60
     61        _Bool ?==?( Duration lhs, zero_t ) { return lhs.tv == 0; }
     62        _Bool ?!=?( Duration lhs, zero_t ) { return lhs.tv != 0; }
     63        _Bool ?<? ( Duration lhs, zero_t ) { return lhs.tv <  0; }
     64        _Bool ?<=?( Duration lhs, zero_t ) { return lhs.tv <= 0; }
     65        _Bool ?>? ( Duration lhs, zero_t ) { return lhs.tv >  0; }
     66        _Bool ?>=?( Duration lhs, zero_t ) { return lhs.tv >= 0; }
     67
     68        Duration abs( Duration rhs ) { return rhs.tv >= 0 ? rhs : -rhs; }
     69
     70        Duration ?`ns( int64_t nsec ) { return (Duration)@{ nsec }; }
     71        Duration ?`us( int64_t usec ) { return (Duration)@{ usec * (TIMEGRAN / 1_000_000LL) }; }
     72        Duration ?`ms( int64_t msec ) { return (Duration)@{ msec * (TIMEGRAN / 1_000LL) }; }
     73        Duration ?`s( int64_t sec ) { return (Duration)@{ sec * TIMEGRAN }; }
     74        Duration ?`s( double sec ) { return (Duration)@{ sec * TIMEGRAN }; }
     75        Duration ?`m( int64_t min ) { return (Duration)@{ min * (60LL * TIMEGRAN) }; }
     76        Duration ?`m( double min ) { return (Duration)@{ min * (60LL * TIMEGRAN) }; }
     77        Duration ?`h( int64_t hours ) { return (Duration)@{ hours * (60LL * 60LL * TIMEGRAN) }; }
     78        Duration ?`h( double hours ) { return (Duration)@{ hours * (60LL * 60LL * TIMEGRAN) }; }
     79        Duration ?`d( int64_t days ) { return (Duration)@{ days * (24LL * 60LL * 60LL * TIMEGRAN) }; }
     80        Duration ?`d( double days ) { return (Duration)@{ days * (24LL * 60LL * 60LL * TIMEGRAN) }; }
     81        Duration ?`w( int64_t weeks ) { return (Duration)@{ weeks * (7LL * 24LL * 60LL * 60LL * TIMEGRAN) }; }
     82        Duration ?`w( double weeks ) { return (Duration)@{ weeks * (7LL * 24LL * 60LL * 60LL * TIMEGRAN) }; }
     83
     84        int64_t ?`ns( Duration dur ) { return dur.tv; }
     85        int64_t ?`us( Duration dur ) { return dur.tv / (TIMEGRAN / 1_000_000LL); }
     86        int64_t ?`ms( Duration dur ) { return dur.tv / (TIMEGRAN / 1_000LL); }
     87        int64_t ?`s( Duration dur ) { return dur.tv / TIMEGRAN; }
     88        int64_t ?`m( Duration dur ) { return dur.tv / (60LL * TIMEGRAN); }
     89        int64_t ?`h( Duration dur ) { return dur.tv / (60LL * 60LL * TIMEGRAN); }
     90        int64_t ?`d( Duration dur ) { return dur.tv / (24LL * 60LL * 60LL * TIMEGRAN); }
     91        int64_t ?`w( Duration dur ) { return dur.tv / (7LL * 24LL * 60LL * 60LL * TIMEGRAN); }
     92
     93        Duration max( Duration lhs, Duration rhs ) { return  (lhs.tv < rhs.tv) ? rhs : lhs;}
     94        Duration min( Duration lhs, Duration rhs ) { return !(rhs.tv < lhs.tv) ? lhs : rhs;}
     95} // distribution
    9696
    9797//######################### C timeval #########################
    9898
    99 static inline void ?{}( timeval & t ) {}
    100 static inline void ?{}( timeval & t, time_t sec, suseconds_t usec ) { t.tv_sec = sec; t.tv_usec = usec; }
    101 static inline void ?{}( timeval & t, time_t sec ) { t{ sec, 0 }; }
    102 static inline void ?{}( timeval & t, zero_t ) { t{ 0, 0 }; }
    103 static inline timeval ?=?( timeval & t, zero_t ) { return t{ 0 }; }
    104 static inline timeval ?+?( timeval & lhs, timeval rhs ) { return (timeval)@{ lhs.tv_sec + rhs.tv_sec, lhs.tv_usec + rhs.tv_usec }; }
    105 static inline timeval ?-?( timeval & lhs, timeval rhs ) { return (timeval)@{ lhs.tv_sec - rhs.tv_sec, lhs.tv_usec - rhs.tv_usec }; }
    106 static inline _Bool ?==?( timeval lhs, timeval rhs ) { return lhs.tv_sec == rhs.tv_sec && lhs.tv_usec == rhs.tv_usec; }
    107 static inline _Bool ?!=?( timeval lhs, timeval rhs ) { return lhs.tv_sec != rhs.tv_sec || lhs.tv_usec != rhs.tv_usec; }
    108 
     99static inline {
     100        void ?{}( timeval & t ) {}
     101        void ?{}( timeval & t, time_t sec, suseconds_t usec ) { t.tv_sec = sec; t.tv_usec = usec; }
     102        void ?{}( timeval & t, time_t sec ) { t{ sec, 0 }; }
     103        void ?{}( timeval & t, zero_t ) { t{ 0, 0 }; }
     104
     105        timeval ?=?( timeval & t, zero_t ) { return t{ 0 }; }
     106        timeval ?+?( timeval & lhs, timeval rhs ) { return (timeval)@{ lhs.tv_sec + rhs.tv_sec, lhs.tv_usec + rhs.tv_usec }; }
     107        timeval ?-?( timeval & lhs, timeval rhs ) { return (timeval)@{ lhs.tv_sec - rhs.tv_sec, lhs.tv_usec - rhs.tv_usec }; }
     108        _Bool ?==?( timeval lhs, timeval rhs ) { return lhs.tv_sec == rhs.tv_sec && lhs.tv_usec == rhs.tv_usec; }
     109        _Bool ?!=?( timeval lhs, timeval rhs ) { return lhs.tv_sec != rhs.tv_sec || lhs.tv_usec != rhs.tv_usec; }
     110} // distribution
    109111
    110112//######################### C timespec #########################
    111113
    112 static inline void ?{}( timespec & t ) {}
    113 static inline void ?{}( timespec & t, time_t sec, __syscall_slong_t nsec ) { t.tv_sec = sec; t.tv_nsec = nsec; }
    114 static inline void ?{}( timespec & t, time_t sec ) { t{ sec, 0}; }
    115 static inline void ?{}( timespec & t, zero_t ) { t{ 0, 0 }; }
    116 static inline timespec ?=?( timespec & t, zero_t ) { return t{ 0 }; }
    117 static inline timespec ?+?( timespec & lhs, timespec rhs ) { return (timespec)@{ lhs.tv_sec + rhs.tv_sec, lhs.tv_nsec + rhs.tv_nsec }; }
    118 static inline timespec ?-?( timespec & lhs, timespec rhs ) { return (timespec)@{ lhs.tv_sec - rhs.tv_sec, lhs.tv_nsec - rhs.tv_nsec }; }
    119 static inline _Bool ?==?( timespec lhs, timespec rhs ) { return lhs.tv_sec == rhs.tv_sec && lhs.tv_nsec == rhs.tv_nsec; }
    120 static inline _Bool ?!=?( timespec lhs, timespec rhs ) { return lhs.tv_sec != rhs.tv_sec || lhs.tv_nsec != rhs.tv_nsec; }
    121 
     114static inline {
     115        void ?{}( timespec & t ) {}
     116        void ?{}( timespec & t, time_t sec, __syscall_slong_t nsec ) { t.tv_sec = sec; t.tv_nsec = nsec; }
     117        void ?{}( timespec & t, time_t sec ) { t{ sec, 0}; }
     118        void ?{}( timespec & t, zero_t ) { t{ 0, 0 }; }
     119
     120        timespec ?=?( timespec & t, zero_t ) { return t{ 0 }; }
     121        timespec ?+?( timespec & lhs, timespec rhs ) { return (timespec)@{ lhs.tv_sec + rhs.tv_sec, lhs.tv_nsec + rhs.tv_nsec }; }
     122        timespec ?-?( timespec & lhs, timespec rhs ) { return (timespec)@{ lhs.tv_sec - rhs.tv_sec, lhs.tv_nsec - rhs.tv_nsec }; }
     123        _Bool ?==?( timespec lhs, timespec rhs ) { return lhs.tv_sec == rhs.tv_sec && lhs.tv_nsec == rhs.tv_nsec; }
     124        _Bool ?!=?( timespec lhs, timespec rhs ) { return lhs.tv_sec != rhs.tv_sec || lhs.tv_nsec != rhs.tv_nsec; }
     125} // distribution
    122126
    123127//######################### C itimerval #########################
    124128
    125 static inline void ?{}( itimerval & itv, Duration alarm ) with( itv ) {
    126         // itimerval contains durations but but uses time data-structure timeval.
    127         it_value{ alarm`s, (alarm % 1`s)`us };                          // seconds, microseconds
    128         it_interval{ 0 };                                                                       // 0 seconds, 0 microseconds
    129 } // itimerval
    130 
    131 static inline void ?{}( itimerval & itv, Duration alarm, Duration interval ) with( itv ) {
    132         // itimerval contains durations but but uses time data-structure timeval.
    133         it_value{ alarm`s, (alarm % 1`s)`us };                          // seconds, microseconds
    134         it_interval{ interval`s, interval`us };                         // seconds, microseconds
    135 } // itimerval
    136 
     129static inline {
     130        void ?{}( itimerval & itv, Duration alarm ) with( itv ) {
     131                // itimerval contains durations but but uses time data-structure timeval.
     132                it_value{ alarm`s, (alarm % 1`s)`us };                  // seconds, microseconds
     133                it_interval{ 0 };                                                               // 0 seconds, 0 microseconds
     134        } // itimerval
     135
     136        void ?{}( itimerval & itv, Duration alarm, Duration interval ) with( itv ) {
     137                // itimerval contains durations but but uses time data-structure timeval.
     138                it_value{ alarm`s, (alarm % 1`s)`us };                  // seconds, microseconds
     139                it_interval{ interval`s, interval`us };                 // seconds, microseconds
     140        } // itimerval
     141} // distribution
    137142
    138143//######################### Time #########################
    139144
    140145void ?{}( Time & time, int year, int month = 0, int day = 0, int hour = 0, int min = 0, int sec = 0, int nsec = 0 );
    141 static inline Time ?=?( Time & time, zero_t ) { return time{ 0 }; }
    142 
    143 static inline void ?{}( Time & time, timeval t ) with( time ) { tv = (int64_t)t.tv_sec * TIMEGRAN + t.tv_usec * 1000; }
    144 static inline Time ?=?( Time & time, timeval t ) with( time ) {
    145         tv = (int64_t)t.tv_sec * TIMEGRAN + t.tv_usec * (TIMEGRAN / 1_000_000LL);
    146         return time;
    147 } // ?=?
    148 
    149 static inline void ?{}( Time & time, timespec t ) with( time ) { tv = (int64_t)t.tv_sec * TIMEGRAN + t.tv_nsec; }
    150 static inline Time ?=?( Time & time, timespec t ) with( time ) {
    151         tv = (int64_t)t.tv_sec * TIMEGRAN + t.tv_nsec;
    152         return time;
    153 } // ?=?
    154 
    155 static inline Time ?+?( Time & lhs, Duration rhs ) { return (Time)@{ lhs.tv + rhs.tv }; }
    156 static inline Time ?+?( Duration lhs, Time rhs ) { return rhs + lhs; }
    157 static inline Time ?+=?( Time & lhs, Duration rhs ) { lhs = lhs + rhs; return lhs; }
    158 
    159 static inline Duration ?-?( Time lhs, Time rhs ) { return (Duration)@{ lhs.tv - rhs.tv }; }
    160 static inline Time ?-?( Time lhs, Duration rhs ) { return (Time)@{ lhs.tv - rhs.tv }; }
    161 static inline Time ?-=?( Time & lhs, Duration rhs ) { lhs = lhs - rhs; return lhs; }
    162 static inline _Bool ?==?( Time lhs, Time rhs ) { return lhs.tv == rhs.tv; }
    163 static inline _Bool ?!=?( Time lhs, Time rhs ) { return lhs.tv != rhs.tv; }
    164 static inline _Bool ?<?( Time lhs, Time rhs ) { return lhs.tv < rhs.tv; }
    165 static inline _Bool ?<=?( Time lhs, Time rhs ) { return lhs.tv <= rhs.tv; }
    166 static inline _Bool ?>?( Time lhs, Time rhs ) { return lhs.tv > rhs.tv; }
    167 static inline _Bool ?>=?( Time lhs, Time rhs ) { return lhs.tv >= rhs.tv; }
     146static inline {
     147        Time ?=?( Time & time, zero_t ) { return time{ 0 }; }
     148
     149        void ?{}( Time & time, timeval t ) with( time ) { tv = (int64_t)t.tv_sec * TIMEGRAN + t.tv_usec * 1000; }
     150        Time ?=?( Time & time, timeval t ) with( time ) {
     151                tv = (int64_t)t.tv_sec * TIMEGRAN + t.tv_usec * (TIMEGRAN / 1_000_000LL);
     152                return time;
     153        } // ?=?
     154
     155        void ?{}( Time & time, timespec t ) with( time ) { tv = (int64_t)t.tv_sec * TIMEGRAN + t.tv_nsec; }
     156        Time ?=?( Time & time, timespec t ) with( time ) {
     157                tv = (int64_t)t.tv_sec * TIMEGRAN + t.tv_nsec;
     158                return time;
     159        } // ?=?
     160
     161        Time ?+?( Time & lhs, Duration rhs ) { return (Time)@{ lhs.tv + rhs.tv }; }
     162        Time ?+?( Duration lhs, Time rhs ) { return rhs + lhs; }
     163        Time ?+=?( Time & lhs, Duration rhs ) { lhs = lhs + rhs; return lhs; }
     164
     165        Duration ?-?( Time lhs, Time rhs ) { return (Duration)@{ lhs.tv - rhs.tv }; }
     166        Time ?-?( Time lhs, Duration rhs ) { return (Time)@{ lhs.tv - rhs.tv }; }
     167        Time ?-=?( Time & lhs, Duration rhs ) { lhs = lhs - rhs; return lhs; }
     168        _Bool ?==?( Time lhs, Time rhs ) { return lhs.tv == rhs.tv; }
     169        _Bool ?!=?( Time lhs, Time rhs ) { return lhs.tv != rhs.tv; }
     170        _Bool ?<?( Time lhs, Time rhs ) { return lhs.tv < rhs.tv; }
     171        _Bool ?<=?( Time lhs, Time rhs ) { return lhs.tv <= rhs.tv; }
     172        _Bool ?>?( Time lhs, Time rhs ) { return lhs.tv > rhs.tv; }
     173        _Bool ?>=?( Time lhs, Time rhs ) { return lhs.tv >= rhs.tv; }
     174} // distribution
    168175
    169176char * yy_mm_dd( Time time, char * buf );
  • src/tests/literals.c

    rc653b37 re3b2474  
    1010// Created On       : Sat Sep  9 16:34:38 2017
    1111// Last Modified By : Peter A. Buhr
    12 // Last Modified On : Sun Mar  4 10:41:31 2018
    13 // Update Count     : 134
     12// Last Modified On : Sun Jul  1 15:12:15 2018
     13// Update Count     : 137
    1414//
    1515
     
    155155
    156156        // binary
    157          0b01101011_l8;   0b01101011_l16;   0b01101011_l32;   0b01101011_l64;   0b01101011_l128;   0b01101011_l8u;   0b01101011_ul16;   0b01101011_l32u;   0b01101011_ul64;   0b01101011_ul128;
    158         +0b01101011_l8;  +0b01101011_l16;  +0b01101011_l32;  +0b01101011_l64;  +0b01101011_l128;  +0b01101011_l8u;  +0b01101011_ul16;  +0b01101011_l32u;  +0b01101011_ul64;  +0b01101011_ul128;
    159         -0b01101011_l8;  -0b01101011_l16;  -0b01101011_l32;  -0b01101011_l64;  -0b01101011_l128;  -0b01101011_l8u;  -0b01101011_ul16;  -0b01101011_l32u;  -0b01101011_ul64;  -0b01101011_ul128;
     157         0b01101011_l8;   0b01101011_l16;   0b01101011_l32;   0b01101011_l64;   0b01101011_l8u;   0b01101011_ul16;   0b01101011_l32u;   0b01101011_ul64;
     158        +0b01101011_l8;  +0b01101011_l16;  +0b01101011_l32;  +0b01101011_l64;  +0b01101011_l8u;  +0b01101011_ul16;  +0b01101011_l32u;  +0b01101011_ul64;
     159        -0b01101011_l8;  -0b01101011_l16;  -0b01101011_l32;  -0b01101011_l64;  -0b01101011_l8u;  -0b01101011_ul16;  -0b01101011_l32u;  -0b01101011_ul64;
     160
     161#ifdef __LP64__ // 64-bit processor
     162        0b01101011_l128;   0b01101011_ul128;
     163        +0b01101011_l128;  +0b01101011_ul128;
     164        -0b01101011_l128;  -0b01101011_ul128;
     165#endif // __LP64__
    160166
    161167        // octal
    162          01234567_l8;   01234567_l16;   01234567_l32;   01234567_l64;   01234567_l128;   01234567_l8u;   01234567_ul16;   01234567_l32u;   01234567_ul64;   01234567_ul128;
    163         +01234567_l8;  +01234567_l16;  +01234567_l32;  +01234567_l64;  +01234567_l128;  +01234567_l8u;  +01234567_ul16;  +01234567_l32u;  +01234567_ul64;  +01234567_ul128;
    164         -01234567_l8;  -01234567_l16;  -01234567_l32;  -01234567_l64;  -01234567_l128;  -01234567_l8u;  -01234567_ul16;  -01234567_l32u;  -01234567_ul64;  -01234567_ul128;
     168         01234567_l8;   01234567_l16;   01234567_l32;   01234567_l64;   01234567_l8u;   01234567_ul16;   01234567_l32u;   01234567_ul64;
     169        +01234567_l8;  +01234567_l16;  +01234567_l32;  +01234567_l64;  +01234567_l8u;  +01234567_ul16;  +01234567_l32u;  +01234567_ul64;
     170        -01234567_l8;  -01234567_l16;  -01234567_l32;  -01234567_l64;  -01234567_l8u;  -01234567_ul16;  -01234567_l32u;  -01234567_ul64;
     171
     172#ifdef __LP64__ // 64-bit processor
     173        01234567_l128;   01234567_ul128;
     174        +01234567_l128;  +01234567_ul128;
     175        -01234567_l128;  -01234567_ul128;
     176#endif // __LP64__
    165177
    166178        // decimal
    167          1234567890L8;   1234567890L16;   1234567890l32;   1234567890l64;   1234567890l128;   1234567890UL8;   1234567890L16U;   1234567890Ul32;   1234567890l64u;   1234567890l128u;
    168         +1234567890L8;  +1234567890L16;  +1234567890l32;  +1234567890l64;  +1234567890l128;  +1234567890UL8;  +1234567890L16U;  +1234567890Ul32;  +1234567890l64u;  +1234567890l128u;
    169         -1234567890L8;  -1234567890L16;  -1234567890l32;  -1234567890l64;  -1234567890l128;  -1234567890UL8;  -1234567890L16U;  -1234567890Ul32;  -1234567890l64u;  -1234567890l128u;
     179         1234567890L8;   1234567890L16;   1234567890l32;   1234567890l64;   1234567890UL8;   1234567890L16U;   1234567890Ul32;   1234567890l64u;
     180        +1234567890L8;  +1234567890L16;  +1234567890l32;  +1234567890l64;  +1234567890UL8;  +1234567890L16U;  +1234567890Ul32;  +1234567890l64u;
     181        -1234567890L8;  -1234567890L16;  -1234567890l32;  -1234567890l64;  -1234567890UL8;  -1234567890L16U;  -1234567890Ul32;  -1234567890l64u;
     182
     183#ifdef __LP64__ // 64-bit processor
     184        1234567890l128;   1234567890l128u;
     185        +1234567890l128;  +1234567890l128u;
     186        -1234567890l128;  -1234567890l128u;
     187#endif // __LP64__
    170188
    171189        // hexadecimal
Note: See TracChangeset for help on using the changeset viewer.