Changeset e3b2474
- Timestamp:
- Jul 3, 2018, 3:25:56 PM (7 years ago)
- Branches:
- ADT, aaron-thesis, arm-eh, ast-experimental, cleanup-dtors, deferred_resn, demangler, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, no_list, persistent-indexer, pthread-emulation, qualifiedEnum
- Children:
- 638ac26
- Parents:
- c653b37 (diff), bbe1a87 (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the(diff)
links above to see all the changes relative to each parent. - Files:
-
- 3 deleted
- 11 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/papers/concurrency/Paper.tex
rc653b37 re3b2474 3 3 \articletype{RESEARCH ARTICLE}% 4 4 5 \received{ 26 April 2016}6 \revised{ 6 June 2016}7 \accepted{ 6 June 2016}5 \received{XXXXX} 6 \revised{XXXXX} 7 \accepted{XXXXX} 8 8 9 9 \raggedbottom … … 32 32 \renewcommand{\linenumberfont}{\scriptsize\sffamily} 33 33 34 \renewcommand{\topfraction}{0.8} % float must be greater than X of the page before it is forced onto its own page 35 \renewcommand{\bottomfraction}{0.8} % float must be greater than X of the page before it is forced onto its own page 36 \renewcommand{\floatpagefraction}{0.8} % float must be greater than X of the page before it is forced onto its own page 34 37 \renewcommand{\textfraction}{0.0} % the entire page maybe devoted to floats with no text on the page at all 35 38 … … 132 135 \makeatother 133 136 134 \newenvironment{cquote}{% 135 \list{}{\lstset{resetmargins=true,aboveskip=0pt,belowskip=0pt}\topsep=3pt\parsep=0pt\leftmargin=\parindentlnth\rightmargin\leftmargin}% 136 \item\relax 137 }{% 138 \endlist 139 }% cquote 137 \newenvironment{cquote} 138 {\list{}{\lstset{resetmargins=true,aboveskip=0pt,belowskip=0pt}\topsep=3pt\parsep=0pt\leftmargin=\parindentlnth\rightmargin\leftmargin}% 139 \item\relax} 140 {\endlist} 141 142 %\newenvironment{cquote}{% 143 %\list{}{\lstset{resetmargins=true,aboveskip=0pt,belowskip=0pt}\topsep=3pt\parsep=0pt\leftmargin=\parindentlnth\rightmargin\leftmargin}% 144 %\item\relax% 145 %}{% 146 %\endlist% 147 %}% cquote 140 148 141 149 % CFA programming language, based on ANSI C (with some gcc additions) … … 234 242 \abstract[Summary]{ 235 243 \CFA is a modern, polymorphic, \emph{non-object-oriented} extension of the C programming language. 236 This paper discusses the design of the concurrency and parallelism features in \CFA, and theconcurrent runtime-system.244 This paper discusses the design of the concurrency and parallelism features in \CFA, and its concurrent runtime-system. 237 245 These features are created from scratch as ISO C lacks concurrency, relying largely on the pthreads library for concurrency. 238 Coroutines and lightweight (user) threads are introduced into the language.239 In addition, monitors are added as a high-level mechanism for mutual exclusion and synchronization.240 A unique contribution is allowing multiple monitors to be safely acquired simultaneously.246 Coroutines and lightweight (user) threads are introduced into \CFA; 247 as well, monitors are added as a high-level mechanism for mutual exclusion and synchronization. 248 A unique contribution of this work is allowing multiple monitors to be safely acquired \emph{simultaneously}. 241 249 All features respect the expectations of C programmers, while being fully integrate with the \CFA polymorphic type-system and other language features. 242 Finally, experimental results show comparable performance of the new features with similar mechanisms in other concurrent programming-languages.250 Experimental results show comparable performance of the new features with similar mechanisms in other concurrent programming-languages. 243 251 }% 244 252 … … 271 279 Concurrency tools handle mutual exclusion and synchronization, while parallelism tools handle performance, cost, and resource utilization. 272 280 273 The proposed concurrency API is implemented in a dialect of C, called \CFA .281 The proposed concurrency API is implemented in a dialect of C, called \CFA (pronounced C-for-all). 274 282 The paper discusses how the language features are added to the \CFA translator with respect to parsing, semantics, and type checking, and the corresponding high-performance runtime-library to implement the concurrent features. 275 283 … … 281 289 282 290 \CFA is a non-object-oriented extension of ISO-C, and hence, supports all C paradigms. 283 %It is a non-object-oriented system-language, meaning most of the major abstractions have either no runtime overhead or can be opted out easily.284 291 Like C, the building blocks of \CFA are structures and routines. 285 292 Virtually all of the code generated by the \CFA translator respects C memory layouts and calling conventions. 286 While \CFA is not object oriented, lacking the concept of a receiver (\eg @this@) and nominal inheritance-relationships, C does havea notion of objects: ``region of data storage in the execution environment, the contents of which can represent values''~\cite[3.15]{C11}.287 While some \CFA features are common in object-oriented programming, they are independent capabilitiesallowing \CFA to adopt them while maintaining a procedural paradigm.293 While \CFA is not object oriented, lacking the concept of a receiver (\eg @this@) and nominal inheritance-relationships, C has a notion of objects: ``region of data storage in the execution environment, the contents of which can represent values''~\cite[3.15]{C11}. 294 While some object-oriented features appear in \CFA, they are independent capabilities, allowing \CFA to adopt them while maintaining a procedural paradigm. 288 295 289 296 … … 338 345 Object-oriented programming languages only provide implicit qualification for the receiver. 339 346 340 In detail, the @with@ statement has the form:347 In detail, the @with@-statement syntax is: 341 348 \begin{cfa} 342 349 $\emph{with-statement}$: … … 348 355 (Enumerations are already opened.) 349 356 The object is the implicit qualifier for the open structure-fields. 350 All expressions in the expression list are open in parallel within the compound statement, which is different from Pascal, which nests the openings from left to right.357 All expressions in the expression list are opened in parallel within the compound statement, which is different from Pascal, which nests the openings from left to right. 351 358 352 359 … … 354 361 355 362 \CFA maximizes the ability to reuse names via overloading to aggressively address the naming problem. 356 Both variables and routines may be overloaded, where selection is based on types, and number of returns (as in Ada~\cite{Ada}) and arguments. 363 Both variables and routines may be overloaded, where selection is based on number and types of returns and arguments (as in Ada~\cite{Ada}). 364 \newpage 365 \vspace*{-2\baselineskip}%??? 357 366 \begin{cquote} 358 \vspace*{-\baselineskip}%??? 367 \begin{cfa} 368 // selection based on type 369 \end{cfa} 359 370 \lstDeleteShortInline@% 360 \begin{cfa}361 // selection based on type362 \end{cfa}363 371 \begin{tabular}{@{}l@{\hspace{2\parindentlnth}}|@{\hspace{2\parindentlnth}}l@{}} 364 372 \begin{cfa} … … 411 419 Therefore, overloading eliminates long prefixes and other naming conventions to prevent name clashes. 412 420 As seen in Section~\ref{s:Concurrency}, routine @main@ is heavily overloaded. 413 For example, variable overloading is useful in the parallel semantics of the @with@ statement for fields with the same name:421 As another example, variable overloading is useful in the parallel semantics of the @with@ statement for fields with the same name: 414 422 \begin{cfa} 415 423 struct S { int `i`; int j; double m; } s; … … 464 472 \lstMakeShortInline@% 465 473 \end{cquote} 466 While concurrency does not use operator overloading directly, it provides an introduction for the syntax of constructors.467 474 468 475 … … 470 477 471 478 Object lifetime is a challenge in non-managed programming languages. 472 \CFA responds with \CC-like constructors and destructors :479 \CFA responds with \CC-like constructors and destructors, using a different operator-overloading syntax. 473 480 \begin{cfa} 474 481 struct VLA { int len, * data; }; $\C{// variable length array of integers}$ … … 490 497 Like \CC, construction is implicit on allocation (stack/heap) and destruction is implicit on deallocation. 491 498 The object and all their fields are constructed/destructed. 492 \CFA also provides @new@ and @delete@ , which behave like @malloc@ and @free@, in addition to constructing and destructing objects:499 \CFA also provides @new@ and @delete@ as library routines, which behave like @malloc@ and @free@, in addition to constructing and destructing objects: 493 500 \begin{cfa} 494 501 { … … 518 525 int i = sum( sa, 5 ); $\C{// use S's 0 construction and +}$ 519 526 \end{cfa} 520 The builtin type @zero_t@ (and @one_t@) overload constant 0 (and 1) for a new types, where both 0 and 1 have special meaning in C. 527 Type variables can be @otype@ or @dtype@. 528 @otype@ refers to a \emph{complete type}, \ie, a type with size, alignment, default constructor, copy constructor, destructor, and assignment operator. 529 @dtype@ refers to an \emph{incomplete type}, \eg, void or a forward-declared type. 530 The builtin types @zero_t@ and @one_t@ overload constant 0 and 1 for a new types, where both 0 and 1 have special meaning in C. 521 531 522 532 \CFA provides \newterm{traits} to name a group of type assertions, where the trait name allows specifying the same set of assertions in multiple locations, preventing repetition mistakes at each routine declaration: … … 533 543 \end{cfa} 534 544 535 Assertions can be @otype@ or @dtype@.536 @otype@ refers to a \emph{complete} object, \ie an object has a size, alignment, default constructor, copy constructor, destructor and an assignment operator.537 @dtype@ refers to an \emph{incomplete} object, \ie, an object only has a size and alignment.538 539 545 Using the return type for overload discrimination, it is possible to write a type-safe @alloc@ based on the C @malloc@: 540 546 \begin{cfa} … … 554 560 In coroutining, the single thread is self-scheduling across the stacks, so execution is deterministic, \ie the execution path from input to output is fixed and predictable. 555 561 A \newterm{stackless} coroutine executes on the caller's stack~\cite{Python} but this approach is restrictive, \eg preventing modularization and supporting only iterator/generator-style programming; 556 a \newterm{stackful l} coroutine executes on its own stack, allowing full generality.557 Only stackful lcoroutines are a stepping stone to concurrency.562 a \newterm{stackful} coroutine executes on its own stack, allowing full generality. 563 Only stackful coroutines are a stepping stone to concurrency. 558 564 559 565 The transition to concurrency, even for execution with a single thread and multiple stacks, occurs when coroutines also context switch to a \newterm{scheduling oracle}, introducing non-determinism from the coroutine perspective~\cite[\S~3]{Buhr05a}. … … 561 567 The resulting execution system now follows a cooperative threading-model, called \newterm{non-preemptive scheduling}. 562 568 563 Because the scheduler is special, it can either be a stackless or stackful lcoroutine.569 Because the scheduler is special, it can either be a stackless or stackful coroutine. 564 570 For stackless, the scheduler performs scheduling on the stack of the current coroutine and switches directly to the next coroutine, so there is one context switch. 565 For stackful l, the current coroutine switches to the scheduler, which performs scheduling, and it then switches to the next coroutine, so there are two context switches.566 A stackful l scheduler is often used for simplicity and security, even through there is a slightly higher runtime-cost.571 For stackful, the current coroutine switches to the scheduler, which performs scheduling, and it then switches to the next coroutine, so there are two context switches. 572 A stackful scheduler is often used for simplicity and security. 567 573 568 574 Regardless of the approach used, a subset of concurrency related challenges start to appear. 569 575 For the complete set of concurrency challenges to occur, the missing feature is \newterm{preemption}, where context switching occurs randomly between any two instructions, often based on a timer interrupt, called \newterm{preemptive scheduling}. 570 While a scheduler introduces uncertainty in the order of execution, preemption introduces uncertainty where context switches occur.576 While a scheduler introduces uncertainty in the order of execution, preemption introduces uncertainty about where context switches occur. 571 577 Interestingly, uncertainty is necessary for the runtime (operating) system to give the illusion of parallelism on a single processor and increase performance on multiple processors. 572 578 The reason is that only the runtime has complete knowledge about resources and how to best utilized them. … … 574 580 otherwise, it is impossible to write meaningful programs. 575 581 Optimal performance in concurrent applications is often obtained by having as much non-determinism as correctness allows. 576 577 578 \subsection{\protect\CFA's Thread Building Blocks}579 582 580 583 An important missing feature in C is threading\footnote{While the C11 standard defines a \protect\lstinline@threads.h@ header, it is minimal and defined as optional. … … 590 593 591 594 While the focus of this discussion is concurrency and parallelism, it is important to address coroutines, which are a significant building block of a concurrency system (but not concurrent among themselves). 592 Coroutines are generalized routines allowing execution to be temporarily suspend and later resumed.595 Coroutines are generalized routines allowing execution to be temporarily suspended and later resumed. 593 596 Hence, unlike a normal routine, a coroutine may not terminate when it returns to its caller, allowing it to be restarted with the values and execution location present at the point of suspension. 594 This capability is accomplish via the coroutine's stack, where suspend/resume context switch among stacks.597 This capability is accomplished via the coroutine's stack, where suspend/resume context switch among stacks. 595 598 Because threading design-challenges are present in coroutines, their design effort is relevant, and this effort can be easily exposed to programmers giving them a useful new programming paradigm because a coroutine handles the class of problems that need to retain state between calls, \eg plugins, device drivers, and finite-state machines. 596 Therefore, the core \CFA coroutine-API for has two fundamental features:independent call-stacks and @suspend@/@resume@ operations.599 Therefore, the two fundamental features of the core \CFA coroutine-API are independent call-stacks and @suspend@/@resume@ operations. 597 600 598 601 For example, a problem made easier with coroutines is unbounded generators, \eg generating an infinite sequence of Fibonacci numbers … … 722 725 Figure~\ref{f:Coroutine3States} creates a @coroutine@ type, @`coroutine` Fib { int fn; }@, which provides communication, @fn@, for the \newterm{coroutine main}, @main@, which runs on the coroutine stack, and possibly multiple interface routines, \eg @next@. 723 726 Like the structure in Figure~\ref{f:ExternalState}, the coroutine type allows multiple instances, where instances of this type are passed to the (overloaded) coroutine main. 724 The coroutine main's stack holds the state for the next generation, @f1@ and @f2@, and the code has the three suspend points, representing the three states in the Fibonacci formula, to context switch back to the caller's @resume@.727 The coroutine main's stack holds the state for the next generation, @f1@ and @f2@, and the code represents the three states in the Fibonacci formula via the three suspend points, to context switch back to the caller's @resume@. 725 728 The interface routine @next@, takes a Fibonacci instance and context switches to it using @resume@; 726 729 on restart, the Fibonacci field, @fn@, contains the next value in the sequence, which is returned. … … 747 750 \end{quote} 748 751 The example takes advantage of resuming a coroutine in the constructor to prime the loops so the first character sent for formatting appears inside the nested loops. 749 The destruct ionprovides a newline, if formatted text ends with a full line.750 Figure~\ref{f:CFmt} shows the C equivalent formatter, where the loops of the coroutine are flatten (linearized) and rechecked on each call because execution location is not retained between calls.752 The destructor provides a newline, if formatted text ends with a full line. 753 Figure~\ref{f:CFmt} shows the C equivalent formatter, where the loops of the coroutine are flattened (linearized) and rechecked on each call because execution location is not retained between calls. 751 754 (Linearized code is the bane of device drivers.) 752 755 … … 761 764 }; 762 765 void main( Format & fmt ) with( fmt ) { 763 for ( ;; ) { 766 for ( ;; ) { 764 767 for ( g = 0; g < 5; g += 1 ) { // group 765 768 for ( b = 0; b < 4; b += 1 ) { // block … … 835 838 The previous examples are \newterm{asymmetric (semi) coroutine}s because one coroutine always calls a resuming routine for another coroutine, and the resumed coroutine always suspends back to its last resumer, similar to call/return for normal routines. 836 839 However, @resume@ and @suspend@ context switch among existing stack-frames, rather than create new ones so there is no stack growth. 837 \newterm{Symmetric (full) coroutine}s have a coroutine call to a resuming routine for another coroutine, and its coroutine main has a call toanother resuming routine, which eventually forms a resuming-call cycle.840 \newterm{Symmetric (full) coroutine}s have a coroutine call to a resuming routine for another coroutine, and its coroutine main calls another resuming routine, which eventually forms a resuming-call cycle. 838 841 (The trivial cycle is a coroutine resuming itself.) 839 842 This control flow is similar to recursion for normal routines, but again there is no stack growth from the context switch. … … 926 929 The call from the consumer to @payment@ introduces the cycle between producer and consumer. 927 930 When @payment@ is called, the consumer copies values into the producer's communication variable and a resume is executed. 928 The context switch restarts the producer at the point where it waslast context switched, so it continues in @delivery@ after the resume.931 The context switch restarts the producer at the point where it last context switched, so it continues in @delivery@ after the resume. 929 932 930 933 @delivery@ returns the status value in @prod@'s coroutine main, where the status is printed. … … 945 948 \subsection{Coroutine Implementation} 946 949 947 A significant implementation challenge for coroutines (and threads, see section\ref{threads}) is adding extra fields and executing code after/before the coroutine constructor/destructor and coroutine main to create/initialize/de-initialize/destroy extra fields and the stack.950 A significant implementation challenge for coroutines (and threads, see Section~\ref{threads}) is adding extra fields and executing code after/before the coroutine constructor/destructor and coroutine main to create/initialize/de-initialize/destroy extra fields and the stack. 948 951 There are several solutions to this problem and the chosen option forced the \CFA coroutine design. 949 952 … … 953 956 \end{cfa} 954 957 and the programming language (and possibly its tool set, \eg debugger) may need to understand @baseCoroutine@ because of the stack. 955 Furthermore, the execution of construct s/destructors is in the wrong order for certain operations.958 Furthermore, the execution of constructors/destructors is in the wrong order for certain operations. 956 959 For example, for threads if the thread is implicitly started, it must start \emph{after} all constructors, because the thread relies on a completely initialized object, but the inherited constructor runs \emph{before} the derived. 957 960 958 An alternative lyis composition:961 An alternative is composition: 959 962 \begin{cfa} 960 963 struct mycoroutine { … … 1004 1007 Users wanting to extend coroutines or build their own for various reasons can only do so in ways offered by the language. 1005 1008 Furthermore, implementing coroutines without language supports also displays the power of a programming language. 1006 While this is ultimately the option used for idiomatic \CFA code, coroutines and threads can still be constructed without using thelanguage support.1007 The reserved keyword simply eases use for the common case s.1008 1009 Part of the mechanism to generalize coroutines is using a \CFA trait, which defines a coroutine as anything satisfying the trait @is_coroutine@, and this trait is used to restrictcoroutine-manipulation routines:1009 While this is ultimately the option used for idiomatic \CFA code, coroutines and threads can still be constructed without language support. 1010 The reserved keyword simply eases use for the common case. 1011 1012 Part of the mechanism to generalize coroutines is using a \CFA trait, which defines a coroutine as anything satisfying the trait @is_coroutine@, and this trait restricts the available set of coroutine-manipulation routines: 1010 1013 \begin{cfa} 1011 1014 trait is_coroutine( `dtype` T ) { … … 1016 1019 forall( `dtype` T | is_coroutine(T) ) void resume( T & ); 1017 1020 \end{cfa} 1018 The @dtype@ property of the trait ensures the coroutine descriptor is non-copyable, so all coroutines must be passed by reference (pointer).1021 The @dtype@ property provides no implicit copying operations and the @is_coroutine@ trait provides no explicit copying operations, so all coroutines must be passed by reference (pointer). 1019 1022 The routine definitions ensures there is a statically-typed @main@ routine that is the starting point (first stack frame) of a coroutine, and a mechanism to get (read) the currently executing coroutine handle. 1020 1023 The @main@ routine has no return value or additional parameters because the coroutine type allows an arbitrary number of interface routines with corresponding arbitrary typed input/output values versus fixed ones. 1021 The generic routines @suspend@ and @resume@ can be redefined, but any object passed to them is a coroutine since it must satisfy the @is_coroutine@ trait to compile.1022 1024 The advantage of this approach is that users can easily create different types of coroutines, \eg changing the memory layout of a coroutine is trivial when implementing the @get_coroutine@ routine, and possibly redefining @suspend@ and @resume@. 1023 1025 The \CFA keyword @coroutine@ implicitly implements the getter and forward declarations required for implementing the coroutine main: … … 1150 1152 \begin{cfa} 1151 1153 int main() { 1152 MyThread * heapLive d;1154 MyThread * heapLive; 1153 1155 { 1154 MyThread blockLive d;$\C{// fork block-based thread}$1155 heapLive d= `new`( MyThread ); $\C{// fork heap-based thread}$1156 MyThread blockLive; $\C{// fork block-based thread}$ 1157 heapLive = `new`( MyThread ); $\C{// fork heap-based thread}$ 1156 1158 ... 1157 1159 } $\C{// join block-based thread}$ 1158 1160 ... 1159 `delete`( heapLive d); $\C{// join heap-based thread}$1161 `delete`( heapLive ); $\C{// join heap-based thread}$ 1160 1162 } 1161 1163 \end{cfa} 1162 1164 The heap-based approach allows arbitrary thread-creation topologies, with respect to fork/join-style concurrency. 1163 1165 1164 Figure~\ref{s:ConcurrentMatrixSummation} shows concurrently adding the rows of a matrix and then totalling the subtotals sequential , after all the row threads have terminated.1166 Figure~\ref{s:ConcurrentMatrixSummation} shows concurrently adding the rows of a matrix and then totalling the subtotals sequentially, after all the row threads have terminated. 1165 1167 The program uses heap-based threads because each thread needs different constructor values. 1166 1168 (Python provides a simple iteration mechanism to initialize array elements to different values allowing stack allocation.) 1167 1169 The allocation/deallocation pattern appears unusual because allocated objects are immediately deallocated without any intervening code. 1168 1170 However, for threads, the deletion provides implicit synchronization, which is the intervening code. 1169 While the subtotals are added in linear order rather than completion order, which slight inhibits concurrency, the computation is restricted by the critical-path thread (\ie the thread that takes the longest), and so any inhibited concurrency is very small as totalling the subtotals is trivial.1171 While the subtotals are added in linear order rather than completion order, which slightly inhibits concurrency, the computation is restricted by the critical-path thread (\ie the thread that takes the longest), and so any inhibited concurrency is very small as totalling the subtotals is trivial. 1170 1172 1171 1173 \begin{figure} 1172 1174 \begin{cfa} 1173 thread Adder { 1174 int * row, cols, & subtotal; $\C{// communication}$ 1175 }; 1175 `thread` Adder { int * row, cols, & subtotal; } $\C{// communication variables}$ 1176 1176 void ?{}( Adder & adder, int row[], int cols, int & subtotal ) { 1177 1177 adder.[ row, cols, &subtotal ] = [ row, cols, &subtotal ]; … … 1187 1187 Adder * adders[rows]; 1188 1188 for ( int r = 0; r < rows; r += 1 ) { $\C{// start threads to sum rows}$ 1189 adders[r] = new( matrix[r], cols, &subtotals[r] );1189 adders[r] = `new( matrix[r], cols, &subtotals[r] );` 1190 1190 } 1191 1191 for ( int r = 0; r < rows; r += 1 ) { $\C{// wait for threads to finish}$ 1192 delete( adders[r] );$\C{// termination join}$1192 `delete( adders[r] );` $\C{// termination join}$ 1193 1193 total += subtotals[r]; $\C{// total subtotal}$ 1194 1194 } … … 1206 1206 To reestablish meaningful execution requires mechanisms to reintroduce determinism, \ie restrict non-determinism, called mutual exclusion and synchronization, where mutual exclusion is an access-control mechanism on data shared by threads, and synchronization is a timing relationship among threads~\cite[\S~4]{Buhr05a}. 1207 1207 Since many deterministic challenges appear with the use of mutable shared state, some languages/libraries disallow it, \eg Erlang~\cite{Erlang}, Haskell~\cite{Haskell}, Akka~\cite{Akka} (Scala). 1208 In these paradigms, interaction among concurrent objects is performed by stateless message-passing~\cite{Thoth,Harmony,V-Kernel} or other paradigms closely relate to networking concepts, \eg channels~\cite{CSP,Go}.1208 In these paradigms, interaction among concurrent objects is performed by stateless message-passing~\cite{Thoth,Harmony,V-Kernel} or other paradigms closely related to networking concepts, \eg channels~\cite{CSP,Go}. 1209 1209 However, in call/return-based languages, these approaches force a clear distinction, \ie introduce a new programming paradigm between regular and concurrent computation, \eg routine call versus message passing. 1210 1210 Hence, a programmer must learn and manipulate two sets of design patterns. 1211 1211 While this distinction can be hidden away in library code, effective use of the library still has to take both paradigms into account. 1212 In contrast, approaches based on stateful lmodels more closely resemble the standard call/return programming-model, resulting in a single programming paradigm.1212 In contrast, approaches based on stateful models more closely resemble the standard call/return programming-model, resulting in a single programming paradigm. 1213 1213 1214 1214 At the lowest level, concurrent control is implemented by atomic operations, upon which different kinds of locking mechanisms are constructed, \eg semaphores~\cite{Dijkstra68b}, barriers, and path expressions~\cite{Campbell74}. 1215 1215 However, for productivity it is always desirable to use the highest-level construct that provides the necessary efficiency~\cite{Hochstein05}. 1216 1216 A newer approach for restricting non-determinism is transactional memory~\cite{Herlihy93}. 1217 While this approach is pursued in hardware~\cite{Nakaike15} and system languages, like \CC~\cite{Cpp-Transactions}, the performance and feature set is still too restrictive to be the main concurrency paradigm for system languages, which is why it was rejected as the core paradigm for concurrency in \CFA.1217 While this approach is pursued in hardware~\cite{Nakaike15} and system languages, like \CC~\cite{Cpp-Transactions}, the performance and feature set is still too restrictive to be the main concurrency paradigm for system languages, which is why it is rejected as the core paradigm for concurrency in \CFA. 1218 1218 1219 1219 One of the most natural, elegant, and efficient mechanisms for mutual exclusion and synchronization for shared-memory systems is the \emph{monitor}. … … 1244 1244 higher-level mechanisms often simplify usage by adding better coupling between synchronization and data, \eg message passing, or offering a simpler solution to otherwise involved challenges, \eg barrier lock. 1245 1245 Often synchronization is used to order access to a critical section, \eg ensuring a reader thread is the next kind of thread to enter a critical section. 1246 If a writer thread is scheduled for next access, but another reader thread acquires the critical section first, that reader has\newterm{barged}.1246 If a writer thread is scheduled for next access, but another reader thread acquires the critical section first, that reader \newterm{barged}. 1247 1247 Barging can result in staleness/freshness problems, where a reader barges ahead of a writer and reads temporally stale data, or a writer barges ahead of another writer overwriting data with a fresh value preventing the previous value from ever being read (lost computation). 1248 1248 Preventing or detecting barging is an involved challenge with low-level locks, which can be made much easier by higher-level constructs. … … 1250 1250 Algorithms that allow a barger, but divert it until later using current synchronization state (flags), are avoiding the barger; 1251 1251 algorithms that preclude a barger from entering during synchronization in the critical section prevent barging completely. 1252 Techniques like baton-pass locks~\cite{Andrews89} between threads instead of unconditionally releasing locks is an example of barging prevention.1252 Techniques like baton-passing locks~\cite{Andrews89} between threads instead of unconditionally releasing locks is an example of barging prevention. 1253 1253 1254 1254 … … 1263 1263 Copying a lock is insecure because it is possible to copy an open lock and then use the open copy when the original lock is closed to simultaneously access the shared data. 1264 1264 Copying a monitor is secure because both the lock and shared data are copies, but copying the shared data is meaningless because it no longer represents a unique entity. 1265 As for coroutines/tasks, a non-copyable (@dtype@) trait is used to capture this requirement, so all locks/monitors must be passed by reference (pointer).1265 As for coroutines/tasks, the @dtype@ property provides no implicit copying operations and the @is_monitor@ trait provides no explicit copying operations, so all locks/monitors must be passed by reference (pointer). 1266 1266 \begin{cfa} 1267 1267 trait is_monitor( `dtype` T ) { … … 1275 1275 \label{s:MutexAcquisition} 1276 1276 1277 While correctness impli citly implies a monitor's mutual exclusion is acquired and released, there are implementation options about when and where the locking/unlocking occurs.1277 While correctness implies a monitor's mutual exclusion is acquired and released, there are implementation options about when and where the locking/unlocking occurs. 1278 1278 (Much of this discussion also applies to basic locks.) 1279 1279 For example, a monitor may need to be passed through multiple helper routines before it becomes necessary to acquire the monitor mutual-exclusion. … … 1301 1301 1302 1302 For maximum usability, monitors have \newterm{multi-acquire} semantics allowing a thread to acquire it multiple times without deadlock. 1303 For example, atomically printing the contents of a binary tree: 1304 \begin{cfa} 1305 monitor Tree { 1306 Tree * left, right; 1307 // value 1308 }; 1309 void print( Tree & mutex tree ) { $\C{// prefix traversal}$ 1310 // write value 1311 print( tree->left ); $\C{// multiply acquire monitor lock on each recursion}$ 1312 print( tree->right ); 1313 } 1314 \end{cfa} 1315 1316 Mandatory monitor qualifiers have the benefit of being self-documenting, but requiring both @mutex@ and \lstinline[morekeywords=nomutex]@nomutex@ for all monitor parameter is redundant. 1317 Instead, one of qualifier semantics can be the default, and the other required. 1318 For example, assume the safe @mutex@ qualifier for all monitor parameters because assuming \lstinline[morekeywords=nomutex]@nomutex@ may cause subtle errors. 1319 On the other hand, assuming \lstinline[morekeywords=nomutex]@nomutex@ is the \emph{normal} parameter behaviour, stating explicitly \emph{this parameter is not special}. 1303 \begin{cfa} 1304 monitor M { ... } m; 1305 void foo( M & mutex m ) { ... } $\C{// acquire mutual exclusion}$ 1306 void bar( M & mutex m ) { $\C{// acquire mutual exclusion}$ 1307 ... `foo( m );` ... $\C{// reacquire mutual exclusion}$ 1308 } 1309 `bar( m );` $\C{// nested monitor call}$ 1310 \end{cfa} 1311 1312 The benefit of mandatory monitor qualifiers is self-documentation, but requiring both @mutex@ and \lstinline[morekeywords=nomutex]@nomutex@ for all monitor parameters is redundant. 1313 Instead, the semantics have one qualifier as the default, and the other required. 1314 For example, make the safe @mutex@ qualifier the default because assuming \lstinline[morekeywords=nomutex]@nomutex@ may cause subtle errors. 1315 Alternatively, make the unsafe \lstinline[morekeywords=nomutex]@nomutex@ qualifier the default because it is the \emph{normal} parameter semantics while @mutex@ parameters are rare. 1320 1316 Providing a default qualifier implies knowing whether a parameter is a monitor. 1321 1317 Since \CFA relies heavily on traits as an abstraction mechanism, the distinction between a type that is a monitor and a type that looks like a monitor can become blurred. … … 1340 1336 1341 1337 To make the issue tractable, \CFA only acquires one monitor per parameter with at most one level of indirection. 1342 However, the C type-system has an ambiguitywith respects to arrays.1338 However, there is an ambiguity in the C type-system with respects to arrays. 1343 1339 Is the argument for @f2@ a single object or an array of objects? 1344 1340 If it is an array, only the first element of the array is acquired, which seems unsafe; … … 1355 1351 1356 1352 For object-oriented monitors, calling a mutex member \emph{implicitly} acquires mutual exclusion of the receiver object, @`rec`.foo(...)@. 1357 \CFA has no receiver, and hence, must use an explicit mechanism to specify which object has mutual exclusion acquired.1353 \CFA has no receiver, and hence, must use an explicit mechanism to specify which object acquires mutual exclusion. 1358 1354 A positive consequence of this design decision is the ability to support multi-monitor routines. 1359 1355 \begin{cfa} … … 1366 1362 Having language support for such a feature is therefore a significant asset for \CFA. 1367 1363 1368 The capability to acquire multiple locks before entering a critical section is called \newterm{bulk acquire} .1364 The capability to acquire multiple locks before entering a critical section is called \newterm{bulk acquire} (see Section~\ref{s:Implementation} for implementation details). 1369 1365 In the previous example, \CFA guarantees the order of acquisition is consistent across calls to different routines using the same monitors as arguments. 1370 1366 This consistent ordering means acquiring multiple monitors is safe from deadlock. … … 1384 1380 1385 1381 However, such use leads to lock acquiring order problems resulting in deadlock~\cite{Lister77}, where detecting it requires dynamic tracking of monitor calls, and dealing with it requires rollback semantics~\cite{Dice10}. 1386 In \CFA, safety is guaranteed by using bulk acquire of all monitors to shared objects, whereas other monitor systems provide no aid.1387 While \CFA provides only a partial solution, the \CFA partial solution handles many useful cases.1388 \begin{cfa} 1389 monitor Bank { ... };1390 void deposit( Bank & `mutex` b, int deposit );1391 void transfer( Bank & `mutex` mybank, Bank & `mutex` yourbank, int me2you) {1392 deposit( my bank, `-`me2you );$\C{// debit}$1393 deposit( your bank, me2you );$\C{// credit}$1382 In \CFA, a safety aid is provided by using bulk acquire of all monitors to shared objects, whereas other monitor systems provide no aid. 1383 While \CFA provides only a partial solution, it handles many useful cases, \eg: 1384 \begin{cfa} 1385 monitor BankAccount { ... }; 1386 void deposit( BankAccount & `mutex` b, int deposit ); 1387 void transfer( BankAccount & `mutex` my, BankAccount & `mutex` your, int me2you ) { 1388 deposit( my, `-`me2you ); $\C{// debit}$ 1389 deposit( your, me2you ); $\C{// credit}$ 1394 1390 } 1395 1391 \end{cfa} … … 1398 1394 1399 1395 1400 \subsection{\protect\lstinline|mutex| statement} \label{mutex-stmt} 1396 \subsection{\protect\lstinline@mutex@ statement} 1397 \label{mutex-stmt} 1401 1398 1402 1399 The monitor call-semantics associate all locking semantics to routines. … … 1405 1402 \begin{tabular}{@{}l@{\hspace{3\parindentlnth}}l@{}} 1406 1403 \begin{cfa} 1407 monitor M { };1404 monitor M { ... }; 1408 1405 void foo( M & mutex m1, M & mutex m2 ) { 1409 1406 // critical section … … 1435 1432 For example, Figure~\ref{f:GenericBoundedBuffer} shows a bounded buffer that may be full/empty so produce/consumer threads must block. 1436 1433 Leaving the monitor and trying again (busy waiting) is impractical for high-level programming. 1437 Monitors eliminate busy waiting by providing internal synchronization to schedule threads needing access to the shared data, where the synchronization is blocking (threads are parked)versus spinning.1434 Monitors eliminate busy waiting by providing synchronization to schedule threads needing access to the shared data, where threads block versus spinning. 1438 1435 Synchronization is generally achieved with internal~\cite{Hoare74} or external~\cite[\S~2.9.2]{uC++} scheduling, where \newterm{scheduling} defines which thread acquires the critical section next. 1439 1436 \newterm{Internal scheduling} is characterized by each thread entering the monitor and making an individual decision about proceeding or blocking, while \newterm{external scheduling} is characterized by an entering thread making a decision about proceeding for itself and on behalf of other threads attempting entry. … … 1525 1522 \end{figure} 1526 1523 1527 Figure~\ref{f:BBExt} shows a \CFA generic bounded-buffer with external scheduling, where producers/consumers detecting a full/empty buffer block and prevent more producers/consumers from entering the monitor until the buffer has a free/empty slot.1524 Figure~\ref{f:BBExt} shows a \CFA generic bounded-buffer with external scheduling, where producers/consumers detecting a full/empty buffer block and prevent more producers/consumers from entering the monitor until there is a free/empty slot in the buffer. 1528 1525 External scheduling is controlled by the @waitfor@ statement, which atomically blocks the calling thread, releases the monitor lock, and restricts the routine calls that can next acquire mutual exclusion. 1529 1526 If the buffer is full, only calls to @remove@ can acquire the buffer, and if the buffer is empty, only calls to @insert@ can acquire the buffer. 1530 Threads making calls to routines that are currently excluded, block outside (external) of the monitor on a calling queue, versus blocking on condition queues inside (internal) of the monitor. 1531 % External scheduling is more constrained and explicit, which helps programmers reduce the non-deterministic nature of concurrency. 1527 Threads making calls to routines that are currently excluded, block outside of (external to) the monitor on a calling queue, versus blocking on condition queues inside of (internal to) the monitor. 1532 1528 External scheduling allows users to wait for events from other threads without concern of unrelated events occurring. 1533 1529 The mechnaism can be done in terms of control flow, \eg Ada @accept@ or \uC @_Accept@, or in terms of data, \eg Go channels. … … 1546 1542 A thread blocks until an appropriate partner arrives. 1547 1543 The complexity is exchanging phone numbers in the monitor because of the mutual-exclusion property. 1548 For signal scheduling, the @exchange@ condition is necessary to block the thread finding the match, while the matcher unblocks to take the oppos e number, post its phone number, and unblock the partner.1544 For signal scheduling, the @exchange@ condition is necessary to block the thread finding the match, while the matcher unblocks to take the opposite number, post its phone number, and unblock the partner. 1549 1545 For signal-block scheduling, the implicit urgent-queue replaces the explict @exchange@-condition and @signal_block@ puts the finding thread on the urgent condition and unblocks the matcher. 1550 1546 … … 1567 1563 wait( Girls[ccode] ); 1568 1564 GirlPhNo = phNo; 1569 ` exchange.signal();`1565 `signal( exchange );` 1570 1566 } else { 1571 1567 GirlPhNo = phNo; 1572 1568 `signal( Boys[ccode] );` 1573 ` exchange.wait();`1569 `wait( exchange );` 1574 1570 } // if 1575 1571 return BoyPhNo; … … 1646 1642 } 1647 1643 \end{cfa} 1648 must have acquired monitor locks that are greater than or equal to the number of locks forthe waiting thread signalled from the condition queue.1644 must have acquired at least the same locks as the waiting thread signalled from the condition queue. 1649 1645 1650 1646 Similarly, for @waitfor( rtn )@, the default semantics is to atomically block the acceptor and release all acquired mutex types in the parameter list, \ie @waitfor( rtn, m1, m2 )@. … … 1652 1648 @waitfor@ statically verifies the released monitors are the same as the acquired mutex-parameters of the given routine or routine pointer. 1653 1649 To statically verify the released monitors match with the accepted routine's mutex parameters, the routine (pointer) prototype must be accessible. 1654 1655 When an overloaded routine appears in an @waitfor@ statement, calls to any routine with that name are accepted. 1656 The rationale is that members with the same name should perform a similar function, and therefore, all should be eligible to accept a call. 1657 As always, overloaded routines can be disambiguated using a cast: 1650 % When an overloaded routine appears in an @waitfor@ statement, calls to any routine with that name are accepted. 1651 % The rationale is that members with the same name should perform a similar function, and therefore, all should be eligible to accept a call. 1652 Overloaded routines can be disambiguated using a cast: 1658 1653 \begin{cfa} 1659 1654 void rtn( M & mutex m ); 1660 1655 `int` rtn( M & mutex m ); 1661 waitfor( (`int` (*)( M & mutex ))rtn, m 1, m2);1662 \end{cfa} 1663 1664 Given the ability to release a subset of acquired monitors can result in a \newterm{nested monitor}~\cite{Lister77} deadlock.1656 waitfor( (`int` (*)( M & mutex ))rtn, m ); 1657 \end{cfa} 1658 1659 The ability to release a subset of acquired monitors can result in a \newterm{nested monitor}~\cite{Lister77} deadlock. 1665 1660 \begin{cfa} 1666 1661 void foo( M & mutex m1, M & mutex m2 ) { … … 1673 1668 1674 1669 Finally, an important aspect of monitor implementation is barging, \ie can calling threads barge ahead of signalled threads? 1675 If barging is allowed, synchronization between a singller and signallee is difficult, often requiring multiple unblock/block cycles (looping around a wait rechecking if a condition is met). 1670 If barging is allowed, synchronization between a signaller and signallee is difficult, often requiring multiple unblock/block cycles (looping around a wait rechecking if a condition is met). 1671 In fact, signals-as-hints is completely opposite from that proposed by Hoare in the seminal paper on monitors: 1676 1672 \begin{quote} 1677 1673 However, we decree that a signal operation be followed immediately by resumption of a waiting program, without possibility of an intervening procedure call from yet a third program. … … 1679 1675 \end{quote} 1680 1676 \CFA scheduling \emph{precludes} barging, which simplifies synchronization among threads in the monitor and increases correctness. 1677 Furthermore, \CFA concurrency has no spurious wakeup~\cite[\S~9]{Buhr05a}, which eliminates an implict form of barging. 1681 1678 For example, there are no loops in either bounded buffer solution in Figure~\ref{f:GenericBoundedBuffer}. 1682 1679 Supporting barging prevention as well as extending internal scheduling to multiple monitors is the main source of complexity in the design and implementation of \CFA concurrency. 1683 Furthermore, \CFA concurrency has no superious wakeup~\cite[\S~9]{Buhr05a}, which eliminates an implict form of barging.1684 1680 1685 1681 … … 1753 1749 1754 1750 One scheduling solution is for the signaller to keep ownership of all locks until the last lock is ready to be transferred, because this semantics fits most closely to the behaviour of single-monitor scheduling. 1755 However, Figure~\ref{f:OtherWaitingThread} shows this solution is complex depending on other waiters, resulting i s choices when the signaller finishes the inner mutex-statement.1756 The si ngaller can retain @m2@ until completion of the outer mutex statement and pass the locks to waiter W1, or it can pass @m2@ to waiter W2 after completing the inner mutex-statement, while continuing to hold @m1@.1751 However, Figure~\ref{f:OtherWaitingThread} shows this solution is complex depending on other waiters, resulting in options when the signaller finishes the inner mutex-statement. 1752 The signaller can retain @m2@ until completion of the outer mutex statement and pass the locks to waiter W1, or it can pass @m2@ to waiter W2 after completing the inner mutex-statement, while continuing to hold @m1@. 1757 1753 In the latter case, waiter W2 must eventually pass @m2@ to waiter W1, which is complex because W1 may have waited before W2, so W2 is unaware of it. 1758 1754 Furthermore, there is an execution sequence where the signaller always finds waiter W2, and hence, waiter W1 starves. … … 1761 1757 Signalled threads are moved to the urgent queue and the waiter at the front defines the set of monitors necessary for it to unblock. 1762 1758 Partial signalling transfers ownership of monitors to the front waiter. 1763 When the signaller thread exits or waits in the monitor the front waiter is unblocked if all its monitors are released.1764 Th is solution has the benefit that complexity is encapsulatedinto only two actions: passing monitors to the next owner when they should be released and conditionally waking threads if all conditions are met.1759 When the signaller thread exits or waits in the monitor, the front waiter is unblocked if all its monitors are released. 1760 The benefit is encapsulating complexity into only two actions: passing monitors to the next owner when they should be released and conditionally waking threads if all conditions are met. 1765 1761 1766 1762 … … 1769 1765 1770 1766 In an object-oriented programming-language, a class includes an exhaustive list of operations. 1771 However, new members can be added via static inheritance or dyna ic members, \eg JavaScript~\cite{JavaScript}.1767 However, new members can be added via static inheritance or dynamic members, \eg JavaScript~\cite{JavaScript}. 1772 1768 Similarly, monitor routines can be added at any time in \CFA, making it less clear for programmers and more difficult to implement. 1773 1769 \begin{cfa} 1774 monitor M { };1770 monitor M { ... }; 1775 1771 void `f`( M & mutex m ); 1776 1772 void g( M & mutex m ) { waitfor( `f` ); } $\C{// clear which f}$ … … 1778 1774 void h( M & mutex m ) { waitfor( `f` ); } $\C{// unclear which f}$ 1779 1775 \end{cfa} 1780 Hence, the cfa-code for theentering a monitor looks like:1776 Hence, the cfa-code for entering a monitor looks like: 1781 1777 \begin{cfa} 1782 1778 if ( $\textrm{\textit{monitor is free}}$ ) $\LstCommentStyle{// \color{red}enter}$ … … 1818 1814 External scheduling, like internal scheduling, becomes significantly more complex for multi-monitor semantics. 1819 1815 Even in the simplest case, new semantics needs to be established. 1820 \begin{cfa} 1821 monitor M {}; 1816 \newpage 1817 \begin{cfa} 1818 monitor M { ... }; 1822 1819 void f( M & mutex m1 ); 1823 1820 void g( M & mutex m1, M & mutex m2 ) { … … 1829 1826 waitfor( f, m2 ); $\C{\color{red}// wait for call to f with argument m2}$ 1830 1827 \end{cfa} 1831 Routine @g@ has acquired both locks, so when routine @f@ is called, the lock for monitor @m2@ is passed from @g@ to @f@, while @g@ still holds lock @m1@.1828 Both locks are acquired by routine @g@, so when routine @f@ is called, the lock for monitor @m2@ is passed from @g@ to @f@, while @g@ still holds lock @m1@. 1832 1829 This behaviour can be extended to the multi-monitor @waitfor@ statement. 1833 1830 \begin{cfa} 1834 monitor M { };1831 monitor M { ... }; 1835 1832 void f( M & mutex m1, M & mutex m2 ); 1836 1833 void g( M & mutex m1, M & mutex m2 ) { … … 1839 1836 \end{cfa} 1840 1837 Again, the set of monitors passed to the @waitfor@ statement must be entirely contained in the set of monitors already acquired by the accepting routine. 1841 1842 Note, for internal and external scheduling with multiple monitors, a signalling or accepting thread must match exactly, \ie partial matching results in waiting. 1843 \begin{cquote} 1838 Also, the order of the monitors in a @waitfor@ statement is unimportant. 1839 1840 Figure~\ref{f:UnmatchedMutexSets} shows an example where, for internal and external scheduling with multiple monitors, a signalling or accepting thread must match exactly, \ie partial matching results in waiting. 1841 For both examples, the set of monitors is disjoint so unblocking is impossible. 1842 1843 \begin{figure} 1844 1844 \lstDeleteShortInline@% 1845 1845 \begin{tabular}{@{}l@{\hspace{\parindentlnth}}|@{\hspace{\parindentlnth}}l@{}} … … 1854 1854 wait( c ); 1855 1855 } 1856 g( `m11`, m2 ); // block on wait 1857 f( `m12`, m2 ); // cannot fulfil 1858 \end{cfa} 1859 & 1860 \begin{cfa} 1861 monitor M1 {} m11, m12; 1862 monitor M2 {} m2; 1863 1864 void f( M1 & mutex m1, M2 & mutex m2 ) { 1865 1866 } 1867 void g( M1 & mutex m1, M2 & mutex m2 ) { 1868 waitfor( f, m1, m2 ); 1869 } 1856 1870 g( `m11`, m2 ); // block on accept 1857 1871 f( `m12`, m2 ); // cannot fulfil 1858 1872 \end{cfa} 1859 &1860 \begin{cfa}1861 monitor M1 {} m11, m12;1862 monitor M2 {} m2;1863 1864 void f( M1 & mutex m1, M2 & mutex m2 ) {1865 1866 }1867 void g( M1 & mutex m1, M2 & mutex m2 ) {1868 waitfor( f, m1, m2 );1869 }1870 g( `m11`, m2 ); // block on accept1871 f( `m12`, m2 ); // cannot fulfil1872 \end{cfa}1873 1873 \end{tabular} 1874 1874 \lstMakeShortInline@% 1875 \end{cquote} 1876 For both internal and external scheduling, the set of monitors is disjoint so unblocking is impossible. 1875 \caption{Unmatched \protect\lstinline@mutex@ sets} 1876 \label{f:UnmatchedMutexSets} 1877 \end{figure} 1877 1878 1878 1879 1879 1880 \subsection{Extended \protect\lstinline@waitfor@} 1880 1881 1881 The extended form of the @waitfor@ statement conditionally accepts one of a group of mutex routines and allows a specific action to be performed \emph{after} the mutex routine finishes. 1882 Figure~\ref{f:ExtendedWaitfor} show the extended form of the @waitfor@ statement to conditionally accept one of a group of mutex routines, with a specific action to be performed \emph{after} the mutex routine finishes. 1883 For a @waitfor@ clause to be executed, its @when@ must be true and an outstanding call to its corresponding member(s) must exist. 1884 The \emph{conditional-expression} of a @when@ may call a routine, but the routine must not block or context switch. 1885 If there are multiple acceptable mutex calls, selection occurs top-to-bottom (prioritized) in the @waitfor@ clauses, whereas some programming languages with similar mechanisms accept non-deterministically for this case, \eg Go \lstinline[morekeywords=select]@select@. 1886 If some accept guards are true and there are no outstanding calls to these members, the acceptor is accept-blocked until a call to one of these members is made. 1887 If all the accept guards are false, the statement does nothing, unless there is a terminating @else@ clause with a true guard, which is executed instead. 1888 Hence, the terminating @else@ clause allows a conditional attempt to accept a call without blocking. 1889 If there is a @timeout@ clause, it provides an upper bound on waiting. 1890 If both a @timeout@ clause and an @else@ clause are present, the @else@ must be conditional, or the @timeout@ is never triggered. 1891 In all cases, the statement following is executed \emph{after} a clause is executed to know which of the clauses executed. 1892 1893 \begin{figure} 1882 1894 \begin{cfa} 1883 1895 `when` ( $\emph{conditional-expression}$ ) $\C{// optional guard}$ … … 1895 1907 $\emph{statement}$ $\C{// action when no immediate calls}$ 1896 1908 \end{cfa} 1897 For a @waitfor@ clause to be executed, its @when@ must be true and an outstanding call to its corresponding member(s) must exist. 1898 The \emph{conditional-expression} of a @when@ may call a routine, but the routine must not block or context switch. 1899 If there are several mutex calls that can be accepted, selection occurs top-to-bottom in the @waitfor@ clauses versus non-deterministically. 1900 If some accept guards are true and there are no outstanding calls to these members, the acceptor is accept-blocked until a call to one of these members is made. 1901 If all the accept guards are false, the statement does nothing, unless there is a terminating @else@ clause with a true guard, which is executed instead. 1902 Hence, the terminating @else@ clause allows a conditional attempt to accept a call without blocking. 1903 If there is a @timeout@ clause, it provides an upper bound on waiting, and can only appear with a conditional @else@, otherwise the timeout cannot be triggered. 1904 In all cases, the statement following is executed \emph{after} a clause is executed to know which of the clauses executed. 1909 \caption{Extended \protect\lstinline@waitfor@} 1910 \label{f:ExtendedWaitfor} 1911 \end{figure} 1905 1912 1906 1913 Note, a group of conditional @waitfor@ clauses is \emph{not} the same as a group of @if@ statements, e.g.: … … 1915 1922 \begin{cfa} 1916 1923 void insert( Buffer(T) & mutex buffer, T elem ) with( buffer ) { 1917 if ( count == BufferSize)1924 if ( count == 10 ) 1918 1925 waitfor( remove, buffer ) { 1919 elements[back] = elem; 1920 back = ( back + 1 ) % BufferSize; 1921 count += 1; 1926 // insert elem into buffer 1922 1927 } or `waitfor( ^?{}, buffer )` throw insertFail; 1923 1928 } … … 1925 1930 When the buffer is deallocated, the current waiter is unblocked and informed, so it can perform an appropriate action. 1926 1931 However, the basic @waitfor@ semantics do not support this functionality, since using an object after its destructor is called is undefined. 1927 Therefore, to make this useful capability work, the semantics for accepting the destructor is the same as @signal@, \ie the call to the destructor is placed on the urgent queue and the acceptor continues execution, which throws an exception to the acceptor and then the caller is unbloc ed from the urgent queue to deallocate the object.1932 Therefore, to make this useful capability work, the semantics for accepting the destructor is the same as @signal@, \ie the call to the destructor is placed on the urgent queue and the acceptor continues execution, which throws an exception to the acceptor and then the caller is unblocked from the urgent queue to deallocate the object. 1928 1933 Accepting the destructor is an idiomatic way to terminate a thread in \CFA. 1929 1934 … … 1933 1938 Threads in \CFA are monitors to allow direct communication among threads, \ie threads can have mutex routines that are called by other threads. 1934 1939 Hence, all monitor features are available when using threads. 1935 Figure~\ref{f:pingpong} shows an example of two threads directly calling each other and accepting calls from each other in a cycle. 1936 Note, both ping/pong threads are globally declared, @pi@/@po@, and hence, start (and possibly complete) before the program main starts. 1937 1938 \begin{figure} 1939 \lstDeleteShortInline@% 1940 \begin{cquote} 1940 The following shows an example of two threads directly calling each other and accepting calls from each other in a cycle. 1941 1941 \begin{cfa} 1942 1942 thread Ping {} pi; … … 1946 1946 int main() {} 1947 1947 \end{cfa} 1948 \vspace{-0.8\baselineskip} 1949 \begin{cquote} 1948 1950 \begin{tabular}{@{}l@{\hspace{3\parindentlnth}}l@{}} 1949 1951 \begin{cfa} … … 1965 1967 \end{cfa} 1966 1968 \end{tabular} 1967 \lstMakeShortInline@%1968 1969 \end{cquote} 1969 \caption{Threads ping/pong using external scheduling} 1970 \label{f:pingpong} 1971 \end{figure} 1970 % \lstMakeShortInline@% 1971 % \caption{Threads ping/pong using external scheduling} 1972 % \label{f:pingpong} 1973 % \end{figure} 1974 Note, the ping/pong threads are globally declared, @pi@/@po@, and hence, start (and possibly complete) before the program main starts. 1975 1976 1977 \subsection{Low-level Locks} 1978 1979 For completeness and efficiency, \CFA provides a standard set of low-level locks: recursive mutex, condition, semaphore, barrier, \etc, and atomic instructions: @fetchAssign@, @fetchAdd@, @testSet@, @compareSet@, \etc. 1980 1972 1981 1973 1982 \section{Parallelism} 1974 1983 1975 1984 Historically, computer performance was about processor speeds. 1976 However, with heat dissipation being a direct consequence of speed increase, parallelism has becomethe new source for increased performance~\cite{Sutter05, Sutter05b}.1985 However, with heat dissipation being a direct consequence of speed increase, parallelism is the new source for increased performance~\cite{Sutter05, Sutter05b}. 1977 1986 Now, high-performance applications must care about parallelism, which requires concurrency. 1978 1987 The lowest-level approach of parallelism is to use \newterm{kernel threads} in combination with semantics like @fork@, @join@, \etc. … … 1993 2002 \label{s:fibers} 1994 2003 1995 A variant of user thread is \newterm{fibers}, which removes preemption, \eg Go~\cite{Go} .1996 Like functional programming, which removes mutation and its associated problems, removing preemption from concurrency reduces nondeterminism, hence race and deadlock errors aremore difficult to generate.2004 A variant of user thread is \newterm{fibers}, which removes preemption, \eg Go~\cite{Go} @goroutine@s. 2005 Like functional programming, which removes mutation and its associated problems, removing preemption from concurrency reduces nondeterminism, making race and deadlock errors more difficult to generate. 1997 2006 However, preemption is necessary for concurrency that relies on spinning, so there are a class of problems that cannot be programmed without preemption. 1998 2007 … … 2000 2009 \subsection{Thread Pools} 2001 2010 2002 In contrast to direct threading is indirect \newterm{thread pools}, where small jobs (work units) are insert into a work pool for execution.2011 In contrast to direct threading is indirect \newterm{thread pools}, where small jobs (work units) are inserted into a work pool for execution. 2003 2012 If the jobs are dependent, \ie interact, there is an implicit/explicit dependency graph that ties them together. 2004 2013 While removing direct concurrency, and hence the amount of context switching, thread pools significantly limit the interaction that can occur among jobs. 2005 Indeed, jobs should not block because that also block the underlying thread, which effectively means the CPU utilization, and therefore throughput, suffers.2014 Indeed, jobs should not block because that also blocks the underlying thread, which effectively means the CPU utilization, and therefore throughput, suffers. 2006 2015 While it is possible to tune the thread pool with sufficient threads, it becomes difficult to obtain high throughput and good core utilization as job interaction increases. 2007 2016 As well, concurrency errors return, which threads pools are suppose to mitigate. 2008 Intel's TBB library~\cite{TBB} is the gold standard for thread pools.2009 2017 2010 2018 … … 2036 2044 The user cluster is created to contain the application user-threads. 2037 2045 Having all threads execute on the one cluster often maximizes utilization of processors, which minimizes runtime. 2038 However, because of limitations of the underlying operating system, heterogeneous hardware, or scheduling requirements (real-time), it is sometimes necessary to have multiple clusters.2046 However, because of limitations of the underlying operating system, heterogeneous hardware, or scheduling requirements (real-time), multiple clusters are sometimes necessary. 2039 2047 2040 2048 … … 2061 2069 2062 2070 \section{Implementation} 2071 \label{s:Implementation} 2063 2072 2064 2073 Currently, \CFA has fixed-sized stacks, where the stack size can be set at coroutine/thread creation but with no subsequent growth. 2065 2074 Schemes exist for dynamic stack-growth, such as stack copying and chained stacks. 2066 However, stack copying requires pointer adjustment to items on the stack, which is impossible without some form of gar age collection.2075 However, stack copying requires pointer adjustment to items on the stack, which is impossible without some form of garbage collection. 2067 2076 As well, chained stacks require all modules be recompiled to use this feature, which breaks backward compatibility with existing C libraries. 2068 2077 In the long term, it is likely C libraries will migrate to stack chaining to support concurrency, at only a minimal cost to sequential programs. … … 2074 2083 This storage is allocated at the base of a thread's stack before blocking, which means programmers must add a small amount of extra space for stacks. 2075 2084 2076 In \CFA, ordering of monitor acquisition relies on memory ordering to prevent deadlock~\cite{Havender68}, because all objects are guaranteed tohave distinct non-overlapping memory layouts, and mutual-exclusion for a monitor is only defined for its lifetime.2085 In \CFA, ordering of monitor acquisition relies on memory ordering to prevent deadlock~\cite{Havender68}, because all objects have distinct non-overlapping memory layouts, and mutual-exclusion for a monitor is only defined for its lifetime. 2077 2086 When a mutex call is made, pointers to the concerned monitors are aggregated into a variable-length array and sorted. 2078 This array persists for the entire duration of the mutual -exclusion and its ordering reused extensively.2079 2080 To improve performance and simplicity, context switching occur inside a routine call, so only callee-saved registers are copied onto the stack and then the stack register is switched;2087 This array persists for the entire duration of the mutual exclusion and is used extensively for synchronization operations. 2088 2089 To improve performance and simplicity, context switching occurs inside a routine call, so only callee-saved registers are copied onto the stack and then the stack register is switched; 2081 2090 the corresponding registers are then restored for the other context. 2082 2091 Note, the instruction pointer is untouched since the context switch is always inside the same routine. … … 2085 2094 This method is a 2-step context-switch and provides a clear distinction between user and kernel code, where scheduling and other system operations happen. 2086 2095 The alternative 1-step context-switch uses the \emph{from} thread's stack to schedule and then context-switches directly to the \emph{to} thread's stack. 2087 Experimental results (not presented) show the performance difference between these two approaches is virtually equivalent, because the 1-step performance is dominated by a lock instructionto prevent a race condition.2096 Experimental results (not presented) show the performance of these two approaches is virtually equivalent, because both approaches are dominated by locking to prevent a race condition. 2088 2097 2089 2098 All kernel threads (@pthreads@) created a stack. … … 2093 2102 2094 2103 Finally, an important aspect for a complete threading system is preemption, which introduces extra non-determinism via transparent interleaving, rather than cooperation among threads for proper scheduling and processor fairness from long-running threads. 2095 Because preemption frequency is usually long , 1 millisecond,performance cost is negligible.2104 Because preemption frequency is usually long (1 millisecond) performance cost is negligible. 2096 2105 Preemption is normally handled by setting a count-down timer on each virtual processor. 2097 2106 When the timer expires, an interrupt is delivered, and the interrupt handler resets the count-down timer, and if the virtual processor is executing in user code, the signal handler performs a user-level context-switch, or if executing in the language runtime-kernel, the preemption is ignored or rolled forward to the point where the runtime kernel context switches back to user code. 2098 2107 Multiple signal handlers may be pending. 2099 When control eventually switches back to the signal handler, it returns normally, and execution continues in the interrupted user thread, even though the return from the signal handler may be on a different kernel thread than the one where the signal was delivered.2108 When control eventually switches back to the signal handler, it returns normally, and execution continues in the interrupted user thread, even though the return from the signal handler may be on a different kernel thread than the one where the signal is delivered. 2100 2109 The only issue with this approach is that signal masks from one kernel thread may be restored on another as part of returning from the signal handler; 2101 therefore, all virtual processors in a cluster need to have the same signal mask.2110 therefore, the same signal mask is required for all virtual processors in a cluster. 2102 2111 2103 2112 However, on current UNIX systems: … … 2157 2166 \end{cfa} 2158 2167 The method used to get time is @clock_gettime( CLOCK_REALTIME )@. 2159 Each benchmark is performed @N@ times, where @N@ varies depending on the benchmark, the total time is divided by @N@ to obtain the average time for a benchmark. 2168 Each benchmark is performed @N@ times, where @N@ varies depending on the benchmark; 2169 the total time is divided by @N@ to obtain the average time for a benchmark. 2160 2170 All omitted tests for other languages are functionally identical to the shown \CFA test. 2161 2171 … … 2215 2225 \multicolumn{1}{c|}{} & \multicolumn{1}{c|}{Median} &\multicolumn{1}{c|}{Average} & \multicolumn{1}{c|}{Std Dev} \\ 2216 2226 \hline 2217 Kernel Thread & 241.5 & 243.86 & 5.08\\2218 \CFA Coroutine & 38 & 38 & 0\\2219 \CFA Thread & 10 3 & 102.96 & 2.96\\2220 \uC Coroutine & 4 6 & 45.86 & 0.35\\2221 \uC Thread & 98 & 99.11 & 1.42\\2222 Goroutine & 1 50 & 149.96 & 3.16\\2223 Java Thread & 289 & 290.68& 8.72 \\2227 Kernel Thread & 333.5 & 332.96 & 4.1 \\ 2228 \CFA Coroutine & 49 & 48.68 & 0.47 \\ 2229 \CFA Thread & 105 & 105.57 & 1.37 \\ 2230 \uC Coroutine & 44 & 44 & 0 \\ 2231 \uC Thread & 100 & 99.29 & 0.96 \\ 2232 Goroutine & 145 & 147.25 & 4.15 \\ 2233 Java Thread & 373.5 & 375.14 & 8.72 \\ 2224 2234 \hline 2225 2235 \end{tabular} … … 2230 2240 \lstset{language=CFA,moredelim=**[is][\color{red}]{@}{@},deletedelim=**[is][]{`}{`}} 2231 2241 \begin{cfa} 2232 monitor M { } m1/*, m2, m3, m4*/;2242 monitor M { ... } m1/*, m2, m3, m4*/; 2233 2243 void __attribute__((noinline)) do_call( M & mutex m/*, m2, m3, m4*/ ) {} 2234 2244 int main() { … … 2250 2260 \multicolumn{1}{c|}{} & \multicolumn{1}{c|}{Median} &\multicolumn{1}{c|}{Average} & \multicolumn{1}{c|}{Std Dev} \\ 2251 2261 \hline 2252 C routine & 2 & 2& 0 \\2253 FetchAdd + FetchSub 2254 Pthreads Mutex Lock & 31 & 31.86 & 0.99\\2255 \uC @monitor@ member routine & 3 0 & 30& 0 \\2256 \CFA @mutex@ routine, 1 argument & 4 1 & 41.57 & 0.9\\2257 \CFA @mutex@ routine, 2 argument & 76 & 76.96 & 1.57\\2258 \CFA @mutex@ routine, 4 argument & 1 45 & 146.68 & 3.85\\2259 Java synchronized routine & 27 & 28.57 & 2.6\\2262 C routine & 2 & 2 & 0 \\ 2263 FetchAdd + FetchSub & 26 & 26 & 0 \\ 2264 Pthreads Mutex Lock & 31 & 31.71 & 0.97 \\ 2265 \uC @monitor@ member routine & 31 & 31 & 0 \\ 2266 \CFA @mutex@ routine, 1 argument & 46 & 46.68 & 0.93 \\ 2267 \CFA @mutex@ routine, 2 argument & 84 & 85.36 & 1.99 \\ 2268 \CFA @mutex@ routine, 4 argument & 158 & 161 & 4.22 \\ 2269 Java synchronized routine & 27.5 & 29.79 & 2.93 \\ 2260 2270 \hline 2261 2271 \end{tabular} … … 2277 2287 Figure~\ref{f:int-sched} shows the code for \CFA, with results in Table~\ref{tab:int-sched}. 2278 2288 Note, the incremental cost of bulk acquire for \CFA, which is largely a fixed cost for small numbers of mutex objects. 2289 Java scheduling is significantly greater because the benchmark explicitly creates multiple thread in order to prevent the JIT from making the program sequential, \ie removing all locking. 2279 2290 2280 2291 \begin{figure} … … 2283 2294 volatile int go = 0; 2284 2295 condition c; 2285 monitor M { } m;2296 monitor M { ... } m; 2286 2297 void __attribute__((noinline)) do_call( M & mutex a1 ) { signal( c ); } 2287 2298 thread T {}; … … 2301 2312 } 2302 2313 \end{cfa} 2303 \captionof{figure}{\CFA Internal 2314 \captionof{figure}{\CFA Internal-scheduling benchmark} 2304 2315 \label{f:int-sched} 2305 2316 2306 2317 \centering 2307 \captionof{table}{Internal 2318 \captionof{table}{Internal-scheduling comparison (nanoseconds)} 2308 2319 \label{tab:int-sched} 2309 2320 \bigskip … … 2313 2324 \multicolumn{1}{c|}{} & \multicolumn{1}{c|}{Median} &\multicolumn{1}{c|}{Average} & \multicolumn{1}{c|}{Std Dev} \\ 2314 2325 \hline 2315 Pthreads Condition Variable & 5902.5 & 6093.29 & 714.78\\2316 \uC @signal@ & 32 2 & 323 & 3.36\\2317 \CFA @signal@, 1 @monitor@ & 3 52.5 & 353.11 & 3.66\\2318 \CFA @signal@, 2 @monitor@ & 4 30 & 430.29 & 8.97\\2319 \CFA @signal@, 4 @monitor@ & 594.5 & 606.57 & 18.33 \\2320 Java @notify@ & 1 3831.5 & 15698.21 & 4782.3\\2326 Pthreads Condition Variable & 6005 & 5681.43 & 835.45 \\ 2327 \uC @signal@ & 324 & 325.54 & 3,02 \\ 2328 \CFA @signal@, 1 @monitor@ & 368.5 & 370.61 & 4.77 \\ 2329 \CFA @signal@, 2 @monitor@ & 467 & 470.5 & 6.79 \\ 2330 \CFA @signal@, 4 @monitor@ & 700.5 & 702.46 & 7.23 \\ 2331 Java @notify@ & 15471 & 172511 & 5689 \\ 2321 2332 \hline 2322 2333 \end{tabular} … … 2334 2345 \begin{cfa} 2335 2346 volatile int go = 0; 2336 monitor M { } m;2347 monitor M { ... } m; 2337 2348 thread T {}; 2338 2349 void __attribute__((noinline)) do_call( M & mutex ) {} … … 2352 2363 } 2353 2364 \end{cfa} 2354 \captionof{figure}{\CFA external 2365 \captionof{figure}{\CFA external-scheduling benchmark} 2355 2366 \label{f:ext-sched} 2356 2367 2357 2368 \centering 2358 2369 2359 \captionof{table}{External 2370 \captionof{table}{External-scheduling comparison (nanoseconds)} 2360 2371 \label{tab:ext-sched} 2361 2372 \bigskip … … 2364 2375 \multicolumn{1}{c|}{} & \multicolumn{1}{c|}{Median} &\multicolumn{1}{c|}{Average} & \multicolumn{1}{c|}{Std Dev} \\ 2365 2376 \hline 2366 \uC @_Accept@ & 35 0 & 350.61 & 3.11\\2367 \CFA @waitfor@, 1 @monitor@ & 35 8.5 & 358.36 & 3.82\\2368 \CFA @waitfor@, 2 @monitor@ & 4 22 & 426.79 & 7.95\\2369 \CFA @waitfor@, 4 @monitor@ & 579.5 & 585.46 & 11.25\\2377 \uC @_Accept@ & 358 & 359.11 & 2.53 \\ 2378 \CFA @waitfor@, 1 @monitor@ & 359 & 360.93 & 4.07 \\ 2379 \CFA @waitfor@, 2 @monitor@ & 450 & 449.39 & 6.62 \\ 2380 \CFA @waitfor@, 4 @monitor@ & 652 & 655.64 & 7.73 \\ 2370 2381 \hline 2371 2382 \end{tabular} … … 2383 2394 } 2384 2395 \end{cfa} 2385 \captionof{figure}{\CFA object 2396 \captionof{figure}{\CFA object-creation benchmark} 2386 2397 \label{f:creation} 2387 2398 … … 2396 2407 \multicolumn{1}{c|}{} & \multicolumn{1}{c|}{Median} & \multicolumn{1}{c|}{Average} & \multicolumn{1}{c|}{Std Dev} \\ 2397 2408 \hline 2398 Pthreads & 2 6996 & 26984.71 & 156.6\\2399 \CFA Coroutine Lazy & 6 & 5.71 & 0.45\\2400 \CFA Coroutine Eager & 708 & 706.68 & 4.82\\2401 \CFA Thread & 1173.5 & 1176.18 & 15.18\\2402 \uC Coroutine & 10 9 & 107.46 & 1.74\\2403 \uC Thread & 5 26 & 530.89 & 9.73\\2404 Goroutine & 2520.5 & 2530.93 & 61.56\\2405 Java Thread & 91114.5 & 92272.79 & 961.58\\2409 Pthreads & 28091 & 28073.39 & 163.1 \\ 2410 \CFA Coroutine Lazy & 6 & 6.07 & 0.26 \\ 2411 \CFA Coroutine Eager & 520 & 520.61 & 2.04 \\ 2412 \CFA Thread & 2032 & 2016.29 & 112.07 \\ 2413 \uC Coroutine & 106 & 107.36 & 1.47 \\ 2414 \uC Thread & 536.5 & 537.07 & 4.64 \\ 2415 Goroutine & 3103 & 3086.29 & 90.25 \\ 2416 Java Thread & 103416.5 & 103732.29 & 1137 \\ 2406 2417 \hline 2407 2418 \end{tabular} … … 2418 2429 \section{Conclusion} 2419 2430 2420 This paper demonstrate a concurrency API that is simple, efficient, and able to build higher-level concurrency features.2431 This paper demonstrates a concurrency API that is simple, efficient, and able to build higher-level concurrency features. 2421 2432 The approach provides concurrency based on a preemptive M:N user-level threading-system, executing in clusters, which encapsulate scheduling of work on multiple kernel threads providing parallelism. 2422 2433 The M:N model is judged to be efficient and provide greater flexibility than a 1:1 threading model. … … 2439 2450 However, no single scheduler is optimal for all workloads and therefore there is value in being able to change the scheduler for given programs. 2440 2451 One solution is to offer various tweaking options, allowing the scheduler to be adjusted to the requirements of the workload. 2441 However, to be truly flexible, it is necessary to have a pluggable scheduler.2452 However, to be truly flexible, a pluggable scheduler is necessary. 2442 2453 Currently, the \CFA pluggable scheduler is too simple to handle complex scheduling, \eg quality of service and real-time, where the scheduler must interact with mutex objects to deal with issues like priority inversion. 2443 2454 … … 2449 2460 At its core, non-blocking I/O is an operating-system level feature queuing IO operations, \eg network operations, and registering for notifications instead of waiting for requests to complete. 2450 2461 Current trends use asynchronous programming like callbacks, futures, and/or promises, \eg Node.js~\cite{NodeJs} for JavaScript, Spring MVC~\cite{SpringMVC} for Java, and Django~\cite{Django} for Python. 2451 However, these solutions lead to code that is hard create, read and maintain.2462 However, these solutions lead to code that is hard to create, read and maintain. 2452 2463 A better approach is to tie non-blocking I/O into the concurrency system to provide ease of use with low overhead, \eg thread-per-connection web-services. 2453 2464 A non-blocking I/O library is currently under development for \CFA. … … 2456 2467 \label{futur:tools} 2457 2468 2458 While monitors offer a flexible and powerful concurrentfor \CFA, other concurrency tools are also necessary for a complete multi-paradigm concurrency package.2469 While monitors offer flexible and powerful concurrency for \CFA, other concurrency tools are also necessary for a complete multi-paradigm concurrency package. 2459 2470 Examples of such tools can include futures and promises~\cite{promises}, executors and actors. 2460 2471 These additional features are useful when monitors offer a level of abstraction that is inadequate for certain tasks. 2472 As well, new \CFA extensions should make it possible to create a uniform interface for virtually all mutual exclusion, including monitors and low-level locks. 2461 2473 2462 2474 \paragraph{Implicit Threading} … … 2467 2479 The canonical example of implicit concurrency is concurrent nested @for@ loops, which are amenable to divide and conquer algorithms~\cite{uC++book}. 2468 2480 The \CFA language features should make it possible to develop a reasonable number of implicit concurrency mechanism to solve basic HPC data-concurrency problems. 2469 However, implicit concurrency is a restrictive solution and has itslimitations, so it can never replace explicit concurrent programming.2481 However, implicit concurrency is a restrictive solution with significant limitations, so it can never replace explicit concurrent programming. 2470 2482 2471 2483 -
doc/papers/concurrency/figures/ext_monitor.fig
rc653b37 re3b2474 89 89 4 1 -1 0 0 0 12 0.0000 2 135 525 3525 3675 shared\001 90 90 4 1 -1 0 0 0 12 0.0000 2 135 735 3525 3975 variables\001 91 4 0 4 50 -1 0 11 0.0000 2 120 135 4150 1875 Y\00192 4 0 4 50 -1 0 11 0.0000 2 120 1 05 4150 2175 Z\00193 4 0 4 50 -1 0 11 0.0000 2 120 135 4150 2475 X\00194 4 0 4 50 -1 0 11 0.0000 2 120 1 65 4150 2775 W\00191 4 0 4 50 -1 0 11 0.0000 2 120 135 4150 1875 X\001 92 4 0 4 50 -1 0 11 0.0000 2 120 135 4150 2175 Y\001 93 4 0 4 50 -1 0 11 0.0000 2 120 135 4150 2475 Y\001 94 4 0 4 50 -1 0 11 0.0000 2 120 135 4150 2775 X\001 95 95 4 0 -1 0 0 3 12 0.0000 2 150 540 5025 4275 urgent\001 96 96 4 1 0 50 -1 0 11 0.0000 2 165 600 3150 3150 accepted\001 -
src/Parser/LinkageSpec.h
rc653b37 re3b2474 10 10 // Created On : Sat May 16 13:24:28 2015 11 11 // Last Modified By : Peter A. Buhr 12 // Last Modified On : Sat Jul 22 09:32:16 201713 // Update Count : 1 412 // Last Modified On : Mon Jul 2 07:46:49 2018 13 // Update Count : 16 14 14 // 15 15 … … 22 22 namespace LinkageSpec { 23 23 // All linkage specs are some combination of these flags: 24 enum { 25 Mangle = 1 << 0, 26 Generate = 1 << 1, 27 Overrideable = 1 << 2, 28 Builtin = 1 << 3, 29 GccBuiltin = 1 << 4, 30 31 NoOfSpecs = 1 << 5, 32 }; 24 enum { Mangle = 1 << 0, Generate = 1 << 1, Overrideable = 1 << 2, Builtin = 1 << 3, GccBuiltin = 1 << 4, NoOfSpecs = 1 << 5, }; 33 25 34 26 union Spec { … … 42 34 }; 43 35 constexpr Spec( unsigned int val ) : val( val ) {} 44 constexpr Spec( Spec const & other ) : val( other.val ) {}36 constexpr Spec( Spec const & other ) : val( other.val ) {} 45 37 // Operators may go here. 46 38 // Supports == and != 47 constexpr operator unsigned int 39 constexpr operator unsigned int() const { return val; } 48 40 }; 49 41 -
src/Parser/parser.yy
rc653b37 re3b2474 10 10 // Created On : Sat Sep 1 20:22:55 2001 11 11 // Last Modified By : Peter A. Buhr 12 // Last Modified On : Sun Jun 24 10:41:10201813 // Update Count : 3 58712 // Last Modified On : Mon Jul 2 20:23:14 2018 13 // Update Count : 3607 14 14 // 15 15 … … 112 112 for ( DeclarationNode *iter = declaration; iter != nullptr; iter = (DeclarationNode *)iter->get_next() ) { 113 113 iter->set_extension( true ); 114 } // for 115 } // distExt 116 117 void distQual( DeclarationNode * declaration, DeclarationNode * qualifiers ) { 118 // distribute qualifiers across all declarations in a distribution statemement 119 for ( DeclarationNode * iter = declaration; iter != nullptr; iter = (DeclarationNode *)iter->get_next() ) { 120 if ( isMangled( iter->linkage ) ) { // ignore extern "C" 121 iter->addQualifiers( qualifiers->clone() ); 122 } // if 114 123 } // for 115 124 } // distExt … … 1136 1145 1137 1146 waitfor: 1138 WAITFOR '(' identifier ')' 1139 { 1140 $$ = new ExpressionNode( new NameExpr( *$3 ) ); 1141 delete $3; 1142 } 1143 | WAITFOR '(' identifier ',' argument_expression_list ')' 1144 { 1145 $$ = new ExpressionNode( new NameExpr( *$3 ) ); 1146 $$->set_last( $5 ); 1147 delete $3; 1148 } 1147 WAITFOR '(' cast_expression ')' 1148 { $$ = $3; } 1149 | WAITFOR '(' cast_expression ',' argument_expression_list ')' 1150 { $$ = (ExpressionNode *)$3->set_last( $5 ); } 1149 1151 ; 1150 1152 … … 1163 1165 { $$ = build_waitfor_timeout( nullptr, $3, $1 ); } 1164 1166 // "else" must be conditional after timeout or timeout is never triggered (i.e., it is meaningless) 1167 | when_clause_opt timeout statement WOR ELSE statement 1168 { SemanticError( yylloc, "else clause must be conditional after timeout or timeout never triggered." ); $$ = nullptr; } 1165 1169 | when_clause_opt timeout statement WOR when_clause ELSE statement 1166 1170 { $$ = build_waitfor_timeout( $2, $3, $1, $7, $5 ); } … … 2380 2384 '{' up external_definition_list_opt down '}' // CFA, namespace 2381 2385 { 2382 for ( DeclarationNode * iter = $5; iter != nullptr; iter = (DeclarationNode *)iter->get_next() ) { 2383 if ( isMangled( iter->linkage ) ) { // ignore extern "C" 2384 iter->addQualifiers( $1->clone() ); 2385 } // if 2386 } // for 2386 distQual( $5, $1 ); 2387 2387 xxx = false; 2388 2388 delete $1; … … 2391 2391 | declaration_qualifier_list 2392 2392 { 2393 if ( $1->type ->qualifiers.val ) { SemanticError( yylloc, "CV qualifiers cannot be distributed; only storage-class and forall qualifiers." ); }2394 if ( $1->type ->forall ) xxx = forall = true; // remember generic type2393 if ( $1->type && $1->type->qualifiers.val ) { SemanticError( yylloc, "CV qualifiers cannot be distributed; only storage-class and forall qualifiers." ); } 2394 if ( $1->type && $1->type->forall ) xxx = forall = true; // remember generic type 2395 2395 } 2396 2396 '{' up external_definition_list_opt down '}' // CFA, namespace 2397 2397 { 2398 for ( DeclarationNode * iter = $5; iter != nullptr; iter = (DeclarationNode *)iter->get_next() ) { 2399 if ( isMangled( iter->linkage ) ) { // ignore extern "C" 2400 iter->addQualifiers( $1->clone() ); 2401 } // if 2402 } // for 2398 distQual( $5, $1 ); 2403 2399 xxx = false; 2404 2400 delete $1; … … 2412 2408 '{' up external_definition_list_opt down '}' // CFA, namespace 2413 2409 { 2414 for ( DeclarationNode * iter = $6; iter != nullptr; iter = (DeclarationNode *)iter->get_next() ) { 2415 if ( isMangled( iter->linkage ) && isMangled( $2->linkage ) ) { // ignore extern "C" 2416 iter->addQualifiers( $1->clone() ); 2417 iter->addQualifiers( $2->clone() ); 2418 } // if 2419 } // for 2410 distQual( $6, $2 ); 2411 distQual( $6, $1 ); 2420 2412 xxx = false; 2421 2413 delete $1; -
src/ResolvExpr/Resolver.cc
rc653b37 re3b2474 583 583 // Make sure we don't widen any existing bindings 584 584 resultEnv.forbidWidening(); 585 585 586 586 // Find any unbound type variables 587 587 resultEnv.extractOpenVars( openVars ); … … 590 590 auto param_end = function->parameters.end(); 591 591 592 int n_mutex_ arg= 0;592 int n_mutex_param = 0; 593 593 594 594 // For every arguments of its set, check if it matches one of the parameter … … 600 600 // We ran out of parameters but still have arguments 601 601 // this function doesn't match 602 SemanticError( function, toString("candidate function not viable: too many mutex arguments, expected ", n_mutex_ arg, "\n" ));602 SemanticError( function, toString("candidate function not viable: too many mutex arguments, expected ", n_mutex_param, "\n" )); 603 603 } 604 604 605 n_mutex_ arg++;605 n_mutex_param++; 606 606 607 607 // Check if the argument matches the parameter type in the current scope … … 626 626 // Check if parameters are missing 627 627 if( advance_to_mutex( param, param_end ) ) { 628 do { 629 n_mutex_param++; 630 param++; 631 } while( advance_to_mutex( param, param_end ) ); 632 628 633 // We ran out of arguments but still have parameters left 629 634 // this function doesn't match 630 SemanticError( function, toString("candidate function not viable: too few mutex arguments, expected ", n_mutex_ arg, "\n" ));635 SemanticError( function, toString("candidate function not viable: too few mutex arguments, expected ", n_mutex_param, "\n" )); 631 636 } 632 637 -
src/libcfa/clock
rc653b37 re3b2474 10 10 // Created On : Thu Apr 12 14:36:06 2018 11 11 // Last Modified By : Peter A. Buhr 12 // Last Modified On : Thu Apr 12 16:53:31 201813 // Update Count : 312 // Last Modified On : Mon Jul 2 21:40:01 2018 13 // Update Count : 7 14 14 // 15 15 … … 34 34 }; 35 35 36 static inline void resetClock( Clock & clk ) with( clk ) { 37 clocktype = CLOCK_REALTIME_COARSE; 38 } // Clock::resetClock 36 static inline { 37 void resetClock( Clock & clk ) with( clk ) { 38 clocktype = CLOCK_REALTIME_COARSE; 39 } // Clock::resetClock 39 40 40 static inlinevoid resetClock( Clock & clk, Duration adj ) with( clk ) {41 clocktype = -1;42 offset = adj + timezone`s;// timezone (global) is (UTC - local time) in seconds43 } // resetClock41 void resetClock( Clock & clk, Duration adj ) with( clk ) { 42 clocktype = -1; 43 offset = adj + __timezone`s; // timezone (global) is (UTC - local time) in seconds 44 } // resetClock 44 45 45 static inlinevoid ?{}( Clock & clk ) { resetClock( clk ); }46 static inlinevoid ?{}( Clock & clk, Duration adj ) { resetClock( clk, adj ); }46 void ?{}( Clock & clk ) { resetClock( clk ); } 47 void ?{}( Clock & clk, Duration adj ) { resetClock( clk, adj ); } 47 48 48 static inline Duration getRes() {49 struct timespec res;50 clock_getres( CLOCK_REALTIME_COARSE, &res );51 return ((int64_t)res.tv_sec * TIMEGRAN + res.tv_nsec)`ns;52 } // getRes49 Duration getResNsec() { 50 struct timespec res; 51 clock_getres( CLOCK_REALTIME, &res ); 52 return ((int64_t)res.tv_sec * TIMEGRAN + res.tv_nsec)`ns; 53 } // getRes 53 54 54 static inline Time getTimeNsec() { // with nanoseconds 55 timespec curr;56 clock_gettime( CLOCK_REALTIME_COARSE, &curr);57 return (Time){ curr };58 } // getTime 55 Duration getRes() { 56 struct timespec res; 57 clock_getres( CLOCK_REALTIME_COARSE, &res ); 58 return ((int64_t)res.tv_sec * TIMEGRAN + res.tv_nsec)`ns; 59 } // getRes 59 60 60 static inline Time getTime() { // without nanoseconds 61 timespec curr; 62 clock_gettime( CLOCK_REALTIME_COARSE, &curr ); 63 curr.tv_nsec = 0; 64 return (Time){ curr }; 65 } // getTime 61 Time getTimeNsec() { // with nanoseconds 62 timespec curr; 63 clock_gettime( CLOCK_REALTIME, &curr ); 64 return (Time){ curr }; 65 } // getTime 66 66 67 static inline Time getTime( Clock & clk ) with( clk ) { 68 return getTime() + offset; 69 } // getTime 67 Time getTime() { // without nanoseconds 68 timespec curr; 69 clock_gettime( CLOCK_REALTIME_COARSE, &curr ); 70 curr.tv_nsec = 0; 71 return (Time){ curr }; 72 } // getTime 70 73 71 static inline Time ?()( Clock & clk ) with( clk ) { // alternative syntax 72 return getTime() + offset;73 } // getTime74 Time getTime( Clock & clk ) with( clk ) { 75 return getTime() + offset; 76 } // getTime 74 77 75 static inline timeval getTime( Clock & clk ) { 76 return (timeval){ clk() };77 } // getTime78 Time ?()( Clock & clk ) with( clk ) { // alternative syntax 79 return getTime() + offset; 80 } // getTime 78 81 79 static inline tm getTime( Clock & clk ) with( clk ) { 80 tm ret; 81 localtime_r( getTime( clk ).tv_sec, &ret ); 82 return ret; 83 } // getTime 82 timeval getTime( Clock & clk ) { 83 return (timeval){ clk() }; 84 } // getTime 85 86 tm getTime( Clock & clk ) with( clk ) { 87 tm ret; 88 localtime_r( getTime( clk ).tv_sec, &ret ); 89 return ret; 90 } // getTime 91 } // distribution 84 92 85 93 // Local Variables: // -
src/libcfa/iostream
rc653b37 re3b2474 10 10 // Created On : Wed May 27 17:56:53 2015 11 11 // Last Modified By : Peter A. Buhr 12 // Last Modified On : S at Jun 2 08:07:55201813 // Update Count : 15 312 // Last Modified On : Sun Jul 1 12:12:22 2018 13 // Update Count : 155 14 14 // 15 15 … … 42 42 void open( ostype & os, const char * name, const char * mode ); 43 43 void close( ostype & os ); 44 ostype & write( ostype &, const char *, unsigned long int );44 ostype & write( ostype &, const char *, size_t ); 45 45 int fmt( ostype &, const char fmt[], ... ); 46 46 }; // ostream … … 117 117 void open( istype & is, const char * name ); 118 118 void close( istype & is ); 119 istype & read( istype &, char *, unsigned long int );119 istype & read( istype &, char *, size_t ); 120 120 istype & ungetc( istype &, char ); 121 121 int fmt( istype &, const char fmt[], ... ); -
src/libcfa/stdlib
rc653b37 re3b2474 10 10 // Created On : Thu Jan 28 17:12:35 2016 11 11 // Last Modified By : Peter A. Buhr 12 // Last Modified On : Sat Jun 2 08:46:35201813 // Update Count : 3 0612 // Last Modified On : Tue Jul 3 08:17:28 2018 13 // Update Count : 324 14 14 // 15 15 … … 264 264 //--------------------------------------- 265 265 266 extern "C" { void srandom( unsigned int seed ); } // override C version 267 char random( void ); 268 char random( char u ); 269 char random( char l, char u ); 270 int random( void ); 271 int random( int u ); 272 int random( int l, int u ); 273 unsigned int random( void ); 274 unsigned int random( unsigned int u ); 275 unsigned int random( unsigned int l, unsigned int u ); 276 extern "C" { long int random( void ); } // override C version 277 long int random( long int u ); 278 long int random( long int l, long int u ); 279 unsigned long int random( void ); 280 unsigned long int random( unsigned long int u ); 281 unsigned long int random( unsigned long int l, unsigned long int u ); 282 float random( void ); 283 double random( void ); 284 float _Complex random( void ); 285 double _Complex random( void ); 286 long double _Complex random( void ); 266 extern "C" { // override C version 267 void srandom( unsigned int seed ); 268 long int random( void ); 269 } // extern "C" 270 271 static inline { 272 long int random( long int l, long int u ) { if ( u < l ) [u, l] = [l, u]; return lrand48() % (u - l) + l; } // [l,u) 273 long int random( long int u ) { if ( u < 0 ) return random( u, 0 ); else return random( 0, u ); } // [0,u) 274 unsigned long int random( void ) { return lrand48(); } 275 unsigned long int random( unsigned long int l, unsigned long int u ) { if ( u < l ) [u, l] = [l, u]; return lrand48() % (u - l) + l; } // [l,u) 276 unsigned long int random( unsigned long int u ) { return lrand48() % u; } // [0,u) 277 278 char random( void ) { return (unsigned long int)random(); } 279 char random( char u ) { return random( (unsigned long int)u ); } // [0,u) 280 char random( char l, char u ) { return random( (unsigned long int)l, (unsigned long int)u ); } // [l,u) 281 int random( void ) { return (long int)random(); } 282 int random( int u ) { return random( (long int)u ); } // [0,u] 283 int random( int l, int u ) { return random( (long int)l, (long int)u ); } // [l,u) 284 unsigned int random( void ) { return (unsigned long int)random(); } 285 unsigned int random( unsigned int u ) { return random( (unsigned long int)u ); } // [0,u] 286 unsigned int random( unsigned int l, unsigned int u ) { return random( (unsigned long int)l, (unsigned long int)u ); } // [l,u) 287 } // distribution 288 289 float random( void ); // [0.0, 1.0) 290 double random( void ); // [0.0, 1.0) 291 float _Complex random( void ); // [0.0, 1.0)+[0.0, 1.0)i 292 double _Complex random( void ); // [0.0, 1.0)+[0.0, 1.0)i 293 long double _Complex random( void ); // [0.0, 1.0)+[0.0, 1.0)i 287 294 288 295 //--------------------------------------- -
src/libcfa/stdlib.c
rc653b37 re3b2474 10 10 // Created On : Thu Jan 28 17:10:29 2016 11 11 // Last Modified By : Peter A. Buhr 12 // Last Modified On : Sat Jun 2 06:15:05201813 // Update Count : 4 4812 // Last Modified On : Tue Jul 3 08:17:30 2018 13 // Update Count : 457 14 14 // 15 15 … … 249 249 //--------------------------------------- 250 250 251 extern "C" { void srandom( unsigned int seed ) { srand48( (long int)seed ); } } // override C version 252 char random( void ) { return (unsigned long int)random(); } 253 char random( char u ) { return random( (unsigned long int)u ); } 254 char random( char l, char u ) { return random( (unsigned long int)l, (unsigned long int)u ); } 255 int random( void ) { return (long int)random(); } 256 int random( int u ) { return random( (long int)u ); } 257 int random( int l, int u ) { return random( (long int)l, (long int)u ); } 258 unsigned int random( void ) { return (unsigned long int)random(); } 259 unsigned int random( unsigned int u ) { return random( (unsigned long int)u ); } 260 unsigned int random( unsigned int l, unsigned int u ) { return random( (unsigned long int)l, (unsigned long int)u ); } 261 extern "C" { long int random( void ) { return mrand48(); } } // override C version 262 long int random( long int u ) { if ( u < 0 ) return random( u, 0 ); else return random( 0, u ); } 263 long int random( long int l, long int u ) { assert( l < u ); return lrand48() % (u - l) + l; } 264 unsigned long int random( void ) { return lrand48(); } 265 unsigned long int random( unsigned long int u ) { return lrand48() % u; } 266 unsigned long int random( unsigned long int l, unsigned long int u ) { assert( l < u ); return lrand48() % (u - l) + l; } 251 extern "C" { // override C version 252 void srandom( unsigned int seed ) { srand48( (long int)seed ); } 253 long int random( void ) { return mrand48(); } 254 } // extern "C" 255 267 256 float random( void ) { return (float)drand48(); } // cast otherwise float uses lrand48 268 257 double random( void ) { return drand48(); } -
src/libcfa/time
rc653b37 re3b2474 10 10 // Created On : Wed Mar 14 23:18:57 2018 11 11 // Last Modified By : Peter A. Buhr 12 // Last Modified On : Sat Apr 14 17:48:23201813 // Update Count : 6 3612 // Last Modified On : Mon Jul 2 21:28:38 2018 13 // Update Count : 641 14 14 // 15 15 … … 27 27 enum { TIMEGRAN = 1_000_000_000LL }; // nanosecond granularity, except for timeval 28 28 29 30 29 //######################### Duration ######################### 31 30 32 static inline Duration ?=?( Duration & dur, zero_t ) { return dur{ 0 }; } 33 34 static inline Duration +?( Duration rhs ) with( rhs ) { return (Duration)@{ +tv }; } 35 static inline Duration ?+?( Duration & lhs, Duration rhs ) { return (Duration)@{ lhs.tv + rhs.tv }; } 36 static inline Duration ?+=?( Duration & lhs, Duration rhs ) { lhs = lhs + rhs; return lhs; } 37 38 static inline Duration -?( Duration rhs ) with( rhs ) { return (Duration)@{ -tv }; } 39 static inline Duration ?-?( Duration & lhs, Duration rhs ) { return (Duration)@{ lhs.tv - rhs.tv }; } 40 static inline Duration ?-=?( Duration & lhs, Duration rhs ) { lhs = lhs - rhs; return lhs; } 41 42 static inline Duration ?*?( Duration lhs, int64_t rhs ) { return (Duration)@{ lhs.tv * rhs }; } 43 static inline Duration ?*?( int64_t lhs, Duration rhs ) { return (Duration)@{ lhs * rhs.tv }; } 44 static inline Duration ?*=?( Duration & lhs, int64_t rhs ) { lhs = lhs * rhs; return lhs; } 45 46 static inline int64_t ?/?( Duration lhs, Duration rhs ) { return lhs.tv / rhs.tv; } 47 static inline Duration ?/?( Duration lhs, int64_t rhs ) { return (Duration)@{ lhs.tv / rhs }; } 48 static inline Duration ?/=?( Duration & lhs, int64_t rhs ) { lhs = lhs / rhs; return lhs; } 49 static inline double div( Duration lhs, Duration rhs ) { return (double)lhs.tv / (double)rhs.tv; } 50 51 static inline Duration ?%?( Duration lhs, Duration rhs ) { return (Duration)@{ lhs.tv % rhs.tv }; } 52 static inline Duration ?%=?( Duration & lhs, Duration rhs ) { lhs = lhs % rhs; return lhs; } 53 54 static inline _Bool ?==?( Duration lhs, Duration rhs ) { return lhs.tv == rhs.tv; } 55 static inline _Bool ?!=?( Duration lhs, Duration rhs ) { return lhs.tv != rhs.tv; } 56 static inline _Bool ?<? ( Duration lhs, Duration rhs ) { return lhs.tv < rhs.tv; } 57 static inline _Bool ?<=?( Duration lhs, Duration rhs ) { return lhs.tv <= rhs.tv; } 58 static inline _Bool ?>? ( Duration lhs, Duration rhs ) { return lhs.tv > rhs.tv; } 59 static inline _Bool ?>=?( Duration lhs, Duration rhs ) { return lhs.tv >= rhs.tv; } 60 61 static inline _Bool ?==?( Duration lhs, zero_t ) { return lhs.tv == 0; } 62 static inline _Bool ?!=?( Duration lhs, zero_t ) { return lhs.tv != 0; } 63 static inline _Bool ?<? ( Duration lhs, zero_t ) { return lhs.tv < 0; } 64 static inline _Bool ?<=?( Duration lhs, zero_t ) { return lhs.tv <= 0; } 65 static inline _Bool ?>? ( Duration lhs, zero_t ) { return lhs.tv > 0; } 66 static inline _Bool ?>=?( Duration lhs, zero_t ) { return lhs.tv >= 0; } 67 68 static inline Duration abs( Duration rhs ) { return rhs.tv >= 0 ? rhs : -rhs; } 69 70 static inline Duration ?`ns( int64_t nsec ) { return (Duration)@{ nsec }; } 71 static inline Duration ?`us( int64_t usec ) { return (Duration)@{ usec * (TIMEGRAN / 1_000_000LL) }; } 72 static inline Duration ?`ms( int64_t msec ) { return (Duration)@{ msec * (TIMEGRAN / 1_000LL) }; } 73 static inline Duration ?`s( int64_t sec ) { return (Duration)@{ sec * TIMEGRAN }; } 74 static inline Duration ?`s( double sec ) { return (Duration)@{ sec * TIMEGRAN }; } 75 static inline Duration ?`m( int64_t min ) { return (Duration)@{ min * (60LL * TIMEGRAN) }; } 76 static inline Duration ?`m( double min ) { return (Duration)@{ min * (60LL * TIMEGRAN) }; } 77 static inline Duration ?`h( int64_t hours ) { return (Duration)@{ hours * (60LL * 60LL * TIMEGRAN) }; } 78 static inline Duration ?`h( double hours ) { return (Duration)@{ hours * (60LL * 60LL * TIMEGRAN) }; } 79 static inline Duration ?`d( int64_t days ) { return (Duration)@{ days * (24LL * 60LL * 60LL * TIMEGRAN) }; } 80 static inline Duration ?`d( double days ) { return (Duration)@{ days * (24LL * 60LL * 60LL * TIMEGRAN) }; } 81 static inline Duration ?`w( int64_t weeks ) { return (Duration)@{ weeks * (7LL * 24LL * 60LL * 60LL * TIMEGRAN) }; } 82 static inline Duration ?`w( double weeks ) { return (Duration)@{ weeks * (7LL * 24LL * 60LL * 60LL * TIMEGRAN) }; } 83 84 static inline int64_t ?`ns( Duration dur ) { return dur.tv; } 85 static inline int64_t ?`us( Duration dur ) { return dur.tv / (TIMEGRAN / 1_000_000LL); } 86 static inline int64_t ?`ms( Duration dur ) { return dur.tv / (TIMEGRAN / 1_000LL); } 87 static inline int64_t ?`s( Duration dur ) { return dur.tv / TIMEGRAN; } 88 static inline int64_t ?`m( Duration dur ) { return dur.tv / (60LL * TIMEGRAN); } 89 static inline int64_t ?`h( Duration dur ) { return dur.tv / (60LL * 60LL * TIMEGRAN); } 90 static inline int64_t ?`d( Duration dur ) { return dur.tv / (24LL * 60LL * 60LL * TIMEGRAN); } 91 static inline int64_t ?`w( Duration dur ) { return dur.tv / (7LL * 24LL * 60LL * 60LL * TIMEGRAN); } 92 93 static inline Duration max( Duration lhs, Duration rhs ) { return (lhs.tv < rhs.tv) ? rhs : lhs;} 94 static inline Duration min( Duration lhs, Duration rhs ) { return !(rhs.tv < lhs.tv) ? lhs : rhs;} 95 31 static inline { 32 Duration ?=?( Duration & dur, zero_t ) { return dur{ 0 }; } 33 34 Duration +?( Duration rhs ) with( rhs ) { return (Duration)@{ +tv }; } 35 Duration ?+?( Duration & lhs, Duration rhs ) { return (Duration)@{ lhs.tv + rhs.tv }; } 36 Duration ?+=?( Duration & lhs, Duration rhs ) { lhs = lhs + rhs; return lhs; } 37 38 Duration -?( Duration rhs ) with( rhs ) { return (Duration)@{ -tv }; } 39 Duration ?-?( Duration & lhs, Duration rhs ) { return (Duration)@{ lhs.tv - rhs.tv }; } 40 Duration ?-=?( Duration & lhs, Duration rhs ) { lhs = lhs - rhs; return lhs; } 41 42 Duration ?*?( Duration lhs, int64_t rhs ) { return (Duration)@{ lhs.tv * rhs }; } 43 Duration ?*?( int64_t lhs, Duration rhs ) { return (Duration)@{ lhs * rhs.tv }; } 44 Duration ?*=?( Duration & lhs, int64_t rhs ) { lhs = lhs * rhs; return lhs; } 45 46 int64_t ?/?( Duration lhs, Duration rhs ) { return lhs.tv / rhs.tv; } 47 Duration ?/?( Duration lhs, int64_t rhs ) { return (Duration)@{ lhs.tv / rhs }; } 48 Duration ?/=?( Duration & lhs, int64_t rhs ) { lhs = lhs / rhs; return lhs; } 49 double div( Duration lhs, Duration rhs ) { return (double)lhs.tv / (double)rhs.tv; } 50 51 Duration ?%?( Duration lhs, Duration rhs ) { return (Duration)@{ lhs.tv % rhs.tv }; } 52 Duration ?%=?( Duration & lhs, Duration rhs ) { lhs = lhs % rhs; return lhs; } 53 54 _Bool ?==?( Duration lhs, Duration rhs ) { return lhs.tv == rhs.tv; } 55 _Bool ?!=?( Duration lhs, Duration rhs ) { return lhs.tv != rhs.tv; } 56 _Bool ?<? ( Duration lhs, Duration rhs ) { return lhs.tv < rhs.tv; } 57 _Bool ?<=?( Duration lhs, Duration rhs ) { return lhs.tv <= rhs.tv; } 58 _Bool ?>? ( Duration lhs, Duration rhs ) { return lhs.tv > rhs.tv; } 59 _Bool ?>=?( Duration lhs, Duration rhs ) { return lhs.tv >= rhs.tv; } 60 61 _Bool ?==?( Duration lhs, zero_t ) { return lhs.tv == 0; } 62 _Bool ?!=?( Duration lhs, zero_t ) { return lhs.tv != 0; } 63 _Bool ?<? ( Duration lhs, zero_t ) { return lhs.tv < 0; } 64 _Bool ?<=?( Duration lhs, zero_t ) { return lhs.tv <= 0; } 65 _Bool ?>? ( Duration lhs, zero_t ) { return lhs.tv > 0; } 66 _Bool ?>=?( Duration lhs, zero_t ) { return lhs.tv >= 0; } 67 68 Duration abs( Duration rhs ) { return rhs.tv >= 0 ? rhs : -rhs; } 69 70 Duration ?`ns( int64_t nsec ) { return (Duration)@{ nsec }; } 71 Duration ?`us( int64_t usec ) { return (Duration)@{ usec * (TIMEGRAN / 1_000_000LL) }; } 72 Duration ?`ms( int64_t msec ) { return (Duration)@{ msec * (TIMEGRAN / 1_000LL) }; } 73 Duration ?`s( int64_t sec ) { return (Duration)@{ sec * TIMEGRAN }; } 74 Duration ?`s( double sec ) { return (Duration)@{ sec * TIMEGRAN }; } 75 Duration ?`m( int64_t min ) { return (Duration)@{ min * (60LL * TIMEGRAN) }; } 76 Duration ?`m( double min ) { return (Duration)@{ min * (60LL * TIMEGRAN) }; } 77 Duration ?`h( int64_t hours ) { return (Duration)@{ hours * (60LL * 60LL * TIMEGRAN) }; } 78 Duration ?`h( double hours ) { return (Duration)@{ hours * (60LL * 60LL * TIMEGRAN) }; } 79 Duration ?`d( int64_t days ) { return (Duration)@{ days * (24LL * 60LL * 60LL * TIMEGRAN) }; } 80 Duration ?`d( double days ) { return (Duration)@{ days * (24LL * 60LL * 60LL * TIMEGRAN) }; } 81 Duration ?`w( int64_t weeks ) { return (Duration)@{ weeks * (7LL * 24LL * 60LL * 60LL * TIMEGRAN) }; } 82 Duration ?`w( double weeks ) { return (Duration)@{ weeks * (7LL * 24LL * 60LL * 60LL * TIMEGRAN) }; } 83 84 int64_t ?`ns( Duration dur ) { return dur.tv; } 85 int64_t ?`us( Duration dur ) { return dur.tv / (TIMEGRAN / 1_000_000LL); } 86 int64_t ?`ms( Duration dur ) { return dur.tv / (TIMEGRAN / 1_000LL); } 87 int64_t ?`s( Duration dur ) { return dur.tv / TIMEGRAN; } 88 int64_t ?`m( Duration dur ) { return dur.tv / (60LL * TIMEGRAN); } 89 int64_t ?`h( Duration dur ) { return dur.tv / (60LL * 60LL * TIMEGRAN); } 90 int64_t ?`d( Duration dur ) { return dur.tv / (24LL * 60LL * 60LL * TIMEGRAN); } 91 int64_t ?`w( Duration dur ) { return dur.tv / (7LL * 24LL * 60LL * 60LL * TIMEGRAN); } 92 93 Duration max( Duration lhs, Duration rhs ) { return (lhs.tv < rhs.tv) ? rhs : lhs;} 94 Duration min( Duration lhs, Duration rhs ) { return !(rhs.tv < lhs.tv) ? lhs : rhs;} 95 } // distribution 96 96 97 97 //######################### C timeval ######################### 98 98 99 static inline void ?{}( timeval & t ) {} 100 static inline void ?{}( timeval & t, time_t sec, suseconds_t usec ) { t.tv_sec = sec; t.tv_usec = usec; } 101 static inline void ?{}( timeval & t, time_t sec ) { t{ sec, 0 }; } 102 static inline void ?{}( timeval & t, zero_t ) { t{ 0, 0 }; } 103 static inline timeval ?=?( timeval & t, zero_t ) { return t{ 0 }; } 104 static inline timeval ?+?( timeval & lhs, timeval rhs ) { return (timeval)@{ lhs.tv_sec + rhs.tv_sec, lhs.tv_usec + rhs.tv_usec }; } 105 static inline timeval ?-?( timeval & lhs, timeval rhs ) { return (timeval)@{ lhs.tv_sec - rhs.tv_sec, lhs.tv_usec - rhs.tv_usec }; } 106 static inline _Bool ?==?( timeval lhs, timeval rhs ) { return lhs.tv_sec == rhs.tv_sec && lhs.tv_usec == rhs.tv_usec; } 107 static inline _Bool ?!=?( timeval lhs, timeval rhs ) { return lhs.tv_sec != rhs.tv_sec || lhs.tv_usec != rhs.tv_usec; } 108 99 static inline { 100 void ?{}( timeval & t ) {} 101 void ?{}( timeval & t, time_t sec, suseconds_t usec ) { t.tv_sec = sec; t.tv_usec = usec; } 102 void ?{}( timeval & t, time_t sec ) { t{ sec, 0 }; } 103 void ?{}( timeval & t, zero_t ) { t{ 0, 0 }; } 104 105 timeval ?=?( timeval & t, zero_t ) { return t{ 0 }; } 106 timeval ?+?( timeval & lhs, timeval rhs ) { return (timeval)@{ lhs.tv_sec + rhs.tv_sec, lhs.tv_usec + rhs.tv_usec }; } 107 timeval ?-?( timeval & lhs, timeval rhs ) { return (timeval)@{ lhs.tv_sec - rhs.tv_sec, lhs.tv_usec - rhs.tv_usec }; } 108 _Bool ?==?( timeval lhs, timeval rhs ) { return lhs.tv_sec == rhs.tv_sec && lhs.tv_usec == rhs.tv_usec; } 109 _Bool ?!=?( timeval lhs, timeval rhs ) { return lhs.tv_sec != rhs.tv_sec || lhs.tv_usec != rhs.tv_usec; } 110 } // distribution 109 111 110 112 //######################### C timespec ######################### 111 113 112 static inline void ?{}( timespec & t ) {} 113 static inline void ?{}( timespec & t, time_t sec, __syscall_slong_t nsec ) { t.tv_sec = sec; t.tv_nsec = nsec; } 114 static inline void ?{}( timespec & t, time_t sec ) { t{ sec, 0}; } 115 static inline void ?{}( timespec & t, zero_t ) { t{ 0, 0 }; } 116 static inline timespec ?=?( timespec & t, zero_t ) { return t{ 0 }; } 117 static inline timespec ?+?( timespec & lhs, timespec rhs ) { return (timespec)@{ lhs.tv_sec + rhs.tv_sec, lhs.tv_nsec + rhs.tv_nsec }; } 118 static inline timespec ?-?( timespec & lhs, timespec rhs ) { return (timespec)@{ lhs.tv_sec - rhs.tv_sec, lhs.tv_nsec - rhs.tv_nsec }; } 119 static inline _Bool ?==?( timespec lhs, timespec rhs ) { return lhs.tv_sec == rhs.tv_sec && lhs.tv_nsec == rhs.tv_nsec; } 120 static inline _Bool ?!=?( timespec lhs, timespec rhs ) { return lhs.tv_sec != rhs.tv_sec || lhs.tv_nsec != rhs.tv_nsec; } 121 114 static inline { 115 void ?{}( timespec & t ) {} 116 void ?{}( timespec & t, time_t sec, __syscall_slong_t nsec ) { t.tv_sec = sec; t.tv_nsec = nsec; } 117 void ?{}( timespec & t, time_t sec ) { t{ sec, 0}; } 118 void ?{}( timespec & t, zero_t ) { t{ 0, 0 }; } 119 120 timespec ?=?( timespec & t, zero_t ) { return t{ 0 }; } 121 timespec ?+?( timespec & lhs, timespec rhs ) { return (timespec)@{ lhs.tv_sec + rhs.tv_sec, lhs.tv_nsec + rhs.tv_nsec }; } 122 timespec ?-?( timespec & lhs, timespec rhs ) { return (timespec)@{ lhs.tv_sec - rhs.tv_sec, lhs.tv_nsec - rhs.tv_nsec }; } 123 _Bool ?==?( timespec lhs, timespec rhs ) { return lhs.tv_sec == rhs.tv_sec && lhs.tv_nsec == rhs.tv_nsec; } 124 _Bool ?!=?( timespec lhs, timespec rhs ) { return lhs.tv_sec != rhs.tv_sec || lhs.tv_nsec != rhs.tv_nsec; } 125 } // distribution 122 126 123 127 //######################### C itimerval ######################### 124 128 125 static inline void ?{}( itimerval & itv, Duration alarm ) with( itv ) { 126 // itimerval contains durations but but uses time data-structure timeval. 127 it_value{ alarm`s, (alarm % 1`s)`us }; // seconds, microseconds 128 it_interval{ 0 }; // 0 seconds, 0 microseconds 129 } // itimerval 130 131 static inline void ?{}( itimerval & itv, Duration alarm, Duration interval ) with( itv ) { 132 // itimerval contains durations but but uses time data-structure timeval. 133 it_value{ alarm`s, (alarm % 1`s)`us }; // seconds, microseconds 134 it_interval{ interval`s, interval`us }; // seconds, microseconds 135 } // itimerval 136 129 static inline { 130 void ?{}( itimerval & itv, Duration alarm ) with( itv ) { 131 // itimerval contains durations but but uses time data-structure timeval. 132 it_value{ alarm`s, (alarm % 1`s)`us }; // seconds, microseconds 133 it_interval{ 0 }; // 0 seconds, 0 microseconds 134 } // itimerval 135 136 void ?{}( itimerval & itv, Duration alarm, Duration interval ) with( itv ) { 137 // itimerval contains durations but but uses time data-structure timeval. 138 it_value{ alarm`s, (alarm % 1`s)`us }; // seconds, microseconds 139 it_interval{ interval`s, interval`us }; // seconds, microseconds 140 } // itimerval 141 } // distribution 137 142 138 143 //######################### Time ######################### 139 144 140 145 void ?{}( Time & time, int year, int month = 0, int day = 0, int hour = 0, int min = 0, int sec = 0, int nsec = 0 ); 141 static inline Time ?=?( Time & time, zero_t ) { return time{ 0 }; } 142 143 static inline void ?{}( Time & time, timeval t ) with( time ) { tv = (int64_t)t.tv_sec * TIMEGRAN + t.tv_usec * 1000; } 144 static inline Time ?=?( Time & time, timeval t ) with( time ) { 145 tv = (int64_t)t.tv_sec * TIMEGRAN + t.tv_usec * (TIMEGRAN / 1_000_000LL); 146 return time; 147 } // ?=? 148 149 static inline void ?{}( Time & time, timespec t ) with( time ) { tv = (int64_t)t.tv_sec * TIMEGRAN + t.tv_nsec; } 150 static inline Time ?=?( Time & time, timespec t ) with( time ) { 151 tv = (int64_t)t.tv_sec * TIMEGRAN + t.tv_nsec; 152 return time; 153 } // ?=? 154 155 static inline Time ?+?( Time & lhs, Duration rhs ) { return (Time)@{ lhs.tv + rhs.tv }; } 156 static inline Time ?+?( Duration lhs, Time rhs ) { return rhs + lhs; } 157 static inline Time ?+=?( Time & lhs, Duration rhs ) { lhs = lhs + rhs; return lhs; } 158 159 static inline Duration ?-?( Time lhs, Time rhs ) { return (Duration)@{ lhs.tv - rhs.tv }; } 160 static inline Time ?-?( Time lhs, Duration rhs ) { return (Time)@{ lhs.tv - rhs.tv }; } 161 static inline Time ?-=?( Time & lhs, Duration rhs ) { lhs = lhs - rhs; return lhs; } 162 static inline _Bool ?==?( Time lhs, Time rhs ) { return lhs.tv == rhs.tv; } 163 static inline _Bool ?!=?( Time lhs, Time rhs ) { return lhs.tv != rhs.tv; } 164 static inline _Bool ?<?( Time lhs, Time rhs ) { return lhs.tv < rhs.tv; } 165 static inline _Bool ?<=?( Time lhs, Time rhs ) { return lhs.tv <= rhs.tv; } 166 static inline _Bool ?>?( Time lhs, Time rhs ) { return lhs.tv > rhs.tv; } 167 static inline _Bool ?>=?( Time lhs, Time rhs ) { return lhs.tv >= rhs.tv; } 146 static inline { 147 Time ?=?( Time & time, zero_t ) { return time{ 0 }; } 148 149 void ?{}( Time & time, timeval t ) with( time ) { tv = (int64_t)t.tv_sec * TIMEGRAN + t.tv_usec * 1000; } 150 Time ?=?( Time & time, timeval t ) with( time ) { 151 tv = (int64_t)t.tv_sec * TIMEGRAN + t.tv_usec * (TIMEGRAN / 1_000_000LL); 152 return time; 153 } // ?=? 154 155 void ?{}( Time & time, timespec t ) with( time ) { tv = (int64_t)t.tv_sec * TIMEGRAN + t.tv_nsec; } 156 Time ?=?( Time & time, timespec t ) with( time ) { 157 tv = (int64_t)t.tv_sec * TIMEGRAN + t.tv_nsec; 158 return time; 159 } // ?=? 160 161 Time ?+?( Time & lhs, Duration rhs ) { return (Time)@{ lhs.tv + rhs.tv }; } 162 Time ?+?( Duration lhs, Time rhs ) { return rhs + lhs; } 163 Time ?+=?( Time & lhs, Duration rhs ) { lhs = lhs + rhs; return lhs; } 164 165 Duration ?-?( Time lhs, Time rhs ) { return (Duration)@{ lhs.tv - rhs.tv }; } 166 Time ?-?( Time lhs, Duration rhs ) { return (Time)@{ lhs.tv - rhs.tv }; } 167 Time ?-=?( Time & lhs, Duration rhs ) { lhs = lhs - rhs; return lhs; } 168 _Bool ?==?( Time lhs, Time rhs ) { return lhs.tv == rhs.tv; } 169 _Bool ?!=?( Time lhs, Time rhs ) { return lhs.tv != rhs.tv; } 170 _Bool ?<?( Time lhs, Time rhs ) { return lhs.tv < rhs.tv; } 171 _Bool ?<=?( Time lhs, Time rhs ) { return lhs.tv <= rhs.tv; } 172 _Bool ?>?( Time lhs, Time rhs ) { return lhs.tv > rhs.tv; } 173 _Bool ?>=?( Time lhs, Time rhs ) { return lhs.tv >= rhs.tv; } 174 } // distribution 168 175 169 176 char * yy_mm_dd( Time time, char * buf ); -
src/tests/literals.c
rc653b37 re3b2474 10 10 // Created On : Sat Sep 9 16:34:38 2017 11 11 // Last Modified By : Peter A. Buhr 12 // Last Modified On : Sun Mar 4 10:41:31201813 // Update Count : 13 412 // Last Modified On : Sun Jul 1 15:12:15 2018 13 // Update Count : 137 14 14 // 15 15 … … 155 155 156 156 // binary 157 0b01101011_l8; 0b01101011_l16; 0b01101011_l32; 0b01101011_l64; 0b01101011_l128; 0b01101011_l8u; 0b01101011_ul16; 0b01101011_l32u; 0b01101011_ul64; 0b01101011_ul128; 158 +0b01101011_l8; +0b01101011_l16; +0b01101011_l32; +0b01101011_l64; +0b01101011_l128; +0b01101011_l8u; +0b01101011_ul16; +0b01101011_l32u; +0b01101011_ul64; +0b01101011_ul128; 159 -0b01101011_l8; -0b01101011_l16; -0b01101011_l32; -0b01101011_l64; -0b01101011_l128; -0b01101011_l8u; -0b01101011_ul16; -0b01101011_l32u; -0b01101011_ul64; -0b01101011_ul128; 157 0b01101011_l8; 0b01101011_l16; 0b01101011_l32; 0b01101011_l64; 0b01101011_l8u; 0b01101011_ul16; 0b01101011_l32u; 0b01101011_ul64; 158 +0b01101011_l8; +0b01101011_l16; +0b01101011_l32; +0b01101011_l64; +0b01101011_l8u; +0b01101011_ul16; +0b01101011_l32u; +0b01101011_ul64; 159 -0b01101011_l8; -0b01101011_l16; -0b01101011_l32; -0b01101011_l64; -0b01101011_l8u; -0b01101011_ul16; -0b01101011_l32u; -0b01101011_ul64; 160 161 #ifdef __LP64__ // 64-bit processor 162 0b01101011_l128; 0b01101011_ul128; 163 +0b01101011_l128; +0b01101011_ul128; 164 -0b01101011_l128; -0b01101011_ul128; 165 #endif // __LP64__ 160 166 161 167 // octal 162 01234567_l8; 01234567_l16; 01234567_l32; 01234567_l64; 01234567_l128; 01234567_l8u; 01234567_ul16; 01234567_l32u; 01234567_ul64; 01234567_ul128; 163 +01234567_l8; +01234567_l16; +01234567_l32; +01234567_l64; +01234567_l128; +01234567_l8u; +01234567_ul16; +01234567_l32u; +01234567_ul64; +01234567_ul128; 164 -01234567_l8; -01234567_l16; -01234567_l32; -01234567_l64; -01234567_l128; -01234567_l8u; -01234567_ul16; -01234567_l32u; -01234567_ul64; -01234567_ul128; 168 01234567_l8; 01234567_l16; 01234567_l32; 01234567_l64; 01234567_l8u; 01234567_ul16; 01234567_l32u; 01234567_ul64; 169 +01234567_l8; +01234567_l16; +01234567_l32; +01234567_l64; +01234567_l8u; +01234567_ul16; +01234567_l32u; +01234567_ul64; 170 -01234567_l8; -01234567_l16; -01234567_l32; -01234567_l64; -01234567_l8u; -01234567_ul16; -01234567_l32u; -01234567_ul64; 171 172 #ifdef __LP64__ // 64-bit processor 173 01234567_l128; 01234567_ul128; 174 +01234567_l128; +01234567_ul128; 175 -01234567_l128; -01234567_ul128; 176 #endif // __LP64__ 165 177 166 178 // decimal 167 1234567890L8; 1234567890L16; 1234567890l32; 1234567890l64; 1234567890l128; 1234567890UL8; 1234567890L16U; 1234567890Ul32; 1234567890l64u; 1234567890l128u; 168 +1234567890L8; +1234567890L16; +1234567890l32; +1234567890l64; +1234567890l128; +1234567890UL8; +1234567890L16U; +1234567890Ul32; +1234567890l64u; +1234567890l128u; 169 -1234567890L8; -1234567890L16; -1234567890l32; -1234567890l64; -1234567890l128; -1234567890UL8; -1234567890L16U; -1234567890Ul32; -1234567890l64u; -1234567890l128u; 179 1234567890L8; 1234567890L16; 1234567890l32; 1234567890l64; 1234567890UL8; 1234567890L16U; 1234567890Ul32; 1234567890l64u; 180 +1234567890L8; +1234567890L16; +1234567890l32; +1234567890l64; +1234567890UL8; +1234567890L16U; +1234567890Ul32; +1234567890l64u; 181 -1234567890L8; -1234567890L16; -1234567890l32; -1234567890l64; -1234567890UL8; -1234567890L16U; -1234567890Ul32; -1234567890l64u; 182 183 #ifdef __LP64__ // 64-bit processor 184 1234567890l128; 1234567890l128u; 185 +1234567890l128; +1234567890l128u; 186 -1234567890l128; -1234567890l128u; 187 #endif // __LP64__ 170 188 171 189 // hexadecimal
Note: See TracChangeset
for help on using the changeset viewer.