Changes in / [dafdbe7:5510027]


Ignore:
Files:
2 added
2 deleted
7 edited

Legend:

Unmodified
Added
Removed
  • doc/papers/concurrency/Paper.tex

    rdafdbe7 r5510027  
    288288\CFA is an extension of ISO-C, and hence, supports all C paradigms.
    289289%It is a non-object-oriented system-language, meaning most of the major abstractions have either no runtime overhead or can be opted out easily.
    290 Like C, the basics of \CFA revolve around structures and routines.
     290Like C, the basics of \CFA revolve around structures and functions.
    291291Virtually all of the code generated by the \CFA translator respects C memory layouts and calling conventions.
    292292While \CFA is not an object-oriented language, lacking the concept of a receiver (\eg @this@) and nominal inheritance-relationships, C does have a notion of objects: ``region of data storage in the execution environment, the contents of which can represent values''~\cite[3.15]{C11}.
     
    349349        'with' '(' $\emph{expression-list}$ ')' $\emph{compound-statement}$
    350350\end{cfa}
    351 and may appear as the body of a routine or nested within a routine body.
     351and may appear as the body of a function or nested within a function body.
    352352Each expression in the expression-list provides a type and object.
    353353The type must be an aggregate type.
     
    360360
    361361\CFA maximizes the ability to reuse names via overloading to aggressively address the naming problem.
    362 Both variables and routines may be overloaded, where selection is based on types, and number of returns (as in Ada~\cite{Ada}) and arguments.
     362Both variables and functions may be overloaded, where selection is based on types, and number of returns (as in Ada~\cite{Ada}) and arguments.
    363363\begin{cquote}
    364364\vspace*{-\baselineskip}%???
     
    416416Overloading is important for \CFA concurrency since the runtime system relies on creating different types to represent concurrency objects.
    417417Therefore, overloading is necessary to prevent the need for long prefixes and other naming conventions to prevent name clashes.
    418 As seen in Section~\ref{basics}, routine @main@ is heavily overloaded.
     418As seen in Section~\ref{basics}, function @main@ is heavily overloaded.
    419419
    420420Variable overloading is useful in the parallel semantics of the @with@ statement for fields with the same name:
     
    438438
    439439Overloading also extends to operators.
    440 Operator-overloading syntax creates a routine name with an operator symbol and question marks for the operands:
     440Operator-overloading syntax names a routine with the operator symbol and question marks for the operands:
    441441\begin{cquote}
    442442\lstDeleteShortInline@%
     
    477477\label{s:ParametricPolymorphism}
    478478
    479 The signature feature of \CFA is parametric-polymorphic routines~\cite{} with routines generalized using a @forall@ clause (giving the language its name), which allow separately compiled routines to support generic usage over multiple types.
    480 For example, the following sum routine works for any type that supports construction from 0 and addition:
     479The signature feature of \CFA is parametric-polymorphic functions~\cite{} with functions generalized using a @forall@ clause (giving the language its name), which allow separately compiled routines to support generic usage over multiple types.
     480For example, the following sum function works for any type that supports construction from 0 and addition:
    481481\begin{cfa}
    482482forall( otype T | { void `?{}`( T *, zero_t ); T `?+?`( T, T ); } ) // constraint type, 0 and +
     
    491491\end{cfa}
    492492
    493 \CFA provides \newterm{traits} to name a group of type assertions, where the trait name allows specifying the same set of assertions in multiple locations, preventing repetition mistakes at each routine declaration:
     493\CFA provides \newterm{traits} to name a group of type assertions, where the trait name allows specifying the same set of assertions in multiple locations, preventing repetition mistakes at each function declaration:
    494494\begin{cfa}
    495495trait `sumable`( otype T ) {
     
    735735`coroutine` Fib { int fn; };
    736736\end{cfa}
    737 which provides communication, @fn@, for the \newterm{coroutine main}, @main@, which runs on the coroutine stack, and possibly multiple interface routines @next@.
     737which provides communication, @fn@, for the \newterm{coroutine main}, @main@, which runs on the coroutine stack, and possibly multiple interface functions, @next@.
    738738Like the structure in Figure~\ref{f:ExternalState}, the coroutine type allows multiple instances, where instances of this type are passed to the (overloaded) coroutine main.
    739739The coroutine main's stack holds the state for the next generation, @f1@ and @f2@, and the code has the three suspend points, representing the three states in the Fibonacci formula, to context switch back to the caller's resume.
    740 The interface routine @next@, takes a Fibonacci instance and context switches to it using @resume@;
     740The interface function, @next@, takes a Fibonacci instance and context switches to it using @resume@;
    741741on restart, the Fibonacci field, @fn@, contains the next value in the sequence, which is returned.
    742742The first @resume@ is special because it cocalls the coroutine at its coroutine main and allocates the stack;
     
    847847\end{figure}
    848848
    849 The previous examples are \newterm{asymmetric (semi) coroutine}s because one coroutine always calls a resuming routine for another coroutine, and the resumed coroutine always suspends back to its last resumer, similar to call/return for normal routines
     849The previous examples are \newterm{asymmetric (semi) coroutine}s because one coroutine always calls a resuming function for another coroutine, and the resumed coroutine always suspends back to its last resumer, similar to call/return for normal functions.
    850850However, there is no stack growth because @resume@/@suspend@ context switch to existing stack-frames rather than create new ones.
    851 \newterm{Symmetric (full) coroutine}s have a coroutine call a resuming routine for another coroutine, which eventually forms a resuming-call cycle.
     851\newterm{Symmetric (full) coroutine}s have a coroutine call a resuming function for another coroutine, which eventually forms a resuming-call cycle.
    852852(The trivial cycle is a coroutine resuming itself.)
    853853This control flow is similar to recursion for normal routines, but again there is no stack growth from the context switch.
     
    931931Figure~\ref{f:ProdCons} shows a producer/consumer symmetric-coroutine performing bi-directional communication.
    932932Since the solution involves a full-coroutining cycle, the program main creates one coroutine in isolation, passes this coroutine to its partner, and closes the cycle at the call to @start@.
    933 The @start@ routine communicates both the number of elements to be produced and the consumer into the producer's coroutine structure.
     933The @start@ function communicates both the number of elements to be produced and the consumer into the producer's coroutine structure.
    934934Then the @resume@ to @prod@ creates @prod@'s stack with a frame for @prod@'s coroutine main at the top, and context switches to it.
    935935@prod@'s coroutine main starts, creates local variables that are retained between coroutine activations, and executes $N$ iterations, each generating two random values, calling the consumer to deliver the values, and printing the status returned from the consumer.
     
    980980However, there is nothing preventing wrong placement or multiple declarations.
    981981
    982 For coroutines as for threads, many implementations are based on routine pointers or routine objects~\cite{Butenhof97, C++14, MS:VisualC++, BoostCoroutines15}.
     982For coroutines as for threads, many implementations are based on routine pointers or function objects~\cite{Butenhof97, C++14, MS:VisualC++, BoostCoroutines15}.
    983983For example, Boost implements coroutines in terms of four functor object-types:
    984984\begin{cfa}
     
    988988symmetric_coroutine<>::yield_type
    989989\end{cfa}
    990 Similarly, the canonical threading paradigm is often based on routine pointers, \eg @pthread@~\cite{pthreads}, \Csharp~\cite{Csharp}, Go~\cite{Go}, and Scala~\cite{Scala}.
     990Similarly, the canonical threading paradigm is often based on function pointers, \eg @pthread@~\cite{pthreads}, \Csharp~\cite{Csharp}, Go~\cite{Go}, and Scala~\cite{Scala}.
    991991However, the generic thread-handle (identifier) is limited (few operations), unless it is wrapped in a custom type.
    992992\begin{cfa}
     
    10021002\end{cfa}
    10031003Since the custom type is simple to write in \CFA and solves several issues, added support for routine/lambda-based coroutines adds very little.
    1004 
    1005 Note, the type @coroutine_t@ must be an abstract handle to the coroutine, because the coroutine descriptor and its stack are non-copyable.
    1006 Copying the coroutine descriptor results in copies being out of date with the current state of the stack.
    1007 Correspondingly, copying the stack results is copies being out of date with coroutine descriptor, and pointers in the stack being out of date to data on the stack.
    1008 (There is no mechanism in C to find all stack-specific pointers and update them as part of a copy.)
    10091004
    10101005The selected approach is to use language support by introducing a new kind of aggregate (structure):
     
    10211016The reserved keyword eases use for the common cases.
    10221017
    1023 Part of the mechanism to generalize coroutines is using a \CFA trait, which defines a coroutine as anything satisfying the trait @is_coroutine@, and this trait is used to restrict coroutine-manipulation routines:
    1024 \begin{cfa}
    1025 trait is_coroutine( `dtype` T ) {
    1026         void main( T & );
    1027         coroutine_desc * get_coroutine( T & );
     1018Part of the mechanism to generalize coroutines is using a \CFA trait, which defines a coroutine as anything satisfying the trait @is_coroutine@, and this trait is used to restrict coroutine-manipulation functions:
     1019\begin{cfa}
     1020trait is_coroutine( dtype T ) {
     1021      void main( T & this );
     1022      coroutine_desc * get_coroutine( T & this );
    10281023};
    1029 forall( `dtype` T | is_coroutine(T) ) void suspend( T & );
    1030 forall( `dtype` T | is_coroutine(T) ) void resume( T & );
    1031 \end{cfa}
    1032 The @dtype@ property of the trait ensures the coroutine descriptor is non-copyable, so all coroutines must be passed by reference (pointer).
    1033 The routine definitions ensures there is a statically-typed @main@ routine that is the starting point (first stack frame) of a coroutine, and a mechanism to get (read) the currently executing coroutine handle.
    1034 The @main@ routine has no return value or additional parameters because the coroutine type allows an arbitrary number of interface routines with corresponding arbitrary typed input/output values versus fixed ones.
    1035 The generic routines @suspend@ and @resume@ can be redefined, but any object passed to them is a coroutine since it must satisfy the @is_coroutine@ trait to compile.
    1036 The advantage of this approach is that users can easily create different types of coroutines, for example, changing the memory layout of a coroutine is trivial when implementing the @get_coroutine@ routine, and possibly redefining @suspend@ and @resume@.
     1024forall( dtype T | is_coroutine(T) ) void get_coroutine( T & );
     1025forall( dtype T | is_coroutine(T) ) void suspend( T & );
     1026forall( dtype T | is_coroutine(T) ) void resume( T & );
     1027\end{cfa}
     1028This definition ensures there is a statically-typed @main@ function that is the starting point (first stack frame) of a coroutine.
     1029No return value or additional parameters are necessary for this function, because the coroutine type allows an arbitrary number of interface functions with corresponding arbitrary typed input/output values.
     1030As well, any object passed to @suspend@ and @resume@ is a coroutine since it must satisfy the @is_coroutine@ trait to compile.
     1031The advantage of this approach is that users can easily create different types of coroutines, for example, changing the memory layout of a coroutine is trivial when implementing the @get_coroutine@ routine.
    10371032The \CFA keyword @coroutine@ implicitly implements the getter and forward declarations required for implementing the coroutine main:
    10381033\begin{cquote}
     
    10781073Like coroutines and for the same design reasons, the selected approach for user threads is to use language support by introducing a new kind of aggregate (structure) and a \CFA trait:
    10791074\begin{cquote}
    1080 \begin{tabular}{@{}c@{\hspace{3\parindentlnth}}c@{}}
     1075\begin{tabular}{@{}c@{\hspace{2\parindentlnth}}c@{}}
    10811076\begin{cfa}
    10821077thread myThread {
     
    10881083&
    10891084\begin{cfa}
    1090 trait is_thread( `dtype` T ) {
    1091       void main( T & );
    1092       thread_desc * get_thread( T & );
    1093       void ^?{}( T & `mutex` );
     1085trait is_thread( dtype T ) {
     1086      void main( T & this );
     1087      thread_desc * get_thread( T & this );
     1088      void ^?{}( T & `mutex` this );
    10941089};
    10951090\end{cfa}
     
    10971092\end{cquote}
    10981093(The qualifier @mutex@ for the destructor parameter is discussed in Section~\ref{s:Monitors}.)
    1099 Like a coroutine, the statically-typed @main@ routine is the starting point (first stack frame) of a user thread.
     1094Like a coroutine, the statically-typed @main@ function is the starting point (first stack frame) of a user thread.
    11001095The difference is that a coroutine borrows a thread from its caller, so the first thread resuming a coroutine creates an instance of @main@;
    11011096whereas, a user thread receives its own thread from the runtime system, which starts in @main@ as some point after the thread constructor is run.\footnote{
    1102 The \lstinline@main@ routine is already a special routine in C (where the program begins), so it is a natural extension of the semantics to use overloading to declare mains for different coroutines/threads (the normal main being the main of the initial thread).}
    1103 No return value or additional parameters are necessary for this routine because the task type allows an arbitrary number of interface routines with corresponding arbitrary typed input/output values.
     1097The \lstinline@main@ function is already a special routine in C (where the program begins), so it is a natural extension of the semantics to use overloading to declare mains for different coroutines/threads (the normal main being the main of the initial thread).}
     1098No return value or additional parameters are necessary for this function, because the task type allows an arbitrary number of interface functions with corresponding arbitrary typed input/output values.
    11041099
    11051100\begin{comment} % put in appendix with coroutine version ???
     
    11141109
    11151110In this example, threads of type @foo@ start execution in the @void main(foo &)@ routine, which prints @"Hello World!".@ While this paper encourages this approach to enforce strongly typed programming, users may prefer to use the routine-based thread semantics for the sake of simplicity.
    1116 With the static semantics it is trivial to write a thread type that takes a routine pointer as a parameter and executes it on its stack asynchronously.
    1117 \begin{cfa}
    1118 typedef void (*voidRtn)(int);
    1119 
    1120 thread RtnRunner {
    1121         voidRtn func;
     1111With the static semantics it is trivial to write a thread type that takes a function pointer as a parameter and executes it on its stack asynchronously.
     1112\begin{cfa}
     1113typedef void (*voidFunc)(int);
     1114
     1115thread FuncRunner {
     1116        voidFunc func;
    11221117        int arg;
    11231118};
    11241119
    1125 void ?{}(RtnRunner & this, voidRtn inRtn, int arg) {
    1126         this.func = inRtn;
     1120void ?{}(FuncRunner & this, voidFunc inFunc, int arg) {
     1121        this.func = inFunc;
    11271122        this.arg  = arg;
    11281123}
    11291124
    1130 void main(RtnRunner & this) {
    1131         // thread starts here and runs the routine
     1125void main(FuncRunner & this) {
     1126        // thread starts here and runs the function
    11321127        this.func( this.arg );
    11331128}
     
    11381133
    11391134int main() {
    1140         RtnRunner f = {hello, 42};
     1135        FuncRunner f = {hello, 42};
    11411136        return 0?
    11421137}
     
    12151210
    12161211
    1217 \section{Mutual Exclusion / Synchronization}
     1212\section{Synchronization / Mutual Exclusion}
    12181213
    12191214Uncontrolled non-deterministic execution is meaningless.
    1220 To reestablish meaningful execution requires mechanisms to reintroduce determinism (\ie restrict non-determinism), called mutual exclusion and synchronization, where mutual exclusion is an access-control mechanism on data shared by threads, and synchronization is a timing relationship among threads~\cite[\S~4]{Buhr05a}.
    1221 Since many deterministic challenges appear with the use of mutable shared state, some languages/libraries disallow it, \eg Erlang~\cite{Erlang}, Haskell~\cite{Haskell}, Akka~\cite{Akka} (Scala).
     1215To reestablish meaningful execution requires mechanisms to reintroduce determinism (control non-determinism), called synchronization and mutual exclusion, where synchronization is a timing relationship among threads and mutual exclusion is an access-control mechanism on data shared by threads.
     1216Since many deterministic challenges appear with the use of mutable shared state, some languages/libraries disallow it (Erlang~\cite{Erlang}, Haskell~\cite{Haskell}, Akka~\cite{Akka} (Scala)).
    12221217In these paradigms, interaction among concurrent objects is performed by stateless message-passing~\cite{Thoth,Harmony,V-Kernel} or other paradigms closely relate to networking concepts (\eg channels~\cite{CSP,Go}).
    1223 However, in call/return-based languages, these approaches force a clear distinction (\ie introduce a new programming paradigm) between regular and concurrent computation (\ie routine call versus message passing).
    1224 Hence, a programmer must learn and manipulate two sets of design patterns.
     1218However, in call/return-based languages, these approaches force a clear distinction (\ie introduce a new programming paradigm) between non-concurrent and concurrent computation (\ie function call versus message passing).
     1219This distinction means a programmers needs to learn two sets of design patterns.
    12251220While this distinction can be hidden away in library code, effective use of the library still has to take both paradigms into account.
    12261221In contrast, approaches based on statefull models more closely resemble the standard call/return programming-model, resulting in a single programming paradigm.
    12271222
    1228 At the lowest level, concurrent control is implemented by atomic operations, upon which different kinds of locks mechanism are constructed, \eg semaphores~\cite{Dijkstra68b}, barriers, and path expressions~\cite{Campbell74}.
     1223At the lowest level, concurrent control is implemented as atomic operations, upon which different kinds of locks mechanism are constructed, \eg semaphores~\cite{Dijkstra68b} and path expressions~\cite{Campbell74}.
    12291224However, for productivity it is always desirable to use the highest-level construct that provides the necessary efficiency~\cite{Hochstein05}.
    1230 A newer approach for restricting non-determinism is transactional memory~\cite{Herlihy93}.
     1225A newer approach is transactional memory~\cite{Herlihy93}.
    12311226While this approach is pursued in hardware~\cite{Nakaike15} and system languages, like \CC~\cite{Cpp-Transactions}, the performance and feature set is still too restrictive to be the main concurrency paradigm for system languages, which is why it was rejected as the core paradigm for concurrency in \CFA.
    12321227
    1233 One of the most natural, elegant, and efficient mechanisms for mutual exclusion and synchronization for shared-memory systems is the \emph{monitor}.
    1234 First proposed by Brinch Hansen~\cite{Hansen73} and later described and extended by C.A.R.~Hoare~\cite{Hoare74}, many concurrent programming-languages provide monitors as an explicit language construct: \eg Concurrent Pascal~\cite{ConcurrentPascal}, Mesa~\cite{Mesa}, Modula~\cite{Modula-2}, Turing~\cite{Turing:old}, Modula-3~\cite{Modula-3}, NeWS~\cite{NeWS}, Emerald~\cite{Emerald}, \uC~\cite{Buhr92a} and Java~\cite{Java}.
    1235 In addition, operating-system kernels and device drivers have a monitor-like structure, although they often use lower-level primitives such as mutex locks or semaphores to simulate monitors.
    1236 For these reasons, \CFA selected monitors as the core high-level concurrency-construct, upon which higher-level approaches can be easily constructed.
     1228One of the most natural, elegant, and efficient mechanisms for synchronization and mutual exclusion for shared-memory systems is the \emph{monitor}.
     1229Monitors were first proposed by Brinch Hansen~\cite{Hansen73} and later described and extended by C.A.R.~Hoare~\cite{Hoare74}.
     1230Many programming languages -- \eg Concurrent Pascal~\cite{ConcurrentPascal}, Mesa~\cite{Mesa}, Modula~\cite{Modula-2}, Turing~\cite{Turing:old}, Modula-3~\cite{Modula-3}, NeWS~\cite{NeWS}, Emerald~\cite{Emerald}, \uC~\cite{Buhr92a} and Java~\cite{Java} -- provide monitors as explicit language constructs.
     1231In addition, operating-system kernels and device drivers have a monitor-like structure, although they often use lower-level primitives such as semaphores or locks to simulate monitors.
     1232For these reasons, this project proposes monitors as the core concurrency construct, upon which even higher-level approaches can be easily constructed..
    12371233
    12381234
     
    12401236
    12411237A group of instructions manipulating a specific instance of shared data that must be performed atomically is called an (individual) \newterm{critical-section}~\cite{Dijkstra65}.
    1242 The generalization is called a \newterm{group critical-section}~\cite{Joung00}, where multiple tasks with the same session may use the resource simultaneously, but different sessions may not use the resource simultaneously.
     1238A generalization is a \newterm{group critical-section}~\cite{Joung00}, where multiple tasks with the same session may use the resource simultaneously, but different sessions may not use the resource simultaneously.
    12431239The readers/writer problem~\cite{Courtois71} is an instance of a group critical-section, where readers have the same session and all writers have a unique session.
    1244 \newterm{Mutual exclusion} enforces that the correct kind and number of threads are using a critical section.
     1240\newterm{Mutual exclusion} enforces the correction number of threads are using a critical section at the same time.
    12451241
    12461242However, many solutions exist for mutual exclusion, which vary in terms of performance, flexibility and ease of use.
    12471243Methods range from low-level locks, which are fast and flexible but require significant attention for correctness, to higher-level concurrency techniques, which sacrifice some performance to improve ease of use.
    12481244Ease of use comes by either guaranteeing some problems cannot occur (\eg deadlock free), or by offering a more explicit coupling between shared data and critical section.
    1249 For example, the \CC @std::atomic<T>@ offers an easy way to express mutual-exclusion on a restricted set of operations (\eg reading/writing) for numerical types.
    1250 However, a significant challenge with locks is composability because it takes careful organization for multiple locks to be used while preventing deadlock.
    1251 Easing composability is another feature higher-level mutual-exclusion mechanisms can offer.
     1245For example, the \CC @std::atomic<T>@ offers an easy way to express mutual-exclusion on a restricted set of operations (\eg reading/writing large types atomically).
     1246However, a significant challenge with (low-level) locks is composability because it takes careful organization for multiple locks to be used while preventing deadlock.
     1247Easing composability is another feature higher-level mutual-exclusion mechanisms offer.
    12521248
    12531249
     
    12551251
    12561252Synchronization enforces relative ordering of execution, and synchronization tools provide numerous mechanisms to establish these timing relationships.
    1257 Low-level synchronization primitives offer good performance and flexibility at the cost of ease of use;
    1258 higher-level mechanisms often simplify usage by adding better coupling between synchronization and data (\eg message passing), or offering a simpler solution to otherwise involved challenges, \eg barrier lock.
    1259 Often synchronization is used to order access to a critical section, \eg ensuring a reader thread is the next kind of thread to enter a critical section.
    1260 If a writer thread is scheduled for next access, but another reader thread acquires the critical section first, that reader has \newterm{barged}.
    1261 Barging can result in staleness/freshness problems, where a reader barges ahead of a writer and reads temporally stale data, or a writer barges ahead of another writer overwriting data with a fresh value preventing the previous value from ever being read (lost computation).
     1253Low-level synchronization primitives offer good performance and flexibility at the cost of ease of use.
     1254Higher-level mechanisms often simplify usage by adding better coupling between synchronization and data (\eg message passing), or offering a simpler solution to otherwise involved challenges, \eg barrier lock.
     1255As mentioned above, synchronization can be expressed as guaranteeing that event \textit{X} always happens before \textit{Y}.
     1256Often synchronization is used to order access to a critical section, \eg ensuring the next kind of thread to enter a critical section is a reader thread
     1257If a writer thread is scheduled for next access, but another reader thread acquires the critical section first, the reader has \newterm{barged}.
     1258Barging can result in staleness/freshness problems, where a reader barges ahead of a write and reads temporally stale data, or a writer barges ahead of another writer overwriting data with a fresh value preventing the previous value from having an opportunity to be read.
    12621259Preventing or detecting barging is an involved challenge with low-level locks, which can be made much easier by higher-level constructs.
    1263 This challenge is often split into two different approaches: barging avoidance and barging prevention.
    1264 Algorithms that allow a barger, but divert it until later using current synchronization state (flags), are avoiding the barger;
    1265 algorithms that preclude a barger from entering during synchronization in the critical section prevent barging completely.
    1266 Techniques like baton-pass locks~\cite{Andrews89} between threads instead of unconditionally releasing locks is an example of barging prevention.
     1260This challenge is often split into two different approaches, barging avoidance and barging prevention.
     1261Algorithms that allow a barger but divert it until later are avoiding the barger, while algorithms that preclude a barger from entering during synchronization in the critical section prevent the barger completely.
     1262baton-pass locks~\cite{Andrews89} between threads instead of releasing the locks are said to be using barging prevention.
    12671263
    12681264
     
    12701266\label{s:Monitors}
    12711267
    1272 A \textbf{monitor} is a set of routines that ensure mutual exclusion when accessing shared state.
    1273 More precisely, a monitor is a programming technique that binds mutual exclusion to routine scope, as opposed to locks, where mutual-exclusion is defined by acquire/release calls, independent of lexical context (analogous to block and heap storage allocation).
    1274 The strong association with the call/return paradigm eases programmability, readability and maintainability, at a slight cost in flexibility and efficiency.
    1275 
    1276 Note, like coroutines/threads, both locks and monitors require an abstract handle to reference them, because at their core, both mechanisms are manipulating non-copyable shared state.
    1277 Copying a lock is insecure because it is possible to copy an open lock and then use the open copy when the original lock is closed to simultaneously access the shared data.
    1278 Copying a monitor is secure because both the lock and shared data are copies, but copying the shared data is meaningless because it no longer represents a unique entity.
    1279 As for coroutines/tasks, a non-copyable (@dtype@) trait is used to capture this requirement, so all locks/monitors must be passed by reference (pointer).
    1280 \begin{cfa}
    1281 trait is_monitor( `dtype` T ) {
     1268A \textbf{monitor} is a set of routines that ensure mutual-exclusion when accessing shared state.
     1269More precisely, a monitor is a programming technique that associates mutual-exclusion to routine scopes, as opposed to mutex locks, where mutual-exclusion is defined by lock/release calls independently of any scoping of the calling routine.
     1270This strong association eases readability and maintainability, at the cost of flexibility.
     1271Note that both monitors and mutex locks, require an abstract handle to identify them.
     1272This concept is generally associated with object-oriented languages like Java~\cite{Java} or \uC~\cite{uC++book} but does not strictly require OO semantics.
     1273The only requirement is the ability to declare a handle to a shared object and a set of routines that act on it:
     1274\begin{cfa}
     1275typedef /*some monitor type*/ monitor;
     1276int f(monitor & m);
     1277
     1278int main() {
     1279        monitor m;  // Handle m
     1280        f(m);       // Routine using handle
     1281}
     1282\end{cfa}
     1283
     1284% ======================================================================
     1285% ======================================================================
     1286\subsection{Call Semantics} \label{call}
     1287% ======================================================================
     1288% ======================================================================
     1289The above monitor example displays some of the intrinsic characteristics.
     1290First, it is necessary to use pass-by-reference over pass-by-value for monitor routines.
     1291This semantics is important, because at their core, monitors are implicit mutual-exclusion objects (locks), and these objects cannot be copied.
     1292Therefore, monitors are non-copy-able objects (@dtype@).
     1293
     1294Another aspect to consider is when a monitor acquires its mutual exclusion.
     1295For example, a monitor may need to be passed through multiple helper routines that do not acquire the monitor mutual-exclusion on entry.
     1296Passthrough can occur for generic helper routines (@swap@, @sort@, \etc) or specific helper routines like the following to implement an atomic counter:
     1297
     1298\begin{cfa}
     1299monitor counter_t { /*...see section $\ref{data}$...*/ };
     1300
     1301void ?{}(counter_t & nomutex this); // constructor
     1302size_t ++?(counter_t & mutex this); // increment
     1303
     1304// need for mutex is platform dependent
     1305void ?{}(size_t * this, counter_t & mutex cnt); // conversion
     1306\end{cfa}
     1307This counter is used as follows:
     1308\begin{center}
     1309\begin{tabular}{c @{\hskip 0.35in} c @{\hskip 0.35in} c}
     1310\begin{cfa}
     1311// shared counter
     1312counter_t cnt1, cnt2;
     1313
     1314// multiple threads access counter
     1315thread 1 : cnt1++; cnt2++;
     1316thread 2 : cnt1++; cnt2++;
     1317thread 3 : cnt1++; cnt2++;
     1318        ...
     1319thread N : cnt1++; cnt2++;
     1320\end{cfa}
     1321\end{tabular}
     1322\end{center}
     1323Notice how the counter is used without any explicit synchronization and yet supports thread-safe semantics for both reading and writing, which is similar in usage to the \CC template @std::atomic@.
     1324
     1325Here, the constructor (@?{}@) uses the @nomutex@ keyword to signify that it does not acquire the monitor mutual-exclusion when constructing.
     1326This semantics is because an object not yet constructed should never be shared and therefore does not require mutual exclusion.
     1327Furthermore, it allows the implementation greater freedom when it initializes the monitor locking.
     1328The prefix increment operator uses @mutex@ to protect the incrementing process from race conditions.
     1329Finally, there is a conversion operator from @counter_t@ to @size_t@.
     1330This conversion may or may not require the @mutex@ keyword depending on whether or not reading a @size_t@ is an atomic operation.
     1331
     1332For maximum usability, monitors use \textbf{multi-acq} semantics, which means a single thread can acquire the same monitor multiple times without deadlock.
     1333For example, listing \ref{fig:search} uses recursion and \textbf{multi-acq} to print values inside a binary tree.
     1334\begin{figure}
     1335\begin{cfa}[caption={Recursive printing algorithm using \textbf{multi-acq}.},label={fig:search}]
     1336monitor printer { ... };
     1337struct tree {
     1338        tree * left, right;
     1339        char * value;
     1340};
     1341void print(printer & mutex p, char * v);
     1342
     1343void print(printer & mutex p, tree * t) {
     1344        print(p, t->value);
     1345        print(p, t->left );
     1346        print(p, t->right);
     1347}
     1348\end{cfa}
     1349\end{figure}
     1350
     1351Having both @mutex@ and @nomutex@ keywords can be redundant, depending on the meaning of a routine having neither of these keywords.
     1352For example, it is reasonable that it should default to the safest option (@mutex@) when given a routine without qualifiers @void foo(counter_t & this)@, whereas assuming @nomutex@ is unsafe and may cause subtle errors.
     1353On the other hand, @nomutex@ is the ``normal'' parameter behaviour, it effectively states explicitly that ``this routine is not special''.
     1354Another alternative is making exactly one of these keywords mandatory, which provides the same semantics but without the ambiguity of supporting routines with neither keyword.
     1355Mandatory keywords would also have the added benefit of being self-documented but at the cost of extra typing.
     1356While there are several benefits to mandatory keywords, they do bring a few challenges.
     1357Mandatory keywords in \CFA would imply that the compiler must know without doubt whether or not a parameter is a monitor or not.
     1358Since \CFA relies heavily on traits as an abstraction mechanism, the distinction between a type that is a monitor and a type that looks like a monitor can become blurred.
     1359For this reason, \CFA only has the @mutex@ keyword and uses no keyword to mean @nomutex@.
     1360
     1361The next semantic decision is to establish when @mutex@ may be used as a type qualifier.
     1362Consider the following declarations:
     1363\begin{cfa}
     1364int f1(monitor & mutex m);
     1365int f2(const monitor & mutex m);
     1366int f3(monitor ** mutex m);
     1367int f4(monitor * mutex m []);
     1368int f5(graph(monitor *) & mutex m);
     1369\end{cfa}
     1370The problem is to identify which object(s) should be acquired.
     1371Furthermore, each object needs to be acquired only once.
     1372In the case of simple routines like @f1@ and @f2@ it is easy to identify an exhaustive list of objects to acquire on entry.
     1373Adding indirections (@f3@) still allows the compiler and programmer to identify which object is acquired.
     1374However, adding in arrays (@f4@) makes it much harder.
     1375Array lengths are not necessarily known in C, and even then, making sure objects are only acquired once becomes none-trivial.
     1376This problem can be extended to absurd limits like @f5@, which uses a graph of monitors.
     1377To make the issue tractable, this project imposes the requirement that a routine may only acquire one monitor per parameter and it must be the type of the parameter with at most one level of indirection (ignoring potential qualifiers).
     1378Also note that while routine @f3@ can be supported, meaning that monitor @**m@ is acquired, passing an array to this routine would be type-safe and yet result in undefined behaviour because only the first element of the array is acquired.
     1379However, this ambiguity is part of the C type-system with respects to arrays.
     1380For this reason, @mutex@ is disallowed in the context where arrays may be passed:
     1381\begin{cfa}
     1382int f1(monitor & mutex m);    // Okay : recommended case
     1383int f2(monitor * mutex m);    // Not Okay : Could be an array
     1384int f3(monitor mutex m []);  // Not Okay : Array of unknown length
     1385int f4(monitor ** mutex m);   // Not Okay : Could be an array
     1386int f5(monitor * mutex m []); // Not Okay : Array of unknown length
     1387\end{cfa}
     1388Note that not all array functions are actually distinct in the type system.
     1389However, even if the code generation could tell the difference, the extra information is still not sufficient to extend meaningfully the monitor call semantic.
     1390
     1391Unlike object-oriented monitors, where calling a mutex member \emph{implicitly} acquires mutual-exclusion of the receiver object, \CFA uses an explicit mechanism to specify the object that acquires mutual-exclusion.
     1392A consequence of this approach is that it extends naturally to multi-monitor calls.
     1393\begin{cfa}
     1394int f(MonitorA & mutex a, MonitorB & mutex b);
     1395
     1396MonitorA a;
     1397MonitorB b;
     1398f(a,b);
     1399\end{cfa}
     1400While OO monitors could be extended with a mutex qualifier for multiple-monitor calls, no example of this feature could be found.
     1401The capability to acquire multiple locks before entering a critical section is called \emph{\textbf{bulk-acq}}.
     1402In practice, writing multi-locking routines that do not lead to deadlocks is tricky.
     1403Having language support for such a feature is therefore a significant asset for \CFA.
     1404In the case presented above, \CFA guarantees that the order of acquisition is consistent across calls to different routines using the same monitors as arguments.
     1405This consistent ordering means acquiring multiple monitors is safe from deadlock when using \textbf{bulk-acq}.
     1406However, users can still force the acquiring order.
     1407For example, notice which routines use @mutex@/@nomutex@ and how this affects acquiring order:
     1408\begin{cfa}
     1409void foo(A& mutex a, B& mutex b) { // acquire a & b
     1410        ...
     1411}
     1412
     1413void bar(A& mutex a, B& /*nomutex*/ b) { // acquire a
     1414        ... foo(a, b); ... // acquire b
     1415}
     1416
     1417void baz(A& /*nomutex*/ a, B& mutex b) { // acquire b
     1418        ... foo(a, b); ... // acquire a
     1419}
     1420\end{cfa}
     1421The \textbf{multi-acq} monitor lock allows a monitor lock to be acquired by both @bar@ or @baz@ and acquired again in @foo@.
     1422In the calls to @bar@ and @baz@ the monitors are acquired in opposite order.
     1423
     1424However, such use leads to lock acquiring order problems.
     1425In the example above, the user uses implicit ordering in the case of function @foo@ but explicit ordering in the case of @bar@ and @baz@.
     1426This subtle difference means that calling these routines concurrently may lead to deadlock and is therefore undefined behaviour.
     1427As shown~\cite{Lister77}, solving this problem requires:
     1428\begin{enumerate}
     1429        \item Dynamically tracking the monitor-call order.
     1430        \item Implement rollback semantics.
     1431\end{enumerate}
     1432While the first requirement is already a significant constraint on the system, implementing a general rollback semantics in a C-like language is still prohibitively complex~\cite{Dice10}.
     1433In \CFA, users simply need to be careful when acquiring multiple monitors at the same time or only use \textbf{bulk-acq} of all the monitors.
     1434While \CFA provides only a partial solution, most systems provide no solution and the \CFA partial solution handles many useful cases.
     1435
     1436For example, \textbf{multi-acq} and \textbf{bulk-acq} can be used together in interesting ways:
     1437\begin{cfa}
     1438monitor bank { ... };
     1439
     1440void deposit( bank & mutex b, int deposit );
     1441
     1442void transfer( bank & mutex mybank, bank & mutex yourbank, int me2you) {
     1443        deposit( mybank, -me2you );
     1444        deposit( yourbank, me2you );
     1445}
     1446\end{cfa}
     1447This example shows a trivial solution to the bank-account transfer problem~\cite{BankTransfer}.
     1448Without \textbf{multi-acq} and \textbf{bulk-acq}, the solution to this problem is much more involved and requires careful engineering.
     1449
     1450
     1451\subsection{\protect\lstinline|mutex| statement} \label{mutex-stmt}
     1452
     1453The call semantics discussed above have one software engineering issue: only a routine can acquire the mutual-exclusion of a set of monitor. \CFA offers the @mutex@ statement to work around the need for unnecessary names, avoiding a major software engineering problem~\cite{2FTwoHardThings}.
     1454Table \ref{f:mutex-stmt} shows an example of the @mutex@ statement, which introduces a new scope in which the mutual-exclusion of a set of monitor is acquired.
     1455Beyond naming, the @mutex@ statement has no semantic difference from a routine call with @mutex@ parameters.
     1456
     1457\begin{table}
     1458\begin{center}
     1459\begin{tabular}{|c|c|}
     1460function call & @mutex@ statement \\
     1461\hline
     1462\begin{cfa}[tabsize=3]
     1463monitor M {};
     1464void foo( M & mutex m1, M & mutex m2 ) {
     1465        // critical section
     1466}
     1467
     1468void bar( M & m1, M & m2 ) {
     1469        foo( m1, m2 );
     1470}
     1471\end{cfa}&\begin{cfa}[tabsize=3]
     1472monitor M {};
     1473void bar( M & m1, M & m2 ) {
     1474        mutex(m1, m2) {
     1475                // critical section
     1476        }
     1477}
     1478
     1479
     1480\end{cfa}
     1481\end{tabular}
     1482\end{center}
     1483\caption{Regular call semantics vs. \protect\lstinline|mutex| statement}
     1484\label{f:mutex-stmt}
     1485\end{table}
     1486
     1487% ======================================================================
     1488% ======================================================================
     1489\subsection{Data semantics} \label{data}
     1490% ======================================================================
     1491% ======================================================================
     1492Once the call semantics are established, the next step is to establish data semantics.
     1493Indeed, until now a monitor is used simply as a generic handle but in most cases monitors contain shared data.
     1494This data should be intrinsic to the monitor declaration to prevent any accidental use of data without its appropriate protection.
     1495For example, here is a complete version of the counter shown in section \ref{call}:
     1496\begin{cfa}
     1497monitor counter_t {
     1498        int value;
     1499};
     1500
     1501void ?{}(counter_t & this) {
     1502        this.cnt = 0;
     1503}
     1504
     1505int ?++(counter_t & mutex this) {
     1506        return ++this.value;
     1507}
     1508
     1509// need for mutex is platform dependent here
     1510void ?{}(int * this, counter_t & mutex cnt) {
     1511        *this = (int)cnt;
     1512}
     1513\end{cfa}
     1514
     1515Like threads and coroutines, monitors are defined in terms of traits with some additional language support in the form of the @monitor@ keyword.
     1516The monitor trait is:
     1517\begin{cfa}
     1518trait is_monitor(dtype T) {
    12821519        monitor_desc * get_monitor( T & );
    12831520        void ^?{}( T & mutex );
    12841521};
    12851522\end{cfa}
    1286 
    1287 
    1288 \subsection{Mutex Acquisition}
    1289 \label{s:MutexAcquisition}
    1290 
    1291 While correctness implicitly implies a monitor's mutual exclusion is acquired and released, there are implementation options about when and where the locking/unlocking occurs.
    1292 (Much of this discussion also applies to basic locks.)
    1293 For example, a monitor may need to be passed through multiple helper routines before it becomes necessary to acquire the monitor mutual-exclusion.
    1294 \begin{cfa}[morekeywords=nomutex]
    1295 monitor Aint { int cnt; };                                      $\C{// atomic integer counter}$
    1296 void ?{}( Aint & `nomutex` this ) with( this ) { cnt = 0; } $\C{// constructor}$
    1297 int ?=?( Aint & `mutex`$\(_{opt}\)$ lhs, int rhs ) with( lhs ) { cnt = rhs; } $\C{// conversions}$
    1298 void ?{}( int & this, Aint & `mutex`$\(_{opt}\)$ v ) { this = v.cnt; }
    1299 int ?=?( int & lhs, Aint & `mutex`$\(_{opt}\)$ rhs ) with( rhs ) { lhs = cnt; }
    1300 int ++?( Aint & `mutex`$\(_{opt}\)$ this ) with( this ) { return ++cnt; } $\C{// increment}$
    1301 \end{cfa}
    1302 The @Aint@ constructor, @?{}@, uses the \lstinline[morekeywords=nomutex]@nomutex@ qualifier indicating mutual exclusion is unnecessary during construction because an object is inaccessible (private) until after it is initialized.
    1303 (While a constructor may publish its address into a global variable, doing so generates a race-condition.)
    1304 The conversion operators for initializing and assigning with a normal integer only need @mutex@, if reading/writing the implementation type is not atomic.
    1305 Finally, the prefix increment operato, @++?@, is normally @mutex@ to protect the incrementing from race conditions, unless there is an atomic increment instruction for the implementation type.
    1306 
    1307 The atomic counter is used without any explicit mutual-exclusion and provides thread-safe semantics, which is similar to the \CC template @std::atomic@.
    1308 \begin{cfa}
    1309 Aint x, y, z;
    1310 ++x; ++y; ++z;                                                          $\C{// safe increment by multiple threads}$
    1311 x = 2; y = 2; z = 2;                                            $\C{// conversions}$
    1312 int i = x, j = y, k = z;
    1313 i = x; j = y; k = z;
    1314 \end{cfa}
    1315 
    1316 For maximum usability, monitors have \newterm{multi-acquire} semantics allowing a thread to acquire it multiple times without deadlock.
    1317 For example, atomically printing the contents of a binary tree:
    1318 \begin{cfa}
    1319 monitor Tree {
    1320         Tree * left, right;
    1321         // value
    1322 };
    1323 void print( Tree & mutex tree ) {                       $\C{// prefix traversal}$
    1324         // write value
    1325         print( tree->left );                                    $\C{// multiply acquire monitor lock on each recursion}$
    1326         print( tree->right );
    1327 }
    1328 \end{cfa}
    1329 
    1330 Mandatory monitor qualifiers have the benefit of being self-documented, but requiring both @mutex@ and \lstinline[morekeywords=nomutex]@nomutex@ for all monitor parameter is redundant.
    1331 Instead, one of qualifier semantics can be the default, and the other required.
    1332 For example, assume the safe @mutex@ option for a monitor parameter because assuming \lstinline[morekeywords=nomutex]@nomutex@ may cause subtle errors.
    1333 On the other hand, assuming \lstinline[morekeywords=nomutex]@nomutex@ is the \emph{normal} parameter behaviour, stating explicitly ``this parameter is not special''.
    1334 Providing a default qualifier implies knowing whether a parameter is a monitor.
    1335 Since \CFA relies heavily on traits as an abstraction mechanism, the distinction between a type that is a monitor and a type that looks like a monitor can become blurred.
    1336 For this reason, \CFA requires programmers to identify the kind of parameter with the @mutex@ keyword and uses no keyword to mean \lstinline[morekeywords=nomutex]@nomutex@.
    1337 
    1338 The next semantic decision is establishing which parameter \emph{types} may be qualified with @mutex@.
    1339 Given:
    1340 \begin{cfa}
    1341 monitor M { ... }
    1342 int f1( M & mutex m );
    1343 int f2( M * mutex m );
    1344 int f3( M * mutex m[] );
    1345 int f4( stack( M * ) & mutex m );
    1346 \end{cfa}
    1347 the issue is that some of these parameter types are composed of multiple objects.
    1348 For @f1@, there is only a single parameter object.
    1349 Adding indirection in @f2@ still identifies a single object.
    1350 However, the matrix in @f3@ introduces multiple objects.
    1351 While shown shortly, multiple acquisition is possible;
    1352 however array lengths are often unknown in C.
    1353 This issue is exacerbated in @f4@, where the data structure must be safely traversed to acquire all of its elements.
    1354 
    1355 To make the issue tractable, \CFA only acquires one monitor per parameter with at most one level of indirection.
    1356 However, the C type-system has an ambiguity with respects to arrays.
    1357 Is the argument for @f2@ a single object or an array of objects?
    1358 If it is an array, only the first element of the array is acquired, which seems unsafe;
    1359 hence, @mutex@ is disallowed for array parameters.
    1360 \begin{cfa}
    1361 int f1( M & mutex m );                                          $\C{// allowed: recommended case}$
    1362 int f2( M * mutex m );                                          $\C{// disallowed: could be an array}$
    1363 int f3( M mutex m[$\,$] );                                      $\C{// disallowed: array length unknown}$
    1364 int f4( M ** mutex m );                                         $\C{// disallowed: could be an array}$
    1365 int f5( M * mutex m[$\,$] );                            $\C{// disallowed: array length unknown}$
    1366 \end{cfa}
    1367 % Note, not all array routines have distinct types: @f2@ and @f3@ have the same type, as do @f4@ and @f5@.
    1368 % However, even if the code generation could tell the difference, the extra information is still not sufficient to extend meaningfully the monitor call semantic.
    1369 
    1370 For object-oriented monitors, calling a mutex member \emph{implicitly} acquires mutual exclusion of the receiver object, @`rec`.foo(...)@.
    1371 \CFA has no receiver, and hence, must use an explicit mechanism to specify which object has mutual exclusion acquired.
    1372 A positive consequence of this design decision is the ability to support multi-monitor routines.
    1373 \begin{cfa}
    1374 int f( M & mutex x, M & mutex y );              $\C{// multiple monitor parameter of any type}$
    1375 M m1, m2;
    1376 f( m1, m2 );
    1377 \end{cfa}
    1378 (While object-oriented monitors can be extended with a mutex qualifier for multiple-monitor members, no prior example of this feature could be found.)
    1379 In practice, writing multi-locking routines that do not deadlocks is tricky.
    1380 Having language support for such a feature is therefore a significant asset for \CFA.
    1381 
    1382 The capability to acquire multiple locks before entering a critical section is called \newterm{bulk acquire}.
    1383 In previous example, \CFA guarantees the order of acquisition is consistent across calls to different routines using the same monitors as arguments.
    1384 This consistent ordering means acquiring multiple monitors is safe from deadlock.
    1385 However, users can force the acquiring order.
    1386 For example, notice the use of @mutex@/\lstinline[morekeywords=nomutex]@nomutex@ and how this affects the acquiring order:
    1387 \begin{cfa}
    1388 void foo( M & mutex m1, M & mutex m2 );         $\C{// acquire m1 and m2}$
    1389 void bar( M & mutex m1, M & /* nomutex */ m2 ) { $\C{// acquire m1}$
    1390         ... foo( m1, m2 ); ...                                  $\C{// acquire m2}$
    1391 }
    1392 void baz( M & /* nomutex */ m1, M & mutex m2 ) { $\C{// acquire m2}$
    1393         ... foo( m1, m2 ); ...                                  $\C{// acquire m1}$
    1394 }
    1395 \end{cfa}
    1396 The multi-acquire semantics allows @bar@ or @baz@ to acquire a monitor lock and reacquire it in @foo@.
    1397 In the calls to @bar@ and @baz@, the monitors are acquired in opposite order.
    1398 
    1399 However, such use leads to lock acquiring order problems resulting in deadlock~\cite{Lister77}, where detecting it requires dynamically tracking of monitor calls, and dealing with it requires implement rollback semantics~\cite{Dice10}.
    1400 In \CFA, safety is guaranteed by using bulk acquire of all monitors to shared objects, whereas other monitor systems provide no aid.
    1401 While \CFA provides only a partial solution, the \CFA partial solution handles many useful cases.
    1402 \begin{cfa}
    1403 monitor Bank { ... };
    1404 void deposit( Bank & `mutex` b, int deposit );
    1405 void transfer( Bank & `mutex` mybank, Bank & `mutex` yourbank, int me2you) {
    1406         deposit( mybank, `-`me2you );                   $\C{// debit}$
    1407         deposit( yourbank, me2you );                    $\C{// credit}$
    1408 }
    1409 \end{cfa}
    1410 This example shows a trivial solution to the bank-account transfer problem~\cite{BankTransfer}.
    1411 Without multi- and bulk acquire, the solution to this problem requires careful engineering.
    1412 
    1413 
    1414 \subsection{\protect\lstinline|mutex| statement} \label{mutex-stmt}
    1415 
    1416 The monitor call-semantics associate all locking semantics to routines.
    1417 Like Java, \CFA offers an alternative @mutex@ statement to reduce refactoring and naming.
    1418 \begin{cquote}
    1419 \begin{tabular}{@{}c|@{\hspace{\parindentlnth}}c@{}}
    1420 routine call & @mutex@ statement \\
    1421 \begin{cfa}
    1422 monitor M {};
    1423 void foo( M & mutex m1, M & mutex m2 ) {
    1424         // critical section
    1425 }
    1426 void bar( M & m1, M & m2 ) {
    1427         foo( m1, m2 );
    1428 }
    1429 \end{cfa}
    1430 &
    1431 \begin{cfa}
    1432 
    1433 void bar( M & m1, M & m2 ) {
    1434         mutex( m1, m2 ) {       // remove refactoring and naming
    1435                 // critical section
    1436         }
    1437 }
    1438 
    1439 \end{cfa}
    1440 \end{tabular}
    1441 \end{cquote}
    1442 
    1443 
    1444 \section{Internal Scheduling}
    1445 \label{s:InternalScheduling}
    1446 
    1447 While monitor mutual-exclusion provides safe access to shared data, the monitor data may indicate that a thread accessing it cannot proceed, \eg a bounded buffer, Figure~\ref{f:BoundedBuffer}, may be full/empty so produce/consumer threads must block.
    1448 Leaving the monitor and trying again (busy waiting) is impractical for high-level programming.
    1449 Monitors eliminate busy waiting by providing internal synchronization to schedule threads needing access to the shared data, where the synchronization is blocking (threads are parked) versus spinning.
    1450 The synchronization is generally achieved with internal~\cite{Hoare74} or external~\cite[\S~2.9.2]{uC++} scheduling, where \newterm{scheduling} is defined as indicating which thread acquires the critical section next.
    1451 \newterm{Internal scheduling} is characterized by each thread entering the monitor and making an individual decision about proceeding or blocking, while \newterm{external scheduling} is characterized by an entering thread making a decision about proceeding for itself and behalf of other threads attempting entry.
    1452 
    1453 Figure~\ref{f:BBInt} shows a \CFA bounded-buffer with internal scheduling, where producers/consumers enter the monitor, see the buffer is full/empty, and block on an appropriate condition lock, @full@/@empty@.
    1454 The @wait@ routine atomically blocks the calling thread and implicitly releases the monitor lock(s) for all monitors in the routine's parameter list.
    1455 The appropriate condition lock is signalled to unblock an opposite kind of thread after an element is inserted/removed from the buffer.
    1456 Signalling is unconditional, because signalling an empty condition lock does nothing.
    1457 Signalling semantics cannot have the signaller and signalled thread in the monitor simultaneously, which means:
    1458 \begin{enumerate}
    1459 \item
    1460 The signalling thread leaves immediately, and the signalled thread continues.
    1461 \item
    1462 The signalling thread continues and the signalled thread is marked for urgent unblocking at subsequent scheduling points (exit/wait).
    1463 \item
    1464 The signalling thread blocks but is marked for urgrent unblocking and the signalled thread continues.
    1465 \end{enumerate}
    1466 The first approach is too restrictive, as it precludes solving a reasonable class of problems (\eg dating service).
    1467 \CFA supports the next two semantics as both are useful.
    1468 Finally, while it is common to store a @condition@ as a field of the monitor, in \CFA, a @condition@ variable can be created/stored independently.
    1469 
    1470 \begin{figure}
    1471 \centering
    1472 \newbox\myboxA
    1473 \begin{lrbox}{\myboxA}
    1474 \begin{lstlisting}[aboveskip=0pt,belowskip=0pt]
    1475 forall( otype T ) { // distribute forall
    1476         monitor Buffer {
    1477                 `condition` full, empty;
    1478                 int front, back, count;
    1479                 T elements[10];
    1480         };
    1481         void ?{}( Buffer(T) & buffer ) with(buffer) {
    1482                 [front, back, count] = 0;
    1483         }
    1484 
    1485         void insert( Buffer(T) & mutex buffer, T elem )
    1486                                 with(buffer) {
    1487                 if ( count == 10 ) `wait( empty )`;
    1488                 // insert elem into buffer
    1489                 `signal( full )`;
    1490         }
    1491         T remove( Buffer(T) & mutex buffer ) with(buffer) {
    1492                 if ( count == 0 ) `wait( full )`;
    1493                 // remove elem from buffer
    1494                 `signal( empty )`;
    1495                 return elem;
    1496         }
    1497 }
    1498 \end{lstlisting}
    1499 \end{lrbox}
    1500 
    1501 \newbox\myboxB
    1502 \begin{lrbox}{\myboxB}
    1503 \begin{lstlisting}[aboveskip=0pt,belowskip=0pt]
    1504 forall( otype T ) { // distribute forall
    1505         monitor Buffer {
    1506 
    1507                 int front, back, count;
    1508                 T elements[10];
    1509         };
    1510         void ?{}( Buffer(T) & buffer ) with(buffer) {
    1511                 [front, back, count] = 0;
    1512         }
    1513         T remove( Buffer(T) & mutex buffer ); // forward
    1514         void insert( Buffer(T) & mutex buffer, T elem )
    1515                                 with(buffer) {
    1516                 if ( count == 10 ) `waitfor( remove, buffer )`;
    1517                 // insert elem into buffer
    1518 
    1519         }
    1520         T remove( Buffer(T) & mutex buffer ) with(buffer) {
    1521                 if ( count == 0 ) `waitfor( insert, buffer )`;
    1522                 // remove elem from buffer
    1523 
    1524                 return elem;
    1525         }
    1526 }
    1527 \end{lstlisting}
    1528 \end{lrbox}
    1529 
    1530 \subfloat[Internal Scheduling]{\label{f:BBInt}\usebox\myboxA}
    1531 %\qquad
    1532 \subfloat[External Scheduling]{\label{f:BBExt}\usebox\myboxB}
    1533 \caption{Generic Bounded-Buffer}
    1534 \label{f:BoundedBuffer}
    1535 \end{figure}
    1536 
    1537 Figure~\ref{f:BBExt} shows a \CFA bounded-buffer with external scheduling, where producers/consumers detecting a full/empty buffer block and prevent more producers/consumers from entering the monitor until the buffer has a free/empty slot.
    1538 External scheduling is controlled by the @waitfor@ statement, which atomically blocks the calling thread, releases the monitor lock, and restricts the routine calls that can next acquire mutual exclusion.
    1539 If the buffer is full, only calls to @remove@ can acquire the buffer, and if the buffer is empty, only calls to @insert@ can acquire the buffer.
    1540 Threads making calls to routines that are currently excluded wait outside (externally) of the monitor on a calling queue.
    1541 
    1542 An important aspect of monitor implementation is barging, \ie can calling threads barge ahead of signalled threads?
    1543 If barging is allowed, synchronization between a singller and signallee is difficult, often requiring multiple unblock/block cycles (looping around a wait rechecking if a condition is met).
    1544 \CFA scheduling does \emph{not} have barging, which simplifies synchronization among threads in the monitor.
    1545 Supporting barging prevention as well as extending internal scheduling to multiple monitors is the main source of complexity in the design and implementation of \CFA concurrency.
    1546 
    1547 Indeed, like the bulk acquire semantics, internal scheduling extends to multiple monitors in a way that is natural to the user but requires additional complexity on the implementation side.
     1523Note that the destructor of a monitor must be a @mutex@ routine to prevent deallocation while a thread is accessing the monitor.
     1524As with any object, calls to a monitor, using @mutex@ or otherwise, is undefined behaviour after the destructor has run.
     1525
     1526% ======================================================================
     1527% ======================================================================
     1528\section{Internal Scheduling} \label{intsched}
     1529% ======================================================================
     1530% ======================================================================
     1531In addition to mutual exclusion, the monitors at the core of \CFA's concurrency can also be used to achieve synchronization.
     1532With monitors, this capability is generally achieved with internal or external scheduling as in~\cite{Hoare74}.
     1533With \textbf{scheduling} loosely defined as deciding which thread acquires the critical section next, \textbf{internal scheduling} means making the decision from inside the critical section (\ie with access to the shared state), while \textbf{external scheduling} means making the decision when entering the critical section (\ie without access to the shared state).
     1534Since internal scheduling within a single monitor is mostly a solved problem, this paper concentrates on extending internal scheduling to multiple monitors.
     1535Indeed, like the \textbf{bulk-acq} semantics, internal scheduling extends to multiple monitors in a way that is natural to the user but requires additional complexity on the implementation side.
    15481536
    15491537First, here is a simple example of internal scheduling:
     
    15681556}
    15691557\end{cfa}
     1558There are two details to note here.
     1559First, @signal@ is a delayed operation; it only unblocks the waiting thread when it reaches the end of the critical section.
     1560This semantics is needed to respect mutual-exclusion, \ie the signaller and signalled thread cannot be in the monitor simultaneously.
     1561The alternative is to return immediately after the call to @signal@, which is significantly more restrictive.
     1562Second, in \CFA, while it is common to store a @condition@ as a field of the monitor, a @condition@ variable can be stored/created independently of a monitor.
     1563Here routine @foo@ waits for the @signal@ from @bar@ before making further progress, ensuring a basic ordering.
     1564
     1565An important aspect of the implementation is that \CFA does not allow barging, which means that once function @bar@ releases the monitor, @foo@ is guaranteed to be the next thread to acquire the monitor (unless some other thread waited on the same condition).
     1566This guarantee offers the benefit of not having to loop around waits to recheck that a condition is met.
     1567The main reason \CFA offers this guarantee is that users can easily introduce barging if it becomes a necessity but adding barging prevention or barging avoidance is more involved without language support.
     1568Supporting barging prevention as well as extending internal scheduling to multiple monitors is the main source of complexity in the design and implementation of \CFA concurrency.
    15701569
    15711570% ======================================================================
     
    16011600This semantic is a logical requirement for barging prevention.
    16021601
    1603 A direct extension of the previous example is a bulk acquire version:
     1602A direct extension of the previous example is a \textbf{bulk-acq} version:
    16041603\begin{multicols}{2}
    16051604\begin{cfa}
     
    16151614\end{cfa}
    16161615\end{multicols}
    1617 \noindent This version uses bulk acquire (denoted using the {\sf\&} symbol), but the presence of multiple monitors does not add a particularly new meaning.
     1616\noindent This version uses \textbf{bulk-acq} (denoted using the {\sf\&} symbol), but the presence of multiple monitors does not add a particularly new meaning.
    16181617Synchronization happens between the two threads in exactly the same way and order.
    16191618The only difference is that mutual exclusion covers a group of monitors.
     
    16761675% ======================================================================
    16771676
    1678 A larger example is presented to show complex issues for bulk acquire and its implementation options are analyzed.
    1679 Figure~\ref{f:int-bulk-cfa} shows an example where bulk acquire adds a significant layer of complexity to the internal signalling semantics, and listing \ref{f:int-bulk-cfa} shows the corresponding \CFA code to implement the cfa-code in listing \ref{f:int-bulk-cfa}.
     1677A larger example is presented to show complex issues for \textbf{bulk-acq} and its implementation options are analyzed.
     1678Figure~\ref{f:int-bulk-cfa} shows an example where \textbf{bulk-acq} adds a significant layer of complexity to the internal signalling semantics, and listing \ref{f:int-bulk-cfa} shows the corresponding \CFA code to implement the cfa-code in listing \ref{f:int-bulk-cfa}.
    16801679For the purpose of translating the given cfa-code into \CFA-code, any method of introducing a monitor is acceptable, \eg @mutex@ parameters, global variables, pointer parameters, or using locals with the @mutex@ statement.
    16811680
     
    17081707\end{cfa}
    17091708\end{multicols}
    1710 \begin{cfa}[caption={Internal scheduling with bulk acquire},label={f:int-bulk-cfa}]
     1709\begin{cfa}[caption={Internal scheduling with \textbf{bulk-acq}},label={f:int-bulk-cfa}]
    17111710\end{cfa}
    17121711\begin{center}
     
    17741773
    17751774The complexity begins at code sections 4 and 8 in listing \ref{f:int-bulk-cfa}, which are where the existing semantics of internal scheduling needs to be extended for multiple monitors.
    1776 The root of the problem is that bulk acquire is used in a context where one of the monitors is already acquired, which is why it is important to define the behaviour of the previous cfa-code.
     1775The root of the problem is that \textbf{bulk-acq} is used in a context where one of the monitors is already acquired, which is why it is important to define the behaviour of the previous cfa-code.
    17771776When the signaller thread reaches the location where it should ``release @A & B@'' (listing \ref{f:int-bulk-cfa} line \ref{line:releaseFirst}), it must actually transfer ownership of monitor @B@ to the waiting thread.
    17781777This ownership transfer is required in order to prevent barging into @B@ by another thread, since both the signalling and signalled threads still need monitor @A@.
     
    21112110}% subfloat
    21122111\qquad
    2113 \subfloat[bulk acquire Monitor] {
     2112\subfloat[\textbf{bulk-acq} Monitor] {
    21142113\label{fig:BulkMonitor}
    21152114{\resizebox{0.45\textwidth}{!}{\input{ext_monitor}}}
     
    21282127Here, the mutex routine called is associated with a thread on the entry queue while a list of acceptable routines is kept separate.
    21292128Generating a mask dynamically means that the storage for the mask information can vary between calls to @waitfor@, allowing for more flexibility and extensions.
    2130 Storing an array of accepted routine pointers replaces the single instruction bitmask comparison with dereferencing a pointer followed by a linear search.
     2129Storing an array of accepted function pointers replaces the single instruction bitmask comparison with dereferencing a pointer followed by a linear search.
    21312130Furthermore, supporting nested external scheduling (\eg listing \ref{f:nest-ext}) may now require additional searches for the @waitfor@ statement to check if a routine is already queued.
    21322131
     
    21462145\end{figure}
    21472146
    2148 Note that in the right picture, tasks need to always keep track of the monitors associated with mutex routines, and the routine mask needs to have both a routine pointer and a set of monitors, as is discussed in the next section.
     2147Note that in the right picture, tasks need to always keep track of the monitors associated with mutex routines, and the routine mask needs to have both a function pointer and a set of monitors, as is discussed in the next section.
    21492148These details are omitted from the picture for the sake of simplicity.
    21502149
     
    22342233% ======================================================================
    22352234
    2236 Syntactically, the @waitfor@ statement takes a routine identifier and a set of monitors.
    2237 While the set of monitors can be any list of expressions, the routine name is more restricted because the compiler validates at compile time the validity of the routine type and the parameters used with the @waitfor@ statement.
    2238 It checks that the set of monitors passed in matches the requirements for a routine call.
     2235Syntactically, the @waitfor@ statement takes a function identifier and a set of monitors.
     2236While the set of monitors can be any list of expressions, the function name is more restricted because the compiler validates at compile time the validity of the function type and the parameters used with the @waitfor@ statement.
     2237It checks that the set of monitors passed in matches the requirements for a function call.
    22392238Figure~\ref{f:waitfor} shows various usages of the waitfor statement and which are acceptable.
    2240 The choice of the routine type is made ignoring any non-@mutex@ parameter.
     2239The choice of the function type is made ignoring any non-@mutex@ parameter.
    22412240One limitation of the current implementation is that it does not handle overloading, but overloading is possible.
    22422241\begin{figure}
     
    22642263        waitfor(f2, a1, a2); // Incorrect : Mutex arguments don't match
    22652264        waitfor(f1, 1);      // Incorrect : 1 not a mutex argument
    2266         waitfor(f9, a1);     // Incorrect : f9 routine does not exist
     2265        waitfor(f9, a1);     // Incorrect : f9 function does not exist
    22672266        waitfor(*fp, a1 );   // Incorrect : fp not an identifier
    22682267        waitfor(f4, a1);     // Incorrect : f4 ambiguous
     
    22742273
    22752274Finally, for added flexibility, \CFA supports constructing a complex @waitfor@ statement using the @or@, @timeout@ and @else@.
    2276 Indeed, multiple @waitfor@ clauses can be chained together using @or@; this chain forms a single statement that uses baton pass to any routine that fits one of the routine+monitor set passed in.
    2277 To enable users to tell which accepted routine executed, @waitfor@s are followed by a statement (including the null statement @;@) or a compound statement, which is executed after the clause is triggered.
    2278 A @waitfor@ chain can also be followed by a @timeout@, to signify an upper bound on the wait, or an @else@, to signify that the call should be non-blocking, which checks for a matching routine call already arrived and otherwise continues.
     2275Indeed, multiple @waitfor@ clauses can be chained together using @or@; this chain forms a single statement that uses baton pass to any function that fits one of the function+monitor set passed in.
     2276To enable users to tell which accepted function executed, @waitfor@s are followed by a statement (including the null statement @;@) or a compound statement, which is executed after the clause is triggered.
     2277A @waitfor@ chain can also be followed by a @timeout@, to signify an upper bound on the wait, or an @else@, to signify that the call should be non-blocking, which checks for a matching function call already arrived and otherwise continues.
    22792278Any and all of these clauses can be preceded by a @when@ condition to dynamically toggle the accept clauses on or off based on some current state.
    22802279Figure~\ref{f:waitfor2} demonstrates several complex masks and some incorrect ones.
     
    24422441\section{Behind the Scenes}
    24432442There are several challenges specific to \CFA when implementing concurrency.
    2444 These challenges are a direct result of bulk acquire and loose object definitions.
     2443These challenges are a direct result of \textbf{bulk-acq} and loose object definitions.
    24452444These two constraints are the root cause of most design decisions in the implementation.
    24462445Furthermore, to avoid contention from dynamically allocating memory in a concurrent environment, the internal-scheduling design is (almost) entirely free of mallocs.
     
    24502449The main memory concern for concurrency is queues.
    24512450All blocking operations are made by parking threads onto queues and all queues are designed with intrusive nodes, where each node has pre-allocated link fields for chaining, to avoid the need for memory allocation.
    2452 Since several concurrency operations can use an unbound amount of memory (depending on bulk acquire), statically defining information in the intrusive fields of threads is insufficient.The only way to use a variable amount of memory without requiring memory allocation is to pre-allocate large buffers of memory eagerly and store the information in these buffers.
     2451Since several concurrency operations can use an unbound amount of memory (depending on \textbf{bulk-acq}), statically defining information in the intrusive fields of threads is insufficient.The only way to use a variable amount of memory without requiring memory allocation is to pre-allocate large buffers of memory eagerly and store the information in these buffers.
    24532452Conveniently, the call stack fits that description and is easy to use, which is why it is used heavily in the implementation of internal scheduling, particularly variable-length arrays.
    24542453Since stack allocation is based on scopes, the first step of the implementation is to identify the scopes that are available to store the information, and which of these can have a variable-length array.
    24552454The threads and the condition both have a fixed amount of memory, while @mutex@ routines and blocking calls allow for an unbound amount, within the stack size.
    24562455
    2457 Note that since the major contributions of this paper are extending monitor semantics to bulk acquire and loose object definitions, any challenges that are not resulting of these characteristics of \CFA are considered as solved problems and therefore not discussed.
     2456Note that since the major contributions of this paper are extending monitor semantics to \textbf{bulk-acq} and loose object definitions, any challenges that are not resulting of these characteristics of \CFA are considered as solved problems and therefore not discussed.
    24582457
    24592458% ======================================================================
     
    25652564void foo(T * mutex t);
    25662565
    2567 // Correct: this routine only works on monitors (any monitor)
     2566// Correct: this function only works on monitors (any monitor)
    25682567forall(dtype T | is_monitor(T))
    25692568void bar(T * mutex t));
     
    25722571Both entry point and \textbf{callsite-locking} are feasible implementations.
    25732572The current \CFA implementation uses entry-point locking because it requires less work when using \textbf{raii}, effectively transferring the burden of implementation to object construction/destruction.
    2574 It is harder to use \textbf{raii} for call-site locking, as it does not necessarily have an existing scope that matches exactly the scope of the mutual exclusion, \ie the routine body.
     2573It is harder to use \textbf{raii} for call-site locking, as it does not necessarily have an existing scope that matches exactly the scope of the mutual exclusion, \ie the function body.
    25752574For example, the monitor call can appear in the middle of an expression.
    25762575Furthermore, entry-point locking requires less code generation since any useful routine is called multiple times but there is only one entry point for many call sites.
     
    26102609\subsection{Context Switching}
    26112610As mentioned in section \ref{coroutine}, coroutines are a stepping stone for implementing threading, because they share the same mechanism for context-switching between different stacks.
    2612 To improve performance and simplicity, context-switching is implemented using the following assumption: all context-switches happen inside a specific routine call.
     2611To improve performance and simplicity, context-switching is implemented using the following assumption: all context-switches happen inside a specific function call.
    26132612This assumption means that the context-switch only has to copy the callee-saved registers onto the stack and then switch the stack registers with the ones of the target coroutine/thread.
    2614 Note that the instruction pointer can be left untouched since the context-switch is always inside the same routine
     2613Note that the instruction pointer can be left untouched since the context-switch is always inside the same function.
    26152614Threads, however, do not context-switch between each other directly.
    26162615They context-switch to the scheduler.
     
    26492648As a result, a signal handler can start on one kernel thread and terminate on a second kernel thread (but the same user thread).
    26502649It is important to note that signal handlers save and restore signal masks because user-thread migration can cause a signal mask to migrate from one kernel thread to another.
    2651 This behaviour is only a problem if all kernel threads, among which a user thread can migrate, differ in terms of signal masks\footnote{Sadly, official POSIX documentation is silent on what distinguishes ``async-signal-safe'' routines from other routines}.
     2650This behaviour is only a problem if all kernel threads, among which a user thread can migrate, differ in terms of signal masks\footnote{Sadly, official POSIX documentation is silent on what distinguishes ``async-signal-safe'' functions from other functions.}.
    26522651However, since the kernel thread handling preemption requires a different signal mask, executing user threads on the kernel-alarm thread can cause deadlocks.
    26532652For this reason, the alarm thread is in a tight loop around a system call to @sigwaitinfo@, requiring very little CPU time for preemption.
     
    26792678
    26802679For \CFA, this picture does not have support for blocking multiple monitors on a single condition.
    2681 To support bulk acquire two changes to this picture are required.
     2680To support \textbf{bulk-acq} two changes to this picture are required.
    26822681First, it is no longer helpful to attach the condition to \emph{a single} monitor.
    26832682Secondly, the thread waiting on the condition has to be separated across multiple monitors, seen in figure \ref{fig:monitor_cfa}.
     
    27282727\end{figure}
    27292728
    2730 The solution discussed in \ref{s:InternalScheduling} can be seen in the exit routine of listing \ref{f:entry2}.
     2729The solution discussed in \ref{intsched} can be seen in the exit routine of listing \ref{f:entry2}.
    27312730Basically, the solution boils down to having a separate data structure for the condition queue and the AS-stack, and unconditionally transferring ownership of the monitors but only unblocking the thread when the last monitor has transferred ownership.
    27322731This solution is deadlock safe as well as preventing any potential barging.
     
    29642963}
    29652964\end{cfa}
    2966 This routine is called by the kernel to fetch the default preemption rate, where 0 signifies an infinite time-slice, \ie no preemption.
     2965This function is called by the kernel to fetch the default preemption rate, where 0 signifies an infinite time-slice, \ie no preemption.
    29672966However, once clusters are fully implemented, it will be possible to create fibers and \textbf{uthread} in the same system, as in listing \ref{f:fiber-uthread}
    29682967\begin{figure}
     
    31493148For monitors, the simplest approach is to measure how long it takes to enter and leave a monitor routine.
    31503149Figure~\ref{f:mutex} shows the code for \CFA.
    3151 To put the results in context, the cost of entering a non-inline routine and the cost of acquiring and releasing a @pthread_mutex@ lock is also measured.
     3150To put the results in context, the cost of entering a non-inline function and the cost of acquiring and releasing a @pthread_mutex@ lock is also measured.
    31523151The results can be shown in table \ref{tab:mutex}.
    31533152
     
    34003399Therefore, there is still significant work to improve performance.
    34013400Many of the data structures and algorithms may change in the future to more efficient versions.
    3402 For example, the number of monitors in a single bulk acquire is only bound by the stack size, this is probably unnecessarily generous.
     3401For example, the number of monitors in a single \textbf{bulk-acq} is only bound by the stack size, this is probably unnecessarily generous.
    34033402It may be possible that limiting the number helps increase performance.
    34043403However, it is not obvious that the benefit would be significant.
  • doc/papers/general/Paper.tex

    rdafdbe7 r5510027  
    201201
    202202\abstract[Summary]{
    203 The C programming language is a foundational technology for modern computing with millions of lines of code implementing everything from hobby projects to commercial operating-systems.
     203The C programming language is a foundational technology for modern computing with millions of lines of code implementing everything from commercial operating-systems to hobby projects.
    204204This installation base and the programmers producing it represent a massive software-engineering investment spanning decades and likely to continue for decades more.
    205205Nevertheless, C, first standardized almost forty years ago, lacks many features that make programming in more modern languages safer and more productive.
     
    224224\section{Introduction}
    225225
    226 The C programming language is a foundational technology for modern computing with millions of lines of code implementing everything from hobby projects to commercial operating-systems.
     226The C programming language is a foundational technology for modern computing with millions of lines of code implementing everything from commercial operating-systems to hobby projects.
    227227This installation base and the programmers producing it represent a massive software-engineering investment spanning decades and likely to continue for decades more.
    228228The TIOBE~\cite{TIOBE} ranks the top 5 most \emph{popular} programming languages as: Java 15\%, \Textbf{C 12\%}, \Textbf{\CC 5.5\%}, Python 5\%, \Csharp 4.5\% = 42\%, where the next 50 languages are less than 4\% each with a long tail.
  • src/Parser/ExpressionNode.cc

    rdafdbe7 r5510027  
    1010// Created On       : Sat May 16 13:17:07 2015
    1111// Last Modified By : Peter A. Buhr
    12 // Last Modified On : Mon Jun  4 21:24:45 2018
    13 // Update Count     : 802
     12// Last Modified On : Thu Mar 22 11:57:39 2018
     13// Update Count     : 801
    1414//
    1515
     
    314314
    315315Expression * build_constantStr( string & str ) {
    316         assert( str.length() > 0 );
    317316        string units;                                                                           // units
    318317        sepString( str, units, '"' );                                           // separate constant from units
  • src/Parser/ParseNode.h

    rdafdbe7 r5510027  
    1010// Created On       : Sat May 16 13:28:16 2015
    1111// Last Modified By : Peter A. Buhr
    12 // Last Modified On : Mon Jun  4 22:21:04 2018
    13 // Update Count     : 832
     12// Last Modified On : Mon Apr 30 09:19:17 2018
     13// Update Count     : 831
    1414//
    1515
     
    408408Statement * build_case( ExpressionNode * ctl );
    409409Statement * build_default();
    410 Statement * build_while( IfCtl * ctl, StatementNode * stmt );
    411 Statement * build_do_while( ExpressionNode * ctl, StatementNode * stmt );
     410Statement * build_while( ExpressionNode * ctl, StatementNode * stmt, bool kind = false );
    412411Statement * build_for( ForCtl * forctl, StatementNode * stmt );
    413412Statement * build_branch( BranchStmt::Type kind );
  • src/Parser/StatementNode.cc

    rdafdbe7 r5510027  
    1010// Created On       : Sat May 16 14:59:41 2015
    1111// Last Modified By : Peter A. Buhr
    12 // Last Modified On : Tue Jun  5 08:58:34 2018
    13 // Update Count     : 362
     12// Last Modified On : Mon Apr 30 09:21:16 2018
     13// Update Count     : 354
    1414//
    1515
     
    6969        caseStmt->get_statements().splice( caseStmt->get_statements().end(), stmts );
    7070        return this;
    71 } // StatementNode::append_last_case
     71}
    7272
    7373Statement * build_expr( ExpressionNode * ctl ) {
    7474        Expression * e = maybeMoveBuild< Expression >( ctl );
    7575
    76         if ( e ) return new ExprStmt( e );
    77         else return new NullStmt();
    78 } // build_expr
     76        if ( e )
     77                return new ExprStmt( e );
     78        else
     79                return new NullStmt();
     80}
    7981
    8082Expression * build_if_control( IfCtl * ctl, std::list< Statement * > & init ) {
     
    98100        delete ctl;
    99101        return cond;
    100 } // build_if_control
     102}
    101103
    102104Statement * build_if( IfCtl * ctl, StatementNode * then_stmt, StatementNode * else_stmt ) {
    103         Statement * thenb, * elseb = nullptr;
     105        Statement * thenb, * elseb = 0;
    104106        std::list< Statement * > branches;
    105107        buildMoveList< Statement, StatementNode >( then_stmt, branches );
     
    117119        Expression * cond = build_if_control( ctl, init );
    118120        return new IfStmt( cond, thenb, elseb, init );
    119 } // build_if
     121}
    120122
    121123Statement * build_switch( bool isSwitch, ExpressionNode * ctl, StatementNode * stmt ) {
     
    133135        // branches.size() == 0 for switch (...) {}, i.e., no declaration or statements
    134136        return new SwitchStmt( maybeMoveBuild< Expression >(ctl), branches );
    135 } // build_switch
    136 
     137}
    137138Statement * build_case( ExpressionNode * ctl ) {
    138139        std::list< Statement * > branches;
    139140        return new CaseStmt( maybeMoveBuild< Expression >(ctl), branches );
    140 } // build_case
    141 
     141}
    142142Statement * build_default() {
    143143        std::list< Statement * > branches;
    144144        return new CaseStmt( nullptr, branches, true );
    145 } // build_default
    146 
    147 Statement * build_while( IfCtl * ctl, StatementNode * stmt ) {
     145}
     146
     147Statement * build_while( ExpressionNode * ctl, StatementNode * stmt, bool kind ) {
    148148        std::list< Statement * > branches;
    149149        buildMoveList< Statement, StatementNode >( stmt, branches );
     
    151151
    152152        std::list< Statement * > init;
    153         Expression * cond = build_if_control( ctl, init );
    154         return new WhileStmt( cond, branches.front(), init, false );
    155 } // build_while
    156 
    157 Statement * build_do_while( ExpressionNode * ctl, StatementNode * stmt ) {
    158         std::list< Statement * > branches;
    159         buildMoveList< Statement, StatementNode >( stmt, branches );
    160         assert( branches.size() == 1 );
    161 
    162         std::list< Statement * > init;
    163         return new WhileStmt( notZeroExpr( maybeMoveBuild< Expression >(ctl) ), branches.front(), init, true );
    164 } // build_do_while
     153        return new WhileStmt( notZeroExpr( maybeMoveBuild< Expression >(ctl) ), branches.front(), init, kind );
     154}
    165155
    166156Statement * build_for( ForCtl * forctl, StatementNode * stmt ) {
     
    184174        delete forctl;
    185175        return new ForStmt( init, cond, incr, branches.front() );
    186 } // build_for
     176}
    187177
    188178Statement * build_branch( BranchStmt::Type kind ) {
    189179        Statement * ret = new BranchStmt( "", kind );
    190180        return ret;
    191 } // build_branch
    192 
     181}
    193182Statement * build_branch( std::string * identifier, BranchStmt::Type kind ) {
    194183        Statement * ret = new BranchStmt( * identifier, kind );
    195184        delete identifier;                                                                      // allocated by lexer
    196185        return ret;
    197 } // build_branch
    198 
     186}
    199187Statement * build_computedgoto( ExpressionNode * ctl ) {
    200188        return new BranchStmt( maybeMoveBuild< Expression >(ctl), BranchStmt::Goto );
    201 } // build_computedgoto
     189}
    202190
    203191Statement * build_return( ExpressionNode * ctl ) {
     
    205193        buildMoveList( ctl, exps );
    206194        return new ReturnStmt( exps.size() > 0 ? exps.back() : nullptr );
    207 } // build_return
     195}
    208196
    209197Statement * build_throw( ExpressionNode * ctl ) {
     
    212200        assertf( exps.size() < 2, "This means we are leaking memory");
    213201        return new ThrowStmt( ThrowStmt::Terminate, !exps.empty() ? exps.back() : nullptr );
    214 } // build_throw
     202}
    215203
    216204Statement * build_resume( ExpressionNode * ctl ) {
     
    219207        assertf( exps.size() < 2, "This means we are leaking memory");
    220208        return new ThrowStmt( ThrowStmt::Resume, !exps.empty() ? exps.back() : nullptr );
    221 } // build_resume
     209}
    222210
    223211Statement * build_resume_at( ExpressionNode * ctl, ExpressionNode * target ) {
     
    225213        (void)target;
    226214        assertf( false, "resume at (non-local throw) is not yet supported," );
    227 } // build_resume_at
     215}
    228216
    229217Statement * build_try( StatementNode * try_stmt, StatementNode * catch_stmt, StatementNode * finally_stmt ) {
     
    233221        FinallyStmt * finallyBlock = dynamic_cast< FinallyStmt * >(maybeMoveBuild< Statement >(finally_stmt) );
    234222        return new TryStmt( tryBlock, branches, finallyBlock );
    235 } // build_try
    236 
     223}
    237224Statement * build_catch( CatchStmt::Kind kind, DeclarationNode * decl, ExpressionNode * cond, StatementNode * body ) {
    238225        std::list< Statement * > branches;
     
    240227        assert( branches.size() == 1 );
    241228        return new CatchStmt( kind, maybeMoveBuild< Declaration >(decl), maybeMoveBuild< Expression >(cond), branches.front() );
    242 } // build_catch
    243 
     229}
    244230Statement * build_finally( StatementNode * stmt ) {
    245231        std::list< Statement * > branches;
     
    247233        assert( branches.size() == 1 );
    248234        return new FinallyStmt( dynamic_cast< CompoundStmt * >( branches.front() ) );
    249 } // build_finally
     235}
    250236
    251237WaitForStmt * build_waitfor( ExpressionNode * targetExpr, StatementNode * stmt, ExpressionNode * when ) {
     
    268254
    269255        return node;
    270 } // build_waitfor
     256}
    271257
    272258WaitForStmt * build_waitfor( ExpressionNode * targetExpr, StatementNode * stmt, ExpressionNode * when, WaitForStmt * node ) {
     
    287273
    288274        return node;
    289 } // build_waitfor
     275}
    290276
    291277WaitForStmt * build_waitfor_timeout( ExpressionNode * timeout, StatementNode * stmt, ExpressionNode * when ) {
     
    296282                node->timeout.statement = maybeMoveBuild<Statement >( stmt    );
    297283                node->timeout.condition = notZeroExpr( maybeMoveBuild<Expression>( when ) );
    298         } else {
     284        }
     285        else {
    299286                node->orelse.statement  = maybeMoveBuild<Statement >( stmt );
    300287                node->orelse.condition  = notZeroExpr( maybeMoveBuild<Expression>( when ) );
    301         } // if
     288        }
    302289
    303290        return node;
    304 } // build_waitfor_timeout
     291}
    305292
    306293WaitForStmt * build_waitfor_timeout( ExpressionNode * timeout, StatementNode * stmt, ExpressionNode * when,  StatementNode * else_stmt, ExpressionNode * else_when ) {
     
    315302
    316303        return node;
    317 } // build_waitfor_timeout
     304}
    318305
    319306WithStmt * build_with( ExpressionNode * exprs, StatementNode * stmt ) {
     
    322309        Statement * s = maybeMoveBuild<Statement>( stmt );
    323310        return new WithStmt( e, s );
    324 } // build_with
     311}
    325312
    326313Statement * build_compound( StatementNode * first ) {
     
    328315        buildMoveList( first, cs->get_kids() );
    329316        return cs;
    330 } // build_compound
     317}
    331318
    332319Statement * build_asm( bool voltile, Expression * instruction, ExpressionNode * output, ExpressionNode * input, ExpressionNode * clobber, LabelNode * gotolabels ) {
     
    338325        buildMoveList( clobber, clob );
    339326        return new AsmStmt( voltile, instruction, out, in, clob, gotolabels ? gotolabels->labels : noLabels );
    340 } // build_asm
     327}
    341328
    342329Statement * build_directive( string * directive ) {
    343330        return new DirectiveStmt( *directive );
    344 } // build_directive
     331}
    345332
    346333// Local Variables: //
  • src/Parser/parser.yy

    rdafdbe7 r5510027  
    1010// Created On       : Sat Sep  1 20:22:55 2001
    1111// Last Modified By : Peter A. Buhr
    12 // Last Modified On : Mon Jun  4 22:22:04 2018
    13 // Update Count     : 3492
     12// Last Modified On : Fri Jun  1 17:59:57 2018
     13// Update Count     : 3476
    1414//
    1515
     
    10501050
    10511051iteration_statement:
    1052         WHILE '(' push if_control_expression ')' statement pop
    1053                 { $$ = new StatementNode( build_while( $4, $6 ) ); }
     1052        WHILE '(' comma_expression ')' statement
     1053                { $$ = new StatementNode( build_while( $3, $5 ) ); }
    10541054        | DO statement WHILE '(' comma_expression ')' ';'
    1055                 { $$ = new StatementNode( build_do_while( $5, $2 ) ); }
     1055                { $$ = new StatementNode( build_while( $5, $2, true ) ); }
    10561056        | FOR '(' push for_control_expression ')' statement pop
    10571057                { $$ = new StatementNode( build_for( $4, $6 ) ); }
     
    13311331        c_declaration ';'
    13321332        | cfa_declaration ';'                                                           // CFA
    1333         | static_assert                                                                         // C11
     1333        | static_assert
    13341334        ;
    13351335
     
    13371337        STATICASSERT '(' constant_expression ',' string_literal ')' ';' // C11
    13381338                { $$ = DeclarationNode::newStaticAssert( $3, $5 ); }
    1339         | STATICASSERT '(' constant_expression ')' ';'          // CFA
    1340                 { $$ = DeclarationNode::newStaticAssert( $3, build_constantStr( *new string( "\"\"" ) ) ); }
    13411339
    13421340// C declaration syntax is notoriously confusing and error prone. Cforall provides its own type, variable and function
     
    18831881
    18841882field_declaration:
    1885         type_specifier field_declaring_list ';'
    1886                 { $$ = distAttr( $1, $2 ); }
     1883        cfa_field_declaring_list ';'                                            // CFA, new style field declaration
     1884        | EXTENSION cfa_field_declaring_list ';'                        // GCC
     1885                {
     1886                        distExt( $2 );                                                          // mark all fields in list
     1887                        $$ = $2;
     1888                }
     1889        | type_specifier field_declaring_list ';'
     1890                {
     1891                        $$ = distAttr( $1, $2 ); }
    18871892        | EXTENSION type_specifier field_declaring_list ';'     // GCC
    1888                 { distExt( $3 ); $$ = distAttr( $2, $3 ); }             // mark all fields in list
    1889         | typedef_declaration ';'                                                       // CFA
    1890                 { SemanticError( yylloc, "Typedef in aggregate is currently unimplemented." ); $$ = nullptr; }
    1891         | cfa_field_declaring_list ';'                                          // CFA, new style field declaration
    1892         | EXTENSION cfa_field_declaring_list ';'                        // GCC
    1893                 { distExt( $2 ); $$ = $2; }                                             // mark all fields in list
    1894         | cfa_typedef_declaration ';'                                           // CFA
    1895                 { SemanticError( yylloc, "Typedef in aggregate is currently unimplemented." ); $$ = nullptr; }
    1896         | static_assert                                                                         // C11
     1893                {
     1894                        distExt( $3 );                                                          // mark all fields in list
     1895                        $$ = distAttr( $2, $3 );
     1896                }
     1897        | static_assert
    18971898        ;
    18981899
  • src/tests/sum.c

    rdafdbe7 r5510027  
    1111// Created On       : Wed May 27 17:56:53 2015
    1212// Last Modified By : Peter A. Buhr
    13 // Last Modified On : Sun Jun  3 19:23:41 2018
    14 // Update Count     : 278
     13// Last Modified On : Sat Feb 17 11:49:17 2018
     14// Update Count     : 273
    1515//
    1616
     
    1818#include <stdlib>
    1919
    20 void ?{}( int & c, zero_t ) { c = 0; }                                  // not in prelude
     20void ?{}( int & c, zero_t ) { c = 0; }
    2121
    2222trait sumable( otype T ) {
    23         void ?{}( T &, zero_t );                                                        // 0 literal constructor
     23        void ?{}( T &, zero_t );                                                        // constructor from 0 literal
    2424        T ?+?( T, T );                                                                          // assortment of additions
    2525        T ?+=?( T &, T );
     
    2929
    3030forall( otype T | sumable( T ) )                                                // use trait
    31 T sum( size_t size, T a[] ) {
    32         T total = 0;                                                                            // initialize by 0 constructor
    33         for ( size_t i = 0; i < size; i += 1 )
     31T sum( unsigned int size, T a[] ) {
     32        T total = 0;                                                                            // instantiate T from 0 by calling constructor
     33        for ( unsigned int i = 0; i < size; i += 1 )
    3434                total += a[i];                                                                  // select appropriate +
    3535        return total;
     
    111111        for ( int i = 0; i < size; i += 1, v += 1 ) {
    112112                s += (int)v;
    113                 gs.x[i] = (int)v;                                                               // set field array in generic type
     113                gs.x[i] = (int)v;                                                               // set filed array in generic type
    114114        } // for
    115115        sout | "sum from" | low | "to" | High | "is"
    116                  | sum( size, gs.x ) | ", check" | (int)s | endl; // add field array in generic type
     116                 | sum( size, gs.x ) | ", check" | (int)s | endl; // add filed array in generic type
    117117} // main
    118118
Note: See TracChangeset for help on using the changeset viewer.