Changes in / [07ca4dd:ea05f8d]
- Files:
-
- 2 edited
-
doc/papers/concurrency/Paper.tex (modified) (15 diffs)
-
libcfa/src/clock.hfa (modified) (2 diffs)
Legend:
- Unmodified
- Added
- Removed
-
doc/papers/concurrency/Paper.tex
r07ca4dd rea05f8d 311 311 Libraries like pthreads were developed for C, and the Solaris operating-system switched from user (JDK 1.1~\cite{JDK1.1}) to kernel threads. 312 312 As a result, languages like Java, Scala, Objective-C~\cite{obj-c-book}, \CCeleven~\cite{C11}, and C\#~\cite{Csharp} adopt the 1:1 kernel-threading model, with a variety of presentation mechanisms. 313 From 2000 onwards, languages like Go~\cite{Go}, Erlang~\cite{Erlang}, Haskell~\cite{Haskell}, D~\cite{D}, and \uC~\cite{uC++,uC++book} have championed the M:N user-threading model, and many user-threading libraries have appeared~\cite{Qthreads,MPC, Marcel}, including putting green threads back into Java~\cite{Quasar}.313 From 2000 onwards, languages like Go~\cite{Go}, Erlang~\cite{Erlang}, Haskell~\cite{Haskell}, D~\cite{D}, and \uC~\cite{uC++,uC++book} have championed the M:N user-threading model, and many user-threading libraries have appeared~\cite{Qthreads,MPC,BoostThreads}, including putting green threads back into Java~\cite{Quasar}. 314 314 The main argument for user-level threading is that they are lighter weight than kernel threads (locking and context switching do not cross the kernel boundary), so there is less restriction on programming styles that encourage large numbers of threads performing medium work-units to facilitate load balancing by the runtime~\cite{Verch12}. 315 315 As well, user-threading facilitates a simpler concurrency approach using thread objects that leverage sequential patterns versus events with call-backs~\cite{vonBehren03}. … … 327 327 328 328 Finally, it is important for a language to provide safety over performance \emph{as the default}, allowing careful reduction of safety for performance when necessary. 329 Two concurrency violations of this philosophy are \emph{spurious wakeup} (random wakeup~\cite[\S~8]{Buhr05a}) and \emph{barging} (signals-as-hints~\cite[\S~8]{Buhr05a}), where one is a consequence of the other, i.e., once there is spurious wakeup, signals-as-hints follows.329 Two concurrency violations of this philosophy are \emph{spurious wakeup} and \emph{barging}, i.e., random wakeup~\cite[\S~8]{Buhr05a} and signals-as-hints~\cite[\S~8]{Buhr05a}, where one is a consequence of the other, i.e., once there is spurious wakeup, signals-as-hints follows. 330 330 However, spurious wakeup is \emph{not} a foundational concurrency property~\cite[\S~8]{Buhr05a}, it is a performance design choice. 331 331 Similarly, signals-as-hints is often a performance decision. 332 332 We argue removing spurious wakeup and signals-as-hints makes concurrent programming significantly safer because it removes local non-determinism and matches with programmer expectation. 333 (Author experience teaching concurrency is that students are highly confused by these semantics.)333 (Authors experience teaching concurrency is that students are highly confused by these semantics.) 334 334 Clawing back performance, when local non-determinism is unimportant, should be an option not the default. 335 335 … … 367 367 \section{Stateful Function} 368 368 369 The stateful function is an old idea~\cite{Conway63,Marlin80} that is new again~\cite{C++20Coroutine19}, where execution is temporarily suspended and later resumed, e.g., plugin, device driver, finite-state machine. 369 The generator/coroutine provides a stateful function, which is an old idea~\cite{Conway63,Marlin80} that is new again~\cite{C++20Coroutine19}. 370 A stateful function allows execution to be temporarily suspended and later resumed, e.g., plugin, device driver, finite-state machine. 370 371 Hence, a stateful function may not end when it returns to its caller, allowing it to be restarted with the data and execution location present at the point of suspension. 371 372 This capability is accomplished by retaining a data/execution \emph{closure} between invocations. … … 542 543 \end{figure} 543 544 544 Stateful functions appear as generators, coroutines, and threads, where presentations are based on function objects or pointers~\cite{Butenhof97, C++14, MS:VisualC++, BoostCoroutines15}.545 For generators, coroutines, and threads, many designs are based on function objects or pointers~\cite{Butenhof97, C++14, MS:VisualC++, BoostCoroutines15}. 545 546 For example, Python presents generators as a function object: 546 547 \begin{python} … … 586 587 587 588 Figure~\ref{f:FibonacciAsymmetricGenerator} shows an unbounded asymmetric generator for an infinite sequence of Fibonacci numbers written in C and \CFA, with a simple C implementation for the \CFA version. 588 This generator is an \emph{output generator}, producing a new result on each resumption.589 This kind of generator is an \emph{output generator}, producing a new result on each resumption. 589 590 To compute Fibonacci, the previous two values in the sequence are retained to generate the next value, \ie @fn1@ and @fn@, plus the execution location where control restarts when the generator is resumed, \ie top or middle. 590 An additional requirement is the ability to create an arbitrary number of generators (of any kind), \ie retaining onestate in global variables is insufficient;591 An additional requirement is the ability to create an arbitrary number of generators (of any kind), \ie retaining state in global variables is insufficient; 591 592 hence, state is retained in a closure between calls. 592 593 Figure~\ref{f:CFibonacci} shows the C approach of manually creating the closure in structure @Fib@, and multiple instances of this closure provide multiple Fibonacci generators. 593 The C version only has the middle execution state because the top execution state is declaration initialization.594 The C version only has the middle execution state because the top execution state becomes declaration initialization. 594 595 Figure~\ref{f:CFAFibonacciGen} shows the \CFA approach, which also has a manual closure, but replaces the structure with a custom \CFA @generator@ type. 595 596 This generator type is then connected to a function that \emph{must be named \lstinline|main|},\footnote{ … … 667 668 As well, the data state is small, where variables @byte@ and @msg@ are communication variables for passing in message bytes and returning the message, and variables @lnth@, @crc@, and @sum@ are local variable that must be retained between calls and are manually hoisted into the generator type. 668 669 % Manually, detecting and hoisting local-state variables is easy when the number is small. 669 In contrast, the execution state is large, with one @resume@ and seven @suspend@s.670 Finally, the execution state is large, with one @resume@ and seven @suspend@s. 670 671 Hence, the key benefits of the generator are correctness, safety, and maintenance because the execution states are transcribed directly into the programming language rather than using a table-driven approach. 671 672 Because FSMs can be complex and occur frequently in important domains, direct support of the generator is crucial in a systems programming-language. … … 779 780 Figure~\ref{f:CFAPingPongGen} shows a symmetric generator, where the generator resumes another generator, forming a resume/resume cycle. 780 781 (The trivial cycle is a generator resuming itself.) 781 This control flow is similar to recursion for functions but without stack growth.782 This control flow is similar to recursion for functions, but without stack growth. 782 783 The steps for symmetric control-flow are creating, executing, and terminating the cycle. 783 784 Constructing the cycle must deal with definition-before-use to close the cycle, \ie, the first generator must know about the last generator, which is not within scope. … … 788 789 Terminating the cycle is accomplished by @suspend@ or @return@, both of which go back to the stack frame that started the cycle (program main in the example). 789 790 The starting stack-frame is below the last active generator because the resume/resume cycle does not grow the stack. 790 Also, since local variables are not retained in the generator function, it does not contain an yobjects with destructors that must be called, so the cost is the same as a function return.791 Also, since local variables are not retained in the generator function, it does not contain an objects with destructors that must be called, so the cost is the same as a function return. 791 792 Destructor cost occurs when the generator instance is deallocated, which is easily controlled by the programmer. 792 793 … … 1219 1220 Hence, the starter coroutine is remembered on the first resume and ending the coroutine resumes the starter. 1220 1221 Figure~\ref{f:ProdConsRuntimeStacks} shows this semantic by the dashed lines from the end of the coroutine mains: @prod@ starts @cons@ so @cons@ resumes @prod@ at the end, and the program main starts @prod@ so @prod@ resumes the program main at the end. 1221 For other scenarios, it is always possible to devise a solution with additional programming effort , such as forcing the cycle forward (backward) to a safe point before starting termination.1222 For other scenarios, it is always possible to devise a solution with additional programming effort. 1222 1223 1223 1224 The producer/consumer example does not illustrate the full power of the starter semantics because @cons@ always ends first. … … 1289 1290 The function definitions ensures there is a statically-typed @main@ function that is the starting point (first stack frame) of a coroutine, and a mechanism to get (read) the currently executing coroutine handle. 1290 1291 The @main@ function has no return value or additional parameters because the coroutine type allows an arbitrary number of interface functions with corresponding arbitrary typed input/output values versus fixed ones. 1291 The advantage of this approach is that users can easily create different types of coroutines, \eg changing the memory layout of a coroutine is trivial when implementing the @get_coroutine@ function, and possibly redefining \textsf{suspend}and @resume@.1292 The advantage of this approach is that users can easily create different types of coroutines, \eg changing the memory layout of a coroutine is trivial when implementing the @get_coroutine@ function, and possibly redefining @suspend@ and @resume@. 1292 1293 1293 1294 The \CFA custom-type @coroutine@ implicitly implements the getter and forward declarations for the coroutine main. … … 1337 1338 Once allocated, a VLS is fixed sized.} 1338 1339 on the allocating stack, provided the allocating stack is large enough. 1339 For a VLS stack allocation /deallocation is an inexpensive adjustment of the stack pointer, modulo any stack constructor costs (\eg initial frame setup).1340 For a VLS stack allocation, allocation/deallocation is an inexpensive adjustment of the stack point, modulo any stack constructor costs (\eg initial frame setup). 1340 1341 For heap stack allocation, allocation/deallocation is an expensive heap allocation (where the heap can be a shared resource), modulo any stack constructor costs. 1341 1342 With heap stack allocation, it is also possible to use a split (segmented) stack calling-convention, available with gcc and clang, so the stack is variable sized. … … 1362 1363 However, coroutines are a stepping stone towards concurrency. 1363 1364 1364 The transition to concurrency, even for a single thread with multiple stacks, occurs when coroutines context switch to a \newterm{scheduling coroutine}, introducing non-determinism from the coroutine perspective~\cite[\S~3,]{Buhr05a} .1365 The transition to concurrency, even for a single thread with multiple stacks, occurs when coroutines context switch to a \newterm{scheduling coroutine}, introducing non-determinism from the coroutine perspective~\cite[\S~3,]{Buhr05a}\cite{Adya02}. 1365 1366 Therefore, a minimal concurrency system requires coroutines \emph{in conjunction with a nondeterministic scheduler}. 1366 The resulting execution system now follows a cooperative threading-model ~\cite{Adya02,libdill}, called \newterm{non-preemptive scheduling}.1367 The resulting execution system now follows a cooperative threading-model, called \newterm{non-preemptive scheduling}. 1367 1368 Adding \newterm{preemption} introduces non-cooperative scheduling, where context switching occurs randomly between any two instructions often based on a timer interrupt, called \newterm{preemptive scheduling}. 1368 1369 While a scheduler introduces uncertain execution among explicit context switches, preemption introduces uncertainty by introducing implicit context switches. … … 1423 1424 This semantic ensures a thread is started and stopped exactly once, eliminating some programming error, and scales to multiple threads for basic (termination) synchronization. 1424 1425 For block allocation to arbitrary depth, including recursion, threads are created/destroyed in a lattice structure (tree with top and bottom). 1425 Arbitrary topologies are possible using dynamic allocation, allowing threads to outlive their declaration scope, identical to normal dynamic allocation.1426 Arbitrary topologies are possible using dynamic allocation, allowing threads to outlive their declaration scope, identical to normal dynamically allocating. 1426 1427 \begin{cfa} 1427 1428 MyTask * factory( int N ) { ... return `anew( N )`; } $\C{// allocate heap-based threads, implicit start after construction}$ … … 1524 1525 \subsection{Mutual Exclusion} 1525 1526 1526 A group of instructions manipulating a specific instance of shared data that must be performed atomically is called a \newterm{critical section}~\cite{Dijkstra65}, which is enforced by \newterm{simple mutual-exclusion}.1527 The generalization is called a \newterm{group critical-section}~\cite{Joung00}, where multiple tasks with the same session use the resource simultaneously and different sessions are segregated, which is enforced by \newterm{complex mutual-exclusion} providing the correct kind and number of threads using a group critical-section.1527 A group of instructions manipulating a specific instance of shared data that must be performed atomically is called an (individual) \newterm{critical-section}~\cite{Dijkstra65}. 1528 The generalization is called a \newterm{group critical-section}~\cite{Joung00}, where multiple tasks with the same session may use the resource simultaneously, but different sessions may not use the resource simultaneously. 1528 1529 The readers/writer problem~\cite{Courtois71} is an instance of a group critical-section, where readers share a session but writers have a unique session. 1530 \newterm{Mutual exclusion} enforces the correct kind and number of threads using a critical section. 1529 1531 1530 1532 However, many solutions exist for mutual exclusion, which vary in terms of performance, flexibility and ease of use. … … 1546 1548 Preventing or detecting barging is an involved challenge with low-level locks, which is made easier through higher-level constructs. 1547 1549 This challenge is often split into two different approaches: barging avoidance and prevention. 1548 Algorithms that unconditionally releasing a lock for competing threads to acquire use barging avoidance during synchronization to force a barging thread to wait ;1550 Algorithms that unconditionally releasing a lock for competing threads to acquire use barging avoidance during synchronization to force a barging thread to wait. 1549 1551 algorithms that conditionally hold locks during synchronization, \eg baton-passing~\cite{Andrews89}, prevent barging completely. 1550 1552 -
libcfa/src/clock.hfa
r07ca4dd rea05f8d 10 10 // Created On : Thu Apr 12 14:36:06 2018 11 11 // Last Modified By : Peter A. Buhr 12 // Last Modified On : Thu Jun 13 21:21:13 201913 // Update Count : 812 // Last Modified On : Mon Jul 2 21:40:01 2018 13 // Update Count : 7 14 14 // 15 15 … … 63 63 clock_gettime( CLOCK_REALTIME, &curr ); 64 64 return (Time){ curr }; 65 } // getTime Nsec65 } // getTime 66 66 67 67 Time getTime() { // without nanoseconds
Note:
See TracChangeset
for help on using the changeset viewer.