Changeset 48b9b36 for doc/papers
- Timestamp:
- May 16, 2018, 10:50:48 PM (7 years ago)
- Branches:
- ADT, aaron-thesis, arm-eh, ast-experimental, cleanup-dtors, deferred_resn, demangler, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, new-env, no_list, persistent-indexer, pthread-emulation, qualifiedEnum, with_gc
- Children:
- 4358c1e
- Parents:
- e9a7e90b
- Location:
- doc/papers
- Files:
-
- 2 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/papers/concurrency/Paper.tex
re9a7e90b r48b9b36 73 73 % \def\{{\ttfamily\upshape\myCHarFont \char`\}}}% 74 74 75 \renewcommand*{\thefootnote}{\Alph{footnote}} % hack because fnsymbol does not work 76 %\renewcommand*{\thefootnote}{\fnsymbol{footnote}} 77 75 78 \makeatletter 76 79 % parindent is relative, i.e., toggled on/off in environments like itemize, so store the value for … … 87 90 \setlength{\gcolumnposn}{3.5in} 88 91 \setlength{\columnposn}{\gcolumnposn} 92 89 93 \newcommand{\C}[2][\@empty]{\ifx#1\@empty\else\global\setlength{\columnposn}{#1}\global\columnposn=\columnposn\fi\hfill\makebox[\textwidth-\columnposn][l]{\lst@basicstyle{\LstCommentStyle{#2}}}} 90 94 \newcommand{\CRT}{\global\columnposn=\gcolumnposn} … … 172 176 literate={-}{\makebox[1ex][c]{\raisebox{0.4ex}{\rule{0.8ex}{0.1ex}}}}1 {^}{\raisebox{0.6ex}{$\scriptstyle\land\,$}}1 173 177 {~}{\raisebox{0.3ex}{$\scriptstyle\sim\,$}}1 % {`}{\ttfamily\upshape\hspace*{-0.1ex}`}1 174 {<-}{$\leftarrow$}2 {=>}{$\Rightarrow$}2 {->}{\makebox[1ex][c]{\raisebox{0.4ex}{\rule{0.8ex}{0.075ex}}}\kern-0.2ex{\textgreater}}2, 178 {<}{\textrm{\textless}}1 {>}{\textrm{\textgreater}}1 179 {<-}{$\leftarrow$}2 {=>}{$\Rightarrow$}2 {->}{\makebox[1ex][c]{\raisebox{0.5ex}{\rule{0.8ex}{0.075ex}}}\kern-0.2ex{\textrm{\textgreater}}}2, 175 180 moredelim=**[is][\color{red}]{`}{`}, 176 181 }% lstset … … 214 219 \lstMakeShortInline@% 215 220 221 \let\OLDthebibliography\thebibliography 222 \renewcommand\thebibliography[1]{ 223 \OLDthebibliography{#1} 224 \setlength{\parskip}{0pt} 225 \setlength{\itemsep}{4pt plus 0.3ex} 226 } 216 227 217 228 \title{\texorpdfstring{Concurrency in \protect\CFA}{Concurrency in Cforall}} … … 230 241 \CFA is a modern, polymorphic, \emph{non-object-oriented} extension of the C programming language. 231 242 This paper discusses the design of the concurrency and parallelism features in \CFA, and the concurrent runtime-system. 232 These features are created from scratch as ISO C lacks concurrency, relying largely on pthreads library.243 These features are created from scratch as ISO C lacks concurrency, relying largely on the pthreads library. 233 244 Coroutines and lightweight (user) threads are introduced into the language. 234 245 In addition, monitors are added as a high-level mechanism for mutual exclusion and synchronization. … … 255 266 Examples of high-level approaches are task (work) based~\cite{TBB}, implicit threading~\cite{OpenMP}, monitors~\cite{Java}, channels~\cite{CSP,Go}, and message passing~\cite{Erlang,MPI}. 256 267 257 Th is paper uses the following terminology.268 The following terminology is used. 258 269 A \newterm{thread} is a fundamental unit of execution that runs a sequence of code and requires a stack to maintain state. 259 270 Multiple simultaneous threads give rise to \newterm{concurrency}, which requires locking to ensure safe communication and access to shared data. 260 271 % Correspondingly, concurrency is defined as the concepts and challenges that occur when multiple independent (sharing memory, timing dependencies, \etc) concurrent threads are introduced. 261 \newterm{Locking}, and by extension locks, are defined as a mechanism to prevent progress of threads to provide safety.272 \newterm{Locking}, and by extension \newterm{locks}, are defined as a mechanism to prevent progress of threads to provide safety. 262 273 \newterm{Parallelism} is running multiple threads simultaneously. 263 274 Parallelism implies \emph{actual} simultaneous execution, where concurrency only requires \emph{apparent} simultaneous execution. 264 As such, parallelism only affects performance, which is observed through differences in space and/or time .265 266 Hence, there are two problems to be solved in the design of concurrency for a programming language: concurrency and parallelism.275 As such, parallelism only affects performance, which is observed through differences in space and/or time at runtime. 276 277 Hence, there are two problems to be solved: concurrency and parallelism. 267 278 While these two concepts are often combined, they are distinct, requiring different tools~\cite[\S~2]{Buhr05a}. 268 279 Concurrency tools handle synchronization and mutual exclusion, while parallelism tools handle performance, cost and resource utilization. 269 280 270 281 The proposed concurrency API is implemented in a dialect of C, called \CFA. 271 The paper discusses how the language features are added to the \CFA translator with respect to parsing, semantic, and type checking, and the corresponding high-perfor amnce runtime-library to implement the concurrency features.282 The paper discusses how the language features are added to the \CFA translator with respect to parsing, semantic, and type checking, and the corresponding high-performance runtime-library to implement the concurrency features. 272 283 273 284 … … 275 286 276 287 The following is a quick introduction to the \CFA language, specifically tailored to the features needed to support concurrency. 277 Extended versions of the following code examples are available at the \CFA website~\cite{Cforall} or Moss~\etal~\cite{XXX}.288 Extended versions and explanation of the following code examples are available at the \CFA website~\cite{Cforall} or in Moss~\etal~\cite{Moss18}. 278 289 279 290 \CFA is an extension of ISO-C, and hence, supports all C paradigms. … … 282 293 Virtually all of the code generated by the \CFA translator respects C memory layouts and calling conventions. 283 294 While \CFA is not an object-oriented language, lacking the concept of a receiver (\eg @this@) and nominal inheritance-relationships, C does have a notion of objects: ``region of data storage in the execution environment, the contents of which can represent values''~\cite[3.15]{C11}. 295 While some \CFA features are common in object-oriented programming-languages, they are an independent capability allowing \CFA to adopt them while retaining a procedural paradigm. 284 296 285 297 … … 296 308 r1 = 3; r2 = 3; r3 = 3; // change x: implicit dereferences *r1, **r2, ***r3 297 309 **p3 = &y; *p3 = &p4; // change p1, p2 298 `&`r3 = &y; `&&`r3 = &`&`r4; // change r1, r2: cancel implicit de ferences (&*)**r3, (&(&*)*)*r3, &(&*)r4310 `&`r3 = &y; `&&`r3 = &`&`r4; // change r1, r2: cancel implicit dereferences (&*)**r3, (&(&*)*)*r3, &(&*)r4 299 311 \end{cfa} 300 312 A reference is a handle to an object, like a pointer, but is automatically dereferenced the specified number of levels. … … 305 317 \label{s:WithStatement} 306 318 307 Heterogeneous data is oftenaggregated into a structure/union.319 Heterogeneous data is aggregated into a structure/union. 308 320 To reduce syntactic noise, \CFA provides a @with@ statement (see Pascal~\cite[\S~4.F]{Pascal}) to elide aggregate field-qualification by opening a scope containing the field identifiers. 309 321 \begin{cquote} … … 315 327 // multiple aggregate parameters 316 328 \end{cfa} 317 \begin{tabular}{@{}l@{\hspace{ \parindentlnth}}|@{\hspace{\parindentlnth}}l@{}}329 \begin{tabular}{@{}l@{\hspace{2\parindentlnth}}|@{\hspace{2\parindentlnth}}l@{}} 318 330 \begin{cfa} 319 331 void f( S & s, T & t ) { … … 357 369 // selection based on type 358 370 \end{cfa} 359 \begin{tabular}{@{}l@{\hspace{ \parindentlnth}}|@{\hspace{\parindentlnth}}l@{}}360 \begin{cfa} 361 const short int MIN= -32768;362 const int MIN= -2147483648;363 const long int MIN= -9223372036854775808L;371 \begin{tabular}{@{}l@{\hspace{2\parindentlnth}}|@{\hspace{2\parindentlnth}}l@{}} 372 \begin{cfa} 373 const short int `MIN` = -32768; 374 const int `MIN` = -2147483648; 375 const long int `MIN` = -9223372036854775808L; 364 376 \end{cfa} 365 377 & 366 378 \begin{cfa} 367 short int si = MIN;368 int i = MIN;369 long int li = MIN;379 short int si = `MIN`; 380 int i = `MIN`; 381 long int li = `MIN`; 370 382 \end{cfa} 371 383 \end{tabular} … … 373 385 // selection based on type and number of parameters 374 386 \end{cfa} 375 \begin{tabular}{@{}l@{\hspace{ 1.7\parindentlnth}}|@{\hspace{\parindentlnth}}l@{}}376 \begin{cfa} 377 void f( void );378 void f( char );379 void f( int, double );387 \begin{tabular}{@{}l@{\hspace{2.7\parindentlnth}}|@{\hspace{2\parindentlnth}}l@{}} 388 \begin{cfa} 389 void `f`( void ); 390 void `f`( char ); 391 void `f`( int, double ); 380 392 \end{cfa} 381 393 & 382 394 \begin{cfa} 383 f();384 f( 'a' );385 f( 3, 5.2 );395 `f`(); 396 `f`( 'a' ); 397 `f`( 3, 5.2 ); 386 398 \end{cfa} 387 399 \end{tabular} … … 389 401 // selection based on type and number of returns 390 402 \end{cfa} 391 \begin{tabular}{@{}l@{\hspace{ \parindentlnth}}|@{\hspace{\parindentlnth}}l@{}}392 \begin{cfa} 393 char f( int );394 double f( int );395 [char, double] f( int );403 \begin{tabular}{@{}l@{\hspace{2\parindentlnth}}|@{\hspace{2\parindentlnth}}l@{}} 404 \begin{cfa} 405 char `f`( int ); 406 double `f`( int ); 407 [char, double] `f`( int ); 396 408 \end{cfa} 397 409 & 398 410 \begin{cfa} 399 char c = f( 3 );400 double d = f( 3 );401 [d, c] = f( 3 );411 char c = `f`( 3 ); 412 double d = `f`( 3 ); 413 [d, c] = `f`( 3 ); 402 414 \end{cfa} 403 415 \end{tabular} … … 405 417 \end{cquote} 406 418 Overloading is important for \CFA concurrency since the runtime system relies on creating different types to represent concurrency objects. 407 Therefore, overloading is necessary to prevent the need for long prefixes and other naming conventions t hatprevent name clashes.419 Therefore, overloading is necessary to prevent the need for long prefixes and other naming conventions to prevent name clashes. 408 420 As seen in Section~\ref{basics}, function @main@ is heavily overloaded. 409 421 … … 414 426 with ( s, t ) { 415 427 j + k; $\C{// unambiguous, s.j + t.k}$ 416 m = 5.0; $\C{// unambiguous, t.m = 5.0}$417 m = 1; $\C{// unambiguous, s.m = 1}$418 int a = m; $\C{// unambiguous, a = s.i}$419 double b = m; $\C{// unambiguous, b = t.m}$428 m = 5.0; $\C{// unambiguous, s.m = 5.0}$ 429 m = 1; $\C{// unambiguous, t.m = 1}$ 430 int a = m; $\C{// unambiguous, a = t.m }$ 431 double b = m; $\C{// unambiguous, b = s.m}$ 420 432 int c = `s.i` + `t.i`; $\C{// unambiguous, qualification}$ 421 (double)m; $\C{// unambiguous, cast }$422 } 423 \end{cfa} 424 For parallel semantics, both @s.i@ and @t.i@ are visible withthe same type, so only @i@ is ambiguous without qualification.433 (double)m; $\C{// unambiguous, cast s.m}$ 434 } 435 \end{cfa} 436 For parallel semantics, both @s.i@ and @t.i@ are visible the same type, so only @i@ is ambiguous without qualification. 425 437 426 438 … … 514 526 \begin{cfa} 515 527 struct VLA { int len, * data; }; $\C{// variable length array of integers}$ 516 void ?{}( VLA & vla ) with ( vla ) { len = 10; data = alloc( len ); } // default constructor528 void ?{}( VLA & vla ) with ( vla ) { len = 10; data = alloc( len ); } $\C{// default constructor}$ 517 529 void ?{}( VLA & vla, int size, char fill ) with ( vla ) { len = size; data = alloc( len, fill ); } // initialization 518 void ?{}( VLA & vla, VLA other ) { vla.len = other.len; vla.data = other.data; } // copy, shallow530 void ?{}( VLA & vla, VLA other ) { vla.len = other.len; vla.data = other.data; } $\C{// copy, shallow}$ 519 531 void ^?{}( VLA & vla ) with ( vla ) { free( data ); } $\C{// destructor}$ 520 532 { 521 533 VLA x, y = { 20, 0x01 }, z = y; $\C{// z points to y}$ 522 // ?{}( x ); ?{}( y, 20, 0x01 ); ?{}( z, y );534 // x{}; y{ 20, 0x01 }; z{ z, y }; 523 535 ^x{}; $\C{// deallocate x}$ 524 536 x{}; $\C{// reallocate x}$ … … 527 539 y{ x }; $\C{// reallocate y, points to x}$ 528 540 x{}; $\C{// reallocate x, not pointing to y}$ 529 // ^?{}(z); ^?{}(y); ^?{}(x);541 // ^z{}; ^y{}; ^x{}; 530 542 } 531 543 \end{cfa} … … 546 558 \section{Concurrency Basics}\label{basics} 547 559 548 At its core, concurrency is based on multiple call-stacks and scheduling among threads executing on these stacks. 549 Multiple call stacks (or contexts) and a single thread of execution does \emph{not} imply concurrency. 550 Execution with a single thread and multiple stacks where the thread is deterministically self-scheduling across the stacks is called \newterm{coroutining}; 551 execution with a single thread and multiple stacks but where the thread is scheduled by an oracle (non-deterministic from the thread's perspective) across the stacks is called concurrency~\cite[\S~3]{Buhr05a}. 552 Therefore, a minimal concurrency system can be achieved using coroutines (see Section \ref{coroutine}), which instead of context-switching among each other, always defer to an oracle for where to context-switch next. 553 554 While coroutines can execute on the caller's stack-frame, stack-full coroutines allow full generality and are sufficient as the basis for concurrency. 555 The aforementioned oracle is a scheduler and the whole system now follows a cooperative threading-model (a.k.a., non-preemptive scheduling). 556 The oracle/scheduler can either be a stack-less or stack-full entity and correspondingly require one or two context-switches to run a different coroutine. 557 In any case, a subset of concurrency related challenges start to appear. 558 For the complete set of concurrency challenges to occur, the only feature missing is preemption. 559 560 A scheduler introduces order of execution uncertainty, while preemption introduces uncertainty about where context switches occur. 561 Mutual exclusion and synchronization are ways of limiting non-determinism in a concurrent system. 562 Now it is important to understand that uncertainty is desirable; uncertainty can be used by runtime systems to significantly increase performance and is often the basis of giving a user the illusion that tasks are running in parallel. 560 At its core, concurrency is based on multiple call-stacks and scheduling threads executing on these stacks. 561 Multiple call stacks (or contexts) and a single thread of execution, called \newterm{coroutining}~\cite{Conway63,Marlin80}, does \emph{not} imply concurrency~\cite[\S~2]{Buhr05a}. 562 In coroutining, the single thread is self-scheduling across the stacks, so execution is deterministic, \ie given fixed inputs, the execution path to the outputs is fixed and predictable. 563 A \newterm{stackless} coroutine executes on the caller's stack~\cite{Python} but this approach is restrictive, \eg preventing modularization and supporting only iterator/generator-style programming; 564 a \newterm{stackfull} coroutine executes on its own stack, allowing full generality. 565 Only stackfull coroutines are a stepping-stone to concurrency. 566 567 The transition to concurrency, even for execution with a single thread and multiple stacks, occurs when coroutines also context switch to a scheduling oracle, introducing non-determinism from the coroutine perspective~\cite[\S~3]{Buhr05a}. 568 Therefore, a minimal concurrency system is possible using coroutines (see Section \ref{coroutine}) in conjunction with a scheduler to decide where to context switch next. 569 The resulting execution system now follows a cooperative threading-model, called \newterm{non-preemptive scheduling}. 570 571 Because the scheduler is special, it can either be a stackless or stackfull coroutine. 572 For stackless, the scheduler performs scheduling on the stack of the current coroutine and switches directly to the next coroutine, so there is one context switch. 573 For stackfull, the current coroutine switches to the scheduler, which performs scheduling, and it then switches to the next coroutine, so there are two context switches. 574 A stackfull scheduler is often used for simplicity and security, even through there is a slightly higher runtime-cost. 575 576 Regardless of the approach used, a subset of concurrency related challenges start to appear. 577 For the complete set of concurrency challenges to occur, the missing feature is \newterm{preemption}, where context switching occurs randomly between any two instructions, often based on a timer interrupt, called \newterm{preemptive scheduling}. 578 While a scheduler introduces uncertainty in the order of execution, preemption introduces uncertainty where context switches occur. 579 Interestingly, uncertainty is necessary for the runtime (operating) system to give the illusion of parallelism on a single processor and increase performance on multiple processors. 580 The reason is that only the runtime has complete knowledge about resources and how to best utilized them. 581 However, the introduction of unrestricted non-determinism results in the need for \newterm{mutual exclusion} and \newterm{synchronization} to restrict non-determinism for correctness; 582 otherwise, it is impossible to write meaningful programs. 563 583 Optimal performance in concurrent applications is often obtained by having as much non-determinism as correctness allows. 564 584 … … 566 586 \subsection{\protect\CFA's Thread Building Blocks} 567 587 568 One of the important features that are missingin C is threading\footnote{While the C11 standard defines a ``threads.h'' header, it is minimal and defined as optional.588 An important missing feature in C is threading\footnote{While the C11 standard defines a ``threads.h'' header, it is minimal and defined as optional. 569 589 As such, library support for threading is far from widespread. 570 590 At the time of writing the paper, neither \protect\lstinline|gcc| nor \protect\lstinline|clang| support ``threads.h'' in their standard libraries.}. 571 On modern architectures, a lack of threading is unacceptable~\cite{Sutter05, Sutter05b}, and therefore modern programming languages must have the proper tools to allow users to writeefficient concurrent programs to take advantage of parallelism.591 On modern architectures, a lack of threading is unacceptable~\cite{Sutter05, Sutter05b}, and therefore existing and new programming languages must have tools for writing efficient concurrent programs to take advantage of parallelism. 572 592 As an extension of C, \CFA needs to express these concepts in a way that is as natural as possible to programmers familiar with imperative languages. 573 And being a system-level language means programmers expect to choose precisely which features they need and which cost they are willing to pay. 593 Furthermore, because C is a system-level language, programmers expect to choose precisely which features they need and which cost they are willing to pay. 594 Hence, concurrent programs should be written using high-level mechanisms, and only step down to lower-level mechanisms when performance bottlenecks are encountered. 574 595 575 596 576 597 \subsection{Coroutines: A Stepping Stone}\label{coroutine} 577 598 578 While the focus of this proposalis concurrency and parallelism, it is important to address coroutines, which are a significant building block of a concurrency system.579 \newterm{Coroutine}s are generalized routines with points where execution is suspended and resumed at a later time.580 Suspend/resume is a context switche and coroutines have other context-management operations.581 Many design challenges of threads are partially present in designing coroutines, which makes the design effort relevant.582 The core \textbf{api} of coroutines has two features: independent call-stacks and @suspend@/@resume@.583 584 A coroutine handles the class of problems that need to retain state between calls (\eg plugin, device driver, finite-state machine). 585 For example, a problem made easier with coroutines is unbounded generators, \eg generating an infinite sequence of Fibonacci numbers :599 While the focus of this discussion is concurrency and parallelism, it is important to address coroutines, which are a significant building block of a concurrency system. 600 Coroutines are generalized routines allowing execution to be temporarily suspend and later resumed. 601 Hence, unlike a normal routine, a coroutine may not terminate when it returns to its caller, allowing it to be restarted with the values and execution location present at the point of suspension. 602 This capability is accomplish via the coroutine's stack, where suspend/resume context switch among stacks. 603 Because threading design-challenges are present in coroutines, their design effort is relevant, and this effort can be easily exposed to programmers giving them a useful new programming paradigm because a coroutine handles the class of problems that need to retain state between calls, \eg plugins, device drivers, and finite-state machines. 604 Therefore, the core \CFA coroutine-API for has two fundamental features: independent call-stacks and @suspend@/@resume@ operations. 605 606 For example, a problem made easier with coroutines is unbounded generators, \eg generating an infinite sequence of Fibonacci numbers, where Figure~\ref{f:C-fibonacci} shows conventional approaches for writing a Fibonacci generator in C. 586 607 \begin{displaymath} 587 f(n) = \left \{608 \mathsf{fib}(n) = \left \{ 588 609 \begin{array}{ll} 589 0 & n = 0 \\590 1 & n = 1 \\591 f(n-1) + f(n-2) & n \ge 2 \\610 0 & n = 0 \\ 611 1 & n = 1 \\ 612 \mathsf{fib}(n-1) + \mathsf{fib}(n-2) & n \ge 2 \\ 592 613 \end{array} 593 614 \right. 594 615 \end{displaymath} 595 Figure~\ref{f:C-fibonacci} shows conventional approaches for writing a Fibonacci generator in C.596 597 616 Figure~\ref{f:GlobalVariables} illustrates the following problems: 598 un encapsulated global variables necessary to retain state between calls;599 only one fibonacci generator can run at a time;600 execution state must be explicitly retained .617 unique unencapsulated global variables necessary to retain state between calls; 618 only one Fibonacci generator; 619 execution state must be explicitly retained via explicit state variables. 601 620 Figure~\ref{f:ExternalState} addresses these issues: 602 621 unencapsulated program global variables become encapsulated structure variables; 603 multiple fibonacci generators can run at a time by declaring multiple fibonacci objects;604 explicit execution state is removed by precomputing the first two Fibonacci numbers and returning $ f(n-2)$.622 unique global variables are replaced by multiple Fibonacci objects; 623 explicit execution state is removed by precomputing the first two Fibonacci numbers and returning $\mathsf{fib}(n-2)$. 605 624 606 625 \begin{figure} … … 662 681 \begin{lstlisting}[aboveskip=0pt,belowskip=0pt] 663 682 `coroutine` Fib { int fn; }; 664 void ?{}( Fibonacci & fib ) with( fib ) { fn = 0; } 665 void main( Fib & f ) with( f ) { 683 void main( Fib & fib ) with( fib ) { 666 684 int f1, f2; 667 685 fn = 0; f1 = fn; `suspend()`; … … 687 705 \begin{lstlisting}[aboveskip=0pt,belowskip=0pt] 688 706 `coroutine` Fib { int ret; }; 689 void main( Fib & f ) with( f ) {707 void main( Fib & f ) with( fib ) { 690 708 int fn, f1 = 1, f2 = 0; 691 709 for ( ;; ) { … … 714 732 \end{figure} 715 733 716 Figure~\ref{f:Coroutine3States} creates a @coroutine@ type, which provides communication for multiple interface functions, and the \newterm{coroutine main}, which runs on the coroutine stack. 717 \begin{cfa} 718 `coroutine C { char c; int i; _Bool s; };` $\C{// used for communication}$ 719 void ?{}( C & c ) { s = false; } $\C{// constructor}$ 720 void main( C & cor ) with( cor ) { $\C{// actual coroutine}$ 721 while ( ! s ) // process c 722 if ( v == ... ) s = false; 723 } 724 // interface functions 725 char cont( C & cor, char ch ) { c = ch; resume( cor ); return c; } 726 _Bool stop( C & cor, int v ) { s = true; i = v; resume( cor ); return s; } 727 \end{cfa} 728 729 encapsulates the Fibonacci state in the shows is an example of a solution to the Fibonacci problem using \CFA coroutines, where the coroutine stack holds sufficient state for the next generation. 730 This solution has the advantage of having very strong decoupling between how the sequence is generated and how it is used. 731 Indeed, this version is as easy to use as the @fibonacci_state@ solution, while the implementation is very similar to the @fibonacci_func@ example. 732 733 Figure~\ref{f:fmt-line} shows the @Format@ coroutine for restructuring text into groups of character blocks of fixed size. 734 The example takes advantage of resuming coroutines in the constructor to simplify the code and highlights the idea that interesting control flow can occur in the constructor. 734 Using a coroutine, it is possible to express the Fibonacci formula directly without any of the C problems. 735 Figure~\ref{f:Coroutine3States} creates a @coroutine@ type: 736 \begin{cfa} 737 `coroutine` Fib { int fn; }; 738 \end{cfa} 739 which provides communication, @fn@, for the \newterm{coroutine main}, @main@, which runs on the coroutine stack, and possibly multiple interface functions, @next@. 740 Like the structure in Figure~\ref{f:ExternalState}, the coroutine type allows multiple instances, where instances of this type are passed to the (overloaded) coroutine main. 741 The coroutine main's stack holds the state for the next generation, @f1@ and @f2@, and the code has the three suspend points, representing the three states in the Fibonacci formula, to context switch back to the caller's resume. 742 The interface function, @next@, takes a Fibonacci instance and context switches to it using @resume@; 743 on return, the Fibonacci field, @fn@, contains the next value in the sequence, which is returned. 744 The first @resume@ is special because it cocalls the coroutine at its coroutine main and allocates the stack; 745 when the coroutine main returns, its stack is deallocated. 746 Hence, @Fib@ is an object at creation, transitions to a coroutine on its first resume, and transitions back to an object when the coroutine main finishes. 747 Figure~\ref{f:Coroutine1State} shows the coroutine version of the C version in Figure~\ref{f:ExternalState}. 748 Coroutine generators are called \newterm{output coroutines} because values are returned by the coroutine. 749 750 Figure~\ref{f:CFAFmt} shows an \newterm{input coroutine}, @Format@, for restructuring text into groups of character blocks of fixed size. 751 For example, the input of the left is reformatted into the output on the right. 752 \begin{quote} 753 \tt 754 \begin{tabular}{@{}l|l@{}} 755 \multicolumn{1}{c|}{\textbf{\textrm{input}}} & \multicolumn{1}{c}{\textbf{\textrm{output}}} \\ 756 abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz 757 & 758 \begin{tabular}[t]{@{}lllll@{}} 759 abcd & efgh & ijkl & mnop & qrst \\ 760 uvwx & yzab & cdef & ghij & klmn \\ 761 opqr & stuv & wxyz & & 762 \end{tabular} 763 \end{tabular} 764 \end{quote} 765 The example takes advantage of resuming coroutines in the constructor to prime the coroutine loops so the first character sent for formatting appears inside the nested loops. 766 The destruction provides a newline if formatted text ends with a full line. 767 Figure~\ref{f:CFmt} shows the C equivalent formatter, where the loops of the coroutine are flatten (linearized) and rechecked on each call because execution location is not retained between calls. 735 768 736 769 \begin{figure} 737 \begin{cfa}[xleftmargin=4\parindentlnth] 770 \centering 771 \newbox\myboxA 772 \begin{lrbox}{\myboxA} 773 \begin{lstlisting}[aboveskip=0pt,belowskip=0pt] 738 774 `coroutine` Format { 739 char ch; $\C{// used for communication}$740 int g, b; $\C{// global because used in destructor}$775 char ch; // used for communication 776 int g, b; // global because used in destructor 741 777 }; 742 void ?{}( Format & fmt ) { `resume( fmt );` } $\C{// prime (start) coroutine}$743 void ^?{}( Format & fmt ) with( fmt ) { if ( g != 0 || b != 0 ) sout | endl; }744 778 void main( Format & fmt ) with( fmt ) { 745 for ( ;; ) { $\C{// for as many characters}$746 for ( g = 0; g < 5; g += 1 ) { $\C{// groups of 5 blocks}$747 for ( b = 0; b < 4; b += 1 ) { $\C{// blocks of 4 characters}$779 for ( ;; ) { 780 for ( g = 0; g < 5; g += 1 ) { // group 781 for ( b = 0; b < 4; b += 1 ) { // block 748 782 `suspend();` 749 sout | ch; $\C{// print character}$783 sout | ch; // separator 750 784 } 751 sout | " "; $\C{// print block separator}$785 sout | " "; // separator 752 786 } 753 sout | endl; $\C{// print group separator}$787 sout | endl; 754 788 } 755 789 } 756 void prt( Format & fmt, char ch ) { 757 fmt.ch = ch; 790 void ?{}( Format & fmt ) { `resume( fmt );` } 791 void ^?{}( Format & fmt ) with( fmt ) { 792 if ( g != 0 || b != 0 ) sout | endl; 793 } 794 void format( Format & fmt ) { 758 795 `resume( fmt );` 759 796 } 760 797 int main() { 761 798 Format fmt; 799 eof: for ( ;; ) { 800 sin | fmt.ch; 801 if ( eof( sin ) ) break eof; 802 format( fmt ); 803 } 804 } 805 \end{lstlisting} 806 \end{lrbox} 807 808 \newbox\myboxB 809 \begin{lrbox}{\myboxB} 810 \begin{lstlisting}[aboveskip=0pt,belowskip=0pt] 811 struct Format { 762 812 char ch; 763 for ( ;; ) { $\C{// read until end of file}$ 764 sin | ch; $\C{// read one character}$ 765 if ( eof( sin ) ) break; $\C{// eof ?}$ 766 prt( fmt, ch ); $\C{// push character for formatting}$ 813 int g, b; 814 }; 815 void format( struct Format * fmt ) { 816 if ( fmt->ch != -1 ) { // not EOF 817 printf( "%c", fmt->ch ); 818 fmt->b += 1; 819 if ( fmt->b == 4 ) { // block 820 printf( " " ); // separator 821 fmt->b = 0; 822 fmt->g += 1; 823 } 824 if ( fmt->g == 5 ) { // group 825 printf( "\n" ); // separator 826 fmt->g = 0; 827 } 828 } else { 829 if ( fmt->g != 0 || fmt->b != 0 ) printf( "\n" ); 767 830 } 768 831 } 769 \end{cfa} 832 int main() { 833 struct Format fmt = { 0, 0, 0 }; 834 for ( ;; ) { 835 scanf( "%c", &fmt.ch ); 836 if ( feof( stdin ) ) break; 837 format( &fmt ); 838 } 839 fmt.ch = -1; 840 format( &fmt ); 841 } 842 \end{lstlisting} 843 \end{lrbox} 844 \subfloat[\CFA Coroutine]{\label{f:CFAFmt}\usebox\myboxA} 845 \qquad 846 \subfloat[C Linearized]{\label{f:CFmt}\usebox\myboxB} 770 847 \caption{Formatting text into lines of 5 blocks of 4 characters.} 771 848 \label{f:fmt-line} 772 849 \end{figure} 773 850 851 The previous examples are \newterm{asymmetric (semi) coroutine}s because one coroutine always calls a resuming function for another coroutine, and the resumed coroutine always suspends back to its last resumer, similar to call/return for normal functions. 852 However, there is no stack growth because @resume@/@suspend@ context switch to an existing stack frames rather than create a new one. 853 \newterm{Symmetric (full) coroutine}s have a coroutine call a resuming function for another coroutine, which eventually forms a cycle. 854 (The trivial cycle is a coroutine resuming itself.) 855 This control flow is similar to recursion for normal routines, but again there is no stack growth from the context switch. 856 774 857 \begin{figure} 775 858 \centering 776 \lstset{language=CFA,escapechar={},moredelim=**[is][\protect\color{red}]{`}{`}} 859 \lstset{language=CFA,escapechar={},moredelim=**[is][\protect\color{red}]{`}{`}}% allow $ 777 860 \begin{tabular}{@{}l@{\hspace{2\parindentlnth}}l@{}} 778 861 \begin{cfa} … … 806 889 Prod prod; 807 890 Cons cons = { prod }; 808 srandom( getpid() );809 891 start( prod, 5, cons ); 810 892 } … … 843 925 `resume( cons );` 844 926 } 845 846 927 \end{cfa} 847 928 \end{tabular} … … 849 930 \label{f:ProdCons} 850 931 \end{figure} 932 933 Figure~\ref{f:ProdCons} shows a producer/consumer symmetric-coroutine performing bi-directional communication. 934 Since the solution involves a full-coroutining cycle, the program main creates one coroutine in isolation, passes this coroutine to its partner, and closes the cycle at the call to @start@. 935 The @start@ function communicates both the number of elements to be produced and the consumer into the producer's coroutine structure. 936 Then the @resume@ to @prod@ creates @prod@'s stack with a frame for @prod@'s coroutine main at the top, and context switches to it. 937 @prod@'s coroutine main starts, creates local variables that are retained between coroutine activations, and executes $N$ iterations, each generating two random vales, calling the consumer to deliver the values, and printing the status returned from the consumer. 938 939 The producer call to @delivery@ transfers values into the consumer's communication variables, resumes the consumer, and returns the consumer status. 940 For the first resume, @cons@'s stack is initialized, creating local variables retained between subsequent activations of the coroutine. 941 The consumer iterates until the @done@ flag is set, prints, increments status, and calls back to the producer's @payment@ member, and on return prints the receipt from the producer and increments the money for the next payment. 942 The call from the consumer to the producer's @payment@ member introduces the cycle between producer and consumer. 943 When @payment@ is called, the consumer copies values into the producer's communication variable and a resume is executed. 944 The context switch restarts the producer at the point where it was last context switched and it continues in member @delivery@ after the resume. 945 946 The @delivery@ member returns the status value in @prod@'s @main@ member, where the status is printed. 947 The loop then repeats calling @delivery@, where each call resumes the consumer coroutine. 948 The context switch to the consumer continues in @payment@. 949 The consumer increments and returns the receipt to the call in @cons@'s @main@ member. 950 The loop then repeats calling @payment@, where each call resumes the producer coroutine. 951 952 After iterating $N$ times, the producer calls @stop@. 953 The @done@ flag is set to stop the consumer's execution and a resume is executed. 954 The context switch restarts @cons@ in @payment@ and it returns with the last receipt. 955 The consumer terminates its loops because @done@ is true, its @main@ terminates, so @cons@ transitions from a coroutine back to an object, and @prod@ reactivates after the resume in @stop@. 956 The @stop@ member returns and @prod@'s @main@ member terminates. 957 The program main restarts after the resume in @start@. 958 The @start@ member returns and the program main terminates. 851 959 852 960 … … 3493 3601 \bibliography{pl,local} 3494 3602 3603 3495 3604 \end{document} 3496 3605 -
doc/papers/general/Paper.tex
re9a7e90b r48b9b36 53 53 %\newcommand{\TODO}[1]{\textbf{TODO}: {\itshape #1}} % TODO included 54 54 \newcommand{\TODO}[1]{} % TODO elided 55 56 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 55 57 56 58 % Default underscore is too low and wide. Cannot use lstlisting "literate" as replacing underscore … … 163 165 literate={-}{\makebox[1ex][c]{\raisebox{0.4ex}{\rule{0.8ex}{0.1ex}}}}1 {^}{\raisebox{0.6ex}{$\scriptstyle\land\,$}}1 164 166 {~}{\raisebox{0.3ex}{$\scriptstyle\sim\,$}}1 % {`}{\ttfamily\upshape\hspace*{-0.1ex}`}1 165 {<-}{$\leftarrow$}2 {=>}{$\Rightarrow$}2 {->}{\makebox[1ex][c]{\raisebox{0.4ex}{\rule{0.8ex}{0.075ex}}}\kern-0.2ex{\textgreater}}2, 167 {<}{\textrm{\textless}}1 {>}{\textrm{\textgreater}}1 168 {<-}{$\leftarrow$}2 {=>}{$\Rightarrow$}2 {->}{\makebox[1ex][c]{\raisebox{0.5ex}{\rule{0.8ex}{0.075ex}}}\kern-0.2ex{\textrm{\textgreater}}}2, 166 169 moredelim=**[is][\color{red}]{`}{`}, 167 170 }% lstset … … 203 206 204 207 The goal of the \CFA project (pronounced ``C-for-all'') is to create an extension of C that provides modern safety and productivity features while still ensuring strong backwards compatibility with C and its programmers. 205 Prior projects have attempted similar goals but failed to honour C programming-style; for instance, adding object-oriented or functional programming with garbage collection is a non-starter for many C developers. 208 Prior projects have attempted similar goals but failed to honour C programming-style; 209 for instance, adding object-oriented or functional programming with garbage collection is a non-starter for many C developers. 206 210 Specifically, \CFA is designed to have an orthogonal feature-set based closely on the C programming paradigm, so that \CFA features can be added \emph{incrementally} to existing C code-bases, and C programmers can learn \CFA extensions on an as-needed basis, preserving investment in existing code and programmers. 207 211 This paper presents a quick tour of \CFA features showing how their design avoids shortcomings of similar features in C and other C-like languages. … … 219 223 220 224 \section{Introduction} 225 221 226 The C programming language is a foundational technology for modern computing with millions of lines of code implementing everything from commercial operating-systems to hobby projects. 222 227 This installation base and the programmers producing it represent a massive software-engineering investment spanning decades and likely to continue for decades more. … … 2000 2005 Destruction parameters are useful for specifying storage-management actions, such as de-initialize but not deallocate.}. 2001 2006 \begin{cfa} 2002 struct VLA { int len, * data; }; 2007 struct VLA { int len, * data; }; $\C{// variable length array of integers}$ 2003 2008 void ?{}( VLA & vla ) with ( vla ) { len = 10; data = alloc( len ); } $\C{// default constructor}$ 2004 2009 void ^?{}( VLA & vla ) with ( vla ) { free( data ); } $\C{// destructor}$
Note: See TracChangeset
for help on using the changeset viewer.