Changes in / [b0ab7853:35a408b7]
- Files:
-
- 2 added
- 4 edited
-
benchmark/Makefile.am (modified) (7 diffs)
-
benchmark/creation/qthreads.c (added)
-
benchmark/ctxswitch/qthreads.c (added)
-
doc/bibliography/pl.bib (modified) (1 diff)
-
doc/papers/concurrency/Paper.tex (modified) (7 diffs)
-
doc/user/user.tex (modified) (8 diffs)
Legend:
- Unmodified
- Added
- Removed
-
benchmark/Makefile.am
rb0ab7853 r35a408b7 11 11 ## Created On : Sun May 31 09:08:15 2015 12 12 ## Last Modified By : Peter A. Buhr 13 ## Last Modified On : Mon Jun 24 16:45:42201914 ## Update Count : 5 313 ## Last Modified On : Sun Jun 23 12:34:29 2019 14 ## Update Count : 52 15 15 ############################################################################### 16 16 … … 31 31 BENCH_V_JAVAC = $(__bench_v_JAVAC_$(__quiet)) 32 32 BENCH_V_UPP = $(__bench_v_UPP_$(__quiet)) 33 BENCH_V_QTHREAD = $(__bench_v_QTHREAD_$(__quiet)) 33 34 34 35 __quiet = verbose … … 45 46 __bench_v_JAVAC_verbose = $(AM_V_JAVAC) 46 47 __bench_v_UPP_verbose = $(AM_V_UPP) 48 __bench_v_QTHREAD_verbose = $(AM_V_CC) 47 49 48 50 … … 174 176 ctxswitch-upp_thread.run \ 175 177 ctxswitch-goroutine.run \ 176 ctxswitch-java_thread.run 178 ctxswitch-java_thread.run \ 179 ctxswitch-qthreads.run 177 180 178 181 … … 221 224 @echo "java JavaThread" >> a.out 222 225 @chmod a+x a.out 226 227 ctxswitch-qthreads$(EXEEXT): 228 $(BENCH_V_QTHREADS)$(COMPILE) -DBENCH_N=50000000 -I/u/pabuhr/software/qthreads/include -L/u/pabuhr/software/qthreads/lib -Xlinker -R/u/pabuhr/software/qthreads/lib $(srcdir)/ctxswitch/qthreads.c -lqthread 223 229 224 230 ## ========================================================================================================= … … 314 320 creation-upp_thread.run \ 315 321 creation-goroutine.run \ 316 creation-java_thread.run 322 creation-java_thread.run \ 323 creation-qthreads.run 317 324 318 325 creation-cfa_coroutine$(EXEEXT): … … 342 349 @echo "java JavaThread" >> a.out 343 350 @chmod a+x a.out 351 352 creation-qthreads$(EXEEXT): 353 $(BENCH_V_QTHREADS)$(COMPILE) -DBENCH_N=50000000 -I/u/pabuhr/software/qthreads/include -L/u/pabuhr/software/qthreads/lib -Xlinker -R/u/pabuhr/software/qthreads/lib $(srcdir)/ctxswitch/qthreads.c -lqthread 344 354 345 355 ## ========================================================================================================= -
doc/bibliography/pl.bib
rb0ab7853 r35a408b7 954 954 key = {Cforall Benchmarks}, 955 955 author = {{\textsf{C}{$\mathbf{\forall}$} Benchmarks}}, 956 howpublished= {\href{https://plg.uwaterloo.ca/~cforall/benchmark .tar}{https://\-plg.uwaterloo.ca/\-$\sim$cforall/\-benchmark.tar}},956 howpublished= {\href{https://plg.uwaterloo.ca/~cforall/benchmarks}{https://\-plg.uwaterloo.ca/\-$\sim$cforall/\-benchmarks}}, 957 957 } 958 958 -
doc/papers/concurrency/Paper.tex
rb0ab7853 r35a408b7 316 316 Finally, performant user-threading implementations (both time and space) meet or exceed direct kernel-threading implementations, while achieving the programming advantages of high concurrency levels and safety. 317 317 318 A further effort over the past two decades is the development of language memory models to deal with the conflict between language features and compiler/hardware optimizations, \ie some language features are unsafe in the presence of aggressive sequential optimizations~\cite{Buhr95a,Boehm05}.318 A further effort over the past two decades is the development of language memory models to deal with the conflict between language features and compiler/hardware optimizations, \ie, some language features are unsafe in the presence of aggressive sequential optimizations~\cite{Buhr95a,Boehm05}. 319 319 The consequence is that a language must provide sufficient tools to program around safety issues, as inline and library code is all sequential to the compiler. 320 One solution is low-level qualifiers and functions (\eg @volatile@ and atomics) allowing \emph{programmers} to explicitly write safe (race-free~\cite{Boehm12}) programs.320 One solution is low-level qualifiers and functions (\eg, @volatile@ and atomics) allowing \emph{programmers} to explicitly write safe (race-free~\cite{Boehm12}) programs. 321 321 A safer solution is high-level language constructs so the \emph{compiler} knows the optimization boundaries, and hence, provides implicit safety. 322 322 This problem is best known with respect to concurrency, but applies to other complex control-flow, like exceptions\footnote{ … … 324 324 The key feature that dovetails with this paper is nonlocal exceptions allowing exceptions to be raised across stacks, with synchronous exceptions raised among coroutines and asynchronous exceptions raised among threads, similar to that in \uC~\cite[\S~5]{uC++} 325 325 } and coroutines. 326 Finally, language solutions allow matching constructs with language paradigm, \ie imperative and functional languages often have different presentations of the same concept to fit their programming model.326 Finally, language solutions allow matching constructs with language paradigm, \ie, imperative and functional languages often have different presentations of the same concept to fit their programming model. 327 327 328 328 Finally, it is important for a language to provide safety over performance \emph{as the default}, allowing careful reduction of safety for performance when necessary. 329 Two concurrency violations of this philosophy are \emph{spurious wakeup} (random wakeup~\cite[\S~8]{Buhr05a}) and \emph{barging} (signals-as-hints~\cite[\S~8]{Buhr05a}), where one is a consequence of the other, \ie once there is spurious wakeup, signals-as-hints follow.329 Two concurrency violations of this philosophy are \emph{spurious wakeup} (random wakeup~\cite[\S~8]{Buhr05a}) and \emph{barging} (signals-as-hints~\cite[\S~8]{Buhr05a}), where one is a consequence of the other, \ie, once there is spurious wakeup, signals-as-hints follow. 330 330 However, spurious wakeup is \emph{not} a foundational concurrency property~\cite[\S~8]{Buhr05a}, it is a performance design choice. 331 331 Similarly, signals-as-hints are often a performance decision. … … 337 337 Most augmented traditional (Fortran 18~\cite{Fortran18}, Cobol 14~\cite{Cobol14}, Ada 12~\cite{Ada12}, Java 11~\cite{Java11}) and new languages (Go~\cite{Go}, Rust~\cite{Rust}, and D~\cite{D}), except \CC, diverge from C with different syntax and semantics, only interoperate indirectly with C, and are not systems languages, for those with managed memory. 338 338 As a result, there is a significant learning curve to move to these languages, and C legacy-code must be rewritten. 339 While \CC, like \CFA, takes an evolutionary approach to extend C, \CC's constantly growing complex and interdependent features-set (\eg objects, inheritance, templates, etc.) mean idiomatic \CC code is difficult to use from C, and C programmers must expend significant effort learning \CC.339 While \CC, like \CFA, takes an evolutionary approach to extend C, \CC's constantly growing complex and interdependent features-set (\eg, objects, inheritance, templates, etc.) mean idiomatic \CC code is difficult to use from C, and C programmers must expend significant effort learning \CC. 340 340 Hence, rewriting and retraining costs for these languages, even \CC, are prohibitive for companies with a large C software-base. 341 341 \CFA with its orthogonal feature-set, its high-performance runtime, and direct access to all existing C libraries circumvents these problems. … … 367 367 \section{Stateful Function} 368 368 369 The stateful function is an old idea~\cite{Conway63,Marlin80} that is new again~\cite{C++20Coroutine19}, where execution is temporarily suspended and later resumed, \eg plugin, device driver, finite-state machine.369 The stateful function is an old idea~\cite{Conway63,Marlin80} that is new again~\cite{C++20Coroutine19}, where execution is temporarily suspended and later resumed, \eg, plugin, device driver, finite-state machine. 370 370 Hence, a stateful function may not end when it returns to its caller, allowing it to be restarted with the data and execution location present at the point of suspension. 371 371 This capability is accomplished by retaining a data/execution \emph{closure} between invocations. 372 If the closure is fixed size, we call it a \emph{generator} (or \emph{stackless}), and its control flow is restricted, \eg suspending outside the generator is prohibited.373 If the closure is variabl e size, we call it a \emph{coroutine} (or \emph{stackful}), and as the names implies, often implemented with a separate stack with no programming restrictions.372 If the closure is fixed size, we call it a \emph{generator} (or \emph{stackless}), and its control flow is restricted, \eg, suspending outside the generator is prohibited. 373 If the closure is variably sized, we call it a \emph{coroutine} (or \emph{stackful}), and as the names implies, often implemented with a separate stack with no programming restrictions. 374 374 Hence, refactoring a stackless coroutine may require changing it to stackful. 375 A foundational property of all \emph{stateful functions} is that resume/suspend \emph{do not} cause incremental stack growth, \ie resume/suspend operations are remembered through the closure not the stack.375 A foundational property of all \emph{stateful functions} is that resume/suspend \emph{do not} cause incremental stack growth, \ie, resume/suspend operations are remembered through the closure not the stack. 376 376 As well, activating a stateful function is \emph{asymmetric} or \emph{symmetric}, identified by resume/suspend (no cycles) and resume/resume (cycles). 377 377 A fixed closure activated by modified call/return is faster than a variable closure activated by context switching. 378 Additionally, any storage management for the closure (especially in unmanaged languages, \ie no garbage collection) must also be factored into design and performance.378 Additionally, any storage management for the closure (especially in unmanaged languages, \ie, no garbage collection) must also be factored into design and performance. 379 379 Therefore, selecting between stackless and stackful semantics is a tradeoff between programming requirements and performance, where stackless is faster and stackful is more general. 380 380 Note, creation cost is amortized across usage, so activation cost is usually the dominant factor. … … 648 648 \end{center} 649 649 The example takes advantage of resuming a generator in the constructor to prime the loops so the first character sent for formatting appears inside the nested loops. 650 The destructor provides a newline ,if formatted text ends with a full line.650 The destructor provides a newline if formatted text ends with a full line. 651 651 Figure~\ref{f:CFormatSim} shows the C implementation of the \CFA input generator with one additional field and the computed @goto@. 652 652 For contrast, Figure~\ref{f:PythonFormatter} shows the equivalent Python format generator with the same properties as the Fibonacci generator. … … 2719 2719 Each benchmark experiment is run 31 times. 2720 2720 All omitted tests for other languages are functionally identical to the \CFA tests and available online~\cite{CforallBenchMarks}. 2721 % tar --exclude=.deps --exclude=Makefile --exclude=Makefile.in --exclude=c.c --exclude=cxx.cpp --exclude=fetch_add.c -cvhf benchmark.tar benchmark 2721 2722 2722 2723 2723 \paragraph{Object Creation} … … 2749 2749 \multicolumn{1}{@{}c}{} & \multicolumn{1}{c}{Median} & \multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\ 2750 2750 \CFA Coroutine Lazy & 14.3 & 14.3 & 0.32 \\ 2751 \CFA Coroutine Eager & 522.8 & 525.3 & 5.81\\2751 \CFA Coroutine Eager & 2203.7 & 2205.6 & 26.03 \\ 2752 2752 \CFA Thread & 1257.8 & 1291.2 & 86.19 \\ 2753 2753 \uC Coroutine & 92.2 & 91.4 & 1.58 \\ -
doc/user/user.tex
rb0ab7853 r35a408b7 11 11 %% Created On : Wed Apr 6 14:53:29 2016 12 12 %% Last Modified By : Peter A. Buhr 13 %% Last Modified On : Tue Jun 25 08:51:33201914 %% Update Count : 38 7113 %% Last Modified On : Sat Jun 15 16:29:45 2019 14 %% Update Count : 3847 15 15 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 16 16 … … 3346 3346 3347 3347 3348 \section{ Stream I/OLibrary}3349 \label{s: StreamIOLibrary}3348 \section{I/O Stream Library} 3349 \label{s:IOStreamLibrary} 3350 3350 \index{input/output stream library} 3351 3351 \index{stream library} 3352 3352 3353 The goal of \CFA streaminput/output (I/O) is to simplify the common cases\index{I/O!common case}, while fully supporting polymorphism and user defined types in a consistent way.3354 Stream I/O can be implicitly or explicitly formatted.3355 I mplicit formatting means \CFA selects the output or input format for values that match with the type of a variable.3356 Explicit formatting means additional information is specified to augment how an output or input of value is interpreted.3357 \CFA formatting is a cross between C ©printf© and \CC ©cout© manipulators, and Python implicit spacing and newline.3358 Specifically: 3353 The goal of \CFA input/output (I/O) is to simplify the common cases\index{I/O!common case}, while fully supporting polymorphism and user defined types in a consistent way. 3354 \CFA I/O combines ideas from C ©printf©, \CC, and Python. 3355 I/O can be unformatted or formatted. 3356 Unformatted means \CFA selects the output or input format for values that match with the type of a variable. 3357 Formatted means additional information is specified to augment how an output or input of value is interpreted. 3358 \CFA formatting is a cross between C ©printf© and \CC ©cout© manipulators. 3359 3359 \begin{itemize} 3360 3360 \item 3361 ©printf© /Pythonformat codes are dense, making them difficult to read and remember.3361 ©printf© format codes are dense, making them difficult to read and remember. 3362 3362 \CFA/\CC format manipulators are named, making them easier to read and remember. 3363 3363 \item 3364 ©printf© /Pythonseparates format codes from associated variables, making it difficult to match codes with variables.3364 ©printf© separates format codes from associated variables, making it difficult to match codes with variables. 3365 3365 \CFA/\CC co-locate codes with associated variables, where \CFA has the tighter binding. 3366 3366 \item 3367 Format manipulators in \C FA have local effect, whereas \CC have global effect, except ©setw©.3367 Format manipulators in \CC have global rather than local effect, except ©setw©. 3368 3368 Hence, it is common programming practice to toggle manipulators on and then back to the default to prevent downstream side-effects. 3369 3369 Without this programming style, errors occur when moving prints, as manipulator effects incorrectly flow into the new location. 3370 3370 (To guarantee no side-effects, manipulator values must be saved and restored across function calls.) 3371 \item3372 \CFA has more sophisticated implicit spacing between values than Python, plus implicit newline at the end of a print.3373 3371 \end{itemize} 3374 3372 The \CFA header file for the I/O library is \Indexc{fstream.hfa}. 3375 3373 3376 For implicit formatted output, the common case is printing a seriesof variables separated by whitespace.3374 For unformatted output, the common case is printing a sequence of variables separated by whitespace. 3377 3375 \begin{cquote} 3378 \begin{tabular}{@{}l@{\hspace{ 2em}}l@{\hspace{2em}}l@{}}3379 \multicolumn{1}{c@{\hspace{ 2em}}}{\textbf{\CFA}} & \multicolumn{1}{c@{\hspace{2em}}}{\textbf{\CC}} & \multicolumn{1}{c}{\textbf{Python}} \\3376 \begin{tabular}{@{}l@{\hspace{3em}}l@{}} 3377 \multicolumn{1}{c@{\hspace{3em}}}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{\CC}} \\ 3380 3378 \begin{cfa} 3381 3379 int x = 1, y = 2, z = 3; … … 3387 3385 cout << x ®<< " "® << y ®<< " "® << z << endl; 3388 3386 \end{cfa} 3389 &3390 \begin{cfa}3391 x = 1; y = 2; z = 33392 print( x, y, z )3393 \end{cfa}3394 3387 \\ 3395 \begin{cfa}[showspaces=true,aboveskip=0pt,belowskip=0pt]3396 1® ®2® ®33397 \end{cfa}3398 &3399 3388 \begin{cfa}[showspaces=true,aboveskip=0pt,belowskip=0pt] 3400 3389 1® ®2® ®3 … … 3440 3429 There is a weak similarity between the \CFA logical-or operator and the \Index{Shell pipe-operator} for moving data, where data flows in the correct direction for input but the opposite direction for output. 3441 3430 3442 For implicit formattedinput, the common case is reading a sequence of values separated by whitespace, where the type of an input constant must match with the type of the input variable.3431 For unformatter input, the common case is reading a sequence of values separated by whitespace, where the type of an input constant must match with the type of the input variable. 3443 3432 \begin{cquote} 3444 3433 \begin{lrbox}{\LstBox} … … 3447 3436 \end{cfa} 3448 3437 \end{lrbox} 3449 \begin{tabular}{@{}l@{\hspace{3em}}l@{ \hspace{3em}}l@{}}3438 \begin{tabular}{@{}l@{\hspace{3em}}l@{}} 3450 3439 \multicolumn{1}{@{}l@{}}{\usebox\LstBox} \\ 3451 \multicolumn{1}{c@{\hspace{ 2em}}}{\textbf{\CFA}} & \multicolumn{1}{c@{\hspace{2em}}}{\textbf{\CC}} & \multicolumn{1}{c}{\textbf{Python}} \\3440 \multicolumn{1}{c@{\hspace{3em}}}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{\CC}} \\ 3452 3441 \begin{cfa}[aboveskip=0pt,belowskip=0pt] 3453 3442 sin | x | y | z; … … 3457 3446 cin >> x >> y >> z; 3458 3447 \end{cfa} 3459 &3460 \begin{cfa}[aboveskip=0pt,belowskip=0pt]3461 x = int(input()); y = float(input()); z = input();3462 \end{cfa}3463 3448 \\ 3464 3449 \begin{cfa}[showspaces=true,aboveskip=0pt,belowskip=0pt] 3465 3450 ®1® ®2.5® ®A® 3466 3467 3468 3451 \end{cfa} 3469 3452 & 3470 3453 \begin{cfa}[showspaces=true,aboveskip=0pt,belowskip=0pt] 3471 3454 ®1® ®2.5® ®A® 3472 3473 3474 \end{cfa}3475 &3476 \begin{cfa}[showspaces=true,aboveskip=0pt,belowskip=0pt]3477 ®1®3478 ®2.5®3479 ®A®3480 3455 \end{cfa} 3481 3456 \end{tabular} … … 3730 3705 0b0 0b11011 0b11011 0b11011 0b11011 3731 3706 sout | bin( -27HH ) | bin( -27H ) | bin( -27 ) | bin( -27L ); 3732 0b11100101 0b1111111111100101 0b11111111111111111111111111100101 0b ®(58 1s)®1001013707 0b11100101 0b1111111111100101 0b11111111111111111111111111100101 0b(58 1s)100101 3733 3708 \end{cfa} 3734 3709 … … 3807 3782 ® ®4.000000 ® ®4.000000 4.000000 3808 3783 ® ®ab ® ®ab ab 3784 ab ab ab 3809 3785 \end{cfa} 3810 3786 If the value is larger, it is printed without truncation, ignoring the ©minimum©.
Note:
See TracChangeset
for help on using the changeset viewer.