Changeset b1d3ee1


Ignore:
Timestamp:
May 8, 2019, 10:28:04 AM (5 years ago)
Author:
Thierry Delisle <tdelisle@…>
Branches:
ADT, arm-eh, ast-experimental, cleanup-dtors, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, pthread-emulation, qualifiedEnum
Children:
6a625de, 7bb6bd8
Parents:
cedb545 (diff), 1c9568f (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the (diff) links above to see all the changes relative to each parent.
Message:

Merge branch 'master' of plg.uwaterloo.ca:software/cfa/cfa-cc

Files:
5 edited

Legend:

Unmodified
Added
Removed
  • doc/papers/concurrency/Paper.tex

    rcedb545 rb1d3ee1  
    241241\corres{*Peter A. Buhr, Cheriton School of Computer Science, University of Waterloo, 200 University Avenue West, Waterloo, ON, N2L 3G1, Canada. \email{pabuhr{\char`\@}uwaterloo.ca}}
    242242
    243 \fundingInfo{Natural Sciences and Engineering Research Council of Canada}
     243% \fundingInfo{Natural Sciences and Engineering Research Council of Canada}
    244244
    245245\abstract[Summary]{
    246 \CFA is a modern, polymorphic, non-object-oriented, backwards-compatible extension of the C programming language.
    247 This paper discusses some advanced control-flow and concurrency/parallelism features in \CFA, along with the supporting runtime.
    248 These features are created from scratch because they do not exist in ISO C, or are low-level and/or unimplemented, so C programmers continue to rely on library features, like C pthreads.
    249 \CFA introduces language-level control-flow mechanisms, like coroutines, user-level threading, and monitors for mutual exclusion and synchronization.
    250 A unique contribution of this work is allowing multiple monitors to be safely acquired \emph{simultaneously} (deadlock free), while integrating this capability with monitor synchronization mechanisms.
    251 These features also integrate with the \CFA polymorphic type-system and exception handling, while respecting the expectations and style of C programmers.
     246\CFA is a polymorphic, non-object-oriented, concurrent, backwards-compatible extension of the C programming language.
     247This paper discusses the design philosophy and implementation of its advanced control-flow and concurrent/parallel features, along with the supporting runtime.
     248These features are created from scratch as ISO C has only low-level and/or unimplemented concurrency, so C programmers continue to rely on library features like C pthreads.
     249\CFA introduces modern language-level control-flow mechanisms, like coroutines, user-level threading, and monitors for mutual exclusion and synchronization.
     250Library extension for executors, futures, and actors are built on these basic mechanisms.
     251The runtime provides significant programmer simplification and safety by eliminating spurious wakeup and reducing monitor barging.
     252The runtime also ensures multiple monitors can be safely acquired \emph{simultaneously} (deadlock free), and this feature is fully integrated with all monitor synchronization mechanisms.
     253All language features integrate with the \CFA polymorphic type-system and exception handling, while respecting the expectations and style of C programmers.
    252254Experimental results show comparable performance of the new features with similar mechanisms in other concurrent programming-languages.
    253255}%
     
    264266\section{Introduction}
    265267
    266 This paper discusses the design of language-level control-flow and concurrency/parallelism extensions in \CFA and its runtime.
     268This paper discusses the design philosophy and implementation of advanced language-level control-flow and concurrent/parallel features in \CFA~\cite{Moss18} and its runtime.
    267269\CFA is a modern, polymorphic, non-object-oriented\footnote{
    268270\CFA has features often associated with object-oriented programming languages, such as constructors, destructors, virtuals and simple inheritance.
    269271However, functions \emph{cannot} be nested in structures, so there is no lexical binding between a structure and set of functions (member/method) implemented by an implicit \lstinline@this@ (receiver) parameter.},
    270 backwards-compatible extension of the C programming language~\cite{Moss18}.
     272backwards-compatible extension of the C programming language.
    271273Within the \CFA framework, new control-flow features are created from scratch.
    272274ISO \Celeven defines only a subset of the \CFA extensions, where the overlapping features are concurrency~\cite[\S~7.26]{C11}.
     
    275277no high-level language concurrency features are defined.
    276278Interestingly, almost a decade after publication of the \Celeven standard, neither gcc-8, clang-8 nor msvc-19 (most recent versions) support the \Celeven include @threads.h@, indicating little interest in the C11 concurrency approach.
    277 Finally, while the \Celeven standard does not state a concurrent threading-model, the historical association with pthreads suggests implementations would adopt kernel-level threading (1:1)~\cite{ThreadModel}.
     279Finally, while the \Celeven standard does not state a threading model, the historical association with pthreads suggests implementations would adopt kernel-level threading (1:1)~\cite{ThreadModel}.
    278280
    279281In contrast, there has been a renewed interest during the past decade in user-level (M:N, green) threading in old and new programming languages.
     
    281283Kernel threading was chosen, largely because of its simplicity and fit with the simpler operating systems and hardware architectures at the time, which gave it a performance advantage~\cite{Drepper03}.
    282284Libraries like pthreads were developed for C, and the Solaris operating-system switched from user (JDK 1.1~\cite{JDK1.1}) to kernel threads.
    283 As a result, languages like Java, Scala~\cite{Scala}, Objective-C~\cite{obj-c-book}, \CCeleven~\cite{C11}, and C\#~\cite{Csharp} adopted the 1:1 kernel-threading model, with a variety of presentation mechanisms.
     285As a result, languages like Java, Scala~\cite{Scala}, Objective-C~\cite{obj-c-book}, \CCeleven~\cite{C11}, and C\#~\cite{Csharp} adopt the 1:1 kernel-threading model, with a variety of presentation mechanisms.
    284286From 2000 onwards, languages like Go~\cite{Go}, Erlang~\cite{Erlang}, Haskell~\cite{Haskell}, D~\cite{D}, and \uC~\cite{uC++,uC++book} have championed the M:N user-threading model, and many user-threading libraries have appeared~\cite{Qthreads,MPC,BoostThreads}, including putting green threads back into Java~\cite{Quasar}.
    285287The main argument for user-level threading is that they are lighter weight than kernel threads (locking and context switching do not cross the kernel boundary), so there is less restriction on programming styles that encourage large numbers of threads performing smaller work-units to facilitate load balancing by the runtime~\cite{Verch12}.
     
    287289Finally, performant user-threading implementations (both time and space) are largely competitive with direct kernel-threading implementations, while achieving the programming advantages of high concurrency levels and safety.
    288290
    289 A further effort over the past decade is the development of language memory-models to deal with the conflict between certain language features and compiler/hardware optimizations.
    290 This issue can be rephrased as: some language features are pervasive (language and runtime) and cannot be safely added via a library to prevent invalidation by sequential optimizations~\cite{Buhr95a,Boehm05}.
    291 The consequence is that a language must be cognizant of these features and provide sufficient tools to program around any safety issues.
    292 For example, C created the @volatile@ qualifier to provide correct execution for @setjmp@/@logjmp@ (concurrency came later).
    293 The common solution is to provide a handful of complex qualifiers and functions (e.g., @volatile@ and atomics) allowing programmers to write consistent/race-free programs, often in the sequentially-consistent memory-model~\cite{Boehm12}.
    294 
    295 While having a sufficient memory-model allows sound libraries to be constructed, writing these libraries can quickly become awkward and error prone, and using these low-level libraries has the same issues.
    296 Essentially, using low-level explicit locks is the concurrent equivalent of assembler programming.
    297 Just as most assembler programming is replaced with high-level programming, explicit locks can be replaced with high-level concurrency in a programming language.
    298 Then the goal is for the compiler to check for correct usage and follow any complex coding conventions implicitly.
    299 The drawback is that language constructs may preclude certain specialized techniques, therefore introducing inefficiency or inhibiting concurrency.
    300 For most concurrent programs, these drawbacks are insignificant in comparison to the speed of composition, and subsequent reliability and maintainability of the high-level concurrent program.
    301 (The same is true for high-level programming versus assembler programming.)
    302 Only very rarely should it be necessary to drop down to races and/or explicit locks to apply a specialized technique to achieve maximum speed or concurrency.
    303 As stated, this observation applies to non-concurrent forms of complex control-flow, like exception handling and coroutines.
    304 
    305 Adapting the programming language to these features also allows matching the control-flow model with the programming-language style, versus adopting one general (sound) library/paradigm.
     291A further effort over the past two decades is the development of language memory-models to deal with the conflict between language features and compiler/hardware optimizations, i.e., some language features are unsafe in the presence of aggressive sequential optimizations~\cite{Buhr95a,Boehm05}.
     292The consequence is that a language must provide sufficient tools to program around safety issues, as inline and library code is all sequential to the compiler.
     293One solution is low-level qualifiers and functions (e.g., @volatile@ and atomics) allowing \emph{programmers} to explicitly write safe (race-free~\cite{Boehm12}) programs.
     294A safer solution is high-level language constructs so the \emph{compiler} knows the optimization boundaries, and hence, provides implicit safety.
     295This problem is best know with respect to concurrency, but applies to other complex control-flow, like exceptions\footnote{
     296\CFA exception handling will be presented in a separate paper.
     297The key feature that dovetails with this paper is non-local exceptions allowing exceptions to be raised across stacks, with synchronous exceptions raised among coroutines and asynchronous exceptions raised among threads, similar to that in \uC~\cite[\S~5]{uC++}
     298} and coroutines.
     299Finally, solutions in the language allows matching constructs with language paradigm, i.e., imperative and functional languages have different presentations of the same concept.
     300
     301Finally, it is important for a language to provide safety over performance \emph{as the default}, allowing careful reduction of safety for performance when necessary.
     302Two concurrency violations of this philosophy are \emph{spurious wakeup} and \emph{barging}, i.e., random wakeup~\cite[\S~8]{Buhr05a} and signalling-as-hints~\cite[\S~8]{Buhr05a}, where one begats the other.
     303If you believe spurious wakeup is a foundational concurrency property, than unblocking (signalling) a thread is always a hint.
     304If you \emph{do not} believe spurious wakeup is foundational, than signalling-as-hints is a performance decision.
     305Most importantly, removing spurious wakeup and signals-as-hints makes concurrent programming significantly safer because it removes local non-determinism.
     306Clawing back performance where the local non-determinism is unimportant, should be an option not the default.
     307
     308\begin{comment}
    306309For example, it is possible to provide exceptions, coroutines, monitors, and tasks as specialized types in an object-oriented language, integrating these constructs to allow leveraging the type-system (static type-checking) and all other object-oriented capabilities~\cite{uC++}.
    307310It is also possible to leverage call/return for blocking communication via new control structures, versus switching to alternative communication paradigms, like channels or message passing.
     
    321324Hence, rewriting and retraining costs for these languages, even \CC, are prohibitive for companies with a large C software-base.
    322325\CFA with its orthogonal feature-set, its high-performance runtime, and direct access to all existing C libraries circumvents these problems.
    323 
    324 We present comparative examples so the reader can judge if the \CFA control-flow extensions are equivalent or better than those in or proposed for \Celeven, \CC and other concurrent, imperative programming languages, and perform experiments to show the \CFA runtime is competitive with other similar mechanisms.
    325 The detailed contributions of this work are:
     326\end{comment}
     327
     328\CFA embraces user-level threading, language extensions for advanced control-flow, and safety as the default.
     329We present comparative examples so the reader can judge if the \CFA control-flow extensions are better and safer than those in or proposed for \Celeven, \CC and other concurrent, imperative programming languages, and perform experiments to show the \CFA runtime is competitive with other similar mechanisms.
     330The main contributions of this work are:
    326331\begin{itemize}
    327332\item
    328 allowing multiple monitors to be safely acquired \emph{simultaneously} (deadlock free), while seamlessly integrating this capability with all monitor synchronization mechanisms.
     333expressive language-level coroutines and user-level threading, which respect the expectations of C programmers.
    329334\item
    330 all control-flow features respect the expectations of C programmers, with statically type-safe interfaces that integrate with the \CFA polymorphic type-system and other language features.
     335monitor synchronization without barging.
    331336\item
    332 experimental results show comparable performance of the new features with similar mechanisms in other concurrent programming-languages.
     337safely acquiring multiple monitors \emph{simultaneously} (deadlock free), while seamlessly integrating this capability with all monitor synchronization mechanisms.
     338\item
     339providing statically type-safe interfaces that integrate with the \CFA polymorphic type-system and other language features.
     340\item
     341library extensions for executors, futures, and actors built on the basic mechanisms.
     342\item
     343a runtime system with no spurious wakeup.
     344\item
     345experimental results showing comparable performance of the new features with similar mechanisms in other concurrent programming-languages.
    333346\end{itemize}
    334347
  • doc/user/user.tex

    rcedb545 rb1d3ee1  
    1111%% Created On       : Wed Apr  6 14:53:29 2016
    1212%% Last Modified By : Peter A. Buhr
    13 %% Last Modified On : Sun Apr 14 11:02:34 2019
    14 %% Update Count     : 3443
     13%% Last Modified On : Sun May  5 18:24:50 2019
     14%% Update Count     : 3489
    1515%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    1616
     
    193193\end{center}
    194194While the \CFA I/O looks similar to the \Index*[C++]{\CC{}} output style, there are important differences, such as automatic spacing between variables as in \Index*{Python} (see~\VRef{s:IOLibrary}).
     195
    195196
    196197\subsection{Background}
     
    431432\end{cfa}
    432433which conditionally includes the correct header file, if the program is compiled using \Indexc{gcc} or \Indexc{cfa}.
     434
     435The \CFA translator has multiple steps.
     436The following flags control how the tranlator works, the stages run, and printing within a stage.
     437The majority of these flags are used by \CFA developers, but some are occasionally useful to programmers.
     438\begin{description}[topsep=5pt,itemsep=0pt,parsep=0pt]
     439\item
     440\Indexc{-h}\index{translator option!-h@{©-h©}}, \Indexc{--help}\index{translator option!--help@{©--help©}} \, print help message
     441\item
     442\Indexc{-l}\index{translator option!-l@{©-l©}}, \Indexc{--libcfa}\index{translator option!--libcfa@{©--libcfa©}} \, generate libcfa.c
     443\item
     444\Indexc{-L}\index{translator option!-L@{©-L©}}, \Indexc{--linemarks}\index{translator option!--linemarks@{©--linemarks©}} \, generate line marks
     445\item
     446\Indexc{-m}\index{translator option!-m@{©-m©}}, \Indexc{--no-main}\index{translator option!--no-main@{©--no-main©}} \, do not replace main
     447\item
     448\Indexc{-N}\index{translator option!-N@{©-N©}}, \Indexc{--no-linemarks}\index{translator option!--no-linemarks@{©--no-linemarks©}} \, do not generate line marks
     449\item
     450\Indexc{-n}\index{translator option!-n@{©-n©}}, \Indexc{--no-prelude}\index{translator option!--no-prelude@{©--no-prelude©}} \, do not read prelude
     451\item
     452\Indexc{-p}\index{translator option!-p@{©-p©}}, \Indexc{--prototypes}\index{translator option!--prototypes@{©--prototypes©}} \, generate prototypes for prelude functions
     453\item
     454\Indexc{-P}\index{translator option!-P@{©-P©}}, \Indexc{--print}\index{translator option!--print@{©--print©}} \, one of:
     455\begin{description}[topsep=0pt,itemsep=0pt,parsep=0pt]
     456\item
     457\Indexc{altexpr}\index{translator option!-P@{©-P©}!©altexpr©}\index{translator option!--print@{©-print©}!©altexpr©} \, alternatives for expressions
     458\item
     459\Indexc{ascodegen}\index{translator option!-P@{©-P©}!©ascodegen©}\index{translator option!--print@{©-print©}!©ascodegen©} \, as codegen rather than AST
     460\item
     461\Indexc{ast}\index{translator option!-P@{©-P©}!©ast©}\index{translator option!--print@{©-print©}!©ast©} \, AST after parsing
     462\item
     463\Indexc{astdecl}\index{translator option!-P@{©-P©}!©astdecl©}\index{translator option!--print@{©-print©}!©astdecl©} \, AST after declaration validation pass
     464\item
     465\Indexc{asterr}\index{translator option!-P@{©-P©}!©asterr©}\index{translator option!--print@{©-print©}!©asterr©} \, AST on error
     466\item
     467\Indexc{astexpr}\index{translator option!-P@{©-P©}!©astexpr©}\index{translator option!--print@{©-print©}!©altexpr©} \, AST after expression analysis
     468\item
     469\Indexc{astgen}\index{translator option!-P@{©-P©}!©astgen©}\index{translator option!--print@{©-print©}!©astgen©} \, AST after instantiate generics
     470\item
     471\Indexc{box}\index{translator option!-P@{©-P©}!©box©}\index{translator option!--print@{©-print©}!©box©} \, before box step
     472\item
     473\Indexc{ctordtor}\index{translator option!-P@{©-P©}!©ctordtor©}\index{translator option!--print@{©-print©}!©ctordtor©} \, after ctor/dtor are replaced
     474\item
     475\Indexc{codegen}\index{translator option!-P@{©-P©}!©codegen©}\index{translator option!--print@{©-print©}!©codegen©} \, before code generation
     476\item
     477\Indexc{declstats}\index{translator option!-P@{©-P©}!©declstats©}\index{translator option!--print@{©-print©}!©declstats©} \, code property statistics
     478\item
     479\Indexc{parse}\index{translator option!-P@{©-P©}!©parse©}\index{translator option!--print@{©-print©}!©parse©} \, yacc (parsing) debug information
     480\item
     481\Indexc{pretty}\index{translator option!-P@{©-P©}!©pretty©}\index{translator option!--print@{©-print©}!©pretty©} \, prettyprint for ascodegen flag
     482\item
     483\Indexc{resolver}\index{translator option!-P@{©-P©}!©resolver©}\index{translator option!--print@{©-print©}!©resolver©} \, before resolver step
     484\item
     485\Indexc{rproto}\index{translator option!-P@{©-P©}!©rproto©}\index{translator option!--print@{©-print©}!©rproto©} \, resolver-proto instance
     486\item
     487\Indexc{rsteps}\index{translator option!-P@{©-P©}!©rsteps©}\index{translator option!--print@{©-print©}!©rsteps©} \, resolver steps
     488\item
     489\Indexc{symevt}\index{translator option!-P@{©-P©}!©symevt©}\index{translator option!--print@{©-print©}!©symevt©} \, symbol table events
     490\item
     491\Indexc{tree}\index{translator option!-P@{©-P©}!©tree©}\index{translator option!--print@{©-print©}!©tree©} \, parse tree
     492\item
     493\Indexc{tuple}\index{translator option!-P@{©-P©}!©tuple©}\index{translator option!--print@{©-print©}!©tuple©} \, after tuple expansion
     494\end{description}
     495\item
     496\Indexc{--prelude-dir} <directory> \, prelude directory for debug/nodebug
     497\item
     498\Indexc{-S}\index{translator option!-S@{©-S©}!©counters,heap,time,all,none©}, \Indexc{--statistics}\index{translator option!--statistics@{©--statistics©}!©counters,heap,time,all,none©} <option-list> \, enable profiling information:
     499\begin{description}[topsep=0pt,itemsep=0pt,parsep=0pt]
     500\item
     501\Indexc{counters,heap,time,all,none}
     502\end{description}
     503\item
     504\Indexc{-t}\index{translator option!-t@{©-t©}}, \Indexc{--tree}\index{translator option!--tree@{©--tree©}} build in tree
     505\end{description}
    433506
    434507
  • libcfa/src/iostream.hfa

    rcedb545 rb1d3ee1  
    1010// Created On       : Wed May 27 17:56:53 2015
    1111// Last Modified By : Peter A. Buhr
    12 // Last Modified On : Sat Apr 20 12:04:07 2019
    13 // Update Count     : 226
     12// Last Modified On : Fri May  3 22:55:04 2019
     13// Update Count     : 230
    1414//
    1515
     
    4848        void close( ostype & os );
    4949        ostype & write( ostype &, const char *, size_t );
    50         int fmt( ostype &, const char format[], ... );
     50        int fmt( ostype &, const char format[], ... ) __attribute__(( format(printf, 2, 3) ));
    5151}; // ostream
    5252
     
    158158        istype & read( istype &, char *, size_t );
    159159        istype & ungetc( istype &, char );
    160         int fmt( istype &, const char format[], ... );
     160        int fmt( istype &, const char format[], ... ) __attribute__(( format(scanf, 2, 3) ));
    161161}; // istream
    162162
  • src/BasicTypes-gen.cc

    rcedb545 rb1d3ee1  
     1#include <algorithm>
    12#include <queue>
    23#include <iostream>
     
    340341        } // for
    341342        code << "\t}; // costMatrix" << endl;
     343
     344        // maximum conversion cost from int
     345        code << "\tstatic const int maxIntCost = " << *max_element(costMatrix[SignedInt], costMatrix[SignedInt] + NUMBER_OF_BASIC_TYPES) << ";" << endl;
    342346        code << "\t";                                                                           // indentation for end marker
    343 
     347       
    344348        if ( (start = str.find( ENDMK, start + 1 )) == string::npos ) Abort( "end", ConversionCost );
    345349        if ( (end = str.find( STARTMK, start + 1 )) == string::npos ) Abort( "start", ConversionCost );
  • src/ResolvExpr/ConversionCost.cc

    rcedb545 rb1d3ee1  
    1010// Created On       : Sun May 17 07:06:19 2015
    1111// Last Modified By : Peter A. Buhr
    12 // Last Modified On : Fri Apr 26 16:33:04 2019
    13 // Update Count     : 24
     12// Last Modified On : Mon May  6 14:18:22 2019
     13// Update Count     : 25
    1414//
    1515
     
    249249                /*_FLDXC*/ {  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,   0, },
    250250        }; // costMatrix
     251        static const int maxIntCost = 15;
    251252        // GENERATED END
    252253        static_assert(
     
    461462                        } // if
    462463                } else if ( dynamic_cast< PointerType* >( dest ) ) {
    463                         cost = Cost::safe;
     464                        cost = Cost::zero;
     465                        cost.incSafe( maxIntCost + 2 ); // +1 for zero_t -> int, +1 for disambiguation
    464466                } // if
    465467        }
Note: See TracChangeset for help on using the changeset viewer.