Changeset 8dbedfc for doc/papers
- Timestamp:
- May 25, 2018, 1:37:38 PM (8 years ago)
- Branches:
- ADT, aaron-thesis, arm-eh, ast-experimental, cleanup-dtors, deferred_resn, demangler, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, new-env, no_list, persistent-indexer, pthread-emulation, qualifiedEnum, with_gc
- Children:
- 58e822a
- Parents:
- 13073be (diff), 34ca532 (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the(diff)links above to see all the changes relative to each parent. - Location:
- doc/papers
- Files:
-
- 2 edited
-
concurrency/Paper.tex (modified) (12 diffs)
-
general/Paper.tex (modified) (25 diffs)
Legend:
- Unmodified
- Added
- Removed
-
doc/papers/concurrency/Paper.tex
r13073be r8dbedfc 70 70 %\DeclareTextCommandDefault{\textunderscore}{\leavevmode\makebox[1.2ex][c]{\rule{1ex}{0.1ex}}} 71 71 \renewcommand{\textunderscore}{\leavevmode\makebox[1.2ex][c]{\rule{1ex}{0.075ex}}} 72 %\def\myCHarFont{\fontencoding{T1}\selectfont}%73 % \def\{{\ttfamily\upshape\myCHarFont \char`\}}}%74 72 75 73 \renewcommand*{\thefootnote}{\Alph{footnote}} % hack because fnsymbol does not work … … 741 739 The coroutine main's stack holds the state for the next generation, @f1@ and @f2@, and the code has the three suspend points, representing the three states in the Fibonacci formula, to context switch back to the caller's resume. 742 740 The interface function, @next@, takes a Fibonacci instance and context switches to it using @resume@; 743 on re turn, the Fibonacci field, @fn@, contains the next value in the sequence, which is returned.741 on restart, the Fibonacci field, @fn@, contains the next value in the sequence, which is returned. 744 742 The first @resume@ is special because it cocalls the coroutine at its coroutine main and allocates the stack; 745 743 when the coroutine main returns, its stack is deallocated. 746 744 Hence, @Fib@ is an object at creation, transitions to a coroutine on its first resume, and transitions back to an object when the coroutine main finishes. 747 745 Figure~\ref{f:Coroutine1State} shows the coroutine version of the C version in Figure~\ref{f:ExternalState}. 748 Coroutine generators are called \newterm{output coroutines} because values are returned by the coroutine.749 750 Figure~\ref{f:CFAFmt} shows an \newterm{input coroutine}, @Format@, for restructuring text into groups of character blocks of fixed size.746 Coroutine generators are called \newterm{output coroutines} because values are only returned. 747 748 Figure~\ref{f:CFAFmt} shows an \newterm{input coroutine}, @Format@, for restructuring text into groups of characters of fixed-size blocks. 751 749 For example, the input of the left is reformatted into the output on the right. 752 750 \begin{quote} … … 763 761 \end{tabular} 764 762 \end{quote} 765 The example takes advantage of resuming coroutines in the constructor to prime the coroutine loops so the first character sent for formatting appears inside the nested loops.763 The example takes advantage of resuming a coroutine in the constructor to prime the loops so the first character sent for formatting appears inside the nested loops. 766 764 The destruction provides a newline if formatted text ends with a full line. 767 765 Figure~\ref{f:CFmt} shows the C equivalent formatter, where the loops of the coroutine are flatten (linearized) and rechecked on each call because execution location is not retained between calls. … … 778 776 void main( Format & fmt ) with( fmt ) { 779 777 for ( ;; ) { 780 for ( g = 0; g < 5; g += 1 ) { // group778 for ( g = 0; g < 5; g += 1 ) { // group 781 779 for ( b = 0; b < 4; b += 1 ) { // block 782 780 `suspend();` … … 814 812 }; 815 813 void format( struct Format * fmt ) { 816 if ( fmt->ch != -1 ) { // not EOF814 if ( fmt->ch != -1 ) { // not EOF ? 817 815 printf( "%c", fmt->ch ); 818 816 fmt->b += 1; … … 823 821 } 824 822 if ( fmt->g == 5 ) { // group 825 printf( "\n" ); // separator823 printf( "\n" ); // separator 826 824 fmt->g = 0; 827 825 } … … 850 848 851 849 The previous examples are \newterm{asymmetric (semi) coroutine}s because one coroutine always calls a resuming function for another coroutine, and the resumed coroutine always suspends back to its last resumer, similar to call/return for normal functions. 852 However, there is no stack growth because @resume@/@suspend@ context switch to an existing stack frames rather than create a new one.853 \newterm{Symmetric (full) coroutine}s have a coroutine call a resuming function for another coroutine, which eventually forms a cycle.850 However, there is no stack growth because @resume@/@suspend@ context switch to existing stack-frames rather than create new ones. 851 \newterm{Symmetric (full) coroutine}s have a coroutine call a resuming function for another coroutine, which eventually forms a resuming-call cycle. 854 852 (The trivial cycle is a coroutine resuming itself.) 855 853 This control flow is similar to recursion for normal routines, but again there is no stack growth from the context switch. … … 935 933 The @start@ function communicates both the number of elements to be produced and the consumer into the producer's coroutine structure. 936 934 Then the @resume@ to @prod@ creates @prod@'s stack with a frame for @prod@'s coroutine main at the top, and context switches to it. 937 @prod@'s coroutine main starts, creates local variables that are retained between coroutine activations, and executes $N$ iterations, each generating two random val es, calling the consumer to deliver the values, and printing the status returned from the consumer.935 @prod@'s coroutine main starts, creates local variables that are retained between coroutine activations, and executes $N$ iterations, each generating two random values, calling the consumer to deliver the values, and printing the status returned from the consumer. 938 936 939 937 The producer call to @delivery@ transfers values into the consumer's communication variables, resumes the consumer, and returns the consumer status. 940 938 For the first resume, @cons@'s stack is initialized, creating local variables retained between subsequent activations of the coroutine. 941 The consumer iterates until the @done@ flag is set, prints, increments status, and calls back to the producer 's @payment@ member, and on return prints the receipt from the producer and increments the money for the next payment.942 The call from the consumer to the producer's @payment@ memberintroduces the cycle between producer and consumer.939 The consumer iterates until the @done@ flag is set, prints, increments status, and calls back to the producer via @payment@, and on return from @payment@, prints the receipt from the producer and increments @money@ (inflation). 940 The call from the consumer to the @payment@ introduces the cycle between producer and consumer. 943 941 When @payment@ is called, the consumer copies values into the producer's communication variable and a resume is executed. 944 The context switch restarts the producer at the point where it was last context switched and it continues in member@delivery@ after the resume.945 946 The @delivery@ member returns the status value in @prod@'s @main@ member, where the status is printed.942 The context switch restarts the producer at the point where it was last context switched, so it continues in @delivery@ after the resume. 943 944 @delivery@ returns the status value in @prod@'s coroutine main, where the status is printed. 947 945 The loop then repeats calling @delivery@, where each call resumes the consumer coroutine. 948 946 The context switch to the consumer continues in @payment@. 949 The consumer increments and returns the receipt to the call in @cons@'s @main@ member.947 The consumer increments and returns the receipt to the call in @cons@'s coroutine main. 950 948 The loop then repeats calling @payment@, where each call resumes the producer coroutine. 951 949 … … 954 952 The context switch restarts @cons@ in @payment@ and it returns with the last receipt. 955 953 The consumer terminates its loops because @done@ is true, its @main@ terminates, so @cons@ transitions from a coroutine back to an object, and @prod@ reactivates after the resume in @stop@. 956 The @stop@ member returns and @prod@'s @main@ memberterminates.954 @stop@ returns and @prod@'s coroutine main terminates. 957 955 The program main restarts after the resume in @start@. 958 The @start@ member returns and the program main terminates. 959 960 961 \subsubsection{Construction} 962 963 One important design challenge for implementing coroutines and threads (shown in section \ref{threads}) is that the runtime system needs to run code after the user-constructor runs to connect the fully constructed object into the system. 964 In the case of coroutines, this challenge is simpler since there is no non-determinism from preemption or scheduling. 965 However, the underlying challenge remains the same for coroutines and threads. 966 967 The runtime system needs to create the coroutine's stack and, more importantly, prepare it for the first resumption. 968 The timing of the creation is non-trivial since users expect both to have fully constructed objects once execution enters the coroutine main and to be able to resume the coroutine from the constructor. 969 There are several solutions to this problem but the chosen option effectively forces the design of the coroutine. 970 971 Furthermore, \CFA faces an extra challenge as polymorphic routines create invisible thunks when cast to non-polymorphic routines and these thunks have function scope. 972 For example, the following code, while looking benign, can run into undefined behaviour because of thunks: 973 974 \begin{cfa} 975 // async: Runs function asynchronously on another thread 976 forall(otype T) 977 extern void async(void (*func)(T*), T* obj); 978 979 forall(otype T) 980 void noop(T*) {} 981 982 void bar() { 983 int a; 984 async(noop, &a); // start thread running noop with argument a 985 } 986 \end{cfa} 987 988 The generated C code\footnote{Code trimmed down for brevity} creates a local thunk to hold type information: 989 990 \begin{cfa} 991 extern void async(/* omitted */, void (*func)(void*), void* obj); 992 993 void noop(/* omitted */, void* obj){} 994 995 void bar(){ 996 int a; 997 void _thunk0(int* _p0){ 998 /* omitted */ 999 noop(/* omitted */, _p0); 1000 } 1001 /* omitted */ 1002 async(/* omitted */, ((void (*)(void*))(&_thunk0)), (&a)); 1003 } 1004 \end{cfa} 1005 The problem in this example is a storage management issue, the function pointer @_thunk0@ is only valid until the end of the block, which limits the viable solutions because storing the function pointer for too long causes undefined behaviour; \ie the stack-based thunk being destroyed before it can be used. 1006 This challenge is an extension of challenges that come with second-class routines. 1007 Indeed, GCC nested routines also have the limitation that nested routine cannot be passed outside of the declaration scope. 1008 The case of coroutines and threads is simply an extension of this problem to multiple call stacks. 1009 1010 1011 \subsubsection{Alternative: Composition} 1012 1013 One solution to this challenge is to use composition/containment, where coroutine fields are added to manage the coroutine. 1014 1015 \begin{cfa} 1016 struct Fibonacci { 1017 int fn; // used for communication 1018 coroutine c; // composition 1019 }; 1020 1021 void FibMain(void*) { 1022 //... 1023 } 1024 1025 void ?{}(Fibonacci& this) { 1026 this.fn = 0; 1027 // Call constructor to initialize coroutine 1028 (this.c){myMain}; 1029 } 1030 \end{cfa} 1031 The downside of this approach is that users need to correctly construct the coroutine handle before using it. 1032 Like any other objects, the user must carefully choose construction order to prevent usage of objects not yet constructed. 1033 However, in the case of coroutines, users must also pass to the coroutine information about the coroutine main, like in the previous example. 1034 This opens the door for user errors and requires extra runtime storage to pass at runtime information that can be known statically. 1035 1036 1037 \subsubsection{Alternative: Reserved keyword} 1038 1039 The next alternative is to use language support to annotate coroutines as follows: 1040 \begin{cfa} 1041 coroutine Fibonacci { 1042 int fn; // used for communication 1043 }; 1044 \end{cfa} 1045 The @coroutine@ keyword means the compiler can find and inject code where needed. 1046 The downside of this approach is that it makes coroutine a special case in the language. 1047 Users wanting to extend coroutines or build their own for various reasons can only do so in ways offered by the language. 1048 Furthermore, implementing coroutines without language supports also displays the power of the programming language used. 1049 While this is ultimately the option used for idiomatic \CFA code, coroutines and threads can still be constructed by users without using the language support. 1050 The reserved keywords are only present to improve ease of use for the common cases. 1051 1052 1053 \subsubsection{Alternative: Lambda Objects} 956 @start@ returns and the program main terminates. 957 958 959 \subsection{Coroutine Implementation} 960 961 A significant implementation challenge for coroutines (and threads, see section \ref{threads}) is adding extra fields and executing code after/before the coroutine constructor/destructor and coroutine main to create/initialize/de-initialize/destroy extra fields and the stack. 962 There are several solutions to this problem and the chosen option forced the \CFA coroutine design. 963 964 Object-oriented inheritance provides extra fields and code in a restricted context, but it requires programmers to explicitly perform the inheritance: 965 \begin{cfa} 966 struct mycoroutine $\textbf{\textsf{inherits}}$ baseCoroutine { ... } 967 \end{cfa} 968 and the programming language (and possibly its tool set, \eg debugger) may need to understand @baseCoroutine@ because of the stack. 969 Furthermore, the execution of constructs/destructors is in the wrong order for certain operations, \eg for threads; 970 \eg, if the thread is implicitly started, it must start \emph{after} all constructors, because the thread relies on a completely initialized object, but the inherited constructor runs \emph{before} the derived. 971 972 An alternatively is composition: 973 \begin{cfa} 974 struct mycoroutine { 975 ... // declarations 976 baseCoroutine dummy; // composition, last declaration 977 } 978 \end{cfa} 979 which also requires an explicit declaration that must be the last one to ensure correct initialization order. 980 However, there is nothing preventing wrong placement or multiple declarations. 1054 981 1055 982 For coroutines as for threads, many implementations are based on routine pointers or function objects~\cite{Butenhof97, C++14, MS:VisualC++, BoostCoroutines15}. 1056 For example, Boost implements coroutines in terms of four functor object types:983 For example, Boost implements coroutines in terms of four functor object-types: 1057 984 \begin{cfa} 1058 985 asymmetric_coroutine<>::pull_type … … 1061 988 symmetric_coroutine<>::yield_type 1062 989 \end{cfa} 1063 Often, the canonical threading paradigm in languages is based on function pointers, @pthread@ being one of the most well-known examples. 1064 The main problem of this approach is that the thread usage is limited to a generic handle that must otherwise be wrapped in a custom type. 1065 Since the custom type is simple to write in \CFA and solves several issues, added support for routine/lambda based coroutines adds very little. 1066 1067 A variation of this would be to use a simple function pointer in the same way @pthread@ does for threads: 1068 \begin{cfa} 1069 void foo( coroutine_t cid, void* arg ) { 1070 int* value = (int*)arg; 990 Similarly, the canonical threading paradigm is often based on function pointers, \eg @pthread@~\cite{pthreads}, \Csharp~\cite{Csharp}, Go~\cite{Go}, and Scala~\cite{Scala}. 991 However, the generic thread-handle (identifier) is limited (few operations), unless it is wrapped in a custom type. 992 \begin{cfa} 993 void mycor( coroutine_t cid, void * arg ) { 994 int * value = (int *)arg; $\C{// type unsafe, pointer-size only}$ 1071 995 // Coroutine body 1072 996 } 1073 1074 997 int main() { 1075 int value = 0; 1076 coroutine_t cid = coroutine_create( &foo, (void*)&value ); 1077 coroutine_resume( &cid ); 1078 } 1079 \end{cfa} 1080 This semantics is more common for thread interfaces but coroutines work equally well. 1081 As discussed in section \ref{threads}, this approach is superseded by static approaches in terms of expressivity. 1082 1083 1084 \subsubsection{Alternative: Trait-Based Coroutines} 1085 1086 Finally, the underlying approach, which is the one closest to \CFA idioms, is to use trait-based lazy coroutines. 1087 This approach defines a coroutine as anything that satisfies the trait @is_coroutine@ (as defined below) and is used as a coroutine. 1088 1089 \begin{cfa} 1090 trait is_coroutine(dtype T) { 1091 void main(T& this); 1092 coroutine_desc* get_coroutine(T& this); 998 int input = 0, output; 999 coroutine_t cid = coroutine_create( &mycor, (void *)&input ); $\C{// type unsafe, pointer-size only}$ 1000 coroutine_resume( cid, (void *)input, (void **)&output ); $\C{// type unsafe, pointer-size only}$ 1001 } 1002 \end{cfa} 1003 Since the custom type is simple to write in \CFA and solves several issues, added support for routine/lambda-based coroutines adds very little. 1004 1005 The selected approach is to use language support by introducing a new kind of aggregate (structure): 1006 \begin{cfa} 1007 coroutine Fibonacci { 1008 int fn; // communication variables 1093 1009 }; 1094 1095 forall( dtype T | is_coroutine(T) ) void suspend(T&); 1096 forall( dtype T | is_coroutine(T) ) void resume (T&); 1097 \end{cfa} 1098 This ensures that an object is not a coroutine until @resume@ is called on the object. 1099 Correspondingly, any object that is passed to @resume@ is a coroutine since it must satisfy the @is_coroutine@ trait to compile. 1010 \end{cfa} 1011 The @coroutine@ keyword means the compiler (and tool set) can find and inject code where needed. 1012 The downside of this approach is that it makes coroutine a special case in the language. 1013 Users wanting to extend coroutines or build their own for various reasons can only do so in ways offered by the language. 1014 Furthermore, implementing coroutines without language supports also displays the power of a programming language. 1015 While this is ultimately the option used for idiomatic \CFA code, coroutines and threads can still be constructed without using the language support. 1016 The reserved keyword eases use for the common cases. 1017 1018 Part of the mechanism to generalize coroutines is using a \CFA trait, which defines a coroutine as anything satisfying the trait @is_coroutine@, and this trait is used to restrict coroutine-manipulation functions: 1019 \begin{cfa} 1020 trait is_coroutine( dtype T ) { 1021 void main( T & this ); 1022 coroutine_desc * get_coroutine( T & this ); 1023 }; 1024 forall( dtype T | is_coroutine(T) ) void get_coroutine( T & ); 1025 forall( dtype T | is_coroutine(T) ) void suspend( T & ); 1026 forall( dtype T | is_coroutine(T) ) void resume( T & ); 1027 \end{cfa} 1028 This definition ensures there is a statically-typed @main@ function that is the starting point (first stack frame) of a coroutine. 1029 No return value or additional parameters are necessary for this function, because the coroutine type allows an arbitrary number of interface functions with corresponding arbitrary typed input/output values. 1030 As well, any object passed to @suspend@ and @resume@ is a coroutine since it must satisfy the @is_coroutine@ trait to compile. 1100 1031 The advantage of this approach is that users can easily create different types of coroutines, for example, changing the memory layout of a coroutine is trivial when implementing the @get_coroutine@ routine. 1101 The \CFA keyword @coroutine@ simply has the effect of implementing the getter and forward declarations required for users to implement the main routine.1102 1103 \begin{ center}1104 \begin{ tabular}{c c c}1105 \begin{cfa}[tabsize=3] 1106 coroutine MyCoroutine { 1107 int someValue; 1032 The \CFA keyword @coroutine@ implicitly implements the getter and forward declarations required for implementing the coroutine main: 1033 \begin{cquote} 1034 \begin{tabular}{@{}ccc@{}} 1035 \begin{cfa} 1036 coroutine MyCor { 1037 int value; 1038 1108 1039 }; 1109 \end{cfa} & == & \begin{cfa}[tabsize=3] 1110 struct MyCoroutine { 1111 int someValue; 1112 coroutine_desc __cor; 1040 \end{cfa} 1041 & {\Large $\Rightarrow$} & 1042 \begin{tabular}{@{}ccc@{}} 1043 \begin{cfa} 1044 struct MyCor { 1045 int value; 1046 coroutine_desc cor; 1113 1047 }; 1114 1115 static inline 1116 coroutine_desc* get_coroutine( 1117 struct MyCoroutine& this 1118 ) { 1119 return &this.__cor; 1120 } 1121 1122 void main(struct MyCoroutine* this); 1048 \end{cfa} 1049 & 1050 \begin{cfa} 1051 static inline coroutine_desc * 1052 get_coroutine( MyCor & this ) { 1053 return &this.cor; 1054 } 1055 \end{cfa} 1056 & 1057 \begin{cfa} 1058 void main( MyCor * this ); 1059 1060 1061 1123 1062 \end{cfa} 1124 1063 \end{tabular} 1125 \end{center} 1126 1127 The combination of these two approaches allows users new to coroutining and concurrency to have an easy and concise specification, while more advanced users have tighter control on memory layout and initialization. 1128 1129 \subsection{Thread Interface}\label{threads} 1130 The basic building blocks of multithreading in \CFA are \textbf{cfathread}. 1131 Both user and kernel threads are supported, where user threads are the concurrency mechanism and kernel threads are the parallel mechanism. 1132 User threads offer a flexible and lightweight interface. 1133 A thread can be declared using a struct declaration @thread@ as follows: 1134 1135 \begin{cfa} 1136 thread foo {}; 1137 \end{cfa} 1138 1139 As for coroutines, the keyword is a thin wrapper around a \CFA trait: 1140 1141 \begin{cfa} 1142 trait is_thread(dtype T) { 1143 void ^?{}(T & mutex this); 1144 void main(T & this); 1145 thread_desc* get_thread(T & this); 1064 \end{tabular} 1065 \end{cquote} 1066 The combination of these two approaches allows an easy and concise specification to coroutining (and concurrency) for normal users, while more advanced users have tighter control on memory layout and initialization. 1067 1068 1069 \subsection{Thread Interface} 1070 \label{threads} 1071 1072 Both user and kernel threads are supported, where user threads provide concurrency and kernel threads provide parallelism. 1073 Like coroutines and for the same design reasons, the selected approach for user threads is to use language support by introducing a new kind of aggregate (structure) and a \CFA trait: 1074 \begin{cquote} 1075 \begin{tabular}{@{}c@{\hspace{2\parindentlnth}}c@{}} 1076 \begin{cfa} 1077 thread myThread { 1078 // communication variables 1146 1079 }; 1147 \end{cfa} 1148 1149 Obviously, for this thread implementation to be useful it must run some user code. 1150 Several other threading interfaces use a function-pointer representation as the interface of threads (for example \Csharp~\cite{Csharp} and Scala~\cite{Scala}). 1151 However, this proposal considers that statically tying a @main@ routine to a thread supersedes this approach. 1152 Since the @main@ routine is already a special routine in \CFA (where the program begins), it is a natural extension of the semantics to use overloading to declare mains for different threads (the normal main being the main of the initial thread). 1080 1081 1082 \end{cfa} 1083 & 1084 \begin{cfa} 1085 trait is_thread( dtype T ) { 1086 void main( T & this ); 1087 thread_desc * get_thread( T & this ); 1088 void ^?{}( T & `mutex` this ); 1089 }; 1090 \end{cfa} 1091 \end{tabular} 1092 \end{cquote} 1093 (The qualifier @mutex@ for the destructor parameter is discussed in Section~\ref{s:Monitors}.) 1094 Like a coroutine, the statically-typed @main@ function is the starting point (first stack frame) of a user thread. 1095 The difference is that a coroutine borrows a thread from its caller, so the first thread resuming a coroutine creates an instance of @main@; 1096 whereas, a user thread receives its own thread from the runtime system, which starts in @main@ as some point after the thread constructor is run.\footnote{ 1097 The \lstinline@main@ function is already a special routine in C (where the program begins), so it is a natural extension of the semantics to use overloading to declare mains for different coroutines/threads (the normal main being the main of the initial thread).} 1098 No return value or additional parameters are necessary for this function, because the task type allows an arbitrary number of interface functions with corresponding arbitrary typed input/output values. 1099 1100 \begin{comment} % put in appendix with coroutine version ??? 1153 1101 As such the @main@ routine of a thread can be defined as 1154 1102 \begin{cfa} … … 1189 1137 } 1190 1138 \end{cfa} 1191 1192 1139 A consequence of the strongly typed approach to main is that memory layout of parameters and return values to/from a thread are now explicitly specified in the \textbf{api}. 1193 1194 Of course, for threads to be useful, it must be possible to start and stop threads and wait for them to complete execution. 1195 While using an \textbf{api} such as @fork@ and @join@ is relatively common in the literature, such an interface is unnecessary.1196 Indeed, the simplest approach is to use \textbf{raii} principles and have threads @fork@ after the constructor has completed and @join@ before the destructor runs.1197 \begin{cfa} 1198 thread World; 1199 1200 void main( World & this) {1140 \end{comment} 1141 1142 For user threads to be useful, it must be possible to start and stop the underlying thread, and wait for it to complete execution. 1143 While using an API such as @fork@ and @join@ is relatively common, such an interface is awkward and unnecessary. 1144 A simple approach is to use allocation/deallocation principles, and have threads implicitly @fork@ after construction and @join@ before destruction. 1145 \begin{cfa} 1146 thread World {}; 1147 void main( World & this ) { 1201 1148 sout | "World!" | endl; 1202 1149 } 1203 1204 void main() { 1205 World w; 1206 // Thread forks here 1207 1208 // Printing "Hello " and "World!" are run concurrently 1209 sout | "Hello " | endl; 1210 1211 // Implicit join at end of scope 1212 } 1213 \end{cfa} 1214 1215 This semantic has several advantages over explicit semantics: a thread is always started and stopped exactly once, users cannot make any programming errors, and it naturally scales to multiple threads meaning basic synchronization is very simple. 1216 1217 \begin{cfa} 1218 thread MyThread { 1219 //... 1150 int main() { 1151 World w`[10]`; $\C{// implicit forks after creation}$ 1152 sout | "Hello " | endl; $\C{// "Hello " and 10 "World!" printed concurrently}$ 1153 } $\C{// implicit joins before destruction}$ 1154 \end{cfa} 1155 This semantics ensures a thread is started and stopped exactly once, eliminating some programming error, and scales to multiple threads for basic (termination) synchronization. 1156 This tree-structure (lattice) create/delete from C block-structure is generalized by using dynamic allocation, so threads can outlive the scope in which they are created, much like dynamically allocating memory lets objects outlive the scope in which they are created. 1157 \begin{cfa} 1158 int main() { 1159 MyThread * heapLived; 1160 { 1161 MyThread blockLived; $\C{// fork block-based thread}$ 1162 heapLived = `new`( MyThread ); $\C{// fork heap-based thread}$ 1163 ... 1164 } $\C{// join block-based thread}$ 1165 ... 1166 `delete`( heapLived ); $\C{// join heap-based thread}$ 1167 } 1168 \end{cfa} 1169 The heap-based approach allows arbitrary thread-creation topologies, with respect to fork/join-style concurrency. 1170 1171 Figure~\ref{s:ConcurrentMatrixSummation} shows concurrently adding the rows of a matrix and then totalling the subtotals sequential, after all the row threads have terminated. 1172 The program uses heap-based threads because each thread needs different constructor values. 1173 (Python provides a simple iteration mechanism to initialize array elements to different values allowing stack allocation.) 1174 The allocation/deallocation pattern appears unusual because allocated objects are immediately deleted without any intervening code. 1175 However, for threads, the deletion provides implicit synchronization, which is the intervening code. 1176 While the subtotals are added in linear order rather than completion order, which slight inhibits concurrency, the computation is restricted by the critical-path thread (\ie the thread that takes the longest), and so any inhibited concurrency is very small as totalling the subtotals is trivial. 1177 1178 \begin{figure} 1179 \begin{cfa} 1180 thread Adder { 1181 int * row, cols, & subtotal; $\C{// communication}$ 1220 1182 }; 1221 1222 // main 1223 void main(MyThread& this) { 1224 //... 1225 } 1226 1227 void foo() { 1228 MyThread thrds[10]; 1229 // Start 10 threads at the beginning of the scope 1230 1231 DoStuff(); 1232 1233 // Wait for the 10 threads to finish 1234 } 1235 \end{cfa} 1236 1237 However, one of the drawbacks of this approach is that threads always form a tree where nodes must always outlive their children, \ie they are always destroyed in the opposite order of construction because of C scoping rules. 1238 This restriction is relaxed by using dynamic allocation, so threads can outlive the scope in which they are created, much like dynamically allocating memory lets objects outlive the scope in which they are created. 1239 1240 \begin{cfa} 1241 thread MyThread { 1242 //... 1243 }; 1244 1245 void main(MyThread& this) { 1246 //... 1247 } 1248 1249 void foo() { 1250 MyThread* long_lived; 1251 { 1252 // Start a thread at the beginning of the scope 1253 MyThread short_lived; 1254 1255 // create another thread that will outlive the thread in this scope 1256 long_lived = new MyThread; 1257 1258 DoStuff(); 1259 1260 // Wait for the thread short_lived to finish 1261 } 1262 DoMoreStuff(); 1263 1264 // Now wait for the long_lived to finish 1265 delete long_lived; 1266 } 1267 \end{cfa} 1268 1269 1270 % ====================================================================== 1271 % ====================================================================== 1272 \section{Concurrency} 1273 % ====================================================================== 1274 % ====================================================================== 1275 Several tools can be used to solve concurrency challenges. 1276 Since many of these challenges appear with the use of mutable shared state, some languages and libraries simply disallow mutable shared state (Erlang~\cite{Erlang}, Haskell~\cite{Haskell}, Akka (Scala)~\cite{Akka}). 1277 In these paradigms, interaction among concurrent objects relies on message passing~\cite{Thoth,Harmony,V-Kernel} or other paradigms closely relate to networking concepts (channels~\cite{CSP,Go} for example). 1278 However, in languages that use routine calls as their core abstraction mechanism, these approaches force a clear distinction between concurrent and non-concurrent paradigms (\ie message passing versus routine calls). 1279 This distinction in turn means that, in order to be effective, programmers need to learn two sets of design patterns. 1183 void ?{}( Adder & adder, int row[], int cols, int & subtotal ) { 1184 adder.[ row, cols, &subtotal ] = [ row, cols, &subtotal ]; 1185 } 1186 void main( Adder & adder ) with( adder ) { 1187 subtotal = 0; 1188 for ( int c = 0; c < cols; c += 1 ) { 1189 subtotal += row[c]; 1190 } 1191 } 1192 int main() { 1193 const int rows = 10, cols = 1000; 1194 int matrix[rows][cols], subtotals[rows], total = 0; 1195 // read matrix 1196 Adder * adders[rows]; 1197 for ( int r = 0; r < rows; r += 1 ) { $\C{// start threads to sum rows}$ 1198 adders[r] = new( matrix[r], cols, &subtotals[r] ); 1199 } 1200 for ( int r = 0; r < rows; r += 1 ) { $\C{// wait for threads to finish}$ 1201 delete( adders[r] ); $\C{// termination join}$ 1202 total += subtotals[r]; $\C{// total subtotal}$ 1203 } 1204 sout | total | endl; 1205 } 1206 \end{cfa} 1207 \caption{Concurrent Matrix Summation} 1208 \label{s:ConcurrentMatrixSummation} 1209 \end{figure} 1210 1211 1212 \section{Synchronization / Mutual Exclusion} 1213 1214 Uncontrolled non-deterministic execution is meaningless. 1215 To reestablish meaningful execution requires mechanisms to reintroduce determinism (control non-determinism), called synchronization and mutual exclusion, where synchronization is a timing relationship among threads and mutual exclusion is an access-control mechanism on data shared by threads. 1216 Since many deterministic challenges appear with the use of mutable shared state, some languages/libraries disallow it (Erlang~\cite{Erlang}, Haskell~\cite{Haskell}, Akka~\cite{Akka} (Scala)). 1217 In these paradigms, interaction among concurrent objects is performed by stateless message-passing~\cite{Thoth,Harmony,V-Kernel} or other paradigms closely relate to networking concepts (\eg channels~\cite{CSP,Go}). 1218 However, in call/return-based languages, these approaches force a clear distinction (\ie introduce a new programming paradigm) between non-concurrent and concurrent computation (\ie function call versus message passing). 1219 This distinction means a programmers needs to learn two sets of design patterns. 1280 1220 While this distinction can be hidden away in library code, effective use of the library still has to take both paradigms into account. 1281 1282 Approaches based on shared memory are more closely related to non-concurrent paradigms since they often rely on basic constructs like routine calls and shared objects. 1283 At the lowest level, concurrent paradigms are implemented as atomic operations and locks. 1284 Many such mechanisms have been proposed, including semaphores~\cite{Dijkstra68b} and path expressions~\cite{Campbell74}. 1285 However, for productivity reasons it is desirable to have a higher-level construct be the core concurrency paradigm~\cite{Hochstein05}. 1286 1287 An approach that is worth mentioning because it is gaining in popularity is transactional memory~\cite{Herlihy93}. 1288 While this approach is even pursued by system languages like \CC~\cite{Cpp-Transactions}, the performance and feature set is currently too restrictive to be the main concurrency paradigm for system languages, which is why it was rejected as the core paradigm for concurrency in \CFA. 1289 1290 One of the most natural, elegant, and efficient mechanisms for synchronization and communication, especially for shared-memory systems, is the \emph{monitor}. 1221 In contrast, approaches based on statefull models more closely resemble the standard call/return programming-model, resulting in a single programming paradigm. 1222 1223 At the lowest level, concurrent control is implemented as atomic operations, upon which different kinds of locks mechanism are constructed, \eg semaphores~\cite{Dijkstra68b} and path expressions~\cite{Campbell74}. 1224 However, for productivity it is always desirable to use the highest-level construct that provides the necessary efficiency~\cite{Hochstein05}. 1225 A newer approach is transactional memory~\cite{Herlihy93}. 1226 While this approach is pursued in hardware~\cite{Nakaike15} and system languages, like \CC~\cite{Cpp-Transactions}, the performance and feature set is still too restrictive to be the main concurrency paradigm for system languages, which is why it was rejected as the core paradigm for concurrency in \CFA. 1227 1228 One of the most natural, elegant, and efficient mechanisms for synchronization and mutual exclusion for shared-memory systems is the \emph{monitor}. 1291 1229 Monitors were first proposed by Brinch Hansen~\cite{Hansen73} and later described and extended by C.A.R.~Hoare~\cite{Hoare74}. 1292 Many programming languages ---\eg Concurrent Pascal~\cite{ConcurrentPascal}, Mesa~\cite{Mesa}, Modula~\cite{Modula-2}, Turing~\cite{Turing:old}, Modula-3~\cite{Modula-3}, NeWS~\cite{NeWS}, Emerald~\cite{Emerald}, \uC~\cite{Buhr92a} and Java~\cite{Java}---provide monitors as explicit language constructs.1230 Many programming languages -- \eg Concurrent Pascal~\cite{ConcurrentPascal}, Mesa~\cite{Mesa}, Modula~\cite{Modula-2}, Turing~\cite{Turing:old}, Modula-3~\cite{Modula-3}, NeWS~\cite{NeWS}, Emerald~\cite{Emerald}, \uC~\cite{Buhr92a} and Java~\cite{Java} -- provide monitors as explicit language constructs. 1293 1231 In addition, operating-system kernels and device drivers have a monitor-like structure, although they often use lower-level primitives such as semaphores or locks to simulate monitors. 1294 For these reasons, this project proposes monitors as the core concurrency construct. 1295 1296 1297 \subsection{Basics} 1298 1299 Non-determinism requires concurrent systems to offer support for mutual-exclusion and synchronization. 1300 Mutual-exclusion is the concept that only a fixed number of threads can access a critical section at any given time, where a critical section is a group of instructions on an associated portion of data that requires the restricted access. 1301 On the other hand, synchronization enforces relative ordering of execution and synchronization tools provide numerous mechanisms to establish timing relationships among threads. 1302 1303 1304 \subsubsection{Mutual-Exclusion} 1305 1306 As mentioned above, mutual-exclusion is the guarantee that only a fix number of threads can enter a critical section at once. 1232 For these reasons, this project proposes monitors as the core concurrency construct, upon which even higher-level approaches can be easily constructed.. 1233 1234 1235 \subsection{Mutual Exclusion} 1236 1237 A group of instructions manipulating a specific instance of shared data that must be performed atomically is called an (individual) \newterm{critical-section}~\cite{Dijkstra65}. 1238 A generalization is a \newterm{group critical-section}~\cite{Joung00}, where multiple tasks with the same session may use the resource simultaneously, but different sessions may not use the resource simultaneously. 1239 The readers/writer problem~\cite{Courtois71} is an instance of a group critical-section, where readers have the same session and all writers have a unique session. 1240 \newterm{Mutual exclusion} enforces the correction number of threads are using a critical section at the same time. 1241 1307 1242 However, many solutions exist for mutual exclusion, which vary in terms of performance, flexibility and ease of use. 1308 Methods range from low-level locks, which are fast and flexible but require significant attention to be correct, to higher-level concurrency techniques, which sacrifice some performance in orderto improve ease of use.1309 Ease of use comes by either guaranteeing some problems cannot occur (\eg being deadlock free) or by offering a more explicit coupling between data and correspondingcritical section.1243 Methods range from low-level locks, which are fast and flexible but require significant attention for correctness, to higher-level concurrency techniques, which sacrifice some performance to improve ease of use. 1244 Ease of use comes by either guaranteeing some problems cannot occur (\eg deadlock free), or by offering a more explicit coupling between shared data and critical section. 1310 1245 For example, the \CC @std::atomic<T>@ offers an easy way to express mutual-exclusion on a restricted set of operations (\eg reading/writing large types atomically). 1311 Another challenge with low-level locks is composability.1312 Locks have restricted composability because it takes careful organizing for multiple locks to be used while preventing deadlocks.1313 Easing composability is another feature higher-level mutual-exclusion mechanisms often offer. 1314 1315 1316 \subsubsection{Synchronization} 1317 1318 As with mutual-exclusion, low-level synchronization primitives often offer good performance and good flexibility at the cost of ease of use.1319 Again, higher-level mechanisms often simplify usage by adding either better coupling between synchronization and data (\eg message passing) or offering a simpler solution to otherwise involved challenges.1246 However, a significant challenge with (low-level) locks is composability because it takes careful organization for multiple locks to be used while preventing deadlock. 1247 Easing composability is another feature higher-level mutual-exclusion mechanisms offer. 1248 1249 1250 \subsection{Synchronization} 1251 1252 Synchronization enforces relative ordering of execution, and synchronization tools provide numerous mechanisms to establish these timing relationships. 1253 Low-level synchronization primitives offer good performance and flexibility at the cost of ease of use. 1254 Higher-level mechanisms often simplify usage by adding better coupling between synchronization and data (\eg message passing), or offering a simpler solution to otherwise involved challenges, \eg barrier lock. 1320 1255 As mentioned above, synchronization can be expressed as guaranteeing that event \textit{X} always happens before \textit{Y}. 1321 Most of the time, synchronization happens within a critical section, where threads must acquire mutual-exclusion in a certain order. 1322 However, it may also be desirable to guarantee that event \textit{Z} does not occur between \textit{X} and \textit{Y}. 1323 Not satisfying this property is called \textbf{barging}. 1324 For example, where event \textit{X} tries to effect event \textit{Y} but another thread acquires the critical section and emits \textit{Z} before \textit{Y}. 1325 The classic example is the thread that finishes using a resource and unblocks a thread waiting to use the resource, but the unblocked thread must compete to acquire the resource. 1256 Often synchronization is used to order access to a critical section, \eg ensuring the next kind of thread to enter a critical section is a reader thread 1257 If a writer thread is scheduled for next access, but another reader thread acquires the critical section first, the reader has \newterm{barged}. 1258 Barging can result in staleness/freshness problems, where a reader barges ahead of a write and reads temporally stale data, or a writer barges ahead of another writer overwriting data with a fresh value preventing the previous value from having an opportunity to be read. 1326 1259 Preventing or detecting barging is an involved challenge with low-level locks, which can be made much easier by higher-level constructs. 1327 This challenge is often split into two different methods, barging avoidance and barging prevention. 1328 Algorithms that use flag variables to detect barging threads are said to be using barging avoidance, while algorithms that baton-pass locks~\cite{Andrews89} between threads instead of releasing the locks are said to be using barging prevention. 1329 1330 1331 % ====================================================================== 1332 % ====================================================================== 1260 This challenge is often split into two different approaches, barging avoidance and barging prevention. 1261 Algorithms that allow a barger but divert it until later are avoiding the barger, while algorithms that preclude a barger from entering during synchronization in the critical section prevent the barger completely. 1262 baton-pass locks~\cite{Andrews89} between threads instead of releasing the locks are said to be using barging prevention. 1263 1264 1333 1265 \section{Monitors} 1334 % ====================================================================== 1335 % ====================================================================== 1266 \label{s:Monitors} 1267 1336 1268 A \textbf{monitor} is a set of routines that ensure mutual-exclusion when accessing shared state. 1337 1269 More precisely, a monitor is a programming technique that associates mutual-exclusion to routine scopes, as opposed to mutex locks, where mutual-exclusion is defined by lock/release calls independently of any scoping of the calling routine. … … 2501 2433 Given these building blocks, it is possible to reproduce all three of the popular paradigms. 2502 2434 Indeed, \textbf{uthread} is the default paradigm in \CFA. 2503 However, disabling \textbf{preemption} on the \textbf{cfacluster} means \textbf{cfathread} effectively become \textbf{fiber}.2435 However, disabling \textbf{preemption} on a cluster means threads effectively become fibers. 2504 2436 Since several \textbf{cfacluster} with different scheduling policy can coexist in the same application, this allows \textbf{fiber} and \textbf{uthread} to coexist in the runtime of an application. 2505 2437 Finally, it is possible to build executors for thread pools from \textbf{uthread} or \textbf{fiber}, which includes specialized jobs like actors~\cite{Actors}. -
doc/papers/general/Paper.tex
r13073be r8dbedfc 243 243 Nevertheless, C, first standardized almost forty years ago~\cite{ANSI89:C}, lacks many features that make programming in more modern languages safer and more productive. 244 244 245 \CFA (pronounced ``C-for-all'', and written \CFA or Cforall) is an evolutionary extension of the C programming language that adds modern language-features to C, while maintaining both source and runtime compatibility with C and a familiar programming model for programmers.245 \CFA (pronounced ``C-for-all'', and written \CFA or Cforall) is an evolutionary extension of the C programming language that adds modern language-features to C, while maintaining source and runtime compatibility in the familiar C programming model. 246 246 The four key design goals for \CFA~\cite{Bilson03} are: 247 247 (1) The behaviour of standard C code must remain the same when translated by a \CFA compiler as when translated by a C compiler; … … 273 273 Starting with a translator versus a compiler makes it easier and faster to generate and debug C object-code rather than intermediate, assembler or machine code. 274 274 The translator design is based on the \emph{visitor pattern}, allowing multiple passes over the abstract code-tree, which works well for incrementally adding new feature through additional visitor passes. 275 At the heart of the translator is the type resolver, which handles the polymorphic routine/type overload-resolution.275 At the heart of the translator is the type resolver, which handles the polymorphic function/type overload-resolution. 276 276 % @plg2[8]% cd cfa-cc/src; cloc libcfa 277 277 % ------------------------------------------------------------------------------- … … 310 310 311 311 Finally, it is impossible to describe a programming language without usages before definitions. 312 Therefore, syntax and semantics appear before explanations ;313 hence, patience is necessary until details are presented.312 Therefore, syntax and semantics appear before explanations, and related work (Section~\ref{s:RelatedWork}) is deferred until \CFA is presented; 313 hence, patience is necessary until details are discussed. 314 314 315 315 … … 329 329 \end{quote} 330 330 \vspace{-9pt} 331 C already has a limited form of ad-hoc polymorphism in the form ofits basic arithmetic operators, which apply to a variety of different types using identical syntax.331 C already has a limited form of ad-hoc polymorphism in its basic arithmetic operators, which apply to a variety of different types using identical syntax. 332 332 \CFA extends the built-in operator overloading by allowing users to define overloads for any function, not just operators, and even any variable; 333 333 Section~\ref{sec:libraries} includes a number of examples of how this overloading simplifies \CFA programming relative to C. … … 653 653 } 654 654 \end{cfa} 655 Since @pair( T *, T * )@ is a concrete type, there are no implicit parameters passed to @lexcmp@, so the generated code is identical to a function written in standard C using @void *@, yet the \CFA version is type-checked to ensure the fields of both pairs and the arguments to the comparison function match in type.655 Since @pair( T *, T * )@ is a concrete type, there are no implicit parameters passed to @lexcmp@, so the generated code is identical to a function written in standard C using @void *@, yet the \CFA version is type-checked to ensure the members of both pairs and the arguments to the comparison function match in type. 656 656 657 657 Another useful pattern enabled by reused dtype-static type instantiations is zero-cost \newterm{tag-structures}. … … 815 815 \subsection{Member Access} 816 816 817 It is also possible to access multiple fields from a single expression using a \newterm{member-access}.817 It is also possible to access multiple members from a single expression using a \newterm{member-access}. 818 818 The result is a single tuple-valued expression whose type is the tuple of the types of the members, \eg: 819 819 \begin{cfa} … … 1020 1020 \begin{cfa} 1021 1021 forall( dtype T0, dtype T1 | sized(T0) | sized(T1) ) struct _tuple2 { 1022 T0 field_0; T1 field_1; $\C{// generated before the first 2-tuple}$1022 T0 member_0; T1 member_1; $\C{// generated before the first 2-tuple}$ 1023 1023 }; 1024 1024 _tuple2(int, int) f() { 1025 1025 _tuple2(double, double) x; 1026 1026 forall( dtype T0, dtype T1, dtype T2 | sized(T0) | sized(T1) | sized(T2) ) struct _tuple3 { 1027 T0 field_0; T1 field_1; T2 field_2; $\C{// generated before the first 3-tuple}$1027 T0 member_0; T1 member_1; T2 member_2; $\C{// generated before the first 3-tuple}$ 1028 1028 }; 1029 1029 _tuple3(int, double, int) y; … … 1033 1033 1034 1034 \begin{comment} 1035 Since tuples are essentially structures, tuple indexing expressions are just fieldaccesses:1035 Since tuples are essentially structures, tuple indexing expressions are just member accesses: 1036 1036 \begin{cfa} 1037 1037 void f(int, [double, char]); … … 1047 1047 _tuple2(int, double) x; 1048 1048 1049 x. field_0+x.field_1;1050 printf("%d %g\n", x. field_0, x.field_1);1051 f(x. field_0, (_tuple2){ x.field_1, 'z' });1052 \end{cfa} 1053 Note that due to flattening, @x@ used in the argument position is converted into the list of its fields.1049 x.member_0+x.member_1; 1050 printf("%d %g\n", x.member_0, x.member_1); 1051 f(x.member_0, (_tuple2){ x.member_1, 'z' }); 1052 \end{cfa} 1053 Note that due to flattening, @x@ used in the argument position is converted into the list of its members. 1054 1054 In the call to @f@, the second and third argument components are structured into a tuple argument. 1055 1055 Similarly, tuple member expressions are recursively expanded into a list of member access expressions. … … 1083 1083 1084 1084 The various kinds of tuple assignment, constructors, and destructors generate GNU C statement expressions. 1085 A variable is generated to store the value produced by a statement expression, since its fields may need to be constructed with a non-trivial constructor and it may need to be referred to multiple time, \eg in a unique expression.1085 A variable is generated to store the value produced by a statement expression, since its members may need to be constructed with a non-trivial constructor and it may need to be referred to multiple time, \eg in a unique expression. 1086 1086 The use of statement expressions allows the translator to arbitrarily generate additional temporary variables as needed, but binds the implementation to a non-standard extension of the C language. 1087 1087 However, there are other places where the \CFA translator makes use of GNU C extensions, such as its use of nested functions, so this restriction is not new. … … 1493 1493 1494 1494 Heterogeneous data is often aggregated into a structure/union. 1495 To reduce syntactic noise, \CFA provides a @with@ statement (see Pascal~\cite[\S~4.F]{Pascal}) to elide aggregate field-qualification by opening a scope containing the fieldidentifiers.1495 To reduce syntactic noise, \CFA provides a @with@ statement (see Pascal~\cite[\S~4.F]{Pascal}) to elide aggregate member-qualification by opening a scope containing the member identifiers. 1496 1496 \begin{cquote} 1497 1497 \vspace*{-\baselineskip}%??? … … 1530 1530 The type must be an aggregate type. 1531 1531 (Enumerations are already opened.) 1532 The object is the implicit qualifier for the open structure- fields.1532 The object is the implicit qualifier for the open structure-members. 1533 1533 1534 1534 All expressions in the expression list are open in parallel within the compound statement, which is different from Pascal, which nests the openings from left to right. 1535 The difference between parallel and nesting occurs for fields with the same name and type:1536 \begin{cfa} 1537 struct S { int `i`; int j; double m; } s, w; 1535 The difference between parallel and nesting occurs for members with the same name and type: 1536 \begin{cfa} 1537 struct S { int `i`; int j; double m; } s, w; $\C{// member i has same type in structure types S and T}$ 1538 1538 struct T { int `i`; int k; int m; } t, w; 1539 with ( s, t ) { 1539 with ( s, t ) { $\C{// open structure variables s and t in parallel}$ 1540 1540 j + k; $\C{// unambiguous, s.j + t.k}$ 1541 1541 m = 5.0; $\C{// unambiguous, s.m = 5.0}$ … … 1549 1549 For parallel semantics, both @s.i@ and @t.i@ are visible, so @i@ is ambiguous without qualification; 1550 1550 for nested semantics, @t.i@ hides @s.i@, so @i@ implies @t.i@. 1551 \CFA's ability to overload variables means fields with the same name but different types are automatically disambiguated, eliminating most qualification when opening multiple aggregates.1551 \CFA's ability to overload variables means members with the same name but different types are automatically disambiguated, eliminating most qualification when opening multiple aggregates. 1552 1552 Qualification or a cast is used to disambiguate. 1553 1553 … … 1555 1555 \begin{cfa} 1556 1556 void ?{}( S & s, int i ) with ( s ) { $\C{// constructor}$ 1557 `s.i = i;` j = 3; m = 5.5; $\C{// initialize fields}$1557 `s.i = i;` j = 3; m = 5.5; $\C{// initialize members}$ 1558 1558 } 1559 1559 \end{cfa} … … 1659 1659 \lstMakeShortInline@% 1660 1660 \end{cquote} 1661 The only exception is bit field specification, which always appear to the right of the base type.1661 The only exception is bit-field specification, which always appear to the right of the base type. 1662 1662 % Specifically, the character @*@ is used to indicate a pointer, square brackets @[@\,@]@ are used to represent an array or function return value, and parentheses @()@ are used to indicate a function parameter. 1663 1663 However, unlike C, \CFA type declaration tokens are distributed across all variables in the declaration list. … … 1715 1715 // pointer to array of 5 doubles 1716 1716 1717 // common bit field syntax1717 // common bit-field syntax 1718 1718 1719 1719 … … 1911 1911 \subsection{Type Nesting} 1912 1912 1913 Nested types provide a mechanism to organize associated types and refactor a subset of fields into a named aggregate (\eg sub-aggregates @name@, @address@, @department@, within aggregate @employe@).1913 Nested types provide a mechanism to organize associated types and refactor a subset of members into a named aggregate (\eg sub-aggregates @name@, @address@, @department@, within aggregate @employe@). 1914 1914 Java nested types are dynamic (apply to objects), \CC are static (apply to the \lstinline[language=C++]@class@), and C hoists (refactors) nested types into the enclosing scope, meaning there is no need for type qualification. 1915 1915 Since \CFA in not object-oriented, adopting dynamic scoping does not make sense; 1916 instead \CFA adopts \CC static nesting, using the field-selection operator ``@.@'' for type qualification, as does Java, rather than the \CC type-selection operator ``@::@'' (see Figure~\ref{f:TypeNestingQualification}).1916 instead \CFA adopts \CC static nesting, using the member-selection operator ``@.@'' for type qualification, as does Java, rather than the \CC type-selection operator ``@::@'' (see Figure~\ref{f:TypeNestingQualification}). 1917 1917 \begin{figure} 1918 1918 \centering … … 2005 2005 Destruction parameters are useful for specifying storage-management actions, such as de-initialize but not deallocate.}. 2006 2006 \begin{cfa} 2007 struct VLA { int len, * data; }; $\C{// variable length array of integers}$2008 void ?{}( VLA & vla ) with ( vla ) { len = 10; data = alloc( len); } $\C{// default constructor}$2007 struct VLA { int size, * data; }; $\C{// variable length array of integers}$ 2008 void ?{}( VLA & vla ) with ( vla ) { size = 10; data = alloc( size ); } $\C{// default constructor}$ 2009 2009 void ^?{}( VLA & vla ) with ( vla ) { free( data ); } $\C{// destructor}$ 2010 2010 { … … 2013 2013 \end{cfa} 2014 2014 @VLA@ is a \newterm{managed type}\footnote{ 2015 A managed type affects the runtime environment versus a self-contained type.}: a type requiring a non-trivial constructor or destructor, or with a fieldof a managed type.2015 A managed type affects the runtime environment versus a self-contained type.}: a type requiring a non-trivial constructor or destructor, or with a member of a managed type. 2016 2016 A managed type is implicitly constructed at allocation and destructed at deallocation to ensure proper interaction with runtime resources, in this case, the @data@ array in the heap. 2017 2017 For details of the code-generation placement of implicit constructor and destructor calls among complex executable statements see~\cite[\S~2.2]{Schluntz17}. … … 2019 2019 \CFA also provides syntax for \newterm{initialization} and \newterm{copy}: 2020 2020 \begin{cfa} 2021 void ?{}( VLA & vla, int size, char fill ) with ( vla) { $\C{// initialization}$2022 len = size; data = alloc( len, fill );2021 void ?{}( VLA & vla, int size, char fill = '\0' ) { $\C{// initialization}$ 2022 vla.[ size, data ] = [ size, alloc( size, fill ) ]; 2023 2023 } 2024 2024 void ?{}( VLA & vla, VLA other ) { $\C{// copy, shallow}$ 2025 vla .len = other.len; vla.data = other.data;2025 vla = other; 2026 2026 } 2027 2027 \end{cfa} … … 2036 2036 2037 2037 \CFA constructors may be explicitly called, like Java, and destructors may be explicitly called, like \CC. 2038 Explicit calls to constructors double as a \CC-style \emph{placement syntax}, useful for construction of member fields in user-defined constructors and reuse of existing storage allocations.2038 Explicit calls to constructors double as a \CC-style \emph{placement syntax}, useful for construction of members in user-defined constructors and reuse of existing storage allocations. 2039 2039 Like the other operators in \CFA, there is a concise syntax for constructor/destructor function calls: 2040 2040 \begin{cfa} … … 2048 2048 y{ x }; $\C{// reallocate y, points to x}$ 2049 2049 x{}; $\C{// reallocate x, not pointing to y}$ 2050 // ^z{}; ^y{}; ^x{}; 2051 } 2050 } // ^z{}; ^y{}; ^x{}; 2052 2051 \end{cfa} 2053 2052 … … 2060 2059 For compatibility with C, a copy constructor from the first union member type is also defined. 2061 2060 For @struct@ types, each of the four functions are implicitly defined to call their corresponding functions on each member of the struct. 2062 To better simulate the behaviour of C initializers, a set of \newterm{ fieldconstructors} is also generated for structures.2061 To better simulate the behaviour of C initializers, a set of \newterm{member constructors} is also generated for structures. 2063 2062 A constructor is generated for each non-empty prefix of a structure's member-list to copy-construct the members passed as parameters and default-construct the remaining members. 2064 To allow users to limit the set of constructors available for a type, when a user declares any constructor or destructor, the corresponding generated function and all fieldconstructors for that type are hidden from expression resolution;2063 To allow users to limit the set of constructors available for a type, when a user declares any constructor or destructor, the corresponding generated function and all member constructors for that type are hidden from expression resolution; 2065 2064 similarly, the generated default constructor is hidden upon declaration of any constructor. 2066 2065 These semantics closely mirror the rule for implicit declaration of constructors in \CC\cite[p.~186]{ANSI98:C++}. … … 2740 2739 2741 2740 \section{Related Work} 2741 \label{s:RelatedWork} 2742 2742 2743 2743 … … 2793 2793 C provides variadic functions through @va_list@ objects, but the programmer is responsible for managing the number of arguments and their types, so the mechanism is type unsafe. 2794 2794 KW-C~\cite{Buhr94a}, a predecessor of \CFA, introduced tuples to C as an extension of the C syntax, taking much of its inspiration from SETL. 2795 The main contributions of that work were adding MRVF, tuple mass and multiple assignment, and record- fieldaccess.2795 The main contributions of that work were adding MRVF, tuple mass and multiple assignment, and record-member access. 2796 2796 \CCeleven introduced @std::tuple@ as a library variadic template structure. 2797 2797 Tuples are a generalization of @std::pair@, in that they allow for arbitrary length, fixed-size aggregation of heterogeneous values.
Note:
See TracChangeset
for help on using the changeset viewer.