- Timestamp:
- Jan 22, 2021, 12:02:59 PM (4 years ago)
- Branches:
- ADT, arm-eh, ast-experimental, enum, forall-pointer-decay, jacob/cs343-translation, master, new-ast-unique-expr, pthread-emulation, qualifiedEnum
- Children:
- 4706098c
- Parents:
- 62c0f8a
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/andrew_beach_MMath/existing.tex
r62c0f8a r6c79bef 1 1 \chapter{\CFA Existing Features} 2 2 3 \CFA (C-for-all)~\cite{Cforall} is an open-source project extending ISO C with modern safety and productivity features, while still ensuring backwards compatibility with C and its programmers. 4 \CFA is designed to have an orthogonal feature-set based closely on the C programming paradigm (non-object-oriented) and these features can be added incrementally to an existing C code-base allowing programmers to learn \CFA on an as-needed basis. 5 6 \section{Overloading and extern} 7 Cforall has overloading, allowing multiple definitions of the same name to 8 be defined.~\cite{Moss18} 9 10 This also adds name mangling so that the assembly symbols are unique for 11 different overloads. For compatability with names in C there is also a 12 syntax to diable the name mangling. These unmangled names cannot be overloaded 13 but act as the interface between C and \CFA code. 14 15 The syntax for disabling mangling is: 16 \begin{cfa} 17 extern "C" { 18 ... 19 } 20 \end{cfa} 21 22 To re-enable mangling once it is disabled the syntax is: 23 \begin{cfa} 24 extern "Cforall" { 25 ... 26 } 27 \end{cfa} 28 29 Both should occur at the declaration level and effect all the declarations 30 in @...@. Neither care about the state of mangling when they begin 31 and will return to that state after the group is finished. So re-enabling 32 is only used to nest areas of mangled and unmangled declarations. 33 34 \section{References} 35 \CFA adds references to C. These are auto-dereferencing pointers and use the 36 same syntax as pointers except they use ampersand (@&@) instead of 37 the asterisk (@*@). They can also be constaint or mutable, if they 38 are mutable they may be assigned to by using the address-of operator 39 (@&@) which converts them into a pointer. 3 \CFA (C-for-all)~\cite{Cforall} is an open-source project extending ISO C with 4 modern safety and productivity features, while still ensuring backwards 5 compatibility with C and its programmers. \CFA is designed to have an 6 orthogonal feature-set based closely on the C programming paradigm 7 (non-object-oriented) and these features can be added incrementally to an 8 existing C code-base allowing programmers to learn \CFA on an as-needed basis. 9 10 Only those \CFA features pertinent to this thesis are discussed. Many of the 11 \CFA syntactic and semantic features used in the thesis should be fairly 12 obvious to the reader. 13 14 \section{Overloading and \lstinline|extern|} 15 \CFA has extensive overloading, allowing multiple definitions of the same name 16 to be defined.~\cite{Moss18} 17 \begin{cfa} 18 char i; int i; double i; $\C[3.75in]{// variable overload}$ 19 int f(); double f(); $\C{// return overload}$ 20 void g( int ); void g( double ); $\C{// parameter overload}\CRT$ 21 \end{cfa} 22 This feature requires name mangling so the assembly symbols are unique for 23 different overloads. For compatibility with names in C, there is also a syntax 24 to disable name mangling. These unmangled names cannot be overloaded but act as 25 the interface between C and \CFA code. The syntax for disabling/enabling 26 mangling is: 27 \begin{cfa} 28 // name mangling 29 int i; // _X1ii_1 30 @extern "C"@ { // no name mangling 31 int j; // j 32 @extern "Cforall"@ { // name mangling 33 int k; // _X1ki_1 34 } 35 // no name mangling 36 } 37 // name mangling 38 \end{cfa} 39 Both forms of @extern@ affect all the declarations within their nested lexical 40 scope and transition back to the previous mangling state when the lexical scope 41 ends. 42 43 \section{Reference Type} 44 \CFA adds a rebindable reference type to C, but more expressive than the \CC 45 reference. Multi-level references are allowed and act like auto-dereferenced 46 pointers using the ampersand (@&@) instead of the pointer asterisk (@*@). \CFA 47 references may also be mutable or non-mutable. If mutable, a reference variable 48 may be assigned to using the address-of operator (@&@), which converts the 49 reference to a pointer. 50 \begin{cfa} 51 int i, j; 52 int @&@ ri = i, @&&@ rri = ri; 53 rri = 3; // auto-dereference assign to i 54 @&@ri = @&@j; // rebindable 55 ri = 5; // assign to j 56 \end{cfa} 40 57 41 58 \section{Constructors and Destructors} 42 59 43 60 Both constructors and destructors are operators, which means they are just 44 functions with special names. The special names are used to define them and 45 may be used to call the functions expicately. The \CFA special names are 46 constructed by taking the tokens in the operators and putting @?@ where 47 the arguments would go. So multiplication is @?*?@ while dereference 48 is @*?@. This also make it easy to tell the difference between 49 pre-fix operations (such as @++?@) and post-fix operations 50 (@?++@). 51 52 The special name for contructors is @?{}@, which comes from the 53 initialization syntax in C. The special name for destructors is 54 @^{}@. % I don't like the \^{} symbol but $^\wedge$ isn't better. 55 56 Any time a type T goes out of scope the destructor matching 57 @void ^?{}(T &);@ is called. In theory this is also true of 58 primitive types such as @int@, but in practice those are no-ops and 59 are usually omitted for optimization. 61 functions with special operator names rather than type names in \CC. The 62 special operator names may be used to call the functions explicitly (not 63 allowed in \CC for constructors). 64 65 In general, operator names in \CFA are constructed by bracketing an operator 66 token with @?@, which indicates where the arguments. For example, infixed 67 multiplication is @?*?@ while prefix dereference is @*?@. This syntax make it 68 easy to tell the difference between prefix operations (such as @++?@) and 69 post-fix operations (@?++@). 70 71 The special name for a constructor is @?{}@, which comes from the 72 initialization syntax in C. The special name for a destructor is @^{}@, where 73 the @^@ has no special meaning. 74 % I don't like the \^{} symbol but $^\wedge$ isn't better. 75 \begin{cfa} 76 struct T { ... }; 77 void ?@{}@(@T &@ this, ...) { ... } // constructor 78 void ?@^{}@(@T &@ this, ...) { ... } // destructor 79 { 80 T s = @{@ ... @}@; // same constructor/initialization braces 81 } // destructor call automatically generated 82 \end{cfa} 83 The first parameter is a reference parameter to the type for the 84 constructor/destructor. Destructors may have multiple parameters. The compiler 85 implicitly matches an overloaded constructor @void ^?{}(T &, ...);@ to an 86 object declaration with associated initialization, and generates a construction 87 call after the object is allocated. When an object goes out of scope, the 88 matching overloaded destructor @void ^?{}(T &);@ is called. Without explicit 89 definition, \CFA creates a default and copy constructor, destructor and 90 assignment (like \CC). It is possible to define constructors/destructors for 91 basic and existing types. 60 92 61 93 \section{Polymorphism} 62 \CFA uses polymorphism to create functions and types that are defined over 63 different types. \CFA polymorphic declarations serve the same role as \CC 64 templates or Java generics. 65 66 Polymorphic declaractions start with a forall clause that goes before the 67 standard (monomorphic) declaration. These declarations have the same syntax 68 except that you may use the names introduced by the forall clause in them. 69 70 Forall clauses are written @forall( ... )@ where @...@ becomes 71 the list of polymorphic variables (local type names) and assertions, which 72 repersent required operations on those types. 73 74 \begin{cfa} 75 forall(dtype T | { void do_once(T &); }) 76 void do_twice(T & value) { 77 do_once(value); 78 do_once(value); 79 } 80 \end{cfa} 81 82 A polymorphic function can be used in the same way normal functions are. 83 The polymorphics variables are filled in with concrete types and the 84 assertions are checked. An assertion checked by seeing if that name of that 85 type (with all the variables replaced with the concrete types) is defined at 86 the the call site. 87 88 As an example, even if no function named @do_once@ is not defined 89 near the definition of @do_twice@ the following code will work. 90 \begin{cfa} 94 \CFA uses parametric polymorphism to create functions and types that are 95 defined over multiple types. \CFA polymorphic declarations serve the same role 96 as \CC templates or Java generics. The ``parametric'' means the polymorphism is 97 accomplished by passing argument operations to associate \emph{parameters} at 98 the call site, and these parameters are used in the function to differentiate 99 among the types the function operates on. 100 101 Polymorphic declarations start with a universal @forall@ clause that goes 102 before the standard (monomorphic) declaration. These declarations have the same 103 syntax except they may use the universal type names introduced by the @forall@ 104 clause. For example, the following is a polymorphic identity function that 105 works on any type @T@: 106 \begin{cfa} 107 @forall( T )@ @T@ identity( @T@ val ) { return val; } 108 int forty_two = identity( 42 ); // T bound to int, forty_two == 42 109 \end{cfa} 110 111 To allow a polymorphic function to be separately compiled, the type @T@ must be 112 constrained by the operations used on @T@ in the function body. The @forall@ 113 clauses is augmented with a list of polymorphic variables (local type names) 114 and assertions (constraints), which represent the required operations on those 115 types used in a function, \eg: 116 \begin{cfa} 117 forall( T @| { void do_once(T); }@) // assertion 118 void do_twice(T value) { 119 do_once(value); 120 do_once(value); 121 } 122 void do_once(int i) { ... } // provide assertion 123 int i; 124 do_twice(i); // implicitly pass assertion do_once to do_twice 125 \end{cfa} 126 Any object with a type fulfilling the assertion may be passed as an argument to 127 a @do_twice@ call. 128 129 A polymorphic function can be used in the same way as a normal function. The 130 polymorphic variables are filled in with concrete types and the assertions are 131 checked. An assertion is checked by verifying each assertion operation (with 132 all the variables replaced with the concrete types from the arguments) is 133 defined at a call site. 134 135 Note, a function named @do_once@ is not required in the scope of @do_twice@ to 136 compile it, unlike \CC template expansion. Furthermore, call-site inferencing 137 allows local replacement of the most specific parametric functions needs for a 138 call. 139 \begin{cfa} 140 void do_once(double y) { ... } // global 91 141 int quadruple(int x) { 92 void do_once(int & y) { 93 y = y * 2; 94 } 95 do_twice(x); 96 return x; 97 } 98 \end{cfa} 99 This is not the recommended way to implement a quadruple function but it 100 does work. The complier will deduce that @do_twice@'s T is an 101 integer from the argument. It will then look for a definition matching the 102 assertion which is the @do_once@ defined within the function. That 103 function will be passed in as a function pointer to @do_twice@ and 104 called within it. 105 106 To avoid typing out long lists of assertions again and again there are also 107 traits which collect assertions into convenent packages that can then be used 108 in assertion lists instead of all of their components. 109 \begin{cfa} 110 trait done_once(dtype T) { 111 void do_once(T &); 112 } 113 \end{cfa} 114 115 After this the forall list in the previous example could instead be written 116 with the trait instead of the assertion itself. 117 \begin{cfa} 118 forall(dtype T | done_once(T)) 119 \end{cfa} 120 121 Traits can have arbitrary number of assertions in them and are usually used to 122 create short hands for, and give descriptive names to, commond groupings of 123 assertions. 124 125 Polymorphic structures and unions may also be defined by putting a forall 126 clause before the declaration. The type variables work the same way except 127 are now used in field declaractions instead of parameters and local variables. 128 142 void do_once(int y) { y = y * 2; } // local 143 do_twice(x); // using local "do_once" 144 return x; 145 } 146 \end{cfa} 147 Specifically, the complier deduces that @do_twice@'s T is an integer from the 148 argument @x@. It then looks for the most specific definition matching the 149 assertion, which is the nested integral @do_once@ defined within the 150 function. The matched assertion function is then passed as a function pointer 151 to @do_twice@ and called within it. 152 153 To avoid typing long lists of assertions, constraints can be collect into 154 convenient packages called a @trait@, which can then be used in an assertion 155 instead of the individual constraints. 156 \begin{cfa} 157 trait done_once(T) { 158 void do_once(T); 159 } 160 \end{cfa} 161 and the @forall@ list in the previous example is replaced with the trait. 162 \begin{cfa} 163 forall(dtype T | @done_once(T)@) 164 \end{cfa} 165 In general, a trait can contain an arbitrary number of assertions, both 166 functions and variables, and are usually used to create a shorthand for, and 167 give descriptive names to, common groupings of assertions describing a certain 168 functionality, like @sumable@, @listable@, \etc. 169 170 Polymorphic structures and unions are defined by qualifying the aggregate type 171 with @forall@. The type variables work the same except they are used in field 172 declarations instead of parameters, returns, and local variable declarations. 129 173 \begin{cfa} 130 174 forall(dtype T) 131 175 struct node { 132 node(T) * next; 133 T * data; 134 } 135 \end{cfa} 136 137 The @node(T)@ is a use of a polymorphic structure. Polymorphic types 138 must be provided their polymorphic parameters. 139 140 There are many other features of polymorphism that have not given here but 141 these are the ones used by the exception system. 176 node(T) * next; // generic linked node 177 T * data; 178 } 179 \end{cfa} 180 The generic type @node(T)@ is an example of a polymorphic-type usage. Like \CC 181 templates usage, a polymorphic-type usage must specify a type parameter. 182 183 There are many other polymorphism features in \CFA but these are the ones used 184 by the exception system. 142 185 143 186 \section{Concurrency} 144 145 \CFA has a number of concurrency features, @thread@s, 146 @monitor@s and @mutex@ parameters, @coroutine@s and 147 @generator@s. The two features that interact with the exception system 148 are @thread@s and @coroutine@s; they and their supporting 149 constructs will be described here. 150 151 \subsection{Coroutines} 152 Coroutines are routines that do not have to finish execution to hand control 153 back to their caller, instead they may suspend their execution at any time 154 and resume it later. 155 Coroutines are not true concurrency but share some similarities and many of 156 the same underpinnings and so are included as part of the \CFA threading 157 library. 158 159 In \CFA coroutines are created using the @coroutine@ keyword which 160 works just like @struct@ except that the created structure will be 161 modified by the compiler to satify the @is_coroutine@ trait. 162 163 These structures act as the interface between callers and the coroutine, 164 the fields are used to pass information in and out. Here is a simple example 165 where the single field is used to pass the next number in a sequence out. 187 \CFA has a number of concurrency features: @thread@, @monitor@, @mutex@ 188 parameters, @coroutine@ and @generator@. The two features that interact with 189 the exception system are @thread@ and @coroutine@; they and their supporting 190 constructs are described here. 191 192 \subsection{Coroutine} 193 A coroutine is a type with associated functions, where the functions are not 194 required to finish execution when control is handed back to the caller. Instead 195 they may suspend execution at any time and be resumed later at the point of 196 last suspension. (Generators are stackless and coroutines are stackful.) These 197 types are not concurrent but share some similarities along with common 198 underpinnings, so they are combined with the \CFA threading library. Further 199 discussion in this section only refers to the coroutine because generators are 200 similar. 201 202 In \CFA, a coroutine is created using the @coroutine@ keyword, which is an 203 aggregate type like @struct,@ except the structure is implicitly modified by 204 the compiler to satisfy the @is_coroutine@ trait; hence, a coroutine is 205 restricted by the type system to types that provide this special trait. The 206 coroutine structure acts as the interface between callers and the coroutine, 207 and its fields are used to pass information in and out of coroutine interface 208 functions. 209 210 Here is a simple example where a single field is used to pass (communicate) the 211 next number in a sequence. 166 212 \begin{cfa} 167 213 coroutine CountUp { 168 unsigned int next; 169 } 170 \end{cfa} 171 172 The routine part of the coroutine is a main function for the coroutine. It 173 takes a reference to a coroutine object and returns nothing. In this function, 174 and any functions called by this function, the suspend statement may be used 175 to return execution to the coroutine's caller. When control returns to the 176 function it continue from that same suspend statement instead of at the top 177 of the function. 178 \begin{cfa} 179 void main(CountUp & this) { 180 unsigned int next = 0; 181 while (true) { 182 this.next = next; 183 suspend; 184 next = next + 1; 185 } 186 } 187 \end{cfa} 188 189 Control is passed to the coroutine with the resume function. This includes the 190 first time when the coroutine is starting up. The resume function takes a 191 reference to the coroutine structure and returns the same reference. The 192 return value is for easy access to communication variables. For example the 193 next value from a count-up can be generated and collected in a single 194 expression: @resume(count).next@. 214 unsigned int next; // communication variable 215 } 216 CountUp countup; 217 \end{cfa} 218 Each coroutine has @main@ function, which takes a reference to a coroutine 219 object and returns @void@. 220 \begin{cfa}[numbers=left] 221 void main(@CountUp & this@) { // argument matches trait is_coroutine 222 unsigned int up = 0; // retained between calls 223 while (true) { 224 next = up; // make "up" available outside function 225 @suspend;@$\label{suspend}$ 226 up += 1; 227 } 228 } 229 \end{cfa} 230 In this function, or functions called by this function (helper functions), the 231 @suspend@ statement is used to return execution to the coroutine's caller 232 without terminating the coroutine. 233 234 A coroutine is resumed by calling the @resume@ function, \eg @resume(countup)@. 235 The first resume calls the @main@ function at the top. Thereafter, resume calls 236 continue a coroutine in the last suspended function after the @suspend@ 237 statement, in this case @main@ line~\ref{suspend}. The @resume@ function takes 238 a reference to the coroutine structure and returns the same reference. The 239 return value allows easy access to communication variables defined in the 240 coroutine object. For example, the @next@ value for coroutine object @countup@ 241 is both generated and collected in the single expression: 242 @resume(countup).next@. 195 243 196 244 \subsection{Monitors and Mutex} 197 198 True concurrency does not garrenty ordering. To get some of that ordering back 199 \CFA uses monitors and mutex (mutual exclution) parameters. A monitor is 200 another special declaration that contains a lock and is compatable with mutex 201 parameters. 202 203 Function parameters can have the @mutex@ qualifiers on reference 204 arguments, for example @void example(a_monitor & mutex arg);@. When the 205 function is called it will acquire the lock on all of the mutex parameters. 206 207 This means that all functions that mutex on a type are part of a critical 208 section and only one will ever run at a time. 245 Concurrency does not guarantee ordering; without ordering results are 246 non-deterministic. To claw back ordering, \CFA uses monitors and @mutex@ 247 (mutual exclusion) parameters. A monitor is another kind of aggregate, where 248 the compiler implicitly inserts a lock and instances are compatible with 249 @mutex@ parameters. 250 251 A function that requires deterministic (ordered) execution, acquires mutual 252 exclusion on a monitor object by qualifying an object reference parameter with 253 @mutex@. 254 \begin{cfa} 255 void example(MonitorA & @mutex@ argA, MonitorB & @mutex@ argB); 256 \end{cfa} 257 When the function is called, it implicitly acquires the monitor lock for all of 258 the mutex parameters without deadlock. This semantics means all functions with 259 the same mutex type(s) are part of a critical section for objects of that type 260 and only one runs at a time. 209 261 210 262 \subsection{Threads} 211 While coroutines allow new things to be done with a single execution path 212 threads actually introduce new paths of execution that continue independently. 213 Now for threads to work together their must be some communication between them 214 and that means the timing of certain operations does have to be known. There 215 or various means of syncronization and mutual exclution provided by \CFA but 216 for exceptions only the basic two -- fork and join -- are needed. 217 218 Threads are created like coroutines except the keyword is changed: 263 Functions, generators, and coroutines are sequential so there is only a single 264 (but potentially sophisticated) execution path in a program. Threads introduce 265 multiple execution paths that continue independently. 266 267 For threads to work safely with objects requires mutual exclusion using 268 monitors and mutex parameters. For threads to work safely with other threads, 269 also requires mutual exclusion in the form of a communication rendezvous, which 270 also supports internal synchronization as for mutex objects. For exceptions 271 only the basic two basic operations are important: thread fork and join. 272 273 Threads are created like coroutines with an associated @main@ function: 219 274 \begin{cfa} 220 275 thread StringWorker { 221 222 276 const char * input; 277 int result; 223 278 }; 224 225 279 void main(StringWorker & this) { 226 const char * localCopy = this.input; 227 // ... do some work, perhaps hashing the string ... 228 this.result = result; 229 } 230 \end{cfa} 231 The main function will start executing after the fork operation and continue 232 executing until it is finished. If another thread joins with this one it will 233 wait until main has completed execution. In other words everything the thread 234 does is between fork and join. 235 236 From the outside this is the creation and destruction of the thread object. 237 Fork happens after the constructor is run and join happens before the 238 destructor runs. Join also happens during the @join@ function which 239 can be used to join a thread earlier. If it is used the destructor does not 240 join as that has already been completed. 280 const char * localCopy = this.input; 281 // ... do some work, perhaps hashing the string ... 282 this.result = result; 283 } 284 { 285 StringWorker stringworker; // fork thread running in "main" 286 } // implicitly join with thread $\(\Rightarrow\)$ wait for completion 287 \end{cfa} 288 The thread main is where a new thread starts execution after a fork operation 289 and then the thread continues executing until it is finished. If another thread 290 joins with an executing thread, it waits until the executing main completes 291 execution. In other words, everything a thread does is between a fork and join. 292 293 From the outside, this behaviour is accomplished through creation and 294 destruction of a thread object. Implicitly, fork happens after a thread 295 object's constructor is run and join happens before the destructor runs. Join 296 can also be specified explicitly using the @join@ function to wait for a 297 thread's completion independently from its deallocation (\ie destructor 298 call). If @join@ is called explicitly, the destructor does not implicitly join.
Note: See TracChangeset
for help on using the changeset viewer.