Changeset d95969a for doc/theses
- Timestamp:
- Jan 25, 2021, 3:45:42 PM (4 years ago)
- Branches:
- ADT, arm-eh, ast-experimental, enum, forall-pointer-decay, jacob/cs343-translation, master, new-ast-unique-expr, pthread-emulation, qualifiedEnum
- Children:
- c292244
- Parents:
- b6a8b31 (diff), 7158202 (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the(diff)
links above to see all the changes relative to each parent. - Location:
- doc/theses
- Files:
-
- 3 added
- 9 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/andrew_beach_MMath/.gitignore
rb6a8b31 rd95969a 3 3 4 4 # Final Files: 5 thesis.pdf5 *.pdf 6 6 7 7 # The Makefile here is not generated. -
doc/theses/andrew_beach_MMath/Makefile
rb6a8b31 rd95969a 1 1 ### Makefile for Andrew Beach's Masters Thesis 2 2 3 DOC= thesis.pdf3 DOC=uw-ethesis.pdf 4 4 BUILD=out 5 5 TEXSRC=$(wildcard *.tex) … … 7 7 STYSRC=$(wildcard *.sty) 8 8 CLSSRC=$(wildcard *.cls) 9 TEXLIB= .: ${BUILD}:9 TEXLIB= .:../../LaTeXmacros:${BUILD}: 10 10 BIBLIB= .:../../bibliography 11 11 … … 29 29 ${LATEX} ${BASE} 30 30 ${BIBTEX} ${BUILD}/${BASE} 31 ${LATEX} ${BASE} 31 32 ${GLOSSARY} ${BUILD}/${BASE} 32 33 ${LATEX} ${BASE} -
doc/theses/andrew_beach_MMath/existing.tex
rb6a8b31 rd95969a 1 \chapter{\CFA{} Existing Features} 2 3 \section{Overloading and extern} 4 Cforall has overloading, allowing multiple definitions of the same name to 5 be defined. 6 7 This also adds name mangling so that the assembly symbols are unique for 8 different overloads. For compatability with names in C there is also a 9 syntax to diable the name mangling. These unmangled names cannot be overloaded 10 but act as the interface between C and \CFA code. 11 12 The syntax for disabling mangling is: 13 \begin{lstlisting} 14 extern "C" { 15 ... 16 } 17 \end{lstlisting} 18 19 To re-enable mangling once it is disabled the syntax is: 20 \begin{lstlisting} 21 extern "Cforall" { 22 ... 23 } 24 \end{lstlisting} 25 26 Both should occur at the declaration level and effect all the declarations 27 in \texttt{...}. Neither care about the state of mangling when they begin 28 and will return to that state after the group is finished. So re-enabling 29 is only used to nest areas of mangled and unmangled declarations. 30 31 \section{References} 32 \CFA adds references to C. These are auto-dereferencing pointers and use the 33 same syntax as pointers except they use ampersand (\codeCFA{\&}) instead of 34 the asterisk (\codeCFA{*}). They can also be constaint or mutable, if they 35 are mutable they may be assigned to by using the address-of operator 36 (\codeCFA\&) which converts them into a pointer. 1 \chapter{\texorpdfstring{\CFA Existing Features}{Cforall Existing Features}} 2 3 \CFA (C-for-all)~\cite{Cforall} is an open-source project extending ISO C with 4 modern safety and productivity features, while still ensuring backwards 5 compatibility with C and its programmers. \CFA is designed to have an 6 orthogonal feature-set based closely on the C programming paradigm 7 (non-object-oriented) and these features can be added incrementally to an 8 existing C code-base allowing programmers to learn \CFA on an as-needed basis. 9 10 Only those \CFA features pertinent to this thesis are discussed. Many of the 11 \CFA syntactic and semantic features used in the thesis should be fairly 12 obvious to the reader. 13 14 \section{\texorpdfstring{Overloading and \lstinline|extern|}{Overloading and extern}} 15 \CFA has extensive overloading, allowing multiple definitions of the same name 16 to be defined.~\cite{Moss18} 17 \begin{cfa} 18 char i; int i; double i; $\C[3.75in]{// variable overload}$ 19 int f(); double f(); $\C{// return overload}$ 20 void g( int ); void g( double ); $\C{// parameter overload}\CRT$ 21 \end{cfa} 22 This feature requires name mangling so the assembly symbols are unique for 23 different overloads. For compatibility with names in C, there is also a syntax 24 to disable name mangling. These unmangled names cannot be overloaded but act as 25 the interface between C and \CFA code. The syntax for disabling/enabling 26 mangling is: 27 \begin{cfa} 28 // name mangling 29 int i; // _X1ii_1 30 @extern "C"@ { // no name mangling 31 int j; // j 32 @extern "Cforall"@ { // name mangling 33 int k; // _X1ki_1 34 } 35 // no name mangling 36 } 37 // name mangling 38 \end{cfa} 39 Both forms of @extern@ affect all the declarations within their nested lexical 40 scope and transition back to the previous mangling state when the lexical scope 41 ends. 42 43 \section{Reference Type} 44 \CFA adds a rebindable reference type to C, but more expressive than the \CC 45 reference. Multi-level references are allowed and act like auto-dereferenced 46 pointers using the ampersand (@&@) instead of the pointer asterisk (@*@). \CFA 47 references may also be mutable or non-mutable. If mutable, a reference variable 48 may be assigned to using the address-of operator (@&@), which converts the 49 reference to a pointer. 50 \begin{cfa} 51 int i, j; 52 int @&@ ri = i, @&&@ rri = ri; 53 rri = 3; // auto-dereference assign to i 54 @&@ri = @&@j; // rebindable 55 ri = 5; // assign to j 56 \end{cfa} 37 57 38 58 \section{Constructors and Destructors} 39 59 40 60 Both constructors and destructors are operators, which means they are just 41 functions with special names. The special names are used to define them and 42 may be used to call the functions expicately. The \CFA special names are 43 constructed by taking the tokens in the operators and putting \texttt{?} where 44 the arguments would go. So multiplication is \texttt{?*?} while dereference 45 is \texttt{*?}. This also make it easy to tell the difference between 46 pre-fix operations (such as \texttt{++?}) and post-fix operations 47 (\texttt{?++}). 48 49 The special name for contructors is \texttt{?\{\}}, which comes from the 50 initialization syntax in C. The special name for destructors is 51 \texttt{\^{}?\{\}}. % I don't like the \^{} symbol but $^\wedge$ isn't better. 52 53 Any time a type T goes out of scope the destructor matching 54 \codeCFA{void ^?\{\}(T \&);} is called. In theory this is also true of 55 primitive types such as \codeCFA{int}, but in practice those are no-ops and 56 are usually omitted for optimization. 61 functions with special operator names rather than type names in \CC. The 62 special operator names may be used to call the functions explicitly (not 63 allowed in \CC for constructors). 64 65 In general, operator names in \CFA are constructed by bracketing an operator 66 token with @?@, which indicates where the arguments. For example, infixed 67 multiplication is @?*?@ while prefix dereference is @*?@. This syntax make it 68 easy to tell the difference between prefix operations (such as @++?@) and 69 post-fix operations (@?++@). 70 71 The special name for a constructor is @?{}@, which comes from the 72 initialization syntax in C. The special name for a destructor is @^{}@, where 73 the @^@ has no special meaning. 74 % I don't like the \^{} symbol but $^\wedge$ isn't better. 75 \begin{cfa} 76 struct T { ... }; 77 void ?@{}@(@T &@ this, ...) { ... } // constructor 78 void ?@^{}@(@T &@ this, ...) { ... } // destructor 79 { 80 T s = @{@ ... @}@; // same constructor/initialization braces 81 } // destructor call automatically generated 82 \end{cfa} 83 The first parameter is a reference parameter to the type for the 84 constructor/destructor. Destructors may have multiple parameters. The compiler 85 implicitly matches an overloaded constructor @void ^?{}(T &, ...);@ to an 86 object declaration with associated initialization, and generates a construction 87 call after the object is allocated. When an object goes out of scope, the 88 matching overloaded destructor @void ^?{}(T &);@ is called. Without explicit 89 definition, \CFA creates a default and copy constructor, destructor and 90 assignment (like \CC). It is possible to define constructors/destructors for 91 basic and existing types. 57 92 58 93 \section{Polymorphism} 59 \CFA uses polymorphism to create functions and types that are defined over 60 different types. \CFA polymorphic declarations serve the same role as \CPP 61 templates or Java generics. 62 63 Polymorphic declaractions start with a forall clause that goes before the 64 standard (monomorphic) declaration. These declarations have the same syntax 65 except that you may use the names introduced by the forall clause in them. 66 67 Forall clauses are written \codeCFA{forall( ... )} where \codeCFA{...} becomes 68 the list of polymorphic variables (local type names) and assertions, which 69 repersent required operations on those types. 70 71 \begin{lstlisting} 72 forall(dtype T | { void do_once(T &); }) 73 void do_twice(T & value) { 74 do_once(value); 75 do_once(value); 76 } 77 \end{lstlisting} 78 79 A polymorphic function can be used in the same way normal functions are. 80 The polymorphics variables are filled in with concrete types and the 81 assertions are checked. An assertion checked by seeing if that name of that 82 type (with all the variables replaced with the concrete types) is defined at 83 the the call site. 84 85 As an example, even if no function named \codeCFA{do_once} is not defined 86 near the definition of \codeCFA{do_twice} the following code will work. 87 \begin{lstlisting} 94 \CFA uses parametric polymorphism to create functions and types that are 95 defined over multiple types. \CFA polymorphic declarations serve the same role 96 as \CC templates or Java generics. The ``parametric'' means the polymorphism is 97 accomplished by passing argument operations to associate \emph{parameters} at 98 the call site, and these parameters are used in the function to differentiate 99 among the types the function operates on. 100 101 Polymorphic declarations start with a universal @forall@ clause that goes 102 before the standard (monomorphic) declaration. These declarations have the same 103 syntax except they may use the universal type names introduced by the @forall@ 104 clause. For example, the following is a polymorphic identity function that 105 works on any type @T@: 106 \begin{cfa} 107 @forall( T )@ @T@ identity( @T@ val ) { return val; } 108 int forty_two = identity( 42 ); // T bound to int, forty_two == 42 109 \end{cfa} 110 111 To allow a polymorphic function to be separately compiled, the type @T@ must be 112 constrained by the operations used on @T@ in the function body. The @forall@ 113 clauses is augmented with a list of polymorphic variables (local type names) 114 and assertions (constraints), which represent the required operations on those 115 types used in a function, \eg: 116 \begin{cfa} 117 forall( T @| { void do_once(T); }@) // assertion 118 void do_twice(T value) { 119 do_once(value); 120 do_once(value); 121 } 122 void do_once(int i) { ... } // provide assertion 123 int i; 124 do_twice(i); // implicitly pass assertion do_once to do_twice 125 \end{cfa} 126 Any object with a type fulfilling the assertion may be passed as an argument to 127 a @do_twice@ call. 128 129 A polymorphic function can be used in the same way as a normal function. The 130 polymorphic variables are filled in with concrete types and the assertions are 131 checked. An assertion is checked by verifying each assertion operation (with 132 all the variables replaced with the concrete types from the arguments) is 133 defined at a call site. 134 135 Note, a function named @do_once@ is not required in the scope of @do_twice@ to 136 compile it, unlike \CC template expansion. Furthermore, call-site inferencing 137 allows local replacement of the most specific parametric functions needs for a 138 call. 139 \begin{cfa} 140 void do_once(double y) { ... } // global 88 141 int quadruple(int x) { 89 void do_once(int & y) { 90 y = y * 2; 91 } 92 do_twice(x); 93 return x; 94 } 95 \end{lstlisting} 96 This is not the recommended way to implement a quadruple function but it 97 does work. The complier will deduce that \codeCFA{do_twice}'s T is an 98 integer from the argument. It will then look for a definition matching the 99 assertion which is the \codeCFA{do_once} defined within the function. That 100 function will be passed in as a function pointer to \codeCFA{do_twice} and 101 called within it. 102 103 To avoid typing out long lists of assertions again and again there are also 104 traits which collect assertions into convenent packages that can then be used 105 in assertion lists instead of all of their components. 106 \begin{lstlisting} 107 trait done_once(dtype T) { 108 void do_once(T &); 109 } 110 \end{lstlisting} 111 112 After this the forall list in the previous example could instead be written 113 with the trait instead of the assertion itself. 114 \begin{lstlisting} 115 forall(dtype T | done_once(T)) 116 \end{lstlisting} 117 118 Traits can have arbitrary number of assertions in them and are usually used to 119 create short hands for, and give descriptive names to, commond groupings of 120 assertions. 121 122 Polymorphic structures and unions may also be defined by putting a forall 123 clause before the declaration. The type variables work the same way except 124 are now used in field declaractions instead of parameters and local variables. 125 126 \begin{lstlisting} 142 void do_once(int y) { y = y * 2; } // local 143 do_twice(x); // using local "do_once" 144 return x; 145 } 146 \end{cfa} 147 Specifically, the complier deduces that @do_twice@'s T is an integer from the 148 argument @x@. It then looks for the most specific definition matching the 149 assertion, which is the nested integral @do_once@ defined within the 150 function. The matched assertion function is then passed as a function pointer 151 to @do_twice@ and called within it. 152 153 To avoid typing long lists of assertions, constraints can be collect into 154 convenient packages called a @trait@, which can then be used in an assertion 155 instead of the individual constraints. 156 \begin{cfa} 157 trait done_once(T) { 158 void do_once(T); 159 } 160 \end{cfa} 161 and the @forall@ list in the previous example is replaced with the trait. 162 \begin{cfa} 163 forall(dtype T | @done_once(T)@) 164 \end{cfa} 165 In general, a trait can contain an arbitrary number of assertions, both 166 functions and variables, and are usually used to create a shorthand for, and 167 give descriptive names to, common groupings of assertions describing a certain 168 functionality, like @sumable@, @listable@, \etc. 169 170 Polymorphic structures and unions are defined by qualifying the aggregate type 171 with @forall@. The type variables work the same except they are used in field 172 declarations instead of parameters, returns, and local variable declarations. 173 \begin{cfa} 127 174 forall(dtype T) 128 175 struct node { 129 node(T) * next; 130 T * data; 131 } 132 \end{lstlisting} 133 134 The \codeCFA{node(T)} is a use of a polymorphic structure. Polymorphic types 135 must be provided their polymorphic parameters. 136 137 There are many other features of polymorphism that have not given here but 138 these are the ones used by the exception system. 176 node(T) * next; // generic linked node 177 T * data; 178 } 179 \end{cfa} 180 The generic type @node(T)@ is an example of a polymorphic-type usage. Like \CC 181 templates usage, a polymorphic-type usage must specify a type parameter. 182 183 There are many other polymorphism features in \CFA but these are the ones used 184 by the exception system. 139 185 140 186 \section{Concurrency} 141 142 \CFA has a number of concurrency features, \codeCFA{thread}s, 143 \codeCFA{monitor}s and \codeCFA{mutex} parameters, \codeCFA{coroutine}s and 144 \codeCFA{generator}s. The two features that interact with the exception system 145 are \codeCFA{thread}s and \codeCFA{coroutine}s; they and their supporting 146 constructs will be described here. 147 148 \subsection{Coroutines} 149 Coroutines are routines that do not have to finish execution to hand control 150 back to their caller, instead they may suspend their execution at any time 151 and resume it later. 152 Coroutines are not true concurrency but share some similarities and many of 153 the same underpinnings and so are included as part of the \CFA threading 154 library. 155 156 In \CFA coroutines are created using the \codeCFA{coroutine} keyword which 157 works just like \codeCFA{struct} except that the created structure will be 158 modified by the compiler to satify the \codeCFA{is_coroutine} trait. 159 160 These structures act as the interface between callers and the coroutine, 161 the fields are used to pass information in and out. Here is a simple example 162 where the single field is used to pass the next number in a sequence out. 163 \begin{lstlisting} 187 \CFA has a number of concurrency features: @thread@, @monitor@, @mutex@ 188 parameters, @coroutine@ and @generator@. The two features that interact with 189 the exception system are @thread@ and @coroutine@; they and their supporting 190 constructs are described here. 191 192 \subsection{Coroutine} 193 A coroutine is a type with associated functions, where the functions are not 194 required to finish execution when control is handed back to the caller. Instead 195 they may suspend execution at any time and be resumed later at the point of 196 last suspension. (Generators are stackless and coroutines are stackful.) These 197 types are not concurrent but share some similarities along with common 198 underpinnings, so they are combined with the \CFA threading library. Further 199 discussion in this section only refers to the coroutine because generators are 200 similar. 201 202 In \CFA, a coroutine is created using the @coroutine@ keyword, which is an 203 aggregate type like @struct,@ except the structure is implicitly modified by 204 the compiler to satisfy the @is_coroutine@ trait; hence, a coroutine is 205 restricted by the type system to types that provide this special trait. The 206 coroutine structure acts as the interface between callers and the coroutine, 207 and its fields are used to pass information in and out of coroutine interface 208 functions. 209 210 Here is a simple example where a single field is used to pass (communicate) the 211 next number in a sequence. 212 \begin{cfa} 164 213 coroutine CountUp { 165 unsigned int next; 166 } 167 \end{lstlisting} 168 169 The routine part of the coroutine is a main function for the coroutine. It 170 takes a reference to a coroutine object and returns nothing. In this function, 171 and any functions called by this function, the suspend statement may be used 172 to return execution to the coroutine's caller. When control returns to the 173 function it continue from that same suspend statement instead of at the top 174 of the function. 175 \begin{lstlisting} 176 void main(CountUp & this) { 177 unsigned int next = 0; 178 while (true) { 179 this.next = next; 180 suspend; 181 next = next + 1; 182 } 183 } 184 \end{lstlisting} 185 186 Control is passed to the coroutine with the resume function. This includes the 187 first time when the coroutine is starting up. The resume function takes a 188 reference to the coroutine structure and returns the same reference. The 189 return value is for easy access to communication variables. For example the 190 next value from a count-up can be generated and collected in a single 191 expression: \codeCFA{resume(count).next}. 214 unsigned int next; // communication variable 215 } 216 CountUp countup; 217 \end{cfa} 218 Each coroutine has @main@ function, which takes a reference to a coroutine 219 object and returns @void@. 220 \begin{cfa}[numbers=left] 221 void main(@CountUp & this@) { // argument matches trait is_coroutine 222 unsigned int up = 0; // retained between calls 223 while (true) { 224 next = up; // make "up" available outside function 225 @suspend;@$\label{suspend}$ 226 up += 1; 227 } 228 } 229 \end{cfa} 230 In this function, or functions called by this function (helper functions), the 231 @suspend@ statement is used to return execution to the coroutine's caller 232 without terminating the coroutine. 233 234 A coroutine is resumed by calling the @resume@ function, \eg @resume(countup)@. 235 The first resume calls the @main@ function at the top. Thereafter, resume calls 236 continue a coroutine in the last suspended function after the @suspend@ 237 statement, in this case @main@ line~\ref{suspend}. The @resume@ function takes 238 a reference to the coroutine structure and returns the same reference. The 239 return value allows easy access to communication variables defined in the 240 coroutine object. For example, the @next@ value for coroutine object @countup@ 241 is both generated and collected in the single expression: 242 @resume(countup).next@. 192 243 193 244 \subsection{Monitors and Mutex} 194 195 True concurrency does not garrenty ordering. To get some of that ordering back 196 \CFA uses monitors and mutex (mutual exclution) parameters. A monitor is 197 another special declaration that contains a lock and is compatable with mutex 198 parameters. 199 200 Function parameters can have the \codeCFA{mutex} qualifiers on reference 201 arguments, for example \codeCFA{void example(a_monitor & mutex arg);}. When the 202 function is called it will acquire the lock on all of the mutex parameters. 203 204 This means that all functions that mutex on a type are part of a critical 205 section and only one will ever run at a time. 245 Concurrency does not guarantee ordering; without ordering results are 246 non-deterministic. To claw back ordering, \CFA uses monitors and @mutex@ 247 (mutual exclusion) parameters. A monitor is another kind of aggregate, where 248 the compiler implicitly inserts a lock and instances are compatible with 249 @mutex@ parameters. 250 251 A function that requires deterministic (ordered) execution, acquires mutual 252 exclusion on a monitor object by qualifying an object reference parameter with 253 @mutex@. 254 \begin{cfa} 255 void example(MonitorA & @mutex@ argA, MonitorB & @mutex@ argB); 256 \end{cfa} 257 When the function is called, it implicitly acquires the monitor lock for all of 258 the mutex parameters without deadlock. This semantics means all functions with 259 the same mutex type(s) are part of a critical section for objects of that type 260 and only one runs at a time. 206 261 207 262 \subsection{Threads} 208 While coroutines allow new things to be done with a single execution path 209 threads actually introduce new paths of execution that continue independently. 210 Now for threads to work together their must be some communication between them 211 and that means the timing of certain operations does have to be known. There 212 or various means of syncronization and mutual exclution provided by \CFA but 213 for exceptions only the basic two -- fork and join -- are needed. 214 215 Threads are created like coroutines except the keyword is changed: 216 \begin{lstlisting} 263 Functions, generators, and coroutines are sequential so there is only a single 264 (but potentially sophisticated) execution path in a program. Threads introduce 265 multiple execution paths that continue independently. 266 267 For threads to work safely with objects requires mutual exclusion using 268 monitors and mutex parameters. For threads to work safely with other threads, 269 also requires mutual exclusion in the form of a communication rendezvous, which 270 also supports internal synchronization as for mutex objects. For exceptions 271 only the basic two basic operations are important: thread fork and join. 272 273 Threads are created like coroutines with an associated @main@ function: 274 \begin{cfa} 217 275 thread StringWorker { 218 219 276 const char * input; 277 int result; 220 278 }; 221 222 279 void main(StringWorker & this) { 223 const char * localCopy = this.input; 224 // ... do some work, perhaps hashing the string ... 225 this.result = result; 226 } 227 \end{lstlisting} 228 The main function will start executing after the fork operation and continue 229 executing until it is finished. If another thread joins with this one it will 230 wait until main has completed execution. In other words everything the thread 231 does is between fork and join. 232 233 From the outside this is the creation and destruction of the thread object. 234 Fork happens after the constructor is run and join happens before the 235 destructor runs. Join also happens during the \codeCFA{join} function which 236 can be used to join a thread earlier. If it is used the destructor does not 237 join as that has already been completed. 280 const char * localCopy = this.input; 281 // ... do some work, perhaps hashing the string ... 282 this.result = result; 283 } 284 { 285 StringWorker stringworker; // fork thread running in "main" 286 } // implicitly join with thread $\(\Rightarrow\)$ wait for completion 287 \end{cfa} 288 The thread main is where a new thread starts execution after a fork operation 289 and then the thread continues executing until it is finished. If another thread 290 joins with an executing thread, it waits until the executing main completes 291 execution. In other words, everything a thread does is between a fork and join. 292 293 From the outside, this behaviour is accomplished through creation and 294 destruction of a thread object. Implicitly, fork happens after a thread 295 object's constructor is run and join happens before the destructor runs. Join 296 can also be specified explicitly using the @join@ function to wait for a 297 thread's completion independently from its deallocation (\ie destructor 298 call). If @join@ is called explicitly, the destructor does not implicitly join. -
doc/theses/andrew_beach_MMath/features.tex
rb6a8b31 rd95969a 1 \chapter{Features} 2 3 This chapter covers the design and user interface of the \CFA exception 4 handling mechanism. 5 6 \section{Virtual Casts} 7 8 Virtual casts and virtual types are not truly part of the exception system but 9 they did not exist in \CFA and are useful in exceptions. So a minimal version 10 of they virtual system was designed and implemented. 11 12 Virtual types are organizied in simple hierarchies. Each virtual type may have 13 a parent and can have any number of children. A type's descendants are its 14 children and its children's descendants. A type may not be its own descendant. 15 16 Each virtual type has an associated virtual table type. A virtual table is a 17 structure that has fields for all the virtual members of a type. A virtual 18 type has all the virtual members of its parent and can add more. It may also 19 update the values of the virtual members. 20 21 Except for virtual casts, this is only used internally in the exception 22 system. There is no general purpose interface for the other features. A 23 a virtual cast has the following syntax: 24 25 \begin{lstlisting} 1 \chapter{Exception Features} 2 3 This chapter covers the design and user interface of the \CFA 4 exception-handling mechanism. 5 6 \section{Virtuals} 7 Virtual types and casts are not required for a basic exception-system but are 8 useful for advanced exception features. However, \CFA is not object-oriented so 9 there is no obvious concept of virtuals. Hence, to create advanced exception 10 features for this work, I needed to designed and implemented a virtual-like 11 system for \CFA. 12 13 Object-oriented languages often organized exceptions into a simple hierarchy, 14 \eg Java. 15 \begin{center} 16 \setlength{\unitlength}{4000sp}% 17 \begin{picture}(1605,612)(2011,-1951) 18 \put(2100,-1411){\vector(1, 0){225}} 19 \put(3450,-1411){\vector(1, 0){225}} 20 \put(3550,-1411){\line(0,-1){225}} 21 \put(3550,-1636){\vector(1, 0){150}} 22 \put(3550,-1636){\line(0,-1){225}} 23 \put(3550,-1861){\vector(1, 0){150}} 24 \put(2025,-1490){\makebox(0,0)[rb]{\LstBasicStyle{exception}}} 25 \put(2400,-1460){\makebox(0,0)[lb]{\LstBasicStyle{arithmetic}}} 26 \put(3750,-1460){\makebox(0,0)[lb]{\LstBasicStyle{underflow}}} 27 \put(3750,-1690){\makebox(0,0)[lb]{\LstBasicStyle{overflow}}} 28 \put(3750,-1920){\makebox(0,0)[lb]{\LstBasicStyle{zerodivide}}} 29 \end{picture}% 30 \end{center} 31 The hierarchy provides the ability to handle an exception at different degrees 32 of specificity (left to right). Hence, it is possible to catch a more general 33 exception-type in higher-level code where the implementation details are 34 unknown, which reduces tight coupling to the lower-level implementation. 35 Otherwise, low-level code changes require higher-level code changes, \eg, 36 changing from raising @underflow@ to @overflow@ at the low level means changing 37 the matching catch at the high level versus catching the general @arithmetic@ 38 exception. In detail, each virtual type may have a parent and can have any 39 number of children. A type's descendants are its children and its children's 40 descendants. A type may not be its own descendant. 41 42 The exception hierarchy allows a handler (@catch@ clause) to match multiple 43 exceptions, \eg a base-type handler catches both base and derived 44 exception-types. 45 \begin{cfa} 46 try { 47 ... 48 } catch(arithmetic &) { 49 ... // handle arithmetic, underflow, overflow, zerodivide 50 } 51 \end{cfa} 52 Most exception mechanisms perform a linear search of the handlers and select 53 the first matching handler, so the order of handers is now important because 54 matching is many to one. 55 56 Each virtual type needs an associated virtual table. A virtual table is a 57 structure with fields for all the virtual members of a type. A virtual type has 58 all the virtual members of its parent and can add more. It may also update the 59 values of the virtual members and often does. 60 61 While much of the virtual infrastructure is created, it is currently only used 62 internally for exception handling. The only user-level feature is the virtual 63 cast, which is the same as the \CC \lstinline[language=C++]|dynamic_cast|. 64 \begin{cfa} 26 65 (virtual TYPE)EXPRESSION 27 \end{lstlisting} 28 29 This has the same precidence as a traditional C-cast and can be used in the 30 same places. This will convert the result of EXPRESSION to the type TYPE. Both 31 the type of EXPRESSION and TYPE must be pointers to virtual types. 32 33 The cast is checked and will either return the original value or null, based 34 on the result of the check. The check is does the object pointed at have a 35 type that is a descendant of the target type. If it is the result is the 36 pointer, otherwise the result is null. 37 38 \section{Exceptions} 66 \end{cfa} 67 Note, the syntax and semantics matches a C-cast, rather than the unusual \CC 68 syntax for special casts. Both the type of @EXPRESSION@ and @TYPE@ must be a 69 pointer to a virtual type. The cast dynamically checks if the @EXPRESSION@ type 70 is the same or a subtype of @TYPE@, and if true, returns a pointer to the 71 @EXPRESSION@ object, otherwise it returns @0p@ (null pointer). 72 73 \section{Exception} 39 74 % Leaving until later, hopefully it can talk about actual syntax instead 40 75 % of my many strange macros. Syntax aside I will also have to talk about the 41 76 % features all exceptions support. 42 77 43 \section{Termination} 44 45 Termination exception throws are likely the most framilar kind, as they are 46 used in several popular programming languages. A termination will throw an 47 exception, search the stack for a handler, unwind the stack to where the 48 handler is defined, execute the handler and then continue execution after 49 the handler. They are used when execution cannot continue here. 50 51 Termination has two pieces of syntax it uses. The first is the throw: 52 \begin{lstlisting} 78 Exceptions are defined by the trait system; there are a series of traits, and 79 if a type satisfies them, then it can be used as an exception. The following 80 is the base trait all exceptions need to match. 81 \begin{cfa} 82 trait is_exception(exceptT &, virtualT &) { 83 virtualT const & @get_exception_vtable@(exceptT *); 84 }; 85 \end{cfa} 86 The function takes any pointer, including the null pointer, and returns a 87 reference to the virtual-table object. Defining this function also establishes 88 the virtual type and a virtual-table pair to the \CFA type-resolver and 89 promises @exceptT@ is a virtual type and a child of the base exception-type. 90 91 {\color{blue} PAB: I do not understand this paragraph.} 92 One odd thing about @get_exception_vtable@ is that it should always be a 93 constant function, returning the same value regardless of its argument. A 94 pointer or reference to the virtual table instance could be used instead, 95 however using a function has some ease of implementation advantages and allows 96 for easier disambiguation because the virtual type name (or the address of an 97 instance that is in scope) can be used instead of the mangled virtual table 98 name. Also note the use of the word ``promise'' in the trait 99 description. Currently, \CFA cannot check to see if either @exceptT@ or 100 @virtualT@ match the layout requirements. This is considered part of 101 @get_exception_vtable@'s correct implementation. 102 103 \section{Raise} 104 \CFA provides two kinds of exception raise: termination (see 105 \VRef{s:Termination}) and resumption (see \VRef{s:Resumption}), which are 106 specified with the following traits. 107 \begin{cfa} 108 trait is_termination_exception( 109 exceptT &, virtualT & | is_exception(exceptT, virtualT)) { 110 void @defaultTerminationHandler@(exceptT &); 111 }; 112 \end{cfa} 113 The function is required to allow a termination raise, but is only called if a 114 termination raise does not find an appropriate handler. 115 116 Allowing a resumption raise is similar. 117 \begin{cfa} 118 trait is_resumption_exception( 119 exceptT &, virtualT & | is_exception(exceptT, virtualT)) { 120 void @defaultResumptionHandler@(exceptT &); 121 }; 122 \end{cfa} 123 The function is required to allow a resumption raise, but is only called if a 124 resumption raise does not find an appropriate handler. 125 126 Finally there are three convenience macros for referring to the these traits: 127 @IS_EXCEPTION@, @IS_TERMINATION_EXCEPTION@ and @IS_RESUMPTION_EXCEPTION@. Each 128 takes the virtual type's name, and for polymorphic types only, the 129 parenthesized list of polymorphic arguments. These macros do the name mangling 130 to get the virtual-table name and provide the arguments to both sides 131 {\color{blue}(PAB: What's a ``side''?)} 132 133 \subsection{Termination} 134 \label{s:Termination} 135 136 Termination raise, called ``throw'', is familiar and used in most programming 137 languages with exception handling. The semantics of termination is: search the 138 stack for a matching handler, unwind the stack frames to the matching handler, 139 execute the handler, and continue execution after the handler. Termination is 140 used when execution \emph{cannot} return to the throw. To continue execution, 141 the program must \emph{recover} in the handler from the failed (unwound) 142 execution at the raise to safely proceed after the handler. 143 144 A termination raise is started with the @throw@ statement: 145 \begin{cfa} 53 146 throw EXPRESSION; 54 \end{lstlisting} 55 56 The expression must evaluate to a reference to a termination exception. A 57 termination exception is any exception with a 58 \codeCFA{void defaultTerminationHandler(T &);} (the default handler) defined 59 on it. The handler is taken from the call sight with \CFA's trait system and 60 passed into the exception system along with the exception itself. 61 62 The exception passed into the system is then copied into managed memory. 63 This is to ensure it remains in scope during unwinding. It is the user's 64 responsibility to make sure the original exception is freed when it goes out 65 of scope. Being allocated on the stack is sufficient for this. 66 67 Then the exception system will search the stack starting from the throw and 68 proceding towards the base of the stack, from callee to caller. As it goes 69 it will check any termination handlers it finds: 70 71 \begin{lstlisting} 72 try { 73 TRY_BLOCK 74 } catch (EXCEPTION_TYPE * NAME) { 75 HANDLER 76 } 77 \end{lstlisting} 78 79 This shows a try statement with a single termination handler. The statements 80 in TRY\_BLOCK will be executed when control reaches this statement. While 81 those statements are being executed if a termination exception is thrown and 82 it is not handled by a try statement further up the stack the EHM will check 83 all of the terminations handlers attached to the try block, top to bottom. 84 85 At each handler the EHM will check to see if the thrown exception is a 86 descendant of EXCEPTION\_TYPE. If it is the pointer to the exception is 87 bound to NAME and the statements in HANDLER are executed. If control reaches 88 the end of the handler then it exits the block, the exception is freed and 89 control continues after the try statement. 90 91 The default handler is only used if no handler for the exception is found 92 after the entire stack is searched. When that happens the default handler 93 is called with a reference to the exception as its only argument. If the 94 handler returns control continues from after the throw statement. 95 96 \paragraph{Conditional Catches} 97 98 Catch clauses may also be written as: 99 \begin{lstlisting} 100 catch (EXCEPTION_TYPE * NAME ; CONDITION) 101 \end{lstlisting} 102 This has the same behaviour as a regular catch clause except that if the 103 exception matches the given type the condition is also run. If the result is 104 true only then is this considered a matching handler. If the result is false 105 then the handler does not match and the search continues with the next clause 106 in the try block. 107 108 The condition considers all names in scope at the beginning of the try block 109 to be in scope along with the name introduce in the catch clause itself. 110 111 \paragraph{Re-Throwing} 112 113 You can also rethrow the most recent termination exception with 114 \codeCFA{throw;}. % This is terrible and you should never do it. 115 This can be done in a handler or any function that could be called from a 116 handler. 117 118 This will start another termination throw reusing the exception, meaning it 119 does not copy the exception or allocated any more memory for it. However the 120 default handler is still at the original through and could refer to data that 121 was on the unwound section of the stack. So instead a new default handler that 122 does a program level abort is used. 123 124 \section{Resumption} 125 126 Resumption exceptions are less popular then termination but in many 127 regards are simpler and easier to understand. A resumption throws an exception, 128 searches for a handler on the stack, executes that handler on top of the stack 129 and then continues execution from the throw. These are used when a problem 130 needs to be fixed before execution continues. 131 132 A resumption is thrown with a throw resume statement: 133 \begin{lstlisting} 147 \end{cfa} 148 The expression must return a termination-exception reference, where the 149 termination exception has a type with a @void defaultTerminationHandler(T &)@ 150 (default handler) defined. The handler is found at the call site using \CFA's 151 trait system and passed into the exception system along with the exception 152 itself. 153 154 At runtime, a representation of the exception type and an instance of the 155 exception type is copied into managed memory (heap) to ensure it remains in 156 scope during unwinding. It is the user's responsibility to ensure the original 157 exception object at the throw is freed when it goes out of scope. Being 158 allocated on the stack is sufficient for this. 159 160 Then the exception system searches the stack starting from the throw and 161 proceeding towards the base of the stack, from callee to caller. At each stack 162 frame, a check is made for termination handlers defined by the @catch@ clauses 163 of a @try@ statement. 164 \begin{cfa} 165 try { 166 GUARDED_BLOCK 167 } @catch (EXCEPTION_TYPE$\(_1\)$ * NAME)@ { // termination handler 1 168 HANDLER_BLOCK$\(_1\)$ 169 } @catch (EXCEPTION_TYPE$\(_2\)$ * NAME)@ { // termination handler 2 170 HANDLER_BLOCK$\(_2\)$ 171 } 172 \end{cfa} 173 The statements in the @GUARDED_BLOCK@ are executed. If those statements, or any 174 functions invoked from those statements, throws an exception, and the exception 175 is not handled by a try statement further up the stack, the termination 176 handlers are searched for a matching exception type from top to bottom. 177 178 Exception matching checks the representation of the thrown exception-type is 179 the same or a descendant type of the exception types in the handler clauses. If 180 there is a match, a pointer to the exception object created at the throw is 181 bound to @NAME@ and the statements in the associated @HANDLER_BLOCK@ are 182 executed. If control reaches the end of the handler, the exception is freed, 183 and control continues after the try statement. 184 185 The default handler visible at the throw statement is used if no matching 186 termination handler is found after the entire stack is searched. At that point, 187 the default handler is called with a reference to the exception object 188 generated at the throw. If the default handler returns, the system default 189 action is executed, which often terminates the program. This feature allows 190 each exception type to define its own action, such as printing an informative 191 error message, when an exception is not handled in the program. 192 193 \subsection{Resumption} 194 \label{s:Resumption} 195 196 Resumption raise, called ``resume'', is as old as termination 197 raise~\cite{Goodenough75} but is less popular. In many ways, resumption is 198 simpler and easier to understand, as it is simply a dynamic call (as in 199 Lisp). The semantics of resumption is: search the stack for a matching handler, 200 execute the handler, and continue execution after the resume. Notice, the stack 201 cannot be unwound because execution returns to the raise point. Resumption is 202 used used when execution \emph{can} return to the resume. To continue 203 execution, the program must \emph{correct} in the handler for the failed 204 execution at the raise so execution can safely continue after the resume. 205 206 A resumption raise is started with the @throwResume@ statement: 207 \begin{cfa} 134 208 throwResume EXPRESSION; 135 \end{lstlisting} 136 The result of EXPRESSION must be a resumption exception type. A resumption 137 exception type is any type that satifies the assertion 138 \codeCFA{void defaultResumptionHandler(T &);} (the default handler). When the 139 statement is executed the expression is evaluated and the result is thrown. 140 141 Handlers are declared using clauses in try statements: 142 \begin{lstlisting} 143 try { 144 TRY_BLOCK 145 } catchResume (EXCEPTION_TYPE * NAME) { 146 HANDLER 147 } 148 \end{lstlisting} 149 This is a simple example with the try block and a single resumption handler. 150 Multiple resumption handlers can be put in a try statement and they can be 151 mixed with termination handlers. 152 153 When a resumption begins it will start searching the stack starting from 154 the throw statement and working its way to the callers. In each try statement 155 handlers will be tried top to bottom. Each handler is checked by seeing if 156 the thrown exception is a descendant of EXCEPTION\_TYPE. If not the search 157 continues. Otherwise NAME is bound to a pointer to the exception and the 158 HANDLER statements are executed. After they are finished executing control 159 continues from the throw statement. 160 161 If no approprate handler is found then the default handler is called. The 162 throw statement acts as a regular function call passing the exception to 163 the default handler and after the handler finishes executing control continues 164 from the throw statement. 165 166 The exception system also tracks the position of a search on the stack. If 167 another resumption exception is thrown while a resumption handler is running 168 it will first check handlers pushed to the stack by the handler and any 169 functions it called, then it will continue from the try statement that the 170 handler is a part of; except for the default handler where it continues from 171 the throw the default handler was passed to. 172 173 This makes the search pattern for resumption reflect the one for termination, 174 which is what most users expect. 175 176 % This might need a diagram. But it is an important part of the justifaction 209 \end{cfa} 210 The semantics of the @throwResume@ statement are like the @throw@, but the 211 expression has a type with a @void defaultResumptionHandler(T &)@ (default 212 handler) defined, where the handler is found at the call site by the type 213 system. At runtime, a representation of the exception type and an instance of 214 the exception type is \emph{not} copied because the stack is maintained during 215 the handler search. 216 217 Then the exception system searches the stack starting from the resume and 218 proceeding towards the base of the stack, from callee to caller. At each stack 219 frame, a check is made for resumption handlers defined by the @catchResume@ 220 clauses of a @try@ statement. 221 \begin{cfa} 222 try { 223 GUARDED_BLOCK 224 } @catchResume (EXCEPTION_TYPE$\(_1\)$ * NAME)@ { // resumption handler 1 225 HANDLER_BLOCK$\(_1\)$ 226 } @catchResume (EXCEPTION_TYPE$\(_2\)$ * NAME)@ { // resumption handler 2 227 HANDLER_BLOCK$\(_2\)$ 228 } 229 \end{cfa} 230 The statements in the @GUARDED_BLOCK@ are executed. If those statements, or any 231 functions invoked from those statements, resumes an exception, and the 232 exception is not handled by a try statement further up the stack, the 233 resumption handlers are searched for a matching exception type from top to 234 bottom. (Note, termination and resumption handlers may be intermixed in a @try@ 235 statement but the kind of raise (throw/resume) only matches with the 236 corresponding kind of handler clause.) 237 238 The exception search and matching for resumption is the same as for 239 termination, including exception inheritance. The difference is when control 240 reaches the end of the handler: the resumption handler returns after the resume 241 rather than after the try statement. The resume point assumes the handler has 242 corrected the problem so execution can safely continue. 243 244 Like termination, if no resumption handler is found, the default handler 245 visible at the resume statement is called, and the system default action is 246 executed. 247 248 For resumption, the exception system uses stack marking to partition the 249 resumption search. If another resumption exception is raised in a resumption 250 handler, the second exception search does not start at the point of the 251 original raise. (Remember the stack is not unwound and the current handler is 252 at the top of the stack.) The search for the second resumption starts at the 253 current point on the stack because new try statements may have been pushed by 254 the handler or functions called from the handler. If there is no match back to 255 the point of the current handler, the search skips the stack frames already 256 searched by the first resume and continues after the try statement. The default 257 handler always continues from default handler associated with the point where 258 the exception is created. 259 260 % This might need a diagram. But it is an important part of the justification 177 261 % of the design of the traversal order. 178 It also avoids the recursive resumption problem. If the entire stack is 179 searched loops of resumption can form. Consider a handler that handles an 180 exception of type A by resuming an exception of type B and on the same stack, 181 later in the search path, is a second handler that handles B by resuming A. 182 183 Assuming no other handlers on the stack handle A or B then in either traversal 184 system an A resumed from the top of the stack will be handled by the first 185 handler. A B resumed from the top or from the first handler it will be handled 186 by the second hander. The only difference is when A is thrown from the second 187 handler. The entire stack search will call the first handler again, creating a 188 loop. Starting from the position in the stack though will break this loop. 189 190 \paragraph{Conditional Catches} 191 192 Resumption supports conditional catch clauses like termination does. They 193 use the same syntax except the keyword is changed: 194 \begin{lstlisting} 195 catchResume (EXCEPTION_TYPE * NAME ; CONDITION) 196 \end{lstlisting} 197 198 It also has the same behaviour, after the exception type has been matched 199 with the EXCEPTION\_TYPE the CONDITION is evaluated with NAME in scope. If 200 the result is true then the hander is run, otherwise the search continues 201 just as if there had been a type mismatch. 202 203 \paragraph{Re-Throwing} 204 205 You may also re-throw resumptions with a \codeCFA{throwResume;} statement. 206 This can only be done from inside of a \codeCFA{catchResume} block. 207 208 Outside of any side effects of any code already run in the handler this will 209 have the same effect as if the exception had not been caught in the first 210 place. 262 \begin{verbatim} 263 throwResume2 ----------. 264 | | 265 generated from handler | 266 | | 267 handler | 268 | | 269 throwResume1 -----. : 270 | | : 271 try | : search skip 272 | | : 273 catchResume <----' : 274 | | 275 \end{verbatim} 276 277 This resumption search-pattern reflect the one for termination, which matches 278 with programmer expectations. However, it avoids the \emph{recursive 279 resumption} problem. If parts of the stack are searched multiple times, loops 280 can easily form resulting in infinite recursion. 281 282 Consider the trivial case: 283 \begin{cfa} 284 try { 285 throwResume$\(_1\)$ (E &){}; 286 } catch( E * ) { 287 throwResume; 288 } 289 \end{cfa} 290 Based on termination semantics, programmer expectation is for the re-resume to 291 continue searching the stack frames after the try statement. However, the 292 current try statement is still on the stack below the handler issuing the 293 reresume (see \VRef{s:Reraise}). Hence, the try statement catches the re-raise 294 again and does another re-raise \emph{ad infinitum}, which is confusing and 295 difficult to debug. The \CFA resumption search-pattern skips the try statement 296 so the reresume search continues after the try, mathcing programmer 297 expectation. 298 299 \section{Conditional Catch} 300 Both termination and resumption handler-clauses may perform conditional matching: 301 \begin{cfa} 302 catch (EXCEPTION_TYPE * NAME ; @CONDITION@) 303 \end{cfa} 304 First, the same semantics is used to match the exception type. Second, if the 305 exception matches, @CONDITION@ is executed. The condition expression may 306 reference all names in scope at the beginning of the try block and @NAME@ 307 introduced in the handler clause. If the condition is true, then the handler 308 matches. Otherwise, the exception search continues at the next appropriate kind 309 of handler clause in the try block. 310 \begin{cfa} 311 try { 312 f1 = open( ... ); 313 f2 = open( ... ); 314 ... 315 } catch( IOFailure * f ; fd( f ) == f1 ) { 316 // only handle IO failure for f1 317 } 318 \end{cfa} 319 Note, catching @IOFailure@, checking for @f1@ in the handler, and reraising the 320 exception if not @f1@ is different because the reraise does not examine any of 321 remaining handlers in the current try statement. 322 323 \section{Reraise} 324 \label{s:Reraise} 325 Within the handler block or functions called from the handler block, it is 326 possible to reraise the most recently caught exception with @throw@ or 327 @throwResume@, respective. 328 \begin{cfa} 329 catch( ... ) { 330 ... throw; // rethrow 331 } catchResume( ... ) { 332 ... throwResume; // reresume 333 } 334 \end{cfa} 335 The only difference between a raise and a reraise is that reraise does not 336 create a new exception; instead it continues using the current exception, \ie 337 no allocation and copy. However the default handler is still set to the one 338 visible at the raise point, and hence, for termination could refer to data that 339 is part of an unwound stack frame. To prevent this problem, a new default 340 handler is generated that does a program-level abort. 341 211 342 212 343 \section{Finally Clauses} 213 214 A \codeCFA{finally} clause may be placed at the end of a try statement after 215 all the handler clauses. In the simply case, with no handlers, it looks like 216 this: 217 218 \begin{lstlisting} 219 try { 220 TRY_BLOCK 344 A @finally@ clause may be placed at the end of a @try@ statement. 345 \begin{cfa} 346 try { 347 GUARDED_BLOCK 348 } ... // any number or kind of handler clauses 221 349 } finally { 222 FINAL_STATEMENTS 223 } 224 \end{lstlisting} 225 226 Any number of termination handlers and resumption handlers may proceed the 227 finally clause. 228 229 The FINAL\_STATEMENTS, the finally block, are executed whenever the try 230 statement is removed from the stack. This includes: the TRY\_BLOCK finishes 231 executing, a termination exception finishes executing and the stack unwinds. 232 233 Execution of the finally block should finish by letting control run off 234 the end of the block. This is because after the finally block is complete 235 control will continue to where ever it would if the finally clause was not 236 present. 237 238 Because of this local control flow out of the finally block is forbidden. 239 The compiler rejects uses of \codeCFA{break}, \codeCFA{continue}, 240 \codeCFA{fallthru} and \codeCFA{return} that would cause control to leave 241 the finally block. Other ways to leave the finally block - such as a long 242 jump or termination - are much harder to check, at best requiring additional 243 runtime overhead, and so are merely discouraged. 350 FINALLY_BLOCK 351 } 352 \end{cfa} 353 The @FINALLY_BLOCK@ is executed when the try statement is unwound from the 354 stack, \ie when the @GUARDED_BLOCK@ or any handler clause finishes. Hence, the 355 finally block is always executed. 356 357 Execution of the finally block should always finish, meaning control runs off 358 the end of the block. This requirement ensures always continues as if the 359 finally clause is not present, \ie finally is for cleanup not changing control 360 flow. Because of this requirement, local control flow out of the finally block 361 is forbidden. The compiler precludes any @break@, @continue@, @fallthru@ or 362 @return@ that causes control to leave the finally block. Other ways to leave 363 the finally block, such as a long jump or termination are much harder to check, 364 and at best requiring additional run-time overhead, and so are discouraged. 244 365 245 366 \section{Cancellation} 246 247 Cancellation can be thought of as a stack-level abort or as an uncatchable 248 termination. It unwinds the entirety of the current exception and if possible 249 passes an exception to a different stack as a message. 250 251 There is no special statement for starting a cancellation, instead you call 252 the standard libary function \codeCFA{cancel\_stack} which takes an exception. 253 Unlike in a throw this exception is not used in control flow but is just there 254 to pass information about why the cancellation happened. 255 256 The handler is decided entirely by which stack is being cancelled. There are 257 three handlers that apply to three different groups of stacks: 258 \begin{itemize} 259 \item Main Stack: 260 The main stack is the one on which the program main is called at the beginning 261 of your program. It is also the only stack you have without the libcfathreads. 262 263 Because of this there is no other stack ``above" (or possibly at all) for main 264 to notify when a cancellation occurs. So after the stack is unwound we do a 265 program level abort. 266 267 \item Thread Stack: 268 Thread stacks are those created \codeCFA{thread} or otherwise satify the 269 \codeCFA{is\_thread} trait. 270 271 Threads only have two structural points of communication that must happen, 272 start and join. As the thread must be running to preform a cancellation it 273 will be after start and before join, so join is one cancellation uses. 274 275 After the stack is unwound the thread will halt as if had completed normally 276 and wait for another thread to join with it. The other thread, when it joins, 277 checks for a cancellation. If so it will throw the resumption exception 278 \codeCFA{ThreadCancelled}. 279 280 There is a difference here in how explicate joins (with the \codeCFA{join} 281 function) and implicate joins (from a destructor call). Explicate joins will 282 take the default handler (\codeCFA{defaultResumptionHandler}) from the context 283 and use like a regular through does if the exception is not caught. The 284 implicate join does a program abort instead. 285 286 This is for safety. One of the big problems in exceptions is you cannot handle 287 two terminations or cancellations on the same stack as either can destroy the 288 context required for the other. This can happen with join but as the 289 destructors will always be run when the stack is being unwound and one 290 termination/cancellation is already active. Also since they are implicite they 291 are easier to forget about. 292 293 \item Coroutine Stack: 294 Coroutine stacks are those created with \codeCFA{coroutine} or otherwise 295 satify the \codeCFA{is\_coroutine} trait. 296 297 A coroutine knows of two other coroutines, its starter and its last resumer. 298 The last resumer is ``closer" so that is the one notified. 299 300 After the stack is unwound control goes to the last resumer. 301 Resume will resume throw a \codeCFA{CoroutineCancelled} exception, which is 302 polymorphic over the coroutine type and has a pointer to the coroutine being 303 cancelled and the cancelling exception. The resume function also has an 304 assertion that the \codeCFA{defaultResumptionHandler} for the exception. So it 305 will use the default handler like a regular throw. 306 307 \end{itemize} 367 Cancellation is a stack-level abort, which can be thought of as as an 368 uncatchable termination. It unwinds the entirety of the current stack, and if 369 possible forwards the cancellation exception to a different stack. 370 371 There is no special statement for starting a cancellation; instead the standard 372 library function @cancel_stack@ is called passing an exception. Unlike a 373 raise, this exception is not used in matching only to pass information about 374 the cause of the cancellation. 375 376 Handling of a cancellation depends on which stack is being cancelled. 377 \begin{description} 378 \item[Main Stack:] 379 380 The main stack is the one used by the program main at the start of execution, 381 and is the only stack in a sequential program. Hence, when cancellation is 382 forwarded to the main stack, there is no other forwarding stack, so after the 383 stack is unwound, there is a program-level abort. 384 385 \item[Thread Stack:] 386 A thread stack is created for a @thread@ object or object that satisfies the 387 @is_thread@ trait. A thread only has two points of communication that must 388 happen: start and join. As the thread must be running to perform a 389 cancellation, it must occur after start and before join, so join is a 390 cancellation point. After the stack is unwound, the thread halts and waits for 391 another thread to join with it. The joining thread, checks for a cancellation, 392 and if present, resumes exception @ThreadCancelled@. 393 394 There is a subtle difference between the explicit join (@join@ function) and 395 implicit join (from a destructor call). The explicit join takes the default 396 handler (@defaultResumptionHandler@) from its calling context, which is used if 397 the exception is not caught. The implicit join does a program abort instead. 398 399 This semantics is for safety. One difficult problem for any exception system is 400 defining semantics when an exception is raised during an exception search: 401 which exception has priority, the original or new exception? No matter which 402 exception is selected, it is possible for the selected one to disrupt or 403 destroy the context required for the other. {\color{blue} PAB: I do not 404 understand the following sentences.} This loss of information can happen with 405 join but as the thread destructor is always run when the stack is being unwound 406 and one termination/cancellation is already active. Also since they are 407 implicit they are easier to forget about. 408 409 \item[Coroutine Stack:] A coroutine stack is created for a @coroutine@ object 410 or object that satisfies the @is_coroutine@ trait. A coroutine only knows of 411 two other coroutines, its starter and its last resumer. The last resumer has 412 the tightest coupling to the coroutine it activated. Hence, cancellation of 413 the active coroutine is forwarded to the last resumer after the stack is 414 unwound, as the last resumer has the most precise knowledge about the current 415 execution. When the resumer restarts, it resumes exception 416 @CoroutineCancelled@, which is polymorphic over the coroutine type and has a 417 pointer to the cancelled coroutine. 418 419 The resume function also has an assertion that the @defaultResumptionHandler@ 420 for the exception. So it will use the default handler like a regular throw. 421 \end{description} -
doc/theses/andrew_beach_MMath/future.tex
rb6a8b31 rd95969a 8 8 parts of the exception system that use the current version. 9 9 10 For instance a full virtual system would probably allow for several 11 improvements to the exception traits. Although they do currently work they 12 could be made easier to use by making the virtual table type implitate in the 13 trait (which would remove the need for those wrapper marcos) or allowing 14 for assertions that give the layout of a virtual table for safety. 10 There are several improvements to the virtual system that would improve 11 the exception traits. The biggest one is an assertion that checks that one 12 virtual type is a child of another virtual type. This would capture many of 13 the requirements much more precisely. 14 15 The full virtual system might also include other improvement like associated 16 types. This is a proposed feature that would allow traits to refer to types 17 not listed in their header. This would allow the exception traits to not 18 refer to the virtual table type explicatly which would remove the need for 19 the interface macros. 15 20 16 21 \section{Additional Throws} 17 Several other kinds of throws, beyond the termination throw ( \codeCFA{throw}),18 the resumption throw ( \codeCFA{throwResume}) and the re-throws, were considered.22 Several other kinds of throws, beyond the termination throw (@throw@), 23 the resumption throw (@throwResume@) and the re-throws, were considered. 19 24 None were as useful as the core throws but they would likely be worth 20 25 revising. … … 65 70 66 71 Also new techniques to skip previously searched parts of the stack will have 67 to be developed. 72 to be developed. The recursive resume problem still remains and ideally the 73 same pattern of ignoring sections of the stack. 68 74 69 \section{Support for More Platforms} 70 Termination is not portable because it is implemented with inline assembly. 71 Those sections will have to be rewritten to support different architectures 75 \section{Signal Exceptions} 76 Exception Handling: Issues and a Proposed Notation suggests there are three 77 types of exceptions: escape, notify and signal. 78 Escape exceptions are our termination exceptions, notify exceptions are 79 resumption exceptions and that leaves signal exception unimplemented. 72 80 73 \section{Quality-of-Life Improvements} 74 Finally come various improvements to the usability of \CFA. Most of these 75 would just require time. Time that would not lead to interesting research so 76 it has been left aside for now. A few examples are included here but there 77 are more: 81 Signal exceptions allow either behaviour, that is after the exception is 82 handled control can either return to the throw or from where the handler is 83 defined. 84 85 The design should be rexamined and be updated for \CFA. A very direct 86 translation would perhaps have a new throw and catch pair and a statement 87 (or statements) could be used to decide if the handler returns to the throw 88 or continues where it is, but there are other options. 89 90 For instance resumption could be extended to cover this use by allowing 91 local control flow out of it. This would require an unwind as part of the 92 transition as there are stack frames that have to be removed. 93 This would mean there is no notify like throw but because \CFA does not have 94 exception signatures a termination can be thrown from any resumption handler 95 already so there are already ways one could try to do this in existing \CFA. 96 97 % Maybe talk about the escape; and escape CONTROL_STMT; statements or how 98 % if we could choose if _Unwind_Resume proceeded to the clean-up stage this 99 % would be much easier to implement. 100 101 \section{Language Improvements} 102 There is also a lot of work that are not follow ups to this work in terms of 103 research, some have no interesting research to be done at all, but would 104 improve \CFA as a programming language. The full list of these would 105 naturally be quite extensive but here are a few examples that involve 106 exceptions: 78 107 79 108 \begin{itemize} 109 \item The implementation of termination is not portable because it includes 110 some assembly statements. These sections will have to be re-written to so 111 \CFA has full support on more machines. 80 112 \item Allowing exception handler to bind the exception to a reference instead 81 113 of a pointer. This should actually result in no change in behaviour so there 82 114 is no reason not to allow it. It is however a small improvement; giving a bit 83 115 of flexibility to the user in what style they want to use. 84 \item Enabling local control flow (by \codeCFA{break}, \codeCFA{return}and116 \item Enabling local control flow (by @break@, @return@ and 85 117 similar statements) out of a termination handler. The current set-up makes 86 118 this very difficult but the catch function that runs the handler after it has … … 88 120 much easier. (To do the same for try blocks would probably wait for zero-cost 89 121 exceptions, which would allow the try block to be inlined as well.) 90 \item Enabling local control flow out of a resumption handler. This would be91 a weighty operation, causing a stack unwind like a termination, so there might92 be a different statement or a statement modifier to make sure the user does93 this purposefully.94 95 However this would require the more complex system as they cannot be inlined96 into the original function as they can be run at a different place on the97 stack. So instead the unwinding will have to carry with it information on98 which one of these points to continue at and possibly also the return value99 for the function if a \codeCFA{return} statement was used.100 122 \end{itemize} -
doc/theses/andrew_beach_MMath/implement.tex
rb6a8b31 rd95969a 9 9 10 10 All of this is accessed through a field inserted at the beginning of every 11 virtual type. Currently it is called \codeC{virtual_table}but it is not11 virtual type. Currently it is called @virtual_table@ but it is not 12 12 ment to be accessed by the user. This field is a pointer to the type's 13 13 virtual table instance. It is assigned once during the object's construction … … 40 40 using that to calculate the mangled name of the parent's virtual table type. 41 41 There are two special fields that are included like normal fields but have 42 special initialization rules: the \codeC{size}field is the type's size and is43 initialized with a sizeof expression, the \codeC{align}field is the type's42 special initialization rules: the @size@ field is the type's size and is 43 initialized with a sizeof expression, the @align@ field is the type's 44 44 alignment and uses an alignof expression. The remaining fields are resolved 45 45 to a name matching the field's name and type using the normal visibility … … 56 56 The declarations include the virtual type definition and forward declarations 57 57 of the virtual table instance, constructor, message function and 58 \codeCFA{get_exception_vtable}. The definition includes the storage and58 @get_exception_vtable@. The definition includes the storage and 59 59 initialization of the virtual table instance and the bodies of the three 60 60 functions. … … 65 65 from the per-instance information. The virtual table type and most of the 66 66 functions are polymorphic so they are all part of the core. The virtual table 67 instance and the \codeCFA{get_exception_vtable}function.68 69 Coroutines and threads need instances of \codeCFA{CoroutineCancelled}and70 \codeCFA{ThreadCancelled}respectively to use all of their functionality.71 When a new data type is declared with \codeCFA{coroutine} or \codeCFA{thread}67 instance and the @get_exception_vtable@ function. 68 69 Coroutines and threads need instances of @CoroutineCancelled@ and 70 @ThreadCancelled@ respectively to use all of their functionality. 71 When a new data type is declared with @coroutine@ or @thread@ 72 72 the forward declaration for the instance is created as well. The definition 73 73 of the virtual table is created at the definition of the main function. … … 79 79 function. 80 80 81 The function is \codeC{__cfa__virtual_cast}and it is implemented in the81 The function is @__cfa__virtual_cast@ and it is implemented in the 82 82 standard library. It takes a pointer to the target type's virtual table and 83 83 the object pointer being cast. The function is very simple, getting the … … 87 87 88 88 For the generated code a forward decaration of the virtual works as follows. 89 There is a forward declaration of \codeC{__cfa__virtual_cast}in every cfa89 There is a forward declaration of @__cfa__virtual_cast@ in every cfa 90 90 file so it can just be used. The object argument is the expression being cast 91 91 so that is just placed in the argument list. … … 110 110 often across functions. 111 111 112 At a very basic level this can be done with \codeC{setjmp} \& \codeC{longjmp}112 At a very basic level this can be done with @setjmp@ \& @longjmp@ 113 113 which simply move the top of the stack, discarding everything on the stack 114 114 above a certain point. However this ignores all the clean-up code that should … … 118 118 both of these problems. 119 119 120 Libunwind, provided in \texttt{unwind.h}on most platorms, is a C library120 Libunwind, provided in @unwind.h@ on most platorms, is a C library 121 121 that provides \CPP style stack unwinding. Its operation is divided into two 122 122 phases. The search phase -- phase 1 -- is used to scan the stack and decide … … 142 142 143 143 GCC will generate an LSDA and attach its personality function with the 144 \texttt{-fexceptions}flag. However this only handles the cleanup attribute.144 @-fexceptions@ flag. However this only handles the cleanup attribute. 145 145 This attribute is used on a variable and specifies a function that should be 146 146 run when the variable goes out of scope. The function is passed a pointer to … … 165 165 messages for special cases (some of which should never be used by the 166 166 personality function) and error codes but unless otherwise noted the 167 personality function should always return \codeC{_URC_CONTINUE_UNWIND}.168 169 The \codeC{version}argument is the verson of the implementation that is167 personality function should always return @_URC_CONTINUE_UNWIND@. 168 169 The @version@ argument is the verson of the implementation that is 170 170 calling the personality function. At this point it appears to always be 1 and 171 171 it will likely stay that way until a new version of the API is updated. 172 172 173 The \codeC{action}argument is set of flags that tell the personality173 The @action@ argument is set of flags that tell the personality 174 174 function when it is being called and what it must do on this invocation. 175 175 The flags are as follows: 176 176 \begin{itemize} 177 \item \codeC{_UA_SEARCH_PHASE}: This flag is set whenever the personality177 \item@_UA_SEARCH_PHASE@: This flag is set whenever the personality 178 178 function is called during the search phase. The personality function should 179 179 decide if unwinding will stop in this function or not. If it does then the 180 personality function should return \codeC{_URC_HANDLER_FOUND}.181 \item \codeC{_UA_CLEANUP_PHASE}: This flag is set whenever the personality180 personality function should return @_URC_HANDLER_FOUND@. 181 \item@_UA_CLEANUP_PHASE@: This flag is set whenever the personality 182 182 function is called during the cleanup phase. If no other flags are set this 183 183 means the entire frame will be unwound and all cleanup code should be run. 184 \item \codeC{_UA_HANDLER_FRAME}: This flag is set during the cleanup phase184 \item@_UA_HANDLER_FRAME@: This flag is set during the cleanup phase 185 185 on the function frame that found the handler. The personality function must 186 186 prepare to return to normal code execution and return 187 \codeC{_URC_INSTALL_CONTEXT}.188 \item \codeC{_UA_FORCE_UNWIND}: This flag is set if the personality function187 @_URC_INSTALL_CONTEXT@. 188 \item@_UA_FORCE_UNWIND@: This flag is set if the personality function 189 189 is called through a forced unwind call. Forced unwind only performs the 190 190 cleanup phase and uses a different means to decide when to stop. See its … … 192 192 \end{itemize} 193 193 194 The \codeC{exception_class} argument is a copy of the \codeC{exception}'s195 \codeC{exception_class}field.196 197 The \codeC{exception}argument is a pointer to the user provided storage194 The @exception_class@ argument is a copy of the @exception@'s 195 @exception_class@ field. 196 197 The @exception@ argument is a pointer to the user provided storage 198 198 object. It has two public fields, the exception class which is actually just 199 199 a number that identifies the exception handling mechanism that created it and … … 201 201 exception needs to 202 202 203 The \codeC{context}argument is a pointer to an opaque type. This is passed203 The @context@ argument is a pointer to an opaque type. This is passed 204 204 to the many helper functions that can be called inside the personality 205 205 function. … … 218 218 functions traversing the stack new-to-old until a function finds a handler or 219 219 the end of the stack is reached. In the latter case raise exception will 220 return with \codeC{_URC_END_OF_STACK}.220 return with @_URC_END_OF_STACK@. 221 221 222 222 Once a handler has been found raise exception continues onto the the cleanup … … 227 227 228 228 If an error is encountered raise exception will return either 229 \codeC{_URC_FATAL_PHASE1_ERROR} or \codeC{_URC_FATAL_PHASE2_ERROR}depending229 @_URC_FATAL_PHASE1_ERROR@ or @_URC_FATAL_PHASE2_ERROR@ depending 230 230 on when the error occured. 231 231 … … 259 259 been unwound. 260 260 261 Each time it is called the stop function should return \codeC{_URC_NO_REASON}261 Each time it is called the stop function should return @_URC_NO_REASON@ 262 262 or transfer control directly to other code outside of libunwind. The 263 263 framework does not provide any assistance here. 264 264 265 265 Its arguments are the same as the paired personality function. 266 The actions \codeC{_UA_CLEANUP_PHASE} and \codeC{_UA_FORCE_UNWIND}are always266 The actions @_UA_CLEANUP_PHASE@ and @_UA_FORCE_UNWIND@ are always 267 267 set when it is called. By the official standard that is all but both GCC and 268 268 Clang add an extra action on the last call at the end of the stack: 269 \codeC{_UA_END_OF_STACK}.269 @_UA_END_OF_STACK@. 270 270 271 271 \section{Exception Context} … … 280 280 Each stack has its own exception context. In a purely sequental program, using 281 281 only core Cforall, there is only one stack and the context is global. However 282 if the library \texttt{libcfathread}is linked then there can be multiple282 if the library @libcfathread@ is linked then there can be multiple 283 283 stacks so they will each need their own. 284 284 285 285 To handle this code always gets the exception context from the function 286 \codeC{this_exception_context}. The main exception handling code is in287 \texttt{libcfa}and that library also defines the function as a weak symbol288 so it acts as a default. Meanwhile in \texttt{libcfathread}the function is286 @this_exception_context@. The main exception handling code is in 287 @libcfa@ and that library also defines the function as a weak symbol 288 so it acts as a default. Meanwhile in @libcfathread@ the function is 289 289 defined as a strong symbol that replaces it when the libraries are linked 290 290 together. 291 291 292 The version of the function defined in \texttt{libcfa}is very simple. It292 The version of the function defined in @libcfa@ is very simple. It 293 293 returns a pointer to a global static variable. With only one stack this 294 294 global instance is associated with the only stack. 295 295 296 The version of the function defined in \texttt{libcfathread}has to handle296 The version of the function defined in @libcfathread@ has to handle 297 297 more as there are multiple stacks. The exception context is included as 298 298 part of the per-stack data stored as part of coroutines. In the cold data 299 299 section, stored at the base of each stack, is the exception context for that 300 stack. The \codeC{this_exception_context}uses the concurrency library to get300 stack. The @this_exception_context@ uses the concurrency library to get 301 301 the current coroutine and through it the cold data section and the exception 302 302 context. … … 323 323 to store the exception. Macros with pointer arthritic and type cast are 324 324 used to move between the components or go from the embedded 325 \codeC{_Unwind_Exception}to the entire node.325 @_Unwind_Exception@ to the entire node. 326 326 327 327 All of these nodes are strung together in a linked list. One linked list per … … 347 347 C which is what the \CFA compiler outputs so a work-around is used. 348 348 349 This work around is a function called \codeC{__cfaehm_try_terminate}in the349 This work around is a function called @__cfaehm_try_terminate@ in the 350 350 standard library. The contents of a try block and the termination handlers 351 351 are converted into functions. These are then passed to the try terminate … … 385 385 386 386 These nested functions and all other functions besides 387 \codeC{__cfaehm_try_terminate}in \CFA use the GCC personality function and388 the \texttt{-fexceptions}flag to generate the LSDA. This allows destructors387 @__cfaehm_try_terminate@ in \CFA use the GCC personality function and 388 the @-fexceptions@ flag to generate the LSDA. This allows destructors 389 389 to be implemented with the cleanup attribute. 390 390 … … 401 401 402 402 The handler function does both the matching and catching. It tries each 403 the condition of \codeCFA{catchResume}in order, top-to-bottom and until it403 the condition of @catchResume@ in order, top-to-bottom and until it 404 404 finds a handler that matches. If no handler matches then the function returns 405 405 false. Otherwise the matching handler is run, if it completes successfully 406 the function returns true. Rethrows, through the \codeCFA{throwResume;}406 the function returns true. Rethrows, through the @throwResume;@ 407 407 statement, cause the function to return true. 408 409 % Recursive Resumption Stuff: 410 Blocking out part of the stack is accomplished by updating the front of the 411 list as the search continues. Before the handler at a node is called the head 412 of the list is updated to the next node of the current node. After the search 413 is complete, successful or not, the head of the list is reset. 414 415 This means the current handler and every handler that has already been 416 checked are not on the list while a handler is run. If a resumption is thrown 417 during the handling of another resumption the active handlers and all the 418 other handler checked up to this point will not be checked again. 419 420 This structure also supports new handler added while the resumption is being 421 handled. These are added to the front of the list, pointing back along the 422 stack -- the first one will point over all the checked handlers -- and the 423 ordering is maintained. 408 424 409 425 \subsection{Libunwind Compatibility} … … 438 454 439 455 Cancellation also uses libunwind to do its stack traversal and unwinding, 440 however it uses a different primary function \codeC{_Unwind_ForcedUnwind}.456 however it uses a different primary function @_Unwind_ForcedUnwind@. 441 457 Details of its interface can be found in the unwind section. 442 458 -
doc/theses/andrew_beach_MMath/unwinding.tex
rb6a8b31 rd95969a 1 \chapter{ Unwinding in \CFA}1 \chapter{\texorpdfstring{Unwinding in \CFA}{Unwinding in Cforall}} 2 2 3 3 Stack unwinding is the process of removing things from the stack. Within … … 10 10 Even this is fairly simple if nothing needs to happen when the stack unwinds. 11 11 Traditional C can unwind the stack by saving and restoring state (with 12 \codeC{setjmp} \& \codeC{longjmp}). However many languages define actions that12 @setjmp@ \& @longjmp@). However many languages define actions that 13 13 have to be taken when something is removed from the stack, such as running 14 a variable's destructor or a \codeCFA{try} statement's \codeCFA{finally}14 a variable's destructor or a @try@ statement's @finally@ 15 15 clause. Handling this requires walking the stack going through each stack 16 16 frame. … … 29 29 30 30 \CFA uses two primary functions in libunwind to create most of its 31 exceptional control-flow: \codeC{_Unwind_RaiseException}and32 \codeC{_Unwind_ForcedUnwind}.31 exceptional control-flow: @_Unwind_RaiseException@ and 32 @_Unwind_ForcedUnwind@. 33 33 Their operation is divided into two phases: search and clean-up. The search 34 34 phase -- phase 1 -- is used to scan the stack but not unwinding it. The … … 44 44 A personality function performs three tasks, although not all have to be 45 45 present. The tasks performed are decided by the actions provided. 46 \codeC{_Unwind_Action}is a bitmask of possible actions and an argument of46 @_Unwind_Action@ is a bitmask of possible actions and an argument of 47 47 this type is passed into the personality function. 48 48 \begin{itemize} 49 \item \codeC{_UA_SEARCH_PHASE}is passed in search phase and tells the49 \item@_UA_SEARCH_PHASE@ is passed in search phase and tells the 50 50 personality function to check for handlers. If there is a handler in this 51 51 stack frame, as defined by the language, the personality function should 52 return \codeC{_URC_HANDLER_FOUND}. Otherwise it should return53 \codeC{_URC_CONTINUE_UNWIND}.54 \item \codeC{_UA_CLEANUP_PHASE}is passed in during the clean-up phase and52 return @_URC_HANDLER_FOUND@. Otherwise it should return 53 @_URC_CONTINUE_UNWIND@. 54 \item@_UA_CLEANUP_PHASE@ is passed in during the clean-up phase and 55 55 means part or all of the stack frame is removed. The personality function 56 56 should do whatever clean-up the language defines 57 57 (such as running destructors/finalizers) and then generally returns 58 \codeC{_URC_CONTINUE_UNWIND}.59 \item \codeC{_UA_HANDLER_FRAME}means the personality function must install58 @_URC_CONTINUE_UNWIND@. 59 \item@_UA_HANDLER_FRAME@ means the personality function must install 60 60 a handler. It is also passed in during the clean-up phase and is in addition 61 61 to the clean-up action. libunwind provides several helpers for the personality 62 62 function here. Once it is done, the personality function must return 63 \codeC{_URC_INSTALL_CONTEXT}.63 @_URC_INSTALL_CONTEXT@. 64 64 \end{itemize} 65 65 The personality function is given a number of other arguments. Some are for 66 compatability and there is the \codeC{struct _Unwind_Context}pointer which66 compatability and there is the @struct _Unwind_Context@ pointer which 67 67 passed to many helpers to get information about the current stack frame. 68 68 … … 72 72 raise-exception but with some extras. 73 73 The first it passes in an extra action to the personality function on each 74 stack frame, \codeC{_UA_FORCE_UNWIND}, which means a handler cannot be74 stack frame, @_UA_FORCE_UNWIND@, which means a handler cannot be 75 75 installed. 76 76 … … 83 83 stack frames have been removed. By the standard API this is marked by setting 84 84 the stack pointer inside the context passed to the stop function. However both 85 GCC and Clang add an extra action for this case \codeC{_UA_END_OF_STACK}.85 GCC and Clang add an extra action for this case @_UA_END_OF_STACK@. 86 86 87 87 Each time function the stop function is called it can do one or two things. 88 When it is not the end of the stack it can return \codeC{_URC_NO_REASON}to88 When it is not the end of the stack it can return @_URC_NO_REASON@ to 89 89 continue unwinding. 90 90 % Is there a reason that NO_REASON is used instead of CONTINUE_UNWIND? … … 93 93 are provided to do it. 94 94 95 \section{\ CFA Implementation}95 \section{\texorpdfstring{\CFA Implementation}{Cforall Implementation}} 96 96 97 97 To use libunwind, \CFA provides several wrappers, its own storage, … … 113 113 114 114 The stop function is very simple. It checks the end of stack flag to see if 115 it is finished unwinding. If so, it calls \codeC{exit}to end the process,115 it is finished unwinding. If so, it calls @exit@ to end the process, 116 116 otherwise it returns with no-reason to continue unwinding. 117 117 % Yeah, this is going to have to change. … … 128 128 location of the instruction pointer and stack layout, which varies with 129 129 compiler and optimization levels. So for frames where there are only 130 destructors, GCC's attribute cleanup with the \texttt{-fexception}flag is130 destructors, GCC's attribute cleanup with the @-fexception@ flag is 131 131 sufficient to handle unwinding. 132 132 133 133 The only functions that require more than that are those that contain 134 \codeCFA{try} statements. A \codeCFA{try} statement has a \codeCFA{try} 135 clause, some number of \codeCFA{catch} clauses and \codeCFA{catchResume}136 clauses and may have a \codeCFA{finally} clause. Of these only \codeCFA{try}137 statements with \codeCFA{catch}clauses need to be transformed and only they138 and the \codeCFA{try}clause are involved.134 @try@ statements. A @try@ statement has a @try@ 135 clause, some number of @catch@ clauses and @catchResume@ 136 clauses and may have a @finally@ clause. Of these only @try@ 137 statements with @catch@ clauses need to be transformed and only they 138 and the @try@ clause are involved. 139 139 140 The \codeCFA{try}statement is converted into a series of closures which can140 The @try@ statement is converted into a series of closures which can 141 141 access other parts of the function according to scoping rules but can be 142 passed around. The \codeCFA{try}clause is converted into the try functions,143 almost entirely unchanged. The \codeCFA{catch}clauses are converted into two142 passed around. The @try@ clause is converted into the try functions, 143 almost entirely unchanged. The @catch@ clauses are converted into two 144 144 functions; the match function and the catch function. 145 145 … … 153 153 runs the handler's body. 154 154 155 These three functions are passed to \codeC{try_terminate}. This is an155 These three functions are passed to @try_terminate@. This is an 156 156 % Maybe I shouldn't quote that, it isn't its actual name. 157 157 internal hand-written function that has its own personality function and … … 167 167 handler was found in this frame. If it was then the personality function 168 168 installs the handler, which is setting the instruction pointer in 169 \codeC{try_terminate}to an otherwise unused section that calls the catch169 @try_terminate@ to an otherwise unused section that calls the catch 170 170 function, passing it the current exception and handler index. 171 \codeC{try_terminate}returns as soon as the catch function returns.171 @try_terminate@ returns as soon as the catch function returns. 172 172 173 173 At this point control has returned to normal control flow. -
doc/theses/fangren_yu_COOP_F20/Report.tex
rb6a8b31 rd95969a 17 17 \usepackage[usenames]{color} 18 18 \input{common} % common CFA document macros 19 \usepackage[dvips,plainpages=false,pdfpagelabels,pdfpagemode=UseNone,colorlinks=true, pagebackref=true,linkcolor=blue,citecolor=blue,urlcolor=blue,pagebackref=true,breaklinks=true]{hyperref}19 \usepackage[dvips,plainpages=false,pdfpagelabels,pdfpagemode=UseNone,colorlinks=true,linkcolor=blue,citecolor=blue,urlcolor=blue,pagebackref=true,breaklinks=true]{hyperref} 20 20 \usepackage{breakurl} 21 21 \urlstyle{sf} … … 76 76 \renewcommand{\subsectionmark}[1]{\markboth{\thesubsection\quad #1}{\thesubsection\quad #1}} 77 77 \pagenumbering{roman} 78 \linenumbers % comment out to turn off line numbering78 %\linenumbers % comment out to turn off line numbering 79 79 80 80 \maketitle 81 81 \pdfbookmark[1]{Contents}{section} 82 \tableofcontents 83 84 \clearpage 82 85 83 \thispagestyle{plain} 86 84 \pagenumbering{arabic} 87 85 88 86 \begin{abstract} 87 \CFA is an evolutionary, non-object-oriented extension of the C programming language, featuring a parametric type-system, and is currently under active development. The reference compiler for the \CFA language, @cfa-cc@, has some of its major components dated back to the early 2000s, which are based on inefficient data structures and algorithms. This report introduces improvements targeting the expression resolution algorithm, suggested by a recent prototype experiment on a simplified model, which are implemented in @cfa-cc@ to support the full \CFA language. These optimizations speed up the compiler by a factor of 20 across the existing \CFA codebase, bringing the compilation time of a mid-sized \CFA source file down to the 10-second level. A few problem cases derived from realistic code examples are analyzed in detail, with proposed solutions. This work is a critical step in the \CFA project development to achieve its eventual goal of being used alongside C for large software systems. 89 88 \end{abstract} 90 89 90 \clearpage 91 \section*{Acknowledgements} 92 \begin{sloppypar} 93 I would like to thank everyone in the \CFA team for their contribution towards this project. Programming language design and development is a tough subject and requires a lot of teamwork. Without the collaborative efforts from the team, this project could not have been a success. Specifically, I would like to thank Andrew Beach for introducing me to the \CFA codebase, Thierry Delisle for maintaining the test and build automation framework, Michael Brooks for providing example programs of various experimental language and type system features, and most importantly, Professor Martin Karsten for recommending me to the \CFA team, and my supervisor, Professor Peter Buhr for encouraging me to explore deeply into intricate compiler algorithms. Finally, I gratefully acknowledge the help from Aaron Moss, former graduate from the team and the author of the precedent thesis work, to participate in the \CFA team's virtual conferences and email correspondence, and provide many critical arguments and suggestions. 2020 had been an unusually challenging year for everyone and we managed to keep a steady pace. 94 \end{sloppypar} 95 96 \clearpage 97 \tableofcontents 98 99 \clearpage 91 100 \section{Introduction} 92 101 93 \section{Completed work} 102 \CFA language, developed by the Programming Language Group at the University of Waterloo, has a long history, with the initial language design in 1992 by Glen Ditchfield~\cite{Ditchfield92} and the first proof-of-concept compiler built in 2003 by Richard Bilson~\cite{Bilson03}. Many new features have been added to the language over time, but the core of \CFA's type-system --- parametric functions introduced by the @forall@ clause (hence the name of the language) providing parametric overloading --- remains mostly unchanged. 103 104 The current \CFA reference compiler, @cfa-cc@, is designed using the visitor pattern~\cite{vistorpattern} over an abstract syntax tree (AST), where multiple passes over the AST modify it for subsequent passes. @cfa-cc@ still includes many parts taken directly from the original Bilson implementation, which served as the starting point for this enhancement work to the type system. Unfortunately, the prior implementation did not provide the efficiency required for the language to be practical: a \CFA source file of approximately 1000 lines of code can take a multiple minutes to compile. The cause of the problem is that the old compiler used inefficient data structures and algorithms for expression resolution, which involved significant copying and redundant work. 105 106 This report presents a series of optimizations to the performance-critical parts of the resolver, with a major rework of the compiler data-structures using a functional-programming approach to reduce memory complexity. The improvements were suggested by running the compiler builds with a performance profiler against the \CFA standard-library source-code and a test suite to find the most underperforming components in the compiler algorithm. 107 108 The \CFA team endorses a pragmatic philosophy that focuses on practical implications of language design and implementation rather than theoretical limits. In particular, the compiler is designed to be expressive with respect to code reuse while maintaining type safety, but compromise theoretical soundness in extreme corner cases. However, when these corner cases do appear in actual usage, they need to be thoroughly investigated. A case-by-case analysis is presented for several of these corner cases, some of which point to certain weaknesses in the language design with solutions proposed based on experimental results. 109 110 \section{AST restructuring} 94 111 95 112 \subsection{Memory model with sharing} 96 113 97 A major rework of the abstract syntax tree (AST) data structure in the compiler is completed as the first step of the project. The majority of work were documented in the reference manual of the compiler~\cite{cfa-cc}. To summarize: 98 \begin{itemize} 99 \item 100 AST nodes (and therefore subtrees) can be shared without copying when reused. 101 \item 102 Modifications apply the functional programming principle, making copies for local changes without affecting the original data shared by other owners. In-place mutations are permitted as a special case when sharing does not happen. The logic is implemented by reference counting. 103 \item 104 Memory allocation and freeing are performed automatically using smart pointers. 105 \end{itemize} 106 The resolver algorithm designed for overload resolution naturally introduces a significant amount of reused intermediate representations, especially in the following two places: 107 \begin{itemize} 108 \item 109 Function overload candidates are computed by combining the argument candidates bottom-up, with many of them being a common term. For example, if $n$ overloads of a function @f@ all take an integer for the first parameter but different types for the second (@f( int, int )@, @f( int, double )@, etc.) the first term is reused $n$ times for each of the generated candidate expressions. This effect is particularly bad for deep expression trees. 110 \item 111 In the unification algorithm and candidate elimination step, actual types are obtained by substituting the type parameters by their bindings. Let $n$ be the complexity (\ie number of nodes in representation) of the original type, $m$ be the complexity of bound type for parameters, and $k$ be the number of occurrences of type parameters in the original type. If everything needs to be deep-copied, the substitution step takes $O(n+mk)$ time and memory, while using shared nodes it is reduced to $O(n)$ time and $O(k)$ memory. 112 \end{itemize} 113 One of the worst examples for the old compiler is a long chain of I/O operations 114 \begin{cfa} 115 sout | 1 | 2 | 3 | 4 | ... 116 \end{cfa} 117 The pipe operator is overloaded by \CFA I/O library for every primitive type in C language, as well as I/O manipulators defined by the library. In total there are around 50 overloads for the output stream operation. On resolving the $n$-th pipe operator in the sequence, the first term, which is the result of sub-expression containing $n-1$ pipe operators, is reused to resolve every overload. Therefore at least $O(n^2)$ copies of expression nodes are made during resolution, not even counting type unification cost; combined with two large factors from number of overloads of pipe operators, and that the ``output stream type'' in \CFA is a trait with 27 assertions (which adds to complexity of the pipe operator's type) this makes compiling a long output sequence extremely slow. In new AST representation only $O(n)$ copies are required and type of pipe operator is not copied at all. 118 119 Reduction in space complexity is especially important, as preliminary profiling result on the old compiler build shows that over half of time spent in expression resolution are on memory allocations. 120 114 A major rework of the AST data-structure in the compiler was completed as the first step of the project. The majority of this work is documented in my prior report documenting the compiler reference-manual~\cite{cfa-cc}. To summarize: 115 \begin{itemize} 116 \item 117 AST nodes (and therefore subtrees) can be shared without copying. 118 \item 119 Modifications are performed using functional-programming principles, making copies for local changes without affecting the original data shared by other owners. In-place mutations are permitted as a special case when there is no sharing. The logic is implemented by reference counting. 120 \item 121 Memory allocation and freeing are performed automatically using smart pointers~\cite{smartpointers}. 122 \end{itemize} 123 124 The resolver algorithm, designed for overload resolution, uses a significant amount of reused, and hence copying, for the intermediate representations, especially in the following two places: 125 \begin{itemize} 126 \item 127 Function overload candidates are computed by combining the argument candidates bottom-up, with many being a common term. For example, if $n$ overloads of a function @f@ all take an integer for the first parameter but different types for the second, \eg @f( int, int )@, @f( int, double )@, etc., the first term is copied $n$ times for each of the generated candidate expressions. This copying is particularly bad for deep expression trees. 128 \item 129 In the unification algorithm and candidate elimination step, actual types are obtained by substituting the type parameters by their bindings. Let $n$ be the complexity (\ie number of nodes in representation) of the original type, $m$ be the complexity of the bound type for parameters, and $k$ be the number of occurrences of type parameters in the original type. If every substitution needs to be deep-copied, these copy step takes $O(n+mk)$ time and memory, while using shared nodes it is reduced to $O(n)$ time and $O(k)$ memory. 130 \end{itemize} 131 One of the worst examples for the old compiler is a long chain of I/O operations: 132 \begin{cfa} 133 sout | 1 | 2 | 3 | 4 | ...; // print integer constants 134 \end{cfa} 135 The pipe operator is overloaded by the \CFA I/O library for every primitive type in the C language, as well as I/O manipulators defined by the library. In total, there are around 50 overloads for the output stream operation. On resolving the $n$-th pipe operator in the sequence, the first term, which is the result of sub-expression containing $n-1$ pipe operators, is reused to resolve every overload. Therefore at least $O(n^2)$ copies of expression nodes are made during resolution, not even counting type unification cost; combined with the two large factors from number of overloads of pipe operators, and that the ``output stream type'' in \CFA is a trait with 27 assertions (which adds to complexity of the pipe operator's type) this makes compiling a long output sequence extremely slow. In the new AST representation, only $O(n)$ copies are required and the type of the pipe operator is not copied at all. 136 Reduction in space complexity is especially important, as preliminary profiling results on the old compiler build showed over half of the time spent in expression resolution is on memory allocations. 137 138 Since the compiler codebase is large and the new memory model mostly benefits expression resolution, some of the old data structures are still kept, and a conversion pass happens before and after the general resolve phase. Rewriting every compiler module will take longer, and whether the new model is correct was unknown when this project started, therefore only the resolver is currently implemented with the new data structure. 139 121 140 122 141 \subsection{Merged resolver calls} 123 142 124 The pre-resolve phase of compilation, ina dequately called ``validate'' in the compiler source code, does more than just simple syntax validation, as it also normalizes input program. Some of them, however, requires type information on expressions and therefore needsto call the resolver before the general resolve phase. There are three notable places where the resolver is invoked:125 \begin{itemize} 126 \item 127 Attempt to generate default constructor, copy constructor and destructor for user-defined @struct@ types 128 \item 129 Resolve @with@ statements (the same as in P ython, which introduces fields of a structure directly in scope)143 The pre-resolve phase of compilation, inappropriately called ``validate'' in the compiler source code, has a number of passes that do more than simple syntax and semantic validation; some passes also normalizes the input program. A few of these passes require type information for expressions, and therefore, need to call the resolver before the general resolve phase. There are three notable places where the resolver is invoked: 144 \begin{itemize} 145 \item 146 Generate default constructor, copy constructor and destructor for user-defined @struct@ types. 147 \item 148 Resolve @with@ statements (the same as in Pascal~\cite{pascal}), which introduces fields of a structure directly into a scope. 130 149 \item 131 150 Resolve @typeof@ expressions (cf. @decltype@ in \CC); note that this step may depend on symbols introduced by @with@ statements. 132 151 \end{itemize} 133 Since the compiler codebase is large and the new memory model mostly only benefits expression resolution, the old data structure is still kept, and a conversion pass happens before and after resolve phase. Rewriting every compiler module will take a long time, and whether the new model is correct is still unknown when started, therefore only the resolver is implemented with the new data structure. 134 135 Since the constructor calls were one of the most expensive to resolve (reason will be shown in the next section), pre-resolve phase were taking more time after resolver moves to the more efficient new implementation. To better facilitate the new resolver, every step that requires type information are reintegrated as part of resolver. 136 137 A by-product of this work is that the reversed dependence of @with@ statement and @typeof@ can now be handled. Previously, the compiler is unable to handle cases such as 152 153 Since the constructor calls are one of the most expensive to resolve (reason given in~\VRef{s:SpecialFunctionLookup}), this pre-resolve phase was taking a large amount of time even after the resolver was changed to the more efficient new implementation. The problem is that multiple resolutions repeat a significant amount of work. Therefore, to better facilitate the new resolver, every step that requires type information should be integrated as part of the general resolver phase. 154 155 A by-product of this work is that reversed dependence between @with@ statement and @typeof@ can now be handled. Previously, the compiler was unable to handle cases such as: 138 156 \begin{cfa} 139 157 struct S { int x; }; 140 158 S foo(); 141 159 typeof( foo() ) s; // type is S 142 with (s) { 160 with (s) { 143 161 x; // refers to s.x 144 162 } 145 163 \end{cfa} 146 since t ype of @s@ is still unresolved when handling @with@ expressions. Instead, the new (and correct) approach is to evaluate @typeof@ expressions when the declaration is first seen, and it suffices because of the declaration-before-use rule.164 since the type of @s@ is unresolved when handling @with@ expressions because the @with@ pass follows the @typeof@ pass (interchanging passes only interchanges the problem). Instead, the new (and correct) approach is to evaluate @typeof@ expressions when the declaration is first seen during resolution, and it suffices because of the declaration-before-use rule. 147 165 148 166 149 167 \subsection{Special function lookup} 150 151 Reducing the number of functions looked up for overload resolution is an effective way to gain performance when there are many overloads but most of them are trivially wrong. In practice, most functions have few (if any) overloads but there are notable exceptions. Most importantly, constructor @?{}@, destructor @^?{}@, and assignment @?=?@ are generated for every user-defined type, and in a large source file there can be hundreds of them. Furthermore, many calls to them are generated for initializing variables and passing arguments. This fact makes them the most overloaded and most called functions. 152 153 In an object-oriented programming language, object has methods declared with their types, so a call such as @obj.f()@ only needs to perform lookup in the method table corresponding to type of @obj@. \CFA on the other hand, does not have methods, and all types are open (\ie new operations can be defined on them), so a similar approach will not work in general. However, the ``big 3'' operators have a unique property enforced by the language rules, such that the first parameter must have a reference type. Since \CFA does not have class inheritance, reference type must always match exactly. Therefore, argument-dependent lookup can be implemented for these operators, by using a dedicated symbol table. 154 155 The lookup key used for the special functions is the mangled type name of the first parameter, which acts as the @this@ parameter in an object-oriented language. To handle generic types, the type parameters are stripped off, and only the base type is matched. Note that a constructor (destructor, assignment operator) taking arbitrary @this@ argument, for example @forall( dtype T ) void ?{}( T & );@ is not allowed, and it guarantees that if the @this@ type is known, all possible overloads can be found by searching with the given type. In case that the @this@ argument itself is overloaded, it is resolved first and all possible result types are used for lookup. 156 157 Note that for the generated expressions, the particular variable for @this@ argument is fully known, without overloads, so the majority of constructor call resolutions only need to check for one given object type. Explicit constructor calls and assignment statements sometimes may require lookup for multiple types. In the extremely rare case that type of @this@ argument is yet unbound, everything will have to be checked, just like without the argument-dependent lookup algorithm; fortunately, this case almost never happens in practice. An example is found in the library function @new@: 168 \label{s:SpecialFunctionLookup} 169 170 Reducing the number of function looked ups for overload resolution is an effective way to gain performance when there are many overloads but most of them are trivially wrong. In practice, most functions have few (if any) overloads but there are notable exceptions. Most importantly, constructor @?{}@, destructor @^?{}@, and assignment @?=?@ are generated for every user-defined type (@struct@ and @union@ in C), and in a large source file there can be hundreds of them. Furthermore, many calls are generated for initializing variables, passing arguments and copying values. This fact makes them the most overloaded and most called functions. 171 172 In an object-oriented programming language, the object-method types are scoped within a class, so a call such as @obj.f()@ only needs to perform lookup in the method table corresponding to the type of @obj@. \CFA on the other hand, does not have methods, and all types are open, \ie new operations can be defined on them without inheritance; at best a \CFA type can be constrained by a translation unit. However, the ``big 3'' operators have a unique property enforced by the language rules: the first parameter must be a reference to its associated type, which acts as the @this@ parameter in an object-oriented language. Since \CFA does not have class inheritance, the reference type must always match exactly. Therefore, argument-dependent lookup can be implemented for these operators by using a dedicated, fast symbol-table. 173 174 The lookup key for the special functions is the mangled type name of the first parameter. To handle generic types, the type parameters are stripped off, and only the base type is matched. Note a constructor (destructor, assignment operator) may not take an arbitrary @this@ argument, \eg @forall( dtype T ) void ?{}( T & )@, thus guaranteeing that if the @this@ type is known, all possible overloads can be found by searching with this given type. In the case where the @this@ argument itself is overloaded, it is resolved first and all possible result types are used for lookup. 175 176 Note that for a generated expression, the particular variable for the @this@ argument is fully known, without overloads, so the majority of constructor-call resolutions only need to check for one given object type. Explicit constructor calls and assignment statements sometimes require lookup for multiple types. In the extremely rare case that the @this@-argument type is unbound, all necessary types are guaranteed to be checked, as for the previous lookup without the argument-dependent lookup; fortunately, this complex case almost never happens in practice. An example is found in the library function @new@: 158 177 \begin{cfa} 159 178 forall( dtype T | sized( T ), ttype TT | { void ?{}( T &, TT ); } ) 160 179 T * new( TT p ) { return &(*malloc()){ p }; } 161 180 \end{cfa} 162 as @malloc@ may return a pointer to any type, depending on context. 163 164 Interestingly, this particular line of code actually caused another complicated issue, where the unusually massive work of checking every constructor in presence makes the case even worse. Section~\ref{s:TtypeResolutionInfiniteRecursion} presents a detailed analysis for theproblem.165 166 The ``callable'' operator @?()@ (cf. @operator()@ in \CC) c ould also be included in the special operator list, as it is usually only on user-defined types, and the restriction that first argument must be a reference seems reasonable in this case.181 as @malloc@ may return a pointer to any type, depending on context. 182 183 Interestingly, this particular declaration actually causes another complicated issue, making the complex checking of every constructor even worse. \VRef[Section]{s:TtypeResolutionInfiniteRecursion} presents a detailed analysis of this problem. 184 185 The ``callable'' operator @?()@ (cf. @operator()@ in \CC) can also be included in this special operator list, as it is usually only on user-defined types, and the restriction that the first argument must be a reference seems reasonable in this case. 167 186 168 187 169 188 \subsection{Improvement of function type representation} 170 189 171 Since substituting type parameters with their bound types is one fundamental operation in many parts of resolver algorithm (particularly unification and environment binding), making as few copies of type nodes as possible helps reducing memory complexity. Even with the new memory management model, allocation is still a significant factor of resolver performance. Conceptually, operations on type nodes of AST should be performed in functional programming style, treating the data structure as immutable and only copy when necessary. The in-place mutation is a mere optimization that does not change logic of operations. 172 The model was broken on function types by an inappropriate design. Function types require some special treatment due to the existence of assertions. In particular, it must be able to distinguish two different kinds of type parameter usage: 190 Since substituting type parameters with their bound types is one fundamental operation in many parts of resolver algorithm (particularly unification and environment binding), making as few copies of type nodes as possible helps reducing memory complexity. Even with the new memory management model, allocation is still a significant factor of resolver performance. Conceptually, operations on type nodes of the AST should be performed in functional-programming style, treating the data structure as immutable and only copying when necessary. The in-place mutation is a mere optimization that does not change the logic for operations. 191 192 However, the model was broken for function types by an inappropriate design. Function types require special treatment due to the existence of assertions that constrain the types it supports. Specifically, it must be possible to distinguish two different kinds of type parameter usage: 173 193 \begin{cfa} 174 194 forall( dtype T ) void foo( T * t ) { 175 forall( dtype U ) void bar( T * t, U* u ) { ... }176 } 177 \end{cfa} 178 Here, only @U@ is a free parameter in declaration of @bar@, as it appears in the function's own forall clause; while @T@ is not free.179 180 Moreover, the resolution algorithm also has to distinguish type bindings of multiple calls to the same function, for example with195 forall( dtype U ) void bar( @T@ * t, @U@ * u ) { ... } 196 } 197 \end{cfa} 198 Here, only @U@ is a free parameter in the nested declaration of function @bar@, as @T@ must be bound at the call site when resolving @bar@. 199 200 Moreover, the resolution algorithm also has to distinguish type bindings of multiple calls to the same function, \eg: 181 201 \begin{cfa} 182 202 forall( dtype T ) int foo( T x ); 183 foo( foo( 1.0 ) );184 \end{cfa} 185 The inner call has binding (T: double) while the outer call has binding (T: int). Therefore a unique representation of free parameters in each expression is required. This was previously done by creating a copy of the parameter declarations inside function type, and fixing references afterwards. However, fixing references is an inherently deep operation that does not work well with functional programming model, as it must be evaluated eagerlyon the entire syntax tree representing the function type.186 187 The revised approach generates a unique ID value for each function call expression instance and represents an occurrence of free parameter type with a pair of generated ID and the original parameter declaration, so that references do not need to be fixed, and a shallow copy of function type is possible.188 189 Note that after the change, all declaration nodes in syntax tree representation maps one-to-one with the actual declarations in the program, and therefore are guaranteed to be unique. Such property can potentially enable more optimizations, and some related ideas are presented after Section~\ref{s:SharedSub-ExpressionCaseUniqueExpressions}.203 int i = foo( foo( 1.0 ) ); 204 \end{cfa} 205 The inner call has binding (T: double) while the outer call has binding (T: int). Therefore a unique representation for the free parameters is required in each expression. This type binding was previously done by creating a copy of the parameter declarations inside the function type and fixing references afterwards. However, fixing references is an inherently deep operation that does not work well with the functional-programming style, as it forces eager evaluation on the entire syntax tree representing the function type. 206 207 The revised approach generates a unique ID value for each function call expression instance and represents an occurrence of a free-parameter type with a pair of generated ID and original parameter declaration, so references are unique and a shallow copy of the function type is possible. 208 209 Note that after the change, all declaration nodes in the syntax-tree representation now map one-to-one with the actual declarations in the program, and therefore are guaranteed to be unique. This property can potentially enable more optimizations, and some related ideas are presented at the end of \VRef{s:SharedSub-ExpressionCaseUniqueExpressions}. 190 210 191 211 192 212 \subsection{Improvement of pruning steps} 193 213 194 A minor improvement for candidate elimination is to skip the step on the function overloads themselves and only perform on results of function application. As function calls are usually by name, the name resolution rule dictates that every function candidate necessarily has a different type; indirect function calls are rare, and when they do appear, they usually will not have many possible interpretations, and those rarely matches exactly in argument type. Since function types have a much more complex representation than data types (with multiple parameters and assertions), checking equality on them also takes longer.195 196 A brief test of this approach shows that the number of function overloads considered in expression resolution increases by a negligible amount of less than 1 percent, while type comparisons in candidate elimination are cut by more than half. Improvement is consistent over all \CFA source files in the test suite.214 A minor improvement for candidate elimination is to skip the step on the function overloads and only check the results of function application. As function calls are usually by name (versus pointers to functions), the name resolution rule dictates that every function candidate necessarily has a different type; indirect function calls are rare, and when they do appear, there are even fewer cases with multiple interpretations, and these rarely match exactly in argument type. Since function types have a much more complex representation (with multiple parameters and assertions) than data types, checking equality on them also takes longer. 215 216 A brief test of this approach shows that the number of function overloads considered in expression resolution increases by an amount of less than 1 percent, while type comparisons in candidate elimination are reduced by more than half. This improvement is consistent over all \CFA source files in the test suite. 197 217 198 218 … … 200 220 \label{s:SharedSub-ExpressionCaseUniqueExpressions} 201 221 202 Unique expression denotes an expression that must be evaluated only once, to prevent unwanted side effects. It is currently only a compiler artifact, generated on tuple member expression of the form222 Unique expression denotes an expression evaluated only once to prevent unwanted side effects. It is currently only a compiler artifact, generated for tuple-member expression of the form: 203 223 \begin{cfa} 204 224 struct S { int a; int b; }; … … 206 226 s.[a, b]; // tuple member expression, type is [int, int] 207 227 \end{cfa} 208 If the aggregate expression contains function calls, it cannot be evaluated multiple times:228 If the aggregate expression is function call, it cannot be evaluated multiple times: 209 229 \begin{cfa} 210 230 S makeS(); 211 makeS().[a, b]; // this should only make one S231 makeS().[a, b]; // this should only generate a unique S 212 232 \end{cfa} 213 233 Before code generation, the above expression is internally represented as … … 226 246 \end{cfa} 227 247 at code generation, where @_unique_var@ and @_unique_var_evaluated@ are generated variables whose scope covers all appearances of the same expression. 228 229 Note that although the unique expression is only used for tuple expansion now, it is a generally useful construction, and can be seen in other languages, such as Scala's @lazy val@~\cite{Scala}; therefore it could be worthwhile to introduce the unique expression to a broader context in \CFA and even make it directly available to programmers. 230 231 In the compiler's visitor pattern, however, this creates a problem where multiple paths to a logically unique expression exist, so it may be modified more than once and become ill-formed; some specific intervention is required to ensure that unique expressions are only visited once. Furthermore, a unique expression appearing in more than one places will be copied on mutation so its representation is no longer unique. Some hacks are required to keep it in sync, and the methods are different when mutating the unique expression instance itself or its underlying expression. 232 233 Example when mutating the underlying expression (visit-once guard) 248 The conditional check ensures a single call to @makeS()@ even though there are logically multiple calls because of the tuple field expansion. 249 250 Note that although the unique expression is only used for tuple expansion now, it is a generally useful construction, and is seen in other programming languages, such as Scala's @lazy val@~\cite{Scala}; therefore it may be worthwhile to introduce the unique expression to a broader context in \CFA and even make it directly available to programmers. 251 252 In the compiler's visitor pattern, however, this creates a problem where multiple paths to a logically unique expression exist, so it may be modified more than once and become ill-formed; some specific intervention is required to ensure unique expressions are only visited once. Furthermore, a unique expression appearing in more than one places is copied on mutation so its representation is no longer unique. 253 254 Currently, special cases are required to keep everything synchronized, and the methods are different when mutating the unique expression instance itself or its underlying expression: 255 \begin{itemize} 256 \item 257 When mutating the underlying expression (visit-once guard) 234 258 \begin{cfa} 235 259 void InsertImplicitCalls::previsit( const ast::UniqueExpr * unqExpr ) { 236 if ( visitedIds.count( unqExpr->id ) ) visit_children = false;260 @if ( visitedIds.count( unqExpr->id ) ) visit_children = false;@ 237 261 else visitedIds.insert( unqExpr->id ); 238 262 } 239 263 \end{cfa} 240 Example when mutating the unique instance itself, which actually creates copies 264 \item 265 When mutating the unique instance itself, which actually creates copies 241 266 \begin{cfa} 242 267 auto mutExpr = mutate( unqExpr ); // internally calls copy when shared 243 if ( ! unqMap.count( unqExpr->id ) ) { 268 @if ( ! unqMap.count( unqExpr->id ) ) {@ 244 269 ... 245 270 } else { … … 248 273 } 249 274 \end{cfa} 250 Such workaround seems difficult to be fit into a common visitor template. This suggests the memory model may need different kinds of nodes to accurately represent the syntax tree. 251 252 Together with the fact that declaration nodes are always unique, it is possible that AST nodes can be classified by three different types: 253 \begin{itemize} 254 \item 255 \textbf{Strictly unique} with only one owner (declarations); 256 \item 257 \textbf{Logically unique} with (possibly) many owners but should not be copied (unique expression example presented here); 258 \item 259 \textbf{Shared} by functional programming model, which assume immutable data structure and are copied on mutation. 275 \end{itemize} 276 Such workarounds are difficult to fit into the common visitor pattern, which suggests the memory model may need different kinds of nodes to accurately represent this feature in the AST. 277 278 Given that declaration nodes are unique, it is possible for AST nodes to be divided into three different types: 279 \begin{itemize} 280 \item 281 \textbf{Singleton} with only one owner (declarations); 282 \item 283 \textbf{No-copy} with multiple owners but cannot be copied (unique expression example presented here); 284 \item 285 \textbf{Copy} by functional-programming style, which assumes immutable data structures that are copied on mutation. 260 286 \end{itemize} 261 287 The boilerplate code can potentially handle these three cases differently. … … 264 290 \section{Analysis of resolver algorithm complexity} 265 291 266 The focus of this chapter is to identify and analyze some realistic cases that cause resolver algorithm to have an exponential run time. As previous work has shown [3], the overload resolution problem in \CFA has worst-case exponential complexity; however, only few specific patterns can trigger the exponential complexity in practice. Implementing heuristic-based optimization for those selected cases is helpful to alleviate the problem.292 The focus of this section is to identify and analyze some realistic cases that cause the resolver algorithm to have an exponential runtime. As previous work has shown~\cite[\S~4.2.1]{Moss19}, the overload resolution problem in \CFA has worst-case exponential complexity; however, only few specific patterns can trigger the exponential complexity in practice. Implementing heuristic-based optimization for those selected cases is helpful to alleviate the problem. 267 293 268 294 … … 270 296 \label{s:UnboundReturnType} 271 297 272 The interaction of return type overloading and polymorphic functions creates this problem of function calls with unbound returntype, and is further complicated by the presence of assertions.298 The interaction of return-type overloading and polymorphic functions creates function calls with unbounded return-type, and is further complicated by the presence of assertions. 273 299 The prime example of a function with unbound return type is the type-safe version of C @malloc@: 274 300 \begin{cfa} 275 // size deduced from type, so no need to provide the size argument 276 forall( dtype T | sized( T ) ) T * malloc( void ); 277 \end{cfa} 278 Unbound return type can be problematic in resolver algorithm complexity because a single match of function call with unbound return type may create multiple candidates. In the worst case, consider a function declared to return any @otype@: 301 forall( dtype T | sized( T ) ) 302 T * malloc( void ) { return (T *)malloc( sizeof(T) ); } // call C malloc 303 int * i = malloc(); // type deduced from left-hand size $\Rightarrow$ no size argument or return cast 304 \end{cfa} 305 An unbound return-type is problematic in resolver complexity because a single match of a function call with an unbound return type may create multiple candidates. In the worst case, consider a function declared that returns any @otype@ (defined \VPageref{otype}): 279 306 \begin{cfa} 280 307 forall( otype T ) T anyObj( void ); 281 308 \end{cfa} 282 As the resolver attempts to satisfy the otype constraint on @T@, a single call to @anyObj()@ without the result type known creates at least as many candidates as the number of complete types currently in scope; with generic types it becomes even worse, for example, assuming a declaration of generic pairis available at that point:309 As the resolver attempts to satisfy the otype constraint on @T@, a call to @anyObj()@ in an expression, without the result type known, creates at least as many candidates as the number of complete types currently in scope; with generic types it becomes even worse, \eg assuming a declaration of a generic @pair@ is available at that point: 283 310 \begin{cfa} 284 311 forall( otype T, otype U ) struct pair { T first; U second; }; 285 312 \end{cfa} 286 Then an @anyObj()@ call can result in arbitrarily complex types, such as @pair( pair( int, int ), pair( int,int ) )@, and the depth can grow indefinitely until the specified parameter depth limit, thus creating exponentially many candidates. However, the expected types allowed by parent expressions are practically very few, so most of those interpretations are invalid; if the result type is never bound up to top level, by the semantic rules it is ambiguous if there are more than one valid bindings, and resolution can fail fast. It is therefore reasonable to delay resolving assertions on an unbound parameter in return type; however, with the current cost model, such behavior may further cause irregularities in candidate selection, such that the presence of assertions can change the preferred candidate, even when order of expression costs are supposed to stay the same. Detailed analysis of this issue will be presented later, in the correctness part.313 Then an @anyObj()@ call can result in arbitrarily complex types, such as @pair( pair( int, int ), pair( int, int ) )@, and the depth can grow indefinitely until a specified parameter-depth limit, thus creating exponentially many candidates. However, the expected types allowed by parent expressions are practically very few, so most of those interpretations are invalid; if the result type is never bound up to the top level, by the semantic rules it is ambiguous if there is more than one valid binding and resolution fails quickly. It is therefore reasonable to delay resolving assertions on an unbound parameter in a return type; however, with the current cost model, such behavior may further cause irregularities in candidate selection, such that the presence of assertions can change the preferred candidate, even when order of expression costs are supposed to stay the same. A detailed analysis of this issue is presented in \VRef{s:AnalysisTypeSystemCorrectness}. 287 314 288 315 … … 290 317 \label{s:TtypeResolutionInfiniteRecursion} 291 318 292 @ttype@ (``tuple type'') is a relatively new addition to the language that attempts to provide type-safe variadic argument semantics. Unlike regular @dtype@ parameters, @ttype@ is only valid in function parameter list, and may only appear once as the type of last parameter. At the call site, a @ttype@ parameter is bound to the tuple type of all remaining functioncall arguments.319 @ttype@ (``tuple type'') is a relatively new addition to the language that attempts to provide type-safe variadic argument semantics. Unlike regular @dtype@ parameters, @ttype@ is only valid in a function parameter-list, and may only appear once as the last parameter type. At the call site, a @ttype@ parameter is bound to the tuple type of all remaining function-call arguments. 293 320 294 321 There are two kinds of idiomatic @ttype@ usage: one is to provide flexible argument forwarding, similar to the variadic template in \CC (\lstinline[language=C++]|template<typename... args>|), as shown below in the implementation of @unique_ptr@ … … 298 325 T * data; 299 326 }; 300 forall( dtype T | sized( T ), ttype Args| { void ?{}( T &, Args ); })301 void ?{}( unique_ptr( T ) & this, Args args) {302 this.data = new( args );303 } 304 \end{cfa} 305 the other is to implement structural recursion in the first-rest manner:306 \begin{cfa} 307 forall( otype T, ttype Params| { void process( T ); void func( Params ); })327 forall( dtype T | sized( T ), @ttype Args@ | { void ?{}( T &, Args ); }) 328 void ?{}( unique_ptr( T ) & this, Args @args@ ) { 329 this.data = new( @args@ ); // forward constructor arguments to dynamic allocator 330 } 331 \end{cfa} 332 The other usage is to implement structural recursion in the first-rest pattern: 333 \begin{cfa} 334 forall( otype T, @ttype Params@ | { void process( T ); void func( Params ); }) 308 335 void func( T arg1, Params p ) { 309 336 process( arg1 ); 310 func( p ); 311 } 312 \end{cfa} 313 For the second use case, it is important that the number of parameters in the recursive call go down, since the call site must deduce all assertion candidates, and that is only possible if by just looking at argument types (and not their values), the recursion is known to be completed in a finite number of steps. 314 315 In recent experiments, however, some flaw in the type binding rules can lead to the first kind of @ttype@ use case produce an invalid candidate that the resolver enters an infinite loop. 316 317 This bug was discovered in an attempt to raise assertion recursive depth limit and one of the library program takes exponentially longer time to compile. The cause of the problem is identified to be the following set of functions. 318 File @memory.cfa@ contains 319 \begin{cfa} 320 #include "memory.hfa" 321 #include "stdlib.hfa" 322 \end{cfa} 323 where file @memory.hfa@ contains the @unique_ptr@ declaration above, and two other similar functions with @ttype@ parameter: 324 \begin{cfa} 325 forall( dtype T | sized( T ), ttype Args | { void ?{}( T &, Args ); }) { 337 func( @p@ ); // recursive call until base case of one argument 338 } 339 \end{cfa} 340 For the second use case, it is imperative the number of parameters in the recursive call goes down, since the call site must deduce all assertion candidates, and that is only possible if by observation of the argument types (and not their values), the recursion is known to be completed in a finite number of steps. 341 342 In recent experiments, however, a flaw in the type-binding rules can lead to the first kind of @ttype@ use case producing an invalid candidate and the resolver enters an infinite loop. 343 This bug was discovered in an attempt to raise the assertion recursive-depth limit and one of the library programs took exponentially longer to compile. The cause of the problem is the following set of functions: 344 \begin{cfa} 345 // unique_ptr declaration from above 346 347 forall( dtype T | sized( T ), ttype Args | { void ?{}( T &, Args ); } ) { // distribute forall clause 326 348 void ?{}( counter_data( T ) & this, Args args ); 327 349 void ?{}( counter_ptr( T ) & this, Args args ); 328 350 void ?{}( unique_ptr( T ) & this, Args args ); 329 351 } 330 \end{cfa} 331 File @stdlib.hfa@ contains 332 \begin{cfa} 352 333 353 forall( dtype T | sized( T ), ttype TT | { void ?{}( T &, TT ); } ) 334 T * new( TT p ) { return &(*malloc()){ p }; } 335 \end{cfa} 336 337 In the expression @(*malloc()){p}@, the type of object being constructed is yet unknown, since the return type information is not immediately provided. That caused every constructor to be searched, and while normally a bound @ttype@ cannot be unified with any free parameter, it is possible with another free @ttype@. Therefore in addition to the correct option provided by assertion, 3 wrong options are examined, each of which again requires the same assertion, for an unknown base type T and @ttype@ arguments, and that becomes an infinite loop, until the specified recursion limit and resolution is forced to fail. Moreover, during the recursion steps, number of candidates grows exponentially, since there are always 3 options at each step. 338 339 Unfortunately, @ttype@ to @ttype@ binding is necessary, to allow calling the function provided by assertion indirectly. 340 \begin{cfa} 341 forall( dtype T | sized( T ), ttype Args | { void ?{}( T &, Args ); }) 342 void ?{}( unique_ptr( T ) & this, Args args ) { this.data = (T * )new( args ); } 343 \end{cfa} 344 Here the constructor assertion is used for the @new( args )@ call. 354 T * new( TT p ) { return @&(*malloc()){ p };@ } 355 \end{cfa} 356 In the expression @(*malloc()){p}@, the type of the object being constructed is unknown, since the return-type information is not immediately available. That causes every constructor to be searched, and while normally a bound @ttype@ cannot be unified with any free parameter, it is possible with another free @ttype@. Therefore, in addition to the correct option provided by the assertion, 3 wrong options are examined, each of which again requires the same assertion, for an unknown base-type @T@ and @ttype@ argument, which becomes an infinite loop until the specified recursion limit and resolution is fails. Moreover, during the recursion steps, the number of candidates grows exponentially, since there are always 3 options at each step. 357 358 Unfortunately, @ttype@ to @ttype@ binding is necessary, to allow indirectly calling a function provided in an assertion. 359 \begin{cfa} 360 forall( dtype T | sized( T ), ttype Args | { @void ?{}( T &, Args );@ }) 361 void ?{}( unique_ptr( T ) & this, Args args ) { this.data = (T *)@new( args )@; } // constructor call 362 \end{cfa} 363 Here the constructor assertion is used by the @new( args )@ call to indirectly call the constructor on the allocated storage. 345 364 Therefore, it is hard, perhaps impossible, to solve this problem by tweaking the type binding rules. An assertion caching algorithm can help improve this case by detecting cycles in recursion. 346 365 347 Meanwhile, without the caching algorithm implemented, some changes in the \CFA source code are enough to eliminate this problem, at least in the current codebase. Note that the issue only happens with an overloaded variadic function, which rarely appears in practice, since the idiomatic use cases are for argument forwarding and self-recursion. The only overloaded @ttype@ function so far discovered in all of \CFA standard library code is the constructor, and by utilizing the argument-dependent lookup process described in Section~\ref{s:UnboundReturnType}, adding a cast before constructor call gets rid of the issue.348 \begin{cfa} 349 T * new( TT p ) { return &(* (T * )malloc()){ p }; }366 Meanwhile, without a caching algorithm implemented, some changes in the \CFA source code are enough to eliminate this problem, at least in the current codebase. Note that the issue only happens with an overloaded variadic function, which rarely appears in practice, since the idiomatic use cases are for argument forwarding and self-recursion. The only overloaded @ttype@ function so far discovered in all of \CFA standard library is the constructor, and by utilizing the argument-dependent lookup process described in \VRef{s:UnboundReturnType}, adding a cast before the constructor call removes the issue. 367 \begin{cfa} 368 T * new( TT p ) { return &(*@(T * )@malloc()){ p }; } 350 369 \end{cfa} 351 370 … … 353 372 \subsection{Reused assertions in nested generic type} 354 373 355 The following test of deeply nested dynamic generic type reveals that locally caching reused assertions is necessary, rather than just a resolver optimization, because recomputing assertions can result in bloated generated code size:374 The following test of deeply nested, dynamic generic type reveals that locally caching reused assertions is necessary, rather than just a resolver optimization, because recomputing assertions can result in bloated generated code size: 356 375 \begin{cfa} 357 376 struct nil {}; … … 361 380 int main() { 362 381 #if N==0 363 nil x;382 nil @x@; 364 383 #elif N==1 365 cons( size_t, nil ) x;384 cons( size_t, nil ) @x@; 366 385 #elif N==2 367 cons( size_t, cons( size_t, nil ) ) x;386 cons( size_t, cons( size_t, nil ) ) @x@; 368 387 #elif N==3 369 cons( size_t, cons( size_t, cons( size_t, nil ) ) ) x;388 cons( size_t, cons( size_t, cons( size_t, nil ) ) ) @x@; 370 389 // similarly for N=4,5,6 371 390 #endif 372 391 } 373 392 \end{cfa} 374 At the declaration of @x@, it is implicitly initialized by generated constructor call, w hose signature is given by393 At the declaration of @x@, it is implicitly initialized by generated constructor call, with signature: 375 394 \begin{cfa} 376 395 forall( otype L, otype R ) void ?{}( cons( L, R ) & ); 377 396 \end{cfa} 378 Note that the @otype@ constraint contains 4 assertions: 397 where the @otype@ constraint contains the 4 assertions:\label{otype} 379 398 \begin{cfa} 380 399 void ?{}( L & ); // default constructor … … 383 402 L & ?=?( L &, L & ); // assignment 384 403 \end{cfa} 385 Now since the right hand side of outermost cons is again a cons, recursive assertions are required. When the compiler cannot cache and reuse already resolved assertions, it becomes a problem, as each of those 4 pending assertions again asks for 4 more assertions one level below. Without any caching, number of resolved assertions grows exponentially, while that is obviously unnecessary since there are only $n+1$ different types involved. Even worse, this causes exponentially many wrapper functions generated later at the codegen step, and results in huge compiled binary. 386 387 \ begin{table}[h]404 405 \begin{table}[htb] 406 \centering 388 407 \caption{Compilation results of nested cons test} 408 \label{t:NestedConsTest} 389 409 \begin{tabular}{|r|r|r|} 390 410 \hline … … 402 422 \end{table} 403 423 404 As the local functions are implemented by emitting executable code on the stack~\cite{gcc-nested-func}, it eventually means that compiled code also has exponential run time. This problem has evident practical implications, as nested collection types are frequently used in real production code. 405 424 Now since the right hand side of outermost cons is again a cons, recursive assertions are required. \VRef[Table]{t:NestedConsTest} shows when the compiler does not cache and reuse already resolved assertions, it becomes a problem, as each of these 4 pending assertions again asks for 4 more assertions one level below. Without caching, the number of resolved assertions grows exponentially, which is unnecessary since there are only $n+1$ different types involved. Even worse, this problem causes exponentially many wrapper functions to be generated at the backend, resulting in a huge binary. As the local functions are implemented by emitting executable code on the stack~\cite{gcc-nested-func}, it means that compiled code also has exponential run time. This problem has practical implications, as nested collection types are frequently used in real production code. 406 425 407 426 \section{Analysis of type system correctness} 427 \label{s:AnalysisTypeSystemCorrectness} 408 428 409 429 In Moss' thesis~\cite[\S~4.1.2,~p.~45]{Moss19}, the author presents the following example: … … 422 442 From the set of candidates whose parameter and argument types have been unified and whose assertions have been satisfied, those whose sub-expression interpretations have the smallest total cost of conversion are selected ... The total cost of conversion for each of these candidates is then calculated based on the implicit conversions and polymorphism involved in adapting the types of the sub-expression interpretations to the formal parameter types. 423 443 \end{quote} 424 With this model, the algorithm picks @g1@ in resolving the @f( g( 42 ) )@ call, which seems to be undesirable. 425 426 There are further evidence that shows the Bilson model is fundamentally incorrect, following the discussion of unbound return type in Section~\ref{s:UnboundReturnType}. By the conversion cost specification, a binding from a polymorphic type parameter to a concrete type incurs a polymorphic cost of 1. It remains unspecified \emph{when} the type parameters should become bound. When the parameterized types appear in the function parameters, they can be deduced from the argument type, and there is no ambiguity. In the unbound return case, however, the binding may happen at any stage in expression resolution, therefore it is impossible to define a unique local conversion cost. Note that type binding happens exactly once per parameter in resolving the entire expression, so the global binding cost is unambiguously 1. 427 428 As per the current compiler implementation, it does have a notable inconsistency in handling such case. For any unbound parameter that does \emph{not} come with an associated assertion, it remains unbound to the parent expression; for those that does however, they are immediately bound in the assertion resolution step, and concrete result types are used in the parent expressions. 429 444 With this model, the algorithm picks @g1@ in resolving the @f( g( 42 ) )@ call, which is undesirable. 445 446 There is further evidence that shows the Bilson model is fundamentally incorrect, following the discussion of unbound return type in \VRef{s:UnboundReturnType}. By the conversion-cost specification, a binding from a polymorphic type-parameter to a concrete type incurs a polymorphic cost of 1. It remains unspecified \emph{when} the type parameters should become bound. When the parameterized types appear in function parameters, they can be deduced from the argument type, and there is no ambiguity. In the unbound return case, however, the binding may happen at any stage in expression resolution, therefore it is impossible to define a unique local conversion cost. Note that type binding happens exactly once per parameter in resolving the entire expression, so the global binding cost is unambiguously 1. 447 448 In the current compiler implementation, there is a notable inconsistency in handling this case. For any unbound parameter that does \emph{not} come with an associated assertion, it remains unbound to the parent expression; for those that do, however, they are immediately bound in the assertion resolution step, and concrete result types are used in the parent expressions. 430 449 Consider the following example: 431 450 \begin{cfa} … … 433 452 void h( int * ); 434 453 \end{cfa} 435 The expression @h( f() )@ eventually has a total cost of 1 from binding (T: int), but in the eager resolution model, the cost of 1 may occur either atcall to @f@ or at call to @h@, and with the assertion resolution triggering a binding, the local cost of @f()@ is (0 poly, 0 spec) with no assertions, but (1 poly, -1 spec) with an assertion:436 \begin{cfa} 437 forall( dtype T | { void g( T * ); }) T * f( void );454 The expression @h( f() )@ eventually has a total cost of 1 from binding (T: int), but in the eager-resolution model, the cost of 1 may occur either at the call to @f@ or at call to @h@, and with the assertion resolution triggering a binding, the local cost of @f()@ is (0 poly, 0 spec) with no assertions, but (1 poly, -1 spec) with an assertion: 455 \begin{cfa} 456 forall( dtype T | @{ void g( T * ); }@ ) T * f( void ); 438 457 void g( int * ); 439 458 void h( int * ); 440 459 \end{cfa} 441 and that contradicts the principle that adding assertions should make expression cost lower. Furthermore, the time at which type binding and assertion resolution happens is an implementation detail of the compiler, but not a part of language definition. That means two compliant \CFA compilers, one performing immediate assertion resolution at each step, and one delaying assertion resolution on unbound types, can produce different expression costs and therefore different candidate selection, making the language rule itself partially undefined and thereforeunsound. By the above reasoning, the updated cost model using global sum of costs should be accepted as the standard. It also allows the compiler to freely choose when to resolve assertions, as the sum of total costs is independent of that choice; more optimizations regarding assertion resolution can also be implemented.460 and that contradicts the principle that adding assertions should make expression cost lower. Furthermore, the time at which type binding and assertion resolution happens is an implementation detail of the compiler, not part of the language definition. That means two compliant \CFA compilers, one performing immediate assertion resolution at each step, and one delaying assertion resolution on unbound types, can produce different expression costs and therefore different candidate selection, making the language rule itself partially undefined, and therefore, unsound. By the above reasoning, the updated cost model using global sum of costs should be accepted as the standard. It also allows the compiler to freely choose when to resolve assertions, as the sum of total costs is independent of that choice; more optimizations regarding assertion resolution can also be implemented. 442 461 443 462 444 463 \section{Timing results} 445 464 446 For the timing results presented here, the \CFA compiler is built with gcc 9.3.0, and tested on a server machine running Ubuntu 20.04, 64GB RAM and 32-core 2.2 GHz CPU, results reported by the time command, and using only 8 cores in parallel such that the time is close to the case with 100% CPU utilization on a single thread. 447 448 On the most recent build, the \CFA standard library (~1.3 MB of source code) compiles in 4 minutes 47 seconds total processor time (single thread equivalent), with the slowest file taking 13 seconds. The test suite (178 test cases, ~2.2MB of source code) completes within 25 minutes total processor time,\footnote{Including a few runtime tests; total time spent in compilation is approximately 21 minutes.} with the slowest file taking 23 seconds. In contrast, the library build on old compiler takes 85 minutes total, 5 minutes for the slowest file. Full test suite takes too long with old compiler build and is therefore not run, but the slowest test cases take approximately 5 minutes. Overall, the most recent build compared to old build in April 2020, before the project started, is consistently faster by a factor of 20. 449 450 Additionally, 6 selected \CFA source files with distinct features from library and test suite are used to test compiler performance after each of the optimizations are implemented. Test files are from the most recent build and run through C preprocessor to eliminate the factor of header file changes. The selected tests are: 451 \begin{itemize} 452 \item 453 @lib/fstream@ (112 KB)\footnote{File sizes are after preprocessing, with no line information (\lstinline|gcc -E -P|).}: implementation of I/O library 465 For the timing results presented here, the \CFA compiler is built with gcc 9.3.0, and tested on a server machine running Ubuntu 20.04, 64GB RAM and 32-core 2.2 GHz CPU. 466 Timing is reported by the @time@ command and an experiment is run using 8 cores, where each core is at 100\% CPU utilization. 467 468 On the most recent build, the \CFA standard library ($\approx$1.3 MB of source code) compiles in 4 minutes 47 seconds total processor time (single thread equivalent), with the slowest file taking 13 seconds. The test suite (178 test cases, $\approx$2.2MB of source code) completes within 25 minutes total processor time, 469 % PAB: I do not understand this footnote. 470 %\footnote{Including a few runtime tests; total time spent in compilation is approximately 21 minutes.} 471 with the slowest file taking 23 seconds. In contrast, the library build with the old compiler takes 85 minutes total, 5 minutes for the slowest file. The full test-suite takes too long with old compiler build and is therefore not run, but the slowest test cases take approximately 5 minutes. Overall, the most recent build compared to an old build is consistently faster by a factor of 20. 472 473 Additionally, 6 selected \CFA source files with distinct features from the library and test suite are used to illustrate the compiler performance change after each of the implemented optimizations. Test files are from the most recent build and run through the C preprocessor to expand header file, perform macro expansions, but no line number information (@gcc -E -P@). 474 \VRef[Table]{t:SelectedFileByCompilerBuild} shows the selected tests: 475 \begin{itemize} 476 \item 477 @lib/fstream@ (112 KB) 454 478 \item 455 479 @lib/mutex@ (166 KB): implementation of concurrency primitive … … 459 483 @lib/stdlib@ (64 KB): type-safe wrapper to @void *@-based C standard library functions 460 484 \item 461 @test/ ISO2@ (55 KB): application of I/O library485 @test/io2@ (55 KB): application of I/O library 462 486 \item 463 487 @test/thread@ (188 KB): application of threading library 464 488 \end{itemize} 465 466 The \CFA compiler builds are picked from git commit history that passed the test suite, and implement the optimizations incrementally: 467 \begin{itemize} 468 \item 469 \#0 is the first working build of new AST data structure 489 versus \CFA compiler builds picked from the git commit history that implement the optimizations incrementally: 490 \begin{itemize} 491 \item 492 old resolver 493 \item 494 \#0 is the first working build of the new AST data structure 470 495 \item 471 496 \#1 implements special symbol table and argument-dependent lookup 472 497 \item 473 \#2 implements late assertion satisfaction 474 \item 475 \#3 implements revised function type representation 476 \item 477 \#4 skips pruning on expressions with function type (most recent build) 478 \end{itemize} 479 The old resolver with no memory sharing and none of the optimizations above is also tested. 480 \begin{table} 498 \#2 implements late assertion-satisfaction 499 \item 500 \#3 implements revised function-type representation 501 \item 502 \#4 skips pruning on expressions for function types (most recent build) 503 \end{itemize} 504 Reading left to right for a test shows the benefit of each optimization on the cost of compilation. 505 506 \begin{table}[htb] 507 \centering 481 508 \caption{Compile time of selected files by compiler build, in seconds} 509 \label{t:SelectedFileByCompilerBuild} 482 510 \begin{tabular}{|l|r|r|r|r|r|r|} 483 511 \hline … … 502 530 \end{table} 503 531 504 505 532 \section{Conclusion} 506 533 507 Over the course of 8 months of active research and development in \CFA type system and compiler algorithm, performance of the reference \CFA compiler, cfa-cc, has been greatly improved, allowing mid-sized \CFA programs to be compiled and built reasonably fast. As there are also ongoing efforts in the team on building a standard library, evaluating the runtime performance, and attempting to incorporate \CFA with existing software written in C, this project is especially meaningful forpractical purposes.508 509 A nalysis conducted in the project were based significantly on heuristics and practical evidence, as the theoretical bounds and average cases for the expression resolution problem differ. This approach was difficult at start to follow, with an unacceptably slow compiler, since running the program through debugger and validation tools (\eg @gdb@, @valgrind@) adds another order of magnitude to run time, which was already in minutes. However, near the end of the project, many significant improvements have already been made and new optimizations can be tested immediately. The positive feedback in development cycle benefits the \CFA team as a whole, more than just for the compiler optimizations.510 511 Some potential issues of the language that may happen frequently in practice have been identified. Due to the time constraint and complex nature of these problems, a handful of them remain unsolved, but some constructive proposals are made. Notably, introducing a local assertion cache in the resolver is a common solution for a few remaining problems, so that should be the focus of work soon.512 513 The \CFA team are planning on a public alpha release of the language as the compiler performance becomes promising, and other parts of the system, such as a standard library, are also being enhanced. Ideally, the remaining problems should be resolved before release, and the solutions will also be integral to drafting a formal specification.534 Over the course of 8 months of active research and development of the \CFA type system and compiler algorithms, performance of the reference \CFA compiler, cfa-cc, has been greatly improved. Now, mid-sized \CFA programs are compiled reasonably fast. Currently, there are ongoing efforts by the \CFA team to augment the standard library and evaluate its runtime performance, and incorporate \CFA with existing software written in C; therefore this project is especially meaningful for these practical purposes. 535 536 Accomplishing this work was difficult. Analysis conducted in the project is based significantly on heuristics and practical evidence, as the theoretical bounds and average cases for the expression resolution problem differ. As well, the slowness of the initial compiler made attempts to understand why and where problems exist extremely difficult because both debugging and validation tools (\eg @gdb@, @valgrind@, @pref@) further slowed down compilation time. However, by the end of the project, I had found and fixed several significant problems and new optimizations are easier to introduce and test. The reduction in the development cycle benefits the \CFA team as a whole. 537 538 Some potential issues of the language, which happen frequently in practice, have been identified. Due to the time constraint and complex nature of these problems, a handful of them remain unsolved, but some constructive proposals are made. Notably, introducing a local assertion cache in the resolver is a reasonable solution for a few remaining problems, so that should be the focus of future work. 539 540 The \CFA team are planning on a public alpha release of the language as the compiler performance, given my recent improvements, is now useable. Other parts of the system, such as the standard library, have made significant gains due to the speed up in the development cycle. Ideally, the remaining problems should be resolved before release, and the solutions will also be integral to drafting a formal specification. 514 541 515 542 \addcontentsline{toc}{section}{\refname} -
doc/theses/fangren_yu_COOP_S20/Report.tex
rb6a8b31 rd95969a 17 17 \usepackage[usenames]{color} 18 18 \input{common} % common CFA document macros 19 \usepackage[dvips,plainpages=false,pdfpagelabels,pdfpagemode=UseNone,colorlinks=true, pagebackref=true,linkcolor=blue,citecolor=blue,urlcolor=blue,pagebackref=true,breaklinks=true]{hyperref}19 \usepackage[dvips,plainpages=false,pdfpagelabels,pdfpagemode=UseNone,colorlinks=true,linkcolor=blue,citecolor=blue,urlcolor=blue,pagebackref=true,breaklinks=true]{hyperref} 20 20 \usepackage{breakurl} 21 21 \urlstyle{sf}
Note: See TracChangeset
for help on using the changeset viewer.