Changeset 95b3a9c for doc/theses
- Timestamp:
- Feb 17, 2021, 12:45:36 PM (5 years ago)
- Branches:
- ADT, arm-eh, ast-experimental, enum, forall-pointer-decay, jacob/cs343-translation, master, new-ast-unique-expr, pthread-emulation, qualifiedEnum, stuck-waitfor-destruct
- Children:
- e7c077a
- Parents:
- 5e99a9a (diff), 9fb1367 (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the(diff)links above to see all the changes relative to each parent. - Location:
- doc/theses
- Files:
-
- 3 added
- 17 edited
-
andrew_beach_MMath/existing.tex (modified) (8 diffs)
-
andrew_beach_MMath/features.tex (modified) (11 diffs)
-
andrew_beach_MMath/future.tex (modified) (2 diffs)
-
andrew_beach_MMath/implement.tex (modified) (1 diff)
-
andrew_beach_MMath/thesis-frontpgs.tex (modified) (7 diffs)
-
andrew_beach_MMath/thesis.tex (modified) (5 diffs)
-
andrew_beach_MMath/unwinding.tex (modified) (2 diffs)
-
andrew_beach_MMath/uw-ethesis-frontpgs.tex (modified) (6 diffs)
-
andrew_beach_MMath/uw-ethesis.tex (modified) (10 diffs)
-
fangren_yu_COOP_F20/Report.tex (modified) (4 diffs)
-
thierry_delisle_PhD/thesis/.gitignore (added)
-
thierry_delisle_PhD/thesis/Makefile (modified) (6 diffs)
-
thierry_delisle_PhD/thesis/fig/io_uring.fig (added)
-
thierry_delisle_PhD/thesis/fig/pivot_ring.fig (added)
-
thierry_delisle_PhD/thesis/local.bib (modified) (3 diffs)
-
thierry_delisle_PhD/thesis/text/core.tex (modified) (2 diffs)
-
thierry_delisle_PhD/thesis/text/intro.tex (modified) (1 diff)
-
thierry_delisle_PhD/thesis/text/io.tex (modified) (3 diffs)
-
thierry_delisle_PhD/thesis/text/runtime.tex (modified) (3 diffs)
-
thierry_delisle_PhD/thesis/thesis.tex (modified) (12 diffs)
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/andrew_beach_MMath/existing.tex
r5e99a9a r95b3a9c 1 \chapter{\ texorpdfstring{\CFA Existing Features}{Cforall Existing Features}}1 \chapter{\CFA Existing Features} 2 2 3 3 \CFA (C-for-all)~\cite{Cforall} is an open-source project extending ISO C with … … 12 12 obvious to the reader. 13 13 14 \section{ \texorpdfstring{Overloading and \lstinline|extern|}{Overloading andextern}}14 \section{Overloading and \lstinline{extern}} 15 15 \CFA has extensive overloading, allowing multiple definitions of the same name 16 16 to be defined.~\cite{Moss18} … … 42 42 43 43 \section{Reference Type} 44 \CFA adds a rebindable reference type to C, but more expressive than the \C C44 \CFA adds a rebindable reference type to C, but more expressive than the \Cpp 45 45 reference. Multi-level references are allowed and act like auto-dereferenced 46 46 pointers using the ampersand (@&@) instead of the pointer asterisk (@*@). \CFA … … 59 59 60 60 Both constructors and destructors are operators, which means they are just 61 functions with special operator names rather than type names in \C C. The61 functions with special operator names rather than type names in \Cpp. The 62 62 special operator names may be used to call the functions explicitly (not 63 allowed in \C Cfor constructors).63 allowed in \Cpp for constructors). 64 64 65 65 In general, operator names in \CFA are constructed by bracketing an operator … … 88 88 matching overloaded destructor @void ^?{}(T &);@ is called. Without explicit 89 89 definition, \CFA creates a default and copy constructor, destructor and 90 assignment (like \C C). It is possible to define constructors/destructors for90 assignment (like \Cpp). It is possible to define constructors/destructors for 91 91 basic and existing types. 92 92 … … 94 94 \CFA uses parametric polymorphism to create functions and types that are 95 95 defined over multiple types. \CFA polymorphic declarations serve the same role 96 as \C Ctemplates or Java generics. The ``parametric'' means the polymorphism is96 as \Cpp templates or Java generics. The ``parametric'' means the polymorphism is 97 97 accomplished by passing argument operations to associate \emph{parameters} at 98 98 the call site, and these parameters are used in the function to differentiate … … 134 134 135 135 Note, a function named @do_once@ is not required in the scope of @do_twice@ to 136 compile it, unlike \C Ctemplate expansion. Furthermore, call-site inferencing136 compile it, unlike \Cpp template expansion. Furthermore, call-site inferencing 137 137 allows local replacement of the most specific parametric functions needs for a 138 138 call. … … 178 178 } 179 179 \end{cfa} 180 The generic type @node(T)@ is an example of a polymorphic-type usage. Like \C C180 The generic type @node(T)@ is an example of a polymorphic-type usage. Like \Cpp 181 181 templates usage, a polymorphic-type usage must specify a type parameter. 182 182 -
doc/theses/andrew_beach_MMath/features.tex
r5e99a9a r95b3a9c 5 5 6 6 \section{Virtuals} 7 Virtual types and casts are not part of the exception system nor are they 8 required for an exception system. But an object-oriented style hierarchy is a 9 great way of organizing exceptions so a minimal virtual system has been added 10 to \CFA. 11 12 The pattern of a simple hierarchy was borrowed from object-oriented 13 programming was chosen for several reasons. 14 The first is that it allows new exceptions to be added in user code 15 and in libraries independently of each other. Another is it allows for 16 different levels of exception grouping (all exceptions, all IO exceptions or 17 a particular IO exception). Also it also provides a simple way of passing 18 data back and forth across the throw. 19 7 20 Virtual types and casts are not required for a basic exception-system but are 8 21 useful for advanced exception features. However, \CFA is not object-oriented so 9 there is no obvious concept of virtuals. Hence, to create advanced exception10 features for this work, I needed to design ed and implementeda virtual-like22 there is no obvious concept of virtuals. Hence, to create advanced exception 23 features for this work, I needed to design and implement a virtual-like 11 24 system for \CFA. 12 25 26 % NOTE: Maybe we should but less of the rational here. 13 27 Object-oriented languages often organized exceptions into a simple hierarchy, 14 28 \eg Java. … … 30 44 \end{center} 31 45 The hierarchy provides the ability to handle an exception at different degrees 32 of specificity (left to right). Hence, it is possible to catch a more general46 of specificity (left to right). Hence, it is possible to catch a more general 33 47 exception-type in higher-level code where the implementation details are 34 48 unknown, which reduces tight coupling to the lower-level implementation. … … 61 75 While much of the virtual infrastructure is created, it is currently only used 62 76 internally for exception handling. The only user-level feature is the virtual 63 cast, which is the same as the \C C\lstinline[language=C++]|dynamic_cast|.77 cast, which is the same as the \Cpp \lstinline[language=C++]|dynamic_cast|. 64 78 \label{p:VirtualCast} 65 79 \begin{cfa} 66 80 (virtual TYPE)EXPRESSION 67 81 \end{cfa} 68 Note, the syntax and semantics matches a C-cast, rather than the unusual \CC 69 syntax for special casts. Both the type of @EXPRESSION@ and @TYPE@ must be a 70 pointer to a virtual type. The cast dynamically checks if the @EXPRESSION@ type 71 is the same or a subtype of @TYPE@, and if true, returns a pointer to the 82 Note, the syntax and semantics matches a C-cast, rather than the function-like 83 \Cpp syntax for special casts. Both the type of @EXPRESSION@ and @TYPE@ must be 84 a pointer to a virtual type. 85 The cast dynamically checks if the @EXPRESSION@ type is the same or a subtype 86 of @TYPE@, and if true, returns a pointer to the 72 87 @EXPRESSION@ object, otherwise it returns @0p@ (null pointer). 73 88 … … 78 93 79 94 Exceptions are defined by the trait system; there are a series of traits, and 80 if a type satisfies them, then it can be used as an exception. The following95 if a type satisfies them, then it can be used as an exception. The following 81 96 is the base trait all exceptions need to match. 82 97 \begin{cfa} 83 98 trait is_exception(exceptT &, virtualT &) { 84 virtualT const & @get_exception_vtable@(exceptT *);99 virtualT const & get_exception_vtable(exceptT *); 85 100 }; 86 101 \end{cfa} 87 The function takes any pointer, including the null pointer, and returns a 88 reference to the virtual-table object. Defining this function also establishes 89 the virtual type and a virtual-table pair to the \CFA type-resolver and 90 promises @exceptT@ is a virtual type and a child of the base exception-type. 91 92 \PAB{I do not understand this paragraph.} 93 One odd thing about @get_exception_vtable@ is that it should always be a 94 constant function, returning the same value regardless of its argument. A 95 pointer or reference to the virtual table instance could be used instead, 96 however using a function has some ease of implementation advantages and allows 97 for easier disambiguation because the virtual type name (or the address of an 98 instance that is in scope) can be used instead of the mangled virtual table 99 name. Also note the use of the word ``promise'' in the trait 100 description. Currently, \CFA cannot check to see if either @exceptT@ or 101 @virtualT@ match the layout requirements. This is considered part of 102 @get_exception_vtable@'s correct implementation. 103 104 \section{Raise} 105 \CFA provides two kinds of exception raise: termination 106 \see{\VRef{s:Termination}} and resumption \see{\VRef{s:Resumption}}, which are 107 specified with the following traits. 102 The trait is defined over two types, the exception type and the virtual table 103 type. This should be one-to-one, each exception type has only one virtual 104 table type and vice versa. The only assertion in the trait is 105 @get_exception_vtable@, which takes a pointer of the exception type and 106 returns a reference to the virtual table type instance. 107 108 The function @get_exception_vtable@ is actually a constant function. 109 Recardless of the value passed in (including the null pointer) it should 110 return a reference to the virtual table instance for that type. 111 The reason it is a function instead of a constant is that it make type 112 annotations easier to write as you can use the exception type instead of the 113 virtual table type; which usually has a mangled name. 114 % Also \CFA's trait system handles functions better than constants and doing 115 % it this way reduce the amount of boiler plate we need. 116 117 % I did have a note about how it is the programmer's responsibility to make 118 % sure the function is implemented correctly. But this is true of every 119 % similar system I know of (except Agda's I guess) so I took it out. 120 121 There are two more traits for exceptions @is_termination_exception@ and 122 @is_resumption_exception@. They are defined as follows: 123 108 124 \begin{cfa} 109 125 trait is_termination_exception( 110 126 exceptT &, virtualT & | is_exception(exceptT, virtualT)) { 111 void @defaultTerminationHandler@(exceptT &);127 void defaultTerminationHandler(exceptT &); 112 128 }; 113 \end{cfa} 114 The function is required to allow a termination raise, but is only called if a 115 termination raise does not find an appropriate handler. 116 117 Allowing a resumption raise is similar. 118 \begin{cfa} 129 119 130 trait is_resumption_exception( 120 131 exceptT &, virtualT & | is_exception(exceptT, virtualT)) { 121 void @defaultResumptionHandler@(exceptT &);132 void defaultResumptionHandler(exceptT &); 122 133 }; 123 134 \end{cfa} 124 The function is required to allow a resumption raise, but is only called if a 125 resumption raise does not find an appropriate handler. 126 127 Finally there are three convenience macros for referring to the these traits: 128 @IS_EXCEPTION@, @IS_TERMINATION_EXCEPTION@ and @IS_RESUMPTION_EXCEPTION@. Each 129 takes the virtual type's name, and for polymorphic types only, the 130 parenthesized list of polymorphic arguments. These macros do the name mangling 131 to get the virtual-table name and provide the arguments to both sides 132 \PAB{What's a ``side''?} 135 136 In other words they make sure that a given type and virtual type is an 137 exception and defines one of the two default handlers. These default handlers 138 are used in the main exception handling operations \see{Exception Handling} 139 and their use will be detailed there. 140 141 However all three of these traits can be trickly to use directly. 142 There is a bit of repetition required but 143 the largest issue is that the virtual table type is mangled and not in a user 144 facing way. So there are three macros that can be used to wrap these traits 145 when you need to refer to the names: 146 @IS_EXCEPTION@, @IS_TERMINATION_EXCEPTION@ and @IS_RESUMPTION_EXCEPTION@. 147 148 All take one or two arguments. The first argument is the name of the 149 exception type. Its unmangled and mangled form are passed to the trait. 150 The second (optional) argument is a parenthesized list of polymorphic 151 arguments. This argument should only with polymorphic exceptions and the 152 list will be passed to both types. 153 In the current set-up the base name and the polymorphic arguments have to 154 match so these macros can be used without losing flexability. 155 156 For example consider a function that is polymorphic over types that have a 157 defined arithmetic exception: 158 \begin{cfa} 159 forall(Num | IS_EXCEPTION(Arithmetic, (Num))) 160 void some_math_function(Num & left, Num & right); 161 \end{cfa} 162 163 \section{Exception Handling} 164 \CFA provides two kinds of exception handling, termination and resumption. 165 These twin operations are the core of the exception handling mechanism and 166 are the reason for the features of exceptions. 167 This section will cover the general patterns shared by the two operations and 168 then go on to cover the details each individual operation. 169 170 Both operations follow the same set of steps to do their operation. They both 171 start with the user preforming a throw on an exception. 172 Then there is the search for a handler, if one is found than the exception 173 is caught and the handler is run. After that control returns to normal 174 execution. 175 176 If the search fails a default handler is run and then control 177 returns to normal execution immediately. That is where the default handlers 178 @defaultTermiationHandler@ and @defaultResumptionHandler@ are used. 133 179 134 180 \subsection{Termination} 135 181 \label{s:Termination} 136 182 137 Termination raise, called ``throw'', is familiar and used in most programming 138 languages with exception handling. The semantics of termination is: search the 139 stack for a matching handler, unwind the stack frames to the matching handler, 140 execute the handler, and continue execution after the handler. Termination is 141 used when execution \emph{cannot} return to the throw. To continue execution, 142 the program must \emph{recover} in the handler from the failed (unwound) 143 execution at the raise to safely proceed after the handler. 144 145 A termination raise is started with the @throw@ statement: 183 Termination handling is more familiar kind and used in most programming 184 languages with exception handling. 185 It is dynamic, non-local goto. If a throw is successful then the stack will 186 be unwound and control will (usually) continue in a different function on 187 the call stack. They are commonly used when an error has occured and recovery 188 is impossible in the current function. 189 190 % (usually) Control can continue in the current function but then a different 191 % control flow construct should be used. 192 193 A termination throw is started with the @throw@ statement: 146 194 \begin{cfa} 147 195 throw EXPRESSION; 148 196 \end{cfa} 149 The expression must return a termination-exception reference, where the150 termination exception has a type with a @void defaultTerminationHandler(T &)@151 (default handler) defined. The handler is found at the call site using \CFA's 152 trait system and passed into the exception system along with the exception 153 itself. 154 155 At runtime, a representation of the exception type and an instance of the 156 exception type is copied into managed memory (heap) to ensure it remains in 157 scope during unwinding. It is the user's responsibility to ensure the original 158 exception object at the throw is freed when it goes out of scope. Being 159 allocated on the stack is sufficient for this. 160 161 Then the exception system searches the stack starting from the throw and 162 proceeding towards the base of the stack, from callee to caller. At each stack 163 frame, a check is made for termination handlers defined by the @catch@ clauses 164 of a @try@ statement.197 The expression must return a reference to a termination exception, where the 198 termination exception is any type that satifies @is_termination_exception@ 199 at the call site. 200 Through \CFA's trait system the functions in the traits are passed into the 201 throw code. A new @defaultTerminationHandler@ can be defined in any scope to 202 change the throw's behavior (see below). 203 204 The throw will copy the provided exception into managed memory. It is the 205 user's responcibility to ensure the original exception is cleaned up if the 206 stack is unwound (allocating it on the stack should be sufficient). 207 208 Then the exception system searches the stack using the copied exception. 209 It starts starts from the throw and proceeds to the base of the stack, 210 from callee to caller. 211 At each stack frame, a check is made for resumption handlers defined by the 212 @catch@ clauses of a @try@ statement. 165 213 \begin{cfa} 166 214 try { 167 215 GUARDED_BLOCK 168 } @catch (EXCEPTION_TYPE$\(_1\)$ * NAME)@ { // termination handler 1216 } catch (EXCEPTION_TYPE$\(_1\)$ * NAME$\(_1\)$) { 169 217 HANDLER_BLOCK$\(_1\)$ 170 } @catch (EXCEPTION_TYPE$\(_2\)$ * NAME)@ { // termination handler 2218 } catch (EXCEPTION_TYPE$\(_2\)$ * NAME$\(_2\)$) { 171 219 HANDLER_BLOCK$\(_2\)$ 172 220 } 173 221 \end{cfa} 174 The statements in the @GUARDED_BLOCK@ are executed. If those statements, or any 175 functions invoked from those statements, throws an exception, and the exception 222 When viewed on its own a try statement will simply exceute the statements in 223 @GUARDED_BLOCK@ and when those are finished the try statement finishes. 224 225 However, while the guarded statements are being executed, including any 226 functions they invoke, all the handlers following the try block are now 227 or any functions invoked from those 228 statements, throws an exception, and the exception 176 229 is not handled by a try statement further up the stack, the termination 177 230 handlers are searched for a matching exception type from top to bottom. … … 179 232 Exception matching checks the representation of the thrown exception-type is 180 233 the same or a descendant type of the exception types in the handler clauses. If 181 there is a match, a pointer to the exception object created at the throw is 182 bound to @NAME@ and the statements in the associated @HANDLER_BLOCK@ are 183 executed. If control reaches the end of the handler, the exception is freed, 184 and control continues after the try statement. 185 186 The default handler visible at the throw statement is used if no matching 187 termination handler is found after the entire stack is searched. At that point, 188 the default handler is called with a reference to the exception object 189 generated at the throw. If the default handler returns, the system default 190 action is executed, which often terminates the program. This feature allows 191 each exception type to define its own action, such as printing an informative 192 error message, when an exception is not handled in the program. 234 it is the same of a descendent of @EXCEPTION_TYPE@$_i$ then @NAME@$_i$ is 235 bound to a pointer to the exception and the statements in @HANDLER_BLOCK@$_i$ 236 are executed. If control reaches the end of the handler, the exception is 237 freed and control continues after the try statement. 238 239 If no handler is found during the search then the default handler is run. 240 Through \CFA's trait system the best match at the throw sight will be used. 241 This function is run and is passed the copied exception. After the default 242 handler is run control continues after the throw statement. 243 244 There is a global @defaultTerminationHandler@ that cancels the current stack 245 with the copied exception. However it is generic over all exception types so 246 new default handlers can be defined for different exception types and so 247 different exception types can have different default handlers. 193 248 194 249 \subsection{Resumption} 195 250 \label{s:Resumption} 196 251 197 Resumption raise, called ``resume'', is as old as termination 198 raise~\cite{Goodenough75} but is less popular. In many ways, resumption is 199 simpler and easier to understand, as it is simply a dynamic call (as in 200 Lisp). The semantics of resumption is: search the stack for a matching handler, 201 execute the handler, and continue execution after the resume. Notice, the stack 202 cannot be unwound because execution returns to the raise point. Resumption is 203 used used when execution \emph{can} return to the resume. To continue 204 execution, the program must \emph{correct} in the handler for the failed 205 execution at the raise so execution can safely continue after the resume. 252 Resumption exception handling is a less common form than termination but is 253 just as old~\cite{Goodenough75} and is in some sense simpler. 254 It is a dynamic, non-local function call. If the throw is successful a 255 closure will be taken from up the stack and executed, after which the throwing 256 function will continue executing. 257 These are most often used when an error occured and if the error is repaired 258 then the function can continue. 206 259 207 260 A resumption raise is started with the @throwResume@ statement: … … 210 263 \end{cfa} 211 264 The semantics of the @throwResume@ statement are like the @throw@, but the 212 expression has a type with a @void defaultResumptionHandler(T &)@ (default 213 handler) defined, where the handler is found at the call site by the type 214 system. At runtime, a representation of the exception type and an instance of 215 the exception type is \emph{not} copied because the stack is maintained during 216 the handler search. 217 218 Then the exception system searches the stack starting from the resume and 219 proceeding towards the base of the stack, from callee to caller. At each stack 220 frame, a check is made for resumption handlers defined by the @catchResume@ 221 clauses of a @try@ statement. 265 expression has return a reference a type that satifies the trait 266 @is_resumption_exception@. The assertions from this trait are available to 267 the exception system while handling the exception. 268 269 At runtime, no copies are made. As the stack is not unwound the exception and 270 any values on the stack will remain in scope while the resumption is handled. 271 272 Then the exception system searches the stack using the provided exception. 273 It starts starts from the throw and proceeds to the base of the stack, 274 from callee to caller. 275 At each stack frame, a check is made for resumption handlers defined by the 276 @catchResume@ clauses of a @try@ statement. 222 277 \begin{cfa} 223 278 try { 224 279 GUARDED_BLOCK 225 } @catchResume (EXCEPTION_TYPE$\(_1\)$ * NAME)@ { // resumption handler 1280 } catchResume (EXCEPTION_TYPE$\(_1\)$ * NAME$\(_1\)$) { 226 281 HANDLER_BLOCK$\(_1\)$ 227 } @catchResume (EXCEPTION_TYPE$\(_2\)$ * NAME)@ { // resumption handler 2282 } catchResume (EXCEPTION_TYPE$\(_2\)$ * NAME$\(_2\)$) { 228 283 HANDLER_BLOCK$\(_2\)$ 229 284 } 230 285 \end{cfa} 231 The statements in the @GUARDED_BLOCK@ are executed. If those statements, or any 232 functions invoked from those statements, resumes an exception, and the 233 exception is not handled by a try statement further up the stack, the 234 resumption handlers are searched for a matching exception type from top to 235 bottom. (Note, termination and resumption handlers may be intermixed in a @try@ 236 statement but the kind of raise (throw/resume) only matches with the 237 corresponding kind of handler clause.) 238 239 The exception search and matching for resumption is the same as for 240 termination, including exception inheritance. The difference is when control 241 reaches the end of the handler: the resumption handler returns after the resume 242 rather than after the try statement. The resume point assumes the handler has243 corrected the problem so execution can safely continue.286 If the handlers are not involved in a search this will simply execute the 287 @GUARDED_BLOCK@ and then continue to the next statement. 288 Its purpose is to add handlers onto the stack. 289 (Note, termination and resumption handlers may be intermixed in a @try@ 290 statement but the kind of throw must be the same as the handler for it to be 291 considered as a possible match.) 292 293 If a search for a resumption handler reaches a try block it will check each 294 @catchResume@ clause, top-to-bottom. 295 At each handler if the thrown exception is or is a child type of 296 @EXCEPTION_TYPE@$_i$ then the a pointer to the exception is bound to 297 @NAME@$_i$ and then @HANDLER_BLOCK@$_i$ is executed. After the block is 298 finished control will return to the @throwResume@ statement. 244 299 245 300 Like termination, if no resumption handler is found, the default handler 246 visible at the resume statement is called, and the system default action is 247 executed. 248 249 For resumption, the exception system uses stack marking to partition the 250 resumption search. If another resumption exception is raised in a resumption 251 handler, the second exception search does not start at the point of the 252 original raise. (Remember the stack is not unwound and the current handler is 253 at the top of the stack.) The search for the second resumption starts at the 254 current point on the stack because new try statements may have been pushed by 255 the handler or functions called from the handler. If there is no match back to 256 the point of the current handler, the search skips\label{p:searchskip} the stack frames already 257 searched by the first resume and continues after the try statement. The default 258 handler always continues from default handler associated with the point where 259 the exception is created. 301 visible at the throw statement is called. It will use the best match at the 302 call sight according to \CFA's overloading rules. The default handler is 303 passed the exception given to the throw. When the default handler finishes 304 execution continues after the throw statement. 305 306 There is a global @defaultResumptionHandler@ is polymorphic over all 307 termination exceptions and preforms a termination throw on the exception. 308 The @defaultTerminationHandler@ for that throw is matched at the original 309 throw statement (the resumption @throwResume@) and it can be customized by 310 introducing a new or better match as well. 311 312 % \subsubsection? 313 314 A key difference between resumption and termination is that resumption does 315 not unwind the stack. A side effect that is that when a handler is matched 316 and run it's try block (the guarded statements) and every try statement 317 searched before it are still on the stack. This can lead to the recursive 318 resumption problem. 319 320 The recursive resumption problem is any situation where a resumption handler 321 ends up being called while it is running. 322 Consider a trivial case: 323 \begin{cfa} 324 try { 325 throwResume (E &){}; 326 } catchResume(E *) { 327 throwResume (E &){}; 328 } 329 \end{cfa} 330 When this code is executed the guarded @throwResume@ will throw, start a 331 search and match the handler in the @catchResume@ clause. This will be 332 call and placed on the stack on top of the try-block. The second throw then 333 throws and will seach the same try block and put call another instance of the 334 same handler leading to an infinite loop. 335 336 This situation is trivial and easy to avoid, but much more complex cycles 337 can form with multiple handlers and different exception types. 338 339 To prevent all of these cases we mask sections of the stack, or equvilantly 340 the try statements on the stack, so that the resumption seach skips over 341 them and continues with the next unmasked section of the stack. 342 343 A section of the stack is marked when it is searched to see if it contains 344 a handler for an exception and unmarked when that exception has been handled 345 or the search was completed without finding a handler. 260 346 261 347 % This might need a diagram. But it is an important part of the justification … … 276 362 \end{verbatim} 277 363 278 This resumption search-pattern reflect the one for termination, which matches 279 with programmer expectations. However, it avoids the \emph{recursive 280 resumption} problem. If parts of the stack are searched multiple times, loops 281 can easily form resulting in infinite recursion. 282 283 Consider the trivial case: 284 \begin{cfa} 285 try { 286 throwResume$\(_1\)$ (E &){}; 287 } catch( E * ) { 288 throwResume; 289 } 290 \end{cfa} 291 Based on termination semantics, programmer expectation is for the re-resume to 292 continue searching the stack frames after the try statement. However, the 293 current try statement is still on the stack below the handler issuing the 294 reresume \see{\VRef{s:Reraise}}. Hence, the try statement catches the re-raise 295 again and does another re-raise \emph{ad infinitum}, which is confusing and 296 difficult to debug. The \CFA resumption search-pattern skips the try statement 297 so the reresume search continues after the try, mathcing programmer 298 expectation. 364 The rules can be remembered as thinking about what would be searched in 365 termination. So when a throw happens in a handler; a termination handler 366 skips everything from the original throw to the original catch because that 367 part of the stack has been unwound, a resumption handler skips the same 368 section of stack because it has been masked. 369 A throw in a default handler will preform the same search as the original 370 throw because; for termination nothing has been unwound, for resumption 371 the mask will be the same. 372 373 The symmetry with termination is why this pattern was picked. Other patterns, 374 such as marking just the handlers that caught, also work but lack the 375 symmetry whih means there is more to remember. 299 376 300 377 \section{Conditional Catch} 301 Both termination and resumption handler-clauses may perform conditional matching: 302 \begin{cfa} 303 catch (EXCEPTION_TYPE * NAME ; @CONDITION@) 378 Both termination and resumption handler clauses can be given an additional 379 condition to further control which exceptions they handle: 380 \begin{cfa} 381 catch (EXCEPTION_TYPE * NAME ; CONDITION) 304 382 \end{cfa} 305 383 First, the same semantics is used to match the exception type. Second, if the 306 384 exception matches, @CONDITION@ is executed. The condition expression may 307 385 reference all names in scope at the beginning of the try block and @NAME@ 308 introduced in the handler clause. If the condition is true, then the handler309 matches. Otherwise, the exception search continues a t the next appropriate kind310 of handler clause in the try block.386 introduced in the handler clause. If the condition is true, then the handler 387 matches. Otherwise, the exception search continues as if the exception type 388 did not match. 311 389 \begin{cfa} 312 390 try { … … 322 400 remaining handlers in the current try statement. 323 401 324 \section{Reraise} 325 \label{s:Reraise} 402 \section{Rethrowing} 403 \colour{red}{From Andrew: I recomend we talk about why the language doesn't 404 have rethrows/reraises instead.} 405 406 \label{s:Rethrowing} 326 407 Within the handler block or functions called from the handler block, it is 327 408 possible to reraise the most recently caught exception with @throw@ or 328 @throwResume@, respective. 329 \begin{cfa} 330 catch( ... ) { 331 ... throw; // rethrow 409 @throwResume@, respectively. 410 \begin{cfa} 411 try { 412 ... 413 } catch( ... ) { 414 ... throw; 332 415 } catchResume( ... ) { 333 ... throwResume; // reresume416 ... throwResume; 334 417 } 335 418 \end{cfa} … … 341 424 handler is generated that does a program-level abort. 342 425 343 344 426 \section{Finally Clauses} 345 A @finally@ clause may be placed at the end of a @try@ statement. 427 Finally clauses are used to preform unconditional clean-up when leaving a 428 scope. They are placed at the end of a try statement: 346 429 \begin{cfa} 347 430 try { 348 431 GUARDED_BLOCK 349 } ... // any number or kind of handler clauses350 }finally {432 } ... // any number or kind of handler clauses 433 ... finally { 351 434 FINALLY_BLOCK 352 435 } 353 436 \end{cfa} 354 The @FINALLY_BLOCK@ is executed when the try statement is unwound from the 355 stack, \ie when the @GUARDED_BLOCK@ or any handler clause finishes. Hence, the 356 finally block is always executed. 437 The @FINALLY_BLOCK@ is executed when the try statement is removed from the 438 stack, including when the @GUARDED_BLOCK@ finishes, any termination handler 439 finishes or during an unwind. 440 The only time the block is not executed is if the program is exited before 441 the stack is unwound. 357 442 358 443 Execution of the finally block should always finish, meaning control runs off 359 444 the end of the block. This requirement ensures always continues as if the 360 445 finally clause is not present, \ie finally is for cleanup not changing control 361 flow. Because of this requirement, local control flow out of the finally block362 is forbidden. The compiler precludes any @break@, @continue@, @fallthru@ or446 flow. Because of this requirement, local control flow out of the finally block 447 is forbidden. The compiler precludes any @break@, @continue@, @fallthru@ or 363 448 @return@ that causes control to leave the finally block. Other ways to leave 364 449 the finally block, such as a long jump or termination are much harder to check, 365 and at best requiring additional run-time overhead, and so are discouraged. 450 and at best requiring additional run-time overhead, and so are mearly 451 discouraged. 452 453 Not all languages with exceptions have finally clauses. Notably \Cpp does 454 without it as descructors serve a similar role. Although destructors and 455 finally clauses can be used in many of the same areas they have their own 456 use cases like top-level functions and lambda functions with closures. 457 Destructors take a bit more work to set up but are much easier to reuse while 458 finally clauses are good for once offs and can include local information. 366 459 367 460 \section{Cancellation} … … 370 463 possible forwards the cancellation exception to a different stack. 371 464 465 Cancellation is not an exception operation like termination or resumption. 372 466 There is no special statement for starting a cancellation; instead the standard 373 library function @cancel_stack@ is called passing an exception. Unlike a374 raise, this exception is not used in matching only to pass information about467 library function @cancel_stack@ is called passing an exception. Unlike a 468 throw, this exception is not used in matching only to pass information about 375 469 the cause of the cancellation. 376 377 Handling of a cancellation depends on which stack is being cancelled. 470 (This also means matching cannot fail so there is no default handler either.) 471 472 After @cancel_stack@ is called the exception is copied into the exception 473 handling mechanism's memory. Then the entirety of the current stack is 474 unwound. After that it depends one which stack is being cancelled. 378 475 \begin{description} 379 476 \item[Main Stack:] 380 477 The main stack is the one used by the program main at the start of execution, 381 and is the only stack in a sequential program. Hence, when cancellation is 382 forwarded to the main stack, there is no other forwarding stack, so after the 383 stack is unwound, there is a program-level abort. 478 and is the only stack in a sequential program. Even in a concurrent program 479 the main stack is only dependent on the environment that started the program. 480 Hence, when the main stack is cancelled there is nowhere else in the program 481 to notify. After the stack is unwound, there is a program-level abort. 384 482 385 483 \item[Thread Stack:] 386 484 A thread stack is created for a @thread@ object or object that satisfies the 387 @is_thread@ trait. A thread only has two points of communication that must485 @is_thread@ trait. A thread only has two points of communication that must 388 486 happen: start and join. As the thread must be running to perform a 389 cancellation, it must occur after start and before join, so join is a 390 cancellation point. After the stack is unwound, the thread halts and waits for 391 another thread to join with it. The joining thread, checks for a cancellation, 487 cancellation, it must occur after start and before join, so join is used 488 for communication here. 489 After the stack is unwound, the thread halts and waits for 490 another thread to join with it. The joining thread checks for a cancellation, 392 491 and if present, resumes exception @ThreadCancelled@. 393 492 … … 397 496 the exception is not caught. The implicit join does a program abort instead. 398 497 399 This semantics is for safety. One difficult problem for any exception system is 400 defining semantics when an exception is raised during an exception search: 401 which exception has priority, the original or new exception? No matter which 402 exception is selected, it is possible for the selected one to disrupt or 403 destroy the context required for the other. \PAB{I do not understand the 404 following sentences.} This loss of information can happen with join but as the 405 thread destructor is always run when the stack is being unwound and one 406 termination/cancellation is already active. Also since they are implicit they 407 are easier to forget about. 498 This semantics is for safety. If an unwind is triggered while another unwind 499 is underway only one of them can proceed as they both want to ``consume'' the 500 stack. Letting both try to proceed leads to very undefined behaviour. 501 Both termination and cancellation involve unwinding and, since the default 502 @defaultResumptionHandler@ preforms a termination that could more easily 503 happen in an implicate join inside a destructor. So there is an error message 504 and an abort instead. 505 \todo{Perhaps have a more general disucssion of unwind collisions before 506 this point.} 507 508 The recommended way to avoid the abort is to handle the intial resumption 509 from the implicate join. If required you may put an explicate join inside a 510 finally clause to disable the check and use the local 511 @defaultResumptionHandler@ instead. 408 512 409 513 \item[Coroutine Stack:] A coroutine stack is created for a @coroutine@ object 410 or object that satisfies the @is_coroutine@ trait. A coroutine only knows of 411 two other coroutines, its starter and its last resumer. The last resumer has 412 the tightest coupling to the coroutine it activated. Hence, cancellation of 413 the active coroutine is forwarded to the last resumer after the stack is 414 unwound, as the last resumer has the most precise knowledge about the current 415 execution. When the resumer restarts, it resumes exception 514 or object that satisfies the @is_coroutine@ trait. A coroutine only knows of 515 two other coroutines, its starter and its last resumer. Of the two the last 516 resumer has the tightest coupling to the coroutine it activated and the most 517 up-to-date information. 518 519 Hence, cancellation of the active coroutine is forwarded to the last resumer 520 after the stack is unwound. When the resumer restarts, it resumes exception 416 521 @CoroutineCancelled@, which is polymorphic over the coroutine type and has a 417 522 pointer to the cancelled coroutine. -
doc/theses/andrew_beach_MMath/future.tex
r5e99a9a r95b3a9c 10 10 \item 11 11 The implementation of termination is not portable because it includes 12 hand-crafted assembly statements. These sections must be generalized to support13 more hardware architectures, \egARM processor.12 hand-crafted assembly statements. These sections must be ported by hand to 13 support more hardware architectures, such as the ARM processor. 14 14 \item 15 15 Due to a type-system problem, the catch clause cannot bind the exception to a … … 24 24 scope of the @try@ statement, where the local control-flow transfers are 25 25 meaningful. 26 \item 27 There is no detection of colliding unwinds. It is possible for clean-up code 28 run during an unwind to trigger another unwind that escapes the clean-up code 29 itself; such as a termination exception caught further down the stack or a 30 cancellation. There do exist ways to handle this but currently they are not 31 even detected and the first unwind will simply be forgotten, often leaving 32 it in a bad state. 33 \item 34 Also the exception system did not have a lot of time to be tried and tested. 35 So just letting people use the exception system more will reveal new 36 quality of life upgrades that can be made with time. 26 37 \end{itemize} 27 38 -
doc/theses/andrew_beach_MMath/implement.tex
r5e99a9a r95b3a9c 278 278 @_URC_END_OF_STACK@. 279 279 280 Second, when a handler is matched, raise exception continues onto the cleanup phase. 280 Second, when a handler is matched, raise exception continues onto the cleanup 281 phase. 281 282 Once again, it calls the personality functions of each stack frame from newest 282 283 to oldest. This pass stops at the stack frame containing the matching handler. -
doc/theses/andrew_beach_MMath/thesis-frontpgs.tex
r5e99a9a r95b3a9c 36 36 37 37 A thesis \\ 38 presented to the University of Waterloo \\ 38 presented to the University of Waterloo \\ 39 39 in fulfillment of the \\ 40 40 thesis requirement for the degree of \\ … … 64 64 \cleardoublepage 65 65 66 66 67 67 %---------------------------------------------------------------------- 68 68 % EXAMINING COMMITTEE (Required for Ph.D. theses only) … … 71 71 \begin{center}\textbf{Examining Committee Membership}\end{center} 72 72 \noindent 73 The following served on the Examining Committee for this thesis. The decision of the Examining Committee is by majority vote. 74 \bigskip 75 76 \noindent 77 \begin{tabbing} 78 Internal-External Member: \= \kill % using longest text to define tab length 79 External Examiner: \> Bruce Bruce \\ 73 The following served on the Examining Committee for this thesis. The decision 74 of the Examining Committee is by majority vote. 75 \bigskip 76 77 \noindent 78 \begin{tabbing} 79 Internal-External Member: \= \kill % using longest text to define tab length 80 External Examiner: \> Bruce Bruce \\ 80 81 \> Professor, Dept. of Philosophy of Zoology, University of Wallamaloo \\ 81 \end{tabbing} 82 \bigskip 83 82 \end{tabbing} 83 \bigskip 84 84 85 \noindent 85 86 \begin{tabbing} … … 91 92 \end{tabbing} 92 93 \bigskip 93 94 94 95 \noindent 95 96 \begin{tabbing} … … 99 100 \end{tabbing} 100 101 \bigskip 101 102 102 103 \noindent 103 104 \begin{tabbing} … … 107 108 \end{tabbing} 108 109 \bigskip 109 110 110 111 \noindent 111 112 \begin{tabbing} … … 123 124 % December 13th, 2006. It is designed for an electronic thesis. 124 125 \noindent 125 I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, including any required final revisions, as accepted by my examiners. 126 127 \bigskip 128 126 I hereby declare that I am the sole author of this thesis. This is a true copy 127 of the thesis, including any required final revisions, as accepted by my 128 examiners. 129 130 \bigskip 131 129 132 \noindent 130 133 I understand that my thesis may be made electronically available to the public. -
doc/theses/andrew_beach_MMath/thesis.tex
r5e99a9a r95b3a9c 45 45 % FRONT MATERIAL 46 46 %---------------------------------------------------------------------- 47 \input{thesis-frontpgs} 47 \input{thesis-frontpgs} 48 48 49 49 %---------------------------------------------------------------------- … … 65 65 A \gls{computer} could compute $\pi$ all day long. In fact, subsets of digits 66 66 of $\pi$'s decimal approximation would make a good source for psuedo-random 67 vectors, \gls{rvec} . 67 vectors, \gls{rvec} . 68 68 69 69 %---------------------------------------------------------------------- … … 96 96 97 97 \begin{itemize} 98 \item A well-prepared PDF should be 98 \item A well-prepared PDF should be 99 99 \begin{enumerate} 100 100 \item Of reasonable size, {\it i.e.} photos cropped and compressed. 101 \item Scalable, to allow enlargment of text and drawings. 102 \end{enumerate} 101 \item Scalable, to allow enlargment of text and drawings. 102 \end{enumerate} 103 103 \item Photos must be bit maps, and so are not scaleable by definition. TIFF and 104 104 BMP are uncompressed formats, while JPEG is compressed. Most photos can be 105 105 compressed without losing their illustrative value. 106 \item Drawings that you make should be scalable vector graphics, \emph{not} 106 \item Drawings that you make should be scalable vector graphics, \emph{not} 107 107 bit maps. Some scalable vector file formats are: EPS, SVG, PNG, WMF. These can 108 all be converted into PNG or PDF, that pdflatex recognizes. Your drawing 109 package probably can export to one of these formats directly. Otherwise, a 110 common procedure is to print-to-file through a Postscript printer driver to 111 create a PS file, then convert that to EPS (encapsulated PS, which has a 112 bounding box to describe its exact size rather than a whole page). 108 all be converted into PNG or PDF, that pdflatex recognizes. Your drawing 109 package probably can export to one of these formats directly. Otherwise, a 110 common procedure is to print-to-file through a Postscript printer driver to 111 create a PS file, then convert that to EPS (encapsulated PS, which has a 112 bounding box to describe its exact size rather than a whole page). 113 113 Programs such as GSView (a Ghostscript GUI) can create both EPS and PDF from 114 114 PS files. Appendix~\ref{AppendixA} shows how to generate properly sized Matlab 115 115 plots and save them as PDF. 116 116 \item It's important to crop your photos and draw your figures to the size that 117 you want to appear in your thesis. Scaling photos with the 118 includegraphics command will cause loss of resolution. And scaling down 117 you want to appear in your thesis. Scaling photos with the 118 includegraphics command will cause loss of resolution. And scaling down 119 119 drawings may cause any text annotations to become too small. 120 120 \end{itemize} 121 121 122 122 For more information on \LaTeX\, see the uWaterloo Skills for the 123 Academic Workplace \href{https://uwaterloo.ca/information-systems-technology/services/electronic-thesis-preparation-and-submission-support/ethesis-guide/creating-pdf-version-your-thesis/creating-pdf-files-using-latex/latex-ethesis-and-large-documents}{course notes}. 123 Academic Workplace \href{https://uwaterloo.ca/information-systems-technology/services/electronic-thesis-preparation-and-submission-support/ethesis-guide/creating-pdf-version-your-thesis/creating-pdf-files-using-latex/latex-ethesis-and-large-documents}{course notes}. 124 124 \footnote{ 125 125 Note that while it is possible to include hyperlinks to external documents, 126 it is not wise to do so, since anything you can't control may change over time. 127 It \emph{would} be appropriate and necessary to provide external links to 128 additional resources for a multimedia ``enhanced'' thesis. 129 But also note that if the \package{hyperref} package is not included, 130 as for the print-optimized option in this thesis template, any \cmmd{href} 126 it is not wise to do so, since anything you can't control may change over time. 127 It \emph{would} be appropriate and necessary to provide external links to 128 additional resources for a multimedia ``enhanced'' thesis. 129 But also note that if the \package{hyperref} package is not included, 130 as for the print-optimized option in this thesis template, any \cmmd{href} 131 131 commands in your logical document are no longer defined. 132 132 A work-around employed by this thesis template is to define a dummy 133 \cmmd{href} command (which does nothing) in the preamble of the document, 134 before the \package{hyperref} package is included. 133 \cmmd{href} command (which does nothing) in the preamble of the document, 134 before the \package{hyperref} package is included. 135 135 The dummy definition is then redifined by the 136 136 \package{hyperref} package when it is included. … … 138 138 139 139 The classic book by Leslie Lamport \cite{lamport.book}, author of \LaTeX , is 140 worth a look too, and the many available add-on packages are described by 140 worth a look too, and the many available add-on packages are described by 141 141 Goossens \textit{et al} \cite{goossens.book}. 142 142 … … 180 180 Export Setup button in the figure Property Editor. 181 181 182 \section{From the Command Line} 182 \section{From the Command Line} 183 183 All figure properties can also be manipulated from the command line. Here's an 184 example: 184 example: 185 185 \begin{verbatim} 186 186 x=[0:0.1:pi]; -
doc/theses/andrew_beach_MMath/unwinding.tex
r5e99a9a r95b3a9c 1 \chapter{ \texorpdfstring{Unwinding in \CFA}{Unwinding in Cforall}}1 \chapter{Unwinding in \CFA} 2 2 3 3 Stack unwinding is the process of removing stack frames (activations) from the … … 110 110 alternate transfers of control. 111 111 112 \section{\ texorpdfstring{\CFA Implementation}{Cforall Implementation}}112 \section{\CFA Implementation} 113 113 114 114 To use libunwind, \CFA provides several wrappers, its own storage, personality -
doc/theses/andrew_beach_MMath/uw-ethesis-frontpgs.tex
r5e99a9a r95b3a9c 13 13 \vspace*{1.0cm} 14 14 15 \Huge 16 {\bf Exception Handling in \CFA} 15 {\Huge\bf Exception Handling in \CFA} 17 16 18 17 \vspace*{1.0cm} 19 18 20 \normalsize21 19 by \\ 22 20 23 21 \vspace*{1.0cm} 24 22 25 \Large 26 Andrew James Beach \\ 23 {\Large Andrew James Beach} \\ 27 24 28 25 \vspace*{3.0cm} 29 26 30 \normalsize31 27 A thesis \\ 32 presented to the University of Waterloo \\ 28 presented to the University of Waterloo \\ 33 29 in fulfillment of the \\ 34 30 thesis requirement for the degree of \\ … … 43 39 \vspace*{1.0cm} 44 40 45 \copyright \Andrew James Beach \the\year \\41 \copyright{} Andrew James Beach \the\year \\ 46 42 \end{center} 47 43 \end{titlepage} 48 44 49 % The rest of the front pages should contain no headers and be numbered using Roman numerals starting with `ii' 45 % The rest of the front pages should contain no headers and be numbered using 46 % Roman numerals starting with `ii'. 50 47 \pagestyle{plain} 51 48 \setcounter{page}{2} 52 49 53 \cleardoublepage % Ends the current page and causes all figures and tables that have so far appeared in the input to be printed. 54 % In a two-sided printing style, it also makes the next page a right-hand (odd-numbered) page, producing a blank page if necessary. 50 \cleardoublepage % Ends the current page and causes all figures and tables 51 % that have so far appeared in the input to be printed. In a two-sided 52 % printing style, it also makes the next page a right-hand (odd-numbered) 53 % page, producing a blank page if necessary. 55 54 56 \begin{comment} 55 \begin{comment} 57 56 % E X A M I N I N G C O M M I T T E E (Required for Ph.D. theses only) 58 57 % Remove or comment out the lines below to remove this page 59 58 \begin{center}\textbf{Examining Committee Membership}\end{center} 60 59 \noindent 61 The following served on the Examining Committee for this thesis. The decision of the Examining Committee is by majority vote. 60 The following served on the Examining Committee for this thesis. 61 The decision of the Examining Committee is by majority vote. 62 62 \bigskip 63 63 64 64 \noindent 65 65 \begin{tabbing} 66 66 Internal-External Member: \= \kill % using longest text to define tab length 67 External Examiner: \> Bruce Bruce \\ 67 External Examiner: \> Bruce Bruce \\ 68 68 \> Professor, Dept. of Philosophy of Zoology, University of Wallamaloo \\ 69 \end{tabbing} 69 \end{tabbing} 70 70 \bigskip 71 71 72 72 \noindent 73 73 \begin{tabbing} … … 79 79 \end{tabbing} 80 80 \bigskip 81 81 82 82 \noindent 83 83 \begin{tabbing} … … 87 87 \end{tabbing} 88 88 \bigskip 89 89 90 90 \noindent 91 91 \begin{tabbing} … … 95 95 \end{tabbing} 96 96 \bigskip 97 97 98 98 \noindent 99 99 \begin{tabbing} … … 111 111 % December 13th, 2006. It is designed for an electronic thesis. 112 112 \begin{center}\textbf{Author's Declaration}\end{center} 113 113 114 114 \noindent 115 I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, including any required final revisions, as accepted by my examiners. 115 I hereby declare that I am the sole author of this thesis. This is a true copy 116 of the thesis, including any required final revisions, as accepted by my 117 examiners. 116 118 117 119 \bigskip 118 120 119 121 \noindent 120 122 I understand that my thesis may be made electronically available to the public. -
doc/theses/andrew_beach_MMath/uw-ethesis.tex
r5e99a9a r95b3a9c 1 1 %====================================================================== 2 % University of Waterloo Thesis Template for LaTeX 3 % Last Updated November, 2020 4 % by Stephen Carr, IST Client Services, 2 % University of Waterloo Thesis Template for LaTeX 3 % Last Updated November, 2020 4 % by Stephen Carr, IST Client Services, 5 5 % University of Waterloo, 200 University Ave. W., Waterloo, Ontario, Canada 6 6 % FOR ASSISTANCE, please send mail to request@uwaterloo.ca 7 7 8 8 % DISCLAIMER 9 % To the best of our knowledge, this template satisfies the current uWaterloo thesis requirements. 10 % However, it is your responsibility to assure that you have met all requirements of the University and your particular department. 11 12 % Many thanks for the feedback from many graduates who assisted the development of this template. 13 % Also note that there are explanatory comments and tips throughout this template. 9 % To the best of our knowledge, this template satisfies the current uWaterloo 10 % thesis requirements. However, it is your responsibility to assure that you 11 % have met all requirements of the University and your particular department. 12 13 % Many thanks for the feedback from many graduates who assisted the 14 % development of this template. Also note that there are explanatory comments 15 % and tips throughout this template. 14 16 %====================================================================== 15 17 % Some important notes on using this template and making it your own... 16 18 17 % The University of Waterloo has required electronic thesis submission since October 2006. 18 % See the uWaterloo thesis regulations at 19 % https://uwaterloo.ca/graduate-studies/thesis. 20 % This thesis template is geared towards generating a PDF version optimized for viewing on an electronic display, including hyperlinks within the PDF. 21 22 % DON'T FORGET TO ADD YOUR OWN NAME AND TITLE in the "hyperref" package configuration below. 23 % THIS INFORMATION GETS EMBEDDED IN THE PDF FINAL PDF DOCUMENT. 24 % You can view the information if you view properties of the PDF document. 25 26 % Many faculties/departments also require one or more printed copies. 27 % This template attempts to satisfy both types of output. 19 % The University of Waterloo has required electronic thesis submission since 20 % October 2006. See the uWaterloo thesis regulations at: 21 % https://uwaterloo.ca/graduate-studies/thesis. 22 % This thesis template is geared towards generating a PDF version optimized 23 % for viewing on an electronic display, including hyperlinks within the PDF. 24 25 % DON'T FORGET TO ADD YOUR OWN NAME AND TITLE in the "hyperref" package 26 % configuration below. THIS INFORMATION GETS EMBEDDED IN THE FINAL PDF 27 % DOCUMENT. You can view the information if you view properties of the PDF. 28 29 % Many faculties/departments also require one or more printed copies. 30 % This template attempts to satisfy both types of output. 28 31 % See additional notes below. 29 % It is based on the standard "book" document class which provides all necessary sectioning structures and allows multi-part theses. 30 31 % If you are using this template in Overleaf (cloud-based collaboration service), then it is automatically processed and previewed for you as you edit. 32 33 % For people who prefer to install their own LaTeX distributions on their own computers, and process the source files manually, the following notes provide the sequence of tasks: 34 32 % It is based on the standard "book" document class which provides all 33 % necessary sectioning structures and allows multi-part theses. 34 35 % If you are using this template in Overleaf (cloud-based collaboration 36 % service), then it is automatically processed and previewed for you as you 37 % edit. 38 39 % For people who prefer to install their own LaTeX distributions on their own 40 % computers, and process the source files manually, the following notes 41 % provide the sequence of tasks: 42 35 43 % E.g. to process a thesis called "mythesis.tex" based on this template, run: 36 44 37 45 % pdflatex mythesis -- first pass of the pdflatex processor 38 46 % bibtex mythesis -- generates bibliography from .bib data file(s) 39 % makeindex -- should be run only if an index is used 40 % pdflatex mythesis -- fixes numbering in cross-references, bibliographic references, glossaries, index, etc. 41 % pdflatex mythesis -- it takes a couple of passes to completely process all cross-references 42 43 % If you use the recommended LaTeX editor, Texmaker, you would open the mythesis.tex file, then click the PDFLaTeX button. Then run BibTeX (under the Tools menu). 44 % Then click the PDFLaTeX button two more times. 45 % If you have an index as well,you'll need to run MakeIndex from the Tools menu as well, before running pdflatex 46 % the last two times. 47 48 % N.B. The "pdftex" program allows graphics in the following formats to be included with the "\includegraphics" command: PNG, PDF, JPEG, TIFF 49 % Tip: Generate your figures and photos in the size you want them to appear in your thesis, rather than scaling them with \includegraphics options. 50 % Tip: Any drawings you do should be in scalable vector graphic formats: SVG, PNG, WMF, EPS and then converted to PNG or PDF, so they are scalable in the final PDF as well. 47 % makeindex -- should be run only if an index is used 48 % pdflatex mythesis -- fixes numbering in cross-references, bibliographic 49 % references, glossaries, index, etc. 50 % pdflatex mythesis -- it takes a couple of passes to completely process all 51 % cross-references 52 53 % If you use the recommended LaTeX editor, Texmaker, you would open the 54 % mythesis.tex file, then click the PDFLaTeX button. Then run BibTeX (under 55 % the Tools menu). Then click the PDFLaTeX button two more times. 56 % If you have an index as well, you'll need to run MakeIndex from the Tools 57 % menu as well, before running pdflatex the last two times. 58 59 % N.B. The "pdftex" program allows graphics in the following formats to be 60 % included with the "\includegraphics" command: PNG, PDF, JPEG, TIFF 61 % Tip: Generate your figures and photos in the size you want them to appear 62 % in your thesis, rather than scaling them with \includegraphics options. 63 % Tip: Any drawings you do should be in scalable vector graphic formats: SVG, 64 % PNG, WMF, EPS and then converted to PNG or PDF, so they are scalable in the 65 % final PDF as well. 51 66 % Tip: Photographs should be cropped and compressed so as not to be too large. 52 67 53 % To create a PDF output that is optimized for double-sided printing: 54 % 1) comment-out the \documentclass statement in the preamble below, and un-comment the second \documentclass line. 55 % 2) change the value assigned below to the boolean variable "PrintVersion" from " false" to "true". 56 57 %====================================================================== 68 % To create a PDF output that is optimized for double-sided printing: 69 % 1) comment-out the \documentclass statement in the preamble below, and 70 % un-comment the second \documentclass line. 71 % 2) change the value assigned below to the boolean variable "PrintVersion" 72 % from "false" to "true". 73 74 % ====================================================================== 58 75 % D O C U M E N T P R E A M B L E 59 % Specify the document class, default style attributes, andpage dimensions, etc.76 % Specify the document class, default style attributes, page dimensions, etc. 60 77 % For hyperlinked PDF, suitable for viewing on a computer, use this: 61 78 \documentclass[letterpaper,12pt,titlepage,oneside,final]{book} 62 79 63 % For PDF, suitable for double-sided printing, change the PrintVersion variable below to "true" and use this \documentclass line instead of the one above: 80 % For PDF, suitable for double-sided printing, change the PrintVersion 81 % variable below to "true" and use this \documentclass line instead of the 82 % one above: 64 83 %\documentclass[letterpaper,12pt,titlepage,openright,twoside,final]{book} 65 84 85 \usepackage{etoolbox} 86 66 87 % Some LaTeX commands I define for my own nomenclature. 67 % If you have to, it's easier to make changes to nomenclature once here than in a million places throughout your thesis! 88 % If you have to, it's easier to make changes to nomenclature once here than 89 % in a million places throughout your thesis! 68 90 \newcommand{\package}[1]{\textbf{#1}} % package names in bold text 69 \newcommand{\cmmd}[1]{\textbackslash\texttt{#1}} % command name in tt font 70 \newcommand{\href}[1]{#1} % does nothing, but defines the command so the print-optimized version will ignore \href tags (redefined by hyperref pkg).71 % \newcommand{\texorpdfstring}[2]{#1} % does nothing, but defines the command91 \newcommand{\cmmd}[1]{\textbackslash\texttt{#1}} % command name in tt font 92 \newcommand{\href}[1]{#1} % does nothing, but defines the command so the 93 % print-optimized version will ignore \href tags (redefined by hyperref pkg). 72 94 % Anything defined here may be redefined by packages added below... 73 95 … … 76 98 \newboolean{PrintVersion} 77 99 \setboolean{PrintVersion}{false} 78 % CHANGE THIS VALUE TO "true" as necessary, to improve printed results for hard copies by overriding some options of the hyperref package, called below. 100 % CHANGE THIS VALUE TO "true" as necessary, to improve printed results for 101 % hard copies by overriding some options of the hyperref package, called below. 79 102 80 103 %\usepackage{nomencl} % For a nomenclature (optional; available from ctan.org) 81 \usepackage{amsmath,amssymb,amstext} % Lots of math symbols and environments 82 \usepackage[pdftex]{graphicx} % For including graphics N.B. pdftex graphics driver 104 % Lots of math symbols and environments 105 \usepackage{amsmath,amssymb,amstext} 106 % For including graphics N.B. pdftex graphics driver 107 \usepackage[pdftex]{graphicx} 108 % Removes large sections of the document. 109 \usepackage{comment} 110 % Adds todos (Must be included after comment.) 111 \usepackage{todonotes} 112 83 113 84 114 % Hyperlinks make it very easy to navigate an electronic document. 85 % In addition, this is where you should specify the thesis title and author as they appear in the properties of the PDF document. 115 % In addition, this is where you should specify the thesis title and author as 116 % they appear in the properties of the PDF document. 86 117 % Use the "hyperref" package 87 118 % N.B. HYPERREF MUST BE THE LAST PACKAGE LOADED; ADD ADDITIONAL PKGS ABOVE 88 119 \usepackage[pdftex,pagebackref=true]{hyperref} % with basic options 89 120 %\usepackage[pdftex,pagebackref=true]{hyperref} 90 % N.B. pagebackref=true provides links back from the References to the body text. This can cause trouble for printing. 121 % N.B. pagebackref=true provides links back from the References to the body 122 % text. This can cause trouble for printing. 91 123 \hypersetup{ 92 124 plainpages=false, % needed if Roman numbers in frontpages … … 96 128 pdffitwindow=false, % window fit to page when opened 97 129 pdfstartview={FitH}, % fits the width of the page to the window 98 % pdftitle={uWaterloo\ LaTeX\ Thesis\ Template}, % title: CHANGE THIS TEXT!130 % pdftitle={uWaterloo\ LaTeX\ Thesis\ Template}, % title: CHANGE THIS TEXT! 99 131 % pdfauthor={Author}, % author: CHANGE THIS TEXT! and uncomment this line 100 132 % pdfsubject={Subject}, % subject: CHANGE THIS TEXT! and uncomment this line 101 % pdfkeywords={keyword1} {key2} {key3}, % list of keywords, and uncomment this line if desired133 % pdfkeywords={keyword1} {key2} {key3}, % optional list of keywords 102 134 pdfnewwindow=true, % links in new window 103 135 colorlinks=true, % false: boxed links; true: colored links … … 107 139 urlcolor=cyan % color of external links 108 140 } 109 \ifthenelse{\boolean{PrintVersion}}{ % for improved print quality, change some hyperref options 141 % for improved print quality, change some hyperref options 142 \ifthenelse{\boolean{PrintVersion}}{ 110 143 \hypersetup{ % override some previously defined hyperref options 111 144 % colorlinks,% … … 116 149 }{} % end of ifthenelse (no else) 117 150 118 \usepackage[automake,toc,abbreviations]{glossaries-extra} % Exception to the rule of hyperref being the last add-on package 119 % If glossaries-extra is not in your LaTeX distribution, get it from CTAN (http://ctan.org/pkg/glossaries-extra), 120 % although it's supposed to be in both the TeX Live and MikTeX distributions. There are also documentation and 121 % installation instructions there. 151 % Exception to the rule of hyperref being the last add-on package 152 \usepackage[automake,toc,abbreviations]{glossaries-extra} 153 % If glossaries-extra is not in your LaTeX distribution, get it from CTAN 154 % (http://ctan.org/pkg/glossaries-extra), although it's supposed to be in 155 % both the TeX Live and MikTeX distributions. There are also documentation 156 % and installation instructions there. 122 157 123 158 % Setting up the page margins... 124 \setlength{\textheight}{9in}\setlength{\topmargin}{-0.45in}\setlength{\headsep}{0.25in} 125 % uWaterloo thesis requirements specify a minimum of 1 inch (72pt) margin at the 126 % top, bottom, and outside page edges and a 1.125 in. (81pt) gutter margin (on binding side). 127 % While this is not an issue for electronic viewing, a PDF may be printed, and so we have the same page layout for both printed and electronic versions, we leave the gutter margin in. 128 % Set margins to minimum permitted by uWaterloo thesis regulations: 159 \setlength{\textheight}{9in} 160 \setlength{\topmargin}{-0.45in} 161 \setlength{\headsep}{0.25in} 162 % uWaterloo thesis requirements specify a minimum of 1 inch (72pt) margin at 163 % the top, bottom, and outside page edges and a 1.125 in. (81pt) gutter margin 164 % (on binding side). While this is not an issue for electronic viewing, a PDF 165 % may be printed, and so we have the same page layout for both printed and 166 % electronic versions, we leave the gutter margin in. Set margins to minimum 167 % permitted by uWaterloo thesis regulations: 129 168 \setlength{\marginparwidth}{0pt} % width of margin notes 130 169 % N.B. If margin notes are used, you must adjust \textwidth, \marginparwidth 131 170 % and \marginparsep so that the space left between the margin notes and page 132 171 % edge is less than 15 mm (0.6 in.) 133 \setlength{\marginparsep}{0pt} % width of space between body text and margin notes 134 \setlength{\evensidemargin}{0.125in} % Adds 1/8 in. to binding side of all 172 % width of space between body text and margin notes 173 \setlength{\marginparsep}{0pt} 174 % Adds 1/8 in. to binding side of all 135 175 % even-numbered pages when the "twoside" printing option is selected 136 \setlength{\oddsidemargin}{0.125in} % Adds 1/8 in. to the left of all pages when "oneside" printing is selected, and to the left of all odd-numbered pages when "twoside" printing is selected 137 \setlength{\textwidth}{6.375in} % assuming US letter paper (8.5 in. x 11 in.) and side margins as above 176 \setlength{\evensidemargin}{0.125in} 177 % Adds 1/8 in. to the left of all pages when "oneside" printing is selected, 178 % and to the left of all odd-numbered pages when "twoside" printing is selected 179 \setlength{\oddsidemargin}{0.125in} 180 % assuming US letter paper (8.5 in. x 11 in.) and side margins as above 181 \setlength{\textwidth}{6.375in} 138 182 \raggedbottom 139 183 140 % The following statement specifies the amount of space between paragraphs. Other reasonable specifications are \bigskipamount and \smallskipamount. 184 % The following statement specifies the amount of space between paragraphs. 185 % Other reasonable specifications are \bigskipamount and \smallskipamount. 141 186 \setlength{\parskip}{\medskipamount} 142 187 143 % The following statement controls the line spacing. 144 % The default spacing corresponds to good typographic conventions and only slight changes (e.g., perhaps "1.2"), if any, should be made. 188 % The following statement controls the line spacing. 189 % The default spacing corresponds to good typographic conventions and only 190 % slight changes (e.g., perhaps "1.2"), if any, should be made. 145 191 \renewcommand{\baselinestretch}{1} % this is the default line space setting 146 192 147 193 % By default, each chapter will start on a recto (right-hand side) page. 148 % We also force each section of the front pages to start on a recto page by inserting \cleardoublepage commands. 149 % In many cases, this will require that the verso (left-hand) page be blank, and while it should be counted, a page number should not be printed. 150 % The following statements ensure a page number is not printed on an otherwise blank verso page. 194 % We also force each section of the front pages to start on a recto page by 195 % inserting \cleardoublepage commands. In many cases, this will require that 196 % the verso (left-hand) page be blank, and while it should be counted, a page 197 % number should not be printed. The following statements ensure a page number 198 % is not printed on an otherwise blank verso page. 151 199 \let\origdoublepage\cleardoublepage 152 200 \newcommand{\clearemptydoublepage}{% … … 154 202 \let\cleardoublepage\clearemptydoublepage 155 203 156 % Define Glossary terms (This is properly done here, in the preamble and could also be \input{} from a separate file...) 204 % Define Glossary terms (This is properly done here, in the preamble and 205 % could also be \input{} from a separate file...) 157 206 \input{glossaries} 158 207 \makeglossaries 159 208 160 \usepackage{comment}161 209 % cfa macros used in the document 162 210 %\usepackage{cfalab} 211 % I'm going to bring back eventually. 212 \makeatletter 213 % Combines all \CC* commands: 214 \newrobustcmd*\Cpp[1][\xspace]{\cfalab@Cpp#1} 215 \newcommand\cfalab@Cpp{C\kern-.1em\hbox{+\kern-.25em+}} 216 % Optional arguments do not work with pdf string. (Some fix-up required.) 217 \pdfstringdefDisableCommands{\def\Cpp{C++}} 218 219 % Colour text, formatted in LaTeX style instead of TeX style. 220 \newcommand*\colour[2]{{\color{#1}#2}} 221 \makeatother 222 163 223 \input{common} 164 \CFAStyle % CFA code-style for all languages 165 \lstset{language=CFA,basicstyle=\linespread{0.9}\tt} % CFA default lnaguage 224 % CFA code-style for all languages 225 \CFAStyle 226 % CFA default lnaguage 227 \lstset{language=CFA,basicstyle=\linespread{0.9}\tt} 228 % Annotations from Peter: 166 229 \newcommand{\PAB}[1]{{\color{blue}PAB: #1}} 230 % Change the style of abbreviations: 231 \renewcommand{\abbrevFont}{} 167 232 168 233 %====================================================================== 169 234 % L O G I C A L D O C U M E N T 170 235 % The logical document contains the main content of your thesis. 171 % Being a large document, it is a good idea to divide your thesis into several files, each one containing one chapter or other significant chunk of content, so you can easily shuffle things around later if desired. 236 % Being a large document, it is a good idea to divide your thesis into several 237 % files, each one containing one chapter or other significant chunk of content, 238 % so you can easily shuffle things around later if desired. 172 239 %====================================================================== 173 240 \begin{document} … … 176 243 % FRONT MATERIAL 177 244 % title page,declaration, borrowers' page, abstract, acknowledgements, 178 % dedication, table of contents, list of tables, list of figures, nomenclature, etc. 179 %---------------------------------------------------------------------- 180 \input{uw-ethesis-frontpgs} 245 % dedication, table of contents, list of tables, list of figures, 246 % nomenclature, etc. 247 %---------------------------------------------------------------------- 248 \input{uw-ethesis-frontpgs} 181 249 182 250 %---------------------------------------------------------------------- 183 251 % MAIN BODY 184 252 % We suggest using a separate file for each chapter of your thesis. 185 % Start each chapter file with the \chapter command. 186 % Only use \documentclass or\begin{document} and \end{document} commands in this master document.253 % Start each chapter file with the \chapter command. Only use \documentclass, 254 % \begin{document} and \end{document} commands in this master document. 187 255 % Tip: Putting each sentence on a new line is a way to simplify later editing. 188 256 %---------------------------------------------------------------------- … … 200 268 % Bibliography 201 269 202 % The following statement selects the style to use for references. 203 % It controls the sort order of the entries in the bibliography and also the formatting for the in-text labels. 270 % The following statement selects the style to use for references. 271 % It controls the sort order of the entries in the bibliography and also the 272 % formatting for the in-text labels. 204 273 \bibliographystyle{plain} 205 % This specifies the location of the file containing the bibliographic information. 206 % It assumes you're using BibTeX to manage your references (if not, why not?). 207 \cleardoublepage % This is needed if the "book" document class is used, to place the anchor in the correct page, because the bibliography will start on its own page. 208 % Use \clearpage instead if the document class uses the "oneside" argument 209 \phantomsection % With hyperref package, enables hyperlinking from the table of contents to bibliography 210 % The following statement causes the title "References" to be used for the bibliography section: 274 % This specifies the location of the file containing the bibliographic 275 % information. It assumes you're using BibTeX to manage your references (if 276 % not, why not?). 277 \cleardoublepage % This is needed if the "book" document class is used, to 278 % place the anchor in the correct page, because the bibliography will start 279 % on its own page. 280 % Use \clearpage instead if the document class uses the "oneside" argument. 281 \phantomsection % With hyperref package, enables hyperlinking from the table 282 % of contents to bibliography. 283 % The following statement causes the title "References" to be used for the 284 % bibliography section: 211 285 \renewcommand*{\bibname}{References} 212 286 … … 215 289 216 290 \bibliography{uw-ethesis,pl} 217 % Tip: You can create multiple .bib files to organize your references. 218 % Just list them all in the \bibliogaphy command, separated by commas (no spaces). 219 220 % The following statement causes the specified references to be added to the bibliography even if they were not cited in the text. 221 % The asterisk is a wildcard that causes all entries in the bibliographic database to be included (optional). 291 % Tip: You can create multiple .bib files to organize your references. Just 292 % list them all in the \bibliogaphy command, separated by commas (no spaces). 293 294 % The following statement causes the specified references to be added to the 295 % bibliography even if they were not cited in the text. The asterisk is a 296 % wildcard that causes all entries in the bibliographic database to be 297 % included (optional). 222 298 % \nocite{*} 223 299 %---------------------------------------------------------------------- … … 227 303 % The \appendix statement indicates the beginning of the appendices. 228 304 \appendix 229 % Add an un-numbered title page before the appendices and a line in the Table of Contents 305 % Add an un-numbered title page before the appendices and a line in the Table 306 % of Contents 230 307 % \chapter*{APPENDICES} 231 308 % \addcontentsline{toc}{chapter}{APPENDICES} 232 % Appendices are just more chapters, with different labeling (letters instead of numbers). 309 % Appendices are just more chapters, with different labeling (letters instead 310 % of numbers). 233 311 % \input{appendix-matlab_plots.tex} 234 312 235 % GLOSSARIES (Lists of definitions, abbreviations, symbols, etc. provided by the glossaries-extra package) 313 % GLOSSARIES (Lists of definitions, abbreviations, symbols, etc. 314 % provided by the glossaries-extra package) 236 315 % ----------------------------- 237 316 \printglossaries -
doc/theses/fangren_yu_COOP_F20/Report.tex
r5e99a9a r95b3a9c 102 102 \CFA language, developed by the Programming Language Group at the University of Waterloo, has a long history, with the initial language design in 1992 by Glen Ditchfield~\cite{Ditchfield92} and the first proof-of-concept compiler built in 2003 by Richard Bilson~\cite{Bilson03}. Many new features have been added to the language over time, but the core of \CFA's type-system --- parametric functions introduced by the @forall@ clause (hence the name of the language) providing parametric overloading --- remains mostly unchanged. 103 103 104 The current \CFA reference compiler, @cfa-cc@, is designed using the visitor pattern~\cite{vistorpattern} over an abstract syntax tree (AST), where multiple passes over the AST modify it for subsequent passes. @cfa-cc@ still includes many parts taken directly from the original Bilson implementation, which served as the starting point for this enhancement work to the type system. Unfortunately, the prior implementation did not provide the efficiency required for the language to be practical: a \CFA source file of approximately 1000 lines of code can take amultiple minutes to compile. The cause of the problem is that the old compiler used inefficient data structures and algorithms for expression resolution, which involved significant copying and redundant work.104 The current \CFA reference compiler, @cfa-cc@, is designed using the visitor pattern~\cite{vistorpattern} over an abstract syntax tree (AST), where multiple passes over the AST modify it for subsequent passes. @cfa-cc@ still includes many parts taken directly from the original Bilson implementation, which served as the starting point for this enhancement work to the type system. Unfortunately, the prior implementation did not provide the efficiency required for the language to be practical: a \CFA source file of approximately 1000 lines of code can take multiple minutes to compile. The cause of the problem is that the old compiler used inefficient data structures and algorithms for expression resolution, which involved significant copying and redundant work. 105 105 106 106 This report presents a series of optimizations to the performance-critical parts of the resolver, with a major rework of the compiler data-structures using a functional-programming approach to reduce memory complexity. The improvements were suggested by running the compiler builds with a performance profiler against the \CFA standard-library source-code and a test suite to find the most underperforming components in the compiler algorithm. … … 122 122 \end{itemize} 123 123 124 The resolver algorithm, designed for overload resolution, uses a significant amount ofreused, and hence copying, for the intermediate representations, especially in the following two places:124 The resolver algorithm, designed for overload resolution, allows a significant amount of code reused, and hence copying, for the intermediate representations, especially in the following two places: 125 125 \begin{itemize} 126 126 \item … … 301 301 forall( dtype T | sized( T ) ) 302 302 T * malloc( void ) { return (T *)malloc( sizeof(T) ); } // call C malloc 303 int * i = malloc(); // type deduced from left-hand size $\ Rightarrow$ no size argument or return cast303 int * i = malloc(); // type deduced from left-hand size $\(\Rightarrow\)$ no size argument or return cast 304 304 \end{cfa} 305 305 An unbound return-type is problematic in resolver complexity because a single match of a function call with an unbound return type may create multiple candidates. In the worst case, consider a function declared that returns any @otype@ (defined \VPageref{otype}): … … 432 432 \begin{cfa} 433 433 void f( int ); 434 double g$ _1$( int );435 int g$ _2$( long );434 double g$\(_1\)$( int ); 435 int g$\(_2\)$( long ); 436 436 f( g( 42 ) ); 437 437 \end{cfa} -
doc/theses/thierry_delisle_PhD/thesis/Makefile
r5e99a9a r95b3a9c 8 8 BibTeX = BIBINPUTS=${TeXLIB} && export BIBINPUTS && bibtex 9 9 10 MAKEFLAGS = --no-print-directory --silent10 MAKEFLAGS = --no-print-directory # --silent 11 11 VPATH = ${Build} ${Figures} 12 12 … … 32 32 emptytree \ 33 33 fairness \ 34 io_uring \ 35 pivot_ring \ 34 36 system \ 35 37 } … … 43 45 ## Define the documents that need to be made. 44 46 all: thesis.pdf 45 thesis.pdf: ${TEXTS} ${FIGURES} ${PICTURES} glossary.tex local.bib47 thesis.pdf: ${TEXTS} ${FIGURES} ${PICTURES} thesis.tex glossary.tex local.bib 46 48 47 49 DOCUMENT = thesis.pdf … … 49 51 50 52 # Directives # 53 54 .NOTPARALLEL: # cannot make in parallel 51 55 52 56 .PHONY : all clean # not file names … … 81 85 ${LaTeX} $< 82 86 83 build/fairness.svg : fig/fairness.py | ${Build}84 python3 $< $@85 86 87 ## Define the default recipes. 87 88 … … 105 106 sed -i 's/$@/${Build}\/$@/g' ${Build}/$@_t 106 107 108 build/fairness.svg : fig/fairness.py | ${Build} 109 python3 $< $@ 110 107 111 ## pstex with inverted colors 108 112 %.dark.pstex : fig/%.fig Makefile | ${Build} -
doc/theses/thierry_delisle_PhD/thesis/local.bib
r5e99a9a r95b3a9c 512 512 } 513 513 514 @manual{MAN:bsd/kqueue, 515 title = {KQUEUE(2) - FreeBSD System Calls Manual}, 516 url = {https://www.freebsd.org/cgi/man.cgi?query=kqueue}, 517 year = {2020}, 518 month = {may} 519 } 520 514 521 % Apple's MAC OS X 515 522 @manual{MAN:apple/scheduler, … … 577 584 578 585 % -------------------------------------------------- 586 % Man Pages 587 @manual{MAN:open, 588 key = "open", 589 title = "open(2) Linux User's Manual", 590 year = "2020", 591 month = "February", 592 } 593 594 @manual{MAN:accept, 595 key = "accept", 596 title = "accept(2) Linux User's Manual", 597 year = "2019", 598 month = "March", 599 } 600 601 @manual{MAN:select, 602 key = "select", 603 title = "select(2) Linux User's Manual", 604 year = "2019", 605 month = "March", 606 } 607 608 @manual{MAN:poll, 609 key = "poll", 610 title = "poll(2) Linux User's Manual", 611 year = "2019", 612 month = "July", 613 } 614 615 @manual{MAN:epoll, 616 key = "epoll", 617 title = "epoll(7) Linux User's Manual", 618 year = "2019", 619 month = "March", 620 } 621 622 @manual{MAN:aio, 623 key = "aio", 624 title = "aio(7) Linux User's Manual", 625 year = "2019", 626 month = "March", 627 } 628 629 @misc{MAN:io_uring, 630 title = {Efficient IO with io\_uring}, 631 author = {Axboe, Jens}, 632 year = "2019", 633 month = "March", 634 version = {0,4}, 635 howpublished = {\url{https://kernel.dk/io_uring.pdf}} 636 } 637 638 % -------------------------------------------------- 579 639 % Wikipedia Entries 580 640 @misc{wiki:taskparallel, … … 617 677 note = "[Online; accessed 2-January-2021]" 618 678 } 679 680 @misc{wiki:future, 681 author = "{Wikipedia contributors}", 682 title = "Futures and promises --- {W}ikipedia{,} The Free Encyclopedia", 683 year = "2020", 684 url = "https://en.wikipedia.org/wiki/Futures_and_promises", 685 note = "[Online; accessed 9-February-2021]" 686 } -
doc/theses/thierry_delisle_PhD/thesis/text/core.tex
r5e99a9a r95b3a9c 49 49 50 50 \section{Design} 51 In general, a na\"{i}ve \glsxtrshort{fifo} ready-queue does not scale with increased parallelism from \glspl{hthrd}, resulting in decreased performance. The problem is adding/removing \glspl{thrd} is a single point of contention. As shown in the evaluation sections, most production schedulers do scale when adding \glspl{hthrd}. The common solution to the single point of contention is to shard the ready-queue so each \gls{hthrd} can access the ready-queue without contention, increasing performance though lack of contention.51 In general, a na\"{i}ve \glsxtrshort{fifo} ready-queue does not scale with increased parallelism from \glspl{hthrd}, resulting in decreased performance. The problem is adding/removing \glspl{thrd} is a single point of contention. As shown in the evaluation sections, most production schedulers do scale when adding \glspl{hthrd}. The common solution to the single point of contention is to shard the ready-queue so each \gls{hthrd} can access the ready-queue without contention, increasing performance. 52 52 53 53 \subsection{Sharding} \label{sec:sharding} 54 An interesting approach to sharding a queue is presented in \cit{Trevors paper}. This algorithm presents a queue with a relaxed \glsxtrshort{fifo} guarantee using an array of strictly \glsxtrshort{fifo} sublists as shown in Figure~\ref{fig:base}. Each \emph{cell} of the array has a timestamp for the last operation and a pointer to a linked-list with a lock and each node in the list is marked with a timestamp indicating when it is added to the list. A push operation is done by picking a random cell, acquiring the list lock, and pushing to the list. If the cell is locked, the operation is simply retried on another random cell until a lock is acquired. A pop operation is done in a similar fashion except two random cells are picked. If both cells are unlocked with non-empty lists, the operation pops the node with the oldest celltimestamp. If one of the cells is unlocked and non-empty, the operation pops from that cell. If both cells are either locked or empty, the operation picks two new random cells and tries again.54 An interesting approach to sharding a queue is presented in \cit{Trevors paper}. This algorithm presents a queue with a relaxed \glsxtrshort{fifo} guarantee using an array of strictly \glsxtrshort{fifo} sublists as shown in Figure~\ref{fig:base}. Each \emph{cell} of the array has a timestamp for the last operation and a pointer to a linked-list with a lock. Each node in the list is marked with a timestamp indicating when it is added to the list. A push operation is done by picking a random cell, acquiring the list lock, and pushing to the list. If the cell is locked, the operation is simply retried on another random cell until a lock is acquired. A pop operation is done in a similar fashion except two random cells are picked. If both cells are unlocked with non-empty lists, the operation pops the node with the oldest timestamp. If one of the cells is unlocked and non-empty, the operation pops from that cell. If both cells are either locked or empty, the operation picks two new random cells and tries again. 55 55 56 56 \begin{figure} … … 100 100 \paragraph{Local Information} Figure~\ref{fig:emptytls} shows an approach using dense information, similar to the bitmap, but each \gls{hthrd} keeps its own independent copy. While this approach can offer good scalability \emph{and} low latency, the liveliness and discovery of the information can become a problem. This case is made worst in systems with few processors where even blind random picks can find \glspl{thrd} in a few tries. 101 101 102 I built a prototype of these approaches and none of these techniques offer satisfying performance when few threads are present. All of these approach hit the same 2 problems. First, randomly picking sub-queues is very fast but means any improvement to the hit rate can easily be countered by a slow-down in look-up speed when there are empty lists. Second, the array is already assharded to avoid contention bottlenecks, so any denser data structure tends to become a bottleneck. In all cases, these factors meant the best cases scenario, \ie many threads, would get worst throughput, and the worst-case scenario, few threads, would get a better hit rate, but an equivalent poor throughput. As a result I tried an entirely different approach.102 I built a prototype of these approaches and none of these techniques offer satisfying performance when few threads are present. All of these approach hit the same 2 problems. First, randomly picking sub-queues is very fast. That speed means any improvement to the hit rate can easily be countered by a slow-down in look-up speed, whether or not there are empty lists. Second, the array is already sharded to avoid contention bottlenecks, so any denser data structure tends to become a bottleneck. In all cases, these factors meant the best cases scenario, \ie many threads, would get worst throughput, and the worst-case scenario, few threads, would get a better hit rate, but an equivalent poor throughput. As a result I tried an entirely different approach. 103 103 104 104 \subsection{Dynamic Entropy}\cit{https://xkcd.com/2318/} 105 In the worst-case scenario there are only few \glspl{thrd} ready to run, or more precisely given $P$ \glspl{proc}\footnote{For simplicity, this assumes there is a one-to-one match between \glspl{proc} and \glspl{hthrd}.}, $T$ \glspl{thrd} and $\epsilon$ a very small number, than the worst case scenario can be represented by $ \epsilon \ll P$, than $T = P + \epsilon$. It is important to note in this case that fairness is effectively irrelevant. Indeed, this case is close to \emph{actually matching} the model of the ``Ideal multi-tasking CPU'' on page \pageref{q:LinuxCFS}. In this context, it is possible to use a purely internal-locality based approach and still meet the fairness requirements. This approach simply has each \gls{proc} running a single \gls{thrd} repeatedly. Or from the shared ready-queue viewpoint, each \gls{proc} pushes to a given sub-queue and then popes from the \emph{same} subqueue. In cases where $T \gg P$, the scheduler should also achieves similar performance without affecting the fairness guarantees.105 In the worst-case scenario there are only few \glspl{thrd} ready to run, or more precisely given $P$ \glspl{proc}\footnote{For simplicity, this assumes there is a one-to-one match between \glspl{proc} and \glspl{hthrd}.}, $T$ \glspl{thrd} and $\epsilon$ a very small number, than the worst case scenario can be represented by $T = P + \epsilon$, with $\epsilon \ll P$. It is important to note in this case that fairness is effectively irrelevant. Indeed, this case is close to \emph{actually matching} the model of the ``Ideal multi-tasking CPU'' on page \pageref{q:LinuxCFS}. In this context, it is possible to use a purely internal-locality based approach and still meet the fairness requirements. This approach simply has each \gls{proc} running a single \gls{thrd} repeatedly. Or from the shared ready-queue viewpoint, each \gls{proc} pushes to a given sub-queue and then pops from the \emph{same} subqueue. The challenge is for the the scheduler to achieve good performance in both the $T = P + \epsilon$ case and the $T \gg P$ case, without affecting the fairness guarantees in the later. 106 106 107 To handle this case, I use a pseudo random-number generator, \glsxtrshort{prng} in a novel way. When the scheduler uses a \glsxtrshort{prng} instance per \gls{proc} exclusively, the random-number seed effectively starts an encoding that produces a list of all accessed subqueues, from latest to oldest. The novel approach is to be able to ``replay'' the \glsxtrshort{prng} backwards and there exist \glsxtrshort{prng}s that are fast, compact \emph{and} can be run forward and backwards. Linear congruential generators~\cite{wiki:lcg} are an example of \glsxtrshort{prng}s that match these requirements.107 To handle this case, I use a \glsxtrshort{prng}\todo{Fix missing long form} in a novel way. There exist \glsxtrshort{prng}s that are fast, compact and can be run forward \emph{and} backwards. Linear congruential generators~\cite{wiki:lcg} are an example of \glsxtrshort{prng}s of such \glsxtrshort{prng}s. The novel approach is to use the ability to run backwards to ``replay'' the \glsxtrshort{prng}. The scheduler uses an exclusive \glsxtrshort{prng} instance per \gls{proc}, the random-number seed effectively starts an encoding that produces a list of all accessed subqueues, from latest to oldest. Replaying the \glsxtrshort{prng} to identify cells accessed recently and which probably have data still cached. 108 108 109 109 The algorithm works as follows: -
doc/theses/thierry_delisle_PhD/thesis/text/intro.tex
r5e99a9a r95b3a9c 7 7 While previous work on the concurrent package of \CFA focused on features and interfaces, this thesis focuses on performance, introducing \glsxtrshort{api} changes only when required by performance considerations. More specifically, this thesis concentrates on scheduling and \glsxtrshort{io}. Prior to this work, the \CFA runtime used a strictly \glsxtrshort{fifo} \gls{rQ}. 8 8 9 This work exclusively concentrates on Linux as it's operating system since the existing \CFA runtime and compiler does not already support other operating systems. Furthermore, as \CFA is yet to be released, supporting version of Linux older tha tthe latest version is not a goal of this work.9 This work exclusively concentrates on Linux as it's operating system since the existing \CFA runtime and compiler does not already support other operating systems. Furthermore, as \CFA is yet to be released, supporting version of Linux older than the latest version is not a goal of this work. -
doc/theses/thierry_delisle_PhD/thesis/text/io.tex
r5e99a9a r95b3a9c 1 \chapter{User Level \ glsxtrshort{io}}2 As mention ned in Section~\ref{prev:io}, User-Level \glsxtrshort{io} requires multiplexing the \glsxtrshort{io} operations of many \glspl{thrd} onto fewer \glspl{proc} using asynchronous \glsxtrshort{io} operations. Various operating systems offer various forms of asynchronous operations and as mentioned in Chapter~\ref{intro}, this work is exclusively focuesd on Linux.1 \chapter{User Level \io} 2 As mentioned in Section~\ref{prev:io}, User-Level \io requires multiplexing the \io operations of many \glspl{thrd} onto fewer \glspl{proc} using asynchronous \io operations. Different operating systems offer various forms of asynchronous operations and as mentioned in Chapter~\ref{intro}, this work is exclusively focused on the Linux operating-system. 3 3 4 \section{ Existing options}5 Since \glsxtrshort{io} operations are generally handled by the4 \section{Kernel Interface} 5 Since this work fundamentally depends on operating-system support, the first step of any design is to discuss the available interfaces and pick one (or more) as the foundations of the non-blocking \io subsystem. 6 6 7 \subsection{\lstinline|epoll|, \lstinline|poll| and \lstinline|select|} 7 \subsection{\lstinline{O_NONBLOCK}} 8 In Linux, files can be opened with the flag @O_NONBLOCK@~\cite{MAN:open} (or @SO_NONBLOCK@~\cite{MAN:accept}, the equivalent for sockets) to use the file descriptors in ``nonblocking mode''. In this mode, ``Neither the @open()@ nor any subsequent \io operations on the [opened file descriptor] will cause the calling 9 process to wait''~\cite{MAN:open}. This feature can be used as the foundation for the non-blocking \io subsystem. However, for the subsystem to know when an \io operation completes, @O_NONBLOCK@ must be use in conjunction with a system call that monitors when a file descriptor becomes ready, \ie, the next \io operation on it does not cause the process to wait\footnote{In this context, ready means \emph{some} operation can be performed without blocking. It does not mean an operation returning \lstinline{EAGAIN} succeeds on the next try. For example, a ready read may only return a subset of bytes and the read must be issues again for the remaining bytes, at which point it may return \lstinline{EAGAIN}.}. 10 This mechanism is also crucial in determining when all \glspl{thrd} are blocked and the application \glspl{kthrd} can now block. 8 11 9 \subsection{Linux's AIO} 12 There are three options to monitor file descriptors in Linux\footnote{For simplicity, this section omits \lstinline{pselect} and \lstinline{ppoll}. The difference between these system calls and \lstinline{select} and \lstinline{poll}, respectively, is not relevant for this discussion.}, @select@~\cite{MAN:select}, @poll@~\cite{MAN:poll} and @epoll@~\cite{MAN:epoll}. All three of these options offer a system call that blocks a \gls{kthrd} until at least one of many file descriptors becomes ready. The group of file descriptors being waited is called the \newterm{interest set}. 10 13 14 \paragraph{\lstinline{select}} is the oldest of these options, it takes as an input a contiguous array of bits, where each bits represent a file descriptor of interest. On return, it modifies the set in place to identify which of the file descriptors changed status. This destructive change means that calling select in a loop requires re-initializing the array each time and the number of file descriptors supported has a hard limit. Another limit of @select@ is that once the call is started, the interest set can no longer be modified. Monitoring a new file descriptor generally requires aborting any in progress call to @select@\footnote{Starting a new call to \lstinline{select} is possible but requires a distinct kernel thread, and as a result is not an acceptable multiplexing solution when the interest set is large and highly dynamic unless the number of parallel calls to \lstinline{select} can be strictly bounded.}. 11 15 16 \paragraph{\lstinline{poll}} is an improvement over select, which removes the hard limit on the number of file descriptors and the need to re-initialize the input on every call. It works using an array of structures as an input rather than an array of bits, thus allowing a more compact input for small interest sets. Like @select@, @poll@ suffers from the limitation that the interest set cannot be changed while the call is blocked. 17 18 \paragraph{\lstinline{epoll}} further improves these two functions by allowing the interest set to be dynamically added to and removed from while a \gls{kthrd} is blocked on an @epoll@ call. This dynamic capability is accomplished by creating an \emph{epoll instance} with a persistent interest set, which is used across multiple calls. This capability significantly reduces synchronization overhead on the part of the caller (in this case the \io subsystem), since the interest set can be modified when adding or removing file descriptors without having to synchronize with other \glspl{kthrd} potentially calling @epoll@. 19 20 However, all three of these system calls have limitations. The @man@ page for @O_NONBLOCK@ mentions that ``[@O_NONBLOCK@] has no effect for regular files and block devices'', which means none of these three system calls are viable multiplexing strategies for these types of \io operations. Furthermore, @epoll@ has been shown to have problems with pipes and ttys~\cit{Peter's examples in some fashion}. Finally, none of these are useful solutions for multiplexing \io operations that do not have a corresponding file descriptor and can be awkward for operations using multiple file descriptors. 21 22 \subsection{POSIX asynchronous I/O (AIO)} 23 An alternative to @O_NONBLOCK@ is the AIO interface. Its interface lets programmers enqueue operations to be performed asynchronously by the kernel. Completions of these operations can be communicated in various ways: either by spawning a new \gls{kthrd}, sending a Linux signal, or by polling for completion of one or more operation. For this work, spawning a new \gls{kthrd} is counter-productive but a related solution is discussed in Section~\ref{io:morethreads}. Using interrupts handlers can also lead to fairly complicated interactions between subsystems. Leaving polling for completion, which is similar to the previous system calls. While AIO only supports read and write operations to file descriptors, it does not have the same limitation as @O_NONBLOCK@, \ie, the file descriptors can be regular files and blocked devices. It also supports batching multiple operations in a single system call. 24 25 AIO offers two different approach to polling: @aio_error@ can be used as a spinning form of polling, returning @EINPROGRESS@ until the operation is completed, and @aio_suspend@ can be used similarly to @select@, @poll@ or @epoll@, to wait until one or more requests have completed. For the purpose of \io multiplexing, @aio_suspend@ is the best interface. However, even if AIO requests can be submitted concurrently, @aio_suspend@ suffers from the same limitation as @select@ and @poll@, \ie, the interest set cannot be dynamically changed while a call to @aio_suspend@ is in progress. AIO also suffers from the limitation of specifying which requests have completed, \ie programmers have to poll each request in the interest set using @aio_error@ to identify the completed requests. This limitation means that, like @select@ and @poll@ but not @epoll@, the time needed to examine polling results increases based on the total number of requests monitored, not the number of completed requests. 26 Finally, AIO does not seem to be a popular interface, which I believe is due in part to this poor polling interface. Linus Torvalds talks about this interface as follows: 12 27 13 28 \begin{displayquote} 14 AIO is a horrible ad-hoc design, with the main excuse being "other,29 AIO is a horrible ad-hoc design, with the main excuse being ``other, 15 30 less gifted people, made that design, and we are implementing it for 16 31 compatibility because database people - who seldom have any shred of 17 taste - actually use it ".32 taste - actually use it''. 18 33 19 34 But AIO was always really really ugly. … … 24 39 \end{displayquote} 25 40 26 Interestingly, in this e-mail answer, Linus goes on to describe41 Interestingly, in this e-mail, Linus goes on to describe 27 42 ``a true \textit{asynchronous system call} interface'' 28 43 that does … … 30 45 in 31 46 ``some kind of arbitrary \textit{queue up asynchronous system call} model''. 32 This description is actually quite close to the interface of the interfacedescribed in the next section.47 This description is actually quite close to the interface described in the next section. 33 48 34 \subsection{\texttt{io\_uring}} 35 A very recent addition to Linux, @io_uring@\cit{io\_uring} is a framework that aims to solve many of the problems listed with the above mentioned solutions. 49 \subsection{\lstinline{io_uring}} 50 A very recent addition to Linux, @io_uring@~\cite{MAN:io_uring}, is a framework that aims to solve many of the problems listed in the above interfaces. Like AIO, it represents \io operations as entries added to a queue. But like @epoll@, new requests can be submitted while a blocking call waiting for requests to complete is already in progress. The @io_uring@ interface uses two ring buffers (referred to simply as rings) at its core: a submit ring to which programmers push \io requests and a completion ring from which programmers poll for completion. 51 52 One of the big advantages over the prior interfaces is that @io_uring@ also supports a much wider range of operations. In addition to supporting reads and writes to any file descriptor like AIO, it supports other operations like @open@, @close@, @fsync@, @accept@, @connect@, @send@, @recv@, @splice@, \etc. 53 54 On top of these, @io_uring@ adds many extras like avoiding copies between the kernel and user-space using shared memory, allowing different mechanisms to communicate with device drivers, and supporting chains of requests, \ie, requests that automatically trigger followup requests on completion. 36 55 37 56 \subsection{Extra Kernel Threads}\label{io:morethreads} 38 Finally, if the operating system does not offer a ny satisfying forms of asynchronous \glsxtrshort{io} operations, a solution is to fake it by creating a pool of \glspl{kthrd} and delegating operations to them in order to avoid blocking \glspl{proc}.57 Finally, if the operating system does not offer a satisfactory form of asynchronous \io operations, an ad-hoc solution is to create a pool of \glspl{kthrd} and delegate operations to it to avoid blocking \glspl{proc}, which is a compromise for multiplexing. In the worst case, where all \glspl{thrd} are consistently blocking on \io, it devolves into 1-to-1 threading. However, regardless of the frequency of \io operations, it achieves the fundamental goal of not blocking \glspl{proc} when \glspl{thrd} are ready to run. This approach is used by languages like Go\cit{Go} and frameworks like libuv\cit{libuv}, since it has the advantage that it can easily be used across multiple operating systems. This advantage is especially relevant for languages like Go, which offer a homogeneous \glsxtrshort{api} across all platforms. As opposed to C, which has a very limited standard api for \io, \eg, the C standard library has no networking. 39 58 40 59 \subsection{Discussion} 60 These options effectively fall into two broad camps: waiting for \io to be ready versus waiting for \io to complete. All operating systems that support asynchronous \io must offer an interface along one of these lines, but the details vary drastically. For example, Free BSD offers @kqueue@~\cite{MAN:bsd/kqueue}, which behaves similarly to @epoll@, but with some small quality of use improvements, while Windows (Win32)~\cit{https://docs.microsoft.com/en-us/windows/win32/fileio/synchronous-and-asynchronous-i-o} offers ``overlapped I/O'', which handles submissions similarly to @O_NONBLOCK@ with extra flags on the synchronous system call, but waits for completion events, similarly to @io_uring@. 41 61 62 For this project, I selected @io_uring@, in large parts because to its generality. While @epoll@ has been shown to be a good solution for socket \io (\cite{DBLP:journals/pomacs/KarstenB20}), @io_uring@'s transparent support for files, pipes, and more complex operations, like @splice@ and @tee@, make it a better choice as the foundation for a general \io subsystem. 42 63 43 64 \section{Event-Engine} 65 An event engine's responsibility is to use the kernel interface to multiplex many \io operations onto few \glspl{kthrd}. In concrete terms, this means \glspl{thrd} enter the engine through an interface, the event engines then starts the operation and parks the calling \glspl{thrd}, returning control to the \gls{proc}. The parked \glspl{thrd} are then rescheduled by the event engine once the desired operation has completed. 66 67 \subsection{\lstinline{io_uring} in depth} 68 Before going into details on the design of my event engine, more details on @io_uring@ usage are presented, each important in the design of the engine. 69 Figure~\ref{fig:iouring} shows an overview of an @io_uring@ instance. 70 Two ring buffers are used to communicate with the kernel: one for submissions~(left) and one for completions~(right). 71 The submission ring contains entries, \newterm{Submit Queue Entries} (SQE), produced (appended) by the application when an operation starts and then consumed by the kernel. 72 The completion ring contains entries, \newterm{Completion Queue Entries} (CQE), produced (appended) by the kernel when an operation completes and then consumed by the application. 73 The submission ring contains indexes into the SQE array (denoted \emph{S}) containing entries describing the I/O operation to start; 74 the completion ring contains entries for the completed I/O operation. 75 Multiple @io_uring@ instances can be created, in which case they each have a copy of the data structures in the figure. 76 77 \begin{figure} 78 \centering 79 \input{io_uring.pstex_t} 80 \caption{Overview of \lstinline{io_uring}} 81 % \caption[Overview of \lstinline{io_uring}]{Overview of \lstinline{io_uring} \smallskip\newline Two ring buffer are used to communicate with the kernel, one for completions~(right) and one for submissions~(left). The completion ring contains entries, \newterm{CQE}s: Completion Queue Entries, that are produced by the kernel when an operation completes and then consumed by the application. On the other hand, the application produces \newterm{SQE}s: Submit Queue Entries, which it appends to the submission ring for the kernel to consume. Unlike the completion ring, the submission ring does not contain the entries directly, it indexes into the SQE array (denoted \emph{S}) instead.} 82 \label{fig:iouring} 83 \end{figure} 84 85 New \io operations are submitted to the kernel following 4 steps, which use the components shown in the figure. 86 \begin{enumerate} 87 \item 88 An SQE is allocated from the pre-allocated array (denoted \emph{S} in Figure~\ref{fig:iouring}). This array is created at the same time as the @io_uring@ instance, is in kernel-locked memory visible by both the kernel and the application, and has a fixed size determined at creation. How these entries are allocated is not important for the functioning of @io_uring@, the only requirement is that no entry is reused before the kernel has consumed it. 89 \item 90 The SQE is filled according to the desired operation. This step is straight forward, the only detail worth mentioning is that SQEs have a @user_data@ field that must be filled in order to match submission and completion entries. 91 \item 92 The SQE is submitted to the submission ring by appending the index of the SQE to the ring following regular ring buffer steps: \lstinline{buffer[head] = item; head++}. Since the head is visible to the kernel, some memory barriers may be required to prevent the compiler from reordering these operations. Since the submission ring is a regular ring buffer, more than one SQE can be added at once and the head is updated only after all entries are updated. 93 \item 94 The kernel is notified of the change to the ring using the system call @io_uring_enter@. The number of elements appended to the submission ring is passed as a parameter and the number of elements consumed is returned. The @io_uring@ instance can be constructed so this step is not required, but this requires elevated privilege.% and an early version of @io_uring@ had additional restrictions. 95 \end{enumerate} 96 97 \begin{sloppypar} 98 The completion side is simpler: applications call @io_uring_enter@ with the flag @IORING_ENTER_GETEVENTS@ to wait on a desired number of operations to complete. The same call can be used to both submit SQEs and wait for operations to complete. When operations do complete, the kernel appends a CQE to the completion ring and advances the head of the ring. Each CQE contains the result of the operation as well as a copy of the @user_data@ field of the SQE that triggered the operation. It is not necessary to call @io_uring_enter@ to get new events because the kernel can directly modify the completion ring. The system call is only needed if the application wants to block waiting for operations to complete. 99 \end{sloppypar} 100 101 The @io_uring_enter@ system call is protected by a lock inside the kernel. This protection means that concurrent call to @io_uring_enter@ using the same instance are possible, but there is no performance gained from parallel calls to @io_uring_enter@. It is possible to do the first three submission steps in parallel, however, doing so requires careful synchronization. 102 103 @io_uring@ also introduces constraints on the number of simultaneous operations that can be ``in flight''. Obviously, SQEs are allocated from a fixed-size array, meaning that there is a hard limit to how many SQEs can be submitted at once. In addition, the @io_uring_enter@ system call can fail because ``The kernel [...] ran out of resources to handle [a request]'' or ``The application is attempting to overcommit the number of requests it can have pending.''. This restriction means \io request bursts may have to be subdivided and submitted in chunks at a later time. 104 105 \subsection{Multiplexing \io: Submission} 106 The submission side is the most complicated aspect of @io_uring@ and its design largely dictates the completion side. 107 108 While it is possible to do the first steps of submission in parallel, the duration of the system call scales with number of entries submitted. The consequence is that the amount of parallelism used to prepare submissions for the next system call is limited. Beyond this limit, the length of the system call is the throughput limiting factor. I concluded from early experiments that preparing submissions seems to take about as long as the system call itself, which means that with a single @io_uring@ instance, there is no benefit in terms of \io throughput to having more than two \glspl{hthrd}. Therefore the design of the submission engine must manage multiple instances of @io_uring@ running in parallel, effectively sharding @io_uring@ instances. Similarly to scheduling, this sharding can be done privately, \ie, one instance per \glspl{proc}, or in decoupled pools, \ie, a pool of \glspl{proc} use a pool of @io_uring@ instances without one-to-one coupling between any given instance and any given \gls{proc}. 109 110 \subsubsection{Pool of Instances} 111 One approach is to have multiple shared instances. \Glspl{thrd} attempting \io operations pick one of the available instances and submits operations to that instance. Since the completion will be sent to the same instance, all instances with pending operations must be polled continuously\footnote{As will be described in Chapter~\ref{practice}, this does not translate into constant CPU usage.}. Since there is no coupling between \glspl{proc} and @io_uring@ instances in this approach, \glspl{thrd} running on more than one \gls{proc} can attempt to submit to the same instance concurrently. Since @io_uring@ effectively sets the amount of sharding needed to avoid contention on its internal locks, performance in this approach is based on two aspects: the synchronization needed to submit does not induce more contention than @io_uring@ already does and the scheme to route \io requests to specific @io_uring@ instances does not introduce contention. This second aspect has an oversized importance because it comes into play before the sharding of instances, and as such, all \glspl{hthrd} can contend on the routing algorithm. 112 113 Allocation in this scheme can be handled fairly easily. Free SQEs, \ie, SQEs that aren't currently being used to represent a request, can be written to safely and have a field called @user_data@ which the kernel only reads to copy to CQEs. Allocation also requires no ordering guarantee as all free SQEs are interchangeable. This requires a simple concurrent bag. The only added complexity is that the number of SQEs is fixed, which means allocation can fail. This failure needs to be pushed up to the routing algorithm, \glspl{thrd} attempting \io operations must not be directed to @io_uring@ instances without any available SQEs. Ideally, the routing algorithm would block operations up-front if none of the instances have available SQEs. 114 115 Once an SQE is allocated, \glspl{thrd} can fill them normally, they simply need to keep track of the SQE index and which instance it belongs to. 116 117 Once an SQE is filled in, what needs to happen is that the SQE must be added to the submission ring buffer, an operation that is not thread-safe on itself, and the kernel must be notified using the @io_uring_enter@ system call. The submission ring buffer is the same size as the pre-allocated SQE buffer, therefore pushing to the ring buffer cannot fail\footnote{This is because it is invalid to have the same \lstinline{sqe} multiple times in the ring buffer.}. However, as mentioned, the system call itself can fail with the expectation that it will be retried once some of the already submitted operations complete. Since multiple SQEs can be submitted to the kernel at once, it is important to strike a balance between batching and latency. Operations that are ready to be submitted should be batched together in few system calls, but at the same time, operations should not be left pending for long period of times before being submitted. This can be handled by either designating one of the submitting \glspl{thrd} as the being responsible for the system call for the current batch of SQEs or by having some other party regularly submitting all ready SQEs, \eg, the poller \gls{thrd} mentioned later in this section. 118 119 In the case of designating a \gls{thrd}, ideally, when multiple \glspl{thrd} attempt to submit operations to the same @io_uring@ instance, all requests would be batched together and one of the \glspl{thrd} would do the system call on behalf of the others, referred to as the \newterm{submitter}. In practice however, it is important that the \io requests are not left pending indefinitely and as such, it may be required to have a current submitter and a next submitter. Indeed, as long as there is a ``next'' submitter, \glspl{thrd} submitting new \io requests can move on, knowing that some future system call will include their request. Once the system call is done, the submitter must also free SQEs so that the allocator can reused them. 120 121 Finally, the completion side is much simpler since the @io_uring@ system call enforces a natural synchronization point. Polling simply needs to regularly do the system call, go through the produced CQEs and communicate the result back to the originating \glspl{thrd}. Since CQEs only own a signed 32 bit result, in addition to the copy of the @user_data@ field, all that is needed to communicate the result is a simple future~\cite{wiki:future}. If the submission side does not designate submitters, polling can also submit all SQEs as it is polling events. A simple approach to polling is to allocate a \gls{thrd} per @io_uring@ instance and simply let the poller \glspl{thrd} poll their respective instances when scheduled. This design is especially convenient for reasons explained in Chapter~\ref{practice}. 122 123 With this pool of instances approach, the big advantage is that it is fairly flexible. It does not impose restrictions on what \glspl{thrd} submitting \io operations can and cannot do between allocations and submissions. It also can gracefully handle running out of resources, SQEs or the kernel returning @EBUSY@. The down side to this is that many of the steps used for submitting need complex synchronization to work properly. The routing and allocation algorithm needs to keep track of which ring instances have available SQEs, block incoming requests if no instance is available, prevent barging if \glspl{thrd} are already queued up waiting for SQEs and handle SQEs being freed. The submission side needs to safely append SQEs to the ring buffer, make sure no SQE is dropped or left pending forever, notify the allocation side when SQEs can be reused and handle the kernel returning @EBUSY@. Sharding the @io_uring@ instances should alleviate much of the contention caused by this, but all this synchronization may still have non-zero cost. 124 125 \subsubsection{Private Instances} 126 Another approach is to simply create one ring instance per \gls{proc}. This alleviate the need for synchronization on the submissions, requiring only that \glspl{thrd} are not interrupted in between two submission steps. This is effectively the same requirement as using @thread_local@ variables. Since SQEs that are allocated must be submitted to the same ring, on the same \gls{proc}, this effectively forces the application to submit SQEs in allocation order\footnote{The actual requirement is that \glspl{thrd} cannot context switch between allocation and submission. This requirement means that from the subsystem's point of view, the allocation and submission are sequential. To remove this requirement, a \gls{thrd} would need the ability to ``yield to a specific \gls{proc}'', \ie, park with the promise that it will be run next on a specific \gls{proc}, the \gls{proc} attached to the correct ring. This is not a current or planned feature of \CFA.}, greatly simplifying both allocation and submission. In this design, allocation and submission form a ring partitioned ring buffer as shown in Figure~\ref{fig:pring}. Once added to the ring buffer, the attached \gls{proc} has a significant amount of flexibility with regards to when to do the system call. Possible options are: when the \gls{proc} runs out of \glspl{thrd} to run, after running a given number of threads \glspl{thrd}, etc. 127 128 \begin{figure} 129 \centering 130 \input{pivot_ring.pstex_t} 131 \caption[Partitioned ring buffer]{Partitioned ring buffer \smallskip\newline Allocated sqes are appending to the first partition. When submitting, the partition is simply advanced to include all the sqes that should be submitted. The kernel considers the partition as the head of the ring.} 132 \label{fig:pring} 133 \end{figure} 134 135 This approach has the advantage that it does not require much of the synchronization needed in the shared approach. This comes at the cost that \glspl{thrd} submitting \io operations have less flexibility, they cannot park or yield, and several exceptional cases are handled poorly. Instances running out of SQEs cannot run \glspl{thrd} wanting to do \io operations, in such a case the \gls{thrd} needs to be moved to a different \gls{proc}, the only current way of achieving this would be to @yield()@ hoping to be scheduled on a different \gls{proc}, which is not guaranteed. Another problematic case is that \glspl{thrd} that do not park for long periods of time will delay the submission of any SQE not already submitted. This issue is similar to fairness issues which schedulers that use work-stealing mentioned in the previous chapter. 136 44 137 45 138 46 139 \section{Interface} 140 Finally, the last important part of the \io subsystem is it's interface. There are multiple approaches that can be offered to programmers, each with advantages and disadvantages. The new \io subsystem can replace the C runtime's API or extend it. And in the later case the interface can go from very similar to vastly different. The following sections discuss some useful options using @read@ as an example. The standard Linux interface for C is : 141 142 @ssize_t read(int fd, void *buf, size_t count);@. 143 144 \subsection{Replacement} 145 Replacing the C \glsxtrshort{api} 146 147 \subsection{Synchronous Extension} 148 149 \subsection{Asynchronous Extension} 150 151 \subsection{Interface directly to \lstinline{io_uring}} -
doc/theses/thierry_delisle_PhD/thesis/text/runtime.tex
r5e99a9a r95b3a9c 11 11 12 12 \section{Clusters} 13 \CFA allows the option to group user-level threading, in the form of clusters. Both \glspl{thrd} and \glspl{proc} belong to a specific cluster. \Glspl{thrd} are only bescheduled onto \glspl{proc} in the same cluster and scheduling is done independently of other clusters. Figure~\ref{fig:system} shows an overview of the \CFA runtime, which allows programmers to tightly control parallelism. It also opens the door to handling effects like NUMA, by pining clusters to a specific NUMA node\footnote{This is not currently implemented in \CFA, but the only hurdle left is creating a generic interface for cpu masks.}.13 \CFA allows the option to group user-level threading, in the form of clusters. Both \glspl{thrd} and \glspl{proc} belong to a specific cluster. \Glspl{thrd} are only scheduled onto \glspl{proc} in the same cluster and scheduling is done independently of other clusters. Figure~\ref{fig:system} shows an overview of the \CFA runtime, which allows programmers to tightly control parallelism. It also opens the door to handling effects like NUMA, by pining clusters to a specific NUMA node\footnote{This is not currently implemented in \CFA, but the only hurdle left is creating a generic interface for cpu masks.}. 14 14 15 15 \begin{figure} … … 25 25 26 26 \section{\glsxtrshort{io}}\label{prev:io} 27 Prior to this work, the \CFA runtime did not add any particular support for \glsxtrshort{io} operations. %\CFA being built on C, this means that,28 While all I/O operations available in C are available in \CFA, \glsxtrshort{io} operations are designed for the POSIX threading model~\cite{pthreads}. Using these 1:1 threading operations in an M:N threading model means I/O operations block \glspl{proc} instead of \glspl{thrd}. While this can work in certain cases, it limits the number of concurrent operations to the number of \glspl{proc} rather than \glspl{thrd}. It also means deadlock can occur because all \glspl{proc} are blocked even if at least one \gls{thrd} is ready to run. A simple example of this type of deadlock would be as follows: 27 Prior to this work, the \CFA runtime did not add any particular support for \glsxtrshort{io} operations. While all \glsxtrshort{io} operations available in C are available in \CFA, \glsxtrshort{io} operations are designed for the POSIX threading model~\cite{pthreads}. Using these 1:1 threading operations in an M:N threading model means \glsxtrshort{io} operations block \glspl{proc} instead of \glspl{thrd}. While this can work in certain cases, it limits the number of concurrent operations to the number of \glspl{proc} rather than \glspl{thrd}. It also means deadlock can occur because all \glspl{proc} are blocked even if at least one \gls{thrd} is ready to run. A simple example of this type of deadlock would be as follows: 28 29 29 \begin{quote} 30 30 Given a simple network program with 2 \glspl{thrd} and a single \gls{proc}, one \gls{thrd} sends network requests to a server and the other \gls{thrd} waits for a response from the server. If the second \gls{thrd} races ahead, it may wait for responses to requests that have not been sent yet. In theory, this should not be a problem, even if the second \gls{thrd} waits, because the first \gls{thrd} is still ready to run and should be able to get CPU time to send the request. With M:N threading, while the first \gls{thrd} is ready, the lone \gls{proc} \emph{cannot} run the first \gls{thrd} if it is blocked in the \glsxtrshort{io} operation of the second \gls{thrd}. If this happen, the system is in a synchronization deadlock\footnote{In this example, the deadlocked could be resolved if the server sends unprompted messages to the client. However, this solution is not general and may not be appropriate even in this simple case.}. 31 31 \end{quote} 32 Therefore, one of the objective of this work is to introduce \emph{User-Level \glsxtrshort{io}}, like \glslink{uthrding}{User-Level \emph{Threading}} blocks \glspl{thrd} rather than \glspl{proc} when doing \glsxtrshort{io} operations, which entails multiplexing the \glsxtrshort{io} operations of many \glspl{thrd} onto fewer \glspl{proc}. This multiplexing requires that a single \gls{proc} be able to execute multiple I/O operations in parallel. This requirement cannot be done with operations that block \glspl{proc}, \ie \glspl{kthrd}, since the first operation would prevent starting new operations for its blocking duration. Executing I/O operations in parallel requires \emph{asynchronous} \glsxtrshort{io}, sometimes referred to as \emph{non-blocking}, since the \gls{kthrd} does not block.33 32 34 \section{Interoperating with C} 33 Therefore, one of the objective of this work is to introduce \emph{User-Level \glsxtrshort{io}}, like \glslink{uthrding}{User-Level \emph{Threading}} blocks \glspl{thrd} rather than \glspl{proc} when doing \glsxtrshort{io} operations, which entails multiplexing the \glsxtrshort{io} operations of many \glspl{thrd} onto fewer \glspl{proc}. This multiplexing requires that a single \gls{proc} be able to execute multiple \glsxtrshort{io} operations in parallel. This requirement cannot be done with operations that block \glspl{proc}, \ie \glspl{kthrd}, since the first operation would prevent starting new operations for its blocking duration. Executing \glsxtrshort{io} operations in parallel requires \emph{asynchronous} \glsxtrshort{io}, sometimes referred to as \emph{non-blocking}, since the \gls{kthrd} does not block. 34 35 \section{Interoperating with \texttt{C}} 35 36 While \glsxtrshort{io} operations are the classical example of operations that block \glspl{kthrd}, the non-blocking challenge extends to all blocking system-calls. The POSIX standard states~\cite[\S~2.9.1]{POSIX17}: 36 37 \begin{quote} … … 44 45 \begin{enumerate} 45 46 \item Precisely identifying blocking C calls is difficult. 46 \item Introducing newcode can have a significant impact on general performance.47 \item Introducing control points code can have a significant impact on general performance. 47 48 \end{enumerate} 48 Because of these consequences, this work does not attempt to ``sandbox'' calls to C. Therefore, it is possible for an unidentified library calls toblock a \gls{kthrd} leading to deadlocks in \CFA's M:N threading model, which would not occur in a traditional 1:1 threading model. Currently, all M:N thread systems interacting with UNIX without sandboxing suffer from this problem but manage to work very well in the majority of applications. Therefore, a complete solution to this problem is outside the scope of this thesis.49 Because of these consequences, this work does not attempt to ``sandbox'' calls to C. Therefore, it is possible calls from an unidentified library will block a \gls{kthrd} leading to deadlocks in \CFA's M:N threading model, which would not occur in a traditional 1:1 threading model. Currently, all M:N thread systems interacting with UNIX without sandboxing suffer from this problem but manage to work very well in the majority of applications. Therefore, a complete solution to this problem is outside the scope of this thesis. -
doc/theses/thierry_delisle_PhD/thesis/thesis.tex
r5e99a9a r95b3a9c 1 % uWaterloo Thesis Template for LaTeX 2 % Last Updated June 14, 2017 by Stephen Carr, IST Client Services 3 % FOR ASSISTANCE, please send mail to rt-IST-CSmathsci@ist.uwaterloo.ca 4 5 % Effective October 2006, the University of Waterloo 6 % requires electronic thesis submission. See the uWaterloo thesis regulations at 1 %====================================================================== 2 % University of Waterloo Thesis Template for LaTeX 3 % Last Updated November, 2020 4 % by Stephen Carr, IST Client Services, 5 % University of Waterloo, 200 University Ave. W., Waterloo, Ontario, Canada 6 % FOR ASSISTANCE, please send mail to request@uwaterloo.ca 7 8 % DISCLAIMER 9 % To the best of our knowledge, this template satisfies the current uWaterloo thesis requirements. 10 % However, it is your responsibility to assure that you have met all requirements of the University and your particular department. 11 12 % Many thanks for the feedback from many graduates who assisted the development of this template. 13 % Also note that there are explanatory comments and tips throughout this template. 14 %====================================================================== 15 % Some important notes on using this template and making it your own... 16 17 % The University of Waterloo has required electronic thesis submission since October 2006. 18 % See the uWaterloo thesis regulations at 7 19 % https://uwaterloo.ca/graduate-studies/thesis. 8 9 % DON'T FORGET TO ADD YOUR OWN NAME AND TITLE in the "hyperref" package 10 % configuration below. THIS INFORMATION GETS EMBEDDED IN THE PDF FINAL PDF DOCUMENT. 11 % You can view the information if you view Properties of the PDF document. 12 13 % Many faculties/departments also require one or more printed 14 % copies. This template attempts to satisfy both types of output. 15 % It is based on the standard "book" document class which provides all necessary 16 % sectioning structures and allows multi-part theses. 17 18 % DISCLAIMER 19 % To the best of our knowledge, this template satisfies the current uWaterloo requirements. 20 % However, it is your responsibility to assure that you have met all 21 % requirements of the University and your particular department. 22 % Many thanks for the feedback from many graduates that assisted the development of this template. 23 24 % ----------------------------------------------------------------------- 25 26 % By default, output is produced that is geared toward generating a PDF 27 % version optimized for viewing on an electronic display, including 28 % hyperlinks within the PDF. 29 20 % This thesis template is geared towards generating a PDF version optimized for viewing on an electronic display, including hyperlinks within the PDF. 21 22 % DON'T FORGET TO ADD YOUR OWN NAME AND TITLE in the "hyperref" package configuration below. 23 % THIS INFORMATION GETS EMBEDDED IN THE PDF FINAL PDF DOCUMENT. 24 % You can view the information if you view properties of the PDF document. 25 26 % Many faculties/departments also require one or more printed copies. 27 % This template attempts to satisfy both types of output. 28 % See additional notes below. 29 % It is based on the standard "book" document class which provides all necessary sectioning structures and allows multi-part theses. 30 31 % If you are using this template in Overleaf (cloud-based collaboration service), then it is automatically processed and previewed for you as you edit. 32 33 % For people who prefer to install their own LaTeX distributions on their own computers, and process the source files manually, the following notes provide the sequence of tasks: 34 30 35 % E.g. to process a thesis called "mythesis.tex" based on this template, run: 31 36 32 37 % pdflatex mythesis -- first pass of the pdflatex processor 33 38 % bibtex mythesis -- generates bibliography from .bib data file(s) 34 % makeindex -- should be run only if an index is used 39 % makeindex -- should be run only if an index is used 35 40 % pdflatex mythesis -- fixes numbering in cross-references, bibliographic references, glossaries, index, etc. 36 % pdflatex mythesis -- fixes numbering in cross-references, bibliographic references, glossaries, index, etc. 37 38 % If you use the recommended LaTeX editor, Texmaker, you would open the mythesis.tex 39 % file, then click the PDFLaTeX button. Then run BibTeX (under the Tools menu). 40 % Then click the PDFLaTeX button two more times. If you have an index as well, 41 % you'll need to run MakeIndex from the Tools menu as well, before running pdflatex 41 % pdflatex mythesis -- it takes a couple of passes to completely process all cross-references 42 43 % If you use the recommended LaTeX editor, Texmaker, you would open the mythesis.tex file, then click the PDFLaTeX button. Then run BibTeX (under the Tools menu). 44 % Then click the PDFLaTeX button two more times. 45 % If you have an index as well,you'll need to run MakeIndex from the Tools menu as well, before running pdflatex 42 46 % the last two times. 43 47 44 % N.B. The "pdftex" program allows graphics in the following formats to be 45 % included with the "\includegraphics" command: PNG, PDF, JPEG, TIFF 46 % Tip 1: Generate your figures and photos in the size you want them to appear 47 % in your thesis, rather than scaling them with \includegraphics options. 48 % Tip 2: Any drawings you do should be in scalable vector graphic formats: 49 % SVG, PNG, WMF, EPS and then converted to PNG or PDF, so they are scalable in 50 % the final PDF as well. 51 % Tip 3: Photographs should be cropped and compressed so as not to be too large. 52 53 % To create a PDF output that is optimized for double-sided printing: 54 % 55 % 1) comment-out the \documentclass statement in the preamble below, and 56 % un-comment the second \documentclass line. 57 % 58 % 2) change the value assigned below to the boolean variable 59 % "PrintVersion" from "false" to "true". 60 61 % --------------------- Start of Document Preamble ----------------------- 62 63 % Specify the document class, default style attributes, and page dimensions 48 % N.B. The "pdftex" program allows graphics in the following formats to be included with the "\includegraphics" command: PNG, PDF, JPEG, TIFF 49 % Tip: Generate your figures and photos in the size you want them to appear in your thesis, rather than scaling them with \includegraphics options. 50 % Tip: Any drawings you do should be in scalable vector graphic formats: SVG, PNG, WMF, EPS and then converted to PNG or PDF, so they are scalable in the final PDF as well. 51 % Tip: Photographs should be cropped and compressed so as not to be too large. 52 53 % To create a PDF output that is optimized for double-sided printing: 54 % 1) comment-out the \documentclass statement in the preamble below, and un-comment the second \documentclass line. 55 % 2) change the value assigned below to the boolean variable "PrintVersion" from " false" to "true". 56 57 %====================================================================== 58 % D O C U M E N T P R E A M B L E 59 % Specify the document class, default style attributes, and page dimensions, etc. 64 60 % For hyperlinked PDF, suitable for viewing on a computer, use this: 65 61 \documentclass[letterpaper,12pt,titlepage,oneside,final]{book} 66 62 67 % For PDF, suitable for double-sided printing, change the PrintVersion variable below 68 % to "true" and use this \documentclass line instead of the one above: 63 % For PDF, suitable for double-sided printing, change the PrintVersion variable below to "true" and use this \documentclass line instead of the one above: 69 64 %\documentclass[letterpaper,12pt,titlepage,openright,twoside,final]{book} 70 65 71 \newcommand{\href}[1]{#1} % does nothing, but defines the command so the 72 % print-optimized version will ignore \href tags (redefined by hyperref pkg). 66 % Some LaTeX commands I define for my own nomenclature. 67 % If you have to, it's easier to make changes to nomenclature once here than in a million places throughout your thesis! 68 \newcommand{\package}[1]{\textbf{#1}} % package names in bold text 69 \newcommand{\cmmd}[1]{\textbackslash\texttt{#1}} % command name in tt font 70 \newcommand{\href}[1]{#1} % does nothing, but defines the command so the print-optimized version will ignore \href tags (redefined by hyperref pkg). 71 %\newcommand{\texorpdfstring}[2]{#1} % does nothing, but defines the command 72 % Anything defined here may be redefined by packages added below... 73 73 74 74 % This package allows if-then-else control structures. … … 76 76 \newboolean{PrintVersion} 77 77 \setboolean{PrintVersion}{false} 78 % CHANGE THIS VALUE TO "true" as necessary, to improve printed results for hard copies 79 % by overriding some options of the hyperref package below. 78 % CHANGE THIS VALUE TO "true" as necessary, to improve printed results for hard copies by overriding some options of the hyperref package, called below. 80 79 81 80 %\usepackage{nomencl} % For a nomenclature (optional; available from ctan.org) 82 81 \usepackage{amsmath,amssymb,amstext} % Lots of math symbols and environments 82 \usepackage{xcolor} 83 83 \usepackage{graphicx} % For including graphics 84 84 85 85 % Hyperlinks make it very easy to navigate an electronic document. 86 % In addition, this is where you should specify the thesis title 87 % and author as they appear in the properties of the PDF document. 86 % In addition, this is where you should specify the thesis title and author as they appear in the properties of the PDF document. 88 87 % Use the "hyperref" package 89 88 % N.B. HYPERREF MUST BE THE LAST PACKAGE LOADED; ADD ADDITIONAL PKGS ABOVE 90 89 \usepackage[pagebackref=false]{hyperref} % with basic options 91 % N.B. pagebackref=true provides links back from the References to the body text. This can cause trouble for printing. 90 %\usepackage[pdftex,pagebackref=true]{hyperref} 91 % N.B. pagebackref=true provides links back from the References to the body text. This can cause trouble for printing. 92 92 \hypersetup{ 93 93 plainpages=false, % needed if Roman numbers in frontpages 94 unicode=false, % non-Latin characters in Acrobat ’s bookmarks95 pdftoolbar=true, % show Acrobat ’s toolbar?96 pdfmenubar=true, % show Acrobat ’s menu?94 unicode=false, % non-Latin characters in Acrobat's bookmarks 95 pdftoolbar=true, % show Acrobat's toolbar? 96 pdfmenubar=true, % show Acrobat's menu? 97 97 pdffitwindow=false, % window fit to page when opened 98 98 pdfstartview={FitH}, % fits the width of the page to the window … … 110 110 \ifthenelse{\boolean{PrintVersion}}{ % for improved print quality, change some hyperref options 111 111 \hypersetup{ % override some previously defined hyperref options 112 citecolor=black, 113 filecolor=black, 114 linkcolor=black, 112 citecolor=black,% 113 filecolor=black,% 114 linkcolor=black,% 115 115 urlcolor=black 116 116 }}{} % end of ifthenelse (no else) … … 120 120 % although it's supposed to be in both the TeX Live and MikTeX distributions. There are also documentation and 121 121 % installation instructions there. 122 \renewcommand*{\glstextformat}[1]{\textsf{#1}} 122 \makeatletter 123 \newcommand*{\glsplainhyperlink}[2]{% 124 \colorlet{currenttext}{.}% store current text color 125 \colorlet{currentlink}{\@linkcolor}% store current link color 126 \hypersetup{linkcolor=currenttext}% set link color 127 \hyperlink{#1}{#2}% 128 \hypersetup{linkcolor=currentlink}% reset to default 129 } 130 \let\@glslink\glsplainhyperlink 131 \makeatother 123 132 124 133 \usepackage{csquotes} … … 126 135 127 136 % Setting up the page margins... 128 \setlength{\textheight}{9in}\setlength{\topmargin}{-0.45in}\setlength{\headsep}{0.25in} 137 \setlength{\textheight}{9in} 138 \setlength{\topmargin}{-0.45in} 139 \setlength{\headsep}{0.25in} 129 140 % uWaterloo thesis requirements specify a minimum of 1 inch (72pt) margin at the 130 % top, bottom, and outside page edges and a 1.125 in. (81pt) gutter 131 % margin (on binding side). While this is not an issue for electronic 132 % viewing, a PDF may be printed, and so we have the same page layout for 133 % both printed and electronic versions, we leave the gutter margin in. 141 % top, bottom, and outside page edges and a 1.125 in. (81pt) gutter margin (on binding side). 142 % While this is not an issue for electronic viewing, a PDF may be printed, and so we have the same page layout for both printed and electronic versions, we leave the gutter margin in. 134 143 % Set margins to minimum permitted by uWaterloo thesis regulations: 135 144 \setlength{\marginparwidth}{0pt} % width of margin notes … … 140 149 \setlength{\evensidemargin}{0.125in} % Adds 1/8 in. to binding side of all 141 150 % even-numbered pages when the "twoside" printing option is selected 142 \setlength{\oddsidemargin}{0.125in} % Adds 1/8 in. to the left of all pages 143 % when "oneside" printing is selected, and to the left of all odd-numbered 144 % pages when "twoside" printing is selected 145 \setlength{\textwidth}{6.375in} % assuming US letter paper (8.5 in. x 11 in.) and 146 % side margins as above 151 \setlength{\oddsidemargin}{0.125in} % Adds 1/8 in. to the left of all pages when "oneside" printing is selected, and to the left of all odd-numbered pages when "twoside" printing is selected 152 \setlength{\textwidth}{6.375in} % assuming US letter paper (8.5 in. x 11 in.) and side margins as above 147 153 \raggedbottom 148 154 149 % The following statement specifies the amount of space between 150 % paragraphs. Other reasonable specifications are \bigskipamount and \smallskipamount. 155 % The following statement specifies the amount of space between paragraphs. Other reasonable specifications are \bigskipamount and \smallskipamount. 151 156 \setlength{\parskip}{\medskipamount} 152 157 153 % The following statement controls the line spacing. The default 154 % spacing corresponds to good typographic conventions and only slight 155 % changes (e.g., perhaps "1.2"), if any, should be made. 158 % The following statement controls the line spacing. 159 % The default spacing corresponds to good typographic conventions and only slight changes (e.g., perhaps "1.2"), if any, should be made. 156 160 \renewcommand{\baselinestretch}{1} % this is the default line space setting 157 161 158 % By default, each chapter will start on a recto (right-hand side) 159 % page. We also force each section of the front pages to start on 160 % a recto page by inserting \cleardoublepage commands. 161 % In many cases, this will require that the verso page be 162 % blank and, while it should be counted, a page number should not be 163 % printed. The following statements ensure a page number is not 164 % printed on an otherwise blank verso page. 162 % By default, each chapter will start on a recto (right-hand side) page. 163 % We also force each section of the front pages to start on a recto page by inserting \cleardoublepage commands. 164 % In many cases, this will require that the verso (left-hand) page be blank, and while it should be counted, a page number should not be printed. 165 % The following statements ensure a page number is not printed on an otherwise blank verso page. 165 166 \let\origdoublepage\cleardoublepage 166 167 \newcommand{\clearemptydoublepage}{% … … 194 195 \input{common} 195 196 \CFAStyle % CFA code-style for all languages 196 \lstset{ basicstyle=\linespread{0.9}\tt}197 \lstset{language=CFA,basicstyle=\linespread{0.9}\tt} % CFA default language 197 198 198 199 % glossary of terms to use … … 200 201 \makeindex 201 202 202 %====================================================================== 203 % L O G I C A L D O C U M E N T -- the content of your thesis 203 \newcommand\io{\glsxtrshort{io}\xspace}% 204 205 %====================================================================== 206 % L O G I C A L D O C U M E N T 207 % The logical document contains the main content of your thesis. 208 % Being a large document, it is a good idea to divide your thesis into several files, each one containing one chapter or other significant chunk of content, so you can easily shuffle things around later if desired. 204 209 %====================================================================== 205 210 \begin{document} 206 211 207 % For a large document, it is a good idea to divide your thesis208 % into several files, each one containing one chapter.209 % To illustrate this idea, the "front pages" (i.e., title page,210 % declaration, borrowers' page, abstract, acknowledgements,211 % dedication, table of contents, list of tables, list of figures,212 % nomenclature) are contained within the file "uw-ethesis-frontpgs.tex" which is213 % included into the document by the following statement.214 212 %---------------------------------------------------------------------- 215 213 % FRONT MATERIAL 214 % title page,declaration, borrowers' page, abstract, acknowledgements, 215 % dedication, table of contents, list of tables, list of figures, nomenclature, etc. 216 216 %---------------------------------------------------------------------- 217 217 \input{text/front.tex} 218 218 219 220 219 %---------------------------------------------------------------------- 221 220 % MAIN BODY 222 % ----------------------------------------------------------------------223 % Because this is a short document, and to reduce the number of files224 % needed for this template, the chapters are not separate225 % documents as suggested above, but you get the idea. If they were226 % separate documents, they would each start with the \chapter command, i.e,227 % do not contain \documentclass or \begin{document} and \end{document} commands. 221 % We suggest using a separate file for each chapter of your thesis. 222 % Start each chapter file with the \chapter command. 223 % Only use \documentclass or \begin{document} and \end{document} commands in this master document. 224 % Tip: Putting each sentence on a new line is a way to simplify later editing. 225 %---------------------------------------------------------------------- 226 228 227 \part{Introduction} 229 228 \input{text/intro.tex} … … 232 231 \part{Design} 233 232 \input{text/core.tex} 233 \input{text/io.tex} 234 234 \input{text/practice.tex} 235 \input{text/io.tex}236 235 \part{Evaluation} 237 236 \label{Evaluation} … … 243 242 %---------------------------------------------------------------------- 244 243 % END MATERIAL 245 %---------------------------------------------------------------------- 246 247 % B I B L I O G R A P H Y 248 % ----------------------- 249 250 % The following statement selects the style to use for references. It controls the sort order of the entries in the bibliography and also the formatting for the in-text labels. 244 % Bibliography, Appendices, Index, etc. 245 %---------------------------------------------------------------------- 246 247 % Bibliography 248 249 % The following statement selects the style to use for references. 250 % It controls the sort order of the entries in the bibliography and also the formatting for the in-text labels. 251 251 \bibliographystyle{plain} 252 252 % This specifies the location of the file containing the bibliographic information. 253 % It assumes you're using BibTeX (if not, why not?). 254 \cleardoublepage % This is needed if the book class is used, to place the anchor in the correct page, 255 % because the bibliography will start on its own page. 256 % Use \clearpage instead if the document class uses the "oneside" argument 253 % It assumes you're using BibTeX to manage your references (if not, why not?). 254 \cleardoublepage % This is needed if the "book" document class is used, to place the anchor in the correct page, because the bibliography will start on its own page. 255 % Use \clearpage instead if the document class uses the "oneside" argument 257 256 \phantomsection % With hyperref package, enables hyperlinking from the table of contents to bibliography 258 257 % The following statement causes the title "References" to be used for the bibliography section: … … 263 262 264 263 \bibliography{local,pl} 265 % Tip 5: You can create multiple .bib files to organize your references.264 % Tip: You can create multiple .bib files to organize your references. 266 265 % Just list them all in the \bibliogaphy command, separated by commas (no spaces). 267 266 268 % % The following statement causes the specified references to be added to the bibliography% even if they were not269 % % cited in the text.The asterisk is a wildcard that causes all entries in the bibliographic database to be included (optional).267 % The following statement causes the specified references to be added to the bibliography even if they were not cited in the text. 268 % The asterisk is a wildcard that causes all entries in the bibliographic database to be included (optional). 270 269 % \nocite{*} 270 %---------------------------------------------------------------------- 271 272 % Appendices 271 273 272 274 % The \appendix statement indicates the beginning of the appendices. 273 275 \appendix 274 % Add a title page before the appendices and a line in the Table of Contents276 % Add an un-numbered title page before the appendices and a line in the Table of Contents 275 277 \chapter*{APPENDICES} 276 278 \addcontentsline{toc}{chapter}{APPENDICES} 279 % Appendices are just more chapters, with different labeling (letters instead of numbers). 277 280 %====================================================================== 278 281 \chapter[PDF Plots From Matlab]{Matlab Code for Making a PDF Plot} … … 312 315 %\input{thesis.ind} % index 313 316 314 \phantomsection 315 316 \end{document} 317 \phantomsection % allows hyperref to link to the correct page 318 319 %---------------------------------------------------------------------- 320 \end{document} % end of logical document
Note:
See TracChangeset
for help on using the changeset viewer.