Changeset 95b3a9c for doc/theses


Ignore:
Timestamp:
Feb 17, 2021, 12:45:36 PM (5 years ago)
Author:
Thierry Delisle <tdelisle@…>
Branches:
ADT, arm-eh, ast-experimental, enum, forall-pointer-decay, jacob/cs343-translation, master, new-ast-unique-expr, pthread-emulation, qualifiedEnum, stuck-waitfor-destruct
Children:
e7c077a
Parents:
5e99a9a (diff), 9fb1367 (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the (diff) links above to see all the changes relative to each parent.
Message:

Merge branch 'master' of plg.uwaterloo.ca:software/cfa/cfa-cc

Location:
doc/theses
Files:
3 added
17 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/andrew_beach_MMath/existing.tex

    r5e99a9a r95b3a9c  
    1 \chapter{\texorpdfstring{\CFA Existing Features}{Cforall Existing Features}}
     1\chapter{\CFA Existing Features}
    22
    33\CFA (C-for-all)~\cite{Cforall} is an open-source project extending ISO C with
     
    1212obvious to the reader.
    1313
    14 \section{\texorpdfstring{Overloading and \lstinline|extern|}{Overloading and extern}}
     14\section{Overloading and \lstinline{extern}}
    1515\CFA has extensive overloading, allowing multiple definitions of the same name
    1616to be defined.~\cite{Moss18}
     
    4242
    4343\section{Reference Type}
    44 \CFA adds a rebindable reference type to C, but more expressive than the \CC
     44\CFA adds a rebindable reference type to C, but more expressive than the \Cpp
    4545reference.  Multi-level references are allowed and act like auto-dereferenced
    4646pointers using the ampersand (@&@) instead of the pointer asterisk (@*@). \CFA
     
    5959
    6060Both constructors and destructors are operators, which means they are just
    61 functions with special operator names rather than type names in \CC. The
     61functions with special operator names rather than type names in \Cpp. The
    6262special operator names may be used to call the functions explicitly (not
    63 allowed in \CC for constructors).
     63allowed in \Cpp for constructors).
    6464
    6565In general, operator names in \CFA are constructed by bracketing an operator
     
    8888matching overloaded destructor @void ^?{}(T &);@ is called.  Without explicit
    8989definition, \CFA creates a default and copy constructor, destructor and
    90 assignment (like \CC). It is possible to define constructors/destructors for
     90assignment (like \Cpp). It is possible to define constructors/destructors for
    9191basic and existing types.
    9292
     
    9494\CFA uses parametric polymorphism to create functions and types that are
    9595defined over multiple types. \CFA polymorphic declarations serve the same role
    96 as \CC templates or Java generics. The ``parametric'' means the polymorphism is
     96as \Cpp templates or Java generics. The ``parametric'' means the polymorphism is
    9797accomplished by passing argument operations to associate \emph{parameters} at
    9898the call site, and these parameters are used in the function to differentiate
     
    134134
    135135Note, a function named @do_once@ is not required in the scope of @do_twice@ to
    136 compile it, unlike \CC template expansion. Furthermore, call-site inferencing
     136compile it, unlike \Cpp template expansion. Furthermore, call-site inferencing
    137137allows local replacement of the most specific parametric functions needs for a
    138138call.
     
    178178}
    179179\end{cfa}
    180 The generic type @node(T)@ is an example of a polymorphic-type usage.  Like \CC
     180The generic type @node(T)@ is an example of a polymorphic-type usage.  Like \Cpp
    181181templates usage, a polymorphic-type usage must specify a type parameter.
    182182
  • doc/theses/andrew_beach_MMath/features.tex

    r5e99a9a r95b3a9c  
    55
    66\section{Virtuals}
     7Virtual types and casts are not part of the exception system nor are they
     8required for an exception system. But an object-oriented style hierarchy is a
     9great way of organizing exceptions so a minimal virtual system has been added
     10to \CFA.
     11
     12The pattern of a simple hierarchy was borrowed from object-oriented
     13programming was chosen for several reasons.
     14The first is that it allows new exceptions to be added in user code
     15and in libraries independently of each other. Another is it allows for
     16different levels of exception grouping (all exceptions, all IO exceptions or
     17a particular IO exception). Also it also provides a simple way of passing
     18data back and forth across the throw.
     19
    720Virtual types and casts are not required for a basic exception-system but are
    821useful for advanced exception features. However, \CFA is not object-oriented so
    9 there is no obvious concept of virtuals.  Hence, to create advanced exception
    10 features for this work, I needed to designed and implemented a virtual-like
     22there is no obvious concept of virtuals. Hence, to create advanced exception
     23features for this work, I needed to design and implement a virtual-like
    1124system for \CFA.
    1225
     26% NOTE: Maybe we should but less of the rational here.
    1327Object-oriented languages often organized exceptions into a simple hierarchy,
    1428\eg Java.
     
    3044\end{center}
    3145The hierarchy provides the ability to handle an exception at different degrees
    32 of specificity (left to right).  Hence, it is possible to catch a more general
     46of specificity (left to right). Hence, it is possible to catch a more general
    3347exception-type in higher-level code where the implementation details are
    3448unknown, which reduces tight coupling to the lower-level implementation.
     
    6175While much of the virtual infrastructure is created, it is currently only used
    6276internally for exception handling. The only user-level feature is the virtual
    63 cast, which is the same as the \CC \lstinline[language=C++]|dynamic_cast|.
     77cast, which is the same as the \Cpp \lstinline[language=C++]|dynamic_cast|.
    6478\label{p:VirtualCast}
    6579\begin{cfa}
    6680(virtual TYPE)EXPRESSION
    6781\end{cfa}
    68 Note, the syntax and semantics matches a C-cast, rather than the unusual \CC
    69 syntax for special casts. Both the type of @EXPRESSION@ and @TYPE@ must be a
    70 pointer to a virtual type. The cast dynamically checks if the @EXPRESSION@ type
    71 is the same or a subtype of @TYPE@, and if true, returns a pointer to the
     82Note, the syntax and semantics matches a C-cast, rather than the function-like
     83\Cpp syntax for special casts. Both the type of @EXPRESSION@ and @TYPE@ must be
     84a pointer to a virtual type.
     85The cast dynamically checks if the @EXPRESSION@ type is the same or a subtype
     86of @TYPE@, and if true, returns a pointer to the
    7287@EXPRESSION@ object, otherwise it returns @0p@ (null pointer).
    7388
     
    7893
    7994Exceptions are defined by the trait system; there are a series of traits, and
    80 if a type satisfies them, then it can be used as an exception.  The following
     95if a type satisfies them, then it can be used as an exception. The following
    8196is the base trait all exceptions need to match.
    8297\begin{cfa}
    8398trait is_exception(exceptT &, virtualT &) {
    84         virtualT const & @get_exception_vtable@(exceptT *);
     99        virtualT const & get_exception_vtable(exceptT *);
    85100};
    86101\end{cfa}
    87 The function takes any pointer, including the null pointer, and returns a
    88 reference to the virtual-table object. Defining this function also establishes
    89 the virtual type and a virtual-table pair to the \CFA type-resolver and
    90 promises @exceptT@ is a virtual type and a child of the base exception-type.
    91 
    92 \PAB{I do not understand this paragraph.}
    93 One odd thing about @get_exception_vtable@ is that it should always be a
    94 constant function, returning the same value regardless of its argument.  A
    95 pointer or reference to the virtual table instance could be used instead,
    96 however using a function has some ease of implementation advantages and allows
    97 for easier disambiguation because the virtual type name (or the address of an
    98 instance that is in scope) can be used instead of the mangled virtual table
    99 name.  Also note the use of the word ``promise'' in the trait
    100 description. Currently, \CFA cannot check to see if either @exceptT@ or
    101 @virtualT@ match the layout requirements. This is considered part of
    102 @get_exception_vtable@'s correct implementation.
    103 
    104 \section{Raise}
    105 \CFA provides two kinds of exception raise: termination
    106 \see{\VRef{s:Termination}} and resumption \see{\VRef{s:Resumption}}, which are
    107 specified with the following traits.
     102The trait is defined over two types, the exception type and the virtual table
     103type. This should be one-to-one, each exception type has only one virtual
     104table type and vice versa. The only assertion in the trait is
     105@get_exception_vtable@, which takes a pointer of the exception type and
     106returns a reference to the virtual table type instance.
     107
     108The function @get_exception_vtable@ is actually a constant function.
     109Recardless of the value passed in (including the null pointer) it should
     110return a reference to the virtual table instance for that type.
     111The reason it is a function instead of a constant is that it make type
     112annotations easier to write as you can use the exception type instead of the
     113virtual table type; which usually has a mangled name.
     114% Also \CFA's trait system handles functions better than constants and doing
     115% it this way reduce the amount of boiler plate we need.
     116
     117% I did have a note about how it is the programmer's responsibility to make
     118% sure the function is implemented correctly. But this is true of every
     119% similar system I know of (except Agda's I guess) so I took it out.
     120
     121There are two more traits for exceptions @is_termination_exception@ and
     122@is_resumption_exception@. They are defined as follows:
     123
    108124\begin{cfa}
    109125trait is_termination_exception(
    110126                exceptT &, virtualT & | is_exception(exceptT, virtualT)) {
    111         void @defaultTerminationHandler@(exceptT &);
     127        void defaultTerminationHandler(exceptT &);
    112128};
    113 \end{cfa}
    114 The function is required to allow a termination raise, but is only called if a
    115 termination raise does not find an appropriate handler.
    116 
    117 Allowing a resumption raise is similar.
    118 \begin{cfa}
     129
    119130trait is_resumption_exception(
    120131                exceptT &, virtualT & | is_exception(exceptT, virtualT)) {
    121         void @defaultResumptionHandler@(exceptT &);
     132        void defaultResumptionHandler(exceptT &);
    122133};
    123134\end{cfa}
    124 The function is required to allow a resumption raise, but is only called if a
    125 resumption raise does not find an appropriate handler.
    126 
    127 Finally there are three convenience macros for referring to the these traits:
    128 @IS_EXCEPTION@, @IS_TERMINATION_EXCEPTION@ and @IS_RESUMPTION_EXCEPTION@.  Each
    129 takes the virtual type's name, and for polymorphic types only, the
    130 parenthesized list of polymorphic arguments. These macros do the name mangling
    131 to get the virtual-table name and provide the arguments to both sides
    132 \PAB{What's a ``side''?}
     135
     136In other words they make sure that a given type and virtual type is an
     137exception and defines one of the two default handlers. These default handlers
     138are used in the main exception handling operations \see{Exception Handling}
     139and their use will be detailed there.
     140
     141However all three of these traits can be trickly to use directly.
     142There is a bit of repetition required but
     143the largest issue is that the virtual table type is mangled and not in a user
     144facing way. So there are three macros that can be used to wrap these traits
     145when you need to refer to the names:
     146@IS_EXCEPTION@, @IS_TERMINATION_EXCEPTION@ and @IS_RESUMPTION_EXCEPTION@.
     147
     148All take one or two arguments. The first argument is the name of the
     149exception type. Its unmangled and mangled form are passed to the trait.
     150The second (optional) argument is a parenthesized list of polymorphic
     151arguments. This argument should only with polymorphic exceptions and the
     152list will be passed to both types.
     153In the current set-up the base name and the polymorphic arguments have to
     154match so these macros can be used without losing flexability.
     155
     156For example consider a function that is polymorphic over types that have a
     157defined arithmetic exception:
     158\begin{cfa}
     159forall(Num | IS_EXCEPTION(Arithmetic, (Num)))
     160void some_math_function(Num & left, Num & right);
     161\end{cfa}
     162
     163\section{Exception Handling}
     164\CFA provides two kinds of exception handling, termination and resumption.
     165These twin operations are the core of the exception handling mechanism and
     166are the reason for the features of exceptions.
     167This section will cover the general patterns shared by the two operations and
     168then go on to cover the details each individual operation.
     169
     170Both operations follow the same set of steps to do their operation. They both
     171start with the user preforming a throw on an exception.
     172Then there is the search for a handler, if one is found than the exception
     173is caught and the handler is run. After that control returns to normal
     174execution.
     175
     176If the search fails a default handler is run and then control
     177returns to normal execution immediately. That is where the default handlers
     178@defaultTermiationHandler@ and @defaultResumptionHandler@ are used.
    133179
    134180\subsection{Termination}
    135181\label{s:Termination}
    136182
    137 Termination raise, called ``throw'', is familiar and used in most programming
    138 languages with exception handling. The semantics of termination is: search the
    139 stack for a matching handler, unwind the stack frames to the matching handler,
    140 execute the handler, and continue execution after the handler. Termination is
    141 used when execution \emph{cannot} return to the throw. To continue execution,
    142 the program must \emph{recover} in the handler from the failed (unwound)
    143 execution at the raise to safely proceed after the handler.
    144 
    145 A termination raise is started with the @throw@ statement:
     183Termination handling is more familiar kind and used in most programming
     184languages with exception handling.
     185It is dynamic, non-local goto. If a throw is successful then the stack will
     186be unwound and control will (usually) continue in a different function on
     187the call stack. They are commonly used when an error has occured and recovery
     188is impossible in the current function.
     189
     190% (usually) Control can continue in the current function but then a different
     191% control flow construct should be used.
     192
     193A termination throw is started with the @throw@ statement:
    146194\begin{cfa}
    147195throw EXPRESSION;
    148196\end{cfa}
    149 The expression must return a termination-exception reference, where the
    150 termination exception has a type with a @void defaultTerminationHandler(T &)@
    151 (default handler) defined. The handler is found at the call site using \CFA's
    152 trait system and passed into the exception system along with the exception
    153 itself.
    154 
    155 At runtime, a representation of the exception type and an instance of the
    156 exception type is copied into managed memory (heap) to ensure it remains in
    157 scope during unwinding. It is the user's responsibility to ensure the original
    158 exception object at the throw is freed when it goes out of scope. Being
    159 allocated on the stack is sufficient for this.
    160 
    161 Then the exception system searches the stack starting from the throw and
    162 proceeding towards the base of the stack, from callee to caller. At each stack
    163 frame, a check is made for termination handlers defined by the @catch@ clauses
    164 of a @try@ statement.
     197The expression must return a reference to a termination exception, where the
     198termination exception is any type that satifies @is_termination_exception@
     199at the call site.
     200Through \CFA's trait system the functions in the traits are passed into the
     201throw code. A new @defaultTerminationHandler@ can be defined in any scope to
     202change the throw's behavior (see below).
     203
     204The throw will copy the provided exception into managed memory. It is the
     205user's responcibility to ensure the original exception is cleaned up if the
     206stack is unwound (allocating it on the stack should be sufficient).
     207
     208Then the exception system searches the stack using the copied exception.
     209It starts starts from the throw and proceeds to the base of the stack,
     210from callee to caller.
     211At each stack frame, a check is made for resumption handlers defined by the
     212@catch@ clauses of a @try@ statement.
    165213\begin{cfa}
    166214try {
    167215        GUARDED_BLOCK
    168 } @catch (EXCEPTION_TYPE$\(_1\)$ * NAME)@ { // termination handler 1
     216} catch (EXCEPTION_TYPE$\(_1\)$ * NAME$\(_1\)$) {
    169217        HANDLER_BLOCK$\(_1\)$
    170 } @catch (EXCEPTION_TYPE$\(_2\)$ * NAME)@ { // termination handler 2
     218} catch (EXCEPTION_TYPE$\(_2\)$ * NAME$\(_2\)$) {
    171219        HANDLER_BLOCK$\(_2\)$
    172220}
    173221\end{cfa}
    174 The statements in the @GUARDED_BLOCK@ are executed. If those statements, or any
    175 functions invoked from those statements, throws an exception, and the exception
     222When viewed on its own a try statement will simply exceute the statements in
     223@GUARDED_BLOCK@ and when those are finished the try statement finishes.
     224
     225However, while the guarded statements are being executed, including any
     226functions they invoke, all the handlers following the try block are now
     227or any functions invoked from those
     228statements, throws an exception, and the exception
    176229is not handled by a try statement further up the stack, the termination
    177230handlers are searched for a matching exception type from top to bottom.
     
    179232Exception matching checks the representation of the thrown exception-type is
    180233the same or a descendant type of the exception types in the handler clauses. If
    181 there is a match, a pointer to the exception object created at the throw is
    182 bound to @NAME@ and the statements in the associated @HANDLER_BLOCK@ are
    183 executed. If control reaches the end of the handler, the exception is freed,
    184 and control continues after the try statement.
    185 
    186 The default handler visible at the throw statement is used if no matching
    187 termination handler is found after the entire stack is searched. At that point,
    188 the default handler is called with a reference to the exception object
    189 generated at the throw. If the default handler returns, the system default
    190 action is executed, which often terminates the program. This feature allows
    191 each exception type to define its own action, such as printing an informative
    192 error message, when an exception is not handled in the program.
     234it is the same of a descendent of @EXCEPTION_TYPE@$_i$ then @NAME@$_i$ is
     235bound to a pointer to the exception and the statements in @HANDLER_BLOCK@$_i$
     236are executed. If control reaches the end of the handler, the exception is
     237freed and control continues after the try statement.
     238
     239If no handler is found during the search then the default handler is run.
     240Through \CFA's trait system the best match at the throw sight will be used.
     241This function is run and is passed the copied exception. After the default
     242handler is run control continues after the throw statement.
     243
     244There is a global @defaultTerminationHandler@ that cancels the current stack
     245with the copied exception. However it is generic over all exception types so
     246new default handlers can be defined for different exception types and so
     247different exception types can have different default handlers.
    193248
    194249\subsection{Resumption}
    195250\label{s:Resumption}
    196251
    197 Resumption raise, called ``resume'', is as old as termination
    198 raise~\cite{Goodenough75} but is less popular. In many ways, resumption is
    199 simpler and easier to understand, as it is simply a dynamic call (as in
    200 Lisp). The semantics of resumption is: search the stack for a matching handler,
    201 execute the handler, and continue execution after the resume. Notice, the stack
    202 cannot be unwound because execution returns to the raise point. Resumption is
    203 used used when execution \emph{can} return to the resume. To continue
    204 execution, the program must \emph{correct} in the handler for the failed
    205 execution at the raise so execution can safely continue after the resume.
     252Resumption exception handling is a less common form than termination but is
     253just as old~\cite{Goodenough75} and is in some sense simpler.
     254It is a dynamic, non-local function call. If the throw is successful a
     255closure will be taken from up the stack and executed, after which the throwing
     256function will continue executing.
     257These are most often used when an error occured and if the error is repaired
     258then the function can continue.
    206259
    207260A resumption raise is started with the @throwResume@ statement:
     
    210263\end{cfa}
    211264The semantics of the @throwResume@ statement are like the @throw@, but the
    212 expression has a type with a @void defaultResumptionHandler(T &)@ (default
    213 handler) defined, where the handler is found at the call site by the type
    214 system.  At runtime, a representation of the exception type and an instance of
    215 the exception type is \emph{not} copied because the stack is maintained during
    216 the handler search.
    217 
    218 Then the exception system searches the stack starting from the resume and
    219 proceeding towards the base of the stack, from callee to caller. At each stack
    220 frame, a check is made for resumption handlers defined by the @catchResume@
    221 clauses of a @try@ statement.
     265expression has return a reference a type that satifies the trait
     266@is_resumption_exception@. The assertions from this trait are available to
     267the exception system while handling the exception.
     268
     269At runtime, no copies are made. As the stack is not unwound the exception and
     270any values on the stack will remain in scope while the resumption is handled.
     271
     272Then the exception system searches the stack using the provided exception.
     273It starts starts from the throw and proceeds to the base of the stack,
     274from callee to caller.
     275At each stack frame, a check is made for resumption handlers defined by the
     276@catchResume@ clauses of a @try@ statement.
    222277\begin{cfa}
    223278try {
    224279        GUARDED_BLOCK
    225 } @catchResume (EXCEPTION_TYPE$\(_1\)$ * NAME)@ { // resumption handler 1
     280} catchResume (EXCEPTION_TYPE$\(_1\)$ * NAME$\(_1\)$) {
    226281        HANDLER_BLOCK$\(_1\)$
    227 } @catchResume (EXCEPTION_TYPE$\(_2\)$ * NAME)@ { // resumption handler 2
     282} catchResume (EXCEPTION_TYPE$\(_2\)$ * NAME$\(_2\)$) {
    228283        HANDLER_BLOCK$\(_2\)$
    229284}
    230285\end{cfa}
    231 The statements in the @GUARDED_BLOCK@ are executed. If those statements, or any
    232 functions invoked from those statements, resumes an exception, and the
    233 exception is not handled by a try statement further up the stack, the
    234 resumption handlers are searched for a matching exception type from top to
    235 bottom. (Note, termination and resumption handlers may be intermixed in a @try@
    236 statement but the kind of raise (throw/resume) only matches with the
    237 corresponding kind of handler clause.)
    238 
    239 The exception search and matching for resumption is the same as for
    240 termination, including exception inheritance. The difference is when control
    241 reaches the end of the handler: the resumption handler returns after the resume
    242 rather than after the try statement. The resume point assumes the handler has
    243 corrected the problem so execution can safely continue.
     286If the handlers are not involved in a search this will simply execute the
     287@GUARDED_BLOCK@ and then continue to the next statement.
     288Its purpose is to add handlers onto the stack.
     289(Note, termination and resumption handlers may be intermixed in a @try@
     290statement but the kind of throw must be the same as the handler for it to be
     291considered as a possible match.)
     292
     293If a search for a resumption handler reaches a try block it will check each
     294@catchResume@ clause, top-to-bottom.
     295At each handler if the thrown exception is or is a child type of
     296@EXCEPTION_TYPE@$_i$ then the a pointer to the exception is bound to
     297@NAME@$_i$ and then @HANDLER_BLOCK@$_i$ is executed. After the block is
     298finished control will return to the @throwResume@ statement.
    244299
    245300Like termination, if no resumption handler is found, the default handler
    246 visible at the resume statement is called, and the system default action is
    247 executed.
    248 
    249 For resumption, the exception system uses stack marking to partition the
    250 resumption search. If another resumption exception is raised in a resumption
    251 handler, the second exception search does not start at the point of the
    252 original raise. (Remember the stack is not unwound and the current handler is
    253 at the top of the stack.) The search for the second resumption starts at the
    254 current point on the stack because new try statements may have been pushed by
    255 the handler or functions called from the handler. If there is no match back to
    256 the point of the current handler, the search skips\label{p:searchskip} the stack frames already
    257 searched by the first resume and continues after the try statement. The default
    258 handler always continues from default handler associated with the point where
    259 the exception is created.
     301visible at the throw statement is called. It will use the best match at the
     302call sight according to \CFA's overloading rules. The default handler is
     303passed the exception given to the throw. When the default handler finishes
     304execution continues after the throw statement.
     305
     306There is a global @defaultResumptionHandler@ is polymorphic over all
     307termination exceptions and preforms a termination throw on the exception.
     308The @defaultTerminationHandler@ for that throw is matched at the original
     309throw statement (the resumption @throwResume@) and it can be customized by
     310introducing a new or better match as well.
     311
     312% \subsubsection?
     313
     314A key difference between resumption and termination is that resumption does
     315not unwind the stack. A side effect that is that when a handler is matched
     316and run it's try block (the guarded statements) and every try statement
     317searched before it are still on the stack. This can lead to the recursive
     318resumption problem.
     319
     320The recursive resumption problem is any situation where a resumption handler
     321ends up being called while it is running.
     322Consider a trivial case:
     323\begin{cfa}
     324try {
     325        throwResume (E &){};
     326} catchResume(E *) {
     327        throwResume (E &){};
     328}
     329\end{cfa}
     330When this code is executed the guarded @throwResume@ will throw, start a
     331search and match the handler in the @catchResume@ clause. This will be
     332call and placed on the stack on top of the try-block. The second throw then
     333throws and will seach the same try block and put call another instance of the
     334same handler leading to an infinite loop.
     335
     336This situation is trivial and easy to avoid, but much more complex cycles
     337can form with multiple handlers and different exception types.
     338
     339To prevent all of these cases we mask sections of the stack, or equvilantly
     340the try statements on the stack, so that the resumption seach skips over
     341them and continues with the next unmasked section of the stack.
     342
     343A section of the stack is marked when it is searched to see if it contains
     344a handler for an exception and unmarked when that exception has been handled
     345or the search was completed without finding a handler.
    260346
    261347% This might need a diagram. But it is an important part of the justification
     
    276362\end{verbatim}
    277363
    278 This resumption search-pattern reflect the one for termination, which matches
    279 with programmer expectations. However, it avoids the \emph{recursive
    280 resumption} problem. If parts of the stack are searched multiple times, loops
    281 can easily form resulting in infinite recursion.
    282 
    283 Consider the trivial case:
    284 \begin{cfa}
    285 try {
    286         throwResume$\(_1\)$ (E &){};
    287 } catch( E * ) {
    288         throwResume;
    289 }
    290 \end{cfa}
    291 Based on termination semantics, programmer expectation is for the re-resume to
    292 continue searching the stack frames after the try statement. However, the
    293 current try statement is still on the stack below the handler issuing the
    294 reresume \see{\VRef{s:Reraise}}. Hence, the try statement catches the re-raise
    295 again and does another re-raise \emph{ad infinitum}, which is confusing and
    296 difficult to debug. The \CFA resumption search-pattern skips the try statement
    297 so the reresume search continues after the try, mathcing programmer
    298 expectation.
     364The rules can be remembered as thinking about what would be searched in
     365termination. So when a throw happens in a handler; a termination handler
     366skips everything from the original throw to the original catch because that
     367part of the stack has been unwound, a resumption handler skips the same
     368section of stack because it has been masked.
     369A throw in a default handler will preform the same search as the original
     370throw because; for termination nothing has been unwound, for resumption
     371the mask will be the same.
     372
     373The symmetry with termination is why this pattern was picked. Other patterns,
     374such as marking just the handlers that caught, also work but lack the
     375symmetry whih means there is more to remember.
    299376
    300377\section{Conditional Catch}
    301 Both termination and resumption handler-clauses may perform conditional matching:
    302 \begin{cfa}
    303 catch (EXCEPTION_TYPE * NAME ; @CONDITION@)
     378Both termination and resumption handler clauses can be given an additional
     379condition to further control which exceptions they handle:
     380\begin{cfa}
     381catch (EXCEPTION_TYPE * NAME ; CONDITION)
    304382\end{cfa}
    305383First, the same semantics is used to match the exception type. Second, if the
    306384exception matches, @CONDITION@ is executed. The condition expression may
    307385reference all names in scope at the beginning of the try block and @NAME@
    308 introduced in the handler clause.  If the condition is true, then the handler
    309 matches. Otherwise, the exception search continues at the next appropriate kind
    310 of handler clause in the try block.
     386introduced in the handler clause. If the condition is true, then the handler
     387matches. Otherwise, the exception search continues as if the exception type
     388did not match.
    311389\begin{cfa}
    312390try {
     
    322400remaining handlers in the current try statement.
    323401
    324 \section{Reraise}
    325 \label{s:Reraise}
     402\section{Rethrowing}
     403\colour{red}{From Andrew: I recomend we talk about why the language doesn't
     404have rethrows/reraises instead.}
     405
     406\label{s:Rethrowing}
    326407Within the handler block or functions called from the handler block, it is
    327408possible to reraise the most recently caught exception with @throw@ or
    328 @throwResume@, respective.
    329 \begin{cfa}
    330 catch( ... ) {
    331         ... throw; // rethrow
     409@throwResume@, respectively.
     410\begin{cfa}
     411try {
     412        ...
     413} catch( ... ) {
     414        ... throw;
    332415} catchResume( ... ) {
    333         ... throwResume; // reresume
     416        ... throwResume;
    334417}
    335418\end{cfa}
     
    341424handler is generated that does a program-level abort.
    342425
    343 
    344426\section{Finally Clauses}
    345 A @finally@ clause may be placed at the end of a @try@ statement.
     427Finally clauses are used to preform unconditional clean-up when leaving a
     428scope. They are placed at the end of a try statement:
    346429\begin{cfa}
    347430try {
    348431        GUARDED_BLOCK
    349 } ...   // any number or kind of handler clauses
    350 } finally {
     432} ... // any number or kind of handler clauses
     433... finally {
    351434        FINALLY_BLOCK
    352435}
    353436\end{cfa}
    354 The @FINALLY_BLOCK@ is executed when the try statement is unwound from the
    355 stack, \ie when the @GUARDED_BLOCK@ or any handler clause finishes. Hence, the
    356 finally block is always executed.
     437The @FINALLY_BLOCK@ is executed when the try statement is removed from the
     438stack, including when the @GUARDED_BLOCK@ finishes, any termination handler
     439finishes or during an unwind.
     440The only time the block is not executed is if the program is exited before
     441the stack is unwound.
    357442
    358443Execution of the finally block should always finish, meaning control runs off
    359444the end of the block. This requirement ensures always continues as if the
    360445finally clause is not present, \ie finally is for cleanup not changing control
    361 flow.  Because of this requirement, local control flow out of the finally block
    362 is forbidden.  The compiler precludes any @break@, @continue@, @fallthru@ or
     446flow. Because of this requirement, local control flow out of the finally block
     447is forbidden. The compiler precludes any @break@, @continue@, @fallthru@ or
    363448@return@ that causes control to leave the finally block. Other ways to leave
    364449the finally block, such as a long jump or termination are much harder to check,
    365 and at best requiring additional run-time overhead, and so are discouraged.
     450and at best requiring additional run-time overhead, and so are mearly
     451discouraged.
     452
     453Not all languages with exceptions have finally clauses. Notably \Cpp does
     454without it as descructors serve a similar role. Although destructors and
     455finally clauses can be used in many of the same areas they have their own
     456use cases like top-level functions and lambda functions with closures.
     457Destructors take a bit more work to set up but are much easier to reuse while
     458finally clauses are good for once offs and can include local information.
    366459
    367460\section{Cancellation}
     
    370463possible forwards the cancellation exception to a different stack.
    371464
     465Cancellation is not an exception operation like termination or resumption.
    372466There is no special statement for starting a cancellation; instead the standard
    373 library function @cancel_stack@ is called passing an exception.  Unlike a
    374 raise, this exception is not used in matching only to pass information about
     467library function @cancel_stack@ is called passing an exception. Unlike a
     468throw, this exception is not used in matching only to pass information about
    375469the cause of the cancellation.
    376 
    377 Handling of a cancellation depends on which stack is being cancelled.
     470(This also means matching cannot fail so there is no default handler either.)
     471
     472After @cancel_stack@ is called the exception is copied into the exception
     473handling mechanism's memory. Then the entirety of the current stack is
     474unwound. After that it depends one which stack is being cancelled.
    378475\begin{description}
    379476\item[Main Stack:]
    380477The main stack is the one used by the program main at the start of execution,
    381 and is the only stack in a sequential program.  Hence, when cancellation is
    382 forwarded to the main stack, there is no other forwarding stack, so after the
    383 stack is unwound, there is a program-level abort.
     478and is the only stack in a sequential program. Even in a concurrent program
     479the main stack is only dependent on the environment that started the program.
     480Hence, when the main stack is cancelled there is nowhere else in the program
     481to notify. After the stack is unwound, there is a program-level abort.
    384482
    385483\item[Thread Stack:]
    386484A thread stack is created for a @thread@ object or object that satisfies the
    387 @is_thread@ trait.  A thread only has two points of communication that must
     485@is_thread@ trait. A thread only has two points of communication that must
    388486happen: start and join. As the thread must be running to perform a
    389 cancellation, it must occur after start and before join, so join is a
    390 cancellation point.  After the stack is unwound, the thread halts and waits for
    391 another thread to join with it. The joining thread, checks for a cancellation,
     487cancellation, it must occur after start and before join, so join is used
     488for communication here.
     489After the stack is unwound, the thread halts and waits for
     490another thread to join with it. The joining thread checks for a cancellation,
    392491and if present, resumes exception @ThreadCancelled@.
    393492
     
    397496the exception is not caught. The implicit join does a program abort instead.
    398497
    399 This semantics is for safety. One difficult problem for any exception system is
    400 defining semantics when an exception is raised during an exception search:
    401 which exception has priority, the original or new exception? No matter which
    402 exception is selected, it is possible for the selected one to disrupt or
    403 destroy the context required for the other. \PAB{I do not understand the
    404 following sentences.} This loss of information can happen with join but as the
    405 thread destructor is always run when the stack is being unwound and one
    406 termination/cancellation is already active. Also since they are implicit they
    407 are easier to forget about.
     498This semantics is for safety. If an unwind is triggered while another unwind
     499is underway only one of them can proceed as they both want to ``consume'' the
     500stack. Letting both try to proceed leads to very undefined behaviour.
     501Both termination and cancellation involve unwinding and, since the default
     502@defaultResumptionHandler@ preforms a termination that could more easily
     503happen in an implicate join inside a destructor. So there is an error message
     504and an abort instead.
     505\todo{Perhaps have a more general disucssion of unwind collisions before
     506this point.}
     507
     508The recommended way to avoid the abort is to handle the intial resumption
     509from the implicate join. If required you may put an explicate join inside a
     510finally clause to disable the check and use the local
     511@defaultResumptionHandler@ instead.
    408512
    409513\item[Coroutine Stack:] A coroutine stack is created for a @coroutine@ object
    410 or object that satisfies the @is_coroutine@ trait.  A coroutine only knows of
    411 two other coroutines, its starter and its last resumer.  The last resumer has
    412 the tightest coupling to the coroutine it activated.  Hence, cancellation of
    413 the active coroutine is forwarded to the last resumer after the stack is
    414 unwound, as the last resumer has the most precise knowledge about the current
    415 execution. When the resumer restarts, it resumes exception
     514or object that satisfies the @is_coroutine@ trait. A coroutine only knows of
     515two other coroutines, its starter and its last resumer. Of the two the last
     516resumer has the tightest coupling to the coroutine it activated and the most
     517up-to-date information.
     518
     519Hence, cancellation of the active coroutine is forwarded to the last resumer
     520after the stack is unwound. When the resumer restarts, it resumes exception
    416521@CoroutineCancelled@, which is polymorphic over the coroutine type and has a
    417522pointer to the cancelled coroutine.
  • doc/theses/andrew_beach_MMath/future.tex

    r5e99a9a r95b3a9c  
    1010\item
    1111The implementation of termination is not portable because it includes
    12 hand-crafted assembly statements. These sections must be generalized to support
    13 more hardware architectures, \eg ARM processor.
     12hand-crafted assembly statements. These sections must be ported by hand to
     13support more hardware architectures, such as the ARM processor.
    1414\item
    1515Due to a type-system problem, the catch clause cannot bind the exception to a
     
    2424scope of the @try@ statement, where the local control-flow transfers are
    2525meaningful.
     26\item
     27There is no detection of colliding unwinds. It is possible for clean-up code
     28run during an unwind to trigger another unwind that escapes the clean-up code
     29itself; such as a termination exception caught further down the stack or a
     30cancellation. There do exist ways to handle this but currently they are not
     31even detected and the first unwind will simply be forgotten, often leaving
     32it in a bad state.
     33\item
     34Also the exception system did not have a lot of time to be tried and tested.
     35So just letting people use the exception system more will reveal new
     36quality of life upgrades that can be made with time.
    2637\end{itemize}
    2738
  • doc/theses/andrew_beach_MMath/implement.tex

    r5e99a9a r95b3a9c  
    278278@_URC_END_OF_STACK@.
    279279
    280 Second, when a handler is matched, raise exception continues onto the cleanup phase.
     280Second, when a handler is matched, raise exception continues onto the cleanup
     281phase.
    281282Once again, it calls the personality functions of each stack frame from newest
    282283to oldest. This pass stops at the stack frame containing the matching handler.
  • doc/theses/andrew_beach_MMath/thesis-frontpgs.tex

    r5e99a9a r95b3a9c  
    3636
    3737        A thesis \\
    38         presented to the University of Waterloo \\ 
     38        presented to the University of Waterloo \\
    3939        in fulfillment of the \\
    4040        thesis requirement for the degree of \\
     
    6464\cleardoublepage
    6565
    66  
     66
    6767%----------------------------------------------------------------------
    6868% EXAMINING COMMITTEE (Required for Ph.D. theses only)
     
    7171\begin{center}\textbf{Examining Committee Membership}\end{center}
    7272  \noindent
    73 The following served on the Examining Committee for this thesis. The decision of the Examining Committee is by majority vote.
    74   \bigskip
    75  
    76   \noindent
    77 \begin{tabbing}
    78 Internal-External Member: \=  \kill % using longest text to define tab length
    79 External Examiner: \>  Bruce Bruce \\
     73The following served on the Examining Committee for this thesis. The decision
     74of the Examining Committee is by majority vote.
     75  \bigskip
     76
     77  \noindent
     78\begin{tabbing}
     79Internal-External Member: \=  \kill % using longest text to define tab length
     80External Examiner: \>  Bruce Bruce \\
    8081\> Professor, Dept. of Philosophy of Zoology, University of Wallamaloo \\
    81 \end{tabbing} 
    82   \bigskip
    83  
     82\end{tabbing}
     83  \bigskip
     84
    8485  \noindent
    8586\begin{tabbing}
     
    9192\end{tabbing}
    9293  \bigskip
    93  
     94
    9495  \noindent
    9596  \begin{tabbing}
     
    99100\end{tabbing}
    100101  \bigskip
    101  
     102
    102103  \noindent
    103104\begin{tabbing}
     
    107108\end{tabbing}
    108109  \bigskip
    109  
     110
    110111  \noindent
    111112\begin{tabbing}
     
    123124  % December 13th, 2006.  It is designed for an electronic thesis.
    124125  \noindent
    125 I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, including any required final revisions, as accepted by my examiners.
    126 
    127   \bigskip
    128  
     126I hereby declare that I am the sole author of this thesis. This is a true copy
     127of the thesis, including any required final revisions, as accepted by my
     128examiners.
     129
     130  \bigskip
     131
    129132  \noindent
    130133I understand that my thesis may be made electronically available to the public.
  • doc/theses/andrew_beach_MMath/thesis.tex

    r5e99a9a r95b3a9c  
    4545% FRONT MATERIAL
    4646%----------------------------------------------------------------------
    47 \input{thesis-frontpgs} 
     47\input{thesis-frontpgs}
    4848
    4949%----------------------------------------------------------------------
     
    6565A \gls{computer} could compute $\pi$ all day long. In fact, subsets of digits
    6666of $\pi$'s decimal approximation would make a good source for psuedo-random
    67 vectors, \gls{rvec} . 
     67vectors, \gls{rvec} .
    6868
    6969%----------------------------------------------------------------------
     
    9696
    9797\begin{itemize}
    98 \item A well-prepared PDF should be 
     98\item A well-prepared PDF should be
    9999  \begin{enumerate}
    100100    \item Of reasonable size, {\it i.e.} photos cropped and compressed.
    101     \item Scalable, to allow enlargment of text and drawings. 
    102   \end{enumerate} 
     101    \item Scalable, to allow enlargment of text and drawings.
     102  \end{enumerate}
    103103\item Photos must be bit maps, and so are not scaleable by definition. TIFF and
    104104BMP are uncompressed formats, while JPEG is compressed. Most photos can be
    105105compressed without losing their illustrative value.
    106 \item Drawings that you make should be scalable vector graphics, \emph{not} 
     106\item Drawings that you make should be scalable vector graphics, \emph{not}
    107107bit maps. Some scalable vector file formats are: EPS, SVG, PNG, WMF. These can
    108 all be converted into PNG or PDF, that pdflatex recognizes. Your drawing 
    109 package probably can export to one of these formats directly. Otherwise, a 
    110 common procedure is to print-to-file through a Postscript printer driver to 
    111 create a PS file, then convert that to EPS (encapsulated PS, which has a 
    112 bounding box to describe its exact size rather than a whole page). 
     108all be converted into PNG or PDF, that pdflatex recognizes. Your drawing
     109package probably can export to one of these formats directly. Otherwise, a
     110common procedure is to print-to-file through a Postscript printer driver to
     111create a PS file, then convert that to EPS (encapsulated PS, which has a
     112bounding box to describe its exact size rather than a whole page).
    113113Programs such as GSView (a Ghostscript GUI) can create both EPS and PDF from
    114114PS files. Appendix~\ref{AppendixA} shows how to generate properly sized Matlab
    115115plots and save them as PDF.
    116116\item It's important to crop your photos and draw your figures to the size that
    117 you want to appear in your thesis. Scaling photos with the 
    118 includegraphics command will cause loss of resolution. And scaling down 
     117you want to appear in your thesis. Scaling photos with the
     118includegraphics command will cause loss of resolution. And scaling down
    119119drawings may cause any text annotations to become too small.
    120120\end{itemize}
    121  
     121
    122122For more information on \LaTeX\, see the uWaterloo Skills for the
    123 Academic Workplace \href{https://uwaterloo.ca/information-systems-technology/services/electronic-thesis-preparation-and-submission-support/ethesis-guide/creating-pdf-version-your-thesis/creating-pdf-files-using-latex/latex-ethesis-and-large-documents}{course notes}. 
     123Academic Workplace \href{https://uwaterloo.ca/information-systems-technology/services/electronic-thesis-preparation-and-submission-support/ethesis-guide/creating-pdf-version-your-thesis/creating-pdf-files-using-latex/latex-ethesis-and-large-documents}{course notes}.
    124124\footnote{
    125125Note that while it is possible to include hyperlinks to external documents,
    126 it is not wise to do so, since anything you can't control may change over time. 
    127 It \emph{would} be appropriate and necessary to provide external links to 
    128 additional resources for a multimedia ``enhanced'' thesis. 
    129 But also note that if the \package{hyperref} package is not included, 
    130 as for the print-optimized option in this thesis template, any \cmmd{href} 
     126it is not wise to do so, since anything you can't control may change over time.
     127It \emph{would} be appropriate and necessary to provide external links to
     128additional resources for a multimedia ``enhanced'' thesis.
     129But also note that if the \package{hyperref} package is not included,
     130as for the print-optimized option in this thesis template, any \cmmd{href}
    131131commands in your logical document are no longer defined.
    132132A work-around employed by this thesis template is to define a dummy
    133 \cmmd{href} command (which does nothing) in the preamble of the document, 
    134 before the \package{hyperref} package is included. 
     133\cmmd{href} command (which does nothing) in the preamble of the document,
     134before the \package{hyperref} package is included.
    135135The dummy definition is then redifined by the
    136136\package{hyperref} package when it is included.
     
    138138
    139139The classic book by Leslie Lamport \cite{lamport.book}, author of \LaTeX , is
    140 worth a look too, and the many available add-on packages are described by 
     140worth a look too, and the many available add-on packages are described by
    141141Goossens \textit{et al} \cite{goossens.book}.
    142142
     
    180180Export Setup button in the figure Property Editor.
    181181
    182 \section{From the Command Line} 
     182\section{From the Command Line}
    183183All figure properties can also be manipulated from the command line. Here's an
    184 example: 
     184example:
    185185\begin{verbatim}
    186186x=[0:0.1:pi];
  • doc/theses/andrew_beach_MMath/unwinding.tex

    r5e99a9a r95b3a9c  
    1 \chapter{\texorpdfstring{Unwinding in \CFA}{Unwinding in Cforall}}
     1\chapter{Unwinding in \CFA}
    22
    33Stack unwinding is the process of removing stack frames (activations) from the
     
    110110alternate transfers of control.
    111111
    112 \section{\texorpdfstring{\CFA Implementation}{Cforall Implementation}}
     112\section{\CFA Implementation}
    113113
    114114To use libunwind, \CFA provides several wrappers, its own storage, personality
  • doc/theses/andrew_beach_MMath/uw-ethesis-frontpgs.tex

    r5e99a9a r95b3a9c  
    1313        \vspace*{1.0cm}
    1414
    15         \Huge
    16         {\bf Exception Handling in \CFA}
     15        {\Huge\bf Exception Handling in \CFA}
    1716
    1817        \vspace*{1.0cm}
    1918
    20         \normalsize
    2119        by \\
    2220
    2321        \vspace*{1.0cm}
    2422
    25         \Large
    26         Andrew James Beach \\
     23        {\Large Andrew James Beach} \\
    2724
    2825        \vspace*{3.0cm}
    2926
    30         \normalsize
    3127        A thesis \\
    32         presented to the University of Waterloo \\ 
     28        presented to the University of Waterloo \\
    3329        in fulfillment of the \\
    3430        thesis requirement for the degree of \\
     
    4339        \vspace*{1.0cm}
    4440
    45         \copyright\ Andrew James Beach \the\year \\
     41        \copyright{} Andrew James Beach \the\year \\
    4642        \end{center}
    4743\end{titlepage}
    4844
    49 % The rest of the front pages should contain no headers and be numbered using Roman numerals starting with `ii'
     45% The rest of the front pages should contain no headers and be numbered using
     46% Roman numerals starting with `ii'.
    5047\pagestyle{plain}
    5148\setcounter{page}{2}
    5249
    53 \cleardoublepage % Ends the current page and causes all figures and tables that have so far appeared in the input to be printed.
    54 % In a two-sided printing style, it also makes the next page a right-hand (odd-numbered) page, producing a blank page if necessary.
     50\cleardoublepage % Ends the current page and causes all figures and tables
     51% that have so far appeared in the input to be printed. In a two-sided
     52% printing style, it also makes the next page a right-hand (odd-numbered)
     53% page, producing a blank page if necessary.
    5554
    56 \begin{comment} 
     55\begin{comment}
    5756% E X A M I N I N G   C O M M I T T E E (Required for Ph.D. theses only)
    5857% Remove or comment out the lines below to remove this page
    5958\begin{center}\textbf{Examining Committee Membership}\end{center}
    6059  \noindent
    61 The following served on the Examining Committee for this thesis. The decision of the Examining Committee is by majority vote.
     60The following served on the Examining Committee for this thesis.
     61The decision of the Examining Committee is by majority vote.
    6262  \bigskip
    63  
     63
    6464  \noindent
    6565\begin{tabbing}
    6666Internal-External Member: \=  \kill % using longest text to define tab length
    67 External Examiner: \>  Bruce Bruce \\ 
     67External Examiner: \>  Bruce Bruce \\
    6868\> Professor, Dept. of Philosophy of Zoology, University of Wallamaloo \\
    69 \end{tabbing} 
     69\end{tabbing}
    7070  \bigskip
    71  
     71
    7272  \noindent
    7373\begin{tabbing}
     
    7979\end{tabbing}
    8080  \bigskip
    81  
     81
    8282  \noindent
    8383  \begin{tabbing}
     
    8787\end{tabbing}
    8888  \bigskip
    89  
     89
    9090  \noindent
    9191\begin{tabbing}
     
    9595\end{tabbing}
    9696  \bigskip
    97  
     97
    9898  \noindent
    9999\begin{tabbing}
     
    111111  % December 13th, 2006.  It is designed for an electronic thesis.
    112112 \begin{center}\textbf{Author's Declaration}\end{center}
    113  
     113
    114114 \noindent
    115 I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, including any required final revisions, as accepted by my examiners.
     115I hereby declare that I am the sole author of this thesis. This is a true copy
     116of the thesis, including any required final revisions, as accepted by my
     117examiners.
    116118
    117119  \bigskip
    118  
     120
    119121  \noindent
    120122I understand that my thesis may be made electronically available to the public.
  • doc/theses/andrew_beach_MMath/uw-ethesis.tex

    r5e99a9a r95b3a9c  
    11%======================================================================
    2 % University of Waterloo Thesis Template for LaTeX 
    3 % Last Updated November, 2020 
    4 % by Stephen Carr, IST Client Services, 
     2% University of Waterloo Thesis Template for LaTeX
     3% Last Updated November, 2020
     4% by Stephen Carr, IST Client Services,
    55% University of Waterloo, 200 University Ave. W., Waterloo, Ontario, Canada
    66% FOR ASSISTANCE, please send mail to request@uwaterloo.ca
    77
    88% DISCLAIMER
    9 % To the best of our knowledge, this template satisfies the current uWaterloo thesis requirements.
    10 % However, it is your responsibility to assure that you have met all requirements of the University and your particular department.
    11 
    12 % Many thanks for the feedback from many graduates who assisted the development of this template.
    13 % Also note that there are explanatory comments and tips throughout this template.
     9% To the best of our knowledge, this template satisfies the current uWaterloo
     10% thesis requirements. However, it is your responsibility to assure that you
     11% have met all requirements of the University and your particular department.
     12
     13% Many thanks for the feedback from many graduates who assisted the
     14% development of this template. Also note that there are explanatory comments
     15% and tips throughout this template.
    1416%======================================================================
    1517% Some important notes on using this template and making it your own...
    1618
    17 % The University of Waterloo has required electronic thesis submission since October 2006.
    18 % See the uWaterloo thesis regulations at
    19 % https://uwaterloo.ca/graduate-studies/thesis.
    20 % This thesis template is geared towards generating a PDF version optimized for viewing on an electronic display, including hyperlinks within the PDF.
    21 
    22 % DON'T FORGET TO ADD YOUR OWN NAME AND TITLE in the "hyperref" package configuration below.
    23 % THIS INFORMATION GETS EMBEDDED IN THE PDF FINAL PDF DOCUMENT.
    24 % You can view the information if you view properties of the PDF document.
    25 
    26 % Many faculties/departments also require one or more printed copies.
    27 % This template attempts to satisfy both types of output.
     19% The University of Waterloo has required electronic thesis submission since
     20% October 2006. See the uWaterloo thesis regulations at:
     21%   https://uwaterloo.ca/graduate-studies/thesis.
     22% This thesis template is geared towards generating a PDF version optimized
     23% for viewing on an electronic display, including hyperlinks within the PDF.
     24
     25% DON'T FORGET TO ADD YOUR OWN NAME AND TITLE in the "hyperref" package
     26% configuration below. THIS INFORMATION GETS EMBEDDED IN THE FINAL PDF
     27% DOCUMENT. You can view the information if you view properties of the PDF.
     28
     29% Many faculties/departments also require one or more printed copies.
     30% This template attempts to satisfy both types of output.
    2831% See additional notes below.
    29 % It is based on the standard "book" document class which provides all necessary sectioning structures and allows multi-part theses.
    30 
    31 % If you are using this template in Overleaf (cloud-based collaboration service), then it is automatically processed and previewed for you as you edit.
    32 
    33 % For people who prefer to install their own LaTeX distributions on their own computers, and process the source files manually, the following notes provide the sequence of tasks:
    34  
     32% It is based on the standard "book" document class which provides all
     33% necessary sectioning structures and allows multi-part theses.
     34
     35% If you are using this template in Overleaf (cloud-based collaboration
     36% service), then it is automatically processed and previewed for you as you
     37% edit.
     38
     39% For people who prefer to install their own LaTeX distributions on their own
     40% computers, and process the source files manually, the following notes
     41% provide the sequence of tasks:
     42
    3543% E.g. to process a thesis called "mythesis.tex" based on this template, run:
    3644
    3745% pdflatex mythesis     -- first pass of the pdflatex processor
    3846% bibtex mythesis       -- generates bibliography from .bib data file(s)
    39 % makeindex         -- should be run only if an index is used
    40 % pdflatex mythesis     -- fixes numbering in cross-references, bibliographic references, glossaries, index, etc.
    41 % pdflatex mythesis     -- it takes a couple of passes to completely process all cross-references
    42 
    43 % If you use the recommended LaTeX editor, Texmaker, you would open the mythesis.tex file, then click the PDFLaTeX button. Then run BibTeX (under the Tools menu).
    44 % Then click the PDFLaTeX button two more times.
    45 % If you have an index as well,you'll need to run MakeIndex from the Tools menu as well, before running pdflatex
    46 % the last two times.
    47 
    48 % N.B. The "pdftex" program allows graphics in the following formats to be included with the "\includegraphics" command: PNG, PDF, JPEG, TIFF
    49 % Tip: Generate your figures and photos in the size you want them to appear in your thesis, rather than scaling them with \includegraphics options.
    50 % Tip: Any drawings you do should be in scalable vector graphic formats: SVG, PNG, WMF, EPS and then converted to PNG or PDF, so they are scalable in the final PDF as well.
     47% makeindex         -- should be run only if an index is used
     48% pdflatex mythesis     -- fixes numbering in cross-references, bibliographic
     49%                      references, glossaries, index, etc.
     50% pdflatex mythesis     -- it takes a couple of passes to completely process all
     51%                      cross-references
     52
     53% If you use the recommended LaTeX editor, Texmaker, you would open the
     54% mythesis.tex file, then click the PDFLaTeX button. Then run BibTeX (under
     55% the Tools menu). Then click the PDFLaTeX button two more times.
     56% If you have an index as well, you'll need to run MakeIndex from the Tools
     57% menu as well, before running pdflatex the last two times.
     58
     59% N.B. The "pdftex" program allows graphics in the following formats to be
     60% included with the "\includegraphics" command: PNG, PDF, JPEG, TIFF
     61% Tip: Generate your figures and photos in the size you want them to appear
     62% in your thesis, rather than scaling them with \includegraphics options.
     63% Tip: Any drawings you do should be in scalable vector graphic formats: SVG,
     64% PNG, WMF, EPS and then converted to PNG or PDF, so they are scalable in the
     65% final PDF as well.
    5166% Tip: Photographs should be cropped and compressed so as not to be too large.
    5267
    53 % To create a PDF output that is optimized for double-sided printing:
    54 % 1) comment-out the \documentclass statement in the preamble below, and un-comment the second \documentclass line.
    55 % 2) change the value assigned below to the boolean variable "PrintVersion" from " false" to "true".
    56 
    57 %======================================================================
     68% To create a PDF output that is optimized for double-sided printing:
     69% 1) comment-out the \documentclass statement in the preamble below, and
     70%    un-comment the second \documentclass line.
     71% 2) change the value assigned below to the boolean variable "PrintVersion"
     72%    from "false" to "true".
     73
     74% ======================================================================
    5875%   D O C U M E N T   P R E A M B L E
    59 % Specify the document class, default style attributes, and page dimensions, etc.
     76% Specify the document class, default style attributes, page dimensions, etc.
    6077% For hyperlinked PDF, suitable for viewing on a computer, use this:
    6178\documentclass[letterpaper,12pt,titlepage,oneside,final]{book}
    6279
    63 % For PDF, suitable for double-sided printing, change the PrintVersion variable below to "true" and use this \documentclass line instead of the one above:
     80% For PDF, suitable for double-sided printing, change the PrintVersion
     81% variable below to "true" and use this \documentclass line instead of the
     82% one above:
    6483%\documentclass[letterpaper,12pt,titlepage,openright,twoside,final]{book}
    6584
     85\usepackage{etoolbox}
     86
    6687% Some LaTeX commands I define for my own nomenclature.
    67 % If you have to, it's easier to make changes to nomenclature once here than in a million places throughout your thesis!
     88% If you have to, it's easier to make changes to nomenclature once here than
     89% in a million places throughout your thesis!
    6890\newcommand{\package}[1]{\textbf{#1}} % package names in bold text
    69 \newcommand{\cmmd}[1]{\textbackslash\texttt{#1}} % command name in tt font 
    70 \newcommand{\href}[1]{#1} % does nothing, but defines the command so the print-optimized version will ignore \href tags (redefined by hyperref pkg).
    71 %\newcommand{\texorpdfstring}[2]{#1} % does nothing, but defines the command
     91\newcommand{\cmmd}[1]{\textbackslash\texttt{#1}} % command name in tt font
     92\newcommand{\href}[1]{#1} % does nothing, but defines the command so the
     93% print-optimized version will ignore \href tags (redefined by hyperref pkg).
    7294% Anything defined here may be redefined by packages added below...
    7395
     
    7698\newboolean{PrintVersion}
    7799\setboolean{PrintVersion}{false}
    78 % CHANGE THIS VALUE TO "true" as necessary, to improve printed results for hard copies by overriding some options of the hyperref package, called below.
     100% CHANGE THIS VALUE TO "true" as necessary, to improve printed results for
     101% hard copies by overriding some options of the hyperref package, called below.
    79102
    80103%\usepackage{nomencl} % For a nomenclature (optional; available from ctan.org)
    81 \usepackage{amsmath,amssymb,amstext} % Lots of math symbols and environments
    82 \usepackage[pdftex]{graphicx} % For including graphics N.B. pdftex graphics driver
     104% Lots of math symbols and environments
     105\usepackage{amsmath,amssymb,amstext}
     106% For including graphics N.B. pdftex graphics driver
     107\usepackage[pdftex]{graphicx}
     108% Removes large sections of the document.
     109\usepackage{comment}
     110% Adds todos (Must be included after comment.)
     111\usepackage{todonotes}
     112
    83113
    84114% Hyperlinks make it very easy to navigate an electronic document.
    85 % In addition, this is where you should specify the thesis title and author as they appear in the properties of the PDF document.
     115% In addition, this is where you should specify the thesis title and author as
     116% they appear in the properties of the PDF document.
    86117% Use the "hyperref" package
    87118% N.B. HYPERREF MUST BE THE LAST PACKAGE LOADED; ADD ADDITIONAL PKGS ABOVE
    88119\usepackage[pdftex,pagebackref=true]{hyperref} % with basic options
    89120%\usepackage[pdftex,pagebackref=true]{hyperref}
    90                 % N.B. pagebackref=true provides links back from the References to the body text. This can cause trouble for printing.
     121% N.B. pagebackref=true provides links back from the References to the body
     122% text. This can cause trouble for printing.
    91123\hypersetup{
    92124    plainpages=false,       % needed if Roman numbers in frontpages
     
    96128    pdffitwindow=false,     % window fit to page when opened
    97129    pdfstartview={FitH},    % fits the width of the page to the window
    98 %    pdftitle={uWaterloo\ LaTeX\ Thesis\ Template},    % title: CHANGE THIS TEXT!
     130%    pdftitle={uWaterloo\ LaTeX\ Thesis\ Template}, % title: CHANGE THIS TEXT!
    99131%    pdfauthor={Author},    % author: CHANGE THIS TEXT! and uncomment this line
    100132%    pdfsubject={Subject},  % subject: CHANGE THIS TEXT! and uncomment this line
    101 %    pdfkeywords={keyword1} {key2} {key3}, % list of keywords, and uncomment this line if desired
     133%    pdfkeywords={keyword1} {key2} {key3}, % optional list of keywords
    102134    pdfnewwindow=true,      % links in new window
    103135    colorlinks=true,        % false: boxed links; true: colored links
     
    107139    urlcolor=cyan           % color of external links
    108140}
    109 \ifthenelse{\boolean{PrintVersion}}{   % for improved print quality, change some hyperref options
     141% for improved print quality, change some hyperref options
     142\ifthenelse{\boolean{PrintVersion}}{
    110143\hypersetup{    % override some previously defined hyperref options
    111144%    colorlinks,%
     
    116149}{} % end of ifthenelse (no else)
    117150
    118 \usepackage[automake,toc,abbreviations]{glossaries-extra} % Exception to the rule of hyperref being the last add-on package
    119 % If glossaries-extra is not in your LaTeX distribution, get it from CTAN (http://ctan.org/pkg/glossaries-extra),
    120 % although it's supposed to be in both the TeX Live and MikTeX distributions. There are also documentation and
    121 % installation instructions there.
     151% Exception to the rule of hyperref being the last add-on package
     152\usepackage[automake,toc,abbreviations]{glossaries-extra}
     153% If glossaries-extra is not in your LaTeX distribution, get it from CTAN
     154% (http://ctan.org/pkg/glossaries-extra), although it's supposed to be in
     155% both the TeX Live and MikTeX distributions. There are also documentation
     156% and installation instructions there.
    122157
    123158% Setting up the page margins...
    124 \setlength{\textheight}{9in}\setlength{\topmargin}{-0.45in}\setlength{\headsep}{0.25in}
    125 % uWaterloo thesis requirements specify a minimum of 1 inch (72pt) margin at the
    126 % top, bottom, and outside page edges and a 1.125 in. (81pt) gutter margin (on binding side).
    127 % While this is not an issue for electronic viewing, a PDF may be printed, and so we have the same page layout for both printed and electronic versions, we leave the gutter margin in.
    128 % Set margins to minimum permitted by uWaterloo thesis regulations:
     159\setlength{\textheight}{9in}
     160\setlength{\topmargin}{-0.45in}
     161\setlength{\headsep}{0.25in}
     162% uWaterloo thesis requirements specify a minimum of 1 inch (72pt) margin at
     163% the top, bottom, and outside page edges and a 1.125 in. (81pt) gutter margin
     164% (on binding side). While this is not an issue for electronic viewing, a PDF
     165% may be printed, and so we have the same page layout for both printed and
     166% electronic versions, we leave the gutter margin in. Set margins to minimum
     167% permitted by uWaterloo thesis regulations:
    129168\setlength{\marginparwidth}{0pt} % width of margin notes
    130169% N.B. If margin notes are used, you must adjust \textwidth, \marginparwidth
    131170% and \marginparsep so that the space left between the margin notes and page
    132171% edge is less than 15 mm (0.6 in.)
    133 \setlength{\marginparsep}{0pt} % width of space between body text and margin notes
    134 \setlength{\evensidemargin}{0.125in} % Adds 1/8 in. to binding side of all
     172% width of space between body text and margin notes
     173\setlength{\marginparsep}{0pt}
     174% Adds 1/8 in. to binding side of all
    135175% even-numbered pages when the "twoside" printing option is selected
    136 \setlength{\oddsidemargin}{0.125in} % Adds 1/8 in. to the left of all pages when "oneside" printing is selected, and to the left of all odd-numbered pages when "twoside" printing is selected
    137 \setlength{\textwidth}{6.375in} % assuming US letter paper (8.5 in. x 11 in.) and side margins as above
     176\setlength{\evensidemargin}{0.125in}
     177% Adds 1/8 in. to the left of all pages when "oneside" printing is selected,
     178% and to the left of all odd-numbered pages when "twoside" printing is selected
     179\setlength{\oddsidemargin}{0.125in}
     180% assuming US letter paper (8.5 in. x 11 in.) and side margins as above
     181\setlength{\textwidth}{6.375in}
    138182\raggedbottom
    139183
    140 % The following statement specifies the amount of space between paragraphs. Other reasonable specifications are \bigskipamount and \smallskipamount.
     184% The following statement specifies the amount of space between paragraphs.
     185% Other reasonable specifications are \bigskipamount and \smallskipamount.
    141186\setlength{\parskip}{\medskipamount}
    142187
    143 % The following statement controls the line spacing. 
    144 % The default spacing corresponds to good typographic conventions and only slight changes (e.g., perhaps "1.2"), if any, should be made.
     188% The following statement controls the line spacing.
     189% The default spacing corresponds to good typographic conventions and only
     190% slight changes (e.g., perhaps "1.2"), if any, should be made.
    145191\renewcommand{\baselinestretch}{1} % this is the default line space setting
    146192
    147193% By default, each chapter will start on a recto (right-hand side) page.
    148 % We also force each section of the front pages to start on a recto page by inserting \cleardoublepage commands.
    149 % In many cases, this will require that the verso (left-hand) page be blank, and while it should be counted, a page number should not be printed.
    150 % The following statements ensure a page number is not printed on an otherwise blank verso page.
     194% We also force each section of the front pages to start on a recto page by
     195% inserting \cleardoublepage commands. In many cases, this will require that
     196% the verso (left-hand) page be blank, and while it should be counted, a page
     197% number should not be printed. The following statements ensure a page number
     198% is not printed on an otherwise blank verso page.
    151199\let\origdoublepage\cleardoublepage
    152200\newcommand{\clearemptydoublepage}{%
     
    154202\let\cleardoublepage\clearemptydoublepage
    155203
    156 % Define Glossary terms (This is properly done here, in the preamble and could also be \input{} from a separate file...)
     204% Define Glossary terms (This is properly done here, in the preamble and
     205% could also be \input{} from a separate file...)
    157206\input{glossaries}
    158207\makeglossaries
    159208
    160 \usepackage{comment}
    161209% cfa macros used in the document
    162210%\usepackage{cfalab}
     211% I'm going to bring back eventually.
     212\makeatletter
     213% Combines all \CC* commands:
     214\newrobustcmd*\Cpp[1][\xspace]{\cfalab@Cpp#1}
     215\newcommand\cfalab@Cpp{C\kern-.1em\hbox{+\kern-.25em+}}
     216% Optional arguments do not work with pdf string. (Some fix-up required.)
     217\pdfstringdefDisableCommands{\def\Cpp{C++}}
     218
     219% Colour text, formatted in LaTeX style instead of TeX style.
     220\newcommand*\colour[2]{{\color{#1}#2}}
     221\makeatother
     222
    163223\input{common}
    164 \CFAStyle                                               % CFA code-style for all languages
    165 \lstset{language=CFA,basicstyle=\linespread{0.9}\tt}    % CFA default lnaguage
     224% CFA code-style for all languages
     225\CFAStyle
     226% CFA default lnaguage
     227\lstset{language=CFA,basicstyle=\linespread{0.9}\tt}
     228% Annotations from Peter:
    166229\newcommand{\PAB}[1]{{\color{blue}PAB: #1}}
     230% Change the style of abbreviations:
     231\renewcommand{\abbrevFont}{}
    167232
    168233%======================================================================
    169234%   L O G I C A L    D O C U M E N T
    170235% The logical document contains the main content of your thesis.
    171 % Being a large document, it is a good idea to divide your thesis into several files, each one containing one chapter or other significant chunk of content, so you can easily shuffle things around later if desired.
     236% Being a large document, it is a good idea to divide your thesis into several
     237% files, each one containing one chapter or other significant chunk of content,
     238% so you can easily shuffle things around later if desired.
    172239%======================================================================
    173240\begin{document}
     
    176243% FRONT MATERIAL
    177244% title page,declaration, borrowers' page, abstract, acknowledgements,
    178 % dedication, table of contents, list of tables, list of figures, nomenclature, etc.
    179 %----------------------------------------------------------------------
    180 \input{uw-ethesis-frontpgs}
     245% dedication, table of contents, list of tables, list of figures,
     246% nomenclature, etc.
     247%----------------------------------------------------------------------
     248\input{uw-ethesis-frontpgs}
    181249
    182250%----------------------------------------------------------------------
    183251% MAIN BODY
    184252% We suggest using a separate file for each chapter of your thesis.
    185 % Start each chapter file with the \chapter command.
    186 % Only use \documentclass or \begin{document} and \end{document} commands in this master document.
     253% Start each chapter file with the \chapter command. Only use \documentclass,
     254% \begin{document} and \end{document} commands in this master document.
    187255% Tip: Putting each sentence on a new line is a way to simplify later editing.
    188256%----------------------------------------------------------------------
     
    200268% Bibliography
    201269
    202 % The following statement selects the style to use for references. 
    203 % It controls the sort order of the entries in the bibliography and also the formatting for the in-text labels.
     270% The following statement selects the style to use for references.
     271% It controls the sort order of the entries in the bibliography and also the
     272% formatting for the in-text labels.
    204273\bibliographystyle{plain}
    205 % This specifies the location of the file containing the bibliographic information. 
    206 % It assumes you're using BibTeX to manage your references (if not, why not?).
    207 \cleardoublepage % This is needed if the "book" document class is used, to place the anchor in the correct page, because the bibliography will start on its own page.
    208 % Use \clearpage instead if the document class uses the "oneside" argument
    209 \phantomsection  % With hyperref package, enables hyperlinking from the table of contents to bibliography             
    210 % The following statement causes the title "References" to be used for the bibliography section:
     274% This specifies the location of the file containing the bibliographic
     275% information. It assumes you're using BibTeX to manage your references (if
     276% not, why not?).
     277\cleardoublepage % This is needed if the "book" document class is used, to
     278% place the anchor in the correct page, because the bibliography will start
     279% on its own page.
     280% Use \clearpage instead if the document class uses the "oneside" argument.
     281\phantomsection  % With hyperref package, enables hyperlinking from the table
     282% of contents to bibliography.
     283% The following statement causes the title "References" to be used for the
     284% bibliography section:
    211285\renewcommand*{\bibname}{References}
    212286
     
    215289
    216290\bibliography{uw-ethesis,pl}
    217 % Tip: You can create multiple .bib files to organize your references.
    218 % Just list them all in the \bibliogaphy command, separated by commas (no spaces).
    219 
    220 % The following statement causes the specified references to be added to the bibliography even if they were not cited in the text.
    221 % The asterisk is a wildcard that causes all entries in the bibliographic database to be included (optional).
     291% Tip: You can create multiple .bib files to organize your references. Just
     292% list them all in the \bibliogaphy command, separated by commas (no spaces).
     293
     294% The following statement causes the specified references to be added to the
     295% bibliography even if they were not cited in the text. The asterisk is a
     296% wildcard that causes all entries in the bibliographic database to be
     297% included (optional).
    222298% \nocite{*}
    223299%----------------------------------------------------------------------
     
    227303% The \appendix statement indicates the beginning of the appendices.
    228304\appendix
    229 % Add an un-numbered title page before the appendices and a line in the Table of Contents
     305% Add an un-numbered title page before the appendices and a line in the Table
     306% of Contents
    230307% \chapter*{APPENDICES}
    231308% \addcontentsline{toc}{chapter}{APPENDICES}
    232 % Appendices are just more chapters, with different labeling (letters instead of numbers).
     309% Appendices are just more chapters, with different labeling (letters instead
     310% of numbers).
    233311% \input{appendix-matlab_plots.tex}
    234312
    235 % GLOSSARIES (Lists of definitions, abbreviations, symbols, etc. provided by the glossaries-extra package)
     313% GLOSSARIES (Lists of definitions, abbreviations, symbols, etc.
     314% provided by the glossaries-extra package)
    236315% -----------------------------
    237316\printglossaries
  • doc/theses/fangren_yu_COOP_F20/Report.tex

    r5e99a9a r95b3a9c  
    102102\CFA language, developed by the Programming Language Group at the University of Waterloo, has a long history, with the initial language design in 1992 by Glen Ditchfield~\cite{Ditchfield92} and the first proof-of-concept compiler built in 2003 by Richard Bilson~\cite{Bilson03}. Many new features have been added to the language over time, but the core of \CFA's type-system --- parametric functions introduced by the @forall@ clause (hence the name of the language) providing parametric overloading --- remains mostly unchanged.
    103103
    104 The current \CFA reference compiler, @cfa-cc@, is designed using the visitor pattern~\cite{vistorpattern} over an abstract syntax tree (AST), where multiple passes over the AST modify it for subsequent passes. @cfa-cc@ still includes many parts taken directly from the original Bilson implementation, which served as the starting point for this enhancement work to the type system. Unfortunately, the prior implementation did not provide the efficiency required for the language to be practical: a \CFA source file of approximately 1000 lines of code can take a multiple minutes to compile. The cause of the problem is that the old compiler used inefficient data structures and algorithms for expression resolution, which involved significant copying and redundant work.
     104The current \CFA reference compiler, @cfa-cc@, is designed using the visitor pattern~\cite{vistorpattern} over an abstract syntax tree (AST), where multiple passes over the AST modify it for subsequent passes. @cfa-cc@ still includes many parts taken directly from the original Bilson implementation, which served as the starting point for this enhancement work to the type system. Unfortunately, the prior implementation did not provide the efficiency required for the language to be practical: a \CFA source file of approximately 1000 lines of code can take multiple minutes to compile. The cause of the problem is that the old compiler used inefficient data structures and algorithms for expression resolution, which involved significant copying and redundant work.
    105105
    106106This report presents a series of optimizations to the performance-critical parts of the resolver, with a major rework of the compiler data-structures using a functional-programming approach to reduce memory complexity. The improvements were suggested by running the compiler builds with a performance profiler against the \CFA standard-library source-code and a test suite to find the most underperforming components in the compiler algorithm.
     
    122122\end{itemize}
    123123
    124 The resolver algorithm, designed for overload resolution, uses a significant amount of reused, and hence copying, for the intermediate representations, especially in the following two places:
     124The resolver algorithm, designed for overload resolution, allows a significant amount of code reused, and hence copying, for the intermediate representations, especially in the following two places:
    125125\begin{itemize}
    126126\item
     
    301301forall( dtype T | sized( T ) )
    302302T * malloc( void ) { return (T *)malloc( sizeof(T) ); } // call C malloc
    303 int * i = malloc();  // type deduced from left-hand size $\Rightarrow$ no size argument or return cast
     303int * i = malloc();  // type deduced from left-hand size $\(\Rightarrow\)$ no size argument or return cast
    304304\end{cfa}
    305305An unbound return-type is problematic in resolver complexity because a single match of a function call with an unbound return type may create multiple candidates. In the worst case, consider a function declared that returns any @otype@ (defined \VPageref{otype}):
     
    432432\begin{cfa}
    433433void f( int );
    434 double g$_1$( int );
    435 int g$_2$( long );
     434double g$\(_1\)$( int );
     435int g$\(_2\)$( long );
    436436f( g( 42 ) );
    437437\end{cfa}
  • doc/theses/thierry_delisle_PhD/thesis/Makefile

    r5e99a9a r95b3a9c  
    88BibTeX = BIBINPUTS=${TeXLIB} && export BIBINPUTS && bibtex
    99
    10 MAKEFLAGS = --no-print-directory --silent
     10MAKEFLAGS = --no-print-directory # --silent
    1111VPATH = ${Build} ${Figures}
    1212
     
    3232        emptytree \
    3333        fairness \
     34        io_uring \
     35        pivot_ring \
    3436        system \
    3537}
     
    4345## Define the documents that need to be made.
    4446all: thesis.pdf
    45 thesis.pdf: ${TEXTS} ${FIGURES} ${PICTURES} glossary.tex local.bib
     47thesis.pdf: ${TEXTS} ${FIGURES} ${PICTURES} thesis.tex glossary.tex local.bib
    4648
    4749DOCUMENT = thesis.pdf
     
    4951
    5052# Directives #
     53
     54.NOTPARALLEL:                                           # cannot make in parallel
    5155
    5256.PHONY : all clean                                      # not file names
     
    8185        ${LaTeX} $<
    8286
    83 build/fairness.svg : fig/fairness.py | ${Build}
    84         python3 $< $@
    85 
    8687## Define the default recipes.
    8788
     
    105106        sed -i 's/$@/${Build}\/$@/g' ${Build}/$@_t
    106107
     108build/fairness.svg : fig/fairness.py | ${Build}
     109        python3 $< $@
     110
    107111## pstex with inverted colors
    108112%.dark.pstex : fig/%.fig Makefile | ${Build}
  • doc/theses/thierry_delisle_PhD/thesis/local.bib

    r5e99a9a r95b3a9c  
    512512}
    513513
     514@manual{MAN:bsd/kqueue,
     515  title = {KQUEUE(2) - FreeBSD System Calls Manual},
     516  url   = {https://www.freebsd.org/cgi/man.cgi?query=kqueue},
     517  year  = {2020},
     518  month = {may}
     519}
     520
    514521% Apple's MAC OS X
    515522@manual{MAN:apple/scheduler,
     
    577584
    578585% --------------------------------------------------
     586% Man Pages
     587@manual{MAN:open,
     588  key        = "open",
     589  title      = "open(2) Linux User's Manual",
     590  year       = "2020",
     591  month      = "February",
     592}
     593
     594@manual{MAN:accept,
     595  key        = "accept",
     596  title      = "accept(2) Linux User's Manual",
     597  year       = "2019",
     598  month      = "March",
     599}
     600
     601@manual{MAN:select,
     602  key        = "select",
     603  title      = "select(2) Linux User's Manual",
     604  year       = "2019",
     605  month      = "March",
     606}
     607
     608@manual{MAN:poll,
     609  key        = "poll",
     610  title      = "poll(2) Linux User's Manual",
     611  year       = "2019",
     612  month      = "July",
     613}
     614
     615@manual{MAN:epoll,
     616  key        = "epoll",
     617  title      = "epoll(7) Linux User's Manual",
     618  year       = "2019",
     619  month      = "March",
     620}
     621
     622@manual{MAN:aio,
     623  key        = "aio",
     624  title      = "aio(7) Linux User's Manual",
     625  year       = "2019",
     626  month      = "March",
     627}
     628
     629@misc{MAN:io_uring,
     630  title   = {Efficient IO with io\_uring},
     631  author  = {Axboe, Jens},
     632  year    = "2019",
     633  month   = "March",
     634  version = {0,4},
     635  howpublished = {\url{https://kernel.dk/io_uring.pdf}}
     636}
     637
     638% --------------------------------------------------
    579639% Wikipedia Entries
    580640@misc{wiki:taskparallel,
     
    617677  note = "[Online; accessed 2-January-2021]"
    618678}
     679
     680@misc{wiki:future,
     681  author = "{Wikipedia contributors}",
     682  title = "Futures and promises --- {W}ikipedia{,} The Free Encyclopedia",
     683  year = "2020",
     684  url = "https://en.wikipedia.org/wiki/Futures_and_promises",
     685  note = "[Online; accessed 9-February-2021]"
     686}
  • doc/theses/thierry_delisle_PhD/thesis/text/core.tex

    r5e99a9a r95b3a9c  
    4949
    5050\section{Design}
    51 In general, a na\"{i}ve \glsxtrshort{fifo} ready-queue does not scale with increased parallelism from \glspl{hthrd}, resulting in decreased performance. The problem is adding/removing \glspl{thrd} is a single point of contention. As shown in the evaluation sections, most production schedulers do scale when adding \glspl{hthrd}. The common solution to the single point of contention is to shard the ready-queue so each \gls{hthrd} can access the ready-queue without contention, increasing performance though lack of contention.
     51In general, a na\"{i}ve \glsxtrshort{fifo} ready-queue does not scale with increased parallelism from \glspl{hthrd}, resulting in decreased performance. The problem is adding/removing \glspl{thrd} is a single point of contention. As shown in the evaluation sections, most production schedulers do scale when adding \glspl{hthrd}. The common solution to the single point of contention is to shard the ready-queue so each \gls{hthrd} can access the ready-queue without contention, increasing performance.
    5252
    5353\subsection{Sharding} \label{sec:sharding}
    54 An interesting approach to sharding a queue is presented in \cit{Trevors paper}. This algorithm presents a queue with a relaxed \glsxtrshort{fifo} guarantee using an array of strictly \glsxtrshort{fifo} sublists as shown in Figure~\ref{fig:base}. Each \emph{cell} of the array has a timestamp for the last operation and a pointer to a linked-list with a lock and each node in the list is marked with a timestamp indicating when it is added to the list. A push operation is done by picking a random cell, acquiring the list lock, and pushing to the list. If the cell is locked, the operation is simply retried on another random cell until a lock is acquired. A pop operation is done in a similar fashion except two random cells are picked. If both cells are unlocked with non-empty lists, the operation pops the node with the oldest cell timestamp. If one of the cells is unlocked and non-empty, the operation pops from that cell. If both cells are either locked or empty, the operation picks two new random cells and tries again.
     54An interesting approach to sharding a queue is presented in \cit{Trevors paper}. This algorithm presents a queue with a relaxed \glsxtrshort{fifo} guarantee using an array of strictly \glsxtrshort{fifo} sublists as shown in Figure~\ref{fig:base}. Each \emph{cell} of the array has a timestamp for the last operation and a pointer to a linked-list with a lock. Each node in the list is marked with a timestamp indicating when it is added to the list. A push operation is done by picking a random cell, acquiring the list lock, and pushing to the list. If the cell is locked, the operation is simply retried on another random cell until a lock is acquired. A pop operation is done in a similar fashion except two random cells are picked. If both cells are unlocked with non-empty lists, the operation pops the node with the oldest timestamp. If one of the cells is unlocked and non-empty, the operation pops from that cell. If both cells are either locked or empty, the operation picks two new random cells and tries again.
    5555
    5656\begin{figure}
     
    100100\paragraph{Local Information} Figure~\ref{fig:emptytls} shows an approach using dense information, similar to the bitmap, but each \gls{hthrd} keeps its own independent copy. While this approach can offer good scalability \emph{and} low latency, the liveliness and discovery of the information can become a problem. This case is made worst in systems with few processors where even blind random picks can find \glspl{thrd} in a few tries.
    101101
    102 I built a prototype of these approaches and none of these techniques offer satisfying performance when few threads are present. All of these approach hit the same 2 problems. First, randomly picking sub-queues is very fast but means any improvement to the hit rate can easily be countered by a slow-down in look-up speed when there are empty lists. Second, the array is already as sharded to avoid contention bottlenecks, so any denser data structure tends to become a bottleneck. In all cases, these factors meant the best cases scenario, \ie many threads, would get worst throughput, and the worst-case scenario, few threads, would get a better hit rate, but an equivalent poor throughput. As a result I tried an entirely different approach.
     102I built a prototype of these approaches and none of these techniques offer satisfying performance when few threads are present. All of these approach hit the same 2 problems. First, randomly picking sub-queues is very fast. That speed means any improvement to the hit rate can easily be countered by a slow-down in look-up speed, whether or not there are empty lists. Second, the array is already sharded to avoid contention bottlenecks, so any denser data structure tends to become a bottleneck. In all cases, these factors meant the best cases scenario, \ie many threads, would get worst throughput, and the worst-case scenario, few threads, would get a better hit rate, but an equivalent poor throughput. As a result I tried an entirely different approach.
    103103
    104104\subsection{Dynamic Entropy}\cit{https://xkcd.com/2318/}
    105 In the worst-case scenario there are only few \glspl{thrd} ready to run, or more precisely given $P$ \glspl{proc}\footnote{For simplicity, this assumes there is a one-to-one match between \glspl{proc} and \glspl{hthrd}.}, $T$ \glspl{thrd} and $\epsilon$ a very small number, than the worst case scenario can be represented by $\epsilon \ll P$, than $T = P + \epsilon$. It is important to note in this case that fairness is effectively irrelevant. Indeed, this case is close to \emph{actually matching} the model of the ``Ideal multi-tasking CPU'' on page \pageref{q:LinuxCFS}. In this context, it is possible to use a purely internal-locality based approach and still meet the fairness requirements. This approach simply has each \gls{proc} running a single \gls{thrd} repeatedly. Or from the shared ready-queue viewpoint, each \gls{proc} pushes to a given sub-queue and then popes from the \emph{same} subqueue. In cases where $T \gg P$, the scheduler should also achieves similar performance without affecting the fairness guarantees.
     105In the worst-case scenario there are only few \glspl{thrd} ready to run, or more precisely given $P$ \glspl{proc}\footnote{For simplicity, this assumes there is a one-to-one match between \glspl{proc} and \glspl{hthrd}.}, $T$ \glspl{thrd} and $\epsilon$ a very small number, than the worst case scenario can be represented by $T = P + \epsilon$, with $\epsilon \ll P$. It is important to note in this case that fairness is effectively irrelevant. Indeed, this case is close to \emph{actually matching} the model of the ``Ideal multi-tasking CPU'' on page \pageref{q:LinuxCFS}. In this context, it is possible to use a purely internal-locality based approach and still meet the fairness requirements. This approach simply has each \gls{proc} running a single \gls{thrd} repeatedly. Or from the shared ready-queue viewpoint, each \gls{proc} pushes to a given sub-queue and then pops from the \emph{same} subqueue. The challenge is for the the scheduler to achieve good performance in both the $T = P + \epsilon$ case and the $T \gg P$ case, without affecting the fairness guarantees in the later.
    106106
    107 To handle this case, I use a pseudo random-number generator, \glsxtrshort{prng} in a novel way. When the scheduler uses a \glsxtrshort{prng} instance per \gls{proc} exclusively, the random-number seed effectively starts an encoding that produces a list of all accessed subqueues, from latest to oldest. The novel approach is to be able to ``replay'' the \glsxtrshort{prng} backwards and there exist \glsxtrshort{prng}s that are fast, compact \emph{and} can be run forward and backwards. Linear congruential generators~\cite{wiki:lcg} are an example of \glsxtrshort{prng}s that match these requirements.
     107To handle this case, I use a \glsxtrshort{prng}\todo{Fix missing long form} in a novel way. There exist \glsxtrshort{prng}s that are fast, compact and can be run forward \emph{and} backwards.  Linear congruential generators~\cite{wiki:lcg} are an example of \glsxtrshort{prng}s of such \glsxtrshort{prng}s. The novel approach is to use the ability to run backwards to ``replay'' the \glsxtrshort{prng}. The scheduler uses an exclusive \glsxtrshort{prng} instance per \gls{proc}, the random-number seed effectively starts an encoding that produces a list of all accessed subqueues, from latest to oldest. Replaying the \glsxtrshort{prng} to identify cells accessed recently and which probably have data still cached.
    108108
    109109The algorithm works as follows:
  • doc/theses/thierry_delisle_PhD/thesis/text/intro.tex

    r5e99a9a r95b3a9c  
    77While previous work on the concurrent package of \CFA focused on features and interfaces, this thesis focuses on performance, introducing \glsxtrshort{api} changes only when required by performance considerations. More specifically, this thesis concentrates on scheduling and \glsxtrshort{io}. Prior to this work, the \CFA runtime used a strictly \glsxtrshort{fifo} \gls{rQ}.
    88
    9 This work exclusively concentrates on Linux as it's operating system since the existing \CFA runtime and compiler does not already support other operating systems. Furthermore, as \CFA is yet to be released, supporting version of Linux older that the latest version is not a goal of this work.
     9This work exclusively concentrates on Linux as it's operating system since the existing \CFA runtime and compiler does not already support other operating systems. Furthermore, as \CFA is yet to be released, supporting version of Linux older than the latest version is not a goal of this work.
  • doc/theses/thierry_delisle_PhD/thesis/text/io.tex

    r5e99a9a r95b3a9c  
    1 \chapter{User Level \glsxtrshort{io}}
    2 As mentionned in Section~\ref{prev:io}, User-Level \glsxtrshort{io} requires multiplexing the \glsxtrshort{io} operations of many \glspl{thrd} onto fewer \glspl{proc} using asynchronous \glsxtrshort{io} operations. Various operating systems offer various forms of asynchronous operations and as mentioned in Chapter~\ref{intro}, this work is exclusively focuesd on Linux.
     1\chapter{User Level \io}
     2As mentioned in Section~\ref{prev:io}, User-Level \io requires multiplexing the \io operations of many \glspl{thrd} onto fewer \glspl{proc} using asynchronous \io operations. Different operating systems offer various forms of asynchronous operations and as mentioned in Chapter~\ref{intro}, this work is exclusively focused on the Linux operating-system.
    33
    4 \section{Existing options}
    5 Since \glsxtrshort{io} operations are generally handled by the
     4\section{Kernel Interface}
     5Since this work fundamentally depends on operating-system support, the first step of any design is to discuss the available interfaces and pick one (or more) as the foundations of the non-blocking \io subsystem.
    66
    7 \subsection{\lstinline|epoll|, \lstinline|poll| and \lstinline|select|}
     7\subsection{\lstinline{O_NONBLOCK}}
     8In Linux, files can be opened with the flag @O_NONBLOCK@~\cite{MAN:open} (or @SO_NONBLOCK@~\cite{MAN:accept}, the equivalent for sockets) to use the file descriptors in ``nonblocking mode''. In this mode, ``Neither the @open()@ nor any subsequent \io operations on the [opened file descriptor] will cause the calling
     9process to wait''~\cite{MAN:open}. This feature can be used as the foundation for the non-blocking \io subsystem. However, for the subsystem to know when an \io operation completes, @O_NONBLOCK@ must be use in conjunction with a system call that monitors when a file descriptor becomes ready, \ie, the next \io operation on it does not cause the process to wait\footnote{In this context, ready means \emph{some} operation can be performed without blocking. It does not mean an operation returning \lstinline{EAGAIN} succeeds on the next try. For example, a ready read may only return a subset of bytes and the read must be issues again for the remaining bytes, at which point it may return \lstinline{EAGAIN}.}.
     10This mechanism is also crucial in determining when all \glspl{thrd} are blocked and the application \glspl{kthrd} can now block.
    811
    9 \subsection{Linux's AIO}
     12There are three options to monitor file descriptors in Linux\footnote{For simplicity, this section omits \lstinline{pselect} and \lstinline{ppoll}. The difference between these system calls and \lstinline{select} and \lstinline{poll}, respectively, is not relevant for this discussion.}, @select@~\cite{MAN:select}, @poll@~\cite{MAN:poll} and @epoll@~\cite{MAN:epoll}. All three of these options offer a system call that blocks a \gls{kthrd} until at least one of many file descriptors becomes ready. The group of file descriptors being waited is called the \newterm{interest set}.
    1013
     14\paragraph{\lstinline{select}} is the oldest of these options, it takes as an input a contiguous array of bits, where each bits represent a file descriptor of interest. On return, it modifies the set in place to identify which of the file descriptors changed status. This destructive change means that calling select in a loop requires re-initializing the array each time and the number of file descriptors supported has a hard limit. Another limit of @select@ is that once the call is started, the interest set can no longer be modified. Monitoring a new file descriptor generally requires aborting any in progress call to @select@\footnote{Starting a new call to \lstinline{select} is possible but requires a distinct kernel thread, and as a result is not an acceptable multiplexing solution when the interest set is large and highly dynamic unless the number of parallel calls to \lstinline{select} can be strictly bounded.}.
    1115
     16\paragraph{\lstinline{poll}} is an improvement over select, which removes the hard limit on the number of file descriptors and the need to re-initialize the input on every call. It works using an array of structures as an input rather than an array of bits, thus allowing a more compact input for small interest sets. Like @select@, @poll@ suffers from the limitation that the interest set cannot be changed while the call is blocked.
     17
     18\paragraph{\lstinline{epoll}} further improves these two functions by allowing the interest set to be dynamically added to and removed from while a \gls{kthrd} is blocked on an @epoll@ call. This dynamic capability is accomplished by creating an \emph{epoll instance} with a persistent interest set, which is used across multiple calls. This capability significantly reduces synchronization overhead on the part of the caller (in this case the \io subsystem), since the interest set can be modified when adding or removing file descriptors without having to synchronize with other \glspl{kthrd} potentially calling @epoll@.
     19
     20However, all three of these system calls have limitations. The @man@ page for @O_NONBLOCK@ mentions that ``[@O_NONBLOCK@] has no effect for regular files and block devices'', which means none of these three system calls are viable multiplexing strategies for these types of \io operations. Furthermore, @epoll@ has been shown to have problems with pipes and ttys~\cit{Peter's examples in some fashion}. Finally, none of these are useful solutions for multiplexing \io operations that do not have a corresponding file descriptor and can be awkward for operations using multiple file descriptors.
     21
     22\subsection{POSIX asynchronous I/O (AIO)}
     23An alternative to @O_NONBLOCK@ is the AIO interface. Its interface lets programmers enqueue operations to be performed asynchronously by the kernel. Completions of these operations can be communicated in various ways: either by spawning a new \gls{kthrd}, sending a Linux signal, or by polling for completion of one or more operation. For this work, spawning a new \gls{kthrd} is counter-productive but a related solution is discussed in Section~\ref{io:morethreads}. Using interrupts handlers can also lead to fairly complicated interactions between subsystems. Leaving polling for completion, which is similar to the previous system calls. While AIO only supports read and write operations to file descriptors, it does not have the same limitation as @O_NONBLOCK@, \ie, the file descriptors can be regular files and blocked devices. It also supports batching multiple operations in a single system call.
     24
     25AIO offers two different approach to polling: @aio_error@ can be used as a spinning form of polling, returning @EINPROGRESS@ until the operation is completed, and @aio_suspend@ can be used similarly to @select@, @poll@ or @epoll@, to wait until one or more requests have completed. For the purpose of \io multiplexing, @aio_suspend@ is the best interface. However, even if AIO requests can be submitted concurrently, @aio_suspend@ suffers from the same limitation as @select@ and @poll@, \ie, the interest set cannot be dynamically changed while a call to @aio_suspend@ is in progress. AIO also suffers from the limitation of specifying which requests have completed, \ie programmers have to poll each request in the interest set using @aio_error@ to identify the completed requests. This limitation means that, like @select@ and @poll@ but not @epoll@, the time needed to examine polling results increases based on the total number of requests monitored, not the number of completed requests.
     26Finally, AIO does not seem to be a popular interface, which I believe is due in part to this poor polling interface. Linus Torvalds talks about this interface as follows:
    1227
    1328\begin{displayquote}
    14         AIO is a horrible ad-hoc design, with the main excuse being "other,
     29        AIO is a horrible ad-hoc design, with the main excuse being ``other,
    1530        less gifted people, made that design, and we are implementing it for
    1631        compatibility because database people - who seldom have any shred of
    17         taste - actually use it".
     32        taste - actually use it''.
    1833
    1934        But AIO was always really really ugly.
     
    2439\end{displayquote}
    2540
    26 Interestingly, in this e-mail answer, Linus goes on to describe
     41Interestingly, in this e-mail, Linus goes on to describe
    2742``a true \textit{asynchronous system call} interface''
    2843that does
     
    3045in
    3146``some kind of arbitrary \textit{queue up asynchronous system call} model''.
    32 This description is actually quite close to the interface of the interface described in the next section.
     47This description is actually quite close to the interface described in the next section.
    3348
    34 \subsection{\texttt{io\_uring}}
    35 A very recent addition to Linux, @io_uring@\cit{io\_uring} is a framework that aims to solve many of the problems listed with the above mentioned solutions.
     49\subsection{\lstinline{io_uring}}
     50A very recent addition to Linux, @io_uring@~\cite{MAN:io_uring}, is a framework that aims to solve many of the problems listed in the above interfaces. Like AIO, it represents \io operations as entries added to a queue. But like @epoll@, new requests can be submitted while a blocking call waiting for requests to complete is already in progress. The @io_uring@ interface uses two ring buffers (referred to simply as rings) at its core: a submit ring to which programmers push \io requests and a completion ring from which programmers poll for completion.
     51
     52One of the big advantages over the prior interfaces is that @io_uring@ also supports a much wider range of operations. In addition to supporting reads and writes to any file descriptor like AIO, it supports other operations like @open@, @close@, @fsync@, @accept@, @connect@, @send@, @recv@, @splice@, \etc.
     53
     54On top of these, @io_uring@ adds many extras like avoiding copies between the kernel and user-space using shared memory, allowing different mechanisms to communicate with device drivers, and supporting chains of requests, \ie, requests that automatically trigger followup requests on completion.
    3655
    3756\subsection{Extra Kernel Threads}\label{io:morethreads}
    38 Finally, if the operating system does not offer any satisfying forms of asynchronous \glsxtrshort{io} operations, a solution is to fake it by creating a pool of \glspl{kthrd} and delegating operations to them in order to avoid blocking \glspl{proc}.
     57Finally, if the operating system does not offer a satisfactory form of asynchronous \io operations, an ad-hoc solution is to create a pool of \glspl{kthrd} and delegate operations to it to avoid blocking \glspl{proc}, which is a compromise for multiplexing. In the worst case, where all \glspl{thrd} are consistently blocking on \io, it devolves into 1-to-1 threading. However, regardless of the frequency of \io operations, it achieves the fundamental goal of not blocking \glspl{proc} when \glspl{thrd} are ready to run. This approach is used by languages like Go\cit{Go} and frameworks like libuv\cit{libuv}, since it has the advantage that it can easily be used across multiple operating systems. This advantage is especially relevant for languages like Go, which offer a homogeneous \glsxtrshort{api} across all platforms. As opposed to C, which has a very limited standard api for \io, \eg, the C standard library has no networking.
    3958
    4059\subsection{Discussion}
     60These options effectively fall into two broad camps: waiting for \io to be ready versus waiting for \io to complete. All operating systems that support asynchronous \io must offer an interface along one of these lines, but the details vary drastically. For example, Free BSD offers @kqueue@~\cite{MAN:bsd/kqueue}, which behaves similarly to @epoll@, but with some small quality of use improvements, while Windows (Win32)~\cit{https://docs.microsoft.com/en-us/windows/win32/fileio/synchronous-and-asynchronous-i-o} offers ``overlapped I/O'', which handles submissions similarly to @O_NONBLOCK@ with extra flags on the synchronous system call, but waits for completion events, similarly to @io_uring@.
    4161
     62For this project, I selected @io_uring@, in large parts because to its generality. While @epoll@ has been shown to be a good solution for socket \io (\cite{DBLP:journals/pomacs/KarstenB20}), @io_uring@'s transparent support for files, pipes, and more complex operations, like @splice@ and @tee@, make it a better choice as the foundation for a general \io subsystem.
    4263
    4364\section{Event-Engine}
     65An event engine's responsibility is to use the kernel interface to multiplex many \io operations onto few \glspl{kthrd}. In concrete terms, this means \glspl{thrd} enter the engine through an interface, the event engines then starts the operation and parks the calling \glspl{thrd}, returning control to the \gls{proc}. The parked \glspl{thrd} are then rescheduled by the event engine once the desired operation has completed.
     66
     67\subsection{\lstinline{io_uring} in depth}
     68Before going into details on the design of my event engine, more details on @io_uring@ usage are presented, each important in the design of the engine.
     69Figure~\ref{fig:iouring} shows an overview of an @io_uring@ instance.
     70Two ring buffers are used to communicate with the kernel: one for submissions~(left) and one for completions~(right).
     71The submission ring contains entries, \newterm{Submit Queue Entries} (SQE), produced (appended) by the application when an operation starts and then consumed by the kernel.
     72The completion ring contains entries, \newterm{Completion Queue Entries} (CQE), produced (appended) by the kernel when an operation completes and then consumed by the application.
     73The submission ring contains indexes into the SQE array (denoted \emph{S}) containing entries describing the I/O operation to start;
     74the completion ring contains entries for the completed I/O operation.
     75Multiple @io_uring@ instances can be created, in which case they each have a copy of the data structures in the figure.
     76
     77\begin{figure}
     78        \centering
     79        \input{io_uring.pstex_t}
     80        \caption{Overview of \lstinline{io_uring}}
     81%       \caption[Overview of \lstinline{io_uring}]{Overview of \lstinline{io_uring} \smallskip\newline Two ring buffer are used to communicate with the kernel, one for completions~(right) and one for submissions~(left). The completion ring contains entries, \newterm{CQE}s: Completion Queue Entries, that are produced by the kernel when an operation completes and then consumed by the application. On the other hand, the application produces \newterm{SQE}s: Submit Queue Entries, which it appends to the submission ring for the kernel to consume. Unlike the completion ring, the submission ring does not contain the entries directly, it indexes into the SQE array (denoted \emph{S}) instead.}
     82        \label{fig:iouring}
     83\end{figure}
     84
     85New \io operations are submitted to the kernel following 4 steps, which use the components shown in the figure.
     86\begin{enumerate}
     87\item
     88An SQE is allocated from the pre-allocated array (denoted \emph{S} in Figure~\ref{fig:iouring}). This array is created at the same time as the @io_uring@ instance, is in kernel-locked memory visible by both the kernel and the application, and has a fixed size determined at creation. How these entries are allocated is not important for the functioning of @io_uring@, the only requirement is that no entry is reused before the kernel has consumed it.
     89\item
     90The SQE is filled according to the desired operation. This step is straight forward, the only detail worth mentioning is that SQEs have a @user_data@ field that must be filled in order to match submission and completion entries.
     91\item
     92The SQE is submitted to the submission ring by appending the index of the SQE to the ring following regular ring buffer steps: \lstinline{buffer[head] = item; head++}. Since the head is visible to the kernel, some memory barriers may be required to prevent the compiler from reordering these operations. Since the submission ring is a regular ring buffer, more than one SQE can be added at once and the head is updated only after all entries are updated.
     93\item
     94The kernel is notified of the change to the ring using the system call @io_uring_enter@. The number of elements appended to the submission ring is passed as a parameter and the number of elements consumed is returned. The @io_uring@ instance can be constructed so this step is not required, but this requires elevated privilege.% and an early version of @io_uring@ had additional restrictions.
     95\end{enumerate}
     96
     97\begin{sloppypar}
     98The completion side is simpler: applications call @io_uring_enter@ with the flag @IORING_ENTER_GETEVENTS@ to wait on a desired number of operations to complete. The same call can be used to both submit SQEs and wait for operations to complete. When operations do complete, the kernel appends a CQE to the completion ring and advances the head of the ring. Each CQE contains the result of the operation as well as a copy of the @user_data@ field of the SQE that triggered the operation. It is not necessary to call @io_uring_enter@ to get new events because the kernel can directly modify the completion ring. The system call is only needed if the application wants to block waiting for operations to complete.
     99\end{sloppypar}
     100
     101The @io_uring_enter@ system call is protected by a lock inside the kernel. This protection means that concurrent call to @io_uring_enter@ using the same instance are possible, but there is no performance gained from parallel calls to @io_uring_enter@. It is possible to do the first three submission steps in parallel, however, doing so requires careful synchronization.
     102
     103@io_uring@ also introduces constraints on the number of simultaneous operations that can be ``in flight''. Obviously, SQEs are allocated from a fixed-size array, meaning that there is a hard limit to how many SQEs can be submitted at once. In addition, the @io_uring_enter@ system call can fail because ``The  kernel [...] ran out of resources to handle [a request]'' or ``The application is attempting to overcommit the number of requests it can  have  pending.''. This restriction means \io request bursts may have to be subdivided and submitted in chunks at a later time.
     104
     105\subsection{Multiplexing \io: Submission}
     106The submission side is the most complicated aspect of @io_uring@ and its design largely dictates the completion side.
     107
     108While it is possible to do the first steps of submission in parallel, the duration of the system call scales with number of entries submitted. The consequence is that the amount of parallelism used to prepare submissions for the next system call is limited. Beyond this limit, the length of the system call is the throughput limiting factor. I concluded from early experiments that preparing submissions seems to take about as long as the system call itself, which means that with a single @io_uring@ instance, there is no benefit in terms of \io throughput to having more than two \glspl{hthrd}. Therefore the design of the submission engine must manage multiple instances of @io_uring@ running in parallel, effectively sharding @io_uring@ instances. Similarly to scheduling, this sharding can be done privately, \ie, one instance per \glspl{proc}, or in decoupled pools, \ie, a pool of \glspl{proc} use a pool of @io_uring@ instances without one-to-one coupling between any given instance and any given \gls{proc}.
     109
     110\subsubsection{Pool of Instances}
     111One approach is to have multiple shared instances. \Glspl{thrd} attempting \io operations pick one of the available instances and submits operations to that instance. Since the completion will be sent to the same instance, all instances with pending operations must be polled continuously\footnote{As will be described in Chapter~\ref{practice}, this does not translate into constant CPU usage.}. Since there is no coupling between \glspl{proc} and @io_uring@ instances in this approach, \glspl{thrd} running on more than one \gls{proc} can attempt to submit to the same instance concurrently. Since @io_uring@ effectively sets the amount of sharding needed to avoid contention on its internal locks, performance in this approach is based on two aspects: the synchronization needed to submit does not induce more contention than @io_uring@ already does and the scheme to route \io requests to specific @io_uring@ instances does not introduce contention. This second aspect has an oversized importance because it comes into play before the sharding of instances, and as such, all \glspl{hthrd} can contend on the routing algorithm.
     112
     113Allocation in this scheme can be handled fairly easily. Free SQEs, \ie, SQEs that aren't currently being used to represent a request, can be written to safely and have a field called @user_data@ which the kernel only reads to copy to CQEs. Allocation also requires no ordering guarantee as all free SQEs are interchangeable. This requires a simple concurrent bag. The only added complexity is that the number of SQEs is fixed, which means allocation can fail. This failure needs to be pushed up to the routing algorithm, \glspl{thrd} attempting \io operations must not be directed to @io_uring@ instances without any available SQEs. Ideally, the routing algorithm would block operations up-front if none of the instances have available SQEs.
     114
     115Once an SQE is allocated, \glspl{thrd} can fill them normally, they simply need to keep track of the SQE index and which instance it belongs to.
     116
     117Once an SQE is filled in, what needs to happen is that the SQE must be added to the submission ring buffer, an operation that is not thread-safe on itself, and the kernel must be notified using the @io_uring_enter@ system call. The submission ring buffer is the same size as the pre-allocated SQE buffer, therefore pushing to the ring buffer cannot fail\footnote{This is because it is invalid to have the same \lstinline{sqe} multiple times in the ring buffer.}. However, as mentioned, the system call itself can fail with the expectation that it will be retried once some of the already submitted operations complete. Since multiple SQEs can be submitted to the kernel at once, it is important to strike a balance between batching and latency. Operations that are ready to be submitted should be batched together in few system calls, but at the same time, operations should not be left pending for long period of times before being submitted. This can be handled by either designating one of the submitting \glspl{thrd} as the being responsible for the system call for the current batch of SQEs or by having some other party regularly submitting all ready SQEs, \eg, the poller \gls{thrd} mentioned later in this section.
     118
     119In the case of designating a \gls{thrd}, ideally, when multiple \glspl{thrd} attempt to submit operations to the same @io_uring@ instance, all requests would be batched together and one of the \glspl{thrd} would do the system call on behalf of the others, referred to as the \newterm{submitter}. In practice however, it is important that the \io requests are not left pending indefinitely and as such, it may be required to have a current submitter and a next submitter. Indeed, as long as there is a ``next'' submitter, \glspl{thrd} submitting new \io requests can move on, knowing that some future system call will include their request. Once the system call is done, the submitter must also free SQEs so that the allocator can reused them.
     120
     121Finally, the completion side is much simpler since the @io_uring@ system call enforces a natural synchronization point. Polling simply needs to regularly do the system call, go through the produced CQEs and communicate the result back to the originating \glspl{thrd}. Since CQEs only own a signed 32 bit result, in addition to the copy of the @user_data@ field, all that is needed to communicate the result is a simple future~\cite{wiki:future}. If the submission side does not designate submitters, polling can also submit all SQEs as it is polling events.  A simple approach to polling is to allocate a \gls{thrd} per @io_uring@ instance and simply let the poller \glspl{thrd} poll their respective instances when scheduled. This design is especially convenient for reasons explained in Chapter~\ref{practice}.
     122
     123With this pool of instances approach, the big advantage is that it is fairly flexible. It does not impose restrictions on what \glspl{thrd} submitting \io operations can and cannot do between allocations and submissions. It also can gracefully handle running out of resources, SQEs or the kernel returning @EBUSY@. The down side to this is that many of the steps used for submitting need complex synchronization to work properly. The routing and allocation algorithm needs to keep track of which ring instances have available SQEs, block incoming requests if no instance is available, prevent barging if \glspl{thrd} are already queued up waiting for SQEs and handle SQEs being freed. The submission side needs to safely append SQEs to the ring buffer, make sure no SQE is dropped or left pending forever, notify the allocation side when SQEs can be reused and handle the kernel returning @EBUSY@. Sharding the @io_uring@ instances should alleviate much of the contention caused by this, but all this synchronization may still have non-zero cost.
     124
     125\subsubsection{Private Instances}
     126Another approach is to simply create one ring instance per \gls{proc}. This alleviate the need for synchronization on the submissions, requiring only that \glspl{thrd} are not interrupted in between two submission steps. This is effectively the same requirement as using @thread_local@ variables. Since SQEs that are allocated must be submitted to the same ring, on the same \gls{proc}, this effectively forces the application to submit SQEs in allocation order\footnote{The actual requirement is that \glspl{thrd} cannot context switch between allocation and submission. This requirement means that from the subsystem's point of view, the allocation and submission are sequential. To remove this requirement, a \gls{thrd} would need the ability to ``yield to a specific \gls{proc}'', \ie, park with the promise that it will be run next on a specific \gls{proc}, the \gls{proc} attached to the correct ring. This is not a current or planned feature of \CFA.}, greatly simplifying both allocation and submission. In this design, allocation and submission form a ring partitioned ring buffer as shown in Figure~\ref{fig:pring}. Once added to the ring buffer, the attached \gls{proc} has a significant amount of flexibility with regards to when to do the system call. Possible options are: when the \gls{proc} runs out of \glspl{thrd} to run, after running a given number of threads \glspl{thrd}, etc.
     127
     128\begin{figure}
     129        \centering
     130        \input{pivot_ring.pstex_t}
     131        \caption[Partitioned ring buffer]{Partitioned ring buffer \smallskip\newline Allocated sqes are appending to the first partition. When submitting, the partition is simply advanced to include all the sqes that should be submitted. The kernel considers the partition as the head of the ring.}
     132        \label{fig:pring}
     133\end{figure}
     134
     135This approach has the advantage that it does not require much of the synchronization needed in the shared approach. This comes at the cost that \glspl{thrd} submitting \io operations have less flexibility, they cannot park or yield, and several exceptional cases are handled poorly. Instances running out of SQEs cannot run \glspl{thrd} wanting to do \io operations, in such a case the \gls{thrd} needs to be moved to a different \gls{proc}, the only current way of achieving this would be to @yield()@ hoping to be scheduled on a different \gls{proc}, which is not guaranteed. Another problematic case is that \glspl{thrd} that do not park for long periods of time will delay the submission of any SQE not already submitted. This issue is similar to fairness issues which schedulers that use work-stealing mentioned in the previous chapter.
     136
    44137
    45138
    46139\section{Interface}
     140Finally, the last important part of the \io subsystem is it's interface. There are multiple approaches that can be offered to programmers, each with advantages and disadvantages. The new \io subsystem can replace the C runtime's API or extend it. And in the later case the interface can go from very similar to vastly different. The following sections discuss some useful options using @read@ as an example. The standard Linux interface for C is :
     141
     142@ssize_t read(int fd, void *buf, size_t count);@.
     143
     144\subsection{Replacement}
     145Replacing the C \glsxtrshort{api}
     146
     147\subsection{Synchronous Extension}
     148
     149\subsection{Asynchronous Extension}
     150
     151\subsection{Interface directly to \lstinline{io_uring}}
  • doc/theses/thierry_delisle_PhD/thesis/text/runtime.tex

    r5e99a9a r95b3a9c  
    1111
    1212\section{Clusters}
    13 \CFA allows the option to group user-level threading, in the form of clusters. Both \glspl{thrd} and \glspl{proc} belong to a specific cluster. \Glspl{thrd} are only be scheduled onto \glspl{proc} in the same cluster and scheduling is done independently of other clusters. Figure~\ref{fig:system} shows an overview of the \CFA runtime, which allows programmers to tightly control parallelism. It also opens the door to handling effects like NUMA, by pining clusters to a specific NUMA node\footnote{This is not currently implemented in \CFA, but the only hurdle left is creating a generic interface for cpu masks.}.
     13\CFA allows the option to group user-level threading, in the form of clusters. Both \glspl{thrd} and \glspl{proc} belong to a specific cluster. \Glspl{thrd} are only scheduled onto \glspl{proc} in the same cluster and scheduling is done independently of other clusters. Figure~\ref{fig:system} shows an overview of the \CFA runtime, which allows programmers to tightly control parallelism. It also opens the door to handling effects like NUMA, by pining clusters to a specific NUMA node\footnote{This is not currently implemented in \CFA, but the only hurdle left is creating a generic interface for cpu masks.}.
    1414
    1515\begin{figure}
     
    2525
    2626\section{\glsxtrshort{io}}\label{prev:io}
    27 Prior to this work, the \CFA runtime did not add any particular support for \glsxtrshort{io} operations. %\CFA being built on C, this means that,
    28 While all I/O operations available in C are available in \CFA, \glsxtrshort{io} operations are designed for the POSIX threading model~\cite{pthreads}. Using these 1:1 threading operations in an M:N threading model means I/O operations block \glspl{proc} instead of \glspl{thrd}. While this can work in certain cases, it limits the number of concurrent operations to the number of \glspl{proc} rather than \glspl{thrd}. It also means deadlock can occur because all \glspl{proc} are blocked even if at least one \gls{thrd} is ready to run. A simple example of this type of deadlock would be as follows:
     27Prior to this work, the \CFA runtime did not add any particular support for \glsxtrshort{io} operations. While all \glsxtrshort{io} operations available in C are available in \CFA, \glsxtrshort{io} operations are designed for the POSIX threading model~\cite{pthreads}. Using these 1:1 threading operations in an M:N threading model means \glsxtrshort{io} operations block \glspl{proc} instead of \glspl{thrd}. While this can work in certain cases, it limits the number of concurrent operations to the number of \glspl{proc} rather than \glspl{thrd}. It also means deadlock can occur because all \glspl{proc} are blocked even if at least one \gls{thrd} is ready to run. A simple example of this type of deadlock would be as follows:
     28
    2929\begin{quote}
    3030Given a simple network program with 2 \glspl{thrd} and a single \gls{proc}, one \gls{thrd} sends network requests to a server and the other \gls{thrd} waits for a response from the server. If the second \gls{thrd} races ahead, it may wait for responses to requests that have not been sent yet. In theory, this should not be a problem, even if the second \gls{thrd} waits, because the first \gls{thrd} is still ready to run and should be able to get CPU time to send the request. With M:N threading, while the first \gls{thrd} is ready, the lone \gls{proc} \emph{cannot} run the first \gls{thrd} if it is blocked in the \glsxtrshort{io} operation of the second \gls{thrd}. If this happen, the system is in a synchronization deadlock\footnote{In this example, the deadlocked could be resolved if the server sends unprompted messages to the client. However, this solution is not general and may not be appropriate even in this simple case.}.
    3131\end{quote}
    32 Therefore, one of the objective of this work is to introduce \emph{User-Level \glsxtrshort{io}}, like \glslink{uthrding}{User-Level \emph{Threading}} blocks \glspl{thrd} rather than \glspl{proc} when doing \glsxtrshort{io} operations, which entails multiplexing the \glsxtrshort{io} operations of many \glspl{thrd} onto fewer \glspl{proc}. This multiplexing requires that a single \gls{proc} be able to execute multiple I/O operations in parallel. This requirement cannot be done with operations that block \glspl{proc}, \ie \glspl{kthrd}, since the first operation would prevent starting new operations for its blocking duration. Executing I/O operations in parallel requires \emph{asynchronous} \glsxtrshort{io}, sometimes referred to as \emph{non-blocking}, since the \gls{kthrd} does not block.
    3332
    34 \section{Interoperating with C}
     33Therefore, one of the objective of this work is to introduce \emph{User-Level \glsxtrshort{io}}, like \glslink{uthrding}{User-Level \emph{Threading}} blocks \glspl{thrd} rather than \glspl{proc} when doing \glsxtrshort{io} operations, which entails multiplexing the \glsxtrshort{io} operations of many \glspl{thrd} onto fewer \glspl{proc}. This multiplexing requires that a single \gls{proc} be able to execute multiple \glsxtrshort{io} operations in parallel. This requirement cannot be done with operations that block \glspl{proc}, \ie \glspl{kthrd}, since the first operation would prevent starting new operations for its blocking duration. Executing \glsxtrshort{io} operations in parallel requires \emph{asynchronous} \glsxtrshort{io}, sometimes referred to as \emph{non-blocking}, since the \gls{kthrd} does not block.
     34
     35\section{Interoperating with \texttt{C}}
    3536While \glsxtrshort{io} operations are the classical example of operations that block \glspl{kthrd}, the non-blocking challenge extends to all blocking system-calls. The POSIX standard states~\cite[\S~2.9.1]{POSIX17}:
    3637\begin{quote}
     
    4445\begin{enumerate}
    4546        \item Precisely identifying blocking C calls is difficult.
    46         \item Introducing new code can have a significant impact on general performance.
     47        \item Introducing control points code can have a significant impact on general performance.
    4748\end{enumerate}
    48 Because of these consequences, this work does not attempt to ``sandbox'' calls to C. Therefore, it is possible for an unidentified library calls to block a \gls{kthrd} leading to deadlocks in \CFA's M:N threading model, which would not occur in a traditional 1:1 threading model. Currently, all M:N thread systems interacting with UNIX without sandboxing suffer from this problem but manage to work very well in the majority of applications. Therefore, a complete solution to this problem is outside the scope of this thesis.
     49Because of these consequences, this work does not attempt to ``sandbox'' calls to C. Therefore, it is possible calls from an unidentified library will block a \gls{kthrd} leading to deadlocks in \CFA's M:N threading model, which would not occur in a traditional 1:1 threading model. Currently, all M:N thread systems interacting with UNIX without sandboxing suffer from this problem but manage to work very well in the majority of applications. Therefore, a complete solution to this problem is outside the scope of this thesis.
  • doc/theses/thierry_delisle_PhD/thesis/thesis.tex

    r5e99a9a r95b3a9c  
    1 % uWaterloo Thesis Template for LaTeX
    2 % Last Updated June 14, 2017 by Stephen Carr, IST Client Services
    3 % FOR ASSISTANCE, please send mail to rt-IST-CSmathsci@ist.uwaterloo.ca
    4 
    5 % Effective October 2006, the University of Waterloo
    6 % requires electronic thesis submission. See the uWaterloo thesis regulations at
     1%======================================================================
     2% University of Waterloo Thesis Template for LaTeX
     3% Last Updated November, 2020
     4% by Stephen Carr, IST Client Services,
     5% University of Waterloo, 200 University Ave. W., Waterloo, Ontario, Canada
     6% FOR ASSISTANCE, please send mail to request@uwaterloo.ca
     7
     8% DISCLAIMER
     9% To the best of our knowledge, this template satisfies the current uWaterloo thesis requirements.
     10% However, it is your responsibility to assure that you have met all requirements of the University and your particular department.
     11
     12% Many thanks for the feedback from many graduates who assisted the development of this template.
     13% Also note that there are explanatory comments and tips throughout this template.
     14%======================================================================
     15% Some important notes on using this template and making it your own...
     16
     17% The University of Waterloo has required electronic thesis submission since October 2006.
     18% See the uWaterloo thesis regulations at
    719% https://uwaterloo.ca/graduate-studies/thesis.
    8 
    9 % DON'T FORGET TO ADD YOUR OWN NAME AND TITLE in the "hyperref" package
    10 % configuration below. THIS INFORMATION GETS EMBEDDED IN THE PDF FINAL PDF DOCUMENT.
    11 % You can view the information if you view Properties of the PDF document.
    12 
    13 % Many faculties/departments also require one or more printed
    14 % copies. This template attempts to satisfy both types of output.
    15 % It is based on the standard "book" document class which provides all necessary
    16 % sectioning structures and allows multi-part theses.
    17 
    18 % DISCLAIMER
    19 % To the best of our knowledge, this template satisfies the current uWaterloo requirements.
    20 % However, it is your responsibility to assure that you have met all
    21 % requirements of the University and your particular department.
    22 % Many thanks for the feedback from many graduates that assisted the development of this template.
    23 
    24 % -----------------------------------------------------------------------
    25 
    26 % By default, output is produced that is geared toward generating a PDF
    27 % version optimized for viewing on an electronic display, including
    28 % hyperlinks within the PDF.
    29 
     20% This thesis template is geared towards generating a PDF version optimized for viewing on an electronic display, including hyperlinks within the PDF.
     21
     22% DON'T FORGET TO ADD YOUR OWN NAME AND TITLE in the "hyperref" package configuration below.
     23% THIS INFORMATION GETS EMBEDDED IN THE PDF FINAL PDF DOCUMENT.
     24% You can view the information if you view properties of the PDF document.
     25
     26% Many faculties/departments also require one or more printed copies.
     27% This template attempts to satisfy both types of output.
     28% See additional notes below.
     29% It is based on the standard "book" document class which provides all necessary sectioning structures and allows multi-part theses.
     30
     31% If you are using this template in Overleaf (cloud-based collaboration service), then it is automatically processed and previewed for you as you edit.
     32
     33% For people who prefer to install their own LaTeX distributions on their own computers, and process the source files manually, the following notes provide the sequence of tasks:
     34 
    3035% E.g. to process a thesis called "mythesis.tex" based on this template, run:
    3136
    3237% pdflatex mythesis     -- first pass of the pdflatex processor
    3338% bibtex mythesis       -- generates bibliography from .bib data file(s)
    34 % makeindex         -- should be run only if an index is used
     39% makeindex         -- should be run only if an index is used 
    3540% pdflatex mythesis     -- fixes numbering in cross-references, bibliographic references, glossaries, index, etc.
    36 % pdflatex mythesis     -- fixes numbering in cross-references, bibliographic references, glossaries, index, etc.
    37 
    38 % If you use the recommended LaTeX editor, Texmaker, you would open the mythesis.tex
    39 % file, then click the PDFLaTeX button. Then run BibTeX (under the Tools menu).
    40 % Then click the PDFLaTeX button two more times. If you have an index as well,
    41 % you'll need to run MakeIndex from the Tools menu as well, before running pdflatex
     41% pdflatex mythesis     -- it takes a couple of passes to completely process all cross-references
     42
     43% If you use the recommended LaTeX editor, Texmaker, you would open the mythesis.tex file, then click the PDFLaTeX button. Then run BibTeX (under the Tools menu).
     44% Then click the PDFLaTeX button two more times.
     45% If you have an index as well,you'll need to run MakeIndex from the Tools menu as well, before running pdflatex
    4246% the last two times.
    4347
    44 % N.B. The "pdftex" program allows graphics in the following formats to be
    45 % included with the "\includegraphics" command: PNG, PDF, JPEG, TIFF
    46 % Tip 1: Generate your figures and photos in the size you want them to appear
    47 % in your thesis, rather than scaling them with \includegraphics options.
    48 % Tip 2: Any drawings you do should be in scalable vector graphic formats:
    49 % SVG, PNG, WMF, EPS and then converted to PNG or PDF, so they are scalable in
    50 % the final PDF as well.
    51 % Tip 3: Photographs should be cropped and compressed so as not to be too large.
    52 
    53 % To create a PDF output that is optimized for double-sided printing:
    54 %
    55 % 1) comment-out the \documentclass statement in the preamble below, and
    56 % un-comment the second \documentclass line.
    57 %
    58 % 2) change the value assigned below to the boolean variable
    59 % "PrintVersion" from "false" to "true".
    60 
    61 % --------------------- Start of Document Preamble -----------------------
    62 
    63 % Specify the document class, default style attributes, and page dimensions
     48% N.B. The "pdftex" program allows graphics in the following formats to be included with the "\includegraphics" command: PNG, PDF, JPEG, TIFF
     49% Tip: Generate your figures and photos in the size you want them to appear in your thesis, rather than scaling them with \includegraphics options.
     50% Tip: Any drawings you do should be in scalable vector graphic formats: SVG, PNG, WMF, EPS and then converted to PNG or PDF, so they are scalable in the final PDF as well.
     51% Tip: Photographs should be cropped and compressed so as not to be too large.
     52
     53% To create a PDF output that is optimized for double-sided printing:
     54% 1) comment-out the \documentclass statement in the preamble below, and un-comment the second \documentclass line.
     55% 2) change the value assigned below to the boolean variable "PrintVersion" from " false" to "true".
     56
     57%======================================================================
     58%   D O C U M E N T   P R E A M B L E
     59% Specify the document class, default style attributes, and page dimensions, etc.
    6460% For hyperlinked PDF, suitable for viewing on a computer, use this:
    6561\documentclass[letterpaper,12pt,titlepage,oneside,final]{book}
    6662
    67 % For PDF, suitable for double-sided printing, change the PrintVersion variable below
    68 % to "true" and use this \documentclass line instead of the one above:
     63% For PDF, suitable for double-sided printing, change the PrintVersion variable below to "true" and use this \documentclass line instead of the one above:
    6964%\documentclass[letterpaper,12pt,titlepage,openright,twoside,final]{book}
    7065
    71 \newcommand{\href}[1]{#1} % does nothing, but defines the command so the
    72     % print-optimized version will ignore \href tags (redefined by hyperref pkg).
     66% Some LaTeX commands I define for my own nomenclature.
     67% If you have to, it's easier to make changes to nomenclature once here than in a million places throughout your thesis!
     68\newcommand{\package}[1]{\textbf{#1}} % package names in bold text
     69\newcommand{\cmmd}[1]{\textbackslash\texttt{#1}} % command name in tt font
     70\newcommand{\href}[1]{#1} % does nothing, but defines the command so the print-optimized version will ignore \href tags (redefined by hyperref pkg).
     71%\newcommand{\texorpdfstring}[2]{#1} % does nothing, but defines the command
     72% Anything defined here may be redefined by packages added below...
    7373
    7474% This package allows if-then-else control structures.
     
    7676\newboolean{PrintVersion}
    7777\setboolean{PrintVersion}{false}
    78 % CHANGE THIS VALUE TO "true" as necessary, to improve printed results for hard copies
    79 % by overriding some options of the hyperref package below.
     78% CHANGE THIS VALUE TO "true" as necessary, to improve printed results for hard copies by overriding some options of the hyperref package, called below.
    8079
    8180%\usepackage{nomencl} % For a nomenclature (optional; available from ctan.org)
    8281\usepackage{amsmath,amssymb,amstext} % Lots of math symbols and environments
     82\usepackage{xcolor}
    8383\usepackage{graphicx} % For including graphics
    8484
    8585% Hyperlinks make it very easy to navigate an electronic document.
    86 % In addition, this is where you should specify the thesis title
    87 % and author as they appear in the properties of the PDF document.
     86% In addition, this is where you should specify the thesis title and author as they appear in the properties of the PDF document.
    8887% Use the "hyperref" package
    8988% N.B. HYPERREF MUST BE THE LAST PACKAGE LOADED; ADD ADDITIONAL PKGS ABOVE
    9089\usepackage[pagebackref=false]{hyperref} % with basic options
    91                 % N.B. pagebackref=true provides links back from the References to the body text. This can cause trouble for printing.
     90%\usepackage[pdftex,pagebackref=true]{hyperref}
     91% N.B. pagebackref=true provides links back from the References to the body text. This can cause trouble for printing.
    9292\hypersetup{
    9393        plainpages=false,       % needed if Roman numbers in frontpages
    94         unicode=false,          % non-Latin characters in Acrobats bookmarks
    95         pdftoolbar=true,        % show Acrobats toolbar?
    96         pdfmenubar=true,        % show Acrobats menu?
     94        unicode=false,          % non-Latin characters in Acrobat's bookmarks
     95        pdftoolbar=true,        % show Acrobat's toolbar?
     96        pdfmenubar=true,        % show Acrobat's menu?
    9797        pdffitwindow=false,     % window fit to page when opened
    9898        pdfstartview={FitH},    % fits the width of the page to the window
     
    110110\ifthenelse{\boolean{PrintVersion}}{   % for improved print quality, change some hyperref options
    111111\hypersetup{    % override some previously defined hyperref options
    112         citecolor=black,
    113         filecolor=black,
    114         linkcolor=black,
     112        citecolor=black,%
     113        filecolor=black,%
     114        linkcolor=black,%
    115115        urlcolor=black
    116116}}{} % end of ifthenelse (no else)
     
    120120% although it's supposed to be in both the TeX Live and MikTeX distributions. There are also documentation and
    121121% installation instructions there.
    122 \renewcommand*{\glstextformat}[1]{\textsf{#1}}
     122\makeatletter
     123\newcommand*{\glsplainhyperlink}[2]{%
     124  \colorlet{currenttext}{.}% store current text color
     125  \colorlet{currentlink}{\@linkcolor}% store current link color
     126  \hypersetup{linkcolor=currenttext}% set link color
     127  \hyperlink{#1}{#2}%
     128  \hypersetup{linkcolor=currentlink}% reset to default
     129}
     130\let\@glslink\glsplainhyperlink
     131\makeatother
    123132
    124133\usepackage{csquotes}
     
    126135
    127136% Setting up the page margins...
    128 \setlength{\textheight}{9in}\setlength{\topmargin}{-0.45in}\setlength{\headsep}{0.25in}
     137\setlength{\textheight}{9in}
     138\setlength{\topmargin}{-0.45in}
     139\setlength{\headsep}{0.25in}
    129140% uWaterloo thesis requirements specify a minimum of 1 inch (72pt) margin at the
    130 % top, bottom, and outside page edges and a 1.125 in. (81pt) gutter
    131 % margin (on binding side). While this is not an issue for electronic
    132 % viewing, a PDF may be printed, and so we have the same page layout for
    133 % both printed and electronic versions, we leave the gutter margin in.
     141% top, bottom, and outside page edges and a 1.125 in. (81pt) gutter margin (on binding side).
     142% While this is not an issue for electronic viewing, a PDF may be printed, and so we have the same page layout for both printed and electronic versions, we leave the gutter margin in.
    134143% Set margins to minimum permitted by uWaterloo thesis regulations:
    135144\setlength{\marginparwidth}{0pt} % width of margin notes
     
    140149\setlength{\evensidemargin}{0.125in} % Adds 1/8 in. to binding side of all
    141150% even-numbered pages when the "twoside" printing option is selected
    142 \setlength{\oddsidemargin}{0.125in} % Adds 1/8 in. to the left of all pages
    143 % when "oneside" printing is selected, and to the left of all odd-numbered
    144 % pages when "twoside" printing is selected
    145 \setlength{\textwidth}{6.375in} % assuming US letter paper (8.5 in. x 11 in.) and
    146 % side margins as above
     151\setlength{\oddsidemargin}{0.125in} % Adds 1/8 in. to the left of all pages when "oneside" printing is selected, and to the left of all odd-numbered pages when "twoside" printing is selected
     152\setlength{\textwidth}{6.375in} % assuming US letter paper (8.5 in. x 11 in.) and side margins as above
    147153\raggedbottom
    148154
    149 % The following statement specifies the amount of space between
    150 % paragraphs. Other reasonable specifications are \bigskipamount and \smallskipamount.
     155% The following statement specifies the amount of space between paragraphs. Other reasonable specifications are \bigskipamount and \smallskipamount.
    151156\setlength{\parskip}{\medskipamount}
    152157
    153 % The following statement controls the line spacing.  The default
    154 % spacing corresponds to good typographic conventions and only slight
    155 % changes (e.g., perhaps "1.2"), if any, should be made.
     158% The following statement controls the line spacing.
     159% The default spacing corresponds to good typographic conventions and only slight changes (e.g., perhaps "1.2"), if any, should be made.
    156160\renewcommand{\baselinestretch}{1} % this is the default line space setting
    157161
    158 % By default, each chapter will start on a recto (right-hand side)
    159 % page.  We also force each section of the front pages to start on
    160 % a recto page by inserting \cleardoublepage commands.
    161 % In many cases, this will require that the verso page be
    162 % blank and, while it should be counted, a page number should not be
    163 % printed.  The following statements ensure a page number is not
    164 % printed on an otherwise blank verso page.
     162% By default, each chapter will start on a recto (right-hand side) page.
     163% We also force each section of the front pages to start on a recto page by inserting \cleardoublepage commands.
     164% In many cases, this will require that the verso (left-hand) page be blank, and while it should be counted, a page number should not be printed.
     165% The following statements ensure a page number is not printed on an otherwise blank verso page.
    165166\let\origdoublepage\cleardoublepage
    166167\newcommand{\clearemptydoublepage}{%
     
    194195\input{common}
    195196\CFAStyle                                               % CFA code-style for all languages
    196 \lstset{basicstyle=\linespread{0.9}\tt}
     197\lstset{language=CFA,basicstyle=\linespread{0.9}\tt}    % CFA default language
    197198
    198199% glossary of terms to use
     
    200201\makeindex
    201202
    202 %======================================================================
    203 %   L O G I C A L    D O C U M E N T -- the content of your thesis
     203\newcommand\io{\glsxtrshort{io}\xspace}%
     204
     205%======================================================================
     206%   L O G I C A L    D O C U M E N T
     207% The logical document contains the main content of your thesis.
     208% Being a large document, it is a good idea to divide your thesis into several files, each one containing one chapter or other significant chunk of content, so you can easily shuffle things around later if desired.
    204209%======================================================================
    205210\begin{document}
    206211
    207 % For a large document, it is a good idea to divide your thesis
    208 % into several files, each one containing one chapter.
    209 % To illustrate this idea, the "front pages" (i.e., title page,
    210 % declaration, borrowers' page, abstract, acknowledgements,
    211 % dedication, table of contents, list of tables, list of figures,
    212 % nomenclature) are contained within the file "uw-ethesis-frontpgs.tex" which is
    213 % included into the document by the following statement.
    214212%----------------------------------------------------------------------
    215213% FRONT MATERIAL
     214% title page,declaration, borrowers' page, abstract, acknowledgements,
     215% dedication, table of contents, list of tables, list of figures, nomenclature, etc.
    216216%----------------------------------------------------------------------
    217217\input{text/front.tex}
    218218
    219 
    220219%----------------------------------------------------------------------
    221220% MAIN BODY
    222 %----------------------------------------------------------------------
    223 % Because this is a short document, and to reduce the number of files
    224 % needed for this template, the chapters are not separate
    225 % documents as suggested above, but you get the idea. If they were
    226 % separate documents, they would each start with the \chapter command, i.e,
    227 % do not contain \documentclass or \begin{document} and \end{document} commands.
     221% We suggest using a separate file for each chapter of your thesis.
     222% Start each chapter file with the \chapter command.
     223% Only use \documentclass or \begin{document} and \end{document} commands in this master document.
     224% Tip: Putting each sentence on a new line is a way to simplify later editing.
     225%----------------------------------------------------------------------
     226
    228227\part{Introduction}
    229228\input{text/intro.tex}
     
    232231\part{Design}
    233232\input{text/core.tex}
     233\input{text/io.tex}
    234234\input{text/practice.tex}
    235 \input{text/io.tex}
    236235\part{Evaluation}
    237236\label{Evaluation}
     
    243242%----------------------------------------------------------------------
    244243% END MATERIAL
    245 %----------------------------------------------------------------------
    246 
    247 % B I B L I O G R A P H Y
    248 % -----------------------
    249 
    250 % The following statement selects the style to use for references.  It controls the sort order of the entries in the bibliography and also the formatting for the in-text labels.
     244% Bibliography, Appendices, Index, etc.
     245%----------------------------------------------------------------------
     246
     247% Bibliography
     248
     249% The following statement selects the style to use for references.
     250% It controls the sort order of the entries in the bibliography and also the formatting for the in-text labels.
    251251\bibliographystyle{plain}
    252252% This specifies the location of the file containing the bibliographic information.
    253 % It assumes you're using BibTeX (if not, why not?).
    254 \cleardoublepage % This is needed if the book class is used, to place the anchor in the correct page,
    255                  % because the bibliography will start on its own page.
    256                  % Use \clearpage instead if the document class uses the "oneside" argument
     253% It assumes you're using BibTeX to manage your references (if not, why not?).
     254\cleardoublepage % This is needed if the "book" document class is used, to place the anchor in the correct page, because the bibliography will start on its own page.
     255% Use \clearpage instead if the document class uses the "oneside" argument
    257256\phantomsection  % With hyperref package, enables hyperlinking from the table of contents to bibliography
    258257% The following statement causes the title "References" to be used for the bibliography section:
     
    263262
    264263\bibliography{local,pl}
    265 % Tip 5: You can create multiple .bib files to organize your references.
     264% Tip: You can create multiple .bib files to organize your references.
    266265% Just list them all in the \bibliogaphy command, separated by commas (no spaces).
    267266
    268 % % The following statement causes the specified references to be added to the bibliography% even if they were not
    269 % % cited in the text. The asterisk is a wildcard that causes all entries in the bibliographic database to be included (optional).
     267% The following statement causes the specified references to be added to the bibliography even if they were not cited in the text.
     268% The asterisk is a wildcard that causes all entries in the bibliographic database to be included (optional).
    270269% \nocite{*}
     270%----------------------------------------------------------------------
     271
     272% Appendices
    271273
    272274% The \appendix statement indicates the beginning of the appendices.
    273275\appendix
    274 % Add a title page before the appendices and a line in the Table of Contents
     276% Add an un-numbered title page before the appendices and a line in the Table of Contents
    275277\chapter*{APPENDICES}
    276278\addcontentsline{toc}{chapter}{APPENDICES}
     279% Appendices are just more chapters, with different labeling (letters instead of numbers).
    277280%======================================================================
    278281\chapter[PDF Plots From Matlab]{Matlab Code for Making a PDF Plot}
     
    312315%\input{thesis.ind}                             % index
    313316
    314 \phantomsection
    315 
    316 \end{document}
     317\phantomsection         % allows hyperref to link to the correct page
     318
     319%----------------------------------------------------------------------
     320\end{document} % end of logical document
Note: See TracChangeset for help on using the changeset viewer.