Changes in / [f6664bf2:14533d4]


Ignore:
Location:
doc
Files:
8 edited

Legend:

Unmodified
Added
Removed
  • doc/LaTeXmacros/common.tex

    rf6664bf2 r14533d4  
    1111%% Created On       : Sat Apr  9 10:06:17 2016
    1212%% Last Modified By : Peter A. Buhr
    13 %% Last Modified On : Sun Feb 14 15:52:46 2021
    14 %% Update Count     : 524
     13%% Last Modified On : Mon Feb  8 21:45:41 2021
     14%% Update Count     : 522
    1515%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    1616
     
    146146% The star version does not lowercase the index information, e.g., \newterm*{IBM}.
    147147\newcommand{\newtermFontInline}{\emph}
    148 \newcommand{\newterm}{\protect\@ifstar\@snewterm\@newterm}
     148\newcommand{\newterm}{\@ifstar\@snewterm\@newterm}
    149149\newcommand{\@newterm}[2][\@empty]{\lowercase{\def\temp{#2}}{\newtermFontInline{#2}}\ifx#1\@empty\index{\temp}\else\index{#1@{\protect#2}}\fi}
    150150\newcommand{\@snewterm}[2][\@empty]{{\newtermFontInline{#2}}\ifx#1\@empty\index{#2}\else\index{#1@{\protect#2}}\fi}
     
    294294
    295295\ifdefined\CFALatin% extra Latin-1 escape characters
    296 \lstnewenvironment{cfa}[1][]{% necessary
     296\lstnewenvironment{cfa}[1][]{
    297297\lstset{
    298298language=CFA,
     
    303303%moredelim=[is][\lstset{keywords={}}]{¶}{¶}, % keyword escape ¶...¶ (pilcrow symbol) emacs: C-q M-^
    304304}% lstset
    305 \lstset{#1}% necessary
     305\lstset{#1}
    306306}{}
    307307% inline code ©...© (copyright symbol) emacs: C-q M-)
    308308\lstMakeShortInline©                                    % single-character for \lstinline
    309309\else% regular ASCI characters
    310 \lstnewenvironment{cfa}[1][]{% necessary
     310\lstnewenvironment{cfa}[1][]{
    311311\lstset{
    312312language=CFA,
     
    315315moredelim=**[is][\color{red}]{@}{@},    % red highlighting @...@
    316316}% lstset
    317 \lstset{#1}% necessary
     317\lstset{#1}
    318318}{}
    319319% inline code @...@ (at symbol)
  • doc/papers/concurrency/mail2

    rf6664bf2 r14533d4  
    12881288
    12891289
    1290 From: "Wiley Online Proofing" <onlineproofing@eproofing.in>
    1291 To: pabuhr@uwaterloo.ca
    1292 Reply-To: eproofing@wiley.com
    1293 Date: 3 Nov 2020 08:25:06 +0000
    1294 Subject: Action: Proof of SPE_EV_SPE2925 for Software: Practice And Experience ready for review
    1295 
    1296 Dear Dr. Peter Buhr,
    1297 
    1298 The proof of your Software: Practice And Experience article Advanced control-flow in Cforall is now available for review:
    1299 
    1300 Edit Article https://wiley.eproofing.in/Proof.aspx?token=ab7739d5678447fbbe5036f3bcba2445081500061
    1301 
    1302 To review your article, please complete the following steps, ideally within 48 hours*, so we can publish your article as quickly as possible.
    1303 
    1304 1. Open your proof in the online proofing system using the button above.
    1305 2. Check the article for correctness and respond to all queries.For instructions on using the system, please see the "Help" menu in the upper right corner.
    1306 3. Submit your changes by clicking the "Submit" button in the proofing system.
    1307 
    1308 Helpful Tips
    1309 
    1310 *  Your manuscript has been formatted following the style requirements for the journal. Any requested changes that go against journal style will not be made.
    1311 *  Your proof will include queries. These must be replied to using the system before the proof can be submitted.
    1312 *  The only acceptable changes at this stage are corrections to grammatical errors or data accuracy, or to provide higher resolution figure files (if requested by the typesetter).
    1313 *  Any changes to scientific content or authorship will require editorial review and approval.
    1314 *  Once your changes are complete, submit the article after which no additional corrections can be requested.
    1315 *  Most authors complete their corrections within 48 hours. Returning any corrections promptly will accelerate publication of your article.
    1316 
    1317 If you encounter any problems or have questions, please contact the production office at (SPEproofs@wiley.com). For the quickest response, include the journal name and your article ID (found in the subject line) in all correspondence.
    1318 
    1319 Best regards,
    1320 Software: Practice And Experience Production Office
    1321 
    1322 * We appreciate that the COVID-19 pandemic may create conditions for you that make it difficult for you to review your proof within standard timeframes. If you have any problems keeping to this schedule, please reach out to me at (SPEproofs@wiley.com) to discuss alternatives.
    1323 
    1324 
    1325 
    13261290From: "Pacaanas, Joel -" <jpacaanas@wiley.com>
    13271291To: "Peter A. Buhr" <pabuhr@uwaterloo.ca>
     
    13811345
    13821346Since the proof was reset, your added corrections before has also been removed. Please add them back.
     1347
    13831348Please return your corrections at your earliest convenience.
    13841349
     
    14191384Best regards,
    14201385Joel Pacaanas
    1421 
    1422 
    1423 
    1424 Date: Wed, 2 Dec 2020 08:49:52 +0000
    1425 From: <cs-author@wiley.com>
    1426 To: <pabuhr@uwaterloo.ca>
    1427 Subject: Published: Your article is now published in Early View!
    1428 
    1429 Dear Peter Buhr,
    1430 
    1431 Your article Advanced Control-flow and Concurrency in C A in Software: Practice and Experience has the following publication status: Published as Early View
    1432 
    1433 To access your article, please click the following link to register or log in:
    1434 
    1435   https://authorservices.wiley.com/index.html#register
    1436 
    1437 You can also access your published article via this link: http://dx.doi.org/10.1002/spe.2925
    1438 
    1439 If you need any assistance, please click here https://hub.wiley.com/community/support/authorservices to view our Help section.
    1440 
    1441 Sincerely,                                                                                 
    1442 Wiley Author Services
    1443 
    1444 
    1445 Date: Wed, 2 Dec 2020 02:16:23 -0500
    1446 From: <no-reply@copyright.com>
    1447 To: <pabuhr@uwaterloo.ca>
    1448 CC: <SPEproofs@wiley.com>
    1449 Subject: Please submit your publication fee(s) SPE2925
    1450  
    1451 John Wiley and Sons
    1452 Please submit your selection and payment for publication fee(s).
    1453 
    1454 Dear Peter A. Buhr,
    1455 
    1456 Congratulations, your article in Software: Practice and Experience has published online:
    1457 
    1458 Manuscript DOI: 10.1002/spe.2925
    1459 Manuscript ID: SPE2925
    1460 Manuscript Title: Advanced control-flow in Cforall
    1461 Published by: John Wiley and Sons
    1462 
    1463 Please carefully review your publication options. If you wish your colour
    1464 figures to be printed in colour, you must select and pay for that option now
    1465 using the RightsLink e-commerce solution from CCC.
    1466 
    1467   Review my options & pay charges 
    1468   https://oa.copyright.com/apc-payment-ui/overview?id=f46ba36a-2565-4c8d-8865-693bb94d87e5&chargeset=CHARGES 
    1469 
    1470 To review and pay your charge(s), please click here
    1471 <https://oa.copyright.com/apc-payment-ui/overview?id=f46ba36a-2565-4c8d-8865-693bb94d87e5&chargeset=CHARGES>. You
    1472 can also forward this link to another party for processing.
    1473 
    1474 To complete a secure transaction, you will need a RightsLink account
    1475 <https://oa.copyright.com/apc-payment-ui/registration?id=f46ba36a-2565-4c8d-8865-693bb94d87e5&chargeset=CHARGES>. If
    1476 you do not have one already, you will be prompted to register as you are
    1477 checking out your author charges. This is a very quick process; the majority of
    1478 your registration form will be pre-populated automatically with information we
    1479 have already supplied to RightsLink.
    1480 
    1481 If you have any questions about these charges, please contact CCC Customer
    1482 Service <wileysupport@copyright.com> using the information below. Please do not
    1483 reply directly to this email as this is an automated email notification sent
    1484 from an unmonitored account.
    1485 
    1486 Sincerely,
    1487 John Wiley and Sons
    1488        
    1489 Tel.: +1-877-622-5543 / +1-978-646-2777
    1490 wileysupport@copyright.com
    1491 www.copyright.com
    1492        
    1493 Copyright Clearance Center
    1494 RightsLink
    1495        
    1496 This message (including attachments) is confidential, unless marked
    1497 otherwise. It is intended for the addressee(s) only. If you are not an intended
    1498 recipient, please delete it without further distribution and reply to the
    1499 sender that you have received the message in error.
    1500 
    1501 
    1502 
    1503 From: "Pacaanas, Joel -" <jpacaanas@wiley.com>
    1504 To: "Peter A. Buhr" <pabuhr@uwaterloo.ca>
    1505 Subject: RE: Please submit your publication fee(s) SPE2925
    1506 Date: Thu, 3 Dec 2020 08:45:10 +0000
    1507 
    1508 Dear Dr Buhr,
    1509 
    1510 Thank you for your email and concern with regard to the RightsLink account. As
    1511 you have mentioned that all figures will be printed as black and white, then I
    1512 have selected it manually from the system to proceed further.
    1513 
    1514 Best regards,
    1515 Joel
    1516 
    1517 Joel Q. Pacaanas
    1518 Production Editor
    1519 On behalf of Wiley
    1520 Manila
    1521 We partner with global experts to further innovative research.
    1522 
    1523 E-mail: jpacaanas@wiley.com
    1524 Tel: +632 88558618
    1525 Fax: +632 5325 0768
    1526 
    1527 -----Original Message-----
    1528 From: Peter A. Buhr [mailto:pabuhr@uwaterloo.ca]
    1529 Sent: Thursday, December 3, 2020 12:28 AM
    1530 To: SPE Proofs <speproofs@wiley.com>
    1531 Subject: Re: Please submit your publication fee(s) SPE2925
    1532 
    1533 I am trying to complete the forms to submit my publication fee.
    1534 
    1535 I clicked all the boxs to print in Black and White, so there is no fee.
    1536 
    1537 I then am asked to create RightsLink account, which I did.
    1538 
    1539 However, it requires that I click a box agreeing to:
    1540 
    1541    I consent to have my contact information shared with my publisher and/or
    1542    funding organization, as needed, to facilitate APC payment(s), reporting and
    1543    customer care.
    1544 
    1545 I do not agree to this sharing and will not click this button.
    1546 
    1547 How would you like to proceed?
    1548 
    1549 
    1550 
    1551 From: "Pacaanas, Joel -" <jpacaanas@wiley.com>
    1552 To: "Peter A. Buhr" <pabuhr@uwaterloo.ca>
    1553 Subject: RE: Please submit your publication fee(s) SPE2925
    1554 Date: Fri, 4 Dec 2020 07:55:59 +0000
    1555 
    1556 Dear Peter,
    1557 
    1558 Yes, you are now done with this selection.
    1559 
    1560 Thank you.
    1561 
    1562 Best regards,
    1563 Joel
    1564 
    1565 Joel Q. Pacaanas
    1566 Production Editor
    1567 On behalf of Wiley
    1568 Manila
    1569 We partner with global experts to further innovative research.
    1570 
    1571 E-mail: jpacaanas@wiley.com
    1572 Tel: +632 88558618
    1573 Fax: +632 5325 0768
    1574 
    1575 -----Original Message-----
    1576 From: Peter A. Buhr [mailto:pabuhr@uwaterloo.ca]
    1577 Sent: Thursday, December 3, 2020 10:29 PM
    1578 To: Pacaanas, Joel - <jpacaanas@wiley.com>
    1579 Subject: Re: Please submit your publication fee(s) SPE2925
    1580 
    1581     Thank you for your email and concern with regard to the RightsLink
    1582     account. As you have mentioned that all figures will be printed as black and
    1583     white, then I have selected it manually from the system to proceed further.
    1584 
    1585 Just be clear, am I done? Meaning I do not have to go back to that web-page again.
  • doc/theses/andrew_beach_MMath/features.tex

    rf6664bf2 r14533d4  
    113113virtual table type; which usually has a mangled name.
    114114% Also \CFA's trait system handles functions better than constants and doing
    115 % it this way reduce the amount of boiler plate we need.
     115% it this way
    116116
    117117% I did have a note about how it is the programmer's responsibility to make
     
    119119% similar system I know of (except Agda's I guess) so I took it out.
    120120
    121 There are two more traits for exceptions @is_termination_exception@ and
    122 @is_resumption_exception@. They are defined as follows:
    123 
     121\section{Raise}
     122\CFA provides two kinds of exception raise: termination
     123\see{\VRef{s:Termination}} and resumption \see{\VRef{s:Resumption}}, which are
     124specified with the following traits.
    124125\begin{cfa}
    125126trait is_termination_exception(
     
    127128        void defaultTerminationHandler(exceptT &);
    128129};
    129 
     130\end{cfa}
     131The function is required to allow a termination raise, but is only called if a
     132termination raise does not find an appropriate handler.
     133
     134Allowing a resumption raise is similar.
     135\begin{cfa}
    130136trait is_resumption_exception(
    131137                exceptT &, virtualT & | is_exception(exceptT, virtualT)) {
     
    133139};
    134140\end{cfa}
    135 
    136 In other words they make sure that a given type and virtual type is an
    137 exception and defines one of the two default handlers. These default handlers
    138 are used in the main exception handling operations \see{Exception Handling}
    139 and their use will be detailed there.
    140 
    141 However all three of these traits can be trickly to use directly.
    142 There is a bit of repetition required but
    143 the largest issue is that the virtual table type is mangled and not in a user
    144 facing way. So there are three macros that can be used to wrap these traits
    145 when you need to refer to the names:
     141The function is required to allow a resumption raise, but is only called if a
     142resumption raise does not find an appropriate handler.
     143
     144Finally there are three convenience macros for referring to the these traits:
    146145@IS_EXCEPTION@, @IS_TERMINATION_EXCEPTION@ and @IS_RESUMPTION_EXCEPTION@.
    147 
    148 All take one or two arguments. The first argument is the name of the
    149 exception type. Its unmangled and mangled form are passed to the trait.
    150 The second (optional) argument is a parenthesized list of polymorphic
    151 arguments. This argument should only with polymorphic exceptions and the
    152 list will be passed to both types.
    153 In the current set-up the base name and the polymorphic arguments have to
    154 match so these macros can be used without losing flexability.
     146All three traits are hard to use while naming the virtual table as it has an
     147internal mangled name. These macros take the exception name as their first
     148argument and do the mangling. They all take a second argument for polymorphic
     149types which is the parenthesized list of polymorphic arguments. These
     150arguments are passed to both the exception type and the virtual table type as
     151the arguments do have to match.
    155152
    156153For example consider a function that is polymorphic over types that have a
     
    161158\end{cfa}
    162159
    163 \section{Exception Handling}
    164 \CFA provides two kinds of exception handling, termination and resumption.
    165 These twin operations are the core of the exception handling mechanism and
    166 are the reason for the features of exceptions.
    167 This section will cover the general patterns shared by the two operations and
    168 then go on to cover the details each individual operation.
    169 
    170 Both operations follow the same set of steps to do their operation. They both
    171 start with the user preforming a throw on an exception.
    172 Then there is the search for a handler, if one is found than the exception
    173 is caught and the handler is run. After that control returns to normal
    174 execution.
    175 
    176 If the search fails a default handler is run and then control
    177 returns to normal execution immediately. That is where the default handlers
    178 @defaultTermiationHandler@ and @defaultResumptionHandler@ are used.
    179 
    180160\subsection{Termination}
    181161\label{s:Termination}
    182162
    183 Termination handling is more familiar kind and used in most programming
    184 languages with exception handling.
    185 It is dynamic, non-local goto. If a throw is successful then the stack will
    186 be unwound and control will (usually) continue in a different function on
    187 the call stack. They are commonly used when an error has occured and recovery
    188 is impossible in the current function.
    189 
    190 % (usually) Control can continue in the current function but then a different
    191 % control flow construct should be used.
    192 
    193 A termination throw is started with the @throw@ statement:
     163Termination raise, called ``throw'', is familiar and used in most programming
     164languages with exception handling. The semantics of termination is: search the
     165stack for a matching handler, unwind the stack frames to the matching handler,
     166execute the handler, and continue execution after the handler. Termination is
     167used when execution \emph{cannot} return to the throw. To continue execution,
     168the program must \emph{recover} in the handler from the failed (unwound)
     169execution at the raise to safely proceed after the handler.
     170
     171A termination raise is started with the @throw@ statement:
    194172\begin{cfa}
    195173throw EXPRESSION;
     
    202180change the throw's behavior (see below).
    203181
    204 The throw will copy the provided exception into managed memory. It is the
    205 user's responcibility to ensure the original exception is cleaned up if the
    206 stack is unwound (allocating it on the stack should be sufficient).
    207 
    208 Then the exception system searches the stack using the copied exception.
    209 It starts starts from the throw and proceeds to the base of the stack,
    210 from callee to caller.
    211 At each stack frame, a check is made for resumption handlers defined by the
    212 @catch@ clauses of a @try@ statement.
     182At runtime, the exception returned by the expression
     183is copied into managed memory (heap) to ensure it remains in
     184scope during unwinding. It is the user's responsibility to ensure the original
     185exception object at the throw is freed when it goes out of scope. Being
     186allocated on the stack is sufficient for this.
     187
     188Then the exception system searches the stack starting from the throw and
     189proceeding towards the base of the stack, from callee to caller. At each stack
     190frame, a check is made for termination handlers defined by the @catch@ clauses
     191of a @try@ statement.
    213192\begin{cfa}
    214193try {
    215194        GUARDED_BLOCK
    216 } catch (EXCEPTION_TYPE$\(_1\)$ * NAME$\(_1\)$) {
     195} catch (EXCEPTION_TYPE$\(_1\)$ * NAME$\(_1\)$) { // termination handler 1
    217196        HANDLER_BLOCK$\(_1\)$
    218 } catch (EXCEPTION_TYPE$\(_2\)$ * NAME$\(_2\)$) {
     197} catch (EXCEPTION_TYPE$\(_2\)$ * NAME$\(_2\)$) { // termination handler 2
    219198        HANDLER_BLOCK$\(_2\)$
    220199}
    221200\end{cfa}
    222 When viewed on its own a try statement will simply exceute the statements in
    223 @GUARDED_BLOCK@ and when those are finished the try statement finishes.
    224 
    225 However, while the guarded statements are being executed, including any
    226 functions they invoke, all the handlers following the try block are now
    227 or any functions invoked from those
    228 statements, throws an exception, and the exception
     201The statements in the @GUARDED_BLOCK@ are executed. If those statements, or any
     202functions invoked from those statements, throws an exception, and the exception
    229203is not handled by a try statement further up the stack, the termination
    230204handlers are searched for a matching exception type from top to bottom.
     
    237211freed and control continues after the try statement.
    238212
    239 If no handler is found during the search then the default handler is run.
    240 Through \CFA's trait system the best match at the throw sight will be used.
    241 This function is run and is passed the copied exception. After the default
    242 handler is run control continues after the throw statement.
    243 
    244 There is a global @defaultTerminationHandler@ that cancels the current stack
    245 with the copied exception. However it is generic over all exception types so
    246 new default handlers can be defined for different exception types and so
    247 different exception types can have different default handlers.
     213The default handler visible at the throw statement is used if no matching
     214termination handler is found after the entire stack is searched. At that point,
     215the default handler is called with a reference to the exception object
     216generated at the throw. If the default handler returns, control continues
     217from after the throw statement. This feature allows
     218each exception type to define its own action, such as printing an informative
     219error message, when an exception is not handled in the program.
     220However the default handler for all exception types triggers a cancellation
     221using the exception.
    248222
    249223\subsection{Resumption}
    250224\label{s:Resumption}
    251225
    252 Resumption exception handling is a less common form than termination but is
    253 just as old~\cite{Goodenough75} and is in some sense simpler.
    254 It is a dynamic, non-local function call. If the throw is successful a
    255 closure will be taken from up the stack and executed, after which the throwing
    256 function will continue executing.
    257 These are most often used when an error occured and if the error is repaired
    258 then the function can continue.
     226Resumption raise, called ``resume'', is as old as termination
     227raise~\cite{Goodenough75} but is less popular. In many ways, resumption is
     228simpler and easier to understand, as it is simply a dynamic call.
     229The semantics of resumption is: search the stack for a matching handler,
     230execute the handler, and continue execution after the resume. Notice, the stack
     231cannot be unwound because execution returns to the raise point. Resumption is
     232used used when execution \emph{can} return to the resume. To continue
     233execution, the program must \emph{correct} in the handler for the failed
     234execution at the raise so execution can safely continue after the resume.
    259235
    260236A resumption raise is started with the @throwResume@ statement:
     
    264240The semantics of the @throwResume@ statement are like the @throw@, but the
    265241expression has return a reference a type that satifies the trait
    266 @is_resumption_exception@. The assertions from this trait are available to
    267 the exception system while handling the exception.
     242@is_resumption_exception@. Like with termination the exception system can
     243use these assertions while (throwing/raising/handling) the exception.
    268244
    269245At runtime, no copies are made. As the stack is not unwound the exception and
    270246any values on the stack will remain in scope while the resumption is handled.
    271247
    272 Then the exception system searches the stack using the provided exception.
    273 It starts starts from the throw and proceeds to the base of the stack,
    274 from callee to caller.
    275 At each stack frame, a check is made for resumption handlers defined by the
    276 @catchResume@ clauses of a @try@ statement.
     248Then the exception system searches the stack starting from the resume and
     249proceeding to the base of the stack, from callee to caller. At each stack
     250frame, a check is made for resumption handlers defined by the @catchResume@
     251clauses of a @try@ statement.
    277252\begin{cfa}
    278253try {
     
    284259}
    285260\end{cfa}
    286 If the handlers are not involved in a search this will simply execute the
    287 @GUARDED_BLOCK@ and then continue to the next statement.
    288 Its purpose is to add handlers onto the stack.
    289 (Note, termination and resumption handlers may be intermixed in a @try@
    290 statement but the kind of throw must be the same as the handler for it to be
    291 considered as a possible match.)
    292 
    293 If a search for a resumption handler reaches a try block it will check each
    294 @catchResume@ clause, top-to-bottom.
    295 At each handler if the thrown exception is or is a child type of
    296 @EXCEPTION_TYPE@$_i$ then the a pointer to the exception is bound to
    297 @NAME@$_i$ and then @HANDLER_BLOCK@$_i$ is executed. After the block is
    298 finished control will return to the @throwResume@ statement.
     261The statements in the @GUARDED_BLOCK@ are executed. If those statements, or any
     262functions invoked from those statements, resumes an exception, and the
     263exception is not handled by a try statement further up the stack, the
     264resumption handlers are searched for a matching exception type from top to
     265bottom. (Note, termination and resumption handlers may be intermixed in a @try@
     266statement but the kind of raise (throw/resume) only matches with the
     267corresponding kind of handler clause.)
     268
     269The exception search and matching for resumption is the same as for
     270termination, including exception inheritance. The difference is when control
     271reaches the end of the handler: the resumption handler returns after the resume
     272rather than after the try statement. The resume point assumes the handler has
     273corrected the problem so execution can safely continue.
    299274
    300275Like termination, if no resumption handler is found, the default handler
    301 visible at the throw statement is called. It will use the best match at the
    302 call sight according to \CFA's overloading rules. The default handler is
    303 passed the exception given to the throw. When the default handler finishes
    304 execution continues after the throw statement.
    305 
    306 There is a global @defaultResumptionHandler@ is polymorphic over all
    307 termination exceptions and preforms a termination throw on the exception.
    308 The @defaultTerminationHandler@ for that throw is matched at the original
    309 throw statement (the resumption @throwResume@) and it can be customized by
    310 introducing a new or better match as well.
    311 
    312 % \subsubsection?
    313 
    314 A key difference between resumption and termination is that resumption does
    315 not unwind the stack. A side effect that is that when a handler is matched
    316 and run it's try block (the guarded statements) and every try statement
    317 searched before it are still on the stack. This can lead to the recursive
    318 resumption problem.
    319 
    320 The recursive resumption problem is any situation where a resumption handler
    321 ends up being called while it is running.
    322 Consider a trivial case:
    323 \begin{cfa}
    324 try {
    325         throwResume (E &){};
    326 } catchResume(E *) {
    327         throwResume (E &){};
    328 }
    329 \end{cfa}
    330 When this code is executed the guarded @throwResume@ will throw, start a
    331 search and match the handler in the @catchResume@ clause. This will be
    332 call and placed on the stack on top of the try-block. The second throw then
    333 throws and will seach the same try block and put call another instance of the
    334 same handler leading to an infinite loop.
    335 
    336 This situation is trivial and easy to avoid, but much more complex cycles
    337 can form with multiple handlers and different exception types.
    338 
    339 To prevent all of these cases we mask sections of the stack, or equvilantly
    340 the try statements on the stack, so that the resumption seach skips over
    341 them and continues with the next unmasked section of the stack.
    342 
    343 A section of the stack is marked when it is searched to see if it contains
    344 a handler for an exception and unmarked when that exception has been handled
    345 or the search was completed without finding a handler.
     276visible at the resume statement is called, and the system default action is
     277executed.
     278
     279For resumption, the exception system uses stack marking to partition the
     280resumption search. If another resumption exception is raised in a resumption
     281handler, the second exception search does not start at the point of the
     282original raise. (Remember the stack is not unwound and the current handler is
     283at the top of the stack.) The search for the second resumption starts at the
     284current point on the stack because new try statements may have been pushed by
     285the handler or functions called from the handler. If there is no match back to
     286the point of the current handler, the search skips\label{p:searchskip} the
     287stack frames already searched by the first resume and continues after
     288the try statement. The default handler always continues from default
     289handler associated with the point where the exception is created.
    346290
    347291% This might need a diagram. But it is an important part of the justification
     
    362306\end{verbatim}
    363307
    364 The rules can be remembered as thinking about what would be searched in
    365 termination. So when a throw happens in a handler; a termination handler
    366 skips everything from the original throw to the original catch because that
    367 part of the stack has been unwound, a resumption handler skips the same
    368 section of stack because it has been masked.
    369 A throw in a default handler will preform the same search as the original
    370 throw because; for termination nothing has been unwound, for resumption
    371 the mask will be the same.
    372 
    373 The symmetry with termination is why this pattern was picked. Other patterns,
    374 such as marking just the handlers that caught, also work but lack the
    375 symmetry whih means there is more to remember.
     308This resumption search pattern reflects the one for termination, and so
     309should come naturally to most programmers.
     310However, it avoids the \emph{recursive resumption} problem.
     311If parts of the stack are searched multiple times, loops
     312can easily form resulting in infinite recursion.
     313
     314Consider the trivial case:
     315\begin{cfa}
     316try {
     317        throwResume (E &){}; // first
     318} catchResume(E *) {
     319        throwResume (E &){}; // second
     320}
     321\end{cfa}
     322If this handler is ever used it will be placed on top of the stack above the
     323try statement. If the stack was not masked than the @throwResume@ in the
     324handler would always be caught by the handler, leading to an infinite loop.
     325Masking avoids this problem and other more complex versions of it involving
     326multiple handlers and exception types.
     327
     328Other masking stratagies could be used; such as masking the handlers that
     329have caught an exception. This one was choosen because it creates a symmetry
     330with termination (masked sections of the stack would be unwound with
     331termination) and having only one pattern to learn is easier.
    376332
    377333\section{Conditional Catch}
     
    379335condition to further control which exceptions they handle:
    380336\begin{cfa}
    381 catch (EXCEPTION_TYPE * NAME ; CONDITION)
     337catch (EXCEPTION_TYPE * NAME ; @CONDITION@)
    382338\end{cfa}
    383339First, the same semantics is used to match the exception type. Second, if the
     
    385341reference all names in scope at the beginning of the try block and @NAME@
    386342introduced in the handler clause. If the condition is true, then the handler
    387 matches. Otherwise, the exception search continues as if the exception type
    388 did not match.
     343matches. Otherwise, the exception search continues at the next appropriate kind
     344of handler clause in the try block.
    389345\begin{cfa}
    390346try {
     
    400356remaining handlers in the current try statement.
    401357
    402 \section{Rethrowing}
    403 \colour{red}{From Andrew: I recomend we talk about why the language doesn't
     358\section{Reraise}
     359\color{red}{From Andrew: I recomend we talk about why the language doesn't
    404360have rethrows/reraises instead.}
    405361
    406 \label{s:Rethrowing}
     362\label{s:Reraise}
    407363Within the handler block or functions called from the handler block, it is
    408364possible to reraise the most recently caught exception with @throw@ or
    409 @throwResume@, respectively.
     365@throwResume@, respective.
    410366\begin{cfa}
    411367try {
    412368        ...
    413369} catch( ... ) {
    414         ... throw;
     370        ... throw; // rethrow
    415371} catchResume( ... ) {
    416         ... throwResume;
     372        ... throwResume; // reresume
    417373}
    418374\end{cfa}
     
    425381
    426382\section{Finally Clauses}
    427 Finally clauses are used to preform unconditional clean-up when leaving a
    428 scope. They are placed at the end of a try statement:
     383A @finally@ clause may be placed at the end of a @try@ statement.
    429384\begin{cfa}
    430385try {
     
    436391\end{cfa}
    437392The @FINALLY_BLOCK@ is executed when the try statement is removed from the
    438 stack, including when the @GUARDED_BLOCK@ finishes, any termination handler
    439 finishes or during an unwind.
     393stack, including when the @GUARDED_BLOCK@ or any handler clause finishes or
     394during an unwind.
    440395The only time the block is not executed is if the program is exited before
    441 the stack is unwound.
     396that happens.
    442397
    443398Execution of the finally block should always finish, meaning control runs off
     
    448403@return@ that causes control to leave the finally block. Other ways to leave
    449404the finally block, such as a long jump or termination are much harder to check,
    450 and at best requiring additional run-time overhead, and so are mearly
    451 discouraged.
    452 
    453 Not all languages with exceptions have finally clauses. Notably \Cpp does
    454 without it as descructors serve a similar role. Although destructors and
    455 finally clauses can be used in many of the same areas they have their own
    456 use cases like top-level functions and lambda functions with closures.
    457 Destructors take a bit more work to set up but are much easier to reuse while
    458 finally clauses are good for once offs and can include local information.
     405and at best requiring additional run-time overhead, and so are discouraged.
    459406
    460407\section{Cancellation}
     
    466413There is no special statement for starting a cancellation; instead the standard
    467414library function @cancel_stack@ is called passing an exception. Unlike a
    468 throw, this exception is not used in matching only to pass information about
     415raise, this exception is not used in matching only to pass information about
    469416the cause of the cancellation.
    470 (This also means matching cannot fail so there is no default handler either.)
    471 
    472 After @cancel_stack@ is called the exception is copied into the exception
    473 handling mechanism's memory. Then the entirety of the current stack is
    474 unwound. After that it depends one which stack is being cancelled.
     417
     418Handling of a cancellation depends on which stack is being cancelled.
    475419\begin{description}
    476420\item[Main Stack:]
     
    503447happen in an implicate join inside a destructor. So there is an error message
    504448and an abort instead.
    505 \todo{Perhaps have a more general disucssion of unwind collisions before
    506 this point.}
    507449
    508450The recommended way to avoid the abort is to handle the intial resumption
     
    513455\item[Coroutine Stack:] A coroutine stack is created for a @coroutine@ object
    514456or object that satisfies the @is_coroutine@ trait. A coroutine only knows of
    515 two other coroutines, its starter and its last resumer. Of the two the last
    516 resumer has the tightest coupling to the coroutine it activated and the most
    517 up-to-date information.
    518 
    519 Hence, cancellation of the active coroutine is forwarded to the last resumer
    520 after the stack is unwound. When the resumer restarts, it resumes exception
     457two other coroutines, its starter and its last resumer. The last resumer has
     458the tightest coupling to the coroutine it activated. Hence, cancellation of
     459the active coroutine is forwarded to the last resumer after the stack is
     460unwound, as the last resumer has the most precise knowledge about the current
     461execution. When the resumer restarts, it resumes exception
    521462@CoroutineCancelled@, which is polymorphic over the coroutine type and has a
    522463pointer to the cancelled coroutine.
  • doc/theses/andrew_beach_MMath/uw-ethesis.tex

    rf6664bf2 r14533d4  
    108108% Removes large sections of the document.
    109109\usepackage{comment}
    110 % Adds todos (Must be included after comment.)
    111 \usepackage{todonotes}
    112 
    113110
    114111% Hyperlinks make it very easy to navigate an electronic document.
     
    216213% Optional arguments do not work with pdf string. (Some fix-up required.)
    217214\pdfstringdefDisableCommands{\def\Cpp{C++}}
    218 
    219 % Colour text, formatted in LaTeX style instead of TeX style.
    220 \newcommand*\colour[2]{{\color{#1}#2}}
    221215\makeatother
    222216
  • doc/theses/thierry_delisle_PhD/thesis/Makefile

    rf6664bf2 r14533d4  
    88BibTeX = BIBINPUTS=${TeXLIB} && export BIBINPUTS && bibtex
    99
    10 MAKEFLAGS = --no-print-directory # --silent
     10MAKEFLAGS = --no-print-directory --silent
    1111VPATH = ${Build} ${Figures}
    1212
     
    5252# Directives #
    5353
    54 .NOTPARALLEL:                                           # cannot make in parallel
    55 
    5654.PHONY : all clean                                      # not file names
    5755
     
    8583        ${LaTeX} $<
    8684
     85build/fairness.svg : fig/fairness.py | ${Build}
     86        python3 $< $@
     87
    8788## Define the default recipes.
    8889
     
    106107        sed -i 's/$@/${Build}\/$@/g' ${Build}/$@_t
    107108
    108 build/fairness.svg : fig/fairness.py | ${Build}
    109         python3 $< $@
     109build/fairness.svg: fig/fairness.py | ${Build}
     110        python3 fig/fairness.py build/fairness.svg
    110111
    111112## pstex with inverted colors
  • doc/theses/thierry_delisle_PhD/thesis/text/io.tex

    rf6664bf2 r14533d4  
    11\chapter{User Level \io}
    2 As mentioned in Section~\ref{prev:io}, User-Level \io requires multiplexing the \io operations of many \glspl{thrd} onto fewer \glspl{proc} using asynchronous \io operations. Different operating systems offer various forms of asynchronous operations and as mentioned in Chapter~\ref{intro}, this work is exclusively focused on the Linux operating-system.
     2As mentionned in Section~\ref{prev:io}, User-Level \io requires multiplexing the \io operations of many \glspl{thrd} onto fewer \glspl{proc} using asynchronous \io operations. Various operating systems offer various forms of asynchronous operations and as mentioned in Chapter~\ref{intro}, this work is exclusively focuesd on Linux.
    33
    44\section{Kernel Interface}
    5 Since this work fundamentally depends on operating-system support, the first step of any design is to discuss the available interfaces and pick one (or more) as the foundations of the non-blocking \io subsystem.
    6 
    7 \subsection{\lstinline{O_NONBLOCK}}
    8 In Linux, files can be opened with the flag @O_NONBLOCK@~\cite{MAN:open} (or @SO_NONBLOCK@~\cite{MAN:accept}, the equivalent for sockets) to use the file descriptors in ``nonblocking mode''. In this mode, ``Neither the @open()@ nor any subsequent \io operations on the [opened file descriptor] will cause the calling
    9 process to wait''~\cite{MAN:open}. This feature can be used as the foundation for the non-blocking \io subsystem. However, for the subsystem to know when an \io operation completes, @O_NONBLOCK@ must be use in conjunction with a system call that monitors when a file descriptor becomes ready, \ie, the next \io operation on it does not cause the process to wait\footnote{In this context, ready means \emph{some} operation can be performed without blocking. It does not mean an operation returning \lstinline{EAGAIN} succeeds on the next try. For example, a ready read may only return a subset of bytes and the read must be issues again for the remaining bytes, at which point it may return \lstinline{EAGAIN}.}.
    10 This mechanism is also crucial in determining when all \glspl{thrd} are blocked and the application \glspl{kthrd} can now block.
    11 
    12 There are three options to monitor file descriptors in Linux\footnote{For simplicity, this section omits \lstinline{pselect} and \lstinline{ppoll}. The difference between these system calls and \lstinline{select} and \lstinline{poll}, respectively, is not relevant for this discussion.}, @select@~\cite{MAN:select}, @poll@~\cite{MAN:poll} and @epoll@~\cite{MAN:epoll}. All three of these options offer a system call that blocks a \gls{kthrd} until at least one of many file descriptors becomes ready. The group of file descriptors being waited is called the \newterm{interest set}.
    13 
    14 \paragraph{\lstinline{select}} is the oldest of these options, it takes as an input a contiguous array of bits, where each bits represent a file descriptor of interest. On return, it modifies the set in place to identify which of the file descriptors changed status. This destructive change means that calling select in a loop requires re-initializing the array each time and the number of file descriptors supported has a hard limit. Another limit of @select@ is that once the call is started, the interest set can no longer be modified. Monitoring a new file descriptor generally requires aborting any in progress call to @select@\footnote{Starting a new call to \lstinline{select} is possible but requires a distinct kernel thread, and as a result is not an acceptable multiplexing solution when the interest set is large and highly dynamic unless the number of parallel calls to \lstinline{select} can be strictly bounded.}.
    15 
    16 \paragraph{\lstinline{poll}} is an improvement over select, which removes the hard limit on the number of file descriptors and the need to re-initialize the input on every call. It works using an array of structures as an input rather than an array of bits, thus allowing a more compact input for small interest sets. Like @select@, @poll@ suffers from the limitation that the interest set cannot be changed while the call is blocked.
    17 
    18 \paragraph{\lstinline{epoll}} further improves these two functions by allowing the interest set to be dynamically added to and removed from while a \gls{kthrd} is blocked on an @epoll@ call. This dynamic capability is accomplished by creating an \emph{epoll instance} with a persistent interest set, which is used across multiple calls. This capability significantly reduces synchronization overhead on the part of the caller (in this case the \io subsystem), since the interest set can be modified when adding or removing file descriptors without having to synchronize with other \glspl{kthrd} potentially calling @epoll@.
    19 
    20 However, all three of these system calls have limitations. The @man@ page for @O_NONBLOCK@ mentions that ``[@O_NONBLOCK@] has no effect for regular files and block devices'', which means none of these three system calls are viable multiplexing strategies for these types of \io operations. Furthermore, @epoll@ has been shown to have problems with pipes and ttys~\cit{Peter's examples in some fashion}. Finally, none of these are useful solutions for multiplexing \io operations that do not have a corresponding file descriptor and can be awkward for operations using multiple file descriptors.
    21 
    22 \subsection{POSIX asynchronous I/O (AIO)}
    23 An alternative to @O_NONBLOCK@ is the AIO interface. Its interface lets programmers enqueue operations to be performed asynchronously by the kernel. Completions of these operations can be communicated in various ways: either by spawning a new \gls{kthrd}, sending a Linux signal, or by polling for completion of one or more operation. For this work, spawning a new \gls{kthrd} is counter-productive but a related solution is discussed in Section~\ref{io:morethreads}. Using interrupts handlers can also lead to fairly complicated interactions between subsystems. Leaving polling for completion, which is similar to the previous system calls. While AIO only supports read and write operations to file descriptors, it does not have the same limitation as @O_NONBLOCK@, \ie, the file descriptors can be regular files and blocked devices. It also supports batching multiple operations in a single system call.
    24 
    25 AIO offers two different approach to polling: @aio_error@ can be used as a spinning form of polling, returning @EINPROGRESS@ until the operation is completed, and @aio_suspend@ can be used similarly to @select@, @poll@ or @epoll@, to wait until one or more requests have completed. For the purpose of \io multiplexing, @aio_suspend@ is the best interface. However, even if AIO requests can be submitted concurrently, @aio_suspend@ suffers from the same limitation as @select@ and @poll@, \ie, the interest set cannot be dynamically changed while a call to @aio_suspend@ is in progress. AIO also suffers from the limitation of specifying which requests have completed, \ie programmers have to poll each request in the interest set using @aio_error@ to identify the completed requests. This limitation means that, like @select@ and @poll@ but not @epoll@, the time needed to examine polling results increases based on the total number of requests monitored, not the number of completed requests.
    26 Finally, AIO does not seem to be a popular interface, which I believe is due in part to this poor polling interface. Linus Torvalds talks about this interface as follows:
     5Since this work fundamentally depends on operating system support, the first step of any design is to discuss the available interfaces and pick one (or more) as the foundations of the \io subsystem.
     6
     7\subsection{\lstinline|O_NONBLOCK|}
     8In Linux, files can be opened with the flag @O_NONBLOCK@~\cite{MAN:open} (or @SO_NONBLOCK@~\cite{MAN:accept}, the equivalent for sockets) to use the file descriptors in ``nonblocking mode''. In this mode, ``Neither the open() nor any subsequent \io operations on the [opened file descriptor] will cause the calling
     9process to wait.'' This feature can be used as the foundation for the \io subsystem. However, for the subsystem to be able to block \glspl{thrd} until an operation completes, @O_NONBLOCK@ must be use in conjunction with a system call that monitors when a file descriptor becomes ready, \ie, the next \io operation on it will not cause the process to wait\footnote{In this context, ready means to \emph{some} operation can be performed without blocking. It does not mean that the last operation that return \lstinline|EAGAIN| will succeed on the next try. A file that is ready to read but has only 1 byte available would be an example of this distinction.}.
     10
     11There are three options to monitor file descriptors in Linux\footnote{For simplicity, this section omits to mention \lstinline|pselect| and \lstinline|ppoll|. The difference between these system calls and \lstinline|select| and \lstinline|poll| respectively is not relevant for this discussion.}, @select@~\cite{MAN:select}, @poll@~\cite{MAN:poll} and @epoll@~\cite{MAN:epoll}. All three of these options offer a system call that blocks a \gls{kthrd} until at least one of many file descriptor becomes ready. The group of file descriptors being waited on is often referred to as the \newterm{interest set}.
     12
     13\paragraph{\lstinline|select|} is the oldest of these options, it takes as an input a contiguous array of bits, where each bits represent a file descriptor of interest. On return, it modifies the set in place to identify which of the file descriptors changed status. This means that calling select in a loop requires re-initializing the array each time and the number of file descriptors supported has a hard limit. Another limit of @select@ is that once the call is started, the interest set can no longer be modified. Monitoring a new file descriptor generally requires aborting any in progress call to @select@\footnote{Starting a new call to \lstinline|select| in this case is possible but requires a distinct kernel thread, and as a result is not a acceptable multiplexing solution when the interest set is large and highly dynamic unless the number of parallel calls to select can be strictly bounded.}.
     14
     15\paragraph{\lstinline|poll|} is an improvement over select, which removes the hard limit on the number of file descriptors and the need to re-initialize the input on every call. It works using an array of structures as an input rather than an array of bits, thus allowing a more compact input for small interest sets. Like @select@, @poll@ suffers from the limitation that the interest set cannot be changed while the call is blocked.
     16
     17\paragraph{\lstinline|epoll|} further improves on these two functions, by allowing the interest set to be dynamically added to and removed from while a \gls{kthrd} is blocked on a call to @epoll@. This is done by creating an \emph{epoll instance} with a persistent intereset set and that is used across multiple calls. This advantage significantly reduces synchronization overhead on the part of the caller (in this case the \io subsystem) since the interest set can be modified when adding or removing file descriptors without having to synchronize with other \glspl{kthrd} potentially calling @epoll@.
     18
     19However, all three of these system calls suffer from generality problems to some extent. The man page for @O_NONBLOCK@ mentions that ``[@O_NONBLOCK@] has no effect for regular files and block devices'', which means none of these three system calls are viable multiplexing strategies for these types of \io operations. Furthermore, @epoll@ has been shown to have some problems with pipes and ttys\cit{Peter's examples in some fashion}. Finally, none of these are useful solutions for multiplexing \io operations that do not have a corresponding file descriptor and can be awkward for operations using multiple file descriptors.
     20
     21\subsection{The POSIX asynchronous I/O (AIO)}
     22An alternative to using @O_NONBLOCK@ is to use the AIO interface. Its interface lets programmers enqueue operations to be performed asynchronously by the kernel. Completions of these operations can be communicated in various ways, either by sending a Linux signal, spawning a new \gls{kthrd} or by polling for completion of one or more operation. For the purpose multiplexing operations, spawning a new \gls{kthrd} is counter-productive but a related solution is discussed in Section~\ref{io:morethreads}. Since using interrupts handlers can also lead to fairly complicated interactions between subsystems, I will concentrate on the different polling methods. AIO only supports read and write operations to file descriptors and those do not have the same limitation as @O_NONBLOCK@, \ie, the file descriptors can be regular files and blocked devices. It also supports batching more than one of these operations in a single system call.
     23
     24AIO offers two different approach to polling. @aio_error@ can be used as a spinning form of polling, returning @EINPROGRESS@ until the operation is completed, and @aio_suspend@ can be used similarly to @select@, @poll@ or @epoll@, to wait until one or more requests have completed. For the purpose of \io multiplexing, @aio_suspend@ is the intended interface. Even if AIO requests can be submitted concurrently, @aio_suspend@ suffers from the same limitation as @select@ and @poll@, \ie, the interest set cannot be dynamically changed while a call to @aio_suspend@ is in progress. Unlike @select@ and @poll@ however, it also suffers from the limitation that it does not specify which requests have completed, meaning programmers then have to poll each request in the interest set using @aio_error@ to identify which requests have completed. This means that, like @select@ and @poll@ but not @epoll@, the time needed to examine polling results increases based in the total number of requests monitored, not the number of completed requests.
     25
     26AIO does not seem to be a particularly popular interface, which I believe is in part due to this less than ideal polling interface. Linus Torvalds talks about this interface as follows :
    2727
    2828\begin{displayquote}
    29         AIO is a horrible ad-hoc design, with the main excuse being ``other,
     29        AIO is a horrible ad-hoc design, with the main excuse being "other,
    3030        less gifted people, made that design, and we are implementing it for
    3131        compatibility because database people - who seldom have any shred of
    32         taste - actually use it''.
     32        taste - actually use it".
    3333
    3434        But AIO was always really really ugly.
     
    3939\end{displayquote}
    4040
    41 Interestingly, in this e-mail, Linus goes on to describe
     41Interestingly, in this e-mail answer, Linus goes on to describe
    4242``a true \textit{asynchronous system call} interface''
    4343that does
     
    4747This description is actually quite close to the interface described in the next section.
    4848
    49 \subsection{\lstinline{io_uring}}
    50 A very recent addition to Linux, @io_uring@~\cite{MAN:io_uring}, is a framework that aims to solve many of the problems listed in the above interfaces. Like AIO, it represents \io operations as entries added to a queue. But like @epoll@, new requests can be submitted while a blocking call waiting for requests to complete is already in progress. The @io_uring@ interface uses two ring buffers (referred to simply as rings) at its core: a submit ring to which programmers push \io requests and a completion ring from which programmers poll for completion.
    51 
    52 One of the big advantages over the prior interfaces is that @io_uring@ also supports a much wider range of operations. In addition to supporting reads and writes to any file descriptor like AIO, it supports other operations like @open@, @close@, @fsync@, @accept@, @connect@, @send@, @recv@, @splice@, \etc.
    53 
    54 On top of these, @io_uring@ adds many extras like avoiding copies between the kernel and user-space using shared memory, allowing different mechanisms to communicate with device drivers, and supporting chains of requests, \ie, requests that automatically trigger followup requests on completion.
     49\subsection{\lstinline|io_uring|}
     50A very recent addition to Linux, @io_uring@\cite{MAN:io_uring} is a framework that aims to solve many of the problems listed with the above mentioned interfaces. Like AIO, it represents \io operations as entries added on a queue. But like @epoll@, new requests can be submitted while a blocking call waiting for requests to complete is already in progress. The @io_uring@ interface uses two ring buffers (referred to simply as rings) as its core, a submit ring to which programmers push \io requests and a completion buffer which programmers poll for completion.
     51
     52One of the big advantages over the interfaces listed above is that it also supports a much wider range of operations. In addition to supporting reads and writes to any file descriptor like AIO, it supports other operations like @open@, @close@, @fsync@, @accept@, @connect@, @send@, @recv@, @splice@, \etc.
     53
     54On top of these, @io_uring@ adds many ``bells and whistles'' like avoiding copies between the kernel and user-space with shared memory, allowing different mechanisms to communicate with device drivers and supporting chains of requests, \ie, requests that automatically trigger followup requests on completion.
    5555
    5656\subsection{Extra Kernel Threads}\label{io:morethreads}
    57 Finally, if the operating system does not offer a satisfactory form of asynchronous \io operations, an ad-hoc solution is to create a pool of \glspl{kthrd} and delegate operations to it to avoid blocking \glspl{proc}, which is a compromise for multiplexing. In the worst case, where all \glspl{thrd} are consistently blocking on \io, it devolves into 1-to-1 threading. However, regardless of the frequency of \io operations, it achieves the fundamental goal of not blocking \glspl{proc} when \glspl{thrd} are ready to run. This approach is used by languages like Go\cit{Go} and frameworks like libuv\cit{libuv}, since it has the advantage that it can easily be used across multiple operating systems. This advantage is especially relevant for languages like Go, which offer a homogeneous \glsxtrshort{api} across all platforms. As opposed to C, which has a very limited standard api for \io, \eg, the C standard library has no networking.
     57Finally, if the operating system does not offer any satisfying forms of asynchronous \io operations, a solution is to fake it by creating a pool of \glspl{kthrd} and delegating operations to them in order to avoid blocking \glspl{proc}. The is a compromise on multiplexing. In the worst case, where all \glspl{thrd} are consistently blocking on \io, it devolves into 1-to-1 threading. However, regardless of the frequency of \io operations, it achieves the fundamental goal of not blocking \glspl{proc} when \glspl{thrd} are ready to run. This approach is used by languages like Go\cit{Go} and frameworks like libuv\cit{libuv}, since it has the advantage that it can easily be used across multiple operating systems. This advantage is especially relevant for languages like Go, which offer an homogenous \glsxtrshort{api} across all platforms. As opposed to C, which has a very limited standard api for \io, \eg, the C standard library has no networking.
    5858
    5959\subsection{Discussion}
    60 These options effectively fall into two broad camps: waiting for \io to be ready versus waiting for \io to complete. All operating systems that support asynchronous \io must offer an interface along one of these lines, but the details vary drastically. For example, Free BSD offers @kqueue@~\cite{MAN:bsd/kqueue}, which behaves similarly to @epoll@, but with some small quality of use improvements, while Windows (Win32)~\cit{https://docs.microsoft.com/en-us/windows/win32/fileio/synchronous-and-asynchronous-i-o} offers ``overlapped I/O'', which handles submissions similarly to @O_NONBLOCK@ with extra flags on the synchronous system call, but waits for completion events, similarly to @io_uring@.
    61 
    62 For this project, I selected @io_uring@, in large parts because to its generality. While @epoll@ has been shown to be a good solution for socket \io (\cite{DBLP:journals/pomacs/KarstenB20}), @io_uring@'s transparent support for files, pipes, and more complex operations, like @splice@ and @tee@, make it a better choice as the foundation for a general \io subsystem.
     60These options effectively fall into two broad camps of solutions, waiting for \io to be ready versus waiting for \io to be completed. All operating systems that support asynchronous \io must offer an interface along one of these lines, but the details can vary drastically. For example, Free BSD offers @kqueue@~\cite{MAN:bsd/kqueue} which behaves similarly to @epoll@ but with some small quality of life improvements, while Windows (Win32)~\cit{https://docs.microsoft.com/en-us/windows/win32/fileio/synchronous-and-asynchronous-i-o} offers ``overlapped I/O'' which handles submissions similarly to @O_NONBLOCK@, with extra flags on the synchronous system call, but waits for completion events, similarly to @io_uring@.
     61
     62For this project, I have chosen to use @io_uring@, in large parts due to its generality. While @epoll@ has been shown to be a good solution to socket \io (\cite{DBLP:journals/pomacs/KarstenB20}), @io_uring@'s transparent support for files, pipes and more complex operations, like @splice@ and @tee@, make it a better choice as the foundation for a general \io subsystem.
    6363
    6464\section{Event-Engine}
    65 An event engine's responsibility is to use the kernel interface to multiplex many \io operations onto few \glspl{kthrd}. In concrete terms, this means \glspl{thrd} enter the engine through an interface, the event engines then starts the operation and parks the calling \glspl{thrd}, returning control to the \gls{proc}. The parked \glspl{thrd} are then rescheduled by the event engine once the desired operation has completed.
    66 
    67 \subsection{\lstinline{io_uring} in depth}
    68 Before going into details on the design of my event engine, more details on @io_uring@ usage are presented, each important in the design of the engine.
    69 Figure~\ref{fig:iouring} shows an overview of an @io_uring@ instance.
    70 Two ring buffers are used to communicate with the kernel: one for submissions~(left) and one for completions~(right).
    71 The submission ring contains entries, \newterm{Submit Queue Entries} (SQE), produced (appended) by the application when an operation starts and then consumed by the kernel.
    72 The completion ring contains entries, \newterm{Completion Queue Entries} (CQE), produced (appended) by the kernel when an operation completes and then consumed by the application.
    73 The submission ring contains indexes into the SQE array (denoted \emph{S}) containing entries describing the I/O operation to start;
    74 the completion ring contains entries for the completed I/O operation.
    75 Multiple @io_uring@ instances can be created, in which case they each have a copy of the data structures in the figure.
     65
     66The event engines reponsibility is to use the kernel interface to multiplex many \io operations onto few \glspl{kthrd}. In concrete terms, this means that \glspl{thrd} enter the engine through an interface, the event engines then starts the operation and parks the calling \glspl{thrd}, returning control to the \gls{proc}. The parked \glspl{thrd} are then rescheduled by the event engine once the desired operation has completed.
     67
     68\subsection{\lstinline|io_uring| in depth}
     69Before going into details on the design of the event engine, I will present some more details on the usage of @io_uring@ which are important for the design of the engine.
    7670
    7771\begin{figure}
    7872        \centering
    7973        \input{io_uring.pstex_t}
    80         \caption{Overview of \lstinline{io_uring}}
    81 %       \caption[Overview of \lstinline{io_uring}]{Overview of \lstinline{io_uring} \smallskip\newline Two ring buffer are used to communicate with the kernel, one for completions~(right) and one for submissions~(left). The completion ring contains entries, \newterm{CQE}s: Completion Queue Entries, that are produced by the kernel when an operation completes and then consumed by the application. On the other hand, the application produces \newterm{SQE}s: Submit Queue Entries, which it appends to the submission ring for the kernel to consume. Unlike the completion ring, the submission ring does not contain the entries directly, it indexes into the SQE array (denoted \emph{S}) instead.}
     74        \caption[Overview of \lstinline|io_uring|]{Overview of \lstinline|io_uring| \smallskip\newline Two ring buffer are used to communicate with the kernel, one for completions~(right) and one for submissions~(left). The completion ring contains entries, \newterm{CQE}s: Completion Queue Entries, that are produced by the kernel when an operation completes and then consumed by the application. On the other hand, the application produces \newterm{SQE}s: Submit Queue Entries, which it appends to the submission ring for the kernel to consume. Unlike the completion ring, the submission ring does not contain the entries directly, it indexes into the SQE array (denoted \emph{S}) instead.}
    8275        \label{fig:iouring}
    8376\end{figure}
    8477
    85 New \io operations are submitted to the kernel following 4 steps, which use the components shown in the figure.
    86 \begin{enumerate}
    87 \item
    88 An SQE is allocated from the pre-allocated array (denoted \emph{S} in Figure~\ref{fig:iouring}). This array is created at the same time as the @io_uring@ instance, is in kernel-locked memory visible by both the kernel and the application, and has a fixed size determined at creation. How these entries are allocated is not important for the functioning of @io_uring@, the only requirement is that no entry is reused before the kernel has consumed it.
    89 \item
    90 The SQE is filled according to the desired operation. This step is straight forward, the only detail worth mentioning is that SQEs have a @user_data@ field that must be filled in order to match submission and completion entries.
    91 \item
    92 The SQE is submitted to the submission ring by appending the index of the SQE to the ring following regular ring buffer steps: \lstinline{buffer[head] = item; head++}. Since the head is visible to the kernel, some memory barriers may be required to prevent the compiler from reordering these operations. Since the submission ring is a regular ring buffer, more than one SQE can be added at once and the head is updated only after all entries are updated.
    93 \item
    94 The kernel is notified of the change to the ring using the system call @io_uring_enter@. The number of elements appended to the submission ring is passed as a parameter and the number of elements consumed is returned. The @io_uring@ instance can be constructed so this step is not required, but this requires elevated privilege.% and an early version of @io_uring@ had additional restrictions.
    95 \end{enumerate}
    96 
    97 \begin{sloppypar}
    98 The completion side is simpler: applications call @io_uring_enter@ with the flag @IORING_ENTER_GETEVENTS@ to wait on a desired number of operations to complete. The same call can be used to both submit SQEs and wait for operations to complete. When operations do complete, the kernel appends a CQE to the completion ring and advances the head of the ring. Each CQE contains the result of the operation as well as a copy of the @user_data@ field of the SQE that triggered the operation. It is not necessary to call @io_uring_enter@ to get new events because the kernel can directly modify the completion ring. The system call is only needed if the application wants to block waiting for operations to complete.
    99 \end{sloppypar}
    100 
    101 The @io_uring_enter@ system call is protected by a lock inside the kernel. This protection means that concurrent call to @io_uring_enter@ using the same instance are possible, but there is no performance gained from parallel calls to @io_uring_enter@. It is possible to do the first three submission steps in parallel, however, doing so requires careful synchronization.
    102 
    103 @io_uring@ also introduces constraints on the number of simultaneous operations that can be ``in flight''. Obviously, SQEs are allocated from a fixed-size array, meaning that there is a hard limit to how many SQEs can be submitted at once. In addition, the @io_uring_enter@ system call can fail because ``The  kernel [...] ran out of resources to handle [a request]'' or ``The application is attempting to overcommit the number of requests it can  have  pending.''. This restriction means \io request bursts may have to be subdivided and submitted in chunks at a later time.
     78Figure~\ref{fig:iouring} shows an overview of an @io_uring@ instance. Multiple @io_uring@ instances can be created, in which case they each have a copy of the data structures in the figure. New \io operations are submitted to the kernel following 4 steps which use the components shown in the figure.
     79
     80\paragraph{First} an @sqe@ must be allocated from the pre-allocated array (denoted \emph{S} in Figure~\ref{fig:iouring}). This array is created at the same time as the @io_uring@ instance, is in kernel-locked memory, which means it is both visible by the kernel and the application, and has a fixed size determined at creation. How these entries are allocated is not important for the functionning of @io_uring@, the only requirement is that no entry is reused before the kernel has consumed it.
     81
     82\paragraph{Secondly} the @sqe@ must be filled according to the desired operation. This step is straight forward, the only detail worth mentionning is that @sqe@s have a @user_data@ field that must be filled in order to match submission and completion entries.
     83
     84\paragraph{Thirdly} the @sqe@ must be submitted to the submission ring, this requires appending the index of the @sqe@ to the ring following regular ring buffer steps: \lstinline|{ buffer[head] = item; head++ }|. Since the head is visible to the kernel, some memory barriers may be required to prevent the compiler from reordering these operations. Since the submission ring is a regular ring buffer, more than one @sqe@ can be added at once and the head can be updated only after the entire batch has been updated.
     85
     86\paragraph{Finally} the kernel must be notified of the change to the ring using the system call @io_uring_enter@. The number of elements appended to the submission ring is passed as a parameter and the number of elements consumed is returned. The @io_uring@ instance can be constructed so that this step is not required, but this requires elevated privilege and early version of @io_uring@ had additionnal restrictions.
     87
     88The completion side is simpler, applications call @io_uring_enter@ with the flag @IORING_ENTER_GETEVENTS@ to wait on a desired number of operations to complete. The same call can be used to both submit @sqe@s and wait for operations to complete. When operations do complete the kernel appends a @cqe@ to the completion ring and advances the head of the ring. Each @cqe@ contains the result of the operation as well as a copy of the @user_data@ field of the @sqe@ that triggered the operation. It is not necessary to call @io_uring_enter@ to get new events, the kernel can directly modify the completion ring, the system call is only needed if the application wants to block waiting on operations to complete.
     89
     90The @io_uring_enter@ system call is protected by a lock inside the kernel. This means that concurrent call to @io_uring_enter@ using the same instance are possible, but there is can be no performance gained from parallel calls to @io_uring_enter@. It is possible to do the first three submission steps in parallel, however, doing so requires careful synchronization.
     91
     92@io_uring@ also introduces some constraints on what the number of operations that can be ``in flight'' at the same time. Obviously, @sqe@s are allocated from a fixed-size array, meaning that there is a hard limit to how many @sqe@s can be submitted at once. In addition, the @io_uring_enter@ system call can fail because ``The  kernel [...] ran out of resources to handle [a request]'' or ``The application is attempting to overcommit the number of requests it can  have  pending.''. This requirement means that it can be required to handle bursts of \io requests by holding back some of the requests so they can be submitted at a later time.
    10493
    10594\subsection{Multiplexing \io: Submission}
    106 The submission side is the most complicated aspect of @io_uring@ and the completion side effectively follows from the design decisions made in the submission side. While it is possible to do the first steps of submission in parallel, the duration of the system call scales with number of entries submitted. The consequence is that the amount of parallelism used to prepare submissions for the next system call is limited.
    107 Beyond this limit, the length of the system call is the throughput limiting factor. I concluded from early experiments that preparing submissions seems to take about as long as the system call itself, which means that with a single @io_uring@ instance, there is no benefit in terms of \io throughput to having more than two \glspl{hthrd}. Therefore the design of the submission engine must manage multiple instances of @io_uring@ running in parallel, effectively sharding @io_uring@ instances. Similarly to scheduling, this sharding can be done privately, \ie, one instance per \glspl{proc}, in decoupled pools, \ie, a pool of \glspl{proc} use a pool of @io_uring@ instances without one-to-one coupling between any given instance and any given \gls{proc}, or some mix of the two. Since completions are sent to the instance where requests were submitted, all instances with pending operations must be polled continously\footnote{As will be described in Chapter~\ref{practice}, this does not translate into constant cpu usage.}.
     95The submission side is the most complicated aspect of @io_uring@ and the completion side effectively follows from the design decisions made in the submission side.
     96
     97While it is possible to do the first steps of submission in parallel, the duration of the system call scales with number of entries submitted. The consequence of this is that how much parallelism can be used to prepare submissions for the next system call is limited. Beyond this limit, the length of the system call will be the throughput limiting factor. I have concluded from early experiments that preparing submissions seems to take about as long as the system call itself, which means that with a single @io_uring@ instance, there is no benefit in terms of \io throughput to having more than two \glspl{hthrd}. Therefore the design of the submission engine must manage multiple instances of @io_uring@ running in parallel, effectively sharding @io_uring@ instances. Similarly to scheduling, this sharding can be done privately, \ie, one instance per \glspl{proc}, in decoupled pools, \ie, a pool of \glspl{proc} use a pool of @io_uring@ instances without one-to-one coupling between any given instance and any given \gls{proc}, or some mix of the two. Since completions are sent to the instance where requests were submitted, all instances with pending operations must be polled continously\footnote{As will be described in Chapter~\ref{practice}, this does not translate into constant cpu usage.}.
    10898
    10999\subsubsection{Shared Instances}
     
    114104Allocation failures need to be pushed up to the routing algorithm: \glspl{thrd} attempting \io operations must not be directed to @io_uring@ instances without sufficient @sqe@s available. Furthermore, the routing algorithm should block operations up-front if none of the instances have available @sqe@s.
    115105
    116 Once an SQE is allocated, \glspl{thrd} can fill them normally, they simply need to keep track of the SQE index and which instance it belongs to.
    117 
    118 Once an SQE is filled in, what needs to happen is that the SQE must be added to the submission ring buffer, an operation that is not thread-safe on itself, and the kernel must be notified using the @io_uring_enter@ system call. The submission ring buffer is the same size as the pre-allocated SQE buffer, therefore pushing to the ring buffer cannot fail\footnote{This is because it is invalid to have the same \lstinline{sqe} multiple times in the ring buffer.}. However, as mentioned, the system call itself can fail with the expectation that it will be retried once some of the already submitted operations complete. Since multiple SQEs can be submitted to the kernel at once, it is important to strike a balance between batching and latency. Operations that are ready to be submitted should be batched together in few system calls, but at the same time, operations should not be left pending for long period of times before being submitted. This can be handled by either designating one of the submitting \glspl{thrd} as the being responsible for the system call for the current batch of SQEs or by having some other party regularly submitting all ready SQEs, \eg, the poller \gls{thrd} mentioned later in this section.
    119 
    120 In the case of designating a \gls{thrd}, ideally, when multiple \glspl{thrd} attempt to submit operations to the same @io_uring@ instance, all requests would be batched together and one of the \glspl{thrd} would do the system call on behalf of the others, referred to as the \newterm{submitter}. In practice however, it is important that the \io requests are not left pending indefinitely and as such, it may be required to have a current submitter and a next submitter. Indeed, as long as there is a ``next'' submitter, \glspl{thrd} submitting new \io requests can move on, knowing that some future system call will include their request. Once the system call is done, the submitter must also free SQEs so that the allocator can reused them.
    121 
    122 Finally, the completion side is much simpler since the @io_uring@ system call enforces a natural synchronization point. Polling simply needs to regularly do the system call, go through the produced CQEs and communicate the result back to the originating \glspl{thrd}. Since CQEs only own a signed 32 bit result, in addition to the copy of the @user_data@ field, all that is needed to communicate the result is a simple future~\cite{wiki:future}. If the submission side does not designate submitters, polling can also submit all SQEs as it is polling events.  A simple approach to polling is to allocate a \gls{thrd} per @io_uring@ instance and simply let the poller \glspl{thrd} poll their respective instances when scheduled. This design is especially convenient for reasons explained in Chapter~\ref{practice}.
    123 
    124 <<<<<<< HEAD
     106Once an @sqe@ is allocated, \glspl{thrd} can fill them normally, they simply need to keep trac of the @sqe@ index and which instance it belongs to.
     107
     108Once an @sqe@ is filled in, what needs to happen is that the @sqe@ must be added to the submission ring buffer, an operation that is not thread-safe on itself, and the kernel must be notified using the @io_uring_enter@ system call. The submission ring buffer is the same size as the pre-allocated @sqe@ buffer, therefore pushing to the ring buffer cannot fail\footnote{This is because it is invalid to have the same \lstinline|sqe| multiple times in the ring buffer.}. However, as mentioned, the system call itself can fail with the expectation that it will be retried once some of the already submitted operations complete. Since multiple @sqe@s can be submitted to the kernel at once, it is important to strike a balance between batching and latency. Operations that are ready to be submitted should be batched together in few system calls, but at the same time, operations should not be left pending for long period of times before being submitted. This can be handled by either designating one of the submitting \glspl{thrd} as the being responsible for the system call for the current batch of @sqe@s or by having some other party regularly submitting all ready @sqe@s, \eg, the poller \gls{thrd} mentionned later in this section.
     109
     110In the case of designating a \gls{thrd}, ideally, when multiple \glspl{thrd} attempt to submit operations to the same @io_uring@ instance, all requests would be batched together and one of the \glspl{thrd} would do the system call on behalf of the others, referred to as the \newterm{submitter}. In practice however, it is important that the \io requests are not left pending indefinately and as such, it may be required to have a current submitter and a next submitter. Indeed, as long as there is a ``next'' submitter, \glspl{thrd} submitting new \io requests can move on, knowing that some future system call will include their request. Once the system call is done, the submitter must also free @sqe@s so that the allocator can reused them.
     111
     112Finally, the completion side is much simpler since the @io_uring@ system call enforces a natural synchronization point. Polling simply needs to regularly do the system call, go through the produced @cqe@s and communicate the result back to the originating \glspl{thrd}. Since @cqe@s only own a signed 32 bit result, in addition to the copy of the @user_data@ field, all that is needed to communicate the result is a simple future~\cite{wiki:future}. If the submission side does not designate submitters, polling can also submit all @sqe@s as it is polling events.  A simple approach to polling is to allocate a \gls{thrd} per @io_uring@ instance and simply let the poller \glspl{thrd} poll their respective instances when scheduled. This design is especially convinient for reasons explained in Chapter~\ref{practice}.
     113
    125114With this pool of instances approach, the big advantage is that it is fairly flexible. It does not impose restrictions on what \glspl{thrd} submitting \io operations can and cannot do between allocations and submissions. It also can gracefully handles running out of ressources, @sqe@s or the kernel returning @EBUSY@. The down side to this is that many of the steps used for submitting need complex synchronization to work properly. The routing and allocation algorithm needs to keep track of which ring instances have available @sqe@s, block incoming requests if no instance is available, prevent barging if \glspl{thrd} are already queued up waiting for @sqe@s and handle @sqe@s being freed. The submission side needs to safely append @sqe@s to the ring buffer, make sure no @sqe@ is dropped or left pending forever, notify the allocation side when @sqe@s can be reused and handle the kernel returning @EBUSY@. All this synchronization may have a significant cost and, compare to the next approach presented, this synchronization is entirely overhead.
    126115
    127116\subsubsection{Private Instances}
    128117Another approach is to simply create one ring instance per \gls{proc}. This alleviate the need for synchronization on the submissions, requiring only that \glspl{thrd} are not interrupted in between two submission steps. This is effectively the same requirement as using @thread_local@ variables. Since @sqe@s that are allocated must be submitted to the same ring, on the same \gls{proc}, this effectively forces the application to submit @sqe@s in allocation order\footnote{The actual requirement is that \glspl{thrd} cannot context switch between allocation and submission. This requirement means that from the subsystem's point of view, the allocation and submission are sequential. To remove this requirement, a \gls{thrd} would need the ability to ``yield to a specific \gls{proc}'', \ie, park with the promise that it will be run next on a specific \gls{proc}, the \gls{proc} attached to the correct ring.}, greatly simplifying both allocation and submission. In this design, allocation and submission form a ring partitionned ring buffer as shown in Figure~\ref{fig:pring}. Once added to the ring buffer, the attached \gls{proc} has a significant amount of flexibility with regards to when to do the system call. Possible options are: when the \gls{proc} runs out of \glspl{thrd} to run, after running a given number of threads \glspl{thrd}, etc.
    129 =======
    130 With this pool of instances approach, the big advantage is that it is fairly flexible. It does not impose restrictions on what \glspl{thrd} submitting \io operations can and cannot do between allocations and submissions. It also can gracefully handle running out of resources, SQEs or the kernel returning @EBUSY@. The down side to this is that many of the steps used for submitting need complex synchronization to work properly. The routing and allocation algorithm needs to keep track of which ring instances have available SQEs, block incoming requests if no instance is available, prevent barging if \glspl{thrd} are already queued up waiting for SQEs and handle SQEs being freed. The submission side needs to safely append SQEs to the ring buffer, make sure no SQE is dropped or left pending forever, notify the allocation side when SQEs can be reused and handle the kernel returning @EBUSY@. Sharding the @io_uring@ instances should alleviate much of the contention caused by this, but all this synchronization may still have non-zero cost.
    131 
    132 \subsubsection{Private Instances}
    133 Another approach is to simply create one ring instance per \gls{proc}. This alleviate the need for synchronization on the submissions, requiring only that \glspl{thrd} are not interrupted in between two submission steps. This is effectively the same requirement as using @thread_local@ variables. Since SQEs that are allocated must be submitted to the same ring, on the same \gls{proc}, this effectively forces the application to submit SQEs in allocation order\footnote{The actual requirement is that \glspl{thrd} cannot context switch between allocation and submission. This requirement means that from the subsystem's point of view, the allocation and submission are sequential. To remove this requirement, a \gls{thrd} would need the ability to ``yield to a specific \gls{proc}'', \ie, park with the promise that it will be run next on a specific \gls{proc}, the \gls{proc} attached to the correct ring. This is not a current or planned feature of \CFA.}, greatly simplifying both allocation and submission. In this design, allocation and submission form a ring partitioned ring buffer as shown in Figure~\ref{fig:pring}. Once added to the ring buffer, the attached \gls{proc} has a significant amount of flexibility with regards to when to do the system call. Possible options are: when the \gls{proc} runs out of \glspl{thrd} to run, after running a given number of threads \glspl{thrd}, etc.
    134 >>>>>>> 1830a8657cb302a89a7ca045bee06baa48b18101
    135118
    136119\begin{figure}
    137120        \centering
    138121        \input{pivot_ring.pstex_t}
    139         \caption[Partitioned ring buffer]{Partitioned ring buffer \smallskip\newline Allocated sqes are appending to the first partition. When submitting, the partition is simply advanced to include all the sqes that should be submitted. The kernel considers the partition as the head of the ring.}
     122        \caption[Partitionned ring buffer]{Partitionned ring buffer \smallskip\newline Allocated sqes are appending to the first partition. When submitting, the partition is simply advanced to include all the sqes that should be submitted. The kernel considers the partition as the head of the ring.}
    140123        \label{fig:pring}
    141124\end{figure}
    142125
    143 <<<<<<< HEAD
    144126This approach has the advantage that it does not require much of the synchronization needed in the shared approach. This comes at the cost that \glspl{thrd} submitting \io operations have less flexibility, they cannot park or yield, and several exceptional cases are handled poorly. Instances running out of @sqe@s cannot run \glspl{thrd} wanting to do \io operations, in such a case the \gls{thrd} needs to be moved to a different \gls{proc}, the only current way of achieving this would be to @yield()@ hoping to be scheduled on a different \gls{proc}, which is not guaranteed.
    145127
     
    208190%       if cltr.io.flag || proc.io != alloc.io || proc.io->flag:
    209191%               return submit_slow(cltr.io)
    210 =======
    211 This approach has the advantage that it does not require much of the synchronization needed in the shared approach. This comes at the cost that \glspl{thrd} submitting \io operations have less flexibility, they cannot park or yield, and several exceptional cases are handled poorly. Instances running out of SQEs cannot run \glspl{thrd} wanting to do \io operations, in such a case the \gls{thrd} needs to be moved to a different \gls{proc}, the only current way of achieving this would be to @yield()@ hoping to be scheduled on a different \gls{proc}, which is not guaranteed. Another problematic case is that \glspl{thrd} that do not park for long periods of time will delay the submission of any SQE not already submitted. This issue is similar to fairness issues which schedulers that use work-stealing mentioned in the previous chapter.
    212 >>>>>>> 1830a8657cb302a89a7ca045bee06baa48b18101
    213192
    214193%       submit_fast(proc.io, a)
     
    235214\subsection{Asynchronous Extension}
    236215
    237 \subsection{Interface directly to \lstinline{io_uring}}
     216\subsection{Interface directly to \lstinline|io_uring|}
  • doc/theses/thierry_delisle_PhD/thesis/thesis.tex

    rf6664bf2 r14533d4  
    1 %======================================================================
    2 % University of Waterloo Thesis Template for LaTeX
    3 % Last Updated November, 2020
    4 % by Stephen Carr, IST Client Services,
    5 % University of Waterloo, 200 University Ave. W., Waterloo, Ontario, Canada
    6 % FOR ASSISTANCE, please send mail to request@uwaterloo.ca
     1% uWaterloo Thesis Template for LaTeX
     2% Last Updated June 14, 2017 by Stephen Carr, IST Client Services
     3% FOR ASSISTANCE, please send mail to rt-IST-CSmathsci@ist.uwaterloo.ca
     4
     5% Effective October 2006, the University of Waterloo
     6% requires electronic thesis submission. See the uWaterloo thesis regulations at
     7% https://uwaterloo.ca/graduate-studies/thesis.
     8
     9% DON'T FORGET TO ADD YOUR OWN NAME AND TITLE in the "hyperref" package
     10% configuration below. THIS INFORMATION GETS EMBEDDED IN THE PDF FINAL PDF DOCUMENT.
     11% You can view the information if you view Properties of the PDF document.
     12
     13% Many faculties/departments also require one or more printed
     14% copies. This template attempts to satisfy both types of output.
     15% It is based on the standard "book" document class which provides all necessary
     16% sectioning structures and allows multi-part theses.
    717
    818% DISCLAIMER
    9 % To the best of our knowledge, this template satisfies the current uWaterloo thesis requirements.
    10 % However, it is your responsibility to assure that you have met all requirements of the University and your particular department.
    11 
    12 % Many thanks for the feedback from many graduates who assisted the development of this template.
    13 % Also note that there are explanatory comments and tips throughout this template.
    14 %======================================================================
    15 % Some important notes on using this template and making it your own...
    16 
    17 % The University of Waterloo has required electronic thesis submission since October 2006.
    18 % See the uWaterloo thesis regulations at
    19 % https://uwaterloo.ca/graduate-studies/thesis.
    20 % This thesis template is geared towards generating a PDF version optimized for viewing on an electronic display, including hyperlinks within the PDF.
    21 
    22 % DON'T FORGET TO ADD YOUR OWN NAME AND TITLE in the "hyperref" package configuration below.
    23 % THIS INFORMATION GETS EMBEDDED IN THE PDF FINAL PDF DOCUMENT.
    24 % You can view the information if you view properties of the PDF document.
    25 
    26 % Many faculties/departments also require one or more printed copies.
    27 % This template attempts to satisfy both types of output.
    28 % See additional notes below.
    29 % It is based on the standard "book" document class which provides all necessary sectioning structures and allows multi-part theses.
    30 
    31 % If you are using this template in Overleaf (cloud-based collaboration service), then it is automatically processed and previewed for you as you edit.
    32 
    33 % For people who prefer to install their own LaTeX distributions on their own computers, and process the source files manually, the following notes provide the sequence of tasks:
    34  
     19% To the best of our knowledge, this template satisfies the current uWaterloo requirements.
     20% However, it is your responsibility to assure that you have met all
     21% requirements of the University and your particular department.
     22% Many thanks for the feedback from many graduates that assisted the development of this template.
     23
     24% -----------------------------------------------------------------------
     25
     26% By default, output is produced that is geared toward generating a PDF
     27% version optimized for viewing on an electronic display, including
     28% hyperlinks within the PDF.
     29
    3530% E.g. to process a thesis called "mythesis.tex" based on this template, run:
    3631
    3732% pdflatex mythesis     -- first pass of the pdflatex processor
    3833% bibtex mythesis       -- generates bibliography from .bib data file(s)
    39 % makeindex         -- should be run only if an index is used 
     34% makeindex         -- should be run only if an index is used
    4035% pdflatex mythesis     -- fixes numbering in cross-references, bibliographic references, glossaries, index, etc.
    41 % pdflatex mythesis     -- it takes a couple of passes to completely process all cross-references
    42 
    43 % If you use the recommended LaTeX editor, Texmaker, you would open the mythesis.tex file, then click the PDFLaTeX button. Then run BibTeX (under the Tools menu).
    44 % Then click the PDFLaTeX button two more times.
    45 % If you have an index as well,you'll need to run MakeIndex from the Tools menu as well, before running pdflatex
     36% pdflatex mythesis     -- fixes numbering in cross-references, bibliographic references, glossaries, index, etc.
     37
     38% If you use the recommended LaTeX editor, Texmaker, you would open the mythesis.tex
     39% file, then click the PDFLaTeX button. Then run BibTeX (under the Tools menu).
     40% Then click the PDFLaTeX button two more times. If you have an index as well,
     41% you'll need to run MakeIndex from the Tools menu as well, before running pdflatex
    4642% the last two times.
    4743
    48 % N.B. The "pdftex" program allows graphics in the following formats to be included with the "\includegraphics" command: PNG, PDF, JPEG, TIFF
    49 % Tip: Generate your figures and photos in the size you want them to appear in your thesis, rather than scaling them with \includegraphics options.
    50 % Tip: Any drawings you do should be in scalable vector graphic formats: SVG, PNG, WMF, EPS and then converted to PNG or PDF, so they are scalable in the final PDF as well.
    51 % Tip: Photographs should be cropped and compressed so as not to be too large.
    52 
    53 % To create a PDF output that is optimized for double-sided printing:
    54 % 1) comment-out the \documentclass statement in the preamble below, and un-comment the second \documentclass line.
    55 % 2) change the value assigned below to the boolean variable "PrintVersion" from " false" to "true".
    56 
    57 %======================================================================
    58 %   D O C U M E N T   P R E A M B L E
    59 % Specify the document class, default style attributes, and page dimensions, etc.
     44% N.B. The "pdftex" program allows graphics in the following formats to be
     45% included with the "\includegraphics" command: PNG, PDF, JPEG, TIFF
     46% Tip 1: Generate your figures and photos in the size you want them to appear
     47% in your thesis, rather than scaling them with \includegraphics options.
     48% Tip 2: Any drawings you do should be in scalable vector graphic formats:
     49% SVG, PNG, WMF, EPS and then converted to PNG or PDF, so they are scalable in
     50% the final PDF as well.
     51% Tip 3: Photographs should be cropped and compressed so as not to be too large.
     52
     53% To create a PDF output that is optimized for double-sided printing:
     54%
     55% 1) comment-out the \documentclass statement in the preamble below, and
     56% un-comment the second \documentclass line.
     57%
     58% 2) change the value assigned below to the boolean variable
     59% "PrintVersion" from "false" to "true".
     60
     61% --------------------- Start of Document Preamble -----------------------
     62
     63% Specify the document class, default style attributes, and page dimensions
    6064% For hyperlinked PDF, suitable for viewing on a computer, use this:
    6165\documentclass[letterpaper,12pt,titlepage,oneside,final]{book}
    6266
    63 % For PDF, suitable for double-sided printing, change the PrintVersion variable below to "true" and use this \documentclass line instead of the one above:
     67% For PDF, suitable for double-sided printing, change the PrintVersion variable below
     68% to "true" and use this \documentclass line instead of the one above:
    6469%\documentclass[letterpaper,12pt,titlepage,openright,twoside,final]{book}
    6570
    66 % Some LaTeX commands I define for my own nomenclature.
    67 % If you have to, it's easier to make changes to nomenclature once here than in a million places throughout your thesis!
    68 \newcommand{\package}[1]{\textbf{#1}} % package names in bold text
    69 \newcommand{\cmmd}[1]{\textbackslash\texttt{#1}} % command name in tt font
    70 \newcommand{\href}[1]{#1} % does nothing, but defines the command so the print-optimized version will ignore \href tags (redefined by hyperref pkg).
    71 %\newcommand{\texorpdfstring}[2]{#1} % does nothing, but defines the command
    72 % Anything defined here may be redefined by packages added below...
     71\newcommand{\href}[1]{#1} % does nothing, but defines the command so the
     72    % print-optimized version will ignore \href tags (redefined by hyperref pkg).
    7373
    7474% This package allows if-then-else control structures.
     
    7676\newboolean{PrintVersion}
    7777\setboolean{PrintVersion}{false}
    78 % CHANGE THIS VALUE TO "true" as necessary, to improve printed results for hard copies by overriding some options of the hyperref package, called below.
     78% CHANGE THIS VALUE TO "true" as necessary, to improve printed results for hard copies
     79% by overriding some options of the hyperref package below.
    7980
    8081%\usepackage{nomencl} % For a nomenclature (optional; available from ctan.org)
     
    8485
    8586% Hyperlinks make it very easy to navigate an electronic document.
    86 % In addition, this is where you should specify the thesis title and author as they appear in the properties of the PDF document.
     87% In addition, this is where you should specify the thesis title
     88% and author as they appear in the properties of the PDF document.
    8789% Use the "hyperref" package
    8890% N.B. HYPERREF MUST BE THE LAST PACKAGE LOADED; ADD ADDITIONAL PKGS ABOVE
    8991\usepackage[pagebackref=false]{hyperref} % with basic options
    90 %\usepackage[pdftex,pagebackref=true]{hyperref}
    91 % N.B. pagebackref=true provides links back from the References to the body text. This can cause trouble for printing.
     92                % N.B. pagebackref=true provides links back from the References to the body text. This can cause trouble for printing.
    9293\hypersetup{
    9394        plainpages=false,       % needed if Roman numbers in frontpages
    94         unicode=false,          % non-Latin characters in Acrobat's bookmarks
    95         pdftoolbar=true,        % show Acrobat's toolbar?
    96         pdfmenubar=true,        % show Acrobat's menu?
     95        unicode=false,          % non-Latin characters in Acrobats bookmarks
     96        pdftoolbar=true,        % show Acrobats toolbar?
     97        pdfmenubar=true,        % show Acrobats menu?
    9798        pdffitwindow=false,     % window fit to page when opened
    9899        pdfstartview={FitH},    % fits the width of the page to the window
     
    110111\ifthenelse{\boolean{PrintVersion}}{   % for improved print quality, change some hyperref options
    111112\hypersetup{    % override some previously defined hyperref options
    112         citecolor=black,%
    113         filecolor=black,%
    114         linkcolor=black,%
     113        citecolor=black,
     114        filecolor=black,
     115        linkcolor=black,
    115116        urlcolor=black
    116117}}{} % end of ifthenelse (no else)
     
    135136
    136137% Setting up the page margins...
    137 \setlength{\textheight}{9in}
    138 \setlength{\topmargin}{-0.45in}
    139 \setlength{\headsep}{0.25in}
     138\setlength{\textheight}{9in}\setlength{\topmargin}{-0.45in}\setlength{\headsep}{0.25in}
    140139% uWaterloo thesis requirements specify a minimum of 1 inch (72pt) margin at the
    141 % top, bottom, and outside page edges and a 1.125 in. (81pt) gutter margin (on binding side).
    142 % While this is not an issue for electronic viewing, a PDF may be printed, and so we have the same page layout for both printed and electronic versions, we leave the gutter margin in.
     140% top, bottom, and outside page edges and a 1.125 in. (81pt) gutter
     141% margin (on binding side). While this is not an issue for electronic
     142% viewing, a PDF may be printed, and so we have the same page layout for
     143% both printed and electronic versions, we leave the gutter margin in.
    143144% Set margins to minimum permitted by uWaterloo thesis regulations:
    144145\setlength{\marginparwidth}{0pt} % width of margin notes
     
    149150\setlength{\evensidemargin}{0.125in} % Adds 1/8 in. to binding side of all
    150151% even-numbered pages when the "twoside" printing option is selected
    151 \setlength{\oddsidemargin}{0.125in} % Adds 1/8 in. to the left of all pages when "oneside" printing is selected, and to the left of all odd-numbered pages when "twoside" printing is selected
    152 \setlength{\textwidth}{6.375in} % assuming US letter paper (8.5 in. x 11 in.) and side margins as above
     152\setlength{\oddsidemargin}{0.125in} % Adds 1/8 in. to the left of all pages
     153% when "oneside" printing is selected, and to the left of all odd-numbered
     154% pages when "twoside" printing is selected
     155\setlength{\textwidth}{6.375in} % assuming US letter paper (8.5 in. x 11 in.) and
     156% side margins as above
    153157\raggedbottom
    154158
    155 % The following statement specifies the amount of space between paragraphs. Other reasonable specifications are \bigskipamount and \smallskipamount.
     159% The following statement specifies the amount of space between
     160% paragraphs. Other reasonable specifications are \bigskipamount and \smallskipamount.
    156161\setlength{\parskip}{\medskipamount}
    157162
    158 % The following statement controls the line spacing.
    159 % The default spacing corresponds to good typographic conventions and only slight changes (e.g., perhaps "1.2"), if any, should be made.
     163% The following statement controls the line spacing.  The default
     164% spacing corresponds to good typographic conventions and only slight
     165% changes (e.g., perhaps "1.2"), if any, should be made.
    160166\renewcommand{\baselinestretch}{1} % this is the default line space setting
    161167
    162 % By default, each chapter will start on a recto (right-hand side) page.
    163 % We also force each section of the front pages to start on a recto page by inserting \cleardoublepage commands.
    164 % In many cases, this will require that the verso (left-hand) page be blank, and while it should be counted, a page number should not be printed.
    165 % The following statements ensure a page number is not printed on an otherwise blank verso page.
     168% By default, each chapter will start on a recto (right-hand side)
     169% page.  We also force each section of the front pages to start on
     170% a recto page by inserting \cleardoublepage commands.
     171% In many cases, this will require that the verso page be
     172% blank and, while it should be counted, a page number should not be
     173% printed.  The following statements ensure a page number is not
     174% printed on an otherwise blank verso page.
    166175\let\origdoublepage\cleardoublepage
    167176\newcommand{\clearemptydoublepage}{%
     
    195204\input{common}
    196205\CFAStyle                                               % CFA code-style for all languages
    197 \lstset{language=CFA,basicstyle=\linespread{0.9}\tt}    % CFA default language
     206\lstset{basicstyle=\linespread{0.9}\tt}
    198207
    199208% glossary of terms to use
     
    201210\makeindex
    202211
    203 \newcommand\io{\glsxtrshort{io}\xspace}%
     212\newcommand\io{\glsxtrshort{io}}%
    204213
    205214%======================================================================
    206 %   L O G I C A L    D O C U M E N T
    207 % The logical document contains the main content of your thesis.
    208 % Being a large document, it is a good idea to divide your thesis into several files, each one containing one chapter or other significant chunk of content, so you can easily shuffle things around later if desired.
     215%   L O G I C A L    D O C U M E N T -- the content of your thesis
    209216%======================================================================
    210217\begin{document}
    211218
     219% For a large document, it is a good idea to divide your thesis
     220% into several files, each one containing one chapter.
     221% To illustrate this idea, the "front pages" (i.e., title page,
     222% declaration, borrowers' page, abstract, acknowledgements,
     223% dedication, table of contents, list of tables, list of figures,
     224% nomenclature) are contained within the file "uw-ethesis-frontpgs.tex" which is
     225% included into the document by the following statement.
    212226%----------------------------------------------------------------------
    213227% FRONT MATERIAL
    214 % title page,declaration, borrowers' page, abstract, acknowledgements,
    215 % dedication, table of contents, list of tables, list of figures, nomenclature, etc.
    216228%----------------------------------------------------------------------
    217229\input{text/front.tex}
    218230
     231
    219232%----------------------------------------------------------------------
    220233% MAIN BODY
    221 % We suggest using a separate file for each chapter of your thesis.
    222 % Start each chapter file with the \chapter command.
    223 % Only use \documentclass or \begin{document} and \end{document} commands in this master document.
    224 % Tip: Putting each sentence on a new line is a way to simplify later editing.
    225 %----------------------------------------------------------------------
    226 
     234%----------------------------------------------------------------------
     235% Because this is a short document, and to reduce the number of files
     236% needed for this template, the chapters are not separate
     237% documents as suggested above, but you get the idea. If they were
     238% separate documents, they would each start with the \chapter command, i.e,
     239% do not contain \documentclass or \begin{document} and \end{document} commands.
    227240\part{Introduction}
    228241\input{text/intro.tex}
     
    242255%----------------------------------------------------------------------
    243256% END MATERIAL
    244 % Bibliography, Appendices, Index, etc.
    245 %----------------------------------------------------------------------
    246 
    247 % Bibliography
    248 
    249 % The following statement selects the style to use for references.
    250 % It controls the sort order of the entries in the bibliography and also the formatting for the in-text labels.
     257%----------------------------------------------------------------------
     258
     259% B I B L I O G R A P H Y
     260% -----------------------
     261
     262% The following statement selects the style to use for references.  It controls the sort order of the entries in the bibliography and also the formatting for the in-text labels.
    251263\bibliographystyle{plain}
    252264% This specifies the location of the file containing the bibliographic information.
    253 % It assumes you're using BibTeX to manage your references (if not, why not?).
    254 \cleardoublepage % This is needed if the "book" document class is used, to place the anchor in the correct page, because the bibliography will start on its own page.
    255 % Use \clearpage instead if the document class uses the "oneside" argument
     265% It assumes you're using BibTeX (if not, why not?).
     266\cleardoublepage % This is needed if the book class is used, to place the anchor in the correct page,
     267                 % because the bibliography will start on its own page.
     268                 % Use \clearpage instead if the document class uses the "oneside" argument
    256269\phantomsection  % With hyperref package, enables hyperlinking from the table of contents to bibliography
    257270% The following statement causes the title "References" to be used for the bibliography section:
     
    262275
    263276\bibliography{local,pl}
    264 % Tip: You can create multiple .bib files to organize your references.
     277% Tip 5: You can create multiple .bib files to organize your references.
    265278% Just list them all in the \bibliogaphy command, separated by commas (no spaces).
    266279
    267 % The following statement causes the specified references to be added to the bibliography even if they were not cited in the text.
    268 % The asterisk is a wildcard that causes all entries in the bibliographic database to be included (optional).
     280% % The following statement causes the specified references to be added to the bibliography% even if they were not
     281% % cited in the text. The asterisk is a wildcard that causes all entries in the bibliographic database to be included (optional).
    269282% \nocite{*}
    270 %----------------------------------------------------------------------
    271 
    272 % Appendices
    273283
    274284% The \appendix statement indicates the beginning of the appendices.
    275285\appendix
    276 % Add an un-numbered title page before the appendices and a line in the Table of Contents
     286% Add a title page before the appendices and a line in the Table of Contents
    277287\chapter*{APPENDICES}
    278288\addcontentsline{toc}{chapter}{APPENDICES}
    279 % Appendices are just more chapters, with different labeling (letters instead of numbers).
    280289%======================================================================
    281290\chapter[PDF Plots From Matlab]{Matlab Code for Making a PDF Plot}
     
    315324%\input{thesis.ind}                             % index
    316325
    317 \phantomsection         % allows hyperref to link to the correct page
    318 
    319 %----------------------------------------------------------------------
    320 \end{document} % end of logical document
     326\phantomsection
     327
     328\end{document}
  • doc/user/user.tex

    rf6664bf2 r14533d4  
    1111%% Created On       : Wed Apr  6 14:53:29 2016
    1212%% Last Modified By : Peter A. Buhr
    13 %% Last Modified On : Mon Feb 15 13:48:53 2021
    14 %% Update Count     : 4452
     13%% Last Modified On : Mon Feb  8 21:53:31 2021
     14%% Update Count     : 4327
    1515%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    1616
     
    105105
    106106\author{
    107 \huge \CFA Team (past and present) \medskip \\
     107\huge \CFA Team \medskip \\
    108108\Large Andrew Beach, Richard Bilson, Michael Brooks, Peter A. Buhr, Thierry Delisle, \smallskip \\
    109109\Large Glen Ditchfield, Rodolfo G. Esteves, Aaron Moss, Colby Parsons, Rob Schluntz, \smallskip \\
     
    129129\vspace*{\fill}
    130130\noindent
    131 \copyright\,2016, 2018, 2021 \CFA Project \\ \\
     131\copyright\,2016 \CFA Project \\ \\
    132132\noindent
    133133This work is licensed under the Creative Commons Attribution 4.0 International License.
     
    970970\hline
    971971\begin{cfa}
    972 while @($\,$)@ { sout | "empty"; break; }
    973 do { sout | "empty"; break; } while @($\,$)@;
    974 for @($\,$)@ { sout | "empty"; break; }
     972while @()@ { sout | "empty"; break; }
     973do { sout | "empty"; break; } while @()@;
     974for @()@ { sout | "empty"; break; }
    975975for ( @0@ ) { sout | "A"; } sout | "zero";
    976976for ( @1@ ) { sout | "A"; }
     
    11451145\subsection{\texorpdfstring{Labelled \LstKeywordStyle{continue} / \LstKeywordStyle{break} Statement}{Labelled continue / break Statement}}
    11461146
    1147 C ©continue© and ©break© statements, for altering control flow, are restricted to one level of nesting for a particular control structure.
    1148 This restriction forces programmers to use \Indexc{goto} to achieve the equivalent control-flow for more than one level of nesting.
     1147While C provides ©continue© and ©break© statements for altering control flow, both are restricted to one level of nesting for a particular control structure.
     1148Unfortunately, this restriction forces programmers to use \Indexc{goto} to achieve the equivalent control-flow for more than one level of nesting.
    11491149To prevent having to switch to the ©goto©, \CFA extends the \Indexc{continue}\index{continue@©continue©!labelled}\index{labelled!continue@©continue©} and \Indexc{break}\index{break@©break©!labelled}\index{labelled!break@©break©} with a target label to support static multi-level exit\index{multi-level exit}\index{static multi-level exit}~\cite{Buhr85}, as in Java.
    11501150For both ©continue© and ©break©, the target label must be directly associated with a ©for©, ©while© or ©do© statement;
    11511151for ©break©, the target label can also be associated with a ©switch©, ©if© or compound (©{}©) statement.
    1152 \VRef[Figure]{f:MultiLevelExit} shows a comparison between labelled ©continue© and ©break© and the corresponding C equivalent using ©goto© and labels.
     1152\VRef[Figure]{f:MultiLevelExit} shows ©continue© and ©break© indicating the specific control structure, and the corresponding C program using only ©goto© and labels.
    11531153The innermost loop has 8 exit points, which cause continuation or termination of one or more of the 7 \Index{nested control-structure}s.
    11541154
     
    12151215\end{lrbox}
    12161216
     1217\hspace*{-10pt}
    12171218\subfloat[\CFA]{\label{f:CFibonacci}\usebox\myboxA}
    1218 \hspace{3pt}
     1219\hspace{2pt}
    12191220\vrule
    1220 \hspace{3pt}
    12211221\subfloat[C]{\label{f:CFAFibonacciGen}\usebox\myboxB}
    12221222\caption{Multi-level Exit}
     
    12331233This restriction prevents missing declarations and/or initializations at the start of a control structure resulting in undefined behaviour.
    12341234\end{itemize}
    1235 The advantage of the labelled ©continue©/©break© is allowing static multi-level exits without having to use the ©goto© statement, and tying control flow to the target control structure rather than an arbitrary point in a program via a label.
     1235The advantage of the labelled ©continue©/©break© is allowing static multi-level exits without having to use the ©goto© statement, and tying control flow to the target control structure rather than an arbitrary point in a program.
    12361236Furthermore, the location of the label at the \emph{beginning} of the target control structure informs the reader (\Index{eye candy}) that complex control-flow is occurring in the body of the control structure.
    12371237With ©goto©, the label is at the end of the control structure, which fails to convey this important clue early enough to the reader.
     
    12401240
    12411241
    1242 %\subsection{\texorpdfstring{\protect\lstinline@with@ Statement}{with Statement}}
    1243 \subsection{\texorpdfstring{\LstKeywordStyle{with} Statement}{with Statement}}
     1242%\section{\texorpdfstring{\protect\lstinline@with@ Statement}{with Statement}}
     1243\section{\texorpdfstring{\LstKeywordStyle{with} Statement}{with Statement}}
    12441244\label{s:WithStatement}
    12451245
    1246 Grouping heterogeneous data into an \newterm{aggregate} (structure/union) is a common programming practice, and aggregates may be nested:
    1247 \begin{cfa}
    1248 struct Person {                                                         $\C{// aggregate}$
    1249         struct Name { char first[20], last[20]; } name $\C{// nesting}$
    1250         struct Address { ... } address                  $\C{// nesting}$
    1251         int sex;
     1246Grouping heterogeneous data into \newterm{aggregate}s (structure/union) is a common programming practice, and an aggregate can be further organized into more complex structures, such as arrays and containers:
     1247\begin{cfa}
     1248struct S { $\C{// aggregate}$
     1249        char c; $\C{// fields}$
     1250        int i;
     1251        double d;
    12521252};
    1253 \end{cfa}
    1254 Functions manipulating aggregates must repeat the aggregate name to access its containing fields.
    1255 \begin{cfa}
    1256 Person p
    1257 @p.@name; @p.@address; @p.@sex; $\C{// access containing fields}$
    1258 \end{cfa}
    1259 which extends to multiple levels of qualification for nested aggregates and multiple aggregates.
    1260 \begin{cfa}
    1261 struct Ticket { ... } t;
    1262 @p.name@.first; @p.address@.street;             $\C{// access nested fields}$
    1263 @t.@departure; @t.@cost;                                $\C{// access multiple aggregate}$
    1264 \end{cfa}
    1265 Repeated aggregate qualification is tedious and makes code difficult to read.
    1266 Therefore, reducing aggregate qualification is a useful language design goal.
    1267 
    1268 C allows unnamed nested aggregates that open their scope into the containing aggregate.
    1269 This feature is used to group fields for attributes and/or with ©union© aggregates.
    1270 \begin{cfa}
    1271 struct S {
    1272         struct { int g,  h; } __attribute__(( aligned(64) ));
    1273         int tag;
    1274         union {
    1275                 struct { char c1,  c2; } __attribute__(( aligned(128) ));
    1276                 struct { int i1,  i2; };
    1277                 struct { double d1,  d2; };
    1278         };
    1279 };
    1280 s.g; s.h; s.tag; s.c1; s.c2; s.i1; s.i2; s.d1; s.d2;
    1281 \end{cfa}
    1282 
    1283 Object-oriented languages reduce qualification for class variables within member functions, \eg \CC:
     1253S s, as[10];
     1254\end{cfa}
     1255However, functions manipulating aggregates must repeat the aggregate name to access its containing fields:
     1256\begin{cfa}
     1257void f( S s ) {
     1258        @s.@c; @s.@i; @s.@d; $\C{// access containing fields}$
     1259}
     1260\end{cfa}
     1261which extends to multiple levels of qualification for nested aggregates.
     1262A similar situation occurs in object-oriented programming, \eg \CC:
    12841263\begin{C++}
    12851264struct S {
    1286         char @c@;   int @i@;   double @d@;
    1287         void f( /* S * this */ ) {                              $\C{// implicit ``this'' parameter}$
    1288                 @c@;   @i@;   @d@;                                      $\C{// this->c; this->i; this->d;}$
     1265        char c; $\C{// fields}$
     1266        int i;
     1267        double d;
     1268        void f() { $\C{// implicit ``this'' aggregate}$
     1269                @this->@c; @this->@i; @this->@d; $\C{// access containing fields}$
    12891270        }
    12901271}
    12911272\end{C++}
    1292 In general, qualification is elided for the variables and functions in the lexical scopes visible from a member function.
    1293 However, qualification is necessary for name shadowing and explicit aggregate parameters.
    1294 \begin{cfa}
    1295 struct T {
    1296         char @m@;   int @i@;   double @n@;              $\C{// derived class variables}$
    1297 };
    1298 struct S : public T {
    1299         char @c@;   int @i@;   double @d@;              $\C{// class variables}$
    1300         void g( double @d@, T & t ) {
    1301                 d;   @t@.m;   @t@.i;   @t@.n;           $\C{// function parameter}$
    1302                 c;   i;   @this->@d;   @S::@d;          $\C{// class S variables}$
    1303                 m;   @T::@i;   n;                                       $\C{// class T variables}$
    1304         }
    1305 };
    1306 \end{cfa}
    1307 Note the three different forms of qualification syntax in \CC, ©.©, ©->©, ©::©, which is confusing.
    1308 
    1309 Since \CFA in not object-oriented, it has no implicit parameter with its implicit qualification.
    1310 Instead \CFA introduces a general mechanism using the ©with© statement \see{Pascal~\cite[\S~4.F]{Pascal}} to explicitly elide aggregate qualification by opening a scope containing the field identifiers.
    1311 Hence, the qualified fields become variables with the side-effect that it is simpler to write, easier to read, and optimize field references in a block.
    1312 \begin{cfa}
    1313 void f( S & this ) @with ( this )@ {            $\C{// with statement}$
    1314         @c@;   @i@;   @d@;                                              $\C{// this.c, this.i, this.d}$
     1273Object-oriented nesting of member functions in a \lstinline[language=C++]@class/struct@ allows eliding \lstinline[language=C++]@this->@ because of lexical scoping.
     1274However, for other aggregate parameters, qualification is necessary:
     1275\begin{cfa}
     1276struct T { double m, n; };
     1277int S::f( T & t ) { $\C{// multiple aggregate parameters}$
     1278        c; i; d; $\C{\R{// this--{\textgreater}c, this--{\textgreater}i, this--{\textgreater}d}}$
     1279        @t.@m; @t.@n; $\C{// must qualify}$
     1280}
     1281\end{cfa}
     1282
     1283To simplify the programmer experience, \CFA provides a ©with© statement \see{Pascal~\cite[\S~4.F]{Pascal}} to elide aggregate qualification to fields by opening a scope containing the field identifiers.
     1284Hence, the qualified fields become variables with the side-effect that it is easier to optimizing field references in a block.
     1285\begin{cfa}
     1286void f( S & this ) @with ( this )@ { $\C{// with statement}$
     1287        c; i; d; $\C{\R{// this.c, this.i, this.d}}$
    13151288}
    13161289\end{cfa}
    13171290with the generality of opening multiple aggregate-parameters:
    13181291\begin{cfa}
    1319 void g( S & s, T & t ) @with ( s, t )@ {        $\C{// multiple aggregate parameters}$
    1320         c;   @s.@i;   d;                                                $\C{// s.c, s.i, s.d}$
    1321         m;   @t.@i;   n;                                                $\C{// t.m, t.i, t.n}$
    1322 }
    1323 \end{cfa}
    1324 where qualification is only necessary to disambiguate the shadowed variable ©i©.
    1325 
    1326 In detail, the ©with© statement may appear as the body of a function or nested within a function body.
    1327 The ©with© clause takes a list of expressions, where each expression provides an aggregate type and object.
     1292void f( S & s, T & t ) @with ( s, t )@ { $\C{// multiple aggregate parameters}$
     1293        c; i; d; $\C{\R{// s.c, s.i, s.d}}$
     1294        m; n; $\C{\R{// t.m, t.n}}$
     1295}
     1296\end{cfa}
     1297
     1298In detail, the ©with© statement has the form:
     1299\begin{cfa}
     1300$\emph{with-statement}$:
     1301        'with' '(' $\emph{expression-list}$ ')' $\emph{compound-statement}$
     1302\end{cfa}
     1303and may appear as the body of a function or nested within a function body.
     1304Each expression in the expression-list provides a type and object.
     1305The type must be an aggregate type.
    13281306(Enumerations are already opened.)
    1329 To open a pointer type, the pointer must be dereferenced to obtain a reference to the aggregate type.
    1330 \begin{cfa}
    1331 S * sp;
    1332 with ( *sp ) { ... }
    1333 \end{cfa}
    1334 The expression object is the implicit qualifier for the open structure-fields.
    1335 \CFA's ability to overload variables \see{\VRef{s:VariableOverload}} and use the left-side of assignment in type resolution means most fields with the same name but different types are automatically disambiguated, eliminating qualification.
     1307The object is the implicit qualifier for the open structure-fields.
     1308
    13361309All expressions in the expression list are open in parallel within the compound statement.
    13371310This semantic is different from Pascal, which nests the openings from left to right.
    13381311The difference between parallel and nesting occurs for fields with the same name and type:
    13391312\begin{cfa}
    1340 struct Q { int @i@; int k; int @m@; } q, w;
    1341 struct R { int @i@; int j; double @m@; } r, w;
    1342 with ( r, q ) {
    1343         j + k;                                                                  $\C{// unambiguous, r.j + q.k}$
    1344         m = 5.0;                                                                $\C{// unambiguous, q.m = 5.0}$
    1345         m = 1;                                                                  $\C{// unambiguous, r.m = 1}$
    1346         int a = m;                                                              $\C{// unambiguous, a = r.i }$
    1347         double b = m;                                                   $\C{// unambiguous, b = q.m}$
    1348         int c = r.i + q.i;                                              $\C{// disambiguate with qualification}$
    1349         (double)m;                                                              $\C{// disambiguate with cast}$
    1350 }
    1351 \end{cfa}
    1352 For parallel semantics, both ©r.i© and ©q.i© are visible, so ©i© is ambiguous without qualification;
    1353 for nested semantics, ©q.i© hides ©r.i©, so ©i© implies ©q.i©.
    1354 Pascal nested-semantics is possible by nesting ©with© statements.
    1355 \begin{cfa}
    1356 with ( r ) {
    1357         i;                                                                              $\C{// unambiguous, r.i}$
    1358         with ( q ) {
    1359                 i;                                                                      $\C{// unambiguous, q.i}$
    1360         }
    1361 }
    1362 \end{cfa}
    1363 A cast or qualification can be used to disambiguate variables within a ©with© \emph{statement}.
    1364 A cast can be used to disambiguate among overload variables in a ©with© \emph{expression}:
    1365 \begin{cfa}
    1366 with ( w ) { ... }                                                      $\C{// ambiguous, same name and no context}$
    1367 with ( (Q)w ) { ... }                                           $\C{// unambiguous, cast}$
    1368 \end{cfa}
    1369 Because there is no left-side in the ©with© expression to implicitly disambiguate between the ©w© variables, it is necessary to explicitly disambiguate by casting ©w© to type ©Q© or ©R©.
    1370 
    1371 Finally, there is an interesting problem between parameters and the function-body ©with©, \eg:
     1313struct S { int @i@; int j; double m; } s, w;
     1314struct T { int @i@; int k; int m; } t, w;
     1315with ( s, t ) {
     1316        j + k; $\C{// unambiguous, s.j + t.k}$
     1317        m = 5.0; $\C{// unambiguous, t.m = 5.0}$
     1318        m = 1; $\C{// unambiguous, s.m = 1}$
     1319        int a = m; $\C{// unambiguous, a = s.i }$
     1320        double b = m; $\C{// unambiguous, b = t.m}$
     1321        int c = s.i + t.i; $\C{// unambiguous, qualification}$
     1322        (double)m; $\C{// unambiguous, cast}$
     1323}
     1324\end{cfa}
     1325For parallel semantics, both ©s.i© and ©t.i© are visible, so ©i© is ambiguous without qualification;
     1326for nested semantics, ©t.i© hides ©s.i©, so ©i© implies ©t.i©.
     1327\CFA's ability to overload variables means fields with the same name but different types are automatically disambiguated, eliminating most qualification when opening multiple aggregates.
     1328Qualification or a cast is used to disambiguate.
     1329
     1330There is an interesting problem between parameters and the function-body ©with©, \eg:
    13721331\begin{cfa}
    13731332void ?{}( S & s, int i ) with ( s ) { $\C{// constructor}$
     
    13851344and implicitly opened \emph{after} a function-body open, to give them higher priority:
    13861345\begin{cfa}
    1387 void ?{}( S & s, int @i@ ) with ( s ) @with( $\emph{\R{params}}$ )@ { // syntax not allowed, illustration only
     1346void ?{}( S & s, int @i@ ) with ( s ) @with( $\emph{\R{params}}$ )@ {
    13881347        s.i = @i@; j = 3; m = 5.5;
    13891348}
    13901349\end{cfa}
    1391 This implicit semantic matches with programmer expectation.
    1392 
     1350Finally, a cast may be used to disambiguate among overload variables in a ©with© expression:
     1351\begin{cfa}
     1352with ( w ) { ... } $\C{// ambiguous, same name and no context}$
     1353with ( (S)w ) { ... } $\C{// unambiguous, cast}$
     1354\end{cfa}
     1355and ©with© expressions may be complex expressions with type reference \see{\VRef{s:References}} to aggregate:
     1356% \begin{cfa}
     1357% struct S { int i, j; } sv;
     1358% with ( sv ) { $\C{// implicit reference}$
     1359%       S & sr = sv;
     1360%       with ( sr ) { $\C{// explicit reference}$
     1361%               S * sp = &sv;
     1362%               with ( *sp ) { $\C{// computed reference}$
     1363%                       i = 3; j = 4; $\C{\color{red}// sp--{\textgreater}i, sp--{\textgreater}j}$
     1364%               }
     1365%               i = 2; j = 3; $\C{\color{red}// sr.i, sr.j}$
     1366%       }
     1367%       i = 1; j = 2; $\C{\color{red}// sv.i, sv.j}$
     1368% }
     1369% \end{cfa}
     1370
     1371In \Index{object-oriented} programming, there is an implicit first parameter, often names \textbf{©self©} or \textbf{©this©}, which is elided.
     1372\begin{C++}
     1373class C {
     1374        int i, j;
     1375        int mem() { $\C{\R{// implicit "this" parameter}}$
     1376                i = 1; $\C{\R{// this->i}}$
     1377                j = 2; $\C{\R{// this->j}}$
     1378        }
     1379}
     1380\end{C++}
     1381Since \CFA is non-object-oriented, the equivalent object-oriented program looks like:
     1382\begin{cfa}
     1383struct S { int i, j; };
     1384int mem( S & @this@ ) { $\C{// explicit "this" parameter}$
     1385        @this.@i = 1; $\C{// "this" is not elided}$
     1386        @this.@j = 2;
     1387}
     1388\end{cfa}
     1389but it is cumbersome having to write ``©this.©'' many times in a member.
     1390
     1391\CFA provides a ©with© clause/statement \see{Pascal~\cite[\S~4.F]{Pascal}} to elided the "©this.©" by opening a scope containing field identifiers, changing the qualified fields into variables and giving an opportunity for optimizing qualified references.
     1392\begin{cfa}
     1393int mem( S & this ) @with( this )@ { $\C{// with clause}$
     1394        i = 1; $\C{\R{// this.i}}$
     1395        j = 2; $\C{\R{// this.j}}$
     1396}
     1397\end{cfa}
     1398which extends to multiple routine parameters:
     1399\begin{cfa}
     1400struct T { double m, n; };
     1401int mem2( S & this1, T & this2 ) @with( this1, this2 )@ {
     1402        i = 1; j = 2;
     1403        m = 1.0; n = 2.0;
     1404}
     1405\end{cfa}
     1406
     1407The statement form is used within a block:
     1408\begin{cfa}
     1409int foo() {
     1410        struct S1 { ... } s1;
     1411        struct S2 { ... } s2;
     1412        @with( s1 )@ { $\C{// with statement}$
     1413                // access fields of s1 without qualification
     1414                @with s2@ { $\C{// nesting}$
     1415                        // access fields of s1 and s2 without qualification
     1416                }
     1417        }
     1418        @with s1, s2@ {
     1419                // access unambiguous fields of s1 and s2 without qualification
     1420        }
     1421}
     1422\end{cfa}
     1423
     1424When opening multiple structures, fields with the same name and type are ambiguous and must be fully qualified.
     1425For fields with the same name but different type, context/cast can be used to disambiguate.
     1426\begin{cfa}
     1427struct S { int i; int j; double m; } a, c;
     1428struct T { int i; int k; int m } b, c;
     1429with( a, b )
     1430{
     1431}
     1432\end{cfa}
     1433
     1434\begin{comment}
     1435The components in the "with" clause
     1436
     1437  with a, b, c { ... }
     1438
     1439serve 2 purposes: each component provides a type and object. The type must be a
     1440structure type. Enumerations are already opened, and I think a union is opened
     1441to some extent, too. (Or is that just unnamed unions?) The object is the target
     1442that the naked structure-fields apply to. The components are open in "parallel"
     1443at the scope of the "with" clause/statement, so opening "a" does not affect
     1444opening "b", etc. This semantic is different from Pascal, which nests the
     1445openings.
     1446
     1447Having said the above, it seems reasonable to allow a "with" component to be an
     1448expression. The type is the static expression-type and the object is the result
     1449of the expression. Again, the type must be an aggregate. Expressions require
     1450parenthesis around the components.
     1451
     1452  with( a, b, c ) { ... }
     1453
     1454Does this now make sense?
     1455
     1456Having written more CFA code, it is becoming clear to me that I *really* want
     1457the "with" to be implemented because I hate having to type all those object
     1458names for fields. It's a great way to drive people away from the language.
     1459\end{comment}
    13931460
    13941461
     
    42784345
    42794346
    4280 \subsection{Constant}
     4347\subsection{Overloaded Constant}
    42814348
    42824349The constants 0 and 1 have special meaning.
     
    43174384
    43184385
    4319 \subsection{Variable}
    4320 \label{s:VariableOverload}
     4386\subsection{Variable Overloading}
    43214387
    43224388The overload rules of \CFA allow a programmer to define multiple variables with the same name, but different types.
     
    43614427
    43624428
    4363 \subsection{Operator}
     4429\subsection{Operator Overloading}
    43644430
    43654431\CFA also allows operators to be overloaded, to simplify the use of user-defined types.
     
    56195685\end{cfa}
    56205686&
    5621 \begin{C++}
     5687\begin{lstlisting}[language=C++]
    56225688class Line {
    56235689        float lnth;
     
    56465712Line line1;
    56475713Line line2( 3.4 );
    5648 \end{C++}
     5714\end{lstlisting}
    56495715&
    56505716\begin{lstlisting}[language=Golang]
Note: See TracChangeset for help on using the changeset viewer.