Ignore:
Timestamp:
Apr 16, 2020, 9:23:02 AM (5 years ago)
Author:
Peter A. Buhr <pabuhr@…>
Branches:
ADT, arm-eh, ast-experimental, enum, forall-pointer-decay, jacob/cs343-translation, master, new-ast, new-ast-unique-expr, pthread-emulation, qualifiedEnum
Children:
41af19c
Parents:
849f2c6b
Message:

final comments on Thierry's comp II

Location:
doc/theses/thierry_delisle_PhD/comp_II
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/thierry_delisle_PhD/comp_II/comp_II_PAB.tex

    r849f2c6b r1c412aa  
    1515\setlist{topsep=6pt,parsep=0pt}         % global reduce spacing between points
    1616\newcommand{\uC}{$\mu$\CC}
    17 \usepackage[hidelinks]{hyperref}
     17\usepackage[breaklinks=true,pagebackref=true,hidelinks]{hyperref}
     18\urlstyle{rm}
    1819\setlength{\abovecaptionskip}{5pt plus 3pt minus 2pt}
    1920\lstMakeShortInline$%                   % single-character for \lstinline
    2021%\usepackage[margin=1in]{geometry}
    2122%\usepackage{float}
     23%\usepackage{breakurl}
    2224
    2325\input{glossary}
     
    5961\section{Introduction}
    6062\subsection{\CFA and the \CFA concurrency package}
    61 \CFA\cite{Moss18} is a modern, polymorphic, non-object-oriented, concurrent, backwards-compatible extension of the C programming language.
     63\CFA~\cite{Moss18} is a modern, polymorphic, non-object-oriented, concurrent, backwards-compatible extension of the C programming language.
    6264It aims to add high-productivity features while maintaining the predictable performance of C.
    63 As such, concurrency in \CFA\cit aims to offer simple and safe high-level tools while still allowing performant code.
     65As such, concurrency in \CFA~\cite{Delisle19} aims to offer simple and safe high-level tools while still allowing performant code.
    6466\CFA concurrent code is written in the synchronous programming paradigm but uses \glspl{uthrd} in order to achieve the simplicity and maintainability of synchronous programming without sacrificing the efficiency of asynchronous programing.
    6567As such, the \CFA \newterm{scheduler} is a preemptive user-level scheduler that maps \glspl{uthrd} onto \glspl{kthrd}.
     
    327329
    328330How much scalability is actually needed is highly debatable.
    329 \emph{libfibre}\cit has compared favorably to other schedulers in webserver tests\cit and uses a single atomic counter in its scheduling algorithm similarly to the proposed bitmask.
     331\emph{libfibre}~\cite{libfibre} has compared favorably to other schedulers in webserver tests~\cite{Karsten20} and uses a single atomic counter in its scheduling algorithm similarly to the proposed bitmask.
    330332As such, the single atomic instruction on a shared cacheline may be sufficiently performant.
    331333
     
    375377
    376378There are possible alternatives to the reader-writer lock solution.
    377 This problem is effectively a memory reclamation problem and as such there is a large body of research on the subject\cit.
     379This problem is effectively a memory reclamation problem and as such there is a large body of research on the subject~\cite{michael2004hazard, brown2015reclaiming}.
    378380However, the reader-write lock-solution is simple and can be leveraged to solve other problems (\eg processor ordering and memory reclamation of threads), which makes it an attractive solution.
    379381
     
    421423Processors that are unnecessarily unblocked lead to unnecessary contention, CPU usage, and power consumption, while too many sleeping processors can lead to sub-optimal throughput.
    422424Furthermore, transitions from sleeping to awake and vice-versa also add unnecessary latency.
    423 There is already a wealth of research on the subject \TODO{(reference Libfibre)} and I may use an existing approach for the idle-sleep heuristic in this project.
     425There is already a wealth of research on the subject~\cite{schillings1996engineering, wiki:thunderherd} and I may use an existing approach for the idle-sleep heuristic in this project, \eg~\cite{Karsten20}.
    424426
    425427\subsection{Asynchronous I/O}
     
    444446While there exists many different APIs for asynchronous I/O, it is not part of this proposal to create a novel API.
    445447It is sufficient to make one work in the complex context of the \CFA runtime.
    446 \uC uses the $select$ as its interface, which handles ttys, pipes and sockets, but not disk.
     448\uC uses the $select$~\cite{select} as its interface, which handles ttys, pipes and sockets, but not disk.
    447449$select$ entails significant complexity and is being replaced in UNIX operating-systems, which make it a less interesting alternative.
    448 Another popular interface is $epoll$\cit, which is supposed to be cheaper than $select$.
    449 However, epoll also does not handle the file system and seems to have problem to linux pipes and $TTY$s\cit.
    450 A very recent alternative that I am investigating is $io_uring$.
    451 It claims to address some of the issues with $epoll$ but is too recent to be confident that it does.
    452 Finally, a popular cross-platform alternative is $libuv$, which offers asynchronous sockets and asynchronous file system operations (among other features).
     450Another popular interface is $epoll$~\cite{epoll}, which is supposed to be cheaper than $select$.
     451However, $epoll$ also does not handle the file system and anecdotal evidence suggest it has problem with linux pipes and $TTY$s.
     452A popular cross-platform alternative is $libuv$~\cite{libuv}, which offers asynchronous sockets and asynchronous file system operations (among other features).
    453453However, as a full-featured library it includes much more than I need and could conflict with other features of \CFA unless significant effort is made to merge them together.
     454A very recent alternative that I am investigating is $io_uring$~\cite{io_uring}.
     455It claims to address some of the issues with $epoll$ and my early investigating suggest that the claim is accurate.
     456$io_uring$ uses a much more general approach where system calls are register to a queue and later executed by the kernel, rather than relying on system calls to return an error instead of blocking and subsequently waiting for changes on file descriptors.
     457I believe this approach allows for fewer problems, \eg the manpage for $open$~\cite{open} states:
     458\begin{quote}
     459        Note that [the $O_NONBLOCK$ flag] has no effect for regular files and block devices;
     460        that is, I/O operations will (briefly) block when device activity is required, regardless of whether $O_NONBLOCK$ is set.
     461        Since $O_NONBLOCK$ semantics might eventually be implemented, applications should not depend upon blocking behavior when specifying this flag for regular files and block devices.
     462\end{quote}
     463This makes approach based on $epoll$/$select$ less reliable since they may not work for every file descriptors.
     464For this reason, I have started to use $io_uring$ as the OS abstraction for the \CFA runtime, until further work shows problems I have not encountered yet.
     465However, only a small subset of the features are available in Ubuntu as of April 2020~\cite{wiki:ubuntu-linux}, which limits performance comparisons.
     466The currently available features are sufficient to start basic socket performance experiments and comparisons.
    454467
    455468\paragraph{Event Engine}
    456469Laying on top of the asynchronous interface layer is the event engine.
    457 This engine is responsible for multiplexing (batching) the synchronous I/O requests into asynchronous I/O requests and demultiplexing the results to appropriate blocked user threads.
     470This engine is responsible for multiplexing (batching) the synchronous I/O requests into asynchronous I/O requests and demultiplexing the results to appropriate blocked user-threads.
    458471This step can be straightforward for simple cases, but becomes quite complex when there are thousands of user threads performing both reads and writes, possibly on overlapping file descriptors.
    459472Decisions that need to be made include:
     
    469482\paragraph{Interface}
    470483Finally, for these non-blocking I/O components to be available, it is necessary to expose them through a synchronous interface because that is the \CFA concurrent programming style.
    471 The interface can be novel but it is preferable to match the existing POSIX interface when possible to be compatible with existing code.
     484The interface can add novel features but it is preferable to match the existing POSIX interface when possible to be compatible with existing code.
    472485Matching allows C programs written using this interface to be transparently converted to \CFA with minimal effort.
    473 Where new functionality is needed, I will create a novel interface to fill gaps and provide advanced features.
     486Where new functionality is needed, I will create a novel interface extensions to fill gaps and provide advanced features.
    474487
    475488
     
    477490% ===============================================================================
    478491\section{Discussion}
    479 
     492I believe that runtime systems and scheduling are still open topics.
     493Many ``state of the art'' production frameworks still use single threaded event-loops because of performance considerations, \eg~\cite{nginx-design}, and, to my knowledge, no widely available system language offers modern threading facilities.
     494I believe the proposed work offers a novel runtime and scheduling package embedded in the \CFA programming language, where existing work only offers fragments that users must assemble themselves when possible.
    480495
    481496% ===============================================================================
    482497% ===============================================================================
    483498\section{Timeline}
    484 
     499\begin{center}
     500\begin{tabular}{ | r @{--} l | p{4in} | }
     501\hline May 2020 & October 2020   & Creation of the performance benchmark. \\
     502\hline November 2020 & March 2021   & Completion of the implementation. \\
     503\hline March 2021 & April 2021  & Final Performance experiments. \\
     504\hline May 2017 & August 2021 & Thesis writing and defense. \\
     505\hline
     506\end{tabular}
     507\end{center}
    485508
    486509% B I B L I O G R A P H Y
     
    489512\phantomsection         % allows hyperref to link to the correct page
    490513\addcontentsline{toc}{section}{\refname}
     514\lstDeleteShortInline$% turn off as $ used in bibliography entries
    491515\bibliographystyle{plain}
    492516\bibliography{pl,local}
  • doc/theses/thierry_delisle_PhD/comp_II/local.bib

    r849f2c6b r1c412aa  
    7676
    7777@article{finkel1987dib,
    78   title={DIBa distributed implementation of backtracking},
     78  title={DIB-a distributed implementation of backtracking},
    7979  author={Finkel, Raphael and Manber, Udi},
    8080  journal={ACM Transactions on Programming Languages and Systems (TOPLAS)},
     
    266266% ===============================================================================
    267267@manual{open,
     268  key        = "open",
    268269  title      = "open(2) Linux User's Manual",
    269270  year       = "2020",
     
    272273
    273274@manual{epoll,
     275  key        = "epoll",
    274276  title      = "epoll(7) Linux User's Manual",
    275277  year       = "2019",
     
    278280
    279281@manual{select,
     282  key        = "select",
    280283  title      = "select(7) Linux User's Manual",
    281284  year       = "2019",
     
    289292  month   = "March",
    290293  version = {0,4},
    291   url     = {https://kernel.dk/io_uring.pdf}
     294  howpublished = {\url{https://kernel.dk/io_uring.pdf}}
    292295}
    293296
    294297@misc{libuv,
     298  key   = "libuv",
    295299  title = {libuv},
    296   url   = {https://github.com/libuv/libuv}
     300  howpublished = {\url{https://github.com/libuv/libuv}}
    297301}
    298302
     
    301305% ===============================================================================
    302306
    303 % libfibre web server
    304 @article{karstenuser,
    305   title={User-level Threading: Have Your Cake and Eat It Too},
    306   author={KARSTEN, MARTIN and BARGHI, SAMAN}
    307 }
    308 
    309 % libfibre
    310 @misc{libfibre,
    311   title={libfibre},
    312   author={KARSTEN, MARTIN},
    313   journal={URL: https://git.uwaterloo.ca/mkarsten/libfibre}
    314 }
    315 
    316307@misc{nginx-design,
    317   title={Inside NGINX: How We Designed for Performance \& Scale},
    318   url = "https://www.nginx.com/blog/inside-nginx-how-we-designed-for-performance-scale/",
    319 }
    320 
    321 @article{Delisle19,
    322   keywords      = {concurrency, Cforall},
    323   contributer   = {pabuhr@plg},
    324   author        = {Thierry Delisle and Peter A. Buhr},
    325   title = {Advanced Control-flow and Concurrency in \textsf{C}$\mathbf{\forall}$},
    326   year  = 2019,
    327   journal       = spe,
    328   pages = {1-33},
    329   note  = {submitted},
     308  key   = "nginx",
     309  title={Inside {NGINX}: How We Designed for Performance \& Scale},
     310  howpublished= {\href{https://www.nginx.com/blog/inside-nginx-how-we-designed-for-performance-scale}
     311                {https://\-www.nginx.com/\-blog/\-inside\--nginx\--how\--we\--designed\--for\--performance\--scale}},
    330312}
    331313
     
    343325   title = "Thundering herd problem --- {W}ikipedia{,} The Free Encyclopedia",
    344326   year = "2020",
    345    url = "https://en.wikipedia.org/wiki/Thundering_herd_problem",
     327   howpublished = {\href{https://en.wikipedia.org/wiki/Thundering_herd_problem}
     328                  {https://\-en.wikipedia.org/\-wiki/\-Thundering\_herd\_problem}},},
    346329   note = "[Online; accessed 14-April-2020]"
    347  }
    348 
    349  @misc{wiki:ubuntu-linux,
     330}
     331
     332@misc{wiki:ubuntu-linux,
    350333   author = "{Wikipedia contributors}",
    351334   title = "Ubuntu version history : Table of versions --- {W}ikipedia{,} The Free Encyclopedia",
    352335   year = "2020",
    353    url = "https://en.wikipedia.org/wiki/Ubuntu_version_history#Table_of_versions",
     336   howpublished = {\href{https://en.wikipedia.org/wiki/Ubuntu_version_history\#Table_of_versions}
     337                  {https://\-en.wikipedia.org/\-wiki/\-Ubuntu\_version\_history\#Table\_of\_versions}},
    354338   note = "[Online; accessed 15-April-2020]"
    355  }
     339}
Note: See TracChangeset for help on using the changeset viewer.