Changeset 1c412aa for doc/theses/thierry_delisle_PhD/comp_II
- Timestamp:
- Apr 16, 2020, 9:23:02 AM (4 years ago)
- Branches:
- ADT, arm-eh, ast-experimental, enum, forall-pointer-decay, jacob/cs343-translation, master, new-ast, new-ast-unique-expr, pthread-emulation, qualifiedEnum
- Children:
- 41af19c
- Parents:
- 849f2c6b
- Location:
- doc/theses/thierry_delisle_PhD/comp_II
- Files:
-
- 2 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/thierry_delisle_PhD/comp_II/comp_II_PAB.tex
r849f2c6b r1c412aa 15 15 \setlist{topsep=6pt,parsep=0pt} % global reduce spacing between points 16 16 \newcommand{\uC}{$\mu$\CC} 17 \usepackage[hidelinks]{hyperref} 17 \usepackage[breaklinks=true,pagebackref=true,hidelinks]{hyperref} 18 \urlstyle{rm} 18 19 \setlength{\abovecaptionskip}{5pt plus 3pt minus 2pt} 19 20 \lstMakeShortInline$% % single-character for \lstinline 20 21 %\usepackage[margin=1in]{geometry} 21 22 %\usepackage{float} 23 %\usepackage{breakurl} 22 24 23 25 \input{glossary} … … 59 61 \section{Introduction} 60 62 \subsection{\CFA and the \CFA concurrency package} 61 \CFA \cite{Moss18} is a modern, polymorphic, non-object-oriented, concurrent, backwards-compatible extension of the C programming language.63 \CFA~\cite{Moss18} is a modern, polymorphic, non-object-oriented, concurrent, backwards-compatible extension of the C programming language. 62 64 It aims to add high-productivity features while maintaining the predictable performance of C. 63 As such, concurrency in \CFA \citaims to offer simple and safe high-level tools while still allowing performant code.65 As such, concurrency in \CFA~\cite{Delisle19} aims to offer simple and safe high-level tools while still allowing performant code. 64 66 \CFA concurrent code is written in the synchronous programming paradigm but uses \glspl{uthrd} in order to achieve the simplicity and maintainability of synchronous programming without sacrificing the efficiency of asynchronous programing. 65 67 As such, the \CFA \newterm{scheduler} is a preemptive user-level scheduler that maps \glspl{uthrd} onto \glspl{kthrd}. … … 327 329 328 330 How much scalability is actually needed is highly debatable. 329 \emph{libfibre} \cit has compared favorably to other schedulers in webserver tests\citand uses a single atomic counter in its scheduling algorithm similarly to the proposed bitmask.331 \emph{libfibre}~\cite{libfibre} has compared favorably to other schedulers in webserver tests~\cite{Karsten20} and uses a single atomic counter in its scheduling algorithm similarly to the proposed bitmask. 330 332 As such, the single atomic instruction on a shared cacheline may be sufficiently performant. 331 333 … … 375 377 376 378 There are possible alternatives to the reader-writer lock solution. 377 This problem is effectively a memory reclamation problem and as such there is a large body of research on the subject \cit.379 This problem is effectively a memory reclamation problem and as such there is a large body of research on the subject~\cite{michael2004hazard, brown2015reclaiming}. 378 380 However, the reader-write lock-solution is simple and can be leveraged to solve other problems (\eg processor ordering and memory reclamation of threads), which makes it an attractive solution. 379 381 … … 421 423 Processors that are unnecessarily unblocked lead to unnecessary contention, CPU usage, and power consumption, while too many sleeping processors can lead to sub-optimal throughput. 422 424 Furthermore, transitions from sleeping to awake and vice-versa also add unnecessary latency. 423 There is already a wealth of research on the subject \TODO{(reference Libfibre)} and I may use an existing approach for the idle-sleep heuristic in this project.425 There is already a wealth of research on the subject~\cite{schillings1996engineering, wiki:thunderherd} and I may use an existing approach for the idle-sleep heuristic in this project, \eg~\cite{Karsten20}. 424 426 425 427 \subsection{Asynchronous I/O} … … 444 446 While there exists many different APIs for asynchronous I/O, it is not part of this proposal to create a novel API. 445 447 It is sufficient to make one work in the complex context of the \CFA runtime. 446 \uC uses the $select$ as its interface, which handles ttys, pipes and sockets, but not disk.448 \uC uses the $select$~\cite{select} as its interface, which handles ttys, pipes and sockets, but not disk. 447 449 $select$ entails significant complexity and is being replaced in UNIX operating-systems, which make it a less interesting alternative. 448 Another popular interface is $epoll$\cit, which is supposed to be cheaper than $select$. 449 However, epoll also does not handle the file system and seems to have problem to linux pipes and $TTY$s\cit. 450 A very recent alternative that I am investigating is $io_uring$. 451 It claims to address some of the issues with $epoll$ but is too recent to be confident that it does. 452 Finally, a popular cross-platform alternative is $libuv$, which offers asynchronous sockets and asynchronous file system operations (among other features). 450 Another popular interface is $epoll$~\cite{epoll}, which is supposed to be cheaper than $select$. 451 However, $epoll$ also does not handle the file system and anecdotal evidence suggest it has problem with linux pipes and $TTY$s. 452 A popular cross-platform alternative is $libuv$~\cite{libuv}, which offers asynchronous sockets and asynchronous file system operations (among other features). 453 453 However, as a full-featured library it includes much more than I need and could conflict with other features of \CFA unless significant effort is made to merge them together. 454 A very recent alternative that I am investigating is $io_uring$~\cite{io_uring}. 455 It claims to address some of the issues with $epoll$ and my early investigating suggest that the claim is accurate. 456 $io_uring$ uses a much more general approach where system calls are register to a queue and later executed by the kernel, rather than relying on system calls to return an error instead of blocking and subsequently waiting for changes on file descriptors. 457 I believe this approach allows for fewer problems, \eg the manpage for $open$~\cite{open} states: 458 \begin{quote} 459 Note that [the $O_NONBLOCK$ flag] has no effect for regular files and block devices; 460 that is, I/O operations will (briefly) block when device activity is required, regardless of whether $O_NONBLOCK$ is set. 461 Since $O_NONBLOCK$ semantics might eventually be implemented, applications should not depend upon blocking behavior when specifying this flag for regular files and block devices. 462 \end{quote} 463 This makes approach based on $epoll$/$select$ less reliable since they may not work for every file descriptors. 464 For this reason, I have started to use $io_uring$ as the OS abstraction for the \CFA runtime, until further work shows problems I have not encountered yet. 465 However, only a small subset of the features are available in Ubuntu as of April 2020~\cite{wiki:ubuntu-linux}, which limits performance comparisons. 466 The currently available features are sufficient to start basic socket performance experiments and comparisons. 454 467 455 468 \paragraph{Event Engine} 456 469 Laying on top of the asynchronous interface layer is the event engine. 457 This engine is responsible for multiplexing (batching) the synchronous I/O requests into asynchronous I/O requests and demultiplexing the results to appropriate blocked user 470 This engine is responsible for multiplexing (batching) the synchronous I/O requests into asynchronous I/O requests and demultiplexing the results to appropriate blocked user-threads. 458 471 This step can be straightforward for simple cases, but becomes quite complex when there are thousands of user threads performing both reads and writes, possibly on overlapping file descriptors. 459 472 Decisions that need to be made include: … … 469 482 \paragraph{Interface} 470 483 Finally, for these non-blocking I/O components to be available, it is necessary to expose them through a synchronous interface because that is the \CFA concurrent programming style. 471 The interface can be novelbut it is preferable to match the existing POSIX interface when possible to be compatible with existing code.484 The interface can add novel features but it is preferable to match the existing POSIX interface when possible to be compatible with existing code. 472 485 Matching allows C programs written using this interface to be transparently converted to \CFA with minimal effort. 473 Where new functionality is needed, I will create a novel interface to fill gaps and provide advanced features.486 Where new functionality is needed, I will create a novel interface extensions to fill gaps and provide advanced features. 474 487 475 488 … … 477 490 % =============================================================================== 478 491 \section{Discussion} 479 492 I believe that runtime systems and scheduling are still open topics. 493 Many ``state of the art'' production frameworks still use single threaded event-loops because of performance considerations, \eg~\cite{nginx-design}, and, to my knowledge, no widely available system language offers modern threading facilities. 494 I believe the proposed work offers a novel runtime and scheduling package embedded in the \CFA programming language, where existing work only offers fragments that users must assemble themselves when possible. 480 495 481 496 % =============================================================================== 482 497 % =============================================================================== 483 498 \section{Timeline} 484 499 \begin{center} 500 \begin{tabular}{ | r @{--} l | p{4in} | } 501 \hline May 2020 & October 2020 & Creation of the performance benchmark. \\ 502 \hline November 2020 & March 2021 & Completion of the implementation. \\ 503 \hline March 2021 & April 2021 & Final Performance experiments. \\ 504 \hline May 2017 & August 2021 & Thesis writing and defense. \\ 505 \hline 506 \end{tabular} 507 \end{center} 485 508 486 509 % B I B L I O G R A P H Y … … 489 512 \phantomsection % allows hyperref to link to the correct page 490 513 \addcontentsline{toc}{section}{\refname} 514 \lstDeleteShortInline$% turn off as $ used in bibliography entries 491 515 \bibliographystyle{plain} 492 516 \bibliography{pl,local} -
doc/theses/thierry_delisle_PhD/comp_II/local.bib
r849f2c6b r1c412aa 76 76 77 77 @article{finkel1987dib, 78 title={DIB —a distributed implementation of backtracking},78 title={DIB-a distributed implementation of backtracking}, 79 79 author={Finkel, Raphael and Manber, Udi}, 80 80 journal={ACM Transactions on Programming Languages and Systems (TOPLAS)}, … … 266 266 % =============================================================================== 267 267 @manual{open, 268 key = "open", 268 269 title = "open(2) Linux User's Manual", 269 270 year = "2020", … … 272 273 273 274 @manual{epoll, 275 key = "epoll", 274 276 title = "epoll(7) Linux User's Manual", 275 277 year = "2019", … … 278 280 279 281 @manual{select, 282 key = "select", 280 283 title = "select(7) Linux User's Manual", 281 284 year = "2019", … … 289 292 month = "March", 290 293 version = {0,4}, 291 url = {https://kernel.dk/io_uring.pdf}294 howpublished = {\url{https://kernel.dk/io_uring.pdf}} 292 295 } 293 296 294 297 @misc{libuv, 298 key = "libuv", 295 299 title = {libuv}, 296 url = {https://github.com/libuv/libuv}300 howpublished = {\url{https://github.com/libuv/libuv}} 297 301 } 298 302 … … 301 305 % =============================================================================== 302 306 303 % libfibre web server304 @article{karstenuser,305 title={User-level Threading: Have Your Cake and Eat It Too},306 author={KARSTEN, MARTIN and BARGHI, SAMAN}307 }308 309 % libfibre310 @misc{libfibre,311 title={libfibre},312 author={KARSTEN, MARTIN},313 journal={URL: https://git.uwaterloo.ca/mkarsten/libfibre}314 }315 316 307 @misc{nginx-design, 317 title={Inside NGINX: How We Designed for Performance \& Scale}, 318 url = "https://www.nginx.com/blog/inside-nginx-how-we-designed-for-performance-scale/", 319 } 320 321 @article{Delisle19, 322 keywords = {concurrency, Cforall}, 323 contributer = {pabuhr@plg}, 324 author = {Thierry Delisle and Peter A. Buhr}, 325 title = {Advanced Control-flow and Concurrency in \textsf{C}$\mathbf{\forall}$}, 326 year = 2019, 327 journal = spe, 328 pages = {1-33}, 329 note = {submitted}, 308 key = "nginx", 309 title={Inside {NGINX}: How We Designed for Performance \& Scale}, 310 howpublished= {\href{https://www.nginx.com/blog/inside-nginx-how-we-designed-for-performance-scale} 311 {https://\-www.nginx.com/\-blog/\-inside\--nginx\--how\--we\--designed\--for\--performance\--scale}}, 330 312 } 331 313 … … 343 325 title = "Thundering herd problem --- {W}ikipedia{,} The Free Encyclopedia", 344 326 year = "2020", 345 url = "https://en.wikipedia.org/wiki/Thundering_herd_problem", 327 howpublished = {\href{https://en.wikipedia.org/wiki/Thundering_herd_problem} 328 {https://\-en.wikipedia.org/\-wiki/\-Thundering\_herd\_problem}},}, 346 329 note = "[Online; accessed 14-April-2020]" 347 348 349 330 } 331 332 @misc{wiki:ubuntu-linux, 350 333 author = "{Wikipedia contributors}", 351 334 title = "Ubuntu version history : Table of versions --- {W}ikipedia{,} The Free Encyclopedia", 352 335 year = "2020", 353 url = "https://en.wikipedia.org/wiki/Ubuntu_version_history#Table_of_versions", 336 howpublished = {\href{https://en.wikipedia.org/wiki/Ubuntu_version_history\#Table_of_versions} 337 {https://\-en.wikipedia.org/\-wiki/\-Ubuntu\_version\_history\#Table\_of\_versions}}, 354 338 note = "[Online; accessed 15-April-2020]" 355 339 }
Note: See TracChangeset
for help on using the changeset viewer.