Changeset 4e21942 for doc/theses/thierry_delisle_PhD/thesis
- Timestamp:
- Aug 1, 2022, 3:27:07 PM (2 years ago)
- Branches:
- ADT, ast-experimental, master, pthread-emulation
- Children:
- 3fe4acd
- Parents:
- 30159e5
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/thierry_delisle_PhD/thesis/text/front.tex
r30159e5 r4e21942 106 106 % D E C L A R A T I O N P A G E 107 107 % ------------------------------- 108 % The following is a sample De laration Page as provided by the GSO108 % The following is a sample Declaration Page as provided by the GSO 109 109 % December 13th, 2006. It is designed for an electronic thesis. 110 110 \noindent … … 124 124 125 125 User-Level threading (M:N) is gaining popularity over kernel-level threading (1:1) in many programming languages. 126 The user -levelapproach is often a better mechanism to express complex concurrent applications by efficiently running 10,000+ threads on multi-core systems.127 Indeed, over-partitioning into small work-units significantly eases load balancing while providing user threads for each unit of work offers greater freedom to the programmer.126 The user threading approach is often a better mechanism to express complex concurrent applications by efficiently running 10,000+ threads on multi-core systems. 127 Indeed, over-partitioning into small work-units with user threading significantly eases load bal\-ancing, while simultaneously providing advanced synchronization and mutual exclusion mechanisms. 128 128 To manage these high levels of concurrency, the underlying runtime must efficiently schedule many user threads across a few kernel threads; 129 which begs of the question of how many kernel threads are needed and when should the need be re-evaliated. 130 Furthermore, the scheduler must prevent kernel threads from blocking, otherwise user-thread parallelism drops, and put idle kernel-threads to sleep to avoid wasted resources. 129 which begs of the question of how many kernel threads are needed and should the number be dynamically reevaluated. 130 Furthermore, scheduling must prevent kernel threads from blocking, otherwise user-thread parallelism drops. 131 When user-threading parallelism does drop, how and when should idle kernel-threads be to sleep to avoid wasted CPU resources. 131 132 Finally, the scheduling system must provide fairness to prevent a user thread from monopolizing a kernel thread; 132 otherwise other user threads can experience short/long term starvation or kernel threads can deadlock waiting for events to occur .133 otherwise other user threads can experience short/long term starvation or kernel threads can deadlock waiting for events to occur on busy kernel threads. 133 134 134 135 This thesis analyses multiple scheduler systems, where each system attempts to fulfill the necessary requirements for user-level threading. 135 The predominant technique for manage high levels of concurrency is sharding the ready-queue with one queue per kernel-threads and using some form of work stealing/sharing to dynamically rebalance workload shifts. 136 Fairness can be handled through preemption or ad-hoc solutions, which leads to coarse-grained fairness and pathological cases. 136 The predominant technique for managing high levels of concurrency is sharding the ready-queue with one queue per kernel-thread and using some form of work stealing/sharing to dynamically rebalance workload shifts. 137 137 Preventing kernel blocking is accomplish by transforming kernel locks and I/O operations into user-level operations that do not block the kernel thread or spin up new kernel threads to manage the blocking. 138 139 After selecting specific approaches to these scheduling issues, a complete implementation was created and tested in the \CFA (C-for-all) runtime system. 138 Fairness is handled through preemption and/or ad-hoc solutions, which leads to coarse-grained fairness with some pathological cases. 139 140 After testing and selecting specific approaches to these scheduling issues, a complete implementation was created and tested in the \CFA (C-for-all) runtime system. 140 141 \CFA is a modern extension of C using user-level threading as its fundamental threading model. 141 142 As one of its primary goals, \CFA aims to offer increased safety and productivity without sacrificing performance. 142 143 The new scheduler achieves this goal by demonstrating equivalent performance to work-stealing schedulers while offering better fairness. 143 This is achieved through several optimization that successfully eliminate the cost of the additional fairness, some of these optimization relying on interesting hardware optimizations present on most modern cpus. 144 This work also includes support for user-level \io, allowing programmers to have many more user-threads blocking on \io operations than there are \glspl{kthrd}. 144 The implementation uses several optimizations that successfully balance the cost of fairness against performance; 145 some of these optimization rely on interesting hardware optimizations present on modern CPUs. 146 The new scheduler also includes support for implicit nonblocking \io, allowing applications to have more user-threads blocking on \io operations than there are \glspl{kthrd}. 145 147 The implementation is based on @io_uring@, a recent addition to the Linux kernel, and achieves the same performance and fairness. 146 To complete the picture, the idle sleep mechanism that goes along is presented. 147 148 148 To complete the scheduler, an idle sleep mechanism is implemented that significantly reduces wasted CPU cycles, which are then available outside of the application. 149 149 150 150 \cleardoublepage
Note: See TracChangeset
for help on using the changeset viewer.