Changeset f2bc9fa for doc/theses/thierry_delisle_PhD/thesis/text/io.tex
- Timestamp:
- Apr 4, 2022, 7:47:48 PM (2 years ago)
- Branches:
- ADT, ast-experimental, enum, master, pthread-emulation, qualifiedEnum
- Children:
- f134c25
- Parents:
- 1a9592a
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/thierry_delisle_PhD/thesis/text/io.tex
r1a9592a rf2bc9fa 173 173 The consequence is that the amount of parallelism used to prepare submissions for the next system call is limited. 174 174 Beyond this limit, the length of the system call is the throughput limiting factor. 175 I concluded from early experiments that preparing submissions seems to take a bout as long as the system call itself, which means that with a single @io_uring@ instance, there is no benefit in terms of \io throughput to having more than two \glspl{hthrd}.175 I concluded from early experiments that preparing submissions seems to take at most as long as the system call itself, which means that with a single @io_uring@ instance, there is no benefit in terms of \io throughput to having more than two \glspl{hthrd}. 176 176 Therefore the design of the submission engine must manage multiple instances of @io_uring@ running in parallel, effectively sharding @io_uring@ instances. 177 177 Similarly to scheduling, this sharding can be done privately, \ie, one instance per \glspl{proc}, in decoupled pools, \ie, a pool of \glspl{proc} use a pool of @io_uring@ instances without one-to-one coupling between any given instance and any given \gls{proc}, or some mix of the two. … … 200 200 The only added complexity is that the number of SQEs is fixed, which means allocation can fail. 201 201 202 Allocation failures need to be pushed up to therouting algorithm: \glspl{thrd} attempting \io operations must not be directed to @io_uring@ instances without sufficient SQEs available.202 Allocation failures need to be pushed up to a routing algorithm: \glspl{thrd} attempting \io operations must not be directed to @io_uring@ instances without sufficient SQEs available. 203 203 Furthermore, the routing algorithm should block operations up-front if none of the instances have available SQEs. 204 204 … … 214 214 215 215 In the case of designating a \gls{thrd}, ideally, when multiple \glspl{thrd} attempt to submit operations to the same @io_uring@ instance, all requests would be batched together and one of the \glspl{thrd} would do the system call on behalf of the others, referred to as the \newterm{submitter}. 216 In practice however, it is important that the \io requests are not left pending indefinitely and as such, it may be required to have a current submitter and a next submitter.216 In practice however, it is important that the \io requests are not left pending indefinitely and as such, it may be required to have a ``next submitter'' that guarentees everything that is missed by the current submitter is seen by the next one. 217 217 Indeed, as long as there is a ``next'' submitter, \glspl{thrd} submitting new \io requests can move on, knowing that some future system call will include their request. 218 218 Once the system call is done, the submitter must also free SQEs so that the allocator can reused them. … … 223 223 If the submission side does not designate submitters, polling can also submit all SQEs as it is polling events. 224 224 A simple approach to polling is to allocate a \gls{thrd} per @io_uring@ instance and simply let the poller \glspl{thrd} poll their respective instances when scheduled. 225 This design is especially convenient for reasons explained in Chapter~\ref{practice}.226 225 227 226 With this pool of instances approach, the big advantage is that it is fairly flexible. 228 227 It does not impose restrictions on what \glspl{thrd} submitting \io operations can and cannot do between allocations and submissions. 229 It also can gracefully handle srunning out of ressources, SQEs or the kernel returning @EBUSY@.228 It also can gracefully handle running out of ressources, SQEs or the kernel returning @EBUSY@. 230 229 The down side to this is that many of the steps used for submitting need complex synchronization to work properly. 231 230 The routing and allocation algorithm needs to keep track of which ring instances have available SQEs, block incoming requests if no instance is available, prevent barging if \glspl{thrd} are already queued up waiting for SQEs and handle SQEs being freed. 232 231 The submission side needs to safely append SQEs to the ring buffer, correctly handle chains, make sure no SQE is dropped or left pending forever, notify the allocation side when SQEs can be reused and handle the kernel returning @EBUSY@. 233 All this synchronization may have a significant cost and, compare to the next approach presented, this synchronization is entirely overhead.232 All this synchronization may have a significant cost and, compared to the next approach presented, this synchronization is entirely overhead. 234 233 235 234 \subsubsection{Private Instances} 236 235 Another approach is to simply create one ring instance per \gls{proc}. 237 This alleviate the need for synchronization on the submissions, requiring only that \glspl{thrd} are not interrupted in between two submission steps.236 This alleviates the need for synchronization on the submissions, requiring only that \glspl{thrd} are not interrupted in between two submission steps. 238 237 This is effectively the same requirement as using @thread_local@ variables. 239 238 Since SQEs that are allocated must be submitted to the same ring, on the same \gls{proc}, this effectively forces the application to submit SQEs in allocation order
Note: See TracChangeset
for help on using the changeset viewer.