Ignore:
Timestamp:
Apr 4, 2022, 7:47:48 PM (6 months ago)
Author:
Thierry Delisle <tdelisle@…>
Branches:
enum, master, pthread-emulation, qualifiedEnum
Children:
f134c25
Parents:
1a9592a
Message:

Small changes to io and intro section.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/thierry_delisle_PhD/thesis/text/io.tex

    r1a9592a rf2bc9fa  
    173173The consequence is that the amount of parallelism used to prepare submissions for the next system call is limited.
    174174Beyond this limit, the length of the system call is the throughput limiting factor.
    175 I concluded from early experiments that preparing submissions seems to take about as long as the system call itself, which means that with a single @io_uring@ instance, there is no benefit in terms of \io throughput to having more than two \glspl{hthrd}.
     175I concluded from early experiments that preparing submissions seems to take at most as long as the system call itself, which means that with a single @io_uring@ instance, there is no benefit in terms of \io throughput to having more than two \glspl{hthrd}.
    176176Therefore the design of the submission engine must manage multiple instances of @io_uring@ running in parallel, effectively sharding @io_uring@ instances.
    177177Similarly to scheduling, this sharding can be done privately, \ie, one instance per \glspl{proc}, in decoupled pools, \ie, a pool of \glspl{proc} use a pool of @io_uring@ instances without one-to-one coupling between any given instance and any given \gls{proc}, or some mix of the two.
     
    200200The only added complexity is that the number of SQEs is fixed, which means allocation can fail.
    201201
    202 Allocation failures need to be pushed up to the routing algorithm: \glspl{thrd} attempting \io operations must not be directed to @io_uring@ instances without sufficient SQEs available.
     202Allocation failures need to be pushed up to a routing algorithm: \glspl{thrd} attempting \io operations must not be directed to @io_uring@ instances without sufficient SQEs available.
    203203Furthermore, the routing algorithm should block operations up-front if none of the instances have available SQEs.
    204204
     
    214214
    215215In the case of designating a \gls{thrd}, ideally, when multiple \glspl{thrd} attempt to submit operations to the same @io_uring@ instance, all requests would be batched together and one of the \glspl{thrd} would do the system call on behalf of the others, referred to as the \newterm{submitter}.
    216 In practice however, it is important that the \io requests are not left pending indefinitely and as such, it may be required to have a current submitter and a next submitter.
     216In practice however, it is important that the \io requests are not left pending indefinitely and as such, it may be required to have a ``next submitter'' that guarentees everything that is missed by the current submitter is seen by the next one.
    217217Indeed, as long as there is a ``next'' submitter, \glspl{thrd} submitting new \io requests can move on, knowing that some future system call will include their request.
    218218Once the system call is done, the submitter must also free SQEs so that the allocator can reused them.
     
    223223If the submission side does not designate submitters, polling can also submit all SQEs as it is polling events.
    224224A simple approach to polling is to allocate a \gls{thrd} per @io_uring@ instance and simply let the poller \glspl{thrd} poll their respective instances when scheduled.
    225 This design is especially convenient for reasons explained in Chapter~\ref{practice}.
    226225
    227226With this pool of instances approach, the big advantage is that it is fairly flexible.
    228227It does not impose restrictions on what \glspl{thrd} submitting \io operations can and cannot do between allocations and submissions.
    229 It also can gracefully handles running out of ressources, SQEs or the kernel returning @EBUSY@.
     228It also can gracefully handle running out of ressources, SQEs or the kernel returning @EBUSY@.
    230229The down side to this is that many of the steps used for submitting need complex synchronization to work properly.
    231230The routing and allocation algorithm needs to keep track of which ring instances have available SQEs, block incoming requests if no instance is available, prevent barging if \glspl{thrd} are already queued up waiting for SQEs and handle SQEs being freed.
    232231The submission side needs to safely append SQEs to the ring buffer, correctly handle chains, make sure no SQE is dropped or left pending forever, notify the allocation side when SQEs can be reused and handle the kernel returning @EBUSY@.
    233 All this synchronization may have a significant cost and, compare to the next approach presented, this synchronization is entirely overhead.
     232All this synchronization may have a significant cost and, compared to the next approach presented, this synchronization is entirely overhead.
    234233
    235234\subsubsection{Private Instances}
    236235Another approach is to simply create one ring instance per \gls{proc}.
    237 This alleviate the need for synchronization on the submissions, requiring only that \glspl{thrd} are not interrupted in between two submission steps.
     236This alleviates the need for synchronization on the submissions, requiring only that \glspl{thrd} are not interrupted in between two submission steps.
    238237This is effectively the same requirement as using @thread_local@ variables.
    239238Since SQEs that are allocated must be submitted to the same ring, on the same \gls{proc}, this effectively forces the application to submit SQEs in allocation order
Note: See TracChangeset for help on using the changeset viewer.