Changeset 123e8b9


Ignore:
Timestamp:
May 6, 2024, 1:15:43 PM (7 weeks ago)
Author:
Peter A. Buhr <pabuhr@…>
Branches:
master
Children:
1379c96e
Parents:
0775468
Message:

move background material from list chapter to background chapter

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/mike_brooks_MMath/list.tex

    r0775468 r123e8b9  
    1212TODO: more summary
    1313
    14 
    15 \section{Design Issues}
    16 \label{toc:lst:issue}
    17 
    18 This section introduces the design space for linked lists that target system programmers.
    19 
    20 All design-issue discussions assume the following invariants.
    21 \PAB{They are stated here to clarify that none of the discussed design issues refers to one of these.}
    22 Alternatives to the assumptions are discussed under Future Work (Section~\ref{toc:lst:futwork}).
    23 \begin{itemize}
    24     \item A doubly-linked list is being designed.
    25           Generally, the discussed issues apply similarly for singly-linked lists.
    26           Circular vs ordered linking is discussed under List identity (Section~\ref{toc:lst:issue:ident}).
    27     \item Link fields are system-managed.
    28           The user works with the system-provided API to query and modify list membership.
    29           The system has freedom over how to represent these links.
    30           The library is not providing applied wrapper operations that consume a user's hand-implemented list primitives.
    31     \item \PAB{These issues are compared at a requirement/functional level.}
    32 \end{itemize}
    33 
    34 Two preexisting linked-list libraries are used throughout, to show examples of the concepts being defined,
    35 and further libraries are introduced as needed.
    36 \begin{description}
    37     \item[LQ] Linux Queue library\cite{lst:linuxq} of @<sys/queue.h>@.
    38     \item[STL] C++ Standard Template Library's @std::list@\cite{lst:stl}
    39 \end{description}
    40 A general comparison of libraries' abilities is given under Related Work (Section~\ref{toc:lst:relwork}).
    41 
    42 The fictional type @req@ (request) is the user's payload in examples.
    43 The list library is helping the user track requests.
    44 A request represents work that the user must do but has not done yet.
    45 This work is on the level of handling a network arrival event or scheduling a thread.
    46 
    47 
    48 
    49 \subsection{Link attachment: Intrusive vs.\ Wrapped}
    50 \label{toc:lst:issue:attach}
    51 
    52 Link attachment deals with the question:
    53 Where are the system's inter-element link fields stored, in relation to the user's payload data fields?
    54 An intrusive list places the link fields inside the payload structure.
    55 A wrapped list places the payload inside a generic system-provided structure that also defines the link fields.
    56 LQ is intrusive; STL is wrapped.
    57 
    58 The wrapped style admits the further distinction between wrapping a reference and wrapping a value.
    59 This distinction is pervasive in all STL collections; @list<req *>@ wraps a reference; @list<req>@ wraps a value.
    60 (For this discussion, @list<req &>@ is similar to @list<req *>@.)
    61 This difference is one of user style, not framework capability.
    62 Figure~\ref{fig:lst-issues-attach} compares the three styles.
    63 
    64 \begin{comment}
    65 \begin{figure}
    66     \begin{tabularx}{\textwidth}{Y|Y|Y}
    67                 \lstinput[language=C]{20-39}{lst-issues-intrusive.run.c}
    68         &\lstinputlisting[language=C++]{20-39}{lst-issues-wrapped-byref.run.cpp}
    69         &\lstinputlisting[language=C++]{20-39}{lst-issues-wrapped-emplaced.run.cpp}
    70       \\ & &
    71       \\
    72         \includegraphics[page=1]{lst-issues-attach.pdf}
    73         &
    74         \includegraphics[page=2]{lst-issues-attach.pdf}
    75         &
    76         \includegraphics[page=3]{lst-issues-attach.pdf}
    77       \\ & &
    78       \\
    79         (a) & (b) & (c)
    80     \end{tabularx}
    81 \caption{
    82         Three styles of link attachment: (a)~intrusive, (b)~wrapped reference, and (c)~wrapped value.
    83         The diagrams show the memory layouts that result after the code runs, eliding the head object \lstinline{reqs};
    84         head objects are discussed in Section~\ref{toc:lst:issue:ident}.
    85         In (a), the field \lstinline{req.x} names a list direction;
    86         these are discussed in Section~\ref{toc:lst:issue:derection}.
    87         In (b) and (c), the type \lstinline{node} represents a system-internal type,
    88         which is \lstinline{std::_List_node} in the GNU implementation.
    89         (TODO: cite? found in  /usr/include/c++/7/bits/stl\_list.h )
    90     }
    91      \label{fig:lst-issues-attach}
    92 \end{figure}
    93 \end{comment}
    94 
    95 \begin{figure}
    96 \centering
    97 \newsavebox{\myboxA}                                    % used with subfigure
    98 \newsavebox{\myboxB}
    99 \newsavebox{\myboxC}
    100 
    101 \begin{lrbox}{\myboxA}
    102 \begin{tabular}{@{}l@{}}
    103 \lstinput[language=C]{20-39}{lst-issues-intrusive.run.c} \\
    104 \ \\
    105 \includegraphics[page=1]{lst-issues-attach.pdf}
    106 \end{tabular}
    107 \end{lrbox}
    108 
    109 \begin{lrbox}{\myboxB}
    110 \begin{tabular}{@{}l@{}}
    111 \lstinput[language=C++]{20-39}{lst-issues-wrapped-byref.run.cpp} \\
    112 \ \\
    113 \includegraphics[page=2]{lst-issues-attach.pdf}
    114 \end{tabular}
    115 \end{lrbox}
    116 
    117 \begin{lrbox}{\myboxC}
    118 \begin{tabular}{@{}l@{}}
    119 \lstinput[language=C++]{20-39}{lst-issues-wrapped-emplaced.run.cpp} \\
    120 \ \\
    121 \includegraphics[page=3]{lst-issues-attach.pdf}
    122 \end{tabular}
    123 \end{lrbox}
    124 
    125 \subfloat[Intrusive]{\label{f:Intrusive}\usebox\myboxA}
    126 \hspace{10pt}
    127 \vrule
    128 \hspace{10pt}
    129 \subfloat[Wrapped reference]{\label{f:WrappedRef}\usebox\myboxB}
    130 \hspace{10pt}
    131 \vrule
    132 \hspace{10pt}
    133 \subfloat[Wrapped value]{\label{f:WrappedValue}\usebox\myboxC}
    134 
    135 \caption{
    136         Three styles of link attachment: \protect\subref*{f:Intrusive}~intrusive, \protect\subref*{f:WrappedRef}~wrapped
    137         reference, and \protect\subref*{f:WrappedValue}~wrapped value.
    138         The diagrams show the memory layouts that result after the code runs, eliding the head object \lstinline{reqs};
    139         head objects are discussed in Section~\ref{toc:lst:issue:ident}.
    140         In \protect\subref*{f:Intrusive}, the field \lstinline{req.x} names a list direction;
    141         these are discussed in Section~\ref{toc:lst:issue:derection}.
    142         In \protect\subref*{f:WrappedRef} and \protect\subref*{f:WrappedValue}, the type \lstinline{node} represents a
    143         system-internal type, which is \lstinline{std::_List_node} in the GNU implementation.
    144         (TODO: cite? found in  /usr/include/c++/7/bits/stl\_list.h )
    145     }
    146     \label{fig:lst-issues-attach}
    147 \end{figure}
    148 
    149 The advantage of intrusive attachment is the control that it gives the user over memory layout.
    150 Each diagrammed example is using the fewest dynamic allocations that its respective style allows.
    151 Both wrapped attachment styles imply system-induced heap allocations.
    152 Such an allocation has a lifetime that matches the item's membership in the list.
    153 In \subref*{f:Intrusive} and \subref*{f:WrappedRef}, one @req@ object can enter and leave a list many times.
    154 In \subref*{f:WrappedRef}, it implies a dynamic allocation/deallocation for each enter/leave; in \subref*{f:Intrusive}, it does not.
    155 
    156 A further aspect of layout control is allowing the user to specify the location of the link fields within the @req@ object.
    157 LQ allows this ability; a different mechanism of intrusion, such as inheriting from a @linkable@ base type, may not.
    158 Having this control means the user can allocate the link fields to cache lines along with the other @req@ fields.
    159 Doing this allocation sensibly can help improve locality or avoid false sharing.
    160 With an attachment mechanism that does not offer this control,
    161 a framework design choice or fact of the host language forces the links to be contiguous with either the beginning or end of the @req@.
    162 All wrapping realizations have this limitation in their wrapped-value configurations.
    163 
    164 Another subtle advantage of intrusive arrangement is that
    165 a reference to a user-level item (@req@) is sufficient to navigate or manage the item's membership.
    166 In LQ, \subref*{f:Intrusive}, a @req@ pointer is the right argument type for operations @LIST_NEXT@ or @LIST_REMOVE@;
    167 there is no distinguishing a @req@ from ``a @req@ in a list.''
    168 The same is not true of STL, \subref*{f:WrappedRef} or \subref*{f:WrappedValue}.
    169 There, the analogous operations work on a parameter of type @list<T>::iterator@;
    170 they are @iterator::operator++()@, @iterator::operator*()@, and @list::erase(iterator)@.
    171 There is no mapping from @req &@ to @list<req>::iterator@, except for linear search.
    172 
    173 The advantage of wrapped attachment is the abstraction of a data item from its list membership(s).
    174 In the wrapped style, the @req@ type can come from a library that serves many independent uses,
    175 which generally have no need for listing.
    176 Then, a novel use can still put @req@ in a (further) list, without requiring any upstream change in the @req@ library.
    177 In intrusive attachment, the ability to be listed must be planned during the definition of @req@.
    178 Similarly, style \subref*{f:WrappedRef} allows for one @req@ to occur at several positions in one list.
    179 Styles \subref*{f:Intrusive} and \subref*{f:WrappedValue} do not support this ability.
    180 \PAB{But style \subref*{f:WrappedValue} can sort of mimic this effect by have multiple copies of \lstinline{req} in the list, modulo changes to the copies are not seen by the original.}
    181 
    182 \begin{figure}
    183     \lstinput[language=C++]{100-117}{lst-issues-attach-reduction.hpp}
    184     \lstinput[language=C++]{150-150}{lst-issues-attach-reduction.hpp}
    185     \caption{
    186         Reduction of wrapped attachment to intrusive attachment.
    187         Illustrated by pseudocode implementation of an STL-compatible API fragment
    188         using LQ as the underlying implementation.
    189         The gap that makes it pseudocode is that
    190         the LQ C macros do not expand to valid C++ when instantiated with template parameters---there is no \lstinline{struct El}.
    191         When using a custom-patched version of LQ to work around this issue,
    192         the programs of Figure~\ref{f:WrappedRef} and \protect\subref*{f:WrappedValue} work with this shim in place of real STL.
    193         Their executions lead to the same memory layouts.
    194     }
    195     \label{fig:lst-issues-attach-reduction}
    196 \end{figure}
    197 
    198 Wrapped attachment has a straightforward reduction to intrusive attachment, illustrated in Figure~\ref{fig:lst-issues-attach-reduction}.
    199 This shim layer performs the implicit dynamic allocations that pure intrusion avoids.
    200 But there is no reduction going the other way.
    201 No shimming can cancel the allocations to which wrapped membership commits.
    202 
    203 So intrusion is a lower-level listing primitive.
    204 And so, the system design choice is not between forcing users to use intrusion or wrapping.
    205 The choice is whether or not to provide access to an allocation-free layer of functionality.
    206 A wrapped-primitive library like STL forces users to incur the costs of wrapping, whether or not they access its benefits.
    207 An intrusive-primitive library like LQ lets users choose when to make this tradeoff.
    208 
    209 
    210 \subsection{Directionality: Single vs.\ Multi-Static vs.\ Dynamic}
    211 \label{toc:lst:issue:derection}
    212 
    213 Axis?
    214 
    215 \PAB{I'm not sure about the term \newterm{Directionality}. Directionality to me, means going forward or backwards through a list.
    216 Would \newterm{dimensionality} work? Think of each list containing the node as a different dimension in which the node sits.}
    217 
    218 Directionality deals with the question:
    219 In how many different lists can an item be stored, at a given time?
    220 
    221 Consider STL in the wrapped-value arrangement of Figure~\ref{f:WrappedValue}.
    222 The STL API completely hides its @node@ type from a user; the user cannot configure this choice or impose a custom one.
    223 STL's @node@ type offers the sole set of links shown in the diagram.
    224 Therefore, every @req@ in existence is allocated either to belong to an occurrence of the diagrammed arrangement,
    225 or to be apart from all occurrences of it.
    226 In the first case, the @req@ belongs to exactly one list (of the style in question).
    227 STL with wrapped values supports a single link direction.
    228 
    229 \begin{figure}
    230     \parbox[t]{3.5in} {
    231         \lstinput[language=C++]{20-60}{lst-issues-multi-static.run.c}
    232     }\parbox[t]{20in} {
    233         ~\\
    234         \includegraphics[page=1]{lst-issues-direct.pdf} \\
    235         ~\\
    236         \hspace*{1.5in}\includegraphics[page=2]{lst-issues-direct.pdf}
    237     }
    238     \caption{
    239         Example of two link directions, with an LQ realization. 
    240         The zoomed-out diagram portion shows the whole example dataset, conceptually.
    241         A consumer of this structure can navigate all requests in priority order, and navigate among requests from a common requestor.
    242         The code is the LQ implementation.
    243         The zoomed-in diagram portion shows the field-level state that results from running the LQ code.
    244     }
    245     \label{fig:lst-issues-multi-static}
    246 \end{figure}
    247 
    248 
    249 The user may benefit from a further link direction.
    250 Suppose the user must both: navigate all requests in priority order, and navigate among requests from a common requestor.
    251 Figure~\ref{fig:lst-issues-multi-static} shows such a situation.
    252 Each of its ``by priority'' and ``by requestor'' is a link direction.
    253 The example shows that a link direction can occur either as one global list (by-priority) or as many lists (there are three by-requestor lists).
    254 
    255 The limitation of intrusive attachment presented in Section~\ref{toc:lst:issue:attach}
    256 has a straightforward extension to multiple directions.
    257 The set of directions by which an item is to be listed must be planned during the definition of the item.
    258 Thus, intrusive LQ supports multiple, but statically many, link directions.
    259 
    260 % https://www.geeksforgeeks.org/introduction-to-multi-linked-list/ -- example of building a bespoke multi-linked list out of STL primitives (providing indication that STL doesn't offer one); offers dynamic directionality by embedding `vector<struct node*> pointers;`
    261 
    262 The corresponding flexibility of wrapped attachment means
    263 the STL wrapped-reference arrangement supports an item being a member of arbitrarily many lists.
    264 This support also applies to the wrapped-value list because the @req@ is copied,
    265 but wrapped-reference lists provide further link directions.
    266 \PAB{Explain how}
    267 STL with wrapped references supports dynamic link directions.
    268 \PAB{Expand}
    269 
    270 When allowing multiple static directions, frameworks differ in their ergonomics for
    271 the typical case: when the user needs only one direction, vs.\ the atypical case, when the user needs several.
    272 LQ's ergonomics are well-suited to the uncommon case of multiple list directions.
    273 Its intrusion declaration and insertion operation both use a mandatory explicit parameter naming the direction.
    274 This decision works well in Figure~\ref{fig:lst-issues-multi-static}, where the names @by_pri@ and @by_rqr@ work well,
    275 but it clutters Figure~\ref{f:Intrusive}, where a contrived name must be invented and used.
    276 The example uses @x@; @reqs@ would be a more readily ignored choice. \PAB{wording?}
    277 
    278 \uCpp offers an intrusive list that makes the opposite choice.  TODO: elaborate on inheritance for first direction and acrobatics for subsequent directions.
    279 
    280 
    281 \subsection{User integration: Preprocessed vs.\ Type-System Mediated}
    282 
    283 % example of poor error message due to LQ's preprocessed integration
    284 % programs/lst-issues-multi-static.run.c:46:1: error: expected identifier or '(' before 'do'
    285 %    46 | LIST_INSERT_HEAD(&reqs_rtr_42, &r42b, by_rqr);
    286 %       | ^~~~~~~~~~~~~~~~
    287 %
    288 % ... not a wonderful example; it was a missing semicolon on the preceding line; but at least real
    289 
    290 
    291 \subsection{List identity: Headed vs.\ Ad-hoc}
    292 \label{toc:lst:issue:ident}
    293 
    294 All examples so far have used distinct user-facing types:
    295 an item found in a list (type @req@, of variables like @r1@), and
    296 a list (type @reql@ or @list<req>@, of variables like @reqs@ or @reqs_rqr_42@).
    297 \see{Figure~\ref{fig:lst-issues-attach} and Figure~\ref{fig:lst-issues-multi-static}}
    298 The latter type is a head, and these examples are of headed lists.
    299 
    300 A bespoke ``pointer to next @req@'' implementation often omits the latter type.
    301 Such a list model is ad-hoc.
    302 
    303 In headed thinking, there are length-zero lists (heads with no elements), and an element can be listed or not listed.
    304 In ad-hoc thinking, there are no length-zero lists and every element belongs to a list of length at least one.
    305 \PAB{Create a figure for this.}
    306 
    307 By omitting the head, elements can enter into an adjacency relationship,
    308 without requiring that someone allocate room for the head of the possibly-resulting list,
    309 or being able to find a correct existing head.
    310 
    311 A head defines one or more element roles, among elements that share a transitive adjacency.
    312 ``First'' and ``last'' are element roles.
    313 One moment's ``first'' need not be the next moment's.
    314 
    315 There is a cost to maintaining these roles.
    316 A runtime component of this cost is evident in LQ's offering the choice of type generators @LIST@ vs.~@TAILQ@.
    317 Its @LIST@ maintains a ``first,'' but not a ``last;'' its @TAILQ@ maintains both roles.
    318 (Both types are doubly linked and an analogous choice is available for singly linked.)
    319 
    320 TODO: finish making this point
    321 
    322 See WIP in lst-issues-adhoc-*.ignore.*.
    323 
    324 The code-complexity component of the cost ...
    325 
    326 Ability to offer heads is good.  Point: Does maintaining a head mean that the user has to provide more state when manipulating the list?  Requiring the user to do so is bad, because the user may have lots of "list" typed variables in scope, and detecting that the user passed the wrong one requires testing all the listing edge cases.
    327 
    328 \subsection{End treatment: Uniform }
    32914
    33015
     
    424109Links in GDB.
    425110
    426 \section{Related Work}
    427 \label{toc:lst:relwork}
Note: See TracChangeset for help on using the changeset viewer.