Ignore:
Timestamp:
Mar 14, 2026, 1:08:08 PM (3 days ago)
Author:
Peter A. Buhr <pabuhr@…>
Branches:
master
Parents:
c979afa
Message:

final proofread of background chapter

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/mike_brooks_MMath/background.tex

    rc979afa r1329d78  
    302302
    303303Subscripting proceeds first with pointer decay, if needed.
    304 Next, \cite[\S~6.5.2.1.2]{C11} explains that @ar[i]@ is treated as if it were @(*((a)+(i)))@.
    305 \cite[\S~6.5.6.8]{C11} explains that the addition, of a pointer with an integer type, is defined only when the pointer refers to an element that is in an array, with a meaning of @i@ elements away from, which is valid if @ar@ is big enough and @i@ is small enough.
     304Next, \cite[\S~6.5.2.1.2]{C11} explains that @ar[i]@ is treated as if it is @(*((ar)+(i)))@.
     305\cite[\S~6.5.6.8]{C11} explains that the addition, of a pointer with an integer type, is defined only when the pointer refers to an element that is in an array.
     306The addition gives the address @i@ elements away from the element (@ar@, or @&ar[0]@).
     307This address is valid if @ar@ is big enough and @i@ is small enough.
    306308Finally, \cite[\S~6.5.3.2.4]{C11} explains that the @*@ operator's result is the referenced element.
    307309Taken together, these rules illustrate that @ar[i]@ and @i[a]@ mean the same thing, as plus is commutative.
     
    319321
    320322Under this assumption, a pointer, @p@, being subscripted (or added to, then dereferenced) by any value (positive, zero, or negative), gives a view of the program's entire address space, centred around @p@'s address, divided into adjacent @sizeof(*p)@ chunks, each potentially (re)interpreted as @typeof(*p).t@.
    321 I call this phenomenon \emph{array diffraction}, which is a diffraction of a single-element pointer into the assumption that its target is conceptually in the middle of an array whose size is unlimited in both directions.
     323I call this phenomenon \emph{array diffraction}, which is a diffraction of a single-element pointer into the assumption that its target is conceptually in the middle of an array whose size is unlimited in both directions, \eg @(&ar[5])[-200]@ or @(&ar[5])[200]@).
    322324No pointer is exempt from array diffraction.
    323325No array shows its elements without pointer decay.
     
    354356\end{cfa}
    355357The basic two meanings, with a syntactic difference helping to distinguish, are illustrated in the declarations of @ca@ \vs @cp@, whose subsequent @edit@ calls behave differently.
    356 The syntax-caused confusion is in the comparison of the first and last lines, both of which use a literal to initialize an object declared with spelling @T x[]@.
     358The syntax-caused confusion is in the comparison of the first and last lines, both of which use a literal to initialize an object declared with spelling @T x[ ]@.
    357359But these initialized declarations get opposite meanings, depending on whether the object is a local variable or a parameter!
    358360
     
    417419
    418420As of C99, the C standard supports a \newterm{variable length array} (VLA)~\cite[\S~6.7.5.2.5]{C99}, providing a dynamic-fixed array feature \see{\VRef{s:ArrayIntro}}.
    419 Note, the \CC standard does not support VLAs, but @g++@ provides them.
     421Note, the \CC standard does not support VLAs, but @g++@ and @clang@ provide them.
    420422A VLA is used when the desired number of array elements is \emph{unknown} at compile time.
    421423\begin{cfa}
     
    527529\VRef[Figure]{f:ContiguousNon-contiguous} shows a powerful extension made in C99 for manipulating contiguous \vs non-contiguous arrays.\footnote{C90 also supported non-contiguous arrays.}
    528530For contiguous-array arguments (including VLA), C99 conjoins one or more of the parameters as a downstream dimension(s), \eg @cols@, implicitly using this parameter to compute the row stride of @m@.
    529 There is now sufficient information to support array copying and subscript checking along the columns to prevent changing the argument or buffer-overflow problems, \emph{but neither feature is provided}.
    530 If the declaration of @fc@ is changed to:
     531Hence, if the declaration of @fc@ is changed to:
    531532\begin{cfa}
    532533void fc( int rows, int cols, int m[@rows@][@cols@] ) ...
    533534\end{cfa}
    534 it is possible for C to perform bound checking across all subscripting.
     535there is now sufficient information to support array copying and subscript checking to prevent changing the argument or buffer-overflow problems, \emph{but neither feature is provided}.
    535536While this contiguous-array capability is a step forward, it is still the programmer's responsibility to manually manage the number of dimensions and their sizes, both at the function definition and call sites.
    536537That is, the array does not automatically carry its structure and sizes for use in computing subscripts.
     
    673674Yet, C allows array syntax for the outermost type constructor, from which comes the freedom to comment.
    674675An array parameter declaration can specify the outermost dimension with a dimension value, @[10]@ (which is ignored), an empty dimension list, @[ ]@, or a pointer, @*@, as seen in \VRef[Figure]{f:ArParmEquivDecl}.
    675 The rationale for rejecting the first invalid row follows shortly, while the second invalid row is nonsense, included to complete the pattern; its syntax hints at what the final row actually achieves.
     676Examining the rows, the rationale for rejecting the first invalid row (row 3) follows shortly, while the second invalid row (row 4) is nonsense, included to complete the pattern; its syntax hints at what the final row actually achieves.
    676677Note, in the leftmost style, the typechecker ignores the actual value, even for a dynamic expression.
    677678\begin{cfa}
     
    684685% So are @float[5]*@, @float[]*@ and @float (*)*@.  These latter ones are simply nonsense, though they hint at ``1d array of pointers'', whose equivalent syntax options are, @float *[5]@, @float *[]@, and @float **@.
    685686
    686 It is a matter of taste as to whether a programmer should use the left form to get the most out of commenting subscripting and dimension sizes, sticking to the right (avoiding false comfort from suggesting the typechecker is checking more than it is), or compromising in the middle (reducing unchecked information, yet clearly stating, ``I will subscript'').
     687It is a matter of taste as to whether a programmer should use the left form to get the most out of commenting subscripting and dimension sizes, sticking to the right (avoiding false comfort from suggesting the typechecker is checking more than it is), or compromising in the middle (reducing unchecked information, yet clearly stating, ``I am subscript'').
    687688
    688689Note that this equivalence of pointer and array declarations is special to parameters.
    689 It does not apply to local variables, where true array declarations are possible.
     690It does not apply to local variables, where true array declarations are possible (formal \vs actual declarations).
    690691\begin{cfa}
    691692void f( float * a ) {
     
    738739
    739740With multidimensional arrays, on dimensions after the first, a size is required and, is not ignored.
    740 These sizes (strides) are required for the callee to be able to subscript.
     741These sizes (strides) are required for the callee to subscript.
    741742\begin{cfa}
    742743void f( float a[][10], float b[][100] ) {
     
    749750The significance of an inner dimension's length is a fact of the callee's perspective.
    750751In the caller's perspective, the type system is quite lax.
    751 Here, there is (some, but) little checking of what is being passed matches the parameter.
     752Here, there is (some, but) little checking if the argument being passed matches the parameter.
    752753% void f( float [][10] );
    753754% int n = 100;
     
    838839\label{toc:lst:issue}
    839840
    840 This thesis focuses on a reduced design space for linked lists that target \emph{system programmers}.
     841This thesis focuses on a reduced design space for linked lists that are important to, but not limited to, \emph{system programmers}.
    841842Within this restricted space, all design-issue discussions assume the following invariants.
    842843\begin{itemize}
     
    874875\VRef[Figure]{f:Intrusive} shows the \newterm{intrusive} style, placing the link fields inside the payload structure.
    875876\VRef[Figures]{f:WrappedRef} and \subref*{f:WrappedValue} show the two \newterm{wrapped} styles, which place the payload inside a generic library-provided structure that then defines the link fields.
    876 The wrapped style distinguishes between wrapping a reference and wrapping a value, \eg @list<req *>@ or @list<req>@.
     877The wrapped style distinguishes between wrapping a reference or a value, \eg @list<req *>@ or @list<req>@.
    877878(For this discussion, @list<req &>@ is similar to @list<req *>@.)
    878879This difference is one of user style and performance (copying), not framework capability.
    879 Library LQ is intrusive; STL is wrapped with reference and value.
     880Library LQ is intrusive; STL is wrapped with reference or value.
    880881
    881882\begin{comment}
     
    961962\end{figure}
    962963
    963 Each diagrammed example is using the fewest dynamic allocations for its respective style:
    964 in intrusive, here is no dynamic allocation, in wrapped reference only the linked fields are dynamically allocated, and in wrapped value the copied data and linked fields are dynamically allocated.
     964Each diagram in \VRef[Figure]{fig:lst-issues-attach} is using the fewest dynamic allocations for its respective style:
     965in intrusive, here is no dynamic allocation, in wrapped reference only the linked fields are dynamically allocated, and in wrapped value the copy data-area and linked fields are dynamically allocated.
    965966The advantage of intrusive is the control in memory layout and storage placement.
    966967Both wrapped styles have independent storage layout and imply library-induced heap allocations, with lifetime that matches the item's membership in the list.
     
    975976The macro @LIST_INSERT_HEAD( &reqs, &r2, d )@ takes the list header, a pointer to the node, and the offset of the link fields in the node.
    976977One of the fields generated by @LIST_ENTRY@ is a pointer to the node, which is set to the node address, \eg @r2@.
    977 Hence, the offset to the link fields provides an access to the entire node, \ie the node points at itself.
     978Hence, the offset to the link fields provides an access to the entire node, because the node points at itself.
    978979For list traversal, @LIST_FOREACH( cur, &reqs_pri, by_pri )@, there is the node cursor, the list, and the offset of the link fields within the node.
    979980The traversal actually moves from link fields to link fields within a node and sets the node cursor from the pointer within the link fields back to the node.
     
    9991000Then, a novel use can put a @req@ in a list, without requiring any upstream change in the @req@ library.
    10001001In intrusive, the ability to be listed must be planned during the definition of @req@.
    1001 Optimistically adding a couple links for future use is normally cheap because links are small and memory is big.
     1002When in doubt, optimistically adding a couple links for future use is cheap because links are small and memory is big.
    10021003
    10031004\begin{figure}
     
    10161017\end{figure}
    10171018
    1018 It is possible to simulate wrapped using intrusive, illustrated in \VRef[Figure]{fig:lst-issues-attach-reduction}.
     1019Finally, it is possible to simulate wrapped using intrusive, illustrated in \VRef[Figure]{fig:lst-issues-attach-reduction}.
    10191020This shim layer performs the implicit dynamic allocations that pure intrusion avoids.
    10201021But there is no reduction going the other way.
     
    10531054\VRef[Figure]{fig:lst-issues-multi-static} shows an example that can traverse all requests in priority order (field @pri@) or navigate among requests with the same request value (field @rqr@).
    10541055Each of ``by priority'' and ``by common request value'' is a separate list.
    1055 For example, there is a single priority-list linked in order [1, 2, 2, 3, 3, 4], where nodes may have the same priority, and there are three common request-value lists combining requests with the same values: [42, 42], [17, 17, 17], and [99], giving four head nodes one for each list.
     1056For example, there is a single priority-list linked in order [1, 2, 2, 3, 3, 4], where nodes may have the same priority, and there are three common request-value lists combining requests with the same values: [42, 42], [17, 17, 17], and [99], giving four head nodes, one for each list.
    10561057The example shows a list can encompass all the nodes (by-priority) or only a subset of the nodes (three request-value lists).
    10571058
     
    11201121\begin{tabular}{@{}ll@{}}
    11211122\begin{c++}
     1123struct Node : public uColable {
     1124        int i;  // data
     1125        @NodeDL nodeseq;@  // embedded intrusive links
     1126        Node( int i ) : i{ i }, @nodeseq{ this }@ {}
     1127};
     1128\end{c++}
     1129&
     1130\begin{c++}
    11221131struct NodeDL : public uSeqable {
    11231132        @Node & node;@  // node pointer
    11241133        NodeDL( Node & node ) : node( node ) {}
    11251134        Node & get() const { return node; }
    1126 };
    1127 \end{c++}
    1128 &
    1129 \begin{c++}
    1130 struct Node : public uColable {
    1131         int i;  // data
    1132         @NodeDL nodeseq;@  // embedded intrusive links
    1133         Node( int i ) : i{ i }, @nodeseq{ this }@ {}
    11341135};
    11351136\end{c++}
     
    11791180an item found in a list (type @req@ of variable @r1@, see \VRef[Figure]{fig:lst-issues-attach}), and the list (type @reql@ of variable @reqs_pri@, see \VRef[Figure]{fig:lst-issues-ident}).
    11801181This kind of list is \newterm{headed}, where the empty list is just a head.
    1181 An alternate ad-hoc approach omits the header, where the empty list is no nodes.
     1182An alternate \emph{ad-hoc} approach omits the header, where the empty list is no nodes.
    11821183Here, a pointer to any node can traverse its link fields: right or left and around, depending on the data structure.
    1183 Note, a headed list is superset of an ad-hoc list, and can normally perform all of the ad-hoc operations.
     1184Note, a headed list is a superset of an ad-hoc list, and can normally perform all of the ad-hoc operations.
    11841185\VRef[Figure]{fig:lst-issues-ident} shows both approaches for different list lengths and unlisted elements.
    11851186For headed, there are length-zero lists (heads with no elements), and an element can be listed or not listed.
     
    12961297Finally, the need to repeatedly scan an entire string to determine its length can result in significant cost, as it is impossible to cache the length in many cases, \eg when a string is passed into another function.
    12971298
    1298 C strings are fixed size because arrays are used for the implementation.
     1299C strings are fixed size, because arrays are used for the implementation.
    12991300However, string manipulation commonly results in dynamically-sized temporary and final string values, \eg @strcpy@, @strcat@, @strcmp@, @strlen@, @strstr@, \etc.
    13001301As a result, storage management for C strings is a nightmare, quickly resulting in array overruns and incorrect results.
Note: See TracChangeset for help on using the changeset viewer.