Context Navigation

← Previous Change
Next Change →

background.tex

Timestamp:

Mar 14, 2026, 1:08:08 PM (3 days ago)

Author:

Peter A. Buhr <pabuhr@…>

Branches:

master

Parents:

c979afa

Message:

final proofread of background chapter

File:

: 1 edited

doc/theses/mike_brooks_MMath/background.tex (modified) (19 diffs)

Legend:

: Unmodified
: Added
: Removed

doc/theses/mike_brooks_MMath/background.tex

-              rc979afa
+              r1329d78
 Subscripting proceeds first with pointer decay, if needed.
+Next, \cite[\S~6.5.2.1.2]{C11} explains that @ar[i]@ is treated as if it were @(*((a)+(i)))@.
+\cite[\S~6.5.6.8]{C11} explains that the addition, of a pointer with an integer type, is defined only when the pointer refers to an element that is in an array, with a meaning of @i@ elements away from, which is valid if @ar@ is big enough and @i@ is small enough.
+Next, \cite[\S~6.5.2.1.2]{C11} explains that @ar[i]@ is treated as if it is @(*((ar)+(i)))@.
+\cite[\S~6.5.6.8]{C11} explains that the addition, of a pointer with an integer type, is defined only when the pointer refers to an element that is in an array.
+The addition gives the address @i@ elements away from the element (@ar@, or @&ar[0]@).
+This address is valid if @ar@ is big enough and @i@ is small enough.
 Finally, \cite[\S~6.5.3.2.4]{C11} explains that the @*@ operator's result is the referenced element.
 Taken together, these rules illustrate that @ar[i]@ and @i[a]@ mean the same thing, as plus is commutative.
 …
 Under this assumption, a pointer, @p@, being subscripted (or added to, then dereferenced) by any value (positive, zero, or negative), gives a view of the program's entire address space, centred around @p@'s address, divided into adjacent @sizeof(*p)@ chunks, each potentially (re)interpreted as @typeof(*p).t@.
 I call this phenomenon \emph{array diffraction}, which is a diffraction of a single-element pointer into the assumption that its target is conceptually in the middle of an array whose size is unlimited in both directions.
+I call this phenomenon \emph{array diffraction}, which is a diffraction of a single-element pointer into the assumption that its target is conceptually in the middle of an array whose size is unlimited in both directions, \eg @(&ar[5])[-200]@ or @(&ar[5])[200]@).
 No pointer is exempt from array diffraction.
 No array shows its elements without pointer decay.
 …
 \end{cfa}
 The basic two meanings, with a syntactic difference helping to distinguish, are illustrated in the declarations of @ca@ \vs @cp@, whose subsequent @edit@ calls behave differently.
 The syntax-caused confusion is in the comparison of the first and last lines, both of which use a literal to initialize an object declared with spelling @T x[]@.
+The syntax-caused confusion is in the comparison of the first and last lines, both of which use a literal to initialize an object declared with spelling @T x[ ]@.
 But these initialized declarations get opposite meanings, depending on whether the object is a local variable or a parameter!
 …
 As of C99, the C standard supports a \newterm{variable length array} (VLA)~\cite[\S~6.7.5.2.5]{C99}, providing a dynamic-fixed array feature \see{\VRef{s:ArrayIntro}}.
 Note, the \CC standard does not support VLAs, but @g++@ provides them.
+Note, the \CC standard does not support VLAs, but @g++@ and @clang@ provide them.
 A VLA is used when the desired number of array elements is \emph{unknown} at compile time.
 \begin{cfa}
 …
 \VRef[Figure]{f:ContiguousNon-contiguous} shows a powerful extension made in C99 for manipulating contiguous \vs non-contiguous arrays.\footnote{C90 also supported non-contiguous arrays.}
 For contiguous-array arguments (including VLA), C99 conjoins one or more of the parameters as a downstream dimension(s), \eg @cols@, implicitly using this parameter to compute the row stride of @m@.
+There is now sufficient information to support array copying and subscript checking along the columns to prevent changing the argument or buffer-overflow problems, \emph{but neither feature is provided}.
+If the declaration of @fc@ is changed to:
+Hence, if the declaration of @fc@ is changed to:
 \begin{cfa}
 void fc( int rows, int cols, int m[@rows@][@cols@] ) ...
 \end{cfa}
 it is possible for C to perform bound checking across all subscripting.
+there is now sufficient information to support array copying and subscript checking to prevent changing the argument or buffer-overflow problems, \emph{but neither feature is provided}.
 While this contiguous-array capability is a step forward, it is still the programmer's responsibility to manually manage the number of dimensions and their sizes, both at the function definition and call sites.
 That is, the array does not automatically carry its structure and sizes for use in computing subscripts.
 …
 Yet, C allows array syntax for the outermost type constructor, from which comes the freedom to comment.
 An array parameter declaration can specify the outermost dimension with a dimension value, @[10]@ (which is ignored), an empty dimension list, @[ ]@, or a pointer, @*@, as seen in \VRef[Figure]{f:ArParmEquivDecl}.
 The rationale for rejecting the first invalid row follows shortly, while the second invalid row is nonsense, included to complete the pattern; its syntax hints at what the final row actually achieves.
+Examining the rows, the rationale for rejecting the first invalid row (row 3) follows shortly, while the second invalid row (row 4) is nonsense, included to complete the pattern; its syntax hints at what the final row actually achieves.
 Note, in the leftmost style, the typechecker ignores the actual value, even for a dynamic expression.
 \begin{cfa}
 …
 % So are @float[5]*@, @float[]*@ and @float (*)*@.  These latter ones are simply nonsense, though they hint at ``1d array of pointers'', whose equivalent syntax options are, @float *[5]@, @float *[]@, and @float **@.
 It is a matter of taste as to whether a programmer should use the left form to get the most out of commenting subscripting and dimension sizes, sticking to the right (avoiding false comfort from suggesting the typechecker is checking more than it is), or compromising in the middle (reducing unchecked information, yet clearly stating, ``I will subscript'').
+It is a matter of taste as to whether a programmer should use the left form to get the most out of commenting subscripting and dimension sizes, sticking to the right (avoiding false comfort from suggesting the typechecker is checking more than it is), or compromising in the middle (reducing unchecked information, yet clearly stating, ``I am subscript'').
 Note that this equivalence of pointer and array declarations is special to parameters.
 It does not apply to local variables, where true array declarations are possible.
+It does not apply to local variables, where true array declarations are possible (formal \vs actual declarations).
 \begin{cfa}
 void f( float * a ) {
 …
 With multidimensional arrays, on dimensions after the first, a size is required and, is not ignored.
 These sizes (strides) are required for the callee to be able to subscript.
+These sizes (strides) are required for the callee to subscript.
 \begin{cfa}
 void f( float a[][10], float b[][100] ) {
 …
 The significance of an inner dimension's length is a fact of the callee's perspective.
 In the caller's perspective, the type system is quite lax.
 Here, there is (some, but) little checking of what is being passed matches the parameter.
+Here, there is (some, but) little checking if the argument being passed matches the parameter.
 % void f( float [][10] );
 % int n = 100;
 …
 \label{toc:lst:issue}
 This thesis focuses on a reduced design space for linked lists that target \emph{system programmers}.
+This thesis focuses on a reduced design space for linked lists that are important to, but not limited to, \emph{system programmers}.
 Within this restricted space, all design-issue discussions assume the following invariants.
 \begin{itemize}
 …
 \VRef[Figure]{f:Intrusive} shows the \newterm{intrusive} style, placing the link fields inside the payload structure.
 \VRef[Figures]{f:WrappedRef} and \subref*{f:WrappedValue} show the two \newterm{wrapped} styles, which place the payload inside a generic library-provided structure that then defines the link fields.
 The wrapped style distinguishes between wrapping a reference and wrapping a value, \eg @list<req *>@ or @list<req>@.
+The wrapped style distinguishes between wrapping a reference or a value, \eg @list<req *>@ or @list<req>@.
 (For this discussion, @list<req &>@ is similar to @list<req *>@.)
 This difference is one of user style and performance (copying), not framework capability.
 Library LQ is intrusive; STL is wrapped with reference and value.
+Library LQ is intrusive; STL is wrapped with reference or value.
 \begin{comment}
 …
 \end{figure}
 Each diagrammed example is using the fewest dynamic allocations for its respective style:
 in intrusive, here is no dynamic allocation, in wrapped reference only the linked fields are dynamically allocated, and in wrapped value the copied data and linked fields are dynamically allocated.
+Each diagram in \VRef[Figure]{fig:lst-issues-attach} is using the fewest dynamic allocations for its respective style:
+in intrusive, here is no dynamic allocation, in wrapped reference only the linked fields are dynamically allocated, and in wrapped value the copy data-area and linked fields are dynamically allocated.
 The advantage of intrusive is the control in memory layout and storage placement.
 Both wrapped styles have independent storage layout and imply library-induced heap allocations, with lifetime that matches the item's membership in the list.
 …
 The macro @LIST_INSERT_HEAD( &reqs, &r2, d )@ takes the list header, a pointer to the node, and the offset of the link fields in the node.
 One of the fields generated by @LIST_ENTRY@ is a pointer to the node, which is set to the node address, \eg @r2@.
 Hence, the offset to the link fields provides an access to the entire node, \ie the node points at itself.
+Hence, the offset to the link fields provides an access to the entire node, because the node points at itself.
 For list traversal, @LIST_FOREACH( cur, &reqs_pri, by_pri )@, there is the node cursor, the list, and the offset of the link fields within the node.
 The traversal actually moves from link fields to link fields within a node and sets the node cursor from the pointer within the link fields back to the node.
 …
 Then, a novel use can put a @req@ in a list, without requiring any upstream change in the @req@ library.
 In intrusive, the ability to be listed must be planned during the definition of @req@.
 Optimistically adding a couple links for future use is normally cheap because links are small and memory is big.
+When in doubt, optimistically adding a couple links for future use is cheap because links are small and memory is big.
 \begin{figure}
 …
 \end{figure}
 It is possible to simulate wrapped using intrusive, illustrated in \VRef[Figure]{fig:lst-issues-attach-reduction}.
+Finally, it is possible to simulate wrapped using intrusive, illustrated in \VRef[Figure]{fig:lst-issues-attach-reduction}.
 This shim layer performs the implicit dynamic allocations that pure intrusion avoids.
 But there is no reduction going the other way.
 …
 \VRef[Figure]{fig:lst-issues-multi-static} shows an example that can traverse all requests in priority order (field @pri@) or navigate among requests with the same request value (field @rqr@).
 Each of ``by priority'' and ``by common request value'' is a separate list.
 For example, there is a single priority-list linked in order [1, 2, 2, 3, 3, 4], where nodes may have the same priority, and there are three common request-value lists combining requests with the same values: [42, 42], [17, 17, 17], and [99], giving four head nodes one for each list.
+For example, there is a single priority-list linked in order [1, 2, 2, 3, 3, 4], where nodes may have the same priority, and there are three common request-value lists combining requests with the same values: [42, 42], [17, 17, 17], and [99], giving four head nodes, one for each list.
 The example shows a list can encompass all the nodes (by-priority) or only a subset of the nodes (three request-value lists).
 …
 \begin{tabular}{@{}ll@{}}
 \begin{c++}
+struct Node : public uColable {
+        int i;  // data
+        @NodeDL nodeseq;@  // embedded intrusive links
+        Node( int i ) : i{ i }, @nodeseq{ this }@ {}
+};
+\end{c++}
+&
+\begin{c++}
 struct NodeDL : public uSeqable {
         @Node & node;@  // node pointer
         NodeDL( Node & node ) : node( node ) {}
         Node & get() const { return node; }
-};
-\end{c++}
+&
-\begin{c++}
-struct Node : public uColable {
-        int i;  // data
-        @NodeDL nodeseq;@  // embedded intrusive links
-        Node( int i ) : i{ i }, @nodeseq{ this }@ {}
 };
 \end{c++}
 …
 an item found in a list (type @req@ of variable @r1@, see \VRef[Figure]{fig:lst-issues-attach}), and the list (type @reql@ of variable @reqs_pri@, see \VRef[Figure]{fig:lst-issues-ident}).
 This kind of list is \newterm{headed}, where the empty list is just a head.
 An alternate ad-hoc approach omits the header, where the empty list is no nodes.
+An alternate \emph{ad-hoc} approach omits the header, where the empty list is no nodes.
 Here, a pointer to any node can traverse its link fields: right or left and around, depending on the data structure.
 Note, a headed list is superset of an ad-hoc list, and can normally perform all of the ad-hoc operations.
+Note, a headed list is a superset of an ad-hoc list, and can normally perform all of the ad-hoc operations.
 \VRef[Figure]{fig:lst-issues-ident} shows both approaches for different list lengths and unlisted elements.
 For headed, there are length-zero lists (heads with no elements), and an element can be listed or not listed.
 …
 Finally, the need to repeatedly scan an entire string to determine its length can result in significant cost, as it is impossible to cache the length in many cases, \eg when a string is passed into another function.
 C strings are fixed size because arrays are used for the implementation.
+C strings are fixed size, because arrays are used for the implementation.
 However, string manipulation commonly results in dynamically-sized temporary and final string values, \eg @strcpy@, @strcat@, @strcmp@, @strlen@, @strstr@, \etc.
 As a result, storage management for C strings is a nightmare, quickly resulting in array overruns and incorrect results.

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset 1329d78 for doc/theses/mike_brooks_MMath/background.tex

Legend:

doc/theses/mike_brooks_MMath/background.tex

Download in other formats: