Context Navigation

← Previous Change
Next Change →

Changeset 0554c1a for doc/theses/mike_brooks_MMath

Timestamp:

Apr 9, 2024, 10:11:29 PM (12 months ago)

Author:

Peter A. Buhr <pabuhr@…>

Branches:

Children:

Parents:

Message:

finish current proofreading of background chapter

Location:

doc/theses/mike_brooks_MMath

Files:

: 2 edited

background.tex (modified) (12 diffs)
programs/bkgd-carray-decay.c (modified) (2 diffs)

Legend:

: Unmodified
: Added
: Removed

TabularUnified doc/theses/mike_brooks_MMath/background.tex ¶

-                      r0bbe172
+                      r0554c1a
 \chapter{Background}
 Since this work builds on C, it is necessary to explain the C mechanisms and their shortcomings for array, linked list, and string,
+Since this work builds on C, it is necessary to explain the C mechanisms and their shortcomings for array, linked list, and string.
 …
 \begin{cquote}
 \begin{tabular}{@{}ll@{}}
 \multicolumn{1}{@{}c}{\textbf{Array}} & \multicolumn{1}{c@{}}{\textbf{Function}} \\
+\multicolumn{1}{@{}c}{\textbf{Array}} & \multicolumn{1}{c@{}}{\textbf{Function Pointer}} \\
 \begin{cfa}
 int @(*@ar@)[@5@]@; // definition
 …
 After all, reading a C array type is easy: just read it from the inside out, and know when to look left and when to look right!
 \CFA provides its own type, variable and routine declarations, using a different syntax.
+\CFA provides its own type, variable and routine declarations, using a simpler syntax.
 The new declarations place qualifiers to the left of the base type, while C declarations place qualifiers to the right of the base type.
 The qualifiers have the same meaning in \CFA as in C.
 Hence, a \CFA declaration is read left to right, where a function return type is enclosed in brackets @[]@.
+Then, a \CFA declaration is read left to right, where a function return type is enclosed in brackets @[@\,@]@.
 \begin{cquote}
 \begin{tabular}{@{}l@{\hspace{3em}}ll@{}}
 \multicolumn{1}{c@{\hspace{3em}}}{\textbf{C}}   & \multicolumn{1}{c}{\textbf{\CFA}} &   \\
+\multicolumn{1}{c@{\hspace{3em}}}{\textbf{C}}   & \multicolumn{1}{c}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{read left to right}}   \\
 \begin{cfa}
 int @*@ x1 @[5]@;
 …
 Each row of the table shows alternate syntactic forms.
 The simplest occurrences of types distinguished in the preceding discussion are marked with $\triangleright$.
 Removing the declared variable @x@, gives the type used for variable, structure field, cast or error message \PAB{(though note Section TODO points out that some types cannot be casted to)}.
+Removing the declared variable @x@, gives the type used for variable, structure field, cast or error messages \PAB{(though note Section TODO points out that some types cannot be casted to)}.
 Unfortunately, parameter declarations \PAB{(section TODO)} have more syntactic forms and rules.
 \begin{table}
 \centering
 \caption{Syntactic Reference for Array vs Pointer. Includes interaction with constness.}
+\caption{Syntactic Reference for Array vs Pointer. Includes interaction with \lstinline{const}ness.}
 \label{bkgd:ar:usr:avp}
 \begin{tabular}{ll|l|l|l}
         & Description & \multicolumn{1}{c|}{C} & \multicolumn{1}{c|}{\CFA}  & \multicolumn{1}{c}{\CFA-thesis} \\
         \hline
         $\triangleright$ & value & @T x;@ & @T x;@ & \\
+$\triangleright$ & value & @T x;@ & @T x;@ & \\
         \hline
         & immutable value & @const T x;@ & @const T x;@ & \\
         & & @T const x;@ & @T const x;@ & \\
         \hline \hline
         $\triangleright$ & pointer to value & @T * x;@ & @* T x;@ & \\
+$\triangleright$ & pointer to value & @T * x;@ & @* T x;@ & \\
         \hline
         & immutable ptr. to val. & @T * const x;@ & @const * T x;@ & \\
 …
         & & @T const * x;@ & @* T const x;@ & \\
         \hline \hline
         $\triangleright$ & array of value & @T x[10];@ & @[10] T x@ & @array(T, 10) x@ \\
+$\triangleright$ & array of value & @T x[10];@ & @[10] T x@ & @array(T, 10) x@ \\
         \hline
         & ar.\ of immutable val. & @const T x[10];@ & @[10] const T x@ & @const array(T, 10) x@ \\
 …
         & & @T const * x[10];@ & @[10] * T const x@ & @array(* const T, 10) x@ \\
         \hline \hline
         $\triangleright$ & ptr.\ to ar.\ of value & @T (*x)[10];@ & @* [10] T x@ & @* array(T, 10) x@ \\
+$\triangleright$ & ptr.\ to ar.\ of value & @T (*x)[10];@ & @* [10] T x@ & @* array(T, 10) x@ \\
         \hline
         & imm. ptr.\ to ar.\ of val. & @T (* const x)[10];@ & @const * [10] T x@ & @const * array(T, 10) x@ \\
 …
         \item static
         \item star as dimension
         \item under pointer decay:                              int p1[const 3]  being  int const *p1
+        \item under pointer decay: @int p1[const 3]@ being @int const *p1@
 \end{itemize}
 …
 \lstinput{10-10}{bkgd-carray-decay.c}
 So, C provides an implicit conversion from @float[10]@ to @float*@, as described in ARM-6.3.2.1.3:
+So, C provides an implicit conversion from @float[10]@ to @float *@.
 \begin{quote}
+Except when it is the operand of the @sizeof@ operator, or the unary @&@ operator, or is a
+string literal used to initialize an array
+an expression that has type ``array of type'' is
+converted to an expression with type ``pointer to type'' that points to the initial element of
+the array object
+Except when it is the operand of the @sizeof@ operator, or the unary @&@ operator, or is a string literal used to
+initialize an array an expression that has type ``array of \emph{type}'' is converted to an expression with type
+``pointer to \emph{type}'' that points to the initial element of the array object~\cite[\S~6.3.2.1.3]{C11}
 \end{quote}
 This phenomenon is the famous ``pointer decay,'' which is a decay of an array-typed expression into a pointer-typed one.
 It is worthy to note that the list of exception cases does not feature the occurrence of @ar@ in @ar[i]@.
+Thus, subscripting happens on pointers, not arrays.
+Subscripting proceeds first with pointer decay, if needed.  Next, ARM-6.5.2.1.2 explains that @ar[i]@ is treated as if it were @(*((a)+(i)))@.
+ARM-6.5.6.8 explains that the addition, of a pointer with an integer type,  is defined only when the pointer refers to an element that is in an array, with a meaning of ``@i@ elements away from,'' which is valid if @ar@ is big enough and @i@ is small enough.
+Finally, ARM-6.5.3.2.4 explains that the @*@ operator's result is the referenced element.
+Taken together, these rules also happen to illustrate that @ar[i]@ and @i[a]@ mean the same thing.
+Thus, subscripting happens on pointers not arrays.
+Subscripting proceeds first with pointer decay, if needed.  Next, \cite[\S~6.5.2.1.2]{C11} explains that @ar[i]@ is treated as if it were @(*((a)+(i)))@.
+\cite[\S~6.5.6.8]{C11} explains that the addition, of a pointer with an integer type,  is defined only when the pointer refers to an element that is in an array, with a meaning of ``@i@ elements away from,'' which is valid if @ar@ is big enough and @i@ is small enough.
+Finally, \cite[\S~6.5.3.2.4]{C11} explains that the @*@ operator's result is the referenced element.
+Taken together, these rules illustrate that @ar[i]@ and @i[a]@ mean the same thing!
 Subscripting a pointer when the target is standard-inappropriate is still practically well-defined.
 …
 fs[5] = 3.14;
 \end{cfa}
+The @malloc@ behaviour is specified as returning a pointer to ``space for an object whose size is'' as requested (ARM-7.22.3.4.2).
+But program says \emph{nothing} more about this pointer value, that might cause its referent to \emph{be} an array, before doing the subscript.
+Under this assumption, a pointer being subscripted (or added to, then dereferenced)
+by any value (positive, zero, or negative), gives a view of the program's entire address space,
+centred around the @p@ address, divided into adjacent @sizeof(*p)@ chunks,
+each potentially (re)interpreted as @typeof(*p)@.
+I call this phenomenon ``array diffraction,''  which is a diffraction of a single-element pointer
+into the assumption that its target is in the middle of an array whose size is unlimited in both directions.
+The @malloc@ behaviour is specified as returning a pointer to ``space for an object whose size is'' as requested (\cite[\S~7.22.3.4.2]{C11}).
+But \emph{nothing} more is said about this pointer value, specifically that its referent might \emph{be} an array allowing subscripting.
+Under this assumption, a pointer being subscripted (or added to, then dereferenced) by any value (positive, zero, or negative), gives a view of the program's entire address space, centred around the @p@ address, divided into adjacent @sizeof(*p)@ chunks, each potentially (re)interpreted as @typeof(*p)@.
+I call this phenomenon ``array diffraction,''  which is a diffraction of a single-element pointer into the assumption that its target is in the middle of an array whose size is unlimited in both directions.
 No pointer is exempt from array diffraction.
 No array shows its elements without pointer decay.
 A further pointer--array confusion, closely related to decay, occurs in parameter declarations.
 ARM-6.7.6.3.7 explains that when an array type is written for a parameter,
 the parameter's type becomes a type that I summarize as being the array-decayed type.
+\cite[\S~6.7.6.3.7]{C11} explains that when an array type is written for a parameter,
+the parameter's type becomes a type that can be summarized as the array-decayed type.
 The respective handling of the following two parameter spellings shows that the array-spelled one is really, like the other, a pointer.
 \lstinput{12-16}{bkgd-carray-decay.c}
 As the @sizeof(x)@ meaning changed, compared with when run on a similarly-spelled local variable declaration,
+GCC also gives this code the warning: ```sizeof' on array function parameter `x' will return size of `float *'.''
+The caller of such a function is left with the reality that a pointer parameter is a pointer, no matter how it's spelled:
+GCC also gives this code the warning for the first assertion:
+\begin{cfa}
+warning: 'sizeof' on array function parameter 'x' will return size of 'float *'
+\end{cfa}
+The caller of such a function is left with the reality that a pointer parameter is a pointer, no matter how it is spelled:
 \lstinput{18-21}{bkgd-carray-decay.c}
 This fragment gives no warnings.
 …
 depending on whether the object is a local variable or a parameter.
 In summary, when a function is written with an array-typed parameter,
 \begin{itemize}
 …
 As a result, a function with a pointer-to-array parameter sees the parameter exactly as the caller does:
 \lstinput{32-42}{bkgd-carray-decay.c}
+\VRef[Figure]{bkgd:ar:usr:decay-parm} gives the reference for the decay phenomenon seen in parameter declarations.
+\begin{figure}
+\VRef[Table]{bkgd:ar:usr:decay-parm} gives the reference for the decay phenomenon seen in parameter declarations.
+\begin{table}
+\caption{Syntactic Reference for Decay during Parameter-Passing.
+Includes interaction with \lstinline{const}ness, where ``immutable'' refers to a restriction on the callee's ability.}
+\label{bkgd:ar:usr:decay-parm}
 \centering
 \begin{tabular}{llllll}
+        & Description & Type & Param. Decl & \CFA-C  \\ \hline
+        $\triangleright$ & ptr.\ to val.
+            & @T *@
+            & \pbox{20cm}{ \vspace{2pt} \lstinline{T * x,} \\ \lstinline{T x[10],} \\ \lstinline{T x[],}   }\vspace{2pt}
+            & \pbox{20cm}{ \vspace{2pt} \lstinline{[ * T ]} \\ \lstinline{[ [10] T ]} \\ \lstinline{[ [] T  ]}   }
+            \\ \hline
+        & \pbox{20cm}{ \vspace{2pt} ptr.\ to val.\\ \footnotesize{no writing the ptr.\ in \lstinline{x}}   }\vspace{2pt}
+            & @T * const@
+            & \pbox{20cm}{ \vspace{2pt} \lstinline{T * const x,} \\ \lstinline{T x[const 10],} \\ \lstinline{T x[const],}   }\vspace{2pt}
+            & \pbox{20cm}{ \vspace{2pt} \lstinline{[ const * T ]} \\ \lstinline{[ [const 10] T ]} \\ \lstinline{[ [const] T  ]}   }
+            \\ \hline
+        & \pbox{20cm}{ \vspace{2pt} ptr.\ to val.\\ \footnotesize{no writing the val.\ in \lstinline{*x}}   }\vspace{2pt}
+            & \pbox{20cm}{ \vspace{2pt} \lstinline{const T *} \\ \lstinline{T const *}   }
+            & \pbox{20cm}{ \vspace{2pt} \lstinline{const T * x,} \\ \lstinline{T const * x,} \\ \lstinline{const T x[10],} \\ \lstinline{T const x[10],} \\ \lstinline{const T x[],} \\ \lstinline{T const x[],}   }\vspace{2pt}
+            & \pbox{20cm}{ \vspace{2pt} \lstinline{[* const T]} \\ \lstinline{[ [10] const T ]} \\ \lstinline{[ [] const T  ]}   }
+            \\ \hline \hline
+        $\triangleright$ & ptr.\ to ar.\ of val.
+            & @T(*)[10]@
+            & \pbox{20cm}{ \vspace{2pt} \lstinline{T (*x)[10],} \\ \lstinline{T x[3][10],} \\ \lstinline{T x[][10],}   }\vspace{2pt}
+            & \pbox{20cm}{ \vspace{2pt} \lstinline{[* [10] T]} \\ \lstinline{[ [3] [10] T ]} \\ \lstinline{[ [] [10] T  ]}   }
+            \\ \hline
+        & ptr.\ to ptr.\ to val.
+            & @T **@
+            & \pbox{20cm}{ \vspace{2pt} \lstinline{T ** x,} \\ \lstinline{T *x[10],} \\ \lstinline{T *x[],}   }\vspace{2pt}
+            & \pbox{20cm}{ \vspace{2pt} \lstinline{[ * * T ]} \\ \lstinline{[ [10] * T ]} \\ \lstinline{[ [] * T  ]}   }
+            \\ \hline
+        & \pbox{20cm}{ \vspace{2pt} ptr.\ to ptr.\ to val.\\ \footnotesize{no writing the val.\ in \lstinline{**argv}}   }\vspace{2pt}
+            & @const char **@
+            & \pbox{20cm}{ \vspace{2pt} \lstinline{const char *argv[],} \\ \footnotesize{(others elided)}   }\vspace{2pt}
+            & \pbox{20cm}{ \vspace{2pt} \lstinline{[ [] * const char ]} \\ \footnotesize{(others elided)}   }
+            \\ \hline
+        & Description & Type & Parameter Declaration & \CFA  \\
+        \hline
+        & & & @T * x,@ & @* T x,@ \\
+$\triangleright$ & pointer to value & @T *@ & @T x[10],@ & @[10] T x,@ \\
+        & & & @T x[],@ & @[] T x,@ \\
+        \hline
+        & & & @T * const x,@ & @const * T x@ \\
+        & immutable ptr.\ to val. & @T * const@ & @T x[const 10],@ & @[const 10] T x,@ \\
+        & & & @T x[const],@ & @[const] T x,@\\
+        \hline
+        & & & @const T * x,@ & @ * const T x,@ \\
+        & &     & @T const * x,@ & @ * T const x,@ \\
+        & ptr.\ to immutable val. & @const T *@ & @const T x[10],@ & @[10] const T x,@ \\
+        & & @T const *@ & @T const x[10],@ &  @[10] T const x,@ \\
+        & & & @const T x[],@ & @[] const T x,@ \\
+        & & & @T const x[],@ & @[] T const x,@ \\
+        \hline \hline
+        & & & @T (*x)[10],@ & @* [10] T x,@ \\
+$\triangleright$ & ptr.\ to ar.\ of val. & @T(*)[10]@ & @T x[3][10],@ & @[3][10] T x,@ \\
+        & & & @T x[][10],@ & @[][10] T x,@ \\
+        \hline
+        & & & @T ** x,@ & @** T x,@ \\
+        & ptr.\ to ptr.\ to val. & @T **@ & @T * x[10],@ & @[10] * T x,@ \\
+        & & & @T * x[],@ & @[] * T x,@ \\
+        \hline
+        & ptr.\ to ptr.\ to imm.\ val. & @const char **@ & @const char * argv[],@ & @[] * const char argv,@ \\
+    & & & \emph{others elided} & \emph{others elided} \\
+        \hline
 \end{tabular}
+\caption{Unfortunate Syntactic Reference for Decay during Parameter-Passing.  Includes interaction with constness, where ``no writing'' refers to a restriction on the callee's ability.}
+\label{bkgd:ar:usr:decay-parm}
+\end{figure}
+\end{table}
 \subsection{Lengths may vary, checking does not}
+When the desired number of elements is unknown at compile time,
+a variable-length array is a solution:
+\begin{cfa}
+int main( int argc, const char *argv[] ) {
+When the desired number of elements is unknown at compile time, a variable-length array is a solution:
+\begin{cfa}
+int main( int argc, const char * argv[] ) {
         assert( argc == 2 );
         size_t n = atol( argv[1] );
+        assert( 0 < n && n < 1000 );
+        assert( 0 < n );
         float ar[n];
         float b[10];
         // ... discussion continues here
+}
 \end{cfa}
+This arrangement allocates @n@ elements on the @main@ stack frame for @ar@, just as it puts 10 elements on the @main@ stack frame for @b@.
+The variable-sized allocation of @ar@ is provided by @alloca@.
+In a situation where the array sizes are not known to be small enough for stack allocation to be sensible, corresponding heap allocations are achievable as:
+\begin{cfa}
+float *ax1 = malloc( sizeof( float[n] ) );
+float *ax2 = malloc( n * sizeof( float ) );
+float *bx1 = malloc( sizeof( float[1000000] ) );
+float *bx2 = malloc( 1000000 * sizeof( float ) );
+\end{cfa}
+VLA
+This arrangement allocates @n@ elements on the @main@ stack frame for @ar@, called a \newterm{variable length array} (VLA), as well as 10 elements in the same stack frame for @b@.
+The variable-sized allocation of @ar@ is provided by the @alloca@ routine, which bumps the stack pointer.
+Note, the C standard supports VLAs, but the \CC standard does not;
+both @gcc@ and @g++@ support VLAs.
+As well, there is misinformation about VLAs, \eg VLAs cause stack failures or are inefficient.
+VLAs exist as far back as Algol W~\cite[\S~5.2]{AlgolW} and are a sound and efficient data type.
+In situations where the stack size has a small bound (coroutines or user-level threads), unbounded VLAs can overflow the stack so a heap allocation is used.
+\begin{cfa}
+float * ax1 = malloc( sizeof( float[n] ) );
+float * ax2 = malloc( n * sizeof( float ) );
+float * bx1 = malloc( sizeof( float[1000000] ) );
+float * bx2 = malloc( 1000000 * sizeof( float ) );
+\end{cfa}
 Parameter dependency
 …
+\subsection{C has full-service, dynamically sized, multidimensional arrays (and \CC does not)}
+\subsection{Dynamically sized, multidimensional arrays}
 In C and \CC, ``multidimensional array'' means ``array of arrays.''  Other meanings are discussed in TODO.

TabularUnified doc/theses/mike_brooks_MMath/programs/bkgd-carray-decay.c ¶

-                      r0bbe172
+                      r0554c1a
         float (*pa)[10] = &a;           $\C{// pointer to array}$
         float a0 = a[0];                        $\C{// element}$
         float *pa0 = &(a[0]);           $\C{// pointer to element}$
+        float * pa0 = &(a[0]);          $\C{// pointer to element}$
         float *pa0x = a;                        $\C{// (ok)}$
+        float * pa0x = a;                       $\C{// (ok)}$
         assert( pa0 == pa0x );
         assert( sizeof(pa0x) != sizeof(a) );
         void f( float x[10], float *y ) {
                 static_assert( sizeof(x) == sizeof(void*) );
                 static_assert( sizeof(y) == sizeof(void*) );
+        void f( float x[10], float * y ) {
+                static_assert( sizeof(x) == sizeof(void *) );
+                static_assert( sizeof(y) == sizeof(void *) );
+        }
         f(0,0);
 …
         char ca[] = "hello";            $\C{// array on stack, initialized from read-only data}$
         char *cp = "hello";                     $\C{// pointer to read-only data [decay here]}$
         void edit( char c[] ) {         $\C{// param is pointer}$
+        char * cp = "hello";            $\C{// pointer to read-only data [decay here]}$
+        void edit( char c[] ) {         $\C{// parameter is pointer}$
                 c[3] = 'p';
+        }
         edit( ca );                                     $\C{// ok [decay here]}$
         edit( c p );                            $\C{// Segmentation fault}$
+        edit( cp );                                     $\C{// Segmentation fault}$
         edit( "hello" );                        $\C{// Segmentation fault [decay here]}$
         void decay( float x[10] ) {
                 static_assert( sizeof(x) == sizeof(void*) );
+                static_assert( sizeof(x) == sizeof(void *) );
+        }
         static_assert( sizeof(a) == 10 * sizeof(float) );

Note: See TracChangeset for help on using the changeset viewer.

Download in other formats: