| 1 | \chapter{Background}
|
|---|
| 2 |
|
|---|
| 3 | Since this work builds on C, it is necessary to explain the C mechanisms and their shortcomings for array, linked list, and string,
|
|---|
| 4 |
|
|---|
| 5 |
|
|---|
| 6 | \section{Array}
|
|---|
| 7 |
|
|---|
| 8 | At the start, the C programming language made a significant design mistake.
|
|---|
| 9 | \begin{quote}
|
|---|
| 10 | In C, there is a strong relationship between pointers and arrays, strong enough that pointers and arrays really should be treated simultaneously.
|
|---|
| 11 | Any operation which can be achieved by array subscripting can also be done with pointers.~\cite[p.~93]{C:old}
|
|---|
| 12 | \end{quote}
|
|---|
| 13 | Accessing any storage requires pointer arithmetic, even if it is just base-displacement addressing in an instruction.
|
|---|
| 14 | The conjoining of pointers and arrays could also be applied to structures, where a pointer references a structure field like an array element.
|
|---|
| 15 | Finally, while subscripting involves pointer arithmetic (as does field references @x.y.z@), it is very complex for multi-dimensional arrays and requires array descriptors to know stride lengths along dimensions.
|
|---|
| 16 | Many C errors result from performing pointer arithmetic instead of using subscripting.
|
|---|
| 17 | Some C textbooks erroneously teach pointer arithmetic suggesting it is faster than subscripting.
|
|---|
| 18 |
|
|---|
| 19 | C semantics want a programmer to \emph{believe} an array variable is a ``pointer to its first element.''
|
|---|
| 20 | This desire becomes apparent by a detailed inspection of an array declaration.
|
|---|
| 21 | \lstinput{34-34}{bkgd-carray-arrty.c}
|
|---|
| 22 | The inspection begins by using @sizeof@ to provide definite program semantics for the intuition of an expression's type.
|
|---|
| 23 | \lstinput{35-36}{bkgd-carray-arrty.c}
|
|---|
| 24 | Now consider the sizes of expressions derived from @ar@, modified by adding ``pointer to'' and ``first element'' (and including unnecessary parentheses to avoid confusion about precedence).
|
|---|
| 25 | \lstinput{37-40}{bkgd-carray-arrty.c}
|
|---|
| 26 | Given the size of @float@ is 4, the size of @ar@ with 10 floats being 40 bytes is common reasoning for C programmers.
|
|---|
| 27 | Equally, C programmers know the size of a \emph{pointer} to the first array element is 8 (or 4 depending on the addressing architecture).
|
|---|
| 28 | % Now, set aside for a moment the claim that this first assertion is giving information about a type.
|
|---|
| 29 | Clearly, an array and a pointer to its first element are different things.
|
|---|
| 30 |
|
|---|
| 31 | In fact, the idea that there is such a thing as a pointer to an array may be surprising and it is not the same thing as a pointer to the first element.
|
|---|
| 32 | \lstinput{42-45}{bkgd-carray-arrty.c}
|
|---|
| 33 | The first assignment gets
|
|---|
| 34 | \begin{cfa}
|
|---|
| 35 | warning: assignment to `float (*)[10]' from incompatible pointer type `float *'
|
|---|
| 36 | \end{cfa}
|
|---|
| 37 | and the second assignment gets the opposite.
|
|---|
| 38 |
|
|---|
| 39 | The inspection now refutes any suggestion that @sizeof@ is informing about allocation rather than type information.
|
|---|
| 40 | Note, @sizeof@ has two forms, one operating on an expression and the other on a type.
|
|---|
| 41 | Using the type form yields the same results as the prior expression form.
|
|---|
| 42 | \lstinput{46-49}{bkgd-carray-arrty.c}
|
|---|
| 43 | The results are also the same when there is \emph{no allocation} using a pointer-to-array type.
|
|---|
| 44 | \lstinput{51-57}{bkgd-carray-arrty.c}
|
|---|
| 45 | Hence, in all cases, @sizeof@ is informing about type information.
|
|---|
| 46 |
|
|---|
| 47 | So, thinking of an array as a pointer to its first element is too simplistic an analogue and it is not backed up the type system.
|
|---|
| 48 | This misguided analogue can be forced onto single-dimension arrays but there is no advantage other than possibly teaching beginning programmers about basic runtime array-access.
|
|---|
| 49 |
|
|---|
| 50 | Continuing, a shortened form for declaring local variables exists, provided that length information is given in the initializer:
|
|---|
| 51 | \lstinput{59-62}{bkgd-carray-arrty.c}
|
|---|
| 52 | In these declarations, the resulting types are both arrays, but their lengths are inferred.
|
|---|
| 53 |
|
|---|
| 54 | My contribution is enabled by recognizing
|
|---|
| 55 | \begin{itemize}
|
|---|
| 56 | \item There is value in using a type that knows how big the whole thing is.
|
|---|
| 57 | \item The type pointer to (first) element does not.
|
|---|
| 58 | \item C \emph{has} a type that knows the whole picture: array, e.g. @T[10]@.
|
|---|
| 59 | \item This type has all the usual derived forms, which also know the whole picture. A usefully noteworthy example is pointer to array, e.g. @T(*)[10]@.
|
|---|
| 60 | \end{itemize}
|
|---|
| 61 |
|
|---|
| 62 | Each of these sections, which introduces another layer of of the C arrays' story,
|
|---|
| 63 | concludes with an \emph{Unfortunate Syntactic Reference}.
|
|---|
| 64 | It shows how to spell the types under discussion,
|
|---|
| 65 | along with interactions with orthogonal (but easily confused) language features.
|
|---|
| 66 | Alternate spellings are listed within a row.
|
|---|
| 67 | The simplest occurrences of types distinguished in the preceding discussion are marked with $\triangleright$.
|
|---|
| 68 | The Type column gives the spelling used in a cast or error message (though note Section TODO points out that some types cannot be casted to).
|
|---|
| 69 | The Declaration column gives the spelling used in an object declaration, such as variable or aggregate member; parameter declarations (section TODO) follow entirely different rules.
|
|---|
| 70 |
|
|---|
| 71 | After all, reading a C array type is easy: just read it from the inside out, and know when to look left and when to look right!
|
|---|
| 72 |
|
|---|
| 73 |
|
|---|
| 74 | \CFA-specific spellings (not yet introduced) are also included here for referenceability; these can be skipped on linear reading.
|
|---|
| 75 | The \CFA-C column gives the, more fortunate, ``new'' syntax of section TODO, for spelling \emph{exactly the same type}.
|
|---|
| 76 | This fortunate syntax does not have different spellings for types vs declarations;
|
|---|
| 77 | a declaration is always the type followed by the declared identifier name;
|
|---|
| 78 | for the example of letting @x@ be a \emph{pointer to array}, the declaration is spelled:
|
|---|
| 79 | \begin{cfa}
|
|---|
| 80 | [ * [10] T ] x;
|
|---|
| 81 | \end{cfa}
|
|---|
| 82 | The \CFA-Full column gives the spelling of a different type, introduced in TODO, which has all of my contributed improvements for safety and ergonomics.
|
|---|
| 83 |
|
|---|
| 84 | \VRef[Figure]{bkgd:ar:usr:avp} gives this reference for the discussion so far.
|
|---|
| 85 |
|
|---|
| 86 | \begin{figure}
|
|---|
| 87 | \centering
|
|---|
| 88 | \setlength{\tabcolsep}{3pt}
|
|---|
| 89 | \begin{tabular}{llllll}
|
|---|
| 90 | & Description & Type & Declaration & \CFA-C & \CFA-Full \\ \hline
|
|---|
| 91 | $\triangleright$ & val.
|
|---|
| 92 | & @T@
|
|---|
| 93 | & @T x;@
|
|---|
| 94 | & @[ T ]@
|
|---|
| 95 | &
|
|---|
| 96 | \\ \hline
|
|---|
| 97 | & \pbox{20cm}{ \vspace{2pt} val.\\ \footnotesize{no writing the val.\ in \lstinline{x}} }\vspace{2pt}
|
|---|
| 98 | & \pbox{20cm}{ \vspace{2pt} \lstinline{const T} \\ \lstinline{T const} }
|
|---|
| 99 | & \pbox{20cm}{ \vspace{2pt} \lstinline{const T x;} \\ \lstinline{T const x;} }
|
|---|
| 100 | & @[ const T ]@
|
|---|
| 101 | &
|
|---|
| 102 | \\ \hline \hline
|
|---|
| 103 | $\triangleright$ & ptr.\ to val.
|
|---|
| 104 | & @T *@
|
|---|
| 105 | & @T * x;@
|
|---|
| 106 | & @[ * T ]@
|
|---|
| 107 | &
|
|---|
| 108 | \\ \hline
|
|---|
| 109 | & \pbox{20cm}{ \vspace{2pt} ptr.\ to val.\\ \footnotesize{no writing the ptr.\ in \lstinline{x}} }\vspace{2pt}
|
|---|
| 110 | & @T * const@
|
|---|
| 111 | & @T * const x;@
|
|---|
| 112 | & @[ const * T ]@
|
|---|
| 113 | &
|
|---|
| 114 | \\ \hline
|
|---|
| 115 | & \pbox{20cm}{ \vspace{2pt} ptr.\ to val.\\ \footnotesize{no writing the val.\ in \lstinline{*x}} }\vspace{2pt}
|
|---|
| 116 | & \pbox{20cm}{ \vspace{2pt} \lstinline{const T *} \\ \lstinline{T const *} }
|
|---|
| 117 | & \pbox{20cm}{ \vspace{2pt} \lstinline{const T * x;} \\ \lstinline{T const * x;} }
|
|---|
| 118 | & @[ * const T ]@
|
|---|
| 119 | &
|
|---|
| 120 | \\ \hline \hline
|
|---|
| 121 | $\triangleright$ & ar.\ of val.
|
|---|
| 122 | & @T[10]@
|
|---|
| 123 | & @T x[10];@
|
|---|
| 124 | & @[ [10] T ]@
|
|---|
| 125 | & @[ array(T, 10) ]@
|
|---|
| 126 | \\ \hline
|
|---|
| 127 | & \pbox{20cm}{ \vspace{2pt} ar.\ of val.\\ \footnotesize{no writing the val.\ in \lstinline{x[5]}} }\vspace{2pt}
|
|---|
| 128 | & \pbox{20cm}{ \vspace{2pt} \lstinline{const T[10]} \\ \lstinline{T const[10]} }
|
|---|
| 129 | & \pbox{20cm}{ \vspace{2pt} \lstinline{const T x[10];} \\ \lstinline{T const x[10];} }
|
|---|
| 130 | & @[ [10] const T ]@
|
|---|
| 131 | & @[ const array(T, 10) ]@
|
|---|
| 132 | \\ \hline
|
|---|
| 133 | & ar.\ of ptr.\ to val.
|
|---|
| 134 | & @T*[10]@
|
|---|
| 135 | & @T *x[10];@
|
|---|
| 136 | & @[ [10] * T ]@
|
|---|
| 137 | & @[ array(* T, 10) ]@
|
|---|
| 138 | \\ \hline
|
|---|
| 139 | & \pbox{20cm}{ \vspace{2pt} ar.\ of ptr.\ to val.\\ \footnotesize{no writing the ptr.\ in \lstinline{x[5]}} }\vspace{2pt}
|
|---|
| 140 | & @T * const [10]@
|
|---|
| 141 | & @T * const x[10];@
|
|---|
| 142 | & @[ [10] const * T ]@
|
|---|
| 143 | & @[ array(const * T, 10) ]@
|
|---|
| 144 | \\ \hline
|
|---|
| 145 | & \pbox{20cm}{ \vspace{2pt} ar.\ of ptr.\ to val.\\ \footnotesize{no writing the val.\ in \lstinline{*(x[5])}} }\vspace{2pt}
|
|---|
| 146 | & \pbox{20cm}{ \vspace{2pt} \lstinline{const T * [10]} \\ \lstinline{T const * [10]} }
|
|---|
| 147 | & \pbox{20cm}{ \vspace{2pt} \lstinline{const T * x[10];} \\ \lstinline{T const * x[10];} }
|
|---|
| 148 | & @[ [10] * const T ]@
|
|---|
| 149 | & @[ array(* const T, 10) ]@
|
|---|
| 150 | \\ \hline \hline
|
|---|
| 151 | $\triangleright$ & ptr.\ to ar.\ of val.
|
|---|
| 152 | & @T(*)[10]@
|
|---|
| 153 | & @T (*x)[10];@
|
|---|
| 154 | & @[ * [10] T ]@
|
|---|
| 155 | & @[ * array(T, 10) ]@
|
|---|
| 156 | \\ \hline
|
|---|
| 157 | & \pbox{20cm}{ \vspace{2pt} ptr.\ to ar.\ of val.\\ \footnotesize{no writing the ptr.\ in \lstinline{x}} }\vspace{2pt}
|
|---|
| 158 | & @T(* const)[10]@
|
|---|
| 159 | & @T (* const x)[10];@
|
|---|
| 160 | & @[ const * [10] T ]@
|
|---|
| 161 | & @[ const * array(T, 10) ]@
|
|---|
| 162 | \\ \hline
|
|---|
| 163 | & \pbox{20cm}{ \vspace{2pt} ptr.\ to ar.\ of val.\\ \footnotesize{no writing the val.\ in \lstinline{(*x)[5]}} }\vspace{2pt}
|
|---|
| 164 | & \pbox{20cm}{ \vspace{2pt} \lstinline{const T(*)[10]} \\ \lstinline{T const (*) [10]} }
|
|---|
| 165 | & \pbox{20cm}{ \vspace{2pt} \lstinline{const T (*x)[10];} \\ \lstinline{T const (*x)[10];} }
|
|---|
| 166 | & @[ * [10] const T ]@
|
|---|
| 167 | & @[ * const array(T, 10) ]@
|
|---|
| 168 | \\ \hline
|
|---|
| 169 | & ptr.\ to ar.\ of ptr.\ to val.
|
|---|
| 170 | & @T*(*)[10]@
|
|---|
| 171 | & @T *(*x)[10];@
|
|---|
| 172 | & @[ * [10] * T ]@
|
|---|
| 173 | & @[ * array(* T, 10) ]@
|
|---|
| 174 | \\ \hline
|
|---|
| 175 | \end{tabular}
|
|---|
| 176 | \caption{Unfortunate Syntactic Reference for Array vs Pointer. Includes interaction with constness.}
|
|---|
| 177 | \label{bkgd:ar:usr:avp}
|
|---|
| 178 | \end{figure}
|
|---|
| 179 |
|
|---|
| 180 |
|
|---|
| 181 |
|
|---|
| 182 |
|
|---|
| 183 |
|
|---|
| 184 | TODO: Address these parked unfortunate syntaxes
|
|---|
| 185 | \begin{itemize}
|
|---|
| 186 | \item static
|
|---|
| 187 | \item star as dimension
|
|---|
| 188 | \item under pointer decay: int p1[const 3] being int const *p1
|
|---|
| 189 | \end{itemize}
|
|---|
| 190 |
|
|---|
| 191 |
|
|---|
| 192 | \subsection{Arrays decay and pointers diffract}
|
|---|
| 193 |
|
|---|
| 194 | The last section established the difference between these four types:
|
|---|
| 195 | \lstinput{3-6}{bkgd-carray-decay.c}
|
|---|
| 196 | But the expression used for obtaining the pointer to the first element is pedantic.
|
|---|
| 197 | The root of all C programmer experience with arrays is the shortcut
|
|---|
| 198 | \lstinput{8-8}{bkgd-carray-decay.c}
|
|---|
| 199 | which reproduces @pa0@, in type and value:
|
|---|
| 200 | \lstinput{9-9}{bkgd-carray-decay.c}
|
|---|
| 201 | The validity of this initialization is unsettling, in the context of the facts established in the last section.
|
|---|
| 202 | Notably, it initializes name @pa0x@ from expression @ar@, when they are not of the same type:
|
|---|
| 203 | \lstinput{10-10}{bkgd-carray-decay.c}
|
|---|
| 204 |
|
|---|
| 205 | So, C provides an implicit conversion from @float[10]@ to @float*@, as described in ARM-6.3.2.1.3:
|
|---|
| 206 | \begin{quote}
|
|---|
| 207 | Except when it is the operand of the @sizeof@ operator, or the unary @&@ operator, or is a
|
|---|
| 208 | string literal used to initialize an array
|
|---|
| 209 | an expression that has type ``array of type'' is
|
|---|
| 210 | converted to an expression with type ``pointer to type'' that points to the initial element of
|
|---|
| 211 | the array object
|
|---|
| 212 | \end{quote}
|
|---|
| 213 |
|
|---|
| 214 | This phenomenon is the famous ``pointer decay,'' which is a decay of an array-typed expression into a pointer-typed one.
|
|---|
| 215 |
|
|---|
| 216 | It is worthy to note that the list of exception cases does not feature the occurrence of @ar@ in @ar[i]@.
|
|---|
| 217 | Thus, subscripting happens on pointers, not arrays.
|
|---|
| 218 |
|
|---|
| 219 | Subscripting proceeds first with pointer decay, if needed. Next, ARM-6.5.2.1.2 explains that @ar[i]@ is treated as if it were @(*((a)+(i)))@.
|
|---|
| 220 | ARM-6.5.6.8 explains that the addition, of a pointer with an integer type, is defined only when the pointer refers to an element that is in an array, with a meaning of ``@i@ elements away from,'' which is valid if @ar@ is big enough and @i@ is small enough.
|
|---|
| 221 | Finally, ARM-6.5.3.2.4 explains that the @*@ operator's result is the referenced element.
|
|---|
| 222 |
|
|---|
| 223 | Taken together, these rules also happen to illustrate that @ar[i]@ and @i[a]@ mean the same thing.
|
|---|
| 224 |
|
|---|
| 225 | Subscripting a pointer when the target is standard-inappropriate is still practically well-defined.
|
|---|
| 226 | While the standard affords a C compiler freedom about the meaning of an out-of-bound access,
|
|---|
| 227 | or of subscripting a pointer that does not refer to an array element at all,
|
|---|
| 228 | the fact that C is famously both generally high-performance, and specifically not bound-checked,
|
|---|
| 229 | leads to an expectation that the runtime handling is uniform across legal and illegal accesses.
|
|---|
| 230 | Moreover, consider the common pattern of subscripting on a @malloc@ result:
|
|---|
| 231 | \begin{cfa}
|
|---|
| 232 | float * fs = malloc( 10 * sizeof(float) );
|
|---|
| 233 | fs[5] = 3.14;
|
|---|
| 234 | \end{cfa}
|
|---|
| 235 | The @malloc@ behaviour is specified as returning a pointer to ``space for an object whose size is'' as requested (ARM-7.22.3.4.2).
|
|---|
| 236 | But program says \emph{nothing} more about this pointer value, that might cause its referent to \emph{be} an array, before doing the subscript.
|
|---|
| 237 |
|
|---|
| 238 | Under this assumption, a pointer being subscripted (or added to, then dereferenced)
|
|---|
| 239 | by any value (positive, zero, or negative), gives a view of the program's entire address space,
|
|---|
| 240 | centred around the @p@ address, divided into adjacent @sizeof(*p)@ chunks,
|
|---|
| 241 | each potentially (re)interpreted as @typeof(*p)@.
|
|---|
| 242 |
|
|---|
| 243 | I call this phenomenon ``array diffraction,'' which is a diffraction of a single-element pointer
|
|---|
| 244 | into the assumption that its target is in the middle of an array whose size is unlimited in both directions.
|
|---|
| 245 |
|
|---|
| 246 | No pointer is exempt from array diffraction.
|
|---|
| 247 |
|
|---|
| 248 | No array shows its elements without pointer decay.
|
|---|
| 249 |
|
|---|
| 250 | A further pointer--array confusion, closely related to decay, occurs in parameter declarations.
|
|---|
| 251 | ARM-6.7.6.3.7 explains that when an array type is written for a parameter,
|
|---|
| 252 | the parameter's type becomes a type that I summarize as being the array-decayed type.
|
|---|
| 253 | The respective handlings of the following two parameter spellings shows that the array-spelled one is really, like the other, a pointer.
|
|---|
| 254 | \lstinput{12-16}{bkgd-carray-decay.c}
|
|---|
| 255 | As the @sizeof(x)@ meaning changed, compared with when run on a similarly-spelled local variable declaration,
|
|---|
| 256 | GCC also gives this code the warning: ```sizeof' on array function parameter `x' will return size of `float *'.''
|
|---|
| 257 |
|
|---|
| 258 | The caller of such a function is left with the reality that a pointer parameter is a pointer, no matter how it's spelled:
|
|---|
| 259 | \lstinput{18-21}{bkgd-carray-decay.c}
|
|---|
| 260 | This fragment gives no warnings.
|
|---|
| 261 |
|
|---|
| 262 | The shortened parameter syntax @T x[]@ is a further way to spell ``pointer.''
|
|---|
| 263 | Note the opposite meaning of this spelling now, compared with its use in local variable declarations.
|
|---|
| 264 | This point of confusion is illustrated in:
|
|---|
| 265 | \lstinput{23-30}{bkgd-carray-decay.c}
|
|---|
| 266 | The basic two meanings, with a syntactic difference helping to distinguish,
|
|---|
| 267 | are illustrated in the declarations of @ca@ vs.\ @cp@,
|
|---|
| 268 | whose subsequent @edit@ calls behave differently.
|
|---|
| 269 | The syntax-caused confusion is in the comparison of the first and last lines,
|
|---|
| 270 | both of which use a literal to initialize an object declared with spelling @T x[]@.
|
|---|
| 271 | But these initialized declarations get opposite meanings,
|
|---|
| 272 | depending on whether the object is a local variable or a parameter.
|
|---|
| 273 |
|
|---|
| 274 |
|
|---|
| 275 | In summary, when a function is written with an array-typed parameter,
|
|---|
| 276 | \begin{itemize}
|
|---|
| 277 | \item an appearance of passing an array by value is always an incorrect understanding
|
|---|
| 278 | \item a dimension value, if any is present, is ignored
|
|---|
| 279 | \item pointer decay is forced at the call site and the callee sees the parameter having the decayed type
|
|---|
| 280 | \end{itemize}
|
|---|
| 281 |
|
|---|
| 282 | Pointer decay does not affect pointer-to-array types, because these are already pointers, not arrays.
|
|---|
| 283 | As a result, a function with a pointer-to-array parameter sees the parameter exactly as the caller does:
|
|---|
| 284 | \lstinput{32-42}{bkgd-carray-decay.c}
|
|---|
| 285 |
|
|---|
| 286 | \VRef[Figure]{bkgd:ar:usr:decay-parm} gives the reference for the decay phenomenon seen in parameter decalarations.
|
|---|
| 287 |
|
|---|
| 288 | \begin{figure}
|
|---|
| 289 | \centering
|
|---|
| 290 | \begin{tabular}{llllll}
|
|---|
| 291 | & Description & Type & Param. Decl & \CFA-C \\ \hline
|
|---|
| 292 | $\triangleright$ & ptr.\ to val.
|
|---|
| 293 | & @T *@
|
|---|
| 294 | & \pbox{20cm}{ \vspace{2pt} \lstinline{T * x,} \\ \lstinline{T x[10],} \\ \lstinline{T x[],} }\vspace{2pt}
|
|---|
| 295 | & \pbox{20cm}{ \vspace{2pt} \lstinline{[ * T ]} \\ \lstinline{[ [10] T ]} \\ \lstinline{[ [] T ]} }
|
|---|
| 296 | \\ \hline
|
|---|
| 297 | & \pbox{20cm}{ \vspace{2pt} ptr.\ to val.\\ \footnotesize{no writing the ptr.\ in \lstinline{x}} }\vspace{2pt}
|
|---|
| 298 | & @T * const@
|
|---|
| 299 | & \pbox{20cm}{ \vspace{2pt} \lstinline{T * const x,} \\ \lstinline{T x[const 10],} \\ \lstinline{T x[const],} }\vspace{2pt}
|
|---|
| 300 | & \pbox{20cm}{ \vspace{2pt} \lstinline{[ const * T ]} \\ \lstinline{[ [const 10] T ]} \\ \lstinline{[ [const] T ]} }
|
|---|
| 301 | \\ \hline
|
|---|
| 302 | & \pbox{20cm}{ \vspace{2pt} ptr.\ to val.\\ \footnotesize{no writing the val.\ in \lstinline{*x}} }\vspace{2pt}
|
|---|
| 303 | & \pbox{20cm}{ \vspace{2pt} \lstinline{const T *} \\ \lstinline{T const *} }
|
|---|
| 304 | & \pbox{20cm}{ \vspace{2pt} \lstinline{const T * x,} \\ \lstinline{T const * x,} \\ \lstinline{const T x[10],} \\ \lstinline{T const x[10],} \\ \lstinline{const T x[],} \\ \lstinline{T const x[],} }\vspace{2pt}
|
|---|
| 305 | & \pbox{20cm}{ \vspace{2pt} \lstinline{[* const T]} \\ \lstinline{[ [10] const T ]} \\ \lstinline{[ [] const T ]} }
|
|---|
| 306 | \\ \hline \hline
|
|---|
| 307 | $\triangleright$ & ptr.\ to ar.\ of val.
|
|---|
| 308 | & @T(*)[10]@
|
|---|
| 309 | & \pbox{20cm}{ \vspace{2pt} \lstinline{T (*x)[10],} \\ \lstinline{T x[3][10],} \\ \lstinline{T x[][10],} }\vspace{2pt}
|
|---|
| 310 | & \pbox{20cm}{ \vspace{2pt} \lstinline{[* [10] T]} \\ \lstinline{[ [3] [10] T ]} \\ \lstinline{[ [] [10] T ]} }
|
|---|
| 311 | \\ \hline
|
|---|
| 312 | & ptr.\ to ptr.\ to val.
|
|---|
| 313 | & @T **@
|
|---|
| 314 | & \pbox{20cm}{ \vspace{2pt} \lstinline{T ** x,} \\ \lstinline{T *x[10],} \\ \lstinline{T *x[],} }\vspace{2pt}
|
|---|
| 315 | & \pbox{20cm}{ \vspace{2pt} \lstinline{[ * * T ]} \\ \lstinline{[ [10] * T ]} \\ \lstinline{[ [] * T ]} }
|
|---|
| 316 | \\ \hline
|
|---|
| 317 | & \pbox{20cm}{ \vspace{2pt} ptr.\ to ptr.\ to val.\\ \footnotesize{no writing the val.\ in \lstinline{**argv}} }\vspace{2pt}
|
|---|
| 318 | & @const char **@
|
|---|
| 319 | & \pbox{20cm}{ \vspace{2pt} \lstinline{const char *argv[],} \\ \footnotesize{(others elided)} }\vspace{2pt}
|
|---|
| 320 | & \pbox{20cm}{ \vspace{2pt} \lstinline{[ [] * const char ]} \\ \footnotesize{(others elided)} }
|
|---|
| 321 | \\ \hline
|
|---|
| 322 | \end{tabular}
|
|---|
| 323 | \caption{Unfortunate Syntactic Reference for Decay during Parameter-Passing. Includes interaction with constness, where ``no writing'' refers to a restriction on the callee's ability.}
|
|---|
| 324 | \label{bkgd:ar:usr:decay-parm}
|
|---|
| 325 | \end{figure}
|
|---|
| 326 |
|
|---|
| 327 |
|
|---|
| 328 | \subsection{Lengths may vary, checking does not}
|
|---|
| 329 |
|
|---|
| 330 | When the desired number of elements is unknown at compile time,
|
|---|
| 331 | a variable-length array is a solution:
|
|---|
| 332 | \begin{cfa}
|
|---|
| 333 | int main( int argc, const char *argv[] ) {
|
|---|
| 334 | assert( argc == 2 );
|
|---|
| 335 | size_t n = atol( argv[1] );
|
|---|
| 336 | assert( 0 < n && n < 1000 );
|
|---|
| 337 |
|
|---|
| 338 | float ar[n];
|
|---|
| 339 | float b[10];
|
|---|
| 340 |
|
|---|
| 341 | // ... discussion continues here
|
|---|
| 342 | }
|
|---|
| 343 | \end{cfa}
|
|---|
| 344 | This arrangement allocates @n@ elements on the @main@ stack frame for @ar@, just as it puts 10 elements on the @main@ stack frame for @b@.
|
|---|
| 345 | The variable-sized allocation of @ar@ is provided by @alloca@.
|
|---|
| 346 |
|
|---|
| 347 | In a situation where the array sizes are not known to be small enough for stack allocation to be sensible, corresponding heap allocations are achievable as:
|
|---|
| 348 | \begin{cfa}
|
|---|
| 349 | float *ax1 = malloc( sizeof( float[n] ) );
|
|---|
| 350 | float *ax2 = malloc( n * sizeof( float ) );
|
|---|
| 351 | float *bx1 = malloc( sizeof( float[1000000] ) );
|
|---|
| 352 | float *bx2 = malloc( 1000000 * sizeof( float ) );
|
|---|
| 353 | \end{cfa}
|
|---|
| 354 |
|
|---|
| 355 |
|
|---|
| 356 | VLA
|
|---|
| 357 |
|
|---|
| 358 | Parameter dependency
|
|---|
| 359 |
|
|---|
| 360 | Checking is best-effort / unsound
|
|---|
| 361 |
|
|---|
| 362 | Limited special handling to get the dimension value checked (static)
|
|---|
| 363 |
|
|---|
| 364 |
|
|---|
| 365 |
|
|---|
| 366 | \subsection{C has full-service, dynamically sized, multidimensional arrays (and \CC does not)}
|
|---|
| 367 |
|
|---|
| 368 | In C and \CC, ``multidimensional array'' means ``array of arrays.'' Other meanings are discussed in TODO.
|
|---|
| 369 |
|
|---|
| 370 | Just as an array's element type can be @float@, so can it be @float[10]@.
|
|---|
| 371 |
|
|---|
| 372 | While any of @float*@, @float[10]@ and @float(*)[10]@ are easy to tell apart from @float@, telling them apart from each other may need occasional reference back to TODO intro section.
|
|---|
| 373 | The sentence derived by wrapping each type in @-[3]@ follows.
|
|---|
| 374 |
|
|---|
| 375 | While any of @float*[3]@, @float[3][10]@ and @float(*)[3][10]@ are easy to tell apart from @float[3]@,
|
|---|
| 376 | telling them apart from each other is what it takes to know what ``array of arrays'' really means.
|
|---|
| 377 |
|
|---|
| 378 | Pointer decay affects the outermost array only
|
|---|
| 379 |
|
|---|
| 380 | TODO: unfortunate syntactic reference with these cases:
|
|---|
| 381 |
|
|---|
| 382 | \begin{itemize}
|
|---|
| 383 | \item ar. of ar. of val (be sure about ordering of dimensions when the declaration is dropped)
|
|---|
| 384 | \item ptr. to ar. of ar. of val
|
|---|
| 385 | \end{itemize}
|
|---|
| 386 |
|
|---|
| 387 |
|
|---|
| 388 | \subsection{Arrays are (but) almost values}
|
|---|
| 389 |
|
|---|
| 390 | Has size; can point to
|
|---|
| 391 |
|
|---|
| 392 | Can't cast to
|
|---|
| 393 |
|
|---|
| 394 | Can't pass as value
|
|---|
| 395 |
|
|---|
| 396 | Can initialize
|
|---|
| 397 |
|
|---|
| 398 | Can wrap in aggregate
|
|---|
| 399 |
|
|---|
| 400 | Can't assign
|
|---|
| 401 |
|
|---|
| 402 |
|
|---|
| 403 | \subsection{Returning an array is (but) almost possible}
|
|---|
| 404 |
|
|---|
| 405 |
|
|---|
| 406 | \subsection{The pointer-to-array type has been noticed before}
|
|---|
| 407 |
|
|---|
| 408 | \subsection{Multi-Dimensional}
|
|---|
| 409 |
|
|---|
| 410 | As in the last section, we inspect the declaration ...
|
|---|
| 411 | \lstinput{16-18}{bkgd-carray-mdim.c}
|
|---|
| 412 | The significant axis of deriving expressions from @ar@ is now ``itself,'' ``first element'' or ``first grand-element (meaning, first element of first element).''
|
|---|
| 413 | \lstinput{20-44}{bkgd-carray-mdim.c}
|
|---|
| 414 |
|
|---|
| 415 |
|
|---|
| 416 | \section{Linked List}
|
|---|
| 417 |
|
|---|
| 418 |
|
|---|
| 419 | \section{String}
|
|---|