| [27f1055] | 1 | \chapter{Background} | 
|---|
|  | 2 |  | 
|---|
| [40ab446] | 3 | This chapter states facts about the prior work, upon which my contributions build. | 
|---|
|  | 4 | Each receives a justification of the extent to which its statement is phrased to provoke controversy or surprise. | 
|---|
| [27f1055] | 5 |  | 
|---|
| [40ab446] | 6 | \section{C} | 
|---|
|  | 7 |  | 
|---|
|  | 8 | \subsection{Common knowledge} | 
|---|
|  | 9 |  | 
|---|
|  | 10 | The reader is assumed to have used C or \CC for the coursework of at least four university-level courses, or have equivalent experience. | 
|---|
|  | 11 | The current discussion introduces facts, unaware of which, such a functioning novice may be operating. | 
|---|
|  | 12 |  | 
|---|
|  | 13 | % TODO: decide if I'm also claiming this collection of facts, and test-oriented presentation is a contribution; if so, deal with (not) arguing for its originality | 
|---|
|  | 14 |  | 
|---|
|  | 15 | \subsection{Convention: C is more touchable than its standard} | 
|---|
|  | 16 |  | 
|---|
|  | 17 | When it comes to explaining how C works, I like illustrating definite program semantics. | 
|---|
|  | 18 | I prefer doing so, over a quoting manual's suggested programmer's intuition, or showing how some compiler writers chose to model their problem. | 
|---|
|  | 19 | To illustrate definite program semantics, I devise a program, whose behaviour exercises the point at issue, and I show its behaviour. | 
|---|
|  | 20 |  | 
|---|
|  | 21 | This behaviour is typically one of | 
|---|
|  | 22 | \begin{itemize} | 
|---|
|  | 23 | \item my statement that the compiler accepts or rejects the program | 
|---|
|  | 24 | \item the program's printed output, which I show | 
|---|
|  | 25 | \item my implied assurance that its assertions do not fail when run | 
|---|
|  | 26 | \end{itemize} | 
|---|
|  | 27 |  | 
|---|
|  | 28 | The compiler whose program semantics is shown is | 
|---|
|  | 29 | \begin{lstlisting} | 
|---|
|  | 30 | $ gcc --version | 
|---|
|  | 31 | gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0 | 
|---|
|  | 32 | \end{lstlisting} | 
|---|
|  | 33 | running on Architecture @x86_64@, with the same environment targeted. | 
|---|
|  | 34 |  | 
|---|
|  | 35 | Unless explicit discussion ensues about differences among compilers or with (versions of) the standard, it is further implied that there exists a second version of GCC and some version of Clang, running on and for the same platform, that give substantially similar behaviour. | 
|---|
|  | 36 | In this case, I do not argue that my sample of major Linux compilers is doing the right thing with respect to the C standard. | 
|---|
|  | 37 |  | 
|---|
|  | 38 |  | 
|---|
|  | 39 | \subsection{C reports many ill-typed expressions as warnings} | 
|---|
|  | 40 |  | 
|---|
|  | 41 | TODO: typeset | 
|---|
| [f277ab6c] | 42 | \lstinputlisting[language=C, firstline=13, lastline=56]{bkgd-c-tyerr.c} | 
|---|
| [40ab446] | 43 |  | 
|---|
|  | 44 |  | 
|---|
|  | 45 | \section{C Arrays} | 
|---|
|  | 46 |  | 
|---|
|  | 47 | \subsection{C has an array type (!)} | 
|---|
|  | 48 |  | 
|---|
|  | 49 | TODO: typeset | 
|---|
| [f277ab6c] | 50 | \lstinputlisting[language=C, firstline=35, lastline=116]{bkgd-carray-arrty.c} | 
|---|
| [40ab446] | 51 |  | 
|---|
|  | 52 | My contribution is enabled by recognizing | 
|---|
|  | 53 | \begin{itemize} | 
|---|
|  | 54 | \item There is value in using a type that knows how big the whole thing is. | 
|---|
|  | 55 | \item The type pointer to (first) element does not. | 
|---|
|  | 56 | \item C \emph{has} a type that knows the whole picture: array, e.g. @T[10]@. | 
|---|
|  | 57 | \item This type has all the usual derived forms, which also know the whole picture.  A usefully noteworthy example is pointer to array, e.g. @T(*)[10]@. | 
|---|
|  | 58 | \end{itemize} | 
|---|
|  | 59 |  | 
|---|
|  | 60 | Each of these sections, which introduces another layer of of the C arrays' story, | 
|---|
|  | 61 | concludes with an \emph{Unfortunate Syntactic Reference}. | 
|---|
|  | 62 | It shows how to spell the types under discussion, | 
|---|
|  | 63 | along with interactions with orthogonal (but easily confused) language features. | 
|---|
|  | 64 | Alterrnate spellings are listed withing a row. | 
|---|
|  | 65 | The simplest occurrences of types distinguished in the preceding discussion are marked with $\triangleright$. | 
|---|
|  | 66 | The Type column gives the spelling used in a cast or error message (though note Section TODO points out that some types cannot be casted to). | 
|---|
|  | 67 | The Declaration column gives the spelling used in an object declaration, such as variable or aggregate member; parameter declarations (section TODO) follow entirely different rules. | 
|---|
|  | 68 |  | 
|---|
|  | 69 | After all, reading a C array type is easy: just read it from the inside out, and know when to look left and when to look right! | 
|---|
|  | 70 |  | 
|---|
|  | 71 |  | 
|---|
|  | 72 | \CFA-specific spellings (not yet introduced) are also included here for referenceability; these can be skipped on linear reading. | 
|---|
|  | 73 | The \CFA-C column gives the, more fortunate, ``new'' syntax of section TODO, for spelling \emph{exactly the same type}. | 
|---|
|  | 74 | This fortunate syntax does not have different spellings for types vs declarations; | 
|---|
|  | 75 | a declaration is always the type followed by the declared identifier name; | 
|---|
|  | 76 | for the example of letting @x@ be a \emph{pointer to array}, the declaration is spelled: | 
|---|
|  | 77 | \begin{lstlisting} | 
|---|
|  | 78 | [ * [10] T ] x; | 
|---|
|  | 79 | \end{lstlisting} | 
|---|
|  | 80 | The \CFA-Full column gives the spelling of a different type, introduced in TODO, which has all of my contributed improvements for safety and ergonomics. | 
|---|
|  | 81 |  | 
|---|
|  | 82 | \noindent | 
|---|
|  | 83 | \textbf{Unfortunate Syntactic Reference} | 
|---|
|  | 84 |  | 
|---|
|  | 85 | \noindent | 
|---|
|  | 86 | \begin{tabular}{llllll} | 
|---|
|  | 87 | & Description & Type & Declaration & \CFA-C  & \CFA-Full \\ \hline | 
|---|
|  | 88 | $\triangleright$ & val. | 
|---|
|  | 89 | & @T@ | 
|---|
|  | 90 | & @T x;@ | 
|---|
|  | 91 | & @[ T ]@ | 
|---|
|  | 92 | & | 
|---|
|  | 93 | \\ \hline | 
|---|
|  | 94 | & \pbox{20cm}{ \vspace{2pt} val.\\ \footnotesize{no writing the val.\ in \lstinline{x}}   }\vspace{2pt} | 
|---|
|  | 95 | & \pbox{20cm}{ \vspace{2pt} \lstinline{const T} \\ \lstinline{T const}   } | 
|---|
|  | 96 | & \pbox{20cm}{ \vspace{2pt} \lstinline{const T x;} \\ \lstinline{T const x;}   } | 
|---|
|  | 97 | & @[ const T ]@ | 
|---|
|  | 98 | & | 
|---|
|  | 99 | \\ \hline | 
|---|
|  | 100 | $\triangleright$ & ptr.\ to val. | 
|---|
|  | 101 | & @T *@ | 
|---|
|  | 102 | & @T * x;@ | 
|---|
|  | 103 | & @[ * T ]@ | 
|---|
|  | 104 | & | 
|---|
|  | 105 | \\ \hline | 
|---|
|  | 106 | & \pbox{20cm}{ \vspace{2pt} ptr.\ to val.\\ \footnotesize{no writing the ptr.\ in \lstinline{x}}   }\vspace{2pt} | 
|---|
|  | 107 | & @T * const@ | 
|---|
|  | 108 | & @T * const x;@ | 
|---|
|  | 109 | & @[ const * T ]@ | 
|---|
|  | 110 | & | 
|---|
|  | 111 | \\ \hline | 
|---|
|  | 112 | & \pbox{20cm}{ \vspace{2pt} ptr.\ to val.\\ \footnotesize{no writing the val.\ in \lstinline{*x}}   }\vspace{2pt} | 
|---|
|  | 113 | & \pbox{20cm}{ \vspace{2pt} \lstinline{const T *} \\ \lstinline{T const *}   } | 
|---|
|  | 114 | & \pbox{20cm}{ \vspace{2pt} \lstinline{const T * x;} \\ \lstinline{T const * x;}   } | 
|---|
|  | 115 | & @[ * const T ]@ | 
|---|
|  | 116 | & | 
|---|
|  | 117 | \\ \hline | 
|---|
|  | 118 | $\triangleright$ & ar.\ of val. | 
|---|
|  | 119 | & @T[10]@ | 
|---|
|  | 120 | & @T x[10];@ | 
|---|
|  | 121 | & @[ [10] T ]@ | 
|---|
|  | 122 | & @[ array(T, 10) ]@ | 
|---|
|  | 123 | \\ \hline | 
|---|
|  | 124 | & \pbox{20cm}{ \vspace{2pt} ar.\ of val.\\ \footnotesize{no writing the val.\ in \lstinline{x[5]}}   }\vspace{2pt} | 
|---|
|  | 125 | & \pbox{20cm}{ \vspace{2pt} \lstinline{const T[10]} \\ \lstinline{T const[10]}   } | 
|---|
|  | 126 | & \pbox{20cm}{ \vspace{2pt} \lstinline{const T x[10];} \\ \lstinline{T const x[10];}   } | 
|---|
|  | 127 | & @[ [10] const T ]@ | 
|---|
|  | 128 | & @[ const array(T, 10) ]@ | 
|---|
|  | 129 | \\ \hline | 
|---|
|  | 130 | & ar.\ of ptr.\ to val. | 
|---|
|  | 131 | & @T*[10]@ | 
|---|
|  | 132 | & @T *x[10];@ | 
|---|
|  | 133 | & @[ [10] * T ]@ | 
|---|
|  | 134 | & @[ array(* T, 10) ]@ | 
|---|
|  | 135 | \\ \hline | 
|---|
|  | 136 | & \pbox{20cm}{ \vspace{2pt} ar.\ of ptr.\ to val.\\ \footnotesize{no writing the ptr.\ in \lstinline{x[5]}}   }\vspace{2pt} | 
|---|
|  | 137 | & @T * const [10]@ | 
|---|
|  | 138 | & @T * const x[10];@ | 
|---|
|  | 139 | & @[ [10] const * T ]@ | 
|---|
|  | 140 | & @[ array(const * T, 10) ]@ | 
|---|
|  | 141 | \\ \hline | 
|---|
|  | 142 | & \pbox{20cm}{ \vspace{2pt} ar.\ of ptr.\ to val.\\ \footnotesize{no writing the val.\ in \lstinline{*(x[5])}}   }\vspace{2pt} | 
|---|
|  | 143 | & \pbox{20cm}{ \vspace{2pt} \lstinline{const T * [10]} \\ \lstinline{T const * [10]}   } | 
|---|
|  | 144 | & \pbox{20cm}{ \vspace{2pt} \lstinline{const T * x[10];} \\ \lstinline{T const * x[10];}   } | 
|---|
|  | 145 | & @[ [10] * const T ]@ | 
|---|
|  | 146 | & @[ array(* const T, 10) ]@ | 
|---|
|  | 147 | \\ \hline | 
|---|
|  | 148 | $\triangleright$ & ptr.\ to ar.\ of val. | 
|---|
|  | 149 | & @T(*)[10]@ | 
|---|
|  | 150 | & @T (*x)[10];@ | 
|---|
|  | 151 | & @[ * [10] T ]@ | 
|---|
|  | 152 | & @[ * array(T, 10) ]@ | 
|---|
|  | 153 | \\ \hline | 
|---|
|  | 154 | & \pbox{20cm}{ \vspace{2pt} ptr.\ to ar.\ of val.\\ \footnotesize{no writing the ptr.\ in \lstinline{x}}   }\vspace{2pt} | 
|---|
|  | 155 | & @T(* const)[10]@ | 
|---|
|  | 156 | & @T (* const x)[10];@ | 
|---|
|  | 157 | & @[ const * [10] T ]@ | 
|---|
|  | 158 | & @[ const * array(T, 10) ]@ | 
|---|
|  | 159 | \\ \hline | 
|---|
|  | 160 | & \pbox{20cm}{ \vspace{2pt} ptr.\ to ar.\ of val.\\ \footnotesize{no writing the val.\ in \lstinline{(*x)[5]}}   }\vspace{2pt} | 
|---|
|  | 161 | & \pbox{20cm}{ \vspace{2pt} \lstinline{const T(*)[10]} \\ \lstinline{T const (*) [10]}   } | 
|---|
|  | 162 | & \pbox{20cm}{ \vspace{2pt} \lstinline{const T (*x)[10];} \\ \lstinline{T const (*x)[10];}   } | 
|---|
|  | 163 | & @[ * [10] const T ]@ | 
|---|
|  | 164 | & @[ * const array(T, 10) ]@ | 
|---|
|  | 165 | \\ \hline | 
|---|
|  | 166 | & ptr.\ to ar.\ of ptr.\ to val. | 
|---|
|  | 167 | & @T*(*)[10]@ | 
|---|
|  | 168 | & @T *(*x)[10];@ | 
|---|
|  | 169 | & @[ * [10] * T ]@ | 
|---|
|  | 170 | & @[ * array(* T, 10) ]@ | 
|---|
|  | 171 | \\ \hline | 
|---|
|  | 172 | \end{tabular} | 
|---|
|  | 173 |  | 
|---|
|  | 174 |  | 
|---|
|  | 175 | \subsection{Arrays decay and pointers diffract} | 
|---|
|  | 176 |  | 
|---|
|  | 177 | TODO: typeset | 
|---|
| [f277ab6c] | 178 | \lstinputlisting[language=C, firstline=4, lastline=26]{bkgd-carray-decay.c} | 
|---|
| [40ab446] | 179 |  | 
|---|
|  | 180 |  | 
|---|
|  | 181 | So, C provides an implicit conversion from @float[10]@ to @float*@, as described in ARM-6.3.2.1.3: | 
|---|
|  | 182 |  | 
|---|
|  | 183 | \begin{quote} | 
|---|
|  | 184 | Except when it is the operand of the @sizeof@ operator, or the unary @&@ operator, or is a | 
|---|
|  | 185 | string literal used to initialize an array | 
|---|
|  | 186 | an expression that has type ``array of type'' is | 
|---|
|  | 187 | converted to an expression with type ``pointer to type'' that points to the initial element of | 
|---|
|  | 188 | the array object | 
|---|
|  | 189 | \end{quote} | 
|---|
|  | 190 |  | 
|---|
|  | 191 | This phenomenon is the famous ``pointer decay,'' which is a decay of an array-typed expression into a pointer-typed one. | 
|---|
|  | 192 |  | 
|---|
|  | 193 | It is worthy to note that the list of exception cases does not feature the occurrence of @a@ in @a[i]@. | 
|---|
|  | 194 | Thus, subscripting happens on pointers, not arrays. | 
|---|
|  | 195 |  | 
|---|
|  | 196 | Subscripting proceeds first with pointer decay, if needed.  Next, ARM-6.5.2.1.2 explains that @a[i]@ is treated as if it were @(*((a)+(i)))@. | 
|---|
|  | 197 | ARM-6.5.6.8 explains that the addition, of a pointer with an integer type,  is defined only when the pointer refers to an element that is in an array, with a meaning of ``@i@ elements away from,'' which is valid if @a@ is big enough and @i@ is small enough. | 
|---|
|  | 198 | Finally, ARM-6.5.3.2.4 explains that the @*@ operator's result is the referenced element. | 
|---|
|  | 199 |  | 
|---|
|  | 200 | Taken together, these rules also happen to illustrate that @a[i]@ and @i[a]@ mean the same thing. | 
|---|
|  | 201 |  | 
|---|
|  | 202 | Subscripting a pointer when the target is standard-inappropriate is still practically well-defined. | 
|---|
|  | 203 | While the standard affords a C compiler freedom about the meaning of an out-of-bound access, | 
|---|
|  | 204 | or of subscripting a pointer that does not refer to an array element at all, | 
|---|
|  | 205 | the fact that C is famously both generally high-performance, and specifically not bound-checked, | 
|---|
|  | 206 | leads to an expectation that the runtime handling is uniform across legal and illegal accesses. | 
|---|
|  | 207 | Moreover, consider the common pattern of subscripting on a malloc result: | 
|---|
|  | 208 | \begin{lstlisting} | 
|---|
|  | 209 | float * fs = malloc( 10 * sizeof(float) ); | 
|---|
|  | 210 | fs[5] = 3.14; | 
|---|
|  | 211 | \end{lstlisting} | 
|---|
|  | 212 | The @malloc@ behaviour is specified as returning a pointer to ``space for an object whose size is'' as requested (ARM-7.22.3.4.2). | 
|---|
|  | 213 | But program says \emph{nothing} more about this pointer value, that might cause its referent to \emph{be} an array, before doing the subscript. | 
|---|
|  | 214 |  | 
|---|
|  | 215 | Under this assumption, a pointer being subscripted (or added to, then dereferenced) | 
|---|
|  | 216 | by any value (positive, zero, or negative), gives a view of the program's entire address space, | 
|---|
|  | 217 | centred around the @p@ address, divided into adjacent @sizeof(*p)@ chunks, | 
|---|
|  | 218 | each potentially (re)interpreted as @typeof(*p)@. | 
|---|
|  | 219 |  | 
|---|
|  | 220 | I call this phenomenon ``array diffraction,''  which is a diffraction of a single-element pointer | 
|---|
|  | 221 | into the assumption that its target is in the middle of an array whose size is unlimited in both directions. | 
|---|
|  | 222 |  | 
|---|
|  | 223 | No pointer is exempt from array diffraction. | 
|---|
|  | 224 |  | 
|---|
|  | 225 | No array shows its elements without pointer decay. | 
|---|
|  | 226 |  | 
|---|
|  | 227 | A further pointer--array confusion, closely related to decay, occurs in parameter declarations. | 
|---|
|  | 228 | ARM-6.7.6.3.7 explains that when an array type is written for a parameter, | 
|---|
|  | 229 | the parameter's type becomes a type that I summarize as being the array-decayed type. | 
|---|
|  | 230 | The respective handlings of the following two parameter spellings shows that the array-spelled one is really, like the other, a pointer. | 
|---|
| [f277ab6c] | 231 | \lstinputlisting[language=C, firstline=40, lastline=44]{bkgd-carray-decay.c} | 
|---|
| [40ab446] | 232 | As the @sizeof(x)@ meaning changed, compared with when run on a similarly-spelled local variariable declaration, | 
|---|
|  | 233 | GCC also gives this code the warning: ```sizeof' on array function parameter `x' will return size of `float *'.'' | 
|---|
|  | 234 |  | 
|---|
|  | 235 | The caller of such a function is left with the reality that a pointer parameter is a pointer, no matter how it's spelled: | 
|---|
| [f277ab6c] | 236 | \lstinputlisting[language=C, firstline=60, lastline=63]{bkgd-carray-decay.c} | 
|---|
| [40ab446] | 237 | This fragment gives no warnings. | 
|---|
|  | 238 |  | 
|---|
|  | 239 | The shortened parameter syntax @T x[]@ is a further way to spell ``pointer.'' | 
|---|
|  | 240 | Note the opposite meaning of this spelling now, compared with its use in local variable declarations. | 
|---|
|  | 241 | This point of confusion is illustrated in: | 
|---|
| [f277ab6c] | 242 | \lstinputlisting[language=C, firstline=80, lastline=87]{bkgd-carray-decay.c} | 
|---|
| [40ab446] | 243 | The basic two meanings, with a syntactic difference helping to distinguish, | 
|---|
|  | 244 | are illustrated in the declarations of @ca@ vs.\ @cp@, | 
|---|
|  | 245 | whose subsequent @edit@ calls behave differently. | 
|---|
|  | 246 | The syntax-caused confusion is in the comparison of the first and last lines, | 
|---|
|  | 247 | both of which use a literal to initialze an object decalared with spelling @T x[]@. | 
|---|
|  | 248 | But these initialized declarations get opposite meanings, | 
|---|
|  | 249 | depending on whether the object is a local variable or a parameter. | 
|---|
|  | 250 |  | 
|---|
|  | 251 |  | 
|---|
|  | 252 | In sumary, when a funciton is written with an array-typed parameter, | 
|---|
|  | 253 | \begin{itemize} | 
|---|
|  | 254 | \item an appearance of passing an array by value is always an incorrect understanding | 
|---|
|  | 255 | \item a dimension value, if any is present, is ignorred | 
|---|
|  | 256 | \item pointer decay is forced at the call site and the callee sees the parameter having the decayed type | 
|---|
|  | 257 | \end{itemize} | 
|---|
|  | 258 |  | 
|---|
|  | 259 | Pointer decay does not affect pointer-to-array types, because these are already pointers, not arrays. | 
|---|
|  | 260 | As a result, a function with a pointer-to-array parameter sees the parameter exactly as the caller does: | 
|---|
| [f277ab6c] | 261 | \lstinputlisting[language=C, firstline=100, lastline=110]{bkgd-carray-decay.c} | 
|---|
| [40ab446] | 262 |  | 
|---|
|  | 263 |  | 
|---|
|  | 264 | \noindent | 
|---|
|  | 265 | \textbf{Unfortunate Syntactic Reference} | 
|---|
|  | 266 |  | 
|---|
|  | 267 | \noindent | 
|---|
|  | 268 | (Parameter declaration; ``no writing'' refers to the callee's ability) | 
|---|
|  | 269 |  | 
|---|
|  | 270 | \noindent | 
|---|
|  | 271 | \begin{tabular}{llllll} | 
|---|
|  | 272 | & Description & Type & Param. Decl & \CFA-C  \\ \hline | 
|---|
|  | 273 | $\triangleright$ & ptr.\ to val. | 
|---|
|  | 274 | & @T *@ | 
|---|
|  | 275 | & \pbox{20cm}{ \vspace{2pt} \lstinline{T * x,} \\ \lstinline{T x[10],} \\ \lstinline{T x[],}   }\vspace{2pt} | 
|---|
|  | 276 | & \pbox{20cm}{ \vspace{2pt} \lstinline{[ * T ]} \\ \lstinline{[ [10] T ]} \\ \lstinline{[ [] T  ]}   } | 
|---|
|  | 277 | \\ \hline | 
|---|
|  | 278 | & \pbox{20cm}{ \vspace{2pt} ptr.\ to val.\\ \footnotesize{no writing the ptr.\ in \lstinline{x}}   }\vspace{2pt} | 
|---|
|  | 279 | & @T * const@ | 
|---|
|  | 280 | & \pbox{20cm}{ \vspace{2pt} \lstinline{T * const x,} \\ \lstinline{T x[const 10],} \\ \lstinline{T x[const],}   }\vspace{2pt} | 
|---|
|  | 281 | & \pbox{20cm}{ \vspace{2pt} \lstinline{[ const * T ]} \\ \lstinline{[ [const 10] T ]} \\ \lstinline{[ [const] T  ]}   } | 
|---|
|  | 282 | \\ \hline | 
|---|
|  | 283 | & \pbox{20cm}{ \vspace{2pt} ptr.\ to val.\\ \footnotesize{no writing the val.\ in \lstinline{*x}}   }\vspace{2pt} | 
|---|
|  | 284 | & \pbox{20cm}{ \vspace{2pt} \lstinline{const T *} \\ \lstinline{T const *}   } | 
|---|
|  | 285 | & \pbox{20cm}{ \vspace{2pt} \lstinline{const T * x,} \\ \lstinline{T const * x,} \\ \lstinline{const T x[10],} \\ \lstinline{T const x[10],} \\ \lstinline{const T x[],} \\ \lstinline{T const x[],}   }\vspace{2pt} | 
|---|
|  | 286 | & \pbox{20cm}{ \vspace{2pt} \lstinline{[* const T]} \\ \lstinline{[ [10] const T ]} \\ \lstinline{[ [] const T  ]}   } | 
|---|
|  | 287 | \\ \hline | 
|---|
|  | 288 | $\triangleright$ & ptr.\ to ar.\ of val. | 
|---|
|  | 289 | & @T(*)[10]@ | 
|---|
|  | 290 | & \pbox{20cm}{ \vspace{2pt} \lstinline{T (*x)[10],} \\ \lstinline{T x[3][10],} \\ \lstinline{T x[][10],}   }\vspace{2pt} | 
|---|
|  | 291 | & \pbox{20cm}{ \vspace{2pt} \lstinline{[* [10] T]} \\ \lstinline{[ [3] [10] T ]} \\ \lstinline{[ [] [10] T  ]}   } | 
|---|
|  | 292 | \\ \hline | 
|---|
|  | 293 | & ptr.\ to ptr.\ to val. | 
|---|
|  | 294 | & @T **@ | 
|---|
|  | 295 | & \pbox{20cm}{ \vspace{2pt} \lstinline{T ** x,} \\ \lstinline{T *x[10],} \\ \lstinline{T *x[],}   }\vspace{2pt} | 
|---|
|  | 296 | & \pbox{20cm}{ \vspace{2pt} \lstinline{[ * * T ]} \\ \lstinline{[ [10] * T ]} \\ \lstinline{[ [] * T  ]}   } | 
|---|
|  | 297 | \\ \hline | 
|---|
|  | 298 | & \pbox{20cm}{ \vspace{2pt} ptr.\ to ptr.\ to val.\\ \footnotesize{no writing the val.\ in \lstinline{**argv}}   }\vspace{2pt} | 
|---|
|  | 299 | & @const char **@ | 
|---|
|  | 300 | & \pbox{20cm}{ \vspace{2pt} \lstinline{const char *argv[],} \\ \footnotesize{(others elided)}   }\vspace{2pt} | 
|---|
|  | 301 | & \pbox{20cm}{ \vspace{2pt} \lstinline{[ [] * const char ]} \\ \footnotesize{(others elided)}   } | 
|---|
|  | 302 | \\ \hline | 
|---|
|  | 303 | \end{tabular} | 
|---|
|  | 304 |  | 
|---|
|  | 305 |  | 
|---|
|  | 306 |  | 
|---|
|  | 307 | \subsection{Lengths may vary, checking does not} | 
|---|
|  | 308 |  | 
|---|
|  | 309 | When the desired number of elements is unknown at compile time, | 
|---|
|  | 310 | a variable-length array is a solution: | 
|---|
|  | 311 | \begin{lstlisting} | 
|---|
|  | 312 | int main( int argc, const char *argv[] ) { | 
|---|
|  | 313 |  | 
|---|
|  | 314 | assert( argc == 2 ); | 
|---|
|  | 315 | size_t n = atol( argv[1] ); | 
|---|
|  | 316 | assert( 0 < n && n < 1000 ); | 
|---|
|  | 317 |  | 
|---|
|  | 318 | float a[n]; | 
|---|
|  | 319 | float b[10]; | 
|---|
|  | 320 |  | 
|---|
|  | 321 | // ... discussion continues here | 
|---|
|  | 322 | } | 
|---|
|  | 323 | \end{lstlisting} | 
|---|
|  | 324 | This arrangement allocates @n@ elements on the @main@ stack frame for @a@, | 
|---|
|  | 325 | just as it puts 10 elements on the @main@ stack frame for @b@. | 
|---|
|  | 326 | The variable-sized allocation of @a@ is provided by @alloca@. | 
|---|
|  | 327 |  | 
|---|
|  | 328 | In a situation where the array sizes are not known to be small enough | 
|---|
|  | 329 | for stack allocation to be sensible, | 
|---|
|  | 330 | corresponding heap allocations are achievable as: | 
|---|
|  | 331 | \begin{lstlisting} | 
|---|
|  | 332 | float *ax1 = malloc( sizeof( float[n] ) ); | 
|---|
|  | 333 | float *ax2 = malloc( n * sizeof( float ) ); | 
|---|
|  | 334 | float *bx1 = malloc( sizeof( float[1000000] ) ); | 
|---|
|  | 335 | float *bx2 = malloc( 1000000 * sizeof( float ) ); | 
|---|
|  | 336 | \end{lstlisting} | 
|---|
|  | 337 |  | 
|---|
|  | 338 |  | 
|---|
|  | 339 | VLA | 
|---|
|  | 340 |  | 
|---|
|  | 341 | Parameter dependency | 
|---|
|  | 342 |  | 
|---|
|  | 343 | Checking is best-effort / unsound | 
|---|
|  | 344 |  | 
|---|
|  | 345 | Limited special handling to get the dimension value checked (static) | 
|---|
|  | 346 |  | 
|---|
|  | 347 |  | 
|---|
|  | 348 |  | 
|---|
|  | 349 | \subsection{C has full-service, dynamically sized, multidimensional arrays (and \CC does not)} | 
|---|
|  | 350 |  | 
|---|
|  | 351 | In C and \CC, ``multidimensional array'' means ``array of arrays.''  Other meanings are discussed in TODO. | 
|---|
|  | 352 |  | 
|---|
|  | 353 | Just as an array's element type can be @float@, so can it be @float[10]@. | 
|---|
|  | 354 |  | 
|---|
|  | 355 | While any of @float*@, @float[10]@ and @float(*)[10]@ are easy to tell apart from @float@, | 
|---|
|  | 356 | telling them apart from each other may need occasional reference back to TODO intro section. | 
|---|
|  | 357 | The sentence derived by wrapping each type in @-[3]@ follows. | 
|---|
|  | 358 |  | 
|---|
|  | 359 | While any of @float*[3]@, @float[3][10]@ and @float(*)[3][10]@ are easy to tell apart from @float[3]@, | 
|---|
|  | 360 | telling them apart from each other is what it takes to know what ``array of arrays'' really means. | 
|---|
|  | 361 |  | 
|---|
|  | 362 |  | 
|---|
|  | 363 | Pointer decay affects the outermost array only | 
|---|
|  | 364 |  | 
|---|
|  | 365 |  | 
|---|
|  | 366 | TODO: unfortunate syntactic reference with these cases: | 
|---|
|  | 367 |  | 
|---|
|  | 368 | \begin{itemize} | 
|---|
|  | 369 | \item ar. of ar. of val (be sure about ordering of dimensions when the declaration is dropped) | 
|---|
|  | 370 | \item ptr. to ar. of ar. of val | 
|---|
|  | 371 | \end{itemize} | 
|---|
|  | 372 |  | 
|---|
|  | 373 |  | 
|---|
|  | 374 |  | 
|---|
|  | 375 |  | 
|---|
|  | 376 |  | 
|---|
|  | 377 | \subsection{Arrays are (but) almost values} | 
|---|
|  | 378 |  | 
|---|
|  | 379 | Has size; can point to | 
|---|
|  | 380 |  | 
|---|
|  | 381 | Can't cast to | 
|---|
|  | 382 |  | 
|---|
|  | 383 | Can't pass as value | 
|---|
|  | 384 |  | 
|---|
|  | 385 | Can initialize | 
|---|
|  | 386 |  | 
|---|
|  | 387 | Can wrap in aggregate | 
|---|
|  | 388 |  | 
|---|
|  | 389 | Can't assign | 
|---|
|  | 390 |  | 
|---|
|  | 391 |  | 
|---|
|  | 392 | \subsection{Returning an array is (but) almost possible} | 
|---|
|  | 393 |  | 
|---|
|  | 394 |  | 
|---|
|  | 395 |  | 
|---|
|  | 396 |  | 
|---|
|  | 397 | \subsection{The pointer-to-array type has been noticed before} | 
|---|
|  | 398 |  | 
|---|
|  | 399 |  | 
|---|
|  | 400 | \section{\CFA} | 
|---|
|  | 401 |  | 
|---|
|  | 402 | Traditionally, fixing C meant leaving the C-ism alone, while providing a better alternative beside it. | 
|---|
|  | 403 | (For later:  That's what I offer with array.hfa, but in the future-work vision for arrays, the fix includes helping programmers stop accidentally using a broken C-ism.) | 
|---|
|  | 404 |  | 
|---|
|  | 405 | \subsection{\CFA features interacting with arrays} | 
|---|
|  | 406 |  | 
|---|
|  | 407 | Prior work on \CFA included making C arrays, as used in C code from the wild, | 
|---|
|  | 408 | work, if this code is fed into @cfacc@. | 
|---|
|  | 409 | The quality of this this treatment was fine, with no more or fewer bugs than is typical. | 
|---|
|  | 410 |  | 
|---|
|  | 411 | More mixed results arose with feeding these ``C'' arrays into preexisting \CFA features. | 
|---|
|  | 412 |  | 
|---|
|  | 413 | A notable success was with the \CFA @alloc@ function, | 
|---|
|  | 414 | which type information associated with a polymorphic return type | 
|---|
|  | 415 | replaces @malloc@'s use of programmer-supplied size information. | 
|---|
| [ed79428] | 416 | \begin{lstlisting} | 
|---|
| [40ab446] | 417 | // C, library | 
|---|
|  | 418 | void * malloc( size_t ); | 
|---|
|  | 419 | // C, user | 
|---|
|  | 420 | struct tm * el1 = malloc(      sizeof(struct tm) ); | 
|---|
|  | 421 | struct tm * ar1 = malloc( 10 * sizeof(struct tm) ); | 
|---|
|  | 422 |  | 
|---|
|  | 423 | // CFA, library | 
|---|
|  | 424 | forall( T * ) T * alloc(); | 
|---|
|  | 425 | // CFA, user | 
|---|
|  | 426 | tm * el2 = alloc(); | 
|---|
|  | 427 | tm (*ar2)[10] = alloc(); | 
|---|
| [ed79428] | 428 | \end{lstlisting} | 
|---|
| [40ab446] | 429 | The alloc polymorphic return compiles into a hidden parameter, which receives a compiler-generated argument. | 
|---|
|  | 430 | This compiler's argument generation uses type information from the left-hand side of the initialization to obtain the intended type. | 
|---|
|  | 431 | Using a compiler-produced value eliminates an opportunity for user error. | 
|---|
|  | 432 |  | 
|---|
|  | 433 | TODO: fix in following: even the alloc call gives bad code gen: verify it was always this way; walk back the wording about things just working here; assignment (rebind) seems to offer workaround, as in bkgd-cfa-arrayinteract.cfa | 
|---|
|  | 434 |  | 
|---|
|  | 435 | Bringing in another \CFA feature, reference types, both resolves a sore spot of the last example, and gives a first example of an array-interaction bug. | 
|---|
|  | 436 | In the last example, the choice of ``pointer to array'' @ar2@ breaks a parallel with @ar1@. | 
|---|
|  | 437 | They are not subscripted in the same way. | 
|---|
| [ed79428] | 438 | \begin{lstlisting} | 
|---|
| [40ab446] | 439 | ar1[5]; | 
|---|
|  | 440 | (*ar2)[5]; | 
|---|
| [ed79428] | 441 | \end{lstlisting} | 
|---|
| [40ab446] | 442 | Using ``reference to array'' works at resolving this issue.  TODO: discuss connection with Doug-Lea \CC proposal. | 
|---|
| [ed79428] | 443 | \begin{lstlisting} | 
|---|
| [40ab446] | 444 | tm (&ar3)[10] = *alloc(); | 
|---|
|  | 445 | ar3[5]; | 
|---|
| [ed79428] | 446 | \end{lstlisting} | 
|---|
| [40ab446] | 447 | The implicit size communication to @alloc@ still works in the same ways as for @ar2@. | 
|---|
|  | 448 |  | 
|---|
| [ed79428] | 449 | Using proper array types (@ar2@ and @ar3@) addresses a concern about using raw element pointers (@ar1@), albeit a theoretical one. | 
|---|
| [40ab446] | 450 | TODO xref C standard does not claim that @ar1@ may be subscripted, | 
|---|
|  | 451 | because no stage of interpreting the construction of @ar1@ has it be that ``there is an \emph{array object} here.'' | 
|---|
|  | 452 | But both @*ar2@ and the referent of @ar3@ are the results of \emph{typed} @alloc@ calls, | 
|---|
|  | 453 | where the type requested is an array, making the result, much more obviously, an array object. | 
|---|
|  | 454 |  | 
|---|
| [ed79428] | 455 | The ``reference to array'' type has its sore spots too.  TODO see also @dimexpr-match-c/REFPARAM_CALL (under TRY_BUG_1)@ | 
|---|
| [40ab446] | 456 |  | 
|---|
|  | 457 |  | 
|---|
|  | 458 |  | 
|---|
|  | 459 | TODO: I fixed a bug associated with using an array as a T.  I think.  Did I really?  What was the bug? | 
|---|