Changeset b5bfb16 for doc/theses/mike_brooks_MMath
- Timestamp:
- Mar 27, 2024, 10:08:54 AM (9 months ago)
- Branches:
- master
- Children:
- 2d82999
- Parents:
- b8cb388
- git-author:
- Peter A. Buhr <pabuhr@…> (03/27/24 10:08:01)
- git-committer:
- Peter A. Buhr <pabuhr@…> (03/27/24 10:08:54)
- Location:
- doc/theses/mike_brooks_MMath
- Files:
-
- 2 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/mike_brooks_MMath/background.tex
rb8cb388 rb5bfb16 6 6 \section{Array} 7 7 8 When a programmer works with an array, C semantics provide access to a type that is different in every way from ``pointer to its first element.'' 9 Its qualities become apparent by inspecting the declaration 8 At the start, the C programming language made a significant design mistake. 9 \begin{quote} 10 In C, there is a strong relationship between pointers and arrays, strong enough that pointers and arrays really should be treated simultaneously. 11 Any operation which can be achieved by array subscripting can also be done with pointers.~\cite[p.~93]{C:old} 12 \end{quote} 13 Accessing any storage requires pointer arithmetic, even if it is just base-displacement addressing in an instruction. 14 The conjoining of pointers and arrays could also be applied to structures, where a pointer references a structure field like an array element. 15 Finally, while subscripting involves pointer arithmetic (as does field references @x.y.z@), it is very complex for multi-dimensional arrays and requires array descriptors to know stride lengths along dimensions. 16 Many C errors result from performing pointer arithmetic instead of using subscripting. 17 Some C textbooks erroneously teach pointer arithmetic suggesting it is faster than subscripting. 18 19 C semantics want a programmer to \emph{believe} an array variable is a ``pointer to its first element.'' 20 This desire becomes apparent by a detailed inspection of an array declaration. 10 21 \lstinput{34-34}{bkgd-carray-arrty.c} 11 22 The inspection begins by using @sizeof@ to provide definite program semantics for the intuition of an expression's type. 12 Assuming a target platform keeps things concrete:13 23 \lstinput{35-36}{bkgd-carray-arrty.c} 14 Consider the sizes of expressions derived from @a@, modified by adding ``pointer to'' and ``first element'' (and including unnecessary parentheses to avoid confusion about precedence).24 Now consider the sizes of expressions derived from @ar@, modified by adding ``pointer to'' and ``first element'' (and including unnecessary parentheses to avoid confusion about precedence). 15 25 \lstinput{37-40}{bkgd-carray-arrty.c} 16 That @a@ takes up40 bytes is common reasoning for C programmers.17 Set aside for a moment the claim that this first assertion is giving information about a type.18 For now, note that an array and a pointer to its first element are, sometimes, different things.19 20 The idea that there is such a thing as a pointer to an array may be surprising. 21 I t is not the same thing as a pointer to the first element:26 Given the size of @float@ is 4, the size of @ar@ with 10 floats being 40 bytes is common reasoning for C programmers. 27 Equally, C programmers know the size of a \emph{pointer} to the first array element is 8 (or 4 depending on the addressing architecture). 28 % Now, set aside for a moment the claim that this first assertion is giving information about a type. 29 Clearly, an array and a pointer to its first element are different things. 30 31 In fact, the idea that there is such a thing as a pointer to an array may be surprising and it is not the same thing as a pointer to the first element. 22 32 \lstinput{42-45}{bkgd-carray-arrty.c} 23 The first gets33 The first assignment gets 24 34 \begin{cfa} 25 35 warning: assignment to `float (*)[10]' from incompatible pointer type `float *' 26 36 \end{cfa} 27 and the second gets the opposite. 28 29 We now refute a concern that @sizeof(a)@ is reporting on special knowledge from @a@ being an local variable, 30 say that it is informing about an allocation, rather than simply a type. 31 32 First, recognizing that @sizeof@ has two forms, one operating on an expression, the other on a type, we observe that the original answers are unaffected by using the type-parameterized form: 33 \lstinput{46-50}{bkgd-carray-arrty.c} 34 Finally, the same sizing is reported when there is no allocation at all, and we launch the analysis instead from the pointer-to-array type. 37 and the second assignment gets the opposite. 38 39 The inspection now refutes any suggestion that @sizeof@ is informing about allocation rather than type information. 40 Note, @sizeof@ has two forms, one operating on an expression and the other on a type. 41 Using the type form yields the same results as the prior expression form. 42 \lstinput{46-49}{bkgd-carray-arrty.c} 43 The results are also the same when there is \emph{no allocation} using a pointer-to-array type. 35 44 \lstinput{51-57}{bkgd-carray-arrty.c} 36 So, in spite of considerable programmer success enabled by an understanding that an array just a pointer to its first element (revisited TODO pointer decay), this understanding is simplistic. 37 38 A shortened form for declaring local variables exists, provided that length information is given in the initializer: 39 \lstinput{59-63}{bkgd-carray-arrty.c} 45 Hence, in all cases, @sizeof@ is informing about type information. 46 47 So, thinking of an array as a pointer to its first element is too simplistic an analogue and it is not backed up the type system. 48 This misguided analogue can be forced onto single-dimension arrays but there is no advantage other than possibly teaching beginning programmers about basic runtime array-access. 49 50 Continuing, a shortened form for declaring local variables exists, provided that length information is given in the initializer: 51 \lstinput{59-62}{bkgd-carray-arrty.c} 40 52 In these declarations, the resulting types are both arrays, but their lengths are inferred. 41 53 … … 102 114 It shows how to spell the types under discussion, 103 115 along with interactions with orthogonal (but easily confused) language features. 104 Alter rnate spellings are listed withinga row.116 Alternate spellings are listed within a row. 105 117 The simplest occurrences of types distinguished in the preceding discussion are marked with $\triangleright$. 106 118 The Type column gives the spelling used in a cast or error message (though note Section TODO points out that some types cannot be casted to). … … 227 239 \lstinput{9-9}{bkgd-carray-decay.c} 228 240 The validity of this initialization is unsettling, in the context of the facts established in the last section. 229 Notably, it initializes name @pa0x@ from expression @a @, when they are not of the same type:241 Notably, it initializes name @pa0x@ from expression @ar@, when they are not of the same type: 230 242 \lstinput{10-10}{bkgd-carray-decay.c} 231 243 … … 241 253 This phenomenon is the famous ``pointer decay,'' which is a decay of an array-typed expression into a pointer-typed one. 242 254 243 It is worthy to note that the list of exception cases does not feature the occurrence of @a @ in @a[i]@.255 It is worthy to note that the list of exception cases does not feature the occurrence of @ar@ in @ar[i]@. 244 256 Thus, subscripting happens on pointers, not arrays. 245 257 246 Subscripting proceeds first with pointer decay, if needed. Next, ARM-6.5.2.1.2 explains that @a [i]@ is treated as if it were @(*((a)+(i)))@.247 ARM-6.5.6.8 explains that the addition, of a pointer with an integer type, is defined only when the pointer refers to an element that is in an array, with a meaning of ``@i@ elements away from,'' which is valid if @a @ is big enough and @i@ is small enough.258 Subscripting proceeds first with pointer decay, if needed. Next, ARM-6.5.2.1.2 explains that @ar[i]@ is treated as if it were @(*((a)+(i)))@. 259 ARM-6.5.6.8 explains that the addition, of a pointer with an integer type, is defined only when the pointer refers to an element that is in an array, with a meaning of ``@i@ elements away from,'' which is valid if @ar@ is big enough and @i@ is small enough. 248 260 Finally, ARM-6.5.3.2.4 explains that the @*@ operator's result is the referenced element. 249 261 250 Taken together, these rules also happen to illustrate that @a [i]@ and @i[a]@ mean the same thing.262 Taken together, these rules also happen to illustrate that @ar[i]@ and @i[a]@ mean the same thing. 251 263 252 264 Subscripting a pointer when the target is standard-inappropriate is still practically well-defined. … … 255 267 the fact that C is famously both generally high-performance, and specifically not bound-checked, 256 268 leads to an expectation that the runtime handling is uniform across legal and illegal accesses. 257 Moreover, consider the common pattern of subscripting on a mallocresult:269 Moreover, consider the common pattern of subscripting on a @malloc@ result: 258 270 \begin{cfa} 259 271 float * fs = malloc( 10 * sizeof(float) ); … … 280 292 The respective handlings of the following two parameter spellings shows that the array-spelled one is really, like the other, a pointer. 281 293 \lstinput{12-16}{bkgd-carray-decay.c} 282 As the @sizeof(x)@ meaning changed, compared with when run on a similarly-spelled local varia riable declaration,294 As the @sizeof(x)@ meaning changed, compared with when run on a similarly-spelled local variable declaration, 283 295 GCC also gives this code the warning: ```sizeof' on array function parameter `x' will return size of `float *'.'' 284 296 … … 295 307 whose subsequent @edit@ calls behave differently. 296 308 The syntax-caused confusion is in the comparison of the first and last lines, 297 both of which use a literal to initial ze an object decalared with spelling @T x[]@.309 both of which use a literal to initialize an object declared with spelling @T x[]@. 298 310 But these initialized declarations get opposite meanings, 299 311 depending on whether the object is a local variable or a parameter. 300 312 301 313 302 In sum ary, when a funciton is written with an array-typed parameter,314 In summary, when a function is written with an array-typed parameter, 303 315 \begin{itemize} 304 316 \item an appearance of passing an array by value is always an incorrect understanding 305 \item a dimension value, if any is present, is ignor red317 \item a dimension value, if any is present, is ignored 306 318 \item pointer decay is forced at the call site and the callee sees the parameter having the decayed type 307 319 \end{itemize} … … 366 378 assert( 0 < n && n < 1000 ); 367 379 368 float a [n];380 float ar[n]; 369 381 float b[10]; 370 382 … … 372 384 } 373 385 \end{cfa} 374 This arrangement allocates @n@ elements on the @main@ stack frame for @a @, just as it puts 10 elements on the @main@ stack frame for @b@.375 The variable-sized allocation of @a @ is provided by @alloca@.386 This arrangement allocates @n@ elements on the @main@ stack frame for @ar@, just as it puts 10 elements on the @main@ stack frame for @b@. 387 The variable-sized allocation of @ar@ is provided by @alloca@. 376 388 377 389 In a situation where the array sizes are not known to be small enough for stack allocation to be sensible, corresponding heap allocations are achievable as: … … 440 452 As in the last section, we inspect the declaration ... 441 453 \lstinput{16-18}{bkgd-carray-mdim.c} 442 The significant axis of deriving expressions from @a @ is now ``itself,'' ``first element'' or ``first grand-element (meaning, first element of first element).''454 The significant axis of deriving expressions from @ar@ is now ``itself,'' ``first element'' or ``first grand-element (meaning, first element of first element).'' 443 455 \lstinput{20-44}{bkgd-carray-mdim.c} 444 456 -
doc/theses/mike_brooks_MMath/programs/bkgd-carray-arrty.c
rb8cb388 rb5bfb16 32 32 33 33 int main() { 34 float a [10];35 static_assert( sizeof(float) == 4);$\C{// floats (array elements) are 4 bytes}$36 static_assert( sizeof(void*) == 8);$\C{// pointers are 8 bytes}$37 static_assert( sizeof(a) == 40);$\C{// array}$38 static_assert( sizeof(&a) == 8 ); $\C{// pointer to array}$39 static_assert( sizeof(a[0]) == 4 );$\C{// first element}$40 static_assert( sizeof(&(a[0])) == 8 ); $\C{// pointer to first element}$34 float ar[10]; 35 static_assert( sizeof(float) == 4 ); $\C{// floats (array elements) are 4 bytes}$ 36 static_assert( sizeof(void*) == 8 ); $\C{// pointers are 8 bytes}$ 37 static_assert( sizeof(ar) == 40 ); $\C{// array}$ 38 static_assert( sizeof(&ar) == 8 ); $\C{// pointer to array}$ 39 static_assert( sizeof(ar[0]) == 4 ); $\C{// first element}$ 40 static_assert( sizeof(&(ar[0])) == 8 ); $\C{// pointer to first element}$ 41 41 42 typeof(&a ) x;$\C{// x is pointer to array}$43 typeof(&(a [0])) y;$\C{// y is pointer to first element}$42 typeof(&ar) x = &ar; $\C{// x is pointer to array}$ 43 typeof(&(ar[0])) y = &ar[0]; $\C{// y is pointer to first element}$ 44 44 @x = y;@ $\C{// ill-typed}$ 45 45 @y = x;@ $\C{// ill-typed}$ 46 static_assert( sizeof(typeof(a)) == 40);47 static_assert( sizeof(typeof(&a)) == 8 );48 static_assert( sizeof(typeof(a[0])) == 4 );49 static_assert( sizeof(typeof(&(a[0]))) == 8 );46 static_assert( sizeof(typeof(ar)) == 40 ); $\C{// array}$ 47 static_assert( sizeof(typeof(&ar)) == 8 ); $\C{// pointer to array}$ 48 static_assert( sizeof(typeof(ar[0])) == 4 ); $\C{// first element}$ 49 static_assert( sizeof(typeof(&(ar[0]))) == 8 ); $\C{// pointer to first element}$ 50 50 51 51 void f( float (*pa)[10] ) { 52 static_assert(sizeof( *pa ) == 40); $\C{// array}$53 static_assert(sizeof( pa ) == 8 );$\C{// pointer to array}$54 static_assert(sizeof((*pa)[0] ) == 4 ); $\C{// first element}$55 static_assert(sizeof(&((*pa)[0])) == 8 ); $\C{// pointer to first element}$52 static_assert( sizeof( *pa ) == 40 ); $\C{// array}$ 53 static_assert( sizeof( pa ) == 8 ); $\C{// pointer to array}$ 54 static_assert( sizeof( (*pa)[0] ) == 4 ); $\C{// first element}$ 55 static_assert( sizeof(&((*pa)[0])) == 8 ); $\C{// pointer to first element}$ 56 56 } 57 f( & a);57 f( &ar ); 58 58 59 59 float fs[] = {3.14, 1.707}; 60 60 char cs[] = "hello"; 61 62 61 static_assert( sizeof(fs) == 2 * sizeof(float) ); 63 62 static_assert( sizeof(cs) == 6 * sizeof(char) ); $\C{// 5 letters + 1 null terminator}$ 63 64 64 } 65 65 … … 144 144 void stx2() { const T x[10]; 145 145 // x[5] = 3.14; // bad 146 146 } 147 147 void stx3() { T const x[10]; 148 148 // x[5] = 3.14; // bad 149 149 } 150 150 151 151 // Local Variables: //
Note: See TracChangeset
for help on using the changeset viewer.