Context Navigation

-              rb7921d8
+              rad9f593
 The conjoining of pointers and arrays could also be applied to structures, where a pointer references a structure field like an array element.
 Finally, while subscripting involves pointer arithmetic (as does a field reference @x.y.z@), the computation is complex for multi-dimensional arrays and requires array descriptors to know stride lengths along dimensions.
+Many C errors result from manually performing pointer arithmetic instead of using language subscripting, letting the compiler performs any arithmetic;
+some C textbooks erroneously suggest manual pointer arithmetic is faster than subscripting.
+Many C errors result from manually performing pointer arithmetic instead of using language subscripting, letting the compiler perform any arithmetic.
+Some C textbooks erroneously suggest manual pointer arithmetic is faster than subscripting.
 A sound and efficient C program does not require explicit pointer arithmetic.
+C semantics want a programmer to \emph{believe} an array variable is a ``pointer to its first element.''
+TODO: provide an example, explain the belief, and give modern refutation
+C semantics wants a programmer to \emph{believe} an array variable is a ``pointer to its first element.''
 This desire becomes apparent by a detailed inspection of an array declaration.
 \lstinput{34-34}{bkgd-carray-arrty.c}
 The inspection begins by using @sizeof@ to provide program semantics for the intuition of an expression's type.
+An architecture with 64-bit pointer size is used, to keep irrelevant details fixed.
 \lstinput{35-36}{bkgd-carray-arrty.c}
 Now consider the @sizeof@ expressions derived from @ar@, modified by adding pointer-to and first-element (and including unnecessary parentheses to avoid any confusion about precedence).
 \lstinput{37-40}{bkgd-carray-arrty.c}
 Given that arrays are contiguous and the size of @float@ is 4, then the size of @ar@ with 10 floats being 40 bytes is common reasoning for C programmers.
 Equally, C programmers know the size of a pointer to the first array element is 8 (or 4 depending on the addressing architecture).
+Equally, C programmers know the size of a pointer to the first array element is 8.
 % Now, set aside for a moment the claim that this first assertion is giving information about a type.
 Clearly, an array and a pointer to its first element are different.
+In fact, the idea that there is such a thing as a pointer to an array may be surprising and it is not the same thing as a pointer to the first element.
+In fact, the idea that there is such a thing as a pointer to an array may be surprising.
+It it is not the same thing as a pointer to the first element.
 \lstinput{42-45}{bkgd-carray-arrty.c}
 The first assignment generates:
 …
 Using the type form yields the same results as the prior expression form.
 \lstinput{46-49}{bkgd-carray-arrty.c}
+The results are also the same when there is no allocation using a pointer-to-array type.
+The results are also the same when there is no allocation at all.
+This time, starting from a pointer-to-array type:
 \lstinput{51-57}{bkgd-carray-arrty.c}
 Hence, in all cases, @sizeof@ is informing about type information.
+Hence, in all cases, @sizeof@ is reporting on type information.
 Therefore, thinking of an array as a pointer to its first element is too simplistic an analogue and it is not backed up by the type system.
 …
 \subsection{Array Parameter Declaration}
+C has a formal and actual declaration for functions to allow definition-before-use and separate compilation, where formal describes a type and an actual defines the type.
+\begin{cfa}
+int foo( int, float, char );                            $\C{// formal, parameter names option}$
+int foo( int i, float f, char c ) { ... }       $\C{// actual}$
+\end{cfa}
+For array parameters, a formal parameter array declaration can specify the first dimension with a dimension value, @[10]@ (which is ignored), an empty dimension list, @[ ]@, or a pointer, @*@:
+Passing an array along with a function call is obviously useful.
+Let us say that a parameter is an array parameter when the called function intends to subscript it.
+This section asserts that a more satisfactory/formal characterization does not exist in C, surveys the ways that C API authors communicate ``@p@ has zero or more @T@s,'' and calls out the minority cases where the C type system is using or verifying such claims.
+A C function's parameter declarations look different, from the caller's and callee's perspectives.
+Both perspectives consist of the text read by a programmer and the semantics enforced by the type system.
+The caller's perspecitve is available from a mere function declaration (which allow definition-before-use and separate compilation), but can also be read from (the non-body part of) a function definition.
+The callee's perspective is what is available inside the function.
+\begin{cfa}
+        int foo( int, float, char );                            $\C{// declaration, names optional}$
+        int bar( int i, float f, char c ) {             $\C{// definition, names mandatory}$
+                $/* caller's perspective of foo's; callee's perspective of bar's */$
+                ...
+        }
+        $/* caller's persepectives of foo's and bar's */$
+\end{cfa}
+The caller's perspective is more limited.
+The example shows, so far, that parameter names (by virtue of being optional) are really comments in the caller's perspective, while they are semantically significant in the callee's perspective.
+Array parameters introduce a further, subtle, semantic difference and considerable freedom to comment.
+At the semantic level, there is no such thing as an array parameter, except for one case (@T[static 5]@) discussed shortly.
+Rather, there are only pointer parameters.
+This fact probably shares considerable responsibility for the common sense of ``an array is just a pointer,'' wich has been refuted in non-parameter contexts.
+This fact holds in both the caller's and callee's perspectives.
+However, a parameter's type can include ``array of.''
+For example, the type ``pointer to array of 5 ints'' (@T(*)[5]@) is a pointer type, a fully meaningful parameter type (in the sense that this description does not contain any information that the type system ignores), and a type that appears the same in the caller's \vs callee's perspectives.
+The outermost type constructor (syntactically first dimension) is really the one that determines the flavour of parameter.
+\begin{figure}
 \begin{cquote}
 \begin{tabular}{@{}llll@{}}
 \begin{cfa}
+double sum( double [5] );
+double sum( double *[5] );
+float sum( float a[5] );
+float sum( float a[5][4] );
+float sum( float a[5][] );
+float sum( float a[5]* );
+float sum( float *a[5] );
 \end{cfa}
+&
 \begin{cfa}
+double sum( double [ ] );
+double sum( double *[ ] );
+float sum( float a[] );
+float sum( float a[][4] );
+float sum( float a[][] );
+float sum( float a[]* );
+float sum( float *a[] );
 \end{cfa}
+&
 \begin{cfa}
+double sum( double * );
+double sum( double ** );
+float sum( float *a );
+float sum( float (*a)[4] );
+float sum( float (*a)[] );
+float sum( float (*a)* );
+float sum( float **a );
 \end{cfa}
+&
 \begin{cfa}
+// array
+// matrix
+// ar of float
+// mat of float
+// invalid
+// invalid
+// ar of ptr to float
 \end{cfa}
 \end{tabular}
 \end{cquote}
+Good practice uses the middle form as it clearly indicates the parameter is subscripted.
+However, an actual declaration cannot use @[ ]@;
+it must use @*@.
+\begin{cfa}
+double sum( double v[ ] ) {                                     $\C{// formal declaration}$
+double * cv;                                                            $\C{// actual declaration, think cv[ ]}$
+sum( cv );                                                                      $\C{// address assignment v = cv}$
+\end{cfa}
+Given the formal dimension forms @[ ]@ or @[5]@, it raises the question of qualifying the implicit array pointer rather than the array element type.
+\caption{Multiple ways to declare an arrray parameter.  Across a valid row, every declaration is equivalent.  Each column gives a declaration style.  Really, the style can be read from the first row only.  The second row shows how the style extends to multiple dimensions, with the rows thereafter providing context for the choice of which second-row \lstinline{[]}receives the column-style variation.}
+\label{f:ArParmEquivDecl}
+\end{figure}
+Yet, C allows array syntax for the outermost type constructor, from which comes the freedom to comment.
+An array parameter declaration can specify the outermost dimension with a dimension value, @[10]@ (which is ignored), an empty dimension list, @[ ]@, or a pointer, @*@, as seen in \VRef[Figure]{f:ArParmEquivDecl}.  The rationale for rejecting the first ``invalid'' row follows shortly, while the second ``invalid'' row is simple nonsense, included to complete the pattern; its syntax hints at what the final row actually achieves.
+In the lefmost style, the typechecker ignores the actual value in most practical cases.
+This value is allowed to be a dynamic expression, so it is \emph{possible} to use the leftmost style in many practical cases.
+% To help contextualize the matrix part of this example, the syntaxes @float [5][]@, @float [][]@ and @float (*)[]@ are all rejected, for reasons discussed shortly.
+% So are @float[5]*@, @float[]*@ and @float (*)*@.  These latter ones are simply nonsense, though they hint at ``1d array of pointers'', whose equivalent syntax options are, @float *[5]@, @float *[]@, and @float **@.
+It is a matter of taste as to whether a programmer should use a form as far left as possible (getting the most out of syntactically integrated comments), sticking to the right (avoiding false comfort from suggesting the typechecker is checking more than it is), or compromising in the middle (reducing unchecked information, yet clearly stating, ``I will subscript this one'').
+Note that this equivalence of pointer and array declarations is special to paramters.
+It does not apply to local variables, where true array declarations are possible.
+\begin{cfa}
+void f( float * a ) {
+        float * b = a; // ok
+        float c[] = a; // reject
+        float d[] = { 1.0, 2.0, 3.0 }; // ok
+        static_assert( sizeof(b) == sizeof(float*) );
+        static_assert( sizeof(d) != sizeof(float*) );
+}
+\end{cfa}
+This equivalence has the consequence that the type system does not help a caller get it right.
+\begin{cfa}
+float sum( float v[] );
+float arg = 3.14;
+sum( &arg );                                                            $\C{// accepted, v := \&arg}$
+\end{cfa}
+Given the syntactic dimension forms @[ ]@ or @[5]@, it raises the question of qualifying the implied array pointer rather than the array element type.
 For example, the qualifiers after the @*@ apply to the array pointer.
 \begin{cfa}
 …
 void foo( const volatile int [ ] @const volatile@ ); // does not parse
 \end{cfa}
 C addressed this shortcoming by moving the pointer qualifiers into the first dimension.
+C instead puts these pointer qualifiers syntactically into the first dimension.
 \begin{cquote}
 @[@ \textit{type-qualifier-list}$_{opt}$ \textit{assignment-expression}$_{opt}$ @]@
 …
 \end{cfa}
 To make the first formal dimension size meaningful, C adds this form.
+To make the first dimension size meaningful, C adds this form.
 \begin{cquote}
 @[@ @static@ \textit{type-qualifier-list}$_{opt}$ \textit{assignment-expression} @]@
 …
 \end{cfa}
 Here, the @static@ storage qualifier defines the minimum array size for its argument.
+@gcc@ ignores this dimension qualifier, \ie it gives no warning if the argument array size is less than the parameter minimum.
+Finally, to handle VLAs, C repurposed the @*@ \emph{within} the dimension in the formal declaration context to mean the argument must be a VLA (contiguous).
+@gcc@ ignores this dimension qualifier, \ie it gives no warning if the argument array size is less than the parameter minimum.  However, @clang@ implements the check, in accordance with the standard.  TODO: be specific about versions
+Note that there are now two different meanings for modifiers in the same position.  In
+\begin{cfa}
+void foo( int x[static const volatile 3] );
+\end{cfa}
+the @static@ applies to the 3, while the @const volatile@ applies to the @x@.
+With multidimensional arrays, on dimensions after the first, a size is required and, is not ignored.
+These sizes are required for the callee to be able to subscript.
+\begin{cfa}
+void f( float a[][10], float b[][100] ) {
+    static_assert( ((char*)&a([1])) - ((char*)&a([0])) == 10 * sizeof(float) );
+    static_assert( ((char*)&b([1])) - ((char*)&b([0])) == 100 * sizeof(float) );
+}
+\end{cfa}
+Here, the distance between the first and second elements of each array depends on the inner dimension size.
+The last observation is a fact of the callee's perspective.
+There is little type-system checking, in the caller's perspective, that what is being passed, matches.
+\begin{cfa}
+void f( float [][10] );
+int n = 100;
+float a[100], b[n];
+f(&a); // reject
+f(&b); // accept
+\end{cfa}
+This size is therefore, a callee's assumption.
+Finally, to handle higher-dimensional VLAs, C repurposed the @*@ \emph{within} the dimension in a declaration to mean that the callee will have make an assumption about the size here, but no (unchecked, possibly wrong) information about this assumption is included for the caller-programmer's benefit/overconfidence.
 \begin{cquote}
 @[@ \textit{type-qualifier-list$_{opt}$} @* ]@
 \end{cquote}
 \begin{cfa}
+void foo( int [@*@][@*@] );                                     $\C{// formal}$
+void foo( int ar[10][10] ) { ... }                      $\C{// actual}$
+int ar[2][10];                                                          $\C{// contiguous}$
+foo( ar );                                                                      $\C{// valid}$
+int * arp[10];                                                          $\C{// non-contiguous}$
+foo( arp );                                                                     $\C{// invalid}$
+\end{cfa}
+This syntactic form for the formal prototype means the header file does not have to commit to specific dimension values, but the compiler knows the argument is a contiguous array.
+void foo( float [][@*@] );                                              $\C{// declaration}$
+void foo( float a[][10] ) { ... }                               $\C{// definition}$
+\end{cfa}
+Repeating it with the full context of a VLA is useful:
+\begin{cfa}
+void foo( int, float [][@*@] );                                 $\C{// declaration}$
+void foo( int n, float a[][n] ) { ... }                 $\C{// definition}$
+\end{cfa}
+Omitting the dimension from the declaration is consistent with omitting parameter names, for the declaration case has no name @n@ in scope.
+The omission is also redacting all information not needed to generate correct caller-side code.

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset ad9f593 for doc/theses

Legend:

doc/theses/mike_brooks_MMath/background.tex

Download in other formats: