Context Navigation

source: doc/theses/mike_brooks_MMath/background.tex@ 30aab55

Visit:

Last change on this file since 30aab55 was b64d0f4, checked in by Peter A. Buhr <pabuhr@…>, 19 months ago
second attempt changing program-input style
Property mode set to `100644`
File size: 25.9 KB

Line
1	\chapter{Background}
2
3	This chapter states facts about the prior work, upon which my contributions build.
4	Each receives a justification of the extent to which its statement is phrased to provoke controversy or surprise.
5
6	\section{C}
7
8	\subsection{Common knowledge}
9
10	The reader is assumed to have used C or \CC for the coursework of at least four university-level courses, or have equivalent experience.
11	The current discussion introduces facts, unaware of which, such a functioning novice may be operating.
12
13	% TODO: decide if I'm also claiming this collection of facts, and test-oriented presentation is a contribution; if so, deal with (not) arguing for its originality
14
15	\subsection{Convention: C is more touchable than its standard}
16
17	When it comes to explaining how C works, I like illustrating definite program semantics.
18	I prefer doing so, over a quoting manual's suggested programmer's intuition, or showing how some compiler writers chose to model their problem.
19	To illustrate definite program semantics, I devise a program, whose behaviour exercises the point at issue, and I show its behaviour.
20
21	This behaviour is typically one of
22	\begin{itemize}
23	\item my statement that the compiler accepts or rejects the program
24	\item the program's printed output, which I show
25	\item my implied assurance that its assertions do not fail when run
26	\end{itemize}
27
28	The compiler whose program semantics is shown is
29	\begin{cfa}
30	$ gcc --version
31	gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
32	\end{cfa}
33	running on Architecture @x86_64@, with the same environment targeted.
34
35	Unless explicit discussion ensues about differences among compilers or with (versions of) the standard, it is further implied that there exists a second version of GCC and some version of Clang, running on and for the same platform, that give substantially similar behaviour.
36	In this case, I do not argue that my sample of major Linux compilers is doing the right thing with respect to the C standard.
37
38
39	\subsection{C reports many ill-typed expressions as warnings}
40
41	These attempts to assign @y@ to @x@ and vice-versa are obviously ill-typed.
42	\lstinput{12-15}{bkgd-c-tyerr.c}
43	with warnings:
44	\begin{cfa}
45	warning: assignment to 'float ' from incompatible pointer type 'void ()(void)'
46	warning: assignment to 'void ()(void)' from incompatible pointer type 'float '
47	\end{cfa}
48	Similarly,
49	\lstinput{17-19}{bkgd-c-tyerr.c}
50	with warning:
51	\begin{cfa}
52	warning: passing argument 1 of 'f' from incompatible pointer type
53	note: expected 'void ()(void)' but argument is of type 'float '
54	\end{cfa}
55	with a segmentation fault at runtime.
56
57	That @f@'s attempt to call @g@ fails is not due to 3.14 being a particularly unlucky choice of value to put in the variable @pi@.
58	Rather, it is because obtaining a program that includes this essential fragment, yet exhibits a behaviour other than "doomed to crash," is a matter for an obfuscated coding competition.
59
60	A "tractable syntactic method for proving the absence of certain program behaviours by classifying phrases according to the kinds of values they compute"*1 rejected the program.
61	The behaviour (whose absence is unprovable) is neither minor nor unlikely.
62	The rejection shows that the program is ill-typed.
63
64	Yet, the rejection presents as a GCC warning.
65
66	In the discussion following, ``ill-typed'' means giving a nonzero @gcc -Werror@ exit condition with a message that discusses typing.
67
68	*1 TAPL-pg1 definition of a type system
69
70
71	\section{C Arrays}
72
73	\subsection{C has an array type (!)}
74
75	When a programmer works with an array, C semantics provide access to a type that is different in every way from ``pointer to its first element.''
76	Its qualities become apparent by inspecting the declaration
77	\lstinput{34-34}{bkgd-carray-arrty.c}
78	The inspection begins by using @sizeof@ to provide definite program semantics for the intuition of an expression's type.
79	Assuming a target platform keeps things concrete:
80	\lstinput{35-36}{bkgd-carray-arrty.c}
81	Consider the sizes of expressions derived from @a@, modified by adding ``pointer to'' and ``first element'' (and including unnecessary parentheses to avoid confusion about precedence).
82	\lstinput{37-40}{bkgd-carray-arrty.c}
83	That @a@ takes up 40 bytes is common reasoning for C programmers.
84	Set aside for a moment the claim that this first assertion is giving information about a type.
85	For now, note that an array and a pointer to its first element are, sometimes, different things.
86
87	The idea that there is such a thing as a pointer to an array may be surprising.
88	It is not the same thing as a pointer to the first element:
89	\lstinput{42-45}{bkgd-carray-arrty.c}
90	The first gets
91	\begin{cfa}
92	warning: assignment to `float ()[10]' from incompatible pointer type `float '
93	\end{cfa}
94	and the second gets the opposite.
95
96	We now refute a concern that @sizeof(a)@ is reporting on special knowledge from @a@ being an local variable,
97	say that it is informing about an allocation, rather than simply a type.
98
99	First, recognizing that @sizeof@ has two forms, one operating on an expression, the other on a type, we observe that the original answers are unaffected by using the type-parameterized form:
100	\lstinput{46-50}{bkgd-carray-arrty.c}
101	Finally, the same sizing is reported when there is no allocation at all, and we launch the analysis instead from the pointer-to-array type.
102	\lstinput{51-57}{bkgd-carray-arrty.c}
103	So, in spite of considerable programmer success enabled by an understanding that an array just a pointer to its first element (revisited TODO pointer decay), this understanding is simplistic.
104
105	A shortened form for declaring local variables exists, provided that length information is given in the initializer:
106	\lstinput{59-63}{bkgd-carray-arrty.c}
107	In these declarations, the resulting types are both arrays, but their lengths are inferred.
108
109	\begin{tabular}{lllllll}
110	@float x;@ & $\rightarrow$ & (base element) & @float@ & @float x;@ & @[ float ]@ & @[ float ]@ \\
111	@float * x;@ & $\rightarrow$ & pointer & @float @ & @float x;@ & @[ * float ]@ & @[ * float ]@ \\
112	@float x[10];@ & $\rightarrow$ & array & @float[10]@ & @float x[10];@ & @[ [10] float ]@ & @[ array(float, 10) ]@ \\
113	@float x[10];@ & $\rightarrow$ & array of pointers & @(float)[10]@ & @float x[10];@ & @[ [10] float ]@ & @[ array(*float, 10) ]@ \\
114	@float (x)[10];@ & $\rightarrow$ & pointer to array & @float()[10]@ & @float (x)[10];@ & @[ [10] float ]@ & @[ * array(float, 10) ]@ \\
115	@float (x5)[10];@ & $\rightarrow$ & pointer to array & @(float)()[10]@ & @float (x)[10];@ & @[ * [10] * float ]@ & @[ * array(*float, 10) ]@
116	\end{tabular}
117	\begin{cfa}
118	x5 = (float()[10]) x4;
119	// x5 = (float(*)[10]) x4; // wrong target type; meta test suggesting above cast uses correct type
120
121	// [here]
122	// const
123
124	// [later]
125	// static
126	// star as dimension
127	// under pointer decay: int p1[const 3] being int const *p1
128
129	const float * y1;
130	float const * y2;
131	float * const y3;
132
133	y1 = 0;
134	y2 = 0;
135	// y3 = 0; // bad
136
137	// *y1 = 3.14; // bad
138	// *y2 = 3.14; // bad
139	*y3 = 3.14;
140
141	const float z1 = 1.414;
142	float const z2 = 1.414;
143
144	// z1 = 3.14; // bad
145	// z2 = 3.14; // bad
146
147
148	}
149
150	#define T float
151	void stx2() { const T x[10];
152	// x[5] = 3.14; // bad
153	}
154	void stx3() { T const x[10];
155	// x[5] = 3.14; // bad
156	}
157	\end{cfa}
158
159	My contribution is enabled by recognizing
160	\begin{itemize}
161	\item There is value in using a type that knows how big the whole thing is.
162	\item The type pointer to (first) element does not.
163	\item C \emph{has} a type that knows the whole picture: array, e.g. @T[10]@.
164	\item This type has all the usual derived forms, which also know the whole picture. A usefully noteworthy example is pointer to array, e.g. @T(*)[10]@.
165	\end{itemize}
166
167	Each of these sections, which introduces another layer of of the C arrays' story,
168	concludes with an \emph{Unfortunate Syntactic Reference}.
169	It shows how to spell the types under discussion,
170	along with interactions with orthogonal (but easily confused) language features.
171	Alterrnate spellings are listed withing a row.
172	The simplest occurrences of types distinguished in the preceding discussion are marked with $\triangleright$.
173	The Type column gives the spelling used in a cast or error message (though note Section TODO points out that some types cannot be casted to).
174	The Declaration column gives the spelling used in an object declaration, such as variable or aggregate member; parameter declarations (section TODO) follow entirely different rules.
175
176	After all, reading a C array type is easy: just read it from the inside out, and know when to look left and when to look right!
177
178
179	\CFA-specific spellings (not yet introduced) are also included here for referenceability; these can be skipped on linear reading.
180	The \CFA-C column gives the, more fortunate, ``new'' syntax of section TODO, for spelling \emph{exactly the same type}.
181	This fortunate syntax does not have different spellings for types vs declarations;
182	a declaration is always the type followed by the declared identifier name;
183	for the example of letting @x@ be a \emph{pointer to array}, the declaration is spelled:
184	\begin{cfa}
185	[ * [10] T ] x;
186	\end{cfa}
187	The \CFA-Full column gives the spelling of a different type, introduced in TODO, which has all of my contributed improvements for safety and ergonomics.
188
189	\noindent
190	\textbf{Unfortunate Syntactic Reference}
191
192	\begin{figure}
193	\centering
194	\setlength{\tabcolsep}{3pt}
195	\begin{tabular}{llllll}
196	& Description & Type & Declaration & \CFA-C & \CFA-Full \\ \hline
197	$\triangleright$ & val.
198	& @T@
199	& @T x;@
200	& @[ T ]@
201	&
202	\\ \hline
203	& \pbox{20cm}{ \vspace{2pt} val.\\ \footnotesize{no writing the val.\ in \lstinline{x}} }\vspace{2pt}
204	& \pbox{20cm}{ \vspace{2pt} \lstinline{const T} \\ \lstinline{T const} }
205	& \pbox{20cm}{ \vspace{2pt} \lstinline{const T x;} \\ \lstinline{T const x;} }
206	& @[ const T ]@
207	&
208	\\ \hline \hline
209	$\triangleright$ & ptr.\ to val.
210	& @T *@
211	& @T * x;@
212	& @[ * T ]@
213	&
214	\\ \hline
215	& \pbox{20cm}{ \vspace{2pt} ptr.\ to val.\\ \footnotesize{no writing the ptr.\ in \lstinline{x}} }\vspace{2pt}
216	& @T * const@
217	& @T * const x;@
218	& @[ const * T ]@
219	&
220	\\ \hline
221	& \pbox{20cm}{ \vspace{2pt} ptr.\ to val.\\ \footnotesize{no writing the val.\ in \lstinline{*x}} }\vspace{2pt}
222	& \pbox{20cm}{ \vspace{2pt} \lstinline{const T } \\ \lstinline{T const } }
223	& \pbox{20cm}{ \vspace{2pt} \lstinline{const T * x;} \\ \lstinline{T const * x;} }
224	& @[ * const T ]@
225	&
226	\\ \hline \hline
227	$\triangleright$ & ar.\ of val.
228	& @T[10]@
229	& @T x[10];@
230	& @[ [10] T ]@
231	& @[ array(T, 10) ]@
232	\\ \hline
233	& \pbox{20cm}{ \vspace{2pt} ar.\ of val.\\ \footnotesize{no writing the val.\ in \lstinline{x[5]}} }\vspace{2pt}
234	& \pbox{20cm}{ \vspace{2pt} \lstinline{const T[10]} \\ \lstinline{T const[10]} }
235	& \pbox{20cm}{ \vspace{2pt} \lstinline{const T x[10];} \\ \lstinline{T const x[10];} }
236	& @[ [10] const T ]@
237	& @[ const array(T, 10) ]@
238	\\ \hline
239	& ar.\ of ptr.\ to val.
240	& @T*[10]@
241	& @T *x[10];@
242	& @[ [10] * T ]@
243	& @[ array(* T, 10) ]@
244	\\ \hline
245	& \pbox{20cm}{ \vspace{2pt} ar.\ of ptr.\ to val.\\ \footnotesize{no writing the ptr.\ in \lstinline{x[5]}} }\vspace{2pt}
246	& @T * const [10]@
247	& @T * const x[10];@
248	& @[ [10] const * T ]@
249	& @[ array(const * T, 10) ]@
250	\\ \hline
251	& \pbox{20cm}{ \vspace{2pt} ar.\ of ptr.\ to val.\\ \footnotesize{no writing the val.\ in \lstinline{*(x[5])}} }\vspace{2pt}
252	& \pbox{20cm}{ \vspace{2pt} \lstinline{const T * [10]} \\ \lstinline{T const * [10]} }
253	& \pbox{20cm}{ \vspace{2pt} \lstinline{const T * x[10];} \\ \lstinline{T const * x[10];} }
254	& @[ [10] * const T ]@
255	& @[ array(* const T, 10) ]@
256	\\ \hline \hline
257	$\triangleright$ & ptr.\ to ar.\ of val.
258	& @T(*)[10]@
259	& @T (*x)[10];@
260	& @[ * [10] T ]@
261	& @[ * array(T, 10) ]@
262	\\ \hline
263	& \pbox{20cm}{ \vspace{2pt} ptr.\ to ar.\ of val.\\ \footnotesize{no writing the ptr.\ in \lstinline{x}} }\vspace{2pt}
264	& @T(* const)[10]@
265	& @T (* const x)[10];@
266	& @[ const * [10] T ]@
267	& @[ const * array(T, 10) ]@
268	\\ \hline
269	& \pbox{20cm}{ \vspace{2pt} ptr.\ to ar.\ of val.\\ \footnotesize{no writing the val.\ in \lstinline{(*x)[5]}} }\vspace{2pt}
270	& \pbox{20cm}{ \vspace{2pt} \lstinline{const T()[10]} \\ \lstinline{T const () [10]} }
271	& \pbox{20cm}{ \vspace{2pt} \lstinline{const T (x)[10];} \\ \lstinline{T const (x)[10];} }
272	& @[ * [10] const T ]@
273	& @[ * const array(T, 10) ]@
274	\\ \hline
275	& ptr.\ to ar.\ of ptr.\ to val.
276	& @T()[10]@
277	& @T (x)[10];@
278	& @[ * [10] * T ]@
279	& @[ * array(* T, 10) ]@
280	\\ \hline
281	\end{tabular}
282	\caption{Figure}
283	\end{figure}
284
285
286	\subsection{Arrays decay and pointers diffract}
287
288	The last section established the difference between these four types:
289	\lstinput{3-6}{bkgd-carray-decay.c}
290	But the expression used for obtaining the pointer to the first element is pedantic.
291	The root of all C programmer experience with arrays is the shortcut
292	\lstinput{8-8}{bkgd-carray-decay.c}
293	which reproduces @pa0@, in type and value:
294	\lstinput{9-9}{bkgd-carray-decay.c}
295	The validity of this initialization is unsettling, in the context of the facts established in the last section.
296	Notably, it initializes name @pa0x@ from expression @a@, when they are not of the same type:
297	\lstinput{10-10}{bkgd-carray-decay.c}
298
299	So, C provides an implicit conversion from @float[10]@ to @float*@, as described in ARM-6.3.2.1.3:
300	\begin{quote}
301	Except when it is the operand of the @sizeof@ operator, or the unary @&@ operator, or is a
302	string literal used to initialize an array
303	an expression that has type ``array of type'' is
304	converted to an expression with type ``pointer to type'' that points to the initial element of
305	the array object
306	\end{quote}
307
308	This phenomenon is the famous ``pointer decay,'' which is a decay of an array-typed expression into a pointer-typed one.
309
310	It is worthy to note that the list of exception cases does not feature the occurrence of @a@ in @a[i]@.
311	Thus, subscripting happens on pointers, not arrays.
312
313	Subscripting proceeds first with pointer decay, if needed. Next, ARM-6.5.2.1.2 explains that @a[i]@ is treated as if it were @(*((a)+(i)))@.
314	ARM-6.5.6.8 explains that the addition, of a pointer with an integer type, is defined only when the pointer refers to an element that is in an array, with a meaning of ``@i@ elements away from,'' which is valid if @a@ is big enough and @i@ is small enough.
315	Finally, ARM-6.5.3.2.4 explains that the @*@ operator's result is the referenced element.
316
317	Taken together, these rules also happen to illustrate that @a[i]@ and @i[a]@ mean the same thing.
318
319	Subscripting a pointer when the target is standard-inappropriate is still practically well-defined.
320	While the standard affords a C compiler freedom about the meaning of an out-of-bound access,
321	or of subscripting a pointer that does not refer to an array element at all,
322	the fact that C is famously both generally high-performance, and specifically not bound-checked,
323	leads to an expectation that the runtime handling is uniform across legal and illegal accesses.
324	Moreover, consider the common pattern of subscripting on a malloc result:
325	\begin{cfa}
326	float * fs = malloc( 10 * sizeof(float) );
327	fs[5] = 3.14;
328	\end{cfa}
329	The @malloc@ behaviour is specified as returning a pointer to ``space for an object whose size is'' as requested (ARM-7.22.3.4.2).
330	But program says \emph{nothing} more about this pointer value, that might cause its referent to \emph{be} an array, before doing the subscript.
331
332	Under this assumption, a pointer being subscripted (or added to, then dereferenced)
333	by any value (positive, zero, or negative), gives a view of the program's entire address space,
334	centred around the @p@ address, divided into adjacent @sizeof(*p)@ chunks,
335	each potentially (re)interpreted as @typeof(*p)@.
336
337	I call this phenomenon ``array diffraction,'' which is a diffraction of a single-element pointer
338	into the assumption that its target is in the middle of an array whose size is unlimited in both directions.
339
340	No pointer is exempt from array diffraction.
341
342	No array shows its elements without pointer decay.
343
344	A further pointer--array confusion, closely related to decay, occurs in parameter declarations.
345	ARM-6.7.6.3.7 explains that when an array type is written for a parameter,
346	the parameter's type becomes a type that I summarize as being the array-decayed type.
347	The respective handlings of the following two parameter spellings shows that the array-spelled one is really, like the other, a pointer.
348	\lstinput{12-16}{bkgd-carray-decay.c}
349	As the @sizeof(x)@ meaning changed, compared with when run on a similarly-spelled local variariable declaration,
350	GCC also gives this code the warning: ```sizeof' on array function parameter `x' will return size of `float *'.''
351
352	The caller of such a function is left with the reality that a pointer parameter is a pointer, no matter how it's spelled:
353	\lstinput{18-21}{bkgd-carray-decay.c}
354	This fragment gives no warnings.
355
356	The shortened parameter syntax @T x[]@ is a further way to spell ``pointer.''
357	Note the opposite meaning of this spelling now, compared with its use in local variable declarations.
358	This point of confusion is illustrated in:
359	\lstinput{23-30}{bkgd-carray-decay.c}
360	The basic two meanings, with a syntactic difference helping to distinguish,
361	are illustrated in the declarations of @ca@ vs.\ @cp@,
362	whose subsequent @edit@ calls behave differently.
363	The syntax-caused confusion is in the comparison of the first and last lines,
364	both of which use a literal to initialze an object decalared with spelling @T x[]@.
365	But these initialized declarations get opposite meanings,
366	depending on whether the object is a local variable or a parameter.
367
368
369	In sumary, when a funciton is written with an array-typed parameter,
370	\begin{itemize}
371	\item an appearance of passing an array by value is always an incorrect understanding
372	\item a dimension value, if any is present, is ignorred
373	\item pointer decay is forced at the call site and the callee sees the parameter having the decayed type
374	\end{itemize}
375
376	Pointer decay does not affect pointer-to-array types, because these are already pointers, not arrays.
377	As a result, a function with a pointer-to-array parameter sees the parameter exactly as the caller does:
378	\lstinput{32-42}{bkgd-carray-decay.c}
379
380	\noindent
381	\textbf{Unfortunate Syntactic Reference}
382
383	\noindent
384	(Parameter declaration; ``no writing'' refers to the callee's ability)
385
386	\begin{figure}
387	\centering
388	\begin{tabular}{llllll}
389	& Description & Type & Param. Decl & \CFA-C \\ \hline
390	$\triangleright$ & ptr.\ to val.
391	& @T *@
392	& \pbox{20cm}{ \vspace{2pt} \lstinline{T * x,} \\ \lstinline{T x[10],} \\ \lstinline{T x[],} }\vspace{2pt}
393	& \pbox{20cm}{ \vspace{2pt} \lstinline{[ * T ]} \\ \lstinline{[ [10] T ]} \\ \lstinline{[ [] T ]} }
394	\\ \hline
395	& \pbox{20cm}{ \vspace{2pt} ptr.\ to val.\\ \footnotesize{no writing the ptr.\ in \lstinline{x}} }\vspace{2pt}
396	& @T * const@
397	& \pbox{20cm}{ \vspace{2pt} \lstinline{T * const x,} \\ \lstinline{T x[const 10],} \\ \lstinline{T x[const],} }\vspace{2pt}
398	& \pbox{20cm}{ \vspace{2pt} \lstinline{[ const * T ]} \\ \lstinline{[ [const 10] T ]} \\ \lstinline{[ [const] T ]} }
399	\\ \hline
400	& \pbox{20cm}{ \vspace{2pt} ptr.\ to val.\\ \footnotesize{no writing the val.\ in \lstinline{*x}} }\vspace{2pt}
401	& \pbox{20cm}{ \vspace{2pt} \lstinline{const T } \\ \lstinline{T const } }
402	& \pbox{20cm}{ \vspace{2pt} \lstinline{const T * x,} \\ \lstinline{T const * x,} \\ \lstinline{const T x[10],} \\ \lstinline{T const x[10],} \\ \lstinline{const T x[],} \\ \lstinline{T const x[],} }\vspace{2pt}
403	& \pbox{20cm}{ \vspace{2pt} \lstinline{[* const T]} \\ \lstinline{[ [10] const T ]} \\ \lstinline{[ [] const T ]} }
404	\\ \hline \hline
405	$\triangleright$ & ptr.\ to ar.\ of val.
406	& @T(*)[10]@
407	& \pbox{20cm}{ \vspace{2pt} \lstinline{T (*x)[10],} \\ \lstinline{T x[3][10],} \\ \lstinline{T x[][10],} }\vspace{2pt}
408	& \pbox{20cm}{ \vspace{2pt} \lstinline{[* [10] T]} \\ \lstinline{[ [3] [10] T ]} \\ \lstinline{[ [] [10] T ]} }
409	\\ \hline
410	& ptr.\ to ptr.\ to val.
411	& @T **@
412	& \pbox{20cm}{ \vspace{2pt} \lstinline{T ** x,} \\ \lstinline{T x[10],} \\ \lstinline{T x[],} }\vspace{2pt}
413	& \pbox{20cm}{ \vspace{2pt} \lstinline{[ * * T ]} \\ \lstinline{[ [10] * T ]} \\ \lstinline{[ [] * T ]} }
414	\\ \hline
415	& \pbox{20cm}{ \vspace{2pt} ptr.\ to ptr.\ to val.\\ \footnotesize{no writing the val.\ in \lstinline{**argv}} }\vspace{2pt}
416	& @const char **@
417	& \pbox{20cm}{ \vspace{2pt} \lstinline{const char *argv[],} \\ \footnotesize{(others elided)} }\vspace{2pt}
418	& \pbox{20cm}{ \vspace{2pt} \lstinline{[ [] * const char ]} \\ \footnotesize{(others elided)} }
419	\\ \hline
420	\end{tabular}
421	\caption{Figure}
422	\end{figure}
423
424
425	\subsection{Lengths may vary, checking does not}
426
427	When the desired number of elements is unknown at compile time,
428	a variable-length array is a solution:
429	\begin{cfa}
430	int main( int argc, const char *argv[] ) {
431	assert( argc == 2 );
432	size_t n = atol( argv[1] );
433	assert( 0 < n && n < 1000 );
434
435	float a[n];
436	float b[10];
437
438	// ... discussion continues here
439	}
440	\end{cfa}
441	This arrangement allocates @n@ elements on the @main@ stack frame for @a@, just as it puts 10 elements on the @main@ stack frame for @b@.
442	The variable-sized allocation of @a@ is provided by @alloca@.
443
444	In a situation where the array sizes are not known to be small enough for stack allocation to be sensible, corresponding heap allocations are achievable as:
445	\begin{cfa}
446	float *ax1 = malloc( sizeof( float[n] ) );
447	float ax2 = malloc( n sizeof( float ) );
448	float *bx1 = malloc( sizeof( float[1000000] ) );
449	float bx2 = malloc( 1000000 sizeof( float ) );
450	\end{cfa}
451
452
453	VLA
454
455	Parameter dependency
456
457	Checking is best-effort / unsound
458
459	Limited special handling to get the dimension value checked (static)
460
461
462
463	\subsection{C has full-service, dynamically sized, multidimensional arrays (and \CC does not)}
464
465	In C and \CC, ``multidimensional array'' means ``array of arrays.'' Other meanings are discussed in TODO.
466
467	Just as an array's element type can be @float@, so can it be @float[10]@.
468
469	While any of @float@, @float[10]@ and @float()[10]@ are easy to tell apart from @float@, telling them apart from each other may need occasional reference back to TODO intro section.
470	The sentence derived by wrapping each type in @-[3]@ follows.
471
472	While any of @float[3]@, @float[3][10]@ and @float()[3][10]@ are easy to tell apart from @float[3]@,
473	telling them apart from each other is what it takes to know what ``array of arrays'' really means.
474
475	Pointer decay affects the outermost array only
476
477	TODO: unfortunate syntactic reference with these cases:
478
479	\begin{itemize}
480	\item ar. of ar. of val (be sure about ordering of dimensions when the declaration is dropped)
481	\item ptr. to ar. of ar. of val
482	\end{itemize}
483
484
485	\subsection{Arrays are (but) almost values}
486
487	Has size; can point to
488
489	Can't cast to
490
491	Can't pass as value
492
493	Can initialize
494
495	Can wrap in aggregate
496
497	Can't assign
498
499
500	\subsection{Returning an array is (but) almost possible}
501
502
503	\subsection{The pointer-to-array type has been noticed before}
504
505	\subsection{Multi-Dimensional}
506
507	As in the last section, we inspect the declaration ...
508	\lstinput{16-18}{bkgd-carray-mdim.c}
509	The significant axis of deriving expressions from @a@ is now ``itself,'' ``first element'' or ``first grand-element (meaning, first element of first element).''
510	\lstinput{20-44}{bkgd-carray-mdim.c}
511
512
513	\section{\CFA}
514
515	Traditionally, fixing C meant leaving the C-ism alone, while providing a better alternative beside it.
516	(For later: That's what I offer with array.hfa, but in the future-work vision for arrays, the fix includes helping programmers stop accidentally using a broken C-ism.)
517
518	\subsection{\CFA features interacting with arrays}
519
520	Prior work on \CFA included making C arrays, as used in C code from the wild,
521	work, if this code is fed into @cfacc@.
522	The quality of this this treatment was fine, with no more or fewer bugs than is typical.
523
524	More mixed results arose with feeding these ``C'' arrays into preexisting \CFA features.
525
526	A notable success was with the \CFA @alloc@ function,
527	which type information associated with a polymorphic return type
528	replaces @malloc@'s use of programmer-supplied size information.
529	\begin{cfa}
530	// C, library
531	void * malloc( size_t );
532	// C, user
533	struct tm * el1 = malloc( sizeof(struct tm) );
534	struct tm * ar1 = malloc( 10 * sizeof(struct tm) );
535
536	// CFA, library
537	forall( T * ) T * alloc();
538	// CFA, user
539	tm * el2 = alloc();
540	tm (*ar2)[10] = alloc();
541	\end{cfa}
542	The alloc polymorphic return compiles into a hidden parameter, which receives a compiler-generated argument.
543	This compiler's argument generation uses type information from the left-hand side of the initialization to obtain the intended type.
544	Using a compiler-produced value eliminates an opportunity for user error.
545
546	TODO: fix in following: even the alloc call gives bad code gen: verify it was always this way; walk back the wording about things just working here; assignment (rebind) seems to offer workaround, as in bkgd-cfa-arrayinteract.cfa
547
548	Bringing in another \CFA feature, reference types, both resolves a sore spot of the last example, and gives a first example of an array-interaction bug.
549	In the last example, the choice of ``pointer to array'' @ar2@ breaks a parallel with @ar1@.
550	They are not subscripted in the same way.
551	\begin{cfa}
552	ar1[5];
553	(*ar2)[5];
554	\end{cfa}
555	Using ``reference to array'' works at resolving this issue. TODO: discuss connection with Doug-Lea \CC proposal.
556	\begin{cfa}
557	tm (&ar3)[10] = *alloc();
558	ar3[5];
559	\end{cfa}
560	The implicit size communication to @alloc@ still works in the same ways as for @ar2@.
561
562	Using proper array types (@ar2@ and @ar3@) addresses a concern about using raw element pointers (@ar1@), albeit a theoretical one.
563	TODO xref C standard does not claim that @ar1@ may be subscripted,
564	because no stage of interpreting the construction of @ar1@ has it be that ``there is an \emph{array object} here.''
565	But both @*ar2@ and the referent of @ar3@ are the results of \emph{typed} @alloc@ calls,
566	where the type requested is an array, making the result, much more obviously, an array object.
567
568	The ``reference to array'' type has its sore spots too.
569	TODO see also @dimexpr-match-c/REFPARAM_CALL@ (under @TRY_BUG_1@)
570
571	TODO: I fixed a bug associated with using an array as a T. I think. Did I really? What was the bug?

Note: See TracBrowser for help on using the repository browser.

Download in other formats: