1 | |
---|
2 | TODO: introduce multidimensional array feature and approaches |
---|
3 | |
---|
4 | The new \CFA standard library @array@ datatype supports multidimensional uses more richly than the C array. The new array's multimentsional interface and implementation, follows an array-of-arrays setup, meaning, like C's @float[n][m]@ type, one contiguous object, with coarsely-strided dimensions directly wrapping finely-strided dimensions. This setup is in contrast with the pattern of array of pointers to other allocations representing a sub-array. Beyond what C's type offers, the new array brings direct support for working with a noncontiguous array slice, allowing a program to work with dimension subscripts given in a non-physical order. C and C++ require a programmer with such a need to manage pointer/offset arithmetic manually. |
---|
5 | |
---|
6 | Examples are shown using a $5 \times 7$ float array, @a@, loaded with increments of $0.1$ when stepping across the length-7 finely-strided dimension shown on columns, and with increments of $1.0$ when stepping across the length-5 corsely-strided dimension shown on rows. |
---|
7 | \lstinputlisting[language=C, firstline=120, lastline=128]{hello-md.cfa} |
---|
8 | The memory layout of @a@ has strictly increasing numbers along its 35 contiguous positions. |
---|
9 | |
---|
10 | A trivial form of slicing extracts a contiguous inner array, within an array-of-arrays. Like with the C array, a lesser-dimensional array reference can be bound to the result of subscripting a greater-dimensional array, by a prefix of its dimensions. This action first subscripts away the most coaresly strided dimensions, leaving a result that expects to be be subscripted by the more finely strided dimensions. |
---|
11 | \lstinputlisting[language=C, firstline=60, lastline=66]{hello-md.cfa} |
---|
12 | \lstinputlisting[language=C, firstline=140, lastline=140]{hello-md.cfa} |
---|
13 | |
---|
14 | This function declaration is asserting too much knowledge about its parameter @c@, for it to be usable for printing either a row slice or a column slice. Specifically, declaring the parameter @c@ with type @array@ means that @c@ is contiguous. However, the function does not use this fact. For the function to do its job, @c@ need only be of a container type that offers a subscript operator (of type @ptrdiff_t@ $\rightarrow$ @float@), with governed length @N@. The new-array library provides the trait @ix@, so-defined. With it, the original declaration can be generalized, while still implemented with the same body, to the latter declaration: |
---|
15 | \lstinputlisting[language=C, firstline=40, lastline=44]{hello-md.cfa} |
---|
16 | \lstinputlisting[language=C, firstline=145, lastline=145]{hello-md.cfa} |
---|
17 | |
---|
18 | Nontrivial slicing, in this example, means passing a noncontiguous slice to @print1d@. The new-array library provides a ``subscript by all'' operation for this purpose. In a multi-dimensional subscript operation, any dimension given as @all@ is left ``not yet subscripted by a value,'' implementing the @ix@ trait, waiting for such a value. |
---|
19 | \lstinputlisting[language=C, firstline=150, lastline=151]{hello-md.cfa} |
---|
20 | |
---|
21 | The example has shown that @a[2]@ and @a[[2, all]]@ both refer to the same, ``2.*'' slice. Indeed, the various @print1d@ calls under discussion access the entry with value 2.3 as @a[2][3]@, @a[[2,all]][3]@, and @a[[all,3]][2]@. This design preserves (and extends) C array semantics by defining @a[[i,j]]@ to be @a[i][j]@ for numeric subscripts, but also for ``subscripting by all''. That is: |
---|
22 | |
---|
23 | \begin{tabular}{cccccl} |
---|
24 | @a[[2,all]][3]@ & $=$ & @a[2][all][3]@ & $=$ & @a[2][3]@ & (here, @all@ is redundant) \\ |
---|
25 | @a[[all,3]][2]@ & $=$ & @a[all][3][2]@ & $=$ & @a[2][3]@ & (here, @all@ is effective) |
---|
26 | \end{tabular} |
---|
27 | |
---|
28 | Narrating progress through each of the @-[-][-][-]@ expressions gives, firstly, a definition of @-[all]@, and secondly, a generalization of C's @-[i]@. |
---|
29 | |
---|
30 | \noindent Where @all@ is redundant: |
---|
31 | |
---|
32 | \begin{tabular}{ll} |
---|
33 | @a@ & 2-dimensional, want subscripts for coarse then fine \\ |
---|
34 | @a[2]@ & 1-dimensional, want subscript for fine; lock coarse = 2 \\ |
---|
35 | @a[2][all]@ & 1-dimensional, want subscript for fine \\ |
---|
36 | @a[2][all][3]@ & 0-dimensional; lock fine = 3 |
---|
37 | \end{tabular} |
---|
38 | |
---|
39 | \noindent Where @all@ is effective: |
---|
40 | |
---|
41 | \begin{tabular}{ll} |
---|
42 | @a@ & 2-dimensional, want subscripts for coarse then fine \\ |
---|
43 | @a[all]@ & 2-dimensional, want subscripts for fine then coarse \\ |
---|
44 | @a[all][3]@ & 1-dimensional, want subscript for coarse; lock fine = 3 \\ |
---|
45 | @a[all][3][2]@ & 0-dimensional; lock coarse = 2 |
---|
46 | \end{tabular} |
---|
47 | |
---|
48 | The semantics of @-[all]@ is to dequeue from the front of the ``want subscripts'' list and re-enqueue at its back. The semantics of @-[i]@ is to dequeue from the front of the ``want subscripts'' list and lock its value to be @i@. |
---|
49 | |
---|
50 | Contiguous arrays, and slices of them, are all realized by the same underlying parameterized type. It includes stride information in its metatdata. The @-[all]@ operation is a conversion from a reference to one instantiation, to a reference to another instantiation. The running example's @all@-effective step, stated more concretely, is: |
---|
51 | |
---|
52 | \begin{tabular}{ll} |
---|
53 | @a@ & : 5 of ( 7 of float each spaced 1 float apart ) each spaced 7 floats apart \\ |
---|
54 | @a[all]@ & : 7 of ( 5 of float each spaced 7 floats apart ) each spaced 1 float apart |
---|
55 | \end{tabular} |
---|
56 | |
---|
57 | \begin{figure} |
---|
58 | \includegraphics{measuring-like-layout} |
---|
59 | \caption{Visualization of subscripting by value and by \lstinline[language=C,basicstyle=\ttfamily]{all}, for \lstinline[language=C,basicstyle=\ttfamily]{a} of type \lstinline[language=C,basicstyle=\ttfamily]{array( float, Z(5), Z(7) )}. The horizontal dimension represents memory addresses while vertical layout is conceptual.} |
---|
60 | \label{fig:subscr-all} |
---|
61 | \end{figure} |
---|
62 | |
---|
63 | \noindent While the latter description implies overlapping elements, Figure \ref{fig:subscr-all} shows that the overlaps only occur with unused spaces between elements. Its depictions of @a[all][...]@ show the navigation of a memory layout with nontrivial strides, that is, with ``spaced \_ floats apart'' values that are greater or smaller than the true count of valid indeces times the size of a logically indexed element. Reading from the bottom up, the expression @a[all][3][2]@ shows a float, that is masquerading as a @float[7]@, for the purpose of being arranged among its peers; five such occurrences form @a[all][3]@. The tail of flatter boxes extending to the right of a poper element represents this stretching. At the next level of containment, the structure @a[all][3]@ masquerades as a @float[1]@, for the purpose of being arranged among its peers; seven such occurrences form @a[all]@. The verical staircase arrangement represents this compression, and resulting overlapping. |
---|
64 | |
---|
65 | The new-array library defines types and operations that ensure proper elements are accessed soundly in spite of the overlapping. The private @arpk@ structure (array with explicit packing) is generic over these two types (and more): the contained element, what it is masquerading as. This structure's public interface is the @array(...)@ construction macro and the two subscript operators. Construction by @array@ initializes the masquerading-as type information to be equal to the contained-element information. Subscrpting by @all@ rearranges the order of masquerading-as types to achieve, in genernal, nontrivial striding. Subscripting by a number consumes the masquerading-as size of the contained element type, does normal array stepping according to that size, and returns there element found there, in unmasked form. |
---|
66 | |
---|
67 | The @arpk@ structure and its @-[i]@ operator are thus defined as: |
---|
68 | \begin{lstlisting} |
---|
69 | forall( ztype(N), // length of current dimension |
---|
70 | dtype(S) | sized(S), // masquerading-as |
---|
71 | dtype E_im, // immediate element, often another array |
---|
72 | dtype E_base // base element, e.g. float, never array |
---|
73 | ) { |
---|
74 | struct arpk { |
---|
75 | S strides[z(N)]; // so that sizeof(this) is N of S |
---|
76 | }; |
---|
77 | |
---|
78 | // expose E_im, stride by S |
---|
79 | E_im & ?[?]( arpk(N, S, E_im, E_base) & a, ptrdiff_t i ) { |
---|
80 | return (E_im &) a.strides[i]; |
---|
81 | } |
---|
82 | } |
---|
83 | \end{lstlisting} |
---|
84 | |
---|
85 | An instantion of the @arpk@ generic is given by the @array(E_base, N0, N1, ...)@ exapnsion, which is @arpk( N0, Rec, Rec, E_base )@, where @Rec@ is @array(E_base, N1, ...)@. In the base case, @array(E_base)@ is just @E_base@. Because this construction uses the same value for the generic parameters @S@ and @E_im@, the resulting layout has trivial strides. |
---|
86 | |
---|
87 | Subscripting by @all@, to operate on nontrivial strides, is a dequeue-enqueue operation on the @E_im@ chain, which carries @S@ instatiations, intact, to new positions. Expressed as an operation on types, this rotation is: |
---|
88 | \begin{eqnarray*} |
---|
89 | suball( arpk(N, S, E_i, E_b) ) & = & enq( N, S, E_i, E_b ) \\ |
---|
90 | enq( N, S, E_b, E_b ) & = & arpk( N, S, E_b, E_b ) \\ |
---|
91 | enq( N, S, arpk(N', S', E_i', E_b), E_b ) & = & arpk( N', S', enq(N, S, E_i', E_b), E_b ) |
---|
92 | \end{eqnarray*} |
---|
93 | |
---|