Index: doc/theses/mike_brooks_MMath/array.tex
===================================================================
--- doc/theses/mike_brooks_MMath/array.tex	(revision b166b1c58adcab718a2a7cee11b22b4dcfdfa0a2)
+++ doc/theses/mike_brooks_MMath/array.tex	(revision 77328d0000bb5e2ccf01c5cf7f003fb00520f9a1)
@@ -248,15 +248,139 @@
 \begin{figure}
 \includegraphics{measuring-like-layout}
-\caption{Visualization of subscripting by value and by \lstinline[language=CFA,basicstyle=\ttfamily]{all}, for \lstinline[language=CFA,basicstyle=\ttfamily]{a} of type \lstinline[language=CFA,basicstyle=\ttfamily]{array( float, 5, 7 )}.
-The horizontal dimension represents memory addresses while vertical layout is conceptual.}
+\caption{Visualization of subscripting, by numeric value, and by \lstinline[language=CFA]{all}.
+	Here \lstinline[language=CFA]{x} has type \lstinline[language=CFA]{array( float, 5, 7 )}, understood as 5 rows by 7 columns.
+	The horizontal layout represents contiguous memory addresses while the vertical layout uses artistic license.
+	The vertical shaded band highlights the location of the targeted element, 2.3.
+	Any such vertical contains various interpretations of a single address.}
 \label{fig:subscr-all}
 \end{figure}
 
-\noindent While the latter description implies overlapping elements, Figure \ref{fig:subscr-all} shows that the overlaps only occur with unused spaces between elements.
-Its depictions of @a[all][...]@ show the navigation of a memory layout with nontrivial strides, that is, with ``spaced \_ floats apart'' values that are greater or smaller than the true count of valid indices times the size of a logically indexed element.
-Reading from the bottom up, the expression @a[all][3][2]@ shows a float, that is masquerading as a @float[7]@, for the purpose of being arranged among its peers; five such occurrences form @a[all][3]@.
-The tail of flatter boxes extending to the right of a proper element represents this stretching.
-At the next level of containment, the structure @a[all][3]@ masquerades as a @float[1]@, for the purpose of being arranged among its peers; seven such occurrences form @a[all]@.
-The vertical staircase arrangement represents this compression, and resulting overlapping.
+\noindent BEGIN: Paste looking for a home
+
+The world of multidimensional array implementation has, or abuts, four relevant levels of abstraction, highest to lowest:
+
+1, purpose:
+If you are doing linear algebra, you might call its dimensions, "column" and "row."
+If you are treating an acrostic poem as a grid of characters, you might say,
+the direction of reading the phrases vs the direction of reading the keyword.
+
+2, flexible-stride memory:
+assuming, from here on, a need to see/use contiguous memory,
+this model offers the ability to slice by (provide an index for) any dimension
+
+3, fixed-stride memory:
+this model offers the ability to slice by (provide an index for) only the coarsest dimension
+
+4, explicit-displacement memory:
+no awareness of dimensions, so no distinguishing them; just the ability to access memory at a distance from a reference point
+
+C offers style-3 arrays.  Fortran, Matlab and APL offer style-2 arrays.
+Offering style-2 implies offering style-3 as a sub-case.
+My CFA arrays are style-2.
+
+Some debate is reasonable as to whether the abstraction actually goes $ 1 < \{2, 3\} < 4 $,
+rather than my numerically-ordered chain.
+According to the diamond view, styles 2 and 3 are at the same abstraction level, just with 3 offering a more limited set of functionality.
+The chain view reflects the design decision I made in my mission to offer a style-2 abstraction;
+I chose to build it upon a style-3 abstraction.
+(Justification of the decision follows, after the description of the design.)
+
+The following discussion first dispenses with API styles 1 and 4, then elaborates on my work with styles 2 and 3.
+
+Style 1 is not a concern of array implementations.
+It concerns documentation and identifier choices of the purpose-specific API.
+If one is offering a matrix-multiply function, one must specify which dimension(s) is/are being summed over
+(or rely on the familiar convention of these being the first argument's rows and second argument's columns).
+Some libraries offer a style-1 abstraction that is not directly backed by a single array
+(e.g. make quadrants contiguous, as may help cache coherence during a parallel matrix multiply),
+but such designs are out of scope for a discussion on arrays; they are applications of several arrays.
+I typically include style-1 language with examples to help guide intuition.
+
+It is often said that C has row-major arrays while Fortran has column-major arrays.
+This comparison brings an unhelpful pollution of style-1 thinking into issues of array implementation.
+Unfortunately, ``-major'' has two senses: the program's order of presenting indices and the array's layout in memory.
+(The program's order could be either lexical, as in @x[1,2,3]@ subscripting, or runtime, as in the @x[1][2][3]@ version.)
+Style 2 is concerned with introducing a nontrivial relationship between program order and memory order,
+while style 3 sees program order identical with memory order.
+Both C and (the style-3 subset of) Fortran actually use the same relationship here:
+an earlier subscript in program order controls coarser steps in memory.
+The job of a layer-2/3 system is to implement program-ordered subscripting according to a defined memory layout.
+C and Fortran do not use opposite orders in doing this job.
+Fortran is only ``backward'' in its layer-1 conventions for reading/writing and linear algebra.
+Fortran subscripts as $m(c,r)$.  When I use style-1 language, I am following the C/mathematical convention of $m(r,c)$.
+
+Style 4 is the inevitable target of any array implementation.
+The hardware offers this model to the C compiler, with bytes as the unit of displacement.
+C offers this model to its programmer as pointer arithmetic, with arbitrary sizes as the unit.
+I consider casting a multidimensional array as a single-dimensional array/pointer,
+then using @x[i]@ syntax to access its elements, to be a form of pointer arithmetic.
+But style 4 is not offering arrays.
+
+Now stepping into the implementation
+of CFA's new type-3 multidimensional arrays in terms of C's existing type-2 multidimensional arrays,
+it helps to clarify that even the interface is quite low-level.
+A C/CFA array interface includes the resulting memory layout.
+The defining requirement of a type-3 system is the ability to slice a column from a column-finest matrix.
+The required memory shape of such a slice is set, before any discussion of implementation.
+The implementation presented here is how the CFA array library wrangles the C type system,
+to make it do memory steps that are consistent with this layout.
+TODO: do I have/need a presentation of just this layout, just the semantics of -[all]?
+
+Figure~\ref{fig:subscr-all} shows one element (in the shaded band) accessed two different ways: as @x[2][3]@ and as @x[all][3][2]@.
+In both cases, value 2 selects from the coarser dimension (rows of @x@),
+while the value 3 selects from the finer dimension (columns of @x@).
+The figure illustrates the value of each subexpression, comparing how numeric subscripting proceeds from @x@, vs from @x[all]@.
+Proceeding from @x@ gives the numeric indices as coarse then fine, while proceeding from @x[all]@ gives them fine then coarse.
+These two starting expressions, which are the example's only multidimensional subexpressions
+(those that received zero numeric indices so far), are illustrated with vertical steps where a \emph{first} numeric index would select.
+
+The figure's presentation offers an intuition answering, What is an atomic element of @x[all]@?
+From there, @x[all]@ itself is simply a two-dimensional array, in the strict C sense, of these strange building blocks.
+An atom (like the bottommost value, @x[all][3][2]@), is the contained value (in the square box)
+and a lie about its size (the wedge above it, growing upward).
+An array of these atoms (like the intermediate @x[all][3]@) is just a contiguous arrangement of them,
+done according to their size, as announced.  Call such an array a column.
+A column is almost ready to be arranged into a matrix; it is the \emph{contained value} of the next-level building block,
+but another lie about size is required.
+At first, an atom needed to be arranged as if it were bigger,
+but now a column needs to be arranged as if it is smaller (the wedge above it, shrinking upward).
+These lying columns, arranged contiguously according to their size (as announced) form the matrix @x[all]@.
+Because @x[all]@ takes indices, first for the fine stride, then for the coarse stride,
+it achieves the requirement of representing the transpose of @x@.
+Yet every time the programmer presents an index, a mere C-array subscript is achieving the offset calculation.
+
+In the @x[all]@ case, after the finely strided subscript is done (column 3 is selected),
+the locations referenced by the coarse subscript options (rows 0..4) are offset by 3 floats,
+compared with where analogous rows appear when the row-level option is presented for @x@.
+
+These size lies create an appearance of overlap.
+For example, in @x[all]@, the shaded band touches atoms 2.0, 2.1, 2.2, 2.3, 1.4, 1.5 and 1.6.
+But only the atom 2.3 is storing its value there.
+The rest are lying about (conflicting) claims on this location, but never exercising these alleged claims.
+
+Lying is implemented as casting.
+The arrangement just described is implemented in the structure @arpk@.
+This structure uses one type in its internal field declaration and offers a different type as the return of its subscript operator.
+The field within is a plain-C array of the fictional type, which is 7 floats long for @x[all][3][2]@ and 1 float long for @x[all][3]@.
+The subscript operator presents what's really inside, by casting to the type below the wedge of lie.
+
+%  Does x[all] have to lie too?  The picture currently glosses over how it it advertizes a size of 7 floats.  I'm leaving that as an edge case benignly misrepresented in the picture.  Edge cases only have to be handled right in the code.
+
+Casting, overlapping and lying are unsafe.
+The mission here is to implement a style-2 feature that the type system helps the programmer use safely.
+The offered style-2 system is allowed to be internally unsafe,
+just as C's implementation of a style-3 system (upon a style-4 system) is unsafe within,
+even when the programmer is using it without casts or pointer arithmetic.
+Having a style-2 system relieves the programmer from resorting to unsafe pointer arithmetic when working with noncontiguous slices.
+
+The choice to implement this style-2 system upon C's style-3 arrays, rather than its style-4 pointer arithmetic,
+reduces the attack surface of unsafe code.
+My casting is unsafe, but I do not do any pointer arithmetic.
+When a programmer works in the common-case style-3 subset (in the no-@[all]@ top of Figure~\ref{fig:subscr-all}),
+my casts are identities, and the C compiler is doing its usual displacement calculations.
+If I had implemented my system upon style-4 pointer arithmetic,
+then this common case would be circumventing C's battle-hardened displacement calculations in favour of my own.
+
+\noindent END: Paste looking for a home
 
 The new-array library defines types and operations that ensure proper elements are accessed soundly in spite of the overlapping.
