Changeset 1037b4c


Ignore:
Timestamp:
Apr 27, 2026, 2:03:17 AM (2 days ago)
Author:
Michael Brooks <mlbrooks@…>
Branches:
master
Children:
eaee33e
Parents:
037dc57
Message:

proof-read array chapter

Location:
doc/theses/mike_brooks_MMath
Files:
3 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/mike_brooks_MMath/array.tex

    r037dc57 r1037b4c  
    1919\end{cfa}
    2020Here, the arguments to the @array@ type are @float@ (element type) and @99@ (dimension).
    21 When this type is used as a function parameter, the type-system requires the argument is a perfect match.
     21When this type is used as a function parameter, the type system requires the argument to be a perfect match.
    2222\begin{cfa}
    2323void f( @array( float, 42 )@ & p ) {}   $\C{// p accepts 42 floats}$
     
    6565}
    6666\end{cfa}
    67 Both of the stack declared array variables, @x1@ and @x2@, have 42 elements, each element being a @float@.
     67Both of the variables, @x1@ and @x2@, name 42 contiguous, stack-allocated @float@ values.
    6868The two variables have identical size and layout, with no additional ``bookkeeping'' allocations or headers.
    6969The C array, @x1@, has no subscript checking, while \CFA array, @x2@, does.
    70 Providing this explicit generic approach requires a significant extension to the \CFA type system to support a full-feature, safe, efficient (space and time) array-type, which forms the foundation for more complex array forms in \CFA.
     70This explicit, generic approach requires a significant extension to the \CFA type system to support a full-feature, safe, efficient (space and time) array type, which forms the foundation for more complex array forms in \CFA.
    7171In all following discussion, ``C array'' means types like @x1@ and ``\CFA array'' means types like @x2@.
    7272
    7373A future goal is to provide the new @array@ features with syntax approaching C's (declaration style of @x1@).
    7474Then, the library @array@ type could be removed, giving \CFA a largely uniform array type.
    75 At present, the C-syntax @array@ is only partially supported, so the generic @array@ is used exclusively in the thesis.
     75At present, new-array features are only partially supported through C-style syntax, so the generic @array@ is used exclusively in the thesis.
    7676
    7777My contributions in this chapter are:
    7878\begin{enumerate}[leftmargin=*]
    7979\item A type system enhancement that lets polymorphic functions and generic types be parameterized by a numeric value: @forall( [N] )@.
    80 \item Provide a dimension/subscript-checked array-type in the \CFA standard library, where the array's length is statically managed and dynamically valued.
     80\item Provide a dimension/subscript-checked array type in the \CFA standard library, where the array's length is statically managed and dynamically valued.
    8181\item Provide argument/parameter passing safety for arrays and subscript safety.
    8282\item Identify the interesting abilities available by the new @array@ type.
    83 \item Where there is a gap concerning this feature's readiness for prime-time, identification of specific workable improvements that are likely to close the gap.
    8483\end{enumerate}
    8584
     
    121120
    122121\begin{figure}
     122\begin{cfa}
     123forall( [N] ) array( bool, N ) & f( array( float, N ) &, array( float, N ) & );
     124\end{cfa}
    123125\lstinput{30-43}{hello-array.cfa}
    124126\lstinput{45-48}{hello-array.cfa}
    125 \caption{\lstinline{f} Example}
     127\caption{Example of calling a dimension-generic function.}
    126128\label{f:fExample}
    127129\end{figure}
     
    144146\item
    145147The value of an @N@-expression is the acquired length, derived from the usage site, \ie generic declaration or function call.
     148\begin{comment} % was not discussed yet
    146149\item
    147150@array( T, N0, N1, ... )@ is a multi-dimensional type wrapping $\prod_i N_i$ contiguous @T@-typed objects.
     151\end{comment}
    148152\end{itemize}
    149153
    150 \VRef[Figure]{f:TemplateVsGenericType} shows @N@ is not the same as a @size_t@ declaration in a \CC \lstinline[language=C++]{template}.\footnote{
    151 The \CFA program requires a snapshot declaration for \lstinline{n} to compile, as described at the end of \Vref{s:ArrayTypingC}.}
     154\VRef[Figure]{f:TemplateVsGenericType} shows @N@ is not the same as a @size_t@ declaration in a \CC \lstinline[language=C++]{template}.  It illustrates these differences:
    152155\begin{enumerate}[leftmargin=*]
    153156\item
     
    158161\item
    159162\label{p:DimensionPassing}
    160 The \CC template @N@ must be passed explicitly at the call, unless @N@ has a default value, even when \CC can deduct the type of @T@.
     163The \CC template @N@ must be passed explicitly at the call, unless @N@ has a default value, even when \CC can deduce the type of @T@.
    161164The \CFA @N@ is part of the array type and passed implicitly at the call.
    162165% fixed by comparing to std::array
    163166% mycode/arrr/thesis-examples/check-peter/cs-cpp.cpp, v2
     167\begin{comment} % does not appear in the example; I think we've found CFA suffering a similar "belief"
    164168\item
    165169\CC cannot have an array of references, but can have an array of @const@ pointers.
     
    167171In the \CC example, the arrays fall back on C arrays, which have a duality with references with respect to automatic dereferencing.
    168172The \CFA array is a contiguous object with an address, which can be stored as a reference or pointer.
     173\end{comment}
    169174% not really about forall-N vs template-N
    170175% any better CFA support is how Rob left references, not what Mike did to arrays
     
    189194        for ( int i = 0; i < N; i += 1 ) ret[i] = x[i];
    190195}
    191 int main() {
    192         const size_t  n = 10;   $\C[1.9in]{// must be constant}$
    193         int ret[n], x[n];               $\C{// static size}$
     196void demo() {
     197        const size_t  n = 10;
     198        int ret[n], x[n];               $\C[1.9in]{// static size}$
    194199        for ( int i = 0; i < n; i += 1 ) x[i] = i;
    195200        @copy<int, n >( ret, x );@
    196201        for ( int i = 0; i < n; i += 1 ) cout << ret[i] << ' ';
    197202        cout << endl;
     203}
     204int main() {
     205        demo();
    198206}
    199207\end{c++}
     
    205213                for ( i; N ) ret[i] = x[i];
    206214        }
    207         size_t  n;
    208         sin | n;
    209         array( int, n ) ret, x; $\C{// dynamic-fixed size}$
    210         for ( i; n ) x[i] = i;  $\C{// initialize}$
    211         @copy( ret,  x );@              $\C{// alternative ret = x}$
    212         for ( i; n ) sout | ret[i] | nonl; $\C{// print}\CRT$
    213         sout | nl;
     215        void demo( const size_t n ) {
     216
     217                array( int, n ) ret, x; $\C{// dynamic-fixed size}$
     218                for ( i; n ) x[i] = i;  $\C{// initialize}$
     219                @copy( ret,  x );@              $\C{// alternative ret = x}$
     220                for ( i; n ) sout | ret[i] | nonl;
     221                sout | nl;
     222        }
     223        size_t  n;   sin | n;   $\C{// obtain size}\CRT$
     224        demo( n );
    214225}
    215226\end{cfa}
     
    247258\CFA allows length-parameterized array members to be nested at arbitrary locations, with intervening member declarations.
    248259\lstinput{10-15}{hello-accordion.cfa}
    249 The structure has course- and student-level metatdata (their respective field names) and a position-based preferences' matrix.
    250 Its layout has the starting offset of @studentIds@ varying according to the generic parameter @C@, and the offset of @preferences@ varying according to both generic parameters.
     260The structure has course- and student-level metadata (their respective field names) and a position-based preferences' matrix.
     261The structure's layout is dynamic; the starting offset of @student_ids@ varies according to the generic parameter @C@; the offset of @preferences@ varies by both.
    251262
    252263\VRef[Figure]{f:checkExample} shows a program main using @School@ and results with different array sizes.
     
    299310
    300311\section{Single Dimension Array Implementation}
    301 
    302 The core of the preexisting \CFA compiler already has the ``heavy equipment'' needed to provide the feature set just reviewed (up to bugs in cases not yet exercised).
     312\label{s:1d-impl}
     313
     314The core of the preexisting \CFA compiler already has the ``heavy equipment'' needed to provide the feature set just reviewed.
    303315To apply this equipment in tracking array lengths, I encoded a dimension (array's length) as a type.
    304316The type in question does not describe any data that the program actually uses at runtime.
     
    309321% Furthermore, the @array@ type itself is really ``icing on the cake.''
    310322% Presenting its full version is deferred until \VRef[Section]{s:ArrayMdImpl} (where added complexity needed for multiple dimensions is considered).
    311 The new array implementation is broken into single and multiple dimensions \see{\VRef[Section]{s:ArrayMdImpl}}.
    312 Details of the @array@ macros are sprinkles among the implementation discussion.
    313 The single dimension implementation begins with a simple example without @array@ macros.
     323Discussion of the new array implementation is broken into single- and multi-dimensional features.
     324The multidimensional feature set of \VRef[Section]{s:ArrayMdImpl} is a toolkit for managing repeated application of the single-dimensional features.
     325Details of the @array@ type are introduced as needed.
     326The single-dimension discussions begins with a simple example stripped of the @array@ type.
    314327\begin{cfa}
    315328forall( [N] )
    316329struct array_1d_float {
    317         float items[N];                 $\C[2in]{// monomorphic type}$
     330        float items[N];                 $\C[2in]{// float elements }$
    318331};
    319332forall( T, [N] )
    320 struct array_1d_T {
    321         T items[N];                             $\C{// polymorphic type}\CRT$
     333struct array_1d {
     334        T items[N];                             $\C{// any-typed elements}\CRT$
    322335};
    323336\end{cfa}
    324337These two structure patterns, plus a subscript operator, is all that @array@ provides.
    325 My main work is letting a programmer define such a structure (one whose type is parameterized by @[N]@) and functions that operate on it (these being similarly parameterized).
     338My main work is letting a programmer define such a structure (one whose type is parameterized by @[N]@) and define functions that operate on it (these being parameterized similarly).
    326339
    327340The repurposed heavy equipment is
    328341\begin{itemize}[leftmargin=*]
    329342\item
    330         Resolver provided values for a declaration's type-system variables, gathered from type information in scope at the usage site.
    331 \item
    332         The box pass, encoding information about type parameters into ``extra'' regular parameters and arguments on declarations and calls.
     343        The ``resolver'' compiler pass, which provides argument values for a declaration's type-system parameters, gathered from type information in scope at the usage site.
     344\item
     345        The ``box'' compiler pass, which encodes information about type parameters into ``extra'' regular parameters on declarations and and arguments on calls.
    333346        Notably, it conveys the size of a type @foo@ as a @__sizeof_foo@ parameter, and rewrites the @sizeof(foo)@ expression as @__sizeof_foo@, \ie a use of the parameter.
    334347\end{itemize}
     
    337350This work is detailed in \VRef[Section]{s:ArrayTypingC}.
    338351However, the resolution--boxing scheme, in its preexisting state, is equipped to work on (desugared) dimension parameters.
    339 The following discussion explains the desugaring and how correctly lowered code results.
     352The following discussion explains the desugaring and why its lowered code is correct.
    340353
    341354A simpler structure, and a toy function on it, demonstrate what is needed for the encoding.
     
    429442The compiler's action produces the more complex form, which if handwritten, is error-prone.
    430443
    431 At the compiler front end, the parsing changes AST schema extensions and validation rules for enabling the sugared user input.
     444At the \CFA compiler's front end, changes to the parser are AST schema extensions and validation rules for enabling the sugared user input.
    432445\begin{itemize}[leftmargin=*]
    433446\item
    434         Recognize the form @[N]@ as a type-variable declaration within a @forall@.
    435 \item
    436         Have the new brand of type-variable, \emph{Dimension}, in the AST form of a type-variable, to represent one parsed from @[-]@.
     447        Recognize the syntax @forall( ... [N] ... )@ as a type-variable declaration.
     448\item
     449        Where the AST's representation of a type variable encodes its \newterm{brand}, previously including full-object types @T@ and opaque datatypes @T&@, define new brand, \newterm{Dimension}.  Apply this brand to the new syntax.
    437450\item
    438451        Allow a type variable to occur in an expression.  Validate (after parsing) that only dimension-branded type-variables are used here.
     
    614627\end{itemize}
    615628\VRef[Figure]{f:DimexprRuleCompare} gives a case-by-case comparison of the consequences of these rule sets.
    616 It demonstrates that the \CFA false alarms occur in the same cases as C treats unsafe.
    617 It also shows that C-incompatibilities only occur in cases that C treats unsafe.
     629It demonstrates that the \CFA false alarms occur in the same cases as C treats unsafely.
     630It also shows that C-incompatibilities only occur in cases that C treats unsafely.
    618631
    619632\begin{figure}
     
    753766\end{comment}
    754767
     768These rules being conservative is the reason that \VRef{s:1d-impl} calls the \CFA array type only \emph{limited} dependent.
     769In programming languages featuring general dependent typing, considerably more effort is invested in being able to conclude that two type-parameterizing expressions always compute the same value.
     770While conservatism is inescapable, these languages use good approximations of eventual runtime behaviour to enable conclusions about more dynamic behaviours, such as a program never popping from an empty stack.
     771
    755772The conservatism of the new rule set can leave a programmer requiring a recourse, when needing to use a dimension expression whose stability argument is more subtle than current-state analysis.
    756773This recourse is to declare an explicit constant for the dimension value.
     
    767784The @nr@ reference is never written, no implicit code (load) between declarations, and control does not leave the function between the uses.
    768785As well, the build-in @?+?@ function is predictable.
    769 To make these cases work, the programmer must add the follow constant declarations (cast does not work):
     786
     787To make these cases work, the programmer adds these constant declarations:
    770788\begin{cfa}
    771789void f( int & nr, const int nv ) {
     
    780798The result is the originally intended semantics,
    781799achieved by adding a superfluous ``snapshot it as of now'' directive.
     800Note that casting does not help with this issue, because it does not express information about values at different points in time.
    782801
    783802The snapshot trick is also used by the \CFA translation, though to achieve a different outcome.
     
    812831\end{cfa}
    813832Dimension hoisting already existed in the \CFA compiler.
    814 However, it was buggy, particularly with determining, ``Can hoisting the expression be skipped here?'', for skipping this hoisting is clearly desirable in some cases.
    815 For example, when a programmer has already hoisted to perform an optimization to prelude duplicate code (expression) and/or expression evaluation.
     833However, it was buggy, particularly with determining, ``Can hoisting the expression be skipped here?'' for skipping this hoisting is clearly desirable in some cases.
     834For example, when a programmer has already done manual hoisting.
    816835In the new implementation, these cases are correct, harmonized with the accept/reject criteria.
    817836
     
    828847Fixed-stride memory:
    829848this model binds the first subscript and the first memory layout dimension, offering the ability to slice by (provide an index for) only the coarsest dimension, @m[row][*]@ or @c[plane][*][*]@, \eg slice only by row (2D) or plane (3D).
    830 After which, subscripting and memory layout are independent.
    831849\item
    832850Explicit-displacement memory:
    833 this model has no awareness of dimensions just the ability to access memory at a distance from a reference point (base-displacement addressing), \eg @x + 23@ or @x[23}@ $\Rightarrow$ 23rd element from the start of @x@.
     851this model has no awareness of dimensions just the ability to access memory at a distance from a reference point (base-displacement addressing), \eg @x + 23@ or @x[23]@ $\Rightarrow$ 23rd element after the start of @x@.
    834852A programmer must manually build any notion of dimensions using other tools;
    835853hence, this style is not offering multidimensional arrays \see{\VRef[Figure]{f:FixedVariable} right example}.
    836854\end{enumerate}
    837855
    838 There is some debate as to whether the abstraction point ordering above goes $\{1, 2\} < 3$, rather than my numerically-ordering.
     856There is some debate as to whether the abstraction point ordering above goes $\{1, 2\} < 3$, rather than my numeric ordering.
    839857That is, styles 1 and 2 are at the same abstraction level, with 3 offering a limited set of functionality.
    840858I chose to build the \CFA style-1 array upon a style-2 abstraction.
    841 (Justification of the decision follows, after the description of the design.)
     859Justification of the decision follows, after the description of the design.
    842860
    843861Style 3 is the inevitable target of any array implementation.
     
    846864Casting a multidimensional array as a single-dimensional array/pointer, then using @x[i]@ syntax to access its elements, is still a form of pointer arithmetic.
    847865
    848 To step into the implementation of \CFA's new type-1 multidimensional arrays in terms of C's existing type-2 multidimensional arrays, it helps to clarify that the interface is low-level, \ie a C/\CFA array interface includes the resulting memory layout.
    849 Specifically, the defining requirement of a type-2 system is the ability to slice a column from a column-finest matrix.
     866To step into the implementation of \CFA's new type-1 multidimensional arrays in terms of C's existing type-2 multidimensional arrays, it helps to clarify that the interface is low-level, \ie a C/\CFA array interface includes memory-layout detail.
     867Specifically, the defining requirement of a type-1 system is the ability to slice a column from a column-finest matrix.
    850868Hence, the required memory shape of such a slice is fixed, before any discussion of implementation.
    851869The implementation presented here is how the \CFA array-library wrangles the C type system to make it do memory steps that are consistent with this layout while not affecting legacy C programs.
     
    856874Beyond what C's array type offers, the new array brings direct support for working with a \emph{noncontiguous} array slice, allowing a program to work with dimension subscripts given in a non-physical order.
    857875
    858 The following examples use the matrix declaration @array( float, 5, 7 ) m@, loaded with values incremented by $0.1$, when stepping across the length-7 finely-strided column dimension, and stepping across the length-5 coarsely-strided row dimension.
     876The following examples use the matrix declaration @array( float, 5, 7 ) m@, loaded with values incremented by $0.1$, when stepping across the length-7 finely-strided column dimension, and incremented by $1$, when stepping across the length-5 coarsely-strided row dimension.
    859877\par
    860878\mbox{\lstinput{121-126}{hello-md.cfa}}
     
    880898\lstinput{150-151}{hello-md.cfa}
    881899
    882 The example shows @x[2]@ and @x[[2, all]]@ both refer to the same, ``2.*'' slice.
    883 Indeed, the various @print1d@ calls under discussion access the entry with value @2.3@ as @x[2][3]@, @x[[2,all]][3]@, and @x[[all,3]][2]@.
    884 This design preserves (and extends) C array semantics by defining @x[[i,j]]@ to be @x[i][j]@ for numeric subscripts, but also for ``subscripting by all''.
     900The example shows @x[2]@ and @x[2, all]@ both refer to the same, ``2.*'' slice.
     901Indeed, the various @print1d@ calls under discussion access the entry with value @2.3@ as @x[2][3]@, @x[2,all][3]@, and @x[all,3][2]@.
     902This design preserves (and extends) C array semantics by defining @x[i,j]@ to be @x[i][j]@ for numeric subscripts, but also for ``subscripting by all''.
    885903That is:
    886904\begin{cquote}
    887905\begin{tabular}{@{}cccccl@{}}
    888 @x[[2,all]][3]@ & $\equiv$      & @x[2][all][3]@  & $\equiv$    & @x[2][3]@  & (here, @all@ is redundant)  \\
    889 @x[[all,3]][2]@ & $\equiv$      & @x[all][3][2]@  & $\equiv$    & @x[2][3]@  & (here, @all@ is effective)
     906@x[2,all][3]@   & $\equiv$      & @x[2][all][3]@  & $\equiv$    & @x[2][3]@  & (here, @all@ is redundant)  \\
     907@x[all,3][2]@   & $\equiv$      & @x[all][3][2]@  & $\equiv$    & @x[2][3]@  & (here, @all@ is effective)
    890908\end{tabular}
    891909\end{cquote}
     
    916934The semantics of @-[i]@ is to dequeue from the front of the ``want subscripts'' list and lock its value to be @i@.
    917935
    918 Contiguous arrays, and slices of them, are all represented by the same underlying parameterized type, which includes stride information in its metatdata.
     936Contiguous arrays, and slices of them, are all represented by the same underlying parameterized type, which includes stride information in its metadata.
    919937The \lstinline{-[all]} operation takes subscripts, \eg @x[2][all]@, @x[all][3]@, \etc, and converts (transposes) from the base reference @x[all]@ to a specific reference of the appropriate form.
    920938The running example's @all@-effective step, stated more concretely, is:
     
    935953\end{figure}
    936954
    937 \VRef[Figure]{fig:subscr-all} shows one element (in the white band) accessed two different ways: as @x[2][3]@ and as @x[all][3][2]@.
     955\VRef[Figure]{fig:subscr-all} shows one element (in the shaded band) accessed two different ways: as @x[2][3]@ and as @x[all][3][2]@.
    938956In both cases, subscript 2 selects from the coarser dimension (rows of @x@),
    939957while subscript 3 selects from the finer dimension (columns of @x@).
     
    15361554
    15371555Java arrays are references, so multi-dimension arrays are arrays-of-arrays \see{\VRef{toc:mdimpl}}.
    1538 For each array, Java implicitly storages the array dimension in a descriptor, supporting array length, subscript checking, and allowing dynamically-sized array-parameter declarations.
     1556For each array, Java implicitly stores the array dimension in a descriptor, supporting array length, subscript checking, and allowing dynamically-sized array-parameter declarations.
    15391557\begin{cquote}
    15401558\begin{tabular}{rl}
  • doc/theses/mike_brooks_MMath/programs/hello-accordion.cfa

    r037dc57 r1037b4c  
    1212        @array@( string, C ) course_codes; $\C{// nested VLAs}$
    1313        @array@( string, S ) student_ids;
    14         @array@( int, C, S ) preferences; $\C{// multidimensional}$
     14        @array@( int, C, S ) preferences; $\C{// matrix, C x S}$
    1515};
    1616
  • doc/theses/mike_brooks_MMath/programs/hello-md.cfa

    r037dc57 r1037b4c  
    119119fill( m );
    120120/*
    121 r/c   0     1     2     3     4     5     6
    122 0.0  0.1  0.2  @0.3@  0.4  0.5  0.6 
    123 1.0  1.1  1.2  @1.3@  1.4  1.5  1.6 
    124 @2.0  2.1  2.2  2.3  2.4  2.5  2.6@ 
    125 3.0  3.1  3.2  @3.3@  3.4  3.5  3.6 
    126 4.0  4.1  4.2  @4.3@  4.4  4.5  4.6
     121r\c     0      1      2      3      4      5      6
     122  0.0   0.1   0.2   @0.3@   0.4   0.5   0.6 
     123  1.0   1.1   1.2   @1.3@   1.4   1.5   1.6 
     124  @2.0   2.1   2.2   2.3   2.4   2.5   2.6@ 
     125  3.0   3.1   3.2   @3.3@   3.4   3.5   3.6 
     126  4.0   4.1   4.2   @4.3@   4.4   4.5   4.6
    127127*/
    128128
Note: See TracChangeset for help on using the changeset viewer.