Context Navigation

-              r1a40870
+              rcecb260
 where a characterization of the return value (giving it a precise type, generally dependent upon the parameters)
 is a sufficient postcondition.
+In an imperative language like C and \CFA, it is also necessary to discuss side effects,
+for which an even heavier formalism, like separation logic, is required.
+In an imperative language like C and \CFA, it is also necessary to discuss side effects, for which an even heavier formalism, like separation logic, is required.
 Secondly, TODO: bash Rust.
 TODO: cite the crap out of these claims.
 …
 \section{Dimension Parameter Implementation}
+The core of the preexisting \CFA compiler already had the ``heavy equipment'' needed
+to provide the feature set just reviewed (up to bugs in cases not yet exercised).
+The core of the preexisting \CFA compiler already had the ``heavy equipment'' needed to provide the feature set just reviewed (up to bugs in cases not yet exercised).
 To apply this equipment in tracking array lengths, I encoded a dimension (array's length) as a type.
 The type in question does not describe any data that the program actually uses at runtime.
 It simply carries information through intermediate stages of \CFA-to-C lowering.
 And this type takes a form such that, once \emph{it} gets lowered, the result does the right thing.
+This new dimension type, encoding the array dimension within it, is the first limited \newterm{dependent type}~\cite{DependentType} added to the \CFA type-system and is used at appropriate points during type resolution.
 Furthermore, the @array@ type itself is really ``icing on the cake.''
 …
 \end{cfa}
 These two structure patterns, plus a subscript operator, is all that @array@ provides.
+My main work is letting a programmer define
+such a structure (one whose type is parameterized by @[N]@)
+and functions that operate on it (these being similarly parameterized).
+My main work is letting a programmer define such a structure (one whose type is parameterized by @[N]@) and functions that operate on it (these being similarly parameterized).
 The repurposed heavy equipment is
 \begin{itemize}[leftmargin=*]
 \item
+        Resolver provided values for a used declaration's type-system variables,
+        gathered from type information in scope at the usage site.
+\item
+        The box pass, encoding information about type parameters
+        into ``extra'' regular parameters/arguments on declarations and calls.
+        Notably, it conveys the size of a type @foo@ as a @__sizeof_foo@ parameter,
+        and rewrites the @sizeof(foo)@ expression as @__sizeof_foo@, \ie a use of the parameter.
+        Resolver provided values for a used declaration's type-system variables, gathered from type information in scope at the usage site.
+\item
+        The box pass, encoding information about type parameters into ``extra'' regular parameters/arguments on declarations and calls.
+        Notably, it conveys the size of a type @foo@ as a @__sizeof_foo@ parameter, and rewrites the @sizeof(foo)@ expression as @__sizeof_foo@, \ie a use of the parameter.
 \end{itemize}
 …
 Dimension hoisting already existed in the \CFA compiler.
 But its was buggy, particularly with determining, ``Can hoisting the expression be skipped here?'', for skipping this hoisting is clearly desirable in some cases.
 For example, when a programmer has already hoisted to perform an optimiation to prelude duplicate code (expression) and/or expression evaluation.
+For example, when a programmer has already hoisted to perform an optimization to prelude duplicate code (expression) and/or expression evaluation.
 In the new implementation, these cases are correct, harmonized with the accept/reject criteria.
 …
 \end{enumerate}
 There is some debate as to whether the abstraction ordering goes $\{1, 2\} < 3$, rather than my numerically-ordering.
+There is some debate as to whether the abstraction point ordering above goes $\{1, 2\} < 3$, rather than my numerically-ordering.
 That is, styles 1 and 2 are at the same abstraction level, with 3 offering a limited set of functionality.
 I chose to build the \CFA style-1 array upon a style-2 abstraction.
 …
 The defining requirement of a type-2 system is the ability to slice a column from a column-finest matrix.
 The required memory shape of such a slice is fixed, before any discussion of implementation.
 The implementation presented here is how the \CFA array library wrangles the C type system, to make it do memory steps that are consistent with this layout.
 TODO: do I have/need a presentation of just this layout, just the semantics of -[all]?
 The new \CFA standard library @array@ datatype supports richer multidimensional features than C.
+The implementation presented here is how the \CFA array-library wrangles the C type system, to make it do memory steps that are consistent with this layout while not affecting legacy C programs.
+% TODO: do I have/need a presentation of just this layout, just the semantics of -[all]?
+The new \CFA standard-library @array@-datatype supports richer multidimensional features than C.
 The new array implementation follows C's contiguous approach, \ie @float [r][c]@, with one contiguous object subscripted by coarsely-strided dimensions directly wrapping finely-strided dimensions.
 Beyond what C's array type offers, the new array brings direct support for working with a noncontiguous array slice, allowing a program to work with dimension subscripts given in a non-physical order.
 …
 Contiguous arrays, and slices of them, are all represented by the same underlying parameterized type, which includes stride information in its metatdata.
+\PAB{Do not understand this sentence: The \lstinline{-[all]} operation is a conversion from a reference to one instantiation to a reference to another instantiation.}
+The \lstinline{-[all]} operation takes subscripts, \eg @x[2][all]@, @x[all][3]@, \etc, and converts (transposes) from the base reference @x[all]@ to a specific reference of the appropriate form.
 The running example's @all@-effective step, stated more concretely, is:
 \begin{cquote}
 …
 \end{figure}
 \VRef[Figure]{fig:subscr-all} shows one element (in the shaded band) accessed two different ways: as @x[2][3]@ and as @x[all][3][2]@.
 In both cases, value 2 selects from the coarser dimension (rows of @x@),
 while the value 3 selects from the finer dimension (columns of @x@).
+\VRef[Figure]{fig:subscr-all} shows one element (in the white band) accessed two different ways: as @x[2][3]@ and as @x[all][3][2]@.
+In both cases, subscript 2 selects from the coarser dimension (rows of @x@),
+while subscript 3 selects from the finer dimension (columns of @x@).
 The figure illustrates the value of each subexpression, comparing how numeric subscripting proceeds from @x@, \vs from @x[all]@.
 Proceeding from @x@ gives the numeric indices as coarse then fine, while proceeding from @x[all]@ gives them fine then coarse.
 …
 (those that received zero numeric indices so far), are illustrated with vertical steps where a \emph{first} numeric index would select.
 The figure's presentation offers an intuition answering to: What is an atomic element of @x[all]@?
+The figure's presentation offers an intuition answer to: What is an atomic element of @x[all]@?
 From there, @x[all]@ itself is simply a two-dimensional array, in the strict C sense, of these building blocks.
 An atom (like the bottommost value, @x[all][3][2]@), is the contained value (in the square box)
 …
 the locations referenced by the coarse subscript options (rows 0..4) are offset by 3 floats,
 compared with where analogous rows appear when the row-level option is presented for @x@.
+For example, in \lstinline{x[all]}, the shaded band touches atoms 2.0, 2.1, 2.2, 2.3, 1.4, 1.5 and 1.6 (left diagonal).
+For example, in \lstinline{x[all]}, the shaded band and its immediate values to the left touches atoms 2.3, 2.0, 2.1, 2.2, 1.4, 1.5 and 1.6.
 But only the atom 2.3 is storing its value there.
 The rest are lying about (conflicting) claims on this location, but never exercising these alleged claims.
 …
 Construction by @array@ initializes the masquerading-as type information to be equal to the contained-element information.
 Subscripting by @all@ rearranges the order of masquerading-as types to achieve, in general, nontrivial striding.
 Subscripting by a number consumes the masquerading-as size of the contained element type, does normal array stepping according to that size, and returns there element found there, in unmasked form.
 An instantiation of the @arpk@ generic is given by the @array(E_base, N0, N1, ...)@ expansion, which is @arpk( N0, Rec, Rec, E_base )@, where @Rec@ is @array(E_base, N1, ...)@.
 In the base case, @array(E_base)@ is just @E_base@.
+Subscripting by a number consumes the masquerading-as size of the contained element type, does normal array stepping according to that size, and returns the element found there, in unmasked form.
+An instantiation of the @arpk@ generic is given by the @array( E_base, N0, N1, ... )@ expansion, which is @arpk( N0, Rec, Rec, E_base )@, where @Rec@ is @array( E_base, N1, ... )@.
+In the base case, @array( E_base )@ is just @E_base@.
 Because this construction uses the same value for the generic parameters @S@ and @E_im@, the resulting layout has trivial strides.
 …
 \CFA array subscripting is protected with runtime bound checks.
 Having dependent typing causes the optimizer to remove more of these bound checks than it would without them.
+The array dependent-typing provides information to the C optimizer allowing it remove many of the bound checks.
 This section provides a demonstration of the effect.
 …
 \section{Array Lifecycle}
+An array is an aggregate, like a structure;
+both are containers wrapping subordinate objects.
+Any arbitrary object type, like @string@, can be an array element or structure member.
+An array, like a structure, wraps subordinate objects.
+Any object type, like @string@, can be an array element or structure member.
 A consequence is that the lifetime of the container must match with its subordinate objects: all elements and members must be initialized/uninitialized implicitly as part of the container's allocation/deallocation.
 Modern programming languages implicitly perform these operations via a type's constructor and destructor.
 Therefore, \CFA must assure that an array's subordinate objects' lifetime operations are called.
 Preexisting \CFA mechanisms achieve this requirement, but with poor performance.
 Furthermore, advanced array users need an exception to the basic mechanism, which does not occur with other aggregates.
 Hence, arrays introduce subtleties in supporting an element's lifecycle.
 The preexisting \CFA support for contained-element lifecycle is based on recursive occurrences of the object-type (@otype@) pseudo-trait.
 A type is an @otype@, if it provides a default (parameterless) constructor, copy constructor, assignment operator, and destructor (like \CC).
 When declaring a structure with @otype@ members, the compiler implicitly generates implementations of the four @otype@ functions for the outer structure.
 Then the generated default constructor for the outer structure calls the default constructor for each member, and the other @otype@ functions work similarly.
+The preexisting \CFA support for contained-element lifecycle is based on recursive occurrences of the object-type (otype) pseudo-trait.
+A type is an otype, if it provides a default (parameterless) constructor, copy constructor, assignment operator, and destructor (like \CC).
+For a structure with otype members, the compiler implicitly generates implementations of the four otype functions for the outer structure.
+Then the generated default constructor for the outer structure calls the default constructor for each member, and the other otype functions work similarly.
 For a member that is a C array, these calls occur in a loop for each array element, which even works for VLAs.
 This logic works the same, whether the member is a concrete type (that happens to be an @otype@) or if the member is a polymorphic type asserted to be an @otype@ (which is implicit in the syntax, @forall(T)@).
+This logic works the same, whether the member is a concrete type (that happens to be an otype) or if the member is a polymorphic type asserted to be an otype (which is implicit in the syntax, @forall(T)@).
 The \CFA array has the simplified form (similar to one seen before):
 \begin{cfa}
 …
 };
 \end{cfa}
+Being a structure with a C-array member, using the otype-form declaration for @T@ causes @array5(float)@ to implement @otype@ too.
+But this @otype@-recursion pattern has a performance issue.
+For example, in a cube of @float@:
+Being a structure with a C-array member, the otype-form declaration @T@ causes @array5( float )@ to be an otype too.
+But this otype-recursion pattern has a performance issue.
+\VRef[Figure]{f:OtypeRecursionBlowup} shows the steps to find lifecycle functions, under the otype-recursion pattern, for a cube of @float@:
 \begin{cfa}
 array5( array5( array5( float ) ) )
 \end{cfa}
+the first few steps of the compiler's work to find the lifecycle functions, under the @otype@-recursion pattern, are shown in \VRef[Figure]{f:OtypeRecursionBlowup}.
+All the work needed for the full @float@-cube would have 256 leaves.
+The work needed for the full @float@-cube is 256 leaves.
+Then the otype-recursion pattern generates helper functions and thunks that are exponential in the number of dimensions.
+Anecdotal experience is annoyingly slow compile time at three dimensions and unusable at four.
 %array5(T) offers
 …
 \begin{figure}
 \centering
 \setlength{\tabcolsep}{15pt}
+\setlength{\tabcolsep}{20pt}
 \begin{tabular}{@{}lll@{}}
 \begin{cfa}[deletekeywords={default}]
 …
 default ctor
 copy ctor
 asgt
+assignment
 dtor
 …
+&
 \begin{cfa}[deletekeywords={default}]
 array5(float) has
 default ctor, using float's
+array5( float ) has
+default ctor, using float$'$s
 default ctor
 copy ctor
 asgt
+assignment
 dtor
 copy ctor, using float's
+copy ctor, using float$'$s
 default ctor
 copy ctor
 asgt
+assignment
 dtor
 asgt, using float's
+assignment, using float$'$s
 default ctor
 copy ctor
 asgt
+assignment
 dtor
 dtor, using float's
+dtor, using float$'$s
 default ctor
 copy ctor
 asgt
+assignment
 dtor
 …
+&
 \begin{cfa}[deletekeywords={default}]
 array5(array5(float)) has
 default ctor, using array5(float)'s
 default ctor, using float's
+array5( array5( float ) ) has
+default ctor, using array5( float )
+default ctor, using float$'$s
 default ctor
 copy ctor
 asgt
+assignment
 dtor
 copy ctor, using float's
+copy ctor, using float$'$s
 default ctor
 copy ctor
 asgt
+assignment
 dtor
 asgt, using float's
+assignment, using float$'$s
 default ctor
 copy ctor
 asgt
+assignment
 dtor
 dtor, using float's
+dtor, using float$'$s
 default ctor
 copy ctor
 asgt
+assignment
 dtor
 copy ctor, using array5(float)'s
+copy ctor, using array5( float )
         ... 4 children, 16 leaves
 asgt, using array5(float)'s
+assignment, using array5( float )
         ... 4 children, 16 leaves
 dtor, using array5(float)'s
+dtor, using array5( float )
         ... 4 children, 16 leaves
 (64 leaves)
 …
 \end{tabular}
 \caption{Exponential thunk generation under the otype-recursion pattern.
         Each time that one type's function (\eg ctor) uses another type's function, the \CFA compiler generates a thunk, to capture the used function's dependencies, presented according to the using function's need.
+        Each time one type's function (\eg ctor) uses another type's function, the \CFA compiler generates a thunk, to capture the used function's dependencies, presented according to the using function's need.
         So, each non-leaf line represents a generated thunk and every line represents a search request for the resolver to find a satisfying function.}
 \label{f:OtypeRecursionBlowup}
 \end{figure}
-So the @otype@-recursion pattern seeks a quantity of helper functions, and generates a quantity of thunks, that are exponential in the number of dimensions.
-Anecdotal experience with this solution found the resulting compile times annoyingly slow at three dimensions, and unusable at four.
 The issue is that the otype-recursion pattern uses more assertions than needed.
+Consider how @array5(float)@'s default constructor is getting all four lifecycle assertions about the element type, @float@.
+It only needs @float@'s default constructor;
+the full set of operations is never used.
+For example, @array5( float )@'s default constructor has all four lifecycle assertions about the element type, @float@.
+However, it only needs @float@'s default constructor, as the other operations are never used.
 Current work by the \CFA team aims to improve this situation.
 Therefore, a workaround is needed for now.
+The workaround is to provide a default constructor, copy constructor and destructor, defined with lean, bespoke assertions:
+Therefore, I had to construct a workaround.
+My workaround moves from otype (value) to dtype (pointer) with a default-constructor assertion, where dtype does not generate any constructors but the assertion claws back the default otype constructor.
 \begin{cquote}
+\begin{tabular}{@{}l@{\hspace{0.5in}}l@{}}
+\setlength{\tabcolsep}{10pt}
+\begin{tabular}{@{}ll@{}}
 \begin{cfa}
 // autogenerated for otype-recursion:
 forall( T )
 void ?{}( array5(T) & this ) {
         for (i; 5) { ( this[i] ){}; }
+void ?{}( array5( T ) & this ) {
+        for ( i; 5 ) { ( this[i] ){}; }
+}
 forall( T )
 void ?{}( array5(T) & this, array5(T) & src ) {
         for (i; 5) { ( this[i] ){ src[i] }; }
+void ?{}( array5( T ) & this, array5( T ) & src ) {
+        for ( i; 5 ) { ( this[i] ){ src[i] }; }
+}
 forall( T )
 void ^?{}( array5(T) & this ) {
         for (i; 5) { ^( this[i] ){}; }
+void ^?{}( array5( T ) & this ) {
+        for ( i; 5 ) { ^( this[i] ){}; }
+}
 \end{cfa}
 …
 // lean, bespoke:
 forall( T* | { void @?{}( T & )@; } )
 void ?{}( array5(T) & this ) {
         for (i; 5) { ( this[i] ){}; }
+void ?{}( array5( T ) & this ) {
+        for ( i; 5 ) { ( this[i] ){}; }
+}
 forall( T* | { void @?{}( T &, T )@; } )
 void ?{}( array5(T) & this, array5(T) & src ) {
         for (i; 5) { ( this[i] ){ src[i] }; }
+void ?{}( array5( T ) & this, array5( T ) & src ) {
+        for ( i; 5 ) { ( this[i] ){ src[i] }; }
+}
 forall( T* | { void @?{}( T & )@; } )
 …
 \end{cquote}
 Moreover, the assignment operator is skipped, to avoid hitting a lingering growth case.
 Skipping assignment is tolerable because array assignment is not a common operation.
 With this solution, the critical lifecycle functions are available, with no growth in thunk creation.
+Temporarily skipping assignment is tolerable because array assignment is not a common operation.
+With this solution, the critical lifecycle functions are available, with linear growth in thunk creation for the default constructor.
 Finally, the intuition that a programmer using an array always wants the elements' default constructor called \emph{automatically} is simplistic.
 …
 void ?{}( worker &, int id );
 array( worker, 5 ) ws = @{}@; // rejected; but desire is for no initialization yet
 for (i; 5) (ws[i]){ @i@ }; // explicitly initialize each thread, giving id
+for ( i; 5 ) (ws[i]){ @i@ }; // explicitly initialize each thread, giving id
 \end{cfa}
 Note the use of the \CFA explicit constructor call, analogous to \CC's placement-@new@.
 …
 Therefore, there is a conflict between the implicit actions of the builtin @thread@ type and a user's desire to defer these actions.
 Another \CFA feature may, at first, seem viable for initializing the array @ws@, though on closer inspection, it is not.
 C initialization, \lstinline|array(worker, 5) ws @= {};|, ignores all \CFA lifecycle management and uses C empty initialization.
+Another \CFA feature for providing C backwards compatibility, at first seem viable for initializing the array @ws@, though on closer inspection is not.
+C initialization without a constructor is specified with \lstinline|@=|, \eg \lstinline|array(worker, 5) ws @= {}| ignores all \CFA lifecycle management and uses C empty initialization.
 This option does achieve the desired semantics on the construction side.
 But on destruction side, the desired semantics is for implicit destructor calls to continue, to keep the join operation tied to lexical scope.
 …
 \end{cfa}
 Both cause the @ws@-construction-time implicit call chain to stop before reaching a @worker@ constructor, while leaving the implicit destruction calls intact.
 Two forms are available, to parallel the development of this feature in \uCpp.
+Two forms are available, to parallel the development of this feature in \uCpp~\cite{uC++}.
 Originally \uCpp offered only the @ws1@ form, using the class-template @uNoCtor@ equivalent to \CFA's @uninit@.
 More recently, \uCpp was extended with the declaration macro, @uArray@, with usage similar to the @ws2@ case.
 Based on experience piloting @uArray@ as a replacement of @uNoCtor@, it might be possible to remove the first option.
+Based on experience piloting @uArray@ as a replacement of @uNoCtor@, it should be possible to remove the first option.
 % note to Mike, I have more fragments on some details available in push24\fragments\uNoCtor.txt
 \section{Comparison with Other Arrays}
+\section{Array Comparison}
 \subsection{Rust}
+\CFA's array is the first lightweight application of dependently-typed bound tracking to an extension of C.
+Other extensions of C that apply dependently-typed bound tracking are heavyweight, in that the bound tracking is part of a linearly-typed ownership-system, which further helps guarantee statically the validity of every pointer deference.
+These systems, therefore, ask the programmer to convince the type checker that every pointer dereference is valid.
+\CFA imposes the lighter-weight obligation, with the more limited guarantee, that initially-declared bounds are respected thereafter.
+\CFA's array is also the first extension of C to use its tracked bounds to generate the pointer arithmetic implied by advanced allocation patterns.
+Other bound-tracked extensions of C either forbid certain C patterns entirely, or address the problem of \emph{verifying} that the user's provided pointer arithmetic is self-consistent.
+The \CFA array, applied to accordion structures [TOD: cross-reference] \emph{implies} the necessary pointer arithmetic, generated automatically, and not appearing at all in a user's program.
+A Rust array is a set of mutable or immutable (@const@) contiguous objects of the same type @T@, where the compile-time dimension(s) is part of the type signature @[T; dimension]@.
+\begin{rust}
+let val = 42;
+let mut ia: [usize; 5] = [val; 5];      $\C{// mutable, repeated initialization [42, 42, 42, 42, 42]}$
+let fval = 42.2;
+let fa: [f64; 5] = [1.2, fval, 5.6, 7.3, 9.1];  $\C{// immutable, individual initialization}$
+\end{rust}
+Initialization is required even if the  array is subsequently initialized.
+Rust arrays are not VLAs, but the compile-time length can be queried using member @len()@.
+Arrays can be assigned and passed to parameters by value or reference, if and only if, the type and dimension match, meaning a different function is needed for each array size.
+Array iteration is by a range or array variable.
+\begin{rust}
+for i in @0..ia.len()@ { print!("{:?} ", ia[i] ); }  $\C{// 42 42 42 42 42}$
+for fv in @fa@ { print!("{:?} ", fv ); }  $\C{// 1.2 42.2 5.6 7.3 9.1}$
+for i in 0..=1 { ia[i] = i; }    $\C{// mutable changes}$
+for iv in ia { print!("{:?} ", iv ); }  $\C{// 0 1 42 42 42}$
+\end{rust}
+An array can be assigned and printed as a set.
+\begin{rust}
+ia = @[5; 5]@;   println!( "{:?} {:?}", @ia@, ia.len() );       $\C{// [5, 5, 5, 5, 5] 5}$
+ia = @[1, 2, 3, 4, 5]@;   println!( "{:?} {:?}", @ia@, ia.len() );      $\C{// [1, 2, 3, 4, 5] 5}$
+\end{rust}
+Multi-dimensional arrays use nested dimensions, resulting in columns before rows.
+Here the outer array is a length @ROWS@ array whose elements are @f64@ arrays of length @COLS@ each.
+\begin{cquote}
+\setlength{\tabcolsep}{10pt}
+\begin{tabular}{@{}ll@{}}
+\begin{rust}
+const ROWS: usize = 4;   const COLS: usize = 6;
+let mut fmatrix: [[f64; @COLS@]; @ROWS@] = [[0.; @COLS@]; @ROWS@];
+for r in 0 .. ROWS {
+        for c in 0 .. COLS {  fmatrix[r][c] = r as f64 + c as f64;  }  }
+\end{rust}
+&
+\begin{rust}
+[ 0.0, 1.0, 2.0, 3.0, 4.0, 5.0 ]
+[ 1.0, 2.0, 3.0, 4.0, 5.0, 6.0 ]
+[ 2.0, 3.0, 4.0, 5.0, 6.0, 7.0 ]
+[ 3.0, 4.0, 5.0, 6.0, 7.0, 8.0 ]
+\end{rust}
+\end{tabular}
+\end{cquote}
+Slices borrow a section of an array (subarray) and have type @&[T]@.
+A slice has a dynamic size and does not implicitly coerce to an array.
+\begin{rust}
+println!( "{:?}", @&ia[0 .. 3]@ ); $\C{// fixed bounds, [1, 2, 3]}$
+println!( "{:?}", @&ia[ia.len() - 2 .. ia.len()@] ); $\C{// variable bounds, [4, 5]}$
+println!( "{:?}", @&ia[ia.len() - 2 .. ]@ ); $\C{// drop upper bound, [4, 5]}$
+println!( "{:?}", @&ia[.. ia.len() - 2 ]@ ); $\C{// drop lower bound, [1, 2, 3]}$
+println!( "{:?}", @&ia[..]@ ); $\C{// drop both bound, [1, 2, 3, 4, 5]}$
+\end{rust}
+A multi-dimensional slice can only borrow subrows because a slice requires contiguous memory.
+Here columns 2--4 are sliced from row 3.
+\begin{rust}
+let mut slice2: &[f64] = &fmatrix[3][@2 ..= 4@];
+println!( "{:?}", slice2 );  $\C{// [5.0, 6.0, 7.0]}$
+\end{rust}
+Here row 2 is sliced from the sub-matrix formed from rows 1--3.
+\begin{rust}
+slice2 = &fmatrix[@1 ..= 3@][1];
+println!( "{:?}", slice2 );  $\C{// [3.0, 4.0, 5.0, 6.0, 7.0, 8.0]}$
+\end{rust}
+A slice can be explicitly converted into an array or reference to an array.
+\begin{rust}
+let slice: &[f64];
+slice = &fa[0 ..= 1]; $\C{// create slice, [1.2, 42.2]}$
+let mut fa2: [f64; 2] = @<[f64; 2]>::try_from( slice ).unwrap()@; $\C{// convert slice to array, [1.2, 42.2]}$
+fa2 = @(& fa[2 ..= 3]).try_into().unwrap()@; $\C{// convert slice to array, [5.6, 7.3]}$
+\end{rust}
+The @vec@ macro (using @Vec@ type) provides dynamically-size dynamically-growable arrays of arrays only using the heap (\CC @vector@ type).
+\begin{cquote}
+\setlength{\tabcolsep}{10pt}
+\begin{tabular}{@{}ll@{}}
+\multicolumn{1}{c}{Rust} &\multicolumn{1}{c}{\CC} \\
+\begin{rust}
+let (rows, cols) = (4, 6);
+let mut matrix = vec![vec![0; cols]; rows];
+...     matrix[r][c] = r + c; // fill matrix
+\end{rust}
+&
+\begin{c++}
+int rows = 4, cols = 6;
+vector<vector<int>> matrix( rows, vector<int>(cols) );
+...     matrix[r][c] = r + c; // fill matrix}
+\end{c++}
+\end{tabular}
+\end{cquote}
+A dynamically-size array is sliced the same as a fixed-size one.
+\begin{rust}
+let mut slice3: &[usize] = &matrix[3][2 ..= 4]; $\C{// [5, 6, 7]}$
+slice3 = &matrix[1 ..= 3][1]; $\C{// [2, 3, 4, 5, 6, 7]}$
+\end{rust}
+Finally, to mitigate the restriction of matching array dimensions between argument and parameter types, it is possible for a parameter to be a slice.
+\begin{rust}
+fn zero( arr: @& mut [usize]@ ){
+        for i in 0 .. arr.len() { arr[i] = 0; }
+}
+zero( & mut ia[0..5] ); // arbitrary sized slice
+zero( & mut ia[0..3] );
+\end{rust}
+Or write a \emph{generic} function taking an array of fixed-element type and any size.
+\begin{rust}
+fn aprint<@const N: usize@>( arg: [usize; N] ) {
+        for r in 0 .. arg.len() {   print!( "{} ", arg[r] );   }
+}
+aprint( ia );
+\end{rust}
 \subsection{Java}
 Java arrays are arrays-of-arrays because all objects are references \see{\VRef{toc:mdimpl}}.
+Java arrays are references, so multi-dimension arrays are arrays-of-arrays \see{\VRef{toc:mdimpl}}.
 For each array, Java implicitly storages the array dimension in a descriptor, supporting array length, subscript checking, and allowing dynamically-sized array-parameter declarations.
 \begin{cquote}
 …
 \end{tabular}
 \end{cquote}
 However, in the C prototype, the parameters @n@ and @m@  are documentation only as the intended size of the first and second dimension of @x@.
+However, in the C prototype, the parameters @n@ and @m@ are manually managed, \ie there is no guarantee they match with the argument matrix for parameter @x@.
 \VRef[Figure]{f:JavaVsCTriangularMatrix} compares a triangular matrix (array-of-arrays) in dynamically safe Java to unsafe C.
 Each dynamically sized row in Java stores its dimension, while C requires the programmer to manage these sizes explicitly (@rlnth@).
 …
 \subsection{Levels of Dependently Typed Arrays}
+\CFA's array is the first lightweight application of dependently-typed bound tracking to an extension of C.
+Other extensions of C that apply dependently-typed bound tracking are heavyweight, in that the bound tracking is part of a linearly-typed ownership-system, which further helps guarantee statically the validity of every pointer deference.
+These systems, therefore, ask the programmer to convince the type checker that every pointer dereference is valid.
+\CFA imposes the lighter-weight obligation, with the more limited guarantee, that initially-declared bounds are respected thereafter.
+\CFA's array is also the first extension of C to use its tracked bounds to generate the pointer arithmetic implied by advanced allocation patterns.
+Other bound-tracked extensions of C either forbid certain C patterns entirely, or address the problem of \emph{verifying} that the user's provided pointer arithmetic is self-consistent.
+The \CFA array, applied to accordion structures [TOD: cross-reference] \emph{implies} the necessary pointer arithmetic, generated automatically, and not appearing at all in a user's program.
 The \CFA array and the field of ``array language'' comparators all leverage dependent types to improve on the expressiveness over C and Java, accommodating examples such as:
 \begin{itemize}[leftmargin=*]
 …
 Having no instances means there is no type for a variable @i@ that constrains @i@ to be in the range for @N@, unlike Dex, [TODO: verify], but like Futhark.
-\subsection{Static Safety in C Extensions}
-\section{Future Work}
-\subsection{Declaration Syntax}
-\subsection{Range Slicing}
-\subsection{With a Module System}
-\subsection{With Described Enumerations}
-A project in \CFA's current portfolio will improve enumerations.
-In the incumbent state, \CFA has C's enumerations, unmodified.
-I will not discuss the core of this project, which has a tall mission already, to improve type safety, maintain appropriate C compatibility and offer more flexibility about storage use.
-It also has a candidate stretch goal, to adapt \CFA's @forall@ generic system to communicate generalized enumerations:
-\begin{cfa}
-forall( T | is_enum(T) )
-void show_in_context( T val ) {
-        for( T i ) {
-                string decorator = "";
-                if ( i == val-1 ) decorator = "< ready";
-                if ( i == val   ) decorator = "< go"   ;
-                sout | i | decorator;
+        }
+}
-enum weekday { mon, tue, wed = 500, thu, fri };
-show_in_context( wed );
-\end{cfa}
-with output
-\begin{cfa}
-mon
-tue < ready
-wed < go
-thu
-fri
-\end{cfa}
-The details in this presentation aren't meant to be taken too precisely as suggestions for how it should look in \CFA.
-But the example shows these abilities:
-\begin{itemize}[leftmargin=*]
-\item a built-in way (the @is_enum@ trait) for a generic routine to require enumeration-like information about its instantiating type
-\item an implicit implementation of the trait whenever a user-written enum occurs (@weekday@'s declaration implies @is_enum@)
-\item a total order over the enumeration constants, with predecessor/successor (@val-1@) available, and valid across gaps in values (@tue == 1 && wed == 500 && tue == wed - 1@)
-\item a provision for looping (the @for@ form used) over the values of the type.
-\end{itemize}
 If \CFA gets such a system for describing the list of values in a type, then \CFA arrays are poised to move from the Futhark level of expressiveness, up to the Dex level.
 …
 In the case of adapting this pattern to \CFA, my current work provides an adapter from ``successively subscripted'' to ``subscripted by tuple,'' so it is likely that generalizing my adapter beyond ``subscripted by @ptrdiff_t@'' is sufficient to make a user-provided adapter unnecessary.
+\subsection{Static Safety in C Extensions}
+\section{Future Work}
+\subsection{Array Syntax}
+An important goal is to recast @array(...)@ syntax into C-style @[]@.
+The proposal (which is partially implemented) is an alternate dimension and subscript syntax.
+C multi-dimension and subscripting syntax uses multiple brackets.
+\begin{cfa}
+int m@[2][3]@;  // dimension
+m@[0][1]@ = 3;  // subscript
+\end{cfa}
+Leveraging this syntax, the following (simpler) syntax should be intuitive to C programmers with only a small backwards compatibility issue.
+\begin{cfa}
+int m@[2, 3]@;  // dimension
+m@[0, 1]@ = 3;  // subscript
+\end{cfa}
+However, the new subscript syntax is not backwards compatible, as @0, 1@ is a comma expression.
+Interestingly, disallowed the comma expression in this context eliminates an unreported error: subscripting a matrix with @m[i, j]@ instead of @m[i][j]@, which selects the @j@th row not the @i, j@ element.
+Hence, a comma expression in this context is rare.
+Finally, it is possible to write @m[(i, j)]@ in the new syntax to achieve the equivalent of the old @m[i, j]@.
+Note, the new subscript syntax can easily be internally lowered to @[-][-]...@ and handled as regular subscripting.
+The only ambiguity with C syntax is for a single dimension array, where the syntax for old and new are the same.
+\begin{cfa}
+int m[2@,@];  // single dimension
+m[0] = 3;  // subscript
+\end{cfa}
+The solution for the dimension is to use a terminating comma to denote a new single-dimension array.
+This syntactic form is also used for the (rare) singleton tuple @[y@{\color{red},}@]@.
+The extra comma in the dimension is only mildly annoying, and acts as eye-candy differentiating old and new arrays.
+The subscript operator is not an issue as overloading selects the correct single-dimension operation for old/new array types.
+The ultimately goal is to replace all C arrays with \CFA arrays, establishing a higher level of safety in C programs, and eliminating the need for the terminating comma.
+\subsection{Range Slicing}
+\subsection{With a Module System}
 \subsection{Retire Pointer Arithmetic}
+\section{\CFA}
+\begin{comment}
+\section{\texorpdfstring{\CFA}{Cforall}}
 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX \\
 …
 (For later:  That's what I offer with array.hfa, but in the future-work vision for arrays, the fix includes helping programmers stop accidentally using a broken C-ism.)
 \subsection{\CFA Features Interacting with Arrays}
+\subsection{\texorpdfstring{\CFA}{Cforall} Features Interacting with Arrays}
 Prior work on \CFA included making C arrays, as used in C code from the wild,
 …
 TODO: I fixed a bug associated with using an array as a T.  I think.  Did I really?  What was the bug?
+\end{comment}

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset cecb260 for doc/theses

Legend:

doc/theses/mike_brooks_MMath/array.tex

Download in other formats: