Changeset cecb260 for doc/theses


Ignore:
Timestamp:
May 27, 2025, 3:35:06 PM (4 months ago)
Author:
Peter A. Buhr <pabuhr@…>
Branches:
master
Children:
8d049ad
Parents:
1a40870
Message:

proofread array chapter

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/mike_brooks_MMath/array.tex

    r1a40870 rcecb260  
    9191where a characterization of the return value (giving it a precise type, generally dependent upon the parameters)
    9292is a sufficient postcondition.
    93 In an imperative language like C and \CFA, it is also necessary to discuss side effects,
    94 for which an even heavier formalism, like separation logic, is required.
     93In an imperative language like C and \CFA, it is also necessary to discuss side effects, for which an even heavier formalism, like separation logic, is required.
    9594Secondly, TODO: bash Rust.
    9695TODO: cite the crap out of these claims.
     
    297296\section{Dimension Parameter Implementation}
    298297
    299 The core of the preexisting \CFA compiler already had the ``heavy equipment'' needed
    300 to provide the feature set just reviewed (up to bugs in cases not yet exercised).
     298The core of the preexisting \CFA compiler already had the ``heavy equipment'' needed to provide the feature set just reviewed (up to bugs in cases not yet exercised).
    301299To apply this equipment in tracking array lengths, I encoded a dimension (array's length) as a type.
    302300The type in question does not describe any data that the program actually uses at runtime.
    303301It simply carries information through intermediate stages of \CFA-to-C lowering.
    304302And this type takes a form such that, once \emph{it} gets lowered, the result does the right thing.
     303This new dimension type, encoding the array dimension within it, is the first limited \newterm{dependent type}~\cite{DependentType} added to the \CFA type-system and is used at appropriate points during type resolution.
    305304
    306305Furthermore, the @array@ type itself is really ``icing on the cake.''
     
    319318\end{cfa}
    320319These two structure patterns, plus a subscript operator, is all that @array@ provides.
    321 
    322 My main work is letting a programmer define
    323 such a structure (one whose type is parameterized by @[N]@)
    324 and functions that operate on it (these being similarly parameterized).
     320My main work is letting a programmer define such a structure (one whose type is parameterized by @[N]@) and functions that operate on it (these being similarly parameterized).
    325321
    326322The repurposed heavy equipment is
    327323\begin{itemize}[leftmargin=*]
    328324\item
    329         Resolver provided values for a used declaration's type-system variables,
    330         gathered from type information in scope at the usage site.
    331 \item
    332         The box pass, encoding information about type parameters
    333         into ``extra'' regular parameters/arguments on declarations and calls.
    334         Notably, it conveys the size of a type @foo@ as a @__sizeof_foo@ parameter,
    335         and rewrites the @sizeof(foo)@ expression as @__sizeof_foo@, \ie a use of the parameter.
     325        Resolver provided values for a used declaration's type-system variables, gathered from type information in scope at the usage site.
     326\item
     327        The box pass, encoding information about type parameters into ``extra'' regular parameters/arguments on declarations and calls.
     328        Notably, it conveys the size of a type @foo@ as a @__sizeof_foo@ parameter, and rewrites the @sizeof(foo)@ expression as @__sizeof_foo@, \ie a use of the parameter.
    336329\end{itemize}
    337330
     
    816809Dimension hoisting already existed in the \CFA compiler.
    817810But its was buggy, particularly with determining, ``Can hoisting the expression be skipped here?'', for skipping this hoisting is clearly desirable in some cases.
    818 For example, when a programmer has already hoisted to perform an optimiation to prelude duplicate code (expression) and/or expression evaluation.
     811For example, when a programmer has already hoisted to perform an optimization to prelude duplicate code (expression) and/or expression evaluation.
    819812In the new implementation, these cases are correct, harmonized with the accept/reject criteria.
    820813
     
    839832\end{enumerate}
    840833
    841 There is some debate as to whether the abstraction ordering goes $\{1, 2\} < 3$, rather than my numerically-ordering.
     834There is some debate as to whether the abstraction point ordering above goes $\{1, 2\} < 3$, rather than my numerically-ordering.
    842835That is, styles 1 and 2 are at the same abstraction level, with 3 offering a limited set of functionality.
    843836I chose to build the \CFA style-1 array upon a style-2 abstraction.
     
    853846The defining requirement of a type-2 system is the ability to slice a column from a column-finest matrix.
    854847The required memory shape of such a slice is fixed, before any discussion of implementation.
    855 The implementation presented here is how the \CFA array library wrangles the C type system, to make it do memory steps that are consistent with this layout.
    856 TODO: do I have/need a presentation of just this layout, just the semantics of -[all]?
    857 
    858 The new \CFA standard library @array@ datatype supports richer multidimensional features than C.
     848The implementation presented here is how the \CFA array-library wrangles the C type system, to make it do memory steps that are consistent with this layout while not affecting legacy C programs.
     849% TODO: do I have/need a presentation of just this layout, just the semantics of -[all]?
     850
     851The new \CFA standard-library @array@-datatype supports richer multidimensional features than C.
    859852The new array implementation follows C's contiguous approach, \ie @float [r][c]@, with one contiguous object subscripted by coarsely-strided dimensions directly wrapping finely-strided dimensions.
    860853Beyond what C's array type offers, the new array brings direct support for working with a noncontiguous array slice, allowing a program to work with dimension subscripts given in a non-physical order.
     
    921914
    922915Contiguous arrays, and slices of them, are all represented by the same underlying parameterized type, which includes stride information in its metatdata.
    923 \PAB{Do not understand this sentence: The \lstinline{-[all]} operation is a conversion from a reference to one instantiation to a reference to another instantiation.}
     916The \lstinline{-[all]} operation takes subscripts, \eg @x[2][all]@, @x[all][3]@, \etc, and converts (transposes) from the base reference @x[all]@ to a specific reference of the appropriate form.
    924917The running example's @all@-effective step, stated more concretely, is:
    925918\begin{cquote}
     
    939932\end{figure}
    940933
    941 \VRef[Figure]{fig:subscr-all} shows one element (in the shaded band) accessed two different ways: as @x[2][3]@ and as @x[all][3][2]@.
    942 In both cases, value 2 selects from the coarser dimension (rows of @x@),
    943 while the value 3 selects from the finer dimension (columns of @x@).
     934\VRef[Figure]{fig:subscr-all} shows one element (in the white band) accessed two different ways: as @x[2][3]@ and as @x[all][3][2]@.
     935In both cases, subscript 2 selects from the coarser dimension (rows of @x@),
     936while subscript 3 selects from the finer dimension (columns of @x@).
    944937The figure illustrates the value of each subexpression, comparing how numeric subscripting proceeds from @x@, \vs from @x[all]@.
    945938Proceeding from @x@ gives the numeric indices as coarse then fine, while proceeding from @x[all]@ gives them fine then coarse.
     
    947940(those that received zero numeric indices so far), are illustrated with vertical steps where a \emph{first} numeric index would select.
    948941
    949 The figure's presentation offers an intuition answering to: What is an atomic element of @x[all]@?
     942The figure's presentation offers an intuition answer to: What is an atomic element of @x[all]@?
    950943From there, @x[all]@ itself is simply a two-dimensional array, in the strict C sense, of these building blocks.
    951944An atom (like the bottommost value, @x[all][3][2]@), is the contained value (in the square box)
     
    963956the locations referenced by the coarse subscript options (rows 0..4) are offset by 3 floats,
    964957compared with where analogous rows appear when the row-level option is presented for @x@.
    965 
    966 For example, in \lstinline{x[all]}, the shaded band touches atoms 2.0, 2.1, 2.2, 2.3, 1.4, 1.5 and 1.6 (left diagonal).
     958For example, in \lstinline{x[all]}, the shaded band and its immediate values to the left touches atoms 2.3, 2.0, 2.1, 2.2, 1.4, 1.5 and 1.6.
    967959But only the atom 2.3 is storing its value there.
    968960The rest are lying about (conflicting) claims on this location, but never exercising these alleged claims.
     
    10131005Construction by @array@ initializes the masquerading-as type information to be equal to the contained-element information.
    10141006Subscripting by @all@ rearranges the order of masquerading-as types to achieve, in general, nontrivial striding.
    1015 Subscripting by a number consumes the masquerading-as size of the contained element type, does normal array stepping according to that size, and returns there element found there, in unmasked form.
    1016 
    1017 An instantiation of the @arpk@ generic is given by the @array(E_base, N0, N1, ...)@ expansion, which is @arpk( N0, Rec, Rec, E_base )@, where @Rec@ is @array(E_base, N1, ...)@.
    1018 In the base case, @array(E_base)@ is just @E_base@.
     1007Subscripting by a number consumes the masquerading-as size of the contained element type, does normal array stepping according to that size, and returns the element found there, in unmasked form.
     1008
     1009An instantiation of the @arpk@ generic is given by the @array( E_base, N0, N1, ... )@ expansion, which is @arpk( N0, Rec, Rec, E_base )@, where @Rec@ is @array( E_base, N1, ... )@.
     1010In the base case, @array( E_base )@ is just @E_base@.
    10191011Because this construction uses the same value for the generic parameters @S@ and @E_im@, the resulting layout has trivial strides.
    10201012
     
    10311023
    10321024\CFA array subscripting is protected with runtime bound checks.
    1033 Having dependent typing causes the optimizer to remove more of these bound checks than it would without them.
     1025The array dependent-typing provides information to the C optimizer allowing it remove many of the bound checks.
    10341026This section provides a demonstration of the effect.
    10351027
     
    10521044\section{Array Lifecycle}
    10531045
    1054 An array is an aggregate, like a structure;
    1055 both are containers wrapping subordinate objects.
    1056 Any arbitrary object type, like @string@, can be an array element or structure member.
     1046An array, like a structure, wraps subordinate objects.
     1047Any object type, like @string@, can be an array element or structure member.
    10571048A consequence is that the lifetime of the container must match with its subordinate objects: all elements and members must be initialized/uninitialized implicitly as part of the container's allocation/deallocation.
    10581049Modern programming languages implicitly perform these operations via a type's constructor and destructor.
    10591050Therefore, \CFA must assure that an array's subordinate objects' lifetime operations are called.
    1060 
    10611051Preexisting \CFA mechanisms achieve this requirement, but with poor performance.
    10621052Furthermore, advanced array users need an exception to the basic mechanism, which does not occur with other aggregates.
    10631053Hence, arrays introduce subtleties in supporting an element's lifecycle.
    10641054
    1065 The preexisting \CFA support for contained-element lifecycle is based on recursive occurrences of the object-type (@otype@) pseudo-trait.
    1066 A type is an @otype@, if it provides a default (parameterless) constructor, copy constructor, assignment operator, and destructor (like \CC).
    1067 When declaring a structure with @otype@ members, the compiler implicitly generates implementations of the four @otype@ functions for the outer structure.
    1068 Then the generated default constructor for the outer structure calls the default constructor for each member, and the other @otype@ functions work similarly.
     1055The preexisting \CFA support for contained-element lifecycle is based on recursive occurrences of the object-type (otype) pseudo-trait.
     1056A type is an otype, if it provides a default (parameterless) constructor, copy constructor, assignment operator, and destructor (like \CC).
     1057For a structure with otype members, the compiler implicitly generates implementations of the four otype functions for the outer structure.
     1058Then the generated default constructor for the outer structure calls the default constructor for each member, and the other otype functions work similarly.
    10691059For a member that is a C array, these calls occur in a loop for each array element, which even works for VLAs.
    1070 This logic works the same, whether the member is a concrete type (that happens to be an @otype@) or if the member is a polymorphic type asserted to be an @otype@ (which is implicit in the syntax, @forall(T)@).
     1060This logic works the same, whether the member is a concrete type (that happens to be an otype) or if the member is a polymorphic type asserted to be an otype (which is implicit in the syntax, @forall(T)@).
    10711061The \CFA array has the simplified form (similar to one seen before):
    10721062\begin{cfa}
     
    10771067};
    10781068\end{cfa}
    1079 Being a structure with a C-array member, using the otype-form declaration for @T@ causes @array5(float)@ to implement @otype@ too.
    1080 
    1081 But this @otype@-recursion pattern has a performance issue.
    1082 For example, in a cube of @float@:
     1069Being a structure with a C-array member, the otype-form declaration @T@ causes @array5( float )@ to be an otype too.
     1070But this otype-recursion pattern has a performance issue.
     1071\VRef[Figure]{f:OtypeRecursionBlowup} shows the steps to find lifecycle functions, under the otype-recursion pattern, for a cube of @float@:
    10831072\begin{cfa}
    10841073array5( array5( array5( float ) ) )
    10851074\end{cfa}
    1086 the first few steps of the compiler's work to find the lifecycle functions, under the @otype@-recursion pattern, are shown in \VRef[Figure]{f:OtypeRecursionBlowup}.
    1087 All the work needed for the full @float@-cube would have 256 leaves.
     1075The work needed for the full @float@-cube is 256 leaves.
     1076Then the otype-recursion pattern generates helper functions and thunks that are exponential in the number of dimensions.
     1077Anecdotal experience is annoyingly slow compile time at three dimensions and unusable at four.
    10881078
    10891079%array5(T) offers
     
    11111101\begin{figure}
    11121102\centering
    1113 \setlength{\tabcolsep}{15pt}
     1103\setlength{\tabcolsep}{20pt}
    11141104\begin{tabular}{@{}lll@{}}
    11151105\begin{cfa}[deletekeywords={default}]
     
    111711071 default ctor
    111811082 copy ctor
    1119 3 asgt
     11093 assignment
    112011104 dtor
    11211111
     
    11461136&
    11471137\begin{cfa}[deletekeywords={default}]
    1148 array5(float) has
    1149 1 default ctor, using float's
     1138array5( float ) has
     11391 default ctor, using float$'$s
    11501140        1 default ctor
    11511141        2 copy ctor
    1152         3 asgt
     1142        3 assignment
    11531143        4 dtor
    1154 2 copy ctor, using float's
     11442 copy ctor, using float$'$s
    11551145        1 default ctor
    11561146        2 copy ctor
    1157         3 asgt
     1147        3 assignment
    11581148        4 dtor
    1159 3 asgt, using float's
     11493 assignment, using float$'$s
    11601150        1 default ctor
    11611151        2 copy ctor
    1162         3 asgt
     1152        3 assignment
    11631153        4 dtor
    1164 4 dtor, using float's
     11544 dtor, using float$'$s
    11651155        1 default ctor
    11661156        2 copy ctor
    1167         3 asgt
     1157        3 assignment
    11681158        4 dtor
    11691159
     
    11781168&
    11791169\begin{cfa}[deletekeywords={default}]
    1180 array5(array5(float)) has
    1181 1 default ctor, using array5(float)'s
    1182         1 default ctor, using float's
     1170array5( array5( float ) ) has
     11711 default ctor, using array5( float )
     1172        1 default ctor, using float$'$s
    11831173                1 default ctor
    11841174                2 copy ctor
    1185                 3 asgt
     1175                3 assignment
    11861176                4 dtor
    1187         2 copy ctor, using float's
     1177        2 copy ctor, using float$'$s
    11881178                1 default ctor
    11891179                2 copy ctor
    1190                 3 asgt
     1180                3 assignment
    11911181                4 dtor
    1192         3 asgt, using float's
     1182        3 assignment, using float$'$s
    11931183                1 default ctor
    11941184                2 copy ctor
    1195                 3 asgt
     1185                3 assignment
    11961186                4 dtor
    1197         4 dtor, using float's
     1187        4 dtor, using float$'$s
    11981188                1 default ctor
    11991189                2 copy ctor
    1200                 3 asgt
     1190                3 assignment
    12011191                4 dtor
    1202 2 copy ctor, using array5(float)'s
     11922 copy ctor, using array5( float )
    12031193        ... 4 children, 16 leaves
    1204 3 asgt, using array5(float)'s
     11943 assignment, using array5( float )
    12051195        ... 4 children, 16 leaves
    1206 4 dtor, using array5(float)'s
     11964 dtor, using array5( float )
    12071197        ... 4 children, 16 leaves
    12081198(64 leaves)
     
    12101200\end{tabular}
    12111201\caption{Exponential thunk generation under the otype-recursion pattern.
    1212         Each time that one type's function (\eg ctor) uses another type's function, the \CFA compiler generates a thunk, to capture the used function's dependencies, presented according to the using function's need.
     1202        Each time one type's function (\eg ctor) uses another type's function, the \CFA compiler generates a thunk, to capture the used function's dependencies, presented according to the using function's need.
    12131203        So, each non-leaf line represents a generated thunk and every line represents a search request for the resolver to find a satisfying function.}
    12141204\label{f:OtypeRecursionBlowup}
    12151205\end{figure}
    12161206
    1217 So the @otype@-recursion pattern seeks a quantity of helper functions, and generates a quantity of thunks, that are exponential in the number of dimensions.
    1218 Anecdotal experience with this solution found the resulting compile times annoyingly slow at three dimensions, and unusable at four.
    1219 
    12201207The issue is that the otype-recursion pattern uses more assertions than needed.
    1221 Consider how @array5(float)@'s default constructor is getting all four lifecycle assertions about the element type, @float@.
    1222 It only needs @float@'s default constructor;
    1223 the full set of operations is never used.
     1208For example, @array5( float )@'s default constructor has all four lifecycle assertions about the element type, @float@.
     1209However, it only needs @float@'s default constructor, as the other operations are never used.
    12241210Current work by the \CFA team aims to improve this situation.
    1225 Therefore, a workaround is needed for now.
    1226 
    1227 The workaround is to provide a default constructor, copy constructor and destructor, defined with lean, bespoke assertions:
     1211Therefore, I had to construct a workaround.
     1212
     1213My workaround moves from otype (value) to dtype (pointer) with a default-constructor assertion, where dtype does not generate any constructors but the assertion claws back the default otype constructor.
    12281214\begin{cquote}
    1229 \begin{tabular}{@{}l@{\hspace{0.5in}}l@{}}
     1215\setlength{\tabcolsep}{10pt}
     1216\begin{tabular}{@{}ll@{}}
    12301217\begin{cfa}
    12311218// autogenerated for otype-recursion:
    12321219forall( T )
    1233 void ?{}( array5(T) & this ) {
    1234         for (i; 5) { ( this[i] ){}; }
     1220void ?{}( array5( T ) & this ) {
     1221        for ( i; 5 ) { ( this[i] ){}; }
    12351222}
    12361223forall( T )
    1237 void ?{}( array5(T) & this, array5(T) & src ) {
    1238         for (i; 5) { ( this[i] ){ src[i] }; }
     1224void ?{}( array5( T ) & this, array5( T ) & src ) {
     1225        for ( i; 5 ) { ( this[i] ){ src[i] }; }
    12391226}
    12401227forall( T )
    1241 void ^?{}( array5(T) & this ) {
    1242         for (i; 5) { ^( this[i] ){}; }
     1228void ^?{}( array5( T ) & this ) {
     1229        for ( i; 5 ) { ^( this[i] ){}; }
    12431230}
    12441231\end{cfa}
     
    12471234// lean, bespoke:
    12481235forall( T* | { void @?{}( T & )@; } )
    1249 void ?{}( array5(T) & this ) {
    1250         for (i; 5) { ( this[i] ){}; }
     1236void ?{}( array5( T ) & this ) {
     1237        for ( i; 5 ) { ( this[i] ){}; }
    12511238}
    12521239forall( T* | { void @?{}( T &, T )@; } )
    1253 void ?{}( array5(T) & this, array5(T) & src ) {
    1254         for (i; 5) { ( this[i] ){ src[i] }; }
     1240void ?{}( array5( T ) & this, array5( T ) & src ) {
     1241        for ( i; 5 ) { ( this[i] ){ src[i] }; }
    12551242}
    12561243forall( T* | { void @?{}( T & )@; } )
     
    12621249\end{cquote}
    12631250Moreover, the assignment operator is skipped, to avoid hitting a lingering growth case.
    1264 Skipping assignment is tolerable because array assignment is not a common operation.
    1265 With this solution, the critical lifecycle functions are available, with no growth in thunk creation.
     1251Temporarily skipping assignment is tolerable because array assignment is not a common operation.
     1252With this solution, the critical lifecycle functions are available, with linear growth in thunk creation for the default constructor.
    12661253
    12671254Finally, the intuition that a programmer using an array always wants the elements' default constructor called \emph{automatically} is simplistic.
     
    12731260void ?{}( worker &, int id );
    12741261array( worker, 5 ) ws = @{}@; // rejected; but desire is for no initialization yet
    1275 for (i; 5) (ws[i]){ @i@ }; // explicitly initialize each thread, giving id
     1262for ( i; 5 ) (ws[i]){ @i@ }; // explicitly initialize each thread, giving id
    12761263\end{cfa}
    12771264Note the use of the \CFA explicit constructor call, analogous to \CC's placement-@new@.
     
    12831270Therefore, there is a conflict between the implicit actions of the builtin @thread@ type and a user's desire to defer these actions.
    12841271
    1285 Another \CFA feature may, at first, seem viable for initializing the array @ws@, though on closer inspection, it is not.
    1286 C initialization, \lstinline|array(worker, 5) ws @= {};|, ignores all \CFA lifecycle management and uses C empty initialization.
     1272Another \CFA feature for providing C backwards compatibility, at first seem viable for initializing the array @ws@, though on closer inspection is not.
     1273C initialization without a constructor is specified with \lstinline|@=|, \eg \lstinline|array(worker, 5) ws @= {}| ignores all \CFA lifecycle management and uses C empty initialization.
    12871274This option does achieve the desired semantics on the construction side.
    12881275But on destruction side, the desired semantics is for implicit destructor calls to continue, to keep the join operation tied to lexical scope.
     
    12951282\end{cfa}
    12961283Both cause the @ws@-construction-time implicit call chain to stop before reaching a @worker@ constructor, while leaving the implicit destruction calls intact.
    1297 Two forms are available, to parallel the development of this feature in \uCpp.
     1284Two forms are available, to parallel the development of this feature in \uCpp~\cite{uC++}.
    12981285Originally \uCpp offered only the @ws1@ form, using the class-template @uNoCtor@ equivalent to \CFA's @uninit@.
    12991286More recently, \uCpp was extended with the declaration macro, @uArray@, with usage similar to the @ws2@ case.
    1300 Based on experience piloting @uArray@ as a replacement of @uNoCtor@, it might be possible to remove the first option.
     1287Based on experience piloting @uArray@ as a replacement of @uNoCtor@, it should be possible to remove the first option.
    13011288
    13021289% note to Mike, I have more fragments on some details available in push24\fragments\uNoCtor.txt
    13031290
    1304 \section{Comparison with Other Arrays}
     1291\section{Array Comparison}
    13051292
    13061293
    13071294\subsection{Rust}
    13081295
    1309 \CFA's array is the first lightweight application of dependently-typed bound tracking to an extension of C.
    1310 Other extensions of C that apply dependently-typed bound tracking are heavyweight, in that the bound tracking is part of a linearly-typed ownership-system, which further helps guarantee statically the validity of every pointer deference.
    1311 These systems, therefore, ask the programmer to convince the type checker that every pointer dereference is valid.
    1312 \CFA imposes the lighter-weight obligation, with the more limited guarantee, that initially-declared bounds are respected thereafter.
    1313 
    1314 \CFA's array is also the first extension of C to use its tracked bounds to generate the pointer arithmetic implied by advanced allocation patterns.
    1315 Other bound-tracked extensions of C either forbid certain C patterns entirely, or address the problem of \emph{verifying} that the user's provided pointer arithmetic is self-consistent.
    1316 The \CFA array, applied to accordion structures [TOD: cross-reference] \emph{implies} the necessary pointer arithmetic, generated automatically, and not appearing at all in a user's program.
     1296A Rust array is a set of mutable or immutable (@const@) contiguous objects of the same type @T@, where the compile-time dimension(s) is part of the type signature @[T; dimension]@.
     1297\begin{rust}
     1298let val = 42;
     1299let mut ia: [usize; 5] = [val; 5];      $\C{// mutable, repeated initialization [42, 42, 42, 42, 42]}$
     1300let fval = 42.2;
     1301let fa: [f64; 5] = [1.2, fval, 5.6, 7.3, 9.1];  $\C{// immutable, individual initialization}$
     1302\end{rust}
     1303Initialization is required even if the  array is subsequently initialized.
     1304Rust arrays are not VLAs, but the compile-time length can be queried using member @len()@.
     1305Arrays can be assigned and passed to parameters by value or reference, if and only if, the type and dimension match, meaning a different function is needed for each array size.
     1306
     1307Array iteration is by a range or array variable.
     1308\begin{rust}
     1309for i in @0..ia.len()@ { print!("{:?} ", ia[i] ); }  $\C{// 42 42 42 42 42}$
     1310for fv in @fa@ { print!("{:?} ", fv ); }  $\C{// 1.2 42.2 5.6 7.3 9.1}$
     1311for i in 0..=1 { ia[i] = i; }    $\C{// mutable changes}$
     1312for iv in ia { print!("{:?} ", iv ); }  $\C{// 0 1 42 42 42}$
     1313\end{rust}
     1314An array can be assigned and printed as a set.
     1315\begin{rust}
     1316ia = @[5; 5]@;   println!( "{:?} {:?}", @ia@, ia.len() );       $\C{// [5, 5, 5, 5, 5] 5}$
     1317ia = @[1, 2, 3, 4, 5]@;   println!( "{:?} {:?}", @ia@, ia.len() );      $\C{// [1, 2, 3, 4, 5] 5}$
     1318\end{rust}
     1319
     1320Multi-dimensional arrays use nested dimensions, resulting in columns before rows.
     1321Here the outer array is a length @ROWS@ array whose elements are @f64@ arrays of length @COLS@ each.
     1322\begin{cquote}
     1323\setlength{\tabcolsep}{10pt}
     1324\begin{tabular}{@{}ll@{}}
     1325\begin{rust}
     1326const ROWS: usize = 4;   const COLS: usize = 6;
     1327let mut fmatrix: [[f64; @COLS@]; @ROWS@] = [[0.; @COLS@]; @ROWS@];
     1328for r in 0 .. ROWS {
     1329        for c in 0 .. COLS {  fmatrix[r][c] = r as f64 + c as f64;  }  }
     1330\end{rust}
     1331&
     1332\begin{rust}
     1333[ 0.0, 1.0, 2.0, 3.0, 4.0, 5.0 ]
     1334[ 1.0, 2.0, 3.0, 4.0, 5.0, 6.0 ]
     1335[ 2.0, 3.0, 4.0, 5.0, 6.0, 7.0 ]
     1336[ 3.0, 4.0, 5.0, 6.0, 7.0, 8.0 ]
     1337\end{rust}
     1338\end{tabular}
     1339\end{cquote}
     1340
     1341Slices borrow a section of an array (subarray) and have type @&[T]@.
     1342A slice has a dynamic size and does not implicitly coerce to an array.
     1343\begin{rust}
     1344println!( "{:?}", @&ia[0 .. 3]@ ); $\C{// fixed bounds, [1, 2, 3]}$
     1345println!( "{:?}", @&ia[ia.len() - 2 .. ia.len()@] ); $\C{// variable bounds, [4, 5]}$
     1346println!( "{:?}", @&ia[ia.len() - 2 .. ]@ ); $\C{// drop upper bound, [4, 5]}$
     1347println!( "{:?}", @&ia[.. ia.len() - 2 ]@ ); $\C{// drop lower bound, [1, 2, 3]}$
     1348println!( "{:?}", @&ia[..]@ ); $\C{// drop both bound, [1, 2, 3, 4, 5]}$
     1349\end{rust}
     1350A multi-dimensional slice can only borrow subrows because a slice requires contiguous memory.
     1351Here columns 2--4 are sliced from row 3.
     1352\begin{rust}
     1353let mut slice2: &[f64] = &fmatrix[3][@2 ..= 4@];
     1354println!( "{:?}", slice2 );  $\C{// [5.0, 6.0, 7.0]}$
     1355\end{rust}
     1356Here row 2 is sliced from the sub-matrix formed from rows 1--3.
     1357\begin{rust}
     1358slice2 = &fmatrix[@1 ..= 3@][1];
     1359println!( "{:?}", slice2 );  $\C{// [3.0, 4.0, 5.0, 6.0, 7.0, 8.0]}$
     1360\end{rust}
     1361A slice can be explicitly converted into an array or reference to an array.
     1362\begin{rust}
     1363let slice: &[f64];
     1364slice = &fa[0 ..= 1]; $\C{// create slice, [1.2, 42.2]}$
     1365let mut fa2: [f64; 2] = @<[f64; 2]>::try_from( slice ).unwrap()@; $\C{// convert slice to array, [1.2, 42.2]}$
     1366fa2 = @(& fa[2 ..= 3]).try_into().unwrap()@; $\C{// convert slice to array, [5.6, 7.3]}$
     1367\end{rust}
     1368
     1369The @vec@ macro (using @Vec@ type) provides dynamically-size dynamically-growable arrays of arrays only using the heap (\CC @vector@ type).
     1370\begin{cquote}
     1371\setlength{\tabcolsep}{10pt}
     1372\begin{tabular}{@{}ll@{}}
     1373\multicolumn{1}{c}{Rust} &\multicolumn{1}{c}{\CC} \\
     1374\begin{rust}
     1375let (rows, cols) = (4, 6);
     1376let mut matrix = vec![vec![0; cols]; rows];
     1377...     matrix[r][c] = r + c; // fill matrix
     1378\end{rust}
     1379&
     1380\begin{c++}
     1381int rows = 4, cols = 6;
     1382vector<vector<int>> matrix( rows, vector<int>(cols) );
     1383...     matrix[r][c] = r + c; // fill matrix}
     1384\end{c++}
     1385\end{tabular}
     1386\end{cquote}
     1387A dynamically-size array is sliced the same as a fixed-size one.
     1388\begin{rust}
     1389let mut slice3: &[usize] = &matrix[3][2 ..= 4]; $\C{// [5, 6, 7]}$
     1390slice3 = &matrix[1 ..= 3][1]; $\C{// [2, 3, 4, 5, 6, 7]}$
     1391\end{rust}
     1392Finally, to mitigate the restriction of matching array dimensions between argument and parameter types, it is possible for a parameter to be a slice.
     1393\begin{rust}
     1394fn zero( arr: @& mut [usize]@ ){
     1395        for i in 0 .. arr.len() { arr[i] = 0; }
     1396}
     1397zero( & mut ia[0..5] ); // arbitrary sized slice
     1398zero( & mut ia[0..3] );
     1399\end{rust}
     1400Or write a \emph{generic} function taking an array of fixed-element type and any size.
     1401\begin{rust}
     1402fn aprint<@const N: usize@>( arg: [usize; N] ) {
     1403        for r in 0 .. arg.len() {   print!( "{} ", arg[r] );   }
     1404}
     1405aprint( ia );
     1406\end{rust}
    13171407
    13181408
    13191409\subsection{Java}
    13201410
    1321 Java arrays are arrays-of-arrays because all objects are references \see{\VRef{toc:mdimpl}}.
     1411Java arrays are references, so multi-dimension arrays are arrays-of-arrays \see{\VRef{toc:mdimpl}}.
    13221412For each array, Java implicitly storages the array dimension in a descriptor, supporting array length, subscript checking, and allowing dynamically-sized array-parameter declarations.
    13231413\begin{cquote}
     
    13271417\end{tabular}
    13281418\end{cquote}
    1329 However, in the C prototype, the parameters @n@ and @m@  are documentation only as the intended size of the first and second dimension of @x@.
     1419However, in the C prototype, the parameters @n@ and @m@ are manually managed, \ie there is no guarantee they match with the argument matrix for parameter @x@.
    13301420\VRef[Figure]{f:JavaVsCTriangularMatrix} compares a triangular matrix (array-of-arrays) in dynamically safe Java to unsafe C.
    13311421Each dynamically sized row in Java stores its dimension, while C requires the programmer to manage these sizes explicitly (@rlnth@).
     
    14201510\subsection{Levels of Dependently Typed Arrays}
    14211511
     1512\CFA's array is the first lightweight application of dependently-typed bound tracking to an extension of C.
     1513Other extensions of C that apply dependently-typed bound tracking are heavyweight, in that the bound tracking is part of a linearly-typed ownership-system, which further helps guarantee statically the validity of every pointer deference.
     1514These systems, therefore, ask the programmer to convince the type checker that every pointer dereference is valid.
     1515\CFA imposes the lighter-weight obligation, with the more limited guarantee, that initially-declared bounds are respected thereafter.
     1516
     1517\CFA's array is also the first extension of C to use its tracked bounds to generate the pointer arithmetic implied by advanced allocation patterns.
     1518Other bound-tracked extensions of C either forbid certain C patterns entirely, or address the problem of \emph{verifying} that the user's provided pointer arithmetic is self-consistent.
     1519The \CFA array, applied to accordion structures [TOD: cross-reference] \emph{implies} the necessary pointer arithmetic, generated automatically, and not appearing at all in a user's program.
     1520
    14221521The \CFA array and the field of ``array language'' comparators all leverage dependent types to improve on the expressiveness over C and Java, accommodating examples such as:
    14231522\begin{itemize}[leftmargin=*]
     
    14651564Having no instances means there is no type for a variable @i@ that constrains @i@ to be in the range for @N@, unlike Dex, [TODO: verify], but like Futhark.
    14661565
    1467 \subsection{Static Safety in C Extensions}
    1468 
    1469 
    1470 \section{Future Work}
    1471 
    1472 \subsection{Declaration Syntax}
    1473 
    1474 \subsection{Range Slicing}
    1475 
    1476 \subsection{With a Module System}
    1477 
    1478 \subsection{With Described Enumerations}
    1479 
    1480 A project in \CFA's current portfolio will improve enumerations.
    1481 In the incumbent state, \CFA has C's enumerations, unmodified.
    1482 I will not discuss the core of this project, which has a tall mission already, to improve type safety, maintain appropriate C compatibility and offer more flexibility about storage use.
    1483 It also has a candidate stretch goal, to adapt \CFA's @forall@ generic system to communicate generalized enumerations:
    1484 \begin{cfa}
    1485 forall( T | is_enum(T) )
    1486 void show_in_context( T val ) {
    1487         for( T i ) {
    1488                 string decorator = "";
    1489                 if ( i == val-1 ) decorator = "< ready";
    1490                 if ( i == val   ) decorator = "< go"   ;
    1491                 sout | i | decorator;
    1492         }
    1493 }
    1494 enum weekday { mon, tue, wed = 500, thu, fri };
    1495 show_in_context( wed );
    1496 \end{cfa}
    1497 with output
    1498 \begin{cfa}
    1499 mon
    1500 tue < ready
    1501 wed < go
    1502 thu
    1503 fri
    1504 \end{cfa}
    1505 The details in this presentation aren't meant to be taken too precisely as suggestions for how it should look in \CFA.
    1506 But the example shows these abilities:
    1507 \begin{itemize}[leftmargin=*]
    1508 \item a built-in way (the @is_enum@ trait) for a generic routine to require enumeration-like information about its instantiating type
    1509 \item an implicit implementation of the trait whenever a user-written enum occurs (@weekday@'s declaration implies @is_enum@)
    1510 \item a total order over the enumeration constants, with predecessor/successor (@val-1@) available, and valid across gaps in values (@tue == 1 && wed == 500 && tue == wed - 1@)
    1511 \item a provision for looping (the @for@ form used) over the values of the type.
    1512 \end{itemize}
    1513 
    15141566If \CFA gets such a system for describing the list of values in a type, then \CFA arrays are poised to move from the Futhark level of expressiveness, up to the Dex level.
    15151567
     
    15631615In the case of adapting this pattern to \CFA, my current work provides an adapter from ``successively subscripted'' to ``subscripted by tuple,'' so it is likely that generalizing my adapter beyond ``subscripted by @ptrdiff_t@'' is sufficient to make a user-provided adapter unnecessary.
    15641616
     1617\subsection{Static Safety in C Extensions}
     1618
     1619
     1620\section{Future Work}
     1621
     1622\subsection{Array Syntax}
     1623
     1624An important goal is to recast @array(...)@ syntax into C-style @[]@.
     1625The proposal (which is partially implemented) is an alternate dimension and subscript syntax.
     1626C multi-dimension and subscripting syntax uses multiple brackets.
     1627\begin{cfa}
     1628int m@[2][3]@;  // dimension
     1629m@[0][1]@ = 3;  // subscript
     1630\end{cfa}
     1631Leveraging this syntax, the following (simpler) syntax should be intuitive to C programmers with only a small backwards compatibility issue.
     1632\begin{cfa}
     1633int m@[2, 3]@;  // dimension
     1634m@[0, 1]@ = 3;  // subscript
     1635\end{cfa}
     1636However, the new subscript syntax is not backwards compatible, as @0, 1@ is a comma expression.
     1637Interestingly, disallowed the comma expression in this context eliminates an unreported error: subscripting a matrix with @m[i, j]@ instead of @m[i][j]@, which selects the @j@th row not the @i, j@ element.
     1638Hence, a comma expression in this context is rare.
     1639Finally, it is possible to write @m[(i, j)]@ in the new syntax to achieve the equivalent of the old @m[i, j]@.
     1640Note, the new subscript syntax can easily be internally lowered to @[-][-]...@ and handled as regular subscripting.
     1641The only ambiguity with C syntax is for a single dimension array, where the syntax for old and new are the same.
     1642\begin{cfa}
     1643int m[2@,@];  // single dimension
     1644m[0] = 3;  // subscript
     1645\end{cfa}
     1646The solution for the dimension is to use a terminating comma to denote a new single-dimension array.
     1647This syntactic form is also used for the (rare) singleton tuple @[y@{\color{red},}@]@.
     1648The extra comma in the dimension is only mildly annoying, and acts as eye-candy differentiating old and new arrays.
     1649The subscript operator is not an issue as overloading selects the correct single-dimension operation for old/new array types.
     1650The ultimately goal is to replace all C arrays with \CFA arrays, establishing a higher level of safety in C programs, and eliminating the need for the terminating comma.
     1651
     1652
     1653\subsection{Range Slicing}
     1654
     1655\subsection{With a Module System}
     1656
     1657
    15651658\subsection{Retire Pointer Arithmetic}
    15661659
    15671660
    1568 \section{\CFA}
     1661\begin{comment}
     1662\section{\texorpdfstring{\CFA}{Cforall}}
    15691663
    15701664XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX \\
     
    15751669(For later:  That's what I offer with array.hfa, but in the future-work vision for arrays, the fix includes helping programmers stop accidentally using a broken C-ism.)
    15761670
    1577 \subsection{\CFA Features Interacting with Arrays}
     1671\subsection{\texorpdfstring{\CFA}{Cforall} Features Interacting with Arrays}
    15781672
    15791673Prior work on \CFA included making C arrays, as used in C code from the wild,
     
    16291723
    16301724TODO: I fixed a bug associated with using an array as a T.  I think.  Did I really?  What was the bug?
     1725\end{comment}
Note: See TracChangeset for help on using the changeset viewer.