Changeset cecb260 for doc/theses
- Timestamp:
- May 27, 2025, 3:35:06 PM (4 months ago)
- Branches:
- master
- Children:
- 8d049ad
- Parents:
- 1a40870
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/mike_brooks_MMath/array.tex
r1a40870 rcecb260 91 91 where a characterization of the return value (giving it a precise type, generally dependent upon the parameters) 92 92 is a sufficient postcondition. 93 In an imperative language like C and \CFA, it is also necessary to discuss side effects, 94 for which an even heavier formalism, like separation logic, is required. 93 In an imperative language like C and \CFA, it is also necessary to discuss side effects, for which an even heavier formalism, like separation logic, is required. 95 94 Secondly, TODO: bash Rust. 96 95 TODO: cite the crap out of these claims. … … 297 296 \section{Dimension Parameter Implementation} 298 297 299 The core of the preexisting \CFA compiler already had the ``heavy equipment'' needed 300 to provide the feature set just reviewed (up to bugs in cases not yet exercised). 298 The core of the preexisting \CFA compiler already had the ``heavy equipment'' needed to provide the feature set just reviewed (up to bugs in cases not yet exercised). 301 299 To apply this equipment in tracking array lengths, I encoded a dimension (array's length) as a type. 302 300 The type in question does not describe any data that the program actually uses at runtime. 303 301 It simply carries information through intermediate stages of \CFA-to-C lowering. 304 302 And this type takes a form such that, once \emph{it} gets lowered, the result does the right thing. 303 This new dimension type, encoding the array dimension within it, is the first limited \newterm{dependent type}~\cite{DependentType} added to the \CFA type-system and is used at appropriate points during type resolution. 305 304 306 305 Furthermore, the @array@ type itself is really ``icing on the cake.'' … … 319 318 \end{cfa} 320 319 These two structure patterns, plus a subscript operator, is all that @array@ provides. 321 322 My main work is letting a programmer define 323 such a structure (one whose type is parameterized by @[N]@) 324 and functions that operate on it (these being similarly parameterized). 320 My main work is letting a programmer define such a structure (one whose type is parameterized by @[N]@) and functions that operate on it (these being similarly parameterized). 325 321 326 322 The repurposed heavy equipment is 327 323 \begin{itemize}[leftmargin=*] 328 324 \item 329 Resolver provided values for a used declaration's type-system variables, 330 gathered from type information in scope at the usage site. 331 \item 332 The box pass, encoding information about type parameters 333 into ``extra'' regular parameters/arguments on declarations and calls. 334 Notably, it conveys the size of a type @foo@ as a @__sizeof_foo@ parameter, 335 and rewrites the @sizeof(foo)@ expression as @__sizeof_foo@, \ie a use of the parameter. 325 Resolver provided values for a used declaration's type-system variables, gathered from type information in scope at the usage site. 326 \item 327 The box pass, encoding information about type parameters into ``extra'' regular parameters/arguments on declarations and calls. 328 Notably, it conveys the size of a type @foo@ as a @__sizeof_foo@ parameter, and rewrites the @sizeof(foo)@ expression as @__sizeof_foo@, \ie a use of the parameter. 336 329 \end{itemize} 337 330 … … 816 809 Dimension hoisting already existed in the \CFA compiler. 817 810 But its was buggy, particularly with determining, ``Can hoisting the expression be skipped here?'', for skipping this hoisting is clearly desirable in some cases. 818 For example, when a programmer has already hoisted to perform an optimi ation to prelude duplicate code (expression) and/or expression evaluation.811 For example, when a programmer has already hoisted to perform an optimization to prelude duplicate code (expression) and/or expression evaluation. 819 812 In the new implementation, these cases are correct, harmonized with the accept/reject criteria. 820 813 … … 839 832 \end{enumerate} 840 833 841 There is some debate as to whether the abstraction orderinggoes $\{1, 2\} < 3$, rather than my numerically-ordering.834 There is some debate as to whether the abstraction point ordering above goes $\{1, 2\} < 3$, rather than my numerically-ordering. 842 835 That is, styles 1 and 2 are at the same abstraction level, with 3 offering a limited set of functionality. 843 836 I chose to build the \CFA style-1 array upon a style-2 abstraction. … … 853 846 The defining requirement of a type-2 system is the ability to slice a column from a column-finest matrix. 854 847 The required memory shape of such a slice is fixed, before any discussion of implementation. 855 The implementation presented here is how the \CFA array library wrangles the C type system, to make it do memory steps that are consistent with this layout.856 TODO: do I have/need a presentation of just this layout, just the semantics of -[all]?857 858 The new \CFA standard library @array@datatype supports richer multidimensional features than C.848 The implementation presented here is how the \CFA array-library wrangles the C type system, to make it do memory steps that are consistent with this layout while not affecting legacy C programs. 849 % TODO: do I have/need a presentation of just this layout, just the semantics of -[all]? 850 851 The new \CFA standard-library @array@-datatype supports richer multidimensional features than C. 859 852 The new array implementation follows C's contiguous approach, \ie @float [r][c]@, with one contiguous object subscripted by coarsely-strided dimensions directly wrapping finely-strided dimensions. 860 853 Beyond what C's array type offers, the new array brings direct support for working with a noncontiguous array slice, allowing a program to work with dimension subscripts given in a non-physical order. … … 921 914 922 915 Contiguous arrays, and slices of them, are all represented by the same underlying parameterized type, which includes stride information in its metatdata. 923 \PAB{Do not understand this sentence: The \lstinline{-[all]} operation is a conversion from a reference to one instantiation to a reference to another instantiation.} 916 The \lstinline{-[all]} operation takes subscripts, \eg @x[2][all]@, @x[all][3]@, \etc, and converts (transposes) from the base reference @x[all]@ to a specific reference of the appropriate form. 924 917 The running example's @all@-effective step, stated more concretely, is: 925 918 \begin{cquote} … … 939 932 \end{figure} 940 933 941 \VRef[Figure]{fig:subscr-all} shows one element (in the shadedband) accessed two different ways: as @x[2][3]@ and as @x[all][3][2]@.942 In both cases, value2 selects from the coarser dimension (rows of @x@),943 while the value3 selects from the finer dimension (columns of @x@).934 \VRef[Figure]{fig:subscr-all} shows one element (in the white band) accessed two different ways: as @x[2][3]@ and as @x[all][3][2]@. 935 In both cases, subscript 2 selects from the coarser dimension (rows of @x@), 936 while subscript 3 selects from the finer dimension (columns of @x@). 944 937 The figure illustrates the value of each subexpression, comparing how numeric subscripting proceeds from @x@, \vs from @x[all]@. 945 938 Proceeding from @x@ gives the numeric indices as coarse then fine, while proceeding from @x[all]@ gives them fine then coarse. … … 947 940 (those that received zero numeric indices so far), are illustrated with vertical steps where a \emph{first} numeric index would select. 948 941 949 The figure's presentation offers an intuition answer ingto: What is an atomic element of @x[all]@?942 The figure's presentation offers an intuition answer to: What is an atomic element of @x[all]@? 950 943 From there, @x[all]@ itself is simply a two-dimensional array, in the strict C sense, of these building blocks. 951 944 An atom (like the bottommost value, @x[all][3][2]@), is the contained value (in the square box) … … 963 956 the locations referenced by the coarse subscript options (rows 0..4) are offset by 3 floats, 964 957 compared with where analogous rows appear when the row-level option is presented for @x@. 965 966 For example, in \lstinline{x[all]}, the shaded band touches atoms 2.0, 2.1, 2.2, 2.3, 1.4, 1.5 and 1.6 (left diagonal). 958 For example, in \lstinline{x[all]}, the shaded band and its immediate values to the left touches atoms 2.3, 2.0, 2.1, 2.2, 1.4, 1.5 and 1.6. 967 959 But only the atom 2.3 is storing its value there. 968 960 The rest are lying about (conflicting) claims on this location, but never exercising these alleged claims. … … 1013 1005 Construction by @array@ initializes the masquerading-as type information to be equal to the contained-element information. 1014 1006 Subscripting by @all@ rearranges the order of masquerading-as types to achieve, in general, nontrivial striding. 1015 Subscripting by a number consumes the masquerading-as size of the contained element type, does normal array stepping according to that size, and returns the reelement found there, in unmasked form.1016 1017 An instantiation of the @arpk@ generic is given by the @array( E_base, N0, N1, ...)@ expansion, which is @arpk( N0, Rec, Rec, E_base )@, where @Rec@ is @array(E_base, N1, ...)@.1018 In the base case, @array( E_base)@ is just @E_base@.1007 Subscripting by a number consumes the masquerading-as size of the contained element type, does normal array stepping according to that size, and returns the element found there, in unmasked form. 1008 1009 An instantiation of the @arpk@ generic is given by the @array( E_base, N0, N1, ... )@ expansion, which is @arpk( N0, Rec, Rec, E_base )@, where @Rec@ is @array( E_base, N1, ... )@. 1010 In the base case, @array( E_base )@ is just @E_base@. 1019 1011 Because this construction uses the same value for the generic parameters @S@ and @E_im@, the resulting layout has trivial strides. 1020 1012 … … 1031 1023 1032 1024 \CFA array subscripting is protected with runtime bound checks. 1033 Having dependent typing causes the optimizer to remove more of these bound checks than it would without them.1025 The array dependent-typing provides information to the C optimizer allowing it remove many of the bound checks. 1034 1026 This section provides a demonstration of the effect. 1035 1027 … … 1052 1044 \section{Array Lifecycle} 1053 1045 1054 An array is an aggregate, like a structure; 1055 both are containers wrapping subordinate objects. 1056 Any arbitrary object type, like @string@, can be an array element or structure member. 1046 An array, like a structure, wraps subordinate objects. 1047 Any object type, like @string@, can be an array element or structure member. 1057 1048 A consequence is that the lifetime of the container must match with its subordinate objects: all elements and members must be initialized/uninitialized implicitly as part of the container's allocation/deallocation. 1058 1049 Modern programming languages implicitly perform these operations via a type's constructor and destructor. 1059 1050 Therefore, \CFA must assure that an array's subordinate objects' lifetime operations are called. 1060 1061 1051 Preexisting \CFA mechanisms achieve this requirement, but with poor performance. 1062 1052 Furthermore, advanced array users need an exception to the basic mechanism, which does not occur with other aggregates. 1063 1053 Hence, arrays introduce subtleties in supporting an element's lifecycle. 1064 1054 1065 The preexisting \CFA support for contained-element lifecycle is based on recursive occurrences of the object-type ( @otype@) pseudo-trait.1066 A type is an @otype@, if it provides a default (parameterless) constructor, copy constructor, assignment operator, and destructor (like \CC).1067 When declaring a structure with @otype@ members, the compiler implicitly generates implementations of the four @otype@functions for the outer structure.1068 Then the generated default constructor for the outer structure calls the default constructor for each member, and the other @otype@functions work similarly.1055 The preexisting \CFA support for contained-element lifecycle is based on recursive occurrences of the object-type (otype) pseudo-trait. 1056 A type is an otype, if it provides a default (parameterless) constructor, copy constructor, assignment operator, and destructor (like \CC). 1057 For a structure with otype members, the compiler implicitly generates implementations of the four otype functions for the outer structure. 1058 Then the generated default constructor for the outer structure calls the default constructor for each member, and the other otype functions work similarly. 1069 1059 For a member that is a C array, these calls occur in a loop for each array element, which even works for VLAs. 1070 This logic works the same, whether the member is a concrete type (that happens to be an @otype@) or if the member is a polymorphic type asserted to be an @otype@(which is implicit in the syntax, @forall(T)@).1060 This logic works the same, whether the member is a concrete type (that happens to be an otype) or if the member is a polymorphic type asserted to be an otype (which is implicit in the syntax, @forall(T)@). 1071 1061 The \CFA array has the simplified form (similar to one seen before): 1072 1062 \begin{cfa} … … 1077 1067 }; 1078 1068 \end{cfa} 1079 Being a structure with a C-array member, using the otype-form declaration for @T@ causes @array5(float)@ to implement @otype@ too. 1080 1081 But this @otype@-recursion pattern has a performance issue. 1082 For example, in a cube of @float@: 1069 Being a structure with a C-array member, the otype-form declaration @T@ causes @array5( float )@ to be an otype too. 1070 But this otype-recursion pattern has a performance issue. 1071 \VRef[Figure]{f:OtypeRecursionBlowup} shows the steps to find lifecycle functions, under the otype-recursion pattern, for a cube of @float@: 1083 1072 \begin{cfa} 1084 1073 array5( array5( array5( float ) ) ) 1085 1074 \end{cfa} 1086 the first few steps of the compiler's work to find the lifecycle functions, under the @otype@-recursion pattern, are shown in \VRef[Figure]{f:OtypeRecursionBlowup}. 1087 All the work needed for the full @float@-cube would have 256 leaves. 1075 The work needed for the full @float@-cube is 256 leaves. 1076 Then the otype-recursion pattern generates helper functions and thunks that are exponential in the number of dimensions. 1077 Anecdotal experience is annoyingly slow compile time at three dimensions and unusable at four. 1088 1078 1089 1079 %array5(T) offers … … 1111 1101 \begin{figure} 1112 1102 \centering 1113 \setlength{\tabcolsep}{ 15pt}1103 \setlength{\tabcolsep}{20pt} 1114 1104 \begin{tabular}{@{}lll@{}} 1115 1105 \begin{cfa}[deletekeywords={default}] … … 1117 1107 1 default ctor 1118 1108 2 copy ctor 1119 3 as gt1109 3 assignment 1120 1110 4 dtor 1121 1111 … … 1146 1136 & 1147 1137 \begin{cfa}[deletekeywords={default}] 1148 array5( float) has1149 1 default ctor, using float 's1138 array5( float ) has 1139 1 default ctor, using float$'$s 1150 1140 1 default ctor 1151 1141 2 copy ctor 1152 3 as gt1142 3 assignment 1153 1143 4 dtor 1154 2 copy ctor, using float 's1144 2 copy ctor, using float$'$s 1155 1145 1 default ctor 1156 1146 2 copy ctor 1157 3 as gt1147 3 assignment 1158 1148 4 dtor 1159 3 as gt, using float's1149 3 assignment, using float$'$s 1160 1150 1 default ctor 1161 1151 2 copy ctor 1162 3 as gt1152 3 assignment 1163 1153 4 dtor 1164 4 dtor, using float 's1154 4 dtor, using float$'$s 1165 1155 1 default ctor 1166 1156 2 copy ctor 1167 3 as gt1157 3 assignment 1168 1158 4 dtor 1169 1159 … … 1178 1168 & 1179 1169 \begin{cfa}[deletekeywords={default}] 1180 array5( array5(float)) has1181 1 default ctor, using array5( float)'s1182 1 default ctor, using float 's1170 array5( array5( float ) ) has 1171 1 default ctor, using array5( float ) 1172 1 default ctor, using float$'$s 1183 1173 1 default ctor 1184 1174 2 copy ctor 1185 3 as gt1175 3 assignment 1186 1176 4 dtor 1187 2 copy ctor, using float 's1177 2 copy ctor, using float$'$s 1188 1178 1 default ctor 1189 1179 2 copy ctor 1190 3 as gt1180 3 assignment 1191 1181 4 dtor 1192 3 as gt, using float's1182 3 assignment, using float$'$s 1193 1183 1 default ctor 1194 1184 2 copy ctor 1195 3 as gt1185 3 assignment 1196 1186 4 dtor 1197 4 dtor, using float 's1187 4 dtor, using float$'$s 1198 1188 1 default ctor 1199 1189 2 copy ctor 1200 3 as gt1190 3 assignment 1201 1191 4 dtor 1202 2 copy ctor, using array5( float)'s1192 2 copy ctor, using array5( float ) 1203 1193 ... 4 children, 16 leaves 1204 3 as gt, using array5(float)'s1194 3 assignment, using array5( float ) 1205 1195 ... 4 children, 16 leaves 1206 4 dtor, using array5( float)'s1196 4 dtor, using array5( float ) 1207 1197 ... 4 children, 16 leaves 1208 1198 (64 leaves) … … 1210 1200 \end{tabular} 1211 1201 \caption{Exponential thunk generation under the otype-recursion pattern. 1212 Each time thatone type's function (\eg ctor) uses another type's function, the \CFA compiler generates a thunk, to capture the used function's dependencies, presented according to the using function's need.1202 Each time one type's function (\eg ctor) uses another type's function, the \CFA compiler generates a thunk, to capture the used function's dependencies, presented according to the using function's need. 1213 1203 So, each non-leaf line represents a generated thunk and every line represents a search request for the resolver to find a satisfying function.} 1214 1204 \label{f:OtypeRecursionBlowup} 1215 1205 \end{figure} 1216 1206 1217 So the @otype@-recursion pattern seeks a quantity of helper functions, and generates a quantity of thunks, that are exponential in the number of dimensions.1218 Anecdotal experience with this solution found the resulting compile times annoyingly slow at three dimensions, and unusable at four.1219 1220 1207 The issue is that the otype-recursion pattern uses more assertions than needed. 1221 Consider how @array5(float)@'s default constructor is getting all four lifecycle assertions about the element type, @float@. 1222 It only needs @float@'s default constructor; 1223 the full set of operations is never used. 1208 For example, @array5( float )@'s default constructor has all four lifecycle assertions about the element type, @float@. 1209 However, it only needs @float@'s default constructor, as the other operations are never used. 1224 1210 Current work by the \CFA team aims to improve this situation. 1225 Therefore, a workaround is needed for now.1226 1227 The workaround is to provide a default constructor, copy constructor and destructor, defined with lean, bespoke assertions: 1211 Therefore, I had to construct a workaround. 1212 1213 My workaround moves from otype (value) to dtype (pointer) with a default-constructor assertion, where dtype does not generate any constructors but the assertion claws back the default otype constructor. 1228 1214 \begin{cquote} 1229 \begin{tabular}{@{}l@{\hspace{0.5in}}l@{}} 1215 \setlength{\tabcolsep}{10pt} 1216 \begin{tabular}{@{}ll@{}} 1230 1217 \begin{cfa} 1231 1218 // autogenerated for otype-recursion: 1232 1219 forall( T ) 1233 void ?{}( array5( T) & this ) {1234 for ( i; 5) { ( this[i] ){}; }1220 void ?{}( array5( T ) & this ) { 1221 for ( i; 5 ) { ( this[i] ){}; } 1235 1222 } 1236 1223 forall( T ) 1237 void ?{}( array5( T) & this, array5(T) & src ) {1238 for ( i; 5) { ( this[i] ){ src[i] }; }1224 void ?{}( array5( T ) & this, array5( T ) & src ) { 1225 for ( i; 5 ) { ( this[i] ){ src[i] }; } 1239 1226 } 1240 1227 forall( T ) 1241 void ^?{}( array5( T) & this ) {1242 for ( i; 5) { ^( this[i] ){}; }1228 void ^?{}( array5( T ) & this ) { 1229 for ( i; 5 ) { ^( this[i] ){}; } 1243 1230 } 1244 1231 \end{cfa} … … 1247 1234 // lean, bespoke: 1248 1235 forall( T* | { void @?{}( T & )@; } ) 1249 void ?{}( array5( T) & this ) {1250 for ( i; 5) { ( this[i] ){}; }1236 void ?{}( array5( T ) & this ) { 1237 for ( i; 5 ) { ( this[i] ){}; } 1251 1238 } 1252 1239 forall( T* | { void @?{}( T &, T )@; } ) 1253 void ?{}( array5( T) & this, array5(T) & src ) {1254 for ( i; 5) { ( this[i] ){ src[i] }; }1240 void ?{}( array5( T ) & this, array5( T ) & src ) { 1241 for ( i; 5 ) { ( this[i] ){ src[i] }; } 1255 1242 } 1256 1243 forall( T* | { void @?{}( T & )@; } ) … … 1262 1249 \end{cquote} 1263 1250 Moreover, the assignment operator is skipped, to avoid hitting a lingering growth case. 1264 Skipping assignment is tolerable because array assignment is not a common operation.1265 With this solution, the critical lifecycle functions are available, with no growth in thunk creation.1251 Temporarily skipping assignment is tolerable because array assignment is not a common operation. 1252 With this solution, the critical lifecycle functions are available, with linear growth in thunk creation for the default constructor. 1266 1253 1267 1254 Finally, the intuition that a programmer using an array always wants the elements' default constructor called \emph{automatically} is simplistic. … … 1273 1260 void ?{}( worker &, int id ); 1274 1261 array( worker, 5 ) ws = @{}@; // rejected; but desire is for no initialization yet 1275 for ( i; 5) (ws[i]){ @i@ }; // explicitly initialize each thread, giving id1262 for ( i; 5 ) (ws[i]){ @i@ }; // explicitly initialize each thread, giving id 1276 1263 \end{cfa} 1277 1264 Note the use of the \CFA explicit constructor call, analogous to \CC's placement-@new@. … … 1283 1270 Therefore, there is a conflict between the implicit actions of the builtin @thread@ type and a user's desire to defer these actions. 1284 1271 1285 Another \CFA feature may, at first, seem viable for initializing the array @ws@, though on closer inspection, itis not.1286 C initialization , \lstinline|array(worker, 5) ws @= {};|,ignores all \CFA lifecycle management and uses C empty initialization.1272 Another \CFA feature for providing C backwards compatibility, at first seem viable for initializing the array @ws@, though on closer inspection is not. 1273 C initialization without a constructor is specified with \lstinline|@=|, \eg \lstinline|array(worker, 5) ws @= {}| ignores all \CFA lifecycle management and uses C empty initialization. 1287 1274 This option does achieve the desired semantics on the construction side. 1288 1275 But on destruction side, the desired semantics is for implicit destructor calls to continue, to keep the join operation tied to lexical scope. … … 1295 1282 \end{cfa} 1296 1283 Both cause the @ws@-construction-time implicit call chain to stop before reaching a @worker@ constructor, while leaving the implicit destruction calls intact. 1297 Two forms are available, to parallel the development of this feature in \uCpp .1284 Two forms are available, to parallel the development of this feature in \uCpp~\cite{uC++}. 1298 1285 Originally \uCpp offered only the @ws1@ form, using the class-template @uNoCtor@ equivalent to \CFA's @uninit@. 1299 1286 More recently, \uCpp was extended with the declaration macro, @uArray@, with usage similar to the @ws2@ case. 1300 Based on experience piloting @uArray@ as a replacement of @uNoCtor@, it mightbe possible to remove the first option.1287 Based on experience piloting @uArray@ as a replacement of @uNoCtor@, it should be possible to remove the first option. 1301 1288 1302 1289 % note to Mike, I have more fragments on some details available in push24\fragments\uNoCtor.txt 1303 1290 1304 \section{ Comparison with Other Arrays}1291 \section{Array Comparison} 1305 1292 1306 1293 1307 1294 \subsection{Rust} 1308 1295 1309 \CFA's array is the first lightweight application of dependently-typed bound tracking to an extension of C. 1310 Other extensions of C that apply dependently-typed bound tracking are heavyweight, in that the bound tracking is part of a linearly-typed ownership-system, which further helps guarantee statically the validity of every pointer deference. 1311 These systems, therefore, ask the programmer to convince the type checker that every pointer dereference is valid. 1312 \CFA imposes the lighter-weight obligation, with the more limited guarantee, that initially-declared bounds are respected thereafter. 1313 1314 \CFA's array is also the first extension of C to use its tracked bounds to generate the pointer arithmetic implied by advanced allocation patterns. 1315 Other bound-tracked extensions of C either forbid certain C patterns entirely, or address the problem of \emph{verifying} that the user's provided pointer arithmetic is self-consistent. 1316 The \CFA array, applied to accordion structures [TOD: cross-reference] \emph{implies} the necessary pointer arithmetic, generated automatically, and not appearing at all in a user's program. 1296 A Rust array is a set of mutable or immutable (@const@) contiguous objects of the same type @T@, where the compile-time dimension(s) is part of the type signature @[T; dimension]@. 1297 \begin{rust} 1298 let val = 42; 1299 let mut ia: [usize; 5] = [val; 5]; $\C{// mutable, repeated initialization [42, 42, 42, 42, 42]}$ 1300 let fval = 42.2; 1301 let fa: [f64; 5] = [1.2, fval, 5.6, 7.3, 9.1]; $\C{// immutable, individual initialization}$ 1302 \end{rust} 1303 Initialization is required even if the array is subsequently initialized. 1304 Rust arrays are not VLAs, but the compile-time length can be queried using member @len()@. 1305 Arrays can be assigned and passed to parameters by value or reference, if and only if, the type and dimension match, meaning a different function is needed for each array size. 1306 1307 Array iteration is by a range or array variable. 1308 \begin{rust} 1309 for i in @0..ia.len()@ { print!("{:?} ", ia[i] ); } $\C{// 42 42 42 42 42}$ 1310 for fv in @fa@ { print!("{:?} ", fv ); } $\C{// 1.2 42.2 5.6 7.3 9.1}$ 1311 for i in 0..=1 { ia[i] = i; } $\C{// mutable changes}$ 1312 for iv in ia { print!("{:?} ", iv ); } $\C{// 0 1 42 42 42}$ 1313 \end{rust} 1314 An array can be assigned and printed as a set. 1315 \begin{rust} 1316 ia = @[5; 5]@; println!( "{:?} {:?}", @ia@, ia.len() ); $\C{// [5, 5, 5, 5, 5] 5}$ 1317 ia = @[1, 2, 3, 4, 5]@; println!( "{:?} {:?}", @ia@, ia.len() ); $\C{// [1, 2, 3, 4, 5] 5}$ 1318 \end{rust} 1319 1320 Multi-dimensional arrays use nested dimensions, resulting in columns before rows. 1321 Here the outer array is a length @ROWS@ array whose elements are @f64@ arrays of length @COLS@ each. 1322 \begin{cquote} 1323 \setlength{\tabcolsep}{10pt} 1324 \begin{tabular}{@{}ll@{}} 1325 \begin{rust} 1326 const ROWS: usize = 4; const COLS: usize = 6; 1327 let mut fmatrix: [[f64; @COLS@]; @ROWS@] = [[0.; @COLS@]; @ROWS@]; 1328 for r in 0 .. ROWS { 1329 for c in 0 .. COLS { fmatrix[r][c] = r as f64 + c as f64; } } 1330 \end{rust} 1331 & 1332 \begin{rust} 1333 [ 0.0, 1.0, 2.0, 3.0, 4.0, 5.0 ] 1334 [ 1.0, 2.0, 3.0, 4.0, 5.0, 6.0 ] 1335 [ 2.0, 3.0, 4.0, 5.0, 6.0, 7.0 ] 1336 [ 3.0, 4.0, 5.0, 6.0, 7.0, 8.0 ] 1337 \end{rust} 1338 \end{tabular} 1339 \end{cquote} 1340 1341 Slices borrow a section of an array (subarray) and have type @&[T]@. 1342 A slice has a dynamic size and does not implicitly coerce to an array. 1343 \begin{rust} 1344 println!( "{:?}", @&ia[0 .. 3]@ ); $\C{// fixed bounds, [1, 2, 3]}$ 1345 println!( "{:?}", @&ia[ia.len() - 2 .. ia.len()@] ); $\C{// variable bounds, [4, 5]}$ 1346 println!( "{:?}", @&ia[ia.len() - 2 .. ]@ ); $\C{// drop upper bound, [4, 5]}$ 1347 println!( "{:?}", @&ia[.. ia.len() - 2 ]@ ); $\C{// drop lower bound, [1, 2, 3]}$ 1348 println!( "{:?}", @&ia[..]@ ); $\C{// drop both bound, [1, 2, 3, 4, 5]}$ 1349 \end{rust} 1350 A multi-dimensional slice can only borrow subrows because a slice requires contiguous memory. 1351 Here columns 2--4 are sliced from row 3. 1352 \begin{rust} 1353 let mut slice2: &[f64] = &fmatrix[3][@2 ..= 4@]; 1354 println!( "{:?}", slice2 ); $\C{// [5.0, 6.0, 7.0]}$ 1355 \end{rust} 1356 Here row 2 is sliced from the sub-matrix formed from rows 1--3. 1357 \begin{rust} 1358 slice2 = &fmatrix[@1 ..= 3@][1]; 1359 println!( "{:?}", slice2 ); $\C{// [3.0, 4.0, 5.0, 6.0, 7.0, 8.0]}$ 1360 \end{rust} 1361 A slice can be explicitly converted into an array or reference to an array. 1362 \begin{rust} 1363 let slice: &[f64]; 1364 slice = &fa[0 ..= 1]; $\C{// create slice, [1.2, 42.2]}$ 1365 let mut fa2: [f64; 2] = @<[f64; 2]>::try_from( slice ).unwrap()@; $\C{// convert slice to array, [1.2, 42.2]}$ 1366 fa2 = @(& fa[2 ..= 3]).try_into().unwrap()@; $\C{// convert slice to array, [5.6, 7.3]}$ 1367 \end{rust} 1368 1369 The @vec@ macro (using @Vec@ type) provides dynamically-size dynamically-growable arrays of arrays only using the heap (\CC @vector@ type). 1370 \begin{cquote} 1371 \setlength{\tabcolsep}{10pt} 1372 \begin{tabular}{@{}ll@{}} 1373 \multicolumn{1}{c}{Rust} &\multicolumn{1}{c}{\CC} \\ 1374 \begin{rust} 1375 let (rows, cols) = (4, 6); 1376 let mut matrix = vec![vec![0; cols]; rows]; 1377 ... matrix[r][c] = r + c; // fill matrix 1378 \end{rust} 1379 & 1380 \begin{c++} 1381 int rows = 4, cols = 6; 1382 vector<vector<int>> matrix( rows, vector<int>(cols) ); 1383 ... matrix[r][c] = r + c; // fill matrix} 1384 \end{c++} 1385 \end{tabular} 1386 \end{cquote} 1387 A dynamically-size array is sliced the same as a fixed-size one. 1388 \begin{rust} 1389 let mut slice3: &[usize] = &matrix[3][2 ..= 4]; $\C{// [5, 6, 7]}$ 1390 slice3 = &matrix[1 ..= 3][1]; $\C{// [2, 3, 4, 5, 6, 7]}$ 1391 \end{rust} 1392 Finally, to mitigate the restriction of matching array dimensions between argument and parameter types, it is possible for a parameter to be a slice. 1393 \begin{rust} 1394 fn zero( arr: @& mut [usize]@ ){ 1395 for i in 0 .. arr.len() { arr[i] = 0; } 1396 } 1397 zero( & mut ia[0..5] ); // arbitrary sized slice 1398 zero( & mut ia[0..3] ); 1399 \end{rust} 1400 Or write a \emph{generic} function taking an array of fixed-element type and any size. 1401 \begin{rust} 1402 fn aprint<@const N: usize@>( arg: [usize; N] ) { 1403 for r in 0 .. arg.len() { print!( "{} ", arg[r] ); } 1404 } 1405 aprint( ia ); 1406 \end{rust} 1317 1407 1318 1408 1319 1409 \subsection{Java} 1320 1410 1321 Java arrays are arrays-of-arrays because all objects are references \see{\VRef{toc:mdimpl}}.1411 Java arrays are references, so multi-dimension arrays are arrays-of-arrays \see{\VRef{toc:mdimpl}}. 1322 1412 For each array, Java implicitly storages the array dimension in a descriptor, supporting array length, subscript checking, and allowing dynamically-sized array-parameter declarations. 1323 1413 \begin{cquote} … … 1327 1417 \end{tabular} 1328 1418 \end{cquote} 1329 However, in the C prototype, the parameters @n@ and @m@ are documentation only as the intended size of the first and second dimension of@x@.1419 However, in the C prototype, the parameters @n@ and @m@ are manually managed, \ie there is no guarantee they match with the argument matrix for parameter @x@. 1330 1420 \VRef[Figure]{f:JavaVsCTriangularMatrix} compares a triangular matrix (array-of-arrays) in dynamically safe Java to unsafe C. 1331 1421 Each dynamically sized row in Java stores its dimension, while C requires the programmer to manage these sizes explicitly (@rlnth@). … … 1420 1510 \subsection{Levels of Dependently Typed Arrays} 1421 1511 1512 \CFA's array is the first lightweight application of dependently-typed bound tracking to an extension of C. 1513 Other extensions of C that apply dependently-typed bound tracking are heavyweight, in that the bound tracking is part of a linearly-typed ownership-system, which further helps guarantee statically the validity of every pointer deference. 1514 These systems, therefore, ask the programmer to convince the type checker that every pointer dereference is valid. 1515 \CFA imposes the lighter-weight obligation, with the more limited guarantee, that initially-declared bounds are respected thereafter. 1516 1517 \CFA's array is also the first extension of C to use its tracked bounds to generate the pointer arithmetic implied by advanced allocation patterns. 1518 Other bound-tracked extensions of C either forbid certain C patterns entirely, or address the problem of \emph{verifying} that the user's provided pointer arithmetic is self-consistent. 1519 The \CFA array, applied to accordion structures [TOD: cross-reference] \emph{implies} the necessary pointer arithmetic, generated automatically, and not appearing at all in a user's program. 1520 1422 1521 The \CFA array and the field of ``array language'' comparators all leverage dependent types to improve on the expressiveness over C and Java, accommodating examples such as: 1423 1522 \begin{itemize}[leftmargin=*] … … 1465 1564 Having no instances means there is no type for a variable @i@ that constrains @i@ to be in the range for @N@, unlike Dex, [TODO: verify], but like Futhark. 1466 1565 1467 \subsection{Static Safety in C Extensions}1468 1469 1470 \section{Future Work}1471 1472 \subsection{Declaration Syntax}1473 1474 \subsection{Range Slicing}1475 1476 \subsection{With a Module System}1477 1478 \subsection{With Described Enumerations}1479 1480 A project in \CFA's current portfolio will improve enumerations.1481 In the incumbent state, \CFA has C's enumerations, unmodified.1482 I will not discuss the core of this project, which has a tall mission already, to improve type safety, maintain appropriate C compatibility and offer more flexibility about storage use.1483 It also has a candidate stretch goal, to adapt \CFA's @forall@ generic system to communicate generalized enumerations:1484 \begin{cfa}1485 forall( T | is_enum(T) )1486 void show_in_context( T val ) {1487 for( T i ) {1488 string decorator = "";1489 if ( i == val-1 ) decorator = "< ready";1490 if ( i == val ) decorator = "< go" ;1491 sout | i | decorator;1492 }1493 }1494 enum weekday { mon, tue, wed = 500, thu, fri };1495 show_in_context( wed );1496 \end{cfa}1497 with output1498 \begin{cfa}1499 mon1500 tue < ready1501 wed < go1502 thu1503 fri1504 \end{cfa}1505 The details in this presentation aren't meant to be taken too precisely as suggestions for how it should look in \CFA.1506 But the example shows these abilities:1507 \begin{itemize}[leftmargin=*]1508 \item a built-in way (the @is_enum@ trait) for a generic routine to require enumeration-like information about its instantiating type1509 \item an implicit implementation of the trait whenever a user-written enum occurs (@weekday@'s declaration implies @is_enum@)1510 \item a total order over the enumeration constants, with predecessor/successor (@val-1@) available, and valid across gaps in values (@tue == 1 && wed == 500 && tue == wed - 1@)1511 \item a provision for looping (the @for@ form used) over the values of the type.1512 \end{itemize}1513 1514 1566 If \CFA gets such a system for describing the list of values in a type, then \CFA arrays are poised to move from the Futhark level of expressiveness, up to the Dex level. 1515 1567 … … 1563 1615 In the case of adapting this pattern to \CFA, my current work provides an adapter from ``successively subscripted'' to ``subscripted by tuple,'' so it is likely that generalizing my adapter beyond ``subscripted by @ptrdiff_t@'' is sufficient to make a user-provided adapter unnecessary. 1564 1616 1617 \subsection{Static Safety in C Extensions} 1618 1619 1620 \section{Future Work} 1621 1622 \subsection{Array Syntax} 1623 1624 An important goal is to recast @array(...)@ syntax into C-style @[]@. 1625 The proposal (which is partially implemented) is an alternate dimension and subscript syntax. 1626 C multi-dimension and subscripting syntax uses multiple brackets. 1627 \begin{cfa} 1628 int m@[2][3]@; // dimension 1629 m@[0][1]@ = 3; // subscript 1630 \end{cfa} 1631 Leveraging this syntax, the following (simpler) syntax should be intuitive to C programmers with only a small backwards compatibility issue. 1632 \begin{cfa} 1633 int m@[2, 3]@; // dimension 1634 m@[0, 1]@ = 3; // subscript 1635 \end{cfa} 1636 However, the new subscript syntax is not backwards compatible, as @0, 1@ is a comma expression. 1637 Interestingly, disallowed the comma expression in this context eliminates an unreported error: subscripting a matrix with @m[i, j]@ instead of @m[i][j]@, which selects the @j@th row not the @i, j@ element. 1638 Hence, a comma expression in this context is rare. 1639 Finally, it is possible to write @m[(i, j)]@ in the new syntax to achieve the equivalent of the old @m[i, j]@. 1640 Note, the new subscript syntax can easily be internally lowered to @[-][-]...@ and handled as regular subscripting. 1641 The only ambiguity with C syntax is for a single dimension array, where the syntax for old and new are the same. 1642 \begin{cfa} 1643 int m[2@,@]; // single dimension 1644 m[0] = 3; // subscript 1645 \end{cfa} 1646 The solution for the dimension is to use a terminating comma to denote a new single-dimension array. 1647 This syntactic form is also used for the (rare) singleton tuple @[y@{\color{red},}@]@. 1648 The extra comma in the dimension is only mildly annoying, and acts as eye-candy differentiating old and new arrays. 1649 The subscript operator is not an issue as overloading selects the correct single-dimension operation for old/new array types. 1650 The ultimately goal is to replace all C arrays with \CFA arrays, establishing a higher level of safety in C programs, and eliminating the need for the terminating comma. 1651 1652 1653 \subsection{Range Slicing} 1654 1655 \subsection{With a Module System} 1656 1657 1565 1658 \subsection{Retire Pointer Arithmetic} 1566 1659 1567 1660 1568 \section{\CFA} 1661 \begin{comment} 1662 \section{\texorpdfstring{\CFA}{Cforall}} 1569 1663 1570 1664 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX \\ … … 1575 1669 (For later: That's what I offer with array.hfa, but in the future-work vision for arrays, the fix includes helping programmers stop accidentally using a broken C-ism.) 1576 1670 1577 \subsection{\ CFAFeatures Interacting with Arrays}1671 \subsection{\texorpdfstring{\CFA}{Cforall} Features Interacting with Arrays} 1578 1672 1579 1673 Prior work on \CFA included making C arrays, as used in C code from the wild, … … 1629 1723 1630 1724 TODO: I fixed a bug associated with using an array as a T. I think. Did I really? What was the bug? 1725 \end{comment}
Note:
See TracChangeset
for help on using the changeset viewer.