Changeset dbff8ec
- Timestamp:
- Jul 10, 2024, 3:39:37 AM (9 months ago)
- Branches:
- master
- Children:
- 725f777f
- Parents:
- bb336a6 (diff), f3811df (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the(diff)
links above to see all the changes relative to each parent. - Files:
-
- 1 added
- 20 edited
Legend:
- Unmodified
- Added
- Removed
-
TabularUnified INSTALL ¶
rbb336a6 rdbff8ec 40 40 41 41 $ cd ./test 42 $ make -j 8 all- tests42 $ make -j 8 all-local 43 43 44 44 The tests take about 2-5 minutes and can be stopped at any time. -
TabularUnified doc/theses/mike_brooks_MMath/array.tex ¶
rbb336a6 rdbff8ec 2 2 \label{c:Array} 3 3 4 4 5 \section{Introduction} 5 6 6 This chapter describes my contribution of language and library features that provide a length-checked array type, as in: 7 8 \begin{lstlisting} 9 array(float, 99) x; // x contains 99 floats 10 11 void f( array(float, 42) & a ) {} 12 f(x); // statically rejected: types are different 7 Arrays in C are possible the single most misunderstood and incorrectly used features in the language, resulting in the largest proportion of runtime errors and security violations. 8 This chapter describes the new \CFA language and library features that introduce a length-checked array-type to the \CFA standard library~\cite{Cforall}, \eg: 9 \begin{cfa} 10 @array( float, 99 )@ x; $\C{// x contains 99 floats}$ 11 void f( @array( float, 42 )@ & p ) {} $\C{// p accepts 42 floats}$ 12 f( x ); $\C{// statically rejected: types are different, 99 != 42}$ 13 13 14 14 forall( T, [N] ) 15 void g( array(T, N) & a, int i ) { 16 T elem = a[i]; // dynamically checked: requires 0 <= i < N 17 } 18 g(x, 0); // T is float, N is 99, succeeds 19 g(x, 1000); // T is float, N is 99, dynamic check fails 20 \end{lstlisting} 21 22 This example first declares @x@ a variable, whose type is an instantiation of the generic type named @array@, with arguments @float@ and @99@. 23 Next, it declares @f@ as a function that expects a length-42 array; the type system rejects the call's attempt to pass @x@ to @f@, because the lengths do not match. 24 Next, the @forall@ annotation on function @g@ introduces @T@ as a familiar type parameter and @N@ as a \emph{dimension} parameter, a new feature that represents a count of elements, as managed by the type system. 25 Because @g@ accepts any length of array; the type system accepts the calls' passing @x@ to @g@, inferring that this length is 99. 26 Just as the caller's code does not need to explain that @T@ is @float@, the safe capture and communication of the value @99@ occurs without programmer involvement. 27 In the case of the second call (which passes the value 1000 for @i@), within the body of @g@, the attempt to subscript @a@ by @i@ fails with a runtime error, since $@i@ \nless @N@$. 28 29 The type @array@, as seen above, comes from my additions to the \CFA standard library. 30 It is very similar to the built-in array type, which \CFA inherits from C. 15 void g( @array( T, N )@ & p, int i ) { 16 T elem = p[i]; $\C{// dynamically checked: requires 0 <= i < N}$ 17 } 18 g( x, 0 ); $\C{// T is float, N is 99, dynamic subscript check succeeds}$ 19 g( x, 1000 ); $\C{// T is float, N is 99, dynamic subscript check fails}$ 20 \end{cfa} 21 This example declares variable @x@, with generic type @array@ using arguments @float@ and @99@. 22 Function @f@ is declared with an @array@ parameter of length @42@. 23 The call @f( x )@ is invalid because the @array@ lengths @99@ and @42@ do not match. 24 Next, function @g@ introduces a @forall@ prefix on type parameter @T@ and arbitrary \emph{dimension parameter} @N@, the new feature that represents a count of elements managed by the type system. 25 The call @g( x, 0 )@ is valid because @g@ accepts any length of array, where the type system infers @float@ for @T@ and length @99@ for @N@. 26 Inferring values for @T@ and @N@ is implicit without programmer involvement. 27 Furthermore, the runtime subscript @x[0]@ (parameter @i@ is @0@) in @g@ is valid because @0@ is in the dimension range $[0,99)$ of argument @x@. 28 The call @g( x, 1000 )@ is also valid; 29 however, the runtime subscript @x[1000]@ is invalid (generates a subscript error) because @1000@ is outside the dimension range $[0,99)$ of argument @x@. 30 31 The generic @array@ type is similar to the C array type, which \CFA inherits from C. 31 32 Its runtime characteristics are often identical, and some features are available in both. 32 33 \begin{ lstlisting}33 For example, assume a caller can instantiates @N@ with 42 in the following (details to follow). 34 \begin{cfa} 34 35 forall( [N] ) 35 36 void declDemo() { 36 float a1[N]; // built-in type ("C array") 37 array(float, N) a2; // type from library 38 } 39 \end{lstlisting} 40 41 If a caller instantiates @N@ with 42, then both locally-declared array variables, @a1@ and @a2@, become arrays of 42 elements, each element being a @float@. 42 The two variables have identical size and layout; they both encapsulate 42-float stack allocations, no heap allocations, and no further "bookkeeping" allocations/header. 43 Having the @array@ library type (that of @a2@) is a tactical measure, an early implementation that offers full feature support. 44 A future goal (TODO xref) is to port all of its features into the built-in array type (that of @a1@); then, the library type could be removed, and \CFA would have only one array type. 45 In present state, the built-in array has partial support for the new features. 46 The fully-featured library type is used exclusively in introductory examples; feature support and C compatibility are revisited in sec TODO. 47 48 Offering the @array@ type, as a distinct alternative from the the C array, is consistent with \CFA's extension philosophy (TODO xref background) to date. 49 A few compatibility-breaking changes to the behaviour of the C array were also made, both as an implementation convenience, and as justified fixes to C's lax treatment. 50 51 The @array@ type is an opportunity to start from a clean slate and show a cohesive selection of features. 52 A clean slate was an important starting point because it meant not having to deal with every inherited complexity introduced in TODO xref background-array. 53 54 55 My contributions are 56 \begin{itemize} 57 \item a type system enhancement that lets polymorphic functions and generic types be parameterized by a numeric value: @forall( [N] )@ 37 float x1[N]; $\C{// built-in type ("C array")}$ 38 array(float, N) x2; $\C{// type from library}$ 39 } 40 \end{cfa} 41 Both of the locally-declared array variables, @x1@ and @x2@, have 42 elements, each element being a @float@. 42 The two variables have identical size and layout; they both encapsulate 42-float, stack \vs heap allocations with no additional ``bookkeeping'' allocations or headers. 43 Providing this explicit generic approach required a significant extension to the \CFA type system to support a full-feature, safe, efficient (space and time) array-type, which forms the foundation for more complex array forms in \CFA. 44 45 Admittedly, the @array@ library type (type for @x2@) is syntactically different from its C counterpart. 46 A future goal (TODO xref) is to provide a built-in array type with syntax approaching C's (type for @x1@); 47 then, the library @array@ type can be removed giving \CFA a largely uniform array type. 48 At present, the built-in array is only partially supported, so the generic @array@ is used exclusively in the discussion; 49 feature support and C compatibility are revisited in Section ? TODO. 50 51 Offering an @array@ type, as a distinct alternative to the C array, is consistent with \CFA's goal of backwards compatibility, \ie virtually all existing C (gcc) programs can be compiled by \CFA with only a small number of changes, similar to \CC (g++). 52 However, a few compatibility-breaking changes to the behaviour of the C array are necessary, both as an implementation convenience and to fix C's lax treatment of arrays. 53 Hence, the @array@ type is an opportunity to start from a clean slate and show a cohesive selection of features, making it unnecessary to deal with every inherited complexity introduced by the C array TODO xref. 54 55 My contributions are: 56 \begin{enumerate} 57 \item A type system enhancement that lets polymorphic functions and generic types be parameterized by a numeric value: @forall( [N] )@. 58 \item Provide a length-checked array-type in the \CFA standard library, where the array's length is statically managed and dynamically valued. 59 \item Provide argument/parameter passing safety for arrays and subscript safety. 58 60 \item TODO: general parking... 59 \item identify specific abilities brought by @array@60 \item Where there is a gap concerning this feature's readiness for prime-time, identification of specific workable improvements that are likely to close the gap 61 \end{ itemize}61 \item Identify the interesting specific abilities available by the new @array@ type. 62 \item Where there is a gap concerning this feature's readiness for prime-time, identification of specific workable improvements that are likely to close the gap. 63 \end{enumerate} 62 64 63 65 … … 68 70 69 71 70 \section{Features Added} 71 72 The present work adds a type @array@ to the \CFA standard library~\cite{Cforall}. 73 74 This array's length is statically managed and dynamically valued. 75 This static management achieves argument safety and suggests a path to subscript safety as future work (TODO: cross reference). 76 77 This section presents motivating examples of the new array type's usage and follows up with definitions of the notations that appear. 78 79 The core of the new array management is tracking all array lengths in the type system. 80 Dynamically valued lengths are represented using type variables. 81 The stratification of type variables preceding object declarations makes a length referenceable everywhere that it is needed. 72 \section{Features added} 73 74 This section presents motivating examples for the new array type, demonstrating the syntax and semantics of the generic @array@. 75 As stated, the core capability of the new array is tracking all dimensions in the type system, where dynamic dimensions are represented using type variables. 76 77 The definition of type variables preceding object declarations makes the array dimension lexically referenceable where it is needed. 82 78 For example, a declaration can share one length, @N@, among a pair of parameters and the return. 83 79 \lstinput{10-17}{hello-array.cfa} 84 80 Here, the function @f@ does a pointwise comparison, checking if each pair of numbers is within half a percent of each other, returning the answers in a newly allocated @bool@ array. 85 86 The array type uses the parameterized length information in its @sizeof@ determination, illustrated in the example's call to @alloc@. 87 That call requests an allocation of type @array(bool, N)@, which the type system deduces from the left-hand side of the initialization, into the return type of the @alloc@ call. 88 Preexisting \CFA behaviour is leveraged here, both in the return-type-only polymorphism, and the @sized(T)@-aware standard-library @alloc@ routine. 89 The new @array@ type plugs into this behaviour by implementing the @sized@/@sizeof@ assertion to have the intuitive meaning. 90 As a result, this design avoids an opportunity for programmer error by making the size/length communication to a called routine implicit, compared with C's @calloc@ (or the low-level \CFA analog @aalloc@), which take an explicit length parameter not managed by the type system. 91 92 \VRef[Figure]{f:fHarness} shows the harness to use the @f@ function illustrating how dynamic values are fed into the system. 93 Here, the @a@ array is loaded with decreasing values, and the @b@ array with amounts off by a constant, giving relative differences within tolerance at first and out of tolerance later. 94 The program main is run with two different inputs of sequence length. 81 The dynamic allocation of the @ret@ array by @alloc@ uses the parameterized dimension information in its implicit @_Alignof@ and @sizeof@ determinations, and casting the return type. 82 \begin{cfa} 83 static inline forall( T & | sized(T) ) 84 T * alloc( size_t dim ) { 85 if ( _Alignof(T) <= libAlign() ) return (T *)aalloc( dim, sizeof(T) ); // calloc without zero fill 86 else return (T *)amemalign( _Alignof(T), dim, sizeof(T) ); // array memalign 87 } 88 \end{cfa} 89 Here, the type system deduces from the left-hand side of the assignment the type @array(bool, N)@, and forwards it as the type variable @T@ for the generic @alloc@ function, making it available in its body. 90 Hence, preexisting \CFA behaviour is leveraged here, both in the return-type polymorphism, and the @sized(T)@-aware standard-library @alloc@ routine. 91 This example illustrates how the new @array@ type plugs into existing \CFA behaviour by implementing necessary @sized@ assertions needed by other types. 92 (@sized@ implies a concrete \vs abstract type with a compile-time size.) 93 As a result, there is significant programming safety by making the size accessible and implicit, compared with C's @calloc@ and non-array supporting @memalign@, which take an explicit length parameter not managed by the type system. 95 94 96 95 \begin{figure} 97 \lstinput{30-49}{hello-array.cfa} 96 \lstinput{30-43}{hello-array.cfa} 97 \lstinput{45-48}{hello-array.cfa} 98 98 \caption{\lstinline{f} Harness} 99 99 \label{f:fHarness} 100 100 \end{figure} 101 101 102 The loops in the program main follow the more familiar pattern of using the ordinary variable @n@ to convey the length. 103 The type system implicitly captures this value at the call site (@main@ calling @f@) and makes it available within the callee (@f@'s loop bound). 104 105 The two parts of the example show @n@ adapting a variable into a type-system managed length (at @main@'s declarations of @a@, @b@, and @result@), @N@ adapting in the opposite direction (at @f@'s loop bound), and a pass-thru use of a managed length (at @f@'s declaration of @ret@). 106 107 The @forall( ...[N] )@ participates in the user-relevant declaration of the name @N@, which becomes usable in parameter/return declarations and in the function @b@. 108 The present form is chosen to parallel the existing @forall@ forms: 109 \begin{cfa} 110 forall( @[N]@ ) ... // array kind 111 forall( & T ) ... // reference kind (dtype) 112 forall( T ) ... // value kind (otype) 113 \end{cfa} 114 115 The notation @array(thing, N)@ is a single-dimensional case, giving a generic type instance. 102 \VRef[Figure]{f:fHarness} shows a harness that uses the @f@ function illustrating how dynamic values are fed into the @array@ type. 103 Here, the dimension of the @x@, @y@, and @result@ arrays is specified from a command-line value and these arrays are allocated on the stack. 104 Then the @x@ array is initialized with decreasing values, and the @y@ array with amounts offset by constant @0.005@, giving relative differences within tolerance initially and diverging for later values. 105 The program main is run (see figure bottom) with inputs @5@ and @7@ for sequence lengths. 106 The loops follow the familiar pattern of using the variable @n@ to iterate through the arrays. 107 Most importantly, the type system implicitly captures @n@ at the call of @f@ and makes it available throughout @f@ as @N@. 108 The example shows @n@ adapting into a type-system managed length at the declarations of @x@, @y@, and @result@, @N@ adapting in the same way at @f@'s loop bound, and a pass-thru use of @n@ at @f@'s declaration of @ret@. 109 Except for the lifetime-management issue of @result@, \ie explicit @free@, this program has eliminated both the syntactic and semantic problems associated with C arrays and their usage. 110 These benefits cannot be underestimated. 111 112 In general, the @forall( ..., [N] )@ participates in the user-relevant declaration of the name @N@, which becomes usable in parameter/return declarations and within a function. 113 The syntactic form is chosen to parallel other @forall@ forms: 114 \begin{cfa} 115 forall( @[N]@ ) ... $\C[1.5in]{// array kind}$ 116 forall( T & ) ... $\C{// reference kind (dtype)}$ 117 forall( T ) ... $\C{// value kind (otype)}\CRT$ 118 \end{cfa} 119 % The notation @array(thing, N)@ is a single-dimensional case, giving a generic type instance. 116 120 In summary: 117 121 \begin{itemize} 118 122 \item 119 @[N]@ -- within a forall, declares the type variable @N@ to be a managed length120 \item 121 $e$ -- a type representing the value of $e$ as a managed length, where $e$ is a @size_t@-typed expression 122 \item 123 N -- an expression of type @size_t@, whose value is the managed length @N@ 124 \item 125 @array( thing, N0, N1, ... )@ -- a type wrapping $\prod_i N_i$ adjacent occurrences of @thing@ objects123 @[N]@ within a forall declares the type variable @N@ to be a managed length. 124 \item 125 The type of @N@ within code is @size_t@. 126 \item 127 The value of @N@ within code is the acquired length derived from the usage site, \ie generic declaration or function call. 128 \item 129 @array( thing, N0, N1, ... )@ is a multi-dimensional type wrapping $\prod_i N_i$ adjacent occurrences of @thing@ objects. 126 130 \end{itemize} 127 Unsigned integers have a special status in this type system. 128 Unlike how C++ allows 131 132 \VRef[Figure]{f:TemplateVsGenericType} shows @N@ is not the same as a @size_t@ declaration in a \CC \lstinline[language=C++]{template}. 133 \begin{enumerate}[leftmargin=*] 134 \item 135 The \CC template @N@ is a compile-time value, while the \CFA @N@ is a runtime value. 136 \item 137 The \CC template @N@ must be passed explicitly at the call, unless @N@ has a default value, even when \CC can deduct the type of @T@. 138 The \CFA @N@ is part of the array type and passed implicitly at the call. 139 \item 140 \CC cannot have an array of references, but can have an array of pointers. 141 \CC has a (mistaken) belief that references are not objects, but pointers are objects. 142 In the \CC example, the arrays fall back on C arrays, which have a duality with references with respect to automatic dereferencing. 143 The \CFA array is a contiguous object with an address, which can stored as a reference or pointer. 144 \item 145 C/\CC arrays cannot be copied, while \CFA arrays can be copied, making them a first-class object (although array copy is often avoided for efficiency). 146 \end{enumerate} 147 148 \begin{figure} 149 \begin{tabular}{@{}l@{\hspace{20pt}}l@{}} 129 150 \begin{c++} 130 template< size_t N, char * msg, typename T >... // declarations 151 152 @template< typename T, size_t N >@ 153 void copy( T ret[N], T x[N] ) { 154 for ( int i = 0; i < N; i += 1 ) ret[i] = x[i]; 155 } 156 int main() { 157 int ret[10], x[10]; 158 for ( int i = 0; i < 10; i += 1 ) x[i] = i; 159 @copy<int, 10 >( ret, x );@ 160 for ( int i = 0; i < 10; i += 1 ) 161 cout << ret[i] << ' '; 162 cout << endl; 163 } 131 164 \end{c++} 132 \CFA does not accommodate values of any user-provided type. 133 TODO: discuss connection with dependent types. 134 An example of a type error demonstrates argument safety. 135 The running example has @f@ expecting two arrays of the same length. 136 A compile-time error occurs when attempting to call @f@ with arrays whose lengths may differ. 137 \begin{cfa} 138 forall( [M], [N] ) 139 void bad( array(float, M) &a, array(float, N) &b ) { 140 f( a, a ); // ok 141 f( b, b ); // ok 142 f( a, b ); // error 143 } 144 \end{cfa} 145 %\lstinput{60-65}{hello-array.cfa} 146 As is common practice in C, the programmer is free to cast, to assert knowledge not shared with the type system. 147 \begin{cfa} 148 forall( [M], [N] ) 149 void bad_fixed( array(float, M) & a, array(float, N) & b ) { 150 if ( M == N ) { 151 f( a, (array(float, M) &)b ); // cast b to matching type 165 & 166 \begin{cfa} 167 int main() { 168 @forall( T, [N] )@ // nested function 169 void copy( array( T, N ) & ret, array( T, N ) & x ) { 170 for ( i; 10 ) ret[i] = x[i]; 152 171 } 153 } 154 \end{cfa} 155 %\lstinput{70-75}{hello-array.cfa} 156 157 Argument safety and the associated implicit communication of array length work with \CFA's generic types too. 158 \CFA allows aggregate types to be generalized with multiple type parameters, including parameterized element type, so can it be defined over a parameterized length. 159 Doing so gives a refinement of C's ``flexible array member'' pattern, that allows nesting structures with array members anywhere within other structures. 160 \lstinput{10-16}{hello-accordion.cfa} 161 This structure's layout has the starting offset of @cost_contribs@ varying in @Nclients@, and the offset of @total_cost@ varying in both generic parameters. 162 For a function that operates on a @request@ structure, the type system handles this variation transparently. 163 \lstinput{40-47}{hello-accordion.cfa} 164 In the example, different runs of the program result in different offset values being used. 165 \lstinput{60-76}{hello-accordion.cfa} 172 173 array( int, 10 ) ret, x; 174 for ( i; 10 ) x[i] = i; 175 @copy( ret, x );@ 176 for ( i; 10 ) 177 sout | ret[i] | nonl; 178 sout | nl; 179 } 180 \end{cfa} 181 \end{tabular} 182 \caption{\CC \lstinline[language=C++]{template} \vs \CFA generic type } 183 \label{f:TemplateVsGenericType} 184 \end{figure} 185 186 Continuing the discussion of \VRef[Figure]{f:fHarness}, the example has @f@ expecting two arrays of the same length. 187 A compile-time error occurs when attempting to call @f@ with arrays of differing lengths. 188 % removing leading whitespace 189 \lstinput[tabsize=1]{52-53}{hello-array.cfa} 190 \lstinput[tabsize=1,aboveskip=0pt]{62-64}{hello-array.cfa} 191 As is common practice in C, the programmer is free to cast, \ie to assert knowledge not shared with the type system. 192 \lstinput{70-74}{hello-array.cfa} 193 194 Orthogonally, the new @array@ type works with \CFA's generic types, providing argument safety and the associated implicit communication of array length. 195 Specifically, \CFA allows aggregate types to be generalized with multiple type parameters, including parameterized element types and lengths. 196 Doing so gives a refinement of C's ``flexible array member'' pattern, allowing nesting structures with array members anywhere within other structures. 197 \lstinput{10-15}{hello-accordion.cfa} 198 This structure's layout has the starting offset of @municipalities@ varying in @NprovTerty@, and the offset of @total_pt@ and @total_mun@ varying in both generic parameters. 199 For a function that operates on a @CanadaPop@ structure, the type system handles this variation transparently. 200 \lstinput{40-45}{hello-accordion.cfa} 201 \VRef[Figure]{f:checkHarness} shows program results where different offset values being used. 166 202 The output values show that @summarize@ and its caller agree on both the offsets (where the callee starts reading @cost_contribs@ and where the callee writes @total_cost@). 167 Yet the call site still says just, ``pass the request.'' 168 169 170 \section{Multidimensional implementation} 203 Yet the call site just says, ``pass the request.'' 204 205 \begin{figure} 206 \lstinput{60-68}{hello-accordion.cfa} 207 \lstinput{70-72}{hello-accordion.cfa} 208 \caption{\lstinline{check} Harness} 209 \label{f:checkHarness} 210 \end{figure} 211 212 213 \section{Multidimensional Arrays} 171 214 \label{toc:mdimpl} 172 215 173 TODO: introduce multidimensional array feature and approaches 174 175 The new \CFA standard library @array@ datatype supports multidimensional uses more richly than the C array. 176 The new array's multidimensional interface and implementation, follows an array-of-arrays setup, meaning, like C's @float[n][m]@ type, one contiguous object, with coarsely-strided dimensions directly wrapping finely-strided dimensions. 177 This setup is in contrast with the pattern of array of pointers to other allocations representing a sub-array. 178 Beyond what C's type offers, the new array brings direct support for working with a noncontiguous array slice, allowing a program to work with dimension subscripts given in a non-physical order. 179 C and C++ require a programmer with such a need to manage pointer/offset arithmetic manually. 180 181 Examples are shown using a $5 \times 7$ float array, @a@, loaded with increments of $0.1$ when stepping across the length-7 finely-strided dimension shown on columns, and with increments of $1.0$ when stepping across the length-5 coarsely-strided dimension shown on rows. 182 %\lstinput{120-126}{hello-md.cfa} 183 The memory layout of @a@ has strictly increasing numbers along its 35 contiguous positions. 216 % TODO: introduce multidimensional array feature and approaches 217 218 When working with arrays, \eg linear algebra, array dimensions are referred to as ``rows'' and ``columns'' for a matrix, adding ``planes'' for a cube. 219 (There is little terminology for higher dimensional arrays.) 220 For example, an acrostic poem\footnote{A type of poetry where the first, last or other letters in a line spell out a particular word or phrase in a vertical column.} 221 can be treated as a grid of characters, where the rows are the text and the columns are the embedded keyword(s). 222 Within a poem, there is the concept of a \newterm{slice}, \eg a row is a slice for the poem text, a column is a slice for a keyword. 223 In general, the dimensioning and subscripting for multidimensional arrays has two syntactic forms: @m[r,c]@ or @m[r][c]@. 224 225 Commonly, an array, matrix, or cube, is visualized (especially in mathematics) as a contiguous row, rectangle, or block. 226 This conceptualization is reenforced by subscript ordering, \eg $m_{r,c}$ for a matrix and $c_{p,r,c}$ for a cube. 227 Few programming languages differ from the mathematical subscript ordering. 228 However, computer memory is flat, and hence, array forms are structured in memory as appropriate for the runtime system. 229 The closest representation to the conceptual visualization is for an array object to be contiguous, and the language structures this memory using pointer arithmetic to access the values using various subscripts. 230 This approach still has degrees of layout freedom, such as row or column major order, \ie juxtaposed rows or columns in memory, even when the subscript order remains fixed. 231 For example, programming languages like MATLAB, Fortran, Julia and R store matrices in column-major order since they are commonly used for processing column-vectors in tabular data sets but retain row-major subscripting. 232 In general, storage layout is hidden by subscripting, and only appears when passing arrays among different programming languages or accessing specific hardware. 233 234 \VRef[Figure]{f:FixedVariable} shows two C90 approaches for manipulating contiguous arrays. 235 Note, C90 does not support VLAs. 236 The fixed-dimension approach uses the type system; 237 however, it requires all dimensions except the first to be specified at compile time, \eg @m[][6]@, allowing all subscripting stride calculations to be generated with constants. 238 Hence, every matrix passed to @fp1@ must have exactly 6 columns but the row size can vary. 239 The variable-dimension approach ignores (violates) the type system, \ie argument and parameters types do not match, and manually performs pointer arithmetic for subscripting in the macro @sub@. 240 241 \begin{figure} 242 \begin{tabular}{@{}l@{\hspace{40pt}}l@{}} 243 \multicolumn{1}{c}{\textbf{Fixed Dimension}} & \multicolumn{1}{c}{\textbf{Variable Dimension}} \\ 244 \begin{cfa} 245 246 void fp1( int rows, int m[][@6@] ) { 247 ... printf( "%d ", @m[r][c]@ ); ... 248 } 249 int fm1[4][@6@], fm2[6][@6@]; // no VLA 250 // initialize matrixes 251 fp1( 4, fm1 ); // implicit 6 columns 252 fp1( 6, fm2 ); 253 \end{cfa} 254 & 255 \begin{cfa} 256 #define sub( m, r, c ) *(m + r * sizeof( m[0] ) + c) 257 void fp2( int rows, int cols, int *m ) { 258 ... printf( "%d ", @sub( m, r, c )@ ); ... 259 } 260 int vm1[4][4], vm2[6][8]; // no VLA 261 // initialize matrixes 262 fp2( 4, 4, vm1 ); 263 fp2( 6, 8, vm2 ); 264 \end{cfa} 265 \end{tabular} 266 \caption{C90 Fixed \vs Variable Contiguous Matrix Styles} 267 \label{f:FixedVariable} 268 \end{figure} 269 270 Many languages allow multidimensional arrays-of-arrays, \eg in Pascal or \CC. 271 \begin{cquote} 272 \begin{tabular}{@{}ll@{}} 273 \begin{pascal} 274 var m : array[0..4, 0..4] of Integer; (* matrix *) 275 type AT = array[0..4] of Integer; (* array type *) 276 type MT = array[0..4] of AT; (* array of array type *) 277 var aa : MT; (* array of array variable *) 278 m@[1][2]@ := 1; aa@[1][2]@ := 1 (* same subscripting *) 279 \end{pascal} 280 & 281 \begin{c++} 282 int m[5][5]; 283 284 typedef vector< vector<int> > MT; 285 MT vm( 5, vector<int>( 5 ) ); 286 m@[1][2]@ = 1; aa@[1][2]@ = 1; 287 \end{c++} 288 \end{tabular} 289 \end{cquote} 290 The language decides if the matrix and array-of-array are laid out the same or differently. 291 For example, an array-of-array may be an array of row pointers to arrays of columns, so the rows may not be contiguous in memory nor even the same length (triangular matrix). 292 Regardless, there is usually a uniform subscripting syntax masking the memory layout, even though the two array forms could be differentiated at the subscript level, \eg @m[1,2]@ \vs @aa[1][2]@. 293 Nevertheless, controlling memory layout can make a difference in what operations are allowed and in performance (caching/NUMA effects). 294 295 C also provides non-contiguous arrays-of-arrays. 296 \begin{cfa} 297 int m[5][5]; $\C{// contiguous}$ 298 int * aa[5]; $\C{// non-contiguous}$ 299 \end{cfa} 300 both with different memory layout using the same subscripting, and both with different degrees of issues. 301 The focus of this work is on the contiguous multidimensional arrays in C. 302 The reason is that programmers are often forced to use the more complex array-of-array form when a contiguous array would be simpler, faster, and safer. 303 Nevertheless, the C array-of-array form continues to be useful for special circumstances. 304 305 \VRef[Figure]{f:ContiguousNon-contiguous} shows the extensions made in C99 for manipulating contiguous \vs non-contiguous arrays.\footnote{C90 also supported non-contiguous arrays.} 306 First, VLAs are supported. 307 Second, for contiguous arrays, C99 conjoins one or more of the parameters as a downstream dimension(s), \eg @cols@, implicitly using this parameter to compute the row stride of @m@. 308 If the declaration of @fc@ is changed to: 309 \begin{cfa} 310 void fc( int rows, int cols, int m[@rows@][@cols@] ) ... 311 \end{cfa} 312 it is possible for C to perform bound checking across all subscripting, but it does not. 313 While this contiguous-array capability is a step forward, it is still the programmer's responsibility to manually manage the number of dimensions and their sizes, both at the function definition and call sites. 314 That is, the array does not automatically carry its structure and sizes for use in computing subscripts. 315 While the non-contiguous style in @faa@ looks very similar to @fc@, the compiler only understands the unknown-sized array of row pointers, and it relies on the programmer to traverse the columns in a row correctly. 316 Specifically, there is no requirement that the rows are the same length, like a poem with different length lines. 317 318 \begin{figure} 319 \begin{tabular}{@{}ll@{}} 320 \multicolumn{1}{c}{\textbf{Contiguous}} & \multicolumn{1}{c}{\textbf{ Non-contiguous}} \\ 321 \begin{cfa} 322 void fc( int rows, @int cols@, int m[ /* rows */ ][@cols@] ) { 323 ... printf( "%d ", @m[r][c]@ ); ... 324 } 325 int m@[5][5]@; 326 for ( int r = 0; r < 5; r += 1 ) { 327 328 for ( int c = 0; c < 5; c += 1 ) 329 m[r][c] = r + c; 330 } 331 fc( 5, 5, m ); 332 \end{cfa} 333 & 334 \begin{cfa} 335 void faa( int rows, int cols, int * m[ @/* cols */@ ] ) { 336 ... printf( "%d ", @m[r][c]@ ); ... 337 } 338 int @* aa[5]@; // row pointers 339 for ( int r = 0; r < 5; r += 1 ) { 340 @aa[r] = malloc( 5 * sizeof(int) );@ // create rows 341 for ( int c = 0; c < 5; c += 1 ) 342 aa[r][c] = r + c; 343 } 344 faa( 5, 5, aa ); 345 \end{cfa} 346 \end{tabular} 347 \caption{C99 Contiguous \vs Non-contiguous Matrix Styles} 348 \label{f:ContiguousNon-contiguous} 349 \end{figure} 350 351 352 \subsection{Multidimensional array implementation} 353 354 A multidimensional array implementation has three relevant levels of abstraction, from highest to lowest, where the array occupies \emph{contiguous memory}. 355 \begin{enumerate} 356 \item 357 Flexible-stride memory: 358 this model has complete independence between subscripting ordering and memory layout, offering the ability to slice by (provide an index for) any dimension, \eg slice a plane, row, or column, \eg @c[3][*][*]@, @c[3][4][*]@, @c[3][*][5]@. 359 \item 360 Fixed-stride memory: 361 this model binds the first subscript and the first memory layout dimension, offering the ability to slice by (provide an index for) only the coarsest dimension, @m[row][*]@ or @c[plane][*][*]@, \eg slice only by row (2D) or plane (3D). 362 After which, subscripting and memory layout are independent. 363 \item 364 Explicit-displacement memory: 365 this model has no awareness of dimensions just the ability to access memory at a distance from a reference point (base-displacement addressing), \eg @x + 23@ or @x[23}@ $\Rightarrow$ 23rd element from the start of @x@. 366 A programmer must manually build any notion of dimensions using other tools; 367 hence, this style is not offering multidimensional arrays \see{\VRef[Figure]{f:FixedVariable}}. 368 \end{enumerate} 369 370 There is some debate as to whether the abstraction ordering goes $\{1, 2\} < 3$, rather than my numerically-ordering. 371 That is, styles 1 and 2 are at the same abstraction level, with 3 offering a limited set of functionality. 372 I chose to build the \CFA style-1 array upon a style-2 abstraction. 373 (Justification of the decision follows, after the description of the design.) 374 375 Style 3 is the inevitable target of any array implementation. 376 The hardware offers this model to the C compiler, with bytes as the unit of displacement. 377 C offers this model to its programmer as pointer arithmetic, with arbitrary sizes as the unit. 378 Casting a multidimensional array as a single-dimensional array/pointer, then using @x[i]@ syntax to access its elements, is still a form of pointer arithmetic. 379 380 Now stepping into the implementation of \CFA's new type-1 multidimensional arrays in terms of C's existing type-2 multidimensional arrays, it helps to clarify that even the interface is quite low-level. 381 A C/\CFA array interface includes the resulting memory layout. 382 The defining requirement of a type-2 system is the ability to slice a column from a column-finest matrix. 383 The required memory shape of such a slice is set, before any discussion of implementation. 384 The implementation presented here is how the \CFA array library wrangles the C type system, to make it do memory steps that are consistent with this layout. 385 TODO: do I have/need a presentation of just this layout, just the semantics of -[all]? 386 387 The new \CFA standard library @array@ datatype supports richer multidimensional features than C. 388 The new array implementation follows C's contiguous approach, \ie @float [r][c]@, with one contiguous object subscripted by coarsely-strided dimensions directly wrapping finely-strided dimensions. 389 Beyond what C's array type offers, the new array brings direct support for working with a noncontiguous array slice, allowing a program to work with dimension subscripts given in a non-physical order. 390 391 The following examples use an @array( float, 5, 7) m@, loaded with values incremented by $0.1$, when stepping across the length-7 finely-strided column dimension, and stepping across the length-5 coarsely-strided row dimension. 392 \par\noindent 393 \mbox{\lstinput{121-126}{hello-md.cfa}} 394 \par\noindent 395 The memory layout is 35 contiguous elements with strictly increasing addresses. 184 396 185 397 A trivial form of slicing extracts a contiguous inner array, within an array-of-arrays. 186 Like with the C array, a lesser-dimensional array reference can be bound to the result of subscripting a greater-dimensional array, by a prefix of its dimensions. 187 This action first subscripts away the most coarsely strided dimensions, leaving a result that expects to be be subscripted by the more finely strided dimensions. 188 \lstinput{60-66}{hello-md.cfa} 398 As for the C array, a lesser-dimensional array reference can be bound to the result of subscripting a greater-dimensional array by a prefix of its dimensions, \eg @m[2]@, giving the third row. 399 This action first subscripts away the most coarsely strided dimensions, leaving a result that expects to be subscripted by the more finely strided dimensions, \eg @m[2][3]@, giving the value @2.3@. 400 The following is an example slicing a row. 401 \lstinput{60-64}{hello-md.cfa} 189 402 \lstinput[aboveskip=0pt]{140-140}{hello-md.cfa} 190 403 191 This function declaration is asserting too much knowledge about its parameter @c@, for it to be usable for printing either a row slice or a column slice. 192 Specifically, declaring the parameter @c@ with type @array@ means that @c@ is contiguous. 193 However, the function does not use this fact. 194 For the function to do its job, @c@ need only be of a container type that offers a subscript operator (of type @ptrdiff_t@ $\rightarrow$ @float@), with managed length @N@. 404 However, function @print1d@ is asserting too much knowledge about its parameter @r@ for printing either a row slice or a column slice. 405 Specifically, declaring the parameter @r@ with type @array@ means that @r@ is contiguous, which is unnecessarily restrictive. 406 That is, @r@ need only be of a container type that offers a subscript operator (of type @ptrdiff_t@ $\rightarrow$ @float@) with managed length @N@. 195 407 The new-array library provides the trait @ix@, so-defined. 196 With it, the original declaration can be generalized , while still implemented with the same body, to the latter declaration:197 \lstinput{4 0-44}{hello-md.cfa}408 With it, the original declaration can be generalized with the same body. 409 \lstinput{43-44}{hello-md.cfa} 198 410 \lstinput[aboveskip=0pt]{145-145}{hello-md.cfa} 199 200 Nontrivial slicing, in this example, means passing a noncontiguous slice to @print1d@. 201 The new-array library provides a ``subscript by all'' operation for this purpose. 202 In a multi-dimensional subscript operation, any dimension given as @all@ is left ``not yet subscripted by a value,'' implementing the @ix@ trait, waiting for such a value. 411 The nontrivial slicing in this example now allows passing a \emph{noncontiguous} slice to @print1d@, where the new-array library provides a ``subscript by all'' operation for this purpose. 412 In a multi-dimensional subscript operation, any dimension given as @all@ is a placeholder, \ie ``not yet subscripted by a value'', waiting for such a value, implementing the @ix@ trait. 203 413 \lstinput{150-151}{hello-md.cfa} 204 414 205 The example has shown that @a[2]@ and @a[[2, all]]@ both refer to the same, ``2.*'' slice.206 Indeed, the various @print1d@ calls under discussion access the entry with value 2.3 as @a[2][3]@, @a[[2,all]][3]@, and @a[[all,3]][2]@.207 This design preserves (and extends) C array semantics by defining @ a[[i,j]]@ to be @a[i][j]@ for numeric subscripts, but also for ``subscripting by all''.415 The example shows @x[2]@ and @x[[2, all]]@ both refer to the same, ``2.*'' slice. 416 Indeed, the various @print1d@ calls under discussion access the entry with value @2.3@ as @x[2][3]@, @x[[2,all]][3]@, and @x[[all,3]][2]@. 417 This design preserves (and extends) C array semantics by defining @x[[i,j]]@ to be @x[i][j]@ for numeric subscripts, but also for ``subscripting by all''. 208 418 That is: 209 210 \begin{tabular}{ cccccl}211 @ a[[2,all]][3]@ & $=$ & @a[2][all][3]@ & $=$ & @a[2][3]@ & (here, @all@ is redundant) \\212 @ a[[all,3]][2]@ & $=$ & @a[all][3][2]@ & $=$ & @a[2][3]@ & (here, @all@ is effective)419 \begin{cquote} 420 \begin{tabular}{@{}cccccl@{}} 421 @x[[2,all]][3]@ & $\equiv$ & @x[2][all][3]@ & $\equiv$ & @x[2][3]@ & (here, @all@ is redundant) \\ 422 @x[[all,3]][2]@ & $\equiv$ & @x[all][3][2]@ & $\equiv$ & @x[2][3]@ & (here, @all@ is effective) 213 423 \end{tabular} 214 215 Narrating progress through each of the @-[-][-][-]@ expressions gives, firstly, a definition of @-[all]@, and secondly, a generalization of C's @-[i]@. 216 217 \noindent Where @all@ is redundant: 218 219 \begin{tabular}{ll} 220 @a@ & 2-dimensional, want subscripts for coarse then fine \\ 221 @a[2]@ & 1-dimensional, want subscript for fine; lock coarse = 2 \\ 222 @a[2][all]@ & 1-dimensional, want subscript for fine \\ 223 @a[2][all][3]@ & 0-dimensional; lock fine = 3 424 \end{cquote} 425 426 Narrating progress through each of the @-[-][-][-]@\footnote{ 427 The first ``\lstinline{-}'' is a variable expression and the remaining ``\lstinline{-}'' are subscript expressions.} 428 expressions gives, firstly, a definition of @-[all]@, and secondly, a generalization of C's @-[i]@. 429 Where @all@ is redundant: 430 \begin{cquote} 431 \begin{tabular}{@{}ll@{}} 432 @x@ & 2-dimensional, want subscripts for coarse then fine \\ 433 @x[2]@ & 1-dimensional, want subscript for fine; lock coarse == 2 \\ 434 @x[2][all]@ & 1-dimensional, want subscript for fine \\ 435 @x[2][all][3]@ & 0-dimensional; lock fine == 3 224 436 \end{tabular} 225 226 \noindentWhere @all@ is effective:227 228 \begin{tabular}{ ll}229 @ a@ & 2-dimensional, want subscripts for coarse then fine \\230 @ a[all]@ & 2-dimensional, want subscripts for fine then coarse \\231 @ a[all][3]@ & 1-dimensional, want subscript for coarse; lock fine= 3 \\232 @ a[all][3][2]@ & 0-dimensional; lock coarse= 2437 \end{cquote} 438 Where @all@ is effective: 439 \begin{cquote} 440 \begin{tabular}{@{}ll@{}} 441 @x@ & 2-dimensional, want subscripts for coarse then fine \\ 442 @x[all]@ & 2-dimensional, want subscripts for fine then coarse \\ 443 @x[all][3]@ & 1-dimensional, want subscript for coarse; lock fine == 3 \\ 444 @x[all][3][2]@ & 0-dimensional; lock coarse == 2 233 445 \end{tabular} 234 446 \end{cquote} 235 447 The semantics of @-[all]@ is to dequeue from the front of the ``want subscripts'' list and re-enqueue at its back. 448 For example, in a two dimensional matrix, this semantics conceptually transposes the matrix by reversing the subscripts. 236 449 The semantics of @-[i]@ is to dequeue from the front of the ``want subscripts'' list and lock its value to be @i@. 237 450 238 Contiguous arrays, and slices of them, are all realized by the same underlying parameterized type. 239 It includes stride information in its metatdata. 240 The @-[all]@ operation is a conversion from a reference to one instantiation, to a reference to another instantiation. 451 Contiguous arrays, and slices of them, are all represented by the same underlying parameterized type, which includes stride information in its metatdata. 452 \PAB{Do not understand this sentence: The \lstinline{-[all]} operation is a conversion from a reference to one instantiation to a reference to another instantiation.} 241 453 The running example's @all@-effective step, stated more concretely, is: 242 243 \begin{tabular}{ ll}244 @ a@ & : 5 of ( 7 of float each spaced 1 float apart ) each spaced 7 floatsapart \\245 @ a[all]@ & : 7 of ( 5 of float each spaced 7 floats apart ) each spaced 1 floatapart454 \begin{cquote} 455 \begin{tabular}{@{}ll@{}} 456 @x@ & : 5 of ( 7 of @float@ each spaced 1 @float@ apart ) each spaced 7 @floats@ apart \\ 457 @x[all]@ & : 7 of ( 5 of @float@ each spaced 7 @float@s apart ) each spaced 1 @float@ apart 246 458 \end{tabular} 459 \end{cquote} 247 460 248 461 \begin{figure} 249 462 \includegraphics{measuring-like-layout} 250 \caption{Visualization of subscripting, by numeric value, and by \lstinline[language=CFA]{all}. 251 Here \lstinline[language=CFA]{x} has type \lstinline[language=CFA]{array( float, 5, 7 )}, understood as 5 rows by 7 columns. 252 The horizontal layout represents contiguous memory addresses while the vertical layout uses artistic license. 253 The vertical shaded band highlights the location of the targeted element, 2.3. 254 Any such vertical contains various interpretations of a single address.} 463 \caption{Visualization of subscripting by value and by \lstinline[language=CFA]{all}, for \lstinline{x} of type \lstinline{array( float, 5, 7 )} understood as 5 rows by 7 columns. 464 The horizontal layout represents contiguous memory addresses while the vertical layout is conceptual. 465 The vertical shaded band highlights the location of the targeted element, 2.3. 466 Any such vertical slice contains various interpretations of a single address.} 255 467 \label{fig:subscr-all} 256 468 \end{figure} 257 258 \noindent BEGIN: Paste looking for a home259 260 The world of multidimensional array implementation has, or abuts, four relevant levels of abstraction, highest to lowest:261 262 1, purpose:263 If you are doing linear algebra, you might call its dimensions, "column" and "row."264 If you are treating an acrostic poem as a grid of characters, you might say,265 the direction of reading the phrases vs the direction of reading the keyword.266 267 2, flexible-stride memory:268 assuming, from here on, a need to see/use contiguous memory,269 this model offers the ability to slice by (provide an index for) any dimension270 271 3, fixed-stride memory:272 this model offers the ability to slice by (provide an index for) only the coarsest dimension273 274 4, explicit-displacement memory:275 no awareness of dimensions, so no distinguishing them; just the ability to access memory at a distance from a reference point276 277 C offers style-3 arrays. Fortran, Matlab and APL offer style-2 arrays.278 Offering style-2 implies offering style-3 as a sub-case.279 My CFA arrays are style-2.280 281 Some debate is reasonable as to whether the abstraction actually goes $ 1 < \{2, 3\} < 4 $,282 rather than my numerically-ordered chain.283 According to the diamond view, styles 2 and 3 are at the same abstraction level, just with 3 offering a more limited set of functionality.284 The chain view reflects the design decision I made in my mission to offer a style-2 abstraction;285 I chose to build it upon a style-3 abstraction.286 (Justification of the decision follows, after the description of the design.)287 288 The following discussion first dispenses with API styles 1 and 4, then elaborates on my work with styles 2 and 3.289 290 Style 1 is not a concern of array implementations.291 It concerns documentation and identifier choices of the purpose-specific API.292 If one is offering a matrix-multiply function, one must specify which dimension(s) is/are being summed over293 (or rely on the familiar convention of these being the first argument's rows and second argument's columns).294 Some libraries offer a style-1 abstraction that is not directly backed by a single array295 (e.g. make quadrants contiguous, as may help cache coherence during a parallel matrix multiply),296 but such designs are out of scope for a discussion on arrays; they are applications of several arrays.297 I typically include style-1 language with examples to help guide intuition.298 299 It is often said that C has row-major arrays while Fortran has column-major arrays.300 This comparison brings an unhelpful pollution of style-1 thinking into issues of array implementation.301 Unfortunately, ``-major'' has two senses: the program's order of presenting indices and the array's layout in memory.302 (The program's order could be either lexical, as in @x[1,2,3]@ subscripting, or runtime, as in the @x[1][2][3]@ version.)303 Style 2 is concerned with introducing a nontrivial relationship between program order and memory order,304 while style 3 sees program order identical with memory order.305 Both C and (the style-3 subset of) Fortran actually use the same relationship here:306 an earlier subscript in program order controls coarser steps in memory.307 The job of a layer-2/3 system is to implement program-ordered subscripting according to a defined memory layout.308 C and Fortran do not use opposite orders in doing this job.309 Fortran is only ``backward'' in its layer-1 conventions for reading/writing and linear algebra.310 Fortran subscripts as $m(c,r)$. When I use style-1 language, I am following the C/mathematical convention of $m(r,c)$.311 312 Style 4 is the inevitable target of any array implementation.313 The hardware offers this model to the C compiler, with bytes as the unit of displacement.314 C offers this model to its programmer as pointer arithmetic, with arbitrary sizes as the unit.315 I consider casting a multidimensional array as a single-dimensional array/pointer,316 then using @x[i]@ syntax to access its elements, to be a form of pointer arithmetic.317 But style 4 is not offering arrays.318 319 Now stepping into the implementation320 of CFA's new type-3 multidimensional arrays in terms of C's existing type-2 multidimensional arrays,321 it helps to clarify that even the interface is quite low-level.322 A C/CFA array interface includes the resulting memory layout.323 The defining requirement of a type-3 system is the ability to slice a column from a column-finest matrix.324 The required memory shape of such a slice is set, before any discussion of implementation.325 The implementation presented here is how the CFA array library wrangles the C type system,326 to make it do memory steps that are consistent with this layout.327 TODO: do I have/need a presentation of just this layout, just the semantics of -[all]?328 469 329 470 Figure~\ref{fig:subscr-all} shows one element (in the shaded band) accessed two different ways: as @x[2][3]@ and as @x[all][3][2]@. … … 365 506 The subscript operator presents what's really inside, by casting to the type below the wedge of lie. 366 507 367 % Does x[all] have to lie too? The picture currently glosses over how it it adverti zes a size of 7 floats. I'm leaving that as an edge case benignly misrepresented in the picture. Edge cases only have to be handled right in the code.508 % Does x[all] have to lie too? The picture currently glosses over how it it advertises a size of 7 floats. I'm leaving that as an edge case benignly misrepresented in the picture. Edge cases only have to be handled right in the code. 368 509 369 510 Casting, overlapping and lying are unsafe. … … 392 533 393 534 The @arpk@ structure and its @-[i]@ operator are thus defined as: 394 \begin{lstlisting} 395 forall( ztype(N), // length of current dimension 396 dtype(S) | sized(S), // masquerading-as 397 dtype E_im, // immediate element, often another array 398 dtype E_base // base element, e.g. float, never array 399 ) { 400 struct arpk { 401 S strides[N]; // so that sizeof(this) is N of S 402 }; 403 404 // expose E_im, stride by S 405 E_im & ?[?]( arpk(N, S, E_im, E_base) & a, ptrdiff_t i ) { 406 return (E_im &) a.strides[i]; 407 } 408 } 409 \end{lstlisting} 535 \begin{cfa} 536 forall( ztype(N), $\C{// length of current dimension}$ 537 dtype(S) | sized(S), $\C{// masquerading-as}$ 538 dtype E_im, $\C{// immediate element, often another array}$ 539 dtype E_base $\C{// base element, e.g. float, never array}$ 540 ) { // distribute forall to each element 541 struct arpk { 542 S strides[N]; $\C{// so that sizeof(this) is N of S}$ 543 }; 544 // expose E_im, stride by S 545 E_im & ?[?]( arpk(N, S, E_im, E_base) & a, ptrdiff_t i ) { 546 return (E_im &) a.strides[i]; 547 } 548 } 549 \end{cfa} 410 550 411 551 An instantiation of the @arpk@ generic is given by the @array(E_base, N0, N1, ...)@ expansion, which is @arpk( N0, Rec, Rec, E_base )@, where @Rec@ is @array(E_base, N1, ...)@. … … 464 604 465 605 \begin{tabular}{rl} 466 C & @void f( size_t n, size_t m, float a[n][m] );@ \\606 C & @void f( size_t n, size_t m, float x[n][m] );@ \\ 467 607 Java & @void f( float[][] a );@ 468 608 \end{tabular} 469 609 470 Java's safety against undefined behaviour assures the callee that, if @ a@ is non-null, then @a.length@ is a valid access (say, evaluating to the number $\ell$) and if @i@ is in $[0, \ell)$ then @a[i]@ is a valid access.610 Java's safety against undefined behaviour assures the callee that, if @x@ is non-null, then @a.length@ is a valid access (say, evaluating to the number $\ell$) and if @i@ is in $[0, \ell)$ then @x[i]@ is a valid access. 471 611 If a value of @i@ outside this range is used, a runtime error is guaranteed. 472 612 In these respects, C offers no guarantees at all. 473 Notably, the suggestion that @n@ is the intended size of the first dimension of @ a@ is documentation only.474 Indeed, many might prefer the technically equivalent declarations @float a[][m]@ or @float (*a)[m]@ as emphasizing the ``no guarantees'' nature of an infrequently used language feature, over using the opportunity to explain a programmer intention.475 Moreover, even if @ a[0][0]@ is valid for the purpose intended, C's basic infamous feature is the possibility of an @i@, such that @a[i][0]@ is not valid for the same purpose, and yet, its evaluation does not produce an error.613 Notably, the suggestion that @n@ is the intended size of the first dimension of @x@ is documentation only. 614 Indeed, many might prefer the technically equivalent declarations @float x[][m]@ or @float (*a)[m]@ as emphasizing the ``no guarantees'' nature of an infrequently used language feature, over using the opportunity to explain a programmer intention. 615 Moreover, even if @x[0][0]@ is valid for the purpose intended, C's basic infamous feature is the possibility of an @i@, such that @x[i][0]@ is not valid for the same purpose, and yet, its evaluation does not produce an error. 476 616 477 617 Java's lack of expressiveness for more applied properties means these outcomes are possible: 478 618 \begin{itemize} 479 \item @ a[0][17]@ and @a[2][17]@ are valid accesses, yet @a[1][17]@ is a runtime error, because @a[1]@ is a null pointer480 \item the same observation, now because @ a[1]@ refers to an array of length 5481 \item execution times vary, because the @float@ values within @ a@ are sometimes stored nearly contiguously, and other times, not at all619 \item @x[0][17]@ and @x[2][17]@ are valid accesses, yet @x[1][17]@ is a runtime error, because @x[1]@ is a null pointer 620 \item the same observation, now because @x[1]@ refers to an array of length 5 621 \item execution times vary, because the @float@ values within @x@ are sometimes stored nearly contiguously, and other times, not at all 482 622 \end{itemize} 483 623 C's array has none of these limitations, nor do any of the ``array language'' comparators discussed in this section. 484 624 485 625 This Java level of safety and expressiveness is also exemplified in the C family, with the commonly given advice [TODO:cite example], for C++ programmers to use @std::vector@ in place of the C++ language's array, which is essentially the C array. 486 The advice is that, while a vector is also more powerful (and quirky) than an array, its capabilities include options to preallocate with an upfront size, to use an available bound-checked accessor (@a.at(i)@ in place of @ a[i]@), to avoid using @push_back@, and to use a vector of vectors.626 The advice is that, while a vector is also more powerful (and quirky) than an array, its capabilities include options to preallocate with an upfront size, to use an available bound-checked accessor (@a.at(i)@ in place of @x[i]@), to avoid using @push_back@, and to use a vector of vectors. 487 627 Used with these restrictions, out-of-bound accesses are stopped, and in-bound accesses never exercise the vector's ability to grow, which is to say, they never make the program slow to reallocate and copy, and they never invalidate the program's other references to the contained values. 488 628 Allowing this scheme the same referential integrity assumption that \CFA enjoys [TODO:xref], this scheme matches Java's safety and expressiveness exactly. … … 532 672 Futhark restricts these expressions syntactically to variables and constants, while a full-strength dependent system does not. 533 673 534 CFA's hybrid presentation, @forall( [N] )@, has @N@ belonging to the type system, yet has no instances.674 \CFA's hybrid presentation, @forall( [N] )@, has @N@ belonging to the type system, yet has no instances. 535 675 Belonging to the type system means it is inferred at a call site and communicated implicitly, like in Dex and unlike in Futhark. 536 676 Having no instances means there is no type for a variable @i@ that constrains @i@ to be in the range for @N@, unlike Dex, [TODO: verify], but like Futhark. … … 539 679 540 680 541 \section{Future Work}681 \section{Future work} 542 682 543 683 \subsection{Declaration syntax} … … 553 693 I will not discuss the core of this project, which has a tall mission already, to improve type safety, maintain appropriate C compatibility and offer more flexibility about storage use. 554 694 It also has a candidate stretch goal, to adapt \CFA's @forall@ generic system to communicate generalized enumerations: 555 \begin{ lstlisting}695 \begin{cfa} 556 696 forall( T | is_enum(T) ) 557 697 void show_in_context( T val ) { … … 565 705 enum weekday { mon, tue, wed = 500, thu, fri }; 566 706 show_in_context( wed ); 567 \end{ lstlisting}707 \end{cfa} 568 708 with output 569 \begin{ lstlisting}709 \begin{cfa} 570 710 mon 571 711 tue < ready … … 573 713 thu 574 714 fri 575 \end{ lstlisting}715 \end{cfa} 576 716 The details in this presentation aren't meant to be taken too precisely as suggestions for how it should look in \CFA. 577 717 But the example shows these abilities: … … 599 739 The structural assumptions required for the domain of an array in Dex are given by the trait (there, ``interface'') @Ix@, which says that the parameter @n@ is a type (which could take an argument like @weekday@) that provides two-way conversion with the integers and a report on the number of values. 600 740 Dex's @Ix@ is analogous the @is_enum@ proposed for \CFA above. 601 \begin{ lstlisting}741 \begin{cfa} 602 742 interface Ix n 603 604 605 606 \end{ lstlisting}743 get_size n : Unit -> Int 744 ordinal : n -> Int 745 unsafe_from_ordinal n : Int -> n 746 \end{cfa} 607 747 608 748 Dex uses this foundation of a trait (as an array type's domain) to achieve polymorphism over shapes. … … 616 756 In the photograph instantiation, it's the tuple-type of $ \langle \mathrm{img\_wd}, \mathrm{img\_ht} \rangle $. 617 757 This use of a tuple-as-index is made possible by the built-in rule for implementing @Ix@ on a pair, given @Ix@ implementations for its elements 618 \begin{ lstlisting}758 \begin{cfa} 619 759 instance {a b} [Ix a, Ix b] Ix (a & b) 620 621 622 760 get_size = \(). size a * size b 761 ordinal = \(i, j). (ordinal i * size b) + ordinal j 762 unsafe_from_ordinal = \o. 623 763 bs = size b 624 764 (unsafe_from_ordinal a (idiv o bs), unsafe_from_ordinal b (rem o bs)) 625 \end{ lstlisting}765 \end{cfa} 626 766 and by a user-provided adapter expression at the call site that shows how to indexing with a tuple is backed by indexing each dimension at a time 627 \begin{ lstlisting}767 \begin{cfa} 628 768 img_trans :: (img_wd,img_ht)=>Real 629 769 img_trans.(i,j) = img.i.j 630 770 result = pairwise img_trans 631 \end{ lstlisting}771 \end{cfa} 632 772 [TODO: cite as simplification of example from https://openreview.net/pdf?id=rJxd7vsWPS section 4] 633 773 … … 661 801 void * malloc( size_t ); 662 802 // C, user 663 struct tm * el1 = malloc( 803 struct tm * el1 = malloc( sizeof(struct tm) ); 664 804 struct tm * ar1 = malloc( 10 * sizeof(struct tm) ); 665 805 -
TabularUnified doc/theses/mike_brooks_MMath/background.tex ¶
rbb336a6 rdbff8ec 67 67 68 68 69 \section{Reading Declarations}69 \section{Reading declarations} 70 70 71 71 A significant area of confusion for reading C declarations results from embedding a declared variable in a declaration, mimicking the way the variable is used in executable statements. … … 316 316 317 317 318 \subsection{Multi- Dimensional}318 \subsection{Multi-dimensional} 319 319 320 320 As in the last section, multi-dimensional array declarations are examined. … … 414 414 415 415 416 \subsection{Design Issues}416 \subsection{Design issues} 417 417 \label{toc:lst:issue} 418 418 … … 432 432 433 433 434 \subsection{Preexisting Linked-List Libraries}434 \subsection{Preexisting linked-list libraries} 435 435 436 436 Two preexisting linked-list libraries are used throughout, to show examples of the concepts being defined, … … 445 445 446 446 447 \subsection{Link attachment: Intrusive vs.\ Wrapped}447 \subsection{Link attachment: intrusive vs.\ wrapped} 448 448 \label{toc:lst:issue:attach} 449 449 … … 604 604 605 605 606 \subsection{Simultaneity: Single vs.\ Multi-Static vs.\ Dynamic}606 \subsection{Simultaneity: single vs.\ multi-static vs.\ dynamic} 607 607 \label{toc:lst:issue:simultaneity} 608 608 … … 729 729 730 730 731 \subsection{User integration: Preprocessed vs.\ Type-System Mediated}731 \subsection{User integration: preprocessed vs.\ type-system mediated} 732 732 733 733 % example of poor error message due to LQ's preprocessed integration … … 739 739 740 740 741 \subsection{List identity: Headed vs.\ Ad-hoc}741 \subsection{List identity: headed vs.\ ad-hoc} 742 742 \label{toc:lst:issue:ident} 743 743 … … 788 788 789 789 790 \subsection{End treatment: Cased vs.\ Uniform }790 \subsection{End treatment: cased vs.\ uniform } 791 791 792 792 This issue is implementation-level, relevant to achieving high performance. -
TabularUnified doc/theses/mike_brooks_MMath/conclusion.tex ¶
rbb336a6 rdbff8ec 7 7 \section{Strings} 8 8 9 \section{Future Work}9 \section{Future work} -
TabularUnified doc/theses/mike_brooks_MMath/intro.tex ¶
rbb336a6 rdbff8ec 16 16 17 17 18 \section{Linked List}18 \section{Linked list} 19 19 20 20 A linked-list provides a homogeneous container often with $O(log N)$/$O(N)$ access to elements using successor and predecessor operations. … … 59 59 60 60 61 \subsection{Ill- Typed Expressions}61 \subsection{Ill-typed expressions} 62 62 63 63 C reports many ill-typed expressions as warnings. … … 95 95 \section{Contributions} 96 96 97 \subsection{Linked List}97 \subsection{Linked list} 98 98 99 99 \subsection{Array} -
TabularUnified doc/theses/mike_brooks_MMath/programs/bkgd-carray-arrty.c ¶
rbb336a6 rdbff8ec 148 148 void stx2() { const T x[10]; 149 149 // x[5] = 3.14; // bad 150 150 } 151 151 void stx3() { T const x[10]; 152 152 // x[5] = 3.14; // bad 153 153 } 154 154 155 155 // Local Variables: // -
TabularUnified doc/theses/mike_brooks_MMath/programs/hello-accordion.cfa ¶
rbb336a6 rdbff8ec 2 2 #include <stdlib.hfa> 3 3 #include <array.hfa> 4 #include <locale.h> // setlocale 4 5 5 6 … … 7 8 8 9 9 10 11 forall( T, [Nclients], [Ncosts] ) 12 struct request { 13 unsigned int requestor_id; 14 array( T, Nclients ) impacted_client_ids; // nested VLA 15 array( float, Ncosts ) cost_contribs; // nested VLA 16 float total_cost; 10 forall( T, @[NprovTerty]@, @[Nmunicipalities]@ ) 11 struct CanadaPop { 12 array( T, @NprovTerty@ ) provTerty; $\C{// nested VLA}$ 13 array( T, @Nmunicipalities@ ) municipalities; $\C{// nested VLA}$ 14 int total_pt, total_mun; 17 15 }; 18 16 … … 20 18 // TODO: understand (fix?) why these are needed (autogen seems to be failing ... is typeof as struct member nayok?) 21 19 22 forall( T, [N clients], [Ncosts] )23 void ?{}( T &, request( T, Nclients, Ncosts ) & this ) {}20 forall( T, [NprovTerty], [Nmunicipalities] ) 21 void ?{}( T &, CanadaPop( T, NprovTerty, Nmunicipalities ) & this ) {} 24 22 25 forall( T &, [N clients], [Ncosts] )26 void ^?{}( request( T, Nclients, Ncosts ) & this ) {}23 forall( T &, [NprovTerty], [Nmunicipalities] ) 24 void ^?{}( CanadaPop( T, NprovTerty, Nmunicipalities ) & this ) {} 27 25 28 26 … … 39 37 40 38 41 forall( T, [Nclients], [Ncosts] ) 42 void summarize( request( T, Nclients, Ncosts ) & r ) { 43 r.total_cost = 0; 44 for( i; Ncosts ) 45 r.total_cost += r.cost_contribs[i]; 46 // say the cost is per-client, to make output vary 47 r.total_cost *= Nclients; 39 40 forall( T, [NprovTerty], [Nmunicipalities] ) 41 void check( CanadaPop( T, NprovTerty, Nmunicipalities ) & pop ) with( pop ) { 42 total_pt = total_mun = 0; 43 for ( i; NprovTerty ) total_pt += provTerty[i]; 44 for ( i; Nmunicipalities ) total_mun += municipalities[i]; 48 45 } 49 46 … … 59 56 60 57 58 59 61 60 int main( int argc, char * argv[] ) { 62 const int ncl = ato( argv[1] ); 63 const int nco = 2; 64 65 request( int, ncl, nco ) r; 66 r.cost_contribs[0] = 100; 67 r.cost_contribs[1] = 0.1; 68 69 summarize(r); 70 sout | "Total cost:" | r.total_cost; 61 const int npt = ato( argv[1] ), nmun = ato( argv[2] ); 62 @CanadaPop( int, npt, nmun ) pop;@ 63 // read in population numbers 64 @check( pop );@ 65 sout | setlocale( LC_NUMERIC, getenv( "LANG" ) ); 66 sout | "Total province/territory:" | pop.total_pt; 67 sout | "Total municipalities:" | pop.total_mun; 71 68 } 72 69 /* 73 $\$$ ./a.out 5 74 Total cost: 500.5 75 $\$$ ./a.out 6 76 Total cost: 600.6 70 $\$$ ./a.out 13 3573 71 Total province/territory: 36,991,981 72 Total municipalities: 36,991,981 77 73 */ 74 75 // Local Variables: // 76 // compile-command: "sed -f sedcmd hello-accordion.cfa > ../build/tmp.cfa; cfa ../build/tmp.cfa -Wall -Wextra" // 77 // End: // -
TabularUnified doc/theses/mike_brooks_MMath/programs/hello-array.cfa ¶
rbb336a6 rdbff8ec 8 8 9 9 10 forall( [ N] ) // array bound11 array( bool, N) & f( array(float, N) & a, array(float, N) & b) {12 array( bool, N) & ret = *alloc(); // sizeof used by alloc13 for ( i; N) {14 ret[i] = 0.005 > 2 * (abs( a[i] - b[i])) / (abs(a[i]) + abs(b[i]));10 forall( [@N@] ) $\C{// array dimension}$ 11 array( bool, @N@) & f( array( float, @N@ ) & x, array( float, @N@ ) & y ) { 12 array( bool, @N@ ) & ret = *@alloc@(); $\C{// sizeof ret used by alloc}$ 13 for ( i; @N@ ) { 14 ret[i] = 0.005 > 2 * (abs( x[i] - y[i] )) / (abs( x[i]) + abs(y[i] )); 15 15 } 16 16 return ret; … … 29 29 30 30 int main( int argc, char * argv[] ) { 31 int n = ato( argv[1] );32 array( float, n) a, b; // VLA33 for ( i; n ) { 34 a[i] = 3.14 / (i + 1);35 b[i] = a[i] + 0.005 ;31 const int @n@ = ato( argv[1] ); $\C{// deduce conversion type}$ 32 array( float, @n@ ) x, y; $\C{// VLAs}$ 33 for ( i; n ) { $\C{// initialize arrays}$ 34 x[i] = 3.14 / (i + 1); 35 y[i] = x[i] + 0.005 ; 36 36 } 37 array( bool, n) & result = f( a, b ); // call38 sout | "result: " | nonl; 37 array( bool, @n@ ) & result = @f( x, y )@; $\C{// call}$ 38 sout | "result: " | nonl; $\C{// print result}$ 39 39 for ( i; n ) 40 40 sout | result[i] | nonl; 41 41 sout | nl; 42 free( &result ); // free returned storage42 free( &result ); $\C{// free result storage}$ 43 43 } 44 44 /* … … 50 50 51 51 void fred() { 52 array( float, 10) a;53 array( float, 20) b;54 f( a, a);55 f( b, b);56 f( a, b);52 array( float, @10@ ) x; 53 array( float, @20@ ) y; 54 f( x, x ); 55 f( y, y ); 56 // f( x, y ); 57 57 } 58 58 59 59 #ifdef SHOWERR1 60 60 forall( [M], [N] ) 61 void bad( array(float, M) & a, array(float, N) &b) {62 f( a, a ); // ok63 f( b, b ); // ok64 f( a, b ); // error61 void bad( array(float, M) &x, array(float, N) &y ) { 62 f( x, x ); $\C[1.5in]{// ok}$ 63 f( y, y ); $\C{// ok}$ 64 f( x, y ); $\C{// error}\CRT$ 65 65 } 66 66 #endif … … 69 69 70 70 forall( [M], [N] ) 71 void bad_fixed( array(float, M) & a, array(float, N) & b ) { 72 if ( M == N ) { 73 f( a, (array(float, M) &)b ); // cast b to matching type 74 } 71 void bad_fixed( array( float, M ) & x, array( float, N ) & y ) { 72 if ( M == N ) 73 f( x, @(array( float, M ) &)@y ); $\C{// cast y to matching type}$ 75 74 } 75 76 // Local Variables: // 77 // compile-command: "sed -f sedcmd hello-array.cfa > ../build/tmp.cfa; cfa ../build/tmp.cfa -Wall -Wextra" // 78 // End: // -
TabularUnified doc/theses/mike_brooks_MMath/programs/hello-md.cfa ¶
rbb336a6 rdbff8ec 39 39 40 40 forall( [N] ) 41 void print1d_cstyle( array( float, N) & c );41 void print1d_cstyle( array( float, N ) & r ); $\C{// C style}$ 42 42 43 forall( [N], C & | ar( C, float, N ) )43 forall( [N], C & @| ar( C, float, N )@ ) $\C{// add trait}$ 44 44 void print1d( C & c ); 45 45 … … 59 59 60 60 forall( [N] ) 61 void print1d_cstyle( array(float, N) & c ) { 62 for ( i; N ) { 63 sout | c[i] | nonl; 64 } 61 void print1d_cstyle( array( float, N ) & r ) { $\C{// C style}$ 62 for ( i; N ) sout | r[i] | nonl; 65 63 sout | nl; 66 64 } 65 66 67 67 68 68 … … 98 98 99 99 100 void fill( array( float, 5, 7) & a ) {100 void fill( array( float, 5, 7 ) & a ) { 101 101 for ( i; (ptrdiff_t) 5 ) { 102 102 for ( j; 7 ) { … … 116 116 117 117 118 array( float, 5, 7 ) a;119 fill( a);118 array( float, 5, 7 ) m; 119 fill( m ); 120 120 /* 121 0.0 0.1 0.2 0.3 0.4 0.5 0.6 122 1.0 1.1 1.2 1.3 1.4 1.5 1.6 123 2.0 2.1 2.2 2.3 2.4 2.5 2.6 124 3.0 3.1 3.2 3.3 3.4 3.5 3.6 125 4.0 4.1 4.2 4.3 4.4 4.5 4.6 121 r/c 0 1 2 3 4 5 6 122 0 0.0 0.1 0.2 @0.3@ 0.4 0.5 0.6 123 1 1.0 1.1 1.2 @1.3@ 1.4 1.5 1.6 124 2 @2.0 2.1 2.2 2.3 2.4 2.5 2.6@ 125 3 3.0 3.1 3.2 @3.3@ 3.4 3.5 3.6 126 4 4.0 4.1 4.2 @4.3@ 4.4 4.5 4.6 126 127 */ 127 128 … … 137 138 138 139 139 140 print1d_cstyle( a[ 2 ] ); // 2.0 2.1 2.2 2.3 2.4 2.5 2.6 140 print1d_cstyle( m[ 2 ] ); $\C{// row 2: 2.0 2.1 2.2 2.3 2.4 2.5 2.6}$ 141 141 142 142 143 143 144 144 145 print1d( a[ 2 ] ); // 2.0 2.1 2.2 2.3 2.4 2.5 2.6145 print1d( m[ 2 ] ); $\C{// row: 2.0 2.1 2.2 2.3 2.4 2.5 2.6}$ 146 146 147 147 148 148 149 149 150 print1d( a[ 2, all ] ); // 2.0 2.1 2.2 2.3 2.4 2.5 2.6151 print1d( a[ all, 3 ] ); // 0.3 1.3 2.3 3.3 4.3150 print1d( m[ 2, all ] ); $\C{// row 2: 2.0 2.1 2.2 2.3 2.4 2.5 2.6}$ 151 print1d( m[ all, 3 ] ); $\C{// column 3: 0.3 1.3 2.3 3.3 4.3}$ 152 152 153 153 154 154 155 print1d_cstyle( a[ 2, all ] );155 print1d_cstyle( m[ 2, all ] ); 156 156 157 157 … … 163 163 #ifdef SHOW_ERROR_1 164 164 165 print1d_cstyle( a[ all, 2 ] ); // bad165 print1d_cstyle( m[ all, 2 ] ); $\C{// bad}$ 166 166 167 167 #endif -
TabularUnified doc/theses/mike_brooks_MMath/programs/lst-issues-attach-reduction.hpp ¶
rbb336a6 rdbff8ec 101 101 class list { 102 102 struct node { 103 @LIST_ENTRY(node) links;@104 @El elem;@103 LIST_ENTRY(node) links; 104 El elem; 105 105 }; 106 106 LIST_HEAD(Impl, node); … … 111 111 } 112 112 void push_front( const El & src ) { 113 node * n = @new node()@;113 node * n = new node(); 114 114 n->elem = src; 115 115 LIST_INSERT_HEAD(&impl, n, links); -
TabularUnified doc/theses/mike_brooks_MMath/programs/lst-issues-wrapped-byref.run.cpp ¶
rbb336a6 rdbff8ec 49 49 for (auto const& cur : reqs) 50 50 printf("{%d %d} ", cur->pri, cur->rqr); 51 printf("\n");51 printf("\n"); 52 52 53 53 } -
TabularUnified doc/theses/mike_brooks_MMath/string.tex ¶
rbb336a6 rdbff8ec 3 3 \section{String} 4 4 5 \subsection{Logical Overlap}5 \subsection{Logical overlap} 6 6 7 7 \input{sharing-demo.tex} … … 38 38 39 39 40 \subsection{Memory Management}40 \subsection{Memory management} 41 41 42 42 A centrepriece of the string module is its memory manager. The managment scheme defines a large shared buffer for strings' text. Allocation in this buffer is always bump-pointer; the buffer is compacted and/or relocated with growth when it fills. A string is a smart pointer into this buffer. … … 80 80 81 81 82 \subsection{Avoiding Implicit Sharing}82 \subsection{Avoiding implicit sharing} 83 83 84 84 There are tradeoffs associated with the copy-on-write mechanism. Several quatitative matters are detailed in the [xref: Performance Assessment] section and the qualitiative issue of multi-threaded support is introduced here. The \CFA sting library provides a switch to disable the sharing mechanism for situtations where it is inappropriate. … … 98 98 99 99 100 \subsection{Future Work}100 \subsection{Future work} 101 101 102 102 To discuss: Unicode … … 105 105 106 106 107 \subsection{Performance Assessment}107 \subsection{Performance assessment} 108 108 109 109 I assessed the CFA string library's speed and memory usage. I present these results ineven quivalent cases, due to either micro-optimizations foregone, or fundamental costs of the added functionality. They also show the benefits and tradeoffs, as >100\% effects, of switching to CFA, with the tradeoff points quantified. The final test shows the overall win of the CFA text-sharing mechanism. It exercises several operations together, showing CFA enabling clean user code to achieve performance that STL requires less-clean user code to achieve. -
TabularUnified doc/theses/mike_brooks_MMath/uw-ethesis.tex ¶
rbb336a6 rdbff8ec 104 104 \lstset{language=cfa,belowskip=-1pt} % set default language to CFA 105 105 \lstnewenvironment{c++}[1][]{\lstset{language=[GNU]C++,escapechar=\$,moredelim=**[is][\color{red}]{@}{@},}\lstset{#1}}{} 106 \lstnewenvironment{pascal}[1][]{\lstset{language=pascal,escapechar=\$,moredelim=**[is][\color{red}]{@}{@},}\lstset{#1}}{} 106 107 \lstset{inputpath={programs}} 107 108 -
TabularUnified libcfa/src/enum.cfa ¶
rbb336a6 rdbff8ec 1 1 #include "enum.hfa" 2 2 #include "fstream.hfa" 3 #include <stdlib.h> // qsort 3 4 #include <string.h> 4 5 5 6 #pragma GCC visibility push(default) 6 7 8 int scmp( const void * str1, const void * str2 ) { 9 return -strcmp( *(char **)str1, *(char **)str2 ); // dscending order 10 } // scmp 11 7 12 forall( istype & | istream( istype ), E, V | CfaEnum( E, V ) ) 8 13 istype & ?|?( istype & is, E & e ) { 14 // printf( "here0\n" ); 9 15 if ( eof( is ) ) throwResume ExceptionInst( missing_data ); 10 char val[256]; 11 int args = fmt( is, "%255s", val ); 12 if ( ! eof( is ) && args != 1 ) throwResume ExceptionInst( missing_data ); 16 17 // Match input enumerator string to enumerator labels. 18 int len = -1; 19 const char * cpy[ 20 /*countof( E )*/ ]; 20 int i = 0; 13 21 for ( s; E ) { 14 if ( strcmp(val, label( s )) == 0 ) { e = s; break; } 22 cpy[i] = label( s ); 23 printf( "%s\n", cpy[i] ); 24 i += 1; 25 } 26 printf( "%d\n", i ); 27 qsort( cpy, i, sizeof(char*), scmp ); 28 i = 0; 29 for ( s; E ) { 30 printf( "%s\n", cpy[i] ); 31 i += 1; 32 } 33 int j = 0; 34 X : for ( s; E ) { 35 len = strlen( cpy[j] ); 36 printf( "%s %d\n", cpy[j], len ); 37 char fmtstr[len + 16]; 38 fmtstr[0] = ' '; // optional leadig whitespace 39 strcpy( &fmtstr[1], cpy[j] ); // copy format and add %n 40 strcpy( &fmtstr[len + 1], "%n" ); 41 printf( "%s\n", fmtstr ); 42 len = -1; 43 // scanf cursor does not move if no match 44 fmt( is, fmtstr, &len ); 45 printf( "%s %d %d\n", fmtstr, len, j ); 46 if ( eof( is ) ) { break; } 47 if ( len != -1 ) { 48 for ( s; E ) { 49 if ( strcmp( label( s ), cpy[j] ) == 0 ) { e = s; break X; } 50 } 51 } 52 j += 1; 15 53 } else { 16 fprintf( stderr, "invalid enumeration constant\n" ); 17 abort(); // cannot use abort stream 54 //ExceptionInst( missing_data ); 18 55 } // for 56 printf( "X %s %d\n", label( e ), len ); 57 if ( ! eof( is ) && len == -1 ) throwResume ExceptionInst( missing_data ); 58 59 // if ( eof( is ) ) throwResume ExceptionInst( missing_data ); 60 // char val[256]; 61 // int args = fmt( is, "%255s", val ); 62 // if ( ! eof( is ) && args != 1 ) throwResume ExceptionInst( missing_data ); 63 // for ( s; E ) { 64 // if ( strcmp(val, label( s )) == 0 ) { e = s; break; } 65 // } else { 66 // fprintf( stderr, "invalid enumeration constant\n" ); 67 // abort(); // cannot use abort stream 68 // } // for 19 69 return is; 20 70 } -
TabularUnified libcfa/src/iostream.cfa ¶
rbb336a6 rdbff8ec 10 10 // Created On : Wed May 27 17:56:53 2015 11 11 // Last Modified By : Peter A. Buhr 12 // Last Modified On : Mon Feb 12 09:26:05202413 // Update Count : 196612 // Last Modified On : Mon Jul 8 23:45:06 2024 13 // Update Count : 2018 14 14 // 15 15 … … 772 772 773 773 774 #define FALSE "false" 775 #define TRUE "true" 776 774 777 forall( istype & | basic_istream( istype ) ) { 775 778 istype & ?|?( istype & is, bool & b ) { 776 779 if ( eof( is ) ) throwResume ExceptionInst( missing_data ); 777 char val[6]; 778 int args = fmt( is, "%5s", val ); 779 if ( ! eof( is ) && args != 1 ) throwResume ExceptionInst( missing_data ); 780 if ( strcmp( val, "true" ) == 0 ) b = true; 781 else if ( strcmp( val, "false" ) == 0 ) b = false; 782 else { 783 fprintf( stderr, "invalid Boolean constant\n" ); 784 abort(); // cannot use abort stream 780 int len = -1; // len not set if no match 781 // Optional leading whitespace at start of strings. 782 fmt( is, " " FALSE "%n", &len ); // try false 783 if ( len != sizeof( FALSE ) - 1 ) { // remove null terminate 784 fmt( is, " " TRUE "%n", &len ); // try true 785 if ( len != sizeof( TRUE ) - 1 ) throwResume ExceptionInst( missing_data ); 786 b = true; 787 } else { 788 b = false; 785 789 } // if 786 790 return is; … … 940 944 } // ?|? 941 945 942 istype & ?|?( istype & is, const char fmt[] ) { 946 istype & ?|?( istype & is, const char fmt[] ) { // match text 943 947 if ( eof( is ) ) throwResume ExceptionInst( missing_data ); 944 948 size_t len = strlen( fmt ); … … 946 950 strcpy( fmtstr, fmt ); // copy format and add %n 947 951 strcpy( &fmtstr[len], "%n" ); 948 int len2 = -1; 949 fmt( is, fmtstr, &len2 ); 950 if ( ! eof( is ) && len2 == -1 ) throwResume ExceptionInst( missing_data ); 952 len = -1; 953 // scanf cursor does not move if no match 954 fmt( is, fmtstr, &len ); 955 if ( ! eof( is ) && len == -1 ) throwResume ExceptionInst( missing_data ); 951 956 return is; 952 957 } // ?|? -
TabularUnified libcfa/src/iostream.hfa ¶
rbb336a6 rdbff8ec 10 10 // Created On : Wed May 27 17:56:53 2015 11 11 // Last Modified By : Peter A. Buhr 12 // Last Modified On : S un Apr 21 07:32:19202413 // Update Count : 7 4412 // Last Modified On : Sat Jul 6 11:33:31 2024 13 // Update Count : 758 14 14 // 15 15 … … 359 359 istype & ?|?( istype &, long double _Complex & ); 360 360 361 istype & ?|?( istype &, const char [] ); 361 // This is too restrictive as it prevents building a format in a string and using that format. 362 // inline istype & ?|?( istype &, char [] ) { // possible error, too restrictive to change 363 // _Static_assert( false, "reading a character array without a maximum length is unsafe. Use input manipulator \"wdi( N, s )\", where \"char s[N]\" or fmt( s )." ); 364 // } 365 istype & ?|?( istype &, const char [] ); // match text 362 366 363 367 // manipulators -
TabularUnified libcfa/src/parseargs.cfa ¶
rbb336a6 rdbff8ec 10 10 // Author : Thierry Delisle 11 11 // Created On : Wed Oct 12 15:28:01 2022 12 // Last Modified By : 13 // Last Modified On : 14 // Update Count : 12 // Last Modified By : Peter A. Buhr 13 // Last Modified On : Mon Jul 8 18:18:23 2024 14 // Update Count : 7 15 15 // 16 16 … … 32 32 extern FILE * stdout; 33 33 34 extern int fileno( FILE *stream);34 extern int fileno( FILE *stream ); 35 35 36 36 extern int fprintf ( FILE * stream, const char * format, ... ); 37 37 38 extern long long int strtoll ( const char* str, char** endptr, int base);39 extern unsigned long long int strtoull( const char* str, char** endptr, int base);40 extern double strtod ( const char* str, char** endptr);38 extern long long int strtoll ( const char* str, char** endptr, int base ); 39 extern unsigned long long int strtoull( const char* str, char** endptr, int base ); 40 extern double strtod ( const char* str, char** endptr ); 41 41 } 42 42 … … 44 44 #include "limits.hfa" 45 45 46 #pragma GCC visibility push( default)47 48 extern int cfa_args_argc __attribute__(( weak));49 extern char ** cfa_args_argv __attribute__(( weak));50 extern char ** cfa_args_envp __attribute__(( weak));46 #pragma GCC visibility push( default ) 47 48 extern int cfa_args_argc __attribute__(( weak )); 49 extern char ** cfa_args_argv __attribute__(( weak )); 50 extern char ** cfa_args_envp __attribute__(( weak )); 51 51 52 52 forall([N]) 53 static void usage( char * cmd, const array( cfa_option, N ) & options, const char * usage, FILE * out) __attribute__ ((noreturn));53 static void usage( char * cmd, const array( cfa_option, N ) & options, const char * usage, FILE * out ) __attribute__ (( noreturn )); 54 54 //----------------------------------------------------------------------------- 55 55 // checking 56 56 forall([N]) 57 57 static void check_args( const array( cfa_option, N ) & options ) { 58 for(i; N) { 59 for(j; N) { 60 if(i == j) continue; 61 62 if( options[i].short_name != '\0' 63 && options[i].short_name == options[j].short_name) 64 abort("Parse Args error: two options have short name '%c' (%zu & %zu)", options[i].short_name, i, j); 65 66 if(0 == strcmp(options[i].long_name, options[j].long_name)) abort("Parse Args error: two options have long name '%s' (%zu & %zu)", options[i].long_name, i, j); 58 for ( i; N ) { 59 for ( j; N ) { 60 if ( i == j ) continue; 61 62 if ( options[i].short_name != '\0' 63 && options[i].short_name == options[j].short_name ) 64 abort( "Parse Args error: two options have short name '%c' (%zu & %zu)", options[i].short_name, i, j ); 65 66 if (0 == strcmp( options[i].long_name, options[j].long_name )) 67 abort( "Parse Args error: two options have long name '%s' (%zu & %zu)", options[i].long_name, i, j ); 67 68 } 68 69 } … … 74 75 forall([opt_count]) { 75 76 void parse_args( const array( cfa_option, opt_count ) & options, const char * usage, char ** & left ) { 76 if( 0p != &cfa_args_argc ) { 77 parse_args(cfa_args_argc, cfa_args_argv, options, usage, left ); 78 } 79 else { 77 if ( 0p != &cfa_args_argc ) { 78 parse_args( cfa_args_argc, cfa_args_argv, options, usage, left ); 79 } else { 80 80 char * temp[1] = { 0p }; 81 81 parse_args(0, temp, options, usage, left ); … … 90 90 char ** & left 91 91 ) { 92 check_args( options);92 check_args( options ); 93 93 94 94 int maxv = 'h'; … … 96 96 { 97 97 int idx = 0; 98 for (i; opt_count) {99 if ( options[i].short_name) {100 maxv = max( options[i].short_name, maxv);98 for ( i; opt_count ) { 99 if ( options[i].short_name ) { 100 maxv = max( options[i].short_name, maxv ); 101 101 optstring[idx] = options[i].short_name; 102 102 idx++; 103 if ( ((intptr_t)options[i].parse) != ((intptr_t)parse_settrue)104 && ((intptr_t)options[i].parse) != ((intptr_t)parse_setfalse) ) {103 if ( (intptr_t)options[i].parse != (intptr_t)parse_settrue 104 && ((intptr_t)options[i].parse) != ((intptr_t)parse_setfalse) ) { 105 105 optstring[idx] = ':'; 106 106 idx++; … … 115 115 { 116 116 int idx = 0; 117 for (i; opt_count) {118 if (options[i].long_name) {117 for ( i; opt_count ) { 118 if ( options[i].long_name ) { 119 119 // we don't have the mutable keyword here, which is really what we would want 120 120 int & val_ref = (int &)(const int &)options[i].val; … … 124 124 optarr[idx].flag = 0p; 125 125 optarr[idx].val = options[i].val; 126 if (((intptr_t)options[i].parse) == ((intptr_t)parse_settrue)127 || ((intptr_t)options[i].parse) == ((intptr_t)parse_setfalse) ) {126 if ( ((intptr_t)options[i].parse) == ((intptr_t)parse_settrue) 127 || ((intptr_t)options[i].parse) == ((intptr_t)parse_setfalse) ) { 128 128 optarr[idx].has_arg = no_argument; 129 129 } else { … … 139 139 FILE * out = stderr; 140 140 NEXT_ARG: 141 for () {141 for () { 142 142 int idx = 0; 143 int opt = getopt_long( argc, argv, optstring, optarr, &idx);144 switch( opt) {143 int opt = getopt_long( argc, argv, optstring, optarr, &idx ); 144 switch( opt ) { 145 145 case -1: 146 if (&left != 0p) left = argv + optind;146 if ( &left != 0p ) left = argv + optind; 147 147 return; 148 148 case 'h': 149 149 out = stdout; 150 150 case '?': 151 usage( argv[0], options, usage, out);151 usage( argv[0], options, usage, out ); 152 152 default: 153 for (i; opt_count) {154 if (opt == options[i].val) {153 for ( i; opt_count ) { 154 if ( opt == options[i].val ) { 155 155 const char * arg = optarg ? optarg : ""; 156 if ( arg[0] == '=' ) { arg++; }156 if ( arg[0] == '=' ) { arg++; } 157 157 // work around for some weird bug 158 158 void * variable = options[i].variable; 159 159 bool (*parse_func)(const char *, void * ) = options[i].parse; 160 160 bool success = parse_func( arg, variable ); 161 if (success) continue NEXT_ARG;162 163 fprintf( out, "Argument '%s' for option %c could not be parsed\n\n", arg, (char)opt);164 usage( argv[0], options, usage, out);161 if ( success ) continue NEXT_ARG; 162 163 fprintf( out, "Argument '%s' for option %c could not be parsed\n\n", arg, (char)opt ); 164 usage( argv[0], options, usage, out ); 165 165 } 166 166 } 167 abort( "Internal parse arg error\n");167 abort( "Internal parse arg error\n" ); 168 168 } 169 169 … … 172 172 } 173 173 174 static inline int next_newline( const char * str) {174 static inline int next_newline( const char * str ) { 175 175 int ret; 176 const char * ptr = strstr( str, "\n");177 if (!ptr) return MAX;178 179 /* paranoid */ verify( str <= ptr );176 const char * ptr = strstr( str, "\n" ); 177 if ( ! ptr ) return MAX; 178 179 /* paranoid */ verify( str <= ptr ); 180 180 intptr_t low = (intptr_t)str; 181 181 intptr_t hi = (intptr_t)ptr; … … 187 187 //----------------------------------------------------------------------------- 188 188 // Print usage 189 static void printopt( FILE * out, int width, int max, char sn, const char * ln, const char * help) {189 static void printopt( FILE * out, int width, int max, char sn, const char * ln, const char * help ) { 190 190 // check how wide we should be printing 191 191 // this includes all options and the help message 192 192 int hwidth = max - (11 + width); 193 if (hwidth <= 0) hwidth = max;193 if ( hwidth <= 0 ) hwidth = max; 194 194 195 195 // check which pieces we have 196 bool has_ln = ln && strcmp( "", ln);197 bool has_help = help && strcmp( "", help);196 bool has_ln = ln && strcmp( "", ln ); 197 bool has_help = help && strcmp( "", help ); 198 198 199 199 // print the small name if present 200 if (sn != '\0') fprintf(out, " -%c", sn);201 else fprintf( out, " ");200 if ( sn != '\0') fprintf( out, " -%c", sn ); 201 else fprintf( out, " " ); 202 202 203 203 // print a comma if we have both short and long names 204 if (sn != '\0' && has_ln) fprintf(out, ", ");205 else fprintf( out, " ");204 if ( sn != '\0' && has_ln ) fprintf( out, ", " ); 205 else fprintf( out, " " ); 206 206 207 207 // print the long name if present 208 if (has_ln) fprintf(out, "--%-*s", width, ln);209 else if (has_help) fprintf(out, " %-*s", width, "");210 211 if (has_help) {208 if ( has_ln ) fprintf( out, "--%-*s", width, ln ); 209 else if ( has_help ) fprintf( out, " %-*s", width, "" ); 210 211 if ( has_help ) { 212 212 // print the help 213 213 // We need to wrap at the max width, and also indent newlines so everything is nice and pretty 214 214 215 215 // for each line to print 216 for () {216 for () { 217 217 //find out if there is a newline 218 int nextnl = next_newline( help);219 int real = min( min(strlen(help), hwidth), nextnl);220 221 fprintf( out, " %.*s", real, help);222 // printf( "%d %d\n", real, nextnl);218 int nextnl = next_newline( help ); 219 int real = min( min( strlen( help ), hwidth ), nextnl ); 220 221 fprintf( out, " %.*s", real, help ); 222 // printf( "%d %d\n", real, nextnl ); 223 223 help += real; 224 if ( nextnl == real ) help++;225 if ('\0' == *help) break;226 fprintf( out, "\n%*s", width + 8, "");227 } 228 } 229 fprintf( out, "\n");230 } 231 232 void print_args_usage( cfa_option options[], const size_t opt_count, const char * usage, bool error) __attribute__ ((noreturn)) {224 if ( nextnl == real ) help++; 225 if ('\0' == *help ) break; 226 fprintf( out, "\n%*s", width + 8, "" ); 227 } 228 } 229 fprintf( out, "\n" ); 230 } 231 232 void print_args_usage( cfa_option options[], const size_t opt_count, const char * usage, bool error ) __attribute__ ((noreturn )) { 233 233 const array( cfa_option, opt_count ) & arr = (const array( cfa_option, opt_count ) &) *options; 234 usage( cfa_args_argv[0], arr, usage, error ? stderr : stdout);235 } 236 237 void print_args_usage( int , char * argv[], cfa_option options[], const size_t opt_count, const char * usage, bool error) __attribute__ ((noreturn)) {234 usage( cfa_args_argv[0], arr, usage, error ? stderr : stdout ); 235 } 236 237 void print_args_usage( int , char * argv[], cfa_option options[], const size_t opt_count, const char * usage, bool error ) __attribute__ (( noreturn )) { 238 238 const array( cfa_option, opt_count ) & arr = (const array( cfa_option, opt_count ) &) *options; 239 usage( argv[0], arr, usage, error ? stderr : stdout);239 usage( argv[0], arr, usage, error ? stderr : stdout ); 240 240 } 241 241 242 242 forall( [N] ) { 243 void print_args_usage( const array(cfa_option, N ) & options, const char * usage, bool error ) {244 usage( cfa_args_argv[0], options, usage, error ? stderr : stdout);245 } 246 247 void print_args_usage( int argc, char * argv[], const array( cfa_option, N ) & options, const char * usage, bool error) {248 usage( argv[0], options, usage, error ? stderr : stdout);243 void print_args_usage( const array(cfa_option, N ) & options, const char * usage, bool error ) { 244 usage( cfa_args_argv[0], options, usage, error ? stderr : stdout ); 245 } 246 247 void print_args_usage( int argc, char * argv[], const array( cfa_option, N ) & options, const char * usage, bool error ) { 248 usage( argv[0], options, usage, error ? stderr : stdout ); 249 249 } 250 250 } 251 251 252 252 forall([N]) 253 static void usage( char * cmd, const array( cfa_option, N ) & options, const char * help, FILE * out) __attribute__((noreturn)) {253 static void usage( char * cmd, const array( cfa_option, N ) & options, const char * help, FILE * out ) __attribute__(( noreturn )) { 254 254 int width = 0; 255 255 { 256 for (i; N) {257 if (options[i].long_name) {258 int w = strlen( options[i].long_name);259 if (w > width) width = w;256 for ( i; N ) { 257 if ( options[i].long_name ) { 258 int w = strlen( options[i].long_name ); 259 if ( w > width ) width = w; 260 260 } 261 261 } … … 263 263 264 264 int max_width = 1_000_000; 265 int outfd = fileno( out);266 if (isatty(outfd)) {265 int outfd = fileno( out ); 266 if ( isatty( outfd ) ) { 267 267 struct winsize size; 268 int ret = ioctl( outfd, TIOCGWINSZ, &size);269 if (ret < 0) abort( "ioctl error: (%d) %s\n", (int)errno, strerror(errno) );268 int ret = ioctl( outfd, TIOCGWINSZ, &size ); 269 if ( ret < 0 ) abort( "ioctl error: (%d) %s\n", (int)errno, strerror( errno) ); 270 270 max_width = size.ws_col; 271 271 } 272 272 273 fprintf( out, "Usage:\n %s %s\n", cmd, help);274 275 for (i; N) {276 printopt( out, width, max_width, options[i].short_name, options[i].long_name, options[i].help);277 } 278 fprintf( out, " -%c, --%-*s %s\n", 'h', width, "help", "print this help message");279 exit( out == stdout ? 0 : 1);273 fprintf( out, "Usage:\n %s %s\n", cmd, help ); 274 275 for ( i; N ) { 276 printopt( out, width, max_width, options[i].short_name, options[i].long_name, options[i].help ); 277 } 278 fprintf( out, " -%c, --%-*s %s\n", 'h', width, "help", "print this help message" ); 279 exit( out == stdout ? 0 : 1 ); 280 280 } 281 281 282 282 //----------------------------------------------------------------------------- 283 283 // Typed argument parsing 284 bool parse_yesno( const char * arg, bool & value ) {285 if (strcmp(arg, "yes") == 0) {284 bool parse_yesno( const char * arg, bool & value ) { 285 if ( strcmp( arg, "yes" ) == 0 ) { 286 286 value = true; 287 287 return true; 288 288 } 289 290 if(strcmp(arg, "Y") == 0) { 289 if ( strcmp( arg, "Y" ) == 0 ) { 291 290 value = true; 292 291 return true; 293 292 } 294 295 if(strcmp(arg, "y") == 0) { 293 if ( strcmp( arg, "y" ) == 0 ) { 296 294 value = true; 297 295 return true; 298 296 } 299 300 if(strcmp(arg, "no") == 0) { 297 if ( strcmp( arg, "no" ) == 0 ) { 301 298 value = false; 302 299 return true; 303 300 } 304 305 if(strcmp(arg, "N") == 0) { 301 if ( strcmp( arg, "N" ) == 0 ) { 306 302 value = false; 307 303 return true; 308 304 } 309 310 if(strcmp(arg, "n") == 0) { 305 if ( strcmp( arg, "n" ) == 0 ) { 311 306 value = false; 312 307 return true; 313 308 } 314 315 309 return false; 316 310 } 317 311 318 bool parse_truefalse( const char * arg, bool & value) {319 if (strcmp(arg, "true") == 0) {312 bool parse_truefalse( const char * arg, bool & value ) { 313 if ( strcmp( arg, "true" ) == 0 ) { 320 314 value = true; 321 315 return true; 322 316 } 323 324 if(strcmp(arg, "false") == 0) { 317 if ( strcmp( arg, "false" ) == 0 ) { 325 318 value = false; 326 319 return true; 327 320 } 328 329 321 return false; 330 322 } 331 323 332 bool parse_settrue ( const char *, bool & value ) {324 bool parse_settrue ( const char *, bool & value ) { 333 325 value = true; 334 326 return true; 335 327 } 336 328 337 bool parse_setfalse( const char *, bool & value ) {329 bool parse_setfalse( const char *, bool & value ) { 338 330 value = false; 339 331 return true; 340 332 } 341 333 342 bool parse( const char * arg, const char * & value ) {334 bool parse( const char * arg, const char * & value ) { 343 335 value = arg; 344 336 return true; 345 337 } 346 338 347 bool parse( const char * arg, int & value) {339 bool parse( const char * arg, int & value ) { 348 340 char * end; 349 341 350 342 errno = 0; 351 long long int r = strtoll(arg, &end, 0); 352 if(errno) return false; 353 if(*end != '\0') return false; 354 if(r > (int)MAX) return false; 355 if(r < (int)MIN) return false; 356 343 long long int r = strtoll( arg, &end, 0 ); 344 if ( errno ) return false; 345 if (*end != '\0') return false; 346 if ( r > (int)MAX ) return false; 347 if ( r < (int)MIN ) return false; 357 348 value = r; 358 349 return true; 359 350 } 360 351 361 static unsigned long long int strict_strtoull( const char * arg, int base ) {352 static unsigned long long int strict_strtoull( const char * arg, int base ) { 362 353 errno = 0; 363 354 { 364 355 const char * in = arg; 365 for () {366 if ('\0' == *in) {356 for () { 357 if ( '\0' == *in ) { 367 358 errno = EINVAL; 368 359 return 0; 369 360 } 370 if (!isspace(*in)) break;361 if ( ! isspace(*in )) break; 371 362 in++; 372 363 } 373 if (!isdigit(*in)) {364 if ( ! isdigit(*in )) { 374 365 errno = EINVAL; 375 366 return 0; 376 367 } 377 368 } 378 379 369 *char end; 380 unsigned long long int r = strtoull( arg, &end, base);381 if (*end != '\0') errno = EINVAL;382 if (errno) return 0;370 unsigned long long int r = strtoull( arg, &end, base ); 371 if (*end != '\0') errno = EINVAL; 372 if ( errno ) return 0; 383 373 return r; 384 374 } 385 375 386 bool parse(const char * arg, unsigned & value) { 387 unsigned long long int r = strict_strtoull(arg, 0); 388 if(errno) return false; 389 if(r > (unsigned)MAX) return false; 390 376 bool parse( const char * arg, unsigned & value ) { 377 unsigned long long int r = strict_strtoull( arg, 0 ); 378 if ( errno ) return false; 379 if ( r > (unsigned)MAX ) return false; 391 380 value = r; 392 381 return true; 393 382 } 394 383 395 bool parse(const char * arg, unsigned long & value) { 396 unsigned long long int r = strict_strtoull(arg, 0); 397 if(errno) return false; 398 if(r > (unsigned long)MAX) return false; 399 384 bool parse( const char * arg, unsigned long & value ) { 385 unsigned long long int r = strict_strtoull( arg, 0 ); 386 if ( errno ) return false; 387 if ( r > (unsigned long)MAX ) return false; 400 388 value = r; 401 389 return true; 402 390 } 403 391 404 bool parse(const char * arg, unsigned long long & value) { 405 unsigned long long int r = strict_strtoull(arg, 0); 406 if(errno) return false; 407 if(r > (unsigned long long)MAX) return false; 408 392 bool parse( const char * arg, unsigned long long & value ) { 393 unsigned long long int r = strict_strtoull( arg, 0 ); 394 if ( errno ) return false; 395 if ( r > (unsigned long long)MAX ) return false; 409 396 value = r; 410 397 return true; 411 398 } 412 399 413 bool parse( const char * arg, double & value) {400 bool parse( const char * arg, double & value ) { 414 401 char * end; 415 double r = strtod(arg, &end); 416 if(*end != '\0') return false; 417 402 double r = strtod( arg, &end ); 403 if ( *end != '\0') return false; 418 404 value = r; 419 405 return true; -
TabularUnified libcfa/src/parseargs.hfa ¶
rbb336a6 rdbff8ec 10 10 // Author : Thierry Delisle 11 11 // Created On : Wed Oct 12 15:28:01 2022 12 // Last Modified By : 13 // Last Modified On : 14 // Update Count : 12 // Last Modified By : Peter A. Buhr 13 // Last Modified On : Mon Jul 8 18:18:14 2024 14 // Update Count : 2 15 15 // 16 16 #pragma once … … 31 31 static inline void ?{}( cfa_option & this ) {} 32 32 33 forall(T & | { bool parse( const char *, T & ); })33 forall(T & | { bool parse( const char *, T & ); }) 34 34 static inline void ?{}( cfa_option & this, char short_name, const char * long_name, const char * help, T & variable ) { 35 35 this.val = 0; -
TabularUnified src/Parser/ExpressionNode.cpp ¶
rbb336a6 rdbff8ec 483 483 Default: // char default string type 484 484 default: 485 strtype = new ast::BasicType( ast::BasicKind::Char , ast::CV::Const);485 strtype = new ast::BasicType( ast::BasicKind::Char ); 486 486 } // switch 487 487 ast::ArrayType * at = new ast::ArrayType( -
TabularUnified src/Parser/parser.yy ¶
rbb336a6 rdbff8ec 10 10 // Created On : Sat Sep 1 20:22:55 2001 11 11 // Last Modified By : Peter A. Buhr 12 // Last Modified On : T hu Jun 27 14:45:57202413 // Update Count : 67 0512 // Last Modified On : Tue Jul 9 10:29:01 2024 13 // Update Count : 6713 14 14 // 15 15 … … 1998 1998 c_declaration ';' 1999 1999 | cfa_declaration ';' // CFA 2000 | static_assert // C112000 | static_assert ';' // C11 2001 2001 ; 2002 2002 2003 2003 static_assert: 2004 STATICASSERT '(' constant_expression ',' string_literal ')' ';'// C112004 STATICASSERT '(' constant_expression ',' string_literal ')' // C11 2005 2005 { $$ = DeclarationNode::newStaticAssert( $3, maybeMoveBuild( $5 ) ); } 2006 | STATICASSERT '(' constant_expression ')' ';'// CFA2006 | STATICASSERT '(' constant_expression ')' // CFA 2007 2007 { $$ = DeclarationNode::newStaticAssert( $3, build_constantStr( yylloc, *new string( "\"\"" ) ) ); } 2008 2008 … … 2709 2709 { $$ = $2; } // mark all fields in list 2710 2710 | cfa_typedef_declaration ';' // CFA 2711 | static_assert 2711 | static_assert ';' // C11 2712 2712 ; 2713 2713 … … 3364 3364 $$ = $6; 3365 3365 } 3366 | ';' // empty declaration 3367 { $$ = nullptr; } 3366 3368 ; 3367 3369
Note: See TracChangeset
for help on using the changeset viewer.