Changes in / [bb336a6:dbff8ec]


Ignore:
Files:
1 added
20 edited

Legend:

Unmodified
Added
Removed
  • INSTALL

    rbb336a6 rdbff8ec  
    4040
    4141  $ cd ./test
    42   $ make -j 8 all-tests
     42  $ make -j 8 all-local
    4343
    4444The tests take about 2-5 minutes and can be stopped at any time.
  • doc/theses/mike_brooks_MMath/array.tex

    rbb336a6 rdbff8ec  
    22\label{c:Array}
    33
     4
    45\section{Introduction}
    56
    6 This chapter describes my contribution of language and library features that provide a length-checked array type, as in:
    7 
    8 \begin{lstlisting}
    9 array(float, 99) x;    // x contains 99 floats
    10 
    11 void f( array(float, 42) & a ) {}
    12 f(x);                  // statically rejected: types are different
     7Arrays in C are possible the single most misunderstood and incorrectly used features in the language, resulting in the largest proportion of runtime errors and security violations.
     8This chapter describes the new \CFA language and library features that introduce a length-checked array-type to the \CFA standard library~\cite{Cforall}, \eg:
     9\begin{cfa}
     10@array( float, 99 )@ x;                                 $\C{// x contains 99 floats}$
     11void f( @array( float, 42 )@ & p ) {}   $\C{// p accepts 42 floats}$
     12f( x );                                                                 $\C{// statically rejected: types are different, 99 != 42}$
    1313
    1414forall( T, [N] )
    15 void g( array(T, N) & a, int i ) {
    16         T elem = a[i];     // dynamically checked: requires 0 <= i < N
    17 }
    18 g(x, 0);               // T is float, N is 99, succeeds
    19 g(x, 1000);            // T is float, N is 99, dynamic check fails
    20 \end{lstlisting}
    21 
    22 This example first declares @x@ a variable, whose type is an instantiation of the generic type named @array@, with arguments @float@ and @99@.
    23 Next, it declares @f@ as a function that expects a length-42 array; the type system rejects the call's attempt to pass @x@ to @f@, because the lengths do not match.
    24 Next, the @forall@ annotation on function @g@ introduces @T@ as a familiar type parameter and @N@ as a \emph{dimension} parameter, a new feature that represents a count of elements, as managed by the type system.
    25 Because @g@ accepts any length of array; the type system accepts the calls' passing @x@ to @g@, inferring that this length is 99.
    26 Just as the caller's code does not need to explain that @T@ is @float@, the safe capture and communication of the value @99@ occurs without programmer involvement.
    27 In the case of the second call (which passes the value 1000 for @i@), within the body of @g@, the attempt to subscript @a@ by @i@ fails with a runtime error, since $@i@ \nless @N@$.
    28 
    29 The type @array@, as seen above, comes from my additions to the \CFA standard library.
    30 It is very similar to the built-in array type, which \CFA inherits from C.
     15void g( @array( T, N )@ & p, int i ) {
     16        T elem = p[i];                                          $\C{// dynamically checked: requires 0 <= i < N}$
     17}
     18g( x, 0 );                                                              $\C{// T is float, N is 99, dynamic subscript check succeeds}$
     19g( x, 1000 );                                                   $\C{// T is float, N is 99, dynamic subscript check fails}$
     20\end{cfa}
     21This example declares variable @x@, with generic type @array@ using arguments @float@ and @99@.
     22Function @f@ is declared with an @array@ parameter of length @42@.
     23The call @f( x )@ is invalid because the @array@ lengths @99@ and @42@ do not match.
     24Next, function @g@ introduces a @forall@ prefix on type parameter @T@ and arbitrary \emph{dimension parameter} @N@, the new feature that represents a count of elements managed by the type system.
     25The call @g( x, 0 )@ is valid because @g@ accepts any length of array, where the type system infers @float@ for @T@ and length @99@ for @N@.
     26Inferring values for @T@ and @N@ is implicit without programmer involvement.
     27Furthermore, the runtime subscript @x[0]@ (parameter @i@ is @0@) in @g@ is valid because @0@ is in the dimension range $[0,99)$ of argument @x@.
     28The call @g( x, 1000 )@ is also valid;
     29however, the runtime subscript @x[1000]@ is invalid (generates a subscript error) because @1000@ is outside the dimension range $[0,99)$ of argument @x@.
     30
     31The generic @array@ type is similar to the C array type, which \CFA inherits from C.
    3132Its runtime characteristics are often identical, and some features are available in both.
    32 
    33 \begin{lstlisting}
     33For example, assume a caller can instantiates @N@ with 42 in the following (details to follow).
     34\begin{cfa}
    3435forall( [N] )
    3536void declDemo() {
    36         float a1[N];         // built-in type ("C array")
    37         array(float, N) a2;  // type from library
    38 }
    39 \end{lstlisting}
    40 
    41 If a caller instantiates @N@ with 42, then both locally-declared array variables, @a1@ and @a2@, become arrays of 42 elements, each element being a @float@.
    42 The two variables have identical size and layout; they both encapsulate 42-float stack allocations, no heap allocations, and no further "bookkeeping" allocations/header.
    43 Having the @array@ library type (that of @a2@) is a tactical measure, an early implementation that offers full feature support.
    44 A future goal (TODO xref) is to port all of its features into the built-in array type (that of @a1@); then, the library type could be removed, and \CFA would have only one array type.
    45 In present state, the built-in array has partial support for the new features.
    46 The fully-featured library type is used exclusively in introductory examples; feature support and C compatibility are revisited in sec TODO.
    47 
    48 Offering the @array@ type, as a distinct alternative from the the C array, is consistent with \CFA's extension philosophy (TODO xref background) to date.
    49 A few compatibility-breaking changes to the behaviour of the C array were also made, both as an implementation convenience, and as justified fixes to C's lax treatment.
    50 
    51 The @array@ type is an opportunity to start from a clean slate and show a cohesive selection of features.
    52 A clean slate was an important starting point because it meant not having to deal with every inherited complexity introduced in TODO xref background-array.
    53 
    54 
    55 My contributions are
    56 \begin{itemize}
    57 \item a type system enhancement that lets polymorphic functions and generic types be parameterized by a numeric value: @forall( [N] )@
     37        float x1[N];                            $\C{// built-in type ("C array")}$
     38        array(float, N) x2;                     $\C{// type from library}$
     39}
     40\end{cfa}
     41Both of the locally-declared array variables, @x1@ and @x2@, have 42 elements, each element being a @float@.
     42The two variables have identical size and layout; they both encapsulate 42-float, stack \vs heap allocations with no additional ``bookkeeping'' allocations or headers.
     43Providing this explicit generic approach required a significant extension to the \CFA type system to support a full-feature, safe, efficient (space and time) array-type, which forms the foundation for more complex array forms in \CFA.
     44
     45Admittedly, the @array@ library type (type for @x2@) is syntactically different from its C counterpart.
     46A future goal (TODO xref) is to provide a built-in array type with syntax approaching C's (type for @x1@);
     47then, the library @array@ type can be removed giving \CFA a largely uniform array type.
     48At present, the built-in array is only partially supported, so the generic @array@ is used exclusively in the discussion;
     49feature support and C compatibility are revisited in Section ? TODO.
     50
     51Offering an @array@ type, as a distinct alternative to the C array, is consistent with \CFA's goal of backwards compatibility, \ie virtually all existing C (gcc) programs can be compiled by \CFA with only a small number of changes, similar to \CC (g++).
     52However, a few compatibility-breaking changes to the behaviour of the C array are necessary, both as an implementation convenience and to fix C's lax treatment of arrays.
     53Hence, the @array@ type is an opportunity to start from a clean slate and show a cohesive selection of features, making it unnecessary to deal with every inherited complexity introduced by the C array TODO xref.
     54
     55My contributions are:
     56\begin{enumerate}
     57\item A type system enhancement that lets polymorphic functions and generic types be parameterized by a numeric value: @forall( [N] )@.
     58\item Provide a length-checked array-type in the \CFA standard library, where the array's length is statically managed and dynamically valued.
     59\item Provide argument/parameter passing safety for arrays and subscript safety.
    5860\item TODO: general parking...
    59 \item identify specific abilities brought by @array@
    60 \item Where there is a gap concerning this feature's readiness for prime-time, identification of specific workable improvements that are likely to close the gap
    61 \end{itemize}
     61\item Identify the interesting specific abilities available by the new @array@ type.
     62\item Where there is a gap concerning this feature's readiness for prime-time, identification of specific workable improvements that are likely to close the gap.
     63\end{enumerate}
    6264
    6365
     
    6870
    6971
    70 \section{Features Added}
    71 
    72 The present work adds a type @array@ to the \CFA standard library~\cite{Cforall}.
    73 
    74 This array's length is statically managed and dynamically valued.
    75 This static management achieves argument safety and suggests a path to subscript safety as future work (TODO: cross reference).
    76 
    77 This section presents motivating examples of the new array type's usage and follows up with definitions of the notations that appear.
    78 
    79 The core of the new array management is tracking all array lengths in the type system.
    80 Dynamically valued lengths are represented using type variables.
    81 The stratification of type variables preceding object declarations makes a length referenceable everywhere that it is needed.
     72\section{Features added}
     73
     74This section presents motivating examples for the new array type, demonstrating the syntax and semantics of the generic @array@.
     75As stated, the core capability of the new array is tracking all dimensions in the type system, where dynamic dimensions are represented using type variables.
     76
     77The definition of type variables preceding object declarations makes the array dimension lexically referenceable where it is needed.
    8278For example, a declaration can share one length, @N@, among a pair of parameters and the return.
    8379\lstinput{10-17}{hello-array.cfa}
    8480Here, the function @f@ does a pointwise comparison, checking if each pair of numbers is within half a percent of each other, returning the answers in a newly allocated @bool@ array.
    85 
    86 The array type uses the parameterized length information in its @sizeof@ determination, illustrated in the example's call to @alloc@.
    87 That call requests an allocation of type @array(bool, N)@, which the type system deduces from the left-hand side of the initialization, into the return type of the @alloc@ call.
    88 Preexisting \CFA behaviour is leveraged here, both in the return-type-only polymorphism, and the @sized(T)@-aware standard-library @alloc@ routine.
    89 The new @array@ type plugs into this behaviour by implementing the @sized@/@sizeof@ assertion to have the intuitive meaning.
    90 As a result, this design avoids an opportunity for programmer error by making the size/length communication to a called routine implicit, compared with C's @calloc@ (or the low-level \CFA analog @aalloc@), which take an explicit length parameter not managed by the type system.
    91 
    92 \VRef[Figure]{f:fHarness} shows the harness to use the @f@ function illustrating how dynamic values are fed into the system.
    93 Here, the @a@ array is loaded with decreasing values, and the @b@ array with amounts off by a constant, giving relative differences within tolerance at first and out of tolerance later.
    94 The program main is run with two different inputs of sequence length.
     81The dynamic allocation of the @ret@ array by @alloc@ uses the parameterized dimension information in its implicit @_Alignof@ and @sizeof@ determinations, and casting the return type.
     82\begin{cfa}
     83static inline forall( T & | sized(T) )
     84T * alloc( size_t dim ) {
     85        if ( _Alignof(T) <= libAlign() ) return (T *)aalloc( dim, sizeof(T) ); // calloc without zero fill
     86        else return (T *)amemalign( _Alignof(T), dim, sizeof(T) ); // array memalign
     87}
     88\end{cfa}
     89Here, the type system deduces from the left-hand side of the assignment the type @array(bool, N)@, and forwards it as the type variable @T@ for the generic @alloc@ function, making it available in its body.
     90Hence, preexisting \CFA behaviour is leveraged here, both in the return-type polymorphism, and the @sized(T)@-aware standard-library @alloc@ routine.
     91This example illustrates how the new @array@ type plugs into existing \CFA behaviour by implementing necessary @sized@ assertions needed by other types.
     92(@sized@ implies a concrete \vs abstract type with a compile-time size.)
     93As a result, there is significant programming safety by making the size accessible and implicit, compared with C's @calloc@ and non-array supporting @memalign@, which take an explicit length parameter not managed by the type system.
    9594
    9695\begin{figure}
    97 \lstinput{30-49}{hello-array.cfa}
     96\lstinput{30-43}{hello-array.cfa}
     97\lstinput{45-48}{hello-array.cfa}
    9898\caption{\lstinline{f} Harness}
    9999\label{f:fHarness}
    100100\end{figure}
    101101
    102 The loops in the program main follow the more familiar pattern of using the ordinary variable @n@ to convey the length.
    103 The type system implicitly captures this value at the call site (@main@ calling @f@) and makes it available within the callee (@f@'s loop bound).
    104 
    105 The two parts of the example show @n@ adapting a variable into a type-system managed length (at @main@'s declarations of @a@, @b@, and @result@), @N@ adapting in the opposite direction (at @f@'s loop bound), and a pass-thru use of a managed length (at @f@'s declaration of @ret@).
    106 
    107 The @forall( ...[N] )@ participates in the user-relevant declaration of the name @N@, which becomes usable in parameter/return declarations and in the function @b@.
    108 The present form is chosen to parallel the existing @forall@ forms:
    109 \begin{cfa}
    110 forall( @[N]@ ) ... // array kind
    111 forall( & T  ) ...  // reference kind (dtype)
    112 forall( T  ) ...    // value kind (otype)
    113 \end{cfa}
    114 
    115 The notation @array(thing, N)@ is a single-dimensional case, giving a generic type instance.
     102\VRef[Figure]{f:fHarness} shows a harness that uses the @f@ function illustrating how dynamic values are fed into the @array@ type.
     103Here, the dimension of the @x@, @y@, and @result@ arrays is specified from a command-line value and these arrays are allocated on the stack.
     104Then the @x@ array is initialized with decreasing values, and the @y@ array with amounts offset by constant @0.005@, giving relative differences within tolerance initially and diverging for later values.
     105The program main is run (see figure bottom) with inputs @5@ and @7@ for sequence lengths.
     106The loops follow the familiar pattern of using the variable @n@ to iterate through the arrays.
     107Most importantly, the type system implicitly captures @n@ at the call of @f@ and makes it available throughout @f@ as @N@.
     108The example shows @n@ adapting into a type-system managed length at the declarations of @x@, @y@, and @result@, @N@ adapting in the same way at @f@'s loop bound, and a pass-thru use of @n@ at @f@'s declaration of @ret@.
     109Except for the lifetime-management issue of @result@, \ie explicit @free@, this program has eliminated both the syntactic and semantic problems associated with C arrays and their usage.
     110These benefits cannot be underestimated.
     111
     112In general, the @forall( ..., [N] )@ participates in the user-relevant declaration of the name @N@, which becomes usable in parameter/return declarations and within a function.
     113The syntactic form is chosen to parallel other @forall@ forms:
     114\begin{cfa}
     115forall( @[N]@ ) ...     $\C[1.5in]{// array kind}$
     116forall( T & ) ...       $\C{// reference kind (dtype)}$
     117forall( T ) ...         $\C{// value kind (otype)}\CRT$
     118\end{cfa}
     119% The notation @array(thing, N)@ is a single-dimensional case, giving a generic type instance.
    116120In summary:
    117121\begin{itemize}
    118122\item
    119 @[N]@ -- within a forall, declares the type variable @N@ to be a managed length
    120 \item
    121 $e$ -- a type representing the value of $e$ as a managed length, where $e$ is a @size_t@-typed expression
    122 \item
    123 N -- an expression of type @size_t@, whose value is the managed length @N@
    124 \item
    125 @array( thing, N0, N1, ... )@ -- a type wrapping $\prod_i N_i$ adjacent occurrences of @thing@ objects
     123@[N]@ within a forall declares the type variable @N@ to be a managed length.
     124\item
     125The type of @N@ within code is @size_t@.
     126\item
     127The value of @N@ within code is the acquired length derived from the usage site, \ie generic declaration or function call.
     128\item
     129@array( thing, N0, N1, ... )@ is a multi-dimensional type wrapping $\prod_i N_i$ adjacent occurrences of @thing@ objects.
    126130\end{itemize}
    127 Unsigned integers have a special status in this type system.
    128 Unlike how C++ allows
     131
     132\VRef[Figure]{f:TemplateVsGenericType} shows @N@ is not the same as a @size_t@ declaration in a \CC \lstinline[language=C++]{template}.
     133\begin{enumerate}[leftmargin=*]
     134\item
     135The \CC template @N@ is a compile-time value, while the \CFA @N@ is a runtime value.
     136\item
     137The \CC template @N@ must be passed explicitly at the call, unless @N@ has a default value, even when \CC can deduct the type of @T@.
     138The \CFA @N@ is part of the array type and passed implicitly at the call.
     139\item
     140\CC cannot have an array of references, but can have an array of pointers.
     141\CC has a (mistaken) belief that references are not objects, but pointers are objects.
     142In the \CC example, the arrays fall back on C arrays, which have a duality with references with respect to automatic dereferencing.
     143The \CFA array is a contiguous object with an address, which can stored as a reference or pointer.
     144\item
     145C/\CC arrays cannot be copied, while \CFA arrays can be copied, making them a first-class object (although array copy is often avoided for efficiency).
     146\end{enumerate}
     147
     148\begin{figure}
     149\begin{tabular}{@{}l@{\hspace{20pt}}l@{}}
    129150\begin{c++}
    130 template< size_t N, char * msg, typename T >... // declarations
     151
     152@template< typename T, size_t N >@
     153void copy( T ret[N], T x[N] ) {
     154        for ( int i = 0; i < N; i += 1 ) ret[i] = x[i];
     155}
     156int main() {
     157        int ret[10], x[10];
     158        for ( int i = 0; i < 10; i += 1 ) x[i] = i;
     159        @copy<int, 10 >( ret, x );@
     160        for ( int i = 0; i < 10; i += 1 )
     161                cout << ret[i] << ' ';
     162        cout << endl;
     163}
    131164\end{c++}
    132 \CFA does not accommodate values of any user-provided type.
    133 TODO: discuss connection with dependent types.
    134 An example of a type error demonstrates argument safety.
    135 The running example has @f@ expecting two arrays of the same length.
    136 A compile-time error occurs when attempting to call @f@ with arrays whose lengths may differ.
    137 \begin{cfa}
    138 forall( [M], [N] )
    139 void bad( array(float, M) &a, array(float, N) &b ) {
    140         f( a, a ); // ok
    141         f( b, b ); // ok
    142         f( a, b ); // error
    143 }
    144 \end{cfa}
    145 %\lstinput{60-65}{hello-array.cfa}
    146 As is common practice in C, the programmer is free to cast, to assert knowledge not shared with the type system.
    147 \begin{cfa}
    148 forall( [M], [N] )
    149 void bad_fixed( array(float, M) & a, array(float, N) & b ) {
    150         if ( M == N ) {
    151             f( a, (array(float, M) &)b ); // cast b to matching type
     165&
     166\begin{cfa}
     167int main() {
     168        @forall( T, [N] )@   // nested function
     169        void copy( array( T, N ) & ret, array( T, N ) & x ) {
     170                for ( i; 10 ) ret[i] = x[i];
    152171        }
    153 }
    154 \end{cfa}
    155 %\lstinput{70-75}{hello-array.cfa}
    156 
    157 Argument safety and the associated implicit communication of array length work with \CFA's generic types too.
    158 \CFA allows aggregate types to be generalized with multiple type parameters, including parameterized element type, so can it be defined over a parameterized length.
    159 Doing so gives a refinement of C's ``flexible array member'' pattern, that allows nesting structures with array members anywhere within other structures.
    160 \lstinput{10-16}{hello-accordion.cfa}
    161 This structure's layout has the starting offset of @cost_contribs@ varying in @Nclients@, and the offset of @total_cost@ varying in both generic parameters.
    162 For a function that operates on a @request@ structure, the type system handles this variation transparently.
    163 \lstinput{40-47}{hello-accordion.cfa}
    164 In the example, different runs of the program result in different offset values being used.
    165 \lstinput{60-76}{hello-accordion.cfa}
     172
     173        array( int, 10 ) ret, x;
     174        for ( i; 10 ) x[i] = i;
     175        @copy( ret,  x );@
     176        for ( i; 10 )
     177                sout | ret[i] | nonl;
     178        sout | nl;
     179}
     180\end{cfa}
     181\end{tabular}
     182\caption{\CC \lstinline[language=C++]{template} \vs \CFA generic type }
     183\label{f:TemplateVsGenericType}
     184\end{figure}
     185
     186Continuing the discussion of \VRef[Figure]{f:fHarness}, the example has @f@ expecting two arrays of the same length.
     187A compile-time error occurs when attempting to call @f@ with arrays of differing lengths.
     188% removing leading whitespace
     189\lstinput[tabsize=1]{52-53}{hello-array.cfa}
     190\lstinput[tabsize=1,aboveskip=0pt]{62-64}{hello-array.cfa}
     191As is common practice in C, the programmer is free to cast, \ie to assert knowledge not shared with the type system.
     192\lstinput{70-74}{hello-array.cfa}
     193
     194Orthogonally, the new @array@ type works with \CFA's generic types, providing argument safety and the associated implicit communication of array length.
     195Specifically, \CFA allows aggregate types to be generalized with multiple type parameters, including parameterized element types and lengths.
     196Doing so gives a refinement of C's ``flexible array member'' pattern, allowing nesting structures with array members anywhere within other structures.
     197\lstinput{10-15}{hello-accordion.cfa}
     198This structure's layout has the starting offset of @municipalities@ varying in @NprovTerty@, and the offset of @total_pt@ and @total_mun@ varying in both generic parameters.
     199For a function that operates on a @CanadaPop@ structure, the type system handles this variation transparently.
     200\lstinput{40-45}{hello-accordion.cfa}
     201\VRef[Figure]{f:checkHarness} shows program results where different offset values being used.
    166202The output values show that @summarize@ and its caller agree on both the offsets (where the callee starts reading @cost_contribs@ and where the callee writes @total_cost@).
    167 Yet the call site still says just, ``pass the request.''
    168 
    169 
    170 \section{Multidimensional implementation}
     203Yet the call site just says, ``pass the request.''
     204
     205\begin{figure}
     206\lstinput{60-68}{hello-accordion.cfa}
     207\lstinput{70-72}{hello-accordion.cfa}
     208\caption{\lstinline{check} Harness}
     209\label{f:checkHarness}
     210\end{figure}
     211
     212
     213\section{Multidimensional Arrays}
    171214\label{toc:mdimpl}
    172215
    173 TODO: introduce multidimensional array feature and approaches
    174 
    175 The new \CFA standard library @array@ datatype supports multidimensional uses more richly than the C array.
    176 The new array's multidimensional interface and implementation, follows an array-of-arrays setup, meaning, like C's @float[n][m]@ type, one contiguous object, with coarsely-strided dimensions directly wrapping finely-strided dimensions.
    177 This setup is in contrast with the pattern of array of pointers to other allocations representing a sub-array.
    178 Beyond what C's type offers, the new array brings direct support for working with a noncontiguous array slice, allowing a program to work with dimension subscripts given in a non-physical order.
    179 C and C++ require a programmer with such a need to manage pointer/offset arithmetic manually.
    180 
    181 Examples are shown using a $5 \times 7$ float array, @a@, loaded with increments of $0.1$ when stepping across the length-7 finely-strided dimension shown on columns, and with increments of $1.0$ when stepping across the length-5 coarsely-strided dimension shown on rows.
    182 %\lstinput{120-126}{hello-md.cfa}
    183 The memory layout of @a@ has strictly increasing numbers along its 35 contiguous positions.
     216% TODO: introduce multidimensional array feature and approaches
     217
     218When working with arrays, \eg linear algebra, array dimensions are referred to as ``rows'' and ``columns'' for a matrix, adding ``planes'' for a cube.
     219(There is little terminology for higher dimensional arrays.)
     220For example, an acrostic poem\footnote{A type of poetry where the first, last or other letters in a line spell out a particular word or phrase in a vertical column.}
     221can be treated as a grid of characters, where the rows are the text and the columns are the embedded keyword(s).
     222Within a poem, there is the concept of a \newterm{slice}, \eg a row is a slice for the poem text, a column is a slice for a keyword.
     223In general, the dimensioning and subscripting for multidimensional arrays has two syntactic forms: @m[r,c]@ or @m[r][c]@.
     224
     225Commonly, an array, matrix, or cube, is visualized (especially in mathematics) as a contiguous row, rectangle, or block.
     226This conceptualization is reenforced by subscript ordering, \eg $m_{r,c}$ for a matrix and $c_{p,r,c}$ for a cube.
     227Few programming languages differ from the mathematical subscript ordering.
     228However, computer memory is flat, and hence, array forms are structured in memory as appropriate for the runtime system.
     229The closest representation to the conceptual visualization is for an array object to be contiguous, and the language structures this memory using pointer arithmetic to access the values using various subscripts.
     230This approach still has degrees of layout freedom, such as row or column major order, \ie juxtaposed rows or columns in memory, even when the subscript order remains fixed.
     231For example, programming languages like MATLAB, Fortran, Julia and R store matrices in column-major order since they are commonly used for processing column-vectors in tabular data sets but retain row-major subscripting.
     232In general, storage layout is hidden by subscripting, and only appears when passing arrays among different programming languages or accessing specific hardware.
     233
     234\VRef[Figure]{f:FixedVariable} shows two C90 approaches for manipulating contiguous arrays.
     235Note, C90 does not support VLAs.
     236The fixed-dimension approach uses the type system;
     237however, it requires all dimensions except the first to be specified at compile time, \eg @m[][6]@, allowing all subscripting stride calculations to be generated with constants.
     238Hence, every matrix passed to @fp1@ must have exactly 6 columns but the row size can vary.
     239The variable-dimension approach ignores (violates) the type system, \ie argument and parameters types do not match, and manually performs pointer arithmetic for subscripting in the macro @sub@.
     240
     241\begin{figure}
     242\begin{tabular}{@{}l@{\hspace{40pt}}l@{}}
     243\multicolumn{1}{c}{\textbf{Fixed Dimension}} & \multicolumn{1}{c}{\textbf{Variable Dimension}} \\
     244\begin{cfa}
     245
     246void fp1( int rows, int m[][@6@] ) {
     247        ...  printf( "%d ", @m[r][c]@ );  ...
     248}
     249int fm1[4][@6@], fm2[6][@6@]; // no VLA
     250// initialize matrixes
     251fp1( 4, fm1 ); // implicit 6 columns
     252fp1( 6, fm2 );
     253\end{cfa}
     254&
     255\begin{cfa}
     256#define sub( m, r, c ) *(m + r * sizeof( m[0] ) + c)
     257void fp2( int rows, int cols, int *m ) {
     258        ...  printf( "%d ", @sub( m, r, c )@ );  ...
     259}
     260int vm1[4][4], vm2[6][8]; // no VLA
     261// initialize matrixes
     262fp2( 4, 4, vm1 );
     263fp2( 6, 8, vm2 );
     264\end{cfa}
     265\end{tabular}
     266\caption{C90 Fixed \vs Variable Contiguous Matrix Styles}
     267\label{f:FixedVariable}
     268\end{figure}
     269
     270Many languages allow multidimensional arrays-of-arrays, \eg in Pascal or \CC.
     271\begin{cquote}
     272\begin{tabular}{@{}ll@{}}
     273\begin{pascal}
     274var m : array[0..4, 0..4] of Integer;  (* matrix *)
     275type AT = array[0..4] of Integer;  (* array type *)
     276type MT = array[0..4] of AT;  (* array of array type *)
     277var aa : MT;  (* array of array variable *)
     278m@[1][2]@ := 1;   aa@[1][2]@ := 1 (* same subscripting *)
     279\end{pascal}
     280&
     281\begin{c++}
     282int m[5][5];
     283
     284typedef vector< vector<int> > MT;
     285MT vm( 5, vector<int>( 5 ) );
     286m@[1][2]@ = 1;  aa@[1][2]@ = 1;
     287\end{c++}
     288\end{tabular}
     289\end{cquote}
     290The language decides if the matrix and array-of-array are laid out the same or differently.
     291For example, an array-of-array may be an array of row pointers to arrays of columns, so the rows may not be contiguous in memory nor even the same length (triangular matrix).
     292Regardless, there is usually a uniform subscripting syntax masking the memory layout, even though the two array forms could be differentiated at the subscript level, \eg @m[1,2]@ \vs @aa[1][2]@.
     293Nevertheless, controlling memory layout can make a difference in what operations are allowed and in performance (caching/NUMA effects).
     294
     295C also provides non-contiguous arrays-of-arrays.
     296\begin{cfa}
     297int m[5][5];                                                    $\C{// contiguous}$
     298int * aa[5];                                                    $\C{// non-contiguous}$
     299\end{cfa}
     300both with different memory layout using the same subscripting, and both with different degrees of issues.
     301The focus of this work is on the contiguous multidimensional arrays in C.
     302The reason is that programmers are often forced to use the more complex array-of-array form when a contiguous array would be simpler, faster, and safer.
     303Nevertheless, the C array-of-array form continues to be useful for special circumstances.
     304
     305\VRef[Figure]{f:ContiguousNon-contiguous} shows the extensions made in C99 for manipulating contiguous \vs non-contiguous arrays.\footnote{C90 also supported non-contiguous arrays.}
     306First, VLAs are supported.
     307Second, for contiguous arrays, C99 conjoins one or more of the parameters as a downstream dimension(s), \eg @cols@, implicitly using this parameter to compute the row stride of @m@.
     308If the declaration of @fc@ is changed to:
     309\begin{cfa}
     310void fc( int rows, int cols, int m[@rows@][@cols@] ) ...
     311\end{cfa}
     312it is possible for C to perform bound checking across all subscripting, but it does not.
     313While this contiguous-array capability is a step forward, it is still the programmer's responsibility to manually manage the number of dimensions and their sizes, both at the function definition and call sites.
     314That is, the array does not automatically carry its structure and sizes for use in computing subscripts.
     315While the non-contiguous style in @faa@ looks very similar to @fc@, the compiler only understands the unknown-sized array of row pointers, and it relies on the programmer to traverse the columns in a row correctly.
     316Specifically, there is no requirement that the rows are the same length, like a poem with different length lines.
     317
     318\begin{figure}
     319\begin{tabular}{@{}ll@{}}
     320\multicolumn{1}{c}{\textbf{Contiguous}} & \multicolumn{1}{c}{\textbf{ Non-contiguous}} \\
     321\begin{cfa}
     322void fc( int rows, @int cols@, int m[ /* rows */ ][@cols@] ) {
     323        ...  printf( "%d ", @m[r][c]@ );  ...
     324}
     325int m@[5][5]@;
     326for ( int r = 0; r < 5; r += 1 ) {
     327
     328        for ( int c = 0; c < 5; c += 1 )
     329                m[r][c] = r + c;
     330}
     331fc( 5, 5, m );
     332\end{cfa}
     333&
     334\begin{cfa}
     335void faa( int rows, int cols, int * m[ @/* cols */@ ] ) {
     336        ...  printf( "%d ", @m[r][c]@ );  ...
     337}
     338int @* aa[5]@;  // row pointers
     339for ( int r = 0; r < 5; r += 1 ) {
     340        @aa[r] = malloc( 5 * sizeof(int) );@ // create rows
     341        for ( int c = 0; c < 5; c += 1 )
     342                aa[r][c] = r + c;
     343}
     344faa( 5, 5, aa );
     345\end{cfa}
     346\end{tabular}
     347\caption{C99 Contiguous \vs Non-contiguous Matrix Styles}
     348\label{f:ContiguousNon-contiguous}
     349\end{figure}
     350
     351
     352\subsection{Multidimensional array implementation}
     353
     354A multidimensional array implementation has three relevant levels of abstraction, from highest to lowest, where the array occupies \emph{contiguous memory}.
     355\begin{enumerate}
     356\item
     357Flexible-stride memory:
     358this model has complete independence between subscripting ordering and memory layout, offering the ability to slice by (provide an index for) any dimension, \eg slice a plane, row, or column, \eg @c[3][*][*]@, @c[3][4][*]@, @c[3][*][5]@.
     359\item
     360Fixed-stride memory:
     361this model binds the first subscript and the first memory layout dimension, offering the ability to slice by (provide an index for) only the coarsest dimension, @m[row][*]@ or @c[plane][*][*]@, \eg slice only by row (2D) or plane (3D).
     362After which, subscripting and memory layout are independent.
     363\item
     364Explicit-displacement memory:
     365this model has no awareness of dimensions just the ability to access memory at a distance from a reference point (base-displacement addressing), \eg @x + 23@ or @x[23}@ $\Rightarrow$ 23rd element from the start of @x@.
     366A programmer must manually build any notion of dimensions using other tools;
     367hence, this style is not offering multidimensional arrays \see{\VRef[Figure]{f:FixedVariable}}.
     368\end{enumerate}
     369
     370There is some debate as to whether the abstraction ordering goes $\{1, 2\} < 3$, rather than my numerically-ordering.
     371That is, styles 1 and 2 are at the same abstraction level, with 3 offering a limited set of functionality.
     372I chose to build the \CFA style-1 array upon a style-2 abstraction.
     373(Justification of the decision follows, after the description of the design.)
     374
     375Style 3 is the inevitable target of any array implementation.
     376The hardware offers this model to the C compiler, with bytes as the unit of displacement.
     377C offers this model to its programmer as pointer arithmetic, with arbitrary sizes as the unit.
     378Casting a multidimensional array as a single-dimensional array/pointer, then using @x[i]@ syntax to access its elements, is still a form of pointer arithmetic.
     379
     380Now stepping into the implementation of \CFA's new type-1 multidimensional arrays in terms of C's existing type-2 multidimensional arrays, it helps to clarify that even the interface is quite low-level.
     381A C/\CFA array interface includes the resulting memory layout.
     382The defining requirement of a type-2 system is the ability to slice a column from a column-finest matrix.
     383The required memory shape of such a slice is set, before any discussion of implementation.
     384The implementation presented here is how the \CFA array library wrangles the C type system, to make it do memory steps that are consistent with this layout.
     385TODO: do I have/need a presentation of just this layout, just the semantics of -[all]?
     386
     387The new \CFA standard library @array@ datatype supports richer multidimensional features than C.
     388The new array implementation follows C's contiguous approach, \ie @float [r][c]@, with one contiguous object subscripted by coarsely-strided dimensions directly wrapping finely-strided dimensions.
     389Beyond what C's array type offers, the new array brings direct support for working with a noncontiguous array slice, allowing a program to work with dimension subscripts given in a non-physical order.
     390
     391The following examples use an @array( float, 5, 7) m@, loaded with values incremented by $0.1$, when stepping across the length-7 finely-strided column dimension, and stepping across the length-5 coarsely-strided row dimension.
     392\par\noindent
     393\mbox{\lstinput{121-126}{hello-md.cfa}}
     394\par\noindent
     395The memory layout is 35 contiguous elements with strictly increasing addresses.
    184396
    185397A trivial form of slicing extracts a contiguous inner array, within an array-of-arrays.
    186 Like with the C array, a lesser-dimensional array reference can be bound to the result of subscripting a greater-dimensional array, by a prefix of its dimensions.
    187 This action first subscripts away the most coarsely strided dimensions, leaving a result that expects to be be subscripted by the more finely strided dimensions.
    188 \lstinput{60-66}{hello-md.cfa}
     398As for the C array, a lesser-dimensional array reference can be bound to the result of subscripting a greater-dimensional array by a prefix of its dimensions, \eg @m[2]@, giving the third row.
     399This action first subscripts away the most coarsely strided dimensions, leaving a result that expects to be subscripted by the more finely strided dimensions, \eg @m[2][3]@, giving the value @2.3@.
     400The following is an example slicing a row.
     401\lstinput{60-64}{hello-md.cfa}
    189402\lstinput[aboveskip=0pt]{140-140}{hello-md.cfa}
    190403
    191 This function declaration is asserting too much knowledge about its parameter @c@, for it to be usable for printing either a row slice or a column slice.
    192 Specifically, declaring the parameter @c@ with type @array@ means that @c@ is contiguous.
    193 However, the function does not use this fact.
    194 For the function to do its job, @c@ need only be of a container type that offers a subscript operator (of type @ptrdiff_t@ $\rightarrow$ @float@), with managed length @N@.
     404However, function @print1d@ is asserting too much knowledge about its parameter @r@ for printing either a row slice or a column slice.
     405Specifically, declaring the parameter @r@ with type @array@ means that @r@ is contiguous, which is unnecessarily restrictive.
     406That is, @r@ need only be of a container type that offers a subscript operator (of type @ptrdiff_t@ $\rightarrow$ @float@) with managed length @N@.
    195407The new-array library provides the trait @ix@, so-defined.
    196 With it, the original declaration can be generalized, while still implemented with the same body, to the latter declaration:
    197 \lstinput{40-44}{hello-md.cfa}
     408With it, the original declaration can be generalized with the same body.
     409\lstinput{43-44}{hello-md.cfa}
    198410\lstinput[aboveskip=0pt]{145-145}{hello-md.cfa}
    199 
    200 Nontrivial slicing, in this example, means passing a noncontiguous slice to @print1d@.
    201 The new-array library provides a ``subscript by all'' operation for this purpose.
    202 In a multi-dimensional subscript operation, any dimension given as @all@ is left ``not yet subscripted by a value,'' implementing the @ix@ trait, waiting for such a value.
     411The nontrivial slicing in this example now allows passing a \emph{noncontiguous} slice to @print1d@, where the new-array library provides a ``subscript by all'' operation for this purpose.
     412In a multi-dimensional subscript operation, any dimension given as @all@ is a placeholder, \ie ``not yet subscripted by a value'', waiting for such a value, implementing the @ix@ trait.
    203413\lstinput{150-151}{hello-md.cfa}
    204414
    205 The example has shown that @a[2]@ and @a[[2, all]]@ both refer to the same, ``2.*'' slice.
    206 Indeed, the various @print1d@ calls under discussion access the entry with value 2.3 as @a[2][3]@, @a[[2,all]][3]@, and @a[[all,3]][2]@.
    207 This design preserves (and extends) C array semantics by defining @a[[i,j]]@ to be @a[i][j]@ for numeric subscripts, but also for ``subscripting by all''.
     415The example shows @x[2]@ and @x[[2, all]]@ both refer to the same, ``2.*'' slice.
     416Indeed, the various @print1d@ calls under discussion access the entry with value @2.3@ as @x[2][3]@, @x[[2,all]][3]@, and @x[[all,3]][2]@.
     417This design preserves (and extends) C array semantics by defining @x[[i,j]]@ to be @x[i][j]@ for numeric subscripts, but also for ``subscripting by all''.
    208418That is:
    209 
    210 \begin{tabular}{cccccl}
    211 @a[[2,all]][3]@  &  $=$  &  @a[2][all][3]@  & $=$  &  @a[2][3]@  & (here, @all@ is redundant)  \\
    212 @a[[all,3]][2]@  &  $=$  &  @a[all][3][2]@  & $=$  &  @a[2][3]@  & (here, @all@ is effective)
     419\begin{cquote}
     420\begin{tabular}{@{}cccccl@{}}
     421@x[[2,all]][3]@ & $\equiv$      & @x[2][all][3]@  & $\equiv$    & @x[2][3]@  & (here, @all@ is redundant)  \\
     422@x[[all,3]][2]@ & $\equiv$      & @x[all][3][2]@  & $\equiv$    & @x[2][3]@  & (here, @all@ is effective)
    213423\end{tabular}
    214 
    215 Narrating progress through each of the @-[-][-][-]@ expressions gives, firstly, a definition of @-[all]@, and secondly, a generalization of C's @-[i]@.
    216 
    217 \noindent Where @all@ is redundant:
    218 
    219 \begin{tabular}{ll}
    220 @a@  & 2-dimensional, want subscripts for coarse then fine \\
    221 @a[2]@  & 1-dimensional, want subscript for fine; lock coarse = 2 \\
    222 @a[2][all]@  & 1-dimensional, want subscript for fine \\
    223 @a[2][all][3]@  & 0-dimensional; lock fine = 3
     424\end{cquote}
     425
     426Narrating progress through each of the @-[-][-][-]@\footnote{
     427The first ``\lstinline{-}'' is a variable expression and the remaining ``\lstinline{-}'' are subscript expressions.}
     428expressions gives, firstly, a definition of @-[all]@, and secondly, a generalization of C's @-[i]@.
     429Where @all@ is redundant:
     430\begin{cquote}
     431\begin{tabular}{@{}ll@{}}
     432@x@  & 2-dimensional, want subscripts for coarse then fine \\
     433@x[2]@  & 1-dimensional, want subscript for fine; lock coarse == 2 \\
     434@x[2][all]@  & 1-dimensional, want subscript for fine \\
     435@x[2][all][3]@  & 0-dimensional; lock fine == 3
    224436\end{tabular}
    225 
    226 \noindent Where @all@ is effective:
    227 
    228 \begin{tabular}{ll}
    229 @a@  & 2-dimensional, want subscripts for coarse then fine \\
    230 @a[all]@  & 2-dimensional, want subscripts for fine then coarse \\
    231 @a[all][3]@  & 1-dimensional, want subscript for coarse; lock fine = 3 \\
    232 @a[all][3][2]@  & 0-dimensional; lock coarse = 2
     437\end{cquote}
     438Where @all@ is effective:
     439\begin{cquote}
     440\begin{tabular}{@{}ll@{}}
     441@x@  & 2-dimensional, want subscripts for coarse then fine \\
     442@x[all]@  & 2-dimensional, want subscripts for fine then coarse \\
     443@x[all][3]@  & 1-dimensional, want subscript for coarse; lock fine == 3 \\
     444@x[all][3][2]@  & 0-dimensional; lock coarse == 2
    233445\end{tabular}
    234 
     446\end{cquote}
    235447The semantics of @-[all]@ is to dequeue from the front of the ``want subscripts'' list and re-enqueue at its back.
     448For example, in a two dimensional matrix, this semantics conceptually transposes the matrix by reversing the subscripts.
    236449The semantics of @-[i]@ is to dequeue from the front of the ``want subscripts'' list and lock its value to be @i@.
    237450
    238 Contiguous arrays, and slices of them, are all realized by the same underlying parameterized type.
    239 It includes stride information in its metatdata.
    240 The @-[all]@ operation is a conversion from a reference to one instantiation, to a reference to another instantiation.
     451Contiguous arrays, and slices of them, are all represented by the same underlying parameterized type, which includes stride information in its metatdata.
     452\PAB{Do not understand this sentence: The \lstinline{-[all]} operation is a conversion from a reference to one instantiation to a reference to another instantiation.}
    241453The running example's @all@-effective step, stated more concretely, is:
    242 
    243 \begin{tabular}{ll}
    244 @a@       & : 5 of ( 7 of float each spaced 1 float apart ) each spaced 7 floats apart \\
    245 @a[all]@  & : 7 of ( 5 of float each spaced 7 floats apart ) each spaced 1 float apart
     454\begin{cquote}
     455\begin{tabular}{@{}ll@{}}
     456@x@       & : 5 of ( 7 of @float@ each spaced 1 @float@ apart ) each spaced 7 @floats@ apart \\
     457@x[all]@  & : 7 of ( 5 of @float@ each spaced 7 @float@s apart ) each spaced 1 @float@ apart
    246458\end{tabular}
     459\end{cquote}
    247460
    248461\begin{figure}
    249462\includegraphics{measuring-like-layout}
    250 \caption{Visualization of subscripting, by numeric value, and by \lstinline[language=CFA]{all}.
    251         Here \lstinline[language=CFA]{x} has type \lstinline[language=CFA]{array( float, 5, 7 )}, understood as 5 rows by 7 columns.
    252         The horizontal layout represents contiguous memory addresses while the vertical layout uses artistic license.
    253         The vertical shaded band highlights the location of the targeted element, 2.3.
    254         Any such vertical contains various interpretations of a single address.}
     463\caption{Visualization of subscripting by value and by \lstinline[language=CFA]{all}, for \lstinline{x} of type \lstinline{array( float, 5, 7 )} understood as 5 rows by 7 columns.
     464The horizontal layout represents contiguous memory addresses while the vertical layout is conceptual.
     465The vertical shaded band highlights the location of the targeted element, 2.3.
     466Any such vertical slice contains various interpretations of a single address.}
    255467\label{fig:subscr-all}
    256468\end{figure}
    257 
    258 \noindent BEGIN: Paste looking for a home
    259 
    260 The world of multidimensional array implementation has, or abuts, four relevant levels of abstraction, highest to lowest:
    261 
    262 1, purpose:
    263 If you are doing linear algebra, you might call its dimensions, "column" and "row."
    264 If you are treating an acrostic poem as a grid of characters, you might say,
    265 the direction of reading the phrases vs the direction of reading the keyword.
    266 
    267 2, flexible-stride memory:
    268 assuming, from here on, a need to see/use contiguous memory,
    269 this model offers the ability to slice by (provide an index for) any dimension
    270 
    271 3, fixed-stride memory:
    272 this model offers the ability to slice by (provide an index for) only the coarsest dimension
    273 
    274 4, explicit-displacement memory:
    275 no awareness of dimensions, so no distinguishing them; just the ability to access memory at a distance from a reference point
    276 
    277 C offers style-3 arrays.  Fortran, Matlab and APL offer style-2 arrays.
    278 Offering style-2 implies offering style-3 as a sub-case.
    279 My CFA arrays are style-2.
    280 
    281 Some debate is reasonable as to whether the abstraction actually goes $ 1 < \{2, 3\} < 4 $,
    282 rather than my numerically-ordered chain.
    283 According to the diamond view, styles 2 and 3 are at the same abstraction level, just with 3 offering a more limited set of functionality.
    284 The chain view reflects the design decision I made in my mission to offer a style-2 abstraction;
    285 I chose to build it upon a style-3 abstraction.
    286 (Justification of the decision follows, after the description of the design.)
    287 
    288 The following discussion first dispenses with API styles 1 and 4, then elaborates on my work with styles 2 and 3.
    289 
    290 Style 1 is not a concern of array implementations.
    291 It concerns documentation and identifier choices of the purpose-specific API.
    292 If one is offering a matrix-multiply function, one must specify which dimension(s) is/are being summed over
    293 (or rely on the familiar convention of these being the first argument's rows and second argument's columns).
    294 Some libraries offer a style-1 abstraction that is not directly backed by a single array
    295 (e.g. make quadrants contiguous, as may help cache coherence during a parallel matrix multiply),
    296 but such designs are out of scope for a discussion on arrays; they are applications of several arrays.
    297 I typically include style-1 language with examples to help guide intuition.
    298 
    299 It is often said that C has row-major arrays while Fortran has column-major arrays.
    300 This comparison brings an unhelpful pollution of style-1 thinking into issues of array implementation.
    301 Unfortunately, ``-major'' has two senses: the program's order of presenting indices and the array's layout in memory.
    302 (The program's order could be either lexical, as in @x[1,2,3]@ subscripting, or runtime, as in the @x[1][2][3]@ version.)
    303 Style 2 is concerned with introducing a nontrivial relationship between program order and memory order,
    304 while style 3 sees program order identical with memory order.
    305 Both C and (the style-3 subset of) Fortran actually use the same relationship here:
    306 an earlier subscript in program order controls coarser steps in memory.
    307 The job of a layer-2/3 system is to implement program-ordered subscripting according to a defined memory layout.
    308 C and Fortran do not use opposite orders in doing this job.
    309 Fortran is only ``backward'' in its layer-1 conventions for reading/writing and linear algebra.
    310 Fortran subscripts as $m(c,r)$.  When I use style-1 language, I am following the C/mathematical convention of $m(r,c)$.
    311 
    312 Style 4 is the inevitable target of any array implementation.
    313 The hardware offers this model to the C compiler, with bytes as the unit of displacement.
    314 C offers this model to its programmer as pointer arithmetic, with arbitrary sizes as the unit.
    315 I consider casting a multidimensional array as a single-dimensional array/pointer,
    316 then using @x[i]@ syntax to access its elements, to be a form of pointer arithmetic.
    317 But style 4 is not offering arrays.
    318 
    319 Now stepping into the implementation
    320 of CFA's new type-3 multidimensional arrays in terms of C's existing type-2 multidimensional arrays,
    321 it helps to clarify that even the interface is quite low-level.
    322 A C/CFA array interface includes the resulting memory layout.
    323 The defining requirement of a type-3 system is the ability to slice a column from a column-finest matrix.
    324 The required memory shape of such a slice is set, before any discussion of implementation.
    325 The implementation presented here is how the CFA array library wrangles the C type system,
    326 to make it do memory steps that are consistent with this layout.
    327 TODO: do I have/need a presentation of just this layout, just the semantics of -[all]?
    328469
    329470Figure~\ref{fig:subscr-all} shows one element (in the shaded band) accessed two different ways: as @x[2][3]@ and as @x[all][3][2]@.
     
    365506The subscript operator presents what's really inside, by casting to the type below the wedge of lie.
    366507
    367 %  Does x[all] have to lie too?  The picture currently glosses over how it it advertizes a size of 7 floats.  I'm leaving that as an edge case benignly misrepresented in the picture.  Edge cases only have to be handled right in the code.
     508%  Does x[all] have to lie too?  The picture currently glosses over how it it advertises a size of 7 floats.  I'm leaving that as an edge case benignly misrepresented in the picture.  Edge cases only have to be handled right in the code.
    368509
    369510Casting, overlapping and lying are unsafe.
     
    392533
    393534The @arpk@ structure and its @-[i]@ operator are thus defined as:
    394 \begin{lstlisting}
    395 forall( ztype(N),               // length of current dimension
    396         dtype(S) | sized(S),    // masquerading-as
    397         dtype E_im,             // immediate element, often another array
    398         dtype E_base            // base element, e.g. float, never array
    399  ) {
    400 struct arpk {
    401         S strides[N];           // so that sizeof(this) is N of S
    402 };
    403 
    404 // expose E_im, stride by S
    405 E_im & ?[?]( arpk(N, S, E_im, E_base) & a, ptrdiff_t i ) {
    406         return (E_im &) a.strides[i];
    407 }
    408 }
    409 \end{lstlisting}
     535\begin{cfa}
     536forall( ztype(N),                       $\C{// length of current dimension}$
     537        dtype(S) | sized(S),    $\C{// masquerading-as}$
     538        dtype E_im,                             $\C{// immediate element, often another array}$
     539        dtype E_base                    $\C{// base element, e.g. float, never array}$
     540) { // distribute forall to each element
     541        struct arpk {
     542                S strides[N];           $\C{// so that sizeof(this) is N of S}$
     543        };
     544        // expose E_im, stride by S
     545        E_im & ?[?]( arpk(N, S, E_im, E_base) & a, ptrdiff_t i ) {
     546                return (E_im &) a.strides[i];
     547        }
     548}
     549\end{cfa}
    410550
    411551An instantiation of the @arpk@ generic is given by the @array(E_base, N0, N1, ...)@ expansion, which is @arpk( N0, Rec, Rec, E_base )@, where @Rec@ is @array(E_base, N1, ...)@.
     
    464604
    465605\begin{tabular}{rl}
    466 C      &  @void f( size_t n, size_t m, float a[n][m] );@ \\
     606C      &  @void f( size_t n, size_t m, float x[n][m] );@ \\
    467607Java   &  @void f( float[][] a );@
    468608\end{tabular}
    469609
    470 Java's safety against undefined behaviour assures the callee that, if @a@ is non-null, then @a.length@ is a valid access (say, evaluating to the number $\ell$) and if @i@ is in $[0, \ell)$ then @a[i]@ is a valid access.
     610Java's safety against undefined behaviour assures the callee that, if @x@ is non-null, then @a.length@ is a valid access (say, evaluating to the number $\ell$) and if @i@ is in $[0, \ell)$ then @x[i]@ is a valid access.
    471611If a value of @i@ outside this range is used, a runtime error is guaranteed.
    472612In these respects, C offers no guarantees at all.
    473 Notably, the suggestion that @n@ is the intended size of the first dimension of @a@ is documentation only.
    474 Indeed, many might prefer the technically equivalent declarations @float a[][m]@ or @float (*a)[m]@ as emphasizing the ``no guarantees'' nature of an infrequently used language feature, over using the opportunity to explain a programmer intention.
    475 Moreover, even if @a[0][0]@ is valid for the purpose intended, C's basic infamous feature is the possibility of an @i@, such that @a[i][0]@ is not valid for the same purpose, and yet, its evaluation does not produce an error.
     613Notably, the suggestion that @n@ is the intended size of the first dimension of @x@ is documentation only.
     614Indeed, many might prefer the technically equivalent declarations @float x[][m]@ or @float (*a)[m]@ as emphasizing the ``no guarantees'' nature of an infrequently used language feature, over using the opportunity to explain a programmer intention.
     615Moreover, even if @x[0][0]@ is valid for the purpose intended, C's basic infamous feature is the possibility of an @i@, such that @x[i][0]@ is not valid for the same purpose, and yet, its evaluation does not produce an error.
    476616
    477617Java's lack of expressiveness for more applied properties means these outcomes are possible:
    478618\begin{itemize}
    479 \item @a[0][17]@ and @a[2][17]@ are valid accesses, yet @a[1][17]@ is a runtime error, because @a[1]@ is a null pointer
    480 \item the same observation, now because @a[1]@ refers to an array of length 5
    481 \item execution times vary, because the @float@ values within @a@ are sometimes stored nearly contiguously, and other times, not at all
     619\item @x[0][17]@ and @x[2][17]@ are valid accesses, yet @x[1][17]@ is a runtime error, because @x[1]@ is a null pointer
     620\item the same observation, now because @x[1]@ refers to an array of length 5
     621\item execution times vary, because the @float@ values within @x@ are sometimes stored nearly contiguously, and other times, not at all
    482622\end{itemize}
    483623C's array has none of these limitations, nor do any of the ``array language'' comparators discussed in this section.
    484624
    485625This Java level of safety and expressiveness is also exemplified in the C family, with the commonly given advice [TODO:cite example], for C++ programmers to use @std::vector@ in place of the C++ language's array, which is essentially the C array.
    486 The advice is that, while a vector is also more powerful (and quirky) than an array, its capabilities include options to preallocate with an upfront size, to use an available bound-checked accessor (@a.at(i)@ in place of @a[i]@), to avoid using @push_back@, and to use a vector of vectors.
     626The advice is that, while a vector is also more powerful (and quirky) than an array, its capabilities include options to preallocate with an upfront size, to use an available bound-checked accessor (@a.at(i)@ in place of @x[i]@), to avoid using @push_back@, and to use a vector of vectors.
    487627Used with these restrictions, out-of-bound accesses are stopped, and in-bound accesses never exercise the vector's ability to grow, which is to say, they never make the program slow to reallocate and copy, and they never invalidate the program's other references to the contained values.
    488628Allowing this scheme the same referential integrity assumption that \CFA enjoys [TODO:xref], this scheme matches Java's safety and expressiveness exactly.
     
    532672Futhark restricts these expressions syntactically to variables and constants, while a full-strength dependent system does not.
    533673
    534 CFA's hybrid presentation, @forall( [N] )@, has @N@ belonging to the type system, yet has no instances.
     674\CFA's hybrid presentation, @forall( [N] )@, has @N@ belonging to the type system, yet has no instances.
    535675Belonging to the type system means it is inferred at a call site and communicated implicitly, like in Dex and unlike in Futhark.
    536676Having no instances means there is no type for a variable @i@ that constrains @i@ to be in the range for @N@, unlike Dex, [TODO: verify], but like Futhark.
     
    539679
    540680
    541 \section{Future Work}
     681\section{Future work}
    542682
    543683\subsection{Declaration syntax}
     
    553693I will not discuss the core of this project, which has a tall mission already, to improve type safety, maintain appropriate C compatibility and offer more flexibility about storage use.
    554694It also has a candidate stretch goal, to adapt \CFA's @forall@ generic system to communicate generalized enumerations:
    555 \begin{lstlisting}
     695\begin{cfa}
    556696forall( T | is_enum(T) )
    557697void show_in_context( T val ) {
     
    565705enum weekday { mon, tue, wed = 500, thu, fri };
    566706show_in_context( wed );
    567 \end{lstlisting}
     707\end{cfa}
    568708with output
    569 \begin{lstlisting}
     709\begin{cfa}
    570710mon
    571711tue < ready
     
    573713thu
    574714fri
    575 \end{lstlisting}
     715\end{cfa}
    576716The details in this presentation aren't meant to be taken too precisely as suggestions for how it should look in \CFA.
    577717But the example shows these abilities:
     
    599739The structural assumptions required for the domain of an array in Dex are given by the trait (there, ``interface'') @Ix@, which says that the parameter @n@ is a type (which could take an argument like @weekday@) that provides two-way conversion with the integers and a report on the number of values.
    600740Dex's @Ix@ is analogous the @is_enum@ proposed for \CFA above.
    601 \begin{lstlisting}
     741\begin{cfa}
    602742interface Ix n
    603  get_size n : Unit -> Int
    604  ordinal : n -> Int
    605  unsafe_from_ordinal n : Int -> n
    606 \end{lstlisting}
     743get_size n : Unit -> Int
     744ordinal : n -> Int
     745unsafe_from_ordinal n : Int -> n
     746\end{cfa}
    607747
    608748Dex uses this foundation of a trait (as an array type's domain) to achieve polymorphism over shapes.
     
    616756In the photograph instantiation, it's the tuple-type of $ \langle \mathrm{img\_wd}, \mathrm{img\_ht} \rangle $.
    617757This use of a tuple-as-index is made possible by the built-in rule for implementing @Ix@ on a pair, given @Ix@ implementations for its elements
    618 \begin{lstlisting}
     758\begin{cfa}
    619759instance {a b} [Ix a, Ix b] Ix (a & b)
    620  get_size = \(). size a * size b
    621  ordinal = \(i, j). (ordinal i * size b) + ordinal j
    622  unsafe_from_ordinal = \o.
     760get_size = \(). size a * size b
     761ordinal = \(i, j). (ordinal i * size b) + ordinal j
     762unsafe_from_ordinal = \o.
    623763bs = size b
    624764(unsafe_from_ordinal a (idiv o bs), unsafe_from_ordinal b (rem o bs))
    625 \end{lstlisting}
     765\end{cfa}
    626766and by a user-provided adapter expression at the call site that shows how to indexing with a tuple is backed by indexing each dimension at a time
    627 \begin{lstlisting}
     767\begin{cfa}
    628768img_trans :: (img_wd,img_ht)=>Real
    629769img_trans.(i,j) = img.i.j
    630770result = pairwise img_trans
    631 \end{lstlisting}
     771\end{cfa}
    632772[TODO: cite as simplification of example from https://openreview.net/pdf?id=rJxd7vsWPS section 4]
    633773
     
    661801void * malloc( size_t );
    662802// C, user
    663 struct tm * el1 = malloc(      sizeof(struct tm) );
     803struct tm * el1 = malloc( sizeof(struct tm) );
    664804struct tm * ar1 = malloc( 10 * sizeof(struct tm) );
    665805
  • doc/theses/mike_brooks_MMath/background.tex

    rbb336a6 rdbff8ec  
    6767
    6868
    69 \section{Reading Declarations}
     69\section{Reading declarations}
    7070
    7171A significant area of confusion for reading C declarations results from embedding a declared variable in a declaration, mimicking the way the variable is used in executable statements.
     
    316316
    317317
    318 \subsection{Multi-Dimensional}
     318\subsection{Multi-dimensional}
    319319
    320320As in the last section, multi-dimensional array declarations are examined.
     
    414414
    415415
    416 \subsection{Design Issues}
     416\subsection{Design issues}
    417417\label{toc:lst:issue}
    418418
     
    432432
    433433
    434 \subsection{Preexisting Linked-List Libraries}
     434\subsection{Preexisting linked-list libraries}
    435435
    436436Two preexisting linked-list libraries are used throughout, to show examples of the concepts being defined,
     
    445445
    446446
    447 \subsection{Link attachment: Intrusive vs.\ Wrapped}
     447\subsection{Link attachment: intrusive vs.\ wrapped}
    448448\label{toc:lst:issue:attach}
    449449
     
    604604
    605605
    606 \subsection{Simultaneity: Single vs.\ Multi-Static vs.\ Dynamic}
     606\subsection{Simultaneity: single vs.\ multi-static vs.\ dynamic}
    607607\label{toc:lst:issue:simultaneity}
    608608
     
    729729
    730730
    731 \subsection{User integration: Preprocessed vs.\ Type-System Mediated}
     731\subsection{User integration: preprocessed vs.\ type-system mediated}
    732732
    733733% example of poor error message due to LQ's preprocessed integration
     
    739739
    740740
    741 \subsection{List identity: Headed vs.\ Ad-hoc}
     741\subsection{List identity: headed vs.\ ad-hoc}
    742742\label{toc:lst:issue:ident}
    743743
     
    788788
    789789
    790 \subsection{End treatment: Cased vs.\ Uniform }
     790\subsection{End treatment: cased vs.\ uniform }
    791791
    792792This issue is implementation-level, relevant to achieving high performance.
  • doc/theses/mike_brooks_MMath/conclusion.tex

    rbb336a6 rdbff8ec  
    77\section{Strings}
    88
    9 \section{Future Work}
     9\section{Future work}
  • doc/theses/mike_brooks_MMath/intro.tex

    rbb336a6 rdbff8ec  
    1616
    1717
    18 \section{Linked List}
     18\section{Linked list}
    1919
    2020A linked-list provides a homogeneous container often with $O(log N)$/$O(N)$ access to elements using successor and predecessor operations.
     
    5959
    6060
    61 \subsection{Ill-Typed Expressions}
     61\subsection{Ill-typed expressions}
    6262
    6363C reports many ill-typed expressions as warnings.
     
    9595\section{Contributions}
    9696
    97 \subsection{Linked List}
     97\subsection{Linked list}
    9898
    9999\subsection{Array}
  • doc/theses/mike_brooks_MMath/programs/bkgd-carray-arrty.c

    rbb336a6 rdbff8ec  
    148148void stx2() { const T x[10];
    149149//            x[5] = 3.14; // bad
    150                         }
     150}
    151151void stx3() { T const x[10];
    152152//            x[5] = 3.14; // bad
    153                         }
     153}
    154154
    155155// Local Variables: //
  • doc/theses/mike_brooks_MMath/programs/hello-accordion.cfa

    rbb336a6 rdbff8ec  
    22#include <stdlib.hfa>
    33#include <array.hfa>
     4#include <locale.h>                                                     // setlocale
    45
    56
     
    78
    89
    9 
    10 
    11 forall( T, [Nclients], [Ncosts] )
    12 struct request {
    13         unsigned int requestor_id;
    14         array( T, Nclients ) impacted_client_ids; // nested VLA
    15         array( float, Ncosts ) cost_contribs; // nested VLA
    16         float total_cost;
     10forall( T, @[NprovTerty]@, @[Nmunicipalities]@ )
     11struct CanadaPop {
     12        array( T, @NprovTerty@ ) provTerty; $\C{// nested VLA}$
     13        array( T, @Nmunicipalities@ ) municipalities; $\C{// nested VLA}$
     14        int total_pt, total_mun;
    1715};
    1816
     
    2018// TODO: understand (fix?) why these are needed (autogen seems to be failing ... is typeof as struct member nayok?)
    2119
    22 forall( T, [Nclients], [Ncosts] )
    23         void ?{}( T &, request( T, Nclients, Ncosts ) & this ) {}
     20forall( T, [NprovTerty], [Nmunicipalities] )
     21        void ?{}( T &, CanadaPop( T, NprovTerty, Nmunicipalities ) & this ) {}
    2422
    25 forall( T &, [Nclients], [Ncosts] )
    26         void ^?{}( request( T, Nclients, Ncosts ) & this ) {}
     23forall( T &, [NprovTerty], [Nmunicipalities] )
     24        void ^?{}( CanadaPop( T, NprovTerty, Nmunicipalities ) & this ) {}
    2725
    2826
     
    3937
    4038
    41 forall( T, [Nclients], [Ncosts] )
    42 void summarize( request( T, Nclients, Ncosts ) & r ) {
    43         r.total_cost = 0;
    44         for( i; Ncosts )
    45                 r.total_cost += r.cost_contribs[i];
    46         // say the cost is per-client, to make output vary
    47         r.total_cost *= Nclients;
     39
     40forall( T, [NprovTerty], [Nmunicipalities] )
     41void check( CanadaPop( T, NprovTerty, Nmunicipalities ) & pop ) with( pop ) {
     42        total_pt = total_mun = 0;
     43        for ( i; NprovTerty ) total_pt += provTerty[i];
     44        for ( i; Nmunicipalities ) total_mun += municipalities[i];
    4845}
    4946
     
    5956
    6057
     58
     59
    6160int main( int argc, char * argv[] ) {
    62         const int ncl = ato( argv[1] );
    63         const int nco = 2;
    64 
    65         request( int, ncl, nco ) r;
    66         r.cost_contribs[0] = 100;
    67         r.cost_contribs[1] = 0.1;
    68 
    69         summarize(r);
    70         sout | "Total cost:" | r.total_cost;
     61        const int npt = ato( argv[1] ), nmun = ato( argv[2] );
     62        @CanadaPop( int, npt, nmun ) pop;@
     63        // read in population numbers
     64        @check( pop );@
     65        sout | setlocale( LC_NUMERIC, getenv( "LANG" ) );
     66        sout | "Total province/territory:" | pop.total_pt;
     67        sout | "Total municipalities:" | pop.total_mun;
    7168}
    7269/*
    73 $\$$ ./a.out 5
    74 Total cost: 500.5
    75 $\$$ ./a.out 6
    76 Total cost: 600.6
     70$\$$ ./a.out  13  3573
     71Total province/territory: 36,991,981
     72Total municipalities: 36,991,981
    7773*/
     74
     75// Local Variables: //
     76// compile-command: "sed -f sedcmd hello-accordion.cfa > ../build/tmp.cfa; cfa ../build/tmp.cfa -Wall -Wextra" //
     77// End: //
  • doc/theses/mike_brooks_MMath/programs/hello-array.cfa

    rbb336a6 rdbff8ec  
    88
    99
    10 forall( [N] ) // array bound
    11 array(bool, N) & f( array(float, N) & a, array(float, N) & b ) {
    12         array(bool, N) & ret = *alloc(); // sizeof used by alloc
    13         for( i; N ) {
    14                 ret[i] = 0.005 > 2 * (abs(a[i] - b[i])) / (abs(a[i]) + abs(b[i]));
     10forall( [@N@] )                                                         $\C{// array dimension}$
     11array( bool, @N@) & f( array( float, @N@ ) & x, array( float, @N@ ) & y ) {
     12        array( bool, @N@ ) & ret = *@alloc@();  $\C{// sizeof ret  used by alloc}$
     13        for ( i; @N@ ) {
     14                ret[i] = 0.005 > 2 * (abs( x[i] - y[i] )) / (abs( x[i]) + abs(y[i] ));
    1515        }
    1616        return ret;
     
    2929
    3030int main( int argc, char * argv[] ) {
    31         int n = ato( argv[1] );
    32         array(float, n) a, b; // VLA
    33         for ( i; n ) {
    34                 a[i] = 3.14 / (i + 1);
    35                 b[i] = a[i] + 0.005 ;
     31        const int @n@ = ato( argv[1] );                 $\C{// deduce conversion type}$
     32        array( float, @n@ ) x, y;                               $\C{// VLAs}$
     33        for ( i; n ) {                                                  $\C{// initialize arrays}$
     34                x[i] = 3.14 / (i + 1);
     35                y[i] = x[i] + 0.005 ;
    3636        }
    37         array(bool, n) & result = f( a, b ); // call
    38         sout | "result: " | nonl;
     37        array( bool, @n@ ) & result = @f( x, y )@; $\C{// call}$
     38        sout | "result: " | nonl;                               $\C{// print result}$
    3939        for ( i; n )
    4040                sout | result[i] | nonl;
    4141        sout | nl;
    42         free( &result ); // free returned storage
     42        free( &result );                                                $\C{// free result storage}$
    4343}
    4444/*
     
    5050
    5151void fred() {
    52         array(float, 10) a;
    53         array(float, 20) b;
    54         f( a, a );
    55         f( b, b );
    56         f( a, b );
     52        array( float, @10@ ) x;
     53        array( float, @20@ ) y;
     54        f( x, x );
     55        f( y, y );
     56//      f( x, y );
    5757}
    5858
    5959#ifdef SHOWERR1
    6060forall( [M], [N] )
    61 void bad( array(float, M) &a, array(float, N) &b ) {
    62         f( a, a ); // ok
    63         f( b, b ); // ok
    64         f( a, b ); // error
     61void bad( array(float, M) &x, array(float, N) &y ) {
     62        f( x, x );              $\C[1.5in]{// ok}$
     63        f( y, y );              $\C{// ok}$
     64        f( x, y );              $\C{// error}\CRT$
    6565}
    6666#endif
     
    6969
    7070forall( [M], [N] )
    71 void bad_fixed( array(float, M) & a, array(float, N) & b ) {
    72         if ( M == N ) {
    73                 f( a, (array(float, M) &)b ); // cast b to matching type
    74         }
     71void bad_fixed( array( float, M ) & x, array( float, N ) & y ) {
     72        if ( M == N )
     73                f( x, @(array( float, M ) &)@y ); $\C{// cast y to matching type}$
    7574}
     75
     76// Local Variables: //
     77// compile-command: "sed -f sedcmd hello-array.cfa > ../build/tmp.cfa; cfa ../build/tmp.cfa -Wall -Wextra" //
     78// End: //
  • doc/theses/mike_brooks_MMath/programs/hello-md.cfa

    rbb336a6 rdbff8ec  
    3939
    4040forall( [N] )
    41 void print1d_cstyle( array(float, N) & c );
     41void print1d_cstyle( array( float, N ) & r ); $\C{// C style}$
    4242
    43 forall( [N], C & | ar( C, float, N ) )
     43forall( [N], C & @| ar( C, float, N )@ ) $\C{// add trait}$
    4444void print1d( C & c );
    4545
     
    5959
    6060forall( [N] )
    61 void print1d_cstyle( array(float, N) & c ) {
    62         for ( i; N ) {
    63                 sout | c[i] | nonl;
    64         }
     61void print1d_cstyle( array( float, N ) & r ) { $\C{// C style}$
     62        for ( i; N ) sout | r[i] | nonl;
    6563        sout | nl;
    6664}
     65
     66
    6767
    6868
     
    9898
    9999
    100 void fill( array(float, 5, 7) & a ) {
     100void fill( array( float, 5, 7 ) & a ) {
    101101        for ( i; (ptrdiff_t) 5 ) {
    102102                for ( j; 7 ) {
     
    116116
    117117
    118 array( float, 5, 7 ) a;
    119 fill(a);
     118array( float, 5, 7 ) m;
     119fill( m );
    120120/*
    121 0.0  0.1  0.2  0.3  0.4  0.5  0.6 
    122 1.0  1.1  1.2  1.3  1.4  1.5  1.6 
    123 2.0  2.1  2.2  2.3  2.4  2.5  2.6 
    124 3.0  3.1  3.2  3.3  3.4  3.5  3.6 
    125 4.0  4.1  4.2  4.3  4.4  4.5  4.6
     121r/c   0     1     2     3     4     5     6
     1220  0.0  0.1  0.2  @0.3@  0.4  0.5  0.6 
     1231  1.0  1.1  1.2  @1.3@  1.4  1.5  1.6 
     1242  @2.0  2.1  2.2  2.3  2.4  2.5  2.6@ 
     1253  3.0  3.1  3.2  @3.3@  3.4  3.5  3.6 
     1264  4.0  4.1  4.2  @4.3@  4.4  4.5  4.6
    126127*/
    127128
     
    137138
    138139
    139 
    140 print1d_cstyle( a[ 2 ] );  // 2.0  2.1  2.2  2.3  2.4  2.5  2.6
     140print1d_cstyle( m[ 2 ] );  $\C{// row 2:  2.0  2.1  2.2  2.3  2.4  2.5  2.6}$
    141141
    142142
    143143
    144144
    145 print1d( a[ 2 ] );  // 2.0  2.1  2.2  2.3  2.4  2.5  2.6
     145print1d( m[ 2 ] );  $\C{// row:  2.0  2.1  2.2  2.3  2.4  2.5  2.6}$
    146146
    147147
    148148
    149149
    150 print1d( a[ 2, all ] );  // 2.0  2.1  2.2  2.3  2.4  2.5  2.6
    151 print1d( a[ all, 3 ] );  // 0.3  1.3  2.3  3.3  4.3
     150print1d( m[ 2, all ] );  $\C{// row 2:  2.0  2.1  2.2  2.3  2.4  2.5  2.6}$
     151print1d( m[ all, 3 ] );  $\C{// column 3:  0.3  1.3  2.3  3.3  4.3}$
    152152
    153153
    154154
    155 print1d_cstyle( a[ 2, all ] );
     155print1d_cstyle( m[ 2, all ] );
    156156
    157157
     
    163163#ifdef SHOW_ERROR_1
    164164
    165 print1d_cstyle( a[ all, 2 ] );  // bad
     165print1d_cstyle( m[ all, 2 ] );  $\C{// bad}$
    166166
    167167#endif
  • doc/theses/mike_brooks_MMath/programs/lst-issues-attach-reduction.hpp

    rbb336a6 rdbff8ec  
    101101class list {
    102102        struct node {
    103                 @LIST_ENTRY(node) links;@
    104                 @El elem;@
     103                LIST_ENTRY(node) links;
     104                El elem;
    105105        };
    106106        LIST_HEAD(Impl, node);
     
    111111        }
    112112        void push_front( const El & src ) {
    113                 node * n = @new node()@;
     113                node * n = new node();
    114114                n->elem = src;
    115115                LIST_INSERT_HEAD(&impl, n, links);
  • doc/theses/mike_brooks_MMath/programs/lst-issues-wrapped-byref.run.cpp

    rbb336a6 rdbff8ec  
    4949for (auto const& cur : reqs)
    5050        printf("{%d %d} ", cur->pri, cur->rqr);
    51 printf("\n");
     51        printf("\n");
    5252
    5353}
  • doc/theses/mike_brooks_MMath/string.tex

    rbb336a6 rdbff8ec  
    33\section{String}
    44
    5 \subsection{Logical Overlap}
     5\subsection{Logical overlap}
    66
    77\input{sharing-demo.tex}
     
    3838
    3939
    40 \subsection{Memory Management}
     40\subsection{Memory management}
    4141
    4242A centrepriece of the string module is its memory manager.  The managment scheme defines a large shared buffer for strings' text.  Allocation in this buffer is always bump-pointer; the buffer is compacted and/or relocated with growth when it fills.  A string is a smart pointer into this buffer.
     
    8080
    8181
    82 \subsection{Avoiding Implicit Sharing}
     82\subsection{Avoiding implicit sharing}
    8383
    8484There are tradeoffs associated with the copy-on-write mechanism.  Several quatitative matters are detailed in the [xref: Performance Assessment] section and the qualitiative issue of multi-threaded support is introduced here.  The \CFA sting library provides a switch to disable the sharing mechanism for situtations where it is inappropriate.
     
    9898
    9999
    100 \subsection{Future Work}
     100\subsection{Future work}
    101101
    102102To discuss: Unicode
     
    105105
    106106
    107 \subsection{Performance Assessment}
     107\subsection{Performance assessment}
    108108
    109109I assessed the CFA string library's speed and memory usage.  I present these results ineven quivalent cases, due to either micro-optimizations foregone, or fundamental costs of the added functionality.  They also show the benefits and tradeoffs, as >100\% effects, of switching to CFA, with the tradeoff points quantified.  The final test shows the overall win of the CFA text-sharing mechanism.  It exercises several operations together, showing CFA enabling clean user code to achieve performance that STL requires less-clean user code to achieve.
  • doc/theses/mike_brooks_MMath/uw-ethesis.tex

    rbb336a6 rdbff8ec  
    104104\lstset{language=cfa,belowskip=-1pt} % set default language to CFA
    105105\lstnewenvironment{c++}[1][]{\lstset{language=[GNU]C++,escapechar=\$,moredelim=**[is][\color{red}]{@}{@},}\lstset{#1}}{}
     106\lstnewenvironment{pascal}[1][]{\lstset{language=pascal,escapechar=\$,moredelim=**[is][\color{red}]{@}{@},}\lstset{#1}}{}
    106107\lstset{inputpath={programs}}
    107108
  • libcfa/src/enum.cfa

    rbb336a6 rdbff8ec  
    11#include "enum.hfa"
    22#include "fstream.hfa"
     3#include <stdlib.h>                                                                             // qsort
    34#include <string.h>
    45
    56#pragma GCC visibility push(default)
    67
     8int scmp( const void * str1, const void * str2 ) {
     9    return -strcmp( *(char **)str1, *(char **)str2 );   // dscending order
     10} // scmp
     11
    712forall( istype & | istream( istype ), E, V | CfaEnum( E, V ) )
    813istype & ?|?( istype & is, E & e ) {
     14//      printf( "here0\n" );
    915        if ( eof( is ) ) throwResume ExceptionInst( missing_data );
    10         char val[256];
    11         int args = fmt( is, "%255s", val );
    12         if ( ! eof( is ) && args != 1 ) throwResume ExceptionInst( missing_data );
     16
     17        // Match input enumerator string to enumerator labels.
     18        int len = -1;
     19        const char * cpy[ 20 /*countof( E )*/ ];
     20        int i = 0;
    1321        for ( s; E ) {
    14                 if ( strcmp(val, label( s )) == 0 ) { e = s; break; }
     22                cpy[i] = label( s );
     23                printf( "%s\n", cpy[i] );
     24                i += 1;
     25        }
     26        printf( "%d\n", i );
     27        qsort( cpy, i, sizeof(char*), scmp );
     28        i = 0;
     29        for ( s; E ) {
     30                printf( "%s\n", cpy[i] );
     31                i += 1;
     32        }
     33        int j = 0;
     34  X : for ( s; E ) {
     35                len = strlen( cpy[j] );
     36                printf( "%s %d\n", cpy[j], len );
     37                char fmtstr[len + 16];
     38                fmtstr[0] = ' ';                                                                // optional leadig whitespace
     39                strcpy( &fmtstr[1], cpy[j] );                                   // copy format and add %n
     40                strcpy( &fmtstr[len + 1], "%n" );
     41                printf( "%s\n", fmtstr );
     42                len = -1;
     43                // scanf cursor does not move if no match
     44                fmt( is, fmtstr, &len );
     45                printf( "%s %d %d\n", fmtstr, len, j );
     46          if ( eof( is ) ) { break; }
     47          if ( len != -1 ) {
     48                  for ( s; E ) {
     49                          if ( strcmp( label( s ), cpy[j] ) == 0 ) { e = s; break X; }
     50                  }
     51          }
     52                j += 1;
    1553        } else {
    16                 fprintf( stderr, "invalid enumeration constant\n" );
    17                 abort();                                                                        // cannot use abort stream
     54                //ExceptionInst( missing_data );
    1855        } // for
     56        printf( "X %s %d\n", label( e ), len );
     57        if ( ! eof( is ) && len == -1 ) throwResume ExceptionInst( missing_data );
     58
     59        // if ( eof( is ) ) throwResume ExceptionInst( missing_data );
     60        // char val[256];
     61        // int args = fmt( is, "%255s", val );
     62        // if ( ! eof( is ) && args != 1 ) throwResume ExceptionInst( missing_data );
     63        // for ( s; E ) {
     64        //      if ( strcmp(val, label( s )) == 0 ) { e = s; break; }
     65        // } else {
     66        //      fprintf( stderr, "invalid enumeration constant\n" );
     67        //      abort();                                                                        // cannot use abort stream
     68        // } // for
    1969        return is;
    2070}
  • libcfa/src/iostream.cfa

    rbb336a6 rdbff8ec  
    1010// Created On       : Wed May 27 17:56:53 2015
    1111// Last Modified By : Peter A. Buhr
    12 // Last Modified On : Mon Feb 12 09:26:05 2024
    13 // Update Count     : 1966
     12// Last Modified On : Mon Jul  8 23:45:06 2024
     13// Update Count     : 2018
    1414//
    1515
     
    772772
    773773
     774#define FALSE "false"
     775#define TRUE "true"
     776
    774777forall( istype & | basic_istream( istype ) ) {
    775778        istype & ?|?( istype & is, bool & b ) {
    776779                if ( eof( is ) ) throwResume ExceptionInst( missing_data );
    777                 char val[6];
    778                 int args = fmt( is, "%5s", val );
    779                 if ( ! eof( is ) && args != 1 ) throwResume ExceptionInst( missing_data );
    780                 if ( strcmp( val, "true" ) == 0 ) b = true;
    781                 else if ( strcmp( val, "false" ) == 0 ) b = false;
    782                 else {
    783                         fprintf( stderr, "invalid Boolean constant\n" );
    784                         abort();                                                                        // cannot use abort stream
     780                int len = -1;                                                                   // len not set if no match
     781                // Optional leading whitespace at start of strings.
     782                fmt( is, " " FALSE "%n", &len );                                // try false
     783                if ( len != sizeof( FALSE ) - 1 ) {                             // remove null terminate
     784                        fmt( is, " " TRUE "%n", &len );                         // try true
     785                        if ( len != sizeof( TRUE ) - 1 ) throwResume ExceptionInst( missing_data );
     786                        b = true;
     787                } else {
     788                        b = false;
    785789                } // if
    786790                return is;
     
    940944        } // ?|?
    941945
    942         istype & ?|?( istype & is, const char fmt[] ) {
     946        istype & ?|?( istype & is, const char fmt[] ) {         // match text
    943947                if ( eof( is ) ) throwResume ExceptionInst( missing_data );
    944948                size_t len = strlen( fmt );
     
    946950                strcpy( fmtstr, fmt );                                                  // copy format and add %n
    947951                strcpy( &fmtstr[len], "%n" );
    948                 int len2 = -1;
    949                 fmt( is, fmtstr, &len2 );
    950                 if ( ! eof( is ) && len2 == -1 ) throwResume ExceptionInst( missing_data );
     952                len = -1;
     953                // scanf cursor does not move if no match
     954                fmt( is, fmtstr, &len );
     955                if ( ! eof( is ) && len == -1 ) throwResume ExceptionInst( missing_data );
    951956                return is;
    952957        } // ?|?
  • libcfa/src/iostream.hfa

    rbb336a6 rdbff8ec  
    1010// Created On       : Wed May 27 17:56:53 2015
    1111// Last Modified By : Peter A. Buhr
    12 // Last Modified On : Sun Apr 21 07:32:19 2024
    13 // Update Count     : 744
     12// Last Modified On : Sat Jul  6 11:33:31 2024
     13// Update Count     : 758
    1414//
    1515
     
    359359        istype & ?|?( istype &, long double _Complex & );
    360360
    361         istype & ?|?( istype &, const char [] );
     361        // This is too restrictive as it prevents building a format in a string and using that format.
     362        // inline istype & ?|?( istype &, char [] ) {                   // possible error, too restrictive to change
     363        //      _Static_assert( false, "reading a character array without a maximum length is unsafe. Use input manipulator \"wdi( N, s )\", where \"char s[N]\" or fmt( s )." );
     364        // }
     365        istype & ?|?( istype &, const char [] );                        // match text
    362366
    363367        // manipulators
  • libcfa/src/parseargs.cfa

    rbb336a6 rdbff8ec  
    1010// Author           : Thierry Delisle
    1111// Created On       : Wed Oct 12 15:28:01 2022
    12 // Last Modified By :
    13 // Last Modified On :
    14 // Update Count     :
     12// Last Modified By : Peter A. Buhr
     13// Last Modified On : Mon Jul  8 18:18:23 2024
     14// Update Count     : 7
    1515//
    1616
     
    3232        extern FILE * stdout;
    3333
    34         extern int fileno(FILE *stream);
     34        extern int fileno( FILE *stream );
    3535
    3636        extern int fprintf ( FILE * stream, const char * format, ... );
    3737
    38         extern          long long int strtoll (const char* str, char** endptr, int base);
    39         extern unsigned long long int strtoull(const char* str, char** endptr, int base);
    40         extern                 double strtod  (const char* str, char** endptr);
     38        extern          long long int strtoll ( const char* str, char** endptr, int base );
     39        extern unsigned long long int strtoull( const char* str, char** endptr, int base );
     40        extern                 double strtod  ( const char* str, char** endptr );
    4141}
    4242
     
    4444#include "limits.hfa"
    4545
    46 #pragma GCC visibility push(default)
    47 
    48 extern int cfa_args_argc __attribute__((weak));
    49 extern char ** cfa_args_argv __attribute__((weak));
    50 extern char ** cfa_args_envp __attribute__((weak));
     46#pragma GCC visibility push( default )
     47
     48extern int cfa_args_argc __attribute__(( weak ));
     49extern char ** cfa_args_argv __attribute__(( weak ));
     50extern char ** cfa_args_envp __attribute__(( weak ));
    5151
    5252forall([N])
    53 static void usage(char * cmd, const array( cfa_option, N ) & options, const char * usage, FILE * out)  __attribute__ ((noreturn));
     53static void usage( char * cmd, const array( cfa_option, N ) & options, const char * usage, FILE * out )  __attribute__ (( noreturn ));
    5454//-----------------------------------------------------------------------------
    5555// checking
    5656forall([N])
    5757static void check_args( const array( cfa_option, N ) & options ) {
    58         for(i; N) {
    59                 for(j; N) {
    60                         if(i == j) continue;
    61 
    62                         if( options[i].short_name != '\0'
    63                         && options[i].short_name == options[j].short_name)
    64                                 abort("Parse Args error: two options have short name '%c' (%zu & %zu)", options[i].short_name, i, j);
    65 
    66                         if(0 == strcmp(options[i].long_name, options[j].long_name)) abort("Parse Args error: two options have long name '%s' (%zu & %zu)", options[i].long_name, i, j);
     58        for ( i; N ) {
     59                for ( j; N ) {
     60                        if ( i == j ) continue;
     61
     62                        if ( options[i].short_name != '\0'
     63                        && options[i].short_name == options[j].short_name )
     64                                abort( "Parse Args error: two options have short name '%c' (%zu & %zu)", options[i].short_name, i, j );
     65
     66                        if (0 == strcmp( options[i].long_name, options[j].long_name ))
     67                                abort( "Parse Args error: two options have long name '%s' (%zu & %zu)", options[i].long_name, i, j );
    6768                }
    6869        }
     
    7475forall([opt_count]) {
    7576        void parse_args( const array( cfa_option, opt_count ) & options, const char * usage, char ** & left ) {
    76                 if( 0p != &cfa_args_argc ) {
    77                         parse_args(cfa_args_argc, cfa_args_argv, options, usage, left );
    78                 }
    79                 else {
     77                if ( 0p != &cfa_args_argc ) {
     78                        parse_args( cfa_args_argc, cfa_args_argv, options, usage, left );
     79                } else {
    8080                        char * temp[1] = { 0p };
    8181                        parse_args(0, temp, options, usage, left );
     
    9090                char ** & left
    9191        ) {
    92                 check_args(options);
     92                check_args( options );
    9393
    9494                int maxv = 'h';
     
    9696                {
    9797                        int idx = 0;
    98                         for(i; opt_count) {
    99                                 if (options[i].short_name) {
    100                                         maxv = max(options[i].short_name, maxv);
     98                        for ( i; opt_count ) {
     99                                if ( options[i].short_name ) {
     100                                        maxv = max( options[i].short_name, maxv );
    101101                                        optstring[idx] = options[i].short_name;
    102102                                        idx++;
    103                                         if(    ((intptr_t)options[i].parse) != ((intptr_t)parse_settrue)
    104                                         && ((intptr_t)options[i].parse) != ((intptr_t)parse_setfalse) ) {
     103                                        if ( (intptr_t)options[i].parse != (intptr_t)parse_settrue
     104                                                 && ((intptr_t)options[i].parse) != ((intptr_t)parse_setfalse) ) {
    105105                                                optstring[idx] = ':';
    106106                                                idx++;
     
    115115                {
    116116                        int idx = 0;
    117                         for(i; opt_count) {
    118                                 if(options[i].long_name) {
     117                        for ( i; opt_count ) {
     118                                if ( options[i].long_name ) {
    119119                                        // we don't have the mutable keyword here, which is really what we would want
    120120                                        int & val_ref = (int &)(const int &)options[i].val;
     
    124124                                        optarr[idx].flag = 0p;
    125125                                        optarr[idx].val  = options[i].val;
    126                                         if(    ((intptr_t)options[i].parse) == ((intptr_t)parse_settrue)
    127                                         || ((intptr_t)options[i].parse) == ((intptr_t)parse_setfalse) ) {
     126                                        if ( ((intptr_t)options[i].parse) == ((intptr_t)parse_settrue)
     127                                                 || ((intptr_t)options[i].parse) == ((intptr_t)parse_setfalse) ) {
    128128                                                optarr[idx].has_arg = no_argument;
    129129                                        } else {
     
    139139                FILE * out = stderr;
    140140                NEXT_ARG:
    141                 for() {
     141                for () {
    142142                        int idx = 0;
    143                         int opt = getopt_long(argc, argv, optstring, optarr, &idx);
    144                         switch(opt) {
     143                        int opt = getopt_long( argc, argv, optstring, optarr, &idx );
     144                        switch( opt ) {
    145145                                case -1:
    146                                         if(&left != 0p) left = argv + optind;
     146                                        if ( &left != 0p ) left = argv + optind;
    147147                                        return;
    148148                                case 'h':
    149149                                        out = stdout;
    150150                                case '?':
    151                                         usage(argv[0], options, usage, out);
     151                                        usage( argv[0], options, usage, out );
    152152                                default:
    153                                         for(i; opt_count) {
    154                                                 if(opt == options[i].val) {
     153                                        for ( i; opt_count ) {
     154                                                if ( opt == options[i].val ) {
    155155                                                        const char * arg = optarg ? optarg : "";
    156                                                         if( arg[0] == '=' ) { arg++; }
     156                                                        if ( arg[0] == '=' ) { arg++; }
    157157                                                        // work around for some weird bug
    158158                                                        void * variable = options[i].variable;
    159159                                                        bool (*parse_func)(const char *, void * ) = options[i].parse;
    160160                                                        bool success = parse_func( arg, variable );
    161                                                         if(success) continue NEXT_ARG;
    162 
    163                                                         fprintf(out, "Argument '%s' for option %c could not be parsed\n\n", arg, (char)opt);
    164                                                         usage(argv[0], options, usage, out);
     161                                                        if ( success ) continue NEXT_ARG;
     162
     163                                                        fprintf( out, "Argument '%s' for option %c could not be parsed\n\n", arg, (char)opt );
     164                                                        usage( argv[0], options, usage, out );
    165165                                                }
    166166                                        }
    167                                         abort("Internal parse arg error\n");
     167                                        abort( "Internal parse arg error\n" );
    168168                        }
    169169
     
    172172}
    173173
    174 static inline int next_newline(const char * str) {
     174static inline int next_newline( const char * str ) {
    175175        int ret;
    176         const char * ptr = strstr(str, "\n");
    177         if(!ptr) return MAX;
    178 
    179         /* paranoid */ verify( str <= ptr);
     176        const char * ptr = strstr( str, "\n" );
     177        if ( ! ptr ) return MAX;
     178
     179        /* paranoid */ verify( str <= ptr );
    180180        intptr_t low = (intptr_t)str;
    181181        intptr_t hi  = (intptr_t)ptr;
     
    187187//-----------------------------------------------------------------------------
    188188// Print usage
    189 static void printopt(FILE * out, int width, int max, char sn, const char * ln, const char * help) {
     189static void printopt( FILE * out, int width, int max, char sn, const char * ln, const char * help ) {
    190190        // check how wide we should be printing
    191191        // this includes all options and the help message
    192192        int hwidth = max - (11 + width);
    193         if(hwidth <= 0) hwidth = max;
     193        if ( hwidth <= 0 ) hwidth = max;
    194194
    195195        // check which pieces we have
    196         bool has_ln = ln && strcmp("", ln);
    197         bool has_help = help && strcmp("", help);
     196        bool has_ln = ln && strcmp( "", ln );
     197        bool has_help = help && strcmp( "", help );
    198198
    199199        // print the small name if present
    200         if(sn != '\0') fprintf(out, "  -%c", sn);
    201         else fprintf(out, "    ");
     200        if ( sn != '\0') fprintf( out, "  -%c", sn );
     201        else fprintf( out, "    " );
    202202
    203203        // print a comma if we have both short and long names
    204         if(sn != '\0' && has_ln) fprintf(out, ", ");
    205         else fprintf(out, "  ");
     204        if ( sn != '\0' && has_ln ) fprintf( out, ", " );
     205        else fprintf( out, "  " );
    206206
    207207        // print the long name if present
    208         if(has_ln)        fprintf(out, "--%-*s", width, ln);
    209         else if(has_help) fprintf(out, "  %-*s", width, "");
    210 
    211         if(has_help) {
     208        if ( has_ln ) fprintf( out, "--%-*s", width, ln );
     209        else if ( has_help ) fprintf( out, "  %-*s", width, "" );
     210
     211        if ( has_help ) {
    212212                // print the help
    213213                // We need to wrap at the max width, and also indent newlines so everything is nice and pretty
    214214
    215215                // for each line to print
    216                 for() {
     216                for () {
    217217                        //find out if there is a newline
    218                         int nextnl = next_newline(help);
    219                         int real = min(min(strlen(help), hwidth), nextnl);
    220 
    221                         fprintf(out, "   %.*s", real, help);
    222                         // printf("%d %d\n", real, nextnl);
     218                        int nextnl = next_newline( help );
     219                        int real = min( min( strlen( help ), hwidth ), nextnl );
     220
     221                        fprintf( out, "   %.*s", real, help );
     222                        // printf( "%d %d\n", real, nextnl );
    223223                        help += real;
    224                         if( nextnl == real ) help++;
    225                         if('\0' == *help) break;
    226                         fprintf(out, "\n%*s", width + 8, "");
    227                 }
    228         }
    229         fprintf(out, "\n");
    230 }
    231 
    232 void print_args_usage(cfa_option options[], const size_t opt_count, const char * usage, bool error)  __attribute__ ((noreturn)) {
     224                        if ( nextnl == real ) help++;
     225                        if ('\0' == *help ) break;
     226                        fprintf( out, "\n%*s", width + 8, "" );
     227                }
     228        }
     229        fprintf( out, "\n" );
     230}
     231
     232void print_args_usage( cfa_option options[], const size_t opt_count, const char * usage, bool error )  __attribute__ ((noreturn )) {
    233233        const array( cfa_option, opt_count ) & arr = (const array( cfa_option, opt_count ) &) *options;
    234         usage(cfa_args_argv[0], arr, usage, error ? stderr : stdout);
    235 }
    236 
    237 void print_args_usage(int , char * argv[], cfa_option options[], const size_t opt_count, const char * usage, bool error)  __attribute__ ((noreturn)) {
     234        usage( cfa_args_argv[0], arr, usage, error ? stderr : stdout );
     235}
     236
     237void print_args_usage( int , char * argv[], cfa_option options[], const size_t opt_count, const char * usage, bool error )  __attribute__ (( noreturn )) {
    238238        const array( cfa_option, opt_count ) & arr = (const array( cfa_option, opt_count ) &) *options;
    239         usage(argv[0], arr, usage, error ? stderr : stdout);
     239        usage( argv[0], arr, usage, error ? stderr : stdout );
    240240}
    241241
    242242forall( [N] ) {
    243         void print_args_usage( const array(cfa_option, N ) & options, const char * usage, bool error) {
    244                 usage(cfa_args_argv[0], options, usage, error ? stderr : stdout);
    245         }
    246 
    247         void print_args_usage(int argc, char * argv[], const array( cfa_option, N ) & options, const char * usage, bool error) {
    248                 usage(argv[0], options, usage, error ? stderr : stdout);
     243        void print_args_usage( const array(cfa_option, N ) & options, const char * usage, bool error ) {
     244                usage( cfa_args_argv[0], options, usage, error ? stderr : stdout );
     245        }
     246
     247        void print_args_usage( int argc, char * argv[], const array( cfa_option, N ) & options, const char * usage, bool error ) {
     248                usage( argv[0], options, usage, error ? stderr : stdout );
    249249        }
    250250}
    251251
    252252forall([N])
    253 static void usage(char * cmd, const array( cfa_option, N ) & options, const char * help, FILE * out) __attribute__((noreturn)) {
     253static void usage( char * cmd, const array( cfa_option, N ) & options, const char * help, FILE * out ) __attribute__(( noreturn )) {
    254254        int width = 0;
    255255        {
    256                 for(i; N) {
    257                         if(options[i].long_name) {
    258                                 int w = strlen(options[i].long_name);
    259                                 if(w > width) width = w;
     256                for ( i; N ) {
     257                        if ( options[i].long_name ) {
     258                                int w = strlen( options[i].long_name );
     259                                if ( w > width ) width = w;
    260260                        }
    261261                }
     
    263263
    264264        int max_width = 1_000_000;
    265         int outfd = fileno(out);
    266         if(isatty(outfd)) {
     265        int outfd = fileno( out );
     266        if ( isatty( outfd ) ) {
    267267                struct winsize size;
    268                 int ret = ioctl(outfd, TIOCGWINSZ, &size);
    269                 if(ret < 0) abort( "ioctl error: (%d) %s\n", (int)errno, strerror(errno) );
     268                int ret = ioctl( outfd, TIOCGWINSZ, &size );
     269                if ( ret < 0 ) abort( "ioctl error: (%d) %s\n", (int)errno, strerror( errno) );
    270270                max_width = size.ws_col;
    271271        }
    272272
    273         fprintf(out, "Usage:\n  %s %s\n", cmd, help);
    274 
    275         for(i; N) {
    276                 printopt(out, width, max_width, options[i].short_name, options[i].long_name, options[i].help);
    277         }
    278         fprintf(out, "  -%c, --%-*s   %s\n", 'h', width, "help", "print this help message");
    279         exit(out == stdout ? 0 : 1);
     273        fprintf( out, "Usage:\n  %s %s\n", cmd, help );
     274
     275        for ( i; N ) {
     276                printopt( out, width, max_width, options[i].short_name, options[i].long_name, options[i].help );
     277        }
     278        fprintf( out, "  -%c, --%-*s   %s\n", 'h', width, "help", "print this help message" );
     279        exit( out == stdout ? 0 : 1 );
    280280}
    281281
    282282//-----------------------------------------------------------------------------
    283283// Typed argument parsing
    284 bool parse_yesno(const char * arg, bool & value ) {
    285         if(strcmp(arg, "yes") == 0) {
     284bool parse_yesno( const char * arg, bool & value ) {
     285        if ( strcmp( arg, "yes" ) == 0 ) {
    286286                value = true;
    287287                return true;
    288288        }
    289 
    290         if(strcmp(arg, "Y") == 0) {
     289        if ( strcmp( arg, "Y" ) == 0 ) {
    291290                value = true;
    292291                return true;
    293292        }
    294 
    295         if(strcmp(arg, "y") == 0) {
     293        if ( strcmp( arg, "y" ) == 0 ) {
    296294                value = true;
    297295                return true;
    298296        }
    299 
    300         if(strcmp(arg, "no") == 0) {
     297        if ( strcmp( arg, "no" ) == 0 ) {
    301298                value = false;
    302299                return true;
    303300        }
    304 
    305         if(strcmp(arg, "N") == 0) {
     301        if ( strcmp( arg, "N" ) == 0 ) {
    306302                value = false;
    307303                return true;
    308304        }
    309 
    310         if(strcmp(arg, "n") == 0) {
     305        if ( strcmp( arg, "n" ) == 0 ) {
    311306                value = false;
    312307                return true;
    313308        }
    314 
    315309        return false;
    316310}
    317311
    318 bool parse_truefalse(const char * arg, bool & value) {
    319         if(strcmp(arg, "true") == 0) {
     312bool parse_truefalse( const char * arg, bool & value ) {
     313        if ( strcmp( arg, "true" ) == 0 ) {
    320314                value = true;
    321315                return true;
    322316        }
    323 
    324         if(strcmp(arg, "false") == 0) {
     317        if ( strcmp( arg, "false" )  == 0 ) {
    325318                value = false;
    326319                return true;
    327320        }
    328 
    329321        return false;
    330322}
    331323
    332 bool parse_settrue (const char *, bool & value ) {
     324bool parse_settrue ( const char *, bool & value ) {
    333325        value = true;
    334326        return true;
    335327}
    336328
    337 bool parse_setfalse(const char *, bool & value )  {
     329bool parse_setfalse( const char *, bool & value )  {
    338330        value = false;
    339331        return true;
    340332}
    341333
    342 bool parse(const char * arg, const char * & value ) {
     334bool parse( const char * arg, const char * & value ) {
    343335        value = arg;
    344336        return true;
    345337}
    346338
    347 bool parse(const char * arg, int & value) {
     339bool parse( const char * arg, int & value ) {
    348340        char * end;
    349341
    350342        errno = 0;
    351         long long int r = strtoll(arg, &end, 0);
    352         if(errno) return false;
    353         if(*end != '\0') return false;
    354         if(r > (int)MAX) return false;
    355         if(r < (int)MIN) return false;
    356 
     343        long long int r = strtoll( arg, &end, 0 );
     344        if ( errno ) return false;
     345        if (*end != '\0') return false;
     346        if ( r > (int)MAX ) return false;
     347        if ( r < (int)MIN ) return false;
    357348        value = r;
    358349        return true;
    359350}
    360351
    361 static unsigned long long int strict_strtoull( const char * arg, int base) {
     352static unsigned long long int strict_strtoull( const char * arg, int base ) {
    362353        errno = 0;
    363354        {
    364355                const char * in = arg;
    365                 for() {
    366                         if('\0' == *in) {
     356                for () {
     357                        if ( '\0' == *in ) {
    367358                                errno = EINVAL;
    368359                                return 0;
    369360                        }
    370                         if(!isspace(*in)) break;
     361                        if ( ! isspace(*in )) break;
    371362                        in++;
    372363                }
    373                 if(!isdigit(*in)) {
     364                if ( ! isdigit(*in )) {
    374365                        errno = EINVAL;
    375366                        return 0;
    376367                }
    377368        }
    378 
    379369        *char end;
    380         unsigned long long int r = strtoull(arg, &end, base);
    381         if(*end != '\0') errno = EINVAL;
    382         if(errno) return 0;
     370        unsigned long long int r = strtoull( arg, &end, base );
     371        if (*end != '\0') errno = EINVAL;
     372        if ( errno ) return 0;
    383373        return r;
    384374}
    385375
    386 bool parse(const char * arg, unsigned & value) {
    387         unsigned long long int r = strict_strtoull(arg, 0);
    388         if(errno) return false;
    389         if(r > (unsigned)MAX) return false;
    390 
     376bool parse( const char * arg, unsigned & value ) {
     377        unsigned long long int r = strict_strtoull( arg, 0 );
     378        if ( errno ) return false;
     379        if ( r > (unsigned)MAX ) return false;
    391380        value = r;
    392381        return true;
    393382}
    394383
    395 bool parse(const char * arg, unsigned long & value) {
    396         unsigned long long int r = strict_strtoull(arg, 0);
    397         if(errno) return false;
    398         if(r > (unsigned long)MAX) return false;
    399 
     384bool parse( const char * arg, unsigned long & value ) {
     385        unsigned long long int r = strict_strtoull( arg, 0 );
     386        if ( errno ) return false;
     387        if ( r > (unsigned long)MAX ) return false;
    400388        value = r;
    401389        return true;
    402390}
    403391
    404 bool parse(const char * arg, unsigned long long & value) {
    405         unsigned long long int r = strict_strtoull(arg, 0);
    406         if(errno) return false;
    407         if(r > (unsigned long long)MAX) return false;
    408 
     392bool parse( const char * arg, unsigned long long & value ) {
     393        unsigned long long int r = strict_strtoull( arg, 0 );
     394        if ( errno ) return false;
     395        if ( r > (unsigned long long)MAX ) return false;
    409396        value = r;
    410397        return true;
    411398}
    412399
    413 bool parse(const char * arg, double & value) {
     400bool parse( const char * arg, double & value ) {
    414401        char * end;
    415         double r = strtod(arg, &end);
    416         if(*end != '\0') return false;
    417 
     402        double r = strtod( arg, &end );
     403        if ( *end != '\0') return false;
    418404        value = r;
    419405        return true;
  • libcfa/src/parseargs.hfa

    rbb336a6 rdbff8ec  
    1010// Author           : Thierry Delisle
    1111// Created On       : Wed Oct 12 15:28:01 2022
    12 // Last Modified By :
    13 // Last Modified On :
    14 // Update Count     :
     12// Last Modified By : Peter A. Buhr
     13// Last Modified On : Mon Jul  8 18:18:14 2024
     14// Update Count     : 2
    1515//
    1616#pragma once
     
    3131static inline void ?{}( cfa_option & this ) {}
    3232
    33 forall(T & | { bool parse(const char *, T & ); })
     33forall(T & | { bool parse( const char *, T & ); })
    3434static inline void ?{}( cfa_option & this, char short_name, const char * long_name, const char * help, T & variable ) {
    3535        this.val        = 0;
  • src/Parser/ExpressionNode.cpp

    rbb336a6 rdbff8ec  
    483483        Default:                                                                                        // char default string type
    484484        default:
    485                 strtype = new ast::BasicType( ast::BasicKind::Char, ast::CV::Const );
     485                strtype = new ast::BasicType( ast::BasicKind::Char );
    486486        } // switch
    487487        ast::ArrayType * at = new ast::ArrayType(
  • src/Parser/parser.yy

    rbb336a6 rdbff8ec  
    1010// Created On       : Sat Sep  1 20:22:55 2001
    1111// Last Modified By : Peter A. Buhr
    12 // Last Modified On : Thu Jun 27 14:45:57 2024
    13 // Update Count     : 6705
     12// Last Modified On : Tue Jul  9 10:29:01 2024
     13// Update Count     : 6713
    1414//
    1515
     
    19981998        c_declaration ';'
    19991999        | cfa_declaration ';'                                                           // CFA
    2000         | static_assert                                                                         // C11
     2000        | static_assert ';'                                                                     // C11
    20012001        ;
    20022002
    20032003static_assert:
    2004         STATICASSERT '(' constant_expression ',' string_literal ')' ';' // C11
     2004        STATICASSERT '(' constant_expression ',' string_literal ')' // C11
    20052005                { $$ = DeclarationNode::newStaticAssert( $3, maybeMoveBuild( $5 ) ); }
    2006         | STATICASSERT '(' constant_expression ')' ';'          // CFA
     2006        | STATICASSERT '(' constant_expression ')'                      // CFA
    20072007                { $$ = DeclarationNode::newStaticAssert( $3, build_constantStr( yylloc, *new string( "\"\"" ) ) ); }
    20082008
     
    27092709                { $$ = $2; }                                                                    // mark all fields in list
    27102710        | cfa_typedef_declaration ';'                                           // CFA
    2711         | static_assert                                                                         // C11
     2711        | static_assert ';'                                                                     // C11
    27122712        ;
    27132713
     
    33643364                        $$ = $6;
    33653365                }
     3366        | ';'                                                                                           // empty declaration
     3367                { $$ = nullptr; }
    33663368        ;
    33673369
Note: See TracChangeset for help on using the changeset viewer.