Changeset c3e41cda


Ignore:
Timestamp:
Nov 3, 2024, 9:13:16 PM (4 months ago)
Author:
Peter A. Buhr <pabuhr@…>
Branches:
master
Children:
7d95bce9
Parents:
63a7394
Message:

proofread new material in background and array chapters

Location:
doc/theses/mike_brooks_MMath
Files:
6 edited

Legend:

Unmodified
Added
Removed
  • TabularUnified doc/theses/mike_brooks_MMath/array.tex

    r63a7394 rc3e41cda  
    22\label{c:Array}
    33
     4Arrays in C are possibly the single most misunderstood and incorrectly used feature in the language, resulting in the largest proportion of runtime errors and security violations.
     5This chapter describes the new \CFA language and library features that introduce a length-checked array type, @array@, to the \CFA standard library~\cite{Cforall}.
     6
     7Offering the @array@ type, as a distinct alternative to the C array, is consistent with \CFA's goal of backwards compatibility, \ie virtually all existing C (@gcc@) programs can be compiled by \CFA with only a small number of changes, similar to \CC (@g++@).
     8However, a few compatibility-breaking changes to the behaviour of the C array are necessary, both as an implementation convenience and to fix C's lax treatment of arrays.
     9Hence, the @array@ type is an opportunity to start from a clean slate and show a cohesive selection of features, making it unnecessary to deal with every inherited complexity of the C array.
     10
    411
    512\section{Introduction}
    613\label{s:ArrayIntro}
    714
    8 Arrays in C are possibly the single most misunderstood and incorrectly used feature in the language, resulting in the largest proportion of runtime errors and security violations.
    9 This chapter describes the new \CFA language and library features that introduce a length-checked array type to the \CFA standard library~\cite{Cforall}.
    10 
    11 Specifically, a new \CFA array is declared by instantiating the generic @array@ type,
    12 much like instantiating any other standard-library generic type (such as @dlist@),
     15The new \CFA array is declared by instantiating the generic @array@ type,
     16much like instantiating any other standard-library generic type (such as \CC @vector@),
    1317though using a new style of generic parameter.
    1418\begin{cfa}
     
    1620\end{cfa}
    1721Here, the arguments to the @array@ type are @float@ (element type) and @99@ (length).
    18 When this type is used as a function parameter, the type-system requires that a call's argument matches, down to the length.
     22When this type is used as a function parameter, the type-system requires that a call's argument is a perfect match.
    1923\begin{cfa}
    2024void f( @array( float, 42 )@ & p ) {}   $\C{// p accepts 42 floats}$
    21 f( x );                                                                 $\C{// statically rejected: types are different, 99 != 42}$
     25f( x );                                                                 $\C{// statically rejected: type lengths are different, 99 != 42}$
    2226
    2327test2.cfa:3:1 error: Invalid application of existing declaration(s) in expression.
     
    2529\end{cfa}
    2630Here, the function @f@'s parameter @p@ is declared with length 42.
    27 The call @f( x )@, with the argument being the previously-declared object, is invalid, because the @array@ lengths @99@ and @42@ do not match.
    28 
    29 A function declaration can be polymorphic over these @array@ arguments by using the @forall@ declaration prefix.
    30 This function @g@'s takes arbitrary type parameter @T@ (familiar) and \emph{dimension parameter} @N@ (new).
    31 A dimension paramter represents a to-be-determined count of elements, managed by the type system.
     31However, the call @f( x )@ is invalid, because @x@'s length is @99@, which does not match @42@.
     32
     33A function declaration can be polymorphic over these @array@ arguments by using the \CFA @forall@ declaration prefix.
    3234\begin{cfa}
    3335forall( T, @[N]@ )
     
    4042Cforall Runtime error: subscript 1000 exceeds dimension range [0,99) $for$ array 0x555555558020.
    4143\end{cfa}
     44Function @g@ takes an arbitrary type parameter @T@ and a \emph{dimension parameter} @N@.
     45A dimension parameter represents a to-be-determined count of elements, managed by the type system.
    4246The call @g( x, 0 )@ is valid because @g@ accepts any length of array, where the type system infers @float@ for @T@ and length @99@ for @N@.
    43 Inferring values for @T@ and @N@ is implicit, without programmer involvement.
     47Inferring values for @T@ and @N@ is implicit.
    4448Furthermore, in this case, the runtime subscript @x[0]@ (parameter @i@ being @0@) in @g@ is valid because 0 is in the dimension range $[0,99)$ of argument @x@.
    45 The call @g( x, 1000 )@ is also accepted through compile time;
     49However, the call @g( x, 1000 )@ is also accepted through compile time;
    4650however, this case's subscript, @x[1000]@, generates an error, because @1000@ is outside the dimension range $[0,99)$ of argument @x@.
     51In general, the @forall( ..., [N] )@ participates in the user-relevant declaration of the name @N@, which becomes usable in parameter/return declarations and within a function.
     52The syntactic form is chosen to parallel other @forall@ forms:
     53\begin{cfa}
     54forall( @[N]@ ) ...     $\C[1.5in]{// dimension}$
     55forall( T ) ...         $\C{// value datatype (formerly, "otype")}$
     56forall( T & ) ...       $\C{// opaque datatype (formerly, "dtype")}\CRT$
     57\end{cfa}
     58% The notation @array(thing, N)@ is a single-dimensional case, giving a generic type instance.
    4759
    4860The generic @array@ type is comparable to the C array type, which \CFA inherits from C.
    4961Their runtime characteristics are often identical, and some features are available in both.
    50 For example, assume a caller instantiates @N@ with 42 (discussion about how to follow) in:
     62For example, assume a caller has an argument that instantiates @N@ with 42.
    5163\begin{cfa}
    5264forall( [N] )
    53 void declDemo() {
     65void declDemo( ... ) {
    5466        float x1[N];                                            $\C{// built-in type ("C array")}$
    5567        array(float, N) x2;                                     $\C{// type from library}$
     
    5971The two variables have identical size and layout; they both encapsulate 42-float stack allocations, with no additional ``bookkeeping'' allocations or headers.
    6072Providing this explicit generic approach requires a significant extension to the \CFA type system to support a full-feature, safe, efficient (space and time) array-type, which forms the foundation for more complex array forms in \CFA.
    61 
    62 Admittedly, the @array@ library type (type for @x2@) is syntactically different from its C counterpart.
    63 A future goal (TODO xref) is to provide the new features upon a built-in type whose syntax approaches C's (declaration style of @x1@).
     73In all following discussion, ``C array'' means the types like that of @x@ and ``\CFA array'' means the standard-library @array@ type (instantiations), like the type of @x2@.
     74
     75Admittedly, the @array@ library type for @x2@ is syntactically different from its C counterpart.
     76A future goal (TODO xref) is to provide the new @array@ features with syntax approaching C's (declaration style of @x1@).
    6477Then, the library @array@ type could be removed, giving \CFA a largely uniform array type.
    65 At present, the C-syntax array gets partial support for the new features, so the generic @array@ is used exclusively when introducing features;
     78At present, the C-syntax @array@ is only partially supported, so the generic @array@ is used exclusively in the thesis;
    6679feature support and C compatibility are revisited in Section ? TODO.
    67 
    68 Offering the @array@ type, as a distinct alternative to the C array, is consistent with \CFA's goal of backwards compatibility, \ie virtually all existing C (@gcc@) programs can be compiled by \CFA with only a small number of changes, similar to \CC (@g++@).
    69 However, a few compatibility-breaking changes to the behaviour of the C array are necessary, both as an implementation convenience and to fix C's lax treatment of arrays.
    70 Hence, the @array@ type is an opportunity to start from a clean slate and show a cohesive selection of features, making it unnecessary to deal with every inherited complexity of the C array.
    71 
    72 In all discussion following, ``C array'' means the types like that of @x@ and ``\CFA array'' means the standard-library @array@ type (instantiations), like the type of @x2@.
    7380
    7481My contributions in this chapter are:
     
    8390
    8491
    85 \section{Definitions and design considerations}
    86 
    87 
    88 \subsection{Dependent typing}
    89 
    90 
    91 
    92 General dependent typing allows the type system to encode arbitrary predicates (e.g. behavioural specifications for functions),
     92\section{Dependent typing}
     93
     94General dependent typing allows the type system to encode arbitrary predicates (\eg behavioural specifications for functions),
    9395which is an anti-goal for my work.
    9496Firstly, this application is strongly associated with pure functional languages,
     
    101103
    102104
    103 
    104105\section{Features added}
    105106
     
    109110By declaring type variables at the front of object declarations, an array dimension is lexically referenceable where it is needed.
    110111For example, a declaration can share one length, @N@, among a pair of parameters and the return,
    111 meaning that it requires both input arrays to be of the same length, and guarantees that the result with be of that length as well.
     112meaning that it requires both input arrays to be of the same length, and guarantees that the result is of that length as well.
    112113\lstinput{10-17}{hello-array.cfa}
    113 This function @f@ does a pointwise comparison of its two input arrays, checking if each pair of numbers is within half a percent of each other, returning the answers in a newly allocated @bool@ array.
    114 The dynamic allocation of the @ret@ array by preexisting @alloc@ uses the parameterized dimension information implicitly within its @sizeof@ determination, and casts the return type.
    115 Note that alloc only sees one whole type for its @T@ (which is @f@'s @array(bool, N)@); this type's size is a computation based on @N@.
    116 \begin{cfa}
    117 // simplification
    118 static inline forall( T & | sized(T) )
     114Function @f@ does a pointwise comparison of its two input arrays, checking if each pair of numbers is within half a percent of each other, returning the answers in a newly allocated @bool@ array.
     115The dynamic allocation of the @ret@ array, by the library @alloc@ function,
     116\begin{cfa}
     117forall( T & | sized(T) )
    119118T * alloc() {
    120         return (T *)malloc( sizeof(T) );
    121 }
    122 \end{cfa}
    123 This example illustrates how the new @array@ type plugs into existing \CFA behaviour by implementing necessary @sized@ assertions needed by other types.
    124 (@sized@ implies a concrete \vs abstract type with a runtime-available size, exposed as @sizeof@.)
     119        return @(T *)@malloc( @sizeof(T)@ );
     120}
     121\end{cfa}
     122uses the parameterized dimension information implicitly within its @sizeof@ determination, and casts the return type.
     123Note that @alloc@ only sees one whole type for its @T@ (which is @f@'s @array(bool, N)@); this type's size is a computation based on @N@.
     124This example illustrates how the new @array@ type plugs into existing \CFA behaviour by implementing necessary \emph{sized} assertions needed by other types.
     125(\emph{sized} implies a concrete \vs abstract type with a runtime-available size, exposed as @sizeof@.)
    125126As a result, there is significant programming safety by making the size accessible and implicit, compared with C's @calloc@ and non-array supporting @memalign@, which take an explicit length parameter not managed by the type system.
    126127
     
    142143The result is a significant improvement in safety and usability.
    143144
    144 In general, the @forall( ..., [N] )@ participates in the user-relevant declaration of the name @N@, which becomes usable in parameter/return declarations and within a function.
    145 The syntactic form is chosen to parallel other @forall@ forms:
    146 \begin{cfa}
    147 forall( @[N]@ ) ...     $\C[1.5in]{// dimension}$
    148 forall( T & ) ...       $\C{// opaque datatype (formerly, "dtype")}$
    149 forall( T ) ...         $\C{// value datatype (formerly, "otype")}\CRT$
    150 \end{cfa}
    151 % The notation @array(thing, N)@ is a single-dimensional case, giving a generic type instance.
    152145In summary:
    153146\begin{itemize}
     
    168161% agreed, though already said
    169162\item
    170 \CC does not allow a template function to be nested, while \CFA lests its polymorphic functions to be nested.
     163\CC does not allow a template function to be nested, while \CFA lets its polymorphic functions to be nested.
    171164% why is this important?
    172165\item
     
    227220\end{cfa}
    228221\end{tabular}
    229 \caption{\lstinline{N}-style paramters, for \CC template \vs \CFA generic type }
     222\caption{\lstinline{N}-style parameters, for \CC template \vs \CFA generic type }
    230223\label{f:TemplateVsGenericType}
    231224\end{figure}
    232225
    233226Just as the first example in \VRef[Section]{s:ArrayIntro} shows a compile-time rejection of a length mismatch,
    234 so are length mismatches stopped when they invlove dimension parameters.
     227so are length mismatches stopped when they involve dimension parameters.
    235228While \VRef[Figure]{f:fHarness} shows successfully calling a function @f@ expecting two arrays of the same length,
    236229\begin{cfa}
    237230array( bool, N ) & f( array( float, N ) &, array( float, N ) & );
    238231\end{cfa}
    239 a static rejection occurs when attempting to call @f@ with arrays of potentially differing lengths.
     232a static rejection occurs when attempting to call @f@ with arrays of differing lengths.
    240233\lstinput[tabsize=1]{70-74}{hello-array.cfa}
    241234When the argument lengths themselves are statically unknown,
     
    252245Orthogonally, the \CFA array type works within generic \emph{types}, \ie @forall@-on-@struct@.
    253246The same argument safety and the associated implicit communication of array length occurs.
    254 Preexisting \CFA allowed aggregate types to be generalized with type parameters, enabling parameterizing for element types.
     247Preexisting \CFA allowed aggregate types to be generalized with type parameters, enabling parameterizing of element types.
    255248Now, \CFA also allows parameterizing them by length.
    256249Doing so gives a refinement of C's ``flexible array member'' pattern[TODO: cite ARM 6.7.2.1 pp18]\cite{arr:gnu-flex-mbr}.
     
    259252This flexibility, in turn, allows for multiple array members.
    260253\lstinput{10-15}{hello-accordion.cfa}
    261 This structure's layout has the starting offset of @studentIds@ varying according to the generic parameter @C@, and the offset of @preferences@ varying according to both generic parameters.
    262 
    263 The school example has the data structure capturing many students' course-preference forms.
    264 It has course- and student-level metadata (their respective display names) and a position-based preferecens' matrix.
    265 The input files in \VRef[Figure]{f:checkHarness} give example data.
    266 
    267 When a function operates on a @School@ structure, the type system handles its memory layout transparently.
    268 \lstinput{30-37}{hello-accordion.cfa}
    269 In the running example, this @getPref@ function answers,
    270 for the student at position @sIx@, what is the position of its @pref@\textsuperscript{th}-favoured class?
    271 
    272 \VRef[Figure]{f:checkHarness} shows the @School@ harness and results with different array sizes.
    273 This example program prints the courses in each student's preferred order, all using the looked-up display names.
    274 Note the declaration of the @school@ variable.
     254The structure has course- and student-level metatdata (their respective field names) and a position-based preferences' matrix.
     255Its layout has the starting offset of @studentIds@ varying according to the generic parameter @C@, and the offset of @preferences@ varying according to both generic parameters.
     256
     257\VRef[Figure]{f:checkHarness} shows a program main using @School@ and results with different array sizes.
     258The @school@ variable holds many students' course-preference forms.
    275259It is on the stack and its initialization does not use any casting or size arithmetic.
    276260Both of these points are impossible with a C flexible array member.
     
    280264\end{cfa}
    281265This ability to avoid casting and size arithmetic improves safety and usability over C flexible array members.
    282 
     266Finally, inputs and outputs are given at the bottom for different sized schools.
     267The example program prints the courses in each student's preferred order, all using the looked-up display names.
    283268
    284269\begin{figure}
    285 % super hack to get this to line up
    286 \begin{tabular}{@{}ll@{\hspace{25pt}}l@{}}
    287 \begin{tabular}{@{}p{3.25in}@{}}
     270\begin{cquote}
    288271\lstinput{50-55}{hello-accordion.cfa}
    289 \vspace*{-3pt}
    290272\lstinput{90-98}{hello-accordion.cfa}
    291 \end{tabular}
    292 &
    293 \raisebox{0.32\totalheight}{%
    294 }%
    295 &
    296 \end{tabular}
    297 
    298 TODO: Get Peter's layout help
    299 
    300 \$ cat school1
    301 
     273\ \\
     274@$ cat school1@
    302275\lstinput{}{school1}
    303276
    304 \$ ./a.out < school1
    305 
     277@$ ./a.out < school1@
    306278\lstinput{}{school1.out}
    307279
    308 \$ cat school2
    309 
     280@$ cat school2@
    310281\lstinput{}{school2}
    311282
    312 \$ ./a.out < school2
    313 
     283@$ ./a.out < school2@
    314284\lstinput{}{school2.out}
     285\end{cquote}
    315286
    316287\caption{\lstinline{School} harness, input and output}
    317288\label{f:checkHarness}
    318289\end{figure}
     290
     291When a function operates on a @School@ structure, the type system handles its memory layout transparently.
     292\lstinput{30-37}{hello-accordion.cfa}
     293In the example, this @getPref@ function answers, for the student at position @is@, what is the position of its @pref@\textsuperscript{th}-favoured class?
    319294
    320295
     
    333308But simplifications close enough for the present discussion are:
    334309\begin{cfa}
    335         forall( [N] )
    336         struct array_1d_float {
    337                 float items[N];
    338         };
    339         forall( T, [N] )
    340         struct array_1d {
    341                 T items[N];
    342         };
    343 \end{cfa}
    344 This structure pattern, plus a subscript operator, is all that @array@ provides.
     310forall( [N] )
     311struct array_1d_float {
     312        float items[N];
     313};
     314forall( T, [N] )
     315struct array_1d_T {
     316        T items[N];
     317};
     318\end{cfa}
     319These two structure patterns, plus a subscript operator, is all that @array@ provides.
    345320
    346321My main work is letting a programmer define
    347 such a structre (one whose type is parameterized by @[N]@)
     322such a structure (one whose type is parameterized by @[N]@)
    348323and functions that operate on it (these being similarly parameterized).
    349324
     
    351326\begin{itemize}
    352327\item
    353         The resolver, providing values for a used declaration's type-system variables,
     328        Resolver provided values for a used declaration's type-system variables,
    354329        gathered from type information in scope at the usage site.
    355 
    356330\item
    357331        The box pass, encoding information about type parameters
    358332        into ``extra'' regular parameters/arguments on declarations and calls.
    359333        Notably, it conveys the size of a type @foo@ as a @__sizeof_foo@ parameter,
    360         and rewrites the @sizeof(foo)@ expression as @__sizeof_foo@, i.e. a use of the parameter.
     334        and rewrites the @sizeof(foo)@ expression as @__sizeof_foo@, \ie a use of the parameter.
    361335\end{itemize}
    362336
     
    364338This work is detailed in \VRef[Section]{s:ArrayTypingC}.
    365339However, the resolution--boxing scheme, in its preexisting state, was already equipped to work on (desugared) dimension parameters.
    366 The discussion following explains the desugaring and how correct lowered code results.
    367 
    368 An even simpler structure, and a toy function on it, demonstrate what's needed for the encoding.
    369 \begin{cfa}
    370         forall( [@N@] ) { // [1]
    371                 struct thing {};
    372                 void f( thing(@N@) ) { sout | @N@; } // [2], [3]
    373         }
    374         int main() {
    375                 thing( @10@ ) x;  f(x);  // prints 10, [4]
    376                 thing( 100 ) y;  f(y);  // prints 100
    377                 return 0;
    378         }
     340The following discussion explains the desugaring and how correctly lowered code results.
     341
     342A simpler structure, and a toy function on it, demonstrate what is needed for the encoding.
     343\begin{cfa}
     344forall( [@N@] ) { $\C{// [1]}$
     345        struct thing {};
     346        void f( thing(@N@) ) { sout | @N@; } $\C{// [2], [3]}$
     347}
     348int main() {
     349        thing( @10@ ) x;  f( x );  $\C{// prints 10, [4]}$
     350        thing( 100 ) y;  f( y );  $\C{// prints 100}$
     351        return 0;
     352}
    379353\end{cfa}
    380354This example has:
     
    389363        A value like 10 being used as an argument to the parameter @N@.
    390364\end{enumerate}
    391 The chosen solution being to encode the value @N@ \emph{as a type}, items 1 and 2 are immediately available for free.
     365The chosen solution is to encode the value @N@ \emph{as a type}, so items 1 and 2 are immediately available for free.
    392366Item 3 needs a way to recover the encoded value from a (valid) type (and to reject invalid types occurring here).
    393367Item 4 needs a way to produce a type that encodes the given value.
     
    400374\item
    401375        Given a dimension expression $e$, produce type @char[@$e$@]@ to represent it.
    402         If $e$ evaluates to $n$ then the endoded type has size $n$.
     376        If $e$ evaluates to $n$ then the encoded type has size $n$.
    403377\item
    404378        Given a type $T$ (produced by these rules), recover the value that it represents with the expression @sizeof(@$T$@)@.
     
    410384The running example is lowered to:
    411385\begin{cfa}
    412         forall( @N*@ ) { // [1]
    413                 struct thing {};
    414                 void f( thing(@N@) ) { sout | @sizeof(N)@; } // [2], [3]
    415         }
    416         int main() {
    417                 thing( char[@10@] ) x;  f(x);  // prints 10, [4]
    418                 thing( char[100] ) y;  f(y);  // prints 100
    419                 return 0;
    420         }
     386forall( @N *@ ) { $\C{// [1]}$
     387        struct thing {};
     388        void f( thing(@N@) ) { sout | @sizeof(N)@; } $\C{// [2], [3]}$
     389}
     390int main() {
     391        thing( char[@10@] ) x;  f( x );  $\C{// prints 10, [4]}$
     392        thing( char[100] ) y;  f( y );  $\C{// prints 100}$
     393        return 0;
     394}
    421395\end{cfa}
    422396Observe:
     
    430404        The @sout...@ expression (being an application of the @?|?@ operator) has a second argument that is an ordinary expression.
    431405\item
    432         The type of variable @x@ is another @thing(-)@ type;  the argument to the generic @thing@ is a type (array type).
     406        The type of variable @x@ is another @thing(-)@ type; the argument to the generic @thing@ is a type (array type of bytes, @char@).
    433407\end{enumerate}
    434408
    435409From this point, preexisting \CFA compilation takes over lowering it the rest of the way to C.
    436 Inspecting the result shows what the above translation achieves.
    437 A form that shows only the relevant changes of the box pass (as informed by the resolver), leaving the rest unadulterated, is:
    438 \begin{cfa}
    439         // [1]
    440         void f( size_t __sizeof_N, @void *@ ) { sout | @__sizeof_N@; } // [2], [3]
    441         int main() {
    442                 struct __conc_thing_10 {} x;  f(@10@, &x);  // prints 10, [4]
    443                 struct __conc_thing_100 {} y;  f(@100@, &y);  // prints 100
    444                 return 0;
    445         }
     410Here the result shows only the relevant changes of the box pass (as informed by the resolver), leaving the rest unadulterated:
     411\begin{cfa}
     412// [1]
     413void f( size_t __sizeof_N, @void *@ ) { sout | @__sizeof_N@; } $\C{// [2], [3]}$
     414int main() {
     415        struct __conc_thing_10 {} x;  f( @10@, &x );  $\C{// prints 10, [4]}$
     416        struct __conc_thing_100 {} y;  f( @100@, &y );  $\C{// prints 100}$
     417        return 0;
     418}
    446419\end{cfa}
    447420Observe:
     
    452425        The type @thing(N)@ is (replaced by @void *@, but thereby effectively) gone.
    453426\item
    454         The @sout...@ expression (being an application of the @?|?@ operator) has a second argument that is a regular variable (parameter) usage.
     427        The @sout...@ expression (being an application of the @?|?@ operator) has a regular variable (parameter) usage for its second argument.
    455428\item
    456429        Information about the particular @thing@ instantiation (value 10) has moved, from the type, to a regular function-call argument.
    457430\end{enumerate}
    458 At the end of the desugaring and downstream processing, the original C idiom of ``pass both a pointer and a length parameter'' has been reconstructed.
    459 In the programmer-written form, only the thing is passed.
     431At the end of the desugaring and downstream processing, the original C idiom of ``pass both a length parameter and a pointer'' has been reconstructed.
     432In the programmer-written form, only the @thing@ is passed.
    460433The compiler's action produces the more complex form, which if handwritten, would be error-prone.
    461434
    462 Back at the very front end, the parsing changes, AST schema extensions, and validation rules for enabling the sugared user input are:
     435Back at the compiler front end, the parsing changes AST schema extensions and validation rules for enabling the sugared user input.
    463436\begin{itemize}
    464437\item
     
    467440        Have the new brand of type-variable, \emph{Dimension}, in the AST form of a type-variable, to represent one parsed from @[-]@.
    468441\item
    469         Allow a type variable to occur in expression position.  Validate (after parsing) that only dimension-branded type variables are used here.
     442        Allow a type variable to occur in an expression.  Validate (after parsing) that only dimension-branded type variables are used here.
    470443\item
    471444        Allow an expression to occur in type-argument position.  Brand the resulting type argument as a dimension.
     
    473446        Validate (after parsing), on a generic-type usage, \eg the type part of the declaration
    474447        \begin{cfa}
    475                 @array_1d( foo, bar ) x;@
     448        array_1d( foo, bar ) x;
    476449        \end{cfa}
     450        \vspace*{-10pt}
    477451        that the brands on the generic arguments match the brands of the declared type variables.
    478452        Here, that @foo@ is a type and @bar@ is a dimension.
     
    488462from one party who knows it, to another who is willing to work with any given length.
    489463For scenarios where the concern is a mishandled length,
    490 the interaction is between two parties who both claim to know (something about) it.
    491 Such a scenario occurs in this pure C fragment, wich today's C compilers accept:
    492 \begin{cfa}
    493         int n = @42@;
    494         float x[n];
    495         float (*xp)[@999@] = &x;
    496         (*xp)[@500@];  // in "bound"?
    497 \end{cfa}
    498 
     464the interaction is between two parties who both claim to know something about it.
     465Such a scenario occurs in this pure C fragment, which today's C compilers accept:
     466\begin{cfa}
     467int n = @42@;
     468float x[n];
     469float (*xp)[@999@] = &x;
     470(*xp)[@500@]; $\C{// in "bound"?}$
     471\end{cfa}
    499472Here, the array @x@ has length 42, while a pointer to it (@xp@) claims length 999.
    500473So, while the subscript of @xp@ at position 500 is out of bound of its referent @x@,
     
    505478The \CFA new-array rejects the analogous case:
    506479\begin{cfa}
    507         int n = @42@;
    508         array(float, n) x;
    509         array(float, 999) * xp = x; // static rejection here
    510         (*xp)[@500@]; // runtime check vs len 999
    511 \end{cfa}
    512 
    513 % TODO: kill the vertical whitespace around these lists
    514 % nothing from https://stackoverflow.com/questions/1061112/eliminate-space-before-beginitemize is working
    515 
    516 The way the \CFA array is implemented,
    517 the type analysis of this \CFA case reduces to a case similar to the earlier C version.
     480int n = @42@;
     481array(float, n) x;
     482array(float, 999) * xp = x; $\C{// static rejection here}$
     483(*xp)[@500@]; $\C{// runtime check vs len 999}$
     484\end{cfa}
     485The way the \CFA array is implemented, the type analysis of this case reduces to a case similar to the earlier C version.
    518486The \CFA compiler's compatibility analysis proceeds as:
    519 \begin{itemize}[noitemsep,partopsep=-\parskip,parsep=0pt,leftmargin=4em]
     487\begin{itemize}[parsep=0pt]
    520488\item
    521489        Is @array(float, 999)@ type-compatible with @array(float, n)@?
    522490\item
    523         Is @arrayX(float, char[999])@ type-compatible with @arrayX(float, char[n])@?
    524         \footnote{Here, \lstinline{arrayX} represents the type that results
     491        Is @arrayX(float, char[999])@ type-compatible with @arrayX(float, char[n])@?\footnote{
     492                Here, \lstinline{arrayX} represents the type that results
    525493                from desugaring the \lstinline{array} type
    526494                into a type whose generic parameters are all types.
     
    531499        Is @char[999]@ type-compatible with @char[n]@?
    532500\end{itemize}
    533 
    534 I chose to achieve the necessary rejection of the \CFA case
    535 by adding a rejection of the corresponding C case.
    536 
    537 This decision is not backward compatible.
     501To achieve the necessary \CFA rejections meant rejecting the corresponding C case, which is not backward compatible.
    538502There are two complementary mitigations for this incompatibility.
    539503
     
    542506This situation might arise if @n@ were known to be 999,
    543507rather than 42, as in the introductory examples.
    544 The programmer can add a cast, as in:
    545 \begin{cfa}
    546         xp = ( float (*)[999] ) & x;
    547 \end{cfa}
    548 This addition causes \CFA to accept, beacause now, the programmer has accepted blame.
     508The programmer can add a cast in the \CFA code.
     509\begin{cfa}
     510xp = @(float (*)[999])@ &x;
     511\end{cfa}
     512This addition causes \CFA to accept, because now, the programmer has accepted blame.
    549513This addition is benign in plain C, because the cast is valid, just unnecessary there.
    550514Moreover, the addition can even be seen as appropriate ``eye candy,''
     
    556520Second, the incompatibility only affects types like pointer-to-array,
    557521which are are infrequently used in C.
    558 The more common C idiom for aliasing an array is to use the pointer-to-first-element type,
    559 which does not participate in the \CFA array's length checking.
    560 \footnote{Notably, the desugaring of the \lstinline{array@} type,
    561         avoids letting any \lstinline{-[-]} type decay,
    562         in order to preserve the length information that powers runtime bound checking.}
    563 Therefore, the frequency of needing to upgrade wild C code (as discussed in the first mitigation)
     522The more common C idiom for aliasing an array is to use a pointer-to-first-element type,
     523which does not participate in the \CFA array's length checking.\footnote{
     524        Notably, the desugaring of the \lstinline{array} type avoids letting any \lstinline{-[-]} type decay,
     525        in order to preserve the length information that powers runtime bound-checking.}
     526Therefore, the frequency of needing to upgrade legacy C code (as discussed in the first mitigation)
    564527is anticipated to be low.
    565528
    566529Because the incompatibility represents a low cost to a \CFA onboarding effort
    567530(with a plausible side benefit of linting the original code for a missing annotation),
    568 I elected not to add special measures to retain the compatibility.
     531no special measures were added to retain the compatibility.
    569532It would be possible to flag occurrences of @-[-]@ types that come from @array@ desugaring,
    570533treating those with stricter \CFA rules, while treating others with classic C rules.
     
    573536
    574537Having allowed that both the initial C example's check
    575 \begin{itemize}[noitemsep,partopsep=-\parskip,parsep=0pt,leftmargin=4em]
     538\begin{itemize}
    576539        \item
    577540                Is @float[999]@ type-compatible with @float[n]@?
    578541\end{itemize}
    579 and the second \CFA exmple's induced check
    580 \begin{itemize}[noitemsep,partopsep=-\parskip,parsep=0pt,leftmargin=4em]
     542and the second \CFA example's induced check
     543\begin{itemize}
    581544        \item
    582545                Is @char[999]@ type-compatible with @char[n]@?
     
    587550To implement the new \CFA rules, I took the syntactic recursion a step further, obtaining,
    588551in both cases:
    589 \begin{itemize}[noitemsep,partopsep=-\parskip,parsep=0pt,leftmargin=4em]
     552\begin{itemize}
    590553        \item
    591                 Is @999@ TBD-compatible with @n@?
     554                Is @999@ compatible with @n@?
    592555\end{itemize}
    593 This compatibility question applies to a pair of expressions, where the earlier ones were to types.
     556This compatibility question applies to a pair of expressions, where the earlier implementation were to types.
    594557Such an expression-compatibility question is a new addition to the \CFA compiler.
    595 These questions only arise in the context of dimension expressions on (C) array types.
     558Note, these questions only arise in the context of dimension expressions on (C) array types.
    596559
    597560TODO: ensure these compiler implementation matters are treated under \CFA compiler background:
     
    600563GenPoly.
    601564
    602 The relevant technical component of the \CFA compiler is,
    603 within the type resolver, the type unification procedure.
     565The relevant technical component of the \CFA compiler is the type unification procedure within the type resolver.
    604566I added rules for continuing this unification into expressions that occur within types.
    605567It is still fundamentally doing \emph{type} unification
     
    607569and not participating in binding any variables that stand in for expression fragments
    608570(for there is no such sort of variable in \CFA's analysis.)
    609 
    610571An unfortunate fact about the \CFA compiler's preexisting implementation is that
    611572type unification suffers from two forms of duplication.
     
    613574The first duplication has (many of) the unification rules stated twice.
    614575As a result, my additions for dimension expressions are stated twice.
    615 The extra statement of the rules occurs in the GenPoly module,
     576The extra statement of the rules occurs in the @GenPoly@ module,
    616577where concrete types like @array(int, 5)@\footnote{
    617578        Again, the presentation is simplified
    618         by leaving the \lstinline{array} macro unexpanded}
     579        by leaving the \lstinline{array} macro unexpanded.}
    619580are lowered into corresponding C types @struct __conc_array_1234@ (the suffix being a generated index).
    620 In this case, the struct's definition gives fields that hardcode the argument values of @float@ and @5@.
    621 The next time an @array(-,-)@ concrete instance is encountered,
    622 is the previous @struct __conc_array_1234@ suitable for it?
    623 Yes, for another occurrance of @array(int, 5)@;
     581In this case, the struct's definition contains fields that hardcode the argument values of @float@ and @5@.
     582The next time an @array(-,-)@ concrete instance is encountered, it checks if the previous @struct __conc_array_1234@ is suitable for it.
     583Yes, for another occurrence of @array(int, 5)@;
    624584no, for either @array(rational(int), 5)@ or @array(int, 42)@.
    625585By the last example, this phase must ``reject''
     
    630590In the program
    631591\begin{cfa}
    632         void f( double );
    633         forall( T & ) void f( T & );
    634         void g( int n ) {
    635                 array( float, n + 1 ) x;
    636                 f(x);
    637         }
    638 \end{cfa}
    639 when resolving the function call, the first unification stage
    640 compares the types @T@, of the parameter, with @array( float, n + 1 )@, of the argument.
     592void @f@( double );
     593forall( T & ) void @f@( T & );
     594void g( int n ) {
     595        array( float, n + 1 ) x;
     596        f(x);   // overloaded
     597}
     598\end{cfa}
     599when resolving the function call, @g@, the first unification stage
     600compares the type @T@ of the parameter with @array( float, n + 1 )@, of the argument.
    641601TODO: finish.
    642602
     
    647607        TODO: Deal with directionality, that I'm doing exact-match, no ``at least as long as,'' no subtyping.
    648608        Should it be an earlier scoping principle?  Feels like it should matter in more places than here.}
    649 So, a ``yes'' answer must represent a guarantee that both expressions will evaluate the
    650 same result, while a ``no'' can tolerate ``they might, but we're not sure,'
     609So, a ``yes'' answer must represent a guarantee that both expressions evaluate the
     610same result, while a ``no'' can tolerate ``they might, but we're not sure'',
    651611provided that practical recourses are available
    652 to let programmers express their better knowledge.
    653 The specific rule-set that I offer with the current release is, in fact, extremely conservative.
     612to let programmers express better knowledge.
     613The new rule-set in the current release is, in fact, extremely conservative.
    654614I chose to keep things simple,
    655 and allow real future needs do drive adding additional complexity,
    656 within the framework that I laid out.
     615and allow future needs to drive adding additional complexity, within the new framework.
    657616
    658617For starters, the original motivating example's rejection
     
    662621Rather, the analysis assumes a variable's value can be anything,
    663622and so there can be no guarantee that its value is 999.
    664 So, a variable use and a literal can never match.
     623So, a variable and a literal can never match.
    665624
    666625Two occurrences of the same literal value are obviously a fine match.
    667 For two occurrences of the same varialbe, more information is needed.
     626For two occurrences of the same variable, more information is needed.
    668627For example, this one is fine
    669628\begin{cfa}
    670         void f( const int n ) {
    671                 float x[n];
    672                 float (*xp)[n] = x; // accept
    673         }
     629void f( const int n ) {
     630        float x[n];
     631        float (*xp)[n] = x;  // accept
     632}
    674633\end{cfa}
    675634while this one is not:
    676635\begin{cfa}
    677         void f() {
    678                 int n = 42;
    679                 float x[n];
    680                 n = 999;
    681                 float (*xp)[n] = x; // reject
    682         }
     636void f() {
     637        int n = 42;
     638        float x[n];
     639        n = 999;
     640        float (*xp)[n] = x;  // reject
     641}
    683642\end{cfa}
    684643Furthermore, the fact that the first example sees @n@ as @const@
    685 is not actually a sufficent basis.
     644is not actually sufficient.
    686645In this example, @f@'s length expression's declaration is as @const@ as it can be,
    687646yet its value still changes between the two invocations:
    688 \begin{cfa}
    689         // compile unit 1
    690         void g();
    691         void f( const int & const nr ) {
    692                 float x[nr];
    693                 g();
    694                 float (*xp)[nr] = x; // reject
    695         }
    696         // compile unit 2
    697         static int n = 42;
    698         void g() {
    699                 n = 99;
    700         }
    701         void f( const int & );
    702         int main () {
    703                 f(n);
    704                 return 0;
    705         }
    706 \end{cfa}
    707 The issue in this last case is,
    708 just because you aren't able to change something doesn't mean someone else can't.
    709 
    710 My rule set also respects a feature of the C tradition.
    711 In spite of the several limitations of the C rules
     647\begin{cquote}
     648\setlength{\tabcolsep}{15pt}
     649\begin{tabular}{@{}ll@{}}
     650\begin{cfa}
     651// compile unit 1
     652void g();
     653void f( const int & const nr ) {
     654        float x[nr];
     655        g();    // change n
     656        @float (*xp)[nr] = x;@   // reject
     657}
     658\end{cfa}
     659&
     660\begin{cfa}
     661// compile unit 2
     662static int n = 42;
     663void g() {
     664        n = 99;
     665}
     666
     667f( n );
     668\end{cfa}
     669\end{tabular}
     670\end{cquote}
     671The issue here is that knowledge needed to make a correct decision is hidden by separate compilation.
     672Even within a translation unit, static analysis might not be able to provide all the information.
     673
     674My rule set also respects a traditional C feature: In spite of the several limitations of the C rules
    712675accepting cases that produce different values, there are a few mismatches that C stops.
    713 C is quite precise when working with two static values:
    714 \begin{cfa}
    715         enum { fortytwo = 42 };
    716         float x[fortytwo];
    717         float (*xp1)[42] = &x; // accept
    718         float (*xp2)[999] = &x; // reject
     676C is quite precise when working with two static values.
     677\begin{cfa}
     678enum { fortytwo = 42 };
     679float x[fortytwo];
     680float (*xp1)[42] = &x;    // accept
     681float (*xp2)[999] = &x;  // reject
    719682\end{cfa}
    720683My \CFA rules agree with C's on these cases.
    721684
    722 My rules classify expressions into three groups:
     685In summary, the new rules classify expressions into three groups:
    723686\begin{description}
    724687\item[Statically Evaluable]
    725688        Expressions for which a specific value can be calculated (conservatively)
    726689        at compile-time.
    727         A preexisting \CFA compiler module defines which expressions qualify,
     690        A preexisting \CFA compiler module defines which literals, enumerators, and expressions qualify,
    728691        and evaluates them.
    729         Includes literals and enumeration values.
    730692\item[Dynamic but Stable]
    731         The value of a variable declared as @const@.
    732         Includes a @const@ parameter.
     693        The value of a variable declared as @const@, including a @const@ parameter.
    733694\item[Potentially Unstable]
    734695        The catch-all category.  Notable examples include:
    735         any function-call result (@float x[foo()];@),
    736         the particular function-call result that is a pointer dereference (@void f(const int * n) { float x[*n]; }@), and
     696        any function-call result, @float x[foo()];@,
     697        the particular function-call result that is a pointer dereference, @void f(const int * n)@ @{ float x[*n]; }@, and
    737698        any use of a reference-typed variable.
    738699\end{description}
    739 
    740 My \CFA rules are:
     700Within these groups, my \CFA rules are:
    741701\begin{itemize}
    742702\item
     
    744704        Notably, this rule allows a literal to match with an enumeration value, based on the value.
    745705\item
    746         Accept a Dynamic but Stable pair, if both expressions are written out the same, e.g. refers to same variable declaration.
     706        Accept a Dynamic but Stable pair, if both expressions are written out the same, \eg refers to the same variable declaration.
    747707\item
    748708        Otherwise, reject.
    749         Notably, reject all pairs from the Potentially Unstable group.
    750         Notably, reject all pairs that cross groups.
     709        Notably, reject all pairs from the Potentially Unstable group and all pairs that cross groups.
    751710\end{itemize}
    752 
    753711The traditional C rules are:
    754712\begin{itemize}
     
    759717\end{itemize}
    760718
    761 
    762 \newcommand{\falsealarm}{{\color{orange}\small{*}}}
    763 \newcommand{\allowmisuse}{{\color{red}\textbf{!}}}
    764 \newcommand{\cmark}{\ding{51}} % from pifont
    765 \newcommand{\xmark}{\ding{55}}
    766719\begin{figure}
     720        \newcommand{\falsealarm}{{\color{blue}\small{*}}}
     721        \newcommand{\allowmisuse}{{\color{red}\textbf{!}}}
     722        \newcommand{\cmark}{\ding{51}} % from pifont
     723        \newcommand{\xmark}{\ding{55}}
     724
    767725        \begin{tabular}{@{}l@{\hspace{16pt}}c@{\hspace{8pt}}c@{\hspace{16pt}}c@{\hspace{8pt}}c@{\hspace{16pt}}c}
    768726         & \multicolumn{2}{c}{\underline{Values Equal}}
     
    778736        \end{tabular}
    779737
    780         \vspace{12pt}
    781         \noindent\textbf{Legend:}
    782         \begin{itemize}
     738        \medskip
     739        \noindent\textbf{Legend}
     740        \begin{itemize}[leftmargin=*]
    783741        \item
    784742                Each row gives the treatment of a test harness of the form
    785743                \begin{cfa}
    786                         float x[ expr1 ];
    787                         float (*xp)[ expr2 ] = &x;
     744                float x[ expr1 ];
     745                float (*xp)[ expr2 ] = &x;
    788746                \end{cfa}
    789                 where \lstinline{expr1} and \lstinline{expr2} are metavariables varying according to the row's Case.
     747                \vspace*{-10pt}
     748                where \lstinline{expr1} and \lstinline{expr2} are meta-variables varying according to the row's Case.
    790749                Each row's claim applies to other harnesses too, including,
    791750                \begin{itemize}
    792751                \item
    793                         calling a function with a paramter like \lstinline{x} and an argument of the \lstinline{xp} type,
     752                        calling a function with a parameter like \lstinline{x} and an argument of the \lstinline{xp} type,
    794753                \item
    795754                        assignment in place of initialization,
     
    801760        \item
    802761                Each case's claim is symmetric (swapping \lstinline{expr1} with \lstinline{expr2} has no effect),
    803                 even though most test harnesses are asymetric.
     762                even though most test harnesses are asymmetric.
    804763        \item
    805764                The table treats symbolic identity (Same/Different on rows)
    806                 apart from value eqality (Equal/Unequal on columns).
     765                apart from value equality (Equal/Unequal on columns).
    807766                \begin{itemize}
    808767                \item
     
    819778                while every Accept under Values Unequal is an allowed misuse (\allowmisuse).
    820779        \end{itemize}
    821         \caption{Case comparison for array type compatibility, given pairs of dimension expressions.
    822                 TODO: get Peter's LaTeX help on overall appearance, probably including column spacing/centering and bullet indentation.}
     780
     781        \caption{Case comparison for array type compatibility, given pairs of dimension expressions.}
    823782        \label{f:DimexprRuleCompare}
    824783\end{figure}
     
    826785
    827786Figure~\ref{f:DimexprRuleCompare} gives a case-by-case comparison of the consequences of these rule sets.
    828 It demonstrates that the \CFA false alarms occur in the same cases as C treats unsafely.
    829 It also shows that C-incompatibilities only occur in cases that C treats unsafely.
     787It demonstrates that the \CFA false alarms occur in the same cases as C treats unsafe.
     788It also shows that C-incompatibilities only occur in cases that C treats unsafe.
    830789
    831790
     
    837796whose reuses are rejected by the blunt current-state rules:
    838797\begin{cfa}
    839         void f( int & nr, const int nv ) {
    840                 float x[nr];
    841                 float (*xp)[nr] = & x; // reject: nr varying (no references)
    842                 float y[nv + 1];
    843                 float (*yp)[nv + 1] = & y; // reject: ?+? unpredicable (no functions)
    844         }
     798void f( int & nr, const int nv ) {
     799        float x[nr];
     800        float (*xp)[nr] = &x;  // reject: nr varying (no references)
     801        float y[nv + 1];
     802        float (*yp)[nv + 1] = &y;   // reject: ?+? unpredictable (no functions)
     803}
    845804\end{cfa}
    846805Yet, both dimension expressions are reused safely.
    847 (The @nr@ reference is never written, not volatile
     806The @nr@ reference is never written, not volatile
    848807and control does not leave the function between the uses.
    849 The name @?+?@ resolves to a function that is quite predictable.)
    850 The programmer here can add the constant declarations:
    851 \begin{cfa}
    852         void f( int & nr, const int nv ) {
    853                 @const int nx@ = nr;
    854                 float x[nx];
    855                 float (*xp)[nx] = & x; // acept
    856                 @const int ny@ = nv + 1;
    857                 float y[ny];
    858                 float (*yp)[ny] = & y; // accept
    859         }
     808The name @?+?@ resolves to a function that is quite predictable.
     809Here, the programmer can add the constant declarations (cast does not work):
     810\begin{cfa}
     811void f( int & nr, const int nv ) {
     812        @const int nx@ = nr;
     813        float x[nx];
     814        float (*xp)[nx] = & x;   // accept
     815        @const int ny@ = nv + 1;
     816        float y[ny];
     817        float (*yp)[ny] = & y;  // accept
     818}
    860819\end{cfa}
    861820The result is the originally intended semantics,
     
    863822
    864823The snapshotting trick is also used by the translation, though to achieve a different outcome.
    865 Rather obviously, every array must be subscriptable, even a bizzarre one:
    866 \begin{cfa}
    867         array( float, rand(10) ) x;
    868         x[0];  // 10% chance of bound-check failure
     824Rather obviously, every array must be subscriptable, even a bizarre one:
     825\begin{cfa}
     826array( float, rand(10) ) x;
     827x[0];  // 10% chance of bound-check failure
    869828\end{cfa}
    870829Less obvious is that the mechanism of subscripting is a function call,
     
    874833Adjusting the example to make the function's use of length more explicit:
    875834\begin{cfa}
    876         forall ( T * )
    877         void f( T * x ) { sout | sizeof(*x); }
    878         float x[ rand(10) ];
    879         f( x );
     835forall ( T * )
     836void f( T * x ) { sout | sizeof(*x); }
     837float x[ rand(10) ];
     838f( x );
    880839\end{cfa}
    881840Considering that the partly translated function declaration is, loosely,
    882841\begin{cfa}
    883         void f( size_t __sizeof_T, void * x ) { sout | __sizeof_T; }
    884 \end{cfa}
    885 the translated call must not go like:
    886 \begin{cfa}
    887         float x[ rand(10) ];
    888         f( rand(10), &x );
    889 \end{cfa}
    890 Rather, its actual translation is like:
    891 \begin{cfa}
    892         size_t __dim_x = rand(10);
    893         float x[ __dim_x ];
    894         f( __dim_x, &x );
    895 \end{cfa}
    896 The occurrence of this dimension hoisting during translation was present in the preexisting \CFA compiler.
    897 But its cases were buggy, particularly with determining, ``Can hoisting be skipped here?''
    898 For skipping this hoisting is clearly desirable in some cases,
    899 not the least of which is when the programmer has already done so manually.
    900 My work includes getting these cases right, harmonized with the accept/reject criteria, and tested.
    901 
    902 
     842void f( size_t __sizeof_T, void * x ) { sout | __sizeof_T; }
     843\end{cfa}
     844the translation must call the dimension argument twice:
     845\begin{cfa}
     846float x[ rand(10) ];
     847f( rand(10), &x );
     848\end{cfa}
     849Rather, the translation is:
     850\begin{cfa}
     851size_t __dim_x = rand(10);
     852float x[ __dim_x ];
     853f( __dim_x, &x );
     854\end{cfa}
     855The occurrence of this dimension hoisting during translation was in the preexisting \CFA compiler.
     856But its cases were buggy, particularly with determining, ``Can hoisting the expression be skipped here?'', for skipping this hoisting is clearly desirable in some cases.
     857For example, when the programmer has already done so manually. \PAB{I don't know what this means.}
     858In the new implementation, these cases are correct, harmonized with the accept/reject criteria.
    903859
    904860TODO: Discuss the interaction of this dimension hoisting with the challenge of extra unification for cost calculation
    905861
    906862
    907 \section{Multidimensional Arrays}
    908 \label{toc:mdimpl}
    909 
    910 % TODO: introduce multidimensional array feature and approaches
    911 
    912 When working with arrays, \eg linear algebra, array dimensions are referred to as ``rows'' and ``columns'' for a matrix, adding ``planes'' for a cube.
    913 (There is little terminology for higher dimensional arrays.)
    914 For example, an acrostic poem\footnote{A type of poetry where the first, last or other letters in a line spell out a particular word or phrase in a vertical column.}
    915 can be treated as a grid of characters, where the rows are the text and the columns are the embedded keyword(s).
    916 Within a poem, there is the concept of a \newterm{slice}, \eg a row is a slice for the poem text, a column is a slice for a keyword.
    917 In general, the dimensioning and subscripting for multidimensional arrays has two syntactic forms: @m[r,c]@ or @m[r][c]@.
    918 
    919 Commonly, an array, matrix, or cube, is visualized (especially in mathematics) as a contiguous row, rectangle, or block.
    920 This conceptualization is reenforced by subscript ordering, \eg $m_{r,c}$ for a matrix and $c_{p,r,c}$ for a cube.
    921 Few programming languages differ from the mathematical subscript ordering.
    922 However, computer memory is flat, and hence, array forms are structured in memory as appropriate for the runtime system.
    923 The closest representation to the conceptual visualization is for an array object to be contiguous, and the language structures this memory using pointer arithmetic to access the values using various subscripts.
    924 This approach still has degrees of layout freedom, such as row or column major order, \ie juxtaposed rows or columns in memory, even when the subscript order remains fixed.
    925 For example, programming languages like MATLAB, Fortran, Julia and R store matrices in column-major order since they are commonly used for processing column-vectors in tabular data sets but retain row-major subscripting.
    926 In general, storage layout is hidden by subscripting, and only appears when passing arrays among different programming languages or accessing specific hardware.
    927 
    928 \VRef[Figure]{f:FixedVariable} shows two C90 approaches for manipulating a contiguous matrix.
    929 Note, C90 does not support VLAs.
    930 The fixed-dimension approach (left) uses the type system;
    931 however, it requires all dimensions except the first to be specified at compile time, \eg @m[][6]@, allowing all subscripting stride calculations to be generated with constants.
    932 Hence, every matrix passed to @fp1@ must have exactly 6 columns but the row size can vary.
    933 The variable-dimension approach (right) ignores (violates) the type system, \ie argument and parameters types do not match, and subscripting is performed manually using pointer arithmetic in the macro @sub@.
    934 
    935 \begin{figure}
    936 \begin{tabular}{@{}l@{\hspace{40pt}}l@{}}
    937 \multicolumn{1}{c}{\textbf{Fixed Dimension}} & \multicolumn{1}{c}{\textbf{Variable Dimension}} \\
    938 \begin{cfa}
    939 
    940 void fp1( int rows, int m[][@6@] ) {
    941         ...  printf( "%d ", @m[r][c]@ );  ...
    942 }
    943 int fm1[4][@6@], fm2[6][@6@]; // no VLA
    944 // initialize matrixes
    945 fp1( 4, fm1 ); // implicit 6 columns
    946 fp1( 6, fm2 );
    947 \end{cfa}
    948 &
    949 \begin{cfa}
    950 #define sub( m, r, c ) *(m + r * sizeof( m[0] ) + c)
    951 void fp2( int rows, int cols, int *m ) {
    952         ...  printf( "%d ", @sub( m, r, c )@ );  ...
    953 }
    954 int vm1[@4@][@4@], vm2[@6@][@8@]; // no VLA
    955 // initialize matrixes
    956 fp2( 4, 4, vm1 );
    957 fp2( 6, 8, vm2 );
    958 \end{cfa}
    959 \end{tabular}
    960 \caption{C90 Fixed \vs Variable Contiguous Matrix Styles}
    961 \label{f:FixedVariable}
    962 \end{figure}
    963 
    964 Many languages allow multidimensional arrays-of-arrays, \eg in Pascal or \CC.
    965 \begin{cquote}
    966 \begin{tabular}{@{}ll@{}}
    967 \begin{pascal}
    968 var m : array[0..4, 0..4] of Integer;  (* matrix *)
    969 type AT = array[0..4] of Integer;  (* array type *)
    970 type MT = array[0..4] of AT;  (* array of array type *)
    971 var aa : MT;  (* array of array variable *)
    972 m@[1][2]@ := 1;   aa@[1][2]@ := 1 (* same subscripting *)
    973 \end{pascal}
    974 &
    975 \begin{c++}
    976 int m[5][5];
    977 
    978 typedef vector< vector<int> > MT;
    979 MT vm( 5, vector<int>( 5 ) );
    980 m@[1][2]@ = 1;  aa@[1][2]@ = 1;
    981 \end{c++}
    982 \end{tabular}
    983 \end{cquote}
    984 The language decides if the matrix and array-of-array are laid out the same or differently.
    985 For example, an array-of-array may be an array of row pointers to arrays of columns, so the rows may not be contiguous in memory nor even the same length (triangular matrix).
    986 Regardless, there is usually a uniform subscripting syntax masking the memory layout, even though a language could differentiated between the two forms using subscript syntax, \eg @m[1,2]@ \vs @aa[1][2]@.
    987 Nevertheless, controlling memory layout can make a difference in what operations are allowed and in performance (caching/NUMA effects).
    988 
    989 C also provides non-contiguous arrays-of-arrays.
    990 \begin{cfa}
    991 int m[5][5];                                                    $\C{// contiguous}$
    992 int * aa[5];                                                    $\C{// non-contiguous}$
    993 \end{cfa}
    994 both with different memory layout using the same subscripting, and both with different degrees of issues.
    995 The focus of this work is on the contiguous multidimensional arrays in C.
    996 The reason is that programmers are often forced to use the more complex array-of-array form when a contiguous array would be simpler, faster, and safer.
    997 Nevertheless, the C array-of-array form is still important for special circumstances.
    998 
    999 \VRef[Figure]{f:ContiguousNon-contiguous} shows the extensions made in C99 for manipulating contiguous \vs non-contiguous arrays.\footnote{C90 also supported non-contiguous arrays.}
    1000 First, VLAs are supported.
    1001 Second, for contiguous arrays, C99 conjoins one or more of the parameters as a downstream dimension(s), \eg @cols@, implicitly using this parameter to compute the row stride of @m@.
    1002 If the declaration of @fc@ is changed to:
    1003 \begin{cfa}
    1004 void fc( int rows, int cols, int m[@rows@][@cols@] ) ...
    1005 \end{cfa}
    1006 it is possible for C to perform bound checking across all subscripting, but it does not.
    1007 While this contiguous-array capability is a step forward, it is still the programmer's responsibility to manually manage the number of dimensions and their sizes, both at the function definition and call sites.
    1008 That is, the array does not automatically carry its structure and sizes for use in computing subscripts.
    1009 While the non-contiguous style in @faa@ looks very similar to @fc@, the compiler only understands the unknown-sized array of row pointers, and it relies on the programmer to traverse the columns in a row correctly with a correctly bounded loop index.
    1010 Specifically, there is no requirement that the rows are the same length, like a poem with different length lines.
    1011 
    1012 \begin{figure}
    1013 \begin{tabular}{@{}ll@{}}
    1014 \multicolumn{1}{c}{\textbf{Contiguous}} & \multicolumn{1}{c}{\textbf{ Non-contiguous}} \\
    1015 \begin{cfa}
    1016 void fc( int rows, @int cols@, int m[ /* rows */ ][@cols@] ) {
    1017         ...  printf( "%d ", @m[r][c]@ );  ...
    1018 }
    1019 int m@[5][5]@;
    1020 for ( int r = 0; r < 5; r += 1 ) {
    1021 
    1022         for ( int c = 0; c < 5; c += 1 )
    1023                 m[r][c] = r + c;
    1024 }
    1025 fc( 5, 5, m );
    1026 \end{cfa}
    1027 &
    1028 \begin{cfa}
    1029 void faa( int rows, int cols, int * m[ @/* cols */@ ] ) {
    1030         ...  printf( "%d ", @m[r][c]@ );  ...
    1031 }
    1032 int @* aa[5]@;  // row pointers
    1033 for ( int r = 0; r < 5; r += 1 ) {
    1034         @aa[r] = malloc( 5 * sizeof(int) );@ // create rows
    1035         for ( int c = 0; c < 5; c += 1 )
    1036                 aa[r][c] = r + c;
    1037 }
    1038 faa( 5, 5, aa );
    1039 \end{cfa}
    1040 \end{tabular}
    1041 \caption{C99 Contiguous \vs Non-contiguous Matrix Styles}
    1042 \label{f:ContiguousNon-contiguous}
    1043 \end{figure}
    1044 
    1045 
    1046 \subsection{Multidimensional array implementation}
     863\section{Multidimensional array implementation}
    1047864\label{s:ArrayMdImpl}
    1048865
     
    12211038        S & | sized(S),                 $\C{// masquerading-as}$
    12221039        Timmed &,                               $\C{// immediate element, often another array}$
    1223         Tbase &                                 $\C{// base element, e.g. float, never array}$
     1040        Tbase &                                 $\C{// base element, \eg float, never array}$
    12241041) { // distribute forall to each element
    12251042        struct arpk {
  • TabularUnified doc/theses/mike_brooks_MMath/background.tex

    r63a7394 rc3e41cda  
    178178\lstinput{34-34}{bkgd-carray-arrty.c}
    179179The inspection begins by using @sizeof@ to provide program semantics for the intuition of an expression's type.
    180 An architecture with 64-bit pointer size is used, to keep irrelevant details fixed.
     180An architecture with 64-bit pointer size is used, to remove irrelevant details.
    181181\lstinput{35-36}{bkgd-carray-arrty.c}
    182182Now consider the @sizeof@ expressions derived from @ar@, modified by adding pointer-to and first-element (and including unnecessary parentheses to avoid any confusion about precedence).
     
    215215
    216216My observation is recognizing:
    217 \begin{itemize}[leftmargin=*,topsep=0pt,itemsep=0pt]
     217\begin{itemize}[leftmargin=*,itemsep=0pt]
    218218        \item There is value in using a type that knows its size.
    219219        \item The type pointer to the (first) element does not.
     
    302302
    303303In summary, when a function is written with an array-typed parameter,
    304 \begin{itemize}[leftmargin=*,topsep=0pt]
     304\begin{itemize}[leftmargin=*]
    305305        \item an appearance of passing an array by value is always an incorrect understanding,
    306306        \item a dimension value, if any is present, is ignored,
     
    532532\subsection{Array Parameter Declaration}
    533533
    534 Passing an array along with a function call is obviously useful.
    535 Let us say that a parameter is an array parameter when the called function intends to subscript it.
    536 This section asserts that a more satisfactory/formal characterization does not exist in C, surveys the ways that C API authors communicate ``@p@ has zero or more @T@s,'' and calls out the minority cases where the C type system is using or verifying such claims.
    537 
    538 A C function's parameter declarations look different, from the caller's and callee's perspectives.
     534Passing an array as an argument to a function is necessary.
     535Assume a parameter is an array when the function intends to subscript it.
     536This section asserts that a more satisfactory/formal characterization does not exist in C, surveys the ways that C API authors communicate ``@p@ has zero or more dimensions'' and calls out the minority cases where the C type system is using or verifying such claims.
     537
     538A C parameter declarations look different, from the caller's and callee's perspectives.
    539539Both perspectives consist of the text read by a programmer and the semantics enforced by the type system.
    540 The caller's perspecitve is available from a mere function declaration (which allow definition-before-use and separate compilation), but can also be read from (the non-body part of) a function definition.
     540The caller's perspective is available from a function declaration, which allow definition-before-use and separate compilation, but can also be read from (the non-body part of) a function definition.
    541541The callee's perspective is what is available inside the function.
    542542\begin{cfa}
    543         int foo( int, float, char );                            $\C{// declaration, names optional}$
    544         int bar( int i, float f, char c ) {             $\C{// definition, names mandatory}$
    545                 $/* caller's perspective of foo's; callee's perspective of bar's */$
    546                 ...
    547         }
    548         $/* caller's persepectives of foo's and bar's */$
    549 \end{cfa}
    550 The caller's perspective is more limited.
    551 The example shows, so far, that parameter names (by virtue of being optional) are really comments in the caller's perspective, while they are semantically significant in the callee's perspective.
     543int foo( int, float, char );                            $\C{// declaration, names optional}$
     544int bar( int i, float f, char c ) {             $\C{// definition, names mandatory}$
     545        // caller's perspective of foo; callee's perspective of bar
     546}
     547// caller's perspectives of foo's and bar's
     548\end{cfa}
     549In caller's perspective, the parameter names (by virtue of being optional) are really comments;
     550in the callee's perspective, parameter names are semantically significant.
    552551Array parameters introduce a further, subtle, semantic difference and considerable freedom to comment.
    553552
    554 At the semantic level, there is no such thing as an array parameter, except for one case (@T[static 5]@) discussed shortly.
     553At the semantic level, there is no such thing as an array parameter, except for one case (@T [static 5]@) discussed shortly.
    555554Rather, there are only pointer parameters.
    556 This fact probably shares considerable responsibility for the common sense of ``an array is just a pointer,'' wich has been refuted in non-parameter contexts.
     555This fact probably shares considerable responsibility for the common sense of ``an array is just a pointer,'' which has been refuted in non-parameter contexts.
    557556This fact holds in both the caller's and callee's perspectives.
    558 However, a parameter's type can include ``array of.''
    559 For example, the type ``pointer to array of 5 ints'' (@T(*)[5]@) is a pointer type, a fully meaningful parameter type (in the sense that this description does not contain any information that the type system ignores), and a type that appears the same in the caller's \vs callee's perspectives.
    560 The outermost type constructor (syntactically first dimension) is really the one that determines the flavour of parameter.
     557However, a parameter's type can include ``array of.'', \eg the type ``pointer to array of 5 ints'' (@T (*)[5]@) is a pointer type.
     558This type is fully meaningful in the sense that its description does not contain any information that the type system ignores, and the type appears the same in the caller's \vs callee's perspectives.
     559In fact, the outermost type constructor (syntactically first dimension) is really the one that determines the flavour of parameter.
     560
     561Yet, C allows array syntax for the outermost type constructor, from which comes the freedom to comment.
     562An array parameter declaration can specify the outermost dimension with a dimension value, @[10]@ (which is ignored), an empty dimension list, @[ ]@, or a pointer, @*@, as seen in \VRef[Figure]{f:ArParmEquivDecl}.
     563The rationale for rejecting the first ``invalid'' row follows shortly, while the second ``invalid'' row is simple nonsense, included to complete the pattern; its syntax hints at what the final row actually achieves.
    561564
    562565\begin{figure}
     
    596599\end{tabular}
    597600\end{cquote}
    598 \caption{Multiple ways to declare an arrray parameter.  Across a valid row, every declaration is equivalent.  Each column gives a declaration style.  Really, the style can be read from the first row only.  The second row shows how the style extends to multiple dimensions, with the rows thereafter providing context for the choice of which second-row \lstinline{[]}receives the column-style variation.}
     601\caption{Multiple ways to declare an array parameter.
     602Across a valid row, every declaration is equivalent.
     603Each column gives a declaration style, where the style for that column is read from the first row.
     604The second row begins the style for multiple dimensions, with the rows thereafter providing context for the choice of which second-row \lstinline{[]} receives the column-style variation.}
    599605\label{f:ArParmEquivDecl}
    600606\end{figure}
    601607
    602 Yet, C allows array syntax for the outermost type constructor, from which comes the freedom to comment.
    603 An array parameter declaration can specify the outermost dimension with a dimension value, @[10]@ (which is ignored), an empty dimension list, @[ ]@, or a pointer, @*@, as seen in \VRef[Figure]{f:ArParmEquivDecl}.  The rationale for rejecting the first ``invalid'' row follows shortly, while the second ``invalid'' row is simple nonsense, included to complete the pattern; its syntax hints at what the final row actually achieves.
    604 
    605 In the lefmost style, the typechecker ignores the actual value in most practical cases.
    606 This value is allowed to be a dynamic expression, so it is \emph{possible} to use the leftmost style in many practical cases.
     608In the leftmost style, the typechecker ignores the actual value in most practical cases.
     609This value is allowed to be a dynamic expression, and then it has practical cases.
     610\begin{cfa}
     611void foo( int @n@ ) {
     612        float _42( float @a[n]@ ) {    // nested function
     613                a[0] = 42;
     614        }
     615        float b[n];
     616        _42( b );
     617}
     618\end{cfa}
     619
    607620
    608621% To help contextualize the matrix part of this example, the syntaxes @float [5][]@, @float [][]@ and @float (*)[]@ are all rejected, for reasons discussed shortly.
    609622% So are @float[5]*@, @float[]*@ and @float (*)*@.  These latter ones are simply nonsense, though they hint at ``1d array of pointers'', whose equivalent syntax options are, @float *[5]@, @float *[]@, and @float **@.
    610623
    611 It is a matter of taste as to whether a programmer should use a form as far left as possible (getting the most out of syntactically integrated comments), sticking to the right (avoiding false comfort from suggesting the typechecker is checking more than it is), or compromising in the middle (reducing unchecked information, yet clearly stating, ``I will subscript this one'').
    612 
    613 Note that this equivalence of pointer and array declarations is special to paramters.
     624It is a matter of taste as to whether a programmer should use a form as far left as possible (getting the most out of possible subscripting and dimension sizes), sticking to the right (avoiding false comfort from suggesting the typechecker is checking more than it is), or compromising in the middle (reducing unchecked information, yet clearly stating, ``I will subscript).
     625
     626Note that this equivalence of pointer and array declarations is special to parameters.
    614627It does not apply to local variables, where true array declarations are possible.
    615628\begin{cfa}
     
    626639float sum( float v[] );
    627640float arg = 3.14;
    628 sum( &arg );                                                            $\C{// accepted, v := \&arg}$
     641sum( &arg );                                                            $\C{// accepted, v = \&arg}$
    629642\end{cfa}
    630643
     
    674687The last observation is a fact of the callee's perspective.
    675688There is little type-system checking, in the caller's perspective, that what is being passed, matches.
     689\PAB{I do not understand}
    676690\begin{cfa}
    677691void f( float [][10] );
     
    680694f(&a); // reject
    681695f(&b); // accept
     696
     697void foo() {
     698        void f( float [][10] );
     699        int n = 100;
     700        float a[100], b[3][12], c[n], d[n][n];
     701        f( a );    // reject
     702        f( b );    // reject
     703        f( c );    // reject
     704        f( d );    // accept
     705        f( &a );   // reject
     706        f( &b );   // reject
     707        f( &c );   // accept
     708        f( &d );   // reject
     709}
    682710\end{cfa}
    683711This size is therefore, a callee's assumption.
    684712
    685 Finally, to handle higher-dimensional VLAs, C repurposed the @*@ \emph{within} the dimension in a declaration to mean that the callee will have make an assumption about the size here, but no (unchecked, possibly wrong) information about this assumption is included for the caller-programmer's benefit/overconfidence.
     713Finally, to handle higher-dimensional VLAs, C repurposed the @*@ \emph{within} the dimension in a declaration to mean that the callee has make an assumption about the size, but no (unchecked, possibly wrong) information about this assumption is included for the caller-programmer's benefit/over-confidence.
    686714\begin{cquote}
    687715@[@ \textit{type-qualifier-list$_{opt}$} @* ]@
  • TabularUnified doc/theses/mike_brooks_MMath/programs/hello-accordion.cfa

    r63a7394 rc3e41cda  
    3030forall( [C], [S] )
    3131int getPref( @School( C, S ) & school@, int is, int pref ) {
    32     for ( ic; C ) {
    33         int curPref = @school.preferences@[ic][is];   $\C{// offset calculation implicit}$
     32        for ( ic; C ) {
     33                int curPref = @school.preferences@[ic][is];   $\C{// offset calculation implicit}$
    3434                if ( curPref == pref ) return ic;
    3535        }
    36     assert( false );
     36        assert( false );
    3737}
    3838
     
    5959
    6060        {       string sv;
    61         int iv;
    62         // headers' row
    63         sin | "\nc\\s";
    64         for ( is ; ns ) {
    65             // column label
    66             sin | sv;
    67             school.student_ids[is] = sv;
    68         }
    69         // body rows
    70         for ( ic ; nc ) {
    71             // row label
    72             sin | sv;
    73             school.course_codes[ic] = sv;
    74             for ( is ; ns ) {
    75                 // matrix item
    76                 sin | iv;
    77                 school.preferences[ic][is] = iv;
    78             }
    79         }
    80     }
     61                int iv;
     62                // headers' row
     63                sin | "\nc\\s";
     64                for ( is ; ns ) {
     65                        // column label
     66                        sin | sv;
     67                        school.student_ids[is] = sv;
     68                }
     69                // body rows
     70                for ( ic ; nc ) {
     71                        // row label
     72                        sin | sv;
     73                        school.course_codes[ic] = sv;
     74                        for ( is ; ns ) {
     75                                // matrix item
     76                                sin | iv;
     77                                school.preferences[ic][is] = iv;
     78                        }
     79                }
     80        }
    8181
    8282
     
    9191                sout | school.student_ids[is] | ": " | nonl;
    9292                for ( pref; 1 ~= nc ) {
    93             int ic = getPref(school, is, pref);
    94             sout | school.course_codes[ ic ] | nonl;
     93                        int ic = getPref(school, is, pref);
     94                        sout | school.course_codes[ ic ] | nonl;
    9595                }
    9696                sout | nl;
  • TabularUnified doc/theses/mike_brooks_MMath/programs/hello-array.cfa

    r63a7394 rc3e41cda  
    114114        f( y, y );              $\C{// ok}$
    115115        if ( M == N )
    116                 f( x, @(array( float, M ) &)@y ); $\C{// ok}$
     116                f( x, @(array( float, M ) &)@y ); $\C{// ok}\CRT$
    117117}
    118118
  • TabularUnified doc/theses/mike_brooks_MMath/programs/sharing-demo.cfa

    r63a7394 rc3e41cda  
    108108        assert( s2 == "bbb" );
    109109        sout | xstr(S2bbb) | "\t& " | s1 | "\t& " | s1a | "\t& " | s2 | "\t\\\\";
    110      
    111     #define S2S1a s2  = s1a
     110
     111        #define S2S1a s2  = s1a
    112112        S2S1a;
    113113        assert( s1 == "aaa" );
     
    122122        assert( s2 == "ccc" );
    123123        sout | xstr(S2ccc) | "\t& " | s1 | "\t& " | s1a | "\t& " | s2 | "\t\\\\";
    124      
     124
    125125        #define S1xxx s1  = "xxx"
    126126        S1xxx;
     
    149149
    150150        #define D2_s2_s1 string s2     = s1(1,2)
    151         D2_s2_s1;     
     151        D2_s2_s1;
    152152        assert( s1 == "abcd" );
    153153        assert( s1_mid == "bc" );
     
    157157        sout | "\\par\\noindent";
    158158
    159     sout | "Again, @`shareEdits@ passes changes in both directions; copy does not.  Note the difference in index values, with the \\emph{b} position being 1 in the longer string and 0 in the shorter strings.  In the case of s1 aliasing with @s1_mid@, the very same character is being accessed by different postitions.";
     159        sout | "Again, @`shareEdits@ passes changes in both directions; copy does not.  Note the difference in index values, with the \\emph{b} position being 1 in the longer string and 0 in the shorter strings.  In the case of s1 aliasing with @s1_mid@, the very same character is being accessed by different postitions.";
    160160        sout | "\\par\\noindent";
    161161        sout | "\\begin{tabular}{llll}";
     
    176176        assert( s2 == "bc" );
    177177        sout | xstr(D2_s1mid_minus) | "\t& " | s1 | "\t& " | s1_mid | "\t& " | s2 | "\t\\\\";
    178      
    179     #define D2_s2_pipe s2    [0] = '|'
     178
     179        #define D2_s2_pipe s2    [0] = '|'
    180180        D2_s2_pipe;
    181181        assert( s1 == "a-cd" );
     
    186186        sout | "\\par\\noindent";
    187187
    188     sout | "Once again, assignment of a value is a modificiation that flows through the aliasing relationship, without affecting its structure.";
     188        sout | "Once again, assignment of a value is a modificiation that flows through the aliasing relationship, without affecting its structure.";
    189189        sout | "\\par\\noindent";
    190190        sout | "\\begin{tabular}{llll}";
     
    198198        assert( s2 == "|c" );
    199199        sout | xstr(D2_s1mid_ff) | "\t& " | s1 | "\t& " | s1_mid | "\t& " | s2 | "\t\\\\";
    200      
     200
    201201        #define D2_s2_gg s2     = "gg"
    202202        D2_s2_gg;
     
    208208        sout | "\\par\\noindent";
    209209
    210     sout | "In the \\emph{ff} step, which is a positive example of flow across an aliasing relationship, the result is straightforward to accept because the flow direction is from contained (small) to containing (large).  The following rules for edits through aliasing substrings will guide how to flow in the opposite direction.";
     210        sout | "In the \\emph{ff} step, which is a positive example of flow across an aliasing relationship, the result is straightforward to accept because the flow direction is from contained (small) to containing (large).  The following rules for edits through aliasing substrings will guide how to flow in the opposite direction.";
    211211        sout | "\\par";
    212212
    213213
    214     sout | "Growth and shrinkage are natural extensions.  An empty substring is a real thing, at a well-defined location, whose meaning is extrapolated from the examples so far.  The intended metaphor is to operating a GUI text editor.  Having an aliasing substring is like using the mouse to select a few words.  Assigning onto an aliasign substring is like typing from having a few words selected:  depending how much you type, the file being edited can get shorter or longer.";
     214        sout | "Growth and shrinkage are natural extensions.  An empty substring is a real thing, at a well-defined location, whose meaning is extrapolated from the examples so far.  The intended metaphor is to operating a GUI text editor.  Having an aliasing substring is like using the mouse to select a few words.  Assigning onto an aliasign substring is like typing from having a few words selected:  depending how much you type, the file being edited can get shorter or longer.";
    215215        sout | "\\par\\noindent";
    216216        sout | "\\begin{tabular}{lll}";
     
    219219
    220220        assert( s1 == "affd" );
    221 //      assert( s1_mid == "fc" );                                                    // ????????? bug?
     221//      assert( s1_mid == "fc" );                                                                                                        // ????????? bug?
    222222        sout | xstr(D2_s2_gg) | "\t& " | s1 | "\t& " | s1_mid | "\t\\\\";
    223223
     
    227227        assert( s1_mid == "hhhh" );
    228228        sout  | xstr(D2_s1mid_hhhh)  | "\t& " | s1 | "\t& " | s1_mid | "\t\\\\";
    229      
     229
    230230        #define D2_s1mid_i s1_mid = "i"
    231231        D2_s1mid_i;
     
    248248        sout | "\\par\\noindent";
    249249
    250     sout | "Multiple portions can be aliased.  When there are several aliasing substrings at once, the text editor analogy becomes an online multi-user editor.  I should be able to edit a paragraph in one place (changing the document's length), without my edits affecting which letters are within a mouse-selection that you had made previously, somewhere else.";
     250        sout | "Multiple portions can be aliased.  When there are several aliasing substrings at once, the text editor analogy becomes an online multi-user editor.  I should be able to edit a paragraph in one place (changing the document's length), without my edits affecting which letters are within a mouse-selection that you had made previously, somewhere else.";
    251251        sout | "\\par\\noindent";
    252252        sout | "\\begin{tabular}{lllll}";
     
    264264        assert( s1_end == "d" );
    265265        sout  | xstr(D2_s1end_s1)  | "\t& " | s1 | "\t& " | s1_bgn | "\t& " | s1_mid | "\t& " | s1_end | "\t\\\\";
    266      
     266
    267267        #define D1_s1bgn_zzz s1_bgn = "zzzz"
    268268        D1_s1bgn_zzz;
     
    275275        sout | "\\par\\noindent";
    276276
    277     sout | "When an edit happens on an aliasing substring that overlaps another, an effect is unavoidable.  Here, the passive party sees its selection shortened, to exclude the characters that were not part of the original selection.";
     277        sout | "When an edit happens on an aliasing substring that overlaps another, an effect is unavoidable.  Here, the passive party sees its selection shortened, to exclude the characters that were not part of the original selection.";
    278278        sout | "\\par\\noindent";
    279279        sout | "\\begin{tabular}{llllll}";
     
    286286        assert( s1_crs == "zj" );
    287287        assert( s1_mid == "jj" );
    288         assert( s1_end == "d" ); 
     288        assert( s1_end == "d" );
    289289        sout  | xstr(D2_s1crs_s1)  | "\t& " | s1 | "\t& " | s1_bgn | "\t& " | s1_crs | "\t& " | s1_mid | "\t& " | s1_end | "\t\\\\";
    290290
     
    301301        sout | "TODO: finish typesetting the demo";
    302302
    303     // "This shortening behaviour means that a modification has to occur entirely inside a substring, to show up in that substring.  Sharing changes through the intersection of partially overlapping aliases is still possible, so long as the receiver's boundary is not inside the edit."
     303        // "This shortening behaviour means that a modification has to occur entirely inside a substring, to show up in that substring.  Sharing changes through the intersection of partially overlapping aliases is still possible, so long as the receiver's boundary is not inside the edit."
    304304
    305305        string word = "Phi";
     
    309309        assert( consonants == "Ph" );
    310310        assert( miniscules == "hi" );
    311      
     311
    312312        consonants[1] = 's';
    313313        assert( word == "Psi" );
     
    321321        string greet_bgn = all(10,1)`shareEdits;
    322322        string greet_end = all(14,1)`shareEdits;
    323      
     323
    324324        assert( all == "They said hello again" );
    325325        assert( greet == "hello" );
    326326        assert( greet_bgn == "h" );
    327327        assert( greet_end == "o" );
    328      
     328
    329329
    330330        greet = "sup";
     
    333333        // assert( greet_bgn == "" );    ------ Should be so, but fails
    334334        // assert( greet_end == "" );
    335      
    336 
    337  
    338 
    339  
    340     /* As in the earlier step where \emph{aj} becomes \emph{ajjd}, such empty substrings maintain their places in the total string, and can be used for filling it.  Because @greet_bgn@ was orginally at the start of the edit, in the outcome, the empty @greet_bgn@ sits just before the written value.  Similarly @greed_end@ goes after.  Though not shown, an overwritten substring at neither side goes arbitrarily to the before side. */
    341  
    342 
    343  
    344 
    345         greet_bgn = "what"; 
    346      
    347      
     335
     336
     337
     338
     339
     340        /* As in the earlier step where \emph{aj} becomes \emph{ajjd}, such empty substrings maintain their places in the total string, and can be used for filling it.  Because @greet_bgn@ was orginally at the start of the edit, in the outcome, the empty @greet_bgn@ sits just before the written value.  Similarly @greed_end@ goes after.  Though not shown, an overwritten substring at neither side goes arbitrarily to the before side. */
     341
     342
     343
     344
     345        greet_bgn = "what";
     346
     347
    348348        assert( all == "They said whatsup again" );
    349      
     349
    350350        assert( greet == "sup" );
    351      
     351
    352352        assert( greet_bgn == "what" );
    353353        // assert( greet_end == "" );    ------ Should be so, but fails
    354      
    355 
    356         greet_end = "..."; 
    357      
    358      
     354
     355
     356        greet_end = "...";
     357
     358
    359359        assert( all == "They said whatsup... again" );
    360      
     360
    361361        assert( greet == "sup" );
    362      
     362
    363363        assert( greet_bgn == "what" );
    364      
     364
    365365        assert( greet_end == "..." );
    366      
    367 
    368  
    369 
    370  
    371     /* Though these empty substrings hold their places in the total string, an empty string only belongs to bigger strings when it occurs completely inside them.  There is no such state as including an empty substring at an edge.  For this reason, @word@ gains the characters added by assigning to @greet_bgn@ and @greet_end@, but the string @greet@ does not. */
    372  
     366
     367
     368
     369
     370
     371        /* Though these empty substrings hold their places in the total string, an empty string only belongs to bigger strings when it occurs completely inside them.  There is no such state as including an empty substring at an edge.  For this reason, @word@ gains the characters added by assigning to @greet_bgn@ and @greet_end@, but the string @greet@ does not. */
     372
    373373
    374374}
     
    377377int main(int argc, char ** argv) {
    378378
    379     demo1();
    380     demo2();
    381     printf("%% %s done running\n", argv[0]);
     379        demo1();
     380        demo2();
     381        printf("%% %s done running\n", argv[0]);
    382382}
  • TabularUnified doc/theses/mike_brooks_MMath/uw-ethesis.tex

    r63a7394 rc3e41cda  
    102102\input{common}
    103103%\usepackage{common}
     104
    104105\CFAStyle                                               % CFA code-style
    105106\lstset{language=cfa,belowskip=-1pt} % set default language to CFA
Note: See TracChangeset for help on using the changeset viewer.