Changeset c3e41cda
- Timestamp:
- Nov 3, 2024, 9:13:16 PM (4 months ago)
- Branches:
- master
- Children:
- 7d95bce9
- Parents:
- 63a7394
- Location:
- doc/theses/mike_brooks_MMath
- Files:
-
- 6 edited
Legend:
- Unmodified
- Added
- Removed
-
TabularUnified doc/theses/mike_brooks_MMath/array.tex ¶
r63a7394 rc3e41cda 2 2 \label{c:Array} 3 3 4 Arrays in C are possibly the single most misunderstood and incorrectly used feature in the language, resulting in the largest proportion of runtime errors and security violations. 5 This chapter describes the new \CFA language and library features that introduce a length-checked array type, @array@, to the \CFA standard library~\cite{Cforall}. 6 7 Offering the @array@ type, as a distinct alternative to the C array, is consistent with \CFA's goal of backwards compatibility, \ie virtually all existing C (@gcc@) programs can be compiled by \CFA with only a small number of changes, similar to \CC (@g++@). 8 However, a few compatibility-breaking changes to the behaviour of the C array are necessary, both as an implementation convenience and to fix C's lax treatment of arrays. 9 Hence, the @array@ type is an opportunity to start from a clean slate and show a cohesive selection of features, making it unnecessary to deal with every inherited complexity of the C array. 10 4 11 5 12 \section{Introduction} 6 13 \label{s:ArrayIntro} 7 14 8 Arrays in C are possibly the single most misunderstood and incorrectly used feature in the language, resulting in the largest proportion of runtime errors and security violations. 9 This chapter describes the new \CFA language and library features that introduce a length-checked array type to the \CFA standard library~\cite{Cforall}. 10 11 Specifically, a new \CFA array is declared by instantiating the generic @array@ type, 12 much like instantiating any other standard-library generic type (such as @dlist@), 15 The new \CFA array is declared by instantiating the generic @array@ type, 16 much like instantiating any other standard-library generic type (such as \CC @vector@), 13 17 though using a new style of generic parameter. 14 18 \begin{cfa} … … 16 20 \end{cfa} 17 21 Here, the arguments to the @array@ type are @float@ (element type) and @99@ (length). 18 When this type is used as a function parameter, the type-system requires that a call's argument matches, down to the length.22 When this type is used as a function parameter, the type-system requires that a call's argument is a perfect match. 19 23 \begin{cfa} 20 24 void f( @array( float, 42 )@ & p ) {} $\C{// p accepts 42 floats}$ 21 f( x ); $\C{// statically rejected: type s are different, 99 != 42}$25 f( x ); $\C{// statically rejected: type lengths are different, 99 != 42}$ 22 26 23 27 test2.cfa:3:1 error: Invalid application of existing declaration(s) in expression. … … 25 29 \end{cfa} 26 30 Here, the function @f@'s parameter @p@ is declared with length 42. 27 The call @f( x )@, with the argument being the previously-declared object, is invalid, because the @array@ lengths @99@ and @42@ do not match. 28 29 A function declaration can be polymorphic over these @array@ arguments by using the @forall@ declaration prefix. 30 This function @g@'s takes arbitrary type parameter @T@ (familiar) and \emph{dimension parameter} @N@ (new). 31 A dimension paramter represents a to-be-determined count of elements, managed by the type system. 31 However, the call @f( x )@ is invalid, because @x@'s length is @99@, which does not match @42@. 32 33 A function declaration can be polymorphic over these @array@ arguments by using the \CFA @forall@ declaration prefix. 32 34 \begin{cfa} 33 35 forall( T, @[N]@ ) … … 40 42 Cforall Runtime error: subscript 1000 exceeds dimension range [0,99) $for$ array 0x555555558020. 41 43 \end{cfa} 44 Function @g@ takes an arbitrary type parameter @T@ and a \emph{dimension parameter} @N@. 45 A dimension parameter represents a to-be-determined count of elements, managed by the type system. 42 46 The call @g( x, 0 )@ is valid because @g@ accepts any length of array, where the type system infers @float@ for @T@ and length @99@ for @N@. 43 Inferring values for @T@ and @N@ is implicit , without programmer involvement.47 Inferring values for @T@ and @N@ is implicit. 44 48 Furthermore, in this case, the runtime subscript @x[0]@ (parameter @i@ being @0@) in @g@ is valid because 0 is in the dimension range $[0,99)$ of argument @x@. 45 The call @g( x, 1000 )@ is also accepted through compile time;49 However, the call @g( x, 1000 )@ is also accepted through compile time; 46 50 however, this case's subscript, @x[1000]@, generates an error, because @1000@ is outside the dimension range $[0,99)$ of argument @x@. 51 In general, the @forall( ..., [N] )@ participates in the user-relevant declaration of the name @N@, which becomes usable in parameter/return declarations and within a function. 52 The syntactic form is chosen to parallel other @forall@ forms: 53 \begin{cfa} 54 forall( @[N]@ ) ... $\C[1.5in]{// dimension}$ 55 forall( T ) ... $\C{// value datatype (formerly, "otype")}$ 56 forall( T & ) ... $\C{// opaque datatype (formerly, "dtype")}\CRT$ 57 \end{cfa} 58 % The notation @array(thing, N)@ is a single-dimensional case, giving a generic type instance. 47 59 48 60 The generic @array@ type is comparable to the C array type, which \CFA inherits from C. 49 61 Their runtime characteristics are often identical, and some features are available in both. 50 For example, assume a caller instantiates @N@ with 42 (discussion about how to follow) in:62 For example, assume a caller has an argument that instantiates @N@ with 42. 51 63 \begin{cfa} 52 64 forall( [N] ) 53 void declDemo( ) {65 void declDemo( ... ) { 54 66 float x1[N]; $\C{// built-in type ("C array")}$ 55 67 array(float, N) x2; $\C{// type from library}$ … … 59 71 The two variables have identical size and layout; they both encapsulate 42-float stack allocations, with no additional ``bookkeeping'' allocations or headers. 60 72 Providing this explicit generic approach requires a significant extension to the \CFA type system to support a full-feature, safe, efficient (space and time) array-type, which forms the foundation for more complex array forms in \CFA. 61 62 Admittedly, the @array@ library type (type for @x2@) is syntactically different from its C counterpart. 63 A future goal (TODO xref) is to provide the new features upon a built-in type whose syntax approaches C's (declaration style of @x1@). 73 In all following discussion, ``C array'' means the types like that of @x@ and ``\CFA array'' means the standard-library @array@ type (instantiations), like the type of @x2@. 74 75 Admittedly, the @array@ library type for @x2@ is syntactically different from its C counterpart. 76 A future goal (TODO xref) is to provide the new @array@ features with syntax approaching C's (declaration style of @x1@). 64 77 Then, the library @array@ type could be removed, giving \CFA a largely uniform array type. 65 At present, the C-syntax array gets partial support for the new features, so the generic @array@ is used exclusively when introducing features;78 At present, the C-syntax @array@ is only partially supported, so the generic @array@ is used exclusively in the thesis; 66 79 feature support and C compatibility are revisited in Section ? TODO. 67 68 Offering the @array@ type, as a distinct alternative to the C array, is consistent with \CFA's goal of backwards compatibility, \ie virtually all existing C (@gcc@) programs can be compiled by \CFA with only a small number of changes, similar to \CC (@g++@).69 However, a few compatibility-breaking changes to the behaviour of the C array are necessary, both as an implementation convenience and to fix C's lax treatment of arrays.70 Hence, the @array@ type is an opportunity to start from a clean slate and show a cohesive selection of features, making it unnecessary to deal with every inherited complexity of the C array.71 72 In all discussion following, ``C array'' means the types like that of @x@ and ``\CFA array'' means the standard-library @array@ type (instantiations), like the type of @x2@.73 80 74 81 My contributions in this chapter are: … … 83 90 84 91 85 \section{Definitions and design considerations} 86 87 88 \subsection{Dependent typing} 89 90 91 92 General dependent typing allows the type system to encode arbitrary predicates (e.g. behavioural specifications for functions), 92 \section{Dependent typing} 93 94 General dependent typing allows the type system to encode arbitrary predicates (\eg behavioural specifications for functions), 93 95 which is an anti-goal for my work. 94 96 Firstly, this application is strongly associated with pure functional languages, … … 101 103 102 104 103 104 105 \section{Features added} 105 106 … … 109 110 By declaring type variables at the front of object declarations, an array dimension is lexically referenceable where it is needed. 110 111 For example, a declaration can share one length, @N@, among a pair of parameters and the return, 111 meaning that it requires both input arrays to be of the same length, and guarantees that the result with beof that length as well.112 meaning that it requires both input arrays to be of the same length, and guarantees that the result is of that length as well. 112 113 \lstinput{10-17}{hello-array.cfa} 113 This function @f@ does a pointwise comparison of its two input arrays, checking if each pair of numbers is within half a percent of each other, returning the answers in a newly allocated @bool@ array. 114 The dynamic allocation of the @ret@ array by preexisting @alloc@ uses the parameterized dimension information implicitly within its @sizeof@ determination, and casts the return type. 115 Note that alloc only sees one whole type for its @T@ (which is @f@'s @array(bool, N)@); this type's size is a computation based on @N@. 116 \begin{cfa} 117 // simplification 118 static inline forall( T & | sized(T) ) 114 Function @f@ does a pointwise comparison of its two input arrays, checking if each pair of numbers is within half a percent of each other, returning the answers in a newly allocated @bool@ array. 115 The dynamic allocation of the @ret@ array, by the library @alloc@ function, 116 \begin{cfa} 117 forall( T & | sized(T) ) 119 118 T * alloc() { 120 return (T *)malloc( sizeof(T) ); 121 } 122 \end{cfa} 123 This example illustrates how the new @array@ type plugs into existing \CFA behaviour by implementing necessary @sized@ assertions needed by other types. 124 (@sized@ implies a concrete \vs abstract type with a runtime-available size, exposed as @sizeof@.) 119 return @(T *)@malloc( @sizeof(T)@ ); 120 } 121 \end{cfa} 122 uses the parameterized dimension information implicitly within its @sizeof@ determination, and casts the return type. 123 Note that @alloc@ only sees one whole type for its @T@ (which is @f@'s @array(bool, N)@); this type's size is a computation based on @N@. 124 This example illustrates how the new @array@ type plugs into existing \CFA behaviour by implementing necessary \emph{sized} assertions needed by other types. 125 (\emph{sized} implies a concrete \vs abstract type with a runtime-available size, exposed as @sizeof@.) 125 126 As a result, there is significant programming safety by making the size accessible and implicit, compared with C's @calloc@ and non-array supporting @memalign@, which take an explicit length parameter not managed by the type system. 126 127 … … 142 143 The result is a significant improvement in safety and usability. 143 144 144 In general, the @forall( ..., [N] )@ participates in the user-relevant declaration of the name @N@, which becomes usable in parameter/return declarations and within a function.145 The syntactic form is chosen to parallel other @forall@ forms:146 \begin{cfa}147 forall( @[N]@ ) ... $\C[1.5in]{// dimension}$148 forall( T & ) ... $\C{// opaque datatype (formerly, "dtype")}$149 forall( T ) ... $\C{// value datatype (formerly, "otype")}\CRT$150 \end{cfa}151 % The notation @array(thing, N)@ is a single-dimensional case, giving a generic type instance.152 145 In summary: 153 146 \begin{itemize} … … 168 161 % agreed, though already said 169 162 \item 170 \CC does not allow a template function to be nested, while \CFA le sts its polymorphic functions to be nested.163 \CC does not allow a template function to be nested, while \CFA lets its polymorphic functions to be nested. 171 164 % why is this important? 172 165 \item … … 227 220 \end{cfa} 228 221 \end{tabular} 229 \caption{\lstinline{N}-style param ters, for \CC template \vs \CFA generic type }222 \caption{\lstinline{N}-style parameters, for \CC template \vs \CFA generic type } 230 223 \label{f:TemplateVsGenericType} 231 224 \end{figure} 232 225 233 226 Just as the first example in \VRef[Section]{s:ArrayIntro} shows a compile-time rejection of a length mismatch, 234 so are length mismatches stopped when they inv love dimension parameters.227 so are length mismatches stopped when they involve dimension parameters. 235 228 While \VRef[Figure]{f:fHarness} shows successfully calling a function @f@ expecting two arrays of the same length, 236 229 \begin{cfa} 237 230 array( bool, N ) & f( array( float, N ) &, array( float, N ) & ); 238 231 \end{cfa} 239 a static rejection occurs when attempting to call @f@ with arrays of potentiallydiffering lengths.232 a static rejection occurs when attempting to call @f@ with arrays of differing lengths. 240 233 \lstinput[tabsize=1]{70-74}{hello-array.cfa} 241 234 When the argument lengths themselves are statically unknown, … … 252 245 Orthogonally, the \CFA array type works within generic \emph{types}, \ie @forall@-on-@struct@. 253 246 The same argument safety and the associated implicit communication of array length occurs. 254 Preexisting \CFA allowed aggregate types to be generalized with type parameters, enabling parameterizing forelement types.247 Preexisting \CFA allowed aggregate types to be generalized with type parameters, enabling parameterizing of element types. 255 248 Now, \CFA also allows parameterizing them by length. 256 249 Doing so gives a refinement of C's ``flexible array member'' pattern[TODO: cite ARM 6.7.2.1 pp18]\cite{arr:gnu-flex-mbr}. … … 259 252 This flexibility, in turn, allows for multiple array members. 260 253 \lstinput{10-15}{hello-accordion.cfa} 261 This structure's layout has the starting offset of @studentIds@ varying according to the generic parameter @C@, and the offset of @preferences@ varying according to both generic parameters. 262 263 The school example has the data structure capturing many students' course-preference forms. 264 It has course- and student-level metadata (their respective display names) and a position-based preferecens' matrix. 265 The input files in \VRef[Figure]{f:checkHarness} give example data. 266 267 When a function operates on a @School@ structure, the type system handles its memory layout transparently. 268 \lstinput{30-37}{hello-accordion.cfa} 269 In the running example, this @getPref@ function answers, 270 for the student at position @sIx@, what is the position of its @pref@\textsuperscript{th}-favoured class? 271 272 \VRef[Figure]{f:checkHarness} shows the @School@ harness and results with different array sizes. 273 This example program prints the courses in each student's preferred order, all using the looked-up display names. 274 Note the declaration of the @school@ variable. 254 The structure has course- and student-level metatdata (their respective field names) and a position-based preferences' matrix. 255 Its layout has the starting offset of @studentIds@ varying according to the generic parameter @C@, and the offset of @preferences@ varying according to both generic parameters. 256 257 \VRef[Figure]{f:checkHarness} shows a program main using @School@ and results with different array sizes. 258 The @school@ variable holds many students' course-preference forms. 275 259 It is on the stack and its initialization does not use any casting or size arithmetic. 276 260 Both of these points are impossible with a C flexible array member. … … 280 264 \end{cfa} 281 265 This ability to avoid casting and size arithmetic improves safety and usability over C flexible array members. 282 266 Finally, inputs and outputs are given at the bottom for different sized schools. 267 The example program prints the courses in each student's preferred order, all using the looked-up display names. 283 268 284 269 \begin{figure} 285 % super hack to get this to line up 286 \begin{tabular}{@{}ll@{\hspace{25pt}}l@{}} 287 \begin{tabular}{@{}p{3.25in}@{}} 270 \begin{cquote} 288 271 \lstinput{50-55}{hello-accordion.cfa} 289 \vspace*{-3pt}290 272 \lstinput{90-98}{hello-accordion.cfa} 291 \end{tabular} 292 & 293 \raisebox{0.32\totalheight}{% 294 }% 295 & 296 \end{tabular} 297 298 TODO: Get Peter's layout help 299 300 \$ cat school1 301 273 \ \\ 274 @$ cat school1@ 302 275 \lstinput{}{school1} 303 276 304 \$ ./a.out < school1 305 277 @$ ./a.out < school1@ 306 278 \lstinput{}{school1.out} 307 279 308 \$ cat school2 309 280 @$ cat school2@ 310 281 \lstinput{}{school2} 311 282 312 \$ ./a.out < school2 313 283 @$ ./a.out < school2@ 314 284 \lstinput{}{school2.out} 285 \end{cquote} 315 286 316 287 \caption{\lstinline{School} harness, input and output} 317 288 \label{f:checkHarness} 318 289 \end{figure} 290 291 When a function operates on a @School@ structure, the type system handles its memory layout transparently. 292 \lstinput{30-37}{hello-accordion.cfa} 293 In the example, this @getPref@ function answers, for the student at position @is@, what is the position of its @pref@\textsuperscript{th}-favoured class? 319 294 320 295 … … 333 308 But simplifications close enough for the present discussion are: 334 309 \begin{cfa} 335 336 337 338 339 340 struct array_1d{341 342 343 \end{cfa} 344 Th is structure pattern, plus a subscript operator, is all that @array@ provides.310 forall( [N] ) 311 struct array_1d_float { 312 float items[N]; 313 }; 314 forall( T, [N] ) 315 struct array_1d_T { 316 T items[N]; 317 }; 318 \end{cfa} 319 These two structure patterns, plus a subscript operator, is all that @array@ provides. 345 320 346 321 My main work is letting a programmer define 347 such a struct re (one whose type is parameterized by @[N]@)322 such a structure (one whose type is parameterized by @[N]@) 348 323 and functions that operate on it (these being similarly parameterized). 349 324 … … 351 326 \begin{itemize} 352 327 \item 353 The resolver, providingvalues for a used declaration's type-system variables,328 Resolver provided values for a used declaration's type-system variables, 354 329 gathered from type information in scope at the usage site. 355 356 330 \item 357 331 The box pass, encoding information about type parameters 358 332 into ``extra'' regular parameters/arguments on declarations and calls. 359 333 Notably, it conveys the size of a type @foo@ as a @__sizeof_foo@ parameter, 360 and rewrites the @sizeof(foo)@ expression as @__sizeof_foo@, i.e.a use of the parameter.334 and rewrites the @sizeof(foo)@ expression as @__sizeof_foo@, \ie a use of the parameter. 361 335 \end{itemize} 362 336 … … 364 338 This work is detailed in \VRef[Section]{s:ArrayTypingC}. 365 339 However, the resolution--boxing scheme, in its preexisting state, was already equipped to work on (desugared) dimension parameters. 366 The discussion following explains the desugaring and how correctlowered code results.367 368 A n even simpler structure, and a toy function on it, demonstrate what's needed for the encoding.369 \begin{cfa} 370 forall( [@N@] ) { // [1] 371 372 void f( thing(@N@) ) { sout | @N@; } // [2], [3]373 374 375 thing( @10@ ) x; f(x); // prints 10, [4]376 thing( 100 ) y; f(y); // prints 100377 378 340 The following discussion explains the desugaring and how correctly lowered code results. 341 342 A simpler structure, and a toy function on it, demonstrate what is needed for the encoding. 343 \begin{cfa} 344 forall( [@N@] ) { $\C{// [1]}$ 345 struct thing {}; 346 void f( thing(@N@) ) { sout | @N@; } $\C{// [2], [3]}$ 347 } 348 int main() { 349 thing( @10@ ) x; f( x ); $\C{// prints 10, [4]}$ 350 thing( 100 ) y; f( y ); $\C{// prints 100}$ 351 return 0; 352 } 379 353 \end{cfa} 380 354 This example has: … … 389 363 A value like 10 being used as an argument to the parameter @N@. 390 364 \end{enumerate} 391 The chosen solution being to encode the value @N@ \emph{as a type},items 1 and 2 are immediately available for free.365 The chosen solution is to encode the value @N@ \emph{as a type}, so items 1 and 2 are immediately available for free. 392 366 Item 3 needs a way to recover the encoded value from a (valid) type (and to reject invalid types occurring here). 393 367 Item 4 needs a way to produce a type that encodes the given value. … … 400 374 \item 401 375 Given a dimension expression $e$, produce type @char[@$e$@]@ to represent it. 402 If $e$ evaluates to $n$ then the en doded type has size $n$.376 If $e$ evaluates to $n$ then the encoded type has size $n$. 403 377 \item 404 378 Given a type $T$ (produced by these rules), recover the value that it represents with the expression @sizeof(@$T$@)@. … … 410 384 The running example is lowered to: 411 385 \begin{cfa} 412 forall( @N*@ ) { // [1] 413 414 void f( thing(@N@) ) { sout | @sizeof(N)@; } // [2], [3]415 416 417 thing( char[@10@] ) x; f(x); // prints 10, [4]418 thing( char[100] ) y; f(y); // prints 100419 420 386 forall( @N *@ ) { $\C{// [1]}$ 387 struct thing {}; 388 void f( thing(@N@) ) { sout | @sizeof(N)@; } $\C{// [2], [3]}$ 389 } 390 int main() { 391 thing( char[@10@] ) x; f( x ); $\C{// prints 10, [4]}$ 392 thing( char[100] ) y; f( y ); $\C{// prints 100}$ 393 return 0; 394 } 421 395 \end{cfa} 422 396 Observe: … … 430 404 The @sout...@ expression (being an application of the @?|?@ operator) has a second argument that is an ordinary expression. 431 405 \item 432 The type of variable @x@ is another @thing(-)@ type; the argument to the generic @thing@ is a type (array type).406 The type of variable @x@ is another @thing(-)@ type; the argument to the generic @thing@ is a type (array type of bytes, @char@). 433 407 \end{enumerate} 434 408 435 409 From this point, preexisting \CFA compilation takes over lowering it the rest of the way to C. 436 Inspecting the result shows what the above translation achieves. 437 A form that shows only the relevant changes of the box pass (as informed by the resolver), leaving the rest unadulterated, is: 438 \begin{cfa} 439 // [1] 440 void f( size_t __sizeof_N, @void *@ ) { sout | @__sizeof_N@; } // [2], [3] 441 int main() { 442 struct __conc_thing_10 {} x; f(@10@, &x); // prints 10, [4] 443 struct __conc_thing_100 {} y; f(@100@, &y); // prints 100 444 return 0; 445 } 410 Here the result shows only the relevant changes of the box pass (as informed by the resolver), leaving the rest unadulterated: 411 \begin{cfa} 412 // [1] 413 void f( size_t __sizeof_N, @void *@ ) { sout | @__sizeof_N@; } $\C{// [2], [3]}$ 414 int main() { 415 struct __conc_thing_10 {} x; f( @10@, &x ); $\C{// prints 10, [4]}$ 416 struct __conc_thing_100 {} y; f( @100@, &y ); $\C{// prints 100}$ 417 return 0; 418 } 446 419 \end{cfa} 447 420 Observe: … … 452 425 The type @thing(N)@ is (replaced by @void *@, but thereby effectively) gone. 453 426 \item 454 The @sout...@ expression (being an application of the @?|?@ operator) has a second argument that is a regular variable (parameter) usage.427 The @sout...@ expression (being an application of the @?|?@ operator) has a regular variable (parameter) usage for its second argument. 455 428 \item 456 429 Information about the particular @thing@ instantiation (value 10) has moved, from the type, to a regular function-call argument. 457 430 \end{enumerate} 458 At the end of the desugaring and downstream processing, the original C idiom of ``pass both a pointer and a length parameter'' has been reconstructed.459 In the programmer-written form, only the thingis passed.431 At the end of the desugaring and downstream processing, the original C idiom of ``pass both a length parameter and a pointer'' has been reconstructed. 432 In the programmer-written form, only the @thing@ is passed. 460 433 The compiler's action produces the more complex form, which if handwritten, would be error-prone. 461 434 462 Back at the very front end, the parsing changes, AST schema extensions, and validation rules for enabling the sugared user input are:435 Back at the compiler front end, the parsing changes AST schema extensions and validation rules for enabling the sugared user input. 463 436 \begin{itemize} 464 437 \item … … 467 440 Have the new brand of type-variable, \emph{Dimension}, in the AST form of a type-variable, to represent one parsed from @[-]@. 468 441 \item 469 Allow a type variable to occur in expression position. Validate (after parsing) that only dimension-branded type variables are used here.442 Allow a type variable to occur in an expression. Validate (after parsing) that only dimension-branded type variables are used here. 470 443 \item 471 444 Allow an expression to occur in type-argument position. Brand the resulting type argument as a dimension. … … 473 446 Validate (after parsing), on a generic-type usage, \eg the type part of the declaration 474 447 \begin{cfa} 475 @array_1d( foo, bar ) x;@448 array_1d( foo, bar ) x; 476 449 \end{cfa} 450 \vspace*{-10pt} 477 451 that the brands on the generic arguments match the brands of the declared type variables. 478 452 Here, that @foo@ is a type and @bar@ is a dimension. … … 488 462 from one party who knows it, to another who is willing to work with any given length. 489 463 For scenarios where the concern is a mishandled length, 490 the interaction is between two parties who both claim to know (something about) it. 491 Such a scenario occurs in this pure C fragment, wich today's C compilers accept: 492 \begin{cfa} 493 int n = @42@; 494 float x[n]; 495 float (*xp)[@999@] = &x; 496 (*xp)[@500@]; // in "bound"? 497 \end{cfa} 498 464 the interaction is between two parties who both claim to know something about it. 465 Such a scenario occurs in this pure C fragment, which today's C compilers accept: 466 \begin{cfa} 467 int n = @42@; 468 float x[n]; 469 float (*xp)[@999@] = &x; 470 (*xp)[@500@]; $\C{// in "bound"?}$ 471 \end{cfa} 499 472 Here, the array @x@ has length 42, while a pointer to it (@xp@) claims length 999. 500 473 So, while the subscript of @xp@ at position 500 is out of bound of its referent @x@, … … 505 478 The \CFA new-array rejects the analogous case: 506 479 \begin{cfa} 507 int n = @42@; 508 array(float, n) x; 509 array(float, 999) * xp = x; // static rejection here 510 (*xp)[@500@]; // runtime check vs len 999 511 \end{cfa} 512 513 % TODO: kill the vertical whitespace around these lists 514 % nothing from https://stackoverflow.com/questions/1061112/eliminate-space-before-beginitemize is working 515 516 The way the \CFA array is implemented, 517 the type analysis of this \CFA case reduces to a case similar to the earlier C version. 480 int n = @42@; 481 array(float, n) x; 482 array(float, 999) * xp = x; $\C{// static rejection here}$ 483 (*xp)[@500@]; $\C{// runtime check vs len 999}$ 484 \end{cfa} 485 The way the \CFA array is implemented, the type analysis of this case reduces to a case similar to the earlier C version. 518 486 The \CFA compiler's compatibility analysis proceeds as: 519 \begin{itemize}[ noitemsep,partopsep=-\parskip,parsep=0pt,leftmargin=4em]487 \begin{itemize}[parsep=0pt] 520 488 \item 521 489 Is @array(float, 999)@ type-compatible with @array(float, n)@? 522 490 \item 523 Is @arrayX(float, char[999])@ type-compatible with @arrayX(float, char[n])@? 524 \footnote{Here, \lstinline{arrayX} represents the type that results491 Is @arrayX(float, char[999])@ type-compatible with @arrayX(float, char[n])@?\footnote{ 492 Here, \lstinline{arrayX} represents the type that results 525 493 from desugaring the \lstinline{array} type 526 494 into a type whose generic parameters are all types. … … 531 499 Is @char[999]@ type-compatible with @char[n]@? 532 500 \end{itemize} 533 534 I chose to achieve the necessary rejection of the \CFA case 535 by adding a rejection of the corresponding C case. 536 537 This decision is not backward compatible. 501 To achieve the necessary \CFA rejections meant rejecting the corresponding C case, which is not backward compatible. 538 502 There are two complementary mitigations for this incompatibility. 539 503 … … 542 506 This situation might arise if @n@ were known to be 999, 543 507 rather than 42, as in the introductory examples. 544 The programmer can add a cast , as in:545 \begin{cfa} 546 xp = ( float (*)[999] ) &x;547 \end{cfa} 548 This addition causes \CFA to accept, be acause now, the programmer has accepted blame.508 The programmer can add a cast in the \CFA code. 509 \begin{cfa} 510 xp = @(float (*)[999])@ &x; 511 \end{cfa} 512 This addition causes \CFA to accept, because now, the programmer has accepted blame. 549 513 This addition is benign in plain C, because the cast is valid, just unnecessary there. 550 514 Moreover, the addition can even be seen as appropriate ``eye candy,'' … … 556 520 Second, the incompatibility only affects types like pointer-to-array, 557 521 which are are infrequently used in C. 558 The more common C idiom for aliasing an array is to use the pointer-to-first-element type, 559 which does not participate in the \CFA array's length checking. 560 \footnote{Notably, the desugaring of the \lstinline{array@} type, 561 avoids letting any \lstinline{-[-]} type decay, 562 in order to preserve the length information that powers runtime bound checking.} 563 Therefore, the frequency of needing to upgrade wild C code (as discussed in the first mitigation) 522 The more common C idiom for aliasing an array is to use a pointer-to-first-element type, 523 which does not participate in the \CFA array's length checking.\footnote{ 524 Notably, the desugaring of the \lstinline{array} type avoids letting any \lstinline{-[-]} type decay, 525 in order to preserve the length information that powers runtime bound-checking.} 526 Therefore, the frequency of needing to upgrade legacy C code (as discussed in the first mitigation) 564 527 is anticipated to be low. 565 528 566 529 Because the incompatibility represents a low cost to a \CFA onboarding effort 567 530 (with a plausible side benefit of linting the original code for a missing annotation), 568 I elected not to add special measuresto retain the compatibility.531 no special measures were added to retain the compatibility. 569 532 It would be possible to flag occurrences of @-[-]@ types that come from @array@ desugaring, 570 533 treating those with stricter \CFA rules, while treating others with classic C rules. … … 573 536 574 537 Having allowed that both the initial C example's check 575 \begin{itemize} [noitemsep,partopsep=-\parskip,parsep=0pt,leftmargin=4em]538 \begin{itemize} 576 539 \item 577 540 Is @float[999]@ type-compatible with @float[n]@? 578 541 \end{itemize} 579 and the second \CFA ex mple's induced check580 \begin{itemize} [noitemsep,partopsep=-\parskip,parsep=0pt,leftmargin=4em]542 and the second \CFA example's induced check 543 \begin{itemize} 581 544 \item 582 545 Is @char[999]@ type-compatible with @char[n]@? … … 587 550 To implement the new \CFA rules, I took the syntactic recursion a step further, obtaining, 588 551 in both cases: 589 \begin{itemize} [noitemsep,partopsep=-\parskip,parsep=0pt,leftmargin=4em]552 \begin{itemize} 590 553 \item 591 Is @999@ TBD-compatible with @n@?554 Is @999@ compatible with @n@? 592 555 \end{itemize} 593 This compatibility question applies to a pair of expressions, where the earlier oneswere to types.556 This compatibility question applies to a pair of expressions, where the earlier implementation were to types. 594 557 Such an expression-compatibility question is a new addition to the \CFA compiler. 595 These questions only arise in the context of dimension expressions on (C) array types.558 Note, these questions only arise in the context of dimension expressions on (C) array types. 596 559 597 560 TODO: ensure these compiler implementation matters are treated under \CFA compiler background: … … 600 563 GenPoly. 601 564 602 The relevant technical component of the \CFA compiler is, 603 within the type resolver, the type unification procedure. 565 The relevant technical component of the \CFA compiler is the type unification procedure within the type resolver. 604 566 I added rules for continuing this unification into expressions that occur within types. 605 567 It is still fundamentally doing \emph{type} unification … … 607 569 and not participating in binding any variables that stand in for expression fragments 608 570 (for there is no such sort of variable in \CFA's analysis.) 609 610 571 An unfortunate fact about the \CFA compiler's preexisting implementation is that 611 572 type unification suffers from two forms of duplication. … … 613 574 The first duplication has (many of) the unification rules stated twice. 614 575 As a result, my additions for dimension expressions are stated twice. 615 The extra statement of the rules occurs in the GenPolymodule,576 The extra statement of the rules occurs in the @GenPoly@ module, 616 577 where concrete types like @array(int, 5)@\footnote{ 617 578 Again, the presentation is simplified 618 by leaving the \lstinline{array} macro unexpanded }579 by leaving the \lstinline{array} macro unexpanded.} 619 580 are lowered into corresponding C types @struct __conc_array_1234@ (the suffix being a generated index). 620 In this case, the struct's definition gives fields that hardcode the argument values of @float@ and @5@. 621 The next time an @array(-,-)@ concrete instance is encountered, 622 is the previous @struct __conc_array_1234@ suitable for it? 623 Yes, for another occurrance of @array(int, 5)@; 581 In this case, the struct's definition contains fields that hardcode the argument values of @float@ and @5@. 582 The next time an @array(-,-)@ concrete instance is encountered, it checks if the previous @struct __conc_array_1234@ is suitable for it. 583 Yes, for another occurrence of @array(int, 5)@; 624 584 no, for either @array(rational(int), 5)@ or @array(int, 42)@. 625 585 By the last example, this phase must ``reject'' … … 630 590 In the program 631 591 \begin{cfa} 632 void f( double );633 forall( T & ) void f( T & );634 635 636 f(x);637 638 \end{cfa} 639 when resolving the function call, the first unification stage640 compares the type s @T@, of the parameter,with @array( float, n + 1 )@, of the argument.592 void @f@( double ); 593 forall( T & ) void @f@( T & ); 594 void g( int n ) { 595 array( float, n + 1 ) x; 596 f(x); // overloaded 597 } 598 \end{cfa} 599 when resolving the function call, @g@, the first unification stage 600 compares the type @T@ of the parameter with @array( float, n + 1 )@, of the argument. 641 601 TODO: finish. 642 602 … … 647 607 TODO: Deal with directionality, that I'm doing exact-match, no ``at least as long as,'' no subtyping. 648 608 Should it be an earlier scoping principle? Feels like it should matter in more places than here.} 649 So, a ``yes'' answer must represent a guarantee that both expressions willevaluate the650 same result, while a ``no'' can tolerate ``they might, but we're not sure ,'609 So, a ``yes'' answer must represent a guarantee that both expressions evaluate the 610 same result, while a ``no'' can tolerate ``they might, but we're not sure'', 651 611 provided that practical recourses are available 652 to let programmers express theirbetter knowledge.653 The specific rule-set that I offer withthe current release is, in fact, extremely conservative.612 to let programmers express better knowledge. 613 The new rule-set in the current release is, in fact, extremely conservative. 654 614 I chose to keep things simple, 655 and allow real future needs do drive adding additional complexity, 656 within the framework that I laid out. 615 and allow future needs to drive adding additional complexity, within the new framework. 657 616 658 617 For starters, the original motivating example's rejection … … 662 621 Rather, the analysis assumes a variable's value can be anything, 663 622 and so there can be no guarantee that its value is 999. 664 So, a variable useand a literal can never match.623 So, a variable and a literal can never match. 665 624 666 625 Two occurrences of the same literal value are obviously a fine match. 667 For two occurrences of the same varia lbe, more information is needed.626 For two occurrences of the same variable, more information is needed. 668 627 For example, this one is fine 669 628 \begin{cfa} 670 671 672 float (*xp)[n] = x;// accept673 629 void f( const int n ) { 630 float x[n]; 631 float (*xp)[n] = x; // accept 632 } 674 633 \end{cfa} 675 634 while this one is not: 676 635 \begin{cfa} 677 678 679 680 681 float (*xp)[n] = x;// reject682 636 void f() { 637 int n = 42; 638 float x[n]; 639 n = 999; 640 float (*xp)[n] = x; // reject 641 } 683 642 \end{cfa} 684 643 Furthermore, the fact that the first example sees @n@ as @const@ 685 is not actually a sufficent basis.644 is not actually sufficient. 686 645 In this example, @f@'s length expression's declaration is as @const@ as it can be, 687 646 yet its value still changes between the two invocations: 688 \begin{cfa} 689 // compile unit 1 690 void g(); 691 void f( const int & const nr ) { 692 float x[nr]; 693 g(); 694 float (*xp)[nr] = x; // reject 695 } 696 // compile unit 2 697 static int n = 42; 698 void g() { 699 n = 99; 700 } 701 void f( const int & ); 702 int main () { 703 f(n); 704 return 0; 705 } 706 \end{cfa} 707 The issue in this last case is, 708 just because you aren't able to change something doesn't mean someone else can't. 709 710 My rule set also respects a feature of the C tradition. 711 In spite of the several limitations of the C rules 647 \begin{cquote} 648 \setlength{\tabcolsep}{15pt} 649 \begin{tabular}{@{}ll@{}} 650 \begin{cfa} 651 // compile unit 1 652 void g(); 653 void f( const int & const nr ) { 654 float x[nr]; 655 g(); // change n 656 @float (*xp)[nr] = x;@ // reject 657 } 658 \end{cfa} 659 & 660 \begin{cfa} 661 // compile unit 2 662 static int n = 42; 663 void g() { 664 n = 99; 665 } 666 667 f( n ); 668 \end{cfa} 669 \end{tabular} 670 \end{cquote} 671 The issue here is that knowledge needed to make a correct decision is hidden by separate compilation. 672 Even within a translation unit, static analysis might not be able to provide all the information. 673 674 My rule set also respects a traditional C feature: In spite of the several limitations of the C rules 712 675 accepting cases that produce different values, there are a few mismatches that C stops. 713 C is quite precise when working with two static values :714 \begin{cfa} 715 716 717 float (*xp1)[42] = &x;// accept718 float (*xp2)[999] = &x;// reject676 C is quite precise when working with two static values. 677 \begin{cfa} 678 enum { fortytwo = 42 }; 679 float x[fortytwo]; 680 float (*xp1)[42] = &x; // accept 681 float (*xp2)[999] = &x; // reject 719 682 \end{cfa} 720 683 My \CFA rules agree with C's on these cases. 721 684 722 Myrules classify expressions into three groups:685 In summary, the new rules classify expressions into three groups: 723 686 \begin{description} 724 687 \item[Statically Evaluable] 725 688 Expressions for which a specific value can be calculated (conservatively) 726 689 at compile-time. 727 A preexisting \CFA compiler module defines which expressions qualify,690 A preexisting \CFA compiler module defines which literals, enumerators, and expressions qualify, 728 691 and evaluates them. 729 Includes literals and enumeration values.730 692 \item[Dynamic but Stable] 731 The value of a variable declared as @const@. 732 Includes a @const@ parameter. 693 The value of a variable declared as @const@, including a @const@ parameter. 733 694 \item[Potentially Unstable] 734 695 The catch-all category. Notable examples include: 735 any function-call result (@float x[foo()];@),736 the particular function-call result that is a pointer dereference (@void f(const int * n) { float x[*n]; }@), and696 any function-call result, @float x[foo()];@, 697 the particular function-call result that is a pointer dereference, @void f(const int * n)@ @{ float x[*n]; }@, and 737 698 any use of a reference-typed variable. 738 699 \end{description} 739 740 My \CFA rules are: 700 Within these groups, my \CFA rules are: 741 701 \begin{itemize} 742 702 \item … … 744 704 Notably, this rule allows a literal to match with an enumeration value, based on the value. 745 705 \item 746 Accept a Dynamic but Stable pair, if both expressions are written out the same, e.g. refers tosame variable declaration.706 Accept a Dynamic but Stable pair, if both expressions are written out the same, \eg refers to the same variable declaration. 747 707 \item 748 708 Otherwise, reject. 749 Notably, reject all pairs from the Potentially Unstable group. 750 Notably, reject all pairs that cross groups. 709 Notably, reject all pairs from the Potentially Unstable group and all pairs that cross groups. 751 710 \end{itemize} 752 753 711 The traditional C rules are: 754 712 \begin{itemize} … … 759 717 \end{itemize} 760 718 761 762 \newcommand{\falsealarm}{{\color{orange}\small{*}}}763 \newcommand{\allowmisuse}{{\color{red}\textbf{!}}}764 \newcommand{\cmark}{\ding{51}} % from pifont765 \newcommand{\xmark}{\ding{55}}766 719 \begin{figure} 720 \newcommand{\falsealarm}{{\color{blue}\small{*}}} 721 \newcommand{\allowmisuse}{{\color{red}\textbf{!}}} 722 \newcommand{\cmark}{\ding{51}} % from pifont 723 \newcommand{\xmark}{\ding{55}} 724 767 725 \begin{tabular}{@{}l@{\hspace{16pt}}c@{\hspace{8pt}}c@{\hspace{16pt}}c@{\hspace{8pt}}c@{\hspace{16pt}}c} 768 726 & \multicolumn{2}{c}{\underline{Values Equal}} … … 778 736 \end{tabular} 779 737 780 \ vspace{12pt}781 \noindent\textbf{Legend :}782 \begin{itemize} 738 \medskip 739 \noindent\textbf{Legend} 740 \begin{itemize}[leftmargin=*] 783 741 \item 784 742 Each row gives the treatment of a test harness of the form 785 743 \begin{cfa} 786 787 744 float x[ expr1 ]; 745 float (*xp)[ expr2 ] = &x; 788 746 \end{cfa} 789 where \lstinline{expr1} and \lstinline{expr2} are metavariables varying according to the row's Case. 747 \vspace*{-10pt} 748 where \lstinline{expr1} and \lstinline{expr2} are meta-variables varying according to the row's Case. 790 749 Each row's claim applies to other harnesses too, including, 791 750 \begin{itemize} 792 751 \item 793 calling a function with a param ter like \lstinline{x} and an argument of the \lstinline{xp} type,752 calling a function with a parameter like \lstinline{x} and an argument of the \lstinline{xp} type, 794 753 \item 795 754 assignment in place of initialization, … … 801 760 \item 802 761 Each case's claim is symmetric (swapping \lstinline{expr1} with \lstinline{expr2} has no effect), 803 even though most test harnesses are asym etric.762 even though most test harnesses are asymmetric. 804 763 \item 805 764 The table treats symbolic identity (Same/Different on rows) 806 apart from value eq ality (Equal/Unequal on columns).765 apart from value equality (Equal/Unequal on columns). 807 766 \begin{itemize} 808 767 \item … … 819 778 while every Accept under Values Unequal is an allowed misuse (\allowmisuse). 820 779 \end{itemize} 821 \caption{Case comparison for array type compatibility, given pairs of dimension expressions. 822 TODO: get Peter's LaTeX help on overall appearance, probably including column spacing/centering and bullet indentation.}780 781 \caption{Case comparison for array type compatibility, given pairs of dimension expressions.} 823 782 \label{f:DimexprRuleCompare} 824 783 \end{figure} … … 826 785 827 786 Figure~\ref{f:DimexprRuleCompare} gives a case-by-case comparison of the consequences of these rule sets. 828 It demonstrates that the \CFA false alarms occur in the same cases as C treats unsafe ly.829 It also shows that C-incompatibilities only occur in cases that C treats unsafe ly.787 It demonstrates that the \CFA false alarms occur in the same cases as C treats unsafe. 788 It also shows that C-incompatibilities only occur in cases that C treats unsafe. 830 789 831 790 … … 837 796 whose reuses are rejected by the blunt current-state rules: 838 797 \begin{cfa} 839 840 841 float (*xp)[nr] = & x;// reject: nr varying (no references)842 843 float (*yp)[nv + 1] = & y; // reject: ?+? unpredicable (no functions)844 798 void f( int & nr, const int nv ) { 799 float x[nr]; 800 float (*xp)[nr] = &x; // reject: nr varying (no references) 801 float y[nv + 1]; 802 float (*yp)[nv + 1] = &y; // reject: ?+? unpredictable (no functions) 803 } 845 804 \end{cfa} 846 805 Yet, both dimension expressions are reused safely. 847 (The @nr@ reference is never written, not volatile806 The @nr@ reference is never written, not volatile 848 807 and control does not leave the function between the uses. 849 The name @?+?@ resolves to a function that is quite predictable. )850 The programmer here can add the constant declarations:851 \begin{cfa} 852 853 854 855 float (*xp)[nx] = & x; // acept856 857 858 float (*yp)[ny] = & y;// accept859 808 The name @?+?@ resolves to a function that is quite predictable. 809 Here, the programmer can add the constant declarations (cast does not work): 810 \begin{cfa} 811 void f( int & nr, const int nv ) { 812 @const int nx@ = nr; 813 float x[nx]; 814 float (*xp)[nx] = & x; // accept 815 @const int ny@ = nv + 1; 816 float y[ny]; 817 float (*yp)[ny] = & y; // accept 818 } 860 819 \end{cfa} 861 820 The result is the originally intended semantics, … … 863 822 864 823 The snapshotting trick is also used by the translation, though to achieve a different outcome. 865 Rather obviously, every array must be subscriptable, even a biz zarre one:866 \begin{cfa} 867 868 824 Rather obviously, every array must be subscriptable, even a bizarre one: 825 \begin{cfa} 826 array( float, rand(10) ) x; 827 x[0]; // 10% chance of bound-check failure 869 828 \end{cfa} 870 829 Less obvious is that the mechanism of subscripting is a function call, … … 874 833 Adjusting the example to make the function's use of length more explicit: 875 834 \begin{cfa} 876 877 878 879 835 forall ( T * ) 836 void f( T * x ) { sout | sizeof(*x); } 837 float x[ rand(10) ]; 838 f( x ); 880 839 \end{cfa} 881 840 Considering that the partly translated function declaration is, loosely, 882 841 \begin{cfa} 883 void f( size_t __sizeof_T, void * x ) { sout | __sizeof_T; } 884 \end{cfa} 885 the translated call must not go like: 886 \begin{cfa} 887 float x[ rand(10) ]; 888 f( rand(10), &x ); 889 \end{cfa} 890 Rather, its actual translation is like: 891 \begin{cfa} 892 size_t __dim_x = rand(10); 893 float x[ __dim_x ]; 894 f( __dim_x, &x ); 895 \end{cfa} 896 The occurrence of this dimension hoisting during translation was present in the preexisting \CFA compiler. 897 But its cases were buggy, particularly with determining, ``Can hoisting be skipped here?'' 898 For skipping this hoisting is clearly desirable in some cases, 899 not the least of which is when the programmer has already done so manually. 900 My work includes getting these cases right, harmonized with the accept/reject criteria, and tested. 901 902 842 void f( size_t __sizeof_T, void * x ) { sout | __sizeof_T; } 843 \end{cfa} 844 the translation must call the dimension argument twice: 845 \begin{cfa} 846 float x[ rand(10) ]; 847 f( rand(10), &x ); 848 \end{cfa} 849 Rather, the translation is: 850 \begin{cfa} 851 size_t __dim_x = rand(10); 852 float x[ __dim_x ]; 853 f( __dim_x, &x ); 854 \end{cfa} 855 The occurrence of this dimension hoisting during translation was in the preexisting \CFA compiler. 856 But its cases were buggy, particularly with determining, ``Can hoisting the expression be skipped here?'', for skipping this hoisting is clearly desirable in some cases. 857 For example, when the programmer has already done so manually. \PAB{I don't know what this means.} 858 In the new implementation, these cases are correct, harmonized with the accept/reject criteria. 903 859 904 860 TODO: Discuss the interaction of this dimension hoisting with the challenge of extra unification for cost calculation 905 861 906 862 907 \section{Multidimensional Arrays} 908 \label{toc:mdimpl} 909 910 % TODO: introduce multidimensional array feature and approaches 911 912 When working with arrays, \eg linear algebra, array dimensions are referred to as ``rows'' and ``columns'' for a matrix, adding ``planes'' for a cube. 913 (There is little terminology for higher dimensional arrays.) 914 For example, an acrostic poem\footnote{A type of poetry where the first, last or other letters in a line spell out a particular word or phrase in a vertical column.} 915 can be treated as a grid of characters, where the rows are the text and the columns are the embedded keyword(s). 916 Within a poem, there is the concept of a \newterm{slice}, \eg a row is a slice for the poem text, a column is a slice for a keyword. 917 In general, the dimensioning and subscripting for multidimensional arrays has two syntactic forms: @m[r,c]@ or @m[r][c]@. 918 919 Commonly, an array, matrix, or cube, is visualized (especially in mathematics) as a contiguous row, rectangle, or block. 920 This conceptualization is reenforced by subscript ordering, \eg $m_{r,c}$ for a matrix and $c_{p,r,c}$ for a cube. 921 Few programming languages differ from the mathematical subscript ordering. 922 However, computer memory is flat, and hence, array forms are structured in memory as appropriate for the runtime system. 923 The closest representation to the conceptual visualization is for an array object to be contiguous, and the language structures this memory using pointer arithmetic to access the values using various subscripts. 924 This approach still has degrees of layout freedom, such as row or column major order, \ie juxtaposed rows or columns in memory, even when the subscript order remains fixed. 925 For example, programming languages like MATLAB, Fortran, Julia and R store matrices in column-major order since they are commonly used for processing column-vectors in tabular data sets but retain row-major subscripting. 926 In general, storage layout is hidden by subscripting, and only appears when passing arrays among different programming languages or accessing specific hardware. 927 928 \VRef[Figure]{f:FixedVariable} shows two C90 approaches for manipulating a contiguous matrix. 929 Note, C90 does not support VLAs. 930 The fixed-dimension approach (left) uses the type system; 931 however, it requires all dimensions except the first to be specified at compile time, \eg @m[][6]@, allowing all subscripting stride calculations to be generated with constants. 932 Hence, every matrix passed to @fp1@ must have exactly 6 columns but the row size can vary. 933 The variable-dimension approach (right) ignores (violates) the type system, \ie argument and parameters types do not match, and subscripting is performed manually using pointer arithmetic in the macro @sub@. 934 935 \begin{figure} 936 \begin{tabular}{@{}l@{\hspace{40pt}}l@{}} 937 \multicolumn{1}{c}{\textbf{Fixed Dimension}} & \multicolumn{1}{c}{\textbf{Variable Dimension}} \\ 938 \begin{cfa} 939 940 void fp1( int rows, int m[][@6@] ) { 941 ... printf( "%d ", @m[r][c]@ ); ... 942 } 943 int fm1[4][@6@], fm2[6][@6@]; // no VLA 944 // initialize matrixes 945 fp1( 4, fm1 ); // implicit 6 columns 946 fp1( 6, fm2 ); 947 \end{cfa} 948 & 949 \begin{cfa} 950 #define sub( m, r, c ) *(m + r * sizeof( m[0] ) + c) 951 void fp2( int rows, int cols, int *m ) { 952 ... printf( "%d ", @sub( m, r, c )@ ); ... 953 } 954 int vm1[@4@][@4@], vm2[@6@][@8@]; // no VLA 955 // initialize matrixes 956 fp2( 4, 4, vm1 ); 957 fp2( 6, 8, vm2 ); 958 \end{cfa} 959 \end{tabular} 960 \caption{C90 Fixed \vs Variable Contiguous Matrix Styles} 961 \label{f:FixedVariable} 962 \end{figure} 963 964 Many languages allow multidimensional arrays-of-arrays, \eg in Pascal or \CC. 965 \begin{cquote} 966 \begin{tabular}{@{}ll@{}} 967 \begin{pascal} 968 var m : array[0..4, 0..4] of Integer; (* matrix *) 969 type AT = array[0..4] of Integer; (* array type *) 970 type MT = array[0..4] of AT; (* array of array type *) 971 var aa : MT; (* array of array variable *) 972 m@[1][2]@ := 1; aa@[1][2]@ := 1 (* same subscripting *) 973 \end{pascal} 974 & 975 \begin{c++} 976 int m[5][5]; 977 978 typedef vector< vector<int> > MT; 979 MT vm( 5, vector<int>( 5 ) ); 980 m@[1][2]@ = 1; aa@[1][2]@ = 1; 981 \end{c++} 982 \end{tabular} 983 \end{cquote} 984 The language decides if the matrix and array-of-array are laid out the same or differently. 985 For example, an array-of-array may be an array of row pointers to arrays of columns, so the rows may not be contiguous in memory nor even the same length (triangular matrix). 986 Regardless, there is usually a uniform subscripting syntax masking the memory layout, even though a language could differentiated between the two forms using subscript syntax, \eg @m[1,2]@ \vs @aa[1][2]@. 987 Nevertheless, controlling memory layout can make a difference in what operations are allowed and in performance (caching/NUMA effects). 988 989 C also provides non-contiguous arrays-of-arrays. 990 \begin{cfa} 991 int m[5][5]; $\C{// contiguous}$ 992 int * aa[5]; $\C{// non-contiguous}$ 993 \end{cfa} 994 both with different memory layout using the same subscripting, and both with different degrees of issues. 995 The focus of this work is on the contiguous multidimensional arrays in C. 996 The reason is that programmers are often forced to use the more complex array-of-array form when a contiguous array would be simpler, faster, and safer. 997 Nevertheless, the C array-of-array form is still important for special circumstances. 998 999 \VRef[Figure]{f:ContiguousNon-contiguous} shows the extensions made in C99 for manipulating contiguous \vs non-contiguous arrays.\footnote{C90 also supported non-contiguous arrays.} 1000 First, VLAs are supported. 1001 Second, for contiguous arrays, C99 conjoins one or more of the parameters as a downstream dimension(s), \eg @cols@, implicitly using this parameter to compute the row stride of @m@. 1002 If the declaration of @fc@ is changed to: 1003 \begin{cfa} 1004 void fc( int rows, int cols, int m[@rows@][@cols@] ) ... 1005 \end{cfa} 1006 it is possible for C to perform bound checking across all subscripting, but it does not. 1007 While this contiguous-array capability is a step forward, it is still the programmer's responsibility to manually manage the number of dimensions and their sizes, both at the function definition and call sites. 1008 That is, the array does not automatically carry its structure and sizes for use in computing subscripts. 1009 While the non-contiguous style in @faa@ looks very similar to @fc@, the compiler only understands the unknown-sized array of row pointers, and it relies on the programmer to traverse the columns in a row correctly with a correctly bounded loop index. 1010 Specifically, there is no requirement that the rows are the same length, like a poem with different length lines. 1011 1012 \begin{figure} 1013 \begin{tabular}{@{}ll@{}} 1014 \multicolumn{1}{c}{\textbf{Contiguous}} & \multicolumn{1}{c}{\textbf{ Non-contiguous}} \\ 1015 \begin{cfa} 1016 void fc( int rows, @int cols@, int m[ /* rows */ ][@cols@] ) { 1017 ... printf( "%d ", @m[r][c]@ ); ... 1018 } 1019 int m@[5][5]@; 1020 for ( int r = 0; r < 5; r += 1 ) { 1021 1022 for ( int c = 0; c < 5; c += 1 ) 1023 m[r][c] = r + c; 1024 } 1025 fc( 5, 5, m ); 1026 \end{cfa} 1027 & 1028 \begin{cfa} 1029 void faa( int rows, int cols, int * m[ @/* cols */@ ] ) { 1030 ... printf( "%d ", @m[r][c]@ ); ... 1031 } 1032 int @* aa[5]@; // row pointers 1033 for ( int r = 0; r < 5; r += 1 ) { 1034 @aa[r] = malloc( 5 * sizeof(int) );@ // create rows 1035 for ( int c = 0; c < 5; c += 1 ) 1036 aa[r][c] = r + c; 1037 } 1038 faa( 5, 5, aa ); 1039 \end{cfa} 1040 \end{tabular} 1041 \caption{C99 Contiguous \vs Non-contiguous Matrix Styles} 1042 \label{f:ContiguousNon-contiguous} 1043 \end{figure} 1044 1045 1046 \subsection{Multidimensional array implementation} 863 \section{Multidimensional array implementation} 1047 864 \label{s:ArrayMdImpl} 1048 865 … … 1221 1038 S & | sized(S), $\C{// masquerading-as}$ 1222 1039 Timmed &, $\C{// immediate element, often another array}$ 1223 Tbase & $\C{// base element, e.g.float, never array}$1040 Tbase & $\C{// base element, \eg float, never array}$ 1224 1041 ) { // distribute forall to each element 1225 1042 struct arpk { -
TabularUnified doc/theses/mike_brooks_MMath/background.tex ¶
r63a7394 rc3e41cda 178 178 \lstinput{34-34}{bkgd-carray-arrty.c} 179 179 The inspection begins by using @sizeof@ to provide program semantics for the intuition of an expression's type. 180 An architecture with 64-bit pointer size is used, to keep irrelevant details fixed.180 An architecture with 64-bit pointer size is used, to remove irrelevant details. 181 181 \lstinput{35-36}{bkgd-carray-arrty.c} 182 182 Now consider the @sizeof@ expressions derived from @ar@, modified by adding pointer-to and first-element (and including unnecessary parentheses to avoid any confusion about precedence). … … 215 215 216 216 My observation is recognizing: 217 \begin{itemize}[leftmargin=*, topsep=0pt,itemsep=0pt]217 \begin{itemize}[leftmargin=*,itemsep=0pt] 218 218 \item There is value in using a type that knows its size. 219 219 \item The type pointer to the (first) element does not. … … 302 302 303 303 In summary, when a function is written with an array-typed parameter, 304 \begin{itemize}[leftmargin=* ,topsep=0pt]304 \begin{itemize}[leftmargin=*] 305 305 \item an appearance of passing an array by value is always an incorrect understanding, 306 306 \item a dimension value, if any is present, is ignored, … … 532 532 \subsection{Array Parameter Declaration} 533 533 534 Passing an array a long with a function call is obviously useful.535 Let us say that a parameter is an array parameter when the calledfunction intends to subscript it.536 This section asserts that a more satisfactory/formal characterization does not exist in C, surveys the ways that C API authors communicate ``@p@ has zero or more @T@s,'' and calls out the minority cases where the C type system is using or verifying such claims.537 538 A C function'sparameter declarations look different, from the caller's and callee's perspectives.534 Passing an array as an argument to a function is necessary. 535 Assume a parameter is an array when the function intends to subscript it. 536 This section asserts that a more satisfactory/formal characterization does not exist in C, surveys the ways that C API authors communicate ``@p@ has zero or more dimensions'' and calls out the minority cases where the C type system is using or verifying such claims. 537 538 A C parameter declarations look different, from the caller's and callee's perspectives. 539 539 Both perspectives consist of the text read by a programmer and the semantics enforced by the type system. 540 The caller's perspec itve is available from a mere function declaration (which allow definition-before-use and separate compilation), but can also be read from (the non-body part of) a function definition.540 The caller's perspective is available from a function declaration, which allow definition-before-use and separate compilation, but can also be read from (the non-body part of) a function definition. 541 541 The callee's perspective is what is available inside the function. 542 542 \begin{cfa} 543 int foo( int, float, char ); $\C{// declaration, names optional}$ 544 int bar( int i, float f, char c ) { $\C{// definition, names mandatory}$ 545 $/* caller's perspective of foo's; callee's perspective of bar's */$ 546 ... 547 } 548 $/* caller's persepectives of foo's and bar's */$ 549 \end{cfa} 550 The caller's perspective is more limited. 551 The example shows, so far, that parameter names (by virtue of being optional) are really comments in the caller's perspective, while they are semantically significant in the callee's perspective. 543 int foo( int, float, char ); $\C{// declaration, names optional}$ 544 int bar( int i, float f, char c ) { $\C{// definition, names mandatory}$ 545 // caller's perspective of foo; callee's perspective of bar 546 } 547 // caller's perspectives of foo's and bar's 548 \end{cfa} 549 In caller's perspective, the parameter names (by virtue of being optional) are really comments; 550 in the callee's perspective, parameter names are semantically significant. 552 551 Array parameters introduce a further, subtle, semantic difference and considerable freedom to comment. 553 552 554 At the semantic level, there is no such thing as an array parameter, except for one case (@T [static 5]@) discussed shortly.553 At the semantic level, there is no such thing as an array parameter, except for one case (@T [static 5]@) discussed shortly. 555 554 Rather, there are only pointer parameters. 556 This fact probably shares considerable responsibility for the common sense of ``an array is just a pointer,'' w ich has been refuted in non-parameter contexts.555 This fact probably shares considerable responsibility for the common sense of ``an array is just a pointer,'' which has been refuted in non-parameter contexts. 557 556 This fact holds in both the caller's and callee's perspectives. 558 However, a parameter's type can include ``array of.'' 559 For example, the type ``pointer to array of 5 ints'' (@T(*)[5]@) is a pointer type, a fully meaningful parameter type (in the sense that this description does not contain any information that the type system ignores), and a type that appears the same in the caller's \vs callee's perspectives. 560 The outermost type constructor (syntactically first dimension) is really the one that determines the flavour of parameter. 557 However, a parameter's type can include ``array of.'', \eg the type ``pointer to array of 5 ints'' (@T (*)[5]@) is a pointer type. 558 This type is fully meaningful in the sense that its description does not contain any information that the type system ignores, and the type appears the same in the caller's \vs callee's perspectives. 559 In fact, the outermost type constructor (syntactically first dimension) is really the one that determines the flavour of parameter. 560 561 Yet, C allows array syntax for the outermost type constructor, from which comes the freedom to comment. 562 An array parameter declaration can specify the outermost dimension with a dimension value, @[10]@ (which is ignored), an empty dimension list, @[ ]@, or a pointer, @*@, as seen in \VRef[Figure]{f:ArParmEquivDecl}. 563 The rationale for rejecting the first ``invalid'' row follows shortly, while the second ``invalid'' row is simple nonsense, included to complete the pattern; its syntax hints at what the final row actually achieves. 561 564 562 565 \begin{figure} … … 596 599 \end{tabular} 597 600 \end{cquote} 598 \caption{Multiple ways to declare an arrray parameter. Across a valid row, every declaration is equivalent. Each column gives a declaration style. Really, the style can be read from the first row only. The second row shows how the style extends to multiple dimensions, with the rows thereafter providing context for the choice of which second-row \lstinline{[]}receives the column-style variation.} 601 \caption{Multiple ways to declare an array parameter. 602 Across a valid row, every declaration is equivalent. 603 Each column gives a declaration style, where the style for that column is read from the first row. 604 The second row begins the style for multiple dimensions, with the rows thereafter providing context for the choice of which second-row \lstinline{[]} receives the column-style variation.} 599 605 \label{f:ArParmEquivDecl} 600 606 \end{figure} 601 607 602 Yet, C allows array syntax for the outermost type constructor, from which comes the freedom to comment. 603 An array parameter declaration can specify the outermost dimension with a dimension value, @[10]@ (which is ignored), an empty dimension list, @[ ]@, or a pointer, @*@, as seen in \VRef[Figure]{f:ArParmEquivDecl}. The rationale for rejecting the first ``invalid'' row follows shortly, while the second ``invalid'' row is simple nonsense, included to complete the pattern; its syntax hints at what the final row actually achieves. 604 605 In the lefmost style, the typechecker ignores the actual value in most practical cases. 606 This value is allowed to be a dynamic expression, so it is \emph{possible} to use the leftmost style in many practical cases. 608 In the leftmost style, the typechecker ignores the actual value in most practical cases. 609 This value is allowed to be a dynamic expression, and then it has practical cases. 610 \begin{cfa} 611 void foo( int @n@ ) { 612 float _42( float @a[n]@ ) { // nested function 613 a[0] = 42; 614 } 615 float b[n]; 616 _42( b ); 617 } 618 \end{cfa} 619 607 620 608 621 % To help contextualize the matrix part of this example, the syntaxes @float [5][]@, @float [][]@ and @float (*)[]@ are all rejected, for reasons discussed shortly. 609 622 % So are @float[5]*@, @float[]*@ and @float (*)*@. These latter ones are simply nonsense, though they hint at ``1d array of pointers'', whose equivalent syntax options are, @float *[5]@, @float *[]@, and @float **@. 610 623 611 It is a matter of taste as to whether a programmer should use a form as far left as possible (getting the most out of syntactically integrated comments), sticking to the right (avoiding false comfort from suggesting the typechecker is checking more than it is), or compromising in the middle (reducing unchecked information, yet clearly stating, ``I will subscript this one'').612 613 Note that this equivalence of pointer and array declarations is special to param ters.624 It is a matter of taste as to whether a programmer should use a form as far left as possible (getting the most out of possible subscripting and dimension sizes), sticking to the right (avoiding false comfort from suggesting the typechecker is checking more than it is), or compromising in the middle (reducing unchecked information, yet clearly stating, ``I will subscript). 625 626 Note that this equivalence of pointer and array declarations is special to parameters. 614 627 It does not apply to local variables, where true array declarations are possible. 615 628 \begin{cfa} … … 626 639 float sum( float v[] ); 627 640 float arg = 3.14; 628 sum( &arg ); $\C{// accepted, v := \&arg}$641 sum( &arg ); $\C{// accepted, v = \&arg}$ 629 642 \end{cfa} 630 643 … … 674 687 The last observation is a fact of the callee's perspective. 675 688 There is little type-system checking, in the caller's perspective, that what is being passed, matches. 689 \PAB{I do not understand} 676 690 \begin{cfa} 677 691 void f( float [][10] ); … … 680 694 f(&a); // reject 681 695 f(&b); // accept 696 697 void foo() { 698 void f( float [][10] ); 699 int n = 100; 700 float a[100], b[3][12], c[n], d[n][n]; 701 f( a ); // reject 702 f( b ); // reject 703 f( c ); // reject 704 f( d ); // accept 705 f( &a ); // reject 706 f( &b ); // reject 707 f( &c ); // accept 708 f( &d ); // reject 709 } 682 710 \end{cfa} 683 711 This size is therefore, a callee's assumption. 684 712 685 Finally, to handle higher-dimensional VLAs, C repurposed the @*@ \emph{within} the dimension in a declaration to mean that the callee will have make an assumption about the size here, but no (unchecked, possibly wrong) information about this assumption is included for the caller-programmer's benefit/overconfidence.713 Finally, to handle higher-dimensional VLAs, C repurposed the @*@ \emph{within} the dimension in a declaration to mean that the callee has make an assumption about the size, but no (unchecked, possibly wrong) information about this assumption is included for the caller-programmer's benefit/over-confidence. 686 714 \begin{cquote} 687 715 @[@ \textit{type-qualifier-list$_{opt}$} @* ]@ -
TabularUnified doc/theses/mike_brooks_MMath/programs/hello-accordion.cfa ¶
r63a7394 rc3e41cda 30 30 forall( [C], [S] ) 31 31 int getPref( @School( C, S ) & school@, int is, int pref ) { 32 33 32 for ( ic; C ) { 33 int curPref = @school.preferences@[ic][is]; $\C{// offset calculation implicit}$ 34 34 if ( curPref == pref ) return ic; 35 35 } 36 36 assert( false ); 37 37 } 38 38 … … 59 59 60 60 { string sv; 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 61 int iv; 62 // headers' row 63 sin | "\nc\\s"; 64 for ( is ; ns ) { 65 // column label 66 sin | sv; 67 school.student_ids[is] = sv; 68 } 69 // body rows 70 for ( ic ; nc ) { 71 // row label 72 sin | sv; 73 school.course_codes[ic] = sv; 74 for ( is ; ns ) { 75 // matrix item 76 sin | iv; 77 school.preferences[ic][is] = iv; 78 } 79 } 80 } 81 81 82 82 … … 91 91 sout | school.student_ids[is] | ": " | nonl; 92 92 for ( pref; 1 ~= nc ) { 93 94 93 int ic = getPref(school, is, pref); 94 sout | school.course_codes[ ic ] | nonl; 95 95 } 96 96 sout | nl; -
TabularUnified doc/theses/mike_brooks_MMath/programs/hello-array.cfa ¶
r63a7394 rc3e41cda 114 114 f( y, y ); $\C{// ok}$ 115 115 if ( M == N ) 116 f( x, @(array( float, M ) &)@y ); $\C{// ok} $116 f( x, @(array( float, M ) &)@y ); $\C{// ok}\CRT$ 117 117 } 118 118 -
TabularUnified doc/theses/mike_brooks_MMath/programs/sharing-demo.cfa ¶
r63a7394 rc3e41cda 108 108 assert( s2 == "bbb" ); 109 109 sout | xstr(S2bbb) | "\t& " | s1 | "\t& " | s1a | "\t& " | s2 | "\t\\\\"; 110 111 110 111 #define S2S1a s2 = s1a 112 112 S2S1a; 113 113 assert( s1 == "aaa" ); … … 122 122 assert( s2 == "ccc" ); 123 123 sout | xstr(S2ccc) | "\t& " | s1 | "\t& " | s1a | "\t& " | s2 | "\t\\\\"; 124 124 125 125 #define S1xxx s1 = "xxx" 126 126 S1xxx; … … 149 149 150 150 #define D2_s2_s1 string s2 = s1(1,2) 151 D2_s2_s1; 151 D2_s2_s1; 152 152 assert( s1 == "abcd" ); 153 153 assert( s1_mid == "bc" ); … … 157 157 sout | "\\par\\noindent"; 158 158 159 159 sout | "Again, @`shareEdits@ passes changes in both directions; copy does not. Note the difference in index values, with the \\emph{b} position being 1 in the longer string and 0 in the shorter strings. In the case of s1 aliasing with @s1_mid@, the very same character is being accessed by different postitions."; 160 160 sout | "\\par\\noindent"; 161 161 sout | "\\begin{tabular}{llll}"; … … 176 176 assert( s2 == "bc" ); 177 177 sout | xstr(D2_s1mid_minus) | "\t& " | s1 | "\t& " | s1_mid | "\t& " | s2 | "\t\\\\"; 178 179 178 179 #define D2_s2_pipe s2 [0] = '|' 180 180 D2_s2_pipe; 181 181 assert( s1 == "a-cd" ); … … 186 186 sout | "\\par\\noindent"; 187 187 188 188 sout | "Once again, assignment of a value is a modificiation that flows through the aliasing relationship, without affecting its structure."; 189 189 sout | "\\par\\noindent"; 190 190 sout | "\\begin{tabular}{llll}"; … … 198 198 assert( s2 == "|c" ); 199 199 sout | xstr(D2_s1mid_ff) | "\t& " | s1 | "\t& " | s1_mid | "\t& " | s2 | "\t\\\\"; 200 200 201 201 #define D2_s2_gg s2 = "gg" 202 202 D2_s2_gg; … … 208 208 sout | "\\par\\noindent"; 209 209 210 210 sout | "In the \\emph{ff} step, which is a positive example of flow across an aliasing relationship, the result is straightforward to accept because the flow direction is from contained (small) to containing (large). The following rules for edits through aliasing substrings will guide how to flow in the opposite direction."; 211 211 sout | "\\par"; 212 212 213 213 214 214 sout | "Growth and shrinkage are natural extensions. An empty substring is a real thing, at a well-defined location, whose meaning is extrapolated from the examples so far. The intended metaphor is to operating a GUI text editor. Having an aliasing substring is like using the mouse to select a few words. Assigning onto an aliasign substring is like typing from having a few words selected: depending how much you type, the file being edited can get shorter or longer."; 215 215 sout | "\\par\\noindent"; 216 216 sout | "\\begin{tabular}{lll}"; … … 219 219 220 220 assert( s1 == "affd" ); 221 // assert( s1_mid == "fc" ); 221 // assert( s1_mid == "fc" ); // ????????? bug? 222 222 sout | xstr(D2_s2_gg) | "\t& " | s1 | "\t& " | s1_mid | "\t\\\\"; 223 223 … … 227 227 assert( s1_mid == "hhhh" ); 228 228 sout | xstr(D2_s1mid_hhhh) | "\t& " | s1 | "\t& " | s1_mid | "\t\\\\"; 229 229 230 230 #define D2_s1mid_i s1_mid = "i" 231 231 D2_s1mid_i; … … 248 248 sout | "\\par\\noindent"; 249 249 250 250 sout | "Multiple portions can be aliased. When there are several aliasing substrings at once, the text editor analogy becomes an online multi-user editor. I should be able to edit a paragraph in one place (changing the document's length), without my edits affecting which letters are within a mouse-selection that you had made previously, somewhere else."; 251 251 sout | "\\par\\noindent"; 252 252 sout | "\\begin{tabular}{lllll}"; … … 264 264 assert( s1_end == "d" ); 265 265 sout | xstr(D2_s1end_s1) | "\t& " | s1 | "\t& " | s1_bgn | "\t& " | s1_mid | "\t& " | s1_end | "\t\\\\"; 266 266 267 267 #define D1_s1bgn_zzz s1_bgn = "zzzz" 268 268 D1_s1bgn_zzz; … … 275 275 sout | "\\par\\noindent"; 276 276 277 277 sout | "When an edit happens on an aliasing substring that overlaps another, an effect is unavoidable. Here, the passive party sees its selection shortened, to exclude the characters that were not part of the original selection."; 278 278 sout | "\\par\\noindent"; 279 279 sout | "\\begin{tabular}{llllll}"; … … 286 286 assert( s1_crs == "zj" ); 287 287 assert( s1_mid == "jj" ); 288 assert( s1_end == "d" ); 288 assert( s1_end == "d" ); 289 289 sout | xstr(D2_s1crs_s1) | "\t& " | s1 | "\t& " | s1_bgn | "\t& " | s1_crs | "\t& " | s1_mid | "\t& " | s1_end | "\t\\\\"; 290 290 … … 301 301 sout | "TODO: finish typesetting the demo"; 302 302 303 303 // "This shortening behaviour means that a modification has to occur entirely inside a substring, to show up in that substring. Sharing changes through the intersection of partially overlapping aliases is still possible, so long as the receiver's boundary is not inside the edit." 304 304 305 305 string word = "Phi"; … … 309 309 assert( consonants == "Ph" ); 310 310 assert( miniscules == "hi" ); 311 311 312 312 consonants[1] = 's'; 313 313 assert( word == "Psi" ); … … 321 321 string greet_bgn = all(10,1)`shareEdits; 322 322 string greet_end = all(14,1)`shareEdits; 323 323 324 324 assert( all == "They said hello again" ); 325 325 assert( greet == "hello" ); 326 326 assert( greet_bgn == "h" ); 327 327 assert( greet_end == "o" ); 328 328 329 329 330 330 greet = "sup"; … … 333 333 // assert( greet_bgn == "" ); ------ Should be so, but fails 334 334 // assert( greet_end == "" ); 335 336 337 338 339 340 341 342 343 344 345 greet_bgn = "what"; 346 347 335 336 337 338 339 340 /* As in the earlier step where \emph{aj} becomes \emph{ajjd}, such empty substrings maintain their places in the total string, and can be used for filling it. Because @greet_bgn@ was orginally at the start of the edit, in the outcome, the empty @greet_bgn@ sits just before the written value. Similarly @greed_end@ goes after. Though not shown, an overwritten substring at neither side goes arbitrarily to the before side. */ 341 342 343 344 345 greet_bgn = "what"; 346 347 348 348 assert( all == "They said whatsup again" ); 349 349 350 350 assert( greet == "sup" ); 351 351 352 352 assert( greet_bgn == "what" ); 353 353 // assert( greet_end == "" ); ------ Should be so, but fails 354 355 356 greet_end = "..."; 357 358 354 355 356 greet_end = "..."; 357 358 359 359 assert( all == "They said whatsup... again" ); 360 360 361 361 assert( greet == "sup" ); 362 362 363 363 assert( greet_bgn == "what" ); 364 364 365 365 assert( greet_end == "..." ); 366 367 368 369 370 371 372 366 367 368 369 370 371 /* Though these empty substrings hold their places in the total string, an empty string only belongs to bigger strings when it occurs completely inside them. There is no such state as including an empty substring at an edge. For this reason, @word@ gains the characters added by assigning to @greet_bgn@ and @greet_end@, but the string @greet@ does not. */ 372 373 373 374 374 } … … 377 377 int main(int argc, char ** argv) { 378 378 379 380 381 379 demo1(); 380 demo2(); 381 printf("%% %s done running\n", argv[0]); 382 382 } -
TabularUnified doc/theses/mike_brooks_MMath/uw-ethesis.tex ¶
r63a7394 rc3e41cda 102 102 \input{common} 103 103 %\usepackage{common} 104 104 105 \CFAStyle % CFA code-style 105 106 \lstset{language=cfa,belowskip=-1pt} % set default language to CFA
Note: See TracChangeset
for help on using the changeset viewer.