Changeset 7c80a86 for doc/theses/fangren_yu_MMath/content1.tex
- Timestamp:
- Nov 20, 2024, 9:32:44 AM (4 weeks ago)
- Branches:
- master
- Children:
- d945be9
- Parents:
- 29075d1
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/fangren_yu_MMath/content1.tex
r29075d1 r7c80a86 1 \chapter{ Recent Features Introduced to \CFA}1 \chapter{\CFA Features and Type System Interactions} 2 2 \label{c:content1} 3 3 4 This chapter discusses some recent additions to the \CFA language and their interactions with the type system.4 This chapter discusses \CFA feature introduced over time by multiple people and their interactions with the type system. 5 5 6 6 … … 17 17 Succinctly, if the address changes often, use a pointer; 18 18 if the value changes often, use a reference. 19 Note, \CC made its reference address immutable starting a \emph{belief} that immutability is a fundamental aspect of a reference's pointer. 20 The results is asymmetry semantics between the pointer and reference. 19 Java has mutable references but no pointers. 20 \CC has mutable pointers but immutable references; 21 hence, references match with functional programming. 22 However, the consequence is asymmetry semantics between the pointer and reference. 21 23 \CFA adopts a uniform policy between pointers and references where mutability is a separate property made at the declaration. 22 24 … … 36 38 Like pointers, reference can be cascaded, \ie a reference to a reference, \eg @&& r2@.\footnote{ 37 39 \CC uses \lstinline{&&} for rvalue reference, a feature for move semantics and handling the \lstinline{const} Hell problem.} 38 Usage of a reference variable automatically performs the same number of dereferences as the number of references in its declaration, \eg @r 3@ becomes @***r3@.40 Usage of a reference variable automatically performs the same number of dereferences as the number of references in its declaration, \eg @r2@ becomes @**r2@. 39 41 Finally, to reassign a reference's address needs a mechanism to stop the auto-referencing, which is accomplished by using a single reference to cancel all the auto-dereferencing, \eg @&r3 = &y@ resets @r3@'s address to point to @y@. 40 42 \CFA's reference type (including multi-de/references) is powerful enough to describe the lvalue rules in C by types only. … … 66 68 int x = 3; $\C{// mutable}$ 67 69 const int cx = 5; $\C{// immutable}$ 68 int * const cp = &x, $\C{// immutable pointer }$70 int * const cp = &x, $\C{// immutable pointer pointer/reference}$ 69 71 & const cr = cx; 70 const int * const ccp = &cx, $\C{// immutable value and pointer }$72 const int * const ccp = &cx, $\C{// immutable value and pointer/reference}$ 71 73 & const ccr = cx; 72 // pointer 74 \end{cfa} 75 \begin{cquote} 76 \setlength{\tabcolsep}{26pt} 77 \begin{tabular}{@{}lll@{}} 78 pointer & reference & \\ 79 \begin{cfa} 73 80 *cp = 7; 74 cp = &x; $\C{// error, assignment of read-only variable}$ 75 *ccp = 7; $\C{// error, assignment of read-only location}$ 76 ccp = &cx; $\C{// error, assignment of read-only variable}$ 77 // reference 81 cp = &x; 82 *ccp = 7; 83 ccp = &cx; 84 \end{cfa} 85 & 86 \begin{cfa} 78 87 cr = 7; 79 cr = &x; $\C{// error, assignment of read-only variable}$ 80 *ccr = 7; $\C{// error, assignment of read-only location}$ 81 ccr = &cx; $\C{// error, assignment of read-only variable}$ 82 \end{cfa} 88 cr = &x; 89 *ccr = 7; 90 ccr = &cx; 91 \end{cfa} 92 & 93 \begin{cfa} 94 // allowed 95 // error, assignment of read-only variable 96 // error, assignment of read-only location 97 // error, assignment of read-only variable 98 \end{cfa} 99 \end{tabular} 100 \end{cquote} 83 101 Interestingly, C does not give a warning/error if a @const@ pointer is not initialized, while \CC does. 84 Hence, type @& const@ is similar to \CC reference, but \CFA does not preclude initialization with a non-variable address.102 Hence, type @& const@ is similar to a \CC reference, but \CFA does not preclude initialization with a non-variable address. 85 103 For example, in system's programming, there are cases where an immutable address is initialized to a specific memory location. 86 104 \begin{cfa} … … 96 114 However, there is an inherent ambiguity for auto-dereferencing: every argument expression involving a reference variable can potentially mean passing the reference's value or address. 97 115 Without any restrictions, this ambiguity limits the behaviour of reference types in \CFA polymorphic functions, where a type @T@ can bind to a reference or non-reference type. 98 This ambiguity prevents the type system treating reference types the same way as other types in many caseseven if type variables could be bound to reference types.116 This ambiguity prevents the type system treating reference types the same way as other types, even if type variables could be bound to reference types. 99 117 The reason is that \CFA uses a common \emph{object trait}\label{p:objecttrait} (constructor, destructor and assignment operators) to handle passing dynamic concrete type arguments into polymorphic functions, and the reference types are handled differently in these contexts so they do not satisfy this common interface. 100 118 101 119 Moreover, there is also some discrepancy in how the reference types are treated in initialization and assignment expressions. 102 For example, in line 3 of the previous example code \see{\VPageref{p:refexamples}}:120 For example, in line 3 of the example code on \VPageref{p:refexamples}: 103 121 \begin{cfa} 104 122 int @&@ r1 = x, @&&@ r2 = r1, @&&&@ r3 = r2; $\C{// references to x}$ … … 129 147 vector( int @&@ ) vec; $\C{// vector of references to ints}$ 130 148 \end{cfa} 131 While it is possible to write a reference type as the argument to a generic type, it is disallowed in assertion checking, if the generic type requires the object trait \see{\VPageref{p:objecttrait}} for the type argument (a fairly common use case).149 While it is possible to write a reference type as the argument to a generic type, it is disallowed in assertion checking, if the generic type requires the object trait \see{\VPageref{p:objecttrait}} for the type argument, a fairly common use case. 132 150 Even if the object trait can be made optional, the current type system often misbehaves by adding undesirable auto-dereference on the referenced-to value rather than the reference variable itself, as intended. 133 151 Some tweaks are necessary to accommodate reference types in polymorphic contexts and it is unclear what can or cannot be achieved. 134 Currently, there are contexts where \CFA programmer must use pointer types, giving up the benefits of auto-dereference operations and better syntax with reference types.152 Currently, there are contexts where \CFA programmer is forced to use a pointer type, giving up the benefits of auto-dereference operations and better syntax with reference types. 135 153 136 154 … … 165 183 Along with making returning multiple values a first-class feature, tuples were extended to simplify a number of other common context that normally require multiple statements and/or additional declarations, all of which reduces coding time and errors. 166 184 \begin{cfa} 167 [x, y, z] = 3; $\C[2in]{// x = 3; y = 3; z = 3, where types are different}$185 [x, y, z] = 3; $\C[2in]{// x = 3; y = 3; z = 3, where types may be different}$ 168 186 [x, y] = [y, x]; $\C{// int tmp = x; x = y; y = tmp;}$ 169 187 void bar( int, int, int ); … … 212 230 bar( t2 ); $\C{// bar defined above}$ 213 231 \end{cfa} 214 \VRef[Figure]{f:Nesting} shows The difference is nesting of structures and tuples.232 \VRef[Figure]{f:Nesting} shows the difference is nesting of structures and tuples. 215 233 The left \CC nested-structure is named so it is not flattened. 216 234 The middle C/\CC nested-structure is unnamed and flattened, causing an error because @i@ and @j@ are duplication names. … … 220 238 221 239 \begin{figure} 222 \setlength{\tabcolsep}{ 15pt}240 \setlength{\tabcolsep}{20pt} 223 241 \begin{tabular}{@{}ll@{\hspace{90pt}}l@{}} 224 242 \multicolumn{1}{c}{\CC} & \multicolumn{1}{c}{C/\CC} & \multicolumn{1}{c}{tuple} \\ … … 273 291 As noted, tradition languages manipulate multiple values by in/out parameters and/or structures. 274 292 K-W C adopted the structure for tuple values or variables, and as needed, the fields are extracted by field access operations. 275 As well, For the tuple-assignment implementation, the left-hand tuple expression is expanded into assignments of each component, creating temporary variables to avoid unexpected side effects.276 For example, the tuple value returned from @foo@ is a structure, and its fields are individually assigned to a left-hand tuple, @x@, @y@, @z@, orcopied directly into a corresponding tuple variable.293 As well, for the tuple-assignment implementation, the left-hand tuple expression is expanded into assignments of each component, creating temporary variables to avoid unexpected side effects. 294 For example, the tuple value returned from @foo@ is a structure, and its fields are individually assigned to a left-hand tuple, @x@, @y@, @z@, \emph{or} copied directly into a corresponding tuple variable. 277 295 278 296 In the second implementation of \CFA tuples by Rodolfo Gabriel Esteves~\cite{Esteves04}, a different strategy is taken to handle MVR functions. … … 286 304 [x, y] = gives_two(); 287 305 \end{cfa} 288 The Till K-W C implementation translates the program to: 306 \VRef[Figure]{f:AlternateTupleImplementation} shows the two implementation approaches. 307 In the left approach, the return statement is rewritten to pack the return values into a structure, which is returned by value, and the structure fields are indiviually assigned to the left-hand side of the assignment. 308 In the right approach, the return statement is rewritten as direct assignments into the passed-in argument addresses. 309 The right imlementation looks more concise and saves unnecessary copying. 310 The downside is indirection within @gives_two@ to access values, unless values get hoisted into registers for some period of time, which is common. 311 312 \begin{figure} 313 \begin{cquote} 314 \setlength{\tabcolsep}{20pt} 315 \begin{tabular}{@{}ll@{}} 316 Till K-W C implementation & Rodolfo \CFA implementation \\ 289 317 \begin{cfa} 290 318 struct _tuple2 { int _0; int _1; } 291 struct _tuple2 gives_two() { ... struct _tuple2 ret = { r1, r2 }, return ret; } 319 struct _tuple2 gives_two() { 320 ... struct _tuple2 ret = { r1, r2 }; 321 return ret; 322 } 292 323 int x, y; 293 324 struct _tuple2 _tmp = gives_two(); 294 325 x = _tmp._0; y = _tmp._1; 295 326 \end{cfa} 296 while the Rodolfo implementation translates it to: 297 \begin{cfa} 298 void gives_two( int * r1, int * r2 ) { ... *r1 = ...; *r2 = ...; return; } 327 & 328 \begin{cfa} 329 330 void gives_two( int * r1, int * r2 ) { 331 ... *r1 = ...; *r2 = ...; 332 return; 333 } 299 334 int x, y; 335 300 336 gives_two( &x, &y ); 301 337 \end{cfa} 302 and inside the body of the function @gives_two@, the return statement is rewritten as assignments into the passed-in argument addresses. 303 This implementation looks more concise, and in the case of returning values having nontrivial types, \eg aggregates, this implementation saves unnecessary copying. 304 For example, 305 \begin{cfa} 306 [ x, y ] gives_two(); 307 int x, y; 308 [ x, y ] = gives_two(); 309 \end{cfa} 310 becomes 311 \begin{cfa} 312 void gives_two( int &, int & ); 313 int x, y; 314 gives_two( x, y ); 315 \end{cfa} 316 eliminiating any copying in or out of the call. 317 The downside is indirection within @gives_two@ to access values, unless values get hoisted into registers for some period of time, which is common. 338 \end{tabular} 339 \end{cquote} 340 \caption{Alternate Tuple Implementation} 341 \label{f:AlternateTupleImplementation} 342 \end{figure} 318 343 319 344 Interestingly, in the third implementation of \CFA tuples by Robert Schluntz~\cite[\S~3]{Schluntz17}, the MVR functions revert back to structure based, where it remains in the current version of \CFA. 320 345 The reason for the reversion was to have a uniform approach for tuple values/variables making tuples first-class types in \CFA, \ie allow tuples with corresponding tuple variables. 321 This extension was possible, because in parallel with Schluntz's work, generic types were beingadded independently by Moss~\cite{Moss19}, and the tuple variables leveraged the same implementation techniques as the generic variables.346 This extension was possible, because in parallel with Schluntz's work, generic types were added independently by Moss~\cite{Moss19}, and the tuple variables leveraged the same implementation techniques as the generic variables. 322 347 \PAB{I'm not sure about the connection here. Do you have an example of what you mean?} 323 348 … … 339 364 \begin{cfa} 340 365 void f( int, int ); 341 void f( [int, int]);366 void f( @[@ int, int @]@ ); 342 367 f( 3, 4 ); // ambiguous call 343 368 \end{cfa} … … 358 383 the call to @f@ can be interpreted as @T = [1]@ and @U = [2, 3, 4, 5]@, or @T = [1, 2]@ and @U = [3, 4, 5]@, and so on. 359 384 The restriction ensures type checking remains tractable and does not take too long to compute. 360 Therefore, tuple types are never present in any fixed-argument function calls .361 362 Finally, a type-safe variadic argument signature was added by Robert Schluntz~\cite[\S~4.1.2]{Schluntz17} using @forall@ and a new tuple parameter-type, denoted by the keyword @ttype 385 Therefore, tuple types are never present in any fixed-argument function calls, because of the flattening. 386 387 Finally, a type-safe variadic argument signature was added by Robert Schluntz~\cite[\S~4.1.2]{Schluntz17} using @forall@ and a new tuple parameter-type, denoted by the keyword @ttype@ in Schluntz's implementation, but changed to the ellipsis syntax similar to \CC's template parameter pack. 363 388 For C variadics, \eg @va_list@, the number and types of the arguments must be conveyed in some way, \eg @printf@ uses a format string indicating the number and types of the arguments. 364 389 \VRef[Figure]{f:CVariadicMaxFunction} shows an $N$ argument @maxd@ function using the C untyped @va_list@ interface. … … 370 395 \begin{figure} 371 396 \begin{cfa} 372 double maxd( int @count@, ... ) {397 double maxd( int @count@, @...@ ) { // ellipse parameter 373 398 double max = 0; 374 399 va_list args; … … 566 591 struct U u; u.k; u.l; 567 592 \end{cfa} 568 and the hoisted type names can clash with global type snames.593 and the hoisted type names can clash with global type names. 569 594 For good reasons, \CC chose to change this semantics: 570 595 \begin{cquote} … … 584 609 \end{cfa} 585 610 \CFA chose to adopt the \CC non-compatible change for nested types, since \CC's change has already forced certain coding changes in C libraries that must be parsed by \CC. 611 \CFA also added the ability to access from a variable through a type to a field. 612 \begin{cfa} 613 struct S s; @s.T@.i; @s.U@.k; 614 \end{cfa} 586 615 587 616 % https://gcc.gnu.org/onlinedocs/gcc/Unnamed-Fields.html … … 604 633 \end{cfa} 605 634 Note, the position of the substructure is normally unimportant, unless there is some form of memory or @union@ overlay. 606 Like the anonymous nested types, the aggregate field names are hoisted into @struct S@, so there is direct access, \eg @s.x@ and @s.i@. 607 However, like the implicit C hoisting of nested structures, the field names must be unique and the type names are now at a different scope level, unlike type nesting in \CC. 608 In addition, a pointer to a structure is automatically converted to a pointer to an anonymous field for assignments and function calls, providing containment inheritance with implicit subtyping, \ie @U@ $\subset$ @S@ and @W@ $\subset$ @S@. 609 For example: 635 Like an anonymous nested type, a named nested Plan-9 type has its field names hoisted into @struct S@, so there is direct access, \eg @s.x@ and @s.i@. 636 Hence, the field names must be unique, unlike \CC nested types, but the type names are at a nested scope level, unlike type nesting in C. 637 In addition, a pointer to a structure is automatically converted to a pointer to an anonymous field for assignments and function calls, providing containment inheritance with implicit subtyping, \ie @U@ $\subset$ @S@ and @W@ $\subset$ @S@, \eg: 610 638 \begin{cfa} 611 639 void f( union U * u ); 612 640 void g( struct W * ); 613 union U * up; 614 struct W * wp; 615 struct S * sp; 616 up = sp; $\C{// assign pointer to U in S}$ 617 wp = sp; $\C{// assign pointer to W in S}$ 641 union U * up; struct W * wp; struct S * sp; 642 up = &s; $\C{// assign pointer to U in S}$ 643 wp = &s; $\C{// assign pointer to W in S}$ 618 644 f( &s ); $\C{// pass pointer to U in S}$ 619 645 g( &s ); $\C{// pass pointer to W in S}$ 620 646 \end{cfa} 621 622 \CFA extends the Plan-9 substructure by allowing polymorphism for values and pointers. 623 The extended substructure is denoted using @inline@, allowing backwards compatibility to existing Plan-9 features. 647 Note, there is no value assignment, such as, @w = s@, to copy the @W@ field from @S@. 648 649 Unfortunately, the Plan-9 designers did not lookahead to other useful features, specifically nested types. 650 This nested type compiles in \CC and \CFA. 651 \begin{cfa} 652 struct R { 653 @struct T;@ $\C[2in]{// forward declaration, conflicts with Plan-9 syntax}$ 654 struct S { $\C{// nested types, mutually recursive reference}\CRT$ 655 S * sp; T * tp; ... 656 }; 657 struct T { 658 S * sp; T * tp; ... 659 }; 660 }; 661 \end{cfa} 662 Note, the syntax for the forward declaration conflicts with the Plan-9 declaration syntax. 663 664 \CFA extends the Plan-9 substructure by allowing polymorphism for values and pointers, where the extended substructure is denoted using @inline@. 624 665 \begin{cfa} 625 666 struct S { 626 @inline@ W; $\C{// extended Plan-9 substructure}$667 @inline@ struct W; $\C{// extended Plan-9 substructure}$ 627 668 unsigned int tag; 628 669 @inline@ U; $\C{// extended Plan-9 substructure}$ 629 670 } s; 630 671 \end{cfa} 631 Note, like \CC, \CFA allows optional prefixing of type names with their kind, \eg @struct@, @union@, and @enum@, unless there is ambiguity with variable names in the same scope. 632 The following shows both value and pointer polymorphism. 672 Note, the declaration of @U@ is not prefixed with @union@. 673 Like \CC, \CFA allows optional prefixing of type names with their kind, \eg @struct@, @union@, and @enum@, unless there is ambiguity with variable names in the same scope. 674 In addition, a semi-non-compatible change is made so that Plan-9 syntax means a forward declaration in a nested type. 675 Since the Plan-9 extension is not part of C and rarely used, this change has minimal impact. 676 Hence, all Plan-9 semantics are denoted by the @inline@ qualifier, which good ``eye-candy'' when reading a structure definition to spot Plan-9 definitions. 677 Finally, the following code shows the value and pointer polymorphism. 633 678 \begin{cfa} 634 679 void f( U, U * ); $\C{// value, pointer}$ 635 680 void g( W, W * ); $\C{// value, pointer}$ 636 U u, * up; 637 S s, * sp; 638 W w, * wp; 639 u = s; up = sp; $\C{// value, pointer}$ 640 w = s; wp = sp; $\C{// value, pointer}$ 681 U u, * up; S s, * sp; W w, * wp; 682 u = s; up = sp; $\C{// value, pointer}$ 683 w = s; wp = sp; $\C{// value, pointer}$ 641 684 f( s, &s ); $\C{// value, pointer}$ 642 685 g( s, &s ); $\C{// value, pointer}$ … … 645 688 In general, non-standard C features (@gcc@) do not need any special treatment, as they are directly passed through to the C compiler. 646 689 However, the Plan-9 semantics allow implicit conversions from the outer type to the inner type, which means the \CFA type resolver must take this information into account. 647 Therefore, the \CFA translator must implement the Plan-9 features and insert necessary type conversions into the translated code output.690 Therefore, the \CFA resolver must implement the Plan-9 features and insert necessary type conversions into the translated code output. 648 691 In the current version of \CFA, this is the only kind of implicit type conversion other than the standard C conversions. 649 692 650 Since variable overloading is possible in \CFA, \CFA's implementation of Plan-9 polymorphism allows duplicate field names. 651 When an outer field and an embedded field have the same name and type, the inner field is shadowed and cannot be accessed directly by name. 652 While such definitions are allowed, duplicate field names is not good practice in general and should be avoided if possible. 653 Plan-9 fields can be nested, and a struct definition can contain multiple Plan-9 embedded fields. 654 In particular, the \newterm{diamond pattern}~\cite[\S~6.1]{Stroustrup89}\cite[\S~4]{Cargill91} can occur and result in a nested field to be embedded twice. 693 Plan-9 polymorphism can result in duplicate field names. 694 For example, the \newterm{diamond pattern}~\cite[\S~6.1]{Stroustrup89}\cite[\S~4]{Cargill91} can result in nested fields being embedded twice. 655 695 \begin{cfa} 656 696 struct A { int x; }; … … 658 698 struct C { inline A; }; 659 699 struct D { 660 inline B; 661 inline C; 662 }; 663 D d; 664 \end{cfa} 665 In the above example, the expression @d.x@ becomes ambiguous, since it can refer to the indirectly embedded field either from @B@ or from @C@. 666 It is still possible to disambiguate the expression by first casting the outer struct to one of the directly embedded type, such as @((B)d).x@. 700 inline B; // B.x 701 inline C; // C.x 702 } d; 703 \end{cfa} 704 Because the @inline@ structures are flattened, the expression @d.x@ is ambiguous, as it can refer to the embedded field either from @B@ or @C@. 705 @gcc@ generates a syntax error about the duplicate member @x@. 706 The equivalent \CC definition compiles: 707 \begin{c++} 708 struct A { int x; }; 709 struct B : public A {}; 710 struct C : public A {}; 711 struct D : @public B, C@ { // multiple inheritance 712 } d; 713 \end{c++} 714 and again the expression @d.x@ is ambiguous. 715 While \CC has no direct syntax to disambiguate @x@, \ie @d.B.x@ or @d.C.x@, it is possible with casts, @((B)d).x@ or @((C)d).x@. 716 Like \CC, \CFA compiles the Plan-9 version and provides direct syntax and casts to disambiguate @x@. 717 While ambiguous definitions are allowed, duplicate field names is poor practice and should be avoided if possible. 718 However, when a programmer does not control all code, this problem can occur and a naming workaround should exist.
Note: See TracChangeset
for help on using the changeset viewer.