Changeset 7c80a86 for doc


Ignore:
Timestamp:
Nov 20, 2024, 9:32:44 AM (4 weeks ago)
Author:
Peter A. Buhr <pabuhr@…>
Branches:
master
Children:
d945be9
Parents:
29075d1
Message:

proofread chapter 3

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/fangren_yu_MMath/content1.tex

    r29075d1 r7c80a86  
    1 \chapter{Recent Features Introduced to \CFA}
     1\chapter{\CFA Features and Type System Interactions}
    22\label{c:content1}
    33
    4 This chapter discusses some recent additions to the \CFA language and their interactions with the type system.
     4This chapter discusses \CFA feature introduced over time by multiple people and their interactions with the type system.
    55
    66
     
    1717Succinctly, if the address changes often, use a pointer;
    1818if the value changes often, use a reference.
    19 Note, \CC made its reference address immutable starting a \emph{belief} that immutability is a fundamental aspect of a reference's pointer.
    20 The results is asymmetry semantics between the pointer and reference.
     19Java has mutable references but no pointers.
     20\CC has mutable pointers but immutable references;
     21hence, references match with functional programming.
     22However, the consequence is asymmetry semantics between the pointer and reference.
    2123\CFA adopts a uniform policy between pointers and references where mutability is a separate property made at the declaration.
    2224
     
    3638Like pointers, reference can be cascaded, \ie a reference to a reference, \eg @&& r2@.\footnote{
    3739\CC uses \lstinline{&&} for rvalue reference, a feature for move semantics and handling the \lstinline{const} Hell problem.}
    38 Usage of a reference variable automatically performs the same number of dereferences as the number of references in its declaration, \eg @r3@ becomes @***r3@.
     40Usage of a reference variable automatically performs the same number of dereferences as the number of references in its declaration, \eg @r2@ becomes @**r2@.
    3941Finally, to reassign a reference's address needs a mechanism to stop the auto-referencing, which is accomplished by using a single reference to cancel all the auto-dereferencing, \eg @&r3 = &y@ resets @r3@'s address to point to @y@.
    4042\CFA's reference type (including multi-de/references) is powerful enough to describe the lvalue rules in C by types only.
     
    6668int x = 3; $\C{// mutable}$
    6769const int cx = 5; $\C{// immutable}$
    68 int * const cp = &x, $\C{// immutable pointer}$
     70int * const cp = &x, $\C{// immutable pointer pointer/reference}$
    6971        & const cr = cx;
    70 const int * const ccp = &cx, $\C{// immutable value and pointer}$
     72const int * const ccp = &cx, $\C{// immutable value and pointer/reference}$
    7173                        & const ccr = cx;
    72 // pointer
     74\end{cfa}
     75\begin{cquote}
     76\setlength{\tabcolsep}{26pt}
     77\begin{tabular}{@{}lll@{}}
     78pointer & reference & \\
     79\begin{cfa}
    7380*cp = 7;
    74 cp = &x; $\C{// error, assignment of read-only variable}$
    75 *ccp = 7; $\C{// error, assignment of read-only location}$
    76 ccp = &cx; $\C{// error, assignment of read-only variable}$
    77 // reference
     81cp = &x;
     82*ccp = 7;
     83ccp = &cx;
     84\end{cfa}
     85&
     86\begin{cfa}
    7887cr = 7;
    79 cr = &x; $\C{// error, assignment of read-only variable}$
    80 *ccr = 7; $\C{// error, assignment of read-only location}$
    81 ccr = &cx; $\C{// error, assignment of read-only variable}$
    82 \end{cfa}
     88cr = &x;
     89*ccr = 7;
     90ccr = &cx;
     91\end{cfa}
     92&
     93\begin{cfa}
     94// allowed
     95// error, assignment of read-only variable
     96// error, assignment of read-only location
     97// error, assignment of read-only variable
     98\end{cfa}
     99\end{tabular}
     100\end{cquote}
    83101Interestingly, C does not give a warning/error if a @const@ pointer is not initialized, while \CC does.
    84 Hence, type @& const@ is similar to \CC reference, but \CFA does not preclude initialization with a non-variable address.
     102Hence, type @& const@ is similar to a \CC reference, but \CFA does not preclude initialization with a non-variable address.
    85103For example, in system's programming, there are cases where an immutable address is initialized to a specific memory location.
    86104\begin{cfa}
     
    96114However, there is an inherent ambiguity for auto-dereferencing: every argument expression involving a reference variable can potentially mean passing the reference's value or address.
    97115Without any restrictions, this ambiguity limits the behaviour of reference types in \CFA polymorphic functions, where a type @T@ can bind to a reference or non-reference type.
    98 This ambiguity prevents the type system treating reference types the same way as other types in many cases even if type variables could be bound to reference types.
     116This ambiguity prevents the type system treating reference types the same way as other types, even if type variables could be bound to reference types.
    99117The reason is that \CFA uses a common \emph{object trait}\label{p:objecttrait} (constructor, destructor and assignment operators) to handle passing dynamic concrete type arguments into polymorphic functions, and the reference types are handled differently in these contexts so they do not satisfy this common interface.
    100118
    101119Moreover, there is also some discrepancy in how the reference types are treated in initialization and assignment expressions.
    102 For example, in line 3 of the previous example code \see{\VPageref{p:refexamples}}:
     120For example, in line 3 of the example code on \VPageref{p:refexamples}:
    103121\begin{cfa}
    104122int @&@ r1 = x,  @&&@ r2 = r1,   @&&&@ r3 = r2; $\C{// references to x}$
     
    129147vector( int @&@ ) vec; $\C{// vector of references to ints}$
    130148\end{cfa}
    131 While it is possible to write a reference type as the argument to a generic type, it is disallowed in assertion checking, if the generic type requires the object trait \see{\VPageref{p:objecttrait}} for the type argument (a fairly common use case).
     149While it is possible to write a reference type as the argument to a generic type, it is disallowed in assertion checking, if the generic type requires the object trait \see{\VPageref{p:objecttrait}} for the type argument, a fairly common use case.
    132150Even if the object trait can be made optional, the current type system often misbehaves by adding undesirable auto-dereference on the referenced-to value rather than the reference variable itself, as intended.
    133151Some tweaks are necessary to accommodate reference types in polymorphic contexts and it is unclear what can or cannot be achieved.
    134 Currently, there are contexts where \CFA programmer must use pointer types, giving up the benefits of auto-dereference operations and better syntax with reference types.
     152Currently, there are contexts where \CFA programmer is forced to use a pointer type, giving up the benefits of auto-dereference operations and better syntax with reference types.
    135153
    136154
     
    165183Along with making returning multiple values a first-class feature, tuples were extended to simplify a number of other common context that normally require multiple statements and/or additional declarations, all of which reduces coding time and errors.
    166184\begin{cfa}
    167 [x, y, z] = 3; $\C[2in]{// x = 3; y = 3; z = 3, where types are different}$
     185[x, y, z] = 3; $\C[2in]{// x = 3; y = 3; z = 3, where types may be different}$
    168186[x, y] = [y, x]; $\C{// int tmp = x; x = y; y = tmp;}$
    169187void bar( int, int, int );
     
    212230bar( t2 );                      $\C{// bar defined above}$
    213231\end{cfa}
    214 \VRef[Figure]{f:Nesting} shows The difference is nesting of structures and tuples.
     232\VRef[Figure]{f:Nesting} shows the difference is nesting of structures and tuples.
    215233The left \CC nested-structure is named so it is not flattened.
    216234The middle C/\CC nested-structure is unnamed and flattened, causing an error because @i@ and @j@ are duplication names.
     
    220238
    221239\begin{figure}
    222 \setlength{\tabcolsep}{15pt}
     240\setlength{\tabcolsep}{20pt}
    223241\begin{tabular}{@{}ll@{\hspace{90pt}}l@{}}
    224242\multicolumn{1}{c}{\CC} & \multicolumn{1}{c}{C/\CC} & \multicolumn{1}{c}{tuple} \\
     
    273291As noted, tradition languages manipulate multiple values by in/out parameters and/or structures.
    274292K-W C adopted the structure for tuple values or variables, and as needed, the fields are extracted by field access operations.
    275 As well, For the tuple-assignment implementation, the left-hand tuple expression is expanded into assignments of each component, creating temporary variables to avoid unexpected side effects.
    276 For example, the tuple value returned from @foo@ is a structure, and its fields are individually assigned to a left-hand tuple, @x@, @y@, @z@, or copied directly into a corresponding tuple variable.
     293As well, for the tuple-assignment implementation, the left-hand tuple expression is expanded into assignments of each component, creating temporary variables to avoid unexpected side effects.
     294For example, the tuple value returned from @foo@ is a structure, and its fields are individually assigned to a left-hand tuple, @x@, @y@, @z@, \emph{or} copied directly into a corresponding tuple variable.
    277295
    278296In the second implementation of \CFA tuples by Rodolfo Gabriel Esteves~\cite{Esteves04}, a different strategy is taken to handle MVR functions.
     
    286304[x, y] = gives_two();
    287305\end{cfa}
    288 The Till K-W C implementation translates the program to:
     306\VRef[Figure]{f:AlternateTupleImplementation} shows the two implementation approaches.
     307In the left approach, the return statement is rewritten to pack the return values into a structure, which is returned by value, and the structure fields are indiviually assigned to the left-hand side of the assignment.
     308In the right approach, the return statement is rewritten as direct assignments into the passed-in argument addresses.
     309The right imlementation looks more concise and saves unnecessary copying.
     310The downside is indirection within @gives_two@ to access values, unless values get hoisted into registers for some period of time, which is common.
     311
     312\begin{figure}
     313\begin{cquote}
     314\setlength{\tabcolsep}{20pt}
     315\begin{tabular}{@{}ll@{}}
     316Till K-W C implementation & Rodolfo \CFA implementation \\
    289317\begin{cfa}
    290318struct _tuple2 { int _0; int _1; }
    291 struct _tuple2 gives_two() { ... struct _tuple2 ret = { r1, r2 }, return ret; }
     319struct _tuple2 gives_two() {
     320        ... struct _tuple2 ret = { r1, r2 };
     321        return ret;
     322}
    292323int x, y;
    293324struct _tuple2 _tmp = gives_two();
    294325x = _tmp._0; y = _tmp._1;
    295326\end{cfa}
    296 while the Rodolfo implementation translates it to:
    297 \begin{cfa}
    298 void gives_two( int * r1, int * r2 ) { ... *r1 = ...; *r2 = ...; return; }
     327&
     328\begin{cfa}
     329
     330void gives_two( int * r1, int * r2 ) {
     331        ... *r1 = ...; *r2 = ...;
     332        return;
     333}
    299334int x, y;
     335
    300336gives_two( &x, &y );
    301337\end{cfa}
    302 and inside the body of the function @gives_two@, the return statement is rewritten as assignments into the passed-in argument addresses.
    303 This implementation looks more concise, and in the case of returning values having nontrivial types, \eg aggregates, this implementation saves unnecessary copying.
    304 For example,
    305 \begin{cfa}
    306 [ x, y ] gives_two();
    307 int x, y;
    308 [ x, y ] = gives_two();
    309 \end{cfa}
    310 becomes
    311 \begin{cfa}
    312 void gives_two( int &, int & );
    313 int x, y;
    314 gives_two( x, y );
    315 \end{cfa}
    316 eliminiating any copying in or out of the call.
    317 The downside is indirection within @gives_two@ to access values, unless values get hoisted into registers for some period of time, which is common.
     338\end{tabular}
     339\end{cquote}
     340\caption{Alternate Tuple Implementation}
     341\label{f:AlternateTupleImplementation}
     342\end{figure}
    318343
    319344Interestingly, in the third implementation of \CFA tuples by Robert Schluntz~\cite[\S~3]{Schluntz17}, the MVR functions revert back to structure based, where it remains in the current version of \CFA.
    320345The reason for the reversion was to have a uniform approach for tuple values/variables making tuples first-class types in \CFA, \ie allow tuples with corresponding tuple variables.
    321 This extension was possible, because in parallel with Schluntz's work, generic types were being added independently by Moss~\cite{Moss19}, and the tuple variables leveraged the same implementation techniques as the generic variables.
     346This extension was possible, because in parallel with Schluntz's work, generic types were added independently by Moss~\cite{Moss19}, and the tuple variables leveraged the same implementation techniques as the generic variables.
    322347\PAB{I'm not sure about the connection here. Do you have an example of what you mean?}
    323348
     
    339364\begin{cfa}
    340365void f( int, int );
    341 void f( [int, int] );
     366void f( @[@ int, int @]@ );
    342367f( 3, 4 );  // ambiguous call
    343368\end{cfa}
     
    358383the call to @f@ can be interpreted as @T = [1]@ and @U = [2, 3, 4, 5]@, or @T = [1, 2]@ and @U = [3, 4, 5]@, and so on.
    359384The restriction ensures type checking remains tractable and does not take too long to compute.
    360 Therefore, tuple types are never present in any fixed-argument function calls.
    361 
    362 Finally, a type-safe variadic argument signature was added by Robert Schluntz~\cite[\S~4.1.2]{Schluntz17} using @forall@ and a new tuple parameter-type, denoted by the keyword @ttype @ in Schluntz's implementation, but changed to the ellipsis syntax similar to \CC's template parameter pack.
     385Therefore, tuple types are never present in any fixed-argument function calls, because of the flattening.
     386
     387Finally, a type-safe variadic argument signature was added by Robert Schluntz~\cite[\S~4.1.2]{Schluntz17} using @forall@ and a new tuple parameter-type, denoted by the keyword @ttype@ in Schluntz's implementation, but changed to the ellipsis syntax similar to \CC's template parameter pack.
    363388For C variadics, \eg @va_list@, the number and types of the arguments must be conveyed in some way, \eg @printf@ uses a format string indicating the number and types of the arguments.
    364389\VRef[Figure]{f:CVariadicMaxFunction} shows an $N$ argument @maxd@ function using the C untyped @va_list@ interface.
     
    370395\begin{figure}
    371396\begin{cfa}
    372 double maxd( int @count@, ... ) {
     397double maxd( int @count@, @...@ ) { // ellipse parameter
    373398    double max = 0;
    374399    va_list args;
     
    566591struct U u;  u.k;  u.l;
    567592\end{cfa}
    568 and the hoisted type names can clash with global types names.
     593and the hoisted type names can clash with global type names.
    569594For good reasons, \CC chose to change this semantics:
    570595\begin{cquote}
     
    584609\end{cfa}
    585610\CFA chose to adopt the \CC non-compatible change for nested types, since \CC's change has already forced certain coding changes in C libraries that must be parsed by \CC.
     611\CFA also added the ability to access from a variable through a type to a field.
     612\begin{cfa}
     613struct S s;  @s.T@.i;  @s.U@.k;
     614\end{cfa}
    586615
    587616% https://gcc.gnu.org/onlinedocs/gcc/Unnamed-Fields.html
     
    604633\end{cfa}
    605634Note, the position of the substructure is normally unimportant, unless there is some form of memory or @union@ overlay.
    606 Like the anonymous nested types, the aggregate field names are hoisted into @struct S@, so there is direct access, \eg @s.x@ and @s.i@.
    607 However, like the implicit C hoisting of nested structures, the field names must be unique and the type names are now at a different scope level, unlike type nesting in \CC.
    608 In addition, a pointer to a structure is automatically converted to a pointer to an anonymous field for assignments and function calls, providing containment inheritance with implicit subtyping, \ie @U@ $\subset$ @S@ and @W@ $\subset$ @S@.
    609 For example:
     635Like an anonymous nested type, a named nested Plan-9 type has its field names hoisted into @struct S@, so there is direct access, \eg @s.x@ and @s.i@.
     636Hence, the field names must be unique, unlike \CC nested types, but the type names are at a nested scope level, unlike type nesting in C.
     637In addition, a pointer to a structure is automatically converted to a pointer to an anonymous field for assignments and function calls, providing containment inheritance with implicit subtyping, \ie @U@ $\subset$ @S@ and @W@ $\subset$ @S@, \eg:
    610638\begin{cfa}
    611639void f( union U * u );
    612640void g( struct W * );
    613 union U * up;
    614 struct W * wp;
    615 struct S * sp;
    616 up = sp; $\C{// assign pointer to U in S}$
    617 wp = sp; $\C{// assign pointer to W in S}$
     641union U * up;   struct W * wp;   struct S * sp;
     642up = &s; $\C{// assign pointer to U in S}$
     643wp = &s; $\C{// assign pointer to W in S}$
    618644f( &s ); $\C{// pass pointer to U in S}$
    619645g( &s ); $\C{// pass pointer to W in S}$
    620646\end{cfa}
    621 
    622 \CFA extends the Plan-9 substructure by allowing polymorphism for values and pointers.
    623 The extended substructure is denoted using @inline@, allowing backwards compatibility to existing Plan-9 features.
     647Note, there is no value assignment, such as, @w = s@, to copy the @W@ field from @S@.
     648
     649Unfortunately, the Plan-9 designers did not lookahead to other useful features, specifically nested types.
     650This nested type compiles in \CC and \CFA.
     651\begin{cfa}
     652struct R {
     653        @struct T;@             $\C[2in]{// forward declaration, conflicts with Plan-9 syntax}$
     654        struct S {              $\C{// nested types, mutually recursive reference}\CRT$
     655                S * sp;   T * tp;  ...
     656        };
     657        struct T {
     658                S * sp;   T * tp;  ...
     659        };
     660};
     661\end{cfa}
     662Note, the syntax for the forward declaration conflicts with the Plan-9 declaration syntax.
     663
     664\CFA extends the Plan-9 substructure by allowing polymorphism for values and pointers, where the extended substructure is denoted using @inline@.
    624665\begin{cfa}
    625666struct S {
    626         @inline@ W;  $\C{// extended Plan-9 substructure}$
     667        @inline@ struct W;  $\C{// extended Plan-9 substructure}$
    627668        unsigned int tag;
    628669        @inline@ U;  $\C{// extended Plan-9 substructure}$
    629670} s;
    630671\end{cfa}
    631 Note, like \CC, \CFA allows optional prefixing of type names with their kind, \eg @struct@, @union@, and @enum@, unless there is ambiguity with variable names in the same scope.
    632 The following shows both value and pointer polymorphism.
     672Note, the declaration of @U@ is not prefixed with @union@.
     673Like \CC, \CFA allows optional prefixing of type names with their kind, \eg @struct@, @union@, and @enum@, unless there is ambiguity with variable names in the same scope.
     674In addition, a semi-non-compatible change is made so that Plan-9 syntax means a forward declaration in a nested type.
     675Since the Plan-9 extension is not part of C and rarely used, this change has minimal impact.
     676Hence, all Plan-9 semantics are denoted by the @inline@ qualifier, which good ``eye-candy'' when reading a structure definition to spot Plan-9 definitions.
     677Finally, the following code shows the value and pointer polymorphism.
    633678\begin{cfa}
    634679void f( U, U * ); $\C{// value, pointer}$
    635680void g( W, W * ); $\C{// value, pointer}$
    636 U u, * up;
    637 S s, * sp;
    638 W w, * wp;
    639 u = s;  up = sp; $\C{// value, pointer}$
    640 w = s;  wp = sp; $\C{// value, pointer}$
     681U u, * up;   S s, * sp;   W w, * wp;
     682u = s;   up = sp; $\C{// value, pointer}$
     683w = s;   wp = sp; $\C{// value, pointer}$
    641684f( s, &s ); $\C{// value, pointer}$
    642685g( s, &s ); $\C{// value, pointer}$
     
    645688In general, non-standard C features (@gcc@) do not need any special treatment, as they are directly passed through to the C compiler.
    646689However, the Plan-9 semantics allow implicit conversions from the outer type to the inner type, which means the \CFA type resolver must take this information into account.
    647 Therefore, the \CFA translator must implement the Plan-9 features and insert necessary type conversions into the translated code output.
     690Therefore, the \CFA resolver must implement the Plan-9 features and insert necessary type conversions into the translated code output.
    648691In the current version of \CFA, this is the only kind of implicit type conversion other than the standard C conversions.
    649692
    650 Since variable overloading is possible in \CFA, \CFA's implementation of Plan-9 polymorphism allows duplicate field names.
    651 When an outer field and an embedded field have the same name and type, the inner field is shadowed and cannot be accessed directly by name.
    652 While such definitions are allowed, duplicate field names is not good practice in general and should be avoided if possible.
    653 Plan-9 fields can be nested, and a struct definition can contain multiple Plan-9 embedded fields.
    654 In particular, the \newterm{diamond pattern}~\cite[\S~6.1]{Stroustrup89}\cite[\S~4]{Cargill91}  can occur and result in a nested field to be embedded twice.
     693Plan-9 polymorphism can result in duplicate field names.
     694For example, the \newterm{diamond pattern}~\cite[\S~6.1]{Stroustrup89}\cite[\S~4]{Cargill91} can result in nested fields being embedded twice.
    655695\begin{cfa}
    656696struct A { int x; };
     
    658698struct C { inline A; };
    659699struct D {
    660         inline B;
    661         inline C;
    662 };
    663 D d;
    664 \end{cfa}
    665 In the above example, the expression @d.x@ becomes ambiguous, since it can refer to the indirectly embedded field either from @B@ or from @C@.
    666 It is still possible to disambiguate the expression by first casting the outer struct to one of the directly embedded type, such as @((B)d).x@.
     700        inline B;  // B.x
     701        inline C;  // C.x
     702} d;
     703\end{cfa}
     704Because the @inline@ structures are flattened, the expression @d.x@ is ambiguous, as it can refer to the embedded field either from @B@ or @C@.
     705@gcc@ generates a syntax error about the duplicate member @x@.
     706The equivalent \CC definition compiles:
     707\begin{c++}
     708struct A { int x; };
     709struct B : public A {};
     710struct C : public A {};
     711struct D : @public B, C@ {  // multiple inheritance
     712} d;
     713\end{c++}
     714and again the expression @d.x@ is ambiguous.
     715While \CC has no direct syntax to disambiguate @x@, \ie @d.B.x@ or @d.C.x@, it is possible with casts, @((B)d).x@ or @((C)d).x@.
     716Like \CC, \CFA compiles the Plan-9 version and provides direct syntax and casts to disambiguate @x@.
     717While ambiguous definitions are allowed, duplicate field names is poor practice and should be avoided if possible.
     718However, when a programmer does not control all code, this problem can occur and a naming workaround should exist.
Note: See TracChangeset for help on using the changeset viewer.