# Changeset eb182b0

Ignore:
Timestamp:
May 23, 2017, 9:55:37 AM (4 years ago)
Branches:
aaron-thesis, arm-eh, cleanup-dtors, deferred_resn, demangler, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, new-env, no_list, persistent-indexer, resolv-new, with_gc
Children:
27dde72
Parents:
547e9b7 (diff), 935315d (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the (diff) links above to see all the changes relative to each parent.
Message:

Merge branch 'master' of plg.uwaterloo.ca:software/cfa/cfa-cc

Files:
15 edited
7 moved

Unmodified
Removed

• ## doc/user/pointer2.fig

 r547e9b7 -2 1200 2 6 1125 2100 3525 2400 2 2 0 1 0 7 100 0 -1 0.000 0 0 -1 0 0 5 1500 1950 1950 1950 1950 2250 1500 2250 1500 1950 1500 2100 1950 2100 1950 2400 1500 2400 1500 2100 2 2 0 1 0 7 100 0 -1 0.000 0 0 -1 0 0 5 2700 2100 3150 2100 3150 2400 2700 2400 2700 2100 4 2 0 100 0 4 10 0.0000 2 120 270 1425 2400 104\001 4 2 0 100 0 4 10 0.0000 2 120 90 1425 2225 y\001 4 0 0 100 0 4 10 0.0000 2 120 165 2025 2300 int\001 4 2 0 100 0 4 10 0.0000 2 120 270 2625 2400 112\001 4 2 0 100 0 4 10 0.0000 2 150 180 2625 2225 p2\001 4 1 0 100 0 4 10 0.0000 2 120 90 1725 2300 3\001 4 0 0 100 0 4 10 0.0000 2 120 270 3225 2300 int *\001 4 1 0 100 0 4 10 0.0000 2 120 270 2925 2300 100\001 -6 2 2 0 1 0 7 100 0 -1 0.000 0 0 -1 0 0 5 1500 1500 1950 1500 1950 1800 1500 1800 1500 1500 2 1 0 1 4 7 100 -1 -1 0.000 0 0 -1 1 0 2 1 1 1.00 45.00 90.00 2700 1800 1950 1950 2700 1800 1950 2100 2 1 0 1 4 7 50 -1 -1 0.000 0 0 -1 1 0 2 1 1 1.00 45.00 90.00 2700 1950 1950 1800 2 2 0 1 0 7 100 0 -1 0.000 0 0 -1 0 0 5 2700 1950 3150 1950 3150 2250 2700 2250 2700 1950 2700 2100 1950 1800 2 2 0 1 0 7 100 0 -1 0.000 0 0 -1 0 0 5 2700 1500 3150 1500 3150 1800 2700 1800 2700 1500 2 1 0 1 4 7 100 -1 -1 0.000 0 0 -1 1 0 2 1 1 1.00 45.00 90.00 3900 1800 3150 1950 3900 1800 3150 2100 2 2 0 1 0 7 100 0 -1 0.000 0 0 -1 0 0 5 3900 1500 4350 1500 4350 1800 3900 1800 3900 1500 4 2 0 100 0 4 10 0.0000 2 120 270 1425 2250 104\001 4 2 0 100 0 4 10 0.0000 2 120 270 1425 1800 100\001 4 2 0 100 0 4 10 0.0000 2 90 90 1425 1625 x\001 4 2 0 100 0 4 10 0.0000 2 120 90 1425 2075 y\001 4 0 0 100 0 4 10 0.0000 2 120 165 2025 2150 int\001 4 0 0 100 0 4 10 0.0000 2 120 165 2025 1700 int\001 4 2 0 100 0 4 10 0.0000 2 120 270 2625 2250 112\001 4 2 0 100 0 4 10 0.0000 2 150 180 2625 2075 p2\001 4 2 0 100 0 4 10 0.0000 2 120 270 2625 1800 108\001 4 2 0 100 0 4 10 0.0000 2 150 180 2625 1625 p1\001 4 1 0 100 0 4 10 0.0000 2 120 90 1725 2150 3\001 4 1 0 100 0 4 10 0.0000 2 120 90 1725 1700 3\001 4 0 0 100 0 4 10 0.0000 2 120 270 3225 2150 int *\001 4 0 0 100 0 4 10 0.0000 2 120 270 3225 1700 int *\001 4 2 0 100 0 4 10 0.0000 2 120 270 3825 1800 116\001 4 2 0 100 0 4 10 0.0000 2 150 180 3825 1625 p3\001 4 1 0 100 0 4 10 0.0000 2 120 270 2925 2150 100\001 4 1 0 100 0 4 10 0.0000 2 120 270 2925 1700 104\001 4 1 0 100 0 4 10 0.0000 2 120 270 4125 1700 112\001
• ## doc/user/user.tex

 r547e9b7 %% Created On       : Wed Apr  6 14:53:29 2016 %% Last Modified By : Peter A. Buhr %% Last Modified On : Mon May 15 18:29:58 2017 %% Update Count     : 1598 %% Last Modified On : Sun May 21 23:36:42 2017 %% Update Count     : 1822 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \author{ \huge \CFA Team \medskip \\ \Large Peter A. Buhr, Richard Bilson, Thierry Delisle, \smallskip \\ \Large Andrew Beach, Richard Bilson, Peter A. Buhr, Thierry Delisle, \smallskip \\ \Large Glen Ditchfield, Rodolfo G. Esteves, Aaron Moss, Rob Schluntz }% author As stated, the goal of the \CFA project is to engineer modern language features into C in an evolutionary rather than revolutionary way. \CC~\cite{c++,ANSI14:C++} is an example of a similar project; \CC~\cite{C++14,C++} is an example of a similar project; however, it largely extended the language, and did not address many existing problems.\footnote{% Two important existing problems addressed were changing the type of character literals from ©int© to ©char© and enumerator from ©int© to the type of its enumerators.} The new declarations place qualifiers to the left of the base type, while C declarations place qualifiers to the right of the base type. In the following example, \R{red} is for the base type and \B{blue} is for the qualifiers. The \CFA declarations move the qualifiers to the left of the base type, i.e., move the blue to the left of the red, while the qualifiers have the same meaning but are ordered left to right to specify a variable's type. The \CFA declarations move the qualifiers to the left of the base type, \ie move the blue to the left of the red, while the qualifiers have the same meaning but are ordered left to right to specify a variable's type. \begin{quote2} \begin{tabular}{@{}l@{\hspace{3em}}l@{}} \end{quote2} Unsupported are K\&R C declarations where the base type defaults to ©int©, if no type is specified,\footnote{ At least one type specifier shall be given in the declaration specifiers in each declaration, and in the specifier-qualifier list in each structure declaration and type name~\cite[\S~6.7.2(2)]{C11}} \eg: \begin{cfa} x;                                                              §\C{// int x}§ *y;                                                             §\C{// int *y}§ f( p1, p2 );                                    §\C{// int f( int p1, int p2 );}§ f( p1, p2 ) {}                                  §\C{// int f( int p1, int p2 ) {}}§ \end{cfa} The new declaration syntax can be used in other contexts where types are required, \eg casts and the pseudo-routine ©sizeof©: \begin{quote2} \begin{tabular}{@{}l@{\hspace{3em}}l@{}} \multicolumn{1}{c@{\hspace{3em}}}{\textbf{\CFA}}        & \multicolumn{1}{c}{\textbf{C}}        \\ \begin{cfa} y = (®* int®)x; i = sizeof(®[ 5 ] * int®); \end{cfa} & \begin{cfa} y = (®int *®)x; i = sizeof(®int *[ 5 ]®); \end{cfa} \end{tabular} \end{quote2} Finally, new \CFA declarations may appear together with C declarations in the same program block, but cannot be mixed within a specific declaration. \section{Pointer / Reference} \section{Pointer/Reference} C provides a \newterm{pointer type}; \CFA adds a \newterm{reference type}. Both types contain an \newterm{address}, which is normally a location in memory. Special addresses are used to denote certain states or access co-processor memory. By convention, no variable is placed at address 0, so addresses like 0, 1, 2, 3 are often used to denote no-value or other special states. Often dereferencing a special state causes a \Index{memory fault}, so checking is necessary during execution. If the programming language assigns addresses, a program's execution is \Index{sound}, i.e., all addresses are to valid memory locations. C allows programmers to assign addresses, so there is the potential for incorrect addresses, both inside and outside of the computer address-space. Program variables are implicit pointers to memory locations generated by the compiler and automatically dereferenced, as in: These types may be derived from a object or routine type, called the \newterm{referenced type}. Objects of these types contain an \newterm{address}, which is normally a location in memory, but may also address memory-mapped registers in hardware devices. An integer constant expression with the value 0, or such an expression cast to type ©void *©, is called a \newterm{null-pointer constant}.\footnote{ One way to conceptualize the null pointer is that no variable is placed at this address, so the null-pointer address can be used to denote an uninitialized pointer/reference object; \ie the null pointer is guaranteed to compare unequal to a pointer to any object or routine.} An address is \newterm{sound}, if it points to a valid memory location in scope, \ie within the program's execution-environment and has not been freed. Dereferencing an \newterm{unsound} address, including the null pointer, is \Index{undefined}, often resulting in a \Index{memory fault}. A program \newterm{object} is a region of data storage in the execution environment, the contents of which can represent values. In most cases, objects are located in memory at an address, and the variable name for an object is an implicit address to the object generated by the compiler and automatically dereferenced, as in: \begin{quote2} \begin{tabular}{@{}lll@{}} \begin{tabular}{@{}ll@{\hspace{2em}}l@{}} \begin{cfa} int x; \end{quote2} where the right example is how the compiler logically interprets the variables in the left example. Since a variable name only points to one location during its lifetime, it is an \Index{immutable} \Index{pointer}; hence, variables ©x© and ©y© are constant pointers in the compiler interpretation. In general, variable addresses are stored in instructions instead of loaded independently, so an instruction fetch implicitly loads a variable's address. Since a variable name only points to one address during its lifetime, it is an \Index{immutable} \Index{pointer}; hence, the implicit type of pointer variables ©x© and ©y© are constant pointers in the compiler interpretation. In general, variable addresses are stored in instructions instead of loaded from memory, and hence may not occupy storage. These approaches are contrasted in the following: \begin{quote2} \begin{tabular}{@{}l|l@{}} \multicolumn{1}{c|}{explicit variable address} & \multicolumn{1}{c}{implicit variable address} \\ \hline \begin{cfa} lda             r1,100                  // load address of x ld              r2,(r1)                   // load value of x ld               r2,(r1)                  // load value of x lda             r3,104                  // load address of y st              r2,(r3)                   // store x into y st               r2,(r3)                  // store x into y \end{cfa} & \end{tabular} \end{quote2} Finally, the immutable nature of a variable's address and the fact that there is no storage for a variable address means pointer assignment\index{pointer!assignment}\index{assignment!pointer} is impossible. Therefore, the expression ©x = y© only has one meaning, ©*x = *y©, i.e., manipulate values, which is why explicitly writing the dereferences is unnecessary even though it occurs implicitly as part of instruction decoding. A \Index{pointer}/\Index{reference} is a generalization of a variable name, i.e., a mutable address that can point to more than one memory location during its lifetime. (Similarly, an integer variable can contain multiple integer literals during its lifetime versus an integer constant representing a single literal during its lifetime and may not occupy storage as the literal is embedded directly into instructions.) Finally, the immutable nature of a variable's address and the fact that there is no storage for the variable pointer means pointer assignment\index{pointer!assignment}\index{assignment!pointer} is impossible. Therefore, the expression ©x = y© has only one meaning, ©*x = *y©, \ie manipulate values, which is why explicitly writing the dereferences is unnecessary even though it occurs implicitly as part of \Index{instruction decoding}. A \Index{pointer}/\Index{reference} object is a generalization of an object variable-name, \ie a mutable address that can point to more than one memory location during its lifetime. (Similarly, an integer variable can contain multiple integer literals during its lifetime versus an integer constant representing a single literal during its lifetime, and like a variable name, may not occupy storage as the literal is embedded directly into instructions.) Hence, a pointer occupies memory to store its current address, and the pointer's value is loaded by dereferencing, \eg: \begin{quote2} \begin{tabular}{@{}ll@{}} \begin{tabular}{@{}l@{\hspace{2em}}l@{}} \begin{cfa} int x, y, ®*® p1, ®*® p2, ®**® p3; \end{cfa} & \raisebox{-0.45\totalheight}{\input{pointer2.pstex_t}} \raisebox{-0.5\totalheight}{\input{pointer2.pstex_t}} \end{tabular} \end{quote2} Notice, an address has a duality\index{address!duality}: a location in memory or the value at that location. In many cases, a compiler might be able to infer the meaning: Notice, an address has a \Index{duality}\index{address!duality}: a location in memory or the value at that location. In many cases, a compiler might be able to infer the best meaning for these two cases. For example, \Index*{Algol68}~\cite{Algol68} infers pointer dereferencing to select the best meaning for each pointer usage \begin{cfa} p2 = p1 + x;                                    §\C{// compiler infers *p2 = *p1 + x;}§ \end{cfa} because adding the arbitrary integer value in ©x© to the address of ©p1© and storing the resulting address into ©p2© is an unlikely operation. \Index*{Algol68}~\cite{Algol68} inferences pointer dereferencing to select the best meaning for each pointer usage. However, in C, the following cases are ambiguous, especially with pointer arithmetic: \begin{cfa} p1 = p2;                                                §\C{// p1 = p2\ \ or\ \ *p1 = *p2}§ p1 = p1 + 1;                                    §\C{// p1 = p1 + 1\ \ or\ \ *p1 = *p1 + 1}§ \end{cfa} Most languages pick one meaning as the default and the programmer explicitly indicates the other meaning to resolve the address-duality ambiguity\index{address! ambiguity}. In C, the default meaning for pointers is to manipulate the pointer's address and the pointed-to value is explicitly accessed by the dereference operator ©*©. Algol68 infers the following deferencing ©*p2 = *p1 + x©, because adding the arbitrary integer value in ©x© to the address of ©p1© and storing the resulting address into ©p2© is an unlikely operation. Unfortunately, automatic dereferencing does not work in all cases, and so some mechanism is necessary to fix incorrect choices. Rather than inferring dereference, most programming languages pick one implicit dereferencing semantics, and the programmer explicitly indicates the other to resolve address-duality. In C, objects of pointer type always manipulate the pointer object's address: \begin{cfa} p1 = p2;                                                §\C{// p1 = p2\ \ rather than\ \ *p1 = *p2}§ p2 = p1 + x;                                    §\C{// p2 = p1 + x\ \ rather than\ \ *p1 = *p1 + x}§ \end{cfa} even though the assignment to ©p2© is likely incorrect, and the programmer probably meant: \begin{cfa} p1 = p2;                                                §\C{// pointer address assignment}§ *p1 = *p1 + 1;                                  §\C{// pointed-to value assignment / operation}§ \end{cfa} which works well for situations where manipulation of addresses is the primary meaning and data is rarely accessed, such as storage management (©malloc©/©free©). ®*®p2 = ®*®p1 + x;                              §\C{// pointed-to value assignment / operation}§ \end{cfa} The C semantics works well for situations where manipulation of addresses is the primary meaning and data is rarely accessed, such as storage management (©malloc©/©free©). However, in most other situations, the pointed-to value is requested more often than the pointer address. \end{cfa} To switch the default meaning for an address requires a new kind of pointer, called a \newterm{reference} denoted by ©&©. To support this common case, a reference type is introduced in \CFA, denoted by ©&©, which is the opposite dereference semantics to a pointer type, making the value at the pointed-to location the implicit semantics for dereferencing (similar but not the same as \CC \Index{reference type}s). \begin{cfa} int x, y, ®&® r1, ®&® r2, ®&&® r3; Except for auto-dereferencing by the compiler, this reference example is the same as the previous pointer example. Hence, a reference behaves like the variable name for the current variable it is pointing-to. The simplest way to understand a reference is to imagine the compiler inserting a dereference operator before the reference variable for each reference qualifier in a declaration, \eg: \begin{cfa} r2 = ((r1 + r2) * (r3 - r1)) / (r3 - 15); \end{cfa} is rewritten as: One way to conceptualize a reference is via a rewrite rule, where the compiler inserts a dereference operator before the reference variable for each reference qualifier in a declaration, so the previous example becomes: \begin{cfa} ®*®r2 = ((®*®r1 + ®*®r2) ®*® (®**®r3 - ®*®r1)) / (®**®r3 - 15); \end{cfa} When a reference operation appears beside a dereference operation, \eg ©&*©, they cancel out.\footnote{ When a reference operation appears beside a dereference operation, \eg ©&*©, they cancel out. However, in C, the cancellation always yields a value (\Index{rvalue}).\footnote{ The unary ©&© operator yields the address of its operand. If the operand has type type'', the result has type pointer to type''. If the operand is the result of a unary ©*© operator, neither that operator nor the ©&© operator is evaluated and the result is as if both were omitted, except that the constraints on the operators still apply and the result is not an lvalue.~\cite[\S~6.5.3.2--3]{C11}} Hence, assigning to a reference requires the address of the reference variable (\Index{lvalue}): \begin{cfa} (&®*®)r1 = &x;                                  §\C{// (\&*) cancel giving variable r1 not variable pointed-to by r1}§ For a \CFA reference type, the cancellation on the left-hand side of assignment leaves the reference as an address (\Index{lvalue}): \begin{cfa} (&®*®)r1 = &x;                                  §\C{// (\&*) cancel giving address of r1 not variable pointed-to by r1}§ \end{cfa} Similarly, the address of a reference can be obtained for assignment or computation (\Index{rvalue}): \begin{cfa} (&(&®*®)®*®)r3 = &(&®*®)r2;             §\C{// (\&*) cancel giving address of r2, (\&(\&*)*) cancel giving variable r3}§ \end{cfa} Cancellation\index{cancellation!pointer/reference}\index{pointer!cancellation} works to arbitrary depth, and pointer and reference values are interchangeable because both contain addresses. (&(&®*®)®*®)r3 = &(&®*®)r2;             §\C{// (\&*) cancel giving address of r2, (\&(\&*)*) cancel giving address of r3}§ \end{cfa} Cancellation\index{cancellation!pointer/reference}\index{pointer!cancellation} works to arbitrary depth. Fundamentally, pointer and reference objects are functionally interchangeable because both contain addresses. \begin{cfa} int x, *p1 = &x, **p2 = &p1, ***p3 = &p2, &&&r3 = p3;                                             §\C{// change r3 to p3, (\&(\&(\&*)*)*)r3, 3 cancellations}§ \end{cfa} Finally, implicit dereferencing and cancellation are a static (compilation) phenomenon not a dynamic one. That is, all implicit dereferencing and any cancellation is carried out prior to the start of the program, so reference performance is equivalent to pointer performance. A programmer selects a pointer or reference type solely on whether the address is dereferenced frequently or infrequently, which dictates the amount of direct aid from the compiler; otherwise, everything else is equal. Interestingly, \Index*[C++]{\CC} deals with the address duality by making the pointed-to value the default, and prevent\-ing changes to the reference address, which eliminates half of the duality. \Index*{Java} deals with the address duality by making address assignment the default and requiring field assignment (direct or indirect via methods), i.e., there is no builtin bit-wise or method-wise assignment, which eliminates half of the duality. As for a pointer, a reference may have qualifiers: Furthermore, both types are equally performant, as the same amount of dereferencing occurs for both types. Therefore, the choice between them is based solely on whether the address is dereferenced frequently or infrequently, which dictates the amount of implicit dereferencing aid from the compiler. As for a pointer type, a reference type may have qualifiers: \begin{cfa} const int cx = 5;                               §\C{// cannot change cx;}§ ®&®cr = &cx;                                    §\C{// can change cr}§ cr = 7;                                                 §\C{// error, cannot change cx}§ int & const rc = x;                             §\C{// must be initialized, \CC reference}§ int & const rc = x;                             §\C{// must be initialized}§ ®&®rc = &x;                                             §\C{// error, cannot change rc}§ const int & const crc = cx;             §\C{// must be initialized, \CC reference}§ const int & const crc = cx;             §\C{// must be initialized}§ crc = 7;                                                §\C{// error, cannot change cx}§ ®&®crc = &cx;                                   §\C{// error, cannot change crc}§ \end{cfa} Hence, for type ©& const©, there is no pointer assignment, so ©&rc = &x© is disallowed, and \emph{the address value cannot be ©0© unless an arbitrary pointer is assigned to the reference}, \eg: \begin{cfa} int & const r = *0;                             §\C{// where 0 is the int * zero}§ \end{cfa} Otherwise, the compiler is managing the addresses for type ©& const© not the programmer, and by a programming discipline of only using references with references, address errors can be prevented. Hence, for type ©& const©, there is no pointer assignment, so ©&rc = &x© is disallowed, and \emph{the address value cannot be the null pointer unless an arbitrary pointer is coerced into the reference}: \begin{cfa} int & const cr = *0;                    §\C{// where 0 is the int * zero}§ \end{cfa} Note, constant reference-types do not prevent addressing errors because of explicit storage-management: \begin{cfa} int & const cr = *malloc(); cr = 5; delete &cr; cr = 7;                                                 §\C{// unsound pointer dereference}§ \end{cfa} Finally, the position of the ©const© qualifier \emph{after} the pointer/reference qualifier causes confuse for C programmers. The ©const© qualifier cannot be moved before the pointer/reference qualifier for C style-declarations; where the \CFA declaration is read left-to-right (see \VRef{s:Declarations}). In contrast to \CFA reference types, \Index*[C++]{\CC{}}'s reference types are all ©const© references, preventing changes to the reference address, so only value assignment is possible, which eliminates half of the \Index{address duality}. \Index*{Java}'s reference types to objects (all Java objects are on the heap) are like C pointers, which always manipulate the address, and there is no (bit-wise) object assignment, so objects are explicitly cloned by shallow or deep copying, which eliminates half of the address duality. \Index{Initialization} is different than \Index{assignment} because initialization occurs on the empty (uninitialized) storage on an object, while assignment occurs on possibly initialized storage of an object. There are three initialization contexts in \CFA: declaration initialization, argument/parameter binding, return/temporary binding. For reference initialization (like pointer), the initializing value must be an address (\Index{lvalue}) not a value (\Index{rvalue}). \begin{cfa} int * p = &x;                                   §\C{// both \&x and x are possible interpretations}§ int & r = x;                                    §\C{// x unlikely interpretation, because of auto-dereferencing}§ \end{cfa} Hence, the compiler implicitly inserts a reference operator, ©&©, before the initialization expression. Similarly, when a reference is used for a parameter/return type, the call-site argument does not require a reference operator. \begin{cfa} int & f( int & rp );                    §\C{// reference parameter and return}§ Because the object being initialized has no value, there is only one meaningful semantics with respect to address duality: it must mean address as there is no pointed-to value. In contrast, the left-hand side of assignment has an address that has a duality. Therefore, for pointer/reference initialization, the initializing value must be an address (\Index{lvalue}) not a value (\Index{rvalue}). \begin{cfa} int * p = &x;                           §\C{// must have address of x}§ int & r = x;                            §\C{// must have address of x}§ \end{cfa} Therefore, it is superfluous to require explicitly taking the address of the initialization object, even though the type is incorrect. Hence, \CFA allows ©r© to be assigned ©x© because it infers a reference for ©x©, by implicitly inserting a address-of operator, ©&©, and it is an error to put an ©&© because the types no longer match. Unfortunately, C allows ©p© to be assigned with ©&x© or ©x©, by value, but most compilers warn about the latter assignment as being potentially incorrect. (\CFA extends pointer initialization so a variable name is automatically referenced, eliminating the unsafe assignment.) Similarly, when a reference type is used for a parameter/return type, the call-site argument does not require a reference operator for the same reason. \begin{cfa} int & f( int & r );                             §\C{// reference parameter and return}§ z = f( x ) + f( y );                    §\C{// reference operator added, temporaries needed for call results}§ \end{cfa} Within routine ©f©, it is possible to change the argument by changing the corresponding parameter, and parameter ©rp© can be locally reassigned within ©f©. Since ©?+?© takes its arguments by value, the references returned from ©f© are used to initialize compiler generated temporaries with value semantics that copy from the references. Within routine ©f©, it is possible to change the argument by changing the corresponding parameter, and parameter ©r© can be locally reassigned within ©f©. Since operator routine ©?+?© takes its arguments by value, the references returned from ©f© are used to initialize compiler generated temporaries with value semantics that copy from the references. \begin{cfa} int temp1 = f( x ), temp2 = f( y ); z = temp1 + temp2; \end{cfa} This implicit referencing is crucial for reducing the syntactic burden for programmers when using references; otherwise references have the same syntactic  burden as pointers in these contexts. When a pointer/reference parameter has a ©const© value (immutable), it is possible to pass literals and expressions. \begin{cfa} void f( ®const® int & crp ); void g( ®const® int * cpp ); void f( ®const® int & cr ); void g( ®const® int * cp ); f( 3 );                   g( &3 ); f( x + y );             g( &(x + y) ); \end{cfa} Here, the compiler passes the address to the literal 3 or the temporary for the expression ©x + y©, knowing the argument cannot be changed through the parameter. (The ©&© is necessary for the pointer parameter to make the types match, and is a common requirement for a C programmer.) \CFA \emph{extends} this semantics to a mutable pointer/reference parameter, and the compiler implicitly creates the necessary temporary (copying the argument), which is subsequently pointed-to by the reference parameter and can be changed. \begin{cfa} void f( int & rp ); void g( int * pp ); (The ©&© is necessary for the pointer-type parameter to make the types match, and is a common requirement for a C programmer.) \CFA \emph{extends} this semantics to a mutable pointer/reference parameter, and the compiler implicitly creates the necessary temporary (copying the argument), which is subsequently pointed-to by the reference parameter and can be changed.\footnote{ If whole program analysis is possible, and shows the parameter is not assigned, \ie it is ©const©, the temporary is unnecessary.} \begin{cfa} void f( int & r ); void g( int * p ); f( 3 );                   g( &3 );              §\C{// compiler implicit generates temporaries}§ f( x + y );             g( &(x + y) );  §\C{// compiler implicit generates temporaries}§ The implicit conversion allows seamless calls to any routine without having to explicitly name/copy the literal/expression to allow the call. While \CFA attempts to handle pointers and references in a uniform, symmetric manner, C handles routine variables in an inconsistent way: a routine variable is both a pointer and a reference (particle and wave). \begin{cfa} void f( int p ) {...} void (*fp)( int ) = &f;                 §\C{// pointer initialization}§ void (*fp)( int ) = f;                  §\C{// reference initialization}§ %\CFA attempts to handle pointers and references in a uniform, symmetric manner. However, C handles routine objects in an inconsistent way. A routine object is both a pointer and a reference (particle and wave). \begin{cfa} void f( int i ); void (*fp)( int ); fp = f;                                                 §\C{// reference initialization}§ fp = &f;                                                §\C{// pointer initialization}§ fp = *f;                                                §\C{// reference initialization}§ fp(3);                                                  §\C{// reference invocation}§ (*fp)(3);                                               §\C{// pointer invocation}§ fp(3);                                                  §\C{// reference invocation}§ \end{cfa} A routine variable is best described by a ©const© reference: \begin{cfa} const void (&fp)( int ) = f; fp( 3 ); fp = ...                                                §\C{// error, cannot change code}§ &fp = ...;                                              §\C{// changing routine reference}§ \end{cfa} because the value of the routine variable is a routine literal, i.e., the routine code is normally immutable during execution.\footnote{ \end{cfa} A routine object is best described by a ©const© reference: \begin{cfa} const void (&fr)( int ) = f; fr = ...                                                §\C{// error, cannot change code}§ &fr = ...;                                              §\C{// changing routine reference}§ fr( 3 );                                                §\C{// reference call to f}§ (*fr)(3);                                               §\C{// error, incorrect type}§ \end{cfa} because the value of the routine object is a routine literal, \ie the routine code is normally immutable during execution.\footnote{ Dynamic code rewriting is possible but only in special circumstances.} \CFA allows this additional use of references for routine variables in an attempt to give a more consistent meaning for them. \section{Type Operators} The new declaration syntax can be used in other contexts where types are required, \eg casts and the pseudo-routine ©sizeof©: \begin{quote2} \begin{tabular}{@{}l@{\hspace{3em}}l@{}} \multicolumn{1}{c@{\hspace{3em}}}{\textbf{\CFA}}        & \multicolumn{1}{c}{\textbf{C}}        \\ \begin{cfa} y = (®* int®)x; i = sizeof(®[ 5 ] * int®); \end{cfa} & \begin{cfa} y = (®int *®)x; i = sizeof(®int *[ 5 ]®); \end{cfa} \end{tabular} \end{quote2} \CFA allows this additional use of references for routine objects in an attempt to give a more consistent meaning for them. This situation is different from inferring with reference type being used ... \begin{comment} \section{References} By introducing references in parameter types, users are given an easy way to pass a value by reference, without the need for NULL pointer checks. In structures, a reference can replace a pointer to an object that should always have a valid value. When a structure contains a reference, all of its constructors must initialize the reference and all instances of this structure must initialize it upon definition. The syntax for using references in \CFA is the same as \CC with the exception of reference initialization. Use ©&© to specify a reference, and access references just like regular objects, not like pointers (use dot notation to access fields). When initializing a reference, \CFA uses a different syntax which differentiates reference initialization from assignment to a reference. The ©&© is used on both sides of the expression to clarify that the address of the reference is being set to the address of the variable to which it refers. From: Richard Bilson Date: Wed, 13 Jul 2016 01:58:58 +0000 Subject: Re: pointers / references To: "Peter A. Buhr" As a general comment I would say that I found the section confusing, as you move back and forth between various real and imagined programming languages. If it were me I would rewrite into two subsections, one that specifies precisely the syntax and semantics of reference variables and another that provides the rationale. I don't see any obvious problems with the syntax or semantics so far as I understand them. It's not obvious that the description you're giving is complete, but I'm sure you'll find the special cases as you do the implementation. My big gripes are mostly that you're not being as precise as you need to be in your terminology, and that you say a few things that aren't actually true even though I generally know what you mean. 20 C provides a pointer type; CFA adds a reference type. Both types contain an address, which is normally a 21 location in memory. An address is not a location in memory; an address refers to a location in memory. Furthermore it seems weird to me to say that a type "contains" an address; rather, objects of that type do. 21 Special addresses are used to denote certain states or access co-processor memory. By 22 convention, no variable is placed at address 0, so addresses like 0, 1, 2, 3 are often used to denote no-value 23 or other special states. This isn't standard C at all. There has to be one null pointer representation, but it doesn't have to be a literal zero representation and there doesn't have to be more than one such representation. 23 Often dereferencing a special state causes a memory fault, so checking is necessary 24 during execution. I don't see the connection between the two clauses here. I feel like if a bad pointer will not cause a memory fault then I need to do more checking, not less. 24 If the programming language assigns addresses, a program's execution is sound, \ie all 25 addresses are to valid memory locations. You haven't said what it means to "assign" an address, but if I use my intuitive understanding of the term I don't see how this can be true unless you're assuming automatic storage management. 1 Program variables are implicit pointers to memory locations generated by the compiler and automatically 2 dereferenced, as in: There is no reason why a variable needs to have a location in memory, and indeed in a typical program many variables will not. In standard terminology an object identifier refers to data in the execution environment, but not necessarily in memory. 13 A pointer/reference is a generalization of a variable name, \ie a mutable address that can point to more 14 than one memory location during its lifetime. I feel like you're off the reservation here. In my world there are objects of pointer type, which seem to be what you're describing here, but also pointer values, which can be stored in an object of pointer type but don't necessarily have to be. For example, how would you describe the value denoted by "&main" in a C program? I would call it a (function) pointer, but that doesn't satisfy your definition. 16 not occupy storage as the literal is embedded directly into instructions.) Hence, a pointer occupies memory 17 to store its current address, and the pointer's value is loaded by dereferencing, e.g.: As with my general objection regarding your definition of variables, there is no reason why a pointer variable (object of pointer type) needs to occupy memory. 21 p2 = p1 + x; // compiler infers *p2 = *p1 + x; What language are we in now? 24 pointer usage. However, in C, the following cases are ambiguous, especially with pointer arithmetic: 25 p1 = p2; // p1 = p2 or *p1 = *p2 This isn't ambiguous. it's defined to be the first option. 26 p1 = p1 + 1; // p1 = p1 + 1 or *p1 = *p1 + 1 Again, this statement is not ambiguous. 13 example. Hence, a reference behaves like the variable name for the current variable it is pointing-to. The 14 simplest way to understand a reference is to imagine the compiler inserting a dereference operator before 15 the reference variable for each reference qualifier in a declaration, e.g.: It's hard for me to understand who the audience for this part is. I think a practical programmer is likely to be satisfied with "a reference behaves like the variable name for the current variable it is pointing-to," maybe with some examples. Your "simplest way" doesn't strike me as simpler than that. It feels like you're trying to provide a more precise definition for the semantics of references, but it isn't actually precise enough to be a formal specification. If you want to express the semantics of references using rewrite rules that's a great way to do it, but lay the rules out clearly, and when you're showing an example of rewriting keep your references/pointers/values separate (right now, you use \eg "r3" to mean a reference, a pointer, and a value). 24 Cancellation works to arbitrary depth, and pointer and reference values are interchangeable because both 25 contain addresses. Except they're not interchangeable, because they have different and incompatible types. 40 Interestingly, C++ deals with the address duality by making the pointed-to value the default, and prevent- 41 ing changes to the reference address, which eliminates half of the duality. Java deals with the address duality 42 by making address assignment the default and requiring field assignment (direct or indirect via methods), 43 \ie there is no builtin bit-wise or method-wise assignment, which eliminates half of the duality. I can follow this but I think that's mostly because I already understand what you're trying to say. I don't think I've ever heard the term "method-wise assignment" and I don't see you defining it. Furthermore Java does have value assignment of basic (non-class) types, so your summary here feels incomplete. (If it were me I'd drop this paragraph rather than try to save it.) 11 Hence, for type & const, there is no pointer assignment, so &rc = &x is disallowed, and the address value 12 cannot be 0 unless an arbitrary pointer is assigned to the reference. Given the pains you've taken to motivate every little bit of the semantics up until now, this last clause ("the address value cannot be 0") comes out of the blue. It seems like you could have perfectly reasonable semantics that allowed the initialization of null references. 12 In effect, the compiler is managing the 13 addresses for type & const not the programmer, and by a programming discipline of only using references 14 with references, address errors can be prevented. Again, is this assuming automatic storage management? 18 rary binding. For reference initialization (like pointer), the initializing value must be an address (lvalue) not 19 a value (rvalue). This sentence appears to suggest that an address and an lvalue are the same thing. 20 int * p = &x; // both &x and x are possible interpretations Are you saying that we should be considering "x" as a possible interpretation of the initializer "&x"? It seems to me that this expression has only one legitimate interpretation in context. 21 int & r = x; // x unlikely interpretation, because of auto-dereferencing You mean, we can initialize a reference using an integer value? Surely we would need some sort of cast to induce that interpretation, no? 22 Hence, the compiler implicitly inserts a reference operator, &, before the initialization expression. But then the expression would have pointer type, which wouldn't be compatible with the type of r. 22 Similarly, 23 when a reference is used for a parameter/return type, the call-site argument does not require a reference 24 operator. Furthermore, it would not be correct to use a reference operator. 45 The implicit conversion allows 1 seamless calls to any routine without having to explicitly name/copy the literal/expression to allow the call. 2 While C' attempts to handle pointers and references in a uniform, symmetric manner, C handles routine 3 variables in an inconsistent way: a routine variable is both a pointer and a reference (particle and wave). After all this talk of how expressions can have both pointer and value interpretations, you're disparaging C because it has expressions that have both pointer and value interpretations? On Sat, Jul 9, 2016 at 4:18 PM Peter A. Buhr wrote: > Aaron discovered a few places where "&"s are missing and where there are too many "&", which are > corrected in the attached updated. None of the text has changed, if you have started reading > already. \end{comment} int main() { * [int](int) fp = foo();        §\C{// int (*fp)(int)}§ sout | fp( 3 ) | endl; sout | fp( 3 ) | endl; } \end{cfa} because Currently, there are no \Index{lambda} expressions, i.e., unnamed routines because routine names are very important to properly select the correct routine. \section{Lexical List} Currently, there are no \Index{lambda} expressions, \ie unnamed routines because routine names are very important to properly select the correct routine. \section{Tuples} In C and \CFA, lists of elements appear in several contexts, such as the parameter list for a routine call. [ v+w, x*y, 3.14159, f() ] \end{cfa} Tuples are permitted to contain sub-tuples (i.e., nesting), such as ©[ [ 14, 21 ], 9 ]©, which is a 2-element tuple whose first element is itself a tuple. Tuples are permitted to contain sub-tuples (\ie nesting), such as ©[ [ 14, 21 ], 9 ]©, which is a 2-element tuple whose first element is itself a tuple. Note, a tuple is not a record (structure); a record denotes a single value with substructure, whereas a tuple is multiple values with no substructure (see flattening coercion in Section 12.1). tuple does not have structure like a record; a tuple is simply converted into a list of components. \begin{rationale} The present implementation of \CFA does not support nested routine calls when the inner routine returns multiple values; i.e., a statement such as ©g( f() )© is not supported. The present implementation of \CFA does not support nested routine calls when the inner routine returns multiple values; \ie a statement such as ©g( f() )© is not supported. Using a temporary variable to store the  results of the inner routine and then passing this variable to the outer routine works, however. \end{rationale} This requirement is the same as for comma expressions in argument lists. Type qualifiers, i.e., const and volatile, may modify a tuple type. The meaning is the same as for a type qualifier modifying an aggregate type [Int99, x 6.5.2.3(7),x 6.7.3(11)], i.e., the qualifier is distributed across all of the types in the tuple, \eg: Type qualifiers, \ie const and volatile, may modify a tuple type. The meaning is the same as for a type qualifier modifying an aggregate type [Int99, x 6.5.2.3(7),x 6.7.3(11)], \ie the qualifier is distributed across all of the types in the tuple, \eg: \begin{cfa} const volatile [ int, float, const int ] x; ©w© is implicitly opened to yield a tuple of four values, which are then assigned individually. A \newterm{flattening coercion} coerces a nested tuple, i.e., a tuple with one or more components, which are themselves tuples, into a flattened tuple, which is a tuple whose components are not tuples, as in: A \newterm{flattening coercion} coerces a nested tuple, \ie a tuple with one or more components, which are themselves tuples, into a flattened tuple, which is a tuple whose components are not tuples, as in: \begin{cfa} [ a, b, c, d ] = [ 1, [ 2, 3 ], 4 ]; \end{cfa} \index{lvalue} The left-hand side is a tuple of \emph{lvalues}, which is a list of expressions each yielding an address, i.e., any data object that can appear on the left-hand side of a conventional assignment statement. The left-hand side is a tuple of \emph{lvalues}, which is a list of expressions each yielding an address, \ie any data object that can appear on the left-hand side of a conventional assignment statement. ©$\emph{expr}$© is any standard arithmetic expression. Clearly, the types of the entities being assigned must be type compatible with the value of the expression. [ x1, y1 ] = z = 0; \end{cfa} As in C, the rightmost assignment is performed first, i.e., assignment parses right to left. As in C, the rightmost assignment is performed first, \ie assignment parses right to left. \section{Labelled Continue / Break} \section{Labelled Continue/Break} While C provides ©continue© and ©break© statements for altering control flow, both are restricted to one level of nesting for a particular control structure. With ©goto©, the label is at the end of the control structure, which fails to convey this important clue early enough to the reader. Finally, using an explicit target for the transfer instead of an implicit target allows new constructs to be added or removed without affecting existing constructs. The implicit targets of the current ©continue© and ©break©, i.e., the closest enclosing loop or ©switch©, change as certain constructs are added or removed. The implicit targets of the current ©continue© and ©break©, \ie the closest enclosing loop or ©switch©, change as certain constructs are added or removed. Furthermore, any statements before the first ©case© clause can only be executed if labelled and transferred to using a ©goto©, either from outside or inside of the ©switch©, both of which are problematic. As well, the declaration of ©z© cannot occur after the ©case© because a label can only be attached to a statement, and without a fall through to case 3, ©z© is uninitialized. The key observation is that the ©switch© statement branches into control structure, i.e., there are multiple entry points into its statement body. The key observation is that the ©switch© statement branches into control structure, \ie there are multiple entry points into its statement body. \end{enumerate} the number of ©switch© statements is small, \item most ©switch© statements are well formed (i.e., no \Index*{Duff's device}), most ©switch© statements are well formed (\ie no \Index*{Duff's device}), \item the ©default© clause is usually written as the last case-clause, \item Eliminating default fall-through has the greatest potential for affecting existing code. However, even if fall-through is removed, most ©switch© statements would continue to work because of the explicit transfers already present at the end of each ©case© clause, the common placement of the ©default© clause at the end of the case list, and the most common use of fall-through, i.e., a list of ©case© clauses executing common code, \eg: However, even if fall-through is removed, most ©switch© statements would continue to work because of the explicit transfers already present at the end of each ©case© clause, the common placement of the ©default© clause at the end of the case list, and the most common use of fall-through, \ie a list of ©case© clauses executing common code, \eg: \begin{cfa} case 1:  case 2:  case 3: ... ®int j = 0;®                            §\C{// disallowed}§ case 1: { { ®int k = 0;®                    §\C{// allowed at different nesting levels}§ ... The following \CC-style \Index{manipulator}s allow control over implicit seperation. Manipulators \Indexc{sepOn}\index{manipulator!sepOn@©sepOn©} and \Indexc{sepOff}\index{manipulator!sepOff@©sepOff©} \emph{locally} toggle printing the separator, i.e., the seperator is adjusted only with respect to the next printed item. Manipulators \Indexc{sepOn}\index{manipulator!sepOn@©sepOn©} and \Indexc{sepOff}\index{manipulator!sepOff@©sepOff©} \emph{locally} toggle printing the separator, \ie the seperator is adjusted only with respect to the next printed item. \begin{cfa}[mathescape=off,belowskip=0pt] sout | sepOn | 1 | 2 | 3 | sepOn | endl;        §\C{// separator at start of line}§ 12 3 \end{cfa} Manipulators \Indexc{sepDisable}\index{manipulator!sepDisable@©sepDisable©} and \Indexc{sepEnable}\index{manipulator!sepEnable@©sepEnable©} \emph{globally} toggle printing the separator, i.e., the seperator is adjusted with respect to all subsequent printed items, unless locally adjusted. Manipulators \Indexc{sepDisable}\index{manipulator!sepDisable@©sepDisable©} and \Indexc{sepEnable}\index{manipulator!sepEnable@©sepEnable©} \emph{globally} toggle printing the separator, \ie the seperator is adjusted with respect to all subsequent printed items, unless locally adjusted. \begin{cfa}[mathescape=off,aboveskip=0pt,belowskip=0pt] sout | sepDisable | 1 | 2 | 3 | endl;           §\C{// globally turn off implicit separation}§ \caption{Constructors and Destructors} \end{figure} \begin{comment} \section{References} By introducing references in parameter types, users are given an easy way to pass a value by reference, without the need for NULL pointer checks. In structures, a reference can replace a pointer to an object that should always have a valid value. When a structure contains a reference, all of its constructors must initialize the reference and all instances of this structure must initialize it upon definition. The syntax for using references in \CFA is the same as \CC with the exception of reference initialization. Use ©&© to specify a reference, and access references just like regular objects, not like pointers (use dot notation to access fields). When initializing a reference, \CFA uses a different syntax which differentiates reference initialization from assignment to a reference. The ©&© is used on both sides of the expression to clarify that the address of the reference is being set to the address of the variable to which it refers. \end{comment} \section{Syntactic Anomalies} There are several ambiguous cases with operator identifiers, \eg ©int *?*?()©, where the string ©*?*?© can be lexed as ©*©~\R{/}~©?*?© or ©*?©~\R{/}~©*?©. Since it is common practise to put a unary operator juxtaposed to an identifier, \eg ©*i©, users will be annoyed if they cannot do this with respect to operator identifiers. Even with this special hack, there are 5 general cases that cannot be handled. The first case is for the function-call identifier ©?()©: \begin{cfa} int *§\textvisiblespace§?()();  // declaration: space required after '*' *§\textvisiblespace§?()();              // expression: space required after '*' \end{cfa} Without the space, the string ©*?()© is ambiguous without N character look ahead; it requires scanning ahead to determine if there is a ©'('©, which is the start of an argument/parameter list. The 4 remaining cases occur in expressions: \begin{cfa} i++§\textvisiblespace§?i:0;             // space required before '?' i--§\textvisiblespace§?i:0;             // space required before '?' i§\textvisiblespace§?++i:0;             // space required after '?' i§\textvisiblespace§?--i:0;             // space required after '?' \end{cfa} In the first two cases, the string ©i++?© is ambiguous, where this string can be lexed as ©i© / ©++?© or ©i++© / ©?©; it requires scanning ahead to determine if there is a ©'('©, which is the start of an argument list. In the second two cases, the string ©?++x© is ambiguous, where this string can be lexed as ©?++© / ©x© or ©?© / y©++x©; it requires scanning ahead to determine if there is a ©'('©, which is the start of an argument list. \section{Incompatible} The following incompatibles exist between \CFA and C, and are similar to Annex C for \CC~\cite{ANSI14:C++}. \begin{enumerate} \item \begin{description} \item[Change:] add new keywords \\ New keywords are added to \CFA (see~\VRef{s:NewKeywords}). \item[Rationale:] keywords added to implement new semantics of \CFA. \item[Effect on original feature:] change to semantics of well-defined feature. \\ Any ISO C programs using these keywords as identifiers are invalid \CFA programs. \item[Difficulty of converting:] keyword clashes are accommodated by syntactic transformations using the \CFA backquote escape-mechanism (see~\VRef{s:BackquoteIdentifiers}): \item[How widely used:] clashes among new \CFA keywords and existing identifiers are rare. \end{description} \item \begin{description} \item[Change:] type of character literal ©int© to ©char© to allow more intuitive overloading: \begin{cfa} int rtn( int i ); int rtn( char c ); rtn( 'x' );                                             §\C{// programmer expects 2nd rtn to be called}§ \end{cfa} \item[Rationale:] it is more intuitive for the call to ©rtn© to match the second version of definition of ©rtn© rather than the first. In particular, output of ©char© variable now print a character rather than the decimal ASCII value of the character. \begin{cfa} sout | 'x' | " " | (int)'x' | endl; x 120 \end{cfa} Having to cast ©'x'© to ©char© is non-intuitive. \item[Effect on original feature:] change to semantics of well-defined feature that depend on: \begin{cfa} sizeof( 'x' ) == sizeof( int ) \end{cfa} no long work the same in \CFA programs. \item[Difficulty of converting:] simple \item[How widely used:] programs that depend upon ©sizeof( 'x' )© are rare and can be changed to ©sizeof(char)©. \end{description} \item \begin{description} \item[Change:] make string literals ©const©: \begin{cfa} char * p = "abc";                               §\C{// valid in C, deprecated in \CFA}§ char * q = expr ? "abc" : "de"; §\C{// valid in C, invalid in \CFA}§ \end{cfa} The type of a string literal is changed from ©[] char© to ©const [] char©. Similarly, the type of a wide string literal is changed from ©[] wchar_t© to ©const [] wchar_t©. \item[Rationale:] This change is a safety issue: \begin{cfa} char * p = "abc"; p[0] = 'w';                                             §\C{// segment fault or change constant literal}§ \end{cfa} The same problem occurs when passing a string literal to a routine that changes its argument. \item[Effect on original feature:] change to semantics of well-defined feature. \item[Difficulty of converting:] simple syntactic transformation, because string literals can be converted to ©char *©. \item[How widely used:] programs that have a legitimate reason to treat string literals as pointers to potentially modifiable memory are rare. \end{description} \item \begin{description} \item[Change:] remove \newterm{tentative definitions}, which only occurs at file scope: \begin{cfa} int i;                                                  §\C{// forward definition}§ int *j = ®&i®;                                  §\C{// forward reference, valid in C, invalid in \CFA}§ int i = 0;                                              §\C{// definition}§ \end{cfa} is valid in C, and invalid in \CFA because duplicate overloaded object definitions at the same scope level are disallowed. This change makes it impossible to define mutually referential file-local static objects, if initializers are restricted to the syntactic forms of C. For example, \begin{cfa} struct X { int i; struct X *next; }; static struct X a;                              §\C{// forward definition}§ static struct X b = { 0, ®&a® };        §\C{// forward reference, valid in C, invalid in \CFA}§ static struct X a = { 1, &b };  §\C{// definition}§ \end{cfa} \item[Rationale:] avoids having different initialization rules for builtin types and userdefined types. \item[Effect on original feature:] change to semantics of well-defined feature. \item[Difficulty of converting:] the initializer for one of a set of mutually-referential file-local static objects must invoke a routine call to achieve the initialization. \item[How widely used:] seldom \end{description} \item \begin{description} \item[Change:] have ©struct© introduce a scope for nested types: \begin{cfa} enum ®Colour® { R, G, B, Y, C, M }; struct Person { enum ®Colour® { R, G, B };      §\C{// nested type}§ struct Face {                           §\C{// nested type}§ ®Colour® Eyes, Hair;    §\C{// type defined outside (1 level)}§ }; ß.ß®Colour® shirt;                      §\C{// type defined outside (top level)}§ ®Colour® pants;                         §\C{// type defined same level}§ Face looks[10];                         §\C{// type defined same level}§ }; ®Colour® c = R;                                 §\C{// type/enum defined same level}§ Personß.ß®Colour® pc = Personß.ßR;      §\C{// type/enum defined inside}§ Personß.ßFace pretty;                   §\C{// type defined inside}§ \end{cfa} In C, the name of the nested types belongs to the same scope as the name of the outermost enclosing structure, i.e., the nested types are hoisted to the scope of the outer-most type, which is not useful and confusing. \CFA is C \emph{incompatible} on this issue, and provides semantics similar to \Index*[C++]{\CC}. Nested types are not hoisted and can be referenced using the field selection operator ©.©'', unlike the \CC scope-resolution operator ©::©''. \item[Rationale:] ©struct© scope is crucial to \CFA as an information structuring and hiding mechanism. \item[Effect on original feature:] change to semantics of well-defined feature. \item[Difficulty of converting:] Semantic transformation. \item[How widely used:] C programs rarely have nest types because they are equivalent to the hoisted version. \end{description} \item \begin{description} \item[Change:] In C++, the name of a nested class is local to its enclosing class. \item[Rationale:] C++ classes have member functions which require that classes establish scopes. \item[Difficulty of converting:] Semantic transformation. To make the struct type name visible in the scope of the enclosing struct, the struct tag could be declared in the scope of the enclosing struct, before the enclosing struct is defined. Example: \begin{cfa} struct Y;                                               §\C{// struct Y and struct X are at the same scope}§ struct X { struct Y { /* ... */ } y; }; \end{cfa} All the definitions of C struct types enclosed in other struct definitions and accessed outside the scope of the enclosing struct could be exported to the scope of the enclosing struct. Note: this is a consequence of the difference in scope rules, which is documented in 3.3. \item[How widely used:] Seldom. \end{description} \item \begin{description} \item[Change:] comma expression is disallowed as subscript \item[Rationale:] safety issue to prevent subscripting error for multidimensional arrays: ©x[i,j]© instead of ©x[i][j]©, and this syntactic form then taken by \CFA for new style arrays. \item[Effect on original feature:] change to semantics of well-defined feature. \item[Difficulty of converting:] semantic transformation of ©x[i,j]© to ©x[(i,j)]© \item[How widely used:] seldom. \end{description} \end{enumerate} \section{Syntax Ambiguities} C has a number of syntax ambiguities, which are resolved by taking the longest sequence of overlapping characters that constitute a token. For example, the program fragment ©x+++++y© is parsed as \lstinline[showspaces=true]@x ++ ++ + y@ because operator tokens ©++© and ©+© overlap. Unfortunately, the longest sequence violates a constraint on increment operators, even though the parse \lstinline[showspaces=true]@x ++ + ++ y@ might yield a correct expression. Hence, C programmers are aware that spaces have to added to disambiguate certain syntactic cases. In \CFA, there are ambiguous cases with dereference and operator identifiers, \eg ©int *?*?()©, where the string ©*?*?© can be interpreted as: \begin{cfa} *?§\color{red}\textvisiblespace§*?              §\C{// dereference operator, dereference operator}§ *§\color{red}\textvisiblespace§?*?              §\C{// dereference, multiplication operator}§ \end{cfa} By default, the first interpretation is selected, which does not yield a meaningful parse. Therefore, \CFA does a lexical look-ahead for the second case, and backtracks to return the leading unary operator and reparses the trailing operator identifier. Otherwise a space is needed between the unary operator and operator identifier to disambiguate this common case. A similar issue occurs with the dereference, ©*?(...)©, and routine-call, ©?()(...)© identifiers. The ambiguity occurs when the deference operator has no parameters: \begin{cfa} *?()§\color{red}\textvisiblespace...§ ; *?()§\color{red}\textvisiblespace...§(...) ; \end{cfa} requiring arbitrary whitespace look-ahead for the routine-call parameter-list to disambiguate. However, the dereference operator \emph{must} have a parameter/argument to dereference ©*?(...)©. Hence, always interpreting the string ©*?()© as \lstinline[showspaces=true]@* ?()@ does not preclude any meaningful program. The remaining cases are with the increment/decrement operators and conditional expression, \eg: \begin{cfa} i++?§\color{red}\textvisiblespace...§(...); i?++§\color{red}\textvisiblespace...§(...); \end{cfa} requiring arbitrary whitespace look-ahead for the operator parameter-list, even though that interpretation is an incorrect expression (juxtaposed identifiers). Therefore, it is necessary to disambiguate these cases with a space: \begin{cfa} i++§\color{red}\textvisiblespace§? i : 0; i?§\color{red}\textvisiblespace§++i : 0; \end{cfa} \begin{quote2} \begin{tabular}{lll} \begin{tabular}{llll} \begin{tabular}{@{}l@{}} ©_AT©                   \\ ©coroutine©             \\ ©disable©               \\ ©dtype©                 \\ ©enable©                \\ \end{tabular} & \begin{tabular}{@{}l@{}} ©dtype©                 \\ ©enable©                \\ ©fallthrough©   \\ ©fallthru©              \\ ©finally©               \\ ©forall©                \\ \end{tabular} & \begin{tabular}{@{}l@{}} ©ftype©                 \\ ©lvalue©                \\ ©monitor©               \\ ©mutex©                 \\ ©one_t©                 \\ ©otype©                 \\ \end{tabular} & \begin{tabular}{@{}l@{}} ©one_t©                 \\ ©otype©                 \\ ©throw©                 \\ ©throwResume©   \\ \end{tabular} \end{quote2} \section{Incompatible} The following incompatibles exist between \CFA and C, and are similar to Annex C for \CC~\cite{C++14}. \begin{enumerate} \item \begin{description} \item[Change:] add new keywords \\ New keywords are added to \CFA (see~\VRef{s:CFAKeywords}). \item[Rationale:] keywords added to implement new semantics of \CFA. \item[Effect on original feature:] change to semantics of well-defined feature. \\ Any ISO C programs using these keywords as identifiers are invalid \CFA programs. \item[Difficulty of converting:] keyword clashes are accommodated by syntactic transformations using the \CFA backquote escape-mechanism (see~\VRef{s:BackquoteIdentifiers}). \item[How widely used:] clashes among new \CFA keywords and existing identifiers are rare. \end{description} \item \begin{description} \item[Change:] drop K\&R C declarations \\ K\&R declarations allow an implicit base-type of ©int©, if no type is specified, plus an alternate syntax for declaring parameters. \eg: \begin{cfa} x;                                                              §\C{// int x}§ *y;                                                             §\C{// int *y}§ f( p1, p2 );                                    §\C{// int f( int p1, int p2 );}§ g( p1, p2 ) int p1, p2;                 §\C{// int g( int p1, int p2 );}§ \end{cfa} \CFA supports K\&R routine definitions: \begin{cfa} f( a, b, c )                                    §\C{// default int return}§ int a, b; char c                        §\C{// K\&R parameter declarations}§ { ... } \end{cfa} \item[Rationale:] dropped from C11 standard.\footnote{ At least one type specifier shall be given in the declaration specifiers in each declaration, and in the specifier-qualifier list in each structure declaration and type name~\cite[\S~6.7.2(2)]{C11}} \item[Effect on original feature:] original feature is deprecated. \\ Any old C programs using these K\&R declarations are invalid \CFA programs. \item[Difficulty of converting:] trivial to convert to \CFA. \item[How widely used:] existing usages are rare. \end{description} \item \begin{description} \item[Change:] type of character literal ©int© to ©char© to allow more intuitive overloading: \begin{cfa} int rtn( int i ); int rtn( char c ); rtn( 'x' );                                             §\C{// programmer expects 2nd rtn to be called}§ \end{cfa} \item[Rationale:] it is more intuitive for the call to ©rtn© to match the second version of definition of ©rtn© rather than the first. In particular, output of ©char© variable now print a character rather than the decimal ASCII value of the character. \begin{cfa} sout | 'x' | " " | (int)'x' | endl; x 120 \end{cfa} Having to cast ©'x'© to ©char© is non-intuitive. \item[Effect on original feature:] change to semantics of well-defined feature that depend on: \begin{cfa} sizeof( 'x' ) == sizeof( int ) \end{cfa} no long work the same in \CFA programs. \item[Difficulty of converting:] simple \item[How widely used:] programs that depend upon ©sizeof( 'x' )© are rare and can be changed to ©sizeof(char)©. \end{description} \item \begin{description} \item[Change:] make string literals ©const©: \begin{cfa} char * p = "abc";                               §\C{// valid in C, deprecated in \CFA}§ char * q = expr ? "abc" : "de"; §\C{// valid in C, invalid in \CFA}§ \end{cfa} The type of a string literal is changed from ©[] char© to ©const [] char©. Similarly, the type of a wide string literal is changed from ©[] wchar_t© to ©const [] wchar_t©. \item[Rationale:] This change is a safety issue: \begin{cfa} char * p = "abc"; p[0] = 'w';                                             §\C{// segment fault or change constant literal}§ \end{cfa} The same problem occurs when passing a string literal to a routine that changes its argument. \item[Effect on original feature:] change to semantics of well-defined feature. \item[Difficulty of converting:] simple syntactic transformation, because string literals can be converted to ©char *©. \item[How widely used:] programs that have a legitimate reason to treat string literals as pointers to potentially modifiable memory are rare. \end{description} \item \begin{description} \item[Change:] remove \newterm{tentative definitions}, which only occurs at file scope: \begin{cfa} int i;                                                  §\C{// forward definition}§ int *j = ®&i®;                                  §\C{// forward reference, valid in C, invalid in \CFA}§ int i = 0;                                              §\C{// definition}§ \end{cfa} is valid in C, and invalid in \CFA because duplicate overloaded object definitions at the same scope level are disallowed. This change makes it impossible to define mutually referential file-local static objects, if initializers are restricted to the syntactic forms of C. For example, \begin{cfa} struct X { int i; struct X *next; }; static struct X a;                              §\C{// forward definition}§ static struct X b = { 0, ®&a® };        §\C{// forward reference, valid in C, invalid in \CFA}§ static struct X a = { 1, &b };  §\C{// definition}§ \end{cfa} \item[Rationale:] avoids having different initialization rules for builtin types and userdefined types. \item[Effect on original feature:] change to semantics of well-defined feature. \item[Difficulty of converting:] the initializer for one of a set of mutually-referential file-local static objects must invoke a routine call to achieve the initialization. \item[How widely used:] seldom \end{description} \item \begin{description} \item[Change:] have ©struct© introduce a scope for nested types: \begin{cfa} enum ®Colour® { R, G, B, Y, C, M }; struct Person { enum ®Colour® { R, G, B };      §\C{// nested type}§ struct Face {                           §\C{// nested type}§ ®Colour® Eyes, Hair;    §\C{// type defined outside (1 level)}§ }; ®.Colour® shirt;                        §\C{// type defined outside (top level)}§ ®Colour® pants;                         §\C{// type defined same level}§ Face looks[10];                         §\C{// type defined same level}§ }; ®Colour® c = R;                                 §\C{// type/enum defined same level}§ Person®.Colour® pc = Person®.®R;        §\C{// type/enum defined inside}§ Person®.®Face pretty;                   §\C{// type defined inside}§ \end{cfa} In C, the name of the nested types belongs to the same scope as the name of the outermost enclosing structure, \ie the nested types are hoisted to the scope of the outer-most type, which is not useful and confusing. \CFA is C \emph{incompatible} on this issue, and provides semantics similar to \Index*[C++]{\CC}. Nested types are not hoisted and can be referenced using the field selection operator ©.©'', unlike the \CC scope-resolution operator ©::©''. \item[Rationale:] ©struct© scope is crucial to \CFA as an information structuring and hiding mechanism. \item[Effect on original feature:] change to semantics of well-defined feature. \item[Difficulty of converting:] Semantic transformation. \item[How widely used:] C programs rarely have nest types because they are equivalent to the hoisted version. \end{description} \item \begin{description} \item[Change:] In C++, the name of a nested class is local to its enclosing class. \item[Rationale:] C++ classes have member functions which require that classes establish scopes. \item[Difficulty of converting:] Semantic transformation. To make the struct type name visible in the scope of the enclosing struct, the struct tag could be declared in the scope of the enclosing struct, before the enclosing struct is defined. Example: \begin{cfa} struct Y;                                               §\C{// struct Y and struct X are at the same scope}§ struct X { struct Y { /* ... */ } y; }; \end{cfa} All the definitions of C struct types enclosed in other struct definitions and accessed outside the scope of the enclosing struct could be exported to the scope of the enclosing struct. Note: this is a consequence of the difference in scope rules, which is documented in 3.3. \item[How widely used:] Seldom. \end{description} \item \begin{description} \item[Change:] comma expression is disallowed as subscript \item[Rationale:] safety issue to prevent subscripting error for multidimensional arrays: ©x[i,j]© instead of ©x[i][j]©, and this syntactic form then taken by \CFA for new style arrays. \item[Effect on original feature:] change to semantics of well-defined feature. \item[Difficulty of converting:] semantic transformation of ©x[i,j]© to ©x[(i,j)]© \item[How widely used:] seldom. \end{description} \end{enumerate} \end{tabular} \end{quote2} For the prescribed head-files, \CFA implicitly wraps their includes in an ©extern "C"©; For the prescribed head-files, \CFA uses header interposition to wraps these includes in an ©extern "C"©; hence, names in these include files are not mangled\index{mangling!name} (see~\VRef{s:Interoperability}). All other C header files must be explicitly wrapped in ©extern "C"© to prevent name mangling. \label{s:StandardLibrary} The goal of the \CFA standard-library is to wrap many of the existing C library-routines that are explicitly polymorphic into implicitly polymorphic versions. The \CFA standard-library wraps many existing explicitly-polymorphic C general-routines into implicitly-polymorphic versions. \leavevmode \begin{cfa}[aboveskip=0pt,belowskip=0pt] forall( otype T ) T * malloc( void );§\indexc{malloc}§ forall( otype T ) T * malloc( char fill ); forall( otype T ) T * malloc( T * ptr, size_t size ); forall( otype T ) T * malloc( T * ptr, size_t size, unsigned char fill ); forall( otype T ) T * calloc( size_t nmemb );§\indexc{calloc}§ forall( otype T ) T * realloc( T * ptr, size_t size );§\indexc{ato}§ forall( otype T ) T * realloc( T * ptr, size_t size, unsigned char fill ); forall( otype T ) T * aligned_alloc( size_t alignment );§\indexc{ato}§ forall( otype T ) T * memalign( size_t alignment );             // deprecated forall( otype T ) int posix_memalign( T ** ptr, size_t alignment ); forall( dtype T | sized(T) ) T * malloc( void );§\indexc{malloc}§ forall( dtype T | sized(T) ) T * malloc( char fill ); forall( dtype T | sized(T) ) T * malloc( T * ptr, size_t size ); forall( dtype T | sized(T) ) T * malloc( T * ptr, size_t size, unsigned char fill ); forall( dtype T | sized(T) ) T * calloc( size_t nmemb );§\indexc{calloc}§ forall( dtype T | sized(T) ) T * realloc( T * ptr, size_t size );§\indexc{ato}§ forall( dtype T | sized(T) ) T * realloc( T * ptr, size_t size, unsigned char fill ); forall( dtype T | sized(T) ) T * aligned_alloc( size_t alignment );§\indexc{ato}§ forall( dtype T | sized(T) ) T * memalign( size_t alignment );          // deprecated forall( dtype T | sized(T) ) int posix_memalign( T ** ptr, size_t alignment ); forall( otype T ) T * memset( T * ptr, unsigned char fill ); // use default value '\0' for fill forall( otype T ) T * memset( T * ptr );                                // remove when default value available forall( dtype T, ttype Params | sized(T) | { void ?{}(T *, Params); } ) T * new( Params p ); forall( dtype T | { void ^?{}(T *); } ) void delete( T * ptr ); forall( dtype T, ttype Params | { void ^?{}(T *); void delete(Params); } ) void delete( T * ptr, Params rest ); \end{cfa} \label{s:Math Library} The goal of the \CFA math-library is to wrap many of the existing C math library-routines that are explicitly polymorphic into implicitly polymorphic versions. The \CFA math-library wraps many existing explicitly-polymorphic C math-routines into implicitly-polymorphic versions.
• ## doc/working/exception/impl/main.c

 r547e9b7 extern int this_exception; _Unwind_Reason_Code foo_try_match() { return this_exception == 2 ? _URC_HANDLER_FOUND : _URC_CONTINUE_UNWIND; return this_exception == 3 ? _URC_HANDLER_FOUND : _URC_CONTINUE_UNWIND; }
• ## src/CodeGen/GenType.cc

 r547e9b7 void GenType::visit( TupleType * tupleType ) { assertf( ! genC, "Tuple types should not reach code generation." ); Visitor::visit( tupleType ); unsigned int i = 0; std::ostringstream os; os << genType( t, "", pretty, genC, lineMarks ) << (i == tupleType->size() ? "" : ", "); } os << "]"; os << "] "; typeString = os.str() + typeString; }
• ## src/GenPoly/InstantiateGeneric.cc

 r547e9b7 concDecl->set_body( inst->get_baseStruct()->has_body() ); substituteMembers( inst->get_baseStruct()->get_members(), *inst->get_baseParameters(), typeSubs, concDecl->get_members() ); DeclMutator::addDeclaration( concDecl ); insert( inst, typeSubs, concDecl ); insert( inst, typeSubs, concDecl ); // must insert before recursion concDecl->acceptMutator( *this ); // recursively instantiate members DeclMutator::addDeclaration( concDecl ); // must occur before declaration is added so that member instantiations appear first } StructInstType *newInst = new StructInstType( inst->get_qualifiers(), concDecl->get_name() ); concDecl->set_body( inst->get_baseUnion()->has_body() ); substituteMembers( inst->get_baseUnion()->get_members(), *inst->get_baseParameters(), typeSubs, concDecl->get_members() ); DeclMutator::addDeclaration( concDecl ); insert( inst, typeSubs, concDecl ); insert( inst, typeSubs, concDecl ); // must insert before recursion concDecl->acceptMutator( *this ); // recursively instantiate members DeclMutator::addDeclaration( concDecl ); // must occur before declaration is added so that member instantiations appear first } UnionInstType *newInst = new UnionInstType( inst->get_qualifiers(), concDecl->get_name() );
• ## src/InitTweak/FixInit.cc

 r547e9b7 Expression * FixCopyCtors::mutate( StmtExpr * stmtExpr ) { stmtExpr = safe_dynamic_cast< StmtExpr * >( Parent::mutate( stmtExpr ) ); // function call temporaries should be placed at statement-level, rather than nested inside of a new statement expression, // since temporaries can be shared across sub-expressions, e.g. //   [A, A] f(); //   g([A] x, [A] y); //   f(g()); // f is executed once, so the return temporary is shared across the tuple constructors for x and y. std::list< Statement * > & stmts = stmtExpr->get_statements()->get_kids(); for ( Statement *& stmt : stmts ) { stmt = stmt->acceptMutator( *this ); } // for // stmtExpr = safe_dynamic_cast< StmtExpr * >( Parent::mutate( stmtExpr ) ); assert( stmtExpr->get_result() ); Type * result = stmtExpr->get_result(); Parent::visit( compoundStmt ); // add destructors for the current scope that we're exiting // add destructors for the current scope that we're exiting, unless the last statement is a return, which // causes unreachable code warnings std::list< Statement * > & statements = compoundStmt->get_kids(); insertDtors( reverseDeclOrder.front().begin(), reverseDeclOrder.front().end(), back_inserter( statements ) ); if ( ! statements.empty() && ! dynamic_cast< ReturnStmt * >( statements.back() ) ) { insertDtors( reverseDeclOrder.front().begin(), reverseDeclOrder.front().end(), back_inserter( statements ) ); } reverseDeclOrder.pop_front(); }
• ## src/Parser/ExpressionNode.cc

 r547e9b7 // Created On       : Sat May 16 13:17:07 2015 // Last Modified By : Peter A. Buhr // Last Modified On : Thu Mar 30 17:02:46 2017 // Update Count     : 515 // Last Modified On : Wed May 17 21:31:01 2017 // Update Count     : 527 // } // build_field_name_fraction_constants Expression * build_field_name_REALFRACTIONconstant( const std::string & str ) { assert( str[0] == '.' ); if ( str.find_first_not_of( "0123456789", 1 ) != string::npos ) throw SemanticError( "invalid tuple index " + str ); Expression * ret = build_constantInteger( *new std::string( str.substr(1) ) ); delete &str; Expression * build_field_name_REALDECIMALconstant( const std::string & str ) { assert( str[str.size()-1] == '.' ); if ( str[str.size()-1] != '.' ) throw SemanticError( "invalid tuple index " + str ); Expression * ret = build_constantInteger( *new std::string( str.substr( 0, str.size()-1 ) ) ); delete &str;
• ## src/Parser/lex.ll

 r547e9b7 * Created On       : Sat Sep 22 08:58:10 2001 * Last Modified By : Peter A. Buhr * Last Modified On : Mon Mar 13 08:36:17 2017 * Update Count     : 506 * Last Modified On : Thu May 18 09:03:49 2017 * Update Count     : 513 */ // numeric constants, CFA: '_' in constant hex_quad {hex}("_"?{hex}){3} integer_suffix "_"?(([uU][lL]?)|([uU]("ll"|"LL")?)|([lL][uU]?)|("ll"|"LL")[uU]?) integer_suffix "_"?(([uU](("ll"|"LL"|[lL])[iI]|[iI]?("ll"|"LL"|[lL])?))|([iI](("ll"|"LL"|[lL])[uU]|[uU]?("ll"|"LL"|[lL])?))|(("ll"|"LL"|[lL])([iI][uU]|[uU]?[iI]?))) octal_digits ({octal})|({octal}({octal}|"_")*{octal}) decimal_digits ({decimal})|({decimal}({decimal}|"_")*{decimal}) real_decimal {decimal_digits}"." real_fraction "."{decimal_digits} real_constant {decimal_digits}?{real_fraction} real_decimal {decimal_digits}"."{exponent}?{floating_suffix}? real_fraction "."{decimal_digits}{exponent}?{floating_suffix}? real_constant {decimal_digits}{real_fraction} exponent "_"?[eE]"_"?[+-]?{decimal_digits} // GCC: D (double), DL (long double) and iI (imaginary) suffixes floating_suffix "_"?([fFdDlL][iI]?|"DL"|[iI][lLfFdD]?) //floating_suffix "_"?([fFdD]|[lL]|[D][L])|([iI][lLfFdD])|([lLfFdD][iI])) // GCC: D (double) and iI (imaginary) suffixes, and DL (long double) floating_suffix "_"?([fFdDlL][iI]?|[iI][lLfFdD]?|"DL") floating_constant (({real_constant}{exponent}?)|({decimal_digits}{exponent})){floating_suffix}?
• ## src/Parser/parser.yy

 r547e9b7 // Created On       : Sat Sep  1 20:22:55 2001 // Last Modified By : Peter A. Buhr // Last Modified On : Thu Mar 30 15:42:32 2017 // Update Count     : 2318 // Last Modified On : Thu May 18 18:06:17 2017 // Update Count     : 2338 // } // for } // distExt bool forall = false;                                                                    // aggregate have one or more forall qualifiers ? %} sue_type_specifier:                                                                             // struct, union, enum + type specifier elaborated_type | type_qualifier_list elaborated_type { $$= 2->addQualifiers( 1 ); } | type_qualifier_list { if ( 1->type != nullptr && 1->type->forall ) forall = true; } // remember generic type elaborated_type {$$ = $3->addQualifiers($1 ); } | sue_type_specifier type_qualifier { $$= 1->addQualifiers( 2 ); } {$$ = DeclarationNode::newAggregate( $1, new string( DeclarationNode::anonymous.newName() ), nullptr,$4, true )->addQualifiers( $2 ); } | aggregate_key attribute_list_opt no_attr_identifier_or_type_name { typedefTable.makeTypedef( *$3 ); } { typedefTable.makeTypedef( *$3 ); // create typedef if ( forall ) typedefTable.changeKind( *$3, TypedefTable::TG ); // possibly update forall = false;                                                         // reset } '{' field_declaration_list '}' {  = DeclarationNode::newAggregate( $1,$3, nullptr, $6, true )->addQualifiers($2 ); }
• ## src/SymTab/Validate.cc

 r547e9b7 }; /// ensure that generic types have the correct number of type arguments class ValidateGenericParameters : public Visitor { public: typedef Visitor Parent; virtual void visit( StructInstType * inst ) final override; virtual void visit( UnionInstType * inst ) final override; }; class ArrayLength : public Visitor { public: Pass3 pass3( 0 ); CompoundLiteral compoundliteral; HoistStruct::hoistStruct( translationUnit ); ValidateGenericParameters genericParams; EliminateTypedef::eliminateTypedef( translationUnit ); HoistStruct::hoistStruct( translationUnit ); // must happen after EliminateTypedef, so that aggregate typedefs occur in the correct order ReturnTypeFixer::fix( translationUnit ); // must happen before autogen acceptAll( translationUnit, lrt ); // must happen before autogen, because sized flag needs to propagate to generated functions acceptAll( translationUnit, genericParams );  // check as early as possible - can't happen before LinkReferenceToTypes acceptAll( translationUnit, epc ); // must happen before VerifyCtorDtorAssign, because void return objects should not exist VerifyCtorDtorAssign::verify( translationUnit );  // must happen before autogen, because autogen examines existing ctor/dtors } template< typename Aggr > void validateGeneric( Aggr * inst ) { std::list< TypeDecl * > * params = inst->get_baseParameters(); if ( params != NULL ) { std::list< Expression * > & args = inst->get_parameters(); if ( args.size() < params->size() ) throw SemanticError( "Too few type arguments in generic type ", inst ); if ( args.size() > params->size() ) throw SemanticError( "Too many type arguments in generic type ", inst ); } } void ValidateGenericParameters::visit( StructInstType * inst ) { validateGeneric( inst ); Parent::visit( inst ); } void ValidateGenericParameters::visit( UnionInstType * inst ) { validateGeneric( inst ); Parent::visit( inst ); } DeclarationWithType * CompoundLiteral::mutate( ObjectDecl *objectDecl ) { storageClasses = objectDecl->get_storageClasses();
• ## src/SynTree/TypeSubstitution.cc

 r547e9b7 boundVars.insert( (*tyvar )->get_name() ); } // for } // if // bind type variables from generic type instantiations std::list< TypeDecl* > *baseParameters = type->get_baseParameters(); if ( baseParameters && ! type->get_parameters().empty() ) { for ( std::list< TypeDecl* >::const_iterator tyvar = baseParameters->begin(); tyvar != baseParameters->end(); ++tyvar ) { boundVars.insert( (*tyvar)->get_name() ); } // for // bind type variables from generic type instantiations std::list< TypeDecl* > *baseParameters = type->get_baseParameters(); if ( baseParameters && ! type->get_parameters().empty() ) { for ( std::list< TypeDecl* >::const_iterator tyvar = baseParameters->begin(); tyvar != baseParameters->end(); ++tyvar ) { boundVars.insert( (*tyvar)->get_name() ); } // for } // if } // if Type *ret = Mutator::mutate( type );
• ## src/libcfa/gmp

 r547e9b7 // Created On       : Tue Apr 19 08:43:43 2016 // Last Modified By : Peter A. Buhr // Last Modified On : Sun May 14 23:47:36 2017 // Update Count     : 9 // Last Modified On : Mon May 22 08:32:39 2017 // Update Count     : 13 // Int ?=?( Int * lhs, long int rhs ) { mpz_set_si( lhs->mpz, rhs ); return *lhs; } Int ?=?( Int * lhs, unsigned long int rhs ) { mpz_set_ui( lhs->mpz, rhs ); return *lhs; } //Int ?=?( Int * lhs, const char * rhs ) { if ( mpq_set_str( lhs->mpz, rhs, 0 ) ) abort(); return *lhs; } Int ?=?( Int * lhs, const char * rhs ) { if ( mpz_set_str( lhs->mpz, rhs, 0 ) ) { printf( "invalid string conversion\n" ); abort(); } return *lhs; } char ?=?( char * lhs, Int rhs ) { char val = mpz_get_si( rhs.mpz ); *lhs = val; return val; }
• ## src/main.cc

 r547e9b7 bresolvep = false, bboxp = false, bcodegenp = false, ctorinitp = false, declstatsp = false, OPTPRINT( "box" ) GenPoly::box( translationUnit ); if ( bcodegenp ) { dump( translationUnit ); return 0; } if ( optind < argc ) {                                                  // any commands after the flags and input file ? => output file name int c; while ( (c = getopt_long( argc, argv, "abBcdefglLmnpqrstTvyzZD:F:", long_opts, &long_index )) != -1 ) { while ( (c = getopt_long( argc, argv, "abBcCdefglLmnpqrstTvyzZD:F:", long_opts, &long_index )) != -1 ) { switch ( c ) { case Ast: case 'c':                                                                             // print after constructors and destructors are replaced ctorinitp = true; break; case 'C':                                                                             // print before code generation bcodegenp = true; break; case DeclStats:
• ## src/tests/.expect/64/gmp.txt

 r547e9b7 conversions y:97 y:12345678901234567890123456789 y:3 y:-3 z:150000000000000000000 z:16666666666666666666 16666666666666666666, 2 16666666666666666666, 2 x:16666666666666666666 y:2
• ## src/tests/gmp.c

 r547e9b7 // Created On       : Tue Apr 19 08:55:51 2016 // Last Modified By : Peter A. Buhr // Last Modified On : Sun May 14 14:46:50 2017 // Update Count     : 530 // Last Modified On : Mon May 22 09:05:09 2017 // Update Count     : 538 // #include int main() { int main( void ) { sout | "constructors" | endl; short int si = 3; sout | "conversions" | endl; y = 'a'; sout | "y:" | y | endl; y = "12345678901234567890123456789"; sout | "y:" | y | endl; y = si; z = x / 3; sout | "z:" | z | endl; sout | div( x, 3 ) | x / 3 | "," | x % 3 | endl; [ x, y ] = div( x, 3 ); sout | "x:" | x | "y:" | y | endl; //      sout | div( x, 3 ) | x / 3 | "," | x % 3 | endl; sout | endl; fn = (Int){0}; fn1 = fn;                                                        // 1st case sout | (int)0 | fn | endl; fn = (Int){1}; fn2 = fn1; fn1 = fn;                                     // 2nd case fn = 1; fn2 = fn1; fn1 = fn;                                            // 2nd case sout | 1 | fn | endl; for ( int i = 2; i <= 200; i += 1 ) { for ( unsigned int i = 2; i <= 200; i += 1 ) { fn = fn1 + fn2; fn2 = fn1; fn1 = fn;                    // general case sout | i | fn | endl; sout | "Factorial Numbers" | endl; Int fact; fact = (Int){1};                                                                        // 1st case fact = 1;                                                                                       // 1st case sout | (int)0 | fact | endl; for ( int i = 1; i <= 40; i += 1 ) { for ( unsigned int i = 1; i <= 40; i += 1 ) { fact = fact * i;                                                                // general case sout | i | fact | endl;
• ## src/tests/tuplePolymorphism.c

 r547e9b7 // Author           : Rob Schluntz // Created On       : Tue Nov 16 10:38:00 2016 // Last Modified By : Rob Schluntz // Last Modified On : Tue Nov 16 10:39:18 2016 // Update Count     : 2 // Last Modified By : Peter A. Buhr // Last Modified On : Thu May 18 18:05:12 2017 // Update Count     : 4 // // packed is needed so that structs are not passed with the same alignment as function arguments __attribute__((packed)) struct A { double x; char y; double z; double x; char y; double z; }; __attribute__((packed)) struct B { long long x; char y; long long z; long long x; char y; long long z; }; int main() { int x1 = 123, x3 = 456; double x2 = 999.123; int x1 = 123, x3 = 456; double x2 = 999.123; int i1 = 111, i3 = 222; double i2 = 333; int i1 = 111, i3 = 222; double i2 = 333; int d1 = 555, d3 = 444; double d2 = 666; int d1 = 555, d3 = 444; double d2 = 666; [i1, i2, i3] = ([x1, (int)x2, x3]) + ([9, 2, 3]); [d1, d2, d3] = ([x1, x2, x3]) + ([9, 2, 3]); printf("%d %g %d\n", i1, i2, i3); printf("%d %g %d\n", d1, d2, d3); [i1, i2, i3] = ([x1, (int)x2, x3]) + ([9, 2, 3]); [d1, d2, d3] = ([x1, x2, x3]) + ([9, 2, 3]); printf("%d %g %d\n", i1, i2, i3); printf("%d %g %d\n", d1, d2, d3); [double, double, double] zzz; zzz = [x1, x2, x3]; printf("%g %g %g\n", zzz); [x1, x2, x3] = zzz+zzz; printf("%d %g %d\n", x1, x2, x3); [double, double, double] zzz; zzz = [x1, x2, x3]; printf("%g %g %g\n", zzz); [x1, x2, x3] = zzz+zzz; printf("%d %g %d\n", x1, x2, x3); // ensure non-matching assertions are specialized correctly g((A){ 1.21, 'x', 10.21}, (B){ 1111LL, 'v', 54385938LL }); // ensure non-matching assertions are specialized correctly g((A){ 1.21, 'x', 10.21}, (B){ 1111LL, 'v', 54385938LL }); } // tab-width: 4 // // End: //
Note: See TracChangeset for help on using the changeset viewer.