# Changeset e7b2559 for doc/user

Ignore:
Timestamp:
Aug 2, 2016, 9:30:34 AM (7 years ago)
Branches:
aaron-thesis, arm-eh, cleanup-dtors, ctor, deferred_resn, demangler, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, memory, new-ast, new-ast-unique-expr, new-env, no_list, persistent-indexer, pthread-emulation, qualifiedEnum, resolv-new, with_gc
Children:
155cce0f
Parents:
e21c72d (diff), 79f64f1 (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the (diff) links above to see all the changes relative to each parent.
Message:

Merge branch 'master' of plg.uwaterloo.ca:software/cfa/cfa-cc

Conflicts:

src/tests/test.py

File:
1 edited

### Legend:

Unmodified
 re21c72d %% Created On       : Wed Apr  6 14:53:29 2016 %% Last Modified By : Peter A. Buhr %% Last Modified On : Wed Jul 13 08:14:39 2016 %% Update Count     : 1247 %% Last Modified On : Mon Aug  1 09:11:24 2016 %% Update Count     : 1271 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% however, it largely extended the language, and did not address many existing problems.\footnote{% Two important existing problems addressed were changing the type of character literals from ©int© to ©char© and enumerator from ©int© to the type of its enumerators.} \Index*{Fortran}~\cite{Fortran08}, \Index*{Ada}~\cite{Ada12}, and \Index*{Cobol}~\cite{Cobol14} are examples of programming languages that took an evolutionary approach, where modern language features (e.g., objects, concurrency) are added and problems fixed within the framework of the existing language. \Index*{Fortran}~\cite{Fortran08}, \Index*{Ada}~\cite{Ada12}, and \Index*{Cobol}~\cite{Cobol14} are examples of programming languages that took an evolutionary approach, where modern language features (\eg objects, concurrency) are added and problems fixed within the framework of the existing language. \Index*{Java}~\cite{Java8}, \Index*{Go}~\cite{Go}, \Index*{Rust}~\cite{Rust} and \Index*{D}~\cite{D} are examples of the revolutionary approach for modernizing C/\CC, resulting in a new language rather than an extension of the descendent. These languages have different syntax and semantics from C, and do not interoperate directly with C, largely because of garbage collection. \section[Compiling CFA Program]{Compiling \CFA Program} The command ©cfa© is used to compile \CFA program(s), and is based on the GNU \Indexc{gcc} command, e.g.: The command ©cfa© is used to compile \CFA program(s), and is based on the GNU \Indexc{gcc} command, \eg: \begin{lstlisting} cfa§\indexc{cfa}\index{compilation!cfa@©cfa©}§ [ gcc-options ] C/§\CFA§-files [ assembler/loader-files ] \section{Underscores in Constants} Numeric constants are extended to allow \Index{underscore}s within constants\index{constant!underscore}, e.g.: Numeric constants are extended to allow \Index{underscore}s within constants\index{constant!underscore}, \eg: \begin{lstlisting} 2®_®147®_®483®_®648;                    §\C{// decimal constant}§ \begin{enumerate} \item A sequence of underscores is disallowed, e.g., ©12__34© is invalid. A sequence of underscores is disallowed, \eg ©12__34© is invalid. \item Underscores may only appear within a sequence of digits (regardless of the digit radix). In other words, an underscore cannot start or end a sequence of digits, e.g., ©_1©, ©1_© and ©_1_© are invalid (actually, the 1st and 3rd examples are identifier names). In other words, an underscore cannot start or end a sequence of digits, \eg ©_1©, ©1_© and ©_1_© are invalid (actually, the 1st and 3rd examples are identifier names). \item A numeric prefix may end with an underscore; \end{quote2} All type qualifiers, e.g., ©const©, ©volatile©, etc., are used in the normal way with the new declarations and also appear left to right, e.g.: All type qualifiers, \eg ©const©, ©volatile©, etc., are used in the normal way with the new declarations and also appear left to right, \eg: \begin{quote2} \begin{tabular}{@{}l@{\hspace{1em}}l@{\hspace{1em}}l@{}} \end{tabular} \end{quote2} All declaration qualifiers, e.g., ©extern©, ©static©, etc., are used in the normal way with the new declarations but can only appear at the start of a \CFA routine declaration,\footnote{\label{StorageClassSpecifier} The placement of a storage-class specifier other than at the beginning of the declaration specifiers in a declaration is an obsolescent feature.~\cite[\S~6.11.5(1)]{C11}} e.g.: All declaration qualifiers, \eg ©extern©, ©static©, etc., are used in the normal way with the new declarations but can only appear at the start of a \CFA routine declaration,\footnote{\label{StorageClassSpecifier} The placement of a storage-class specifier other than at the beginning of the declaration specifiers in a declaration is an obsolescent feature.~\cite[\S~6.11.5(1)]{C11}} \eg: \begin{quote2} \begin{tabular}{@{}l@{\hspace{3em}}l@{\hspace{2em}}l@{}} Unsupported are K\&R C declarations where the base type defaults to ©int©, if no type is specified,\footnote{ At least one type specifier shall be given in the declaration specifiers in each declaration, and in the specifier-qualifier list in each structure declaration and type name~\cite[\S~6.7.2(2)]{C11}} e.g.: \eg: \begin{lstlisting} x;                                                              §\C{// int x}§ A \Index{pointer}/\Index{reference} is a generalization of a variable name, i.e., a mutable address that can point to more than one memory location during its lifetime. (Similarly, an integer variable can contain multiple integer literals during its lifetime versus an integer constant representing a single literal during its lifetime and may not occupy storage as the literal is embedded directly into instructions.) Hence, a pointer occupies memory to store its current address, and the pointer's value is loaded by dereferencing, e.g.: Hence, a pointer occupies memory to store its current address, and the pointer's value is loaded by dereferencing, \eg: \begin{quote2} \begin{tabular}{@{}ll@{}} Except for auto-dereferencing by the compiler, this reference example is the same as the previous pointer example. Hence, a reference behaves like the variable name for the current variable it is pointing-to. The simplest way to understand a reference is to imagine the compiler inserting a dereference operator before the reference variable for each reference qualifier in a declaration, e.g.: The simplest way to understand a reference is to imagine the compiler inserting a dereference operator before the reference variable for each reference qualifier in a declaration, \eg: \begin{lstlisting} r2 = ((r1 + r2) * (r3 - r1)) / (r3 - 15); ®*®r2 = ((®*®r1 + ®*®r2) ®*® (®**®r3 - ®*®r1)) / (®**®r3 - 15); \end{lstlisting} When a reference operation appears beside a dereference operation, e.g., ©&*©, they cancel out.\footnote{ When a reference operation appears beside a dereference operation, \eg ©&*©, they cancel out.\footnote{ The unary ©&© operator yields the address of its operand. If the operand has type type'', the result has type pointer to type''. ®&®crc = &cx;                                   §\C{// error, cannot change crc}§ \end{lstlisting} Hence, for type ©& const©, there is no pointer assignment, so ©&rc = &x© is disallowed, and \emph{the address value cannot be ©0© unless an arbitrary pointer is assigned to the reference}, e.g.: Hence, for type ©& const©, there is no pointer assignment, so ©&rc = &x© is disallowed, and \emph{the address value cannot be ©0© unless an arbitrary pointer is assigned to the reference}, \eg: \begin{lstlisting} int & const r = *0;                             §\C{// where 0 is the int * zero}§ \end{lstlisting} Otherwise, the compiler is managing the addresses for type ©& const© not the programmer, and by a programming discipline of only using references with references, address errors can be prevented. Finally, the position of the ©const© qualifier \emph{after} the pointer/reference qualifier causes confuse for C programmers. The ©const© qualifier cannot be moved before the pointer/reference qualifier for C style-declarations; \CFA-style declarations attempt to address this issue: \begin{quote2} \begin{tabular}{@{}l@{\hspace{3em}}l@{}} \multicolumn{1}{c@{\hspace{3em}}}{\textbf{\CFA}}        & \multicolumn{1}{c}{\textbf{C}}        \\ \begin{lstlisting} ®const® * ®const® * const int ccp; ®const® & ®const® & const int ccr; \end{lstlisting} & \begin{lstlisting} const int * ®const® * ®const® ccp; \end{lstlisting} \end{tabular} \end{quote2} where the \CFA declaration is read left-to-right (see \VRef{s:Declarations}). \Index{Initialization} is different than \Index{assignment} because initialization occurs on the empty (uninitialized) storage on an object, while assignment occurs on possibly initialized storage of an object. \section{Type Operators} The new declaration syntax can be used in other contexts where types are required, e.g., casts and the pseudo-routine ©sizeof©: The new declaration syntax can be used in other contexts where types are required, \eg casts and the pseudo-routine ©sizeof©: \begin{quote2} \begin{tabular}{@{}l@{\hspace{3em}}l@{}} \CFA also supports a new syntax for routine definition, as well as ISO C and K\&R routine syntax. The point of the new syntax is to allow returning multiple values from a routine~\cite{Galletly96,CLU}, e.g.: The point of the new syntax is to allow returning multiple values from a routine~\cite{Galletly96,CLU}, \eg: \begin{lstlisting} ®[ int o1, int o2, char o3 ]® f( int i1, char i2, char i3 ) { \Index*{Michael Tiemann}, with help from \Index*{Doug Lea}, provided named return values in g++, circa 1989.} The value of each local return variable is automatically returned at routine termination. Declaration qualifiers can only appear at the start of a routine definition, e.g.: Declaration qualifiers can only appear at the start of a routine definition, \eg: \begin{lstlisting} ®extern® [ int x ] g( int y ) {§\,§} The inability to use \CFA declarations in these two contexts is probably a blessing because it precludes programmers from arbitrarily switching between declarations forms within a declaration contexts. C-style declarations can be used to declare parameters for \CFA style routine definitions, e.g.: C-style declarations can be used to declare parameters for \CFA style routine definitions, \eg: \begin{lstlisting} [ int ] f( * int, int * );              §\C{// returns an integer, accepts 2 pointers to integers}§ The syntax of the new routine prototype declaration follows directly from the new routine definition syntax; as well, parameter names are optional, e.g.: as well, parameter names are optional, \eg: \begin{lstlisting} [ int x ] f ();                                 §\C{// returning int with no parameters}§ \end{lstlisting} This syntax allows a prototype declaration to be created by cutting and pasting source text from the routine definition header (or vice versa). It is possible to declare multiple routine-prototypes in a single declaration, but the entire type specification is distributed across \emph{all} routine names in the declaration list (see~\VRef{s:Declarations}), e.g.: It is possible to declare multiple routine-prototypes in a single declaration, but the entire type specification is distributed across \emph{all} routine names in the declaration list (see~\VRef{s:Declarations}), \eg: \begin{quote2} \begin{tabular}{@{}l@{\hspace{3em}}l@{}} \end{tabular} \end{quote2} Declaration qualifiers can only appear at the start of a \CFA routine declaration,\footref{StorageClassSpecifier} e.g.: Declaration qualifiers can only appear at the start of a \CFA routine declaration,\footref{StorageClassSpecifier} \eg: \begin{lstlisting} extern [ int ] f (int); \section{Routine Pointers} The syntax for pointers to \CFA routines specifies the pointer name on the right, e.g.: The syntax for pointers to \CFA routines specifies the pointer name on the right, \eg: \begin{lstlisting} * [ int x ] () fp;                      §\C{// pointer to routine returning int with no parameters}§ p( /* positional */, /* named */, . . . ); \end{lstlisting} While it is possible to implement both approaches, the first possibly is more complex than the second, e.g.: While it is possible to implement both approaches, the first possibly is more complex than the second, \eg: \begin{lstlisting} p( int x, int y, int z, . . . ); In the second call, the named arguments separate the positional and ellipse arguments, making it trivial to read the call. The problem is exacerbated with default arguments, e.g.: The problem is exacerbated with default arguments, \eg: \begin{lstlisting} void p( int x, int y = 2, int z = 3. . . ); As mentioned, tuples can appear in contexts requiring a list of value, such as an argument list of a routine call. In unambiguous situations, the tuple brackets may be omitted, e.g., a tuple that appears as an argument may have its In unambiguous situations, the tuple brackets may be omitted, \eg a tuple that appears as an argument may have its square brackets omitted for convenience; therefore, the following routine invocations are equivalent: \begin{lstlisting} Type qualifiers, i.e., const and volatile, may modify a tuple type. The meaning is the same as for a type qualifier modifying an aggregate type [Int99, x 6.5.2.3(7),x 6.7.3(11)], i.e., the qualifier is distributed across all of the types in the tuple, e.g.: The meaning is the same as for a type qualifier modifying an aggregate type [Int99, x 6.5.2.3(7),x 6.7.3(11)], i.e., the qualifier is distributed across all of the types in the tuple, \eg: \begin{lstlisting} const volatile [ int, float, const int ] x; [ const volatile int, const volatile float, const volatile int ] x; \end{lstlisting} Declaration qualifiers can only appear at the start of a \CFA tuple declaration4, e.g.: Declaration qualifiers can only appear at the start of a \CFA tuple declaration4, \eg: \begin{lstlisting} extern [ int, int ] w1; Unfortunately, C's syntax for subscripts precluded treating them as tuples. The C subscript list has the form ©[i][j]...© and not ©[i, j, ...]©. Therefore, there is no syntactic way for a routine returning multiple values to specify the different subscript values, e.g., ©f[g()]© always means a single subscript value because there is only one set of brackets. Therefore, there is no syntactic way for a routine returning multiple values to specify the different subscript values, \eg ©f[g()]© always means a single subscript value because there is only one set of brackets. Fixing this requires a major change to C because the syntactic form ©M[i, j, k]© already has a particular meaning: ©i, j, k© is a comma expression. \end{rationale} Clearly, the types of the entities being assigned must be type compatible with the value of the expression. Mass assignment has parallel semantics, e.g., the statement: Mass assignment has parallel semantics, \eg the statement: \begin{lstlisting} [ x, y, z ] = 1.5; \section{Unnamed Structure Fields} C requires each field of a structure to have a name, except for a bit field associated with a basic type, e.g.: C requires each field of a structure to have a name, except for a bit field associated with a basic type, \eg: \begin{lstlisting} struct { int f1;                 // named field int f2 : 4;             // named field with bit field size int : 3;                // unnamed field for basic type with bit field size int ;                   // disallowed, unnamed field int *;                  // disallowed, unnamed field int (*)(int);   // disallowed, unnamed field int f1;                                 §\C{// named field}§ int f2 : 4;                             §\C{// named field with bit field size}§ int : 3;                                §\C{// unnamed field for basic type with bit field size}§ int ;                                   §\C{// disallowed, unnamed field}§ int *;                                  §\C{// disallowed, unnamed field}§ int (*)(int);                   §\C{// disallowed, unnamed field}§ }; \end{lstlisting} This requirement is relaxed by making the field name optional for all field declarations; therefore, all the field declarations in the example are allowed. As for unnamed bit fields, an unnamed field is used for padding a structure to a particular size. A list of unnamed fields is also supported, e.g.: A list of unnamed fields is also supported, \eg: \begin{lstlisting} struct { int , , ;               // 3 unnamed fields int , , ;                               §\C{// 3 unnamed fields}§ } \end{lstlisting} §\emph{expr}§ -> [ §\emph{fieldlist}§ ] \end{lstlisting} \emph{expr} is any expression yielding a value of type record, e.g., ©struct©, ©union©. \emph{expr} is any expression yielding a value of type record, \eg ©struct©, ©union©. Each element of \emph{ fieldlist} is an element of the record specified by \emph{expr}. A record-field tuple may be used anywhere a tuple can be used. An example of the use of a record-field tuple is } \end{lstlisting} While the declaration of the local variable ©y© is useful and its scope is across all ©case© clauses, the initialization for such a variable is defined to never be executed because control always transfers over it. Furthermore, any statements before the first ©case© clause can only be executed if labelled and transferred to using a ©goto©, either from outside or inside of the ©switch©. As mentioned, transfer into control structures should be forbidden; transfers from within the ©switch© body using a ©goto© are equally unpalatable. As well, the declaration of ©z© is cannot occur after the ©case© because a label can only be attached to a statement, and without a fall through to case 3, ©z© is uninitialized. While the declaration of the local variable ©y© is useful with a scope across all ©case© clauses, the initialization for such a variable is defined to never be executed because control always transfers over it. Furthermore, any statements before the first ©case© clause can only be executed if labelled and transferred to using a ©goto©, either from outside or inside of the ©switch©, both of which are problematic. As well, the declaration of ©z© cannot occur after the ©case© because a label can only be attached to a statement, and without a fall through to case 3, ©z© is uninitialized. The key observation is that the ©switch© statement branches into control structure, i.e., there are multiple entry points into its statement body. \end{enumerate} and there is only a medium amount of fall-through from one ©case© clause to the next, and most of these result from a list of case values executing common code, rather than a sequence of case actions that compound. \end{itemize} These observations help to put the suggested changes to the ©switch© into perspective. These observations help to put the \CFA changes to the ©switch© into perspective. \begin{enumerate} \item Eliminating default fall-through has the greatest potential for affecting existing code. However, even if fall-through is removed, most ©switch© statements would continue to work because of the explicit transfers already present at the end of each ©case© clause, the common placement of the ©default© clause at the end of the case list, and the most common use of fall-through, i.e., a list of ©case© clauses executing common code, e.g.: \begin{lstlisting} However, even if fall-through is removed, most ©switch© statements would continue to work because of the explicit transfers already present at the end of each ©case© clause, the common placement of the ©default© clause at the end of the case list, and the most common use of fall-through, i.e., a list of ©case© clauses executing common code, \eg: \begin{lstlisting} case 1:  case 2:  case 3: ... \end{lstlisting} still work. Nevertheless, reversing the default action would have a non-trivial effect on case actions that compound, such as the above example of processing shell arguments. Therefore, to preserve backwards compatibility, it is necessary to introduce a new kind of ©switch© statement, called ©choose©, with no implicit fall-through semantics and an explicit fall-through if the last statement of a case-clause ends with the new keyword ©fallthru©, e.g.: <<<<<<< HEAD Therefore, to preserve backwards compatibility, it is necessary to introduce a new kind of ©switch© statement, called ©choose©, with no implicit fall-through semantics and an explicit fall-through if the last statement of a case-clause ends with the new keyword ©fallthru©, \eg: ======= Therefore, to preserve backwards compatibility, it is necessary to introduce a new kind of ©switch© statement, called ©choose©, with no implicit fall-through semantics and an explicit fall-through if the last statement of a case-clause ends with the new keyword ©fallthrough©/©fallthru©, e.g.: >>>>>>> 080615890f586cb9954c252b55cab47f52c25758 \begin{lstlisting} ®choose® ( i ) { Therefore, no change is made for this issue. \item Dealing with unreachable code in a ©switch©/©choose© body is solved by restricting declarations and associated initialization to the start of statement body, which is executed \emph{before} the transfer to the appropriate ©case© clause.\footnote{ Essentially, these declarations are hoisted before the statement and both declarations and statement are surrounded by a compound statement.} and precluding statements before the first ©case© clause. Further declaration in the statement body are disallowed. Dealing with unreachable code in a ©switch©/©choose© body is solved by restricting declarations and associated initialization to the start of statement body, which is executed \emph{before} the transfer to the appropriate ©case© clause\footnote{ Essentially, these declarations are hoisted before the ©switch©/©choose© statement and both declarations and statement are surrounded by a compound statement.} and precluding statements before the first ©case© clause. Further declarations at the same nesting level as the statement body are disallowed to ensure every transfer into the body is sound. \begin{lstlisting} switch ( x ) { ®int i = 0;®                            §\C{// allowed}§ ®int i = 0;®                            §\C{// allowed only at start}§ case 0: ... ®int i = 0;®                            §\C{// disallowed}§ ®int j = 0;®                            §\C{// disallowed}§ case 1: { ®int i = 0;®                    §\C{// allowed in any compound statement}§ ®int k = 0;®                    §\C{// allowed at different nesting levels}§ ... } Like the \Index*[C++]{\CC} lexical problem with closing template-syntax, e.g, ©Foo>®©, this issue can be solved with a more powerful lexer/parser. There are several ambiguous cases with operator identifiers, e.g., ©int *?*?()©, where the string ©*?*?© can be lexed as ©*©/©?*?© or ©*?©/©*?©. Since it is common practise to put a unary operator juxtaposed to an identifier, e.g., ©*i©, users will be annoyed if they cannot do this with respect to operator identifiers. There are several ambiguous cases with operator identifiers, \eg ©int *?*?()©, where the string ©*?*?© can be lexed as ©*©/©?*?© or ©*?©/©*?©. Since it is common practise to put a unary operator juxtaposed to an identifier, \eg ©*i©, users will be annoyed if they cannot do this with respect to operator identifiers. Even with this special hack, there are 5 general cases that cannot be handled. The first case is for the function-call identifier ©?()©: This means that a function requiring mutual exclusion could block if the lock is already held by another thread. Blocking on a monitor lock does not block the kernel thread, it simply blocks the user thread, which yields its kernel thread while waiting to obtain the lock. If multiple mutex parameters are specified, they will be locked in parameter order (i.e. first parameter is locked first) and unlocked in the If multiple mutex parameters are specified, they will be locked in parameter order (\ie first parameter is locked first) and unlocked in the reverse order. \begin{lstlisting} \section{New Keywowrds} ©catch©, ©catchResume©, ©choose©, \quad ©disable©, ©dtype©, \quad ©enable©, \quad ©fallthrough©, ©fallthru©, ©finally©, ©forall©, ©ftype©, \quad ©lvalue©, \quad ©otype©, \quad ©throw©, ©throwResume©, ©trait©, ©try© \section{Incompatible} \CFA is C \emph{incompatible} on this issue, and provides semantics similar to \Index*[C++]{\CC}. Nested types are not hoisted and can be referenced using the field selection operator ©.©'', unlike the \CC scope-resolution operator ©::©''. Given that nested types in C are equivalent to not using them, i.e., they are essentially useless, it is unlikely there are any realistic usages that break because of this incompatibility. Given that nested types in C are equivalent to not using them, \ie they are essentially useless, it is unlikely there are any realistic usages that break because of this incompatibility. \end{description} \label{s:RationalNumbers} Rational numbers are numbers written as a ratio, i.e., as a fraction, where the numerator (top number) and the denominator (bottom number) are whole numbers. Rational numbers are numbers written as a ratio, \ie as a fraction, where the numerator (top number) and the denominator (bottom number) are whole numbers. When creating and computing with rational numbers, results are constantly reduced to keep the numerator and denominator as small as possible.