Changeset b1e63ac5 for doc/user/user.tex
- Timestamp:
- Jul 4, 2017, 9:40:16 AM (8 years ago)
- Branches:
- ADT, aaron-thesis, arm-eh, ast-experimental, cleanup-dtors, deferred_resn, demangler, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, new-env, no_list, persistent-indexer, pthread-emulation, qualifiedEnum, resolv-new, with_gc
- Children:
- 208e5be
- Parents:
- 9c951e3 (diff), f7cb0bc (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the(diff)links above to see all the changes relative to each parent. - File:
-
- 1 edited
-
doc/user/user.tex (modified) (116 diffs)
Legend:
- Unmodified
- Added
- Removed
-
doc/user/user.tex
r9c951e3 rb1e63ac5 11 11 %% Created On : Wed Apr 6 14:53:29 2016 12 12 %% Last Modified By : Peter A. Buhr 13 %% Last Modified On : Mon May 15 18:29:58201714 %% Update Count : 159813 %% Last Modified On : Fri Jun 16 12:00:01 2017 14 %% Update Count : 2433 15 15 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 16 16 … … 43 43 \usepackage[pagewise]{lineno} 44 44 \renewcommand{\linenumberfont}{\scriptsize\sffamily} 45 \input{common} % bespoke macros used in the document45 \input{common} % common CFA document macros 46 46 \usepackage[dvips,plainpages=false,pdfpagelabels,pdfpagemode=UseNone,colorlinks=true,pagebackref=true,linkcolor=blue,citecolor=blue,urlcolor=blue,pagebackref=true,breaklinks=true]{hyperref} 47 47 \usepackage{breakurl} … … 94 94 \author{ 95 95 \huge \CFA Team \medskip \\ 96 \Large Peter A. Buhr, Richard Bilson, Thierry Delisle, \smallskip \\96 \Large Andrew Beach, Richard Bilson, Peter A. Buhr, Thierry Delisle, \smallskip \\ 97 97 \Large Glen Ditchfield, Rodolfo G. Esteves, Aaron Moss, Rob Schluntz 98 98 }% author … … 110 110 \renewcommand{\subsectionmark}[1]{\markboth{\thesubsection\quad #1}{\thesubsection\quad #1}} 111 111 \pagenumbering{roman} 112 \linenumbers % comment out to turn off line numbering112 %\linenumbers % comment out to turn off line numbering 113 113 114 114 \maketitle … … 135 135 136 136 \CFA{}\index{cforall@\CFA}\footnote{Pronounced ``\Index*{C-for-all}'', and written \CFA, CFA, or \CFL.} is a modern general-purpose programming-language, designed as an evolutionary step forward for the C programming language. 137 The syntax of the \CFA language builds from C, and should look immediately familiar to C/\Index*[C++]{\CC } programmers.138 % Any language feature that is not described here can be assumed to be using the standard C11syntax.137 The syntax of the \CFA language builds from C, and should look immediately familiar to C/\Index*[C++]{\CC{}} programmers. 138 % Any language feature that is not described here can be assumed to be using the standard \Celeven syntax. 139 139 \CFA adds many modern programming-language features that directly lead to increased \emph{\Index{safety}} and \emph{\Index{productivity}}, while maintaining interoperability with existing C programs and achieving C performance. 140 Like C, \CFA is a statically typed, procedural language with a low-overhead runtime, meaning there is no global \Index{garbage-collection}, but \Index{regional garbage-collection}\index{garbage collection!regional} is possible.140 Like C, \CFA is a statically typed, procedural language with a low-overhead runtime, meaning there is no global \Index{garbage-collection}, but \Index{regional garbage-collection}\index{garbage-collection!regional} is possible. 141 141 The primary new features include parametric-polymorphic routines and types, exceptions, concurrency, and modules. 142 142 … … 147 147 instead, a programmer evolves an existing C program into \CFA by incrementally incorporating \CFA features. 148 148 New programs can be written in \CFA using a combination of C and \CFA features. 149 \Index*[C++]{\CC } had a similar goal 30 years ago, but currently has the disadvantages of multiple legacy design-choices that cannot be updated and active divergence of the language model from C, requiring significant effort and training to incrementally add \CC to a C-based project.149 \Index*[C++]{\CC{}} had a similar goal 30 years ago, but currently has the disadvantages of multiple legacy design-choices that cannot be updated and active divergence of the language model from C, requiring significant effort and training to incrementally add \CC to a C-based project. 150 150 In contrast, \CFA has 30 years of hindsight and a clean starting point. 151 151 152 Like \Index*[C++]{\CC }, there may be both an old and new ways to achieve the same effect.153 For example, the following programs compare the \CFA and C I/O mechanisms.152 Like \Index*[C++]{\CC{}}, there may be both an old and new ways to achieve the same effect. 153 For example, the following programs compare the \CFA, C, and \CC I/O mechanisms, where the programs output the same result. 154 154 \begin{quote2} 155 155 \begin{tabular}{@{}l@{\hspace{1.5em}}l@{\hspace{1.5em}}l@{}} 156 156 \multicolumn{1}{c@{\hspace{1.5em}}}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{C}} & \multicolumn{1}{c}{\textbf{\CC}} \\ 157 157 \begin{cfa} 158 #include <fstream> 158 #include <fstream>§\indexc{fstream}§ 159 159 160 160 int main( void ) { … … 165 165 & 166 166 \begin{lstlisting} 167 #include <stdio.h> 167 #include <stdio.h>§\indexc{stdio.h}§ 168 168 169 169 int main( void ) { … … 174 174 & 175 175 \begin{lstlisting} 176 #include <iostream> 176 #include <iostream>§\indexc{iostream}§ 177 177 using namespace std; 178 178 int main() { … … 183 183 \end{tabular} 184 184 \end{quote2} 185 The programs output the same result. 186 While the \CFA I/O looks similar to the \Index*[C++]{\CC} output style, there are important differences, such as automatic spacing between variables as in \Index*{Python} (see~\VRef{s:IOLibrary}). 187 188 This document is a user manual for the \CFA programming language, targeted at \CFA programmers. 185 While the \CFA I/O looks similar to the \Index*[C++]{\CC{}} output style, there are important differences, such as automatic spacing between variables as in \Index*{Python} (see~\VRef{s:IOLibrary}). 186 187 This document is a programmer reference-manual for the \CFA programming language. 188 The manual covers the core features of the language and runtime-system, with simple examples illustrating syntax and semantics of each feature. 189 The manual does not teach programming, i.e., how to combine the new constructs to build complex programs. 190 A reader should already have an intermediate knowledge of control flow, data structures, and concurrency issues to understand the ideas presented as well as some experience programming in C/\CC. 189 191 Implementers may refer to the \CFA Programming Language Specification for details about the language syntax and semantics. 190 In its current state, this document covers the intended core features of the language.191 192 Changes to the syntax and additional features are expected to be included in later revisions. 192 193 … … 198 199 Even with all its problems, C continues to be popular because it allows writing software at virtually any level in a computer system without restriction. 199 200 For system programming, where direct access to hardware and dealing with real-time issues is a requirement, C is usually the language of choice. 200 The TIOBE index~\cite{TIOBE} for March 2016 showed the following programming-language popularity: \Index*{Java} 20.5\%, C 14.5\%, \Index*[C++]{\CC } 6.7\%, \Csharp 4.3\%, \Index*{Python} 4.3\%, where the next 50 languages are less than 3\% each with a long tail.201 The TIOBE index~\cite{TIOBE} for March 2016 showed the following programming-language popularity: \Index*{Java} 20.5\%, C 14.5\%, \Index*[C++]{\CC{}} 6.7\%, \Csharp 4.3\%, \Index*{Python} 4.3\%, where the next 50 languages are less than 3\% each with a long tail. 201 202 As well, for 30 years, C has been the number 1 and 2 most popular programming language: 202 203 \begin{center} … … 217 218 218 219 As stated, the goal of the \CFA project is to engineer modern language features into C in an evolutionary rather than revolutionary way. 219 \CC~\cite{ c++,ANSI14:C++} is an example of a similar project;220 \CC~\cite{C++14,C++} is an example of a similar project; 220 221 however, it largely extended the language, and did not address many existing problems.\footnote{% 221 222 Two important existing problems addressed were changing the type of character literals from ©int© to ©char© and enumerator from ©int© to the type of its enumerators.} … … 226 227 These costs can be prohibitive for many companies with a large software base in C/\CC, and a significant number of programmers requiring retraining to a new programming language. 227 228 228 The result of this project is a language that is largely backwards compatible with \Index* {C11}~\cite{C11}, but fixing some of the well known C problems and containing many modern language features.229 The result of this project is a language that is largely backwards compatible with \Index*[C11]{\Celeven{}}~\cite{C11}, but fixing some of the well known C problems and containing many modern language features. 229 230 Without significant extension to the C programming language, it is becoming unable to cope with the needs of modern programming problems and programmers; 230 231 as a result, it will fade into disuse. 231 232 Considering the large body of existing C code and programmers, there is significant impetus to ensure C is transformed into a modern programming language. 232 While \Index* {C11} made a few simple extensions to the language, nothing was added to address existing problems in the language or to augment the language with modern language features.233 While \Index*[C11]{\Celeven{}} made a few simple extensions to the language, nothing was added to address existing problems in the language or to augment the language with modern language features. 233 234 While some may argue that modern language features may make C complex and inefficient, it is clear a language without modern capabilities is insufficient for the advanced programming problems existing today. 234 235 … … 244 245 int forty_two = identity( 42 ); §\C{// T is bound to int, forty\_two == 42}§ 245 246 \end{lstlisting} 246 % extending the C type system with parametric polymorphism and overloading, as opposed to the \Index*[C++]{\CC } approach of object-oriented extensions.247 % extending the C type system with parametric polymorphism and overloading, as opposed to the \Index*[C++]{\CC{}} approach of object-oriented extensions. 247 248 \CFA{}\hspace{1pt}'s polymorphism was originally formalized by Ditchfiled~\cite{Ditchfield92}, and first implemented by Bilson~\cite{Bilson03}. 248 249 However, at that time, there was little interesting in extending C, so work did not continue. … … 263 264 A simple example is leveraging the existing type-unsafe (©void *©) C ©bsearch© to binary search a sorted floating-point array: 264 265 \begin{lstlisting} 265 void * bsearch( const void * key, const void * base, size_t nmemb, size_t size,266 void * bsearch( const void * key, const void * base, size_t dim, size_t size, 266 267 int (* compar)( const void *, const void * )); 267 268 … … 341 342 The 1999 C standard plus GNU extensions. 342 343 \item 344 {\lstset{deletekeywords={inline}} 343 345 \Indexc{-fgnu89-inline}\index{compilation option!-fgnu89-inline@{©-fgnu89-inline©}} 344 346 Use the traditional GNU semantics for inline routines in C99 mode, which allows inline routines in header files. 347 }% 345 348 \end{description} 346 349 The following new \CFA options are available: … … 413 416 \begin{cfa} 414 417 #ifndef __CFORALL__ 415 #include <stdio.h> §\C{// C header file}§418 #include <stdio.h>§\indexc{stdio.h}§ §\C{// C header file}§ 416 419 #else 417 #include <fstream> §\C{// \CFA header file}§420 #include <fstream>§\indexc{fstream}§ §\C{// \CFA header file}§ 418 421 #endif 419 422 \end{cfa} … … 451 454 the type suffixes ©U©, ©L©, etc. may start with an underscore ©1_U©, ©1_ll© or ©1.0E10_f©. 452 455 \end{enumerate} 453 It is significantly easier to read and enter long constants when they are broken up into smaller groupings (m ost cultures use commaor period among digits for the same purpose).456 It is significantly easier to read and enter long constants when they are broken up into smaller groupings (many cultures use comma and/or period among digits for the same purpose). 454 457 This extension is backwards compatible, matches with the use of underscore in variable names, and appears in \Index*{Ada} and \Index*{Java} 8. 455 458 … … 461 464 \begin{cfa} 462 465 int ®`®otype®`® = 3; §\C{// make keyword an identifier}§ 463 double ®`® choose®`® = 3.5;464 \end{cfa} 465 Programs can be converted easily by enclosing keyword identifiers in backquotes, and the backquotes can be removed later when the identifier name is changed to anon-keyword name.466 double ®`®forall®`® = 3.5; 467 \end{cfa} 468 Existing C programs with keyword clashes can be converted by enclosing keyword identifiers in backquotes, and eventually the identifier name can be changed to a non-keyword name. 466 469 \VRef[Figure]{f:InterpositionHeaderFile} shows how clashes in C header files (see~\VRef{s:StandardHeaders}) can be handled using preprocessor \newterm{interposition}: ©#include_next© and ©-I filename©: 467 470 … … 470 473 // include file uses the CFA keyword "otype". 471 474 #if ! defined( otype ) §\C{// nesting ?}§ 472 #define otype `otype`475 #define otype ®`®otype®`® §\C{// make keyword an identifier}§ 473 476 #define __CFA_BFD_H__ 474 477 #endif // ! otype … … 494 497 \begin{tabular}{@{}ll@{}} 495 498 \begin{cfa} 496 int * x[5]499 int * x[5] 497 500 \end{cfa} 498 501 & … … 505 508 For example, a routine returning a \Index{pointer} to an array of integers is defined and used in the following way: 506 509 \begin{cfa} 507 int (*f())[5] {...}; §\C{}§508 ... (*f())[3] += 1; 510 int ®(*®f®())[®5®]® {...}; §\C{definition}§ 511 ... ®(*®f®())[®3®]® += 1; §\C{usage}§ 509 512 \end{cfa} 510 513 Essentially, the return type is wrapped around the routine name in successive layers (like an \Index{onion}). … … 513 516 \CFA provides its own type, variable and routine declarations, using a different syntax. 514 517 The new declarations place qualifiers to the left of the base type, while C declarations place qualifiers to the right of the base type. 515 In the following example, \R{red} is for the base type and \B{blue} is for thequalifiers.516 The \CFA declarations move the qualifiers to the left of the base type, i.e.,move the blue to the left of the red, while the qualifiers have the same meaning but are ordered left to right to specify a variable's type.518 In the following example, \R{red} is the base type and \B{blue} is qualifiers. 519 The \CFA declarations move the qualifiers to the left of the base type, \ie move the blue to the left of the red, while the qualifiers have the same meaning but are ordered left to right to specify a variable's type. 517 520 \begin{quote2} 518 521 \begin{tabular}{@{}l@{\hspace{3em}}l@{}} … … 531 534 \end{tabular} 532 535 \end{quote2} 533 The only exception is bit fieldspecification, which always appear to the right of the base type.536 The only exception is \Index{bit field} specification, which always appear to the right of the base type. 534 537 % Specifically, the character ©*© is used to indicate a pointer, square brackets ©[©\,©]© are used to represent an array or function return value, and parentheses ©()© are used to indicate a routine parameter. 535 538 However, unlike C, \CFA type declaration tokens are distributed across all variables in the declaration list. … … 580 583 \begin{cfa} 581 584 int z[ 5 ]; 582 char * w[ 5 ];583 double (* v)[ 5 ];585 char * w[ 5 ]; 586 double (* v)[ 5 ]; 584 587 struct s { 585 588 int f0:3; 586 int * f1;587 int * f2[ 5 ]589 int * f1; 590 int * f2[ 5 ] 588 591 }; 589 592 \end{cfa} … … 634 637 \begin{cfa} 635 638 int extern x[ 5 ]; 636 const int static * y;639 const int static * y; 637 640 \end{cfa} 638 641 & … … 644 647 \end{quote2} 645 648 646 Unsupported are K\&R C declarations where the base type defaults to ©int©, if no type is specified,\footnote{ 647 At least one type specifier shall be given in the declaration specifiers in each declaration, and in the specifier-qualifier list in each structure declaration and type name~\cite[\S~6.7.2(2)]{C11}} 648 \eg: 649 \begin{cfa} 650 x; §\C{// int x}§ 651 *y; §\C{// int *y}§ 652 f( p1, p2 ); §\C{// int f( int p1, int p2 );}§ 653 f( p1, p2 ) {} §\C{// int f( int p1, int p2 ) {}}§ 654 \end{cfa} 649 The new declaration syntax can be used in other contexts where types are required, \eg casts and the pseudo-routine ©sizeof©: 650 \begin{quote2} 651 \begin{tabular}{@{}l@{\hspace{3em}}l@{}} 652 \multicolumn{1}{c@{\hspace{3em}}}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{C}} \\ 653 \begin{cfa} 654 y = (®* int®)x; 655 i = sizeof(®[ 5 ] * int®); 656 \end{cfa} 657 & 658 \begin{cfa} 659 y = (®int *®)x; 660 i = sizeof(®int * [ 5 ]®); 661 \end{cfa} 662 \end{tabular} 663 \end{quote2} 655 664 656 665 Finally, new \CFA declarations may appear together with C declarations in the same program block, but cannot be mixed within a specific declaration. … … 659 668 660 669 661 \section{Pointer /Reference}670 \section{Pointer/Reference} 662 671 663 672 C provides a \newterm{pointer type}; 664 673 \CFA adds a \newterm{reference type}. 665 Both types contain an \newterm{address}, which is normally a location in memory. 666 Special addresses are used to denote certain states or access co-processor memory. 667 By convention, no variable is placed at address 0, so addresses like 0, 1, 2, 3 are often used to denote no-value or other special states. 668 Often dereferencing a special state causes a \Index{memory fault}, so checking is necessary during execution. 669 If the programming language assigns addresses, a program's execution is \Index{sound}, i.e., all addresses are to valid memory locations. 670 C allows programmers to assign addresses, so there is the potential for incorrect addresses, both inside and outside of the computer address-space. 671 672 Program variables are implicit pointers to memory locations generated by the compiler and automatically dereferenced, as in: 674 These types may be derived from an object or routine type, called the \newterm{referenced type}. 675 Objects of these types contain an \newterm{address}, which is normally a location in memory, but may also address memory-mapped registers in hardware devices. 676 An integer constant expression with the value 0, or such an expression cast to type ©void *©, is called a \newterm{null-pointer constant}.\footnote{ 677 One way to conceptualize the null pointer is that no variable is placed at this address, so the null-pointer address can be used to denote an uninitialized pointer/reference object; 678 \ie the null pointer is guaranteed to compare unequal to a pointer to any object or routine.} 679 An address is \newterm{sound}, if it points to a valid memory location in scope, \ie within the program's execution-environment and has not been freed. 680 Dereferencing an \newterm{unsound} address, including the null pointer, is \Index{undefined}, often resulting in a \Index{memory fault}. 681 682 A program \newterm{object} is a region of data storage in the execution environment, the contents of which can represent values. 683 In most cases, objects are located in memory at an address, and the variable name for an object is an implicit address to the object generated by the compiler and automatically dereferenced, as in: 673 684 \begin{quote2} 674 \begin{tabular}{@{}ll l@{}}685 \begin{tabular}{@{}ll@{\hspace{2em}}l@{}} 675 686 \begin{cfa} 676 687 int x; … … 691 702 \end{quote2} 692 703 where the right example is how the compiler logically interprets the variables in the left example. 693 Since a variable name only points to one location during its lifetime, it is an \Index{immutable} \Index{pointer}; 694 hence, variables ©x© and ©y© are constant pointers in the compiler interpretation. 695 In general, variable addresses are stored in instructions instead of loaded independently, so an instruction fetch implicitly loads a variable's address. 704 Since a variable name only points to one address during its lifetime, it is an \Index{immutable} \Index{pointer}; 705 hence, the implicit type of pointer variables ©x© and ©y© are constant pointers in the compiler interpretation. 706 In general, variable addresses are stored in instructions instead of loaded from memory, and hence may not occupy storage. 707 These approaches are contrasted in the following: 696 708 \begin{quote2} 697 709 \begin{tabular}{@{}l|l@{}} 710 \multicolumn{1}{c|}{explicit variable address} & \multicolumn{1}{c}{implicit variable address} \\ 711 \hline 698 712 \begin{cfa} 699 713 lda r1,100 // load address of x 700 ld r2,(r1)// load value of x714 ld r2,(r1) // load value of x 701 715 lda r3,104 // load address of y 702 st r2,(r3)// store x into y716 st r2,(r3) // store x into y 703 717 \end{cfa} 704 718 & … … 711 725 \end{tabular} 712 726 \end{quote2} 713 Finally, the immutable nature of a variable's address and the fact that there is no storage for a variable addressmeans pointer assignment\index{pointer!assignment}\index{assignment!pointer} is impossible.714 Therefore, the expression ©x = y© only has one meaning, ©*x = *y©, i.e., manipulate values, which is why explicitly writing the dereferences is unnecessary even though it occurs implicitly as part of instruction decoding.715 716 A \Index{pointer}/\Index{reference} is a generalization of a variable name, i.e.,a mutable address that can point to more than one memory location during its lifetime.717 (Similarly, an integer variable can contain multiple integer literals during its lifetime versus an integer constant representing a single literal during its lifetime and may not occupy storage asthe literal is embedded directly into instructions.)727 Finally, the immutable nature of a variable's address and the fact that there is no storage for the variable pointer means pointer assignment\index{pointer!assignment}\index{assignment!pointer} is impossible. 728 Therefore, the expression ©x = y© has only one meaning, ©*x = *y©, \ie manipulate values, which is why explicitly writing the dereferences is unnecessary even though it occurs implicitly as part of \Index{instruction decoding}. 729 730 A \Index{pointer}/\Index{reference} object is a generalization of an object variable-name, \ie a mutable address that can point to more than one memory location during its lifetime. 731 (Similarly, an integer variable can contain multiple integer literals during its lifetime versus an integer constant representing a single literal during its lifetime, and like a variable name, may not occupy storage if the literal is embedded directly into instructions.) 718 732 Hence, a pointer occupies memory to store its current address, and the pointer's value is loaded by dereferencing, \eg: 719 733 \begin{quote2} 720 \begin{tabular}{@{}l l@{}}734 \begin{tabular}{@{}l@{\hspace{2em}}l@{}} 721 735 \begin{cfa} 722 736 int x, y, ®*® p1, ®*® p2, ®**® p3; … … 727 741 \end{cfa} 728 742 & 729 \raisebox{-0. 45\totalheight}{\input{pointer2.pstex_t}}743 \raisebox{-0.5\totalheight}{\input{pointer2.pstex_t}} 730 744 \end{tabular} 731 745 \end{quote2} 732 746 733 Notice, an address has a duality\index{address!duality}: a location in memory or the value at that location. 734 In many cases, a compiler might be able to infer the meaning: 747 Notice, an address has a \Index{duality}\index{address!duality}: a location in memory or the value at that location. 748 In many cases, a compiler might be able to infer the best meaning for these two cases. 749 For example, \Index*{Algol68}~\cite{Algol68} infers pointer dereferencing to select the best meaning for each pointer usage 735 750 \begin{cfa} 736 751 p2 = p1 + x; §\C{// compiler infers *p2 = *p1 + x;}§ 737 752 \end{cfa} 738 because adding the arbitrary integer value in ©x© to the address of ©p1© and storing the resulting address into ©p2© is an unlikely operation.739 \Index*{Algol68}~\cite{Algol68} inferences pointer dereferencing to select the best meaning for each pointer usage. 740 However, in C, the following cases are ambiguous, especially with pointer arithmetic: 741 \begin{cfa} 742 p1 = p2; §\C{// p1 = p2\ \ or\ \ *p1 = *p2}§ 743 p1 = p1 + 1; §\C{// p1 = p1 + 1\ \ or\ \ *p1 = *p1 + 1}§ 744 \end{cfa} 745 746 Most languages pick one meaning as the default and the programmer explicitly indicates the other meaning to resolve the address-duality ambiguity\index{address! ambiguity}. 747 In C, the default meaning for pointers is to manipulate the pointer's address and the pointed-to value is explicitly accessed by the dereference operator ©*©. 753 Algol68 infers the following dereferencing ©*p2 = *p1 + x©, because adding the arbitrary integer value in ©x© to the address of ©p1© and storing the resulting address into ©p2© is an unlikely operation. 754 Unfortunately, automatic dereferencing does not work in all cases, and so some mechanism is necessary to fix incorrect choices. 755 756 Rather than inferring dereference, most programming languages pick one implicit dereferencing semantics, and the programmer explicitly indicates the other to resolve address-duality. 757 In C, objects of pointer type always manipulate the pointer object's address: 758 \begin{cfa} 759 p1 = p2; §\C{// p1 = p2\ \ rather than\ \ *p1 = *p2}§ 760 p2 = p1 + x; §\C{// p2 = p1 + x\ \ rather than\ \ *p2 = *p1 + x}§ 761 \end{cfa} 762 even though the assignment to ©p2© is likely incorrect, and the programmer probably meant: 748 763 \begin{cfa} 749 764 p1 = p2; §\C{// pointer address assignment}§ 750 *p1 = *p1 + 1;§\C{// pointed-to value assignment / operation}§751 \end{cfa} 752 which workswell for situations where manipulation of addresses is the primary meaning and data is rarely accessed, such as storage management (©malloc©/©free©).765 ®*®p2 = ®*®p1 + x; §\C{// pointed-to value assignment / operation}§ 766 \end{cfa} 767 The C semantics work well for situations where manipulation of addresses is the primary meaning and data is rarely accessed, such as storage management (©malloc©/©free©). 753 768 754 769 However, in most other situations, the pointed-to value is requested more often than the pointer address. … … 762 777 \end{cfa} 763 778 764 To s witch the default meaning for an address requires a new kind of pointer, called a \newterm{reference} denoted by ©&©.779 To support this common case, a reference type is introduced in \CFA, denoted by ©&©, which is the opposite dereference semantics to a pointer type, making the value at the pointed-to location the implicit semantics for dereferencing (similar but not the same as \CC \Index{reference type}s). 765 780 \begin{cfa} 766 781 int x, y, ®&® r1, ®&® r2, ®&&® r3; … … 773 788 Except for auto-dereferencing by the compiler, this reference example is the same as the previous pointer example. 774 789 Hence, a reference behaves like the variable name for the current variable it is pointing-to. 775 The simplest way to understand a reference is to imagine the compiler inserting a dereference operator before the reference variable for each reference qualifier in a declaration, \eg: 776 \begin{cfa} 777 r2 = ((r1 + r2) * (r3 - r1)) / (r3 - 15); 778 \end{cfa} 779 is rewritten as: 790 One way to conceptualize a reference is via a rewrite rule, where the compiler inserts a dereference operator before the reference variable for each reference qualifier in a declaration, so the previous example becomes: 780 791 \begin{cfa} 781 792 ®*®r2 = ((®*®r1 + ®*®r2) ®*® (®**®r3 - ®*®r1)) / (®**®r3 - 15); 782 793 \end{cfa} 783 When a reference operation appears beside a dereference operation, \eg ©&*©, they cancel out.\footnote{ 794 When a reference operation appears beside a dereference operation, \eg ©&*©, they cancel out. 795 However, in C, the cancellation always yields a value (\Index{rvalue}).\footnote{ 784 796 The unary ©&© operator yields the address of its operand. 785 797 If the operand has type ``type'', the result has type ``pointer to type''. 786 798 If the operand is the result of a unary ©*© operator, neither that operator nor the ©&© operator is evaluated and the result is as if both were omitted, except that the constraints on the operators still apply and the result is not an lvalue.~\cite[\S~6.5.3.2--3]{C11}} 787 Hence, assigning to a reference requires the address of the reference variable(\Index{lvalue}):788 \begin{cfa} 789 (&®*®)r1 = &x; §\C{// (\&*) cancel giving variabler1 not variable pointed-to by r1}§799 For a \CFA reference type, the cancellation on the left-hand side of assignment leaves the reference as an address (\Index{lvalue}): 800 \begin{cfa} 801 (&®*®)r1 = &x; §\C{// (\&*) cancel giving address in r1 not variable pointed-to by r1}§ 790 802 \end{cfa} 791 803 Similarly, the address of a reference can be obtained for assignment or computation (\Index{rvalue}): 792 804 \begin{cfa} 793 (&(&®*®)®*®)r3 = &(&®*®)r2; §\C{// (\&*) cancel giving address of r2, (\&(\&*)*) cancel giving variable r3}§ 794 \end{cfa} 795 Cancellation\index{cancellation!pointer/reference}\index{pointer!cancellation} works to arbitrary depth, and pointer and reference values are interchangeable because both contain addresses. 805 (&(&®*®)®*®)r3 = &(&®*®)r2; §\C{// (\&*) cancel giving address in r2, (\&(\&*)*) cancel giving address in r3}§ 806 \end{cfa} 807 Cancellation\index{cancellation!pointer/reference}\index{pointer!cancellation} works to arbitrary depth. 808 809 Fundamentally, pointer and reference objects are functionally interchangeable because both contain addresses. 796 810 \begin{cfa} 797 811 int x, *p1 = &x, **p2 = &p1, ***p3 = &p2, … … 805 819 &&&r3 = p3; §\C{// change r3 to p3, (\&(\&(\&*)*)*)r3, 3 cancellations}§ 806 820 \end{cfa} 807 Finally, implicit dereferencing and cancellation are a static (compilation) phenomenon not a dynamic one. 808 That is, all implicit dereferencing and any cancellation is carried out prior to the start of the program, so reference performance is equivalent to pointer performance. 809 A programmer selects a pointer or reference type solely on whether the address is dereferenced frequently or infrequently, which dictates the amount of direct aid from the compiler; 810 otherwise, everything else is equal. 811 812 Interestingly, \Index*[C++]{\CC} deals with the address duality by making the pointed-to value the default, and prevent\-ing changes to the reference address, which eliminates half of the duality. 813 \Index*{Java} deals with the address duality by making address assignment the default and requiring field assignment (direct or indirect via methods), i.e., there is no builtin bit-wise or method-wise assignment, which eliminates half of the duality. 814 815 As for a pointer, a reference may have qualifiers: 816 \begin{cfa} 817 const int cx = 5; §\C{// cannot change cx;}§ 818 const int & cr = cx; §\C{// cannot change what cr points to}§ 819 ®&®cr = &cx; §\C{// can change cr}§ 820 cr = 7; §\C{// error, cannot change cx}§ 821 int & const rc = x; §\C{// must be initialized, \CC reference}§ 822 ®&®rc = &x; §\C{// error, cannot change rc}§ 823 const int & const crc = cx; §\C{// must be initialized, \CC reference}§ 824 crc = 7; §\C{// error, cannot change cx}§ 825 ®&®crc = &cx; §\C{// error, cannot change crc}§ 826 \end{cfa} 827 Hence, for type ©& const©, there is no pointer assignment, so ©&rc = &x© is disallowed, and \emph{the address value cannot be ©0© unless an arbitrary pointer is assigned to the reference}, \eg: 828 \begin{cfa} 829 int & const r = *0; §\C{// where 0 is the int * zero}§ 830 \end{cfa} 831 Otherwise, the compiler is managing the addresses for type ©& const© not the programmer, and by a programming discipline of only using references with references, address errors can be prevented. 832 Finally, the position of the ©const© qualifier \emph{after} the pointer/reference qualifier causes confuse for C programmers. 821 Furthermore, both types are equally performant, as the same amount of dereferencing occurs for both types. 822 Therefore, the choice between them is based solely on whether the address is dereferenced frequently or infrequently, which dictates the amount of implicit dereferencing aid from the compiler. 823 824 As for a pointer type, a reference type may have qualifiers: 825 \begin{cfa} 826 const int cx = 5; §\C{// cannot change cx;}§ 827 const int & cr = cx; §\C{// cannot change what cr points to}§ 828 ®&®cr = &cx; §\C{// can change cr}§ 829 cr = 7; §\C{// error, cannot change cx}§ 830 int & const rc = x; §\C{// must be initialized}§ 831 ®&®rc = &x; §\C{// error, cannot change rc}§ 832 const int & const crc = cx; §\C{// must be initialized}§ 833 crc = 7; §\C{// error, cannot change cx}§ 834 ®&®crc = &cx; §\C{// error, cannot change crc}§ 835 \end{cfa} 836 Hence, for type ©& const©, there is no pointer assignment, so ©&rc = &x© is disallowed, and \emph{the address value cannot be the null pointer unless an arbitrary pointer is coerced\index{coercion} into the reference}: 837 \begin{cfa} 838 int & const cr = *0; §\C{// where 0 is the int * zero}§ 839 \end{cfa} 840 Note, constant reference-types do not prevent \Index{addressing errors} because of explicit storage-management: 841 \begin{cfa} 842 int & const cr = *malloc(); 843 cr = 5; 844 free( &cr ); 845 cr = 7; §\C{// unsound pointer dereference}§ 846 \end{cfa} 847 848 The position of the ©const© qualifier \emph{after} the pointer/reference qualifier causes confuse for C programmers. 833 849 The ©const© qualifier cannot be moved before the pointer/reference qualifier for C style-declarations; 834 \CFA-style declarations attempt to address this issue:850 \CFA-style declarations (see \VRef{s:Declarations}) attempt to address this issue: 835 851 \begin{quote2} 836 852 \begin{tabular}{@{}l@{\hspace{3em}}l@{}} … … 847 863 \end{tabular} 848 864 \end{quote2} 849 where the \CFA declaration is read left-to-right (see \VRef{s:Declarations}). 865 where the \CFA declaration is read left-to-right. 866 867 Finally, like pointers, references are usable and composable with other type operators and generators. 868 \begin{cfa} 869 int w, x, y, z, & ar[3] = { x, y, z }; §\C{// initialize array of references}§ 870 &ar[1] = &w; §\C{// change reference array element}§ 871 typeof( ar[1] ) p; §\C{// (gcc) is int, i.e., the type of referenced object}§ 872 typeof( &ar[1] ) q; §\C{// (gcc) is int \&, i.e., the type of reference}§ 873 sizeof( ar[1] ) == sizeof( int ); §\C{// is true, i.e., the size of referenced object}§ 874 sizeof( &ar[1] ) == sizeof( int *) §\C{// is true, i.e., the size of a reference}§ 875 \end{cfa} 876 877 In contrast to \CFA reference types, \Index*[C++]{\CC{}}'s reference types are all ©const© references, preventing changes to the reference address, so only value assignment is possible, which eliminates half of the \Index{address duality}. 878 Also, \CC does not allow \Index{array}s\index{array!reference} of reference\footnote{ 879 The reason for disallowing arrays of reference is unknown, but possibly comes from references being ethereal (like a textual macro), and hence, replaceable by the referant object.} 880 \Index*{Java}'s reference types to objects (all Java objects are on the heap) are like C pointers, which always manipulate the address, and there is no (bit-wise) object assignment, so objects are explicitly cloned by shallow or deep copying, which eliminates half of the address duality. 881 882 883 \subsection{Initialization} 850 884 851 885 \Index{Initialization} is different than \Index{assignment} because initialization occurs on the empty (uninitialized) storage on an object, while assignment occurs on possibly initialized storage of an object. 852 886 There are three initialization contexts in \CFA: declaration initialization, argument/parameter binding, return/temporary binding. 853 For reference initialization (like pointer), the initializing value must be an address (\Index{lvalue}) not a value (\Index{rvalue}). 854 \begin{cfa} 855 int * p = &x; §\C{// both \&x and x are possible interpretations}§ 856 int & r = x; §\C{// x unlikely interpretation, because of auto-dereferencing}§ 857 \end{cfa} 858 Hence, the compiler implicitly inserts a reference operator, ©&©, before the initialization expression. 859 Similarly, when a reference is used for a parameter/return type, the call-site argument does not require a reference operator. 860 \begin{cfa} 861 int & f( int & rp ); §\C{// reference parameter and return}§ 862 z = f( x ) + f( y ); §\C{// reference operator added, temporaries needed for call results}§ 863 \end{cfa} 864 Within routine ©f©, it is possible to change the argument by changing the corresponding parameter, and parameter ©rp© can be locally reassigned within ©f©. 865 Since ©?+?© takes its arguments by value, the references returned from ©f© are used to initialize compiler generated temporaries with value semantics that copy from the references. 887 Because the object being initialized has no value, there is only one meaningful semantics with respect to address duality: it must mean address as there is no pointed-to value. 888 In contrast, the left-hand side of assignment has an address that has a duality. 889 Therefore, for pointer/reference initialization, the initializing value must be an address not a value. 890 \begin{cfa} 891 int * p = &x; §\C{// assign address of x}§ 892 ®int * p = x;® §\C{// assign value of x}§ 893 int & r = x; §\C{// must have address of x}§ 894 \end{cfa} 895 Like the previous example with C pointer-arithmetic, it is unlikely assigning the value of ©x© into a pointer is meaningful (again, a warning is usually given). 896 Therefore, for safety, this context requires an address, so it is superfluous to require explicitly taking the address of the initialization object, even though the type is incorrect. 897 Note, this is strictly a convenience and safety feature for a programmer. 898 Hence, \CFA allows ©r© to be assigned ©x© because it infers a reference for ©x©, by implicitly inserting a address-of operator, ©&©, and it is an error to put an ©&© because the types no longer match due to the implicit dereference. 899 Unfortunately, C allows ©p© to be assigned with ©&x© (address) or ©x© (value), but most compilers warn about the latter assignment as being potentially incorrect. 900 Similarly, when a reference type is used for a parameter/return type, the call-site argument does not require a reference operator for the same reason. 901 \begin{cfa} 902 int & f( int & r ); §\C{// reference parameter and return}§ 903 z = f( x ) + f( y ); §\C{// reference operator added, temporaries needed for call results}§ 904 \end{cfa} 905 Within routine ©f©, it is possible to change the argument by changing the corresponding parameter, and parameter ©r© can be locally reassigned within ©f©. 906 Since operator routine ©?+?© takes its arguments by value, the references returned from ©f© are used to initialize compiler generated temporaries with value semantics that copy from the references. 907 \begin{cfa} 908 int temp1 = f( x ), temp2 = f( y ); 909 z = temp1 + temp2; 910 \end{cfa} 911 This \Index{implicit referencing} is crucial for reducing the syntactic burden for programmers when using references; 912 otherwise references have the same syntactic burden as pointers in these contexts. 866 913 867 914 When a pointer/reference parameter has a ©const© value (immutable), it is possible to pass literals and expressions. 868 915 \begin{cfa} 869 void f( ®const® int & cr p);870 void g( ®const® int * cp p);871 f( 3 ); g( &3 );872 f( x + y ); g( &(x + y) );916 void f( ®const® int & cr ); 917 void g( ®const® int * cp ); 918 f( 3 ); g( ®&®3 ); 919 f( x + y ); g( ®&®(x + y) ); 873 920 \end{cfa} 874 921 Here, the compiler passes the address to the literal 3 or the temporary for the expression ©x + y©, knowing the argument cannot be changed through the parameter. 875 (The ©&© is necessary for the pointer parameter to make the types match, and is a common requirement for a C programmer.) 876 \CFA \emph{extends} this semantics to a mutable pointer/reference parameter, and the compiler implicitly creates the necessary temporary (copying the argument), which is subsequently pointed-to by the reference parameter and can be changed. 877 \begin{cfa} 878 void f( int & rp ); 879 void g( int * pp ); 880 f( 3 ); g( &3 ); §\C{// compiler implicit generates temporaries}§ 881 f( x + y ); g( &(x + y) ); §\C{// compiler implicit generates temporaries}§ 922 The ©&© before the constant/expression for the pointer-type parameter (©g©) is a \CFA extension necessary to type match and is a common requirement before a variable in C (\eg ©scanf©). 923 Importantly, ©&3© may not be equal to ©&3©, where the references occur across calls because the temporaries maybe different on each call. 924 925 \CFA \emph{extends} this semantics to a mutable pointer/reference parameter, and the compiler implicitly creates the necessary temporary (copying the argument), which is subsequently pointed-to by the reference parameter and can be changed.\footnote{ 926 If whole program analysis is possible, and shows the parameter is not assigned, \ie it is ©const©, the temporary is unnecessary.} 927 \begin{cfa} 928 void f( int & r ); 929 void g( int * p ); 930 f( 3 ); g( ®&®3 ); §\C{// compiler implicit generates temporaries}§ 931 f( x + y ); g( ®&®(x + y) ); §\C{// compiler implicit generates temporaries}§ 882 932 \end{cfa} 883 933 Essentially, there is an implicit \Index{rvalue} to \Index{lvalue} conversion in this case.\footnote{ … … 885 935 The implicit conversion allows seamless calls to any routine without having to explicitly name/copy the literal/expression to allow the call. 886 936 887 While \CFA attempts to handle pointers and references in a uniform, symmetric manner, C handles routine variables in an inconsistent way: a routine variable is both a pointer and a reference (particle and wave). 888 \begin{cfa} 889 void f( int p ) {...} 890 void (*fp)( int ) = &f; §\C{// pointer initialization}§ 891 void (*fp)( int ) = f; §\C{// reference initialization}§ 892 (*fp)(3); §\C{// pointer invocation}§ 893 fp(3); §\C{// reference invocation}§ 894 \end{cfa} 895 A routine variable is best described by a ©const© reference: 896 \begin{cfa} 897 const void (&fp)( int ) = f; 898 fp( 3 ); 899 fp = ... §\C{// error, cannot change code}§ 900 &fp = ...; §\C{// changing routine reference}§ 901 \end{cfa} 902 because the value of the routine variable is a routine literal, i.e., the routine code is normally immutable during execution.\footnote{ 937 %\CFA attempts to handle pointers and references in a uniform, symmetric manner. 938 Finally, C handles \Index{routine object}s in an inconsistent way. 939 A routine object is both a pointer and a reference (\Index{particle and wave}). 940 \begin{cfa} 941 void f( int i ); 942 void (*fp)( int ); §\C{// routine pointer}§ 943 fp = f; §\C{// reference initialization}§ 944 fp = &f; §\C{// pointer initialization}§ 945 fp = *f; §\C{// reference initialization}§ 946 fp(3); §\C{// reference invocation}§ 947 (*fp)(3); §\C{// pointer invocation}§ 948 \end{cfa} 949 While C's treatment of routine objects has similarity to inferring a reference type in initialization contexts, the examples are assignment not initialization, and all possible forms of assignment are possible (©f©, ©&f©, ©*f©) without regard for type. 950 Instead, a routine object should be referenced by a ©const© reference: 951 \begin{cfa} 952 ®const® void (®&® fr)( int ) = f; §\C{// routine reference}§ 953 fr = ... §\C{// error, cannot change code}§ 954 &fr = ...; §\C{// changing routine reference}§ 955 fr( 3 ); §\C{// reference call to f}§ 956 (*fr)(3); §\C{// error, incorrect type}§ 957 \end{cfa} 958 because the value of the routine object is a routine literal, \ie the routine code is normally immutable during execution.\footnote{ 903 959 Dynamic code rewriting is possible but only in special circumstances.} 904 \CFA allows this additional use of references for routine variables in an attempt to give a more consistent meaning for them. 905 906 907 \section{Type Operators} 908 909 The new declaration syntax can be used in other contexts where types are required, \eg casts and the pseudo-routine ©sizeof©: 910 \begin{quote2} 911 \begin{tabular}{@{}l@{\hspace{3em}}l@{}} 912 \multicolumn{1}{c@{\hspace{3em}}}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{C}} \\ 913 \begin{cfa} 914 y = (®* int®)x; 915 i = sizeof(®[ 5 ] * int®); 916 \end{cfa} 917 & 918 \begin{cfa} 919 y = (®int *®)x; 920 i = sizeof(®int *[ 5 ]®); 921 \end{cfa} 922 \end{tabular} 923 \end{quote2} 960 \CFA allows this additional use of references for routine objects in an attempt to give a more consistent meaning for them. 961 962 963 \subsection{Address-of Semantics} 964 965 In C, ©&E© is an rvalue for any expression ©E©. 966 \CFA extends the ©&© (address-of) operator as follows: 967 \begin{itemize} 968 \item 969 if ©R© is an \Index{rvalue} of type ©T &$_1$...&$_r$© where $r \ge 1$ references (©&© symbols) than ©&R© has type ©T ®*®&$_{\color{red}2}$...&$_{\color{red}r}$©, \ie ©T© pointer with $r-1$ references (©&© symbols). 970 971 \item 972 if ©L© is an \Index{lvalue} of type ©T &$_1$...&$_l$© where $l \ge 0$ references (©&© symbols) then ©&L© has type ©T ®*®&$_{\color{red}1}$...&$_{\color{red}l}$©, \ie ©T© pointer with $l$ references (©&© symbols). 973 \end{itemize} 974 The following example shows the first rule applied to different \Index{rvalue} contexts: 975 \begin{cfa} 976 int x, * px, ** ppx, *** pppx, **** ppppx; 977 int & rx = x, && rrx = rx, &&& rrrx = rrx ; 978 x = rrrx; // rrrx is an lvalue with type int &&& (equivalent to x) 979 px = &rrrx; // starting from rrrx, &rrrx is an rvalue with type int *&&& (&x) 980 ppx = &&rrrx; // starting from &rrrx, &&rrrx is an rvalue with type int **&& (&rx) 981 pppx = &&&rrrx; // starting from &&rrrx, &&&rrrx is an rvalue with type int ***& (&rrx) 982 ppppx = &&&&rrrx; // starting from &&&rrrx, &&&&rrrx is an rvalue with type int **** (&rrrx) 983 \end{cfa} 984 The following example shows the second rule applied to different \Index{lvalue} contexts: 985 \begin{cfa} 986 int x, * px, ** ppx, *** pppx; 987 int & rx = x, && rrx = rx, &&& rrrx = rrx ; 988 rrrx = 2; // rrrx is an lvalue with type int &&& (equivalent to x) 989 &rrrx = px; // starting from rrrx, &rrrx is an rvalue with type int *&&& (rx) 990 &&rrrx = ppx; // starting from &rrrx, &&rrrx is an rvalue with type int **&& (rrx) 991 &&&rrrx = pppx; // starting from &&rrrx, &&&rrrx is an rvalue with type int ***& (rrrx) 992 \end{cfa} 993 994 995 \subsection{Conversions} 996 997 C provides a basic implicit conversion to simplify variable usage: 998 \begin{enumerate} 999 \setcounter{enumi}{-1} 1000 \item 1001 lvalue to rvalue conversion: ©cv T© converts to ©T©, which allows implicit variable dereferencing. 1002 \begin{cfa} 1003 int x; 1004 x + 1; // lvalue variable (int) converts to rvalue for expression 1005 \end{cfa} 1006 An rvalue has no type qualifiers (©cv©), so the lvalue qualifiers are dropped. 1007 \end{enumerate} 1008 \CFA provides three new implicit conversion for reference types to simplify reference usage. 1009 \begin{enumerate} 1010 \item 1011 reference to rvalue conversion: ©cv T &© converts to ©T©, which allows implicit reference dereferencing. 1012 \begin{cfa} 1013 int x, &r = x, f( int p ); 1014 x = ®r® + f( ®r® ); // lvalue reference converts to rvalue 1015 \end{cfa} 1016 An rvalue has no type qualifiers (©cv©), so the reference qualifiers are dropped. 1017 1018 \item 1019 lvalue to reference conversion: \lstinline[deletekeywords={lvalue}]@lvalue-type cv1 T@ converts to ©cv2 T &©, which allows implicitly converting variables to references. 1020 \begin{cfa} 1021 int x, &r = ®x®, f( int & p ); // lvalue variable (int) convert to reference (int &) 1022 f( ®x® ); // lvalue variable (int) convert to reference (int &) 1023 \end{cfa} 1024 Conversion can restrict a type, where ©cv1© $\le$ ©cv2©, \eg passing an ©int© to a ©const volatile int &©, which has low cost. 1025 Conversion can expand a type, where ©cv1© $>$ ©cv2©, \eg passing a ©const volatile int© to an ©int &©, which has high cost (\Index{warning}); 1026 furthermore, if ©cv1© has ©const© but not ©cv2©, a temporary variable is created to preserve the immutable lvalue. 1027 1028 \item 1029 rvalue to reference conversion: ©T© converts to ©cv T &©, which allows binding references to temporaries. 1030 \begin{cfa} 1031 int x, & f( int & p ); 1032 f( ®x + 3® ); // rvalue parameter (int) implicitly converts to lvalue temporary reference (int &) 1033 ®&f®(...) = &x; // rvalue result (int &) implicitly converts to lvalue temporary reference (int &) 1034 \end{cfa} 1035 In both case, modifications to the temporary are inaccessible (\Index{warning}). 1036 Conversion expands the temporary-type with ©cv©, which is low cost since the temporary is inaccessible. 1037 \end{enumerate} 1038 1039 1040 \begin{comment} 1041 From: Richard Bilson <rcbilson@gmail.com> 1042 Date: Wed, 13 Jul 2016 01:58:58 +0000 1043 Subject: Re: pointers / references 1044 To: "Peter A. Buhr" <pabuhr@plg2.cs.uwaterloo.ca> 1045 1046 As a general comment I would say that I found the section confusing, as you move back and forth 1047 between various real and imagined programming languages. If it were me I would rewrite into two 1048 subsections, one that specifies precisely the syntax and semantics of reference variables and 1049 another that provides the rationale. 1050 1051 I don't see any obvious problems with the syntax or semantics so far as I understand them. It's not 1052 obvious that the description you're giving is complete, but I'm sure you'll find the special cases 1053 as you do the implementation. 1054 1055 My big gripes are mostly that you're not being as precise as you need to be in your terminology, and 1056 that you say a few things that aren't actually true even though I generally know what you mean. 1057 1058 20 C provides a pointer type; CFA adds a reference type. Both types contain an address, which is normally a 1059 21 location in memory. 1060 1061 An address is not a location in memory; an address refers to a location in memory. Furthermore it 1062 seems weird to me to say that a type "contains" an address; rather, objects of that type do. 1063 1064 21 Special addresses are used to denote certain states or access co-processor memory. By 1065 22 convention, no variable is placed at address 0, so addresses like 0, 1, 2, 3 are often used to denote no-value 1066 23 or other special states. 1067 1068 This isn't standard C at all. There has to be one null pointer representation, but it doesn't have 1069 to be a literal zero representation and there doesn't have to be more than one such representation. 1070 1071 23 Often dereferencing a special state causes a memory fault, so checking is necessary 1072 24 during execution. 1073 1074 I don't see the connection between the two clauses here. I feel like if a bad pointer will not cause 1075 a memory fault then I need to do more checking, not less. 1076 1077 24 If the programming language assigns addresses, a program's execution is sound, \ie all 1078 25 addresses are to valid memory locations. 1079 1080 You haven't said what it means to "assign" an address, but if I use my intuitive understanding of 1081 the term I don't see how this can be true unless you're assuming automatic storage management. 1082 1083 1 Program variables are implicit pointers to memory locations generated by the compiler and automatically 1084 2 dereferenced, as in: 1085 1086 There is no reason why a variable needs to have a location in memory, and indeed in a typical 1087 program many variables will not. In standard terminology an object identifier refers to data in the 1088 execution environment, but not necessarily in memory. 1089 1090 13 A pointer/reference is a generalization of a variable name, \ie a mutable address that can point to more 1091 14 than one memory location during its lifetime. 1092 1093 I feel like you're off the reservation here. In my world there are objects of pointer type, which 1094 seem to be what you're describing here, but also pointer values, which can be stored in an object of 1095 pointer type but don't necessarily have to be. For example, how would you describe the value denoted 1096 by "&main" in a C program? I would call it a (function) pointer, but that doesn't satisfy your 1097 definition. 1098 1099 16 not occupy storage as the literal is embedded directly into instructions.) Hence, a pointer occupies memory 1100 17 to store its current address, and the pointer's value is loaded by dereferencing, e.g.: 1101 1102 As with my general objection regarding your definition of variables, there is no reason why a 1103 pointer variable (object of pointer type) needs to occupy memory. 1104 1105 21 p2 = p1 + x; // compiler infers *p2 = *p1 + x; 1106 1107 What language are we in now? 1108 1109 24 pointer usage. However, in C, the following cases are ambiguous, especially with pointer arithmetic: 1110 25 p1 = p2; // p1 = p2 or *p1 = *p2 1111 1112 This isn't ambiguous. it's defined to be the first option. 1113 1114 26 p1 = p1 + 1; // p1 = p1 + 1 or *p1 = *p1 + 1 1115 1116 Again, this statement is not ambiguous. 1117 1118 13 example. Hence, a reference behaves like the variable name for the current variable it is pointing-to. The 1119 14 simplest way to understand a reference is to imagine the compiler inserting a dereference operator before 1120 15 the reference variable for each reference qualifier in a declaration, e.g.: 1121 1122 It's hard for me to understand who the audience for this part is. I think a practical programmer is 1123 likely to be satisfied with "a reference behaves like the variable name for the current variable it 1124 is pointing-to," maybe with some examples. Your "simplest way" doesn't strike me as simpler than 1125 that. It feels like you're trying to provide a more precise definition for the semantics of 1126 references, but it isn't actually precise enough to be a formal specification. If you want to 1127 express the semantics of references using rewrite rules that's a great way to do it, but lay the 1128 rules out clearly, and when you're showing an example of rewriting keep your 1129 references/pointers/values separate (right now, you use \eg "r3" to mean a reference, a pointer, 1130 and a value). 1131 1132 24 Cancellation works to arbitrary depth, and pointer and reference values are interchangeable because both 1133 25 contain addresses. 1134 1135 Except they're not interchangeable, because they have different and incompatible types. 1136 1137 40 Interestingly, C++ deals with the address duality by making the pointed-to value the default, and prevent- 1138 41 ing changes to the reference address, which eliminates half of the duality. Java deals with the address duality 1139 42 by making address assignment the default and requiring field assignment (direct or indirect via methods), 1140 43 \ie there is no builtin bit-wise or method-wise assignment, which eliminates half of the duality. 1141 1142 I can follow this but I think that's mostly because I already understand what you're trying to 1143 say. I don't think I've ever heard the term "method-wise assignment" and I don't see you defining 1144 it. Furthermore Java does have value assignment of basic (non-class) types, so your summary here 1145 feels incomplete. (If it were me I'd drop this paragraph rather than try to save it.) 1146 1147 11 Hence, for type & const, there is no pointer assignment, so &rc = &x is disallowed, and the address value 1148 12 cannot be 0 unless an arbitrary pointer is assigned to the reference. 1149 1150 Given the pains you've taken to motivate every little bit of the semantics up until now, this last 1151 clause ("the address value cannot be 0") comes out of the blue. It seems like you could have 1152 perfectly reasonable semantics that allowed the initialization of null references. 1153 1154 12 In effect, the compiler is managing the 1155 13 addresses for type & const not the programmer, and by a programming discipline of only using references 1156 14 with references, address errors can be prevented. 1157 1158 Again, is this assuming automatic storage management? 1159 1160 18 rary binding. For reference initialization (like pointer), the initializing value must be an address (lvalue) not 1161 19 a value (rvalue). 1162 1163 This sentence appears to suggest that an address and an lvalue are the same thing. 1164 1165 20 int * p = &x; // both &x and x are possible interpretations 1166 1167 Are you saying that we should be considering "x" as a possible interpretation of the initializer 1168 "&x"? It seems to me that this expression has only one legitimate interpretation in context. 1169 1170 21 int & r = x; // x unlikely interpretation, because of auto-dereferencing 1171 1172 You mean, we can initialize a reference using an integer value? Surely we would need some sort of 1173 cast to induce that interpretation, no? 1174 1175 22 Hence, the compiler implicitly inserts a reference operator, &, before the initialization expression. 1176 1177 But then the expression would have pointer type, which wouldn't be compatible with the type of r. 1178 1179 22 Similarly, 1180 23 when a reference is used for a parameter/return type, the call-site argument does not require a reference 1181 24 operator. 1182 1183 Furthermore, it would not be correct to use a reference operator. 1184 1185 45 The implicit conversion allows 1186 1 seamless calls to any routine without having to explicitly name/copy the literal/expression to allow the call. 1187 2 While C' attempts to handle pointers and references in a uniform, symmetric manner, C handles routine 1188 3 variables in an inconsistent way: a routine variable is both a pointer and a reference (particle and wave). 1189 1190 After all this talk of how expressions can have both pointer and value interpretations, you're 1191 disparaging C because it has expressions that have both pointer and value interpretations? 1192 1193 On Sat, Jul 9, 2016 at 4:18 PM Peter A. Buhr <pabuhr@plg.uwaterloo.ca> wrote: 1194 > Aaron discovered a few places where "&"s are missing and where there are too many "&", which are 1195 > corrected in the attached updated. None of the text has changed, if you have started reading 1196 > already. 1197 \end{comment} 924 1198 925 1199 926 1200 \section{Routine Definition} 927 1201 928 \CFA also supports a new syntax for routine definition, as well as ISO Cand K\&R routine syntax.1202 \CFA also supports a new syntax for routine definition, as well as \Celeven and K\&R routine syntax. 929 1203 The point of the new syntax is to allow returning multiple values from a routine~\cite{Galletly96,CLU}, \eg: 930 1204 \begin{cfa} … … 946 1220 in both cases the type is assumed to be void as opposed to old style C defaults of int return type and unknown parameter types, respectively, as in: 947 1221 \begin{cfa} 948 [§\,§] g(); §\C{// no input or output parameters}§949 [ void ] g( void ); §\C{// no input or output parameters}§1222 [§\,§] g(); §\C{// no input or output parameters}§ 1223 [ void ] g( void ); §\C{// no input or output parameters}§ 950 1224 \end{cfa} 951 1225 … … 965 1239 \begin{cfa} 966 1240 typedef int foo; 967 int f( int (* foo) ); §\C{// foo is redefined as a parameter name}§1241 int f( int (* foo) ); §\C{// foo is redefined as a parameter name}§ 968 1242 \end{cfa} 969 1243 The string ``©int (* foo)©'' declares a C-style named-parameter of type pointer to an integer (the parenthesis are superfluous), while the same string declares a \CFA style unnamed parameter of type routine returning integer with unnamed parameter of type pointer to foo. … … 973 1247 C-style declarations can be used to declare parameters for \CFA style routine definitions, \eg: 974 1248 \begin{cfa} 975 [ int ] f( * int, int * ); §\C{// returns an integer, accepts 2 pointers to integers}§976 [ * int, int * ] f( int ); §\C{// returns 2 pointers to integers, accepts an integer}§1249 [ int ] f( * int, int * ); §\C{// returns an integer, accepts 2 pointers to integers}§ 1250 [ * int, int * ] f( int ); §\C{// returns 2 pointers to integers, accepts an integer}§ 977 1251 \end{cfa} 978 1252 The reason for allowing both declaration styles in the new context is for backwards compatibility with existing preprocessor macros that generate C-style declaration-syntax, as in: 979 1253 \begin{cfa} 980 1254 #define ptoa( n, d ) int (*n)[ d ] 981 int f( ptoa( p, 5 ) ) ... §\C{// expands to int f( int (*p)[ 5 ] )}§982 [ int ] f( ptoa( p, 5 ) ) ... §\C{// expands to [ int ] f( int (*p)[ 5 ] )}§1255 int f( ptoa( p, 5 ) ) ... §\C{// expands to int f( int (*p)[ 5 ] )}§ 1256 [ int ] f( ptoa( p, 5 ) ) ... §\C{// expands to [ int ] f( int (*p)[ 5 ] )}§ 983 1257 \end{cfa} 984 1258 Again, programmers are highly encouraged to use one declaration form or the other, rather than mixing the forms. … … 1002 1276 int z; 1003 1277 ... x = 0; ... y = z; ... 1004 ®return;® §\C{// implicitly return x, y}§1278 ®return;® §\C{// implicitly return x, y}§ 1005 1279 } 1006 1280 \end{cfa} … … 1012 1286 [ int x, int y ] f() { 1013 1287 ... 1014 } §\C{// implicitly return x, y}§1288 } §\C{// implicitly return x, y}§ 1015 1289 \end{cfa} 1016 1290 In this case, the current values of ©x© and ©y© are returned to the calling routine just as if a ©return© had been encountered. 1291 1292 Named return values may be used in conjunction with named parameter values; 1293 specifically, a return and parameter can have the same name. 1294 \begin{cfa} 1295 [ int x, int y ] f( int, x, int y ) { 1296 ... 1297 } §\C{// implicitly return x, y}§ 1298 \end{cfa} 1299 This notation allows the compiler to eliminate temporary variables in nested routine calls. 1300 \begin{cfa} 1301 [ int x, int y ] f( int, x, int y ); §\C{// prototype declaration}§ 1302 int a, b; 1303 [a, b] = f( f( f( a, b ) ) ); 1304 \end{cfa} 1305 While the compiler normally ignores parameters names in prototype declarations, here they are used to eliminate temporary return-values by inferring that the results of each call are the inputs of the next call, and ultimately, the left-hand side of the assignment. 1306 Hence, even without the body of routine ©f© (separate compilation), it is possible to perform a global optimization across routine calls. 1307 The compiler warns about naming inconsistencies between routine prototype and definition in this case, and behaviour is \Index{undefined} if the programmer is inconsistent. 1017 1308 1018 1309 … … 1022 1313 as well, parameter names are optional, \eg: 1023 1314 \begin{cfa} 1024 [ int x ] f (); §\C{// returning int with no parameters}§1025 [ * int ] g (int y); §\C{// returning pointer to int with int parameter}§1026 [ ] h ( int,char);§\C{// returning no result with int and char parameters}§1027 [ * int, int ] j (int);§\C{// returning pointer to int and int, with int parameter}§1315 [ int x ] f (); §\C{// returning int with no parameters}§ 1316 [ * int ] g (int y); §\C{// returning pointer to int with int parameter}§ 1317 [ ] h ( int, char ); §\C{// returning no result with int and char parameters}§ 1318 [ * int, int ] j ( int ); §\C{// returning pointer to int and int, with int parameter}§ 1028 1319 \end{cfa} 1029 1320 This syntax allows a prototype declaration to be created by cutting and pasting source text from the routine definition header (or vice versa). … … 1033 1324 \multicolumn{1}{c@{\hspace{3em}}}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{C}} \\ 1034 1325 \begin{cfa} 1035 [ int ] f( int), g;1326 [ int ] f( int ), g; 1036 1327 \end{cfa} 1037 1328 & 1038 1329 \begin{cfa} 1039 int f( int), g(int);1330 int f( int ), g( int ); 1040 1331 \end{cfa} 1041 1332 \end{tabular} … … 1043 1334 Declaration qualifiers can only appear at the start of a \CFA routine declaration,\footref{StorageClassSpecifier} \eg: 1044 1335 \begin{cfa} 1045 extern [ int ] f ( int);1046 static [ int ] g ( int);1336 extern [ int ] f ( int ); 1337 static [ int ] g ( int ); 1047 1338 \end{cfa} 1048 1339 … … 1052 1343 The syntax for pointers to \CFA routines specifies the pointer name on the right, \eg: 1053 1344 \begin{cfa} 1054 * [ int x ] () fp; §\C{// pointer to routine returning int with no parameters}§1055 * [ * int ] (int y) gp; §\C{// pointer to routine returning pointer to int with int parameter}§1056 * [ ] (int,char) hp; §\C{// pointer to routine returning no result with int and char parameters}§1057 * [ * int,int ] ( int) jp;§\C{// pointer to routine returning pointer to int and int, with int parameter}§1345 * [ int x ] () fp; §\C{// pointer to routine returning int with no parameters}§ 1346 * [ * int ] (int y) gp; §\C{// pointer to routine returning pointer to int with int parameter}§ 1347 * [ ] (int,char) hp; §\C{// pointer to routine returning no result with int and char parameters}§ 1348 * [ * int,int ] ( int ) jp; §\C{// pointer to routine returning pointer to int and int, with int parameter}§ 1058 1349 \end{cfa} 1059 1350 While parameter names are optional, \emph{a routine name cannot be specified}; 1060 1351 for example, the following is incorrect: 1061 1352 \begin{cfa} 1062 * [ int x ] f () fp; §\C{// routine name "f" is not allowed}§1353 * [ int x ] f () fp; §\C{// routine name "f" is not allowed}§ 1063 1354 \end{cfa} 1064 1355 … … 1066 1357 \section{Named and Default Arguments} 1067 1358 1068 Named and defaultarguments~\cite{Hardgrave76}\footnote{1359 Named\index{named arguments}\index{arguments!named} and default\index{default arguments}\index{arguments!default} arguments~\cite{Hardgrave76}\footnote{ 1069 1360 Francez~\cite{Francez77} proposed a further extension to the named-parameter passing style, which specifies what type of communication (by value, by reference, by name) the argument is passed to the routine.} 1070 1361 are two mechanisms to simplify routine call. … … 1233 1524 1234 1525 Given the \CFA restrictions above, both named and default arguments are backwards compatible. 1235 \Index*[C++]{\CC } only supports default arguments;1526 \Index*[C++]{\CC{}} only supports default arguments; 1236 1527 \Index*{Ada} supports both named and default arguments. 1237 1528 1238 1529 1239 \section{Type/Routine Nesting} 1530 \section{Unnamed Structure Fields} 1531 1532 C requires each field of a structure to have a name, except for a bit field associated with a basic type, \eg: 1533 \begin{cfa} 1534 struct { 1535 int f1; §\C{// named field}§ 1536 int f2 : 4; §\C{// named field with bit field size}§ 1537 int : 3; §\C{// unnamed field for basic type with bit field size}§ 1538 int ; §\C{// disallowed, unnamed field}§ 1539 int *; §\C{// disallowed, unnamed field}§ 1540 int (*)( int ); §\C{// disallowed, unnamed field}§ 1541 }; 1542 \end{cfa} 1543 This requirement is relaxed by making the field name optional for all field declarations; therefore, all the field declarations in the example are allowed. 1544 As for unnamed bit fields, an unnamed field is used for padding a structure to a particular size. 1545 A list of unnamed fields is also supported, \eg: 1546 \begin{cfa} 1547 struct { 1548 int , , ; §\C{// 3 unnamed fields}§ 1549 } 1550 \end{cfa} 1551 1552 1553 \section{Nesting} 1240 1554 1241 1555 Nesting of types and routines is useful for controlling name visibility (\newterm{name hiding}). … … 1244 1558 \subsection{Type Nesting} 1245 1559 1246 \CFA allows \Index{type nesting}, and type qualification of the nested typ res (see \VRef[Figure]{f:TypeNestingQualification}), where as C hoists\index{type hoisting} (refactors) nested types into the enclosing scope and has no type qualification.1560 \CFA allows \Index{type nesting}, and type qualification of the nested types (see \VRef[Figure]{f:TypeNestingQualification}), where as C hoists\index{type hoisting} (refactors) nested types into the enclosing scope and has no type qualification. 1247 1561 \begin{figure} 1248 1562 \centering … … 1347 1661 } 1348 1662 int main() { 1349 * [int]( int) fp = foo(); §\C{// int (*fp)(int)}§1350 sout | fp( 3 ) | endl;1663 * [int]( int ) fp = foo(); §\C{// int (*fp)( int )}§ 1664 sout | fp( 3 ) | endl; 1351 1665 } 1352 1666 \end{cfa} 1353 1667 because 1354 1668 1355 Currently, there are no \Index{lambda} expressions, i.e.,unnamed routines because routine names are very important to properly select the correct routine.1356 1357 1358 \section{ Lexical List}1669 Currently, there are no \Index{lambda} expressions, \ie unnamed routines because routine names are very important to properly select the correct routine. 1670 1671 1672 \section{Tuples} 1359 1673 1360 1674 In C and \CFA, lists of elements appear in several contexts, such as the parameter list for a routine call. … … 1373 1687 [ v+w, x*y, 3.14159, f() ] 1374 1688 \end{cfa} 1375 Tuples are permitted to contain sub-tuples ( i.e.,nesting), such as ©[ [ 14, 21 ], 9 ]©, which is a 2-element tuple whose first element is itself a tuple.1689 Tuples are permitted to contain sub-tuples (\ie nesting), such as ©[ [ 14, 21 ], 9 ]©, which is a 2-element tuple whose first element is itself a tuple. 1376 1690 Note, a tuple is not a record (structure); 1377 1691 a record denotes a single value with substructure, whereas a tuple is multiple values with no substructure (see flattening coercion in Section 12.1). … … 1429 1743 tuple does not have structure like a record; a tuple is simply converted into a list of components. 1430 1744 \begin{rationale} 1431 The present implementation of \CFA does not support nested routine calls when the inner routine returns multiple values; i.e.,a statement such as ©g( f() )© is not supported.1745 The present implementation of \CFA does not support nested routine calls when the inner routine returns multiple values; \ie a statement such as ©g( f() )© is not supported. 1432 1746 Using a temporary variable to store the results of the inner routine and then passing this variable to the outer routine works, however. 1433 1747 \end{rationale} … … 1442 1756 This requirement is the same as for comma expressions in argument lists. 1443 1757 1444 Type qualifiers, i.e.,const and volatile, may modify a tuple type.1445 The meaning is the same as for a type qualifier modifying an aggregate type [Int99, x 6.5.2.3(7),x 6.7.3(11)], i.e.,the qualifier is distributed across all of the types in the tuple, \eg:1758 Type qualifiers, \ie const and volatile, may modify a tuple type. 1759 The meaning is the same as for a type qualifier modifying an aggregate type [Int99, x 6.5.2.3(7),x 6.7.3(11)], \ie the qualifier is distributed across all of the types in the tuple, \eg: 1446 1760 \begin{cfa} 1447 1761 const volatile [ int, float, const int ] x; … … 1481 1795 ©w© is implicitly opened to yield a tuple of four values, which are then assigned individually. 1482 1796 1483 A \newterm{flattening coercion} coerces a nested tuple, i.e.,a tuple with one or more components, which are themselves tuples, into a flattened tuple, which is a tuple whose components are not tuples, as in:1797 A \newterm{flattening coercion} coerces a nested tuple, \ie a tuple with one or more components, which are themselves tuples, into a flattened tuple, which is a tuple whose components are not tuples, as in: 1484 1798 \begin{cfa} 1485 1799 [ a, b, c, d ] = [ 1, [ 2, 3 ], 4 ]; … … 1516 1830 \end{cfa} 1517 1831 \index{lvalue} 1518 The left-hand side is a tuple of \emph{lvalues}, which is a list of expressions each yielding an address, i.e.,any data object that can appear on the left-hand side of a conventional assignment statement.1832 The left-hand side is a tuple of \emph{lvalues}, which is a list of expressions each yielding an address, \ie any data object that can appear on the left-hand side of a conventional assignment statement. 1519 1833 ©$\emph{expr}$© is any standard arithmetic expression. 1520 1834 Clearly, the types of the entities being assigned must be type compatible with the value of the expression. … … 1557 1871 \index{lvalue} 1558 1872 The left-hand side is a tuple of \emph{lvalues}, and the right-hand side is a tuple of \emph{expr}s. 1559 Each \emph{expr} appearing on the right hand side of a multiple assignment statement is assigned to the corresponding \emph{lvalues} on the left-hand side of the statement using parallel semantics for each assignment.1873 Each \emph{expr} appearing on the right-hand side of a multiple assignment statement is assigned to the corresponding \emph{lvalues} on the left-hand side of the statement using parallel semantics for each assignment. 1560 1874 An example of multiple assignment is: 1561 1875 \begin{cfa} … … 1604 1918 [ x1, y1 ] = z = 0; 1605 1919 \end{cfa} 1606 As in C, the rightmost assignment is performed first, i.e., assignment parses right to left. 1607 1608 1609 \section{Unnamed Structure Fields} 1610 1611 C requires each field of a structure to have a name, except for a bit field associated with a basic type, \eg: 1612 \begin{cfa} 1613 struct { 1614 int f1; §\C{// named field}§ 1615 int f2 : 4; §\C{// named field with bit field size}§ 1616 int : 3; §\C{// unnamed field for basic type with bit field size}§ 1617 int ; §\C{// disallowed, unnamed field}§ 1618 int *; §\C{// disallowed, unnamed field}§ 1619 int (*)(int); §\C{// disallowed, unnamed field}§ 1620 }; 1621 \end{cfa} 1622 This requirement is relaxed by making the field name optional for all field declarations; therefore, all the field declarations in the example are allowed. 1623 As for unnamed bit fields, an unnamed field is used for padding a structure to a particular size. 1624 A list of unnamed fields is also supported, \eg: 1625 \begin{cfa} 1626 struct { 1627 int , , ; §\C{// 3 unnamed fields}§ 1628 } 1629 \end{cfa} 1920 As in C, the rightmost assignment is performed first, \ie assignment parses right to left. 1630 1921 1631 1922 … … 1669 1960 1670 1961 1671 \section{Labelled Continue /Break}1962 \section{Labelled Continue/Break} 1672 1963 1673 1964 While C provides ©continue© and ©break© statements for altering control flow, both are restricted to one level of nesting for a particular control structure. 1674 Unfortunately, this restriction forces programmers to use ©goto©to achieve the equivalent control-flow for more than one level of nesting.1675 To prevent having to switch to the ©goto©, \CFA extends the ©continue©\index{continue@©continue©}\index{continue@©continue©!labelled}\index{labelled!continue@©continue©} and ©break©\index{break@©break©}\index{break@©break©!labelled}\index{labelled!break@©break©} with a target label to support static multi-level exit\index{multi-level exit}\index{static multi-level exit}~\cite{Buhr85,Java}.1965 Unfortunately, this restriction forces programmers to use \Indexc{goto} to achieve the equivalent control-flow for more than one level of nesting. 1966 To prevent having to switch to the ©goto©, \CFA extends the \Indexc{continue}\index{continue@\lstinline $continue$!labelled}\index{labelled!continue@©continue©} and \Indexc{break}\index{break@\lstinline $break$!labelled}\index{labelled!break@©break©} with a target label to support static multi-level exit\index{multi-level exit}\index{static multi-level exit}~\cite{Buhr85,Java}. 1676 1967 For both ©continue© and ©break©, the target label must be directly associated with a ©for©, ©while© or ©do© statement; 1677 1968 for ©break©, the target label can also be associated with a ©switch©, ©if© or compound (©{}©) statement. 1678 1679 The following example shows the labelled ©continue© specifying which control structure is the target for the next loop iteration: 1680 \begin{quote2} 1681 \begin{tabular}{@{}l@{\hspace{3em}}l@{}} 1682 \multicolumn{1}{c@{\hspace{3em}}}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{C}} \\ 1683 \begin{cfa} 1684 ®L1:® do { 1685 ®L2:® while ( ... ) { 1686 ®L3:® for ( ... ) { 1687 ... continue ®L1®; ... // continue do 1688 ... continue ®L2®; ... // continue while 1689 ... continue ®L3®; ... // continue for 1690 } // for 1691 } // while 1692 } while ( ... ); 1693 \end{cfa} 1694 & 1695 \begin{cfa} 1696 do { 1697 while ( ... ) { 1698 for ( ... ) { 1699 ... goto L1; ... 1700 ... goto L2; ... 1701 ... goto L3; ... 1702 L3: ; } 1703 L2: ; } 1704 L1: ; } while ( ... ); 1705 \end{cfa} 1706 \end{tabular} 1707 \end{quote2} 1708 The innermost loop has three restart points, which cause the next loop iteration to begin. 1709 1710 The following example shows the labelled ©break© specifying which control structure is the target for exit: 1711 \begin{quote2} 1712 \begin{tabular}{@{}l@{\hspace{3em}}l@{}} 1713 \multicolumn{1}{c@{\hspace{3em}}}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{C}} \\ 1714 \begin{cfa} 1715 ®L1:® { 1969 \VRef[Figure]{f:MultiLevelResumeTermination} shows the labelled ©continue© and ©break©, specifying which control structure is the target for exit, and the corresponding C program using only ©goto©. 1970 The innermost loop has 7 exit points, which cause resumption or termination of one or more of the 7 \Index{nested control-structure}s. 1971 1972 \begin{figure} 1973 \begin{tabular}{@{\hspace{\parindentlnth}}l@{\hspace{1.5em}}l@{}} 1974 \multicolumn{1}{c@{\hspace{1.5em}}}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{C}} \\ 1975 \begin{cfa} 1976 ®LC:® { 1716 1977 ... §declarations§ ... 1717 ®L 2:® switch ( ... ) {1978 ®LS:® switch ( ... ) { 1718 1979 case 3: 1719 ®L3:® if ( ... ) { 1720 ®L4:® for ( ... ) { 1721 ... break ®L1®; ... // exit compound statement 1722 ... break ®L2®; ... // exit switch 1723 ... break ®L3®; ... // exit if 1724 ... break ®L4®; ... // exit loop 1980 ®LIF:® if ( ... ) { 1981 ®LF:® for ( ... ) { 1982 ®LW:® while ( ... ) { 1983 ... break ®LC®; ... // terminate compound 1984 ... break ®LS®; ... // terminate switch 1985 ... break ®LIF®; ... // terminate if 1986 ... continue ®LF;® ... // resume loop 1987 ... break ®LF®; ... // terminate loop 1988 ... continue ®LW®; ... // resume loop 1989 ... break ®LW®; ... // terminate loop 1990 } // while 1725 1991 } // for 1726 1992 } else { 1727 ... break ®L 3®; ... // exitif1993 ... break ®LIF®; ... // terminate if 1728 1994 } // if 1729 1995 } // switch … … 1736 2002 switch ( ... ) { 1737 2003 case 3: 1738 if ( ... ) {2004 if ( ... ) { 1739 2005 for ( ... ) { 1740 ... goto L1; ... 1741 ... goto L2; ... 1742 ... goto L3; ... 1743 ... goto L4; ... 1744 } L4: ; 2006 while ( ... ) { 2007 ... goto ®LC®; ... 2008 ... goto ®LS®; ... 2009 ... goto ®LIF®; ... 2010 ... goto ®LFC®; ... 2011 ... goto ®LFB®; ... 2012 ... goto ®LWC®; ... 2013 ... goto ®LWB®; ... 2014 ®LWC®: ; } ®LWB:® ; 2015 ®LFC:® ; } ®LFB:® ; 1745 2016 } else { 1746 ... goto L3; ...1747 } L3:;1748 } L2:;1749 } L1:;2017 ... goto ®LIF®; ... 2018 } ®L3:® ; 2019 } ®LS:® ; 2020 } ®LC:® ; 1750 2021 \end{cfa} 1751 2022 \end{tabular} 1752 \end{quote2} 1753 The innermost loop has four exit points, which cause termination of one or more of the four \Index{nested control structure}s. 1754 1755 Both ©continue© and ©break© with target labels are simply a ©goto©\index{goto@©goto©!restricted} restricted in the following ways: 2023 \caption{Multi-level Resume/Termination} 2024 \label{f:MultiLevelResumeTermination} 2025 \end{figure} 2026 2027 \begin{comment} 2028 int main() { 2029 LC: { 2030 LS: switch ( 1 ) { 2031 case 3: 2032 LIF: if ( 1 ) { 2033 LF: for ( ;; ) { 2034 LW: while ( 1 ) { 2035 break LC; // terminate compound 2036 break LS; // terminate switch 2037 break LIF; // terminate if 2038 continue LF; // resume loop 2039 break LF; // terminate loop 2040 continue LW; // resume loop 2041 break LW; // terminate loop 2042 } // while 2043 } // for 2044 } else { 2045 break LIF; // terminate if 2046 } // if 2047 } // switch 2048 } // compound 2049 { 2050 switch ( 1 ) { 2051 case 3: 2052 if ( 1 ) { 2053 for ( ;; ) { 2054 while ( 1 ) { 2055 goto LCx; 2056 goto LSx; 2057 goto LIF; 2058 goto LFC; 2059 goto LFB; 2060 goto LWC; 2061 goto LWB; 2062 LWC: ; } LWB: ; 2063 LFC: ; } LFB: ; 2064 } else { 2065 goto LIF; 2066 } L3: ; 2067 } LSx: ; 2068 } LCx: ; 2069 } 2070 2071 // Local Variables: // 2072 // tab-width: 4 // 2073 // End: // 2074 \end{comment} 2075 2076 2077 Both labelled ©continue© and ©break© are a ©goto©\index{goto@\lstinline $goto$!restricted} restricted in the following ways: 1756 2078 \begin{itemize} 1757 2079 \item 1758 They cannot be used to create a loop.1759 This means that only the looping construct can be used to create a loop.1760 This restriction is important since all situations that can result in repeated execution of statements in a program are clearly delineated. 1761 \item 1762 Since they always transfer out of containing control structures, they cannot be used to branch into a control structure.2080 They cannot create a loop, which means only the looping constructs cause looping. 2081 This restriction means all situations resulting in repeated execution are clearly delineated. 2082 \item 2083 They cannot branch into a control structure. 2084 This restriction prevents missing initialization at the start of a control structure resulting in undefined behaviour. 1763 2085 \end{itemize} 1764 The advantage of the labelled ©continue©/©break© is allowing static multi-level exits without having to use the ©goto© statement and tying control flow to the target control structure rather than an arbitrary point in a program.2086 The advantage of the labelled ©continue©/©break© is allowing static multi-level exits without having to use the ©goto© statement, and tying control flow to the target control structure rather than an arbitrary point in a program. 1765 2087 Furthermore, the location of the label at the \emph{beginning} of the target control structure informs the reader that complex control-flow is occurring in the body of the control structure. 1766 2088 With ©goto©, the label is at the end of the control structure, which fails to convey this important clue early enough to the reader. 1767 2089 Finally, using an explicit target for the transfer instead of an implicit target allows new constructs to be added or removed without affecting existing constructs. 1768 The implicit targets of the current ©continue© and ©break©, i.e.,the closest enclosing loop or ©switch©, change as certain constructs are added or removed.2090 The implicit targets of the current ©continue© and ©break©, \ie the closest enclosing loop or ©switch©, change as certain constructs are added or removed. 1769 2091 1770 2092 … … 1903 2225 Furthermore, any statements before the first ©case© clause can only be executed if labelled and transferred to using a ©goto©, either from outside or inside of the ©switch©, both of which are problematic. 1904 2226 As well, the declaration of ©z© cannot occur after the ©case© because a label can only be attached to a statement, and without a fall through to case 3, ©z© is uninitialized. 1905 The key observation is that the ©switch© statement branches into control structure, i.e.,there are multiple entry points into its statement body.2227 The key observation is that the ©switch© statement branches into control structure, \ie there are multiple entry points into its statement body. 1906 2228 \end{enumerate} 1907 2229 … … 1911 2233 the number of ©switch© statements is small, 1912 2234 \item 1913 most ©switch© statements are well formed ( i.e.,no \Index*{Duff's device}),2235 most ©switch© statements are well formed (\ie no \Index*{Duff's device}), 1914 2236 \item 1915 2237 the ©default© clause is usually written as the last case-clause, … … 1921 2243 \item 1922 2244 Eliminating default fall-through has the greatest potential for affecting existing code. 1923 However, even if fall-through is removed, most ©switch© statements would continue to work because of the explicit transfers already present at the end of each ©case© clause, the common placement of the ©default© clause at the end of the case list, and the most common use of fall-through, i.e.,a list of ©case© clauses executing common code, \eg:2245 However, even if fall-through is removed, most ©switch© statements would continue to work because of the explicit transfers already present at the end of each ©case© clause, the common placement of the ©default© clause at the end of the case list, and the most common use of fall-through, \ie a list of ©case© clauses executing common code, \eg: 1924 2246 \begin{cfa} 1925 2247 case 1: case 2: case 3: ... … … 1964 2286 ®int j = 0;® §\C{// disallowed}§ 1965 2287 case 1: 1966 {2288 { 1967 2289 ®int k = 0;® §\C{// allowed at different nesting levels}§ 1968 2290 ... … … 2079 2401 \index{input/output library} 2080 2402 2081 The goal for the \CFA I/O is to make I/O as simple as possible in the common cases, while fully supporting polymorphism and user defined types in a consistent way. 2403 The goal of \CFA I/O is to simplify the common cases\index{I/O!common case}, while fully supporting polymorphism and user defined types in a consistent way. 2404 The \CFA header file for the I/O library is \Indexc{fstream}. 2405 2082 2406 The common case is printing out a sequence of variables separated by whitespace. 2083 2407 \begin{quote2} … … 2085 2409 \multicolumn{1}{c@{\hspace{3em}}}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{\CC}} \\ 2086 2410 \begin{cfa} 2087 int x = 0, y = 1, z = 2;2411 int x = 1, y = 2, z = 3; 2088 2412 sout | x ®|® y ®|® z | endl; 2089 2413 \end{cfa} … … 2092 2416 2093 2417 cout << x ®<< " "® << y ®<< " "® << z << endl; 2418 \end{cfa} 2419 \\ 2420 \begin{cfa}[mathescape=off,showspaces=true,aboveskip=0pt,belowskip=0pt] 2421 1 2 3 2422 \end{cfa} 2423 & 2424 \begin{cfa}[mathescape=off,showspaces=true,aboveskip=0pt,belowskip=0pt] 2425 1 2 3 2094 2426 \end{cfa} 2095 2427 \end{tabular} 2096 2428 \end{quote2} 2097 2429 The \CFA form has half as many characters as the \CC form, and is similar to \Index*{Python} I/O with respect to implicit separators. 2098 2099 The logical-or operator is used because it is the lowest-priority overloadable operator, other than assignment. 2430 A tuple prints all the tuple's values, each separated by ©", "©. 2431 \begin{cfa}[mathescape=off,aboveskip=0pt,belowskip=0pt] 2432 [int, int] t1 = [1, 2], t2 = [3, 4]; 2433 sout | t1 | t2 | endl; §\C{// print tuples}§ 2434 \end{cfa} 2435 \begin{cfa}[mathescape=off,showspaces=true,belowskip=0pt] 2436 1, 2, 3, 4 2437 \end{cfa} 2438 \CFA uses the logical-or operator for I/O because it is the lowest-priority overloadable operator, other than assignment. 2100 2439 Therefore, fewer output expressions require parenthesis. 2101 2440 \begin{quote2} … … 2110 2449 & 2111 2450 \begin{cfa} 2112 cout << x * 3 << y + 1 << (z << 2) << (x == y) << (x | y) << (x || y) << (x > z ? 1 : 2) << endl; 2451 cout << x * 3 << y + 1 << ®(®z << 2®)® << ®(®x == y®)® << (x | y) << (x || y) << (x > z ? 1 : 2) << endl; 2452 \end{cfa} 2453 \\ 2454 & 2455 \begin{cfa}[mathescape=off,showspaces=true,aboveskip=0pt,belowskip=0pt] 2456 3 3 12 0 3 1 2 2113 2457 \end{cfa} 2114 2458 \end{tabular} … … 2116 2460 Finally, the logical-or operator has a link with the Shell pipe-operator for moving data, where data flows in the correct direction for input but the opposite direction for output. 2117 2461 2118 The implicit separator\index{I/O separator} character (space/blank) is a separator not a terminator. 2462 2463 The implicit separator\index{I/O!separator} character (space/blank) is a separator not a terminator. 2119 2464 The rules for implicitly adding the separator are: 2120 2465 \begin{enumerate} … … 2127 2472 1 2 3 2128 2473 \end{cfa} 2474 2129 2475 \item 2130 2476 A separator does not appear before or after a character literal or variable. … … 2133 2479 123 2134 2480 \end{cfa} 2135 \item 2136 A separator does not appear before or after a null (empty) C string 2481 2482 \item 2483 A separator does not appear before or after a null (empty) C string. 2137 2484 \begin{cfa} 2138 2485 sout | 1 | "" | 2 | "" | 3 | endl; … … 2140 2487 \end{cfa} 2141 2488 which is a local mechanism to disable insertion of the separator character. 2142 \item 2143 A separator does not appear before a C string starting with the (extended) \Index{ASCII}\index{ASCII!extended} characters: \lstinline[mathescape=off]@([{=$£¥¡¿«@ 2489 2490 \item 2491 A separator does not appear before a C string starting with the (extended) \Index*{ASCII}\index{ASCII!extended} characters: \lstinline[mathescape=off,basicstyle=\tt]@([{=$£¥¡¿«@ 2144 2492 %$ 2145 2493 \begin{cfa}[mathescape=off] … … 2148 2496 \end{cfa} 2149 2497 %$ 2150 \begin{cfa}[mathescape=off, showspaces=true,aboveskip=0pt,belowskip=0pt]2151 x (1 x [2 x {3 x =4 x $5 x £6 x ¥7 x ¡8 x ¿9 x «102498 \begin{cfa}[mathescape=off,basicstyle=\tt,showspaces=true,aboveskip=0pt,belowskip=0pt] 2499 x ®(®1 x ®[®2 x ®{®3 x ®=®4 x ®$®5 x ®£®6 x ®¥®7 x ®¡®8 x ®¿®9 x ®«®10 2152 2500 \end{cfa} 2153 2501 %$ 2502 where \lstinline[basicstyle=\tt]@¡¿@ are inverted opening exclamation and question marks, and \lstinline[basicstyle=\tt]@«@ is an opening citation mark. 2503 2154 2504 \item 2155 2505 {\lstset{language=CFA,deletedelim=**[is][]{¢}{¢}} 2156 A seperator does not appear after a C string ending with the (extended) \Index {ASCII}\index{ASCII!extended} characters: ©,.;!?)]}%¢»©2506 A seperator does not appear after a C string ending with the (extended) \Index*{ASCII}\index{ASCII!extended} characters: \lstinline[basicstyle=\tt]@,.;!?)]}%¢»@ 2157 2507 \begin{cfa}[belowskip=0pt] 2158 2508 sout | 1 | ", x" | 2 | ". x" | 3 | "; x" | 4 | "! x" | 5 | "? x" | 6 | "% x" 2159 2509 | 7 | "¢ x" | 8 | "» x" | 9 | ") x" | 10 | "] x" | 11 | "} x" | endl; 2160 2510 \end{cfa} 2161 \begin{cfa}[ mathescape=off,showspaces=true,aboveskip=0pt,belowskip=0pt]2162 1 , x 2. x 3; x 4! x 5? x 6% x 7§\textcent§ x 8» x 9) x 10] x 11}x2511 \begin{cfa}[basicstyle=\tt,showspaces=true,aboveskip=0pt,belowskip=0pt] 2512 1®,® x 2®.® x 3®;® x 4®!® x 5®?® x 6®%® x 7§\color{red}\textcent§ x 8®»® x 9®)® x 10®]® x 11®}® x 2163 2513 \end{cfa}}% 2164 \item 2165 A seperator does not appear before or after a C string begining/ending with the \Index{ASCII} quote or whitespace characters: \lstinline[showspaces=true]@`'": \t\v\f\r\n@ 2514 where \lstinline[basicstyle=\tt]@»@ is a closing citation mark. 2515 2516 \item 2517 A seperator does not appear before or after a C string begining/ending with the \Index*{ASCII} quote or whitespace characters: \lstinline[basicstyle=\tt,showspaces=true]@`'": \t\v\f\r\n@ 2166 2518 \begin{cfa}[belowskip=0pt] 2167 2519 sout | "x`" | 1 | "`x'" | 2 | "'x\"" | 3 | "\"x:" | 4 | ":x " | 5 | " x\t" | 6 | "\tx" | endl; 2168 2520 \end{cfa} 2169 \begin{cfa}[mathescape=off,showspaces=true,showtabs=true,aboveskip=0pt,belowskip=0pt] 2170 x`1`x'2'x"3"x:4:x 5 x 6 x 2521 \begin{cfa}[basicstyle=\tt,showspaces=true,showtabs=true,aboveskip=0pt,belowskip=0pt] 2522 x®`®1®`®x§\color{red}\texttt{'}§2§\color{red}\texttt{'}§x§\color{red}\texttt{"}§3§\color{red}\texttt{"}§x®:®4®:®x® ®5® ®x® ®6® ®x 2523 \end{cfa} 2524 2525 \item 2526 If a space is desired before or after one of the special string start/end characters, simply insert a space. 2527 \begin{cfa}[belowskip=0pt] 2528 sout | "x (§\color{red}\texttt{\textvisiblespace}§" | 1 | "§\color{red}\texttt{\textvisiblespace}§) x" | 2 | "§\color{red}\texttt{\textvisiblespace}§, x" | 3 | "§\color{red}\texttt{\textvisiblespace}§:x:§\color{red}\texttt{\textvisiblespace}§" | 4 | endl; 2529 \end{cfa} 2530 \begin{cfa}[basicstyle=\tt,showspaces=true,showtabs=true,aboveskip=0pt,belowskip=0pt] 2531 x (® ®1® ®) x 2® ®, x 3® ®:x:® ®4 2171 2532 \end{cfa} 2172 2533 \end{enumerate} 2173 2534 2174 The following \CC-style \Index{manipulator}s allow control over implicit seperation. 2175 Manipulators \Indexc{sepOn}\index{manipulator!sepOn@©sepOn©} and \Indexc{sepOff}\index{manipulator!sepOff@©sepOff©} \emph{locally} toggle printing the separator, i.e., the seperator is adjusted only with respect to the next printed item. 2535 The following routines and \CC-style \Index{manipulator}s control implicit seperation. 2536 \begin{enumerate} 2537 \item 2538 Routines \Indexc{sepSet}\index{manipulator!sepSet@©sepSet©} and \Indexc{sepGet}\index{manipulator!sepGet@©sepGet©} set and get the separator string. 2539 The separator string can be at most 16 characters including the ©'\0'© string terminator (15 printable characters). 2540 \begin{cfa}[mathescape=off,aboveskip=0pt,belowskip=0pt] 2541 sepSet( sout, ", $" ); §\C{// set separator from " " to ", \$"}§ 2542 sout | 1 | 2 | 3 | " \"" | ®sepGet( sout )® | "\"" | endl; 2543 \end{cfa} 2544 %$ 2545 \begin{cfa}[mathescape=off,showspaces=true,aboveskip=0pt] 2546 1, $2, $3 ®", $"® 2547 \end{cfa} 2548 %$ 2549 \begin{cfa}[mathescape=off,aboveskip=0pt,belowskip=0pt] 2550 sepSet( sout, " " ); §\C{// reset separator to " "}§ 2551 sout | 1 | 2 | 3 | " \"" | ®sepGet( sout )® | "\"" | endl; 2552 \end{cfa} 2553 \begin{cfa}[mathescape=off,showspaces=true,aboveskip=0pt] 2554 1 2 3 ®" "® 2555 \end{cfa} 2556 2557 \item 2558 Manipulators \Indexc{sepOn}\index{manipulator!sepOn@©sepOn©} and \Indexc{sepOff}\index{manipulator!sepOff@©sepOff©} \emph{locally} toggle printing the separator, \ie the seperator is adjusted only with respect to the next printed item. 2176 2559 \begin{cfa}[mathescape=off,belowskip=0pt] 2177 2560 sout | sepOn | 1 | 2 | 3 | sepOn | endl; §\C{// separator at start of line}§ … … 2186 2569 12 3 2187 2570 \end{cfa} 2188 Manipulators \Indexc{sepDisable}\index{manipulator!sepDisable@©sepDisable©} and \Indexc{sepEnable}\index{manipulator!sepEnable@©sepEnable©} \emph{globally} toggle printing the separator, i.e., the seperator is adjusted with respect to all subsequent printed items, unless locally adjusted. 2571 2572 \item 2573 Manipulators \Indexc{sepDisable}\index{manipulator!sepDisable@©sepDisable©} and \Indexc{sepEnable}\index{manipulator!sepEnable@©sepEnable©} \emph{globally} toggle printing the separator, \ie the seperator is adjusted with respect to all subsequent printed items, unless locally adjusted. 2189 2574 \begin{cfa}[mathescape=off,aboveskip=0pt,belowskip=0pt] 2190 2575 sout | sepDisable | 1 | 2 | 3 | endl; §\C{// globally turn off implicit separation}§ … … 2205 2590 1 2 3 2206 2591 \end{cfa} 2207 Printing a tuple outputs all the tuple's values separated by ©", "©: 2592 2593 \item 2594 Routine \Indexc{sepSetTuple}\index{manipulator!sepSetTuple@©sepSetTuple©} and \Indexc{sepGetTuple}\index{manipulator!sepGetTuple@©sepGetTuple©} get and set the tuple separator-string. 2595 The tuple separator-string can be at most 16 characters including the ©'\0'© string terminator (15 printable characters). 2208 2596 \begin{cfa}[mathescape=off,aboveskip=0pt,belowskip=0pt] 2209 sout | [2, 3] | [4, 5] | endl; §\C{// print tuple}§ 2597 sepSetTuple( sout, " " ); §\C{// set tuple separator from ", " to " "}§ 2598 sout | t1 | t2 | " \"" | ®sepGetTuple( sout )® | "\"" | endl; 2599 \end{cfa} 2600 \begin{cfa}[mathescape=off,showspaces=true,aboveskip=0pt] 2601 1 2 3 4 ®" "® 2602 \end{cfa} 2603 \begin{cfa}[mathescape=off,aboveskip=0pt,belowskip=0pt] 2604 sepSetTuple( sout, ", " ); §\C{// reset tuple separator to ", "}§ 2605 sout | t1 | t2 | " \"" | ®sepGetTuple( sout )® | "\"" | endl; 2606 \end{cfa} 2607 \begin{cfa}[mathescape=off,showspaces=true,aboveskip=0pt] 2608 1, 2, 3, 4 ®", "® 2609 \end{cfa} 2610 2611 \item 2612 The tuple separator can also be turned on and off. 2613 \begin{cfa}[mathescape=off,aboveskip=0pt,belowskip=0pt] 2614 sout | sepOn | t1 | sepOff | t2 | endl; §\C{// locally turn on/off implicit separation}§ 2210 2615 \end{cfa} 2211 2616 \begin{cfa}[mathescape=off,showspaces=true,aboveskip=0pt,belowskip=0pt] 2212 2, 3, 4, 5 2213 \end{cfa} 2214 The tuple separator can also be turned on and off: 2215 \begin{cfa}[mathescape=off,aboveskip=0pt,belowskip=0pt] 2216 sout | sepOn | [2, 3] | sepOff | [4, 5] | endl; §\C{// locally turn on/off implicit separation}§ 2217 \end{cfa} 2218 \begin{cfa}[mathescape=off,showspaces=true,aboveskip=0pt,belowskip=0pt] 2219 , 2, 34, 5 2617 , 1, 23, 4 2220 2618 \end{cfa} 2221 2619 Notice a tuple seperator starts the line because the next item is a tuple. 2222 Finally, the stream routines \Indexc{sepGet}\index{manipulator!sepGet@©sepGet©} and \Indexc{sepSet}\index{manipulator!sepSet@©sepSet©} get and set the basic separator-string. 2223 \begin{cfa}[mathescape=off,aboveskip=0pt,aboveskip=0pt,belowskip=0pt] 2224 sepSet( sout, ", $" ); §\C{// set separator from " " to ", \$"}§ 2225 sout | 1 | 2 | 3 | " \"" | sepGet( sout ) | "\"" | endl; 2226 \end{cfa} 2227 %$ 2228 \begin{cfa}[mathescape=off,showspaces=true,aboveskip=0pt] 2229 1, $2, $3 ", $" 2230 \end{cfa} 2231 %$ 2232 \begin{cfa}[mathescape=off,aboveskip=0pt,aboveskip=0pt,belowskip=0pt] 2233 sepSet( sout, " " ); §\C{// reset separator to " "}§ 2234 sout | 1 | 2 | 3 | " \"" | sepGet( sout ) | "\"" | endl; 2235 \end{cfa} 2236 \begin{cfa}[mathescape=off,showspaces=true,aboveskip=0pt] 2237 1 2 3 " " 2238 \end{cfa} 2239 and the stream routines \Indexc{sepGetTuple}\index{manipulator!sepGetTuple@©sepGetTuple©} and \Indexc{sepSetTuple}\index{manipulator!sepSetTuple@©sepSetTuple©} get and set the tuple separator-string. 2240 \begin{cfa}[mathescape=off,aboveskip=0pt,aboveskip=0pt,belowskip=0pt] 2241 sepSetTuple( sout, " " ); §\C{// set tuple separator from ", " to " "}§ 2242 sout | [2, 3] | [4, 5] | " \"" | sepGetTuple( sout ) | "\"" | endl; 2243 \end{cfa} 2244 \begin{cfa}[mathescape=off,showspaces=true,aboveskip=0pt] 2245 2 3 4 5 " " 2246 \end{cfa} 2247 \begin{cfa}[mathescape=off,aboveskip=0pt,aboveskip=0pt,belowskip=0pt] 2248 sepSetTuple( sout, ", " ); §\C{// reset tuple separator to ", "}§ 2249 sout | [2, 3] | [4, 5] | " \"" | sepGetTuple( sout ) | "\"" | endl; 2250 \end{cfa} 2251 \begin{cfa}[mathescape=off,showspaces=true,aboveskip=0pt] 2252 2, 3, 4, 5 ", " 2253 \end{cfa} 2620 \end{enumerate} 2254 2621 2255 2622 \begin{comment} … … 2257 2624 2258 2625 int main( void ) { 2259 int x = 0, y = 1, z = 2; 2260 sout | x * 3 | y + 1 | z << 2 | x == y | (x | y) | (x || y) | (x > z ? 1 : 2) | endl | endl; 2626 int x = 1, y = 2, z = 3; 2627 sout | x | y | z | endl; 2628 [int, int] t1 = [1, 2], t2 = [3, 4]; 2629 sout | t1 | t2 | endl; // print tuple 2630 sout | x * 3 | y + 1 | z << 2 | x == y | (x | y) | (x || y) | (x > z ? 1 : 2) | endl; 2261 2631 sout | 1 | 2 | 3 | endl; 2262 2632 sout | '1' | '2' | '3' | endl; … … 2267 2637 | 7 | "¢ x" | 8 | "» x" | 9 | ") x" | 10 | "] x" | 11 | "} x" | endl; 2268 2638 sout | "x`" | 1 | "`x'" | 2 | "'x\"" | 3 | "\"x:" | 4 | ":x " | 5 | " x\t" | 6 | "\tx" | endl; 2269 2270 sout | sepOn | 1 | 2 | 3 | sepOn | endl; // separator at start of line 2271 sout | 1 | sepOff | 2 | 3 | endl; // locally turn off implicit separator 2272 sout | sepDisable | 1 | 2 | 3 | endl; // globally turn off implicit separation 2273 sout | 1 | sepOn | 2 | 3 | endl; // locally turn on implicit separator 2274 sout | sepEnable | 1 | 2 | 3 | endl; // globally turn on implicit separation 2275 2276 sout | [2, 3] | [4, 5] | endl; // print tuple 2277 sout | sepOn | [2, 3] | sepOff | [4, 5] | endl; // locally turn on/off implicit separation 2639 sout | "x ( " | 1 | " ) x" | 2 | " , x" | 3 | " :x: " | 4 | endl; 2278 2640 2279 2641 sepSet( sout, ", $" ); // set separator from " " to ", $" … … 2282 2644 sout | 1 | 2 | 3 | " \"" | sepGet( sout ) | "\"" | endl; 2283 2645 2646 sout | sepOn | 1 | 2 | 3 | sepOn | endl; // separator at start of line 2647 sout | 1 | sepOff | 2 | 3 | endl; // locally turn off implicit separator 2648 2649 sout | sepDisable | 1 | 2 | 3 | endl; // globally turn off implicit separation 2650 sout | 1 | sepOn | 2 | 3 | endl; // locally turn on implicit separator 2651 sout | sepEnable | 1 | 2 | 3 | endl; // globally turn on implicit separation 2652 2284 2653 sepSetTuple( sout, " " ); // set tuple separator from ", " to " " 2285 sout | [2, 3] | [4, 5]| " \"" | sepGetTuple( sout ) | "\"" | endl;2654 sout | t1 | t2 | " \"" | sepGetTuple( sout ) | "\"" | endl; 2286 2655 sepSetTuple( sout, ", " ); // reset tuple separator to ", " 2287 sout | [2, 3] | [4, 5] | " \"" | sepGetTuple( sout ) | "\"" | endl; 2656 sout | t1 | t2 | " \"" | sepGetTuple( sout ) | "\"" | endl; 2657 2658 sout | t1 | t2 | endl; // print tuple 2659 sout | sepOn | t1 | sepOff | t2 | endl; // locally turn on/off implicit separation 2288 2660 } 2289 2661 2290 2662 // Local Variables: // 2291 2663 // tab-width: 4 // 2664 // fill-column: 100 // 2292 2665 // End: // 2293 2666 \end{comment} … … 2409 2782 2410 2783 2411 \s ubsection{Constructors and Destructors}2784 \section{Constructors and Destructors} 2412 2785 2413 2786 \CFA supports C initialization of structures, but it also adds constructors for more advanced initialization. 2414 Additionally, \CFA adds destructors that are called when a variable is de -allocated (variable goes out of scope or object is deleted).2787 Additionally, \CFA adds destructors that are called when a variable is deallocated (variable goes out of scope or object is deleted). 2415 2788 These functions take a reference to the structure as a parameter (see References for more information). 2416 2789 … … 2461 2834 \caption{Constructors and Destructors} 2462 2835 \end{figure} 2463 2464 2465 \begin{comment}2466 \section{References}2467 2468 2469 By introducing references in parameter types, users are given an easy way to pass a value by reference, without the need for NULL pointer checks.2470 In structures, a reference can replace a pointer to an object that should always have a valid value.2471 When a structure contains a reference, all of its constructors must initialize the reference and all instances of this structure must initialize it upon definition.2472 2473 The syntax for using references in \CFA is the same as \CC with the exception of reference initialization.2474 Use ©&© to specify a reference, and access references just like regular objects, not like pointers (use dot notation to access fields).2475 When initializing a reference, \CFA uses a different syntax which differentiates reference initialization from assignment to a reference.2476 The ©&© is used on both sides of the expression to clarify that the address of the reference is being set to the address of the variable to which it refers.2477 \end{comment}2478 2836 2479 2837 … … 2755 3113 2756 3114 3115 \begin{comment} 2757 3116 \section{Generics} 2758 3117 … … 2760 3119 Generics allow programmers to use type variables in place of concrete types so that the code can be reused with multiple types. 2761 3120 The type parameters can be restricted to satisfy a set of constraints. 2762 This enables \CFA to build fully compiled generic functions and types, unlike other languages like \Index*[C++]{\CC } where templates are expanded or must be explicitly instantiated.3121 This enables \CFA to build fully compiled generic functions and types, unlike other languages like \Index*[C++]{\CC{}} where templates are expanded or must be explicitly instantiated. 2763 3122 2764 3123 2765 3124 \subsection{Generic Functions} 2766 3125 2767 Generic functions in \CFA are similar to template functions in \Index*[C++]{\CC }, and will sometimes be expanded into specialized versions, just like in \CC.3126 Generic functions in \CFA are similar to template functions in \Index*[C++]{\CC{}}, and will sometimes be expanded into specialized versions, just like in \CC. 2768 3127 The difference, however, is that generic functions in \CFA can also be separately compiled, using function pointers for callers to pass in all needed functionality for the given type. 2769 3128 This means that compiled libraries can contain generic functions that can be used by programs linked with them (statically or dynamically). … … 2798 3157 generic (type T | bool ?<?(T, T) ) 2799 3158 2800 T min( T a, T b) {3159 T min( T a, T b ) { 2801 3160 return a < b ? a : b; 2802 3161 } … … 2837 3196 2838 3197 generic (type T | Orderable(T)) 2839 T min( T a, T b) {3198 T min( T a, T b ) { 2840 3199 return a < b ? a : b; 2841 3200 } … … 2884 3243 2885 3244 Generic types are defined using the same mechanisms as those described above for generic functions. 2886 This feature allows users to create types that have one or more fields that use generic parameters as types, similar to a template classes in \Index*[C++]{\CC }.3245 This feature allows users to create types that have one or more fields that use generic parameters as types, similar to a template classes in \Index*[C++]{\CC{}}. 2887 3246 For example, to make a generic linked list, a placeholder is created for the type of the elements, so that the specific type of the elements in the list need not be specified when defining the list. 2888 3247 In C, something like this would have to be done using void pointers and unsafe casting. … … 2936 3295 Throwing an exception terminates execution of the current block, invokes the destructors of variables that are local to the block, and propagates the exception to the parent block. 2937 3296 The exception is immediately re-thrown from the parent block unless it is caught as described below. 2938 \CFA uses keywords similar to \Index*[C++]{\CC } for exception handling.3297 \CFA uses keywords similar to \Index*[C++]{\CC{}} for exception handling. 2939 3298 An exception is thrown using a throw statement, which accepts one argument. 2940 3299 … … 2961 3320 } 2962 3321 \end{cfa} 3322 \end{comment} 2963 3323 2964 3324 … … 3020 3380 Complex *p3 = new(0.5, 1.0); // allocate + 2 param constructor 3021 3381 } 3022 3023 3382 \end{cfa} 3024 3383 … … 3032 3391 3033 3392 3393 \begin{comment} 3034 3394 \subsection{Unsafe C Constructs} 3035 3395 … … 3042 3402 The exact set of unsafe C constructs that will be disallowed in \CFA has not yet been decided, but is sure to include pointer arithmetic, pointer casting, etc. 3043 3403 Once the full set is decided, the rules will be listed here. 3404 \end{comment} 3044 3405 3045 3406 3046 3407 \section{Concurrency} 3047 3048 Today's processors for nearly all use cases, ranging from embedded systems to large cloud computing servers, are composed of multiple cores, often heterogeneous.3049 As machines grow in complexity, it becomes more difficult for a program to make the most use of the hardware available.3050 \CFA includes built-in concurrency features to enable high performance and improve programmer productivity on these multi-/many-core machines.3051 3408 3052 3409 Concurrency support in \CFA is implemented on top of a highly efficient runtime system of light-weight, M:N, user level threads. … … 3055 3412 This enables a very familiar interface to all programmers, even those with no parallel programming experience. 3056 3413 It also allows the compiler to do static type checking of all communication, a very important safety feature. 3057 This controlled communication with type safety has some similarities with channels in \Index*{Go}, and can actually implement 3058 channels exactly, as well as create additional communication patterns that channels cannot. 3414 This controlled communication with type safety has some similarities with channels in \Index*{Go}, and can actually implement channels exactly, as well as create additional communication patterns that channels cannot. 3059 3415 Mutex objects, monitors, are used to contain mutual exclusion within an object and synchronization across concurrent threads. 3060 3416 3061 Three new keywords are added to support these features: 3062 3063 monitor creates a structure with implicit locking when accessing fields 3064 3065 mutex implies use of a monitor requiring the implicit locking 3066 3067 task creates a type with implicit locking, separate stack, and a thread 3417 \begin{figure} 3418 \begin{cfa} 3419 #include <fstream> 3420 #include <coroutine> 3421 3422 coroutine Fibonacci { 3423 int fn; §\C{// used for communication}§ 3424 }; 3425 void ?{}( Fibonacci * this ) { 3426 this->fn = 0; 3427 } 3428 void main( Fibonacci * this ) { 3429 int fn1, fn2; §\C{// retained between resumes}§ 3430 this->fn = 0; §\C{// case 0}§ 3431 fn1 = this->fn; 3432 suspend(); §\C{// return to last resume}§ 3433 3434 this->fn = 1; §\C{// case 1}§ 3435 fn2 = fn1; 3436 fn1 = this->fn; 3437 suspend(); §\C{// return to last resume}§ 3438 3439 for ( ;; ) { §\C{// general case}§ 3440 this->fn = fn1 + fn2; 3441 fn2 = fn1; 3442 fn1 = this->fn; 3443 suspend(); §\C{// return to last resume}§ 3444 } // for 3445 } 3446 int next( Fibonacci * this ) { 3447 resume( this ); §\C{// transfer to last suspend}§ 3448 return this->fn; 3449 } 3450 int main() { 3451 Fibonacci f1, f2; 3452 for ( int i = 1; i <= 10; i += 1 ) { 3453 sout | next( &f1 ) | ' ' | next( &f2 ) | endl; 3454 } // for 3455 } 3456 \end{cfa} 3457 \caption{Fibonacci Coroutine} 3458 \label{f:FibonacciCoroutine} 3459 \end{figure} 3460 3461 3462 \subsection{Coroutine} 3463 3464 \Index{Coroutines} are the precursor to tasks. 3465 \VRef[Figure]{f:FibonacciCoroutine} shows a coroutine that computes the \Index*{Fibonacci} numbers. 3068 3466 3069 3467 … … 3080 3478 \end{cfa} 3081 3479 3480 \begin{figure} 3481 \begin{cfa} 3482 #include <fstream> 3483 #include <kernel> 3484 #include <monitor> 3485 #include <thread> 3486 3487 monitor global_t { 3488 int value; 3489 }; 3490 3491 void ?{}(global_t * this) { 3492 this->value = 0; 3493 } 3494 3495 static global_t global; 3496 3497 void increment3( global_t * mutex this ) { 3498 this->value += 1; 3499 } 3500 void increment2( global_t * mutex this ) { 3501 increment3( this ); 3502 } 3503 void increment( global_t * mutex this ) { 3504 increment2( this ); 3505 } 3506 3507 thread MyThread {}; 3508 3509 void main( MyThread* this ) { 3510 for(int i = 0; i < 1_000_000; i++) { 3511 increment( &global ); 3512 } 3513 } 3514 int main(int argc, char* argv[]) { 3515 processor p; 3516 { 3517 MyThread f[4]; 3518 } 3519 sout | global.value | endl; 3520 } 3521 \end{cfa} 3522 \caption{Atomic-Counter Monitor} 3523 \caption{f:AtomicCounterMonitor} 3524 \end{figure} 3525 3526 \begin{comment} 3082 3527 Since a monitor structure includes an implicit locking mechanism, it does not make sense to copy a monitor; 3083 3528 it is always passed by reference. … … 3126 3571 } 3127 3572 \end{cfa} 3573 \end{comment} 3128 3574 3129 3575 … … 3133 3579 A task provides mutual exclusion like a monitor, and also has its own execution state and a thread of control. 3134 3580 Similar to a monitor, a task is defined like a structure: 3581 3582 \begin{figure} 3583 \begin{cfa} 3584 #include <fstream> 3585 #include <kernel> 3586 #include <stdlib> 3587 #include <thread> 3588 3589 thread First { signal_once * lock; }; 3590 thread Second { signal_once * lock; }; 3591 3592 void ?{}( First * this, signal_once* lock ) { this->lock = lock; } 3593 void ?{}( Second * this, signal_once* lock ) { this->lock = lock; } 3594 3595 void main( First * this ) { 3596 for ( int i = 0; i < 10; i += 1 ) { 3597 sout | "First : Suspend No." | i + 1 | endl; 3598 yield(); 3599 } 3600 signal( this->lock ); 3601 } 3602 3603 void main( Second * this ) { 3604 wait( this->lock ); 3605 for ( int i = 0; i < 10; i += 1 ) { 3606 sout | "Second : Suspend No." | i + 1 | endl; 3607 yield(); 3608 } 3609 } 3610 3611 int main( void ) { 3612 signal_once lock; 3613 sout | "User main begin" | endl; 3614 { 3615 processor p; 3616 { 3617 First f = { &lock }; 3618 Second s = { &lock }; 3619 } 3620 } 3621 sout | "User main end" | endl; 3622 } 3623 \end{cfa} 3624 \caption{Simple Tasks} 3625 \label{f:SimpleTasks} 3626 \end{figure} 3627 3628 3629 \begin{comment} 3135 3630 \begin{cfa} 3136 3631 type Adder = task { … … 3142 3637 3143 3638 A task may define a constructor, which will be called upon allocation and run on the caller.s thread. 3144 A destructor may also be defined, which is called at de -allocation (when a dynamic object is deleted or when a local object goes out of scope).3639 A destructor may also be defined, which is called at deallocation (when a dynamic object is deleted or when a local object goes out of scope). 3145 3640 After a task is allocated and initialized, its thread is spawned implicitly and begins executing in its function call method. 3146 3641 All tasks must define this function call method, with a void return value and no additional parameters, or the compiler will report an error. … … 3186 3681 \end{cfa} 3187 3682 3188 3189 3683 \subsection{Cooperative Scheduling} 3190 3684 … … 3299 3793 } 3300 3794 \end{cfa} 3301 3302 3795 \end{comment} 3796 3797 3798 \begin{comment} 3303 3799 \section{Modules and Packages } 3304 3800 3305 \begin{comment}3306 3801 High-level encapsulation is useful for organizing code into reusable units, and accelerating compilation speed. 3307 3802 \CFA provides a convenient mechanism for creating, building and sharing groups of functionality that enhances productivity and improves compile time. … … 3319 3814 \subsection{No Declarations, No Header Files} 3320 3815 3321 In C and \Index*[C++]{\CC }, it is necessary to declare or define every global variable, global function, and type before it is used in each file.3816 In C and \Index*[C++]{\CC{}}, it is necessary to declare or define every global variable, global function, and type before it is used in each file. 3322 3817 Header files and a preprocessor are normally used to avoid repeating code. 3323 3818 Thus, many variables, functions, and types are described twice, which exposes an opportunity for errors and causes additional maintenance work. … … 3964 4459 In developing \CFA, many other languages were consulted for ideas, constructs, and syntax. 3965 4460 Therefore, it is important to show how these languages each compare with Do. 3966 In this section, \CFA is compared with what the writers of this document consider to be the closest competitors of Do: \Index*[C++]{\CC}, \Index*{Go}, \Index*{Rust}, and \Index*{D}. 3967 3968 4461 In this section, \CFA is compared with what the writers of this document consider to be the closest competitors of Do: \Index*[C++]{\CC{}}, \Index*{Go}, \Index*{Rust}, and \Index*{D}. 4462 4463 4464 \begin{comment} 3969 4465 \subsection[Comparing Key Features of CFA]{Comparing Key Features of \CFA} 3970 4466 … … 4344 4840 4345 4841 4346 \begin{comment}4347 4842 \subsubsection{Modules / Packages} 4348 4843 … … 4424 4919 } 4425 4920 \end{cfa} 4426 \end{comment}4427 4921 4428 4922 … … 4585 5079 4586 5080 \subsection{Summary of Language Comparison} 4587 4588 4589 \subsubsection[C++]{\CC} 4590 4591 \Index*[C++]{\CC} is a general-purpose programming language. 5081 \end{comment} 5082 5083 5084 \subsection[C++]{\CC} 5085 5086 \Index*[C++]{\CC{}} is a general-purpose programming language. 4592 5087 It has imperative, object-oriented and generic programming features, while also providing facilities for low-level memory manipulation. (Wikipedia) 4593 5088 … … 4608 5103 4609 5104 4610 \subs ubsection{Go}5105 \subsection{Go} 4611 5106 4612 5107 \Index*{Go}, also commonly referred to as golang, is a programming language developed at Google in 2007 [.]. … … 4624 5119 4625 5120 4626 \subs ubsection{Rust}5121 \subsection{Rust} 4627 5122 4628 5123 \Index*{Rust} is a general-purpose, multi-paradigm, compiled programming language developed by Mozilla Research. … … 4638 5133 4639 5134 4640 \subs ubsection{D}5135 \subsection{D} 4641 5136 4642 5137 The \Index*{D} programming language is an object-oriented, imperative, multi-paradigm system programming … … 4655 5150 4656 5151 4657 \section{Syntactic Anomalies} 4658 4659 There are several ambiguous cases with operator identifiers, \eg ©int *?*?()©, where the string ©*?*?© can be lexed as ©*©~\R{/}~©?*?© or ©*?©~\R{/}~©*?©. 4660 Since it is common practise to put a unary operator juxtaposed to an identifier, \eg ©*i©, users will be annoyed if they cannot do this with respect to operator identifiers. 4661 Even with this special hack, there are 5 general cases that cannot be handled. 4662 The first case is for the function-call identifier ©?()©: 4663 \begin{cfa} 4664 int *§\textvisiblespace§?()(); // declaration: space required after '*' 4665 *§\textvisiblespace§?()(); // expression: space required after '*' 4666 \end{cfa} 4667 Without the space, the string ©*?()© is ambiguous without N character look ahead; 4668 it requires scanning ahead to determine if there is a ©'('©, which is the start of an argument/parameter list. 4669 4670 The 4 remaining cases occur in expressions: 4671 \begin{cfa} 4672 i++§\textvisiblespace§?i:0; // space required before '?' 4673 i--§\textvisiblespace§?i:0; // space required before '?' 4674 i§\textvisiblespace§?++i:0; // space required after '?' 4675 i§\textvisiblespace§?--i:0; // space required after '?' 4676 \end{cfa} 4677 In the first two cases, the string ©i++?© is ambiguous, where this string can be lexed as ©i© / ©++?© or ©i++© / ©?©; 4678 it requires scanning ahead to determine if there is a ©'('©, which is the start of an argument list. 4679 In the second two cases, the string ©?++x© is ambiguous, where this string can be lexed as ©?++© / ©x© or ©?© / y©++x©; 4680 it requires scanning ahead to determine if there is a ©'('©, which is the start of an argument list. 4681 4682 4683 \section{Incompatible} 4684 4685 The following incompatibles exist between \CFA and C, and are similar to Annex C for \CC~\cite{ANSI14:C++}. 4686 4687 \begin{enumerate} 4688 \item 4689 \begin{description} 4690 \item[Change:] add new keywords \\ 4691 New keywords are added to \CFA (see~\VRef{s:NewKeywords}). 4692 \item[Rationale:] keywords added to implement new semantics of \CFA. 4693 \item[Effect on original feature:] change to semantics of well-defined feature. \\ 4694 Any ISO C programs using these keywords as identifiers are invalid \CFA programs. 4695 \item[Difficulty of converting:] keyword clashes are accommodated by syntactic transformations using the \CFA backquote escape-mechanism (see~\VRef{s:BackquoteIdentifiers}): 4696 \item[How widely used:] clashes among new \CFA keywords and existing identifiers are rare. 4697 \end{description} 4698 4699 \item 4700 \begin{description} 4701 \item[Change:] type of character literal ©int© to ©char© to allow more intuitive overloading: 4702 \begin{cfa} 4703 int rtn( int i ); 4704 int rtn( char c ); 4705 rtn( 'x' ); §\C{// programmer expects 2nd rtn to be called}§ 4706 \end{cfa} 4707 \item[Rationale:] it is more intuitive for the call to ©rtn© to match the second version of definition of ©rtn© rather than the first. 4708 In particular, output of ©char© variable now print a character rather than the decimal ASCII value of the character. 4709 \begin{cfa} 4710 sout | 'x' | " " | (int)'x' | endl; 4711 x 120 4712 \end{cfa} 4713 Having to cast ©'x'© to ©char© is non-intuitive. 4714 \item[Effect on original feature:] change to semantics of well-defined feature that depend on: 4715 \begin{cfa} 4716 sizeof( 'x' ) == sizeof( int ) 4717 \end{cfa} 4718 no long work the same in \CFA programs. 4719 \item[Difficulty of converting:] simple 4720 \item[How widely used:] programs that depend upon ©sizeof( 'x' )© are rare and can be changed to ©sizeof(char)©. 4721 \end{description} 4722 4723 \item 4724 \begin{description} 4725 \item[Change:] make string literals ©const©: 4726 \begin{cfa} 4727 char * p = "abc"; §\C{// valid in C, deprecated in \CFA}§ 4728 char * q = expr ? "abc" : "de"; §\C{// valid in C, invalid in \CFA}§ 4729 \end{cfa} 4730 The type of a string literal is changed from ©[] char© to ©const [] char©. 4731 Similarly, the type of a wide string literal is changed from ©[] wchar_t© to ©const [] wchar_t©. 4732 \item[Rationale:] This change is a safety issue: 4733 \begin{cfa} 4734 char * p = "abc"; 4735 p[0] = 'w'; §\C{// segment fault or change constant literal}§ 4736 \end{cfa} 4737 The same problem occurs when passing a string literal to a routine that changes its argument. 4738 \item[Effect on original feature:] change to semantics of well-defined feature. 4739 \item[Difficulty of converting:] simple syntactic transformation, because string literals can be converted to ©char *©. 4740 \item[How widely used:] programs that have a legitimate reason to treat string literals as pointers to potentially modifiable memory are rare. 4741 \end{description} 4742 4743 \item 4744 \begin{description} 4745 \item[Change:] remove \newterm{tentative definitions}, which only occurs at file scope: 4746 \begin{cfa} 4747 int i; §\C{// forward definition}§ 4748 int *j = ®&i®; §\C{// forward reference, valid in C, invalid in \CFA}§ 4749 int i = 0; §\C{// definition}§ 4750 \end{cfa} 4751 is valid in C, and invalid in \CFA because duplicate overloaded object definitions at the same scope level are disallowed. 4752 This change makes it impossible to define mutually referential file-local static objects, if initializers are restricted to the syntactic forms of C. For example, 4753 \begin{cfa} 4754 struct X { int i; struct X *next; }; 4755 static struct X a; §\C{// forward definition}§ 4756 static struct X b = { 0, ®&a® }; §\C{// forward reference, valid in C, invalid in \CFA}§ 4757 static struct X a = { 1, &b }; §\C{// definition}§ 4758 \end{cfa} 4759 \item[Rationale:] avoids having different initialization rules for builtin types and userdefined types. 4760 \item[Effect on original feature:] change to semantics of well-defined feature. 4761 \item[Difficulty of converting:] the initializer for one of a set of mutually-referential file-local static objects must invoke a routine call to achieve the initialization. 4762 \item[How widely used:] seldom 4763 \end{description} 4764 4765 \item 4766 \begin{description} 4767 \item[Change:] have ©struct© introduce a scope for nested types: 4768 \begin{cfa} 4769 enum ®Colour® { R, G, B, Y, C, M }; 4770 struct Person { 4771 enum ®Colour® { R, G, B }; §\C{// nested type}§ 4772 struct Face { §\C{// nested type}§ 4773 ®Colour® Eyes, Hair; §\C{// type defined outside (1 level)}§ 4774 }; 4775 ß.ß®Colour® shirt; §\C{// type defined outside (top level)}§ 4776 ®Colour® pants; §\C{// type defined same level}§ 4777 Face looks[10]; §\C{// type defined same level}§ 4778 }; 4779 ®Colour® c = R; §\C{// type/enum defined same level}§ 4780 Personß.ß®Colour® pc = Personß.ßR; §\C{// type/enum defined inside}§ 4781 Personß.ßFace pretty; §\C{// type defined inside}§ 4782 \end{cfa} 4783 In C, the name of the nested types belongs to the same scope as the name of the outermost enclosing structure, i.e., the nested types are hoisted to the scope of the outer-most type, which is not useful and confusing. 4784 \CFA is C \emph{incompatible} on this issue, and provides semantics similar to \Index*[C++]{\CC}. 4785 Nested types are not hoisted and can be referenced using the field selection operator ``©.©'', unlike the \CC scope-resolution operator ``©::©''. 4786 \item[Rationale:] ©struct© scope is crucial to \CFA as an information structuring and hiding mechanism. 4787 \item[Effect on original feature:] change to semantics of well-defined feature. 4788 \item[Difficulty of converting:] Semantic transformation. 4789 \item[How widely used:] C programs rarely have nest types because they are equivalent to the hoisted version. 4790 \end{description} 4791 4792 \item 4793 \begin{description} 4794 \item[Change:] In C++, the name of a nested class is local to its enclosing class. 4795 \item[Rationale:] C++ classes have member functions which require that classes establish scopes. 4796 \item[Difficulty of converting:] Semantic transformation. To make the struct type name visible in the scope of the enclosing struct, the struct tag could be declared in the scope of the enclosing struct, before the enclosing struct is defined. Example: 4797 \begin{cfa} 4798 struct Y; §\C{// struct Y and struct X are at the same scope}§ 4799 struct X { 4800 struct Y { /* ... */ } y; 4801 }; 4802 \end{cfa} 4803 All the definitions of C struct types enclosed in other struct definitions and accessed outside the scope of the enclosing struct could be exported to the scope of the enclosing struct. 4804 Note: this is a consequence of the difference in scope rules, which is documented in 3.3. 4805 \item[How widely used:] Seldom. 4806 \end{description} 4807 4808 \item 4809 \begin{description} 4810 \item[Change:] comma expression is disallowed as subscript 4811 \item[Rationale:] safety issue to prevent subscripting error for multidimensional arrays: ©x[i,j]© instead of ©x[i][j]©, and this syntactic form then taken by \CFA for new style arrays. 4812 \item[Effect on original feature:] change to semantics of well-defined feature. 4813 \item[Difficulty of converting:] semantic transformation of ©x[i,j]© to ©x[(i,j)]© 4814 \item[How widely used:] seldom. 4815 \end{description} 4816 \end{enumerate} 5152 \section{Syntax Ambiguities} 5153 5154 C has a number of syntax ambiguities, which are resolved by taking the longest sequence of overlapping characters that constitute a token. 5155 For example, the program fragment ©x+++++y© is parsed as \lstinline[showspaces=true]@x ++ ++ + y@ because operator tokens ©++© and ©+© overlap. 5156 Unfortunately, the longest sequence violates a constraint on increment operators, even though the parse \lstinline[showspaces=true]@x ++ + ++ y@ might yield a correct expression. 5157 Hence, C programmers are aware that spaces have to added to disambiguate certain syntactic cases. 5158 5159 In \CFA, there are ambiguous cases with dereference and operator identifiers, \eg ©int *?*?()©, where the string ©*?*?© can be interpreted as: 5160 \begin{cfa} 5161 *?§\color{red}\textvisiblespace§*? §\C{// dereference operator, dereference operator}§ 5162 *§\color{red}\textvisiblespace§?*? §\C{// dereference, multiplication operator}§ 5163 \end{cfa} 5164 By default, the first interpretation is selected, which does not yield a meaningful parse. 5165 Therefore, \CFA does a lexical look-ahead for the second case, and backtracks to return the leading unary operator and reparses the trailing operator identifier. 5166 Otherwise a space is needed between the unary operator and operator identifier to disambiguate this common case. 5167 5168 A similar issue occurs with the dereference, ©*?(...)©, and routine-call, ©?()(...)© identifiers. 5169 The ambiguity occurs when the deference operator has no parameters: 5170 \begin{cfa} 5171 *?()§\color{red}\textvisiblespace...§ ; 5172 *?()§\color{red}\textvisiblespace...§(...) ; 5173 \end{cfa} 5174 requiring arbitrary whitespace look-ahead for the routine-call parameter-list to disambiguate. 5175 However, the dereference operator \emph{must} have a parameter/argument to dereference ©*?(...)©. 5176 Hence, always interpreting the string ©*?()© as \lstinline[showspaces=true]@* ?()@ does not preclude any meaningful program. 5177 5178 The remaining cases are with the increment/decrement operators and conditional expression, \eg: 5179 \begin{cfa} 5180 i++?§\color{red}\textvisiblespace...§(...); 5181 i?++§\color{red}\textvisiblespace...§(...); 5182 \end{cfa} 5183 requiring arbitrary whitespace look-ahead for the operator parameter-list, even though that interpretation is an incorrect expression (juxtaposed identifiers). 5184 Therefore, it is necessary to disambiguate these cases with a space: 5185 \begin{cfa} 5186 i++§\color{red}\textvisiblespace§? i : 0; 5187 i?§\color{red}\textvisiblespace§++i : 0; 5188 \end{cfa} 4817 5189 4818 5190 … … 4821 5193 4822 5194 \begin{quote2} 4823 \begin{tabular}{lll }5195 \begin{tabular}{llll} 4824 5196 \begin{tabular}{@{}l@{}} 4825 5197 ©_AT© \\ … … 4829 5201 ©coroutine© \\ 4830 5202 ©disable© \\ 4831 ©dtype© \\4832 ©enable© \\4833 5203 \end{tabular} 4834 5204 & 4835 5205 \begin{tabular}{@{}l@{}} 5206 ©dtype© \\ 5207 ©enable© \\ 4836 5208 ©fallthrough© \\ 4837 5209 ©fallthru© \\ 4838 5210 ©finally© \\ 4839 5211 ©forall© \\ 5212 \end{tabular} 5213 & 5214 \begin{tabular}{@{}l@{}} 4840 5215 ©ftype© \\ 4841 5216 ©lvalue© \\ 4842 5217 ©monitor© \\ 4843 5218 ©mutex© \\ 5219 ©one_t© \\ 5220 ©otype© \\ 4844 5221 \end{tabular} 4845 5222 & 4846 5223 \begin{tabular}{@{}l@{}} 4847 ©one_t© \\4848 ©otype© \\4849 5224 ©throw© \\ 4850 5225 ©throwResume© \\ … … 4858 5233 4859 5234 5235 \section{Incompatible} 5236 5237 The following incompatibles exist between \CFA and C, and are similar to Annex C for \CC~\cite{C++14}. 5238 5239 5240 \begin{enumerate} 5241 \item 5242 \begin{description} 5243 \item[Change:] add new keywords \\ 5244 New keywords are added to \CFA (see~\VRef{s:CFAKeywords}). 5245 \item[Rationale:] keywords added to implement new semantics of \CFA. 5246 \item[Effect on original feature:] change to semantics of well-defined feature. \\ 5247 Any \Celeven programs using these keywords as identifiers are invalid \CFA programs. 5248 \item[Difficulty of converting:] keyword clashes are accommodated by syntactic transformations using the \CFA backquote escape-mechanism (see~\VRef{s:BackquoteIdentifiers}). 5249 \item[How widely used:] clashes among new \CFA keywords and existing identifiers are rare. 5250 \end{description} 5251 5252 \item 5253 \begin{description} 5254 \item[Change:] drop K\&R C declarations \\ 5255 K\&R declarations allow an implicit base-type of ©int©, if no type is specified, plus an alternate syntax for declaring parameters. 5256 \eg: 5257 \begin{cfa} 5258 x; §\C{// int x}§ 5259 *y; §\C{// int *y}§ 5260 f( p1, p2 ); §\C{// int f( int p1, int p2 );}§ 5261 g( p1, p2 ) int p1, p2; §\C{// int g( int p1, int p2 );}§ 5262 \end{cfa} 5263 \CFA supports K\&R routine definitions: 5264 \begin{cfa} 5265 f( a, b, c ) §\C{// default int return}§ 5266 int a, b; char c §\C{// K\&R parameter declarations}§ 5267 { 5268 ... 5269 } 5270 \end{cfa} 5271 \item[Rationale:] dropped from \Celeven standard.\footnote{ 5272 At least one type specifier shall be given in the declaration specifiers in each declaration, and in the specifier-qualifier list in each structure declaration and type name~\cite[\S~6.7.2(2)]{C11}} 5273 \item[Effect on original feature:] original feature is deprecated. \\ 5274 Any old C programs using these K\&R declarations are invalid \CFA programs. 5275 \item[Difficulty of converting:] trivial to convert to \CFA. 5276 \item[How widely used:] existing usages are rare. 5277 \end{description} 5278 5279 \item 5280 \begin{description} 5281 \item[Change:] type of character literal ©int© to ©char© to allow more intuitive overloading: 5282 \begin{cfa} 5283 int rtn( int i ); 5284 int rtn( char c ); 5285 rtn( 'x' ); §\C{// programmer expects 2nd rtn to be called}§ 5286 \end{cfa} 5287 \item[Rationale:] it is more intuitive for the call to ©rtn© to match the second version of definition of ©rtn© rather than the first. 5288 In particular, output of ©char© variable now print a character rather than the decimal ASCII value of the character. 5289 \begin{cfa} 5290 sout | 'x' | " " | (int)'x' | endl; 5291 x 120 5292 \end{cfa} 5293 Having to cast ©'x'© to ©char© is non-intuitive. 5294 \item[Effect on original feature:] change to semantics of well-defined feature that depend on: 5295 \begin{cfa} 5296 sizeof( 'x' ) == sizeof( int ) 5297 \end{cfa} 5298 no long work the same in \CFA programs. 5299 \item[Difficulty of converting:] simple 5300 \item[How widely used:] programs that depend upon ©sizeof( 'x' )© are rare and can be changed to ©sizeof(char)©. 5301 \end{description} 5302 5303 \item 5304 \begin{description} 5305 \item[Change:] make string literals ©const©: 5306 \begin{cfa} 5307 char * p = "abc"; §\C{// valid in C, deprecated in \CFA}§ 5308 char * q = expr ? "abc" : "de"; §\C{// valid in C, invalid in \CFA}§ 5309 \end{cfa} 5310 The type of a string literal is changed from ©[] char© to ©const [] char©. 5311 Similarly, the type of a wide string literal is changed from ©[] wchar_t© to ©const [] wchar_t©. 5312 \item[Rationale:] This change is a safety issue: 5313 \begin{cfa} 5314 char * p = "abc"; 5315 p[0] = 'w'; §\C{// segment fault or change constant literal}§ 5316 \end{cfa} 5317 The same problem occurs when passing a string literal to a routine that changes its argument. 5318 \item[Effect on original feature:] change to semantics of well-defined feature. 5319 \item[Difficulty of converting:] simple syntactic transformation, because string literals can be converted to ©char *©. 5320 \item[How widely used:] programs that have a legitimate reason to treat string literals as pointers to potentially modifiable memory are rare. 5321 \end{description} 5322 5323 \item 5324 \begin{description} 5325 \item[Change:] remove \newterm{tentative definitions}, which only occurs at file scope: 5326 \begin{cfa} 5327 int i; §\C{// forward definition}§ 5328 int *j = ®&i®; §\C{// forward reference, valid in C, invalid in \CFA}§ 5329 int i = 0; §\C{// definition}§ 5330 \end{cfa} 5331 is valid in C, and invalid in \CFA because duplicate overloaded object definitions at the same scope level are disallowed. 5332 This change makes it impossible to define mutually referential file-local static objects, if initializers are restricted to the syntactic forms of C. For example, 5333 \begin{cfa} 5334 struct X { int i; struct X *next; }; 5335 static struct X a; §\C{// forward definition}§ 5336 static struct X b = { 0, ®&a® }; §\C{// forward reference, valid in C, invalid in \CFA}§ 5337 static struct X a = { 1, &b }; §\C{// definition}§ 5338 \end{cfa} 5339 \item[Rationale:] avoids having different initialization rules for builtin types and user-defined types. 5340 \item[Effect on original feature:] change to semantics of well-defined feature. 5341 \item[Difficulty of converting:] the initializer for one of a set of mutually-referential file-local static objects must invoke a routine call to achieve the initialization. 5342 \item[How widely used:] seldom 5343 \end{description} 5344 5345 \item 5346 \begin{description} 5347 \item[Change:] have ©struct© introduce a scope for nested types: 5348 \begin{cfa} 5349 enum ®Colour® { R, G, B, Y, C, M }; 5350 struct Person { 5351 enum ®Colour® { R, G, B }; §\C{// nested type}§ 5352 struct Face { §\C{// nested type}§ 5353 ®Colour® Eyes, Hair; §\C{// type defined outside (1 level)}§ 5354 }; 5355 ®.Colour® shirt; §\C{// type defined outside (top level)}§ 5356 ®Colour® pants; §\C{// type defined same level}§ 5357 Face looks[10]; §\C{// type defined same level}§ 5358 }; 5359 ®Colour® c = R; §\C{// type/enum defined same level}§ 5360 Person®.Colour® pc = Person®.®R; §\C{// type/enum defined inside}§ 5361 Person®.®Face pretty; §\C{// type defined inside}§ 5362 \end{cfa} 5363 In C, the name of the nested types belongs to the same scope as the name of the outermost enclosing structure, \ie the nested types are hoisted to the scope of the outer-most type, which is not useful and confusing. 5364 \CFA is C \emph{incompatible} on this issue, and provides semantics similar to \Index*[C++]{\CC{}}. 5365 Nested types are not hoisted and can be referenced using the field selection operator ``©.©'', unlike the \CC scope-resolution operator ``©::©''. 5366 \item[Rationale:] ©struct© scope is crucial to \CFA as an information structuring and hiding mechanism. 5367 \item[Effect on original feature:] change to semantics of well-defined feature. 5368 \item[Difficulty of converting:] Semantic transformation. 5369 \item[How widely used:] C programs rarely have nest types because they are equivalent to the hoisted version. 5370 \end{description} 5371 5372 \item 5373 \begin{description} 5374 \item[Change:] In C++, the name of a nested class is local to its enclosing class. 5375 \item[Rationale:] C++ classes have member functions which require that classes establish scopes. 5376 \item[Difficulty of converting:] Semantic transformation. To make the struct type name visible in the scope of the enclosing struct, the struct tag could be declared in the scope of the enclosing struct, before the enclosing struct is defined. Example: 5377 \begin{cfa} 5378 struct Y; §\C{// struct Y and struct X are at the same scope}§ 5379 struct X { 5380 struct Y { /* ... */ } y; 5381 }; 5382 \end{cfa} 5383 All the definitions of C struct types enclosed in other struct definitions and accessed outside the scope of the enclosing struct could be exported to the scope of the enclosing struct. 5384 Note: this is a consequence of the difference in scope rules, which is documented in 3.3. 5385 \item[How widely used:] Seldom. 5386 \end{description} 5387 5388 \item 5389 \begin{description} 5390 \item[Change:] comma expression is disallowed as subscript 5391 \item[Rationale:] safety issue to prevent subscripting error for multidimensional arrays: ©x[i,j]© instead of ©x[i][j]©, and this syntactic form then taken by \CFA for new style arrays. 5392 \item[Effect on original feature:] change to semantics of well-defined feature. 5393 \item[Difficulty of converting:] semantic transformation of ©x[i,j]© to ©x[(i,j)]© 5394 \item[How widely used:] seldom. 5395 \end{description} 5396 \end{enumerate} 5397 5398 4860 5399 \section{Standard Headers} 4861 5400 \label{s:StandardHeaders} 4862 5401 4863 C11prescribes the following standard header-files~\cite[\S~7.1.2]{C11} and \CFA adds to this list:5402 \Celeven prescribes the following standard header-files~\cite[\S~7.1.2]{C11} and \CFA adds to this list: 4864 5403 \begin{quote2} 4865 \begin{tabular}{lll|l} 4866 \multicolumn{3}{c|}{C11} & \multicolumn{1}{c}{\CFA} \\ 5404 \lstset{deletekeywords={float}} 5405 \begin{tabular}{@{}llll|l@{}} 5406 \multicolumn{4}{c|}{C11} & \multicolumn{1}{c}{\CFA} \\ 4867 5407 \hline 4868 assert.h & math.h & stdlib.h & unistd.h \\ 4869 complex.h & setjmp.h & stdnoreturn.h & gmp.h \\ 4870 ctype.h & signal.h & string.h \\ 4871 errno.h & stdalign.h & tgmath.h \\ 4872 fenv.h & stdarg.h & threads.h \\ 4873 float.h & stdatomic.h & time.h \\ 4874 inttypes.h & stdbool.h & uchar.h \\ 4875 iso646.h & stddef.h & wchar.h \\ 4876 limits.h & stdint.h & wctype.h \\ 4877 locale.h & stdio.h & \\ 5408 \begin{tabular}{@{}l@{}} 5409 \Indexc{assert.h} \\ 5410 \Indexc{complex.h} \\ 5411 \Indexc{ctype.h} \\ 5412 \Indexc{errno.h} \\ 5413 \Indexc{fenv.h} \\ 5414 \Indexc{float.h} \\ 5415 \Indexc{inttypes.h} \\ 5416 \Indexc{iso646.h} \\ 5417 \end{tabular} 5418 & 5419 \begin{tabular}{@{}l@{}} 5420 \Indexc{limits.h} \\ 5421 \Indexc{locale.h} \\ 5422 \Indexc{math.h} \\ 5423 \Indexc{setjmp.h} \\ 5424 \Indexc{signal.h} \\ 5425 \Indexc{stdalign.h} \\ 5426 \Indexc{stdarg.h} \\ 5427 \Indexc{stdatomic.h} \\ 5428 \end{tabular} 5429 & 5430 \begin{tabular}{@{}l@{}} 5431 \Indexc{stdbool.h} \\ 5432 \Indexc{stddef.h} \\ 5433 \Indexc{stdint.h} \\ 5434 \Indexc{stdio.h} \\ 5435 \Indexc{stdlib.h} \\ 5436 \Indexc{stdnoreturn.h} \\ 5437 \Indexc{string.h} \\ 5438 \Indexc{tgmath.h} \\ 5439 \end{tabular} 5440 & 5441 \begin{tabular}{@{}l@{}} 5442 \Indexc{threads.h} \\ 5443 \Indexc{time.h} \\ 5444 \Indexc{uchar.h} \\ 5445 \Indexc{wchar.h} \\ 5446 \Indexc{wctype.h} \\ 5447 \\ 5448 \\ 5449 \\ 5450 \end{tabular} 5451 & 5452 \begin{tabular}{@{}l@{}} 5453 \Indexc{unistd.h} \\ 5454 \Indexc{gmp.h} \\ 5455 \\ 5456 \\ 5457 \\ 5458 \\ 5459 \\ 5460 \\ 5461 \end{tabular} 4878 5462 \end{tabular} 4879 5463 \end{quote2} 4880 For the prescribed head-files, \CFA implicitly wraps theirincludes in an ©extern "C"©;5464 For the prescribed head-files, \CFA uses header interposition to wraps these includes in an ©extern "C"©; 4881 5465 hence, names in these include files are not mangled\index{mangling!name} (see~\VRef{s:Interoperability}). 4882 5466 All other C header files must be explicitly wrapped in ©extern "C"© to prevent name mangling. 5467 For \Index*[C++]{\CC{}}, the name-mangling issue is handled implicitly because most C header-files are augmented with checks for preprocessor variable ©__cplusplus©, which adds appropriate ©extern "C"© qualifiers. 4883 5468 4884 5469 … … 4886 5471 \label{s:StandardLibrary} 4887 5472 4888 The goal of the \CFA standard-library is to wrap many of the existing C library-routines that are explicitly polymorphic into implicitly polymorphic versions. 4889 4890 4891 \subsection{malloc} 5473 The \CFA standard-library wraps explicitly-polymorphic C routines into implicitly-polymorphic versions. 5474 5475 5476 \subsection{Storage Management} 5477 5478 The storage-management routines extend their C equivalents by overloading, alternate names, providing shallow type-safety, and removing the need to specify the allocation size for non-array types. 5479 5480 Storage management provides the following capabilities: 5481 \begin{description} 5482 \item[fill] 5483 after allocation the storage is filled with a specified character. 5484 \item[resize] 5485 an existing allocation is decreased or increased in size. 5486 In either case, new storage may or may not be allocated and, if there is a new allocation, as much data from the existing allocation is copied. 5487 For an increase in storage size, new storage after the copied data may be filled. 5488 \item[alignment] 5489 an allocation starts on a specified memory boundary, e.g., an address multiple of 64 or 128 for cache-line purposes. 5490 \item[array] 5491 the allocation size is scaled to the specified number of array elements. 5492 An array may be filled, resized, or aligned. 5493 \end{description} 5494 The table shows allocation routines supporting different combinations of storage-management capabilities: 5495 \begin{center} 5496 \begin{tabular}{@{}lr|l|l|l|l@{}} 5497 & & \multicolumn{1}{c|}{fill} & resize & alignment & array \\ 5498 \hline 5499 C & ©malloc© & no & no & no & no \\ 5500 & ©calloc© & yes (0 only) & no & no & yes \\ 5501 & ©realloc© & no/copy & yes & no & no \\ 5502 & ©memalign© & no & no & yes & no \\ 5503 & ©posix_memalign© & no & no & yes & no \\ 5504 C11 & ©aligned_alloc© & no & no & yes & no \\ 5505 \CFA & ©alloc© & no/copy/yes & no/yes & no & yes \\ 5506 & ©align_alloc© & no/yes & no & yes & yes \\ 5507 \end{tabular} 5508 \end{center} 5509 It is impossible to resize with alignment because the underlying ©realloc© allocates storage if more space is needed, and it does not honour alignment from the original allocation. 4892 5510 4893 5511 \leavevmode 4894 5512 \begin{cfa}[aboveskip=0pt,belowskip=0pt] 4895 forall( otype T ) T * malloc( void );§\indexc{malloc}§ 4896 forall( otype T ) T * malloc( char fill ); 4897 forall( otype T ) T * malloc( T * ptr, size_t size ); 4898 forall( otype T ) T * malloc( T * ptr, size_t size, unsigned char fill ); 4899 forall( otype T ) T * calloc( size_t nmemb );§\indexc{calloc}§ 4900 forall( otype T ) T * realloc( T * ptr, size_t size );§\indexc{ato}§ 4901 forall( otype T ) T * realloc( T * ptr, size_t size, unsigned char fill ); 4902 4903 forall( otype T ) T * aligned_alloc( size_t alignment );§\indexc{ato}§ 4904 forall( otype T ) T * memalign( size_t alignment ); // deprecated 4905 forall( otype T ) int posix_memalign( T ** ptr, size_t alignment ); 4906 4907 forall( otype T ) T * memset( T * ptr, unsigned char fill ); // use default value '\0' for fill 4908 forall( otype T ) T * memset( T * ptr ); // remove when default value available 4909 \end{cfa} 4910 4911 4912 \subsection{ato / strto} 5513 // C unsafe allocation 5514 extern "C" { 5515 void * mallac( size_t size );§\indexc{memset}§ 5516 void * calloc( size_t dim, size_t size );§\indexc{calloc}§ 5517 void * realloc( void * ptr, size_t size );§\indexc{realloc}§ 5518 void * memalign( size_t align, size_t size );§\indexc{memalign}§ 5519 int posix_memalign( void ** ptr, size_t align, size_t size );§\indexc{posix_memalign}§ 5520 } 5521 5522 // §\CFA§ safe equivalents, i.e., implicit size specification 5523 forall( dtype T | sized(T) ) T * malloc( void ); 5524 forall( dtype T | sized(T) ) T * calloc( size_t dim ); 5525 forall( dtype T | sized(T) ) T * realloc( T * ptr, size_t size ); 5526 forall( dtype T | sized(T) ) T * memalign( size_t align ); 5527 forall( dtype T | sized(T) ) T * aligned_alloc( size_t align ); 5528 forall( dtype T | sized(T) ) int posix_memalign( T ** ptr, size_t align ); 5529 5530 // §\CFA§ safe general allocation, fill, resize, array 5531 forall( dtype T | sized(T) ) T * alloc( void );§\indexc{alloc}§ 5532 forall( dtype T | sized(T) ) T * alloc( char fill ); 5533 forall( dtype T | sized(T) ) T * alloc( size_t dim ); 5534 forall( dtype T | sized(T) ) T * alloc( size_t dim, char fill ); 5535 forall( dtype T | sized(T) ) T * alloc( T ptr[], size_t dim ); 5536 forall( dtype T | sized(T) ) T * alloc( T ptr[], size_t dim, char fill ); 5537 5538 // §\CFA§ safe general allocation, align, fill, array 5539 forall( dtype T | sized(T) ) T * align_alloc( size_t align ); 5540 forall( dtype T | sized(T) ) T * align_alloc( size_t align, char fill ); 5541 forall( dtype T | sized(T) ) T * align_alloc( size_t align, size_t dim ); 5542 forall( dtype T | sized(T) ) T * align_alloc( size_t align, size_t dim, char fill ); 5543 5544 // C unsafe initialization/copy 5545 extern "C" { 5546 void * memset( void * dest, int c, size_t size ); 5547 void * memcpy( void * dest, const void * src, size_t size ); 5548 } 5549 5550 // §\CFA§ safe initialization/copy, i.e., implicit size specification 5551 forall( dtype T | sized(T) ) T * memset( T * dest, char c );§\indexc{memset}§ 5552 forall( dtype T | sized(T) ) T * memcpy( T * dest, const T * src );§\indexc{memcpy}§ 5553 5554 // §\CFA§ safe initialization/copy array 5555 forall( dtype T | sized(T) ) T * memset( T dest[], size_t dim, char c ); 5556 forall( dtype T | sized(T) ) T * memcpy( T dest[], const T src[], size_t dim ); 5557 5558 // §\CFA§ allocation/deallocation and constructor/destructor 5559 forall( dtype T | sized(T), ttype Params | { void ?{}( T *, Params ); } ) T * new( Params p );§\indexc{new}§ 5560 forall( dtype T | { void ^?{}( T * ); } ) void delete( T * ptr );§\indexc{delete}§ 5561 forall( dtype T, ttype Params | { void ^?{}( T * ); void delete( Params ); } ) 5562 void delete( T * ptr, Params rest ); 5563 5564 // §\CFA§ allocation/deallocation and constructor/destructor, array 5565 forall( dtype T | sized(T), ttype Params | { void ?{}( T *, Params ); } ) T * anew( size_t dim, Params p );§\indexc{anew}§ 5566 forall( dtype T | sized(T) | { void ^?{}( T * ); } ) void adelete( size_t dim, T arr[] );§\indexc{adelete}§ 5567 forall( dtype T | sized(T) | { void ^?{}( T * ); }, ttype Params | { void adelete( Params ); } ) 5568 void adelete( size_t dim, T arr[], Params rest ); 5569 \end{cfa} 5570 5571 5572 \subsection{Conversion} 4913 5573 4914 5574 \leavevmode … … 4942 5602 4943 5603 4944 \subsection{ bsearch / qsort}5604 \subsection{Search / Sort} 4945 5605 4946 5606 \leavevmode 4947 5607 \begin{cfa}[aboveskip=0pt,belowskip=0pt] 5608 forall( otype T | { int ?<?( T, T ); } ) §\C{// location}§ 5609 T * bsearch( T key, const T * arr, size_t dim );§\indexc{bsearch}§ 5610 5611 forall( otype T | { int ?<?( T, T ); } ) §\C{// position}§ 5612 unsigned int bsearch( T key, const T * arr, size_t dim ); 5613 4948 5614 forall( otype T | { int ?<?( T, T ); } ) 4949 T * bsearch( const T key, const T * arr, size_t dimension );§\indexc{bsearch}§ 4950 4951 forall( otype T | { int ?<?( T, T ); } ) 4952 void qsort( const T * arr, size_t dimension );§\indexc{qsort}§ 4953 \end{cfa} 4954 4955 4956 \subsection{abs} 5615 void qsort( const T * arr, size_t dim );§\indexc{qsort}§ 5616 \end{cfa} 5617 5618 5619 \subsection{Absolute Value} 4957 5620 4958 5621 \leavevmode 4959 5622 \begin{cfa}[aboveskip=0pt,belowskip=0pt] 4960 char abs(char );§\indexc{abs}§5623 unsigned char abs( signed char );§\indexc{abs}§ 4961 5624 int abs( int ); 4962 long int abs( long int );4963 long long int abs( long long int );5625 unsigned long int abs( long int ); 5626 unsigned long long int abs( long long int ); 4964 5627 float abs( float ); 4965 5628 double abs( double ); … … 4968 5631 double abs( double _Complex ); 4969 5632 long double abs( long double _Complex ); 4970 \end{cfa} 4971 4972 4973 \subsection{random} 5633 forall( otype T | { void ?{}( T *, zero_t ); int ?<?( T, T ); T -?( T ); } ) 5634 T abs( T ); 5635 \end{cfa} 5636 5637 5638 \subsection{Random Numbers} 4974 5639 4975 5640 \leavevmode … … 4989 5654 4990 5655 4991 \subsection{ min / max / clamp / swap}5656 \subsection{Algorithms} 4992 5657 4993 5658 \leavevmode 4994 5659 \begin{cfa}[aboveskip=0pt,belowskip=0pt] 4995 forall( otype T | { int ?<?( T, T ); } ) 4996 T min( const T t1, const T t2 );§\indexc{min}§ 4997 4998 forall( otype T | { int ?>?( T, T ); } ) 4999 T max( const T t1, const T t2 );§\indexc{max}§ 5000 5001 forall( otype T | { T min( T, T ); T max( T, T ); } ) 5002 T clamp( T value, T min_val, T max_val );§\indexc{clamp}§ 5003 5004 forall( otype T ) 5005 void swap( T * t1, T * t2 );§\indexc{swap}§ 5660 forall( otype T | { int ?<?( T, T ); } ) T min( T t1, T t2 );§\indexc{min}§ 5661 forall( otype T | { int ?>?( T, T ); } ) T max( T t1, T t2 );§\indexc{max}§ 5662 forall( otype T | { T min( T, T ); T max( T, T ); } ) T clamp( T value, T min_val, T max_val );§\indexc{clamp}§ 5663 forall( otype T ) void swap( T * t1, T * t2 );§\indexc{swap}§ 5006 5664 \end{cfa} 5007 5665 … … 5010 5668 \label{s:Math Library} 5011 5669 5012 The goal of the \CFA math-library is to wrap many of the existing C math library-routines that are explicitly polymorphic into implicitlypolymorphic versions.5670 The \CFA math-library wraps explicitly-polymorphic C math-routines into implicitly-polymorphic versions. 5013 5671 5014 5672 … … 5017 5675 \leavevmode 5018 5676 \begin{cfa}[aboveskip=0pt,belowskip=0pt] 5019 float fabs( float );§\indexc{fabs}§5020 double fabs( double );5021 long double fabs( long double );5022 float cabs( float _Complex );5023 double cabs( double _Complex );5024 long double cabs( long double _Complex );5025 5026 5677 float ?%?( float, float );§\indexc{fmod}§ 5027 5678 float fmod( float, float ); … … 5378 6029 5379 6030 6031 \section{Multi-precision Integers} 6032 \label{s:MultiPrecisionIntegers} 6033 6034 \CFA has an interface to the GMP \Index{multi-precision} signed-integers~\cite{GMP}, similar to the \CC interface provided by GMP. 6035 The \CFA interface wraps GMP routines into operator routines to make programming with multi-precision integers identical to using fixed-sized integers. 6036 The \CFA type name for multi-precision signed-integers is \Indexc{Int} and the header file is \Indexc{gmp}. 6037 6038 \begin{cfa} 6039 void ?{}( Int * this ); §\C{// constructor}§ 6040 void ?{}( Int * this, Int init ); 6041 void ?{}( Int * this, zero_t ); 6042 void ?{}( Int * this, one_t ); 6043 void ?{}( Int * this, signed long int init ); 6044 void ?{}( Int * this, unsigned long int init ); 6045 void ?{}( Int * this, const char * val ); 6046 void ^?{}( Int * this ); 6047 6048 Int ?=?( Int * lhs, Int rhs ); §\C{// assignment}§ 6049 Int ?=?( Int * lhs, long int rhs ); 6050 Int ?=?( Int * lhs, unsigned long int rhs ); 6051 Int ?=?( Int * lhs, const char * rhs ); 6052 6053 char ?=?( char * lhs, Int rhs ); 6054 short int ?=?( short int * lhs, Int rhs ); 6055 int ?=?( int * lhs, Int rhs ); 6056 long int ?=?( long int * lhs, Int rhs ); 6057 unsigned char ?=?( unsigned char * lhs, Int rhs ); 6058 unsigned short int ?=?( unsigned short int * lhs, Int rhs ); 6059 unsigned int ?=?( unsigned int * lhs, Int rhs ); 6060 unsigned long int ?=?( unsigned long int * lhs, Int rhs ); 6061 6062 long int narrow( Int val ); 6063 unsigned long int narrow( Int val ); 6064 6065 int ?==?( Int oper1, Int oper2 ); §\C{// comparison}§ 6066 int ?==?( Int oper1, long int oper2 ); 6067 int ?==?( long int oper2, Int oper1 ); 6068 int ?==?( Int oper1, unsigned long int oper2 ); 6069 int ?==?( unsigned long int oper2, Int oper1 ); 6070 6071 int ?!=?( Int oper1, Int oper2 ); 6072 int ?!=?( Int oper1, long int oper2 ); 6073 int ?!=?( long int oper1, Int oper2 ); 6074 int ?!=?( Int oper1, unsigned long int oper2 ); 6075 int ?!=?( unsigned long int oper1, Int oper2 ); 6076 6077 int ?<?( Int oper1, Int oper2 ); 6078 int ?<?( Int oper1, long int oper2 ); 6079 int ?<?( long int oper2, Int oper1 ); 6080 int ?<?( Int oper1, unsigned long int oper2 ); 6081 int ?<?( unsigned long int oper2, Int oper1 ); 6082 6083 int ?<=?( Int oper1, Int oper2 ); 6084 int ?<=?( Int oper1, long int oper2 ); 6085 int ?<=?( long int oper2, Int oper1 ); 6086 int ?<=?( Int oper1, unsigned long int oper2 ); 6087 int ?<=?( unsigned long int oper2, Int oper1 ); 6088 6089 int ?>?( Int oper1, Int oper2 ); 6090 int ?>?( Int oper1, long int oper2 ); 6091 int ?>?( long int oper1, Int oper2 ); 6092 int ?>?( Int oper1, unsigned long int oper2 ); 6093 int ?>?( unsigned long int oper1, Int oper2 ); 6094 6095 int ?>=?( Int oper1, Int oper2 ); 6096 int ?>=?( Int oper1, long int oper2 ); 6097 int ?>=?( long int oper1, Int oper2 ); 6098 int ?>=?( Int oper1, unsigned long int oper2 ); 6099 int ?>=?( unsigned long int oper1, Int oper2 ); 6100 6101 Int +?( Int oper ); §\C{// arithmetic}§ 6102 Int -?( Int oper ); 6103 Int ~?( Int oper ); 6104 6105 Int ?&?( Int oper1, Int oper2 ); 6106 Int ?&?( Int oper1, long int oper2 ); 6107 Int ?&?( long int oper1, Int oper2 ); 6108 Int ?&?( Int oper1, unsigned long int oper2 ); 6109 Int ?&?( unsigned long int oper1, Int oper2 ); 6110 Int ?&=?( Int * lhs, Int rhs ); 6111 6112 Int ?|?( Int oper1, Int oper2 ); 6113 Int ?|?( Int oper1, long int oper2 ); 6114 Int ?|?( long int oper1, Int oper2 ); 6115 Int ?|?( Int oper1, unsigned long int oper2 ); 6116 Int ?|?( unsigned long int oper1, Int oper2 ); 6117 Int ?|=?( Int * lhs, Int rhs ); 6118 6119 Int ?^?( Int oper1, Int oper2 ); 6120 Int ?^?( Int oper1, long int oper2 ); 6121 Int ?^?( long int oper1, Int oper2 ); 6122 Int ?^?( Int oper1, unsigned long int oper2 ); 6123 Int ?^?( unsigned long int oper1, Int oper2 ); 6124 Int ?^=?( Int * lhs, Int rhs ); 6125 6126 Int ?+?( Int addend1, Int addend2 ); 6127 Int ?+?( Int addend1, long int addend2 ); 6128 Int ?+?( long int addend2, Int addend1 ); 6129 Int ?+?( Int addend1, unsigned long int addend2 ); 6130 Int ?+?( unsigned long int addend2, Int addend1 ); 6131 Int ?+=?( Int * lhs, Int rhs ); 6132 Int ?+=?( Int * lhs, long int rhs ); 6133 Int ?+=?( Int * lhs, unsigned long int rhs ); 6134 Int ++?( Int * lhs ); 6135 Int ?++( Int * lhs ); 6136 6137 Int ?-?( Int minuend, Int subtrahend ); 6138 Int ?-?( Int minuend, long int subtrahend ); 6139 Int ?-?( long int minuend, Int subtrahend ); 6140 Int ?-?( Int minuend, unsigned long int subtrahend ); 6141 Int ?-?( unsigned long int minuend, Int subtrahend ); 6142 Int ?-=?( Int * lhs, Int rhs ); 6143 Int ?-=?( Int * lhs, long int rhs ); 6144 Int ?-=?( Int * lhs, unsigned long int rhs ); 6145 Int --?( Int * lhs ); 6146 Int ?--( Int * lhs ); 6147 6148 Int ?*?( Int multiplicator, Int multiplicand ); 6149 Int ?*?( Int multiplicator, long int multiplicand ); 6150 Int ?*?( long int multiplicand, Int multiplicator ); 6151 Int ?*?( Int multiplicator, unsigned long int multiplicand ); 6152 Int ?*?( unsigned long int multiplicand, Int multiplicator ); 6153 Int ?*=?( Int * lhs, Int rhs ); 6154 Int ?*=?( Int * lhs, long int rhs ); 6155 Int ?*=?( Int * lhs, unsigned long int rhs ); 6156 6157 Int ?/?( Int dividend, Int divisor ); 6158 Int ?/?( Int dividend, unsigned long int divisor ); 6159 Int ?/?( unsigned long int dividend, Int divisor ); 6160 Int ?/?( Int dividend, long int divisor ); 6161 Int ?/?( long int dividend, Int divisor ); 6162 Int ?/=?( Int * lhs, Int rhs ); 6163 Int ?/=?( Int * lhs, long int rhs ); 6164 Int ?/=?( Int * lhs, unsigned long int rhs ); 6165 6166 [ Int, Int ] div( Int dividend, Int divisor ); 6167 [ Int, Int ] div( Int dividend, unsigned long int divisor ); 6168 6169 Int ?%?( Int dividend, Int divisor ); 6170 Int ?%?( Int dividend, unsigned long int divisor ); 6171 Int ?%?( unsigned long int dividend, Int divisor ); 6172 Int ?%?( Int dividend, long int divisor ); 6173 Int ?%?( long int dividend, Int divisor ); 6174 Int ?%=?( Int * lhs, Int rhs ); 6175 Int ?%=?( Int * lhs, long int rhs ); 6176 Int ?%=?( Int * lhs, unsigned long int rhs ); 6177 6178 Int ?<<?( Int shiften, mp_bitcnt_t shift ); 6179 Int ?<<=?( Int * lhs, mp_bitcnt_t shift ); 6180 Int ?>>?( Int shiften, mp_bitcnt_t shift ); 6181 Int ?>>=?( Int * lhs, mp_bitcnt_t shift ); 6182 6183 Int abs( Int oper ); §\C{// number functions}§ 6184 Int fact( unsigned long int N ); 6185 Int gcd( Int oper1, Int oper2 ); 6186 Int pow( Int base, unsigned long int exponent ); 6187 Int pow( unsigned long int base, unsigned long int exponent ); 6188 void srandom( gmp_randstate_t state ); 6189 Int random( gmp_randstate_t state, mp_bitcnt_t n ); 6190 Int random( gmp_randstate_t state, Int n ); 6191 Int random( gmp_randstate_t state, mp_size_t max_size ); 6192 int sgn( Int oper ); 6193 Int sqrt( Int oper ); 6194 6195 forall( dtype istype | istream( istype ) ) istype * ?|?( istype * is, Int * mp ); §\C{// I/O}§ 6196 forall( dtype ostype | ostream( ostype ) ) ostype * ?|?( ostype * os, Int mp ); 6197 \end{cfa} 6198 6199 The following factorial programs contrast using GMP with the \CFA and C interfaces, where the output from these programs appears in \VRef[Figure]{f:MultiPrecisionFactorials}. 6200 (Compile with flag \Indexc{-lgmp} to link with the GMP library.) 6201 \begin{quote2} 6202 \begin{tabular}{@{}l@{\hspace{\parindentlnth}}|@{\hspace{\parindentlnth}}l@{}} 6203 \multicolumn{1}{c|@{\hspace{\parindentlnth}}}{\textbf{\CFA}} & \multicolumn{1}{@{\hspace{\parindentlnth}}c}{\textbf{C}} \\ 6204 \hline 6205 \begin{cfa} 6206 #include <gmp>§\indexc{gmp}§ 6207 int main( void ) { 6208 sout | "Factorial Numbers" | endl; 6209 Int fact = 1; 6210 6211 sout | 0 | fact | endl; 6212 for ( unsigned int i = 1; i <= 40; i += 1 ) { 6213 fact *= i; 6214 sout | i | fact | endl; 6215 } 6216 } 6217 \end{cfa} 6218 & 6219 \begin{cfa} 6220 #include <gmp.h>§\indexc{gmp.h}§ 6221 int main( void ) { 6222 ®gmp_printf®( "Factorial Numbers\n" ); 6223 ®mpz_t® fact; 6224 ®mpz_init_set_ui®( fact, 1 ); 6225 ®gmp_printf®( "%d %Zd\n", 0, fact ); 6226 for ( unsigned int i = 1; i <= 40; i += 1 ) { 6227 ®mpz_mul_ui®( fact, fact, i ); 6228 ®gmp_printf®( "%d %Zd\n", i, fact ); 6229 } 6230 } 6231 \end{cfa} 6232 \end{tabular} 6233 \end{quote2} 6234 6235 \begin{figure} 6236 \begin{cfa} 6237 Factorial Numbers 6238 0 1 6239 1 1 6240 2 2 6241 3 6 6242 4 24 6243 5 120 6244 6 720 6245 7 5040 6246 8 40320 6247 9 362880 6248 10 3628800 6249 11 39916800 6250 12 479001600 6251 13 6227020800 6252 14 87178291200 6253 15 1307674368000 6254 16 20922789888000 6255 17 355687428096000 6256 18 6402373705728000 6257 19 121645100408832000 6258 20 2432902008176640000 6259 21 51090942171709440000 6260 22 1124000727777607680000 6261 23 25852016738884976640000 6262 24 620448401733239439360000 6263 25 15511210043330985984000000 6264 26 403291461126605635584000000 6265 27 10888869450418352160768000000 6266 28 304888344611713860501504000000 6267 29 8841761993739701954543616000000 6268 30 265252859812191058636308480000000 6269 31 8222838654177922817725562880000000 6270 32 263130836933693530167218012160000000 6271 33 8683317618811886495518194401280000000 6272 34 295232799039604140847618609643520000000 6273 35 10333147966386144929666651337523200000000 6274 36 371993326789901217467999448150835200000000 6275 37 13763753091226345046315979581580902400000000 6276 38 523022617466601111760007224100074291200000000 6277 39 20397882081197443358640281739902897356800000000 6278 40 815915283247897734345611269596115894272000000000 6279 \end{cfa} 6280 \caption{Multi-precision Factorials} 6281 \label{f:MultiPrecisionFactorials} 6282 \end{figure} 6283 6284 5380 6285 \section{Rational Numbers} 5381 6286 \label{s:RationalNumbers} … … 5390 6295 }; // Rational 5391 6296 5392 // constants 5393 extern struct Rational 0; 5394 extern struct Rational 1; 5395 5396 // constructors 5397 Rational rational(); 6297 Rational rational(); §\C{// constructors}§ 5398 6298 Rational rational( long int n ); 5399 6299 Rational rational( long int n, long int d ); 5400 5401 // getter/setter for numerator/denominator 5402 long int numerator( Rational r ); 6300 void ?{}( Rational * r, zero_t ); 6301 void ?{}( Rational * r, one_t ); 6302 6303 long int numerator( Rational r ); §\C{// numerator/denominator getter/setter}§ 5403 6304 long int numerator( Rational r, long int n ); 5404 6305 long int denominator( Rational r ); 5405 6306 long int denominator( Rational r, long int d ); 5406 6307 5407 // comparison 5408 int ?==?( Rational l, Rational r ); 6308 int ?==?( Rational l, Rational r ); §\C{// comparison}§ 5409 6309 int ?!=?( Rational l, Rational r ); 5410 6310 int ?<?( Rational l, Rational r ); … … 5413 6313 int ?>=?( Rational l, Rational r ); 5414 6314 5415 // arithmetic 5416 Rational -?( Rational r ); 6315 Rational -?( Rational r ); §\C{// arithmetic}§ 5417 6316 Rational ?+?( Rational l, Rational r ); 5418 6317 Rational ?-?( Rational l, Rational r ); … … 5420 6319 Rational ?/?( Rational l, Rational r ); 5421 6320 5422 // conversion 5423 double widen( Rational r ); 6321 double widen( Rational r ); §\C{// conversion}§ 5424 6322 Rational narrow( double f, long int md ); 5425 6323 5426 // I/O 5427 forall( dtype istype | istream( istype ) ) istype * ?|?( istype *, Rational * ); 6324 forall( dtype istype | istream( istype ) ) istype * ?|?( istype *, Rational * ); // I/O 5428 6325 forall( dtype ostype | ostream( ostype ) ) ostype * ?|?( ostype *, Rational ); 5429 6326 \end{cfa}
Note:
See TracChangeset
for help on using the changeset viewer.