Changeset 0642216


Ignore:
Timestamp:
May 17, 2017, 11:00:30 PM (5 years ago)
Author:
Peter A. Buhr <pabuhr@…>
Branches:
aaron-thesis, arm-eh, cleanup-dtors, deferred_resn, demangler, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, new-env, no_list, persistent-indexer, resolv-new, with_gc
Children:
8b7124e
Parents:
0213af6
Message:

rewrite Pointer/Reference? section

Location:
doc/user
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • doc/user/pointer2.fig

    r0213af6 r0642216  
    88-2
    991200 2
     106 1125 2100 3525 2400
    10112 2 0 1 0 7 100 0 -1 0.000 0 0 -1 0 0 5
    11          1500 1950 1950 1950 1950 2250 1500 2250 1500 1950
     12         1500 2100 1950 2100 1950 2400 1500 2400 1500 2100
     132 2 0 1 0 7 100 0 -1 0.000 0 0 -1 0 0 5
     14         2700 2100 3150 2100 3150 2400 2700 2400 2700 2100
     154 2 0 100 0 4 10 0.0000 2 120 270 1425 2400 104\001
     164 2 0 100 0 4 10 0.0000 2 120 90 1425 2225 y\001
     174 0 0 100 0 4 10 0.0000 2 120 165 2025 2300 int\001
     184 2 0 100 0 4 10 0.0000 2 120 270 2625 2400 112\001
     194 2 0 100 0 4 10 0.0000 2 150 180 2625 2225 p2\001
     204 1 0 100 0 4 10 0.0000 2 120 90 1725 2300 3\001
     214 0 0 100 0 4 10 0.0000 2 120 270 3225 2300 int *\001
     224 1 0 100 0 4 10 0.0000 2 120 270 2925 2300 100\001
     23-6
    12242 2 0 1 0 7 100 0 -1 0.000 0 0 -1 0 0 5
    1325         1500 1500 1950 1500 1950 1800 1500 1800 1500 1500
    14262 1 0 1 4 7 100 -1 -1 0.000 0 0 -1 1 0 2
    1527        1 1 1.00 45.00 90.00
    16          2700 1800 1950 1950
     28         2700 1800 1950 2100
    17292 1 0 1 4 7 50 -1 -1 0.000 0 0 -1 1 0 2
    1830        1 1 1.00 45.00 90.00
    19          2700 1950 1950 1800
    20 2 2 0 1 0 7 100 0 -1 0.000 0 0 -1 0 0 5
    21          2700 1950 3150 1950 3150 2250 2700 2250 2700 1950
     31         2700 2100 1950 1800
    22322 2 0 1 0 7 100 0 -1 0.000 0 0 -1 0 0 5
    2333         2700 1500 3150 1500 3150 1800 2700 1800 2700 1500
    24342 1 0 1 4 7 100 -1 -1 0.000 0 0 -1 1 0 2
    2535        1 1 1.00 45.00 90.00
    26          3900 1800 3150 1950
     36         3900 1800 3150 2100
    27372 2 0 1 0 7 100 0 -1 0.000 0 0 -1 0 0 5
    2838         3900 1500 4350 1500 4350 1800 3900 1800 3900 1500
    29 4 2 0 100 0 4 10 0.0000 2 120 270 1425 2250 104\001
    30394 2 0 100 0 4 10 0.0000 2 120 270 1425 1800 100\001
    31404 2 0 100 0 4 10 0.0000 2 90 90 1425 1625 x\001
    32 4 2 0 100 0 4 10 0.0000 2 120 90 1425 2075 y\001
    33 4 0 0 100 0 4 10 0.0000 2 120 165 2025 2150 int\001
    34414 0 0 100 0 4 10 0.0000 2 120 165 2025 1700 int\001
    35 4 2 0 100 0 4 10 0.0000 2 120 270 2625 2250 112\001
    36 4 2 0 100 0 4 10 0.0000 2 150 180 2625 2075 p2\001
    37424 2 0 100 0 4 10 0.0000 2 120 270 2625 1800 108\001
    38434 2 0 100 0 4 10 0.0000 2 150 180 2625 1625 p1\001
    39 4 1 0 100 0 4 10 0.0000 2 120 90 1725 2150 3\001
    40444 1 0 100 0 4 10 0.0000 2 120 90 1725 1700 3\001
    41 4 0 0 100 0 4 10 0.0000 2 120 270 3225 2150 int *\001
    42454 0 0 100 0 4 10 0.0000 2 120 270 3225 1700 int *\001
    43464 2 0 100 0 4 10 0.0000 2 120 270 3825 1800 116\001
    44474 2 0 100 0 4 10 0.0000 2 150 180 3825 1625 p3\001
    45 4 1 0 100 0 4 10 0.0000 2 120 270 2925 2150 100\001
    46484 1 0 100 0 4 10 0.0000 2 120 270 2925 1700 104\001
    47494 1 0 100 0 4 10 0.0000 2 120 270 4125 1700 112\001
  • doc/user/user.tex

    r0213af6 r0642216  
    1111%% Created On       : Wed Apr  6 14:53:29 2016
    1212%% Last Modified By : Peter A. Buhr
    13 %% Last Modified On : Mon May 15 18:29:58 2017
    14 %% Update Count     : 1598
     13%% Last Modified On : Wed May 17 22:42:11 2017
     14%% Update Count     : 1685
    1515%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    1616
     
    9494\author{
    9595\huge \CFA Team \medskip \\
    96 \Large Peter A. Buhr, Richard Bilson, Thierry Delisle, \smallskip \\
     96\Large Richard Bilson, Peter A. Buhr, Thierry Delisle, \smallskip \\
    9797\Large Glen Ditchfield, Rodolfo G. Esteves, Aaron Moss, Rob Schluntz
    9898}% author
     
    217217
    218218As stated, the goal of the \CFA project is to engineer modern language features into C in an evolutionary rather than revolutionary way.
    219 \CC~\cite{c++,ANSI14:C++} is an example of a similar project;
     219\CC~\cite{C++14,C++} is an example of a similar project;
    220220however, it largely extended the language, and did not address many existing problems.\footnote{%
    221221Two important existing problems addressed were changing the type of character literals from ©int© to ©char© and enumerator from ©int© to the type of its enumerators.}
     
    514514The new declarations place qualifiers to the left of the base type, while C declarations place qualifiers to the right of the base type.
    515515In the following example, \R{red} is for the base type and \B{blue} is for the qualifiers.
    516 The \CFA declarations move the qualifiers to the left of the base type, i.e., move the blue to the left of the red, while the qualifiers have the same meaning but are ordered left to right to specify a variable's type.
     516The \CFA declarations move the qualifiers to the left of the base type, \ie move the blue to the left of the red, while the qualifiers have the same meaning but are ordered left to right to specify a variable's type.
    517517\begin{quote2}
    518518\begin{tabular}{@{}l@{\hspace{3em}}l@{}}
     
    659659
    660660
    661 \section{Pointer / Reference}
     661\section{Pointer/Reference}
    662662
    663663C provides a \newterm{pointer type};
    664664\CFA adds a \newterm{reference type}.
    665 Both types contain an \newterm{address}, which is normally a location in memory.
    666 Special addresses are used to denote certain states or access co-processor memory.
    667 By convention, no variable is placed at address 0, so addresses like 0, 1, 2, 3 are often used to denote no-value or other special states.
    668 Often dereferencing a special state causes a \Index{memory fault}, so checking is necessary during execution.
    669 If the programming language assigns addresses, a program's execution is \Index{sound}, i.e., all addresses are to valid memory locations.
    670 C allows programmers to assign addresses, so there is the potential for incorrect addresses, both inside and outside of the computer address-space.
    671 
    672 Program variables are implicit pointers to memory locations generated by the compiler and automatically dereferenced, as in:
     665These types may be derived from a object or routine type, called the \newterm{referenced type}.
     666Objects of these types contain an \newterm{address}, which is normally a location in memory, but may also address memory-mapped registers in hardware devices.
     667An integer constant expression with the value 0, or such an expression cast to type ©void *©, is called a \newterm{null-pointer constant}.\footnote{
     668One way to conceptualize the null pointer is that no variable is placed at this address, so the null-pointer address can be used to denote an uninitialized pointer/reference object;
     669\ie the null pointer is guaranteed to compare unequal to a pointer to any object or routine.}
     670An address is \newterm{sound}, if it points to a valid memory location in scope, \ie has not been freed.
     671Dereferencing an \newterm{unsound} address, including the null pointer, is \Index{undefined}, often resulting in a \Index{memory fault}.
     672
     673A program \newterm{object} is a region of data storage in the execution environment, the contents of which can represent values.
     674In most cases, objects are located in memory at an address, and the variable name for an object is an implicit address to the object generated by the compiler and automatically dereferenced, as in:
    673675\begin{quote2}
    674 \begin{tabular}{@{}lll@{}}
     676\begin{tabular}{@{}ll@{\hspace{2em}}l@{}}
    675677\begin{cfa}
    676678int x;
     
    691693\end{quote2}
    692694where the right example is how the compiler logically interprets the variables in the left example.
    693 Since a variable name only points to one location during its lifetime, it is an \Index{immutable} \Index{pointer};
    694 hence, variables ©x© and ©y© are constant pointers in the compiler interpretation.
    695 In general, variable addresses are stored in instructions instead of loaded independently, so an instruction fetch implicitly loads a variable's address.
     695Since a variable name only points to one address during its lifetime, it is an \Index{immutable} \Index{pointer};
     696hence, the implicit type of pointer variables ©x© and ©y© are constant pointers in the compiler interpretation.
     697In general, variable addresses are stored in instructions instead of loaded from memory, and hence may not occupy storage.
     698These approaches are contrasted in the following:
    696699\begin{quote2}
    697700\begin{tabular}{@{}l|l@{}}
     701\multicolumn{1}{c|}{explicit variable address} & \multicolumn{1}{c}{implicit variable address} \\
     702\hline
    698703\begin{cfa}
    699704lda             r1,100                  // load address of x
    700 ld              r2,(r1)                   // load value of x
     705ld               r2,(r1)                  // load value of x
    701706lda             r3,104                  // load address of y
    702 st              r2,(r3)                   // store x into y
     707st               r2,(r3)                  // store x into y
    703708\end{cfa}
    704709&
     
    711716\end{tabular}
    712717\end{quote2}
    713 Finally, the immutable nature of a variable's address and the fact that there is no storage for a variable address means pointer assignment\index{pointer!assignment}\index{assignment!pointer} is impossible.
    714 Therefore, the expression ©x = y© only has one meaning, ©*x = *y©, i.e., manipulate values, which is why explicitly writing the dereferences is unnecessary even though it occurs implicitly as part of instruction decoding.
    715 
    716 A \Index{pointer}/\Index{reference} is a generalization of a variable name, i.e., a mutable address that can point to more than one memory location during its lifetime.
    717 (Similarly, an integer variable can contain multiple integer literals during its lifetime versus an integer constant representing a single literal during its lifetime and may not occupy storage as the literal is embedded directly into instructions.)
     718Finally, the immutable nature of a variable's address and the fact that there is no storage for the variable pointer means pointer assignment\index{pointer!assignment}\index{assignment!pointer} is impossible.
     719Therefore, the expression ©x = y© has only one meaning, ©*x = *y©, \ie manipulate values, which is why explicitly writing the dereferences is unnecessary even though it occurs implicitly as part of instruction decoding.
     720
     721A \Index{pointer}/\Index{reference} object is a generalization of an object variable-name, \ie a mutable address that can point to more than one memory location during its lifetime.
     722(Similarly, an integer variable can contain multiple integer literals during its lifetime versus an integer constant representing a single literal during its lifetime and, like a variable name, may not occupy storage as the literal is embedded directly into instructions.)
    718723Hence, a pointer occupies memory to store its current address, and the pointer's value is loaded by dereferencing, \eg:
    719724\begin{quote2}
    720 \begin{tabular}{@{}ll@{}}
     725\begin{tabular}{@{}l@{\hspace{2em}}l@{}}
    721726\begin{cfa}
    722727int x, y, ®*® p1, ®*® p2, ®**® p3;
     
    727732\end{cfa}
    728733&
    729 \raisebox{-0.45\totalheight}{\input{pointer2.pstex_t}}
     734\raisebox{-0.5\totalheight}{\input{pointer2.pstex_t}}
    730735\end{tabular}
    731736\end{quote2}
    732737
    733738Notice, an address has a duality\index{address!duality}: a location in memory or the value at that location.
    734 In many cases, a compiler might be able to infer the meaning:
     739In many cases, a compiler might be able to infer the best meaning for these two cases.
     740For example, \Index*{Algol68}~\cite{Algol68} inferences pointer dereferencing to select the best meaning for each pointer usage
    735741\begin{cfa}
    736742p2 = p1 + x;                                    §\C{// compiler infers *p2 = *p1 + x;}§
    737743\end{cfa}
    738 because adding the arbitrary integer value in ©x© to the address of ©p1© and storing the resulting address into ©p2© is an unlikely operation.
    739 \Index*{Algol68}~\cite{Algol68} inferences pointer dereferencing to select the best meaning for each pointer usage.
    740 However, in C, the following cases are ambiguous, especially with pointer arithmetic:
    741 \begin{cfa}
    742 p1 = p2;                                                §\C{// p1 = p2\ \ or\ \ *p1 = *p2}§
    743 p1 = p1 + 1;                                    §\C{// p1 = p1 + 1\ \ or\ \ *p1 = *p1 + 1}§
    744 \end{cfa}
    745 
    746 Most languages pick one meaning as the default and the programmer explicitly indicates the other meaning to resolve the address-duality ambiguity\index{address! ambiguity}.
    747 In C, the default meaning for pointers is to manipulate the pointer's address and the pointed-to value is explicitly accessed by the dereference operator ©*©.
     744Algol68 infers the following deferencing ©*p2 = *p1 + x©, because adding the arbitrary integer value in ©x© to the address of ©p1© and storing the resulting address into ©p2© is an unlikely operation.
     745Unfortunately, automatic dereferencing does not work in all cases, and so some mechanism is necessary to fix incorrect choices.
     746
     747Rather than dereference inferencing, most programming languages pick one implicit dereferencing semantics, and the programmer explicitly indicates the other to resolve address-duality.
     748In C, objects of pointer type always manipulate the pointer object's address:
     749\begin{cfa}
     750p1 = p2;                                                §\C{// p1 = p2\ \ rather than\ \ *p1 = *p2}§
     751p2 = p1 + x;                                    §\C{// p2 = p1 + x\ \ rather than\ \ *p1 = *p1 + x}§
     752\end{cfa}
     753even though the assignment to ©p2© is likely incorrect, and the programmer probably meant:
    748754\begin{cfa}
    749755p1 = p2;                                                §\C{// pointer address assignment}§
    750 *p1 = *p1 + 1;                                  §\C{// pointed-to value assignment / operation}§
    751 \end{cfa}
    752 which works well for situations where manipulation of addresses is the primary meaning and data is rarely accessed, such as storage management (©malloc©/©free©).
     756®*®p2 = ®*®p1 + x;                              §\C{// pointed-to value assignment / operation}§
     757\end{cfa}
     758The C semantics works well for situations where manipulation of addresses is the primary meaning and data is rarely accessed, such as storage management (©malloc©/©free©).
    753759
    754760However, in most other situations, the pointed-to value is requested more often than the pointer address.
     
    762768\end{cfa}
    763769
    764 To switch the default meaning for an address requires a new kind of pointer, called a \newterm{reference} denoted by ©&©.
     770To support this common case, a reference type is introduced in \CFA, denoted by ©&©, which is the opposite dereference semantics to a pointer type, making the value at the pointed-to location the implicit semantics for dereferencing.
    765771\begin{cfa}
    766772int x, y, ®&® r1, ®&® r2, ®&&® r3;
     
    773779Except for auto-dereferencing by the compiler, this reference example is the same as the previous pointer example.
    774780Hence, a reference behaves like the variable name for the current variable it is pointing-to.
    775 The simplest way to understand a reference is to imagine the compiler inserting a dereference operator before the reference variable for each reference qualifier in a declaration, \eg:
    776 \begin{cfa}
    777 r2 = ((r1 + r2) * (r3 - r1)) / (r3 - 15);
    778 \end{cfa}
    779 is rewritten as:
     781One way to conceptualize a reference is via a rewrite rule, where the compiler inserts a dereference operator before the reference variable for each reference qualifier in a declaration, so the previous example becomes:
    780782\begin{cfa}
    781783®*®r2 = ((®*®r1 + ®*®r2) ®*® (®**®r3 - ®*®r1)) / (®**®r3 - 15);
     
    793795(&(&®*®)®*®)r3 = &(&®*®)r2;             §\C{// (\&*) cancel giving address of r2, (\&(\&*)*) cancel giving variable r3}§
    794796\end{cfa}
    795 Cancellation\index{cancellation!pointer/reference}\index{pointer!cancellation} works to arbitrary depth, and pointer and reference values are interchangeable because both contain addresses.
     797Cancellation\index{cancellation!pointer/reference}\index{pointer!cancellation} works to arbitrary depth.
     798
     799Fundamentally, pointer and reference objects are functionally interchangeable because both contain addresses.
    796800\begin{cfa}
    797801int x, *p1 = &x, **p2 = &p1, ***p3 = &p2,
     
    805809&&&r3 = p3;                                             §\C{// change r3 to p3, (\&(\&(\&*)*)*)r3, 3 cancellations}§
    806810\end{cfa}
    807 Finally, implicit dereferencing and cancellation are a static (compilation) phenomenon not a dynamic one.
    808 That is, all implicit dereferencing and any cancellation is carried out prior to the start of the program, so reference performance is equivalent to pointer performance.
    809 A programmer selects a pointer or reference type solely on whether the address is dereferenced frequently or infrequently, which dictates the amount of direct aid from the compiler;
    810 otherwise, everything else is equal.
    811 
    812 Interestingly, \Index*[C++]{\CC} deals with the address duality by making the pointed-to value the default, and prevent\-ing changes to the reference address, which eliminates half of the duality.
    813 \Index*{Java} deals with the address duality by making address assignment the default and requiring field assignment (direct or indirect via methods), i.e., there is no builtin bit-wise or method-wise assignment, which eliminates half of the duality.
    814 
    815 As for a pointer, a reference may have qualifiers:
     811Furthermore, both types are equally performant, as the same amount of dereferencing occurs for both types.
     812Therefore, the choice between them is based solely on whether the address is dereferenced frequently or infrequently, which dictates the amount of dereferencing aid from the compiler.
     813
     814As for a pointer type, a reference type may have qualifiers:
    816815\begin{cfa}
    817816const int cx = 5;                               §\C{// cannot change cx;}§
     
    819818®&®cr = &cx;                                    §\C{// can change cr}§
    820819cr = 7;                                                 §\C{// error, cannot change cx}§
    821 int & const rc = x;                             §\C{// must be initialized, \CC reference
     820int & const rc = x;                             §\C{// must be initialized
    822821®&®rc = &x;                                             §\C{// error, cannot change rc}§
    823 const int & const crc = cx;             §\C{// must be initialized, \CC reference
     822const int & const crc = cx;             §\C{// must be initialized
    824823crc = 7;                                                §\C{// error, cannot change cx}§
    825824®&®crc = &cx;                                   §\C{// error, cannot change crc}§
    826825\end{cfa}
    827 Hence, for type ©& const©, there is no pointer assignment, so ©&rc = &x© is disallowed, and \emph{the address value cannot be ©0© unless an arbitrary pointer is assigned to the reference}, \eg:
    828 \begin{cfa}
    829 int & const r = *0;                             §\C{// where 0 is the int * zero}§
    830 \end{cfa}
    831 Otherwise, the compiler is managing the addresses for type ©& const© not the programmer, and by a programming discipline of only using references with references, address errors can be prevented.
     826Hence, for type ©& const©, there is no pointer assignment, so ©&rc = &x© is disallowed, and \emph{the address value cannot be the null pointer unless an arbitrary pointer is coerced into the reference}:
     827\begin{cfa}
     828int & const cr = *0;                    §\C{// where 0 is the int * zero}§
     829\end{cfa}
     830Note, constant reference types do not prevent addressing errors because of explicit storage-management:
     831\begin{cfa}
     832int & const cr = *malloc();
     833delete &cr;
     834cr = 7;                                                 §\C{// unsound pointer dereference}§
     835\end{cfa}
     836
    832837Finally, the position of the ©const© qualifier \emph{after} the pointer/reference qualifier causes confuse for C programmers.
    833838The ©const© qualifier cannot be moved before the pointer/reference qualifier for C style-declarations;
     
    849854where the \CFA declaration is read left-to-right (see \VRef{s:Declarations}).
    850855
     856In contract to \CFA reference types, \Index*[C++]{\CC{}}'s reference types are all ©const© references, preventing changes to the reference address, so only value assignment is possible, which eliminates half of the \Index{address duality}.
     857\Index*{Java}'s reference types to objects (because all Java objects are on the heap) are like C pointers, which always manipulate the address and there is no (bit-wise) object assignment, so objects are explicitly cloned by shallow or deep copying, which eliminates half of the address duality.
     858
    851859\Index{Initialization} is different than \Index{assignment} because initialization occurs on the empty (uninitialized) storage on an object, while assignment occurs on possibly initialized storage of an object.
    852860There are three initialization contexts in \CFA: declaration initialization, argument/parameter binding, return/temporary binding.
    853861For reference initialization (like pointer), the initializing value must be an address (\Index{lvalue}) not a value (\Index{rvalue}).
    854862\begin{cfa}
    855 int * p = &x;                                   §\C{// both \&x and x are possible interpretations
     863int * p = &x;                                   §\C{// both \&x and x are possible interpretations in C
    856864int & r = x;                                    §\C{// x unlikely interpretation, because of auto-dereferencing}§
    857865\end{cfa}
    858 Hence, the compiler implicitly inserts a reference operator, ©&©, before the initialization expression.
    859 Similarly, when a reference is used for a parameter/return type, the call-site argument does not require a reference operator.
     866C allows ©p© to be assigned with ©&x© or ©x© (many compilers warn about the latter assignment).
     867\CFA allows ©r© to be assigned ©x© only because it inferences a dereference for ©x©, by implicitly inserting a address-of operator, ©&©, before the initialization expression because a reference behaves like the variable name it is pointing-to.
     868Similarly, when a reference is used for a parameter/return type, the call-site argument does not require a reference operator for the same reason.
    860869\begin{cfa}
    861870int & f( int & rp );                    §\C{// reference parameter and return}§
     
    863872\end{cfa}
    864873Within routine ©f©, it is possible to change the argument by changing the corresponding parameter, and parameter ©rp© can be locally reassigned within ©f©.
    865 Since ©?+?© takes its arguments by value, the references returned from ©f© are used to initialize compiler generated temporaries with value semantics that copy from the references.
     874Since operator routine ©?+?© takes its arguments by value, the references returned from ©f© are used to initialize compiler generated temporaries with value semantics that copy from the references.
    866875
    867876When a pointer/reference parameter has a ©const© value (immutable), it is possible to pass literals and expressions.
     
    873882\end{cfa}
    874883Here, the compiler passes the address to the literal 3 or the temporary for the expression ©x + y©, knowing the argument cannot be changed through the parameter.
    875 (The ©&© is necessary for the pointer parameter to make the types match, and is a common requirement for a C programmer.)
     884(The ©&© is necessary for the pointer-type parameter to make the types match, and is a common requirement for a C programmer.)
    876885\CFA \emph{extends} this semantics to a mutable pointer/reference parameter, and the compiler implicitly creates the necessary temporary (copying the argument), which is subsequently pointed-to by the reference parameter and can be changed.
    877886\begin{cfa}
     
    885894The implicit conversion allows seamless calls to any routine without having to explicitly name/copy the literal/expression to allow the call.
    886895
    887 While \CFA attempts to handle pointers and references in a uniform, symmetric manner, C handles routine variables in an inconsistent way: a routine variable is both a pointer and a reference (particle and wave).
     896While \CFA attempts to handle pointers and references in a uniform, symmetric manner, C handles routine objects in an inconsistent way: a routine object is both a pointer and a reference (particle and wave).
    888897\begin{cfa}
    889898void f( int p ) {...}
     
    893902fp(3);                                                  §\C{// reference invocation}§
    894903\end{cfa}
    895 A routine variable is best described by a ©const© reference:
     904A routine object is best described by a ©const© reference:
    896905\begin{cfa}
    897906const void (&fp)( int ) = f;
     
    900909&fp = ...;                                              §\C{// changing routine reference}§
    901910\end{cfa}
    902 because the value of the routine variable is a routine literal, i.e., the routine code is normally immutable during execution.\footnote{
     911because the value of the routine object is a routine literal, \ie the routine code is normally immutable during execution.\footnote{
    903912Dynamic code rewriting is possible but only in special circumstances.}
    904 \CFA allows this additional use of references for routine variables in an attempt to give a more consistent meaning for them.
     913\CFA allows this additional use of references for routine objects in an attempt to give a more consistent meaning for them.
     914
     915
     916\begin{comment}
     917\section{References}
     918
     919By introducing references in parameter types, users are given an easy way to pass a value by reference, without the need for NULL pointer checks.
     920In structures, a reference can replace a pointer to an object that should always have a valid value.
     921When a structure contains a reference, all of its constructors must initialize the reference and all instances of this structure must initialize it upon definition.
     922
     923The syntax for using references in \CFA is the same as \CC with the exception of reference initialization.
     924Use ©&© to specify a reference, and access references just like regular objects, not like pointers (use dot notation to access fields).
     925When initializing a reference, \CFA uses a different syntax which differentiates reference initialization from assignment to a reference.
     926The ©&© is used on both sides of the expression to clarify that the address of the reference is being set to the address of the variable to which it refers.
     927
     928
     929From: Richard Bilson <rcbilson@gmail.com>
     930Date: Wed, 13 Jul 2016 01:58:58 +0000
     931Subject: Re: pointers / references
     932To: "Peter A. Buhr" <pabuhr@plg2.cs.uwaterloo.ca>
     933
     934As a general comment I would say that I found the section confusing, as you move back and forth
     935between various real and imagined programming languages. If it were me I would rewrite into two
     936subsections, one that specifies precisely the syntax and semantics of reference variables and
     937another that provides the rationale.
     938
     939I don't see any obvious problems with the syntax or semantics so far as I understand them. It's not
     940obvious that the description you're giving is complete, but I'm sure you'll find the special cases
     941as you do the implementation.
     942
     943My big gripes are mostly that you're not being as precise as you need to be in your terminology, and
     944that you say a few things that aren't actually true even though I generally know what you mean.
     945
     94620 C provides a pointer type; CFA adds a reference type. Both types contain an address, which is normally a
     94721 location in memory.
     948
     949An address is not a location in memory; an address refers to a location in memory. Furthermore it
     950seems weird to me to say that a type "contains" an address; rather, objects of that type do.
     951
     95221 Special addresses are used to denote certain states or access co-processor memory. By
     95322 convention, no variable is placed at address 0, so addresses like 0, 1, 2, 3 are often used to denote no-value
     95423 or other special states.
     955
     956This isn't standard C at all. There has to be one null pointer representation, but it doesn't have
     957to be a literal zero representation and there doesn't have to be more than one such representation.
     958
     95923 Often dereferencing a special state causes a memory fault, so checking is necessary
     96024 during execution.
     961
     962I don't see the connection between the two clauses here. I feel like if a bad pointer will not cause
     963a memory fault then I need to do more checking, not less.
     964
     96524 If the programming language assigns addresses, a program's execution is sound, \ie all
     96625 addresses are to valid memory locations.
     967
     968You haven't said what it means to "assign" an address, but if I use my intuitive understanding of
     969the term I don't see how this can be true unless you're assuming automatic storage management.
     970
     9711 Program variables are implicit pointers to memory locations generated by the compiler and automatically
     9722 dereferenced, as in:
     973
     974There is no reason why a variable needs to have a location in memory, and indeed in a typical
     975program many variables will not. In standard terminology an object identifier refers to data in the
     976execution environment, but not necessarily in memory.
     977
     97813 A pointer/reference is a generalization of a variable name, \ie a mutable address that can point to more
     97914 than one memory location during its lifetime.
     980
     981I feel like you're off the reservation here. In my world there are objects of pointer type, which
     982seem to be what you're describing here, but also pointer values, which can be stored in an object of
     983pointer type but don't necessarily have to be. For example, how would you describe the value denoted
     984by "&main" in a C program? I would call it a (function) pointer, but that doesn't satisfy your
     985definition.
     986
     98716 not occupy storage as the literal is embedded directly into instructions.) Hence, a pointer occupies memory
     98817 to store its current address, and the pointer's value is loaded by dereferencing, e.g.:
     989
     990As with my general objection regarding your definition of variables, there is no reason why a
     991pointer variable (object of pointer type) needs to occupy memory.
     992
     99321 p2 = p1 + x; // compiler infers *p2 = *p1 + x;
     994
     995What language are we in now?
     996
     99724 pointer usage. However, in C, the following cases are ambiguous, especially with pointer arithmetic:
     99825 p1 = p2; // p1 = p2 or *p1 = *p2
     999
     1000This isn't ambiguous. it's defined to be the first option.
     1001
     100226 p1 = p1 + 1; // p1 = p1 + 1 or *p1 = *p1 + 1
     1003
     1004Again, this statement is not ambiguous.
     1005
     100613 example. Hence, a reference behaves like the variable name for the current variable it is pointing-to. The
     100714 simplest way to understand a reference is to imagine the compiler inserting a dereference operator before
     100815 the reference variable for each reference qualifier in a declaration, e.g.:
     1009
     1010It's hard for me to understand who the audience for this part is. I think a practical programmer is
     1011likely to be satisfied with "a reference behaves like the variable name for the current variable it
     1012is pointing-to," maybe with some examples. Your "simplest way" doesn't strike me as simpler than
     1013that. It feels like you're trying to provide a more precise definition for the semantics of
     1014references, but it isn't actually precise enough to be a formal specification. If you want to
     1015express the semantics of references using rewrite rules that's a great way to do it, but lay the
     1016rules out clearly, and when you're showing an example of rewriting keep your
     1017references/pointers/values separate (right now, you use \eg "r3" to mean a reference, a pointer,
     1018and a value).
     1019
     102024 Cancellation works to arbitrary depth, and pointer and reference values are interchangeable because both
     102125 contain addresses.
     1022
     1023Except they're not interchangeable, because they have different and incompatible types.
     1024
     102540 Interestingly, C++ deals with the address duality by making the pointed-to value the default, and prevent-
     102641 ing changes to the reference address, which eliminates half of the duality. Java deals with the address duality
     102742 by making address assignment the default and requiring field assignment (direct or indirect via methods),
     102843 \ie there is no builtin bit-wise or method-wise assignment, which eliminates half of the duality.
     1029
     1030I can follow this but I think that's mostly because I already understand what you're trying to
     1031say. I don't think I've ever heard the term "method-wise assignment" and I don't see you defining
     1032it. Furthermore Java does have value assignment of basic (non-class) types, so your summary here
     1033feels incomplete. (If it were me I'd drop this paragraph rather than try to save it.)
     1034
     103511 Hence, for type & const, there is no pointer assignment, so &rc = &x is disallowed, and the address value
     103612 cannot be 0 unless an arbitrary pointer is assigned to the reference.
     1037
     1038Given the pains you've taken to motivate every little bit of the semantics up until now, this last
     1039clause ("the address value cannot be 0") comes out of the blue. It seems like you could have
     1040perfectly reasonable semantics that allowed the initialization of null references.
     1041
     104212 In effect, the compiler is managing the
     104313 addresses for type & const not the programmer, and by a programming discipline of only using references
     104414 with references, address errors can be prevented.
     1045
     1046Again, is this assuming automatic storage management?
     1047
     104818 rary binding. For reference initialization (like pointer), the initializing value must be an address (lvalue) not
     104919 a value (rvalue).
     1050
     1051This sentence appears to suggest that an address and an lvalue are the same thing.
     1052
     105320 int * p = &x; // both &x and x are possible interpretations
     1054
     1055Are you saying that we should be considering "x" as a possible interpretation of the initializer
     1056"&x"? It seems to me that this expression has only one legitimate interpretation in context.
     1057
     105821 int & r = x; // x unlikely interpretation, because of auto-dereferencing
     1059
     1060You mean, we can initialize a reference using an integer value? Surely we would need some sort of
     1061cast to induce that interpretation, no?
     1062
     106322 Hence, the compiler implicitly inserts a reference operator, &, before the initialization expression.
     1064
     1065But then the expression would have pointer type, which wouldn't be compatible with the type of r.
     1066
     106722 Similarly,
     106823 when a reference is used for a parameter/return type, the call-site argument does not require a reference
     106924 operator.
     1070
     1071Furthermore, it would not be correct to use a reference operator.
     1072
     107345 The implicit conversion allows
     10741 seamless calls to any routine without having to explicitly name/copy the literal/expression to allow the call.
     10752 While C' attempts to handle pointers and references in a uniform, symmetric manner, C handles routine
     10763 variables in an inconsistent way: a routine variable is both a pointer and a reference (particle and wave).
     1077
     1078After all this talk of how expressions can have both pointer and value interpretations, you're
     1079disparaging C because it has expressions that have both pointer and value interpretations?
     1080
     1081On Sat, Jul 9, 2016 at 4:18 PM Peter A. Buhr <pabuhr@plg.uwaterloo.ca> wrote:
     1082> Aaron discovered a few places where "&"s are missing and where there are too many "&", which are
     1083> corrected in the attached updated. None of the text has changed, if you have started reading
     1084> already.
     1085\end{comment}
    9051086
    9061087
     
    13531534because
    13541535
    1355 Currently, there are no \Index{lambda} expressions, i.e., unnamed routines because routine names are very important to properly select the correct routine.
    1356 
    1357 
    1358 \section{Lexical List}
     1536Currently, there are no \Index{lambda} expressions, \ie unnamed routines because routine names are very important to properly select the correct routine.
     1537
     1538
     1539\section{Tuples}
    13591540
    13601541In C and \CFA, lists of elements appear in several contexts, such as the parameter list for a routine call.
     
    13731554[ v+w, x*y, 3.14159, f() ]
    13741555\end{cfa}
    1375 Tuples are permitted to contain sub-tuples (i.e., nesting), such as ©[ [ 14, 21 ], 9 ]©, which is a 2-element tuple whose first element is itself a tuple.
     1556Tuples are permitted to contain sub-tuples (\ie nesting), such as ©[ [ 14, 21 ], 9 ]©, which is a 2-element tuple whose first element is itself a tuple.
    13761557Note, a tuple is not a record (structure);
    13771558a record denotes a single value with substructure, whereas a tuple is multiple values with no substructure (see flattening coercion in Section 12.1).
     
    14291610tuple does not have structure like a record; a tuple is simply converted into a list of components.
    14301611\begin{rationale}
    1431 The present implementation of \CFA does not support nested routine calls when the inner routine returns multiple values; i.e., a statement such as ©g( f() )© is not supported.
     1612The present implementation of \CFA does not support nested routine calls when the inner routine returns multiple values; \ie a statement such as ©g( f() )© is not supported.
    14321613Using a temporary variable to store the  results of the inner routine and then passing this variable to the outer routine works, however.
    14331614\end{rationale}
     
    14421623This requirement is the same as for comma expressions in argument lists.
    14431624
    1444 Type qualifiers, i.e., const and volatile, may modify a tuple type.
    1445 The meaning is the same as for a type qualifier modifying an aggregate type [Int99, x 6.5.2.3(7),x 6.7.3(11)], i.e., the qualifier is distributed across all of the types in the tuple, \eg:
     1625Type qualifiers, \ie const and volatile, may modify a tuple type.
     1626The meaning is the same as for a type qualifier modifying an aggregate type [Int99, x 6.5.2.3(7),x 6.7.3(11)], \ie the qualifier is distributed across all of the types in the tuple, \eg:
    14461627\begin{cfa}
    14471628const volatile [ int, float, const int ] x;
     
    14811662©w© is implicitly opened to yield a tuple of four values, which are then assigned individually.
    14821663
    1483 A \newterm{flattening coercion} coerces a nested tuple, i.e., a tuple with one or more components, which are themselves tuples, into a flattened tuple, which is a tuple whose components are not tuples, as in:
     1664A \newterm{flattening coercion} coerces a nested tuple, \ie a tuple with one or more components, which are themselves tuples, into a flattened tuple, which is a tuple whose components are not tuples, as in:
    14841665\begin{cfa}
    14851666[ a, b, c, d ] = [ 1, [ 2, 3 ], 4 ];
     
    15161697\end{cfa}
    15171698\index{lvalue}
    1518 The left-hand side is a tuple of \emph{lvalues}, which is a list of expressions each yielding an address, i.e., any data object that can appear on the left-hand side of a conventional assignment statement.
     1699The left-hand side is a tuple of \emph{lvalues}, which is a list of expressions each yielding an address, \ie any data object that can appear on the left-hand side of a conventional assignment statement.
    15191700©$\emph{expr}$© is any standard arithmetic expression.
    15201701Clearly, the types of the entities being assigned must be type compatible with the value of the expression.
     
    16041785[ x1, y1 ] = z = 0;
    16051786\end{cfa}
    1606 As in C, the rightmost assignment is performed first, i.e., assignment parses right to left.
     1787As in C, the rightmost assignment is performed first, \ie assignment parses right to left.
    16071788
    16081789
     
    16691850
    16701851
    1671 \section{Labelled Continue / Break}
     1852\section{Labelled Continue/Break}
    16721853
    16731854While C provides ©continue© and ©break© statements for altering control flow, both are restricted to one level of nesting for a particular control structure.
     
    17661947With ©goto©, the label is at the end of the control structure, which fails to convey this important clue early enough to the reader.
    17671948Finally, using an explicit target for the transfer instead of an implicit target allows new constructs to be added or removed without affecting existing constructs.
    1768 The implicit targets of the current ©continue© and ©break©, i.e., the closest enclosing loop or ©switch©, change as certain constructs are added or removed.
     1949The implicit targets of the current ©continue© and ©break©, \ie the closest enclosing loop or ©switch©, change as certain constructs are added or removed.
    17691950
    17701951
     
    19032084Furthermore, any statements before the first ©case© clause can only be executed if labelled and transferred to using a ©goto©, either from outside or inside of the ©switch©, both of which are problematic.
    19042085As well, the declaration of ©z© cannot occur after the ©case© because a label can only be attached to a statement, and without a fall through to case 3, ©z© is uninitialized.
    1905 The key observation is that the ©switch© statement branches into control structure, i.e., there are multiple entry points into its statement body.
     2086The key observation is that the ©switch© statement branches into control structure, \ie there are multiple entry points into its statement body.
    19062087\end{enumerate}
    19072088
     
    19112092the number of ©switch© statements is small,
    19122093\item
    1913 most ©switch© statements are well formed (i.e., no \Index*{Duff's device}),
     2094most ©switch© statements are well formed (\ie no \Index*{Duff's device}),
    19142095\item
    19152096the ©default© clause is usually written as the last case-clause,
     
    19212102\item
    19222103Eliminating default fall-through has the greatest potential for affecting existing code.
    1923 However, even if fall-through is removed, most ©switch© statements would continue to work because of the explicit transfers already present at the end of each ©case© clause, the common placement of the ©default© clause at the end of the case list, and the most common use of fall-through, i.e., a list of ©case© clauses executing common code, \eg:
     2104However, even if fall-through is removed, most ©switch© statements would continue to work because of the explicit transfers already present at the end of each ©case© clause, the common placement of the ©default© clause at the end of the case list, and the most common use of fall-through, \ie a list of ©case© clauses executing common code, \eg:
    19242105\begin{cfa}
    19252106case 1:  case 2:  case 3: ...
     
    21732354
    21742355The following \CC-style \Index{manipulator}s allow control over implicit seperation.
    2175 Manipulators \Indexc{sepOn}\index{manipulator!sepOn@©sepOn©} and \Indexc{sepOff}\index{manipulator!sepOff@©sepOff©} \emph{locally} toggle printing the separator, i.e., the seperator is adjusted only with respect to the next printed item.
     2356Manipulators \Indexc{sepOn}\index{manipulator!sepOn@©sepOn©} and \Indexc{sepOff}\index{manipulator!sepOff@©sepOff©} \emph{locally} toggle printing the separator, \ie the seperator is adjusted only with respect to the next printed item.
    21762357\begin{cfa}[mathescape=off,belowskip=0pt]
    21772358sout | sepOn | 1 | 2 | 3 | sepOn | endl;        §\C{// separator at start of line}§
     
    2186236712 3
    21872368\end{cfa}
    2188 Manipulators \Indexc{sepDisable}\index{manipulator!sepDisable@©sepDisable©} and \Indexc{sepEnable}\index{manipulator!sepEnable@©sepEnable©} \emph{globally} toggle printing the separator, i.e., the seperator is adjusted with respect to all subsequent printed items, unless locally adjusted.
     2369Manipulators \Indexc{sepDisable}\index{manipulator!sepDisable@©sepDisable©} and \Indexc{sepEnable}\index{manipulator!sepEnable@©sepEnable©} \emph{globally} toggle printing the separator, \ie the seperator is adjusted with respect to all subsequent printed items, unless locally adjusted.
    21892370\begin{cfa}[mathescape=off,aboveskip=0pt,belowskip=0pt]
    21902371sout | sepDisable | 1 | 2 | 3 | endl;           §\C{// globally turn off implicit separation}§
     
    24612642\caption{Constructors and Destructors}
    24622643\end{figure}
    2463 
    2464 
    2465 \begin{comment}
    2466 \section{References}
    2467 
    2468 
    2469 By introducing references in parameter types, users are given an easy way to pass a value by reference, without the need for NULL pointer checks.
    2470 In structures, a reference can replace a pointer to an object that should always have a valid value.
    2471 When a structure contains a reference, all of its constructors must initialize the reference and all instances of this structure must initialize it upon definition.
    2472 
    2473 The syntax for using references in \CFA is the same as \CC with the exception of reference initialization.
    2474 Use ©&© to specify a reference, and access references just like regular objects, not like pointers (use dot notation to access fields).
    2475 When initializing a reference, \CFA uses a different syntax which differentiates reference initialization from assignment to a reference.
    2476 The ©&© is used on both sides of the expression to clarify that the address of the reference is being set to the address of the variable to which it refers.
    2477 \end{comment}
    24782644
    24792645
     
    46834849\section{Incompatible}
    46844850
    4685 The following incompatibles exist between \CFA and C, and are similar to Annex C for \CC~\cite{ANSI14:C++}.
     4851The following incompatibles exist between \CFA and C, and are similar to Annex C for \CC~\cite{C++14}.
    46864852
    46874853\begin{enumerate}
     
    47814947Personß.ßFace pretty;                   §\C{// type defined inside}§
    47824948\end{cfa}
    4783 In C, the name of the nested types belongs to the same scope as the name of the outermost enclosing structure, i.e., the nested types are hoisted to the scope of the outer-most type, which is not useful and confusing.
     4949In C, the name of the nested types belongs to the same scope as the name of the outermost enclosing structure, \ie the nested types are hoisted to the scope of the outer-most type, which is not useful and confusing.
    47844950\CFA is C \emph{incompatible} on this issue, and provides semantics similar to \Index*[C++]{\CC}.
    47854951Nested types are not hoisted and can be referenced using the field selection operator ``©.©'', unlike the \CC scope-resolution operator ``©::©''.
     
    48785044\end{tabular}
    48795045\end{quote2}
    4880 For the prescribed head-files, \CFA implicitly wraps their includes in an ©extern "C"©;
     5046For the prescribed head-files, \CFA uses header interposition to wraps these includes in an ©extern "C"©;
    48815047hence, names in these include files are not mangled\index{mangling!name} (see~\VRef{s:Interoperability}).
    48825048All other C header files must be explicitly wrapped in ©extern "C"© to prevent name mangling.
     
    48865052\label{s:StandardLibrary}
    48875053
    4888 The goal of the \CFA standard-library is to wrap many of the existing C library-routines that are explicitly polymorphic into implicitly polymorphic versions.
     5054The \CFA standard-library wraps many existing explicitly-polymorphic C general-routines into implicitly-polymorphic versions.
    48895055
    48905056
     
    50105176\label{s:Math Library}
    50115177
    5012 The goal of the \CFA math-library is to wrap many of the existing C math library-routines that are explicitly polymorphic into implicitly polymorphic versions.
     5178The \CFA math-library wraps many existing explicitly-polymorphic C math-routines into implicitly-polymorphic versions.
    50135179
    50145180
Note: See TracChangeset for help on using the changeset viewer.