Changeset a17e7b8 for doc/user/user.tex


Ignore:
Timestamp:
Jul 7, 2016, 8:27:40 AM (8 years ago)
Author:
Peter A. Buhr <pabuhr@…>
Branches:
ADT, aaron-thesis, arm-eh, ast-experimental, cleanup-dtors, ctor, deferred_resn, demangler, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, memory, new-ast, new-ast-unique-expr, new-env, no_list, persistent-indexer, pthread-emulation, qualifiedEnum, resolv-new, with_gc
Children:
540b275
Parents:
4096de0
Message:

update section on pointer/reference

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/user/user.tex

    r4096de0 ra17e7b8  
    1111%% Created On       : Wed Apr  6 14:53:29 2016
    1212%% Last Modified By : Peter A. Buhr
    13 %% Last Modified On : Wed Jul  6 21:08:24 2016
    14 %% Update Count     : 1070
     13%% Last Modified On : Thu Jul  7 08:25:37 2016
     14%% Update Count     : 1099
    1515%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    1616
     
    525525Special addresses are used to denote certain states or access co-processor memory.
    526526By convention, no variable is placed at address 0, so addresses like 0, 1, 2, 3 are often used to denote no-value or other special states.
    527 Often dereferencing a special state causes a memory fault, so checking is necessary during execution.
     527Often dereferencing a special state causes a \Index{memory fault}, so checking is necessary during execution.
    528528If the programming language assigns addresses, a program's execution is \Index{sound}, i.e., all addresses are to valid memory locations.
    529529C allows programmers to assign addresses, so there is the potential for incorrect addresses, both inside and outside of the computer address-space.
     
    542542&
    543543\begin{lstlisting}
    544 int * const x = (int *)100
     544int * ®const® x = (int *)100
    545545*x = 3;                 // implicit dereference
    546 int * const y = (int *)104;
     546int * ®const® y = (int *)104;
    547547*y = *x;                // implicit dereference
    548548\end{lstlisting}
    549549\end{tabular}
    550550\end{quote2}
    551 where the right example is how the compiler logically interpreters variables.
    552 Since a variable name only points to one location during its lifetime, it is a \Index{immutable} pointer;
     551where the right example is how the compiler logically interpreters the variables in the left example.
     552Since a variable name only points to one location during its lifetime, it is an \Index{immutable} pointer;
    553553hence, variables ©x© and ©y© are constant pointers in the compiler interpretation.
    554554In general, variable addresses are stored in instructions instead of loaded independently, so an instruction fetch implicitly loads a variable's address.
     
    594594In many cases, the compiler can infer the meaning:
    595595\begin{lstlisting}
    596 p2 = p1 + x;                            §\C{// compiler infers *p2 = *p1 + x;}§
     596p2 = p1 + x;                                    §\C{// compiler infers *p2 = *p1 + x;}§
    597597\end{lstlisting}
    598598because adding the arbitrary integer value in ©x© to the address of ©p1© and storing the resulting address into ©p2© is an unlikely operation.
     
    600600However, there are ambiguous cases, especially when \Index{pointer arithmetic} is possible, as in C:
    601601\begin{lstlisting}
    602 p1 = p2;                                        §\C{// p1 = p2 or *p1 = *p2}§
    603 p1 = p1 + 1;                            §\C{// p1 = p1 + 1 or *p1 = *p1 + 1}§
     602p1 = p2;                                                §\C{// p1 = p2\ \ or\ \ *p1 = *p2}§
     603p1 = p1 + 1;                                    §\C{// p1 = p1 + 1\ \ or\ \ *p1 = *p1 + 1}§
    604604\end{lstlisting}
    605605
     
    607607In C, the default meaning for pointers is to manipulate the pointer's address and the pointed-to value is explicitly accessed by the dereference operator ©*©.
    608608\begin{lstlisting}
    609 p1 = p2;                                        §\C{// pointer address assignment}§
    610 *p1 = *p1 + 1;                          §\C{// pointed-to value assignment / operation}§
     609p1 = p2;                                                §\C{// pointer address assignment}§
     610*p1 = *p1 + 1;                                  §\C{// pointed-to value assignment / operation}§
    611611\end{lstlisting}
    612612which works well for situations where manipulation of addresses in the primary meaning and data is rarely accessed, such as storage management (©malloc©/©free©).
     
    622622\end{lstlisting}
    623623
    624 To switch the default meaning for an address requires a new kind of pointer, called a \newterm{reference} and denoted by ©&©.
     624To switch the default meaning for an address requires a new kind of pointer, called a \newterm{reference} denoted by ©&©.
    625625\begin{lstlisting}
    626626int x, y, ®&® r1, ®&® r2, ®&&® r3;
    627 ®&®r1 = &x;                                     §\C{// r1 points to x}§
    628 ®&®r2 = &r1;                            §\C{// r2 also points to x}§
    629 ®&®r1 = &y;                                     §\C{// r2 also points to x}§
    630 ®&®r1 = &r2 + 1;                        §\C{// r1 points to y, pointer arithmetic}§
    631 ®&®r3 = ®&®&r2;                         §\C{// r3 points to r2}§
     627®&®r1 = &x;                                             §\C{// r1 points to x}§
     628®&®r2 = &r1;                                    §\C{// r2 also points to x}§
     629®&®r1 = &y;                                             §\C{// r2 also points to x}§
     630®&®r1 = &r2 + 1;                                §\C{// r1 points to y, pointer arithmetic}§
     631®&®r3 = ®&®&r2;                                 §\C{// r3 points to r2}§
    632632r2 = ((r1 + r2) * (r3 - r1)) / (r3 - 15); §\C{// implicit dereferencing}§
    633633\end{lstlisting}
    634 Except for auto-dereferencing by the compiler, this reference example is the same as the pointer example.
     634Except for auto-dereferencing by the compiler, this reference example is the same as the previous pointer example.
    635635Hence, a reference behaves like the variable name for the current variable it is pointing-to.
    636636The simplest way to understand a reference is to imagine the compiler inserting a dereference operator before the reference variable for each reference qualifier in a declaration, e.g.:
     
    642642®*®r2 = ((®*®r1 + ®*®r2) ®*® (®**®r3 - ®*®r1)) / (®**®r3 - 15);
    643643\end{lstlisting}
    644 When a reference appears beside a dereference, e.g., ©&*©, they cancel out.\footnote{
     644When a reference operation appears beside a dereference operation, e.g., ©&*©, they cancel out.\footnote{
    645645The unary ©&© operator yields the address of its operand.
    646646If the operand has type ``type'', the result has type ``pointer to type''.
     
    648648Hence, assigning to a reference requires the address of the reference variable (\Index{lvalue}):
    649649\begin{lstlisting}
    650 (&®*®)r1 = &x;                          §\C{// (\&*) cancel out giving variable r1 not the variable pointed-to by r1}§
     650(&®*®)r1 = &x;                                  §\C{// (\&*) cancel out giving variable r1 not the variable pointed-to by r1}§
    651651\end{lstlisting}
    652652Similarly, the address of a reference can be obtained for assignment or computation (\Index{rvalue}):
    653653\begin{lstlisting}
    654 (&®*®)r3 = &(&®*®)r2;           §\C{// (\&*) cancel out giving the address of variable r2}§
     654(&®*®)r3 = &(&®*®)r2;                   §\C{// (\&*) cancel out giving the address of variable r2}§
    655655\end{lstlisting}
    656656\Index{Cancellation}\index{pointer!cancellation rule} works to arbitrary depth, and pointer and reference values are interchangeable because both contain addresses.
     
    658658int x, *p1 = &x, **p2 = &p1, ***p3 = &p2,
    659659                 &r1 = &x, &&r2 = &&r1, &&&r3 = &&r2;
    660 ***p3 = 3;                                      §\C{// change x}§
    661 r3 = 3;                                         §\C{// change x, ***r3}§
    662 **p3 = ...;                                     §\C{// change p1}§
    663 &r3 = ...;                                      §\C{// change r1, (\&*)**r3, 1 cancellation}§
    664 *p3 = ...;                                      §\C{// change p2}§
    665 &&r3 = ...;                                     §\C{// change r2, (\&(\&*)*)*r3, 2 cancellations}§
    666 &&&r3 = p3;                                     §\C{// change r3 to p3, (\&(\&(\&*)*)*)r3, 3 cancellations}§
     660***p3 = 3;                                              §\C{// change x}§
     661r3 = 3;                                                 §\C{// change x, ***r3}§
     662**p3 = ...;                                             §\C{// change p1}§
     663&r3 = ...;                                              §\C{// change r1, (\&*)**r3, 1 cancellation}§
     664*p3 = ...;                                              §\C{// change p2}§
     665&&r3 = ...;                                             §\C{// change r2, (\&(\&*)*)*r3, 2 cancellations}§
     666&&&r3 = p3;                                             §\C{// change r3 to p3, (\&(\&(\&*)*)*)r3, 3 cancellations}§
    667667\end{lstlisting}
    668668Finally, implicit dereferencing and cancellation are a static (compilation) phenomenon not a dynamic one.
     
    674674Java deals with the address duality by making address assignment the default and providing a \Index{clone} mechanism to change the pointed-to value.
    675675
    676 As for pointers, references may have qualifiers:
    677 \begin{lstlisting}
    678 const int cx = 5;                       §\C{// cannot change cx;}§
    679 const int & r3 = &cx;           §\C{// cannot change what r3 is pointing to}§
    680 ®&®r3 = &cx;                            §\C{// can change r3}§
    681 r3 = 7;                                         §\C{// error, cannot change cx}§
    682 int & const r4 = &x;            §\C{// must be initialized, \CC reference}§
    683 ®&®r4 = &x;                                     §\C{// error, cannot change r4}§
    684 const int & const r5 = &cx;     §\C{// must be initialized, \CC reference}§
    685 r5 = 7;                                         §\C{// error, cannot change cx}§
    686 ®&®r5 = &cx;                            §\C{// error, cannot change r5}§
     676As for a pointer, a reference may have qualifiers:
     677\begin{lstlisting}
     678const int cx = 5;                               §\C{// cannot change cx;}§
     679const int & r3 = &cx;                   §\C{// cannot change what r3 is pointing to}§
     680®&®r3 = &cx;                                    §\C{// can change r3}§
     681r3 = 7;                                                 §\C{// error, cannot change cx}§
     682int & const r4 = &x;                    §\C{// must be initialized, \CC reference}§
     683®&®r4 = &x;                                             §\C{// error, cannot change r4}§
     684const int & const r5 = &cx;             §\C{// must be initialized, \CC reference}§
     685r5 = 7;                                                 §\C{// error, cannot change cx}§
     686®&®r5 = &cx;                                    §\C{// error, cannot change r5}§
    687687\end{lstlisting}
    688688Hence, for type ©& const©, there is no pointer assignment, so ©&r4 = &x© is disallowed, and \emph{the address value cannot be ©0©}.
    689 in effect, the compiler is managing the addresses not the programmer.
    690 
    691 \Index{Initialization} is different than \Index{assignment} because initialization occurs on an empty (uninitialized) storage on an object, while assignment occurs on possible initialized storage for an object.
     689In effect, the compiler is managing the addresses fpr type ©& const© not the programmer.
     690
     691\Index{Initialization} is different than \Index{assignment} because initialization occurs on the empty (uninitialized) storage on an object, while assignment occurs on possibly initialized storage for an object.
    692692There are three initialization contexts in \CFA: declaration initialization, argument/parameter binding, return/temporary binding.
    693 For reference (like pointer) initialization, the initializing value must be an address (lvalue) not a value (rvalue).
    694 \begin{lstlisting}
    695 int * p = &x;                           §\C{// both \&x and x are possible interpretations}§
    696 int & r = x;                            §\C{// x unlikely interpretation, because of auto-dereferencing}§
    697 \end{lstlisting}
    698 Hence, the compiler implicitly inserts a reference, ©&©, before the initialization expression:
    699 Similarly, when a reference is used for a parameter/return type, the call-site argument does not require a reference.
    700 \begin{lstlisting}
    701 int & f( int & ri );            §\C{// reference parameter and return}§
    702 z = f( x ) + f( y );            §\C{// reference not required}§
     693For reference initialization (like pointer), the initializing value must be an address (lvalue) not a value (rvalue).
     694\begin{lstlisting}
     695int * p = &x;                                   §\C{// both \&x and x are possible interpretations}§
     696int & r = x;                                    §\C{// x unlikely interpretation, because of auto-dereferencing}§
     697\end{lstlisting}
     698Hence, the compiler implicitly inserts a reference operator, ©&©, before the initialization expression:
     699Similarly, when a reference is used for a parameter/return type, the call-site argument does not require a reference operator.
     700\begin{lstlisting}
     701int & f( int & ri );                    §\C{// reference parameter and return}§
     702z = f( x ) + f( y );                    §\C{// reference operator added not required}§
    703703\end{lstlisting}
    704704Within routine ©f©, it is possible to change the argument by changing the corresponding parameter, and parameter ©ri© can be locally reassigned within ©f©.
    705 The return reference from ©f© is copied into a compiler generated temporary, which is logically treated as an initialization.
     705The return reference from ©f© is copied into a compiler generated temporary, which is treated as an initialization.
    706706
    707707When a pointer/reference parameter has a ©const© value (immutable), it is possible to pass literals and expressions.
    708708\begin{lstlisting}
    709 void f( const int & cri );
    710 void g( const int * cri );
    711 f( 3 );                  g( &3 );
     709void f( ®const® int & cri );
     710void g( ®const® int * cri );
     711f( 3 );                   g( &3 );
    712712f( x + y );             g( &(x + y) );
    713713\end{lstlisting}
    714714Here, the compiler passes the address to the literal 3 or the temporary for the expression ©x + y©, knowing the argument cannot be changed through the parameter.
    715 (The ©&© is necessary for the pointer parameter to make the types match, and is common for a C programmer.)
    716 \CFA extends this semantics to a mutable pointer/reference parameter, and the compiler implicitly creates the necessary temporary (copying the argument), which is subsequently pointed to by the reference parameter, which can be changed.
     715(The ©&© is necessary for the pointer parameter to make the types match, and is common requirement for a C programmer.)
     716\CFA extends this semantics to a mutable pointer/reference parameter, and the compiler implicitly creates the necessary temporary (copying the argument), which is subsequently pointed to by the reference parameter and can be changed.
     717\begin{lstlisting}
     718void f( int & cri );
     719void g( int * cri );
     720f( 3 );                   g( &3 );              §\C{// compiler implicit generates temporaries}§
     721f( x + y );             g( &(x + y) );  §\C{// compiler implicit generates temporaries}§
     722\end{lstlisting}
    717723Essentially, there is an implicit rvalue to lvalue conversion in this case.\footnote{
    718724This conversion attempts to address the \newterm{const Hell} problem, when the innocent addition of a ©const© qualifier causes a cascade of type failures, requiring an unknown number of additional ©const© qualifiers, until it is discovered a ©const© qualifier cannot be added and all the ©const© qualifiers must be removed.}
     
    722728\begin{lstlisting}
    723729void f( int p ) {...}
    724 void (*fp)( int ) = &f;         §\C{// pointer initialization}§
    725 void (*fp)( int ) = f;          §\C{// reference initialization}§
    726 (*fp)(3);                                       §\C{// pointer invocation}§
    727 fp(3);                                          §\C{// reference invocation}§
     730void (*fp)( int ) = &f;                 §\C{// pointer initialization}§
     731void (*fp)( int ) = f;                  §\C{// reference initialization}§
     732(*fp)(3);                                               §\C{// pointer invocation}§
     733fp(3);                                                  §\C{// reference invocation}§
    728734\end{lstlisting}
    729735A routine variable is best described by a ©const© reference:
     
    731737const void (&fp)( int ) = f;
    732738fp( 3 );
    733 fp = ...                                        §\C{// change code not allowed}§
    734 &fp = ...;                                      §\C{// change routine refernce allowed
     739fp = ...                                                §\C{// change code not allowed}§
     740&fp = ...;                                              §\C{// change routine reference
    735741\end{lstlisting}
    736742because the value of the routine variable is a routine literal, i.e., the routine code is normally immutable during execution.\footnote{
Note: See TracChangeset for help on using the changeset viewer.