Context Navigation

-                      r51b5a02
+                      rcbe477e
 The implicit targets of the current @continue@ and @break@, \ie the closest enclosing loop or @switch@, change as certain constructs are added or removed.
+\TODO{choose and fallthrough here as well?}
 \subsection{\texorpdfstring{\LstKeywordStyle{with} Clause / Statement}{with Clause / Statement}}
 …
 \subsection{References}
+\TODO{Pull draft text from user manual; make sure to discuss nested references and rebind operator drawn from lvalue-addressof operator}
+All variables in C have an \emph{address}, a \emph{value}, and a \emph{type}; at the position in the program's memory denoted by the address, there exists a sequence of bits (the value), with the length and semantic meaning of this bit sequence defined by the type.
+The C type system does not always track the relationship between a value and its address; a value that does not have a corresponding address is called a \emph{rvalue} (for ``right-hand value''), while a value that does have an address is called a \emph{lvalue} (for ``left-hand value''); in @int x; x = 42;@ the variable expression @x@ on the left-hand-side of the assignment is a lvalue, while the constant expression @42@ on the right-hand-side of the assignment is a rvalue.
+Which address a value is located at is sometimes significant; the imperative programming paradigm of C relies on the mutation of values at specific addresses.
+Within a lexical scope, lvalue exressions can be used in either their \emph{address interpretation} to determine where a mutated value should be stored or in their \emph{value interpretation} to refer to their stored value; in @x = y;@ in @{ int x, y = 7; x = y; }@, @x@ is used in its address interpretation, while y is used in its value interpretation.
+Though this duality of interpretation is useful, C lacks a direct mechanism to pass lvalues between contexts, instead relying on \emph{pointer types} to serve a similar purpose.
+In C, for any type @T@ there is a pointer type @T*@, the value of which is the address of a value of type @T@; a pointer rvalue can be explicitly \emph{dereferenced} to the pointed-to lvalue with the dereference operator @*?@, while the rvalue representing the address of a lvalue can be obtained with the address-of operator @&?@.
+\begin{cfa}
+int x = 1, y = 2, * p1, * p2, ** p3;
+p1 = &x;  $\C{// p1 points to x}$
+p2 = &y;  $\C{// p2 points to y}$
+p3 = &p1;  $\C{// p3 points to p1}$
+\end{cfa}
+Unfortunately, the dereference and address-of operators introduce a great deal of syntactic noise when dealing with pointed-to values rather than pointers, as well as the potential for subtle bugs.
+It would be desirable to have the compiler figure out how to elide the dereference operators in a complex expression such as @*p2 = ((*p1 + *p2) * (**p3 - *p1)) / (**p3 - 15);@, for both brevity and clarity.
+However, since C defines a number of forms of \emph{pointer arithmetic}, two similar expressions involving pointers to arithmetic types (\eg @*p1 + x@ and @p1 + x@) may each have well-defined but distinct semantics, introducing the possibility that a user programmer may write one when they mean the other, and precluding any simple algorithm for elision of dereference operators.
+To solve these problems, \CFA introduces reference types @T&@; a @T&@ has exactly the same value as a @T*@, but where the @T*@ takes the address interpretation by default, a @T&@ takes the value interpretation by default, as below:
+\begin{cfa}
+inx x = 1, y = 2, & r1, & r2, && r3;
+&r1 = &x;  $\C{// r1 points to x}$
+&r2 = &y;  $\C{// r2 points to y}$
+&&r3 = &&r1;  $\C{// r3 points to r2}$
+r2 = ((r1 + r2) * (r3 - r1)) / (r3 - 15);  $\C{// implicit dereferencing}$
+\end{cfa}
+Except for auto-dereferencing by the compiler, this reference example is exactly the same as the previous pointer example.
+Hence, a reference behaves like a variable name -- an lvalue expression which is interpreted as a value, but also has the type system track the address of that value.
+One way to conceptualize a reference is via a rewrite rule, where the compiler inserts a dereference operator before the reference variable for each reference qualifier in the reference variable declaration, so the previous example implicitly acts like:
+\begin{cfa}
+`*`r2 = ((`*`r1 + `*`r2) * (`**`r3 - `*`r1)) / (`**`r3 - 15);
+\end{cfa}
+References in \CFA are similar to those in \CC, but with a couple important improvements, both of which can be seen in the example above.
+Firstly, \CFA does not forbid references to references, unlike \CC.
+This provides a much more orthogonal design for library implementors, obviating the need for workarounds such as @std::reference_wrapper@.
+Secondly, unlike the references in \CC which always point to a fixed address, \CFA references are rebindable.
+This allows \CFA references to be default-initialized (to a null pointer), and also to point to different addresses throughout their lifetime.
+This rebinding is accomplished without adding any new syntax to \CFA, but simply by extending the existing semantics of the address-of operator in C.
+In C, the address of a lvalue is always a rvalue, as in general that address is not stored anywhere in memory, and does not itself have an address.
+In \CFA, the address of a @T&@ is a lvalue @T*@, as the address of the underlying @T@ is stored in the reference, and can thus be mutated there.
+The result of this rule is that any reference can be rebound using the existing pointer assignment semantics by assigning a compatible pointer into the address of the reference, \eg @&r1 = &x;@ above.
+This rebinding can occur to an arbitrary depth of reference nesting; $n$ address-of operators applied to a reference nested $m$ times will produce an lvalue pointer nested $n$ times if $n \le m$ (note that $n = m+1$ is simply the usual C rvalue address-of operator applied to the $n = m$ case).
+The explicit address-of operators can be thought of as ``cancelling out'' the implicit dereference operators, \eg @(&`*`)r1 = &x;@ or @(&(&`*`)`*`)r3 = &(&`*`)r1;@ or even @(&`*`)r2 = (&`*`)`*`r3;@ for @&r2 = &r3;@.
+Since pointers and references share the same internal representation, code using either is equally performant; in fact the \CFA compiler converts references to pointers internally, and the choice between them in user code can be made based solely on convenience.
+By analogy to pointers, \CFA references also allow cv-qualifiers:
+\begin{cfa}
+const int cx = 5;               $\C{// cannot change cx}$
+const int & cr = cx;    $\C{// cannot change cr's referred value}$
+&cr = &cx;                              $\C{// rebinding cr allowed}$
+cr = 7;                                 $\C{// ERROR, cannot change cr}$
+int & const rc = x;             $\C{// must be initialized, like in \CC}$
+&rc = &x;                               $\C{// ERROR, cannot rebind rc}$
+rc = 7;                                 $\C{// x now equal to 7}$
+\end{cfa}
+\TODO{Pull more draft text from user manual; make sure to discuss initialization and reference conversions}
 \subsection{Constructors and Destructors}
 …
 \subsection{0/1}
+\TODO{Some text already at the end of Section~\ref{sec:poly-fns}}
 \subsection{Units}

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset cbe477e

Legend:

doc/papers/general/Paper.tex

Download in other formats: