| [5c6afcd] | 1 | ## Lvalues and References ##
 | 
|---|
 | 2 | C defines the notion of a _lvalue_, essentially an addressable object, as well 
 | 
|---|
 | 3 | as a number of type _qualifiers_, `const`, `volatile`, and `restrict`. 
 | 
|---|
 | 4 | As these type qualifiers are generally only meaningful to the type system as 
 | 
|---|
 | 5 | applied to lvalues, the two concepts are closely related. 
 | 
|---|
 | 6 | A const lvalue cannot be modified, the compiler cannot assume that a volatile 
 | 
|---|
 | 7 | lvalue will not be concurrently modified by some other part of the system, and 
 | 
|---|
 | 8 | a restrict lvalue must have pointer type, and the compiler may assume that no 
 | 
|---|
 | 9 | other pointer in scope aliases that pointer (this is solely a performance 
 | 
|---|
 | 10 | optimization, and may be ignored by implementers).
 | 
|---|
 | 11 | _Lvalue-to-rvalue conversion_, which takes an lvalue of type `T` and converts 
 | 
|---|
 | 12 | it to an expression result of type `T` (commonly called an _rvalue_ of type 
 | 
|---|
 | 13 | `T`) also strips all the qualifiers from the lvalue, as an expression result 
 | 
|---|
 | 14 | is a value, not an addressable object that can have properties like 
 | 
|---|
 | 15 | immutability. 
 | 
|---|
 | 16 | Though lvalue-to-rvalue conversion strips the qualifiers from lvalues, 
 | 
|---|
 | 17 | derived rvalue types such as pointer types may include qualifiers; 
 | 
|---|
 | 18 | `const int *` is a distinct type from `int *`, though the latter is safely 
 | 
|---|
| [a4b3525] | 19 | convertible to the former. 
 | 
|---|
| [5c6afcd] | 20 | In general, any number of qualifiers can be safely added to the 
 | 
|---|
 | 21 | pointed-to-type of a pointer type, e.g. `int *` converts safely to 
 | 
|---|
 | 22 | `const int *` and `volatile int *`, both of which convert safely to 
 | 
|---|
 | 23 | `const volatile int *`.
 | 
|---|
 | 24 | 
 | 
|---|
| [a4b3525] | 25 | Since lvalues are precisely "addressable objects", in C, only lvalues can be 
 | 
|---|
| [5c6afcd] | 26 | used as the operand of the `&` address-of operator. 
 | 
|---|
 | 27 | Similarly, only modifiable lvalues may be used as the assigned-to 
 | 
|---|
 | 28 | operand of the mutating operators: assignment, compound assignment 
 | 
|---|
 | 29 | (e.g. `+=`), and increment and decrement; roughly speaking, lvalues without 
 | 
|---|
 | 30 | the `const` qualifier are modifiable, but lvalues of incomplete types, array 
 | 
|---|
 | 31 | types, and struct or union types with const members are also not modifiable. 
 | 
|---|
 | 32 | Lvalues are produced by the following expressions: object identifiers 
 | 
|---|
 | 33 | (function identifiers are not considered to be lvalues), the result of the `*` 
 | 
|---|
 | 34 | dereference operator applied to an object pointer, the result of a member 
 | 
|---|
 | 35 | expression `s.f` if the left argument `s` is an lvalue (note that the 
 | 
|---|
 | 36 | preceding two rules imply that the result of indirect member expressions 
 | 
|---|
 | 37 | `s->f` are always lvalues, by desugaring to `(*s).f`), and the result of the 
 | 
|---|
 | 38 | indexing operator `a[i]` (similarly by its desugaring to `*((a)+(i))`). 
 | 
|---|
 | 39 | Somewhat less obviously, parenthesized lvalue expressions, string literals, 
 | 
|---|
 | 40 | and compound literals (e.g. `(struct foo){ 'x', 3.14, 42 }`) are also lvalues.
 | 
|---|
 | 41 | 
 | 
|---|
 | 42 | All of the conversions described above are defined in standard C, but Cforall 
 | 
|---|
 | 43 | requires further features from its type system. 
 | 
|---|
 | 44 | In particular, to allow overloading of the `*?` and `?[?]` dereferencing and 
 | 
|---|
 | 45 | indexing operators, Cforall requires a way to declare that the functions 
 | 
|---|
 | 46 | defining these operators return lvalues, and since C functions never return 
 | 
|---|
 | 47 | lvalues and for syntactic reasons we wish to distinguish functions which 
 | 
|---|
 | 48 | return lvalues from functions which return pointers, this is of necessity an 
 | 
|---|
 | 49 | extension to standard C. 
 | 
|---|
 | 50 | In the current design, an `lvalue` qualifier can be added to function return 
 | 
|---|
 | 51 | types (and only to function return types), the effect of which is to return a 
 | 
|---|
 | 52 | pointer which is implicitly dereferenced by the caller.
 | 
|---|
 | 53 | C++ includes the more general concept of _references_, which are typically 
 | 
|---|
 | 54 | implemented as implicitly dereferenced pointers as well. 
 | 
|---|
 | 55 | Another use case which C++ references support is providing a way to pass 
 | 
|---|
 | 56 | function parameters by reference (rather than by value) with a natural 
 | 
|---|
 | 57 | syntax; Cforall in its current state has no such mechanism. 
 | 
|---|
 | 58 | As an example, consider the following (currently typical) copy-constructor 
 | 
|---|
 | 59 | signature and call:
 | 
|---|
 | 60 | 
 | 
|---|
 | 61 |         void ?{}(T *lhs, T rhs);
 | 
|---|
 | 62 |         
 | 
|---|
 | 63 |         T x;
 | 
|---|
 | 64 |         T y = { x };
 | 
|---|
 | 65 | 
 | 
|---|
 | 66 | Note that the right-hand argument is passed by value, and would in fact be 
 | 
|---|
 | 67 | copied twice in the course of the constructor call `T y = { x };` (once into 
 | 
|---|
 | 68 | the parameter by C's standard `memcpy` semantics, once again in the body of 
 | 
|---|
 | 69 | the copy constructor, though it is possible that return value optimization 
 | 
|---|
 | 70 | will elide the `memcpy`-style copy).
 | 
|---|
 | 71 | However, to pass by reference using the existing pointer syntax, the example 
 | 
|---|
 | 72 | above would look like this:
 | 
|---|
 | 73 | 
 | 
|---|
 | 74 |         void ?{}(T *lhs, const T *rhs);
 | 
|---|
 | 75 |         
 | 
|---|
 | 76 |         T x;
 | 
|---|
 | 77 |         T y = { &x };
 | 
|---|
 | 78 | 
 | 
|---|
 | 79 | This example is not even as bad as it could be; assuming pass-by-reference is 
 | 
|---|
 | 80 | the desired semantics for the `?+?` operator, that implies the following 
 | 
|---|
 | 81 | design today:
 | 
|---|
 | 82 | 
 | 
|---|
 | 83 |         T ?+?(const T *lhs, const T *rhs);
 | 
|---|
 | 84 |         
 | 
|---|
 | 85 |         T a, b;
 | 
|---|
 | 86 |         T c = &a + &b,
 | 
|---|
 | 87 | 
 | 
|---|
 | 88 | In addition to `&a + &b` being unsightly and confusing syntax to add `a` and 
 | 
|---|
 | 89 | `b`, it also introduces a possible ambiguity with pointer arithmetic on `T*` 
 | 
|---|
 | 90 | which can only be resolved by return-type inference.
 | 
|---|
 | 91 | 
 | 
|---|
 | 92 | Pass-by-reference and marking functions as returning lvalues instead of the 
 | 
|---|
 | 93 | usual rvalues are actually closely related concepts, as obtaining a reference 
 | 
|---|
 | 94 | to pass depends on the referenced object being addressable, i.e. an lvalue, 
 | 
|---|
 | 95 | and lvalue return types are effectively return-by-reference. 
 | 
|---|
 | 96 | Cforall should also unify the concepts, with a parameterized type for 
 | 
|---|
 | 97 | "reference to `T`", which I will write `T&`. 
 | 
|---|
 | 98 | 
 | 
|---|
 | 99 | Firstly, assignment to a function parameter as part of a function call and 
 | 
|---|
 | 100 | local variable initialization have almost identical semantics, so should be 
 | 
|---|
 | 101 | treated similarly for the reference type too; this implies we should be able 
 | 
|---|
 | 102 | to declare local variables of reference type, as in the following:
 | 
|---|
 | 103 | 
 | 
|---|
 | 104 |         int x = 42;
 | 
|---|
 | 105 |         int& r = x; // r is now an alias for x
 | 
|---|
 | 106 | 
 | 
|---|
 | 107 | Unlike in C++, we would like to have the capability to re-bind references 
 | 
|---|
 | 108 | after initialization, as this allows the attractive syntax of references to 
 | 
|---|
 | 109 | support some further useful code patterns, such as first initializing a 
 | 
|---|
 | 110 | reference after its declaration. 
 | 
|---|
 | 111 | Constant references to `T` (`T& const`) should not be re-bindable. 
 | 
|---|
 | 112 | 
 | 
|---|
 | 113 | One option for re-binding references is to use a dedicated operator, as in the 
 | 
|---|
 | 114 | code example below:
 | 
|---|
 | 115 | 
 | 
|---|
 | 116 |         int i = 42, j = 7;
 | 
|---|
 | 117 |         int& r = i;  // bind r to i
 | 
|---|
 | 118 |         r = j;       // set i (== r) to 7
 | 
|---|
 | 119 |         r := j;      // rebind r to j using the new := rebind operator
 | 
|---|
 | 120 |         i = 42;      // reset i (!= r) to 42
 | 
|---|
 | 121 |         assert( r == 7 );
 | 
|---|
 | 122 | 
 | 
|---|
 | 123 | Another option for reference rebind is to modify the semantics of the `&` 
 | 
|---|
 | 124 | address-of operator. 
 | 
|---|
 | 125 | In standard C, the address-of operator never returns an lvalue, but for an 
 | 
|---|
 | 126 | object of type `T`, returns a `T*`. 
 | 
|---|
 | 127 | If the address-of operator returned an lvalue for references, this would 
 | 
|---|
 | 128 | allow reference rebinding using the usual pointer assignment syntax; 
 | 
|---|
 | 129 | that is, if address-of a `T&` returned a `T*&` then the following works:
 | 
|---|
 | 130 | 
 | 
|---|
 | 131 |     int i = 42; j = 7;
 | 
|---|
 | 132 |     int& r = i;  // bind r to i
 | 
|---|
 | 133 |     r = j;       // set i (== r) to 7
 | 
|---|
 | 134 |     &r = &j;     // rebind r to j using the newly mutable "address-of reference"
 | 
|---|
 | 135 |     i = 42;      // reset i (!= r) to 42
 | 
|---|
 | 136 |     assert( r == 7 );
 | 
|---|
 | 137 | 
 | 
|---|
 | 138 | This change (making addresses of references mutable) allows use of existing 
 | 
|---|
 | 139 | operators defined over pointers, as well as elegant handling of nested 
 | 
|---|
 | 140 | references-to-references.
 | 
|---|
 | 141 | 
 | 
|---|
 | 142 | The semantics and restrictions of `T&` are effectively the semantics of an 
 | 
|---|
 | 143 | lvalue of type `T`, and by this analogy there should be a safe, qualifier 
 | 
|---|
 | 144 | dropping conversion from `const volatile restrict T&` (and every other 
 | 
|---|
 | 145 | qualifier combination on the `T` in `T&`) to `T`. 
 | 
|---|
 | 146 | With this conversion, the resolver may type most expressions that C would 
 | 
|---|
 | 147 | call "lvalue of type `T`" as `T&`. 
 | 
|---|
 | 148 | There's also an obvious argument that lvalues of a (possibly-qualified) type 
 | 
|---|
| [a4b3525] | 149 | `T` should be convertible to references of type `T`, where `T` is also 
 | 
|---|
| [5c6afcd] | 150 | so-qualified (e.g. lvalue `int` to `int&`, lvalue `const char` to 
 | 
|---|
 | 151 | `const char&`). 
 | 
|---|
 | 152 | By similar arguments to pointer types, qualifiers should be addable to the 
 | 
|---|
 | 153 | referred-to type of a reference (e.g. `int&` to `const int&`). 
 | 
|---|
| [a4b3525] | 154 | As a note, since pointer arithmetic is explicitly not defined on `T&`, 
 | 
|---|
| [5c6afcd] | 155 | `restrict T&` should be allowable and would have alias-analysis rules that 
 | 
|---|
 | 156 | are actually comprehensible to mere mortals.
 | 
|---|
 | 157 | 
 | 
|---|
 | 158 | Using pass-by-reference semantics for function calls should not put syntactic 
 | 
|---|
 | 159 | constraints on how the function is called; particularly, temporary values 
 | 
|---|
 | 160 | should be able to be passed by reference. 
 | 
|---|
 | 161 | The mechanism for this pass-by-reference would be to store the value of the 
 | 
|---|
 | 162 | temporary expression into a new unnamed temporary, and pass the reference of 
 | 
|---|
 | 163 | that temporary to the function.
 | 
|---|
 | 164 | As an example, the following code should all compile and run:
 | 
|---|
 | 165 | 
 | 
|---|
 | 166 |         void f(int& x) { printf("%d\n", x++); }
 | 
|---|
 | 167 |         
 | 
|---|
 | 168 |         int i = 7, j = 11;
 | 
|---|
 | 169 |         const int answer = 42;
 | 
|---|
 | 170 |         
 | 
|---|
 | 171 |         f(i);      // (1)
 | 
|---|
 | 172 |         f(42);     // (2)
 | 
|---|
 | 173 |         f(i + j);  // (3)
 | 
|---|
 | 174 |         f(answer); // (4)
 | 
|---|
 | 175 | 
 | 
|---|
 | 176 | The semantics of (1) are just like C++'s, "7" is printed, and `i` has the 
 | 
|---|
 | 177 | value 8 afterward. 
 | 
|---|
 | 178 | For (2), "42" is printed, and the increment of the unnamed temporary to 43 is 
 | 
|---|
 | 179 | not visible to the caller; (3) behaves similarly, printing "19", but not 
 | 
|---|
 | 180 | changing `i` or `j`. 
 | 
|---|
 | 181 | (4) is a bit of an interesting case; we want to be able to support named 
 | 
|---|
 | 182 | constants like `answer` that can be used anywhere the constant expression 
 | 
|---|
 | 183 | they're replacing (like `42`) could go; in this sense, (4) and (2) should have 
 | 
|---|
 | 184 | the same semantics. 
 | 
|---|
 | 185 | However, we don't want the mutation to the `x` parameter to be visible in 
 | 
|---|
 | 186 | `answer` afterward, because `answer` is a constant, and thus shouldn't change. 
 | 
|---|
 | 187 | The solution to this is to allow chaining of the two lvalue conversions; 
 | 
|---|
 | 188 | `answer` has the type `const int&`, which can be converted to `int` by the 
 | 
|---|
 | 189 | lvalue-to-rvalue conversion (which drops the qualifiers), then up to `int&` 
 | 
|---|
 | 190 | by the temporary-producing rvalue-to-lvalue conversion. 
 | 
|---|
 | 191 | Thus, an unnamed temporary is inserted, initialized to `answer` (i.e. 42), 
 | 
|---|
 | 192 | mutated by `f`, then discarded; "42" is printed, just as in case (2), and 
 | 
|---|
 | 193 | `answer` still equals 42 after the call, because it was the temporary that was 
 | 
|---|
 | 194 | mutated, not `answer`.
 | 
|---|
 | 195 | It may be somewhat surprising to C++ programmers that `f(i)` mutates `i` while 
 | 
|---|
 | 196 | `f(answer)` does not mutate `answer` (though `f(answer)` would be illegal in 
 | 
|---|
 | 197 | C++, leading to the dreaded "const hell"), but the behaviour of this rule can 
 | 
|---|
 | 198 | be determined by examining local scope with the simple rule "non-`const` 
 | 
|---|
 | 199 | references to `const` variables produce temporaries", which aligns with 
 | 
|---|
 | 200 | programmer intuition that `const` variables cannot be mutated.
 | 
|---|
 | 201 | 
 | 
|---|
 | 202 | To bikeshed syntax for `T&`, there are three basic options: language 
 | 
|---|
 | 203 | keywords (`lvalue T` is already in Cforall), compiler-supported "special" 
 | 
|---|
 | 204 | generic types (e.g. `ref(T)`), or sigils (`T&` is familiar to C++ 
 | 
|---|
 | 205 | programmers). 
 | 
|---|
 | 206 | Keyword or generic based approaches run the risk of name conflicts with 
 | 
|---|
 | 207 | existing code, while any sigil used would have to be carefully chosen to not 
 | 
|---|
 | 208 | create parsing conflicts.
 | 
|---|