source: doc/proposals/zero_one.md @ 849f2c6b

ADTarm-ehast-experimentalenumforall-pointer-decayjacob/cs343-translationnew-astnew-ast-unique-exprpthread-emulationqualifiedEnum
Last change on this file since 849f2c6b was 5c6afcd, checked in by Aaron Moss <a3moss@…>, 8 years ago

Split out reference type and zero and one type proposals from working resolver doc

  • Property mode set to 100644
File size: 8.8 KB
Line 
1## Types for 0 and 1 literals ##
2The literals `0` and `1` are treated specially by Cforall, due to their
3potential uses in operator overloading.
4Earlier versions of Cforall allowed `0` and `1` to be variable names, allowing
5multiple interpretations of them according to the existing variable
6overloading rules, with the following declarations in the prelude:
7
8        const int 0, 1;
9        forall ( dtype DT ) const DT * const    0;
10        forall ( ftype FT ) FT * const          0;
11
12This did, however, create some backward-compatibility problems and potential
13performance issues, and works poorly for generic types. To start with, this
14(entirely legal C) code snippet doesn't compile in Cforall:
15
16        if ( 0 ) {}
17
18It desugars to `if ( (int)(0 != 0) ) {}`, and since both `int` and
19`forall(dtype DT) DT*` have a != operator which returns `int` the resolver can
20not choose which `0` variable to take, because they're both exact matches.
21
22The general `!=` computation may also be less efficient than a check for a zero
23value; take the following example of a rational type:
24
25        struct rational { int32_t num, int32_t den };
26        rational 0 = { 0, 1 };
27       
28        int ?!=? (rational a, rational b) {
29                return ((int64_t)a.num)*b.den != ((int64_t)b.num)*a.den;
30        }
31       
32        int not_zero (rational a) { return a.num != 0; }
33
34To check if two rationals are equal we need to do a pair of multiplications to
35normalize them (the casts in the example are to prevent overflow), but to
36check if a rational is non-zero we just need to check its numerator, a more
37efficient operation.
38
39Finally, though polymorphic null-pointer variables can be meaningfully
40defined, most other polymorphic variables cannot be, which makes it difficult
41to make generic types "truthy" using the existing system:
42
43        forall(otype T) struct pair { T x; T y; };
44        forall(otype T | { T 0; }) pair(T) 0 = { 0, 0 };
45
46Now, it seems natural enough to want to define the zero for this pair type as
47a pair of the zero values of its element type (if they're defined).
48The declaration of `pair(T) 0` above is actually illegal though, as there is
49no way to represent the zero values of an infinite number of types in the
50single memory location available for this polymorphic variable - the
51polymorphic null-pointer variables defined in the prelude are legal, but that
52is only because all pointers are the same size and the single zero value is a
53legal value of all pointer types simultaneously; null pointer is, however,
54somewhat unique in this respect.
55
56The technical explanation for the problems with polymorphic zero is that `0` 
57is really a rvalue, not a lvalue - an expression, not an object.
58Drawing from this, the solution we propose is to give `0` a new built-in type,
59`zero_t`, and similarly give `1` the new built-in type `one_t`.
60If the prelude defines `!=` over `zero_t` this solves the `if ( 0 )` problem,
61because now the unambiguous best interpretation of `0 != 0` is to read them
62both as `zero_t` (and say that this expression is false).
63Backwards compatibility with C can be served by defining conversions in the
64prelude from `zero_t` and `one_t` to `int` and the appropriate pointer
65types, as below:
66
67        // int 0;
68        forall(otype T | { void ?{safe}(T*, int); }) void ?{safe} (T*, zero_t);
69        forall(otype T | { void ?{unsafe}(T*, int); }) void ?{unsafe} (T*, zero_t);
70       
71        // int 1;
72        forall(otype T | { void ?{safe}(T*, int); }) void ?{safe} (T*, one_t);
73        forall(otype T | { void ?{unsafe}(T*, int); }) void ?{unsafe} (T*, one_t);
74       
75        // forall(dtype DT) const DT* 0;
76        forall(dtype DT) void ?{safe}(const DT**, zero_t);
77        // forall(ftype FT) FT* 0;
78        forall(ftype FT) void ?{safe}(FT**, zero_t);
79
80Further, with this change, instead of making `0` and `1` overloadable
81variables, we can instead allow user-defined constructors (or, more flexibly,
82safe conversions) from `zero_t`, as below:
83
84        // rational 0 = { 0, 1 };
85        void ?{safe} (rational *this, zero_t) { this->num = 0; this->den = 1; }
86
87Note that we don't need to name the `zero_t` parameter to this constructor,
88because its only possible value is a literal zero.
89This one line allows `0` to be used anywhere a `rational` is required, as well
90as enabling the same use of rationals in boolean contexts as above (by
91interpreting the `0` in the desguraring to be a rational by this conversion).
92Furthermore, while defining a conversion function from literal zero to
93`rational` makes rational a "truthy" type able to be used in a boolean
94context, we can optionally further optimize the truth decision on rationals as
95follows:
96
97        int ?!=? (rational a, zero_t) { return a.num != 0; }
98
99This comparison function will be chosen in preference to the more general
100rational comparison function for comparisons against literal zero (like in
101boolean contexts) because it doesn't require a conversion on the `0` argument.
102Functions of the form `int ?!=? (T, zero_t)` can acutally be used in general
103to make a type `T` truthy without making `0` a value which can convert to that
104type, a capability not available in the current design.
105
106This design also solves the problem of polymorphic zero for generic types, as
107in the following example:
108
109        // ERROR: forall(otype T | { T 0; }) pair(T) 0 = { 0, 0 };
110        forall(otype T | { T 0; }) void ?{safe} (pair(T) *this, zero_t) {
111                this->x = 0; this->y = 0;
112        }
113
114The polymorphic variable declaration didn't work, but this constructor is
115perfectly legal and has the desired semantics.
116
117We can assert that `T` can be used in a boolean context as follows:
118
119        `forall(otype T | { int ?!=?(T, zero_t); })`
120 
121Since the C standard (6.5.16.1.1) specifically states that pointers can be
122assigned into `_Bool` variables (and implies that other artithmetic types can
123be assigned into `_Bool` variables), it seems natural to say that assignment
124into a `_Bool` variable effectively constitutes a boolean context.
125To allow this interpretation, I propose including the following function (or
126its effective equivalent) in the prelude:
127
128        forall(otype T | { int ?!=?(T, zero_t); })
129        void ?{safe}( _Bool *this, T that ) { *this = that != 0; }
130
131Note that this conversion is not transitive; that is, for `t` a variable of
132some "truthy" type `T`, `(_Bool)t;` would use this conversion (in the absence
133of a lower-cost one), `(int)t;` would not use this conversion (and in fact
134would not be legal in the absence of another valid way to convert a `T` to an
135`int`), but `(int)(_Bool)t;` could legally use this conversion.
136
137Similarly giving literal `1` the special type `one_t` allows for more
138concise and consistent specification of the increment and decrement operators,
139using the following de-sugaring:
140
141        ++i => i += 1
142        i++ => (tmp = i, i += 1, tmp)
143        --i => i -= 1
144        i-- => (tmp = i, i -= 1, tmp)
145
146In the examples above, `tmp` is a fresh temporary with its type inferred from
147the return type of `i += 1`.
148Under this proposal, defining a conversion from `one_t` to `T` and a
149`lvalue T ?+=? (T*, T)` provides both the pre- and post-increment operators
150for free in a consistent fashion (similarly for -= and the decrement
151operators).
152If a meaningful `1` cannot be defined for a type, both increment operators can
153still be defined with the signature `lvalue T ?+=? (T*, one_t)`.
154Similarly, if scalar addition can be performed on a type more efficiently than
155by repeated increment, `lvalue T ?+=? (T*, int)` will not only define the
156addition operator, it will simultaneously define consistent implementations of
157both increment operators (this can also be accomplished by defining a
158conversion from `int` to `T` and an addition operator `lvalue T ?+=?(T*, T)`).
159
160To allow functions of the form `lvalue T ?+=? (T*, int)` to satisfy "has an
161increment operator" assertions of the form `lvalue T ?+=? (T*, one_t)`,
162we also define a non-transitive unsafe conversion from `_Bool` (allowable
163values `0` and `1`) to `one_t` (and `zero_t`) as follows:
164
165        void ?{unsafe} (one_t*, _Bool) {}
166
167As a note, the desugaring of post-increment above is possibly even more
168efficient than that of C++ - in C++, the copy to the temporary may be hidden
169in a separately-compiled module where it can't be elided in cases where it is
170not used, whereas this approach for Cforall always gives the compiler the
171opportunity to optimize out the temporary when it is not needed.
172Furthermore, one could imagine a post-increment operator that returned some
173type `T2` that was implicitly convertable to `T` but less work than a full
174copy of `T` to create (this seems like an absurdly niche case) - since the
175type of `tmp` is inferred from the return type of `i += 1`, you could set up
176functions with the following signatures to enable an equivalent pattern in
177Cforall:
178
179        lvalue T2 ?+=? (T*, one_t);   // increment operator returns T2
180        void ?{} (T2*, T);            // initialize T2 from T for use in `tmp = i`
181        void ?{safe} (T*, T2);        // allow T2 to be used as a T when needed to
182                                      // preserve expected semantics of T x = y++;
Note: See TracBrowser for help on using the repository browser.