1 | ## Types for 0 and 1 literals ##
|
---|
2 | The literals `0` and `1` are treated specially by Cforall, due to their
|
---|
3 | potential uses in operator overloading.
|
---|
4 | Earlier versions of Cforall allowed `0` and `1` to be variable names, allowing
|
---|
5 | multiple interpretations of them according to the existing variable
|
---|
6 | overloading rules, with the following declarations in the prelude:
|
---|
7 |
|
---|
8 | const int 0, 1;
|
---|
9 | forall ( dtype DT ) const DT * const 0;
|
---|
10 | forall ( ftype FT ) FT * const 0;
|
---|
11 |
|
---|
12 | This did, however, create some backward-compatibility problems and potential
|
---|
13 | performance issues, and works poorly for generic types. To start with, this
|
---|
14 | (entirely legal C) code snippet doesn't compile in Cforall:
|
---|
15 |
|
---|
16 | if ( 0 ) {}
|
---|
17 |
|
---|
18 | It desugars to `if ( (int)(0 != 0) ) {}`, and since both `int` and
|
---|
19 | `forall(dtype DT) DT*` have a != operator which returns `int` the resolver can
|
---|
20 | not choose which `0` variable to take, because they're both exact matches.
|
---|
21 |
|
---|
22 | The general `!=` computation may also be less efficient than a check for a zero
|
---|
23 | value; take the following example of a rational type:
|
---|
24 |
|
---|
25 | struct rational { int32_t num, int32_t den };
|
---|
26 | rational 0 = { 0, 1 };
|
---|
27 |
|
---|
28 | int ?!=? (rational a, rational b) {
|
---|
29 | return ((int64_t)a.num)*b.den != ((int64_t)b.num)*a.den;
|
---|
30 | }
|
---|
31 |
|
---|
32 | int not_zero (rational a) { return a.num != 0; }
|
---|
33 |
|
---|
34 | To check if two rationals are equal we need to do a pair of multiplications to
|
---|
35 | normalize them (the casts in the example are to prevent overflow), but to
|
---|
36 | check if a rational is non-zero we just need to check its numerator, a more
|
---|
37 | efficient operation.
|
---|
38 |
|
---|
39 | Finally, though polymorphic null-pointer variables can be meaningfully
|
---|
40 | defined, most other polymorphic variables cannot be, which makes it difficult
|
---|
41 | to make generic types "truthy" using the existing system:
|
---|
42 |
|
---|
43 | forall(otype T) struct pair { T x; T y; };
|
---|
44 | forall(otype T | { T 0; }) pair(T) 0 = { 0, 0 };
|
---|
45 |
|
---|
46 | Now, it seems natural enough to want to define the zero for this pair type as
|
---|
47 | a pair of the zero values of its element type (if they're defined).
|
---|
48 | The declaration of `pair(T) 0` above is actually illegal though, as there is
|
---|
49 | no way to represent the zero values of an infinite number of types in the
|
---|
50 | single memory location available for this polymorphic variable - the
|
---|
51 | polymorphic null-pointer variables defined in the prelude are legal, but that
|
---|
52 | is only because all pointers are the same size and the single zero value is a
|
---|
53 | legal value of all pointer types simultaneously; null pointer is, however,
|
---|
54 | somewhat unique in this respect.
|
---|
55 |
|
---|
56 | The technical explanation for the problems with polymorphic zero is that `0`
|
---|
57 | is really a rvalue, not a lvalue - an expression, not an object.
|
---|
58 | Drawing from this, the solution we propose is to give `0` a new built-in type,
|
---|
59 | `zero_t`, and similarly give `1` the new built-in type `one_t`.
|
---|
60 | If the prelude defines `!=` over `zero_t` this solves the `if ( 0 )` problem,
|
---|
61 | because now the unambiguous best interpretation of `0 != 0` is to read them
|
---|
62 | both as `zero_t` (and say that this expression is false).
|
---|
63 | Backwards compatibility with C can be served by defining conversions in the
|
---|
64 | prelude from `zero_t` and `one_t` to `int` and the appropriate pointer
|
---|
65 | types, as below:
|
---|
66 |
|
---|
67 | // int 0;
|
---|
68 | forall(otype T | { void ?{safe}(T*, int); }) void ?{safe} (T*, zero_t);
|
---|
69 | forall(otype T | { void ?{unsafe}(T*, int); }) void ?{unsafe} (T*, zero_t);
|
---|
70 |
|
---|
71 | // int 1;
|
---|
72 | forall(otype T | { void ?{safe}(T*, int); }) void ?{safe} (T*, one_t);
|
---|
73 | forall(otype T | { void ?{unsafe}(T*, int); }) void ?{unsafe} (T*, one_t);
|
---|
74 |
|
---|
75 | // forall(dtype DT) const DT* 0;
|
---|
76 | forall(dtype DT) void ?{safe}(const DT**, zero_t);
|
---|
77 | // forall(ftype FT) FT* 0;
|
---|
78 | forall(ftype FT) void ?{safe}(FT**, zero_t);
|
---|
79 |
|
---|
80 | Further, with this change, instead of making `0` and `1` overloadable
|
---|
81 | variables, we can instead allow user-defined constructors (or, more flexibly,
|
---|
82 | safe conversions) from `zero_t`, as below:
|
---|
83 |
|
---|
84 | // rational 0 = { 0, 1 };
|
---|
85 | void ?{safe} (rational *this, zero_t) { this->num = 0; this->den = 1; }
|
---|
86 |
|
---|
87 | Note that we don't need to name the `zero_t` parameter to this constructor,
|
---|
88 | because its only possible value is a literal zero.
|
---|
89 | This one line allows `0` to be used anywhere a `rational` is required, as well
|
---|
90 | as enabling the same use of rationals in boolean contexts as above (by
|
---|
91 | interpreting the `0` in the desguraring to be a rational by this conversion).
|
---|
92 | Furthermore, while defining a conversion function from literal zero to
|
---|
93 | `rational` makes rational a "truthy" type able to be used in a boolean
|
---|
94 | context, we can optionally further optimize the truth decision on rationals as
|
---|
95 | follows:
|
---|
96 |
|
---|
97 | int ?!=? (rational a, zero_t) { return a.num != 0; }
|
---|
98 |
|
---|
99 | This comparison function will be chosen in preference to the more general
|
---|
100 | rational comparison function for comparisons against literal zero (like in
|
---|
101 | boolean contexts) because it doesn't require a conversion on the `0` argument.
|
---|
102 | Functions of the form `int ?!=? (T, zero_t)` can acutally be used in general
|
---|
103 | to make a type `T` truthy without making `0` a value which can convert to that
|
---|
104 | type, a capability not available in the current design.
|
---|
105 |
|
---|
106 | This design also solves the problem of polymorphic zero for generic types, as
|
---|
107 | in the following example:
|
---|
108 |
|
---|
109 | // ERROR: forall(otype T | { T 0; }) pair(T) 0 = { 0, 0 };
|
---|
110 | forall(otype T | { T 0; }) void ?{safe} (pair(T) *this, zero_t) {
|
---|
111 | this->x = 0; this->y = 0;
|
---|
112 | }
|
---|
113 |
|
---|
114 | The polymorphic variable declaration didn't work, but this constructor is
|
---|
115 | perfectly legal and has the desired semantics.
|
---|
116 |
|
---|
117 | We can assert that `T` can be used in a boolean context as follows:
|
---|
118 |
|
---|
119 | `forall(otype T | { int ?!=?(T, zero_t); })`
|
---|
120 |
|
---|
121 | Since the C standard (6.5.16.1.1) specifically states that pointers can be
|
---|
122 | assigned into `_Bool` variables (and implies that other artithmetic types can
|
---|
123 | be assigned into `_Bool` variables), it seems natural to say that assignment
|
---|
124 | into a `_Bool` variable effectively constitutes a boolean context.
|
---|
125 | To allow this interpretation, I propose including the following function (or
|
---|
126 | its effective equivalent) in the prelude:
|
---|
127 |
|
---|
128 | forall(otype T | { int ?!=?(T, zero_t); })
|
---|
129 | void ?{safe}( _Bool *this, T that ) { *this = that != 0; }
|
---|
130 |
|
---|
131 | Note that this conversion is not transitive; that is, for `t` a variable of
|
---|
132 | some "truthy" type `T`, `(_Bool)t;` would use this conversion (in the absence
|
---|
133 | of a lower-cost one), `(int)t;` would not use this conversion (and in fact
|
---|
134 | would not be legal in the absence of another valid way to convert a `T` to an
|
---|
135 | `int`), but `(int)(_Bool)t;` could legally use this conversion.
|
---|
136 |
|
---|
137 | Similarly giving literal `1` the special type `one_t` allows for more
|
---|
138 | concise and consistent specification of the increment and decrement operators,
|
---|
139 | using the following de-sugaring:
|
---|
140 |
|
---|
141 | ++i => i += 1
|
---|
142 | i++ => (tmp = i, i += 1, tmp)
|
---|
143 | --i => i -= 1
|
---|
144 | i-- => (tmp = i, i -= 1, tmp)
|
---|
145 |
|
---|
146 | In the examples above, `tmp` is a fresh temporary with its type inferred from
|
---|
147 | the return type of `i += 1`.
|
---|
148 | Under this proposal, defining a conversion from `one_t` to `T` and a
|
---|
149 | `lvalue T ?+=? (T*, T)` provides both the pre- and post-increment operators
|
---|
150 | for free in a consistent fashion (similarly for -= and the decrement
|
---|
151 | operators).
|
---|
152 | If a meaningful `1` cannot be defined for a type, both increment operators can
|
---|
153 | still be defined with the signature `lvalue T ?+=? (T*, one_t)`.
|
---|
154 | Similarly, if scalar addition can be performed on a type more efficiently than
|
---|
155 | by repeated increment, `lvalue T ?+=? (T*, int)` will not only define the
|
---|
156 | addition operator, it will simultaneously define consistent implementations of
|
---|
157 | both increment operators (this can also be accomplished by defining a
|
---|
158 | conversion from `int` to `T` and an addition operator `lvalue T ?+=?(T*, T)`).
|
---|
159 |
|
---|
160 | To allow functions of the form `lvalue T ?+=? (T*, int)` to satisfy "has an
|
---|
161 | increment operator" assertions of the form `lvalue T ?+=? (T*, one_t)`,
|
---|
162 | we also define a non-transitive unsafe conversion from `_Bool` (allowable
|
---|
163 | values `0` and `1`) to `one_t` (and `zero_t`) as follows:
|
---|
164 |
|
---|
165 | void ?{unsafe} (one_t*, _Bool) {}
|
---|
166 |
|
---|
167 | As a note, the desugaring of post-increment above is possibly even more
|
---|
168 | efficient than that of C++ - in C++, the copy to the temporary may be hidden
|
---|
169 | in a separately-compiled module where it can't be elided in cases where it is
|
---|
170 | not used, whereas this approach for Cforall always gives the compiler the
|
---|
171 | opportunity to optimize out the temporary when it is not needed.
|
---|
172 | Furthermore, one could imagine a post-increment operator that returned some
|
---|
173 | type `T2` that was implicitly convertable to `T` but less work than a full
|
---|
174 | copy of `T` to create (this seems like an absurdly niche case) - since the
|
---|
175 | type of `tmp` is inferred from the return type of `i += 1`, you could set up
|
---|
176 | functions with the following signatures to enable an equivalent pattern in
|
---|
177 | Cforall:
|
---|
178 |
|
---|
179 | lvalue T2 ?+=? (T*, one_t); // increment operator returns T2
|
---|
180 | void ?{} (T2*, T); // initialize T2 from T for use in `tmp = i`
|
---|
181 | void ?{safe} (T*, T2); // allow T2 to be used as a T when needed to
|
---|
182 | // preserve expected semantics of T x = y++;
|
---|