Context Navigation

source: doc/proposals/zero_one.md @ e1990f1

ADTarm-ehast-experimentalenumforall-pointer-decayjacob/cs343-translationjenkins-sandboxnew-astnew-ast-unique-exprpthread-emulationqualifiedEnum

Last change on this file since e1990f1 was 5c6afcd, checked in by Aaron Moss <a3moss@…>, 8 years ago
Split out reference type and zero and one type proposals from working resolver doc
Property mode set to `100644`
File size: 8.8 KB

Line
1	## Types for 0 and 1 literals ##
2	The literals `0` and `1` are treated specially by Cforall, due to their
3	potential uses in operator overloading.
4	Earlier versions of Cforall allowed `0` and `1` to be variable names, allowing
5	multiple interpretations of them according to the existing variable
6	overloading rules, with the following declarations in the prelude:
7
8	const int 0, 1;
9	forall ( dtype DT ) const DT * const 0;
10	forall ( ftype FT ) FT * const 0;
11
12	This did, however, create some backward-compatibility problems and potential
13	performance issues, and works poorly for generic types. To start with, this
14	(entirely legal C) code snippet doesn't compile in Cforall:
15
16	if ( 0 ) {}
17
18	It desugars to `if ( (int)(0 != 0) ) {}`, and since both `int` and
19	`forall(dtype DT) DT*` have a != operator which returns `int` the resolver can
20	not choose which `0` variable to take, because they're both exact matches.
21
22	The general `!=` computation may also be less efficient than a check for a zero
23	value; take the following example of a rational type:
24
25	struct rational { int32_t num, int32_t den };
26	rational 0 = { 0, 1 };
27
28	int ?!=? (rational a, rational b) {
29	return ((int64_t)a.num)b.den != ((int64_t)b.num)a.den;
30	}
31
32	int not_zero (rational a) { return a.num != 0; }
33
34	To check if two rationals are equal we need to do a pair of multiplications to
35	normalize them (the casts in the example are to prevent overflow), but to
36	check if a rational is non-zero we just need to check its numerator, a more
37	efficient operation.
38
39	Finally, though polymorphic null-pointer variables can be meaningfully
40	defined, most other polymorphic variables cannot be, which makes it difficult
41	to make generic types "truthy" using the existing system:
42
43	forall(otype T) struct pair { T x; T y; };
44	forall(otype T \| { T 0; }) pair(T) 0 = { 0, 0 };
45
46	Now, it seems natural enough to want to define the zero for this pair type as
47	a pair of the zero values of its element type (if they're defined).
48	The declaration of `pair(T) 0` above is actually illegal though, as there is
49	no way to represent the zero values of an infinite number of types in the
50	single memory location available for this polymorphic variable - the
51	polymorphic null-pointer variables defined in the prelude are legal, but that
52	is only because all pointers are the same size and the single zero value is a
53	legal value of all pointer types simultaneously; null pointer is, however,
54	somewhat unique in this respect.
55
56	The technical explanation for the problems with polymorphic zero is that `0`
57	is really a rvalue, not a lvalue - an expression, not an object.
58	Drawing from this, the solution we propose is to give `0` a new built-in type,
59	`zero_t`, and similarly give `1` the new built-in type `one_t`.
60	If the prelude defines `!=` over `zero_t` this solves the `if ( 0 )` problem,
61	because now the unambiguous best interpretation of `0 != 0` is to read them
62	both as `zero_t` (and say that this expression is false).
63	Backwards compatibility with C can be served by defining conversions in the
64	prelude from `zero_t` and `one_t` to `int` and the appropriate pointer
65	types, as below:
66
67	// int 0;
68	forall(otype T \| { void ?{safe}(T, int); }) void ?{safe} (T, zero_t);
69	forall(otype T \| { void ?{unsafe}(T, int); }) void ?{unsafe} (T, zero_t);
70
71	// int 1;
72	forall(otype T \| { void ?{safe}(T, int); }) void ?{safe} (T, one_t);
73	forall(otype T \| { void ?{unsafe}(T, int); }) void ?{unsafe} (T, one_t);
74
75	// forall(dtype DT) const DT* 0;
76	forall(dtype DT) void ?{safe}(const DT**, zero_t);
77	// forall(ftype FT) FT* 0;
78	forall(ftype FT) void ?{safe}(FT**, zero_t);
79
80	Further, with this change, instead of making `0` and `1` overloadable
81	variables, we can instead allow user-defined constructors (or, more flexibly,
82	safe conversions) from `zero_t`, as below:
83
84	// rational 0 = { 0, 1 };
85	void ?{safe} (rational *this, zero_t) { this->num = 0; this->den = 1; }
86
87	Note that we don't need to name the `zero_t` parameter to this constructor,
88	because its only possible value is a literal zero.
89	This one line allows `0` to be used anywhere a `rational` is required, as well
90	as enabling the same use of rationals in boolean contexts as above (by
91	interpreting the `0` in the desguraring to be a rational by this conversion).
92	Furthermore, while defining a conversion function from literal zero to
93	`rational` makes rational a "truthy" type able to be used in a boolean
94	context, we can optionally further optimize the truth decision on rationals as
95	follows:
96
97	int ?!=? (rational a, zero_t) { return a.num != 0; }
98
99	This comparison function will be chosen in preference to the more general
100	rational comparison function for comparisons against literal zero (like in
101	boolean contexts) because it doesn't require a conversion on the `0` argument.
102	Functions of the form `int ?!=? (T, zero_t)` can acutally be used in general
103	to make a type `T` truthy without making `0` a value which can convert to that
104	type, a capability not available in the current design.
105
106	This design also solves the problem of polymorphic zero for generic types, as
107	in the following example:
108
109	// ERROR: forall(otype T \| { T 0; }) pair(T) 0 = { 0, 0 };
110	forall(otype T \| { T 0; }) void ?{safe} (pair(T) *this, zero_t) {
111	this->x = 0; this->y = 0;
112	}
113
114	The polymorphic variable declaration didn't work, but this constructor is
115	perfectly legal and has the desired semantics.
116
117	We can assert that `T` can be used in a boolean context as follows:
118
119	`forall(otype T \| { int ?!=?(T, zero_t); })`
120
121	Since the C standard (6.5.16.1.1) specifically states that pointers can be
122	assigned into `_Bool` variables (and implies that other artithmetic types can
123	be assigned into `_Bool` variables), it seems natural to say that assignment
124	into a `_Bool` variable effectively constitutes a boolean context.
125	To allow this interpretation, I propose including the following function (or
126	its effective equivalent) in the prelude:
127
128	forall(otype T \| { int ?!=?(T, zero_t); })
129	void ?{safe}( _Bool this, T that ) { this = that != 0; }
130
131	Note that this conversion is not transitive; that is, for `t` a variable of
132	some "truthy" type `T`, `(_Bool)t;` would use this conversion (in the absence
133	of a lower-cost one), `(int)t;` would not use this conversion (and in fact
134	would not be legal in the absence of another valid way to convert a `T` to an
135	`int`), but `(int)(_Bool)t;` could legally use this conversion.
136
137	Similarly giving literal `1` the special type `one_t` allows for more
138	concise and consistent specification of the increment and decrement operators,
139	using the following de-sugaring:
140
141	++i => i += 1
142	i++ => (tmp = i, i += 1, tmp)
143	--i => i -= 1
144	i-- => (tmp = i, i -= 1, tmp)
145
146	In the examples above, `tmp` is a fresh temporary with its type inferred from
147	the return type of `i += 1`.
148	Under this proposal, defining a conversion from `one_t` to `T` and a
149	`lvalue T ?+=? (T*, T)` provides both the pre- and post-increment operators
150	for free in a consistent fashion (similarly for -= and the decrement
151	operators).
152	If a meaningful `1` cannot be defined for a type, both increment operators can
153	still be defined with the signature `lvalue T ?+=? (T*, one_t)`.
154	Similarly, if scalar addition can be performed on a type more efficiently than
155	by repeated increment, `lvalue T ?+=? (T*, int)` will not only define the
156	addition operator, it will simultaneously define consistent implementations of
157	both increment operators (this can also be accomplished by defining a
158	conversion from `int` to `T` and an addition operator `lvalue T ?+=?(T*, T)`).
159
160	To allow functions of the form `lvalue T ?+=? (T*, int)` to satisfy "has an
161	increment operator" assertions of the form `lvalue T ?+=? (T*, one_t)`,
162	we also define a non-transitive unsafe conversion from `_Bool` (allowable
163	values `0` and `1`) to `one_t` (and `zero_t`) as follows:
164
165	void ?{unsafe} (one_t*, _Bool) {}
166
167	As a note, the desugaring of post-increment above is possibly even more
168	efficient than that of C++ - in C++, the copy to the temporary may be hidden
169	in a separately-compiled module where it can't be elided in cases where it is
170	not used, whereas this approach for Cforall always gives the compiler the
171	opportunity to optimize out the temporary when it is not needed.
172	Furthermore, one could imagine a post-increment operator that returned some
173	type `T2` that was implicitly convertable to `T` but less work than a full
174	copy of `T` to create (this seems like an absurdly niche case) - since the
175	type of `tmp` is inferred from the return type of `i += 1`, you could set up
176	functions with the following signatures to enable an equivalent pattern in
177	Cforall:
178
179	lvalue T2 ?+=? (T*, one_t); // increment operator returns T2
180	void ?{} (T2*, T); // initialize T2 from T for use in `tmp = i`
181	void ?{safe} (T*, T2); // allow T2 to be used as a T when needed to
182	// preserve expected semantics of T x = y++;

Note: See TracBrowser for help on using the repository browser.

Download in other formats: