1 | # Thoughts on Resolver Design #
|
---|
2 |
|
---|
3 | ## Conversions ##
|
---|
4 | C's implicit "usual arithmetic conversions" define a structure among the
|
---|
5 | built-in types consisting of _unsafe_ narrowing conversions and a hierarchy of
|
---|
6 | _safe_ widening conversions.
|
---|
7 | There is also a set of _explicit_ conversions that are only allowed through a
|
---|
8 | cast expression.
|
---|
9 | Based on Glen's notes on conversions [1], I propose that safe and unsafe
|
---|
10 | conversions be expressed as constructor variants, though I make explicit
|
---|
11 | (cast) conversions a constructor variant as well rather than a dedicated
|
---|
12 | operator.
|
---|
13 | Throughout this article, I will use the following operator names for
|
---|
14 | constructors and conversion functions from `From` to `To`:
|
---|
15 |
|
---|
16 | void ?{} ( To*, To ); // copy constructor
|
---|
17 | void ?{} ( To*, From ); // explicit constructor
|
---|
18 | void ?{explicit} ( To*, From ); // explicit cast conversion
|
---|
19 | void ?{safe} ( To*, From ); // implicit safe conversion
|
---|
20 | void ?{unsafe} ( To*, From ); // implicit unsafe conversion
|
---|
21 |
|
---|
22 | [1] http://plg.uwaterloo.ca/~cforall/Conversions/index.html
|
---|
23 |
|
---|
24 | Glen's design made no distinction between constructors and unsafe implicit
|
---|
25 | conversions; this is elegant, but interacts poorly with tuples.
|
---|
26 | Essentially, without making this distinction, a constructor like the following
|
---|
27 | would add an interpretation of any two `int`s as a `Coord`, needlessly
|
---|
28 | multiplying the space of possible interpretations of all functions:
|
---|
29 |
|
---|
30 | void ?{}( Coord *this, int x, int y );
|
---|
31 |
|
---|
32 | That said, it would certainly be possible to make a multiple-argument implicit
|
---|
33 | conversion, as below, though the argument above suggests this option should be
|
---|
34 | used infrequently:
|
---|
35 |
|
---|
36 | void ?{unsafe}( Coord *this, int x, int y );
|
---|
37 |
|
---|
38 | An alternate possibility would be to only count two-arg constructors
|
---|
39 | `void ?{} ( To*, From )` as unsafe conversions; under this semantics, safe and
|
---|
40 | explicit conversions should also have a compiler-enforced restriction to
|
---|
41 | ensure that they are two-arg functions (this restriction may be valuable
|
---|
42 | regardless).
|
---|
43 |
|
---|
44 | Regardless of syntax, there should be a type assertion that expresses `From`
|
---|
45 | is convertable to `To`.
|
---|
46 | If user-defined conversions are not added to the language,
|
---|
47 | `void ?{} ( To*, From )` may be a suitable representation, relying on
|
---|
48 | conversions on the argument types to account for transitivity.
|
---|
49 | On the other hand, `To*` should perhaps match its target type exactly, so
|
---|
50 | another assertion syntax specific to conversions may be required, e.g.
|
---|
51 | `From -> To`.
|
---|
52 |
|
---|
53 | ### Constructor Idiom ###
|
---|
54 | Basing our notion of conversions off otherwise normal Cforall functions means
|
---|
55 | that we can use the full range of Cforall features for conversions, including
|
---|
56 | polymorphism.
|
---|
57 | Glen [1] defines a _constructor idiom_ that can be used to create chains of
|
---|
58 | safe conversions without duplicating code; given a type `Safe` which members
|
---|
59 | of another type `From` can be directly converted to, the constructor idiom
|
---|
60 | allows us to write a conversion for any type `To` which `Safe` converts to:
|
---|
61 |
|
---|
62 | forall(otype To | { void ?{safe}( To*, Safe ) })
|
---|
63 | void ?{safe}( To *this, From that ) {
|
---|
64 | Safe tmp = /* some expression involving that */;
|
---|
65 | *this = tmp; // uses assertion parameter
|
---|
66 | }
|
---|
67 |
|
---|
68 | This idiom can also be used with only minor variations for a parallel set of
|
---|
69 | unsafe conversions.
|
---|
70 |
|
---|
71 | What selective non-use of the constructor idiom gives us is the ability to
|
---|
72 | define a conversion that may only be the *last* conversion in a chain of such.
|
---|
73 | Constructing a conversion graph able to unambiguously represent the full
|
---|
74 | hierarchy of implicit conversions in C is provably impossible using only
|
---|
75 | single-step conversions with no additional information (see Appendix B), but
|
---|
76 | this mechanism is sufficiently powerful (see [1], though the design there has
|
---|
77 | some minor bugs; the general idea is to use the constructor idiom to define
|
---|
78 | two chains of conversions, one among the signed integral types, another among
|
---|
79 | the unsigned, and to use monomorphic conversions to allow conversions between
|
---|
80 | signed and unsigned integer types).
|
---|
81 |
|
---|
82 | ### Implementation Details ###
|
---|
83 | It is desirable to have a system which can be efficiently implemented, yet
|
---|
84 | also to have one which has sufficient power to distinguish between functions
|
---|
85 | on all possible axes of polymorphism.
|
---|
86 | This ordering may be a partial order, which may complicate implementation
|
---|
87 | somewhat; in this case it may be desirable to store the set of implementations
|
---|
88 | for a given function as the directed acyclic graph (DAG) representing the
|
---|
89 | order.
|
---|
90 |
|
---|
91 | ## Conversion Costs ##
|
---|
92 | Each possible resolution of an expression has a _cost_ tuple consisting of
|
---|
93 | the following components:
|
---|
94 | 1. _unsafe_ conversion cost: summed degree of unsafe conversions; unlike CFA03, this is not a simple count of conversions (for symmetry with the safe conversions)
|
---|
95 | 2. _polymorphic unifications_: count of parameters and return values bound to some polymorphic type for boxing
|
---|
96 | 3. _type variables_: number of polymorphic type variables bound
|
---|
97 | 4. negated _type specializations_: Each type assertion specializes the polymorphism, thus decreasing the cost; nested polymorphic types (e.g. `T*`) are also counted as specializations
|
---|
98 | 5. _safe_ conversions: summed degree of safe conversions
|
---|
99 | 6. _qualifier_ conversions: summed degree of qualifier and reference conversions
|
---|
100 | These components are lexically-ordered and can be summed element-wise;
|
---|
101 | summation starts at `(0, 0, 0, 0, 0)`.
|
---|
102 |
|
---|
103 | **TODO** update below for consistency with this
|
---|
104 |
|
---|
105 | ### Lvalue and Qualifier Conversions ###
|
---|
106 | C defines the notion of a _lvalue_, essentially an addressable object, as well
|
---|
107 | as a number of type _qualifiers_, `const`, `volatile`, and `restrict`.
|
---|
108 | As these type qualifiers are generally only meaningful to the type system as
|
---|
109 | applied to lvalues, the two concepts are closely related.
|
---|
110 | A const lvalue cannot be modified, the compiler cannot assume that a volatile
|
---|
111 | lvalue will not be concurrently modified by some other part of the system, and
|
---|
112 | a restrict lvalue must have pointer type, and the compiler may assume that no
|
---|
113 | other pointer in scope aliases that pointer (this is solely a performance
|
---|
114 | optimization, and may be ignored by implementers).
|
---|
115 | _Lvalue-to-rvalue conversion_, which takes an lvalue of type `T` and converts
|
---|
116 | it to an expression result of type `T` (commonly called an _rvalue_ of type
|
---|
117 | `T`) also strips all the qualifiers from the lvalue, as an expression result
|
---|
118 | is a value, not an addressable object that can have properties like
|
---|
119 | immutability.
|
---|
120 | Though lvalue-to-rvalue conversion strips the qualifiers from lvalues,
|
---|
121 | derived rvalue types such as pointer types may include qualifiers;
|
---|
122 | `const int *` is a distinct type from `int *`, though the latter is safely
|
---|
123 | convertable to the former.
|
---|
124 | In general, any number of qualifiers can be safely added to the
|
---|
125 | pointed-to-type of a pointer type, e.g. `int *` converts safely to
|
---|
126 | `const int *` and `volatile int *`, both of which convert safely to
|
---|
127 | `const volatile int *`.
|
---|
128 |
|
---|
129 | Since lvalues are precicely "addressable objects", in C, only lvalues can be
|
---|
130 | used as the operand of the `&` address-of operator.
|
---|
131 | Similarly, only modifiable lvalues may be used as the assigned-to
|
---|
132 | operand of the mutating operators: assignment, compound assignment
|
---|
133 | (e.g. `+=`), and increment and decrement; roughly speaking, lvalues without
|
---|
134 | the `const` qualifier are modifiable, but lvalues of incomplete types, array
|
---|
135 | types, and struct or union types with const members are also not modifiable.
|
---|
136 | Lvalues are produced by the following expressions: object identifiers
|
---|
137 | (function identifiers are not considered to be lvalues), the result of the `*`
|
---|
138 | dereference operator applied to an object pointer, the result of a member
|
---|
139 | expression `s.f` if the left argument `s` is an lvalue (note that the
|
---|
140 | preceding two rules imply that the result of indirect member expressions
|
---|
141 | `s->f` are always lvalues, by desugaring to `(*s).f`), and the result of the
|
---|
142 | indexing operator `a[i]` (similarly by its desugaring to `*((a)+(i))`).
|
---|
143 | Somewhat less obviously, parenthesized lvalue expressions, string literals,
|
---|
144 | and compound literals (e.g. `(struct foo){ 'x', 3.14, 42 }`) are also lvalues.
|
---|
145 |
|
---|
146 | All of the conversions described above are defined in standard C, but Cforall
|
---|
147 | requires further features from its type system.
|
---|
148 | In particular, to allow overloading of the `*?` and `?[?]` dereferencing and
|
---|
149 | indexing operators, Cforall requires a way to declare that the functions
|
---|
150 | defining these operators return lvalues, and since C functions never return
|
---|
151 | lvalues and for syntactic reasons we wish to distinguish functions which
|
---|
152 | return lvalues from functions which return pointers, this is of necessity an
|
---|
153 | extension to standard C.
|
---|
154 | In the current design, an `lvalue` qualifier can be added to function return
|
---|
155 | types (and only to function return types), the effect of which is to return a
|
---|
156 | pointer which is implicitly dereferenced by the caller.
|
---|
157 | C++ includes the more general concept of _references_, which are typically
|
---|
158 | implemented as implicitly dereferenced pointers as well.
|
---|
159 | Another use case which C++ references support is providing a way to pass
|
---|
160 | function parameters by reference (rather than by value) with a natural
|
---|
161 | syntax; Cforall in its current state has no such mechanism.
|
---|
162 | As an example, consider the following (currently typical) copy-constructor
|
---|
163 | signature and call:
|
---|
164 |
|
---|
165 | void ?{}(T *lhs, T rhs);
|
---|
166 |
|
---|
167 | T x;
|
---|
168 | T y = { x };
|
---|
169 |
|
---|
170 | Note that the right-hand argument is passed by value, and would in fact be
|
---|
171 | copied twice in the course of the constructor call `T y = { x };` (once into
|
---|
172 | the parameter by C's standard `memcpy` semantics, once again in the body of
|
---|
173 | the copy constructor, though it is possible that return value optimization
|
---|
174 | will elide the `memcpy`-style copy).
|
---|
175 | However, to pass by reference using the existing pointer syntax, the example
|
---|
176 | above would look like this:
|
---|
177 |
|
---|
178 | void ?{}(T *lhs, const T *rhs);
|
---|
179 |
|
---|
180 | T x;
|
---|
181 | T y = { &x };
|
---|
182 |
|
---|
183 | This example is not even as bad as it could be; assuming pass-by-reference is
|
---|
184 | the desired semantics for the `?+?` operator, that implies the following
|
---|
185 | design today:
|
---|
186 |
|
---|
187 | T ?+?(const T *lhs, const T *rhs);
|
---|
188 |
|
---|
189 | T a, b;
|
---|
190 | T c = &a + &b,
|
---|
191 |
|
---|
192 | In addition to `&a + &b` being unsightly and confusing syntax to add `a` and
|
---|
193 | `b`, it also introduces a possible ambiguity with pointer arithmetic on `T*`
|
---|
194 | which can only be resolved by return-type inference.
|
---|
195 |
|
---|
196 | Pass-by-reference and marking functions as returning lvalues instead of the
|
---|
197 | usual rvalues are actually closely related concepts, as obtaining a reference
|
---|
198 | to pass depends on the referenced object being addressable, i.e. an lvalue,
|
---|
199 | and lvalue return types are effectively return-by-reference.
|
---|
200 | Cforall should also unify the concepts, with a parameterized type for
|
---|
201 | "reference to `T`", which I will write `ref T`.
|
---|
202 | Syntax bikeshedding can be done later (there are some examples at the bottom
|
---|
203 | of this section), but `ref T` is sufficiently distinct from both the existing
|
---|
204 | `lvalue T` (which it subsumes) and the closely related C++ `T&` to allow
|
---|
205 | independent discussion of its semantics.
|
---|
206 |
|
---|
207 | Firstly, assignment to a function parameter as part of a function call and
|
---|
208 | local variable initialization have almost identical semantics, so should be
|
---|
209 | treated similarly for the reference type too; this implies we should be able
|
---|
210 | to declare local variables of reference type, as in the following:
|
---|
211 |
|
---|
212 | int x = 42;
|
---|
213 | ref int r = x; // r is now an alias for x
|
---|
214 |
|
---|
215 | Unlike in C++, we would like to have the capability to re-bind references
|
---|
216 | after initialization, as this allows the attractive syntax of references to
|
---|
217 | support some further useful code patterns, such as first initializing a
|
---|
218 | reference after its declaration.
|
---|
219 | Constant references to `T` (`const ref T`) should not be re-bindable.
|
---|
220 |
|
---|
221 | One option for re-binding references is to use a dedicated operator, as in the
|
---|
222 | code example below:
|
---|
223 |
|
---|
224 | int i = 42, j = 7;
|
---|
225 | ref int r = i; // bind r to i
|
---|
226 | r = j; // set i (== r) to 7
|
---|
227 | r := j; // rebind r to j using the new := rebind operator
|
---|
228 | i = 42; // reset i (!= r) to 42
|
---|
229 | assert( r == 7 );
|
---|
230 |
|
---|
231 | The other syntactic option for reference re-bind would be to overload
|
---|
232 | assignment and use type inference on the left and right-hand sides to
|
---|
233 | determine whether the referred-to variable on the left should be reassigned to
|
---|
234 | the value on the right, or if the reference on the left should be aliased to
|
---|
235 | the reference on the right.
|
---|
236 | This could be disambiguated with casts, as in the following code example:
|
---|
237 |
|
---|
238 | int i
|
---|
239 | int j;
|
---|
240 | ref int r = i; // (0a)
|
---|
241 | ref int s = i; // (0b)
|
---|
242 |
|
---|
243 | i = j; // (1)
|
---|
244 | i = (int)s; // (2)
|
---|
245 | i = s; // (3)
|
---|
246 | // ---------------------
|
---|
247 | r = s; // (4)
|
---|
248 | r = (ref int)j; // (5)
|
---|
249 | // ---------------------
|
---|
250 | r = j; // (6)
|
---|
251 | r = (int)s; // (7)
|
---|
252 |
|
---|
253 | By the expected aliasing syntax, (0a) and (0b) are initializing `r` and `s` as
|
---|
254 | aliases for `i`.
|
---|
255 | For C compatibility, (1) has to be assignment; in general, any assignment to a
|
---|
256 | non-reference type should be assignment, so (2) and (3) are as well.
|
---|
257 | By types, (4) and (5) should have the same semantics, and the semantics of (6)
|
---|
258 | and (7) should match as well.
|
---|
259 | This suggests that (4) and (5) are reference re-bind, and (6) and (7) are an
|
---|
260 | assignment to the referred variable; this makes the syntax to explicitly alias
|
---|
261 | a local variable rather ugly (and inconsistent with the initialization
|
---|
262 | syntax), as well as making it rather awkward to copy the value stored in one
|
---|
263 | reference-type variable into another reference type variable (which is likely
|
---|
264 | more painful in functions with by-reference parameters than with local
|
---|
265 | variables of reference type).
|
---|
266 |
|
---|
267 | Because of the aforementioned issues with overloading assignment as reference
|
---|
268 | rebind, in addition to the fact that reference rebind should not be a
|
---|
269 | user-overloadable operator (unlike assignment), I propose refererence rebind
|
---|
270 | should have its own dedicated operator.
|
---|
271 |
|
---|
272 | The semantics and restrictions of `ref T` are effectively the semantics of an
|
---|
273 | lvalue of type `T`, and by this analogy there should be a safe, qualifier
|
---|
274 | dropping conversion from `ref const volatile restrict T` (and every other
|
---|
275 | qualifier combination on the `T` in `ref T`) to `T`.
|
---|
276 | With this conversion, the resolver may type most expressions that C would
|
---|
277 | call "lvalue of type `T`" as `ref T`.
|
---|
278 | There's also an obvious argument that lvalues of a (possibly-qualified) type
|
---|
279 | `T` should be convertable to references of type `T`, where `T` is also
|
---|
280 | so-qualified (e.g. lvalue `int` to `ref int`, lvalue `const char` to
|
---|
281 | `ref const char`).
|
---|
282 | By similar arguments to pointer types, qualifiers should be addable to the
|
---|
283 | referred-to type of a reference (e.g. `ref int` to `ref const int`).
|
---|
284 | As a note, since pointer arithmetic is explictly not defined on `ref T`,
|
---|
285 | `restrict ref T` should be allowable and would have alias-analysis rules that
|
---|
286 | are actually comprehensible to mere mortals.
|
---|
287 |
|
---|
288 | Using pass-by-reference semantics for function calls should not put syntactic
|
---|
289 | constraints on how the function is called; particularly, temporary values
|
---|
290 | should be able to be passed by reference.
|
---|
291 | The mechanism for this pass-by-reference would be to store the value of the
|
---|
292 | temporary expression into a new unnamed temporary, and pass the reference of
|
---|
293 | that temporary to the function.
|
---|
294 | As an example, the following code should all compile and run:
|
---|
295 |
|
---|
296 | void f(ref int x) { printf("%d\n", x++); }
|
---|
297 |
|
---|
298 | int i = 7, j = 11;
|
---|
299 | const int answer = 42;
|
---|
300 |
|
---|
301 | f(i); // (1)
|
---|
302 | f(42); // (2)
|
---|
303 | f(i + j); // (3)
|
---|
304 | f(answer); // (4)
|
---|
305 |
|
---|
306 | The semantics of (1) are just like C++'s, "7" is printed, and `i` has the
|
---|
307 | value 8 afterward.
|
---|
308 | For (2), "42" is printed, and the increment of the unnamed temporary to 43 is
|
---|
309 | not visible to the caller; (3) behaves similarly, printing "19", but not
|
---|
310 | changing `i` or `j`.
|
---|
311 | (4) is a bit of an interesting case; we want to be able to support named
|
---|
312 | constants like `answer` that can be used anywhere the constant expression
|
---|
313 | they're replacing (like `42`) could go; in this sense, (4) and (2) should have
|
---|
314 | the same semantics.
|
---|
315 | However, we don't want the mutation to the `x` parameter to be visible in
|
---|
316 | `answer` afterward, because `answer` is a constant, and thus shouldn't change.
|
---|
317 | The solution to this is to allow chaining of the two `ref` conversions;
|
---|
318 | `answer` has the type `ref const int`, which can be converted to `int` by the
|
---|
319 | lvalue-to-rvalue conversion (which drops the qualifiers), then up to `ref int`
|
---|
320 | by the temporary-producing rvalue-to-lvalue conversion.
|
---|
321 | Thus, an unnamed temporary is inserted, initialized to `answer` (i.e. 42),
|
---|
322 | mutated by `f`, then discarded; "42" is printed, just as in case (2), and
|
---|
323 | `answer` still equals 42 after the call, because it was the temporary that was
|
---|
324 | mutated, not `answer`.
|
---|
325 | It may be somewhat surprising to C++ programmers that `f(i)` mutates `i` while
|
---|
326 | `f(answer)` does not mutate `answer` (though `f(answer)` would be illegal in
|
---|
327 | C++, leading to the dreaded "const hell"), but the behaviour of this rule can
|
---|
328 | be determined by examining local scope with the simple rule "non-`const`
|
---|
329 | references to `const` variables produce temporaries", which aligns with
|
---|
330 | programmer intuition that `const` variables cannot be mutated.
|
---|
331 |
|
---|
332 | To bikeshed syntax for `ref T`, there are three basic options: language
|
---|
333 | keywords (`lvalue T` is already in Cforall), compiler-supported "special"
|
---|
334 | generic types (e.g. `ref(T)`), or sigils (`T&` is familiar to C++
|
---|
335 | programmers).
|
---|
336 | Keyword or generic based approaches run the risk of name conflicts with
|
---|
337 | existing code, while any sigil used would have to be carefully chosen to not
|
---|
338 | create parsing conflicts.
|
---|
339 |
|
---|
340 | **TODO** Consider arguments for move semantics and see if there is a
|
---|
341 | compelling case for rvalue references.
|
---|
342 |
|
---|
343 | ### Conversion Operator Costs ###
|
---|
344 | Copy constructors, safe conversions, and unsafe conversions all have an
|
---|
345 | associated conversion cost, calculated according to the algorithm below:
|
---|
346 |
|
---|
347 | 1. Monomorphic copy constructors have a conversion cost of `(0, 0, 0, 0)`
|
---|
348 | 2. Monomorphic safe conversions have a conversion cost of `(0, 0, 1, 1)`
|
---|
349 | 3. Monomoprhic unsafe conversions have a conversion cost of `(1, 0, 0, 1)`
|
---|
350 | 4. Polymorphic conversion operators (or copy constructors) have a conversion
|
---|
351 | cost of `(0, 1, 0, 1)` plus the conversion cost of their monomorphic
|
---|
352 | equivalent and the sum of the conversion costs of all conversion operators
|
---|
353 | passed as assertion parameters, but where the fourth "count" element of the
|
---|
354 | cost tuple is fixed to `1`.
|
---|
355 |
|
---|
356 | **TODO** Polymorphism cost may need to be reconsidered in the light of the
|
---|
357 | thoughts on polymorphism below.
|
---|
358 | **TODO** You basically just want path-length in the conversion graph implied
|
---|
359 | by the set of conversions; the only tricky question is whether or not you can
|
---|
360 | account for "mixed" safe and unsafe conversions used to satisfy polymorphic
|
---|
361 | constraints, whether a polymorphic conversion should cost more than a
|
---|
362 | monomorphic one, and whether to account for non-conversion constraints in the
|
---|
363 | polymorphism cost
|
---|
364 |
|
---|
365 | ### Argument-Parameter Matching ###
|
---|
366 | Given a function `f` with an parameter list (after tuple flattening)
|
---|
367 | `(T1 t1, T2 t2, ... Tn tn)`, and a function application
|
---|
368 | `f(<e1>, <e2>, ... <em>)`, the cost of matching each argument to the
|
---|
369 | appropriate parameter is calculated according to the algorithm below:
|
---|
370 |
|
---|
371 | Given a parameter `t` of type `T` and an expression `<e>` from these lists,
|
---|
372 | `<e>` will have a set of interpretations of types `E1, E2, ... Ek` with
|
---|
373 | associated costs `(u1, p1, s1, c1), (u2, p2, s2, c2), ... (uk, pk, sk, ck)`.
|
---|
374 | (If any `Ei` is a tuple type, replace it with its first flattened element for
|
---|
375 | the purposes of this section.)
|
---|
376 |
|
---|
377 | The cost of matching the interpretation of `<e>` with type `Ei` to `t1` with
|
---|
378 | type `T` is the sum of the interpretation cost `(ui, pi, si, ci)` and the
|
---|
379 | conversion operator cost from `Ei` to `T`.
|
---|
380 |
|
---|
381 | ### Object Initialization ###
|
---|
382 | The cost to initialize an object is calculated very similarly to
|
---|
383 | argument-parameter matching, with a few modifications.
|
---|
384 | Firstly, explicit constructors are included in the set of available
|
---|
385 | conversions, with conversion cost `(0, 0, 0, 1)` plus associated polymorphic
|
---|
386 | conversion costs (if applicable) and the _interpretation cost_ of the
|
---|
387 | constructor, the sum of the argument-parameter matching costs for its
|
---|
388 | parameters.
|
---|
389 | Also, ties in overall cost (interpretation cost plus conversion cost) are
|
---|
390 | broken by lowest conversion cost (i.e. of alternatives with the same overall
|
---|
391 | cost, copy constructors are preferred to other explicit constructors,
|
---|
392 | explicit constructors are preferred to safe conversions, which are preferred
|
---|
393 | to unsafe conversions).
|
---|
394 | An object initialization is properly typed if it has exactly one min-cost
|
---|
395 | interpretation.
|
---|
396 |
|
---|
397 | ### Explicit Casts ###
|
---|
398 | Explicit casts are handled similarly to object initialization.
|
---|
399 | Copy constructors and other explicit constructors are not included in the set
|
---|
400 | of possible conversions, though interpreting a cast as type ascription
|
---|
401 | (`(T)e`, meaning the interpretation of `e` as type `T`) has conversion cost
|
---|
402 | `(0, 0, 0, 0)`.
|
---|
403 | Explicit conversion operators are also included in the set of possible
|
---|
404 | conversions, with cost `(0, 0, 0, 1)` plus whatever polymorphic conversion
|
---|
405 | costs are invoked.
|
---|
406 | Unlike for explicit constructors and other functions, implicit conversions are
|
---|
407 | never applied to the argument or return type of an explicit cast operator, so
|
---|
408 | that the cast may be used more effectively as a method for the user programmer
|
---|
409 | to guide type resolution.
|
---|
410 |
|
---|
411 | ## Trait Satisfaction ##
|
---|
412 | A _trait_ consists of a list of _type variables_ along with a (possibly empty)
|
---|
413 | set of _assertions_ on those variables.
|
---|
414 | Assertions can take two forms, _variable assertions_ and the more common
|
---|
415 | _function assertions_, as in the following example:
|
---|
416 |
|
---|
417 | trait a_trait(otype T, otype S) {
|
---|
418 | T a_variable_assertion;
|
---|
419 | S* another_variable_assertion;
|
---|
420 | S a_function_assertion( T* );
|
---|
421 | };
|
---|
422 |
|
---|
423 | Variable assertions enforce that a variable with the given name and type
|
---|
424 | exists (the type is generally one of the type variables, or derived from one),
|
---|
425 | while a function assertion enforces that a function with a
|
---|
426 | _compatible signature_ to the provided function exists.
|
---|
427 |
|
---|
428 | To test if some list of types _satisfy_ the trait, the types are first _bound_
|
---|
429 | to the type variables, and then declarations to satisfy each assertion are
|
---|
430 | sought out.
|
---|
431 | Variable assertions require an exact match, because they are passed as object
|
---|
432 | pointers, and there is no mechanism to employ conversion functions, while
|
---|
433 | function assertions only require a function that can be wrapped to a
|
---|
434 | compatible type; for example, the declarations below satisfy
|
---|
435 | `a_trait(int, short)`:
|
---|
436 |
|
---|
437 | int a_variable_assertion;
|
---|
438 | short* another_variable_assertion;
|
---|
439 | char a_function_assertion( void* );
|
---|
440 | // int* may be implicitly converted to void*, and char to short, so the
|
---|
441 | // above works
|
---|
442 |
|
---|
443 | Cforall Polymorphic functions have a _constraining trait_, denoted as follows:
|
---|
444 |
|
---|
445 | forall(otype A, otype B | some_trait(A, B))
|
---|
446 |
|
---|
447 | The trait may be anonymous, with the same syntax as a trait declaration, and
|
---|
448 | may be unioned together using `|` or `,`.
|
---|
449 |
|
---|
450 | **TODO** Consider including field assertions in the list of constraint types,
|
---|
451 | also associated types and the appropriate matching type assertion.
|
---|
452 |
|
---|
453 | ## Polymorphism Costs ##
|
---|
454 | The type resolver should prefer functions that are "less polymorphic" to
|
---|
455 | functions that are "more polymorphic".
|
---|
456 | Determining how to order functions by degree of polymorphism is somewhat less
|
---|
457 | straightforward, though, as there are multiple axes of polymorphism and it is
|
---|
458 | not always clear how they compose.
|
---|
459 | The natural order for degree of polymorphism is a partial order, and this
|
---|
460 | section includes some open questions on whether it is desirable or feasible to
|
---|
461 | develop a tie-breaking strategy to impose a total order on the degree of
|
---|
462 | polymorphism of functions.
|
---|
463 | Helpfully, though, the degree of polymorphism is a property of functions
|
---|
464 | rather than function calls, so any complicated graph structure or calculation
|
---|
465 | representing a (partial) order over function degree of polymorphism can be
|
---|
466 | calculated once and cached.
|
---|
467 |
|
---|
468 | ### Function Parameters ###
|
---|
469 | All other things being equal, if a parameter of one function has a concrete
|
---|
470 | type and the equivalent parameter of another function has a dynamic type, the
|
---|
471 | first function is less polymorphic:
|
---|
472 |
|
---|
473 | void f( int, int ); // (0) least polymorphic
|
---|
474 | forall(otype T) void f( T, int ); // (1a) more polymorphic than (0)
|
---|
475 | forall(otype T) void f( int, T ); // (1b) more polymorphic than (0)
|
---|
476 | // incomparable with (1a)
|
---|
477 | forall(otype T) void f( T, T ); // (2) more polymorphic than (1a/b)
|
---|
478 |
|
---|
479 | This should extend to parameterized types (pointers and generic types) also:
|
---|
480 |
|
---|
481 | forall(otype S) struct box { S val; };
|
---|
482 |
|
---|
483 | forall(otype T) void f( T, T* ); // (3) less polymorphic than (2)
|
---|
484 | forall(otype T) void f( T, T** ); // (4) less polymorphic than (3)
|
---|
485 | forall(otype T) void f( T, box(T) ); // (5) less polymorphic than (2)
|
---|
486 | // incomparable with (3)
|
---|
487 | forall(otype T) void f( T, box(T*) ); // (6) less polymorphic than (5)
|
---|
488 |
|
---|
489 | Every function in the group above is incomparable with (1a/b), but that's fine
|
---|
490 | because an `int` isn't a pointer or a `box`, so the ambiguity shouldn't occur
|
---|
491 | much in practice (unless there are safe or unsafe conversions defined between
|
---|
492 | the possible argument types).
|
---|
493 |
|
---|
494 | For degree of polymorphism from arguments, I think we should not distinguish
|
---|
495 | between different type parameters, e.g. the following should be considered
|
---|
496 | equally polymorphic:
|
---|
497 |
|
---|
498 | forall(otype T, otype S) void f( T, T, S ); // (7)
|
---|
499 | forall(otype T, otype S) void f( S, T, T ); // (8)
|
---|
500 |
|
---|
501 | However parameter lists are compared, parameters of multi-parameter generic
|
---|
502 | types should ideally be treated as a recursive case, e.g. in the example
|
---|
503 | below, (9) is less polymorphic than (10), which is less polymorphic than (11):
|
---|
504 |
|
---|
505 | forall(otype T, otype S) struct pair { T x; S y; };
|
---|
506 |
|
---|
507 | void f( pair(int, int) ); // (9)
|
---|
508 | forall(otype T) void f( pair(T, int) ); // (10)
|
---|
509 | forall(otype T) void f( pair(T, T) ); // (11)
|
---|
510 |
|
---|
511 | Parameter comparison could possibly be made somewhat cheaper at loss of some
|
---|
512 | precision by representing each parameter as a value from the natural numbers
|
---|
513 | plus infinity, where infinity represents a monomorphic parameter and a finite
|
---|
514 | number counts how many levels deep the shallowest type variable is, e.g. where
|
---|
515 | `T` is a type variable, `int` would have value infinity, `T` would have value
|
---|
516 | 0, `T*` would have value 1, `box(T)*` would have value 2, etc.
|
---|
517 | Under this scheme, higher values represent less polymorphism.
|
---|
518 | This makes the partial order on parameters a total order, so that many of the
|
---|
519 | incomparable functions above compare equal, though that is perhaps a virtue.
|
---|
520 | It also loses the ability to differentiate between some multi-parameter
|
---|
521 | generic types, such as the parameters in (10) and (11), which would both be
|
---|
522 | valued 1, losing the polymorphism distinction between them.
|
---|
523 |
|
---|
524 | A variant of the above scheme would be to fix a maximum depth of polymorphic
|
---|
525 | type variables (16 seems like a reasonable choice) at which a parameter would
|
---|
526 | be considered to be effectively monomorphic, and to subtract the value
|
---|
527 | described above from that maximum, clamping the result to a minimum of 0.
|
---|
528 | Under this scheme, assuming a maximum value of 4, `int` has value 0, `T` has
|
---|
529 | value 4, `T*` has value 3, `box(T)*` has value 2, and `box(T*)**` has value 0,
|
---|
530 | the same as `int`.
|
---|
531 | This can be quite succinctly represented, and summed without the presence of a
|
---|
532 | single monomorphic parameter pushing the result to infinity, but does lose the
|
---|
533 | ability to distinguish between very deeply structured polymorphic types.
|
---|
534 |
|
---|
535 | ### Parameter Lists ###
|
---|
536 | A partial order on function parameter lists can be produced by the
|
---|
537 | product order of the partial orders on parameters described above.
|
---|
538 | In more detail, this means that for two parameter lists with the same arity,
|
---|
539 | if any pair of corresponding parameters are incomparable with respect to each
|
---|
540 | other, the two parameter lists are incomparable; if in all pairs of
|
---|
541 | corresponding parameters one list's parameter is always (less than or) equal
|
---|
542 | to the other list's parameter than the first parameter list is (less than or)
|
---|
543 | equal to the second parameter list; otherwise the lists are incomparable with
|
---|
544 | respect to each other.
|
---|
545 |
|
---|
546 | How to compare parameter lists of different arity is a somewhat open question.
|
---|
547 | A simple, but perhaps somewhat unsatisfying, solution would be just to say
|
---|
548 | that such lists are incomparable.
|
---|
549 | The simplist approach to make them comparable is to say that, given two lists
|
---|
550 | `(T1, T2, ... Tn)` and `(S1, S2, ... Sm)`, where `n <= m`, the parameter lists
|
---|
551 | can be compared based on their shared prefix of `n` types.
|
---|
552 | This approach breaks the transitivity property of the equivalence relation on
|
---|
553 | the partial order, though, as seen below:
|
---|
554 |
|
---|
555 | forall(otype T) void f( T, int ); // (1a)
|
---|
556 | forall(otype T) void f( T, int, int ); // (12)
|
---|
557 | forall(otype T) void f( T, int, T ); // (13)
|
---|
558 |
|
---|
559 | By this rule, (1a) is equally polymorphic to both (12) and (13), so by
|
---|
560 | transitivity (12) and (13) should also be equally polymorphic, but that is not
|
---|
561 | actually the case.
|
---|
562 |
|
---|
563 | We can fix the rule by saying that `(T1 ... Tn)` can be compared to
|
---|
564 | `(S1 ... Sm)` by _extending_ the list of `T`s to `m` types by inserting
|
---|
565 | notional monomorphic parameters.
|
---|
566 | In this case, (1a) and (12) are equally polymorphic, because (1a) gets
|
---|
567 | extended with a monomorphic type that compares equal to (12)'s third `int`
|
---|
568 | parameter, but (1a) is less polymorphic than (13), because its notional
|
---|
569 | monomorphic third parameter is less polymorphic than (13)'s `T`.
|
---|
570 | Essentially what this rule says is that any parameter list with more
|
---|
571 | parameters is no less polymorphic than one with fewer.
|
---|
572 |
|
---|
573 | We could collapse this parameter list ordering to a succinct total order by
|
---|
574 | simply taking the sum of the clamped parameter polymorphism counts, but this
|
---|
575 | would again make most incomparable parameter lists compare equal, as well as
|
---|
576 | having the potential for some unexpected results based on the (completely
|
---|
577 | arbitrary) value chosen for "completely polymorphic".
|
---|
578 | For instance, if we set 4 to be the maximum depth of polymorphism (as above),
|
---|
579 | the following functions would be equally polymorphic, which is a somewhat
|
---|
580 | unexpected result:
|
---|
581 |
|
---|
582 | forall(otype T) void g( T, T, T, int ); // 4 + 4 + 4 + 0 = 12
|
---|
583 | forall(otype T) void g( T*, T*, T*, T* ); // 3 + 3 + 3 + 3 = 12
|
---|
584 |
|
---|
585 | These functions would also be considered equally polymorphic:
|
---|
586 |
|
---|
587 | forall(otype T) void g( T, int ); // 4 + 0 = 4;
|
---|
588 | forall(otype T) void g( T**, T** ); // 2 + 2 = 4;
|
---|
589 |
|
---|
590 | This issue can be mitigated by choosing a larger maximum depth of
|
---|
591 | polymorphism, but this scheme does have the distinct disadvantage of either
|
---|
592 | specifying the (completely arbitrary) maximum depth as part of the language or
|
---|
593 | allowing the compiler to refuse to accept otherwise well-typed deeply-nested
|
---|
594 | polymorphic types.
|
---|
595 |
|
---|
596 | For purposes of determining polymorphism, the list of return types of a
|
---|
597 | function should be treated like another parameter list, and combined with the
|
---|
598 | degree of polymorphism from the parameter list in the same way that the
|
---|
599 | parameters in the parameter list are combined.
|
---|
600 | For instance, in the following, (14) is less polymorphic than (15) which is
|
---|
601 | less polymorphic than (16):
|
---|
602 |
|
---|
603 | forall(otype T) int f( T ); // (14)
|
---|
604 | forall(otype T) T* f( T ); // (15)
|
---|
605 | forall(otype T) T f( T ); // (16)
|
---|
606 |
|
---|
607 | ### Type Variables and Bounds ###
|
---|
608 | Degree of polymorphism doesn't solely depend on the parameter lists, though.
|
---|
609 | Glen's thesis (4.4.4, p.89) gives an example that shows that it also depends
|
---|
610 | on the number of type variables as well:
|
---|
611 |
|
---|
612 | forall(otype T) void f( T, int ); // (1a) polymorphic
|
---|
613 | forall(otype T) void f( T, T ); // (2) more polymorphic
|
---|
614 | forall(otype T, otype S) void f( T, S ); // (17) most polymorphic
|
---|
615 |
|
---|
616 | Clearly the `forall` type parameter list needs to factor into calculation of
|
---|
617 | degree of polymorphism as well, as it's the only real differentiation between
|
---|
618 | (2) and (17).
|
---|
619 | The simplest way to include the type parameter list would be to simply count
|
---|
620 | the type variables and say that functions with more type variables are more
|
---|
621 | polymorphic.
|
---|
622 |
|
---|
623 | However, it also seems natural that more-constrained type variables should be
|
---|
624 | counted as "less polymorphic" than less-constrained type variables.
|
---|
625 | This would allow our resolver to pick more specialized (and presumably more
|
---|
626 | efficient) implementations of functions where one exists.
|
---|
627 | For example:
|
---|
628 |
|
---|
629 | forall(otype T | { void g(T); }) T f( T ); // (18) less polymorphic
|
---|
630 | forall(otype T) T f( T ); // (16) more polymorphic
|
---|
631 |
|
---|
632 | We could account for this by counting the number of unique constraints and
|
---|
633 | saying that functions with more constraints are less polymorphic.
|
---|
634 |
|
---|
635 | That said, we do model the `forall` constraint list as a (possibly anonymous)
|
---|
636 | _trait_, and say that each trait is a set of constraints, so we could
|
---|
637 | presumably define a partial order over traits based on subset inclusion, and
|
---|
638 | use this partial order instead of the weaker count of constraints to order the
|
---|
639 | list of type parameters of a function, as below:
|
---|
640 |
|
---|
641 | trait has_g(otype T) { void g(T); };
|
---|
642 | trait has_h(otype S) { void h(T); };
|
---|
643 | trait has_gh(otype R | has_g(R) | has_h(R)) {};
|
---|
644 | // has_gh is equivlent to { void g(R); void h(R); }
|
---|
645 |
|
---|
646 | forall(otype T | has_gh(T)) T f( T ); // (19) least polymorphic
|
---|
647 | forall(otype T | has_g(T)) T f( T ); // (18) more polymorphic than (19)
|
---|
648 | forall(otype T | has_h(T)) T f( T ); // (18b) more polymorphic than (19)
|
---|
649 | // incomparable with (18)
|
---|
650 | forall(otype T) T f( T ); // (16) most polymorphic
|
---|
651 |
|
---|
652 | The tricky bit with this is figuring out how to compare the constraint
|
---|
653 | functions for equality up to type variable renaming; I suspect there's a known
|
---|
654 | solution, but don't know what it is (perhaps some sort of unification
|
---|
655 | calculation, though I hope there's a more lightweight option).
|
---|
656 | We also should be able to take advantage of the programmer-provided trait
|
---|
657 | subset information (like the constraint on `has_gh` in the example) to more
|
---|
658 | efficiently generate the partial-order graph for traits, which should be able
|
---|
659 | to be cached for efficiency.
|
---|
660 |
|
---|
661 | Combining count of type variables with the (partial) order on the trait
|
---|
662 | constraining those variables seems like it should be a fairly straightforward
|
---|
663 | product ordering to me - one `forall` qualifier is (less than or) equal to
|
---|
664 | another if it has both a (less than or) equal number of type variables and a
|
---|
665 | (less than or) equal degree of polymorphism from its constraining trait; the
|
---|
666 | two qualifiers are incomparable otherwise.
|
---|
667 | If an easier-to-calculate total ordering is desired, it might be acceptable to
|
---|
668 | use the number of type variables, with ties broken by number of constraints.
|
---|
669 |
|
---|
670 | Similarly, to combine the (partial) orders on parameter and return lists with
|
---|
671 | the (partial) order on `forall` qualifiers, a product ordering seems like the
|
---|
672 | reasonable choice, though if we wanted a total order a reasonable choice would
|
---|
673 | be to use whatever method we use to combine parameter costs into parameter
|
---|
674 | lists to combine the costs for the parameter and return lists, then break ties
|
---|
675 | by the order on the `forall` qualifiers.
|
---|
676 |
|
---|
677 | ## Expression Costs ##
|
---|
678 |
|
---|
679 | ### Variable Expressions ###
|
---|
680 | Variables may be overloaded; that is, there may be multiple distinct variables
|
---|
681 | with the same name so long as each variable has a distinct type.
|
---|
682 | The variable expression `x` has one zero-cost interpretation as type `T` for
|
---|
683 | each variable `T x` in scope.
|
---|
684 |
|
---|
685 | ### Member Selection Expressions ###
|
---|
686 | For every interpretation `I` of `e` which has a struct or union type `S`,
|
---|
687 | `e.y` has an interpretation of type `T` for each member `T y` of `S`, with the
|
---|
688 | same cost as `I`.
|
---|
689 | Note that there may be more than one member of `S` with the same name, as per
|
---|
690 | Cforall's usual overloading rules.
|
---|
691 | The indirect member expression `e->y` is desugared to `(*e).y` and interpreted
|
---|
692 | analogously.
|
---|
693 |
|
---|
694 | **TODO** Consider allowing `e.y` to be interpreted as `e->y` if no
|
---|
695 | interpretations as `e.y` exist.
|
---|
696 |
|
---|
697 | ### Address Expressions ###
|
---|
698 | Address expressions `&e` have an interpretation for each interpretation `I` of
|
---|
699 | `e` that is an lvalue of type `T`, with the same cost as `I` and type `T*`.
|
---|
700 | Lvalues result from variable expressions, member selection expressions, or
|
---|
701 | application of functions returning an lvalue-qualified type.
|
---|
702 | Note that the dereference operator is overloadable, so the rules for its
|
---|
703 | resolution follow those for function application below.
|
---|
704 |
|
---|
705 | **TODO** Consider allowing conversion-to-lvalue so that, e.g., `&42` spawns a
|
---|
706 | new temporary holding `42` and takes its address.
|
---|
707 |
|
---|
708 | ### Boolean-context Expressions ###
|
---|
709 | C has a number of "boolean contexts", where expressions are assigned a truth
|
---|
710 | value; these include both arguments to the short-circuiting `&&` and `||`
|
---|
711 | operators, as well as the conditional expressions in `if` and `while`
|
---|
712 | statements, the middle expression in `for` statements, and the first argument
|
---|
713 | to the `?:` ternary conditional operator.
|
---|
714 | In all these contexts, C interprets `0` (which is both an integer and a null
|
---|
715 | pointer literal) as false, and all other integer or pointer values as true.
|
---|
716 | In this spirit, Cforall allows other types to be considered "truthy" if they
|
---|
717 | support the following de-sugaring in a conditional context (see notes on
|
---|
718 | interpretation of literal `0` below):
|
---|
719 |
|
---|
720 | x => ((int)( x != 0 ))
|
---|
721 |
|
---|
722 | ### Literal Expressions ###
|
---|
723 | Literal expressions (e.g. 42, 'c', 3.14, "Hello, world!") have one
|
---|
724 | zero-cost interpretation with the same type the expression would have in C,
|
---|
725 | with three exceptions:
|
---|
726 |
|
---|
727 | Character literals like 'x' are typed as `char` in Cforall, not `int` as in C.
|
---|
728 | This change breaks very little C code (primarily `sizeof 'x'`; the implicit
|
---|
729 | conversion from `int` to `char` and lack of overloading handle most other
|
---|
730 | expressions), matches the behaviour of C++, and is more compatible with
|
---|
731 | programmer intuition.
|
---|
732 |
|
---|
733 | The literals `0` and `1` are also treated specially by Cforall, due to their
|
---|
734 | potential uses in operator overloading.
|
---|
735 | Earlier versions of Cforall allowed `0` and `1` to be variable names, allowing
|
---|
736 | multiple interpretations of them according to the existing variable
|
---|
737 | overloading rules, with the following declarations in the prelude:
|
---|
738 |
|
---|
739 | const int 0, 1;
|
---|
740 | forall ( dtype DT ) const DT * const 0;
|
---|
741 | forall ( ftype FT ) FT * const 0;
|
---|
742 |
|
---|
743 | This did, however, create some backward-compatibility problems and potential
|
---|
744 | performance issues, and works poorly for generic types. To start with, this
|
---|
745 | (entirely legal C) code snippet doesn't compile in Cforall:
|
---|
746 |
|
---|
747 | if ( 0 ) {}
|
---|
748 |
|
---|
749 | It desugars to `if ( (int)(0 != 0) ) {}`, and since both `int` and
|
---|
750 | `forall(dtype DT) DT*` have a != operator which returns `int` the resolver can
|
---|
751 | not choose which `0` variable to take, because they're both exact matches.
|
---|
752 |
|
---|
753 | The general != computation may also be less efficient than a check for a zero
|
---|
754 | value; take the following example of a rational type:
|
---|
755 |
|
---|
756 | struct rational { int32_t num, int32_t den };
|
---|
757 | rational 0 = { 0, 1 };
|
---|
758 |
|
---|
759 | int ?!=? (rational a, rational b) {
|
---|
760 | return ((int64_t)a.num)*b.den != ((int64_t)b.num)*a.den;
|
---|
761 | }
|
---|
762 |
|
---|
763 | int not_zero (rational a) { return a.num != 0; }
|
---|
764 |
|
---|
765 | To check if two rationals are equal we need to do a pair of multiplications to
|
---|
766 | normalize them (the casts in the example are to prevent overflow), but to
|
---|
767 | check if a rational is non-zero we just need to check its numerator, a more
|
---|
768 | efficient operation.
|
---|
769 |
|
---|
770 | Finally, though polymorphic null-pointer variables can be meaningfully
|
---|
771 | defined, most other polymorphic variables cannot be, which makes it difficult
|
---|
772 | to make generic types "truthy" using the existing system:
|
---|
773 |
|
---|
774 | forall(otype T) struct pair { T x; T y; };
|
---|
775 | forall(otype T | { T 0; }) pair(T) 0 = { 0, 0 };
|
---|
776 |
|
---|
777 | Now, it seems natural enough to want to define the zero for this pair type as
|
---|
778 | a pair of the zero values of its element type (if they're defined).
|
---|
779 | The declaration of `pair(T) 0` above is actually illegal though, as there is
|
---|
780 | no way to represent the zero values of an infinite number of types in the
|
---|
781 | single memory location available for this polymorphic variable - the
|
---|
782 | polymorphic null-pointer variables defined in the prelude are legal, but that
|
---|
783 | is only because all pointers are the same size and the single zero value is a
|
---|
784 | legal value of all pointer types simultaneously; null pointer is, however,
|
---|
785 | somewhat unique in this respect.
|
---|
786 |
|
---|
787 | The technical explanation for the problems with polymorphic zero is that `0`
|
---|
788 | is really a rvalue, not a lvalue - an expression, not an object.
|
---|
789 | Drawing from this, the solution we propose is to give `0` a new built-in type,
|
---|
790 | `_zero_t` (name open to bikeshedding), and similarly give `1` the new built-in
|
---|
791 | type `_unit_t`.
|
---|
792 | If the prelude defines != over `_zero_t` this solves the `if ( 0 )` problem,
|
---|
793 | because now the unambiguous best interpretation of `0 != 0` is to read them
|
---|
794 | both as `_zero_t` (and say that this expression is false).
|
---|
795 | Backwards compatibility with C can be served by defining conversions in the
|
---|
796 | prelude from `_zero_t` and `_unit_t` to `int` and the appropriate pointer
|
---|
797 | types, as below:
|
---|
798 |
|
---|
799 | // int 0;
|
---|
800 | forall(otype T | { void ?{safe}(T*, int); }) void ?{safe} (T*, _zero_t);
|
---|
801 | forall(otype T | { void ?{unsafe}(T*, int); }) void ?{unsafe} (T*, _zero_t);
|
---|
802 |
|
---|
803 | // int 1;
|
---|
804 | forall(otype T | { void ?{safe}(T*, int); }) void ?{safe} (T*, _unit_t);
|
---|
805 | forall(otype T | { void ?{unsafe}(T*, int); }) void ?{unsafe} (T*, _unit_t);
|
---|
806 |
|
---|
807 | // forall(dtype DT) const DT* 0;
|
---|
808 | forall(dtype DT) void ?{safe}(const DT**, _zero_t);
|
---|
809 | // forall(ftype FT) FT* 0;
|
---|
810 | forall(ftype FT) void ?{safe}(FT**, _zero_t);
|
---|
811 |
|
---|
812 | Further, with this change, instead of making `0` and `1` overloadable
|
---|
813 | variables, we can instead allow user-defined constructors (or, more flexibly,
|
---|
814 | safe conversions) from `_zero_t`, as below:
|
---|
815 |
|
---|
816 | // rational 0 = { 0, 1 };
|
---|
817 | void ?{safe} (rational *this, _zero_t) { this->num = 0; this->den = 1; }
|
---|
818 |
|
---|
819 | Note that we don't need to name the `_zero_t` parameter to this constructor,
|
---|
820 | because its only possible value is a literal zero.
|
---|
821 | This one line allows `0` to be used anywhere a `rational` is required, as well
|
---|
822 | as enabling the same use of rationals in boolean contexts as above (by
|
---|
823 | interpreting the `0` in the desguraring to be a rational by this conversion).
|
---|
824 | Furthermore, while defining a conversion function from literal zero to
|
---|
825 | `rational` makes rational a "truthy" type able to be used in a boolean
|
---|
826 | context, we can optionally further optimize the truth decision on rationals as
|
---|
827 | follows:
|
---|
828 |
|
---|
829 | int ?!=? (rational a, _zero_t) { return a.num != 0; }
|
---|
830 |
|
---|
831 | This comparison function will be chosen in preference to the more general
|
---|
832 | rational comparison function for comparisons against literal zero (like in
|
---|
833 | boolean contexts) because it doesn't require a conversion on the `0` argument.
|
---|
834 | Functions of the form `int ?!=? (T, _zero_t)` can acutally be used in general
|
---|
835 | to make a type `T` truthy without making `0` a value which can convert to that
|
---|
836 | type, a capability not available in the current design.
|
---|
837 |
|
---|
838 | This design also solves the problem of polymorphic zero for generic types, as
|
---|
839 | in the following example:
|
---|
840 |
|
---|
841 | // ERROR: forall(otype T | { T 0; }) pair(T) 0 = { 0, 0 };
|
---|
842 | forall(otype T | { T 0; }) void ?{safe} (pair(T) *this, _zero_t) {
|
---|
843 | this->x = 0; this->y = 0;
|
---|
844 | }
|
---|
845 |
|
---|
846 | The polymorphic variable declaration didn't work, but this constructor is
|
---|
847 | perfectly legal and has the desired semantics.
|
---|
848 |
|
---|
849 | We can assert that `T` can be used in a boolean context as follows:
|
---|
850 |
|
---|
851 | `forall(otype T | { int ?!=?(T, _zero_t); })`
|
---|
852 |
|
---|
853 | Since the C standard (6.5.16.1.1) specifically states that pointers can be
|
---|
854 | assigned into `_Bool` variables (and implies that other artithmetic types can
|
---|
855 | be assigned into `_Bool` variables), it seems natural to say that assignment
|
---|
856 | into a `_Bool` variable effectively constitutes a boolean context.
|
---|
857 | To allow this interpretation, I propose including the following function (or
|
---|
858 | its effective equivalent) in the prelude:
|
---|
859 |
|
---|
860 | forall(otype T | { int ?!=?(T, _zero_t); })
|
---|
861 | void ?{safe}( _Bool *this, T that ) { *this = that != 0; }
|
---|
862 |
|
---|
863 | Note that this conversion is not transitive; that is, for `t` a variable of
|
---|
864 | some "truthy" type `T`, `(_Bool)t;` would use this conversion (in the absence
|
---|
865 | of a lower-cost one), `(int)t;` would not use this conversion (and in fact
|
---|
866 | would not be legal in the absence of another valid way to convert a `T` to an
|
---|
867 | `int`), but `(int)(_Bool)t;` could legally use this conversion.
|
---|
868 |
|
---|
869 | Similarly giving literal `1` the special type `_unit_t` allows for more
|
---|
870 | concise and consistent specification of the increment and decrement operators,
|
---|
871 | using the following de-sugaring:
|
---|
872 |
|
---|
873 | ++i => i += 1
|
---|
874 | i++ => (tmp = i, i += 1, tmp)
|
---|
875 | --i => i -= 1
|
---|
876 | i-- => (tmp = i, i -= 1, tmp)
|
---|
877 |
|
---|
878 | In the examples above, `tmp` is a fresh temporary with its type inferred from
|
---|
879 | the return type of `i += 1`.
|
---|
880 | Under this proposal, defining a conversion from `_unit_t` to `T` and a
|
---|
881 | `lvalue T ?+=? (T*, T)` provides both the pre- and post-increment operators
|
---|
882 | for free in a consistent fashion (similarly for -= and the decrement
|
---|
883 | operators).
|
---|
884 | If a meaningful `1` cannot be defined for a type, both increment operators can
|
---|
885 | still be defined with the signature `lvalue T ?+=? (T*, _unit_t)`.
|
---|
886 | Similarly, if scalar addition can be performed on a type more efficiently than
|
---|
887 | by repeated increment, `lvalue T ?+=? (T*, int)` will not only define the
|
---|
888 | addition operator, it will simultaneously define consistent implementations of
|
---|
889 | both increment operators (this can also be accomplished by defining a
|
---|
890 | conversion from `int` to `T` and an addition operator `lvalue T ?+=?(T*, T)`).
|
---|
891 |
|
---|
892 | To allow functions of the form `lvalue T ?+=? (T*, int)` to satisfy "has an
|
---|
893 | increment operator" assertions of the form `lvalue T ?+=? (T*, _unit_t)`,
|
---|
894 | we also define a non-transitive unsafe conversion from `_Bool` (allowable
|
---|
895 | values `0` and `1`) to `_unit_t` (and `_zero_t`) as follows:
|
---|
896 |
|
---|
897 | void ?{unsafe} (_unit_t*, _Bool) {}
|
---|
898 |
|
---|
899 | As a note, the desugaring of post-increment above is possibly even more
|
---|
900 | efficient than that of C++ - in C++, the copy to the temporary may be hidden
|
---|
901 | in a separately-compiled module where it can't be elided in cases where it is
|
---|
902 | not used, whereas this approach for Cforall always gives the compiler the
|
---|
903 | opportunity to optimize out the temporary when it is not needed.
|
---|
904 | Furthermore, one could imagine a post-increment operator that returned some
|
---|
905 | type `T2` that was implicitly convertable to `T` but less work than a full
|
---|
906 | copy of `T` to create (this seems like an absurdly niche case) - since the
|
---|
907 | type of `tmp` is inferred from the return type of `i += 1`, you could set up
|
---|
908 | functions with the following signatures to enable an equivalent pattern in
|
---|
909 | Cforall:
|
---|
910 |
|
---|
911 | lvalue T2 ?+=? (T*, _unit_t); // increment operator returns T2
|
---|
912 | void ?{} (T2*, T); // initialize T2 from T for use in `tmp = i`
|
---|
913 | void ?{safe} (T*, T2); // allow T2 to be used as a T when needed to
|
---|
914 | // preserve expected semantics of T x = y++;
|
---|
915 |
|
---|
916 | **TODO** Look in C spec for literal type interprations.
|
---|
917 | **TODO** Write up proposal for wider range of literal types, put in appendix
|
---|
918 |
|
---|
919 | ### Initialization and Cast Expressions ###
|
---|
920 | An initialization expression `T x = e` has one interpretation for each
|
---|
921 | interpretation `I` of `e` with type `S` which is convertable to `T`.
|
---|
922 | The cost of the interpretation is the cost of `I` plus the conversion cost
|
---|
923 | from `S` to `T`.
|
---|
924 | A cast expression `(T)e` is interpreted as hoisting initialization of a
|
---|
925 | temporary variable `T tmp = e` out of the current expression, then replacing
|
---|
926 | `(T)e` by the new temporary `tmp`.
|
---|
927 |
|
---|
928 | ### Assignment Expressions ###
|
---|
929 | An assignment expression `e = f` desugars to `(?=?(&e, f), e)`, and is then
|
---|
930 | interpreted according to the usual rules for function application and comma
|
---|
931 | expressions.
|
---|
932 | Operator-assignment expressions like `e += f` desugar similarly as
|
---|
933 | `(?+=?(&e, f), e)`.
|
---|
934 |
|
---|
935 | ### Function Application Expressions ###
|
---|
936 | Every _compatible function_ and satisfying interpretation of its arguments and
|
---|
937 | polymorphic variable bindings produces one intepretation for the function
|
---|
938 | application expression.
|
---|
939 | Broadly speaking, the resolution cost of a function application is the sum of
|
---|
940 | the cost of the interpretations of all arguments, the cost of all conversions
|
---|
941 | to make those argument interpretations match the parameter types, and the
|
---|
942 | binding cost of any of the function's polymorphic type parameters.
|
---|
943 |
|
---|
944 | **TODO** Work out binding cost in more detail.
|
---|
945 | **TODO** Address whether "incomparably polymorphic" should be treated as
|
---|
946 | "equally polymorphic" and be disambiguated by count of (safe) conversions.
|
---|
947 | **TODO** Think about what polymorphic return types mean in terms of late
|
---|
948 | binding.
|
---|
949 | **TODO** Consider if "f is less polymorphic than g" can mean exactly "f
|
---|
950 | specializes g"; if we don't consider the assertion parameters (except perhaps
|
---|
951 | by count) and make polymorphic variables bind exactly (rather than after
|
---|
952 | implicit conversions) this should actually be pre-computable.
|
---|
953 | **TODO** Add "deletable" functions - take Thierry's suggestion that a deleted
|
---|
954 | function declaration is costed out by the resolver in the same way that any
|
---|
955 | other function declaration is costed; if the deleted declaration is the unique
|
---|
956 | min-cost resolution refuse to type the expression, if it is tied for min-cost
|
---|
957 | then take the non-deleted alternative, and of two equivalent-cost deleted
|
---|
958 | interpretations with the same return type pick one arbitrarily rather than
|
---|
959 | producing an ambiguous resolution. This would also be useful for forbidding
|
---|
960 | pointer-to-floating-point explicit conversions (C11, 6.5.4.4).
|
---|
961 | **TODO** Cover default parameters, maybe named parameters (see "named
|
---|
962 | arguments" thread of 11 March 2016)
|
---|
963 |
|
---|
964 |
|
---|
965 | ### Sizeof, Alignof & Offsetof Expressions ###
|
---|
966 | `sizeof`, `alignof`, and `offsetof` expressions have at most a single
|
---|
967 | interpretation, of type `size_t`.
|
---|
968 | `sizeof` and `alignof` expressions take either a type or an expression as an
|
---|
969 | argument; if the argument is a type, it must be a complete type which is not a
|
---|
970 | function type, if an expression, the expression must have a single
|
---|
971 | interpretation, the type of which conforms to the same rules.
|
---|
972 | `offsetof` takes two arguments, a type and a member name; the type must be
|
---|
973 | a complete structure or union type, and the second argument must name a member
|
---|
974 | of that type.
|
---|
975 |
|
---|
976 | ### Comma Expressions ###
|
---|
977 | A comma expression `x, y` resolves `x` as if it had been cast to `void`, and
|
---|
978 | then, if there is a unique interpretation `I` of `x`, has one interpretation
|
---|
979 | for each interpretation `J` of `y` with the same type as `J` costing the sum
|
---|
980 | of the costs of `I` and `J`.
|
---|
981 |
|
---|
982 | ### Index Expressions ###
|
---|
983 | **TODO** Consider adding polymorphic function in prelude for this, as per
|
---|
984 | 6.5.2.1.2 in the C standard:
|
---|
985 |
|
---|
986 | forall(otype T, otype I, otype R, otype E | { R ?+?(T, I); lvalue E *?(R); })
|
---|
987 | lvalue E ?[?](T a, I i) { return *(a + i); }
|
---|
988 |
|
---|
989 | I think this isn't actually a good idea, because the cases for it are niche,
|
---|
990 | mostly odd tricks like `0[p]` as an alternate syntax for dereferencing a
|
---|
991 | pointer `p`, and adding it to the prelude would slow down resolution of
|
---|
992 | every index expression just a bit. Our existing prelude includes an indexing
|
---|
993 | operator `forall(otype T) lvalue T ?[?](ptrdiff_t, T*)`, plus qualified
|
---|
994 | variants, which should satisfy C source-compatibility without propegating this
|
---|
995 | silly desugaring further.
|
---|
996 |
|
---|
997 | #### Compatible Functions ####
|
---|
998 | **TODO** This subsection is very much a work in progress and has multiple open
|
---|
999 | design questions.
|
---|
1000 |
|
---|
1001 | A _compatible function_ for an application expression is a visible function
|
---|
1002 | declaration with the same name as the application expression and parameter
|
---|
1003 | types that can be converted to from the argument types.
|
---|
1004 | Function pointers and variables of types with the `?()` function call operator
|
---|
1005 | overloaded may also serve as function declarations for purposes of
|
---|
1006 | compatibility.
|
---|
1007 |
|
---|
1008 | For monomorphic parameters of a function declaration, the declaration is a
|
---|
1009 | compatible function if there is an argument interpretation that is either an
|
---|
1010 | exact match, or has a safe or unsafe implicit conversion that can be used to
|
---|
1011 | reach the parameter type; for example:
|
---|
1012 |
|
---|
1013 | void f(int);
|
---|
1014 |
|
---|
1015 | f(42); // compatible; exact match to int type
|
---|
1016 | f('x'); // compatible; safe conversion from char => int
|
---|
1017 | f(3.14); // compatible; unsafe conversion from double => int
|
---|
1018 | f((void*)0); // not compatible; no implicit conversion from void* => int
|
---|
1019 |
|
---|
1020 | Per Richard[*], function assertion satisfaction involves recursively searching
|
---|
1021 | for compatible functions, not an exact match on the function types (I don't
|
---|
1022 | believe the current Cforall resolver handles this properly); to extend the
|
---|
1023 | previous example:
|
---|
1024 |
|
---|
1025 | forall(otype T | { void f(T); }) void g(T);
|
---|
1026 |
|
---|
1027 | g(42); // binds T = int, takes f(int) by exact match
|
---|
1028 | g('x'); // binds T = char, takes f(int) by conversion
|
---|
1029 | g(3.14); // binds T = double, takes f(int) by conversion
|
---|
1030 |
|
---|
1031 | [*] Bilson, s.2.1.3, p.26-27, "Assertion arguments are found by searching the
|
---|
1032 | accessible scopes for definitions corresponding to assertion names, and
|
---|
1033 | choosing the ones whose types correspond *most closely* to the assertion
|
---|
1034 | types." (emphasis mine)
|
---|
1035 |
|
---|
1036 | There are three approaches we could take to binding type variables: type
|
---|
1037 | variables must bind to argument types exactly, each type variable must bind
|
---|
1038 | exactly to at least one argument, or type variables may bind to any type which
|
---|
1039 | all corresponding arguments can implicitly convert to; I'll provide some
|
---|
1040 | possible motivation for each approach.
|
---|
1041 |
|
---|
1042 | There are two main arguments for the more restrictive binding schemes; the
|
---|
1043 | first is that the built-in implicit conversions in C between `void*` and `T*`
|
---|
1044 | for any type `T` can lead to unexpectedly type-unsafe behaviour in a more
|
---|
1045 | permissive binding scheme, for example:
|
---|
1046 |
|
---|
1047 | forall(dtype T) T* id(T *p) { return p; }
|
---|
1048 |
|
---|
1049 | int main() {
|
---|
1050 | int *p = 0;
|
---|
1051 | char *c = id(p);
|
---|
1052 | }
|
---|
1053 |
|
---|
1054 | This code compiles in CFA today, and it works because the extra function
|
---|
1055 | wrapper `id` provides a level of indirection that allows the non-chaining
|
---|
1056 | implicit conversions from `int*` => `void*` and `void*` => `char*` to chain.
|
---|
1057 | The resolver types the last line with `T` bound to `void` as follows:
|
---|
1058 |
|
---|
1059 | char *c = (char*)id( (void*)p );
|
---|
1060 |
|
---|
1061 | It has been suggested that making the implicit conversions to and from `void*`
|
---|
1062 | explicit in Cforall code (as in C++) would solve this particular problem, and
|
---|
1063 | provide enough other type-safety benefits to outweigh the source-compatibility
|
---|
1064 | break with C; see Appendix D for further details.
|
---|
1065 |
|
---|
1066 | The second argument for a more constrained binding scheme is performance;
|
---|
1067 | trait assertions need to be checked after the type variables are bound, and
|
---|
1068 | adding more possible values for the type variables should correspond to a
|
---|
1069 | linear increase in runtime of the resolver per type variable.
|
---|
1070 | There are 21 built-in arithmetic types in C (ignoring qualifiers), and each of
|
---|
1071 | them is implicitly convertable to any other; if we allow unrestricted binding
|
---|
1072 | of type variables, a common `int` variable (or literal) used in the position
|
---|
1073 | of a polymorphic variable parameter would cause a 20x increase in the amount
|
---|
1074 | of time needed to check trait resolution for that interpretation.
|
---|
1075 | These numbers have yet to be emprically substantiated, but the theory is
|
---|
1076 | reasonable, and given that much of the impetus for re-writing the resolver is
|
---|
1077 | due to its poor performance, I think this is a compelling argument.
|
---|
1078 |
|
---|
1079 | I would also mention that a constrained binding scheme is practical; the most
|
---|
1080 | common type of assertion is a function assertion, and, as mentioned above,
|
---|
1081 | those assertions should be able to be implicitly converted to to match.
|
---|
1082 | Thus, in the example above with `g(T)`, where the assertion is `void f(T)`,
|
---|
1083 | we first bind `T = int` or `T = char` or `T = double`, then substitute the
|
---|
1084 | binding into the assertion, yielding assertions of `void f(int)`,
|
---|
1085 | `void f(char)`, or `void f(double)`, respectively, then attempt to satisfy
|
---|
1086 | these assertions to complete the binding.
|
---|
1087 | Though in all three cases, the existing function with signature `void f(int)`
|
---|
1088 | satisfies this assertion, the checking work cannot easily be re-used between
|
---|
1089 | variable bindings, because there may be better or worse matches depending on
|
---|
1090 | the specifics of the binding.
|
---|
1091 |
|
---|
1092 | The main argument for a more flexible binding scheme is that the binding
|
---|
1093 | abstraction can actually cause a wrapped function call that would work to
|
---|
1094 | cease to resolve, as below:
|
---|
1095 |
|
---|
1096 | forall(otype T | { T ?+? (T, T) })
|
---|
1097 | T add(T x, T y) { return x + y; }
|
---|
1098 |
|
---|
1099 | int main() {
|
---|
1100 | int i, j = 2;
|
---|
1101 | short r, s = 3;
|
---|
1102 | i = add(j, s);
|
---|
1103 | r = add(s, j);
|
---|
1104 | }
|
---|
1105 |
|
---|
1106 | Now, C's implicit conversions mean that you can call `j + s` or `s + j`, and
|
---|
1107 | in both cases the short `s` is promoted to `int` to match `j`.
|
---|
1108 | If, on the other hand, we demand that variables exactly match type variables,
|
---|
1109 | neither call to `add` will compile, because it is impossible to simultaneously
|
---|
1110 | bind `T` to both `int` and `short` (C++ has a similar restriction on template
|
---|
1111 | variable inferencing).
|
---|
1112 | One alternative that enables this case, while still limiting the possible
|
---|
1113 | type variable bindings is to say that at least one argument must bind to its
|
---|
1114 | type parameter exactly.
|
---|
1115 | In this case, both calls to `add` would have the set `{ T = int, T = short }`
|
---|
1116 | for candidate bindings, and would have to check both, as well as checking that
|
---|
1117 | `short` could convert to `int` or vice-versa.
|
---|
1118 |
|
---|
1119 | It is worth noting here that parameterized types generally bind their type
|
---|
1120 | parameters exactly anyway, so these "restrictive" semantics only restrict a
|
---|
1121 | small minority of calls; for instance, in the example following, there isn't a
|
---|
1122 | sensible way to type the call to `ptr-add`:
|
---|
1123 |
|
---|
1124 | forall(otype T | { T ?+?(T, T) })
|
---|
1125 | void ptr-add( T* rtn, T* x, T* y ) {
|
---|
1126 | *rtn = *x + *y;
|
---|
1127 | }
|
---|
1128 |
|
---|
1129 | int main() {
|
---|
1130 | int i, j = 2;
|
---|
1131 | short s = 3;
|
---|
1132 | ptr-add(&i, &j, &s); // ERROR &s is not an int*
|
---|
1133 | }
|
---|
1134 |
|
---|
1135 | I think there is some value in providing library authors with the
|
---|
1136 | capability to express "these two parameter types must match exactly".
|
---|
1137 | This can be done without restricting the language's expressivity, as the `add`
|
---|
1138 | case above can be made to work under the strictest type variable binding
|
---|
1139 | semantics with any addition operator in the system by changing its signature
|
---|
1140 | as follows:
|
---|
1141 |
|
---|
1142 | forall( otype T, otype R, otype S | { R ?+?(T, S); } )
|
---|
1143 | R add(T x, S y) { return x + y; }
|
---|
1144 |
|
---|
1145 | Now, it is somewhat unfortunate that the most general version here is more
|
---|
1146 | verbose (and thus that the path of least resistence would be more restrictive
|
---|
1147 | library code); however, the breaking case in the example code above is a bit
|
---|
1148 | odd anyway - explicitly taking two variables of distinct types and relying on
|
---|
1149 | C's implicit conversions to do the right thing is somewhat bad form,
|
---|
1150 | especially where signed/unsigned conversions are concerned.
|
---|
1151 | I think the more common case for implicit conversions is the following,
|
---|
1152 | though, where the conversion is used on a literal:
|
---|
1153 |
|
---|
1154 | short x = 40;
|
---|
1155 | short y = add(x, 2);
|
---|
1156 |
|
---|
1157 | One option to handle just this case would be to make literals implicitly
|
---|
1158 | convertable to match polymorphic type variables, but only literals.
|
---|
1159 | The example above would actually behave slightly differently than `x + 2` in
|
---|
1160 | C, though, casting the `2` down to `short` rather than the `x` up to `int`, a
|
---|
1161 | possible demerit of this scheme.
|
---|
1162 |
|
---|
1163 | The other question to ask would be which conversions would be allowed for
|
---|
1164 | literals; it seems rather odd to allow down-casting `42ull` to `char`, when
|
---|
1165 | the programmer has explicitly specified by the suffix that it's an unsigned
|
---|
1166 | long.
|
---|
1167 | Type interpretations of literals in C are rather complex (see [1]), but one
|
---|
1168 | reasonable approach would be to say that un-suffixed integer literals could be
|
---|
1169 | interpreted as any type convertable from int, "u" suffixed literals could be
|
---|
1170 | interpreted as any type convertable from "unsigned int" except the signed
|
---|
1171 | integer types, and "l" or "ll" suffixed literals could only be interpreted as
|
---|
1172 | `long` or `long long`, respectively (or possibly that the "u" suffix excludes
|
---|
1173 | the signed types, while the "l" suffix excludes the types smaller than
|
---|
1174 | `long int`, as in [1]).
|
---|
1175 | Similarly, unsuffixed floating-point literals could be interpreted as `float`,
|
---|
1176 | `double` or `long double`, but "f" or "l" suffixed floating-point literals
|
---|
1177 | could only be interpreted as `float` or `long double`, respectively.
|
---|
1178 | I would like to specify that character literals can only be interpreted as
|
---|
1179 | `char`, but the wide-character variants and the C practice of typing character
|
---|
1180 | literals as `int` means that would likely break code, so character literals
|
---|
1181 | should be able to take any integer type.
|
---|
1182 |
|
---|
1183 | [1] http://en.cppreference.com/w/c/language/integer_constant
|
---|
1184 |
|
---|
1185 | With the possible exception of the `add` case above, implicit conversions to
|
---|
1186 | the function types of assertions can handle most of the expected behaviour
|
---|
1187 | from C.
|
---|
1188 | However, implicit conversions cannot be applied to match variable assertions,
|
---|
1189 | as in the following example:
|
---|
1190 |
|
---|
1191 | forall( otype T | { int ?<?(T, T); T ?+?(T, T); T min; T max; } )
|
---|
1192 | T clamp_sum( T x, T y ) {
|
---|
1193 | T sum = x + y;
|
---|
1194 | if ( sum < min ) return min;
|
---|
1195 | if ( max < sum ) return max;
|
---|
1196 | return sum;
|
---|
1197 | }
|
---|
1198 |
|
---|
1199 | char min = 'A';
|
---|
1200 | double max = 100.0;
|
---|
1201 | //int val = clamp_sum( 'X', 3.14 ); // ERROR (1)
|
---|
1202 |
|
---|
1203 | char max = 'Z'
|
---|
1204 | char val = clamp_sum( 'X', 3.14 ); // MATCH (2)
|
---|
1205 | double val = clamp_sum( 40.9, 19.9 ); // MAYBE (3)
|
---|
1206 |
|
---|
1207 | In this example, the call to `clamp_sum` at (1) doesn't compile, because even
|
---|
1208 | though there are compatible `min` and `max` variables of types `char` and
|
---|
1209 | `double`, they need to have the same type to match the constraint, and they
|
---|
1210 | don't.
|
---|
1211 | The (2) example does compile, but with a behaviour that might be a bit
|
---|
1212 | unexpected given the "usual arithmetic conversions", in that both values are
|
---|
1213 | narrowed to `char` to match the `min` and `max` constraints, rather than
|
---|
1214 | widened to `double` as is usual for mis-matched arguments to +.
|
---|
1215 | The (3) example is the only case discussed here that would require the most
|
---|
1216 | permisive type binding semantics - here, `T` is bound to `char`, to match the
|
---|
1217 | constraints, and both the parameters are narrowed from `double` to `char`
|
---|
1218 | before the call, which would not be allowed under either of the more
|
---|
1219 | restrictive binding semantics.
|
---|
1220 | However, the behaviour here is unexpected to the programmer, because the
|
---|
1221 | return value will be `(double)'A' /* == 60.0 */` due to the conversions,
|
---|
1222 | rather than `60.8 /* == 40.9 + 19.9 */` as they might expect.
|
---|
1223 |
|
---|
1224 | Personally, I think that implicit conversions are not a particularly good
|
---|
1225 | language design, and that the use-cases for them can be better handled with
|
---|
1226 | less powerful features (e.g. more versatile rules for typing constant
|
---|
1227 | expressions).
|
---|
1228 | However, though we do need implicit conversions in monomorphic code for C
|
---|
1229 | compatibility, I'm in favour of restricting their usage in polymorphic code,
|
---|
1230 | both to give programmers some stronger tools to express their intent and to
|
---|
1231 | shrink the search space for the resolver.
|
---|
1232 | Of the possible binding semantics I've discussed, I'm in favour of forcing
|
---|
1233 | polymorphic type variables to bind exactly, though I could be talked into
|
---|
1234 | allowing literal expressions to have more flexibility in their bindings, or
|
---|
1235 | possibly loosening "type variables bind exactly" to "type variables bind
|
---|
1236 | exactly at least once"; I think the unrestricted combination of implicit
|
---|
1237 | conversions and polymorphic type variable binding unneccesarily multiplies the
|
---|
1238 | space of possible function resolutions, and that the added resolution options
|
---|
1239 | are mostly unexpected and therefore confusing and not useful to user
|
---|
1240 | programmers.
|
---|
1241 |
|
---|
1242 | ## Resolver Architecture ##
|
---|
1243 |
|
---|
1244 | ### Function Application Resolution ###
|
---|
1245 | Our resolution algorithm for function application expressions is based on
|
---|
1246 | Baker's[3] single-pass bottom-up algorithm, with Cormack's[4] single-pass
|
---|
1247 | top-down algorithm applied where appropriate as an optimization.
|
---|
1248 | Broadly speaking, the cost of this resolution per expression will be
|
---|
1249 | proportional to `i^d`, where `i` is the number of interpretations of each
|
---|
1250 | program symbol, and `d` is the maximum depth of the expression DAG.
|
---|
1251 | Since `d` is determined by the user programmer (in general, bounded by a small
|
---|
1252 | constant), opportunities for resolver optimization primarily revolve around
|
---|
1253 | minimizing `i`, the number of interpretations of each symbol that are
|
---|
1254 | considered.
|
---|
1255 |
|
---|
1256 | [3] Baker, Theodore P. A one-pass algorithm for overload resolution in Ada.
|
---|
1257 | ACM Transactions on Programming Languages and Systems (1982) 4:4 p.601-614
|
---|
1258 |
|
---|
1259 | [4] Cormack, Gordon V. An algorithm for the selection of overloaded functions
|
---|
1260 | in Ada. SIGPLAN Notices (1981) 16:2 p.48-52
|
---|
1261 |
|
---|
1262 | Unlike Baker, our system allows implicit type conversions for function
|
---|
1263 | arguments and return types; the problem then becomes to find the valid
|
---|
1264 | interpretation for an expression that has the unique minimal conversion cost,
|
---|
1265 | if such exists.
|
---|
1266 | Interpretations can be produced both by overloaded names and implicit
|
---|
1267 | conversions applied to existing interpretations; we have proposals to reduce
|
---|
1268 | the number of interpretations considered from both sources.
|
---|
1269 | To simplify the problem for this discussion, we will consider application
|
---|
1270 | resolution restricted to a domain of functions applied to variables, possibly
|
---|
1271 | in a nested manner (e.g. `f( g( x ), y )`, where `x` and `y` are variables and
|
---|
1272 | `f` and `g` are functions), and possibly in a typed context such as a variable
|
---|
1273 | initialization (e.g. `int i = f( x );`); the other aspects of Cforall type
|
---|
1274 | resolution should be able to be straightforwardly mapped into this model.
|
---|
1275 | The types of the symbol tables used for variable and function declarations
|
---|
1276 | look somewhat like the following:
|
---|
1277 |
|
---|
1278 | variable_table = name_map( variable_name, variable_map )
|
---|
1279 |
|
---|
1280 | function_table = name_map( function_name, function_map )
|
---|
1281 |
|
---|
1282 | variable_map = multi_index( by_type( variable_type ),
|
---|
1283 | variable_decl_set )
|
---|
1284 |
|
---|
1285 | function_map = multi_index( by_int( n_params ),
|
---|
1286 | by_type( return_type ),
|
---|
1287 | function_decl_set )
|
---|
1288 |
|
---|
1289 | `variable_name` and `function_name` are likely simple strings, with `name_map`
|
---|
1290 | a hash table (or perhaps trie) mapping string keys to values.
|
---|
1291 | `variable_decl_set` and `function_decl_set` can be thought of for the moment
|
---|
1292 | as simple bags of typed declarations, where the declaration types are linked
|
---|
1293 | to the graph of available conversions for that type.
|
---|
1294 | In a typed context both the `variable_decl_set` and the `function_decl_set`
|
---|
1295 | should be able to be selected upon by type; this is accomplished by the
|
---|
1296 | `by_type` index of both `variable_map` and `function_map`.
|
---|
1297 | The `by_int` index of `function_map` also provides a way to select functions
|
---|
1298 | by their number of parameters; this index may be used to swiftly discard any
|
---|
1299 | function declaration which does not have the appropriate number of parameters
|
---|
1300 | for the argument interpretations being passed to it; given the likely small
|
---|
1301 | number of entries in this map, it is possible that a binary search of a sorted
|
---|
1302 | vector or even a linear search of an unsorted vector would be more efficient
|
---|
1303 | than the usual hash-based index.
|
---|
1304 |
|
---|
1305 | Given these data structures, the general outline of our algorithm follows
|
---|
1306 | Baker, with Cormack's algorithm used as a heuristic filter in typed contexts.
|
---|
1307 |
|
---|
1308 | In an untyped context, we use a variant of Baker's bottom-up algorithm.
|
---|
1309 | The leaves of the interpretation DAG are drawn from the variable symbol table,
|
---|
1310 | with entries in the table each producing zero-cost interpretations, and each
|
---|
1311 | implicit conversion available to be applied to the type of an existing entry
|
---|
1312 | producing a further interpretation with the same cost as the conversion.
|
---|
1313 | As in Baker, if two or more interpretations have the same type, only the
|
---|
1314 | minimum cost interpretation with that type is produced; if there is no unique
|
---|
1315 | minimum cost interpretation than resolution with that type is ambiguous, and
|
---|
1316 | not permitted.
|
---|
1317 | It should be relatively simple to produce the list of interpretations sorted
|
---|
1318 | by cost by producing the interpretations via a breadth-first search of the
|
---|
1319 | conversion graph from the initial interpretations provided in the variable
|
---|
1320 | symbol table.
|
---|
1321 |
|
---|
1322 | To match a function at one of the internal nodes of the DAG, we first look up
|
---|
1323 | the function's name in the function symbol table, the appropriate number of
|
---|
1324 | parameters for the arguments that are provided through the `by_int` index of
|
---|
1325 | the returned `function_map`, then go through the resulting `function_decl_set`
|
---|
1326 | searching for functions where the parameter types can unify with the provided
|
---|
1327 | argument lists; any such matching function produces an interpretation with a
|
---|
1328 | cost that is the sum of its argument costs.
|
---|
1329 | Though this is not included in our simplified model, this unification step may
|
---|
1330 | include binding of polymorphic variables, which introduces a cost for the
|
---|
1331 | function binding itself which must be added to the argument costs.
|
---|
1332 | Also, checking of function assertions would likely be done at this step as
|
---|
1333 | well, possibly eliminating some possible matching functions (if no suitable
|
---|
1334 | assertions can be satisfied), or adding further conversion costs for the
|
---|
1335 | assertion satisfaction.
|
---|
1336 | Once the set of valid function interpretations is produced, these may also be
|
---|
1337 | expanded by the graph of implicit conversions on their return types, as the
|
---|
1338 | variable interpretations were.
|
---|
1339 |
|
---|
1340 | This implicit conversion-based expansion of interpretations should be skipped
|
---|
1341 | for the top-level expression if used in an untyped (void) context, e.g. for
|
---|
1342 | `f` in `f( g ( x ) );` or `x` in `x;`.
|
---|
1343 | On the other hand, if the top-level expression specifies a type, e.g. in
|
---|
1344 | `int i = f( x );`, only top level expressions that return that type are
|
---|
1345 | relevant to the search, so the candidates for `f` can be filtered first by
|
---|
1346 | those that return `int` (or a type convertable to it); this can be
|
---|
1347 | accomplished by performing a top-down filter of the interpretations of `f` by
|
---|
1348 | the `by_type` index of the `function_map` in a manner similar to Cormack's[4]
|
---|
1349 | algorithm.
|
---|
1350 |
|
---|
1351 | In a typed context, such as an initialization expression
|
---|
1352 | `T x = f( g( y ), z );`, only interpretations of `f( g( y ), z )` which have
|
---|
1353 | type `T` are valid; since there are likely to be valid interpretations of
|
---|
1354 | `f( g( y ), z )` which cannot be used to initialize a variable of type `T`, we
|
---|
1355 | can use this information to reduce the number of interpretations considered.
|
---|
1356 | Drawing from Cormack[4], we first search for interpretations of `f` where the
|
---|
1357 | return type is `T`; by breadth-first-search of the conversion graph, it should
|
---|
1358 | be straightforward to order the interpretations of `f` by the cost to convert
|
---|
1359 | their return type to `T`.
|
---|
1360 | We can also filter out interpretations of `f` with less than two parameters,
|
---|
1361 | since both `g( y )` and `z` must produce at least one parameter; we may not,
|
---|
1362 | however, rule out interpretations of `f` with more than two parameters, as
|
---|
1363 | there may be a valid interpretation of `g( y )` as a function returning more
|
---|
1364 | than one parameter (if the expression was `f( y, z )` instead, we could use an
|
---|
1365 | exact parameter count, assuming that variables of tuple type don't exist).
|
---|
1366 | For each compatible interpretation of `f`, we can add the type of the first
|
---|
1367 | parameter of that interpretation of `f` to a set `S`, and recursively search
|
---|
1368 | for interpretations of `g( y )` that return some type `Si` in `S`, and
|
---|
1369 | similarly for interpretations of `z` that match the type of any of the second
|
---|
1370 | parameters of some `f`.
|
---|
1371 | Naturally, if no matching interpretation of `g( y )` can be found for the
|
---|
1372 | first parameter of some `f`, the type of the second parameter of that `f` will
|
---|
1373 | not be added to the set of valid types for `z`.
|
---|
1374 | Each node in this interpretation DAG is given a cost the same way it would be
|
---|
1375 | in the bottom-up approach, with the exception that when going top-down there
|
---|
1376 | must be a final bottom-up pass to sum the interpretation costs and sort them
|
---|
1377 | as appropriate.
|
---|
1378 |
|
---|
1379 | If a parameter type for some `f` is a polymorphic type variable that is left
|
---|
1380 | unbound by the return type (e.g. `forall(otype S) int f(S x, int y)`), the
|
---|
1381 | matching arguments should be found using the bottom-up algorithm above for
|
---|
1382 | untyped contexts, because the polymorphic type variable does not sufficiently
|
---|
1383 | constrain the available interpretations of the argument expression.
|
---|
1384 | Similarly, it would likely be an advantage to use top-down resolution for
|
---|
1385 | cast expressions (e.g. `(int)x`), even when those cast expressions are
|
---|
1386 | subexpressions of an otherwise untyped expression.
|
---|
1387 | It may also be fruitful to switch between the bottom-up and top-down
|
---|
1388 | algorithms if the number of valid interpretations for a subexpression or valid
|
---|
1389 | types for an argument exceeds some heuristic threshold, but finding such
|
---|
1390 | a threshold (if any exists) will require experimental data.
|
---|
1391 | This hybrid top-down/bottom-up search provides more opportunities for pruning
|
---|
1392 | interpretations than either a bottom-up or top-down approach alone, and thus
|
---|
1393 | may be more efficient than either.
|
---|
1394 | A top-down-only approach, however, devolves to linear search through every
|
---|
1395 | possible interpretation in the solution space in an untyped context, and is
|
---|
1396 | thus likely to be inferior to a strictly bottom-up approach, though this
|
---|
1397 | hypothesis needs to be empirically validated.
|
---|
1398 |
|
---|
1399 | Another approach would be to abandon expression-tree ordering for
|
---|
1400 | subexpression matching, and order by "most constrained symbol"; symbols would
|
---|
1401 | be more constrained if there were fewer matching declarations, fewer
|
---|
1402 | subexpressions yet to resolve, or possibly fewer possible types the expression
|
---|
1403 | could resolve to. Ordering the expressions in a priority-queue by this metric
|
---|
1404 | would not necessarily produce a top-down or a bottom-up order, but would add
|
---|
1405 | opportunities for pruning based on memoized upper and lower bounds.
|
---|
1406 |
|
---|
1407 | Both Baker and Cormack explicitly generate all possible interpretations of a
|
---|
1408 | given expression; thinking of the set of interpretations of an expression as a
|
---|
1409 | list sorted by cost, this is an eager evaluation of the list.
|
---|
1410 | However, since we generally expect that user programmers will not often use
|
---|
1411 | high-cost implicit conversions, one potentially effective way to prune the
|
---|
1412 | search space would be to first find the minimal-cost interpretations of any
|
---|
1413 | given subexpression, then to save the resolution progress of the
|
---|
1414 | subexpressions and attempt to resolve the superexpression using only those
|
---|
1415 | subexpression interpretations.
|
---|
1416 | If no valid interpretation of the superexpression can be found, the resolver
|
---|
1417 | would then repeatedly find the next-most-minimal cost interpretations of the
|
---|
1418 | subexpressions and attempt to produce the minimal cost interpretation of the
|
---|
1419 | superexpression.
|
---|
1420 | This process would proceed until all possible subexpression interpretations
|
---|
1421 | have been found and considered.
|
---|
1422 |
|
---|
1423 | A middle ground between the eager and lazy approaches can be taken by
|
---|
1424 | considering the lexical order on the cost tuple; essentially, every
|
---|
1425 | interpretation in each of the classes below will be strictly cheaper than any
|
---|
1426 | interpretation in the class after it, so if a min-cost valid interpretation
|
---|
1427 | can be found while only generating interpretations in a given class, that
|
---|
1428 | interpretation is guaranteed to be the best possible one:
|
---|
1429 |
|
---|
1430 | 1. Interpretations without polymorphic functions or implicit conversions
|
---|
1431 | 2. Interpretations without polymorphic functions using only safe conversions
|
---|
1432 | 3. Interpretations using polymorphic functions without unsafe conversions
|
---|
1433 | 4. Interpretations using unsafe conversions
|
---|
1434 |
|
---|
1435 | In this lazy-eager approach, all the interpretations in one class would be
|
---|
1436 | eagerly generated, while the interpretations in the next class would only be
|
---|
1437 | considered if no match was found in the previous class.
|
---|
1438 |
|
---|
1439 | Another source of efficiency would be to cache the best given interpretation
|
---|
1440 | of a subexpression within an environment; this may not be incredibly useful
|
---|
1441 | for explict parameters (though it may be useful for, e.g. `f( x, x )`, where
|
---|
1442 | both parameters of `f` have the same type), but should pay some dividends for
|
---|
1443 | the implicit assertion parameters, especially the otype parameters for the
|
---|
1444 | argument of a generic type, which will generally be resolved in duplicate for
|
---|
1445 | (at least) the assignment operator, constructor, copy constructor & destructor
|
---|
1446 | of the generic type.
|
---|
1447 |
|
---|
1448 | ## Appendix A: Partial and Total Orders ##
|
---|
1449 | The `<=` relation on integers is a commonly known _total order_, and
|
---|
1450 | intuitions based on how it works generally apply well to other total orders.
|
---|
1451 | Formally, a total order is some binary relation `<=` over a set `S` such that
|
---|
1452 | for any two members `a` and `b` of `S`, `a <= b` or `b <= a` (if both, `a` and
|
---|
1453 | `b` must be equal, the _antisymmetry_ property); total orders also have a
|
---|
1454 | _transitivity_ property, that if `a <= b` and `b <= c`, then `a <= c`.
|
---|
1455 | If `a` and `b` are distinct elements and `a <= b`, we may write `a < b`.
|
---|
1456 |
|
---|
1457 | A _partial order_ is a generalization of this concept where the `<=` relation
|
---|
1458 | is not required to be defined over all pairs of elements in `S` (though there
|
---|
1459 | is a _reflexivity_ requirement that for all `a` in `S`, `a <= a`); in other
|
---|
1460 | words, it is possible for two elements `a` and `b` of `S` to be
|
---|
1461 | _incomparable_, unable to be ordered with respect to one another (any `a` and
|
---|
1462 | `b` for which either `a <= b` or `b <= a` are called _comparable_).
|
---|
1463 | Antisymmetry and transitivity are also required for a partial order, so all
|
---|
1464 | total orders are also partial orders by definition.
|
---|
1465 | One fairly natural partial order is the "subset of" relation over sets from
|
---|
1466 | the same universe; `{ }` is a subset of both `{ 1 }` and `{ 2 }`, which are
|
---|
1467 | both subsets of `{ 1, 2 }`, but neither `{ 1 }` nor `{ 2 }` is a subset of the
|
---|
1468 | other - they are incomparable under this relation.
|
---|
1469 |
|
---|
1470 | We can compose two (or more) partial orders to produce a new partial order on
|
---|
1471 | tuples drawn from both (or all the) sets.
|
---|
1472 | For example, given `a` and `c` from set `S` and `b` and `d` from set `R`,
|
---|
1473 | where both `S` and `R` both have partial orders defined on them, we can define
|
---|
1474 | a ordering relation between `(a, b)` and `(c, d)`.
|
---|
1475 | One common order is the _lexicographical order_, where `(a, b) <= (c, d)` iff
|
---|
1476 | `a < c` or both `a = c` and `b <= d`; this can be thought of as ordering by
|
---|
1477 | the first set and "breaking ties" by the second set.
|
---|
1478 | Another common order is the _product order_, which can be roughly thought of
|
---|
1479 | as "all the components are ordered the same way"; formally `(a, b) <= (c, d)`
|
---|
1480 | iff `a <= c` and `b <= d`.
|
---|
1481 | One difference between the lexicographical order and the product order is that
|
---|
1482 | in the lexicographical order if both `a` and `c` and `b` and `d` are
|
---|
1483 | comparable then `(a, b)` and `(c, d)` will be comparable, while in the product
|
---|
1484 | order you can have `a <= c` and `d <= b` (both comparable) which will make
|
---|
1485 | `(a, b)` and `(c, d)` incomparable.
|
---|
1486 | The product order, on the other hand, has the benefit of not prioritizing one
|
---|
1487 | order over the other.
|
---|
1488 |
|
---|
1489 | Any partial order has a natural representation as a directed acyclic graph
|
---|
1490 | (DAG).
|
---|
1491 | Each element `a` of the set becomes a node of the DAG, with an arc pointing to
|
---|
1492 | its _covering_ elements, any element `b` such that `a < b` but where there is
|
---|
1493 | no `c` such that `a < c` and `c < b`.
|
---|
1494 | Intuitively, the covering elements are the "next ones larger", where you can't
|
---|
1495 | fit another element between the two.
|
---|
1496 | Under this construction, `a < b` is equivalent to "there is a path from `a` to
|
---|
1497 | `b` in the DAG", and the lack of cycles in the directed graph is ensured by
|
---|
1498 | the antisymmetry property of the partial order.
|
---|
1499 |
|
---|
1500 | Partial orders can be generalized to _preorders_ by removing the antisymmetry
|
---|
1501 | property.
|
---|
1502 | In a preorder the relation is generally called `<~`, and it is possible for
|
---|
1503 | two distict elements `a` and `b` to have `a <~ b` and `b <~ a` - in this case
|
---|
1504 | we write `a ~ b`; `a <~ b` and not `a ~ b` is written `a < b`.
|
---|
1505 | Preorders may also be represented as directed graphs, but in this case the
|
---|
1506 | graph may contain cycles.
|
---|
1507 |
|
---|
1508 | ## Appendix B: Building a Conversion Graph from Un-annotated Single Steps ##
|
---|
1509 | The short answer is that it's impossible.
|
---|
1510 |
|
---|
1511 | The longer answer is that it has to do with what's essentially a diamond
|
---|
1512 | inheritance problem.
|
---|
1513 | In C, `int` converts to `unsigned int` and also `long` "safely"; both convert
|
---|
1514 | to `unsigned long` safely, and it's possible to chain the conversions to
|
---|
1515 | convert `int` to `unsigned long`.
|
---|
1516 | There are two constraints here; one is that the `int` to `unsigned long`
|
---|
1517 | conversion needs to cost more than the other two (because the types aren't as
|
---|
1518 | "close" in a very intuitive fashion), and the other is that the system needs a
|
---|
1519 | way to choose which path to take to get to the destination type.
|
---|
1520 | Now, a fairly natural solution for this would be to just say "C knows how to
|
---|
1521 | convert from `int` to `unsigned long`, so we just put in a direct conversion
|
---|
1522 | and make the compiler smart enough to figure out the costs" - given that the
|
---|
1523 | users can build an arbitrary graph of conversions, this needs to be handled
|
---|
1524 | anyway.
|
---|
1525 | We can define a preorder over the types by saying that `a <~ b` if there
|
---|
1526 | exists a chain of conversions from `a` to `b`.
|
---|
1527 | This preorder corresponds roughly to a more usual type-theoretic concept of
|
---|
1528 | subtyping ("if I can convert `a` to `b`, `a` is a more specific type than
|
---|
1529 | `b`"); however, since this graph is arbitrary, it may contain cycles, so if
|
---|
1530 | there is also a path to convert `b` to `a` they are in some sense equivalently
|
---|
1531 | specific.
|
---|
1532 |
|
---|
1533 | Now, to compare the cost of two conversion chains `(s, x1, x2, ... xn)` and
|
---|
1534 | `(s, y1, y2, ... ym)`, we have both the length of the chains (`n` versus `m`)
|
---|
1535 | and this conversion preorder over the destination types `xn` and `ym`.
|
---|
1536 | We could define a preorder by taking chain length and breaking ties by the
|
---|
1537 | conversion preorder, but this would lead to unexpected behaviour when closing
|
---|
1538 | diamonds with an arm length of longer than 1.
|
---|
1539 | Consider a set of types `A`, `B1`, `B2`, `C` with the arcs `A->B1`, `B1->B2`,
|
---|
1540 | `B2->C`, and `A->C`.
|
---|
1541 | If we are comparing conversions from `A` to both `B2` and `C`, we expect the
|
---|
1542 | conversion to `B2` to be chosen because it's the more specific type under the
|
---|
1543 | conversion preorder, but since its chain length is longer than the conversion
|
---|
1544 | to `C`, it loses and `C` is chosen.
|
---|
1545 | However, taking the conversion preorder and breaking ties or ambiguities by
|
---|
1546 | chain length also doesn't work, because of cases like the following example
|
---|
1547 | where the transitivity property is broken and we can't find a global maximum:
|
---|
1548 |
|
---|
1549 | `X->Y1->Y2`, `X->Z1->Z2->Z3->W`, `X->W`
|
---|
1550 |
|
---|
1551 | In this set of arcs, if we're comparing conversions from `X` to each of `Y2`,
|
---|
1552 | `Z3` and `W`, converting to `Y2` is cheaper than converting to `Z3`, because
|
---|
1553 | there are no conversions between `Y2` and `Z3`, and `Y2` has the shorter chain
|
---|
1554 | length.
|
---|
1555 | Also, comparing conversions from `X` to `Z3` and to `W`, we find that the
|
---|
1556 | conversion to `Z3` is cheaper, because `Z3 < W` by the conversion preorder,
|
---|
1557 | and so is considered to be the nearer type.
|
---|
1558 | By transitivity, then, the conversion from `X` to `Y2` should be cheaper than
|
---|
1559 | the conversion from `X` to `W`, but in this case the `X` and `W` are
|
---|
1560 | incomparable by the conversion preorder, so the tie is broken by the shorter
|
---|
1561 | path from `X` to `W` in favour of `W`, contradicting the transitivity property
|
---|
1562 | for this proposed order.
|
---|
1563 |
|
---|
1564 | Without transitivity, we would need to compare all pairs of conversions, which
|
---|
1565 | would be expensive, and possibly not yield a minimal-cost conversion even if
|
---|
1566 | all pairs were comparable.
|
---|
1567 | In short, this ordering is infeasible, and by extension I believe any ordering
|
---|
1568 | composed solely of single-step conversions between types with no further
|
---|
1569 | user-supplied information will be insufficiently powerful to express the
|
---|
1570 | built-in conversions between C's types.
|
---|
1571 |
|
---|
1572 | ## Appendix C: Proposed Prelude Changes ##
|
---|
1573 | **TODO** Port Glen's "Future Work" page for builtin C conversions.
|
---|
1574 | **TODO** Move discussion of zero_t, unit_t here.
|
---|
1575 |
|
---|
1576 | It may be desirable to have some polymorphic wrapper functions in the prelude
|
---|
1577 | which provide consistent default implementations of various operators given a
|
---|
1578 | definition of one of them.
|
---|
1579 | Naturally, users could still provide a monomorphic overload if they wished to
|
---|
1580 | make their own code more efficient than the polymorphic wrapper could be, but
|
---|
1581 | this would minimize user effort in the common case where the user cannot write
|
---|
1582 | a more efficient function, or is willing to trade some runtime efficiency for
|
---|
1583 | developer time.
|
---|
1584 | As an example, consider the following polymorphic defaults for `+` and `+=`:
|
---|
1585 |
|
---|
1586 | forall(otype T | { T ?+?(T, T); })
|
---|
1587 | lvalue T ?+=? (T *a, T b) {
|
---|
1588 | return *a = *a + b;
|
---|
1589 | }
|
---|
1590 |
|
---|
1591 | forall(otype T | { lvalue T ?+=? (T*, T) })
|
---|
1592 | T ?+? (T a, T b) {
|
---|
1593 | T tmp = a;
|
---|
1594 | return tmp += b;
|
---|
1595 | }
|
---|
1596 |
|
---|
1597 | Both of these have a possibly unneccessary copy (the first in copying the
|
---|
1598 | result of `*a + b` back into `*a`, the second copying `a` into `tmp`), but in
|
---|
1599 | cases where these copies are unavoidable the polymorphic wrappers should be
|
---|
1600 | just as performant as the monomorphic equivalents (assuming a compiler
|
---|
1601 | sufficiently clever to inline the extra function call), and allow programmers
|
---|
1602 | to define a set of related operators with maximal concision.
|
---|
1603 |
|
---|
1604 | **TODO** Look at what Glen and Richard have already written for this.
|
---|
1605 |
|
---|
1606 | ## Appendix D: Feasibility of Making void* Conversions Explicit ##
|
---|
1607 | C allows implicit conversions between `void*` and other pointer types, as per
|
---|
1608 | section 6.3.2.3.1 of the standard.
|
---|
1609 | Making these implicit conversions explicit in Cforall would provide
|
---|
1610 | significant type-safety benefits, and is precedented in C++.
|
---|
1611 | A weaker version of this proposal would be to allow implicit conversions to
|
---|
1612 | `void*` (as a sort of "top type" for all pointer types), but to make the
|
---|
1613 | unsafe conversion from `void*` back to a concrete pointer type an explicit
|
---|
1614 | conversion.
|
---|
1615 | However, `int *p = malloc( sizeof(int) );` and friends are hugely common
|
---|
1616 | in C code, and rely on the unsafe implicit conversion from the `void*` return
|
---|
1617 | type of `malloc` to the `int*` type of the variable - obviously it would be
|
---|
1618 | too much of a source-compatibility break to disallow this for C code.
|
---|
1619 | We do already need to wrap C code in an `extern "C"` block, though, so it is
|
---|
1620 | technically feasible to make the `void*` conversions implicit in C but
|
---|
1621 | explicit in Cforall.
|
---|
1622 | Also, for calling C code with `void*`-based APIs, pointers-to-dtype are
|
---|
1623 | calling-convention compatible with `void*`; we could read `void*` in function
|
---|
1624 | signatures as essentially a fresh dtype type variable, e.g:
|
---|
1625 |
|
---|
1626 | void* malloc( size_t )
|
---|
1627 | => forall(dtype T0) T0* malloc( size_t )
|
---|
1628 | void qsort( void*, size_t, size_t, int (*)( const void*, const void* ) )
|
---|
1629 | => forall(dtype T0, dtype T1, dtype T2)
|
---|
1630 | void qsort( T0*, size_t, size_t, int (*)( const T1*, const T2* ) )
|
---|
1631 |
|
---|
1632 | This would even allow us to leverage some of Cforall's type safety to write
|
---|
1633 | better declarations for legacy C API functions, like the following wrapper for
|
---|
1634 | `qsort`:
|
---|
1635 |
|
---|
1636 | extern "C" { // turns off name-mangling so that this calls the C library
|
---|
1637 | // call-compatible type-safe qsort signature
|
---|
1638 | forall(dtype T)
|
---|
1639 | void qsort( T*, size_t, size_t, int (*)( const T*, const T* ) );
|
---|
1640 |
|
---|
1641 | // forbid type-unsafe C signature from resolving
|
---|
1642 | void qsort( void*, size_t, size_t, int (*)( const void*, const void* ) )
|
---|
1643 | = delete;
|
---|
1644 | }
|
---|
1645 |
|
---|
1646 | ## Appendix E: Features to Add in Resolver Re-write ##
|
---|
1647 | * Reference types
|
---|
1648 | * Special types for 0 and 1 literals
|
---|
1649 | * Expression type for return statement that resolves similarly to ?=?
|
---|
1650 | - This is to get rid of the kludge in the box pass that effectively
|
---|
1651 | re-implements the resolver poorly.
|
---|