source: doc/working/assertion_resolution.md @ b12e4ad

Last change on this file since b12e4ad was 804b57e3, checked in by Aaron Moss <a3moss@…>, 7 years ago

Fix conversion of feature paper to article format; stub in added sections on ctors, references, with-stmt

  • Property mode set to 100644
File size: 7.1 KB
Line 
1The prototype system uses a 5-tuple for expression cost:
21. *unsafe*: unsafe conversion cost
32. *poly*: count of parameters and return types bound to a polymorphic type
43. *vars*: count of polymorphic type variables
54. -*spec*: count of polymorphic type specializations and type assertions, lowers cost
6   * e.g. given `dtype T` `T*` has spec 1, `T**` has spec 2, and each type assertion also has spec 1
75. *safe*: safe conversion cost
8
9Note that in this framework, type assertions are bound to their unique minimum-cost resolution (if extant), but unlike the CFA'03, the cost of type assertion resolution does not change the cost of the overall expression resolution. In particular, it is possible to defer type assertion resolution arbitrarily late in expression resolution without changing the cost of any interpretation that has satisfiable type assertions. This means that we can defer assertion resolution until all the expressions explicitly provided in the user code ("primary expressions") are tentatively resolved.
10
11Even after resolving all the primary expressions, there may be types in the assertions which remain polymorphic. I will call these "auxiliary types", as in the following example (drawn from the `multi_type` test in the prototype repository):
12
13```
14struct a {};
15struct b {};
16
17forall(dtype A, dtype B | { B foo(A); A bar(B); })
18A baz(A);
19
20b foo(a);
21a bar(b);
22
23a x;
24baz(x);
25```
26
27In the course of resolving this expression, `A` is obviously bound to `a`, but `B` is left unbound by the primary expressions, an auxiliary type. There is, however, a unique min-cost resolution for this, where `B` is bound to `b`. In the cost model of the prototype system, if there is a unique min-cost mutually-satisfying resolution for the two type assertions, the cost of resolving `baz(x)` is (0, 2, 2, -2, 0).
28
29The most-common failure case for type assertions in CFA is a set of partial specializations of auxilliary types that does not ever specialize to a concrete type, as in this example (drawn from the `exp` test in the prototype repository -- note that an equivalent to the `make` function shows up fairly often as an implicit conversion from `void*` to any other pointer type, and `ctor` and `dtor` correspond to two of the common `otype` constraints):
30
31```
32struct base {};
33forall(dtype T) struct box {};
34
35void ctor(base);
36void dtor(base);
37
38forall(dtype T | { void ctor(T); void dtor(T); })
39void ctor(box(T));
40forall(dtype T | { void ctor(T); void dtor(T); })
41void dtor(box(T));
42
43forall(dtype S | { void ctor(S); void dtor(S); })
44void f(box(S));
45
46forall(dtype Q | { void ctor(Q); void dtor(Q); })
47Q make();
48
49f( make() );
50```
51
52`make()` has a cost of (0, 1, 1, -3, 0) for any binding of `ctor` and `dtor` (spec is -2 for the constraints, -1 for the partially-specialized polymorphic type `box(Q)`). `f( make() )` then forces a binding `Q => box(S)`, with a total expression cost of (0, 2, 2, -6, 0) for any satisfying resolution of the four constraints (`ctor` and `dtor` on `Q` and `S`). The given `ctor` and `dtor` for `box(T)` will satisfy the constraints on `Q` if `S` can be satisfied, so the question is what to bind the auxiliary type `S` to: `base` would work (at zero cost), as would `box(T)` (at cost (0, 1, 1, -3, 0)), provided we can bind the auxiliary type `T`, a recursive case of exactly the same problem.
53
54Now, at this point, in either the prototype cost model or that in CFA-CC, it should be clear that `S => base` is the cheapest resolution, which should be chosen. The prototype model has the advantage that it has already calculated the cost for `S => box(T)` (assuming a satisfactory `T`), while CFA-CC would determine the cost of satisfying `T` to add to the cost of `S => box(T)`. Since `T` can be `box(box(... box(base) ...))` nested arbitrarily deeply, there are an infinite number of possible bindings, and this computation does not terminate without the imposition of a limit on depth of recursive type assertion resolution.
55
56However, given this problem, a few observations can be made. Firstly, all satisfactory bindings of auxiliary types will be concrete types -- that is, they will not include any unbound type variables. Secondly, since type assertion resolution does not allow conversions (at least in the prototype system -- there may be type conversions to match a type variable, but its assertions cannot use implicit conversions), a concrete match for an assertion parameter will always be cheaper than a polymorphic match for an assertion parameter (which may recursively add more type assertions).
57
58It follows, then, that if a mutually-satisfactory set of concrete matches for all assertion parameters can be found, that there is no cheaper set of matches including more polymorphic matches, and thus no need to attempt to recursively resolve further type assertions of potential polymorphic matches. Thus, given some set of concrete matches for all type assertions, if there is a unique min-cost such set, it can safely be used, whereas if there is no uniquely-minimal set, then the assertion set is ambiguous, as there will not be any cheaper resolution. If there is no concrete match (e.g. remove `ctor(base)` and `dtor(base)` from the example above), then the calculation may still recurse infinitely (as discussed above), however it is possible this sort of cycle could be precisely detected and terminated on, or, alternately, the existing approach of limiting depth of recursive type assertion resolution should suffice in practice.
59
60I further argue that in reasonable code the depth at which such concrete solutions can be found is likely to be fairly small, by appeal to structural-induction-based handwaving. I posit that any *useful* polymorphic function either (1) does not satisfy its own type assertions or (2) only satisfies any of its own type assertions through partial specialization of the type. `f` above would be an example of (1), while `ctor` and `dtor` would be an example of (2). `ctor` and `dtor` do satisfy their own type assertions, with the binding `T => box(T')`. However, any object that exists in the system must have a finitely-expressible type (generally with a nesting level bounded by a small natural number), so there should be very few repetitions of this cycle before an appropriate concrete type is discovered. This informal argument also applies if the cycle is composed of two or more functions which satisfy each other's type assertions. While it is syntactically and semantically legal to write a function that falls into neither category, its addition cannot make any previously-unresolvable code resolvable, and thus it is unlikely to exist in user code. An example of such a function would be this, clearly useless: `forall(dtype T | { T* f(T*); } ) T* f(T*);`
61
62In conclusion, the existing CFA-CC compiler essentially does a depth-first exploration of the space of possible resolutions of auxiliary types (up to bounded depth); a breadth-first approach would seem likely to be more efficient in practice, as it could be terminated at the first depth any solution was found with a minimal-cost solution or ambiguous result. This does not remove the necessity for an assertion recursion limit, but should prevent the compiler from reaching that limit as often in practice.
Note: See TracBrowser for help on using the repository browser.