source: doc/working/resolver_design.md @ 81a05ca

ADTarm-ehast-experimentalcleanup-dtorsenumforall-pointer-decayjacob/cs343-translationjenkins-sandboxnew-astnew-ast-unique-exprpthread-emulationqualifiedEnum
Last change on this file since 81a05ca was 275f4b4, checked in by Aaron Moss <a3moss@…>, 7 years ago

Update discussion of cost tuple

  • Property mode set to 100644
File size: 84.9 KB
Line 
1# Thoughts on Resolver Design #
2
3## Conversions ##
4C's implicit "usual arithmetic conversions" define a structure among the
5built-in types consisting of _unsafe_ narrowing conversions and a hierarchy of
6_safe_ widening conversions.
7There is also a set of _explicit_ conversions that are only allowed through a
8cast expression.
9Based on Glen's notes on conversions [1], I propose that safe and unsafe
10conversions be expressed as constructor variants, though I make explicit
11(cast) conversions a constructor variant as well rather than a dedicated
12operator.
13Throughout this article, I will use the following operator names for
14constructors and conversion functions from `From` to `To`:
15
16        void ?{} ( To*, To );            // copy constructor
17        void ?{} ( To*, From );          // explicit constructor
18        void ?{explicit} ( To*, From );  // explicit cast conversion
19        void ?{safe} ( To*, From );      // implicit safe conversion
20        void ?{unsafe} ( To*, From );    // implicit unsafe conversion
21
22[1] http://plg.uwaterloo.ca/~cforall/Conversions/index.html
23
24Glen's design made no distinction between constructors and unsafe implicit
25conversions; this is elegant, but interacts poorly with tuples.
26Essentially, without making this distinction, a constructor like the following
27would add an interpretation of any two `int`s as a `Coord`, needlessly
28multiplying the space of possible interpretations of all functions:
29
30        void ?{}( Coord *this, int x, int y );
31
32That said, it would certainly be possible to make a multiple-argument implicit
33conversion, as below, though the argument above suggests this option should be
34used infrequently:
35
36        void ?{unsafe}( Coord *this, int x, int y );
37
38An alternate possibility would be to only count two-arg constructors
39`void ?{} ( To*, From )` as unsafe conversions; under this semantics, safe and
40explicit conversions should also have a compiler-enforced restriction to
41ensure that they are two-arg functions (this restriction may be valuable
42regardless).
43
44Regardless of syntax, there should be a type assertion that expresses `From` 
45is convertable to `To`.
46If user-defined conversions are not added to the language,
47`void ?{} ( To*, From )` may be a suitable representation, relying on
48conversions on the argument types to account for transitivity.
49On the other hand, `To*` should perhaps match its target type exactly, so
50another assertion syntax specific to conversions may be required, e.g.
51`From -> To`.
52
53### Constructor Idiom ###
54Basing our notion of conversions off otherwise normal Cforall functions means
55that we can use the full range of Cforall features for conversions, including
56polymorphism.
57Glen [1] defines a _constructor idiom_ that can be used to create chains of
58safe conversions without duplicating code; given a type `Safe` which members
59of another type `From` can be directly converted to, the constructor idiom
60allows us to write a conversion for any type `To` which `Safe` converts to:
61
62        forall(otype To | { void ?{safe}( To*, Safe ) })
63        void ?{safe}( To *this, From that ) {
64                Safe tmp = /* some expression involving that */;
65                *this = tmp; // uses assertion parameter
66        }
67
68This idiom can also be used with only minor variations for a parallel set of
69unsafe conversions.
70
71What selective non-use of the constructor idiom gives us is the ability to
72define a conversion that may only be the *last* conversion in a chain of such.
73Constructing a conversion graph able to unambiguously represent the full
74hierarchy of implicit conversions in C is provably impossible using only
75single-step conversions with no additional information (see Appendix B), but
76this mechanism is sufficiently powerful (see [1], though the design there has
77some minor bugs; the general idea is to use the constructor idiom to define
78two chains of conversions, one among the signed integral types, another among
79the unsigned, and to use monomorphic conversions to allow conversions between
80signed and unsigned integer types).
81
82### Implementation Details ###
83It is desirable to have a system which can be efficiently implemented, yet
84also to have one which has sufficient power to distinguish between functions
85on all possible axes of polymorphism.
86This ordering may be a partial order, which may complicate implementation
87somewhat; in this case it may be desirable to store the set of implementations
88for a given function as the directed acyclic graph (DAG) representing the
89order.
90
91## Conversion Costs ##
92Each possible resolution of an expression has a _cost_ tuple consisting of
93the following components:
941. _unsafe_ conversion cost: summed degree of unsafe conversions; unlike CFA03, this is not a simple count of conversions (for symmetry with the safe conversions)
952. _polymorphic unifications_: count of parameters and return values bound to some polymorphic type for boxing
963. _type variables_: number of polymorphic type variables bound
974. negated _type specializations_: Each type assertion specializes the polymorphism, thus decreasing the cost; nested polymorphic types (e.g. `T*`) are also counted as specializations
985. _safe_ conversions: summed degree of safe conversions
996. _qualifier_ conversions: summed degree of qualifier and reference conversions
100These components are lexically-ordered and can be summed element-wise;
101summation starts at `(0, 0, 0, 0, 0)`.
102
103**TODO** update below for consistency with this
104
105### Lvalue and Qualifier Conversions ###
106C defines the notion of a _lvalue_, essentially an addressable object, as well
107as a number of type _qualifiers_, `const`, `volatile`, and `restrict`.
108As these type qualifiers are generally only meaningful to the type system as
109applied to lvalues, the two concepts are closely related.
110A const lvalue cannot be modified, the compiler cannot assume that a volatile
111lvalue will not be concurrently modified by some other part of the system, and
112a restrict lvalue must have pointer type, and the compiler may assume that no
113other pointer in scope aliases that pointer (this is solely a performance
114optimization, and may be ignored by implementers).
115_Lvalue-to-rvalue conversion_, which takes an lvalue of type `T` and converts
116it to an expression result of type `T` (commonly called an _rvalue_ of type
117`T`) also strips all the qualifiers from the lvalue, as an expression result
118is a value, not an addressable object that can have properties like
119immutability.
120Though lvalue-to-rvalue conversion strips the qualifiers from lvalues,
121derived rvalue types such as pointer types may include qualifiers;
122`const int *` is a distinct type from `int *`, though the latter is safely
123convertable to the former.
124In general, any number of qualifiers can be safely added to the
125pointed-to-type of a pointer type, e.g. `int *` converts safely to
126`const int *` and `volatile int *`, both of which convert safely to
127`const volatile int *`.
128
129Since lvalues are precicely "addressable objects", in C, only lvalues can be
130used as the operand of the `&` address-of operator.
131Similarly, only modifiable lvalues may be used as the assigned-to
132operand of the mutating operators: assignment, compound assignment
133(e.g. `+=`), and increment and decrement; roughly speaking, lvalues without
134the `const` qualifier are modifiable, but lvalues of incomplete types, array
135types, and struct or union types with const members are also not modifiable.
136Lvalues are produced by the following expressions: object identifiers
137(function identifiers are not considered to be lvalues), the result of the `*` 
138dereference operator applied to an object pointer, the result of a member
139expression `s.f` if the left argument `s` is an lvalue (note that the
140preceding two rules imply that the result of indirect member expressions
141`s->f` are always lvalues, by desugaring to `(*s).f`), and the result of the
142indexing operator `a[i]` (similarly by its desugaring to `*((a)+(i))`).
143Somewhat less obviously, parenthesized lvalue expressions, string literals,
144and compound literals (e.g. `(struct foo){ 'x', 3.14, 42 }`) are also lvalues.
145
146All of the conversions described above are defined in standard C, but Cforall
147requires further features from its type system.
148In particular, to allow overloading of the `*?` and `?[?]` dereferencing and
149indexing operators, Cforall requires a way to declare that the functions
150defining these operators return lvalues, and since C functions never return
151lvalues and for syntactic reasons we wish to distinguish functions which
152return lvalues from functions which return pointers, this is of necessity an
153extension to standard C.
154In the current design, an `lvalue` qualifier can be added to function return
155types (and only to function return types), the effect of which is to return a
156pointer which is implicitly dereferenced by the caller.
157C++ includes the more general concept of _references_, which are typically
158implemented as implicitly dereferenced pointers as well.
159Another use case which C++ references support is providing a way to pass
160function parameters by reference (rather than by value) with a natural
161syntax; Cforall in its current state has no such mechanism.
162As an example, consider the following (currently typical) copy-constructor
163signature and call:
164
165        void ?{}(T *lhs, T rhs);
166       
167        T x;
168        T y = { x };
169
170Note that the right-hand argument is passed by value, and would in fact be
171copied twice in the course of the constructor call `T y = { x };` (once into
172the parameter by C's standard `memcpy` semantics, once again in the body of
173the copy constructor, though it is possible that return value optimization
174will elide the `memcpy`-style copy).
175However, to pass by reference using the existing pointer syntax, the example
176above would look like this:
177
178        void ?{}(T *lhs, const T *rhs);
179       
180        T x;
181        T y = { &x };
182
183This example is not even as bad as it could be; assuming pass-by-reference is
184the desired semantics for the `?+?` operator, that implies the following
185design today:
186
187        T ?+?(const T *lhs, const T *rhs);
188       
189        T a, b;
190        T c = &a + &b,
191
192In addition to `&a + &b` being unsightly and confusing syntax to add `a` and
193`b`, it also introduces a possible ambiguity with pointer arithmetic on `T*`
194which can only be resolved by return-type inference.
195
196Pass-by-reference and marking functions as returning lvalues instead of the
197usual rvalues are actually closely related concepts, as obtaining a reference
198to pass depends on the referenced object being addressable, i.e. an lvalue,
199and lvalue return types are effectively return-by-reference.
200Cforall should also unify the concepts, with a parameterized type for
201"reference to `T`", which I will write `ref T`.
202Syntax bikeshedding can be done later (there are some examples at the bottom
203of this section), but `ref T` is sufficiently distinct from both the existing
204`lvalue T` (which it subsumes) and the closely related C++ `T&` to allow
205independent discussion of its semantics.
206
207Firstly, assignment to a function parameter as part of a function call and
208local variable initialization have almost identical semantics, so should be
209treated similarly for the reference type too; this implies we should be able
210to declare local variables of reference type, as in the following:
211
212        int x = 42;
213        ref int r = x; // r is now an alias for x
214
215Unlike in C++, we would like to have the capability to re-bind references
216after initialization, as this allows the attractive syntax of references to
217support some further useful code patterns, such as first initializing a
218reference after its declaration.
219Constant references to `T` (`const ref T`) should not be re-bindable.
220
221One option for re-binding references is to use a dedicated operator, as in the
222code example below:
223
224        int i = 42, j = 7;
225        ref int r = i; // bind r to i
226        r = j;         // set i (== r) to 7
227        r := j;        // rebind r to j using the new := rebind operator
228        i = 42;        // reset i (!= r) to 42
229        assert( r == 7 );
230
231The other syntactic option for reference re-bind would be to overload
232assignment and use type inference on the left and right-hand sides to
233determine whether the referred-to variable on the left should be reassigned to
234the value on the right, or if the reference on the left should be aliased to
235the reference on the right.
236This could be disambiguated with casts, as in the following code example:
237
238        int i
239        int j;
240        ref int r = i;   // (0a)
241        ref int s = i;   // (0b)
242       
243        i = j;           // (1)
244        i = (int)s;      // (2)
245        i = s;           // (3)
246        // ---------------------
247        r = s;           // (4)
248        r = (ref int)j;  // (5)
249        // ---------------------
250        r = j;           // (6)
251        r = (int)s;      // (7)
252
253By the expected aliasing syntax, (0a) and (0b) are initializing `r` and `s` as
254aliases for `i`.
255For C compatibility, (1) has to be assignment; in general, any assignment to a
256non-reference type should be assignment, so (2) and (3) are as well.
257By types, (4) and (5) should have the same semantics, and the semantics of (6)
258and (7) should match as well.
259This suggests that (4) and (5) are reference re-bind, and (6) and (7) are an
260assignment to the referred variable; this makes the syntax to explicitly alias
261a local variable rather ugly (and inconsistent with the initialization
262syntax), as well as making it rather awkward to copy the value stored in one
263reference-type variable into another reference type variable (which is likely
264more painful in functions with by-reference parameters than with local
265variables of reference type).
266
267Because of the aforementioned issues with overloading assignment as reference
268rebind, in addition to the fact that reference rebind should not be a
269user-overloadable operator (unlike assignment), I propose refererence rebind
270should have its own dedicated operator.
271
272The semantics and restrictions of `ref T` are effectively the semantics of an
273lvalue of type `T`, and by this analogy there should be a safe, qualifier
274dropping conversion from `ref const volatile restrict T` (and every other
275qualifier combination on the `T` in `ref T`) to `T`.
276With this conversion, the resolver may type most expressions that C would
277call "lvalue of type `T`" as `ref T`.
278There's also an obvious argument that lvalues of a (possibly-qualified) type
279`T` should be convertable to references of type `T`, where `T` is also
280so-qualified (e.g. lvalue `int` to `ref int`, lvalue `const char` to
281`ref const char`).
282By similar arguments to pointer types, qualifiers should be addable to the
283referred-to type of a reference (e.g. `ref int` to `ref const int`).
284As a note, since pointer arithmetic is explictly not defined on `ref T`,
285`restrict ref T` should be allowable and would have alias-analysis rules that
286are actually comprehensible to mere mortals.
287
288Using pass-by-reference semantics for function calls should not put syntactic
289constraints on how the function is called; particularly, temporary values
290should be able to be passed by reference.
291The mechanism for this pass-by-reference would be to store the value of the
292temporary expression into a new unnamed temporary, and pass the reference of
293that temporary to the function.
294As an example, the following code should all compile and run:
295
296        void f(ref int x) { printf("%d\n", x++); }
297       
298        int i = 7, j = 11;
299        const int answer = 42;
300       
301        f(i);      // (1)
302        f(42);     // (2)
303        f(i + j);  // (3)
304        f(answer); // (4)
305
306The semantics of (1) are just like C++'s, "7" is printed, and `i` has the
307value 8 afterward.
308For (2), "42" is printed, and the increment of the unnamed temporary to 43 is
309not visible to the caller; (3) behaves similarly, printing "19", but not
310changing `i` or `j`.
311(4) is a bit of an interesting case; we want to be able to support named
312constants like `answer` that can be used anywhere the constant expression
313they're replacing (like `42`) could go; in this sense, (4) and (2) should have
314the same semantics.
315However, we don't want the mutation to the `x` parameter to be visible in
316`answer` afterward, because `answer` is a constant, and thus shouldn't change.
317The solution to this is to allow chaining of the two `ref` conversions;
318`answer` has the type `ref const int`, which can be converted to `int` by the
319lvalue-to-rvalue conversion (which drops the qualifiers), then up to `ref int` 
320by the temporary-producing rvalue-to-lvalue conversion.
321Thus, an unnamed temporary is inserted, initialized to `answer` (i.e. 42),
322mutated by `f`, then discarded; "42" is printed, just as in case (2), and
323`answer` still equals 42 after the call, because it was the temporary that was
324mutated, not `answer`.
325It may be somewhat surprising to C++ programmers that `f(i)` mutates `i` while
326`f(answer)` does not mutate `answer` (though `f(answer)` would be illegal in
327C++, leading to the dreaded "const hell"), but the behaviour of this rule can
328be determined by examining local scope with the simple rule "non-`const`
329references to `const` variables produce temporaries", which aligns with
330programmer intuition that `const` variables cannot be mutated.
331
332To bikeshed syntax for `ref T`, there are three basic options: language
333keywords (`lvalue T` is already in Cforall), compiler-supported "special"
334generic types (e.g. `ref(T)`), or sigils (`T&` is familiar to C++
335programmers).
336Keyword or generic based approaches run the risk of name conflicts with
337existing code, while any sigil used would have to be carefully chosen to not
338create parsing conflicts.
339
340**TODO** Consider arguments for move semantics and see if there is a
341compelling case for rvalue references.
342
343### Conversion Operator Costs ###
344Copy constructors, safe conversions, and unsafe conversions all have an
345associated conversion cost, calculated according to the algorithm below:
346
3471. Monomorphic copy constructors have a conversion cost of `(0, 0, 0, 0)`
3482. Monomorphic safe conversions have a conversion cost of `(0, 0, 1, 1)`
3493. Monomoprhic unsafe conversions have a conversion cost of `(1, 0, 0, 1)`
3504. Polymorphic conversion operators (or copy constructors) have a conversion
351   cost of `(0, 1, 0, 1)` plus the conversion cost of their monomorphic
352   equivalent and the sum of the conversion costs of all conversion operators
353   passed as assertion parameters, but where the fourth "count" element of the
354   cost tuple is fixed to `1`.
355
356**TODO** Polymorphism cost may need to be reconsidered in the light of the
357thoughts on polymorphism below.
358**TODO** You basically just want path-length in the conversion graph implied
359by the set of conversions; the only tricky question is whether or not you can
360account for "mixed" safe and unsafe conversions used to satisfy polymorphic
361constraints, whether a polymorphic conversion should cost more than a
362monomorphic one, and whether to account for non-conversion constraints in the
363polymorphism cost
364
365### Argument-Parameter Matching ###
366Given a function `f` with an parameter list (after tuple flattening)
367`(T1 t1, T2 t2, ... Tn tn)`, and a function application
368`f(<e1>, <e2>, ... <em>)`, the cost of matching each argument to the
369appropriate parameter is calculated according to the algorithm below:
370
371Given a parameter `t` of type `T` and an expression `<e>` from these lists,
372`<e>` will have a set of interpretations of types `E1, E2, ... Ek` with
373associated costs `(u1, p1, s1, c1), (u2, p2, s2, c2), ... (uk, pk, sk, ck)`.
374(If any `Ei` is a tuple type, replace it with its first flattened element for
375the purposes of this section.)
376
377The cost of matching the interpretation of `<e>` with type `Ei` to `t1` with
378type `T` is the sum of the interpretation cost `(ui, pi, si, ci)` and the
379conversion operator cost from `Ei` to `T`.
380
381### Object Initialization ###
382The cost to initialize an object is calculated very similarly to
383argument-parameter matching, with a few modifications.
384Firstly, explicit constructors are included in the set of available
385conversions, with conversion cost `(0, 0, 0, 1)` plus associated polymorphic
386conversion costs (if applicable) and the _interpretation cost_ of the
387constructor, the sum of the argument-parameter matching costs for its
388parameters.
389Also, ties in overall cost (interpretation cost plus conversion cost) are
390broken by lowest conversion cost (i.e. of alternatives with the same overall
391cost, copy constructors are preferred to other explicit constructors,
392explicit constructors are preferred to safe conversions, which are preferred
393to unsafe conversions).
394An object initialization is properly typed if it has exactly one min-cost
395interpretation.
396
397### Explicit Casts ###
398Explicit casts are handled similarly to object initialization.
399Copy constructors and other explicit constructors are not included in the set
400of possible conversions, though interpreting a cast as type ascription
401(`(T)e`, meaning the interpretation of `e` as type `T`) has conversion cost
402`(0, 0, 0, 0)`.
403Explicit conversion operators are also included in the set of possible
404conversions, with cost `(0, 0, 0, 1)` plus whatever polymorphic conversion
405costs are invoked.
406Unlike for explicit constructors and other functions, implicit conversions are
407never applied to the argument or return type of an explicit cast operator, so
408that the cast may be used more effectively as a method for the user programmer
409to guide type resolution.
410
411## Trait Satisfaction ##
412A _trait_ consists of a list of _type variables_ along with a (possibly empty)
413set of _assertions_ on those variables.
414Assertions can take two forms, _variable assertions_ and the more common
415_function assertions_, as in the following example:
416
417        trait a_trait(otype T, otype S) {
418                T a_variable_assertion;
419                S* another_variable_assertion;
420                S a_function_assertion( T* );
421        };
422
423Variable assertions enforce that a variable with the given name and type
424exists (the type is generally one of the type variables, or derived from one),
425while a function assertion enforces that a function with a
426_compatible signature_ to the provided function exists.
427
428To test if some list of types _satisfy_ the trait, the types are first _bound_ 
429to the type variables, and then declarations to satisfy each assertion are
430sought out.
431Variable assertions require an exact match, because they are passed as object
432pointers, and there is no mechanism to employ conversion functions, while
433function assertions only require a function that can be wrapped to a
434compatible type; for example, the declarations below satisfy
435`a_trait(int, short)`:
436
437        int a_variable_assertion;
438        short* another_variable_assertion;
439        char a_function_assertion( void* );
440        // int* may be implicitly converted to void*, and char to short, so the
441        // above works
442
443Cforall Polymorphic functions have a _constraining trait_, denoted as follows:
444
445        forall(otype A, otype B | some_trait(A, B))
446
447The trait may be anonymous, with the same syntax as a trait declaration, and
448may be unioned together using `|` or `,`.
449
450**TODO** Consider including field assertions in the list of constraint types,
451also associated types and the appropriate matching type assertion.
452
453## Polymorphism Costs ##
454The type resolver should prefer functions that are "less polymorphic" to
455functions that are "more polymorphic".
456Determining how to order functions by degree of polymorphism is somewhat less
457straightforward, though, as there are multiple axes of polymorphism and it is
458not always clear how they compose.
459The natural order for degree of polymorphism is a partial order, and this
460section includes some open questions on whether it is desirable or feasible to
461develop a tie-breaking strategy to impose a total order on the degree of
462polymorphism of functions.
463Helpfully, though, the degree of polymorphism is a property of functions
464rather than function calls, so any complicated graph structure or calculation
465representing a (partial) order over function degree of polymorphism can be
466calculated once and cached.
467
468### Function Parameters ###
469All other things being equal, if a parameter of one function has a concrete
470type and the equivalent parameter of another function has a dynamic type, the
471first function is less polymorphic:
472
473                        void f( int, int );  // (0) least polymorphic
474        forall(otype T) void f( T, int );    // (1a) more polymorphic than (0)
475        forall(otype T) void f( int, T );    // (1b) more polymorphic than (0)
476                                             //      incomparable with (1a)
477        forall(otype T) void f( T, T );      // (2) more polymorphic than (1a/b)
478
479This should extend to parameterized types (pointers and generic types) also:
480
481        forall(otype S) struct box { S val; };
482       
483        forall(otype T) void f( T, T* );       // (3) less polymorphic than (2)
484        forall(otype T) void f( T, T** );      // (4) less polymorphic than (3)
485        forall(otype T) void f( T, box(T) );   // (5) less polymorphic than (2)
486                                               //     incomparable with (3)
487        forall(otype T) void f( T, box(T*) );  // (6) less polymorphic than (5)
488
489Every function in the group above is incomparable with (1a/b), but that's fine
490because an `int` isn't a pointer or a `box`, so the ambiguity shouldn't occur
491much in practice (unless there are safe or unsafe conversions defined between
492the possible argument types).
493
494For degree of polymorphism from arguments, I think we should not distinguish
495between different type parameters, e.g. the following should be considered
496equally polymorphic:
497
498        forall(otype T, otype S) void f( T, T, S );  // (7)
499        forall(otype T, otype S) void f( S, T, T );  // (8)
500
501However parameter lists are compared, parameters of multi-parameter generic
502types should ideally be treated as a recursive case, e.g. in the example
503below, (9) is less polymorphic than (10), which is less polymorphic than (11):
504
505        forall(otype T, otype S) struct pair { T x; S y; };
506       
507                        void f( pair(int, int) );  // (9)
508        forall(otype T) void f( pair(T, int) );    // (10)
509        forall(otype T) void f( pair(T, T) );      // (11)
510
511Parameter comparison could possibly be made somewhat cheaper at loss of some
512precision by representing each parameter as a value from the natural numbers
513plus infinity, where infinity represents a monomorphic parameter and a finite
514number counts how many levels deep the shallowest type variable is, e.g. where
515`T` is a type variable, `int` would have value infinity, `T` would have value
5160, `T*` would have value 1, `box(T)*` would have value 2, etc.
517Under this scheme, higher values represent less polymorphism.
518This makes the partial order on parameters a total order, so that many of the
519incomparable functions above compare equal, though that is perhaps a virtue.
520It also loses the ability to differentiate between some multi-parameter
521generic types, such as the parameters in (10) and (11), which would both be
522valued 1, losing the polymorphism distinction between them.
523
524A variant of the above scheme would be to fix a maximum depth of polymorphic
525type variables (16 seems like a reasonable choice) at which a parameter would
526be considered to be effectively monomorphic, and to subtract the value
527described above from that maximum, clamping the result to a minimum of 0.
528Under this scheme, assuming a maximum value of 4, `int` has value 0, `T` has
529value 4, `T*` has value 3, `box(T)*` has value 2, and `box(T*)**` has value 0,
530the same as `int`.
531This can be quite succinctly represented, and summed without the presence of a
532single monomorphic parameter pushing the result to infinity, but does lose the
533ability to distinguish between very deeply structured polymorphic types.
534
535### Parameter Lists ###
536A partial order on function parameter lists can be produced by the
537product order of the partial orders on parameters described above.
538In more detail, this means that for two parameter lists with the same arity,
539if any pair of corresponding parameters are incomparable with respect to each
540other, the two parameter lists are incomparable; if in all pairs of
541corresponding parameters one list's parameter is always (less than or) equal
542to the other list's parameter than the first parameter list is (less than or)
543equal to the second parameter list; otherwise the lists are incomparable with
544respect to each other.
545
546How to compare parameter lists of different arity is a somewhat open question.
547A simple, but perhaps somewhat unsatisfying, solution would be just to say
548that such lists are incomparable.
549The simplist approach to make them comparable is to say that, given two lists
550`(T1, T2, ... Tn)` and `(S1, S2, ... Sm)`, where `n <= m`, the parameter lists
551can be compared based on their shared prefix of `n` types.
552This approach breaks the transitivity property of the equivalence relation on
553the partial order, though, as seen below:
554
555        forall(otype T) void f( T, int );       // (1a)
556        forall(otype T) void f( T, int, int );  // (12)
557        forall(otype T) void f( T, int, T );    // (13)
558
559By this rule, (1a) is equally polymorphic to both (12) and (13), so by
560transitivity (12) and (13) should also be equally polymorphic, but that is not
561actually the case.
562
563We can fix the rule by saying that `(T1 ... Tn)` can be compared to
564`(S1 ... Sm)` by _extending_ the list of `T`s to `m` types by inserting
565notional monomorphic parameters.
566In this case, (1a) and (12) are equally polymorphic, because (1a) gets
567extended with a monomorphic type that compares equal to (12)'s third `int` 
568parameter, but (1a) is less polymorphic than (13), because its notional
569monomorphic third parameter is less polymorphic than (13)'s `T`.
570Essentially what this rule says is that any parameter list with more
571parameters is no less polymorphic than one with fewer.
572
573We could collapse this parameter list ordering to a succinct total order by
574simply taking the sum of the clamped parameter polymorphism counts, but this
575would again make most incomparable parameter lists compare equal, as well as
576having the potential for some unexpected results based on the (completely
577arbitrary) value chosen for "completely polymorphic".
578For instance, if we set 4 to be the maximum depth of polymorphism (as above),
579the following functions would be equally polymorphic, which is a somewhat
580unexpected result:
581
582        forall(otype T) void g( T, T, T, int );    // 4 + 4 + 4 + 0 = 12
583        forall(otype T) void g( T*, T*, T*, T* );  // 3 + 3 + 3 + 3 = 12
584
585These functions would also be considered equally polymorphic:
586
587        forall(otype T) void g( T, int );    // 4 + 0 = 4;
588        forall(otype T) void g( T**, T** );  // 2 + 2 = 4;
589
590This issue can be mitigated by choosing a larger maximum depth of
591polymorphism, but this scheme does have the distinct disadvantage of either
592specifying the (completely arbitrary) maximum depth as part of the language or
593allowing the compiler to refuse to accept otherwise well-typed deeply-nested
594polymorphic types.
595
596For purposes of determining polymorphism, the list of return types of a
597function should be treated like another parameter list, and combined with the
598degree of polymorphism from the parameter list in the same way that the
599parameters in the parameter list are combined.
600For instance, in the following, (14) is less polymorphic than (15) which is
601less polymorphic than (16):
602
603        forall(otype T) int f( T );  // (14)
604        forall(otype T) T*  f( T );  // (15)
605        forall(otype T) T   f( T );  // (16)
606
607### Type Variables and Bounds ###
608Degree of polymorphism doesn't solely depend on the parameter lists, though.
609Glen's thesis (4.4.4, p.89) gives an example that shows that it also depends
610on the number of type variables as well:
611
612        forall(otype T)          void f( T, int );  // (1a) polymorphic
613        forall(otype T)          void f( T, T );    // (2)  more polymorphic
614        forall(otype T, otype S) void f( T, S );    // (17) most polymorphic
615
616Clearly the `forall` type parameter list needs to factor into calculation of
617degree of polymorphism as well, as it's the only real differentiation between
618(2) and (17).
619The simplest way to include the type parameter list would be to simply count
620the type variables and say that functions with more type variables are more
621polymorphic.
622
623However, it also seems natural that more-constrained type variables should be
624counted as "less polymorphic" than less-constrained type variables.
625This would allow our resolver to pick more specialized (and presumably more
626efficient) implementations of functions where one exists.
627For example:
628
629        forall(otype T | { void g(T); }) T f( T );  // (18) less polymorphic
630        forall(otype T)                  T f( T );  // (16) more polymorphic
631
632We could account for this by counting the number of unique constraints and
633saying that functions with more constraints are less polymorphic.
634
635That said, we do model the `forall` constraint list as a (possibly anonymous)
636_trait_, and say that each trait is a set of constraints, so we could
637presumably define a partial order over traits based on subset inclusion, and
638use this partial order instead of the weaker count of constraints to order the
639list of type parameters of a function, as below:
640
641        trait has_g(otype T) { void g(T); };
642        trait has_h(otype S) { void h(T); };
643        trait has_gh(otype R | has_g(R) | has_h(R)) {};
644        // has_gh is equivlent to { void g(R); void h(R); }
645       
646        forall(otype T | has_gh(T)) T f( T );  // (19) least polymorphic
647        forall(otype T | has_g(T))  T f( T );  // (18) more polymorphic than (19)
648        forall(otype T | has_h(T))  T f( T );  // (18b) more polymorphic than (19)
649                                               //       incomparable with (18)
650        forall(otype T)             T f( T );  // (16) most polymorphic
651
652The tricky bit with this is figuring out how to compare the constraint
653functions for equality up to type variable renaming; I suspect there's a known
654solution, but don't know what it is (perhaps some sort of unification
655calculation, though I hope there's a more lightweight option).
656We also should be able to take advantage of the programmer-provided trait
657subset information (like the constraint on `has_gh` in the example) to more
658efficiently generate the partial-order graph for traits, which should be able
659to be cached for efficiency.
660
661Combining count of type variables with the (partial) order on the trait
662constraining those variables seems like it should be a fairly straightforward
663product ordering to me - one `forall` qualifier is (less than or) equal to
664another if it has both a (less than or) equal number of type variables and a
665(less than or) equal degree of polymorphism from its constraining trait; the
666two qualifiers are incomparable otherwise.
667If an easier-to-calculate total ordering is desired, it might be acceptable to
668use the number of type variables, with ties broken by number of constraints.
669
670Similarly, to combine the (partial) orders on parameter and return lists with
671the (partial) order on `forall` qualifiers, a product ordering seems like the
672reasonable choice, though if we wanted a total order a reasonable choice would
673be to use whatever method we use to combine parameter costs into parameter
674lists to combine the costs for the parameter and return lists, then break ties
675by the order on the `forall` qualifiers.
676
677## Expression Costs ##
678
679### Variable Expressions ###
680Variables may be overloaded; that is, there may be multiple distinct variables
681with the same name so long as each variable has a distinct type.
682The variable expression `x` has one zero-cost interpretation as type `T` for
683each variable `T x` in scope.
684
685### Member Selection Expressions ###
686For every interpretation `I` of `e` which has a struct or union type `S`,
687`e.y` has an interpretation of type `T` for each member `T y` of `S`, with the
688same cost as `I`.
689Note that there may be more than one member of `S` with the same name, as per
690Cforall's usual overloading rules.
691The indirect member expression `e->y` is desugared to `(*e).y` and interpreted
692analogously.
693
694**TODO** Consider allowing `e.y` to be interpreted as `e->y` if no
695interpretations as `e.y` exist.
696
697### Address Expressions ###
698Address expressions `&e` have an interpretation for each interpretation `I` of
699`e` that is an lvalue of type `T`, with the same cost as `I` and type `T*`.
700Lvalues result from variable expressions, member selection expressions, or
701application of functions returning an lvalue-qualified type.
702Note that the dereference operator is overloadable, so the rules for its
703resolution follow those for function application below.
704
705**TODO** Consider allowing conversion-to-lvalue so that, e.g., `&42` spawns a
706new temporary holding `42` and takes its address.
707
708### Boolean-context Expressions ###
709C has a number of "boolean contexts", where expressions are assigned a truth
710value; these include both arguments to the short-circuiting `&&` and `||` 
711operators, as well as the conditional expressions in `if` and `while` 
712statements, the middle expression in `for` statements, and the first argument
713to the `?:` ternary conditional operator.
714In all these contexts, C interprets `0` (which is both an integer and a null
715pointer literal) as false, and all other integer or pointer values as true.
716In this spirit, Cforall allows other types to be considered "truthy" if they
717support the following de-sugaring in a conditional context (see notes on
718interpretation of literal `0` below):
719
720        x => ((int)( x != 0 ))
721
722### Literal Expressions ###
723Literal expressions (e.g. 42, 'c', 3.14, "Hello, world!") have one
724zero-cost interpretation with the same type the expression would have in C,
725with three exceptions:
726
727Character literals like 'x' are typed as `char` in Cforall, not `int` as in C.
728This change breaks very little C code (primarily `sizeof 'x'`; the implicit
729conversion from `int` to `char` and lack of overloading handle most other
730expressions), matches the behaviour of C++, and is more compatible with
731programmer intuition.
732
733The literals `0` and `1` are also treated specially by Cforall, due to their
734potential uses in operator overloading.
735Earlier versions of Cforall allowed `0` and `1` to be variable names, allowing
736multiple interpretations of them according to the existing variable
737overloading rules, with the following declarations in the prelude:
738
739        const int 0, 1;
740        forall ( dtype DT ) const DT * const    0;
741        forall ( ftype FT ) FT * const          0;
742
743This did, however, create some backward-compatibility problems and potential
744performance issues, and works poorly for generic types. To start with, this
745(entirely legal C) code snippet doesn't compile in Cforall:
746
747        if ( 0 ) {}
748
749It desugars to `if ( (int)(0 != 0) ) {}`, and since both `int` and
750`forall(dtype DT) DT*` have a != operator which returns `int` the resolver can
751not choose which `0` variable to take, because they're both exact matches.
752
753The general != computation may also be less efficient than a check for a zero
754value; take the following example of a rational type:
755
756        struct rational { int32_t num, int32_t den };
757        rational 0 = { 0, 1 };
758       
759        int ?!=? (rational a, rational b) {
760                return ((int64_t)a.num)*b.den != ((int64_t)b.num)*a.den;
761        }
762       
763        int not_zero (rational a) { return a.num != 0; }
764
765To check if two rationals are equal we need to do a pair of multiplications to
766normalize them (the casts in the example are to prevent overflow), but to
767check if a rational is non-zero we just need to check its numerator, a more
768efficient operation.
769
770Finally, though polymorphic null-pointer variables can be meaningfully
771defined, most other polymorphic variables cannot be, which makes it difficult
772to make generic types "truthy" using the existing system:
773
774        forall(otype T) struct pair { T x; T y; };
775        forall(otype T | { T 0; }) pair(T) 0 = { 0, 0 };
776
777Now, it seems natural enough to want to define the zero for this pair type as
778a pair of the zero values of its element type (if they're defined).
779The declaration of `pair(T) 0` above is actually illegal though, as there is
780no way to represent the zero values of an infinite number of types in the
781single memory location available for this polymorphic variable - the
782polymorphic null-pointer variables defined in the prelude are legal, but that
783is only because all pointers are the same size and the single zero value is a
784legal value of all pointer types simultaneously; null pointer is, however,
785somewhat unique in this respect.
786
787The technical explanation for the problems with polymorphic zero is that `0` 
788is really a rvalue, not a lvalue - an expression, not an object.
789Drawing from this, the solution we propose is to give `0` a new built-in type,
790`_zero_t` (name open to bikeshedding), and similarly give `1` the new built-in
791type `_unit_t`.
792If the prelude defines != over `_zero_t` this solves the `if ( 0 )` problem,
793because now the unambiguous best interpretation of `0 != 0` is to read them
794both as `_zero_t` (and say that this expression is false).
795Backwards compatibility with C can be served by defining conversions in the
796prelude from `_zero_t` and `_unit_t` to `int` and the appropriate pointer
797types, as below:
798
799        // int 0;
800        forall(otype T | { void ?{safe}(T*, int); }) void ?{safe} (T*, _zero_t);
801        forall(otype T | { void ?{unsafe}(T*, int); }) void ?{unsafe} (T*, _zero_t);
802       
803        // int 1;
804        forall(otype T | { void ?{safe}(T*, int); }) void ?{safe} (T*, _unit_t);
805        forall(otype T | { void ?{unsafe}(T*, int); }) void ?{unsafe} (T*, _unit_t);
806       
807        // forall(dtype DT) const DT* 0;
808        forall(dtype DT) void ?{safe}(const DT**, _zero_t);
809        // forall(ftype FT) FT* 0;
810        forall(ftype FT) void ?{safe}(FT**, _zero_t);
811
812Further, with this change, instead of making `0` and `1` overloadable
813variables, we can instead allow user-defined constructors (or, more flexibly,
814safe conversions) from `_zero_t`, as below:
815
816        // rational 0 = { 0, 1 };
817        void ?{safe} (rational *this, _zero_t) { this->num = 0; this->den = 1; }
818
819Note that we don't need to name the `_zero_t` parameter to this constructor,
820because its only possible value is a literal zero.
821This one line allows `0` to be used anywhere a `rational` is required, as well
822as enabling the same use of rationals in boolean contexts as above (by
823interpreting the `0` in the desguraring to be a rational by this conversion).
824Furthermore, while defining a conversion function from literal zero to
825`rational` makes rational a "truthy" type able to be used in a boolean
826context, we can optionally further optimize the truth decision on rationals as
827follows:
828
829        int ?!=? (rational a, _zero_t) { return a.num != 0; }
830
831This comparison function will be chosen in preference to the more general
832rational comparison function for comparisons against literal zero (like in
833boolean contexts) because it doesn't require a conversion on the `0` argument.
834Functions of the form `int ?!=? (T, _zero_t)` can acutally be used in general
835to make a type `T` truthy without making `0` a value which can convert to that
836type, a capability not available in the current design.
837
838This design also solves the problem of polymorphic zero for generic types, as
839in the following example:
840
841        // ERROR: forall(otype T | { T 0; }) pair(T) 0 = { 0, 0 };
842        forall(otype T | { T 0; }) void ?{safe} (pair(T) *this, _zero_t) {
843                this->x = 0; this->y = 0;
844        }
845
846The polymorphic variable declaration didn't work, but this constructor is
847perfectly legal and has the desired semantics.
848
849We can assert that `T` can be used in a boolean context as follows:
850
851        `forall(otype T | { int ?!=?(T, _zero_t); })`
852 
853Since the C standard (6.5.16.1.1) specifically states that pointers can be
854assigned into `_Bool` variables (and implies that other artithmetic types can
855be assigned into `_Bool` variables), it seems natural to say that assignment
856into a `_Bool` variable effectively constitutes a boolean context.
857To allow this interpretation, I propose including the following function (or
858its effective equivalent) in the prelude:
859
860        forall(otype T | { int ?!=?(T, _zero_t); })
861        void ?{safe}( _Bool *this, T that ) { *this = that != 0; }
862
863Note that this conversion is not transitive; that is, for `t` a variable of
864some "truthy" type `T`, `(_Bool)t;` would use this conversion (in the absence
865of a lower-cost one), `(int)t;` would not use this conversion (and in fact
866would not be legal in the absence of another valid way to convert a `T` to an
867`int`), but `(int)(_Bool)t;` could legally use this conversion.
868
869Similarly giving literal `1` the special type `_unit_t` allows for more
870concise and consistent specification of the increment and decrement operators,
871using the following de-sugaring:
872
873        ++i => i += 1
874        i++ => (tmp = i, i += 1, tmp)
875        --i => i -= 1
876        i-- => (tmp = i, i -= 1, tmp)
877
878In the examples above, `tmp` is a fresh temporary with its type inferred from
879the return type of `i += 1`.
880Under this proposal, defining a conversion from `_unit_t` to `T` and a
881`lvalue T ?+=? (T*, T)` provides both the pre- and post-increment operators
882for free in a consistent fashion (similarly for -= and the decrement
883operators).
884If a meaningful `1` cannot be defined for a type, both increment operators can
885still be defined with the signature `lvalue T ?+=? (T*, _unit_t)`.
886Similarly, if scalar addition can be performed on a type more efficiently than
887by repeated increment, `lvalue T ?+=? (T*, int)` will not only define the
888addition operator, it will simultaneously define consistent implementations of
889both increment operators (this can also be accomplished by defining a
890conversion from `int` to `T` and an addition operator `lvalue T ?+=?(T*, T)`).
891
892To allow functions of the form `lvalue T ?+=? (T*, int)` to satisfy "has an
893increment operator" assertions of the form `lvalue T ?+=? (T*, _unit_t)`,
894we also define a non-transitive unsafe conversion from `_Bool` (allowable
895values `0` and `1`) to `_unit_t` (and `_zero_t`) as follows:
896
897        void ?{unsafe} (_unit_t*, _Bool) {}
898
899As a note, the desugaring of post-increment above is possibly even more
900efficient than that of C++ - in C++, the copy to the temporary may be hidden
901in a separately-compiled module where it can't be elided in cases where it is
902not used, whereas this approach for Cforall always gives the compiler the
903opportunity to optimize out the temporary when it is not needed.
904Furthermore, one could imagine a post-increment operator that returned some
905type `T2` that was implicitly convertable to `T` but less work than a full
906copy of `T` to create (this seems like an absurdly niche case) - since the
907type of `tmp` is inferred from the return type of `i += 1`, you could set up
908functions with the following signatures to enable an equivalent pattern in
909Cforall:
910
911        lvalue T2 ?+=? (T*, _unit_t); // increment operator returns T2
912        void ?{} (T2*, T);            // initialize T2 from T for use in `tmp = i`
913        void ?{safe} (T*, T2);        // allow T2 to be used as a T when needed to
914                                      // preserve expected semantics of T x = y++;
915
916**TODO** Look in C spec for literal type interprations.
917**TODO** Write up proposal for wider range of literal types, put in appendix
918
919### Initialization and Cast Expressions ###
920An initialization expression `T x = e` has one interpretation for each
921interpretation `I` of `e` with type `S` which is convertable to `T`.
922The cost of the interpretation is the cost of `I` plus the conversion cost
923from `S` to `T`.
924A cast expression `(T)e` is interpreted as hoisting initialization of a
925temporary variable `T tmp = e` out of the current expression, then replacing
926`(T)e` by the new temporary `tmp`.
927
928### Assignment Expressions ###
929An assignment expression `e = f` desugars to `(?=?(&e, f), e)`, and is then
930interpreted according to the usual rules for function application and comma
931expressions.
932Operator-assignment expressions like `e += f` desugar similarly as
933`(?+=?(&e, f), e)`.
934
935### Function Application Expressions ###
936Every _compatible function_ and satisfying interpretation of its arguments and
937polymorphic variable bindings produces one intepretation for the function
938application expression.
939Broadly speaking, the resolution cost of a function application is the sum of
940the cost of the interpretations of all arguments, the cost of all conversions
941to make those argument interpretations match the parameter types, and the
942binding cost of any of the function's polymorphic type parameters.
943
944**TODO** Work out binding cost in more detail.
945**TODO** Address whether "incomparably polymorphic" should be treated as
946"equally polymorphic" and be disambiguated by count of (safe) conversions.
947**TODO** Think about what polymorphic return types mean in terms of late
948binding.
949**TODO** Consider if "f is less polymorphic than g" can mean exactly "f
950specializes g"; if we don't consider the assertion parameters (except perhaps
951by count) and make polymorphic variables bind exactly (rather than after
952implicit conversions) this should actually be pre-computable.
953**TODO** Add "deletable" functions - take Thierry's suggestion that a deleted
954function declaration is costed out by the resolver in the same way that any
955other function declaration is costed; if the deleted declaration is the unique
956min-cost resolution refuse to type the expression, if it is tied for min-cost
957then take the non-deleted alternative, and of two equivalent-cost deleted
958interpretations with the same return type pick one arbitrarily rather than
959producing an ambiguous resolution. This would also be useful for forbidding
960pointer-to-floating-point explicit conversions (C11, 6.5.4.4).
961**TODO** Cover default parameters, maybe named parameters (see "named
962arguments" thread of 11 March 2016)
963 
964
965### Sizeof, Alignof & Offsetof Expressions ###
966`sizeof`, `alignof`, and `offsetof` expressions have at most a single
967interpretation, of type `size_t`.
968`sizeof` and `alignof` expressions take either a type or an expression as an
969argument; if the argument is a type, it must be a complete type which is not a
970function type, if an expression, the expression must have a single
971interpretation, the type of which conforms to the same rules.
972`offsetof` takes two arguments, a type and a member name; the type must be
973a complete structure or union type, and the second argument must name a member
974of that type.
975
976### Comma Expressions ###
977A comma expression `x, y` resolves `x` as if it had been cast to `void`, and
978then, if there is a unique interpretation `I` of `x`, has one interpretation
979for each interpretation `J` of `y` with the same type as `J` costing the sum
980of the costs of `I` and `J`.
981
982### Index Expressions ###
983**TODO** Consider adding polymorphic function in prelude for this, as per
9846.5.2.1.2 in the C standard:
985
986        forall(otype T, otype I, otype R, otype E | { R ?+?(T, I); lvalue E *?(R); })
987        lvalue E ?[?](T a, I i) { return *(a + i); }
988
989I think this isn't actually a good idea, because the cases for it are niche,
990mostly odd tricks like `0[p]` as an alternate syntax for dereferencing a
991pointer `p`, and adding it to the prelude would slow down resolution of
992every index expression just a bit. Our existing prelude includes an indexing
993operator `forall(otype T) lvalue T ?[?](ptrdiff_t, T*)`, plus qualified
994variants, which should satisfy C source-compatibility without propegating this
995silly desugaring further.
996
997#### Compatible Functions ####
998**TODO** This subsection is very much a work in progress and has multiple open
999design questions.
1000
1001A _compatible function_ for an application expression is a visible function
1002declaration with the same name as the application expression and parameter
1003types that can be converted to from the argument types.
1004Function pointers and variables of types with the `?()` function call operator
1005overloaded may also serve as function declarations for purposes of
1006compatibility.
1007
1008For monomorphic parameters of a function declaration, the declaration is a
1009compatible function if there is an argument interpretation that is either an
1010exact match, or has a safe or unsafe implicit conversion that can be used to
1011reach the parameter type; for example:
1012
1013        void f(int);
1014       
1015        f(42);        // compatible; exact match to int type
1016        f('x');       // compatible; safe conversion from char => int
1017        f(3.14);      // compatible; unsafe conversion from double => int
1018        f((void*)0);  // not compatible; no implicit conversion from void* => int
1019
1020Per Richard[*], function assertion satisfaction involves recursively searching
1021for compatible functions, not an exact match on the function types (I don't
1022believe the current Cforall resolver handles this properly); to extend the
1023previous example:
1024
1025        forall(otype T | { void f(T); }) void g(T);
1026       
1027        g(42);    // binds T = int, takes f(int) by exact match
1028        g('x');   // binds T = char, takes f(int) by conversion
1029        g(3.14);  // binds T = double, takes f(int) by conversion
1030
1031[*] Bilson, s.2.1.3, p.26-27, "Assertion arguments are found by searching the
1032accessible scopes for definitions corresponding to assertion names, and
1033choosing the ones whose types correspond *most closely* to the assertion
1034types." (emphasis mine)
1035
1036There are three approaches we could take to binding type variables: type
1037variables must bind to argument types exactly, each type variable must bind
1038exactly to at least one argument, or type variables may bind to any type which
1039all corresponding arguments can implicitly convert to; I'll provide some
1040possible motivation for each approach.
1041
1042There are two main arguments for the more restrictive binding schemes; the
1043first is that the built-in implicit conversions in C between `void*` and `T*` 
1044for any type `T` can lead to unexpectedly type-unsafe behaviour in a more
1045permissive binding scheme, for example:
1046
1047        forall(dtype T) T* id(T *p) { return p; }
1048       
1049        int main() {
1050                int *p = 0;
1051                char *c = id(p);
1052        }
1053
1054This code compiles in CFA today, and it works because the extra function
1055wrapper `id` provides a level of indirection that allows the non-chaining
1056implicit conversions from `int*` => `void*` and `void*` => `char*` to chain.
1057The resolver types the last line with `T` bound to `void` as follows:
1058
1059        char *c = (char*)id( (void*)p );
1060
1061It has been suggested that making the implicit conversions to and from `void*` 
1062explicit in Cforall code (as in C++) would solve this particular problem, and
1063provide enough other type-safety benefits to outweigh the source-compatibility
1064break with C; see Appendix D for further details.
1065
1066The second argument for a more constrained binding scheme is performance;
1067trait assertions need to be checked after the type variables are bound, and
1068adding more possible values for the type variables should correspond to a
1069linear increase in runtime of the resolver per type variable.
1070There are 21 built-in arithmetic types in C (ignoring qualifiers), and each of
1071them is implicitly convertable to any other; if we allow unrestricted binding
1072of type variables, a common `int` variable (or literal) used in the position
1073of a polymorphic variable parameter would cause a 20x increase in the amount
1074of time needed to check trait resolution for that interpretation.
1075These numbers have yet to be emprically substantiated, but the theory is
1076reasonable, and given that much of the impetus for re-writing the resolver is
1077due to its poor performance, I think this is a compelling argument.
1078
1079I would also mention that a constrained binding scheme is practical; the most
1080common type of assertion is a function assertion, and, as mentioned above,
1081those assertions should be able to be implicitly converted to to match.
1082Thus, in the example above with `g(T)`, where the assertion is `void f(T)`,
1083we first bind `T = int` or `T = char` or `T = double`, then substitute the
1084binding into the assertion, yielding assertions of `void f(int)`,
1085`void f(char)`, or `void f(double)`, respectively, then attempt to satisfy
1086these assertions to complete the binding.
1087Though in all three cases, the existing function with signature `void f(int)` 
1088satisfies this assertion, the checking work cannot easily be re-used between
1089variable bindings, because there may be better or worse matches depending on
1090the specifics of the binding.
1091
1092The main argument for a more flexible binding scheme is that the binding
1093abstraction can actually cause a wrapped function call that would work to
1094cease to resolve, as below:
1095
1096        forall(otype T | { T ?+? (T, T) })
1097        T add(T x, T y) { return x + y; }
1098       
1099        int main() {
1100                int i, j = 2;
1101                short r, s = 3;
1102                i = add(j, s);
1103                r = add(s, j);
1104        }
1105
1106Now, C's implicit conversions mean that you can call `j + s` or `s + j`, and
1107in both cases the short `s` is promoted to `int` to match `j`.
1108If, on the other hand, we demand that variables exactly match type variables,
1109neither call to `add` will compile, because it is impossible to simultaneously
1110bind `T` to both `int` and `short` (C++ has a similar restriction on template
1111variable inferencing).
1112One alternative that enables this case, while still limiting the possible
1113type variable bindings is to say that at least one argument must bind to its
1114type parameter exactly.
1115In this case, both calls to `add` would have the set `{ T = int, T = short }` 
1116for candidate bindings, and would have to check both, as well as checking that
1117`short` could convert to `int` or vice-versa.
1118
1119It is worth noting here that parameterized types generally bind their type
1120parameters exactly anyway, so these "restrictive" semantics only restrict a
1121small minority of calls; for instance, in the example following, there isn't a
1122sensible way to type the call to `ptr-add`:
1123
1124        forall(otype T | { T ?+?(T, T) })
1125        void ptr-add( T* rtn, T* x, T* y ) {
1126                *rtn = *x + *y;
1127        }
1128       
1129        int main() {
1130                int i, j = 2;
1131                short s = 3;
1132                ptr-add(&i, &j, &s); // ERROR &s is not an int*
1133        }
1134
1135I think there is some value in providing library authors with the
1136capability to express "these two parameter types must match exactly".
1137This can be done without restricting the language's expressivity, as the `add` 
1138case above can be made to work under the strictest type variable binding
1139semantics with any addition operator in the system by changing its signature
1140as follows:
1141
1142        forall( otype T, otype R, otype S | { R ?+?(T, S); } )
1143        R add(T x, S y) { return x + y; }
1144
1145Now, it is somewhat unfortunate that the most general version here is more
1146verbose (and thus that the path of least resistence would be more restrictive
1147library code); however, the breaking case in the example code above is a bit
1148odd anyway - explicitly taking two variables of distinct types and relying on
1149C's implicit conversions to do the right thing is somewhat bad form,
1150especially where signed/unsigned conversions are concerned.
1151I think the more common case for implicit conversions is the following,
1152though, where the conversion is used on a literal:
1153
1154        short x = 40;
1155        short y = add(x, 2);
1156
1157One option to handle just this case would be to make literals implicitly
1158convertable to match polymorphic type variables, but only literals.
1159The example above would actually behave slightly differently than `x + 2` in
1160C, though, casting the `2` down to `short` rather than the `x` up to `int`, a
1161possible demerit of this scheme.
1162
1163The other question to ask would be which conversions would be allowed for
1164literals; it seems rather odd to allow down-casting `42ull` to `char`, when
1165the programmer has explicitly specified by the suffix that it's an unsigned
1166long.
1167Type interpretations of literals in C are rather complex (see [1]), but one
1168reasonable approach would be to say that un-suffixed integer literals could be
1169interpreted as any type convertable from int, "u" suffixed literals could be
1170interpreted as any type convertable from "unsigned int" except the signed
1171integer types, and "l" or "ll" suffixed literals could only be interpreted as
1172`long` or `long long`, respectively (or possibly that the "u" suffix excludes
1173the signed types, while the "l" suffix excludes the types smaller than
1174`long int`, as in [1]).
1175Similarly, unsuffixed floating-point literals could be interpreted as `float`,
1176`double` or `long double`, but "f" or "l" suffixed floating-point literals
1177could only be interpreted as `float` or `long double`, respectively.
1178I would like to specify that character literals can only be interpreted as
1179`char`, but the wide-character variants and the C practice of typing character
1180literals as `int` means that would likely break code, so character literals
1181should be able to take any integer type.
1182
1183[1] http://en.cppreference.com/w/c/language/integer_constant
1184
1185With the possible exception of the `add` case above, implicit conversions to
1186the function types of assertions can handle most of the expected behaviour
1187from C.
1188However, implicit conversions cannot be applied to match variable assertions,
1189as in the following example:
1190
1191        forall( otype T | { int ?<?(T, T); T ?+?(T, T); T min; T max; } )
1192        T clamp_sum( T x, T y ) {
1193                T sum = x + y;
1194                if ( sum < min ) return min;
1195                if ( max < sum ) return max;
1196                return sum;
1197        }
1198       
1199        char min = 'A';
1200        double max = 100.0;
1201        //int val = clamp_sum( 'X', 3.14 );  // ERROR (1)
1202       
1203        char max = 'Z'
1204        char val = clamp_sum( 'X', 3.14 ); // MATCH (2)
1205        double val = clamp_sum( 40.9, 19.9 ); // MAYBE (3)
1206
1207In this example, the call to `clamp_sum` at (1) doesn't compile, because even
1208though there are compatible `min` and `max` variables of types `char` and
1209`double`, they need to have the same type to match the constraint, and they
1210don't.
1211The (2) example does compile, but with a behaviour that might be a bit
1212unexpected given the "usual arithmetic conversions", in that both values are
1213narrowed to `char` to match the `min` and `max` constraints, rather than
1214widened to `double` as is usual for mis-matched arguments to +.
1215The (3) example is the only case discussed here that would require the most
1216permisive type binding semantics - here, `T` is bound to `char`, to match the
1217constraints, and both the parameters are narrowed from `double` to `char` 
1218before the call, which would not be allowed under either of the more
1219restrictive binding semantics.
1220However, the behaviour here is unexpected to the programmer, because the
1221return value will be `(double)'A' /* == 60.0 */` due to the conversions,
1222rather than `60.8 /* == 40.9 + 19.9 */` as they might expect.
1223
1224Personally, I think that implicit conversions are not a particularly good
1225language design, and that the use-cases for them can be better handled with
1226less powerful features (e.g. more versatile rules for typing constant
1227expressions).
1228However, though we do need implicit conversions in monomorphic code for C
1229compatibility, I'm in favour of restricting their usage in polymorphic code,
1230both to give programmers some stronger tools to express their intent and to
1231shrink the search space for the resolver.
1232Of the possible binding semantics I've discussed, I'm in favour of forcing
1233polymorphic type variables to bind exactly, though I could be talked into
1234allowing literal expressions to have more flexibility in their bindings, or
1235possibly loosening "type variables bind exactly" to "type variables bind
1236exactly at least once"; I think the unrestricted combination of implicit
1237conversions and polymorphic type variable binding unneccesarily multiplies the
1238space of possible function resolutions, and that the added resolution options
1239are mostly unexpected and therefore confusing and not useful to user
1240programmers.
1241
1242## Resolver Architecture ##
1243
1244### Function Application Resolution ###
1245Our resolution algorithm for function application expressions is based on
1246Baker's[3] single-pass bottom-up algorithm, with Cormack's[4] single-pass
1247top-down algorithm applied where appropriate as an optimization.
1248Broadly speaking, the cost of this resolution per expression will be
1249proportional to `i^d`, where `i` is the number of interpretations of each
1250program symbol, and `d` is the maximum depth of the expression DAG.
1251Since `d` is determined by the user programmer (in general, bounded by a small
1252constant), opportunities for resolver optimization primarily revolve around
1253minimizing `i`, the number of interpretations of each symbol that are
1254considered.
1255
1256[3] Baker, Theodore P. A one-pass algorithm for overload resolution in Ada.
1257ACM Transactions on Programming Languages and Systems (1982) 4:4 p.601-614
1258
1259[4] Cormack, Gordon V. An algorithm for the selection of overloaded functions
1260in Ada. SIGPLAN Notices (1981) 16:2 p.48-52
1261
1262Unlike Baker, our system allows implicit type conversions for function
1263arguments and return types; the problem then becomes to find the valid
1264interpretation for an expression that has the unique minimal conversion cost,
1265if such exists.
1266Interpretations can be produced both by overloaded names and implicit
1267conversions applied to existing interpretations; we have proposals to reduce
1268the number of interpretations considered from both sources.
1269To simplify the problem for this discussion, we will consider application
1270resolution restricted to a domain of functions applied to variables, possibly
1271in a nested manner (e.g. `f( g( x ), y )`, where `x` and `y` are variables and
1272`f` and `g` are functions), and possibly in a typed context such as a variable
1273initialization (e.g. `int i = f( x );`); the other aspects of Cforall type
1274resolution should be able to be straightforwardly mapped into this model.
1275The types of the symbol tables used for variable and function declarations
1276look somewhat like the following:
1277
1278        variable_table = name_map( variable_name, variable_map )
1279       
1280        function_table = name_map( function_name, function_map )
1281       
1282        variable_map = multi_index( by_type( variable_type ),
1283                                    variable_decl_set )
1284
1285        function_map = multi_index( by_int( n_params ),
1286                                                                by_type( return_type ),
1287                                                                function_decl_set )
1288
1289`variable_name` and `function_name` are likely simple strings, with `name_map` 
1290a hash table (or perhaps trie) mapping string keys to values.
1291`variable_decl_set` and `function_decl_set` can be thought of for the moment
1292as simple bags of typed declarations, where the declaration types are linked
1293to the graph of available conversions for that type.
1294In a typed context both the `variable_decl_set` and the `function_decl_set` 
1295should be able to be selected upon by type; this is accomplished by the
1296`by_type` index of both `variable_map` and `function_map`.
1297The `by_int` index of `function_map` also provides a way to select functions
1298by their number of parameters; this index may be used to swiftly discard any
1299function declaration which does not have the appropriate number of parameters
1300for the argument interpretations being passed to it; given the likely small
1301number of entries in this map, it is possible that a binary search of a sorted
1302vector or even a linear search of an unsorted vector would be more efficient
1303than the usual hash-based index.
1304
1305Given these data structures, the general outline of our algorithm follows
1306Baker, with Cormack's algorithm used as a heuristic filter in typed contexts.
1307
1308In an untyped context, we use a variant of Baker's bottom-up algorithm.
1309The leaves of the interpretation DAG are drawn from the variable symbol table,
1310with entries in the table each producing zero-cost interpretations, and each
1311implicit conversion available to be applied to the type of an existing entry
1312producing a further interpretation with the same cost as the conversion.
1313As in Baker, if two or more interpretations have the same type, only the
1314minimum cost interpretation with that type is produced; if there is no unique
1315minimum cost interpretation than resolution with that type is ambiguous, and
1316not permitted.
1317It should be relatively simple to produce the list of interpretations sorted
1318by cost by producing the interpretations via a breadth-first search of the
1319conversion graph from the initial interpretations provided in the variable
1320symbol table.
1321
1322To match a function at one of the internal nodes of the DAG, we first look up
1323the function's name in the function symbol table, the appropriate number of
1324parameters for the arguments that are provided through the `by_int` index of
1325the returned `function_map`, then go through the resulting `function_decl_set` 
1326searching for functions where the parameter types can unify with the provided
1327argument lists; any such matching function produces an interpretation with a
1328cost that is the sum of its argument costs.
1329Though this is not included in our simplified model, this unification step may
1330include binding of polymorphic variables, which introduces a cost for the
1331function binding itself which must be added to the argument costs.
1332Also, checking of function assertions would likely be done at this step as
1333well, possibly eliminating some possible matching functions (if no suitable
1334assertions can be satisfied), or adding further conversion costs for the
1335assertion satisfaction.
1336Once the set of valid function interpretations is produced, these may also be
1337expanded by the graph of implicit conversions on their return types, as the
1338variable interpretations were.
1339
1340This implicit conversion-based expansion of interpretations should be skipped
1341for the top-level expression if used in an untyped (void) context, e.g. for
1342`f` in `f( g ( x ) );` or `x` in `x;`.
1343On the other hand, if the top-level expression specifies a type, e.g. in
1344`int i = f( x );`, only top level expressions that return that type are
1345relevant to the search, so the candidates for `f` can be filtered first by
1346those that return `int` (or a type convertable to it); this can be
1347accomplished by performing a top-down filter of the interpretations of `f` by
1348the `by_type` index of the `function_map` in a manner similar to Cormack's[4]
1349algorithm.
1350
1351In a typed context, such as an initialization expression
1352`T x = f( g( y ), z );`, only interpretations of `f( g( y ), z )` which have
1353type `T` are valid; since there are likely to be valid interpretations of
1354`f( g( y ), z )` which cannot be used to initialize a variable of type `T`, we
1355can use this information to reduce the number of interpretations considered.
1356Drawing from Cormack[4], we first search for interpretations of `f` where the
1357return type is `T`; by breadth-first-search of the conversion graph, it should
1358be straightforward to order the interpretations of `f` by the cost to convert
1359their return type to `T`.
1360We can also filter out interpretations of `f` with less than two parameters,
1361since both `g( y )` and `z` must produce at least one parameter; we may not,
1362however, rule out interpretations of `f` with more than two parameters, as
1363there may be a valid interpretation of `g( y )` as a function returning more
1364than one parameter (if the expression was `f( y, z )` instead, we could use an
1365exact parameter count, assuming that variables of tuple type don't exist).
1366For each compatible interpretation of `f`, we can add the type of the first
1367parameter of that interpretation of `f` to a set `S`, and recursively search
1368for interpretations of `g( y )` that return some type `Si` in `S`, and
1369similarly for interpretations of `z` that match the type of any of the second
1370parameters of some `f`.
1371Naturally, if no matching interpretation of `g( y )` can be found for the
1372first parameter of some `f`, the type of the second parameter of that `f` will
1373not be added to the set of valid types for `z`.
1374Each node in this interpretation DAG is given a cost the same way it would be
1375in the bottom-up approach, with the exception that when going top-down there
1376must be a final bottom-up pass to sum the interpretation costs and sort them
1377as appropriate.
1378
1379If a parameter type for some `f` is a polymorphic type variable that is left
1380unbound by the return type (e.g. `forall(otype S) int f(S x, int y)`), the
1381matching arguments should be found using the bottom-up algorithm above for
1382untyped contexts, because the polymorphic type variable does not sufficiently
1383constrain the available interpretations of the argument expression.
1384Similarly, it would likely be an advantage to use top-down resolution for
1385cast expressions (e.g. `(int)x`), even when those cast expressions are
1386subexpressions of an otherwise untyped expression.
1387It may also be fruitful to switch between the bottom-up and top-down
1388algorithms if the number of valid interpretations for a subexpression or valid
1389types for an argument exceeds some heuristic threshold, but finding such
1390a threshold (if any exists) will require experimental data.
1391This hybrid top-down/bottom-up search provides more opportunities for pruning
1392interpretations than either a bottom-up or top-down approach alone, and thus
1393may be more efficient than either.
1394A top-down-only approach, however, devolves to linear search through every
1395possible interpretation in the solution space in an untyped context, and is
1396thus likely to be inferior to a strictly bottom-up approach, though this
1397hypothesis needs to be empirically validated.
1398
1399Another approach would be to abandon expression-tree ordering for
1400subexpression matching, and order by "most constrained symbol"; symbols would 
1401be more constrained if there were fewer matching declarations, fewer
1402subexpressions yet to resolve, or possibly fewer possible types the expression
1403could resolve to. Ordering the expressions in a priority-queue by this metric
1404would not necessarily produce a top-down or a bottom-up order, but would add
1405opportunities for pruning based on memoized upper and lower bounds.
1406
1407Both Baker and Cormack explicitly generate all possible interpretations of a
1408given expression; thinking of the set of interpretations of an expression as a
1409list sorted by cost, this is an eager evaluation of the list.
1410However, since we generally expect that user programmers will not often use
1411high-cost implicit conversions, one potentially effective way to prune the
1412search space would be to first find the minimal-cost interpretations of any
1413given subexpression, then to save the resolution progress of the
1414subexpressions and attempt to resolve the superexpression using only those
1415subexpression interpretations.
1416If no valid interpretation of the superexpression can be found, the resolver
1417would then repeatedly find the next-most-minimal cost interpretations of the
1418subexpressions and attempt to produce the minimal cost interpretation of the
1419superexpression.
1420This process would proceed until all possible subexpression interpretations
1421have been found and considered.
1422
1423A middle ground between the eager and lazy approaches can be taken by
1424considering the lexical order on the cost tuple; essentially, every
1425interpretation in each of the classes below will be strictly cheaper than any
1426interpretation in the class after it, so if a min-cost valid interpretation
1427can be found while only generating interpretations in a given class, that
1428interpretation is guaranteed to be the best possible one:
1429
14301. Interpretations without polymorphic functions or implicit conversions
14312. Interpretations without polymorphic functions using only safe conversions
14323. Interpretations using polymorphic functions without unsafe conversions
14334. Interpretations using unsafe conversions
1434
1435In this lazy-eager approach, all the interpretations in one class would be
1436eagerly generated, while the interpretations in the next class would only be
1437considered if no match was found in the previous class.
1438
1439Another source of efficiency would be to cache the best given interpretation
1440of a subexpression within an environment; this may not be incredibly useful
1441for explict parameters (though it may be useful for, e.g. `f( x, x )`, where
1442both parameters of `f` have the same type), but should pay some dividends for
1443the implicit assertion parameters, especially the otype parameters for the
1444argument of a generic type, which will generally be resolved in duplicate for
1445(at least) the assignment operator, constructor, copy constructor & destructor
1446of the generic type.
1447
1448## Appendix A: Partial and Total Orders ##
1449The `<=` relation on integers is a commonly known _total order_, and
1450intuitions based on how it works generally apply well to other total orders.
1451Formally, a total order is some binary relation `<=` over a set `S` such that
1452for any two members `a` and `b` of `S`, `a <= b` or `b <= a` (if both, `a` and
1453`b` must be equal, the _antisymmetry_ property); total orders also have a
1454_transitivity_ property, that if `a <= b` and `b <= c`, then `a <= c`.
1455If `a` and `b` are distinct elements and `a <= b`, we may write `a < b`.
1456 
1457A _partial order_ is a generalization of this concept where the `<=` relation
1458is not required to be defined over all pairs of elements in `S` (though there
1459is a _reflexivity_ requirement that for all `a` in `S`, `a <= a`); in other
1460words, it is possible for two elements `a` and `b` of `S` to be
1461_incomparable_, unable to be ordered with respect to one another (any `a` and
1462`b` for which either `a <= b` or `b <= a` are called _comparable_).
1463Antisymmetry and transitivity are also required for a partial order, so all
1464total orders are also partial orders by definition.
1465One fairly natural partial order is the "subset of" relation over sets from
1466the same universe; `{ }` is a subset of both `{ 1 }` and `{ 2 }`, which are
1467both subsets of `{ 1, 2 }`, but neither `{ 1 }` nor `{ 2 }` is a subset of the
1468other - they are incomparable under this relation.
1469
1470We can compose two (or more) partial orders to produce a new partial order on
1471tuples drawn from both (or all the) sets.
1472For example, given `a` and `c` from set `S` and `b` and `d` from set `R`,
1473where both `S` and `R` both have partial orders defined on them, we can define
1474a ordering relation between `(a, b)` and `(c, d)`.
1475One common order is the _lexicographical order_, where `(a, b) <= (c, d)` iff
1476`a < c` or both `a = c` and `b <= d`; this can be thought of as ordering by
1477the first set and "breaking ties" by the second set.
1478Another common order is the _product order_, which can be roughly thought of
1479as "all the components are ordered the same way"; formally `(a, b) <= (c, d)` 
1480iff `a <= c` and `b <= d`.
1481One difference between the lexicographical order and the product order is that
1482in the lexicographical order if both `a` and `c` and `b` and `d` are
1483comparable then `(a, b)` and `(c, d)` will be comparable, while in the product
1484order you can have `a <= c` and `d <= b` (both comparable) which will make
1485`(a, b)` and `(c, d)` incomparable.
1486The product order, on the other hand, has the benefit of not prioritizing one
1487order over the other.
1488
1489Any partial order has a natural representation as a directed acyclic graph
1490(DAG).
1491Each element `a` of the set becomes a node of the DAG, with an arc pointing to
1492its _covering_ elements, any element `b` such that `a < b` but where there is
1493no `c` such that `a < c` and `c < b`.
1494Intuitively, the covering elements are the "next ones larger", where you can't
1495fit another element between the two.
1496Under this construction, `a < b` is equivalent to "there is a path from `a` to
1497`b` in the DAG", and the lack of cycles in the directed graph is ensured by
1498the antisymmetry property of the partial order.
1499
1500Partial orders can be generalized to _preorders_ by removing the antisymmetry
1501property.
1502In a preorder the relation is generally called `<~`, and it is possible for
1503two distict elements `a` and `b` to have `a <~ b` and `b <~ a` - in this case
1504we write `a ~ b`; `a <~ b` and not `a ~ b` is written `a < b`.
1505Preorders may also be represented as directed graphs, but in this case the
1506graph may contain cycles.
1507
1508## Appendix B: Building a Conversion Graph from Un-annotated Single Steps ##
1509The short answer is that it's impossible.
1510
1511The longer answer is that it has to do with what's essentially a diamond
1512inheritance problem.
1513In C, `int` converts to `unsigned int` and also `long` "safely"; both convert
1514to `unsigned long` safely, and it's possible to chain the conversions to
1515convert `int` to `unsigned long`.
1516There are two constraints here; one is that the `int` to `unsigned long` 
1517conversion needs to cost more than the other two (because the types aren't as
1518"close" in a very intuitive fashion), and the other is that the system needs a
1519way to choose which path to take to get to the destination type.
1520Now, a fairly natural solution for this would be to just say "C knows how to
1521convert from `int` to `unsigned long`, so we just put in a direct conversion
1522and make the compiler smart enough to figure out the costs" - given that the
1523users can build an arbitrary graph of conversions, this needs to be handled
1524anyway.
1525We can define a preorder over the types by saying that `a <~ b` if there
1526exists a chain of conversions from `a` to `b`.
1527This preorder corresponds roughly to a more usual type-theoretic concept of
1528subtyping ("if I can convert `a` to `b`, `a` is a more specific type than
1529`b`"); however, since this graph is arbitrary, it may contain cycles, so if
1530there is also a path to convert `b` to `a` they are in some sense equivalently
1531specific.
1532
1533Now, to compare the cost of two conversion chains `(s, x1, x2, ... xn)` and
1534`(s, y1, y2, ... ym)`, we have both the length of the chains (`n` versus `m`)
1535and this conversion preorder over the destination types `xn` and `ym`.
1536We could define a preorder by taking chain length and breaking ties by the
1537conversion preorder, but this would lead to unexpected behaviour when closing
1538diamonds with an arm length of longer than 1.
1539Consider a set of types `A`, `B1`, `B2`, `C` with the arcs `A->B1`, `B1->B2`,
1540`B2->C`, and `A->C`.
1541If we are comparing conversions from `A` to both `B2` and `C`, we expect the
1542conversion to `B2` to be chosen because it's the more specific type under the
1543conversion preorder, but since its chain length is longer than the conversion
1544to `C`, it loses and `C` is chosen.
1545However, taking the conversion preorder and breaking ties or ambiguities by
1546chain length also doesn't work, because of cases like the following example
1547where the transitivity property is broken and we can't find a global maximum:
1548
1549        `X->Y1->Y2`, `X->Z1->Z2->Z3->W`, `X->W`
1550
1551In this set of arcs, if we're comparing conversions from `X` to each of `Y2`,
1552`Z3` and `W`, converting to `Y2` is cheaper than converting to `Z3`, because
1553there are no conversions between `Y2` and `Z3`, and `Y2` has the shorter chain
1554length.
1555Also, comparing conversions from `X` to `Z3` and to `W`, we find that the
1556conversion to `Z3` is cheaper, because `Z3 < W` by the conversion preorder,
1557and so is considered to be the nearer type.
1558By transitivity, then, the conversion from `X` to `Y2` should be cheaper than
1559the conversion from `X` to `W`, but in this case the `X` and `W` are
1560incomparable by the conversion preorder, so the tie is broken by the shorter
1561path from `X` to `W` in favour of `W`, contradicting the transitivity property
1562for this proposed order.
1563
1564Without transitivity, we would need to compare all pairs of conversions, which
1565would be expensive, and possibly not yield a minimal-cost conversion even if
1566all pairs were comparable.
1567In short, this ordering is infeasible, and by extension I believe any ordering
1568composed solely of single-step conversions between types with no further
1569user-supplied information will be insufficiently powerful to express the
1570built-in conversions between C's types.
1571
1572## Appendix C: Proposed Prelude Changes ##
1573**TODO** Port Glen's "Future Work" page for builtin C conversions.
1574**TODO** Move discussion of zero_t, unit_t here.
1575
1576It may be desirable to have some polymorphic wrapper functions in the prelude
1577which provide consistent default implementations of various operators given a
1578definition of one of them.
1579Naturally, users could still provide a monomorphic overload if they wished to
1580make their own code more efficient than the polymorphic wrapper could be, but
1581this would minimize user effort in the common case where the user cannot write
1582a more efficient function, or is willing to trade some runtime efficiency for
1583developer time.
1584As an example, consider the following polymorphic defaults for `+` and `+=`:
1585
1586        forall(otype T | { T ?+?(T, T); })
1587        lvalue T ?+=? (T *a, T b) {
1588                return *a = *a + b;
1589        }
1590       
1591        forall(otype T | { lvalue T ?+=? (T*, T) })
1592        T ?+? (T a, T b) {
1593                T tmp = a;
1594                return tmp += b;
1595        }
1596
1597Both of these have a possibly unneccessary copy (the first in copying the
1598result of `*a + b` back into `*a`, the second copying `a` into `tmp`), but in
1599cases where these copies are unavoidable the polymorphic wrappers should be
1600just as performant as the monomorphic equivalents (assuming a compiler
1601sufficiently clever to inline the extra function call), and allow programmers
1602to define a set of related operators with maximal concision.
1603
1604**TODO** Look at what Glen and Richard have already written for this.
1605
1606## Appendix D: Feasibility of Making void* Conversions Explicit ##
1607C allows implicit conversions between `void*` and other pointer types, as per
1608section 6.3.2.3.1 of the standard.
1609Making these implicit conversions explicit in Cforall would provide
1610significant type-safety benefits, and is precedented in C++.
1611A weaker version of this proposal would be to allow implicit conversions to
1612`void*` (as a sort of "top type" for all pointer types), but to make the
1613unsafe conversion from `void*` back to a concrete pointer type an explicit
1614conversion.
1615However, `int *p = malloc( sizeof(int) );` and friends are hugely common
1616in C code, and rely on the unsafe implicit conversion from the `void*` return
1617type of `malloc` to the `int*` type of the variable - obviously it would be
1618too much of a source-compatibility break to disallow this for C code.
1619We do already need to wrap C code in an `extern "C"` block, though, so it is
1620technically feasible to make the `void*` conversions implicit in C but
1621explicit in Cforall.
1622Also, for calling C code with `void*`-based APIs, pointers-to-dtype are
1623calling-convention compatible with `void*`; we could read `void*` in function
1624signatures as essentially a fresh dtype type variable, e.g:
1625
1626        void* malloc( size_t )
1627                => forall(dtype T0) T0* malloc( size_t )
1628        void qsort( void*, size_t, size_t, int (*)( const void*, const void* ) )
1629                => forall(dtype T0, dtype T1, dtype T2)
1630                   void qsort( T0*, size_t, size_t, int (*)( const T1*, const T2* ) )
1631
1632This would even allow us to leverage some of Cforall's type safety to write
1633better declarations for legacy C API functions, like the following wrapper for
1634`qsort`:
1635
1636        extern "C" { // turns off name-mangling so that this calls the C library
1637                // call-compatible type-safe qsort signature
1638                forall(dtype T)
1639                void qsort( T*, size_t, size_t, int (*)( const T*, const T* ) );
1640               
1641                // forbid type-unsafe C signature from resolving
1642                void qsort( void*, size_t, size_t, int (*)( const void*, const void* ) )
1643                        = delete;
1644        }
1645
1646## Appendix E: Features to Add in Resolver Re-write ##
1647* Reference types
1648* Special types for 0 and 1 literals
1649* Expression type for return statement that resolves similarly to ?=?
1650  - This is to get rid of the kludge in the box pass that effectively
1651    re-implements the resolver poorly.
Note: See TracBrowser for help on using the repository browser.