source: doc/working/resolver_design.md @ e4e9173

ADTaaron-thesisarm-ehast-experimentalcleanup-dtorsdeferred_resndemanglerenumforall-pointer-decayjacob/cs343-translationjenkins-sandboxnew-astnew-ast-unique-exprnew-envno_listpersistent-indexerpthread-emulationqualifiedEnumresolv-newwith_gc
Last change on this file since e4e9173 was 41634098, checked in by Aaron Moss <a3moss@…>, 7 years ago

Added white paper on user-defined conversions based on resolver design doc

  • Property mode set to 100644
File size: 84.3 KB
Line 
1# Thoughts on Resolver Design #
2
3## Conversions ##
4C's implicit "usual arithmetic conversions" define a structure among the
5built-in types consisting of _unsafe_ narrowing conversions and a hierarchy of
6_safe_ widening conversions.
7There is also a set of _explicit_ conversions that are only allowed through a
8cast expression.
9Based on Glen's notes on conversions [1], I propose that safe and unsafe
10conversions be expressed as constructor variants, though I make explicit
11(cast) conversions a constructor variant as well rather than a dedicated
12operator.
13Throughout this article, I will use the following operator names for
14constructors and conversion functions from `From` to `To`:
15
16        void ?{} ( To*, To );            // copy constructor
17        void ?{} ( To*, From );          // explicit constructor
18        void ?{explicit} ( To*, From );  // explicit cast conversion
19        void ?{safe} ( To*, From );      // implicit safe conversion
20        void ?{unsafe} ( To*, From );    // implicit unsafe conversion
21
22[1] http://plg.uwaterloo.ca/~cforall/Conversions/index.html
23
24Glen's design made no distinction between constructors and unsafe implicit
25conversions; this is elegant, but interacts poorly with tuples.
26Essentially, without making this distinction, a constructor like the following
27would add an interpretation of any two `int`s as a `Coord`, needlessly
28multiplying the space of possible interpretations of all functions:
29
30        void ?{}( Coord *this, int x, int y );
31
32That said, it would certainly be possible to make a multiple-argument implicit
33conversion, as below, though the argument above suggests this option should be
34used infrequently:
35
36        void ?{unsafe}( Coord *this, int x, int y );
37
38An alternate possibility would be to only count two-arg constructors
39`void ?{} ( To*, From )` as unsafe conversions; under this semantics, safe and
40explicit conversions should also have a compiler-enforced restriction to
41ensure that they are two-arg functions (this restriction may be valuable
42regardless).
43
44Regardless of syntax, there should be a type assertion that expresses `From` 
45is convertable to `To`.
46If user-defined conversions are not added to the language,
47`void ?{} ( To*, From )` may be a suitable representation, relying on
48conversions on the argument types to account for transitivity.
49On the other hand, `To*` should perhaps match its target type exactly, so
50another assertion syntax specific to conversions may be required, e.g.
51`From -> To`.
52
53### Constructor Idiom ###
54Basing our notion of conversions off otherwise normal Cforall functions means
55that we can use the full range of Cforall features for conversions, including
56polymorphism.
57Glen [1] defines a _constructor idiom_ that can be used to create chains of
58safe conversions without duplicating code; given a type `Safe` which members
59of another type `From` can be directly converted to, the constructor idiom
60allows us to write a conversion for any type `To` which `Safe` converts to:
61
62        forall(otype To | { void ?{safe}( To*, Safe ) })
63        void ?{safe}( To *this, From that ) {
64                Safe tmp = /* some expression involving that */;
65                *this = tmp; // uses assertion parameter
66        }
67
68This idiom can also be used with only minor variations for a parallel set of
69unsafe conversions.
70
71What selective non-use of the constructor idiom gives us is the ability to
72define a conversion that may only be the *last* conversion in a chain of such.
73Constructing a conversion graph able to unambiguously represent the full
74hierarchy of implicit conversions in C is provably impossible using only
75single-step conversions with no additional information (see Appendix B), but
76this mechanism is sufficiently powerful (see [1], though the design there has
77some minor bugs; the general idea is to use the constructor idiom to define
78two chains of conversions, one among the signed integral types, another among
79the unsigned, and to use monomorphic conversions to allow conversions between
80signed and unsigned integer types).
81
82### Implementation Details ###
83It is desirable to have a system which can be efficiently implemented, yet
84also to have one which has sufficient power to distinguish between functions
85on all possible axes of polymorphism.
86This ordering may be a partial order, which may complicate implementation
87somewhat; in this case it may be desirable to store the set of implementations
88for a given function as the directed acyclic graph (DAG) representing the
89order.
90
91## Conversion Costs ##
92Each possible resolution of an expression has a _cost_ tuple consisting of
93the following components: _unsafe_ conversion cost, _polymorphic_ 
94specialization cost, _safe_ conversion cost, a count of _explicit_ 
95conversions, and _qualifier_ conversion cost.
96These components are lexically-ordered and can be summed element-wise;
97summation starts at `(0, 0, 0, 0, 0)`.
98
99### Lvalue and Qualifier Conversions ###
100C defines the notion of a _lvalue_, essentially an addressable object, as well
101as a number of type _qualifiers_, `const`, `volatile`, and `restrict`.
102As these type qualifiers are generally only meaningful to the type system as
103applied to lvalues, the two concepts are closely related.
104A const lvalue cannot be modified, the compiler cannot assume that a volatile
105lvalue will not be concurrently modified by some other part of the system, and
106a restrict lvalue must have pointer type, and the compiler may assume that no
107other pointer in scope aliases that pointer (this is solely a performance
108optimization, and may be ignored by implementers).
109_Lvalue-to-rvalue conversion_, which takes an lvalue of type `T` and converts
110it to an expression result of type `T` (commonly called an _rvalue_ of type
111`T`) also strips all the qualifiers from the lvalue, as an expression result
112is a value, not an addressable object that can have properties like
113immutability.
114Though lvalue-to-rvalue conversion strips the qualifiers from lvalues,
115derived rvalue types such as pointer types may include qualifiers;
116`const int *` is a distinct type from `int *`, though the latter is safely
117convertable to the former.
118In general, any number of qualifiers can be safely added to the
119pointed-to-type of a pointer type, e.g. `int *` converts safely to
120`const int *` and `volatile int *`, both of which convert safely to
121`const volatile int *`.
122
123Since lvalues are precicely "addressable objects", in C, only lvalues can be
124used as the operand of the `&` address-of operator.
125Similarly, only modifiable lvalues may be used as the assigned-to
126operand of the mutating operators: assignment, compound assignment
127(e.g. `+=`), and increment and decrement; roughly speaking, lvalues without
128the `const` qualifier are modifiable, but lvalues of incomplete types, array
129types, and struct or union types with const members are also not modifiable.
130Lvalues are produced by the following expressions: object identifiers
131(function identifiers are not considered to be lvalues), the result of the `*` 
132dereference operator applied to an object pointer, the result of a member
133expression `s.f` if the left argument `s` is an lvalue (note that the
134preceding two rules imply that the result of indirect member expressions
135`s->f` are always lvalues, by desugaring to `(*s).f`), and the result of the
136indexing operator `a[i]` (similarly by its desugaring to `*((a)+(i))`).
137Somewhat less obviously, parenthesized lvalue expressions, string literals,
138and compound literals (e.g. `(struct foo){ 'x', 3.14, 42 }`) are also lvalues.
139
140All of the conversions described above are defined in standard C, but Cforall
141requires further features from its type system.
142In particular, to allow overloading of the `*?` and `?[?]` dereferencing and
143indexing operators, Cforall requires a way to declare that the functions
144defining these operators return lvalues, and since C functions never return
145lvalues and for syntactic reasons we wish to distinguish functions which
146return lvalues from functions which return pointers, this is of necessity an
147extension to standard C.
148In the current design, an `lvalue` qualifier can be added to function return
149types (and only to function return types), the effect of which is to return a
150pointer which is implicitly dereferenced by the caller.
151C++ includes the more general concept of _references_, which are typically
152implemented as implicitly dereferenced pointers as well.
153Another use case which C++ references support is providing a way to pass
154function parameters by reference (rather than by value) with a natural
155syntax; Cforall in its current state has no such mechanism.
156As an example, consider the following (currently typical) copy-constructor
157signature and call:
158
159        void ?{}(T *lhs, T rhs);
160       
161        T x;
162        T y = { x };
163
164Note that the right-hand argument is passed by value, and would in fact be
165copied twice in the course of the constructor call `T y = { x };` (once into
166the parameter by C's standard `memcpy` semantics, once again in the body of
167the copy constructor, though it is possible that return value optimization
168will elide the `memcpy`-style copy).
169However, to pass by reference using the existing pointer syntax, the example
170above would look like this:
171
172        void ?{}(T *lhs, const T *rhs);
173       
174        T x;
175        T y = { &x };
176
177This example is not even as bad as it could be; assuming pass-by-reference is
178the desired semantics for the `?+?` operator, that implies the following
179design today:
180
181        T ?+?(const T *lhs, const T *rhs);
182       
183        T a, b;
184        T c = &a + &b,
185
186In addition to `&a + &b` being unsightly and confusing syntax to add `a` and
187`b`, it also introduces a possible ambiguity with pointer arithmetic on `T*`
188which can only be resolved by return-type inference.
189
190Pass-by-reference and marking functions as returning lvalues instead of the
191usual rvalues are actually closely related concepts, as obtaining a reference
192to pass depends on the referenced object being addressable, i.e. an lvalue,
193and lvalue return types are effectively return-by-reference.
194Cforall should also unify the concepts, with a parameterized type for
195"reference to `T`", which I will write `ref T`.
196Syntax bikeshedding can be done later (there are some examples at the bottom
197of this section), but `ref T` is sufficiently distinct from both the existing
198`lvalue T` (which it subsumes) and the closely related C++ `T&` to allow
199independent discussion of its semantics.
200
201Firstly, assignment to a function parameter as part of a function call and
202local variable initialization have almost identical semantics, so should be
203treated similarly for the reference type too; this implies we should be able
204to declare local variables of reference type, as in the following:
205
206        int x = 42;
207        ref int r = x; // r is now an alias for x
208
209Unlike in C++, we would like to have the capability to re-bind references
210after initialization, as this allows the attractive syntax of references to
211support some further useful code patterns, such as first initializing a
212reference after its declaration.
213Constant references to `T` (`const ref T`) should not be re-bindable.
214
215One option for re-binding references is to use a dedicated operator, as in the
216code example below:
217
218        int i = 42, j = 7;
219        ref int r = i; // bind r to i
220        r = j;         // set i (== r) to 7
221        r := j;        // rebind r to j using the new := rebind operator
222        i = 42;        // reset i (!= r) to 42
223        assert( r == 7 );
224
225The other syntactic option for reference re-bind would be to overload
226assignment and use type inference on the left and right-hand sides to
227determine whether the referred-to variable on the left should be reassigned to
228the value on the right, or if the reference on the left should be aliased to
229the reference on the right.
230This could be disambiguated with casts, as in the following code example:
231
232        int i
233        int j;
234        ref int r = i;   // (0a)
235        ref int s = i;   // (0b)
236       
237        i = j;           // (1)
238        i = (int)s;      // (2)
239        i = s;           // (3)
240        // ---------------------
241        r = s;           // (4)
242        r = (ref int)j;  // (5)
243        // ---------------------
244        r = j;           // (6)
245        r = (int)s;      // (7)
246
247By the expected aliasing syntax, (0a) and (0b) are initializing `r` and `s` as
248aliases for `i`.
249For C compatibility, (1) has to be assignment; in general, any assignment to a
250non-reference type should be assignment, so (2) and (3) are as well.
251By types, (4) and (5) should have the same semantics, and the semantics of (6)
252and (7) should match as well.
253This suggests that (4) and (5) are reference re-bind, and (6) and (7) are an
254assignment to the referred variable; this makes the syntax to explicitly alias
255a local variable rather ugly (and inconsistent with the initialization
256syntax), as well as making it rather awkward to copy the value stored in one
257reference-type variable into another reference type variable (which is likely
258more painful in functions with by-reference parameters than with local
259variables of reference type).
260
261Because of the aforementioned issues with overloading assignment as reference
262rebind, in addition to the fact that reference rebind should not be a
263user-overloadable operator (unlike assignment), I propose refererence rebind
264should have its own dedicated operator.
265
266The semantics and restrictions of `ref T` are effectively the semantics of an
267lvalue of type `T`, and by this analogy there should be a safe, qualifier
268dropping conversion from `ref const volatile restrict T` (and every other
269qualifier combination on the `T` in `ref T`) to `T`.
270With this conversion, the resolver may type most expressions that C would
271call "lvalue of type `T`" as `ref T`.
272There's also an obvious argument that lvalues of a (possibly-qualified) type
273`T` should be convertable to references of type `T`, where `T` is also
274so-qualified (e.g. lvalue `int` to `ref int`, lvalue `const char` to
275`ref const char`).
276By similar arguments to pointer types, qualifiers should be addable to the
277referred-to type of a reference (e.g. `ref int` to `ref const int`).
278As a note, since pointer arithmetic is explictly not defined on `ref T`,
279`restrict ref T` should be allowable and would have alias-analysis rules that
280are actually comprehensible to mere mortals.
281
282Using pass-by-reference semantics for function calls should not put syntactic
283constraints on how the function is called; particularly, temporary values
284should be able to be passed by reference.
285The mechanism for this pass-by-reference would be to store the value of the
286temporary expression into a new unnamed temporary, and pass the reference of
287that temporary to the function.
288As an example, the following code should all compile and run:
289
290        void f(ref int x) { printf("%d\n", x++); }
291       
292        int i = 7, j = 11;
293        const int answer = 42;
294       
295        f(i);      // (1)
296        f(42);     // (2)
297        f(i + j);  // (3)
298        f(answer); // (4)
299
300The semantics of (1) are just like C++'s, "7" is printed, and `i` has the
301value 8 afterward.
302For (2), "42" is printed, and the increment of the unnamed temporary to 43 is
303not visible to the caller; (3) behaves similarly, printing "19", but not
304changing `i` or `j`.
305(4) is a bit of an interesting case; we want to be able to support named
306constants like `answer` that can be used anywhere the constant expression
307they're replacing (like `42`) could go; in this sense, (4) and (2) should have
308the same semantics.
309However, we don't want the mutation to the `x` parameter to be visible in
310`answer` afterward, because `answer` is a constant, and thus shouldn't change.
311The solution to this is to allow chaining of the two `ref` conversions;
312`answer` has the type `ref const int`, which can be converted to `int` by the
313lvalue-to-rvalue conversion (which drops the qualifiers), then up to `ref int` 
314by the temporary-producing rvalue-to-lvalue conversion.
315Thus, an unnamed temporary is inserted, initialized to `answer` (i.e. 42),
316mutated by `f`, then discarded; "42" is printed, just as in case (2), and
317`answer` still equals 42 after the call, because it was the temporary that was
318mutated, not `answer`.
319It may be somewhat surprising to C++ programmers that `f(i)` mutates `i` while
320`f(answer)` does not mutate `answer` (though `f(answer)` would be illegal in
321C++, leading to the dreaded "const hell"), but the behaviour of this rule can
322be determined by examining local scope with the simple rule "non-`const`
323references to `const` variables produce temporaries", which aligns with
324programmer intuition that `const` variables cannot be mutated.
325
326To bikeshed syntax for `ref T`, there are three basic options: language
327keywords (`lvalue T` is already in Cforall), compiler-supported "special"
328generic types (e.g. `ref(T)`), or sigils (`T&` is familiar to C++
329programmers).
330Keyword or generic based approaches run the risk of name conflicts with
331existing code, while any sigil used would have to be carefully chosen to not
332create parsing conflicts.
333
334**TODO** Consider arguments for move semantics and see if there is a
335compelling case for rvalue references.
336
337### Conversion Operator Costs ###
338Copy constructors, safe conversions, and unsafe conversions all have an
339associated conversion cost, calculated according to the algorithm below:
340
3411. Monomorphic copy constructors have a conversion cost of `(0, 0, 0, 0)`
3422. Monomorphic safe conversions have a conversion cost of `(0, 0, 1, 1)`
3433. Monomoprhic unsafe conversions have a conversion cost of `(1, 0, 0, 1)`
3444. Polymorphic conversion operators (or copy constructors) have a conversion
345   cost of `(0, 1, 0, 1)` plus the conversion cost of their monomorphic
346   equivalent and the sum of the conversion costs of all conversion operators
347   passed as assertion parameters, but where the fourth "count" element of the
348   cost tuple is fixed to `1`.
349
350**TODO** Polymorphism cost may need to be reconsidered in the light of the
351thoughts on polymorphism below.
352**TODO** You basically just want path-length in the conversion graph implied
353by the set of conversions; the only tricky question is whether or not you can
354account for "mixed" safe and unsafe conversions used to satisfy polymorphic
355constraints, whether a polymorphic conversion should cost more than a
356monomorphic one, and whether to account for non-conversion constraints in the
357polymorphism cost
358
359### Argument-Parameter Matching ###
360Given a function `f` with an parameter list (after tuple flattening)
361`(T1 t1, T2 t2, ... Tn tn)`, and a function application
362`f(<e1>, <e2>, ... <em>)`, the cost of matching each argument to the
363appropriate parameter is calculated according to the algorithm below:
364
365Given a parameter `t` of type `T` and an expression `<e>` from these lists,
366`<e>` will have a set of interpretations of types `E1, E2, ... Ek` with
367associated costs `(u1, p1, s1, c1), (u2, p2, s2, c2), ... (uk, pk, sk, ck)`.
368(If any `Ei` is a tuple type, replace it with its first flattened element for
369the purposes of this section.)
370
371The cost of matching the interpretation of `<e>` with type `Ei` to `t1` with
372type `T` is the sum of the interpretation cost `(ui, pi, si, ci)` and the
373conversion operator cost from `Ei` to `T`.
374
375### Object Initialization ###
376The cost to initialize an object is calculated very similarly to
377argument-parameter matching, with a few modifications.
378Firstly, explicit constructors are included in the set of available
379conversions, with conversion cost `(0, 0, 0, 1)` plus associated polymorphic
380conversion costs (if applicable) and the _interpretation cost_ of the
381constructor, the sum of the argument-parameter matching costs for its
382parameters.
383Also, ties in overall cost (interpretation cost plus conversion cost) are
384broken by lowest conversion cost (i.e. of alternatives with the same overall
385cost, copy constructors are preferred to other explicit constructors,
386explicit constructors are preferred to safe conversions, which are preferred
387to unsafe conversions).
388An object initialization is properly typed if it has exactly one min-cost
389interpretation.
390
391### Explicit Casts ###
392Explicit casts are handled similarly to object initialization.
393Copy constructors and other explicit constructors are not included in the set
394of possible conversions, though interpreting a cast as type ascription
395(`(T)e`, meaning the interpretation of `e` as type `T`) has conversion cost
396`(0, 0, 0, 0)`.
397Explicit conversion operators are also included in the set of possible
398conversions, with cost `(0, 0, 0, 1)` plus whatever polymorphic conversion
399costs are invoked.
400Unlike for explicit constructors and other functions, implicit conversions are
401never applied to the argument or return type of an explicit cast operator, so
402that the cast may be used more effectively as a method for the user programmer
403to guide type resolution.
404
405## Trait Satisfaction ##
406A _trait_ consists of a list of _type variables_ along with a (possibly empty)
407set of _assertions_ on those variables.
408Assertions can take two forms, _variable assertions_ and the more common
409_function assertions_, as in the following example:
410
411        trait a_trait(otype T, otype S) {
412                T a_variable_assertion;
413                S* another_variable_assertion;
414                S a_function_assertion( T* );
415        };
416
417Variable assertions enforce that a variable with the given name and type
418exists (the type is generally one of the type variables, or derived from one),
419while a function assertion enforces that a function with a
420_compatible signature_ to the provided function exists.
421
422To test if some list of types _satisfy_ the trait, the types are first _bound_ 
423to the type variables, and then declarations to satisfy each assertion are
424sought out.
425Variable assertions require an exact match, because they are passed as object
426pointers, and there is no mechanism to employ conversion functions, while
427function assertions only require a function that can be wrapped to a
428compatible type; for example, the declarations below satisfy
429`a_trait(int, short)`:
430
431        int a_variable_assertion;
432        short* another_variable_assertion;
433        char a_function_assertion( void* );
434        // int* may be implicitly converted to void*, and char to short, so the
435        // above works
436
437Cforall Polymorphic functions have a _constraining trait_, denoted as follows:
438
439        forall(otype A, otype B | some_trait(A, B))
440
441The trait may be anonymous, with the same syntax as a trait declaration, and
442may be unioned together using `|` or `,`.
443
444**TODO** Consider including field assertions in the list of constraint types,
445also associated types and the appropriate matching type assertion.
446
447## Polymorphism Costs ##
448The type resolver should prefer functions that are "less polymorphic" to
449functions that are "more polymorphic".
450Determining how to order functions by degree of polymorphism is somewhat less
451straightforward, though, as there are multiple axes of polymorphism and it is
452not always clear how they compose.
453The natural order for degree of polymorphism is a partial order, and this
454section includes some open questions on whether it is desirable or feasible to
455develop a tie-breaking strategy to impose a total order on the degree of
456polymorphism of functions.
457Helpfully, though, the degree of polymorphism is a property of functions
458rather than function calls, so any complicated graph structure or calculation
459representing a (partial) order over function degree of polymorphism can be
460calculated once and cached.
461
462### Function Parameters ###
463All other things being equal, if a parameter of one function has a concrete
464type and the equivalent parameter of another function has a dynamic type, the
465first function is less polymorphic:
466
467                        void f( int, int );  // (0) least polymorphic
468        forall(otype T) void f( T, int );    // (1a) more polymorphic than (0)
469        forall(otype T) void f( int, T );    // (1b) more polymorphic than (0)
470                                             //      incomparable with (1a)
471        forall(otype T) void f( T, T );      // (2) more polymorphic than (1a/b)
472
473This should extend to parameterized types (pointers and generic types) also:
474
475        forall(otype S) struct box { S val; };
476       
477        forall(otype T) void f( T, T* );       // (3) less polymorphic than (2)
478        forall(otype T) void f( T, T** );      // (4) less polymorphic than (3)
479        forall(otype T) void f( T, box(T) );   // (5) less polymorphic than (2)
480                                               //     incomparable with (3)
481        forall(otype T) void f( T, box(T*) );  // (6) less polymorphic than (5)
482
483Every function in the group above is incomparable with (1a/b), but that's fine
484because an `int` isn't a pointer or a `box`, so the ambiguity shouldn't occur
485much in practice (unless there are safe or unsafe conversions defined between
486the possible argument types).
487
488For degree of polymorphism from arguments, I think we should not distinguish
489between different type parameters, e.g. the following should be considered
490equally polymorphic:
491
492        forall(otype T, otype S) void f( T, T, S );  // (7)
493        forall(otype T, otype S) void f( S, T, T );  // (8)
494
495However parameter lists are compared, parameters of multi-parameter generic
496types should ideally be treated as a recursive case, e.g. in the example
497below, (9) is less polymorphic than (10), which is less polymorphic than (11):
498
499        forall(otype T, otype S) struct pair { T x; S y; };
500       
501                        void f( pair(int, int) );  // (9)
502        forall(otype T) void f( pair(T, int) );    // (10)
503        forall(otype T) void f( pair(T, T) );      // (11)
504
505Parameter comparison could possibly be made somewhat cheaper at loss of some
506precision by representing each parameter as a value from the natural numbers
507plus infinity, where infinity represents a monomorphic parameter and a finite
508number counts how many levels deep the shallowest type variable is, e.g. where
509`T` is a type variable, `int` would have value infinity, `T` would have value
5100, `T*` would have value 1, `box(T)*` would have value 2, etc.
511Under this scheme, higher values represent less polymorphism.
512This makes the partial order on parameters a total order, so that many of the
513incomparable functions above compare equal, though that is perhaps a virtue.
514It also loses the ability to differentiate between some multi-parameter
515generic types, such as the parameters in (10) and (11), which would both be
516valued 1, losing the polymorphism distinction between them.
517
518A variant of the above scheme would be to fix a maximum depth of polymorphic
519type variables (16 seems like a reasonable choice) at which a parameter would
520be considered to be effectively monomorphic, and to subtract the value
521described above from that maximum, clamping the result to a minimum of 0.
522Under this scheme, assuming a maximum value of 4, `int` has value 0, `T` has
523value 4, `T*` has value 3, `box(T)*` has value 2, and `box(T*)**` has value 0,
524the same as `int`.
525This can be quite succinctly represented, and summed without the presence of a
526single monomorphic parameter pushing the result to infinity, but does lose the
527ability to distinguish between very deeply structured polymorphic types.
528
529### Parameter Lists ###
530A partial order on function parameter lists can be produced by the
531product order of the partial orders on parameters described above.
532In more detail, this means that for two parameter lists with the same arity,
533if any pair of corresponding parameters are incomparable with respect to each
534other, the two parameter lists are incomparable; if in all pairs of
535corresponding parameters one list's parameter is always (less than or) equal
536to the other list's parameter than the first parameter list is (less than or)
537equal to the second parameter list; otherwise the lists are incomparable with
538respect to each other.
539
540How to compare parameter lists of different arity is a somewhat open question.
541A simple, but perhaps somewhat unsatisfying, solution would be just to say
542that such lists are incomparable.
543The simplist approach to make them comparable is to say that, given two lists
544`(T1, T2, ... Tn)` and `(S1, S2, ... Sm)`, where `n <= m`, the parameter lists
545can be compared based on their shared prefix of `n` types.
546This approach breaks the transitivity property of the equivalence relation on
547the partial order, though, as seen below:
548
549        forall(otype T) void f( T, int );       // (1a)
550        forall(otype T) void f( T, int, int );  // (12)
551        forall(otype T) void f( T, int, T );    // (13)
552
553By this rule, (1a) is equally polymorphic to both (12) and (13), so by
554transitivity (12) and (13) should also be equally polymorphic, but that is not
555actually the case.
556
557We can fix the rule by saying that `(T1 ... Tn)` can be compared to
558`(S1 ... Sm)` by _extending_ the list of `T`s to `m` types by inserting
559notional monomorphic parameters.
560In this case, (1a) and (12) are equally polymorphic, because (1a) gets
561extended with a monomorphic type that compares equal to (12)'s third `int` 
562parameter, but (1a) is less polymorphic than (13), because its notional
563monomorphic third parameter is less polymorphic than (13)'s `T`.
564Essentially what this rule says is that any parameter list with more
565parameters is no less polymorphic than one with fewer.
566
567We could collapse this parameter list ordering to a succinct total order by
568simply taking the sum of the clamped parameter polymorphism counts, but this
569would again make most incomparable parameter lists compare equal, as well as
570having the potential for some unexpected results based on the (completely
571arbitrary) value chosen for "completely polymorphic".
572For instance, if we set 4 to be the maximum depth of polymorphism (as above),
573the following functions would be equally polymorphic, which is a somewhat
574unexpected result:
575
576        forall(otype T) void g( T, T, T, int );    // 4 + 4 + 4 + 0 = 12
577        forall(otype T) void g( T*, T*, T*, T* );  // 3 + 3 + 3 + 3 = 12
578
579These functions would also be considered equally polymorphic:
580
581        forall(otype T) void g( T, int );    // 4 + 0 = 4;
582        forall(otype T) void g( T**, T** );  // 2 + 2 = 4;
583
584This issue can be mitigated by choosing a larger maximum depth of
585polymorphism, but this scheme does have the distinct disadvantage of either
586specifying the (completely arbitrary) maximum depth as part of the language or
587allowing the compiler to refuse to accept otherwise well-typed deeply-nested
588polymorphic types.
589
590For purposes of determining polymorphism, the list of return types of a
591function should be treated like another parameter list, and combined with the
592degree of polymorphism from the parameter list in the same way that the
593parameters in the parameter list are combined.
594For instance, in the following, (14) is less polymorphic than (15) which is
595less polymorphic than (16):
596
597        forall(otype T) int f( T );  // (14)
598        forall(otype T) T*  f( T );  // (15)
599        forall(otype T) T   f( T );  // (16)
600
601### Type Variables and Bounds ###
602Degree of polymorphism doesn't solely depend on the parameter lists, though.
603Glen's thesis (4.4.4, p.89) gives an example that shows that it also depends
604on the number of type variables as well:
605
606        forall(otype T)          void f( T, int );  // (1a) polymorphic
607        forall(otype T)          void f( T, T );    // (2)  more polymorphic
608        forall(otype T, otype S) void f( T, S );    // (17) most polymorphic
609
610Clearly the `forall` type parameter list needs to factor into calculation of
611degree of polymorphism as well, as it's the only real differentiation between
612(2) and (17).
613The simplest way to include the type parameter list would be to simply count
614the type variables and say that functions with more type variables are more
615polymorphic.
616
617However, it also seems natural that more-constrained type variables should be
618counted as "less polymorphic" than less-constrained type variables.
619This would allow our resolver to pick more specialized (and presumably more
620efficient) implementations of functions where one exists.
621For example:
622
623        forall(otype T | { void g(T); }) T f( T );  // (18) less polymorphic
624        forall(otype T)                  T f( T );  // (16) more polymorphic
625
626We could account for this by counting the number of unique constraints and
627saying that functions with more constraints are less polymorphic.
628
629That said, we do model the `forall` constraint list as a (possibly anonymous)
630_trait_, and say that each trait is a set of constraints, so we could
631presumably define a partial order over traits based on subset inclusion, and
632use this partial order instead of the weaker count of constraints to order the
633list of type parameters of a function, as below:
634
635        trait has_g(otype T) { void g(T); };
636        trait has_h(otype S) { void h(T); };
637        trait has_gh(otype R | has_g(R) | has_h(R)) {};
638        // has_gh is equivlent to { void g(R); void h(R); }
639       
640        forall(otype T | has_gh(T)) T f( T );  // (19) least polymorphic
641        forall(otype T | has_g(T))  T f( T );  // (18) more polymorphic than (19)
642        forall(otype T | has_h(T))  T f( T );  // (18b) more polymorphic than (19)
643                                               //       incomparable with (18)
644        forall(otype T)             T f( T );  // (16) most polymorphic
645
646The tricky bit with this is figuring out how to compare the constraint
647functions for equality up to type variable renaming; I suspect there's a known
648solution, but don't know what it is (perhaps some sort of unification
649calculation, though I hope there's a more lightweight option).
650We also should be able to take advantage of the programmer-provided trait
651subset information (like the constraint on `has_gh` in the example) to more
652efficiently generate the partial-order graph for traits, which should be able
653to be cached for efficiency.
654
655Combining count of type variables with the (partial) order on the trait
656constraining those variables seems like it should be a fairly straightforward
657product ordering to me - one `forall` qualifier is (less than or) equal to
658another if it has both a (less than or) equal number of type variables and a
659(less than or) equal degree of polymorphism from its constraining trait; the
660two qualifiers are incomparable otherwise.
661If an easier-to-calculate total ordering is desired, it might be acceptable to
662use the number of type variables, with ties broken by number of constraints.
663
664Similarly, to combine the (partial) orders on parameter and return lists with
665the (partial) order on `forall` qualifiers, a product ordering seems like the
666reasonable choice, though if we wanted a total order a reasonable choice would
667be to use whatever method we use to combine parameter costs into parameter
668lists to combine the costs for the parameter and return lists, then break ties
669by the order on the `forall` qualifiers.
670
671## Expression Costs ##
672
673### Variable Expressions ###
674Variables may be overloaded; that is, there may be multiple distinct variables
675with the same name so long as each variable has a distinct type.
676The variable expression `x` has one zero-cost interpretation as type `T` for
677each variable `T x` in scope.
678
679### Member Selection Expressions ###
680For every interpretation `I` of `e` which has a struct or union type `S`,
681`e.y` has an interpretation of type `T` for each member `T y` of `S`, with the
682same cost as `I`.
683Note that there may be more than one member of `S` with the same name, as per
684Cforall's usual overloading rules.
685The indirect member expression `e->y` is desugared to `(*e).y` and interpreted
686analogously.
687
688**TODO** Consider allowing `e.y` to be interpreted as `e->y` if no
689interpretations as `e.y` exist.
690
691### Address Expressions ###
692Address expressions `&e` have an interpretation for each interpretation `I` of
693`e` that is an lvalue of type `T`, with the same cost as `I` and type `T*`.
694Lvalues result from variable expressions, member selection expressions, or
695application of functions returning an lvalue-qualified type.
696Note that the dereference operator is overloadable, so the rules for its
697resolution follow those for function application below.
698
699**TODO** Consider allowing conversion-to-lvalue so that, e.g., `&42` spawns a
700new temporary holding `42` and takes its address.
701
702### Boolean-context Expressions ###
703C has a number of "boolean contexts", where expressions are assigned a truth
704value; these include both arguments to the short-circuiting `&&` and `||` 
705operators, as well as the conditional expressions in `if` and `while` 
706statements, the middle expression in `for` statements, and the first argument
707to the `?:` ternary conditional operator.
708In all these contexts, C interprets `0` (which is both an integer and a null
709pointer literal) as false, and all other integer or pointer values as true.
710In this spirit, Cforall allows other types to be considered "truthy" if they
711support the following de-sugaring in a conditional context (see notes on
712interpretation of literal `0` below):
713
714        x => ((int)( x != 0 ))
715
716### Literal Expressions ###
717Literal expressions (e.g. 42, 'c', 3.14, "Hello, world!") have one
718zero-cost interpretation with the same type the expression would have in C,
719with three exceptions:
720
721Character literals like 'x' are typed as `char` in Cforall, not `int` as in C.
722This change breaks very little C code (primarily `sizeof 'x'`; the implicit
723conversion from `int` to `char` and lack of overloading handle most other
724expressions), matches the behaviour of C++, and is more compatible with
725programmer intuition.
726
727The literals `0` and `1` are also treated specially by Cforall, due to their
728potential uses in operator overloading.
729Earlier versions of Cforall allowed `0` and `1` to be variable names, allowing
730multiple interpretations of them according to the existing variable
731overloading rules, with the following declarations in the prelude:
732
733        const int 0, 1;
734        forall ( dtype DT ) const DT * const    0;
735        forall ( ftype FT ) FT * const          0;
736
737This did, however, create some backward-compatibility problems and potential
738performance issues, and works poorly for generic types. To start with, this
739(entirely legal C) code snippet doesn't compile in Cforall:
740
741        if ( 0 ) {}
742
743It desugars to `if ( (int)(0 != 0) ) {}`, and since both `int` and
744`forall(dtype DT) DT*` have a != operator which returns `int` the resolver can
745not choose which `0` variable to take, because they're both exact matches.
746
747The general != computation may also be less efficient than a check for a zero
748value; take the following example of a rational type:
749
750        struct rational { int32_t num, int32_t den };
751        rational 0 = { 0, 1 };
752       
753        int ?!=? (rational a, rational b) {
754                return ((int64_t)a.num)*b.den != ((int64_t)b.num)*a.den;
755        }
756       
757        int not_zero (rational a) { return a.num != 0; }
758
759To check if two rationals are equal we need to do a pair of multiplications to
760normalize them (the casts in the example are to prevent overflow), but to
761check if a rational is non-zero we just need to check its numerator, a more
762efficient operation.
763
764Finally, though polymorphic null-pointer variables can be meaningfully
765defined, most other polymorphic variables cannot be, which makes it difficult
766to make generic types "truthy" using the existing system:
767
768        forall(otype T) struct pair { T x; T y; };
769        forall(otype T | { T 0; }) pair(T) 0 = { 0, 0 };
770
771Now, it seems natural enough to want to define the zero for this pair type as
772a pair of the zero values of its element type (if they're defined).
773The declaration of `pair(T) 0` above is actually illegal though, as there is
774no way to represent the zero values of an infinite number of types in the
775single memory location available for this polymorphic variable - the
776polymorphic null-pointer variables defined in the prelude are legal, but that
777is only because all pointers are the same size and the single zero value is a
778legal value of all pointer types simultaneously; null pointer is, however,
779somewhat unique in this respect.
780
781The technical explanation for the problems with polymorphic zero is that `0` 
782is really a rvalue, not a lvalue - an expression, not an object.
783Drawing from this, the solution we propose is to give `0` a new built-in type,
784`_zero_t` (name open to bikeshedding), and similarly give `1` the new built-in
785type `_unit_t`.
786If the prelude defines != over `_zero_t` this solves the `if ( 0 )` problem,
787because now the unambiguous best interpretation of `0 != 0` is to read them
788both as `_zero_t` (and say that this expression is false).
789Backwards compatibility with C can be served by defining conversions in the
790prelude from `_zero_t` and `_unit_t` to `int` and the appropriate pointer
791types, as below:
792
793        // int 0;
794        forall(otype T | { void ?{safe}(T*, int); }) void ?{safe} (T*, _zero_t);
795        forall(otype T | { void ?{unsafe}(T*, int); }) void ?{unsafe} (T*, _zero_t);
796       
797        // int 1;
798        forall(otype T | { void ?{safe}(T*, int); }) void ?{safe} (T*, _unit_t);
799        forall(otype T | { void ?{unsafe}(T*, int); }) void ?{unsafe} (T*, _unit_t);
800       
801        // forall(dtype DT) const DT* 0;
802        forall(dtype DT) void ?{safe}(const DT**, _zero_t);
803        // forall(ftype FT) FT* 0;
804        forall(ftype FT) void ?{safe}(FT**, _zero_t);
805
806Further, with this change, instead of making `0` and `1` overloadable
807variables, we can instead allow user-defined constructors (or, more flexibly,
808safe conversions) from `_zero_t`, as below:
809
810        // rational 0 = { 0, 1 };
811        void ?{safe} (rational *this, _zero_t) { this->num = 0; this->den = 1; }
812
813Note that we don't need to name the `_zero_t` parameter to this constructor,
814because its only possible value is a literal zero.
815This one line allows `0` to be used anywhere a `rational` is required, as well
816as enabling the same use of rationals in boolean contexts as above (by
817interpreting the `0` in the desguraring to be a rational by this conversion).
818Furthermore, while defining a conversion function from literal zero to
819`rational` makes rational a "truthy" type able to be used in a boolean
820context, we can optionally further optimize the truth decision on rationals as
821follows:
822
823        int ?!=? (rational a, _zero_t) { return a.num != 0; }
824
825This comparison function will be chosen in preference to the more general
826rational comparison function for comparisons against literal zero (like in
827boolean contexts) because it doesn't require a conversion on the `0` argument.
828Functions of the form `int ?!=? (T, _zero_t)` can acutally be used in general
829to make a type `T` truthy without making `0` a value which can convert to that
830type, a capability not available in the current design.
831
832This design also solves the problem of polymorphic zero for generic types, as
833in the following example:
834
835        // ERROR: forall(otype T | { T 0; }) pair(T) 0 = { 0, 0 };
836        forall(otype T | { T 0; }) void ?{safe} (pair(T) *this, _zero_t) {
837                this->x = 0; this->y = 0;
838        }
839
840The polymorphic variable declaration didn't work, but this constructor is
841perfectly legal and has the desired semantics.
842
843We can assert that `T` can be used in a boolean context as follows:
844
845        `forall(otype T | { int ?!=?(T, _zero_t); })`
846 
847Since the C standard (6.5.16.1.1) specifically states that pointers can be
848assigned into `_Bool` variables (and implies that other artithmetic types can
849be assigned into `_Bool` variables), it seems natural to say that assignment
850into a `_Bool` variable effectively constitutes a boolean context.
851To allow this interpretation, I propose including the following function (or
852its effective equivalent) in the prelude:
853
854        forall(otype T | { int ?!=?(T, _zero_t); })
855        void ?{safe}( _Bool *this, T that ) { *this = that != 0; }
856
857Note that this conversion is not transitive; that is, for `t` a variable of
858some "truthy" type `T`, `(_Bool)t;` would use this conversion (in the absence
859of a lower-cost one), `(int)t;` would not use this conversion (and in fact
860would not be legal in the absence of another valid way to convert a `T` to an
861`int`), but `(int)(_Bool)t;` could legally use this conversion.
862
863Similarly giving literal `1` the special type `_unit_t` allows for more
864concise and consistent specification of the increment and decrement operators,
865using the following de-sugaring:
866
867        ++i => i += 1
868        i++ => (tmp = i, i += 1, tmp)
869        --i => i -= 1
870        i-- => (tmp = i, i -= 1, tmp)
871
872In the examples above, `tmp` is a fresh temporary with its type inferred from
873the return type of `i += 1`.
874Under this proposal, defining a conversion from `_unit_t` to `T` and a
875`lvalue T ?+=? (T*, T)` provides both the pre- and post-increment operators
876for free in a consistent fashion (similarly for -= and the decrement
877operators).
878If a meaningful `1` cannot be defined for a type, both increment operators can
879still be defined with the signature `lvalue T ?+=? (T*, _unit_t)`.
880Similarly, if scalar addition can be performed on a type more efficiently than
881by repeated increment, `lvalue T ?+=? (T*, int)` will not only define the
882addition operator, it will simultaneously define consistent implementations of
883both increment operators (this can also be accomplished by defining a
884conversion from `int` to `T` and an addition operator `lvalue T ?+=?(T*, T)`).
885
886To allow functions of the form `lvalue T ?+=? (T*, int)` to satisfy "has an
887increment operator" assertions of the form `lvalue T ?+=? (T*, _unit_t)`,
888we also define a non-transitive unsafe conversion from `_Bool` (allowable
889values `0` and `1`) to `_unit_t` (and `_zero_t`) as follows:
890
891        void ?{unsafe} (_unit_t*, _Bool) {}
892
893As a note, the desugaring of post-increment above is possibly even more
894efficient than that of C++ - in C++, the copy to the temporary may be hidden
895in a separately-compiled module where it can't be elided in cases where it is
896not used, whereas this approach for Cforall always gives the compiler the
897opportunity to optimize out the temporary when it is not needed.
898Furthermore, one could imagine a post-increment operator that returned some
899type `T2` that was implicitly convertable to `T` but less work than a full
900copy of `T` to create (this seems like an absurdly niche case) - since the
901type of `tmp` is inferred from the return type of `i += 1`, you could set up
902functions with the following signatures to enable an equivalent pattern in
903Cforall:
904
905        lvalue T2 ?+=? (T*, _unit_t); // increment operator returns T2
906        void ?{} (T2*, T);            // initialize T2 from T for use in `tmp = i`
907        void ?{safe} (T*, T2);        // allow T2 to be used as a T when needed to
908                                      // preserve expected semantics of T x = y++;
909
910**TODO** Look in C spec for literal type interprations.
911**TODO** Write up proposal for wider range of literal types, put in appendix
912
913### Initialization and Cast Expressions ###
914An initialization expression `T x = e` has one interpretation for each
915interpretation `I` of `e` with type `S` which is convertable to `T`.
916The cost of the interpretation is the cost of `I` plus the conversion cost
917from `S` to `T`.
918A cast expression `(T)e` is interpreted as hoisting initialization of a
919temporary variable `T tmp = e` out of the current expression, then replacing
920`(T)e` by the new temporary `tmp`.
921
922### Assignment Expressions ###
923An assignment expression `e = f` desugars to `(?=?(&e, f), e)`, and is then
924interpreted according to the usual rules for function application and comma
925expressions.
926Operator-assignment expressions like `e += f` desugar similarly as
927`(?+=?(&e, f), e)`.
928
929### Function Application Expressions ###
930Every _compatible function_ and satisfying interpretation of its arguments and
931polymorphic variable bindings produces one intepretation for the function
932application expression.
933Broadly speaking, the resolution cost of a function application is the sum of
934the cost of the interpretations of all arguments, the cost of all conversions
935to make those argument interpretations match the parameter types, and the
936binding cost of any of the function's polymorphic type parameters.
937
938**TODO** Work out binding cost in more detail.
939**TODO** Address whether "incomparably polymorphic" should be treated as
940"equally polymorphic" and be disambiguated by count of (safe) conversions.
941**TODO** Think about what polymorphic return types mean in terms of late
942binding.
943**TODO** Consider if "f is less polymorphic than g" can mean exactly "f
944specializes g"; if we don't consider the assertion parameters (except perhaps
945by count) and make polymorphic variables bind exactly (rather than after
946implicit conversions) this should actually be pre-computable.
947**TODO** Add "deletable" functions - take Thierry's suggestion that a deleted
948function declaration is costed out by the resolver in the same way that any
949other function declaration is costed; if the deleted declaration is the unique
950min-cost resolution refuse to type the expression, if it is tied for min-cost
951then take the non-deleted alternative, and of two equivalent-cost deleted
952interpretations with the same return type pick one arbitrarily rather than
953producing an ambiguous resolution. This would also be useful for forbidding
954pointer-to-floating-point explicit conversions (C11, 6.5.4.4).
955**TODO** Cover default parameters, maybe named parameters (see "named
956arguments" thread of 11 March 2016)
957 
958
959### Sizeof, Alignof & Offsetof Expressions ###
960`sizeof`, `alignof`, and `offsetof` expressions have at most a single
961interpretation, of type `size_t`.
962`sizeof` and `alignof` expressions take either a type or an expression as an
963argument; if the argument is a type, it must be a complete type which is not a
964function type, if an expression, the expression must have a single
965interpretation, the type of which conforms to the same rules.
966`offsetof` takes two arguments, a type and a member name; the type must be
967a complete structure or union type, and the second argument must name a member
968of that type.
969
970### Comma Expressions ###
971A comma expression `x, y` resolves `x` as if it had been cast to `void`, and
972then, if there is a unique interpretation `I` of `x`, has one interpretation
973for each interpretation `J` of `y` with the same type as `J` costing the sum
974of the costs of `I` and `J`.
975
976### Index Expressions ###
977**TODO** Consider adding polymorphic function in prelude for this, as per
9786.5.2.1.2 in the C standard:
979
980        forall(otype T, otype I, otype R, otype E | { R ?+?(T, I); lvalue E *?(R); })
981        lvalue E ?[?](T a, I i) { return *(a + i); }
982
983I think this isn't actually a good idea, because the cases for it are niche,
984mostly odd tricks like `0[p]` as an alternate syntax for dereferencing a
985pointer `p`, and adding it to the prelude would slow down resolution of
986every index expression just a bit. Our existing prelude includes an indexing
987operator `forall(otype T) lvalue T ?[?](ptrdiff_t, T*)`, plus qualified
988variants, which should satisfy C source-compatibility without propegating this
989silly desugaring further.
990
991#### Compatible Functions ####
992**TODO** This subsection is very much a work in progress and has multiple open
993design questions.
994
995A _compatible function_ for an application expression is a visible function
996declaration with the same name as the application expression and parameter
997types that can be converted to from the argument types.
998Function pointers and variables of types with the `?()` function call operator
999overloaded may also serve as function declarations for purposes of
1000compatibility.
1001
1002For monomorphic parameters of a function declaration, the declaration is a
1003compatible function if there is an argument interpretation that is either an
1004exact match, or has a safe or unsafe implicit conversion that can be used to
1005reach the parameter type; for example:
1006
1007        void f(int);
1008       
1009        f(42);        // compatible; exact match to int type
1010        f('x');       // compatible; safe conversion from char => int
1011        f(3.14);      // compatible; unsafe conversion from double => int
1012        f((void*)0);  // not compatible; no implicit conversion from void* => int
1013
1014Per Richard[*], function assertion satisfaction involves recursively searching
1015for compatible functions, not an exact match on the function types (I don't
1016believe the current Cforall resolver handles this properly); to extend the
1017previous example:
1018
1019        forall(otype T | { void f(T); }) void g(T);
1020       
1021        g(42);    // binds T = int, takes f(int) by exact match
1022        g('x');   // binds T = char, takes f(int) by conversion
1023        g(3.14);  // binds T = double, takes f(int) by conversion
1024
1025[*] Bilson, s.2.1.3, p.26-27, "Assertion arguments are found by searching the
1026accessible scopes for definitions corresponding to assertion names, and
1027choosing the ones whose types correspond *most closely* to the assertion
1028types." (emphasis mine)
1029
1030There are three approaches we could take to binding type variables: type
1031variables must bind to argument types exactly, each type variable must bind
1032exactly to at least one argument, or type variables may bind to any type which
1033all corresponding arguments can implicitly convert to; I'll provide some
1034possible motivation for each approach.
1035
1036There are two main arguments for the more restrictive binding schemes; the
1037first is that the built-in implicit conversions in C between `void*` and `T*` 
1038for any type `T` can lead to unexpectedly type-unsafe behaviour in a more
1039permissive binding scheme, for example:
1040
1041        forall(dtype T) T* id(T *p) { return p; }
1042       
1043        int main() {
1044                int *p = 0;
1045                char *c = id(p);
1046        }
1047
1048This code compiles in CFA today, and it works because the extra function
1049wrapper `id` provides a level of indirection that allows the non-chaining
1050implicit conversions from `int*` => `void*` and `void*` => `char*` to chain.
1051The resolver types the last line with `T` bound to `void` as follows:
1052
1053        char *c = (char*)id( (void*)p );
1054
1055It has been suggested that making the implicit conversions to and from `void*` 
1056explicit in Cforall code (as in C++) would solve this particular problem, and
1057provide enough other type-safety benefits to outweigh the source-compatibility
1058break with C; see Appendix D for further details.
1059
1060The second argument for a more constrained binding scheme is performance;
1061trait assertions need to be checked after the type variables are bound, and
1062adding more possible values for the type variables should correspond to a
1063linear increase in runtime of the resolver per type variable.
1064There are 21 built-in arithmetic types in C (ignoring qualifiers), and each of
1065them is implicitly convertable to any other; if we allow unrestricted binding
1066of type variables, a common `int` variable (or literal) used in the position
1067of a polymorphic variable parameter would cause a 20x increase in the amount
1068of time needed to check trait resolution for that interpretation.
1069These numbers have yet to be emprically substantiated, but the theory is
1070reasonable, and given that much of the impetus for re-writing the resolver is
1071due to its poor performance, I think this is a compelling argument.
1072
1073I would also mention that a constrained binding scheme is practical; the most
1074common type of assertion is a function assertion, and, as mentioned above,
1075those assertions should be able to be implicitly converted to to match.
1076Thus, in the example above with `g(T)`, where the assertion is `void f(T)`,
1077we first bind `T = int` or `T = char` or `T = double`, then substitute the
1078binding into the assertion, yielding assertions of `void f(int)`,
1079`void f(char)`, or `void f(double)`, respectively, then attempt to satisfy
1080these assertions to complete the binding.
1081Though in all three cases, the existing function with signature `void f(int)` 
1082satisfies this assertion, the checking work cannot easily be re-used between
1083variable bindings, because there may be better or worse matches depending on
1084the specifics of the binding.
1085
1086The main argument for a more flexible binding scheme is that the binding
1087abstraction can actually cause a wrapped function call that would work to
1088cease to resolve, as below:
1089
1090        forall(otype T | { T ?+? (T, T) })
1091        T add(T x, T y) { return x + y; }
1092       
1093        int main() {
1094                int i, j = 2;
1095                short r, s = 3;
1096                i = add(j, s);
1097                r = add(s, j);
1098        }
1099
1100Now, C's implicit conversions mean that you can call `j + s` or `s + j`, and
1101in both cases the short `s` is promoted to `int` to match `j`.
1102If, on the other hand, we demand that variables exactly match type variables,
1103neither call to `add` will compile, because it is impossible to simultaneously
1104bind `T` to both `int` and `short` (C++ has a similar restriction on template
1105variable inferencing).
1106One alternative that enables this case, while still limiting the possible
1107type variable bindings is to say that at least one argument must bind to its
1108type parameter exactly.
1109In this case, both calls to `add` would have the set `{ T = int, T = short }` 
1110for candidate bindings, and would have to check both, as well as checking that
1111`short` could convert to `int` or vice-versa.
1112
1113It is worth noting here that parameterized types generally bind their type
1114parameters exactly anyway, so these "restrictive" semantics only restrict a
1115small minority of calls; for instance, in the example following, there isn't a
1116sensible way to type the call to `ptr-add`:
1117
1118        forall(otype T | { T ?+?(T, T) })
1119        void ptr-add( T* rtn, T* x, T* y ) {
1120                *rtn = *x + *y;
1121        }
1122       
1123        int main() {
1124                int i, j = 2;
1125                short s = 3;
1126                ptr-add(&i, &j, &s); // ERROR &s is not an int*
1127        }
1128
1129I think there is some value in providing library authors with the
1130capability to express "these two parameter types must match exactly".
1131This can be done without restricting the language's expressivity, as the `add` 
1132case above can be made to work under the strictest type variable binding
1133semantics with any addition operator in the system by changing its signature
1134as follows:
1135
1136        forall( otype T, otype R, otype S | { R ?+?(T, S); } )
1137        R add(T x, S y) { return x + y; }
1138
1139Now, it is somewhat unfortunate that the most general version here is more
1140verbose (and thus that the path of least resistence would be more restrictive
1141library code); however, the breaking case in the example code above is a bit
1142odd anyway - explicitly taking two variables of distinct types and relying on
1143C's implicit conversions to do the right thing is somewhat bad form,
1144especially where signed/unsigned conversions are concerned.
1145I think the more common case for implicit conversions is the following,
1146though, where the conversion is used on a literal:
1147
1148        short x = 40;
1149        short y = add(x, 2);
1150
1151One option to handle just this case would be to make literals implicitly
1152convertable to match polymorphic type variables, but only literals.
1153The example above would actually behave slightly differently than `x + 2` in
1154C, though, casting the `2` down to `short` rather than the `x` up to `int`, a
1155possible demerit of this scheme.
1156
1157The other question to ask would be which conversions would be allowed for
1158literals; it seems rather odd to allow down-casting `42ull` to `char`, when
1159the programmer has explicitly specified by the suffix that it's an unsigned
1160long.
1161Type interpretations of literals in C are rather complex (see [1]), but one
1162reasonable approach would be to say that un-suffixed integer literals could be
1163interpreted as any type convertable from int, "u" suffixed literals could be
1164interpreted as any type convertable from "unsigned int" except the signed
1165integer types, and "l" or "ll" suffixed literals could only be interpreted as
1166`long` or `long long`, respectively (or possibly that the "u" suffix excludes
1167the signed types, while the "l" suffix excludes the types smaller than
1168`long int`, as in [1]).
1169Similarly, unsuffixed floating-point literals could be interpreted as `float`,
1170`double` or `long double`, but "f" or "l" suffixed floating-point literals
1171could only be interpreted as `float` or `long double`, respectively.
1172I would like to specify that character literals can only be interpreted as
1173`char`, but the wide-character variants and the C practice of typing character
1174literals as `int` means that would likely break code, so character literals
1175should be able to take any integer type.
1176
1177[1] http://en.cppreference.com/w/c/language/integer_constant
1178
1179With the possible exception of the `add` case above, implicit conversions to
1180the function types of assertions can handle most of the expected behaviour
1181from C.
1182However, implicit conversions cannot be applied to match variable assertions,
1183as in the following example:
1184
1185        forall( otype T | { int ?<?(T, T); T ?+?(T, T); T min; T max; } )
1186        T clamp_sum( T x, T y ) {
1187                T sum = x + y;
1188                if ( sum < min ) return min;
1189                if ( max < sum ) return max;
1190                return sum;
1191        }
1192       
1193        char min = 'A';
1194        double max = 100.0;
1195        //int val = clamp_sum( 'X', 3.14 );  // ERROR (1)
1196       
1197        char max = 'Z'
1198        char val = clamp_sum( 'X', 3.14 ); // MATCH (2)
1199        double val = clamp_sum( 40.9, 19.9 ); // MAYBE (3)
1200
1201In this example, the call to `clamp_sum` at (1) doesn't compile, because even
1202though there are compatible `min` and `max` variables of types `char` and
1203`double`, they need to have the same type to match the constraint, and they
1204don't.
1205The (2) example does compile, but with a behaviour that might be a bit
1206unexpected given the "usual arithmetic conversions", in that both values are
1207narrowed to `char` to match the `min` and `max` constraints, rather than
1208widened to `double` as is usual for mis-matched arguments to +.
1209The (3) example is the only case discussed here that would require the most
1210permisive type binding semantics - here, `T` is bound to `char`, to match the
1211constraints, and both the parameters are narrowed from `double` to `char` 
1212before the call, which would not be allowed under either of the more
1213restrictive binding semantics.
1214However, the behaviour here is unexpected to the programmer, because the
1215return value will be `(double)'A' /* == 60.0 */` due to the conversions,
1216rather than `60.8 /* == 40.9 + 19.9 */` as they might expect.
1217
1218Personally, I think that implicit conversions are not a particularly good
1219language design, and that the use-cases for them can be better handled with
1220less powerful features (e.g. more versatile rules for typing constant
1221expressions).
1222However, though we do need implicit conversions in monomorphic code for C
1223compatibility, I'm in favour of restricting their usage in polymorphic code,
1224both to give programmers some stronger tools to express their intent and to
1225shrink the search space for the resolver.
1226Of the possible binding semantics I've discussed, I'm in favour of forcing
1227polymorphic type variables to bind exactly, though I could be talked into
1228allowing literal expressions to have more flexibility in their bindings, or
1229possibly loosening "type variables bind exactly" to "type variables bind
1230exactly at least once"; I think the unrestricted combination of implicit
1231conversions and polymorphic type variable binding unneccesarily multiplies the
1232space of possible function resolutions, and that the added resolution options
1233are mostly unexpected and therefore confusing and not useful to user
1234programmers.
1235
1236## Resolver Architecture ##
1237
1238### Function Application Resolution ###
1239Our resolution algorithm for function application expressions is based on
1240Baker's[3] single-pass bottom-up algorithm, with Cormack's[4] single-pass
1241top-down algorithm applied where appropriate as an optimization.
1242Broadly speaking, the cost of this resolution per expression will be
1243proportional to `i^d`, where `i` is the number of interpretations of each
1244program symbol, and `d` is the maximum depth of the expression DAG.
1245Since `d` is determined by the user programmer (in general, bounded by a small
1246constant), opportunities for resolver optimization primarily revolve around
1247minimizing `i`, the number of interpretations of each symbol that are
1248considered.
1249
1250[3] Baker, Theodore P. A one-pass algorithm for overload resolution in Ada.
1251ACM Transactions on Programming Languages and Systems (1982) 4:4 p.601-614
1252
1253[4] Cormack, Gordon V. An algorithm for the selection of overloaded functions
1254in Ada. SIGPLAN Notices (1981) 16:2 p.48-52
1255
1256Unlike Baker, our system allows implicit type conversions for function
1257arguments and return types; the problem then becomes to find the valid
1258interpretation for an expression that has the unique minimal conversion cost,
1259if such exists.
1260Interpretations can be produced both by overloaded names and implicit
1261conversions applied to existing interpretations; we have proposals to reduce
1262the number of interpretations considered from both sources.
1263To simplify the problem for this discussion, we will consider application
1264resolution restricted to a domain of functions applied to variables, possibly
1265in a nested manner (e.g. `f( g( x ), y )`, where `x` and `y` are variables and
1266`f` and `g` are functions), and possibly in a typed context such as a variable
1267initialization (e.g. `int i = f( x );`); the other aspects of Cforall type
1268resolution should be able to be straightforwardly mapped into this model.
1269The types of the symbol tables used for variable and function declarations
1270look somewhat like the following:
1271
1272        variable_table = name_map( variable_name, variable_map )
1273       
1274        function_table = name_map( function_name, function_map )
1275       
1276        variable_map = multi_index( by_type( variable_type ),
1277                                    variable_decl_set )
1278
1279        function_map = multi_index( by_int( n_params ),
1280                                                                by_type( return_type ),
1281                                                                function_decl_set )
1282
1283`variable_name` and `function_name` are likely simple strings, with `name_map` 
1284a hash table (or perhaps trie) mapping string keys to values.
1285`variable_decl_set` and `function_decl_set` can be thought of for the moment
1286as simple bags of typed declarations, where the declaration types are linked
1287to the graph of available conversions for that type.
1288In a typed context both the `variable_decl_set` and the `function_decl_set` 
1289should be able to be selected upon by type; this is accomplished by the
1290`by_type` index of both `variable_map` and `function_map`.
1291The `by_int` index of `function_map` also provides a way to select functions
1292by their number of parameters; this index may be used to swiftly discard any
1293function declaration which does not have the appropriate number of parameters
1294for the argument interpretations being passed to it; given the likely small
1295number of entries in this map, it is possible that a binary search of a sorted
1296vector or even a linear search of an unsorted vector would be more efficient
1297than the usual hash-based index.
1298
1299Given these data structures, the general outline of our algorithm follows
1300Baker, with Cormack's algorithm used as a heuristic filter in typed contexts.
1301
1302In an untyped context, we use a variant of Baker's bottom-up algorithm.
1303The leaves of the interpretation DAG are drawn from the variable symbol table,
1304with entries in the table each producing zero-cost interpretations, and each
1305implicit conversion available to be applied to the type of an existing entry
1306producing a further interpretation with the same cost as the conversion.
1307As in Baker, if two or more interpretations have the same type, only the
1308minimum cost interpretation with that type is produced; if there is no unique
1309minimum cost interpretation than resolution with that type is ambiguous, and
1310not permitted.
1311It should be relatively simple to produce the list of interpretations sorted
1312by cost by producing the interpretations via a breadth-first search of the
1313conversion graph from the initial interpretations provided in the variable
1314symbol table.
1315
1316To match a function at one of the internal nodes of the DAG, we first look up
1317the function's name in the function symbol table, the appropriate number of
1318parameters for the arguments that are provided through the `by_int` index of
1319the returned `function_map`, then go through the resulting `function_decl_set` 
1320searching for functions where the parameter types can unify with the provided
1321argument lists; any such matching function produces an interpretation with a
1322cost that is the sum of its argument costs.
1323Though this is not included in our simplified model, this unification step may
1324include binding of polymorphic variables, which introduces a cost for the
1325function binding itself which must be added to the argument costs.
1326Also, checking of function assertions would likely be done at this step as
1327well, possibly eliminating some possible matching functions (if no suitable
1328assertions can be satisfied), or adding further conversion costs for the
1329assertion satisfaction.
1330Once the set of valid function interpretations is produced, these may also be
1331expanded by the graph of implicit conversions on their return types, as the
1332variable interpretations were.
1333
1334This implicit conversion-based expansion of interpretations should be skipped
1335for the top-level expression if used in an untyped (void) context, e.g. for
1336`f` in `f( g ( x ) );` or `x` in `x;`.
1337On the other hand, if the top-level expression specifies a type, e.g. in
1338`int i = f( x );`, only top level expressions that return that type are
1339relevant to the search, so the candidates for `f` can be filtered first by
1340those that return `int` (or a type convertable to it); this can be
1341accomplished by performing a top-down filter of the interpretations of `f` by
1342the `by_type` index of the `function_map` in a manner similar to Cormack's[4]
1343algorithm.
1344
1345In a typed context, such as an initialization expression
1346`T x = f( g( y ), z );`, only interpretations of `f( g( y ), z )` which have
1347type `T` are valid; since there are likely to be valid interpretations of
1348`f( g( y ), z )` which cannot be used to initialize a variable of type `T`, we
1349can use this information to reduce the number of interpretations considered.
1350Drawing from Cormack[4], we first search for interpretations of `f` where the
1351return type is `T`; by breadth-first-search of the conversion graph, it should
1352be straightforward to order the interpretations of `f` by the cost to convert
1353their return type to `T`.
1354We can also filter out interpretations of `f` with less than two parameters,
1355since both `g( y )` and `z` must produce at least one parameter; we may not,
1356however, rule out interpretations of `f` with more than two parameters, as
1357there may be a valid interpretation of `g( y )` as a function returning more
1358than one parameter (if the expression was `f( y, z )` instead, we could use an
1359exact parameter count, assuming that variables of tuple type don't exist).
1360For each compatible interpretation of `f`, we can add the type of the first
1361parameter of that interpretation of `f` to a set `S`, and recursively search
1362for interpretations of `g( y )` that return some type `Si` in `S`, and
1363similarly for interpretations of `z` that match the type of any of the second
1364parameters of some `f`.
1365Naturally, if no matching interpretation of `g( y )` can be found for the
1366first parameter of some `f`, the type of the second parameter of that `f` will
1367not be added to the set of valid types for `z`.
1368Each node in this interpretation DAG is given a cost the same way it would be
1369in the bottom-up approach, with the exception that when going top-down there
1370must be a final bottom-up pass to sum the interpretation costs and sort them
1371as appropriate.
1372
1373If a parameter type for some `f` is a polymorphic type variable that is left
1374unbound by the return type (e.g. `forall(otype S) int f(S x, int y)`), the
1375matching arguments should be found using the bottom-up algorithm above for
1376untyped contexts, because the polymorphic type variable does not sufficiently
1377constrain the available interpretations of the argument expression.
1378Similarly, it would likely be an advantage to use top-down resolution for
1379cast expressions (e.g. `(int)x`), even when those cast expressions are
1380subexpressions of an otherwise untyped expression.
1381It may also be fruitful to switch between the bottom-up and top-down
1382algorithms if the number of valid interpretations for a subexpression or valid
1383types for an argument exceeds some heuristic threshold, but finding such
1384a threshold (if any exists) will require experimental data.
1385This hybrid top-down/bottom-up search provides more opportunities for pruning
1386interpretations than either a bottom-up or top-down approach alone, and thus
1387may be more efficient than either.
1388A top-down-only approach, however, devolves to linear search through every
1389possible interpretation in the solution space in an untyped context, and is
1390thus likely to be inferior to a strictly bottom-up approach, though this
1391hypothesis needs to be empirically validated.
1392
1393Another approach would be to abandon expression-tree ordering for
1394subexpression matching, and order by "most constrained symbol"; symbols would 
1395be more constrained if there were fewer matching declarations, fewer
1396subexpressions yet to resolve, or possibly fewer possible types the expression
1397could resolve to. Ordering the expressions in a priority-queue by this metric
1398would not necessarily produce a top-down or a bottom-up order, but would add
1399opportunities for pruning based on memoized upper and lower bounds.
1400
1401Both Baker and Cormack explicitly generate all possible interpretations of a
1402given expression; thinking of the set of interpretations of an expression as a
1403list sorted by cost, this is an eager evaluation of the list.
1404However, since we generally expect that user programmers will not often use
1405high-cost implicit conversions, one potentially effective way to prune the
1406search space would be to first find the minimal-cost interpretations of any
1407given subexpression, then to save the resolution progress of the
1408subexpressions and attempt to resolve the superexpression using only those
1409subexpression interpretations.
1410If no valid interpretation of the superexpression can be found, the resolver
1411would then repeatedly find the next-most-minimal cost interpretations of the
1412subexpressions and attempt to produce the minimal cost interpretation of the
1413superexpression.
1414This process would proceed until all possible subexpression interpretations
1415have been found and considered.
1416
1417A middle ground between the eager and lazy approaches can be taken by
1418considering the lexical order on the cost tuple; essentially, every
1419interpretation in each of the classes below will be strictly cheaper than any
1420interpretation in the class after it, so if a min-cost valid interpretation
1421can be found while only generating interpretations in a given class, that
1422interpretation is guaranteed to be the best possible one:
1423
14241. Interpretations without polymorphic functions or implicit conversions
14252. Interpretations without polymorphic functions using only safe conversions
14263. Interpretations using polymorphic functions without unsafe conversions
14274. Interpretations using unsafe conversions
1428
1429In this lazy-eager approach, all the interpretations in one class would be
1430eagerly generated, while the interpretations in the next class would only be
1431considered if no match was found in the previous class.
1432
1433Another source of efficiency would be to cache the best given interpretation
1434of a subexpression within an environment; this may not be incredibly useful
1435for explict parameters (though it may be useful for, e.g. `f( x, x )`, where
1436both parameters of `f` have the same type), but should pay some dividends for
1437the implicit assertion parameters, especially the otype parameters for the
1438argument of a generic type, which will generally be resolved in duplicate for
1439(at least) the assignment operator, constructor, copy constructor & destructor
1440of the generic type.
1441
1442## Appendix A: Partial and Total Orders ##
1443The `<=` relation on integers is a commonly known _total order_, and
1444intuitions based on how it works generally apply well to other total orders.
1445Formally, a total order is some binary relation `<=` over a set `S` such that
1446for any two members `a` and `b` of `S`, `a <= b` or `b <= a` (if both, `a` and
1447`b` must be equal, the _antisymmetry_ property); total orders also have a
1448_transitivity_ property, that if `a <= b` and `b <= c`, then `a <= c`.
1449If `a` and `b` are distinct elements and `a <= b`, we may write `a < b`.
1450 
1451A _partial order_ is a generalization of this concept where the `<=` relation
1452is not required to be defined over all pairs of elements in `S` (though there
1453is a _reflexivity_ requirement that for all `a` in `S`, `a <= a`); in other
1454words, it is possible for two elements `a` and `b` of `S` to be
1455_incomparable_, unable to be ordered with respect to one another (any `a` and
1456`b` for which either `a <= b` or `b <= a` are called _comparable_).
1457Antisymmetry and transitivity are also required for a partial order, so all
1458total orders are also partial orders by definition.
1459One fairly natural partial order is the "subset of" relation over sets from
1460the same universe; `{ }` is a subset of both `{ 1 }` and `{ 2 }`, which are
1461both subsets of `{ 1, 2 }`, but neither `{ 1 }` nor `{ 2 }` is a subset of the
1462other - they are incomparable under this relation.
1463
1464We can compose two (or more) partial orders to produce a new partial order on
1465tuples drawn from both (or all the) sets.
1466For example, given `a` and `c` from set `S` and `b` and `d` from set `R`,
1467where both `S` and `R` both have partial orders defined on them, we can define
1468a ordering relation between `(a, b)` and `(c, d)`.
1469One common order is the _lexicographical order_, where `(a, b) <= (c, d)` iff
1470`a < c` or both `a = c` and `b <= d`; this can be thought of as ordering by
1471the first set and "breaking ties" by the second set.
1472Another common order is the _product order_, which can be roughly thought of
1473as "all the components are ordered the same way"; formally `(a, b) <= (c, d)` 
1474iff `a <= c` and `b <= d`.
1475One difference between the lexicographical order and the product order is that
1476in the lexicographical order if both `a` and `c` and `b` and `d` are
1477comparable then `(a, b)` and `(c, d)` will be comparable, while in the product
1478order you can have `a <= c` and `d <= b` (both comparable) which will make
1479`(a, b)` and `(c, d)` incomparable.
1480The product order, on the other hand, has the benefit of not prioritizing one
1481order over the other.
1482
1483Any partial order has a natural representation as a directed acyclic graph
1484(DAG).
1485Each element `a` of the set becomes a node of the DAG, with an arc pointing to
1486its _covering_ elements, any element `b` such that `a < b` but where there is
1487no `c` such that `a < c` and `c < b`.
1488Intuitively, the covering elements are the "next ones larger", where you can't
1489fit another element between the two.
1490Under this construction, `a < b` is equivalent to "there is a path from `a` to
1491`b` in the DAG", and the lack of cycles in the directed graph is ensured by
1492the antisymmetry property of the partial order.
1493
1494Partial orders can be generalized to _preorders_ by removing the antisymmetry
1495property.
1496In a preorder the relation is generally called `<~`, and it is possible for
1497two distict elements `a` and `b` to have `a <~ b` and `b <~ a` - in this case
1498we write `a ~ b`; `a <~ b` and not `a ~ b` is written `a < b`.
1499Preorders may also be represented as directed graphs, but in this case the
1500graph may contain cycles.
1501
1502## Appendix B: Building a Conversion Graph from Un-annotated Single Steps ##
1503The short answer is that it's impossible.
1504
1505The longer answer is that it has to do with what's essentially a diamond
1506inheritance problem.
1507In C, `int` converts to `unsigned int` and also `long` "safely"; both convert
1508to `unsigned long` safely, and it's possible to chain the conversions to
1509convert `int` to `unsigned long`.
1510There are two constraints here; one is that the `int` to `unsigned long` 
1511conversion needs to cost more than the other two (because the types aren't as
1512"close" in a very intuitive fashion), and the other is that the system needs a
1513way to choose which path to take to get to the destination type.
1514Now, a fairly natural solution for this would be to just say "C knows how to
1515convert from `int` to `unsigned long`, so we just put in a direct conversion
1516and make the compiler smart enough to figure out the costs" - given that the
1517users can build an arbitrary graph of conversions, this needs to be handled
1518anyway.
1519We can define a preorder over the types by saying that `a <~ b` if there
1520exists a chain of conversions from `a` to `b`.
1521This preorder corresponds roughly to a more usual type-theoretic concept of
1522subtyping ("if I can convert `a` to `b`, `a` is a more specific type than
1523`b`"); however, since this graph is arbitrary, it may contain cycles, so if
1524there is also a path to convert `b` to `a` they are in some sense equivalently
1525specific.
1526
1527Now, to compare the cost of two conversion chains `(s, x1, x2, ... xn)` and
1528`(s, y1, y2, ... ym)`, we have both the length of the chains (`n` versus `m`)
1529and this conversion preorder over the destination types `xn` and `ym`.
1530We could define a preorder by taking chain length and breaking ties by the
1531conversion preorder, but this would lead to unexpected behaviour when closing
1532diamonds with an arm length of longer than 1.
1533Consider a set of types `A`, `B1`, `B2`, `C` with the arcs `A->B1`, `B1->B2`,
1534`B2->C`, and `A->C`.
1535If we are comparing conversions from `A` to both `B2` and `C`, we expect the
1536conversion to `B2` to be chosen because it's the more specific type under the
1537conversion preorder, but since its chain length is longer than the conversion
1538to `C`, it loses and `C` is chosen.
1539However, taking the conversion preorder and breaking ties or ambiguities by
1540chain length also doesn't work, because of cases like the following example
1541where the transitivity property is broken and we can't find a global maximum:
1542
1543        `X->Y1->Y2`, `X->Z1->Z2->Z3->W`, `X->W`
1544
1545In this set of arcs, if we're comparing conversions from `X` to each of `Y2`,
1546`Z3` and `W`, converting to `Y2` is cheaper than converting to `Z3`, because
1547there are no conversions between `Y2` and `Z3`, and `Y2` has the shorter chain
1548length.
1549Also, comparing conversions from `X` to `Z3` and to `W`, we find that the
1550conversion to `Z3` is cheaper, because `Z3 < W` by the conversion preorder,
1551and so is considered to be the nearer type.
1552By transitivity, then, the conversion from `X` to `Y2` should be cheaper than
1553the conversion from `X` to `W`, but in this case the `X` and `W` are
1554incomparable by the conversion preorder, so the tie is broken by the shorter
1555path from `X` to `W` in favour of `W`, contradicting the transitivity property
1556for this proposed order.
1557
1558Without transitivity, we would need to compare all pairs of conversions, which
1559would be expensive, and possibly not yield a minimal-cost conversion even if
1560all pairs were comparable.
1561In short, this ordering is infeasible, and by extension I believe any ordering
1562composed solely of single-step conversions between types with no further
1563user-supplied information will be insufficiently powerful to express the
1564built-in conversions between C's types.
1565
1566## Appendix C: Proposed Prelude Changes ##
1567**TODO** Port Glen's "Future Work" page for builtin C conversions.
1568**TODO** Move discussion of zero_t, unit_t here.
1569
1570It may be desirable to have some polymorphic wrapper functions in the prelude
1571which provide consistent default implementations of various operators given a
1572definition of one of them.
1573Naturally, users could still provide a monomorphic overload if they wished to
1574make their own code more efficient than the polymorphic wrapper could be, but
1575this would minimize user effort in the common case where the user cannot write
1576a more efficient function, or is willing to trade some runtime efficiency for
1577developer time.
1578As an example, consider the following polymorphic defaults for `+` and `+=`:
1579
1580        forall(otype T | { T ?+?(T, T); })
1581        lvalue T ?+=? (T *a, T b) {
1582                return *a = *a + b;
1583        }
1584       
1585        forall(otype T | { lvalue T ?+=? (T*, T) })
1586        T ?+? (T a, T b) {
1587                T tmp = a;
1588                return tmp += b;
1589        }
1590
1591Both of these have a possibly unneccessary copy (the first in copying the
1592result of `*a + b` back into `*a`, the second copying `a` into `tmp`), but in
1593cases where these copies are unavoidable the polymorphic wrappers should be
1594just as performant as the monomorphic equivalents (assuming a compiler
1595sufficiently clever to inline the extra function call), and allow programmers
1596to define a set of related operators with maximal concision.
1597
1598**TODO** Look at what Glen and Richard have already written for this.
1599
1600## Appendix D: Feasibility of Making void* Conversions Explicit ##
1601C allows implicit conversions between `void*` and other pointer types, as per
1602section 6.3.2.3.1 of the standard.
1603Making these implicit conversions explicit in Cforall would provide
1604significant type-safety benefits, and is precedented in C++.
1605A weaker version of this proposal would be to allow implicit conversions to
1606`void*` (as a sort of "top type" for all pointer types), but to make the
1607unsafe conversion from `void*` back to a concrete pointer type an explicit
1608conversion.
1609However, `int *p = malloc( sizeof(int) );` and friends are hugely common
1610in C code, and rely on the unsafe implicit conversion from the `void*` return
1611type of `malloc` to the `int*` type of the variable - obviously it would be
1612too much of a source-compatibility break to disallow this for C code.
1613We do already need to wrap C code in an `extern "C"` block, though, so it is
1614technically feasible to make the `void*` conversions implicit in C but
1615explicit in Cforall.
1616Also, for calling C code with `void*`-based APIs, pointers-to-dtype are
1617calling-convention compatible with `void*`; we could read `void*` in function
1618signatures as essentially a fresh dtype type variable, e.g:
1619
1620        void* malloc( size_t )
1621                => forall(dtype T0) T0* malloc( size_t )
1622        void qsort( void*, size_t, size_t, int (*)( const void*, const void* ) )
1623                => forall(dtype T0, dtype T1, dtype T2)
1624                   void qsort( T0*, size_t, size_t, int (*)( const T1*, const T2* ) )
1625
1626This would even allow us to leverage some of Cforall's type safety to write
1627better declarations for legacy C API functions, like the following wrapper for
1628`qsort`:
1629
1630        extern "C" { // turns off name-mangling so that this calls the C library
1631                // call-compatible type-safe qsort signature
1632                forall(dtype T)
1633                void qsort( T*, size_t, size_t, int (*)( const T*, const T* ) );
1634               
1635                // forbid type-unsafe C signature from resolving
1636                void qsort( void*, size_t, size_t, int (*)( const void*, const void* ) )
1637                        = delete;
1638        }
1639
1640## Appendix E: Features to Add in Resolver Re-write ##
1641* Reference types
1642* Special types for 0 and 1 literals
1643* Expression type for return statement that resolves similarly to ?=?
1644  - This is to get rid of the kludge in the box pass that effectively
1645    re-implements the resolver poorly.
Note: See TracBrowser for help on using the repository browser.