source: doc/working/resolver_design.md @ 55ba733

ADTaaron-thesisarm-ehast-experimentalcleanup-dtorsctordeferred_resndemanglerenumforall-pointer-decaygc_noraiijacob/cs343-translationjenkins-sandboxmemorynew-astnew-ast-unique-exprnew-envno_listpersistent-indexerpthread-emulationqualifiedEnumresolv-newwith_gc
Last change on this file since 55ba733 was d5f1cfc, checked in by Aaron Moss <a3moss@…>, 8 years ago

Update design of reference types

  • Property mode set to 100644
File size: 71.3 KB
Line 
1# Thoughts on Resolver Design #
2
3## Conversions ##
4C's implicit "usual arithmetic conversions" define a structure among the
5built-in types consisting of _unsafe_ narrowing conversions and a hierarchy of
6_safe_ widening conversions.
7There is also a set of _explicit_ conversions that are only allowed through a
8cast expression.
9Based on Glen's notes on conversions [1], I propose that safe and unsafe
10conversions be expressed as constructor variants, though I make explicit
11(cast) conversions a constructor variant as well rather than a dedicated
12operator.
13Throughout this article, I will use the following operator names for
14constructors and conversion functions from `From` to `To`:
15
16        void ?{} ( To*, To );            // copy constructor
17        void ?{} ( To*, From );          // explicit constructor
18        void ?{explicit} ( To*, From );  // explicit cast conversion
19        void ?{safe} ( To*, From );      // implicit safe conversion
20        void ?{unsafe} ( To*, From );    // implicit unsafe conversion
21
22[1] http://plg.uwaterloo.ca/~cforall/Conversions/index.html
23
24Glen's design made no distinction between constructors and unsafe implicit
25conversions; this is elegant, but interacts poorly with tuples.
26Essentially, without making this distinction, a constructor like the following
27would add an interpretation of any two `int`s as a `Coord`, needlessly
28multiplying the space of possible interpretations of all functions:
29
30        void ?{}( Coord *this, int x, int y );
31
32That said, it would certainly be possible to make a multiple-argument implicit
33conversion, as below, though the argument above suggests this option should be
34used infrequently:
35
36        void ?{unsafe}( Coord *this, int x, int y );
37
38An alternate possibility would be to only count two-arg constructors
39`void ?{} ( To*, From )` as unsafe conversions; under this semantics, safe and 
40explicit conversions should also have a compiler-enforced restriction to
41ensure that they are two-arg functions (this restriction may be valuable
42regardless).
43
44### Constructor Idiom ###
45Basing our notion of conversions off otherwise normal Cforall functions means
46that we can use the full range of Cforall features for conversions, including
47polymorphism.
48Glen [1] defines a _constructor idiom_ that can be used to create chains of
49safe conversions without duplicating code; given a type `Safe` which members
50of another type `From` can be directly converted to, the constructor idiom
51allows us to write a conversion for any type `To` which `Safe` converts to:
52
53        forall(otype To | { void ?{safe}( To*, Safe ) })
54        void ?{safe}( To *this, From that ) {
55                Safe tmp = /* some expression involving that */;
56                *this = tmp; // uses assertion parameter
57        }
58
59This idiom can also be used with only minor variations for a parallel set of
60unsafe conversions.
61
62What selective non-use of the constructor idiom gives us is the ability to
63define a conversion that may only be the *last* conversion in a chain of such.
64Constructing a conversion graph able to unambiguously represent the full
65hierarchy of implicit conversions in C is provably impossible using only
66single-step conversions with no additional information (see Appendix B), but
67this mechanism is sufficiently powerful (see [1], though the design there has
68some minor bugs; the general idea is to use the constructor idiom to define
69two chains of conversions, one among the signed integral types, another among
70the unsigned, and to use monomorphic conversions to allow conversions between
71signed and unsigned integer types).   
72
73### Implementation Details ###
74It is desirable to have a system which can be efficiently implemented, yet
75also to have one which has sufficient power to distinguish between functions
76on all possible axes of polymorphism.
77This ordering may be a partial order, which may complicate implementation
78somewhat; in this case it may be desirable to store the set of implementations
79for a given function as the directed acyclic graph (DAG) representing the
80order.
81
82## Conversion Costs ##
83Each possible resolution of an expression has a _cost_ consisting of four
84integer components: _unsafe_ conversion cost, _polymorphic_ specialization
85cost, _safe_ conversion cost, and a _count_ of conversions.
86These components form a lexically-ordered tuple which can be summed
87element-wise; summation starts at `(0, 0, 0, 0)`.
88
89### Lvalue and Qualifier Conversions ###
90C defines the notion of a _lvalue_, essentially an addressable object, as well
91as a number of type _qualifiers_, `const`, `volatile`, and `restrict`.
92As these type qualifiers are generally only meaningful to the type system as
93applied to lvalues, the two concepts are closely related.
94A const lvalue cannot be modified, the compiler cannot assume that a volatile
95lvalue will not be concurrently modified by some other part of the system, and
96a restrict lvalue must have pointer type, and the compiler may assume that no
97other pointer in scope aliases that pointer (this is solely a performance
98optimization, and may be ignored by implementers).
99_Lvalue-to-rvalue conversion_, which takes an lvalue of type `T` and converts
100it to an expression result of type `T` (commonly called an _rvalue_ of type
101`T`) also strips all the qualifiers from the lvalue, as an expression result
102is a value, not an addressable object that can have properties like
103immutability.
104Though lvalue-to-rvalue conversion strips the qualifiers from lvalues,
105derived rvalue types such as pointer types may include qualifiers;
106`const int *` is a distinct type from `int *`, though the latter is safely
107convertable to the former.
108In general, any number of qualifiers can be safely added to the
109pointed-to-type of a pointer type, e.g. `int *` converts safely to
110`const int *` and `volatile int *`, both of which convert safely to
111`const volatile int *`.
112
113Since lvalues are precicely "addressable objects", in C, only lvalues can be
114used as the operand of the `&` address-of operator.
115Similarly, only modifiable lvalues may be used as the assigned-to
116operand of the mutating operators: assignment, compound assignment
117(e.g. `+=`), and increment and decrement; roughly speaking, lvalues without
118the `const` qualifier are modifiable, but lvalues of incomplete types, array
119types, and struct or union types with const members are also not modifiable.
120Lvalues are produced by the following expressions: object identifiers
121(function identifiers are not considered to be lvalues), the result of the `*` 
122dereference operator applied to an object pointer, the result of a member
123expression `s.f` if the left argument `s` is an lvalue (note that the
124preceding two rules imply that the result of indirect member expressions
125`s->f` are always lvalues, by desugaring to `(*s).f`), and the result of the
126indexing operator `a[i]` (similarly by its desugaring to `*((a)+(i))`).
127Somewhat less obviously, parenthesized lvalue expressions, string literals,
128and compound literals (e.g. `(struct foo){ 'x', 3.14, 42 }`) are also lvalues.
129
130All of the conversions described above are defined in standard C, but Cforall
131requires further features from its type system.
132In particular, to allow overloading of the `*?` and `?[?]` dereferencing and
133indexing operators, Cforall requires a way to declare that the functions
134defining these operators return lvalues, and since C functions never return
135lvalues and for syntactic reasons we wish to distinguish functions which
136return lvalues from functions which return pointers, this is of necessity an
137extension to standard C.
138In the current design, an `lvalue` qualifier can be added to function return
139types (and only to function return types), the effect of which is to return a
140pointer which is implicitly dereferenced by the caller.
141C++ includes the more general concept of _references_, which are typically
142implemented as implicitly dereferenced pointers as well.
143Another use case which C++ references support is providing a way to pass
144function parameters by reference (rather than by value) with a natural
145syntax; Cforall in its current state has no such mechanism.
146As an example, consider the following (currently typical) copy-constructor
147signature and call:
148
149        void ?{}(T *lhs, T rhs);
150       
151        T x;
152        T y = { x };
153
154Note that the right-hand argument is passed by value, and would in fact be
155copied twice in the course of the constructor call `T y = { x };` (once into
156the parameter by C's standard `memcpy` semantics, once again in the body of
157the copy constructor, though it is possible that return value optimization
158will elide the `memcpy`-style copy).
159However, to pass by reference using the existing pointer syntax, the example
160above would look like this:
161
162        void ?{}(T *lhs, const T *rhs);
163       
164        T x;
165        T y = { &x };
166
167This example is not even as bad as it could be; assuming pass-by-reference is
168the desired semantics for the `?+?` operator, that implies the following
169design today:
170
171        T ?+?(const T *lhs, const T *rhs);
172       
173        T a, b;
174        T c = &a + &b,
175
176In addition to `&a + &b` being unsightly and confusing syntax to add `a` and
177`b`, it also introduces a possible ambiguity with pointer arithmetic on `T*`
178which can only be resolved by return-type inference.
179
180Pass-by-reference and marking functions as returning lvalues instead of the
181usual rvalues are actually closely related concepts, as obtaining a reference
182to pass depends on the referenced object being addressable, i.e. an lvalue,
183and lvalue return types are effectively return-by-reference.
184Cforall should also unify the concepts, with a parameterized type for
185"reference to `T`", which I will write `ref T`.
186Syntax bikeshedding can be done later (there are some examples at the bottom
187of this section), but `ref T` is sufficiently distinct from both the existing
188`lvalue T` (which it subsumes) and the closely related C++ `T&` to allow
189independent discussion of its semantics.
190
191Firstly, assignment to a function parameter as part of a function call and
192local variable initialization have almost identical semantics, so should be
193treated similarly for the reference type too; this implies we should be able
194to declare local variables of reference type, as in the following:
195
196        int x = 42;
197        ref int r = x; // r is now an alias for x
198
199Unlike in C++, we would like to have the capability to re-bind references
200after initialization, as this allows the attractive syntax of references to
201support some further useful code patterns, such as first initializing a
202reference after its declaration.
203Constant references to `T` (`const ref T`) should not be re-bindable.
204
205One option for re-binding references is to use a dedicated operator, as in the
206code example below:
207
208        int i = 42, j = 7;
209        ref int r = i; // bind r to i
210        r = j;         // set i (== r) to 7
211        r := j;        // rebind r to j using the new := rebind operator
212        i = 42;        // reset i (!= r) to 42
213        assert( r == 7 );
214
215The other syntactic option for reference re-bind would be to overload
216assignment and use type inference on the left and right-hand sides to
217determine whether the referred-to variable on the left should be reassigned to
218the value on the right, or if the reference on the left should be aliased to
219the reference on the right.
220This could be disambiguated with casts, as in the following code example:
221
222        int i
223        int j;
224        ref int r = i;   // (0a)
225        ref int s = i;   // (0b)
226       
227        i = j;           // (1)
228        i = (int)s;      // (2)
229        i = s;           // (3)
230        // ---------------------
231        r = s;           // (4)
232        r = (ref int)j;  // (5)
233        // ---------------------
234        r = j;           // (6)
235        r = (int)s;      // (7)
236
237By the expected aliasing syntax, (0a) and (0b) are initializing `r` and `s` as
238aliases for `i`.
239For C compatibility, (1) has to be assignment; in general, any assignment to a
240non-reference type should be assignment, so (2) and (3) are as well.
241By types, (4) and (5) should have the same semantics, and the semantics of (6)
242and (7) should match as well.
243This suggests that (4) and (5) are reference re-bind, and (6) and (7) are an
244assignment to the referred variable; this makes the syntax to explicitly alias
245a local variable rather ugly (and inconsistent with the initialization
246syntax), as well as making it rather awkward to copy the value stored in one
247reference-type variable into another reference type variable (which is likely
248more painful in functions with by-reference parameters than with local
249variables of reference type).
250
251Because of the aforementioned issues with overloading assignment as reference
252rebind, in addition to the fact that reference rebind should not be a
253user-overloadable operator (unlike assignment), I propose refererence rebind
254should have its own dedicated operator.
255
256The semantics and restrictions of `ref T` are effectively the semantics of an
257lvalue of type `T`, and by this analogy there should be a safe, qualifier
258dropping conversion from `ref const volatile restrict T` (and every other
259qualifier combination on the `T` in `ref T`) to `T`.
260With this conversion, the resolver may type most expressions that C would
261call "lvalue of type `T`" as `ref T`.
262There's also an obvious argument that lvalues of a (possibly-qualified) type
263`T` should be convertable to references of type `T`, where `T` is also
264so-qualified (e.g. lvalue `int` to `ref int`, lvalue `const char` to
265`ref const char`).
266By similar arguments to pointer types, qualifiers should be addable to the
267referred-to type of a reference (e.g. `ref int` to `ref const int`).
268As a note, since pointer arithmetic is explictly not defined on `ref T`,
269`restrict ref T` should be allowable and would have alias-analysis rules that
270are actually comprehensible to mere mortals.
271
272Using pass-by-reference semantics for function calls should not put syntactic
273constraints on how the function is called; particularly, temporary values
274should be able to be passed by reference.
275The mechanism for this pass-by-reference would be to store the value of the
276temporary expression into a new unnamed temporary, and pass the reference of
277that temporary to the function.
278As an example, the following code should all compile and run:
279
280        void f(ref int x) { printf("%d\n", x++); }
281       
282        int i = 7, j = 11;
283        const int answer = 42;
284       
285        f(i);      // (1)
286        f(42);     // (2)
287        f(i + j);  // (3)
288        f(answer); // (4)
289
290The semantics of (1) are just like C++'s, "7" is printed, and `i` has the
291value 8 afterward.
292For (2), "42" is printed, and the increment of the unnamed temporary to 43 is
293not visible to the caller; (3) behaves similarly, printing "19", but not
294changing `i` or `j`.
295(4) is a bit of an interesting case; we want to be able to support named
296constants like `answer` that can be used anywhere the constant expression
297they're replacing (like `42`) could go; in this sense, (4) and (2) should have
298the same semantics.
299However, we don't want the mutation to the `x` parameter to be visible in
300`answer` afterward, because `answer` is a constant, and thus shouldn't change.
301The solution to this is to allow chaining of the two `ref` conversions;
302`answer` has the type `ref const int`, which can be converted to `int` by the
303lvalue-to-rvalue conversion (which drops the qualifiers), then up to `ref int` 
304by the temporary-producing rvalue-to-lvalue conversion.
305Thus, an unnamed temporary is inserted, initialized to `answer` (i.e. 42),
306mutated by `f`, then discarded; "42" is printed, just as in case (2), and
307`answer` still equals 42 after the call, because it was the temporary that was
308mutated, not `answer`.
309It may be somewhat surprising to C++ programmers that `f(i)` mutates `i` while
310`f(answer)` does not mutate `answer` (though `f(answer)` would be illegal in
311C++, leading to the dreaded "const hell"), but the behaviour of this rule can
312be determined by examining local scope with the simple rule "non-`const`
313references to `const` variables produce temporaries", which aligns with
314programmer intuition that `const` variables cannot be mutated.
315
316To bikeshed syntax for `ref T`, there are three basic options: language
317keywords (`lvalue T` is already in Cforall), compiler-supported "special"
318generic types (e.g. `ref(T)`), or sigils (`T&` is familiar to C++
319programmers).
320Keyword or generic based approaches run the risk of name conflicts with
321existing code, while any sigil used would have to be carefully chosen to not
322create parsing conflicts.
323
324**TODO** Consider arguments for move semantics and see if there is a
325compelling case for rvalue references.
326
327### Conversion Operator Costs ###
328Copy constructors, safe conversions, and unsafe conversions all have an
329associated conversion cost, calculated according to the algorithm below:
330
3311. Monomorphic copy constructors have a conversion cost of `(0, 0, 0, 0)`
3322. Monomorphic safe conversions have a conversion cost of `(0, 0, 1, 1)`
3333. Monomoprhic unsafe conversions have a conversion cost of `(1, 0, 0, 1)`
3344. Polymorphic conversion operators (or copy constructors) have a conversion
335   cost of `(0, 1, 0, 1)` plus the conversion cost of their monomorphic
336   equivalent and the sum of the conversion costs of all conversion operators
337   passed as assertion parameters, but where the fourth "count" element of the
338   cost tuple is fixed to `1`.
339
340**TODO** Polymorphism cost may need to be reconsidered in the light of the
341thoughts on polymorphism below.
342**TODO** You basically just want path-length in the conversion graph implied
343by the set of conversions; the only tricky question is whether or not you can
344account for "mixed" safe and unsafe conversions used to satisfy polymorphic
345constraints, whether a polymorphic conversion should cost more than a
346monomorphic one, and whether to account for non-conversion constraints in the
347polymorphism cost
348
349### Argument-Parameter Matching ###
350Given a function `f` with an parameter list (after tuple flattening)
351`(T1 t1, T2 t2, ... Tn tn)`, and a function application
352`f(<e1>, <e2>, ... <em>)`, the cost of matching each argument to the
353appropriate parameter is calculated according to the algorithm below:
354
355Given a parameter `t` of type `T` and an expression `<e>` from these lists,
356`<e>` will have a set of interpretations of types `E1, E2, ... Ek` with
357associated costs `(u1, p1, s1, c1), (u2, p2, s2, c2), ... (uk, pk, sk, ck)`.
358(If any `Ei` is a tuple type, replace it with its first flattened element for
359the purposes of this section.)
360
361The cost of matching the interpretation of `<e>` with type `Ei` to `t1` with
362type `T` is the sum of the interpretation cost `(ui, pi, si, ci)` and the
363conversion operator cost from `Ei` to `T`.
364
365### Object Initialization ###
366The cost to initialize an object is calculated very similarly to
367argument-parameter matching, with a few modifications.
368Firstly, explicit constructors are included in the set of available
369conversions, with conversion cost `(0, 0, 0, 1)` plus associated polymorphic
370conversion costs (if applicable) and the _interpretation cost_ of the
371constructor, the sum of the argument-parameter matching costs for its
372parameters.
373Also, ties in overall cost (interpretation cost plus conversion cost) are
374broken by lowest conversion cost (i.e. of alternatives with the same overall
375cost, copy constructors are preferred to other explicit constructors,
376explicit constructors are preferred to safe conversions, which are preferred
377to unsafe conversions).
378An object initialization is properly typed if it has exactly one min-cost
379interpretation.
380
381### Explicit Casts ###
382Explicit casts are handled similarly to object initialization.
383Copy constructors and other explicit constructors are not included in the set
384of possible conversions, though interpreting a cast as type ascription
385(`(T)e`, meaning the interpretation of `e` as type `T`) has conversion cost
386`(0, 0, 0, 0)`.
387Explicit conversion operators are also included in the set of possible
388conversions, with cost `(0, 0, 0, 1)` plus whatever polymorphic conversion
389costs are invoked.
390Unlike for explicit constructors and other functions, implicit conversions are
391never applied to the argument or return type of an explicit cast operator, so
392that the cast may be used more effectively as a method for the user programmer
393to guide type resolution.
394
395## Trait Satisfaction ##
396A _trait_ consists of a list of _type variables_ along with a (possibly empty)
397set of _assertions_ on those variables.
398Assertions can take two forms, _variable assertions_ and the more common
399_function assertions_, as in the following example:
400
401        trait a_trait(otype T, otype S) {
402                T a_variable_assertion;
403                S* another_variable_assertion;
404                S a_function_assertion( T* );
405        };
406
407Variable assertions enforce that a variable with the given name and type
408exists (the type is generally one of the type variables, or derived from one),
409while a function assertion enforces that a function with a
410_compatible signature_ to the provided function exists.
411
412To test if some list of types _satisfy_ the trait, the types are first _bound_ 
413to the type variables, and then declarations to satisfy each assertion are
414sought out.
415Variable assertions require an exact match, because they are passed as object
416pointers, and there is no mechanism to employ conversion functions, while
417function assertions only require a function that can be wrapped to a
418compatible type; for example, the declarations below satisfy
419`a_trait(int, short)`:
420
421        int a_variable_assertion;
422        short* another_variable_assertion;
423        char a_function_assertion( void* );
424        // int* may be implicitly converted to void*, and char to short, so the
425        // above works
426
427Cforall Polymorphic functions have a _constraining trait_, denoted as follows:
428
429        forall(otype A, otype B | some_trait(A, B))
430
431The trait may be anonymous, with the same syntax as a trait declaration, and
432may be unioned together using `|` or `,`.
433
434**TODO** Consider including field assertions in the list of constraint types,
435also associated types and the appropriate matching type assertion.
436
437## Polymorphism Costs ##
438The type resolver should prefer functions that are "less polymorphic" to
439functions that are "more polymorphic".
440Determining how to order functions by degree of polymorphism is somewhat less
441straightforward, though, as there are multiple axes of polymorphism and it is
442not always clear how they compose.
443The natural order for degree of polymorphism is a partial order, and this
444section includes some open questions on whether it is desirable or feasible to
445develop a tie-breaking strategy to impose a total order on the degree of
446polymorphism of functions.
447Helpfully, though, the degree of polymorphism is a property of functions
448rather than function calls, so any complicated graph structure or calculation
449representing a (partial) order over function degree of polymorphism can be
450calculated once and cached.
451
452### Function Parameters ###
453All other things being equal, if a parameter of one function has a concrete
454type and the equivalent parameter of another function has a dynamic type, the
455first function is less polymorphic:
456
457                        void f( int, int );  // (0) least polymorphic
458        forall(otype T) void f( T, int );    // (1a) more polymorphic than (0)
459        forall(otype T) void f( int, T );    // (1b) more polymorphic than (0)
460                                             //      incomparable with (1a)
461        forall(otype T) void f( T, T );      // (2) more polymorphic than (1a/b)
462
463This should extend to parameterized types (pointers and generic types) also:
464
465        forall(otype S) struct box { S val; };
466       
467        forall(otype T) void f( T, T* );       // (3) less polymorphic than (2)
468        forall(otype T) void f( T, T** );      // (4) less polymorphic than (3)
469        forall(otype T) void f( T, box(T) );   // (5) less polymorphic than (2)
470                                               //     incomparable with (3)
471        forall(otype T) void f( T, box(T*) );  // (6) less polymorphic than (5)
472
473Every function in the group above is incomparable with (1a/b), but that's fine
474because an `int` isn't a pointer or a `box`, so the ambiguity shouldn't occur
475much in practice (unless there are safe or unsafe conversions defined between
476the possible argument types).
477
478For degree of polymorphism from arguments, I think we should not distinguish
479between different type parameters, e.g. the following should be considered
480equally polymorphic:
481
482        forall(otype T, otype S) void f( T, T, S );  // (7)
483        forall(otype T, otype S) void f( S, T, T );  // (8)
484
485However parameter lists are compared, parameters of multi-parameter generic
486types should ideally be treated as a recursive case, e.g. in the example
487below, (9) is less polymorphic than (10), which is less polymorphic than (11):
488
489        forall(otype T, otype S) struct pair { T x; S y; };
490       
491                        void f( pair(int, int) );  // (9)
492        forall(otype T) void f( pair(T, int) );    // (10)
493        forall(otype T) void f( pair(T, T) );      // (11)
494
495Parameter comparison could possibly be made somewhat cheaper at loss of some
496precision by representing each parameter as a value from the natural numbers
497plus infinity, where infinity represents a monomorphic parameter and a finite
498number counts how many levels deep the shallowest type variable is, e.g. where
499`T` is a type variable, `int` would have value infinity, `T` would have value
5000, `T*` would have value 1, `box(T)*` would have value 2, etc.
501Under this scheme, higher values represent less polymorphism.
502This makes the partial order on parameters a total order, so that many of the
503incomparable functions above compare equal, though that is perhaps a virtue.
504It also loses the ability to differentiate between some multi-parameter
505generic types, such as the parameters in (10) and (11), which would both be
506valued 1, losing the polymorphism distinction between them.
507
508A variant of the above scheme would be to fix a maximum depth of polymorphic
509type variables (16 seems like a reasonable choice) at which a parameter would
510be considered to be effectively monomorphic, and to subtract the value 
511described above from that maximum, clamping the result to a minimum of 0.
512Under this scheme, assuming a maximum value of 4, `int` has value 0, `T` has
513value 4, `T*` has value 3, `box(T)*` has value 2, and `box(T*)**` has value 0,
514the same as `int`.
515This can be quite succinctly represented, and summed without the presence of a
516single monomorphic parameter pushing the result to infinity, but does lose the
517ability to distinguish between very deeply structured polymorphic types.
518
519### Parameter Lists ###
520A partial order on function parameter lists can be produced by the
521product order of the partial orders on parameters described above.
522In more detail, this means that for two parameter lists with the same arity,
523if any pair of corresponding parameters are incomparable with respect to each
524other, the two parameter lists are incomparable; if in all pairs of
525corresponding parameters one list's parameter is always (less than or) equal
526to the other list's parameter than the first parameter list is (less than or)
527equal to the second parameter list; otherwise the lists are incomparable with
528respect to each other.
529
530How to compare parameter lists of different arity is a somewhat open question.
531A simple, but perhaps somewhat unsatisfying, solution would be just to say
532that such lists are incomparable.
533The simplist approach to make them comparable is to say that, given two lists
534`(T1, T2, ... Tn)` and `(S1, S2, ... Sm)`, where `n <= m`, the parameter lists
535can be compared based on their shared prefix of `n` types.
536This approach breaks the transitivity property of the equivalence relation on
537the partial order, though, as seen below:
538
539        forall(otype T) void f( T, int );       // (1a)
540        forall(otype T) void f( T, int, int );  // (12)
541        forall(otype T) void f( T, int, T );    // (13)
542
543By this rule, (1a) is equally polymorphic to both (12) and (13), so by
544transitivity (12) and (13) should also be equally polymorphic, but that is not
545actually the case.
546
547We can fix the rule by saying that `(T1 ... Tn)` can be compared to
548`(S1 ... Sm)` by _extending_ the list of `T`s to `m` types by inserting
549notional monomorphic parameters.
550In this case, (1a) and (12) are equally polymorphic, because (1a) gets
551extended with a monomorphic type that compares equal to (12)'s third `int` 
552parameter, but (1a) is less polymorphic than (13), because its notional
553monomorphic third parameter is less polymorphic than (13)'s `T`.
554Essentially what this rule says is that any parameter list with more
555parameters is no less polymorphic than one with fewer.
556
557We could collapse this parameter list ordering to a succinct total order by
558simply taking the sum of the clamped parameter polymorphism counts, but this
559would again make most incomparable parameter lists compare equal, as well as
560having the potential for some unexpected results based on the (completely
561arbitrary) value chosen for "completely polymorphic".
562For instance, if we set 4 to be the maximum depth of polymorphism (as above),
563the following functions would be equally polymorphic, which is a somewhat
564unexpected result:
565
566        forall(otype T) void g( T, T, T, int );    // 4 + 4 + 4 + 0 = 12
567        forall(otype T) void g( T*, T*, T*, T* );  // 3 + 3 + 3 + 3 = 12
568
569These functions would also be considered equally polymorphic:
570
571        forall(otype T) void g( T, int );    // 4 + 0 = 4;
572        forall(otype T) void g( T**, T** );  // 2 + 2 = 4;
573
574This issue can be mitigated by choosing a larger maximum depth of
575polymorphism, but this scheme does have the distinct disadvantage of either
576specifying the (completely arbitrary) maximum depth as part of the language or
577allowing the compiler to refuse to accept otherwise well-typed deeply-nested
578polymorphic types. 
579
580For purposes of determining polymorphism, the list of return types of a
581function should be treated like another parameter list, and combined with the
582degree of polymorphism from the parameter list in the same way that the
583parameters in the parameter list are combined.
584For instance, in the following, (14) is less polymorphic than (15) which is
585less polymorphic than (16):
586
587        forall(otype T) int f( T );  // (14)
588        forall(otype T) T*  f( T );  // (15)
589        forall(otype T) T   f( T );  // (16)
590
591### Type Variables and Bounds ###
592Degree of polymorphism doesn't solely depend on the parameter lists, though.
593Glen's thesis (4.4.4, p.89) gives an example that shows that it also depends
594on the number of type variables as well:
595
596        forall(otype T)          void f( T, int );  // (1a) polymorphic
597        forall(otype T)          void f( T, T );    // (2)  more polymorphic
598        forall(otype T, otype S) void f( T, S );    // (17) most polymorphic
599
600Clearly the `forall` type parameter list needs to factor into calculation of
601degree of polymorphism as well, as it's the only real differentiation between
602(2) and (17).
603The simplest way to include the type parameter list would be to simply count
604the type variables and say that functions with more type variables are more
605polymorphic.
606
607However, it also seems natural that more-constrained type variables should be
608counted as "less polymorphic" than less-constrained type variables.
609This would allow our resolver to pick more specialized (and presumably more
610efficient) implementations of functions where one exists.
611For example:
612
613        forall(otype T | { void g(T); }) T f( T );  // (18) less polymorphic
614        forall(otype T)                  T f( T );  // (16) more polymorphic
615
616We could account for this by counting the number of unique constraints and
617saying that functions with more constraints are less polymorphic.
618
619That said, we do model the `forall` constraint list as a (possibly anonymous)
620_trait_, and say that each trait is a set of constraints, so we could
621presumably define a partial order over traits based on subset inclusion, and
622use this partial order instead of the weaker count of constraints to order the
623list of type parameters of a function, as below:
624
625        trait has_g(otype T) { void g(T); };
626        trait has_h(otype S) { void h(T); };
627        trait has_gh(otype R | has_g(R) | has_h(R)) {};
628        // has_gh is equivlent to { void g(R); void h(R); }
629       
630        forall(otype T | has_gh(T)) T f( T );  // (19) least polymorphic
631        forall(otype T | has_g(T))  T f( T );  // (18) more polymorphic than (19)
632        forall(otype T | has_h(T))  T f( T );  // (18b) more polymorphic than (19)
633                                               //       incomparable with (18)
634        forall(otype T)             T f( T );  // (16) most polymorphic
635
636The tricky bit with this is figuring out how to compare the constraint
637functions for equality up to type variable renaming; I suspect there's a known
638solution, but don't know what it is (perhaps some sort of unification
639calculation, though I hope there's a more lightweight option).
640We also should be able to take advantage of the programmer-provided trait
641subset information (like the constraint on `has_gh` in the example) to more
642efficiently generate the partial-order graph for traits, which should be able
643to be cached for efficiency.
644
645Combining count of type variables with the (partial) order on the trait
646constraining those variables seems like it should be a fairly straightforward
647product ordering to me - one `forall` qualifier is (less than or) equal to
648another if it has both a (less than or) equal number of type variables and a
649(less than or) equal degree of polymorphism from its constraining trait; the
650two qualifiers are incomparable otherwise.
651If an easier-to-calculate total ordering is desired, it might be acceptable to
652use the number of type variables, with ties broken by number of constraints.
653
654Similarly, to combine the (partial) orders on parameter and return lists with
655the (partial) order on `forall` qualifiers, a product ordering seems like the
656reasonable choice, though if we wanted a total order a reasonable choice would
657be to use whatever method we use to combine parameter costs into parameter
658lists to combine the costs for the parameter and return lists, then break ties
659by the order on the `forall` qualifiers.
660
661## Expression Costs ##
662
663### Variable Expressions ###
664Variables may be overloaded; that is, there may be multiple distinct variables
665with the same name so long as each variable has a distinct type.
666The variable expression `x` has one zero-cost interpretation as type `T` for
667each variable `T x` in scope.
668
669### Member Selection Expressions ###
670For every interpretation `I` of `e` which has a struct or union type `S`,
671`e.y` has an interpretation of type `T` for each member `T y` of `S`, with the
672same cost as `I`.
673Note that there may be more than one member of `S` with the same name, as per
674Cforall's usual overloading rules.
675The indirect member expression `e->y` is desugared to `(*e).y` and interpreted
676analogously.
677
678**TODO** Consider allowing `e.y` to be interpreted as `e->y` if no
679interpretations as `e.y` exist.
680
681### Address Expressions ###
682Address expressions `&e` have an interpretation for each interpretation `I` of
683`e` that is an lvalue of type `T`, with the same cost as `I` and type `T*`.
684Lvalues result from variable expressions, member selection expressions, or
685application of functions returning an lvalue-qualified type.
686Note that the dereference operator is overloadable, so the rules for its
687resolution follow those for function application below.
688
689**TODO** Consider allowing conversion-to-lvalue so that, e.g., `&42` spawns a
690new temporary holding `42` and takes its address.
691
692### Boolean-context Expressions ###
693C has a number of "boolean contexts", where expressions are assigned a truth
694value; these include both arguments to the short-circuiting `&&` and `||` 
695operators, as well as the conditional expressions in `if` and `while` 
696statements, the middle expression in `for` statements, and the first argument
697to the `?:` ternary conditional operator.
698In all these contexts, C interprets `0` (which is both an integer and a null
699pointer literal) as false, and all other integer or pointer values as true.
700In this spirit, Cforall allows other types to be considered "truthy" if they
701support the following de-sugaring in a conditional context (see notes on
702interpretation of literal `0` below):
703
704        x => ((int)( x != 0 ))
705
706### Literal Expressions ###
707Literal expressions (e.g. 42, 'c', 3.14, "Hello, world!") have one
708zero-cost interpretation with the same type the expression would have in C,
709with three exceptions:
710
711Character literals like 'x' are typed as `char` in Cforall, not `int` as in C.
712This change breaks very little C code (primarily `sizeof 'x'`; the implicit
713conversion from `int` to `char` and lack of overloading handle most other
714expressions), matches the behaviour of C++, and is more compatible with
715programmer intuition.
716
717The literals `0` and `1` are also treated specially by Cforall, due to their
718potential uses in operator overloading.
719Earlier versions of Cforall allowed `0` and `1` to be variable names, allowing
720multiple interpretations of them according to the existing variable
721overloading rules, with the following declarations in the prelude:
722
723        const int 0, 1;
724        forall ( dtype DT ) const DT * const    0;
725        forall ( ftype FT ) FT * const          0;
726
727This did, however, create some backward-compatibility problems and potential
728performance issues, and works poorly for generic types. To start with, this
729(entirely legal C) code snippet doesn't compile in Cforall:
730
731        if ( 0 ) {}
732
733It desugars to `if ( (int)(0 != 0) ) {}`, and since both `int` and
734`forall(dtype DT) DT*` have a != operator which returns `int` the resolver can
735not choose which `0` variable to take, because they're both exact matches.
736
737The general != computation may also be less efficient than a check for a zero
738value; take the following example of a rational type:
739
740        struct rational { int32_t num, int32_t den };
741        rational 0 = { 0, 1 };
742       
743        int ?!=? (rational a, rational b) {
744                return ((int64_t)a.num)*b.den != ((int64_t)b.num)*a.den;
745        }
746       
747        int not_zero (rational a) { return a.num != 0; }
748
749To check if two rationals are equal we need to do a pair of multiplications to
750normalize them (the casts in the example are to prevent overflow), but to
751check if a rational is non-zero we just need to check its numerator, a more
752efficient operation.
753
754Finally, though polymorphic null-pointer variables can be meaningfully
755defined, most other polymorphic variables cannot be, which makes it difficult
756to make generic types "truthy" using the existing system:
757
758        forall(otype T) struct pair { T x; T y; };
759        forall(otype T | { T 0; }) pair(T) 0 = { 0, 0 };
760
761Now, it seems natural enough to want to define the zero for this pair type as
762a pair of the zero values of its element type (if they're defined).
763The declaration of `pair(T) 0` above is actually illegal though, as there is
764no way to represent the zero values of an infinite number of types in the
765single memory location available for this polymorphic variable - the
766polymorphic null-pointer variables defined in the prelude are legal, but that
767is only because all pointers are the same size and the single zero value is a
768legal value of all pointer types simultaneously; null pointer is, however,
769somewhat unique in this respect.
770
771The technical explanation for the problems with polymorphic zero is that `0` 
772is really a rvalue, not a lvalue - an expression, not an object.
773Drawing from this, the solution we propose is to give `0` a new built-in type,
774`_zero_t` (name open to bikeshedding), and similarly give `1` the new built-in
775type `_unit_t`.
776If the prelude defines != over `_zero_t` this solves the `if ( 0 )` problem,
777because now the unambiguous best interpretation of `0 != 0` is to read them
778both as `_zero_t` (and say that this expression is false).
779Backwards compatibility with C can be served by defining conversions in the
780prelude from `_zero_t` and `_unit_t` to `int` and the appropriate pointer
781types, as below:
782
783        // int 0;
784        forall(otype T | { void ?{safe}(T*, int); }) void ?{safe} (T*, _zero_t);
785        forall(otype T | { void ?{unsafe}(T*, int); }) void ?{unsafe} (T*, _zero_t);
786       
787        // int 1;
788        forall(otype T | { void ?{safe}(T*, int); }) void ?{safe} (T*, _unit_t);
789        forall(otype T | { void ?{unsafe}(T*, int); }) void ?{unsafe} (T*, _unit_t);
790       
791        // forall(dtype DT) const DT* 0;
792        forall(dtype DT) void ?{safe}(const DT**, _zero_t);
793        // forall(ftype FT) FT* 0;
794        forall(ftype FT) void ?{safe}(FT**, _zero_t);
795
796Further, with this change, instead of making `0` and `1` overloadable
797variables, we can instead allow user-defined constructors (or, more flexibly,
798safe conversions) from `_zero_t`, as below:
799
800        // rational 0 = { 0, 1 };
801        void ?{safe} (rational *this, _zero_t) { this->num = 0; this->den = 1; }
802
803Note that we don't need to name the `_zero_t` parameter to this constructor,
804because its only possible value is a literal zero.
805This one line allows `0` to be used anywhere a `rational` is required, as well
806as enabling the same use of rationals in boolean contexts as above (by
807interpreting the `0` in the desguraring to be a rational by this conversion).
808Furthermore, while defining a conversion function from literal zero to
809`rational` makes rational a "truthy" type able to be used in a boolean
810context, we can optionally further optimize the truth decision on rationals as
811follows:
812
813        int ?!=? (rational a, _zero_t) { return a.num != 0; }
814
815This comparison function will be chosen in preference to the more general
816rational comparison function for comparisons against literal zero (like in
817boolean contexts) because it doesn't require a conversion on the `0` argument.
818Functions of the form `int ?!=? (T, _zero_t)` can acutally be used in general
819to make a type `T` truthy without making `0` a value which can convert to that
820type, a capability not available in the current design.
821
822This design also solves the problem of polymorphic zero for generic types, as
823in the following example:
824
825        // ERROR: forall(otype T | { T 0; }) pair(T) 0 = { 0, 0 };
826        forall(otype T | { T 0; }) void ?{safe} (pair(T) *this, _zero_t) {
827                this->x = 0; this->y = 0;
828        }
829
830The polymorphic variable declaration didn't work, but this constructor is
831perfectly legal and has the desired semantics.
832
833We can assert that `T` can be used in a boolean context as follows:
834
835        `forall(otype T | { int ?!=?(T, _zero_t); })`
836 
837Since the C standard (6.5.16.1.1) specifically states that pointers can be
838assigned into `_Bool` variables (and implies that other artithmetic types can
839be assigned into `_Bool` variables), it seems natural to say that assignment
840into a `_Bool` variable effectively constitutes a boolean context.
841To allow this interpretation, I propose including the following function (or
842its effective equivalent) in the prelude:
843
844        forall(otype T | { int ?!=?(T, _zero_t); })
845        void ?{safe}( _Bool *this, T that ) { *this = that != 0; }
846
847Note that this conversion is not transitive; that is, for `t` a variable of
848some "truthy" type `T`, `(_Bool)t;` would use this conversion (in the absence
849of a lower-cost one), `(int)t;` would not use this conversion (and in fact
850would not be legal in the absence of another valid way to convert a `T` to an
851`int`), but `(int)(_Bool)t;` could legally use this conversion.
852
853Similarly giving literal `1` the special type `_unit_t` allows for more
854concise and consistent specification of the increment and decrement operators,
855using the following de-sugaring:
856
857        ++i => i += 1
858        i++ => (tmp = i, i += 1, tmp)
859        --i => i -= 1
860        i-- => (tmp = i, i -= 1, tmp)
861
862In the examples above, `tmp` is a fresh temporary with its type inferred from
863the return type of `i += 1`.
864Under this proposal, defining a conversion from `_unit_t` to `T` and a
865`lvalue T ?+=? (T*, T)` provides both the pre- and post-increment operators
866for free in a consistent fashion (similarly for -= and the decrement
867operators).
868If a meaningful `1` cannot be defined for a type, both increment operators can
869still be defined with the signature `lvalue T ?+=? (T*, _unit_t)`.
870Similarly, if scalar addition can be performed on a type more efficiently than
871by repeated increment, `lvalue T ?+=? (T*, int)` will not only define the
872addition operator, it will simultaneously define consistent implementations of
873both increment operators (this can also be accomplished by defining a
874conversion from `int` to `T` and an addition operator `lvalue T ?+=?(T*, T)`).
875
876To allow functions of the form `lvalue T ?+=? (T*, int)` to satisfy "has an
877increment operator" assertions of the form `lvalue T ?+=? (T*, _unit_t)`,
878we also define a non-transitive unsafe conversion from `_Bool` (allowable
879values `0` and `1`) to `_unit_t` (and `_zero_t`) as follows:
880
881        void ?{unsafe} (_unit_t*, _Bool) {}
882
883As a note, the desugaring of post-increment above is possibly even more
884efficient than that of C++ - in C++, the copy to the temporary may be hidden
885in a separately-compiled module where it can't be elided in cases where it is
886not used, whereas this approach for Cforall always gives the compiler the
887opportunity to optimize out the temporary when it is not needed.
888Furthermore, one could imagine a post-increment operator that returned some
889type `T2` that was implicitly convertable to `T` but less work than a full
890copy of `T` to create (this seems like an absurdly niche case) - since the
891type of `tmp` is inferred from the return type of `i += 1`, you could set up
892functions with the following signatures to enable an equivalent pattern in
893Cforall:
894
895        lvalue T2 ?+=? (T*, _unit_t); // increment operator returns T2
896        void ?{} (T2*, T);            // initialize T2 from T for use in `tmp = i`
897        void ?{safe} (T*, T2);        // allow T2 to be used as a T when needed to
898                                      // preserve expected semantics of T x = y++;
899
900**TODO** Look in C spec for literal type interprations.
901**TODO** Write up proposal for wider range of literal types, put in appendix
902
903### Initialization and Cast Expressions ###
904An initialization expression `T x = e` has one interpretation for each
905interpretation `I` of `e` with type `S` which is convertable to `T`.
906The cost of the interpretation is the cost of `I` plus the conversion cost
907from `S` to `T`.
908A cast expression `(T)e` is interpreted as hoisting initialization of a
909temporary variable `T tmp = e` out of the current expression, then replacing
910`(T)e` by the new temporary `tmp`.
911
912### Assignment Expressions ###
913An assignment expression `e = f` desugars to `(?=?(&e, f), e)`, and is then
914interpreted according to the usual rules for function application and comma
915expressions.
916Operator-assignment expressions like `e += f` desugar similarly as
917`(?+=?(&e, f), e)`.
918
919### Function Application Expressions ###
920Every _compatible function_ and satisfying interpretation of its arguments and
921polymorphic variable bindings produces one intepretation for the function
922application expression.
923Broadly speaking, the resolution cost of a function application is the sum of
924the cost of the interpretations of all arguments, the cost of all conversions
925to make those argument interpretations match the parameter types, and the
926binding cost of any of the function's polymorphic type parameters.
927
928**TODO** Work out binding cost in more detail.
929**TODO** Address whether "incomparably polymorphic" should be treated as
930"equally polymorphic" and be disambiguated by count of (safe) conversions.
931**TODO** Think about what polymorphic return types mean in terms of late
932binding.
933**TODO** Consider if "f is less polymorphic than g" can mean exactly "f
934specializes g"; if we don't consider the assertion parameters (except perhaps
935by count) and make polymorphic variables bind exactly (rather than after
936implicit conversions) this should actually be pre-computable.
937**TODO** Add "deletable" functions - take Thierry's suggestion that a deleted
938function declaration is costed out by the resolver in the same way that any
939other function declaration is costed; if the deleted declaration is the unique
940min-cost resolution refuse to type the expression, if it is tied for min-cost
941then take the non-deleted alternative, and of two equivalent-cost deleted
942interpretations with the same return type pick one arbitrarily rather than
943producing an ambiguous resolution. This would also be useful for forbidding
944pointer-to-floating-point explicit conversions (C11, 6.5.4.4).
945**TODO** Cover default parameters, maybe named parameters (see "named
946arguments" thread of 11 March 2016)
947 
948
949### Sizeof, Alignof & Offsetof Expressions ###
950`sizeof`, `alignof`, and `offsetof` expressions have at most a single
951interpretation, of type `size_t`.
952`sizeof` and `alignof` expressions take either a type or an expression as a 
953an argument; if the argument is a type, it must be a complete type which is
954not a function type, if an expression, the expression must have a single
955interpretation, the type of which conforms to the same rules.
956`offsetof` takes two arguments, a type and a member name; the type must be
957a complete structure or union type, and the second argument must name a member
958of that type.
959
960### Comma Expressions ###
961A comma expression `x, y` resolves `x` as if it had been cast to `void`, and
962then, if there is a unique interpretation `I` of `x`, has one interpretation
963for each interpretation `J` of `y` with the same type as `J` costing the sum
964of the costs of `I` and `J`.
965
966### Index Expressions ###
967**TODO** Consider adding polymorphic function in prelude for this, as per
9686.5.2.1.2 in the C standard:
969
970        forall(otype T, otype I, otype R, otype E | { R ?+?(T, I); lvalue E *?(R); })
971        lvalue E ?[?](T a, I i) { return *(a + i); }
972
973I think this isn't actually a good idea, because the cases for it are niche,
974mostly odd tricks like `0[p]` as an alternate syntax for dereferencing a
975pointer `p`, and adding it to the prelude would slow down resolution of
976every index expression just a bit. Our existing prelude includes an indexing
977operator `forall(otype T) lvalue T ?[?](ptrdiff_t, T*)`, plus qualified
978variants, which should satisfy C source-compatibility without propegating this
979silly desugaring further.
980
981#### Compatible Functions ####
982**TODO** This subsection is very much a work in progress and has multiple open
983design questions.
984
985A _compatible function_ for an application expression is a visible function
986declaration with the same name as the application expression and parameter
987types that can be converted to from the argument types.
988Function pointers and variables of types with the `?()` function call operator
989overloaded may also serve as function declarations for purposes of
990compatibility.
991
992For monomorphic parameters of a function declaration, the declaration is a
993compatible function if there is an argument interpretation that is either an
994exact match, or has a safe or unsafe implicit conversion that can be used to
995reach the parameter type; for example:
996
997        void f(int);
998       
999        f(42);        // compatible; exact match to int type
1000        f('x');       // compatible; safe conversion from char => int
1001        f(3.14);      // compatible; unsafe conversion from double => int
1002        f((void*)0);  // not compatible; no implicit conversion from void* => int
1003
1004Per Richard[*], function assertion satisfaction involves recursively searching
1005for compatible functions, not an exact match on the function types (I don't
1006believe the current Cforall resolver handles this properly); to extend the
1007previous example:
1008
1009        forall(otype T | { void f(T); }) void g(T);
1010       
1011        g(42);    // binds T = int, takes f(int) by exact match
1012        g('x');   // binds T = char, takes f(int) by conversion
1013        g(3.14);  // binds T = double, takes f(int) by conversion
1014
1015[*] Bilson, s.2.1.3, p.26-27, "Assertion arguments are found by searching the
1016accessible scopes for definitions corresponding to assertion names, and
1017choosing the ones whose types correspond *most closely* to the assertion
1018types." (emphasis mine)
1019
1020There are three approaches we could take to binding type variables: type
1021variables must bind to argument types exactly, each type variable must bind
1022exactly to at least one argument, or type variables may bind to any type which
1023all corresponding arguments can implicitly convert to; I'll provide some
1024possible motivation for each approach.
1025
1026There are two main arguments for the more restrictive binding schemes; the
1027first is that the built-in implicit conversions in C between `void*` and `T*` 
1028for any type `T` can lead to unexpectedly type-unsafe behaviour in a more
1029permissive binding scheme, for example:
1030
1031        forall(dtype T) T* id(T *p) { return p; }
1032       
1033        int main() {
1034                int *p = 0;
1035                char *c = id(p);
1036        }
1037
1038This code compiles in CFA today, and it works because the extra function
1039wrapper `id` provides a level of indirection that allows the non-chaining
1040implicit conversions from `int*` => `void*` and `void*` => `char*` to chain.
1041The resolver types the last line with `T` bound to `void` as follows:
1042
1043        char *c = (char*)id( (void*)p );
1044
1045It has been suggested that making the implicit conversions to and from `void*` 
1046explicit in Cforall code (as in C++) would solve this particular problem, and
1047provide enough other type-safety benefits to outweigh the source-compatibility
1048break with C; see Appendix D for further details.
1049
1050The second argument for a more constrained binding scheme is performance;
1051trait assertions need to be checked after the type variables are bound, and
1052adding more possible values for the type variables should correspond to a
1053linear increase in runtime of the resolver per type variable.
1054There are 21 built-in arithmetic types in C (ignoring qualifiers), and each of
1055them is implicitly convertable to any other; if we allow unrestricted binding
1056of type variables, a common `int` variable (or literal) used in the position
1057of a polymorphic variable parameter would cause a 20x increase in the amount
1058of time needed to check trait resolution for that interpretation.
1059These numbers have yet to be emprically substantiated, but the theory is
1060reasonable, and given that much of the impetus for re-writing the resolver is
1061due to its poor performance, I think this is a compelling argument.
1062
1063I would also mention that a constrained binding scheme is practical; the most
1064common type of assertion is a function assertion, and, as mentioned above,
1065those assertions should be able to be implicitly converted to to match.
1066Thus, in the example above with `g(T)`, where the assertion is `void f(T)`,
1067we first bind `T = int` or `T = char` or `T = double`, then substitute the
1068binding into the assertion, yielding assertions of `void f(int)`,
1069`void f(char)`, or `void f(double)`, respectively, then attempt to satisfy
1070these assertions to complete the binding.
1071Though in all three cases, the existing function with signature `void f(int)` 
1072satisfies this assertion, the checking work cannot easily be re-used between
1073variable bindings, because there may be better or worse matches depending on
1074the specifics of the binding.
1075
1076The main argument for a more flexible binding scheme is that the binding
1077abstraction can actually cause a wrapped function call that would work to
1078cease to resolve, as below:
1079
1080        forall(otype T | { T ?+? (T, T) })
1081        T add(T x, T y) { return x + y; }
1082       
1083        int main() {
1084                int i, j = 2;
1085                short r, s = 3;
1086                i = add(j, s);
1087                r = add(s, j);
1088        }
1089
1090Now, C's implicit conversions mean that you can call `j + s` or `s + j`, and
1091in both cases the short `s` is promoted to `int` to match `j`.
1092If, on the other hand, we demand that variables exactly match type variables,
1093neither call to `add` will compile, because it is impossible to simultaneously
1094bind `T` to both `int` and `short` (C++ has a similar restriction on template
1095variable inferencing).
1096One alternative that enables this case, while still limiting the possible
1097type variable bindings is to say that at least one argument must bind to its
1098type parameter exactly.
1099In this case, both calls to `add` would have the set `{ T = int, T = short }` 
1100for candidate bindings, and would have to check both, as well as checking that
1101`short` could convert to `int` or vice-versa.
1102
1103It is worth noting here that parameterized types generally bind their type
1104parameters exactly anyway, so these "restrictive" semantics only restrict a
1105small minority of calls; for instance, in the example following, there isn't a
1106sensible way to type the call to `ptr-add`:
1107
1108        forall(otype T | { T ?+?(T, T) })
1109        void ptr-add( T* rtn, T* x, T* y ) {
1110                *rtn = *x + *y;
1111        }
1112       
1113        int main() {
1114                int i, j = 2;
1115                short s = 3;
1116                ptr-add(&i, &j, &s); // ERROR &s is not an int*
1117        }
1118
1119I think there is some value in providing library authors with the
1120capability to express "these two parameter types must match exactly".
1121This can be done without restricting the language's expressivity, as the `add` 
1122case above can be made to work under the strictest type variable binding
1123semantics with any addition operator in the system by changing its signature
1124as follows:
1125
1126        forall( otype T, otype R, otype S | { R ?+?(T, S); } )
1127        R add(T x, S y) { return x + y; }
1128
1129Now, it is somewhat unfortunate that the most general version here is more
1130verbose (and thus that the path of least resistence would be more restrictive
1131library code); however, the breaking case in the example code above is a bit
1132odd anyway - explicitly taking two variables of distinct types and relying on
1133C's implicit conversions to do the right thing is somewhat bad form,
1134especially where signed/unsigned conversions are concerned.
1135I think the more common case for implicit conversions is the following,
1136though, where the conversion is used on a literal:
1137
1138        short x = 40;
1139        short y = add(x, 2);
1140
1141One option to handle just this case would be to make literals implicitly
1142convertable to match polymorphic type variables, but only literals.
1143The example above would actually behave slightly differently than `x + 2` in
1144C, though, casting the `2` down to `short` rather than the `x` up to `int`, a
1145possible demerit of this scheme.
1146
1147The other question to ask would be which conversions would be allowed for
1148literals; it seems rather odd to allow down-casting `42ull` to `char`, when
1149the programmer has explicitly specified by the suffix that it's an unsigned
1150long.
1151Type interpretations of literals in C are rather complex (see [1]), but one
1152reasonable approach would be to say that un-suffixed integer literals could be
1153interpreted as any type convertable from int, "u" suffixed literals could be
1154interpreted as any type convertable from "unsigned int" except the signed
1155integer types, and "l" or "ll" suffixed literals could only be interpreted as
1156`long` or `long long`, respectively (or possibly that the "u" suffix excludes
1157the signed types, while the "l" suffix excludes the types smaller than
1158`long int`, as in [1]).
1159Similarly, unsuffixed floating-point literals could be interpreted as `float`,
1160`double` or `long double`, but "f" or "l" suffixed floating-point literals
1161could only be interpreted as `float` or `long double`, respectively.
1162I would like to specify that character literals can only be interpreted as
1163`char`, but the wide-character variants and the C practice of typing character
1164literals as `int` means that would likely break code, so character literals
1165should be able to take any integer type.
1166
1167[1] http://en.cppreference.com/w/c/language/integer_constant
1168
1169With the possible exception of the `add` case above, implicit conversions to
1170the function types of assertions can handle most of the expected behaviour
1171from C.
1172However, implicit conversions cannot be applied to match variable assertions,
1173as in the following example:
1174
1175        forall( otype T | { int ?<?(T, T); T ?+?(T, T); T min; T max; } )
1176        T clamp_sum( T x, T y ) {
1177                T sum = x + y;
1178                if ( sum < min ) return min;
1179                if ( max < sum ) return max;
1180                return sum;
1181        }
1182       
1183        char min = 'A';
1184        double max = 100.0;
1185        //int val = clamp_sum( 'X', 3.14 );  // ERROR (1)
1186       
1187        char max = 'Z'
1188        char val = clamp_sum( 'X', 3.14 ); // MATCH (2)
1189        double val = clamp_sum( 40.9, 19.9 ); // MAYBE (3)
1190
1191In this example, the call to `clamp_sum` at (1) doesn't compile, because even
1192though there are compatible `min` and `max` variables of types `char` and
1193`double`, they need to have the same type to match the constraint, and they
1194don't.
1195The (2) example does compile, but with a behaviour that might be a bit
1196unexpected given the "usual arithmetic conversions", in that both values are
1197narrowed to `char` to match the `min` and `max` constraints, rather than
1198widened to `double` as is usual for mis-matched arguments to +.
1199The (3) example is the only case discussed here that would require the most
1200permisive type binding semantics - here, `T` is bound to `char`, to match the
1201constraints, and both the parameters are narrowed from `double` to `char` 
1202before the call, which would not be allowed under either of the more
1203restrictive binding semantics.
1204However, the behaviour here is unexpected to the programmer, because the
1205return value will be `(double)'A' /* == 60.0 */` due to the conversions,
1206rather than `60.8 /* == 40.9 + 19.9 */` as they might expect.
1207
1208Personally, I think that implicit conversions are not a particularly good
1209language design, and that the use-cases for them can be better handled with
1210less powerful features (e.g. more versatile rules for typing constant
1211expressions).
1212However, though we do need implicit conversions in monomorphic code for C
1213compatibility, I'm in favour of restricting their usage in polymorphic code,
1214both to give programmers some stronger tools to express their intent and to
1215shrink the search space for the resolver.
1216Of the possible binding semantics I've discussed, I'm in favour of forcing
1217polymorphic type variables to bind exactly, though I could be talked into
1218allowing literal expressions to have more flexibility in their bindings, or
1219possibly loosening "type variables bind exactly" to "type variables bind
1220exactly at least once"; I think the unrestricted combination of implicit
1221conversions and polymorphic type variable binding unneccesarily multiplies the
1222space of possible function resolutions, and that the added resolution options
1223are mostly unexpected and therefore confusing and not useful to user
1224programmers.
1225
1226## Appendix A: Partial and Total Orders ##
1227The `<=` relation on integers is a commonly known _total order_, and
1228intuitions based on how it works generally apply well to other total orders.
1229Formally, a total order is some binary relation `<=` over a set `S` such that
1230for any two members `a` and `b` of `S`, `a <= b` or `b <= a` (if both, `a` and
1231`b` must be equal, the _antisymmetry_ property); total orders also have a
1232_transitivity_ property, that if `a <= b` and `b <= c`, then `a <= c`.
1233If `a` and `b` are distinct elements and `a <= b`, we may write `a < b`.
1234 
1235A _partial order_ is a generalization of this concept where the `<=` relation
1236is not required to be defined over all pairs of elements in `S` (though there
1237is a _reflexivity_ requirement that for all `a` in `S`, `a <= a`); in other
1238words, it is possible for two elements `a` and `b` of `S` to be
1239_incomparable_, unable to be ordered with respect to one another (any `a` and
1240`b` for which either `a <= b` or `b <= a` are called _comparable_).
1241Antisymmetry and transitivity are also required for a partial order, so all
1242total orders are also partial orders by definition.
1243One fairly natural partial order is the "subset of" relation over sets from
1244the same universe; `{ }` is a subset of both `{ 1 }` and `{ 2 }`, which are
1245both subsets of `{ 1, 2 }`, but neither `{ 1 }` nor `{ 2 }` is a subset of the
1246other - they are incomparable under this relation.
1247
1248We can compose two (or more) partial orders to produce a new partial order on
1249tuples drawn from both (or all the) sets.
1250For example, given `a` and `c` from set `S` and `b` and `d` from set `R`,
1251where both `S` and `R` both have partial orders defined on them, we can define
1252a ordering relation between `(a, b)` and `(c, d)`.
1253One common order is the _lexicographical order_, where `(a, b) <= (c, d)` iff
1254`a < c` or both `a = c` and `b <= d`; this can be thought of as ordering by
1255the first set and "breaking ties" by the second set.
1256Another common order is the _product order_, which can be roughly thought of
1257as "all the components are ordered the same way"; formally `(a, b) <= (c, d)` 
1258iff `a <= c` and `b <= d`.
1259One difference between the lexicographical order and the product order is that
1260in the lexicographical order if both `a` and `c` and `b` and `d` are
1261comparable then `(a, b)` and `(c, d)` will be comparable, while in the product
1262order you can have `a <= c` and `d <= b` (both comparable) which will make
1263`(a, b)` and `(c, d)` incomparable.
1264The product order, on the other hand, has the benefit of not prioritizing one
1265order over the other.
1266
1267Any partial order has a natural representation as a directed acyclic graph
1268(DAG).
1269Each element `a` of the set becomes a node of the DAG, with an arc pointing to
1270its _covering_ elements, any element `b` such that `a < b` but where there is
1271no `c` such that `a < c` and `c < b`.
1272Intuitively, the covering elements are the "next ones larger", where you can't
1273fit another element between the two.
1274Under this construction, `a < b` is equivalent to "there is a path from `a` to
1275`b` in the DAG", and the lack of cycles in the directed graph is ensured by
1276the antisymmetry property of the partial order.
1277
1278Partial orders can be generalized to _preorders_ by removing the antisymmetry
1279property.
1280In a preorder the relation is generally called `<~`, and it is possible for
1281two distict elements `a` and `b` to have `a <~ b` and `b <~ a` - in this case
1282we write `a ~ b`; `a <~ b` and not `a ~ b` is written `a < b`.
1283Preorders may also be represented as directed graphs, but in this case the
1284graph may contain cycles.
1285
1286## Appendix B: Building a Conversion Graph from Un-annotated Single Steps ##
1287The short answer is that it's impossible.
1288
1289The longer answer is that it has to do with what's essentially a diamond
1290inheritance problem.
1291In C, `int` converts to `unsigned int` and also `long` "safely"; both convert
1292to `unsigned long` safely, and it's possible to chain the conversions to
1293convert `int` to `unsigned long`.
1294There are two constraints here; one is that the `int` to `unsigned long` 
1295conversion needs to cost more than the other two (because the types aren't as
1296"close" in a very intuitive fashion), and the other is that the system needs a
1297way to choose which path to take to get to the destination type.
1298Now, a fairly natural solution for this would be to just say "C knows how to
1299convert from `int` to `unsigned long`, so we just put in a direct conversion
1300and make the compiler smart enough to figure out the costs" - given that the
1301users can build an arbitrary graph of conversions, this needs to be handled
1302anyway.
1303We can define a preorder over the types by saying that `a <~ b` if there
1304exists a chain of conversions from `a` to `b`.
1305This preorder corresponds roughly to a more usual type-theoretic concept of
1306subtyping ("if I can convert `a` to `b`, `a` is a more specific type than
1307`b`"); however, since this graph is arbitrary, it may contain cycles, so if
1308there is also a path to convert `b` to `a` they are in some sense equivalently
1309specific.
1310
1311Now, to compare the cost of two conversion chains `(s, x1, x2, ... xn)` and
1312`(s, y1, y2, ... ym)`, we have both the length of the chains (`n` versus `m`)
1313and this conversion preorder over the destination types `xn` and `ym`.
1314We could define a preorder by taking chain length and breaking ties by the
1315conversion preorder, but this would lead to unexpected behaviour when closing
1316diamonds with an arm length of longer than 1.
1317Consider a set of types `A`, `B1`, `B2`, `C` with the arcs `A->B1`, `B1->B2`,
1318`B2->C`, and `A->C`.
1319If we are comparing conversions from `A` to both `B2` and `C`, we expect the
1320conversion to `B2` to be chosen because it's the more specific type under the
1321conversion preorder, but since its chain length is longer than the conversion
1322to `C`, it loses and `C` is chosen.
1323However, taking the conversion preorder and breaking ties or ambiguities by
1324chain length also doesn't work, because of cases like the following example
1325where the transitivity property is broken and we can't find a global maximum:
1326
1327        `X->Y1->Y2`, `X->Z1->Z2->Z3->W`, `X->W`
1328
1329In this set of arcs, if we're comparing conversions from `X` to each of `Y2`,
1330`Z3` and `W`, converting to `Y2` is cheaper than converting to `Z3`, because
1331there are no conversions between `Y2` and `Z3`, and `Y2` has the shorter chain
1332length.
1333Also, comparing conversions from `X` to `Z3` and to `W`, we find that the
1334conversion to `Z3` is cheaper, because `Z3 < W` by the conversion preorder,
1335and so is considered to be the nearer type.
1336By transitivity, then, the conversion from `X` to `Y2` should be cheaper than
1337the conversion from `X` to `W`, but in this case the `X` and `W` are
1338incomparable by the conversion preorder, so the tie is broken by the shorter
1339path from `X` to `W` in favour of `W`, contradicting the transitivity property
1340for this proposed order.
1341
1342Without transitivity, we would need to compare all pairs of conversions, which
1343would be expensive, and possibly not yield a minimal-cost conversion even if
1344all pairs were comparable.
1345In short, this ordering is infeasible, and by extension I believe any ordering
1346composed solely of single-step conversions between types with no further
1347user-supplied information will be insufficiently powerful to express the
1348built-in conversions between C's types.
1349
1350## Appendix C: Proposed Prelude Changes ##
1351**TODO** Port Glen's "Future Work" page for builtin C conversions.
1352**TODO** Move discussion of zero_t, unit_t here.
1353
1354It may be desirable to have some polymorphic wrapper functions in the prelude
1355which provide consistent default implementations of various operators given a
1356definition of one of them.
1357Naturally, users could still provide a monomorphic overload if they wished to
1358make their own code more efficient than the polymorphic wrapper could be, but
1359this would minimize user effort in the common case where the user cannot write
1360a more efficient function, or is willing to trade some runtime efficiency for
1361developer time.
1362As an example, consider the following polymorphic defaults for `+` and `+=`:
1363
1364        forall(otype T | { T ?+?(T, T); })
1365        lvalue T ?+=? (T *a, T b) {
1366                return *a = *a + b;
1367        }
1368       
1369        forall(otype T | { lvalue T ?+=? (T*, T) })
1370        T ?+? (T a, T b) {
1371                T tmp = a;
1372                return tmp += b;
1373        }
1374
1375Both of these have a possibly unneccessary copy (the first in copying the
1376result of `*a + b` back into `*a`, the second copying `a` into `tmp`), but in
1377cases where these copies are unavoidable the polymorphic wrappers should be
1378just as performant as the monomorphic equivalents (assuming a compiler
1379sufficiently clever to inline the extra function call), and allow programmers
1380to define a set of related operators with maximal concision.
1381
1382**TODO** Look at what Glen and Richard have already written for this.
1383
1384## Appendix D: Feasibility of Making void* Conversions Explicit ##
1385C allows implicit conversions between `void*` and other pointer types, as per
1386section 6.3.2.3.1 of the standard.
1387Making these implicit conversions explicit in Cforall would provide
1388significant type-safety benefits, and is precedented in C++.
1389A weaker version of this proposal would be to allow implicit conversions to
1390`void*` (as a sort of "top type" for all pointer types), but to make the
1391unsafe conversion from `void*` back to a concrete pointer type an explicit
1392conversion.
1393However, `int *p = malloc( sizeof(int) );` and friends are hugely common
1394in C code, and rely on the unsafe implicit conversion from the `void*` return
1395type of `malloc` to the `int*` type of the variable - obviously it would be
1396too much of a source-compatibility break to disallow this for C code.
1397We do already need to wrap C code in an `extern "C"` block, though, so it is
1398technically feasible to make the `void*` conversions implicit in C but
1399explicit in Cforall.
1400Also, for calling C code with `void*`-based APIs, pointers-to-dtype are
1401calling-convention compatible with `void*`; we could read `void*` in function
1402signatures as essentially a fresh dtype type variable, e.g:
1403
1404        void* malloc( size_t )
1405                => forall(dtype T0) T0* malloc( size_t )
1406        void qsort( void*, size_t, size_t, int (*)( const void*, const void* ) )
1407                => forall(dtype T0, dtype T1, dtype T2)
1408                   void qsort( T0*, size_t, size_t, int (*)( const T1*, const T2* ) )
1409
1410This would even allow us to leverage some of Cforall's type safety to write
1411better declarations for legacy C API functions, like the following wrapper for
1412`qsort`:
1413
1414        extern "C" { // turns off name-mangling so that this calls the C library
1415                // call-compatible type-safe qsort signature
1416                forall(dtype T)
1417                void qsort( T*, size_t, size_t, int (*)( const T*, const T* ) );
1418               
1419                // forbid type-unsafe C signature from resolving
1420                void qsort( void*, size_t, size_t, int (*)( const void*, const void* ) )
1421                        = delete;
1422        }
Note: See TracBrowser for help on using the repository browser.