source: doc/working/resolver_design.md @ f4bc57c

ADTaaron-thesisarm-ehast-experimentalcleanup-dtorsctordeferred_resndemanglerenumforall-pointer-decaygc_noraiijacob/cs343-translationjenkins-sandboxmemorynew-astnew-ast-unique-exprnew-envno_listpersistent-indexerpthread-emulationqualifiedEnumresolv-newwith_gc
Last change on this file since f4bc57c was d14d96a, checked in by Aaron Moss <a3moss@…>, 8 years ago

Update resolver design with draft of architecture

  • Property mode set to 100644
File size: 82.5 KB
Line 
1# Thoughts on Resolver Design #
2
3## Conversions ##
4C's implicit "usual arithmetic conversions" define a structure among the
5built-in types consisting of _unsafe_ narrowing conversions and a hierarchy of
6_safe_ widening conversions.
7There is also a set of _explicit_ conversions that are only allowed through a
8cast expression.
9Based on Glen's notes on conversions [1], I propose that safe and unsafe
10conversions be expressed as constructor variants, though I make explicit
11(cast) conversions a constructor variant as well rather than a dedicated
12operator.
13Throughout this article, I will use the following operator names for
14constructors and conversion functions from `From` to `To`:
15
16        void ?{} ( To*, To );            // copy constructor
17        void ?{} ( To*, From );          // explicit constructor
18        void ?{explicit} ( To*, From );  // explicit cast conversion
19        void ?{safe} ( To*, From );      // implicit safe conversion
20        void ?{unsafe} ( To*, From );    // implicit unsafe conversion
21
22[1] http://plg.uwaterloo.ca/~cforall/Conversions/index.html
23
24Glen's design made no distinction between constructors and unsafe implicit
25conversions; this is elegant, but interacts poorly with tuples.
26Essentially, without making this distinction, a constructor like the following
27would add an interpretation of any two `int`s as a `Coord`, needlessly
28multiplying the space of possible interpretations of all functions:
29
30        void ?{}( Coord *this, int x, int y );
31
32That said, it would certainly be possible to make a multiple-argument implicit
33conversion, as below, though the argument above suggests this option should be
34used infrequently:
35
36        void ?{unsafe}( Coord *this, int x, int y );
37
38An alternate possibility would be to only count two-arg constructors
39`void ?{} ( To*, From )` as unsafe conversions; under this semantics, safe and 
40explicit conversions should also have a compiler-enforced restriction to
41ensure that they are two-arg functions (this restriction may be valuable
42regardless).
43
44### Constructor Idiom ###
45Basing our notion of conversions off otherwise normal Cforall functions means
46that we can use the full range of Cforall features for conversions, including
47polymorphism.
48Glen [1] defines a _constructor idiom_ that can be used to create chains of
49safe conversions without duplicating code; given a type `Safe` which members
50of another type `From` can be directly converted to, the constructor idiom
51allows us to write a conversion for any type `To` which `Safe` converts to:
52
53        forall(otype To | { void ?{safe}( To*, Safe ) })
54        void ?{safe}( To *this, From that ) {
55                Safe tmp = /* some expression involving that */;
56                *this = tmp; // uses assertion parameter
57        }
58
59This idiom can also be used with only minor variations for a parallel set of
60unsafe conversions.
61
62What selective non-use of the constructor idiom gives us is the ability to
63define a conversion that may only be the *last* conversion in a chain of such.
64Constructing a conversion graph able to unambiguously represent the full
65hierarchy of implicit conversions in C is provably impossible using only
66single-step conversions with no additional information (see Appendix B), but
67this mechanism is sufficiently powerful (see [1], though the design there has
68some minor bugs; the general idea is to use the constructor idiom to define
69two chains of conversions, one among the signed integral types, another among
70the unsigned, and to use monomorphic conversions to allow conversions between
71signed and unsigned integer types).   
72
73### Implementation Details ###
74It is desirable to have a system which can be efficiently implemented, yet
75also to have one which has sufficient power to distinguish between functions
76on all possible axes of polymorphism.
77This ordering may be a partial order, which may complicate implementation
78somewhat; in this case it may be desirable to store the set of implementations
79for a given function as the directed acyclic graph (DAG) representing the
80order.
81
82## Conversion Costs ##
83Each possible resolution of an expression has a _cost_ tuple consisting of
84the following components: _unsafe_ conversion cost, _polymorphic_ 
85specialization cost, _safe_ conversion cost, a count of _explicit_ 
86conversions, and _qualifier_ conversion cost.
87These components are lexically-ordered and can be summed element-wise;
88summation starts at `(0, 0, 0, 0, 0)`.
89
90### Lvalue and Qualifier Conversions ###
91C defines the notion of a _lvalue_, essentially an addressable object, as well
92as a number of type _qualifiers_, `const`, `volatile`, and `restrict`.
93As these type qualifiers are generally only meaningful to the type system as
94applied to lvalues, the two concepts are closely related.
95A const lvalue cannot be modified, the compiler cannot assume that a volatile
96lvalue will not be concurrently modified by some other part of the system, and
97a restrict lvalue must have pointer type, and the compiler may assume that no
98other pointer in scope aliases that pointer (this is solely a performance
99optimization, and may be ignored by implementers).
100_Lvalue-to-rvalue conversion_, which takes an lvalue of type `T` and converts
101it to an expression result of type `T` (commonly called an _rvalue_ of type
102`T`) also strips all the qualifiers from the lvalue, as an expression result
103is a value, not an addressable object that can have properties like
104immutability.
105Though lvalue-to-rvalue conversion strips the qualifiers from lvalues,
106derived rvalue types such as pointer types may include qualifiers;
107`const int *` is a distinct type from `int *`, though the latter is safely
108convertable to the former.
109In general, any number of qualifiers can be safely added to the
110pointed-to-type of a pointer type, e.g. `int *` converts safely to
111`const int *` and `volatile int *`, both of which convert safely to
112`const volatile int *`.
113
114Since lvalues are precicely "addressable objects", in C, only lvalues can be
115used as the operand of the `&` address-of operator.
116Similarly, only modifiable lvalues may be used as the assigned-to
117operand of the mutating operators: assignment, compound assignment
118(e.g. `+=`), and increment and decrement; roughly speaking, lvalues without
119the `const` qualifier are modifiable, but lvalues of incomplete types, array
120types, and struct or union types with const members are also not modifiable.
121Lvalues are produced by the following expressions: object identifiers
122(function identifiers are not considered to be lvalues), the result of the `*` 
123dereference operator applied to an object pointer, the result of a member
124expression `s.f` if the left argument `s` is an lvalue (note that the
125preceding two rules imply that the result of indirect member expressions
126`s->f` are always lvalues, by desugaring to `(*s).f`), and the result of the
127indexing operator `a[i]` (similarly by its desugaring to `*((a)+(i))`).
128Somewhat less obviously, parenthesized lvalue expressions, string literals,
129and compound literals (e.g. `(struct foo){ 'x', 3.14, 42 }`) are also lvalues.
130
131All of the conversions described above are defined in standard C, but Cforall
132requires further features from its type system.
133In particular, to allow overloading of the `*?` and `?[?]` dereferencing and
134indexing operators, Cforall requires a way to declare that the functions
135defining these operators return lvalues, and since C functions never return
136lvalues and for syntactic reasons we wish to distinguish functions which
137return lvalues from functions which return pointers, this is of necessity an
138extension to standard C.
139In the current design, an `lvalue` qualifier can be added to function return
140types (and only to function return types), the effect of which is to return a
141pointer which is implicitly dereferenced by the caller.
142C++ includes the more general concept of _references_, which are typically
143implemented as implicitly dereferenced pointers as well.
144Another use case which C++ references support is providing a way to pass
145function parameters by reference (rather than by value) with a natural
146syntax; Cforall in its current state has no such mechanism.
147As an example, consider the following (currently typical) copy-constructor
148signature and call:
149
150        void ?{}(T *lhs, T rhs);
151       
152        T x;
153        T y = { x };
154
155Note that the right-hand argument is passed by value, and would in fact be
156copied twice in the course of the constructor call `T y = { x };` (once into
157the parameter by C's standard `memcpy` semantics, once again in the body of
158the copy constructor, though it is possible that return value optimization
159will elide the `memcpy`-style copy).
160However, to pass by reference using the existing pointer syntax, the example
161above would look like this:
162
163        void ?{}(T *lhs, const T *rhs);
164       
165        T x;
166        T y = { &x };
167
168This example is not even as bad as it could be; assuming pass-by-reference is
169the desired semantics for the `?+?` operator, that implies the following
170design today:
171
172        T ?+?(const T *lhs, const T *rhs);
173       
174        T a, b;
175        T c = &a + &b,
176
177In addition to `&a + &b` being unsightly and confusing syntax to add `a` and
178`b`, it also introduces a possible ambiguity with pointer arithmetic on `T*`
179which can only be resolved by return-type inference.
180
181Pass-by-reference and marking functions as returning lvalues instead of the
182usual rvalues are actually closely related concepts, as obtaining a reference
183to pass depends on the referenced object being addressable, i.e. an lvalue,
184and lvalue return types are effectively return-by-reference.
185Cforall should also unify the concepts, with a parameterized type for
186"reference to `T`", which I will write `ref T`.
187Syntax bikeshedding can be done later (there are some examples at the bottom
188of this section), but `ref T` is sufficiently distinct from both the existing
189`lvalue T` (which it subsumes) and the closely related C++ `T&` to allow
190independent discussion of its semantics.
191
192Firstly, assignment to a function parameter as part of a function call and
193local variable initialization have almost identical semantics, so should be
194treated similarly for the reference type too; this implies we should be able
195to declare local variables of reference type, as in the following:
196
197        int x = 42;
198        ref int r = x; // r is now an alias for x
199
200Unlike in C++, we would like to have the capability to re-bind references
201after initialization, as this allows the attractive syntax of references to
202support some further useful code patterns, such as first initializing a
203reference after its declaration.
204Constant references to `T` (`const ref T`) should not be re-bindable.
205
206One option for re-binding references is to use a dedicated operator, as in the
207code example below:
208
209        int i = 42, j = 7;
210        ref int r = i; // bind r to i
211        r = j;         // set i (== r) to 7
212        r := j;        // rebind r to j using the new := rebind operator
213        i = 42;        // reset i (!= r) to 42
214        assert( r == 7 );
215
216The other syntactic option for reference re-bind would be to overload
217assignment and use type inference on the left and right-hand sides to
218determine whether the referred-to variable on the left should be reassigned to
219the value on the right, or if the reference on the left should be aliased to
220the reference on the right.
221This could be disambiguated with casts, as in the following code example:
222
223        int i
224        int j;
225        ref int r = i;   // (0a)
226        ref int s = i;   // (0b)
227       
228        i = j;           // (1)
229        i = (int)s;      // (2)
230        i = s;           // (3)
231        // ---------------------
232        r = s;           // (4)
233        r = (ref int)j;  // (5)
234        // ---------------------
235        r = j;           // (6)
236        r = (int)s;      // (7)
237
238By the expected aliasing syntax, (0a) and (0b) are initializing `r` and `s` as
239aliases for `i`.
240For C compatibility, (1) has to be assignment; in general, any assignment to a
241non-reference type should be assignment, so (2) and (3) are as well.
242By types, (4) and (5) should have the same semantics, and the semantics of (6)
243and (7) should match as well.
244This suggests that (4) and (5) are reference re-bind, and (6) and (7) are an
245assignment to the referred variable; this makes the syntax to explicitly alias
246a local variable rather ugly (and inconsistent with the initialization
247syntax), as well as making it rather awkward to copy the value stored in one
248reference-type variable into another reference type variable (which is likely
249more painful in functions with by-reference parameters than with local
250variables of reference type).
251
252Because of the aforementioned issues with overloading assignment as reference
253rebind, in addition to the fact that reference rebind should not be a
254user-overloadable operator (unlike assignment), I propose refererence rebind
255should have its own dedicated operator.
256
257The semantics and restrictions of `ref T` are effectively the semantics of an
258lvalue of type `T`, and by this analogy there should be a safe, qualifier
259dropping conversion from `ref const volatile restrict T` (and every other
260qualifier combination on the `T` in `ref T`) to `T`.
261With this conversion, the resolver may type most expressions that C would
262call "lvalue of type `T`" as `ref T`.
263There's also an obvious argument that lvalues of a (possibly-qualified) type
264`T` should be convertable to references of type `T`, where `T` is also
265so-qualified (e.g. lvalue `int` to `ref int`, lvalue `const char` to
266`ref const char`).
267By similar arguments to pointer types, qualifiers should be addable to the
268referred-to type of a reference (e.g. `ref int` to `ref const int`).
269As a note, since pointer arithmetic is explictly not defined on `ref T`,
270`restrict ref T` should be allowable and would have alias-analysis rules that
271are actually comprehensible to mere mortals.
272
273Using pass-by-reference semantics for function calls should not put syntactic
274constraints on how the function is called; particularly, temporary values
275should be able to be passed by reference.
276The mechanism for this pass-by-reference would be to store the value of the
277temporary expression into a new unnamed temporary, and pass the reference of
278that temporary to the function.
279As an example, the following code should all compile and run:
280
281        void f(ref int x) { printf("%d\n", x++); }
282       
283        int i = 7, j = 11;
284        const int answer = 42;
285       
286        f(i);      // (1)
287        f(42);     // (2)
288        f(i + j);  // (3)
289        f(answer); // (4)
290
291The semantics of (1) are just like C++'s, "7" is printed, and `i` has the
292value 8 afterward.
293For (2), "42" is printed, and the increment of the unnamed temporary to 43 is
294not visible to the caller; (3) behaves similarly, printing "19", but not
295changing `i` or `j`.
296(4) is a bit of an interesting case; we want to be able to support named
297constants like `answer` that can be used anywhere the constant expression
298they're replacing (like `42`) could go; in this sense, (4) and (2) should have
299the same semantics.
300However, we don't want the mutation to the `x` parameter to be visible in
301`answer` afterward, because `answer` is a constant, and thus shouldn't change.
302The solution to this is to allow chaining of the two `ref` conversions;
303`answer` has the type `ref const int`, which can be converted to `int` by the
304lvalue-to-rvalue conversion (which drops the qualifiers), then up to `ref int` 
305by the temporary-producing rvalue-to-lvalue conversion.
306Thus, an unnamed temporary is inserted, initialized to `answer` (i.e. 42),
307mutated by `f`, then discarded; "42" is printed, just as in case (2), and
308`answer` still equals 42 after the call, because it was the temporary that was
309mutated, not `answer`.
310It may be somewhat surprising to C++ programmers that `f(i)` mutates `i` while
311`f(answer)` does not mutate `answer` (though `f(answer)` would be illegal in
312C++, leading to the dreaded "const hell"), but the behaviour of this rule can
313be determined by examining local scope with the simple rule "non-`const`
314references to `const` variables produce temporaries", which aligns with
315programmer intuition that `const` variables cannot be mutated.
316
317To bikeshed syntax for `ref T`, there are three basic options: language
318keywords (`lvalue T` is already in Cforall), compiler-supported "special"
319generic types (e.g. `ref(T)`), or sigils (`T&` is familiar to C++
320programmers).
321Keyword or generic based approaches run the risk of name conflicts with
322existing code, while any sigil used would have to be carefully chosen to not
323create parsing conflicts.
324
325**TODO** Consider arguments for move semantics and see if there is a
326compelling case for rvalue references.
327
328### Conversion Operator Costs ###
329Copy constructors, safe conversions, and unsafe conversions all have an
330associated conversion cost, calculated according to the algorithm below:
331
3321. Monomorphic copy constructors have a conversion cost of `(0, 0, 0, 0)`
3332. Monomorphic safe conversions have a conversion cost of `(0, 0, 1, 1)`
3343. Monomoprhic unsafe conversions have a conversion cost of `(1, 0, 0, 1)`
3354. Polymorphic conversion operators (or copy constructors) have a conversion
336   cost of `(0, 1, 0, 1)` plus the conversion cost of their monomorphic
337   equivalent and the sum of the conversion costs of all conversion operators
338   passed as assertion parameters, but where the fourth "count" element of the
339   cost tuple is fixed to `1`.
340
341**TODO** Polymorphism cost may need to be reconsidered in the light of the
342thoughts on polymorphism below.
343**TODO** You basically just want path-length in the conversion graph implied
344by the set of conversions; the only tricky question is whether or not you can
345account for "mixed" safe and unsafe conversions used to satisfy polymorphic
346constraints, whether a polymorphic conversion should cost more than a
347monomorphic one, and whether to account for non-conversion constraints in the
348polymorphism cost
349
350### Argument-Parameter Matching ###
351Given a function `f` with an parameter list (after tuple flattening)
352`(T1 t1, T2 t2, ... Tn tn)`, and a function application
353`f(<e1>, <e2>, ... <em>)`, the cost of matching each argument to the
354appropriate parameter is calculated according to the algorithm below:
355
356Given a parameter `t` of type `T` and an expression `<e>` from these lists,
357`<e>` will have a set of interpretations of types `E1, E2, ... Ek` with
358associated costs `(u1, p1, s1, c1), (u2, p2, s2, c2), ... (uk, pk, sk, ck)`.
359(If any `Ei` is a tuple type, replace it with its first flattened element for
360the purposes of this section.)
361
362The cost of matching the interpretation of `<e>` with type `Ei` to `t1` with
363type `T` is the sum of the interpretation cost `(ui, pi, si, ci)` and the
364conversion operator cost from `Ei` to `T`.
365
366### Object Initialization ###
367The cost to initialize an object is calculated very similarly to
368argument-parameter matching, with a few modifications.
369Firstly, explicit constructors are included in the set of available
370conversions, with conversion cost `(0, 0, 0, 1)` plus associated polymorphic
371conversion costs (if applicable) and the _interpretation cost_ of the
372constructor, the sum of the argument-parameter matching costs for its
373parameters.
374Also, ties in overall cost (interpretation cost plus conversion cost) are
375broken by lowest conversion cost (i.e. of alternatives with the same overall
376cost, copy constructors are preferred to other explicit constructors,
377explicit constructors are preferred to safe conversions, which are preferred
378to unsafe conversions).
379An object initialization is properly typed if it has exactly one min-cost
380interpretation.
381
382### Explicit Casts ###
383Explicit casts are handled similarly to object initialization.
384Copy constructors and other explicit constructors are not included in the set
385of possible conversions, though interpreting a cast as type ascription
386(`(T)e`, meaning the interpretation of `e` as type `T`) has conversion cost
387`(0, 0, 0, 0)`.
388Explicit conversion operators are also included in the set of possible
389conversions, with cost `(0, 0, 0, 1)` plus whatever polymorphic conversion
390costs are invoked.
391Unlike for explicit constructors and other functions, implicit conversions are
392never applied to the argument or return type of an explicit cast operator, so
393that the cast may be used more effectively as a method for the user programmer
394to guide type resolution.
395
396## Trait Satisfaction ##
397A _trait_ consists of a list of _type variables_ along with a (possibly empty)
398set of _assertions_ on those variables.
399Assertions can take two forms, _variable assertions_ and the more common
400_function assertions_, as in the following example:
401
402        trait a_trait(otype T, otype S) {
403                T a_variable_assertion;
404                S* another_variable_assertion;
405                S a_function_assertion( T* );
406        };
407
408Variable assertions enforce that a variable with the given name and type
409exists (the type is generally one of the type variables, or derived from one),
410while a function assertion enforces that a function with a
411_compatible signature_ to the provided function exists.
412
413To test if some list of types _satisfy_ the trait, the types are first _bound_ 
414to the type variables, and then declarations to satisfy each assertion are
415sought out.
416Variable assertions require an exact match, because they are passed as object
417pointers, and there is no mechanism to employ conversion functions, while
418function assertions only require a function that can be wrapped to a
419compatible type; for example, the declarations below satisfy
420`a_trait(int, short)`:
421
422        int a_variable_assertion;
423        short* another_variable_assertion;
424        char a_function_assertion( void* );
425        // int* may be implicitly converted to void*, and char to short, so the
426        // above works
427
428Cforall Polymorphic functions have a _constraining trait_, denoted as follows:
429
430        forall(otype A, otype B | some_trait(A, B))
431
432The trait may be anonymous, with the same syntax as a trait declaration, and
433may be unioned together using `|` or `,`.
434
435**TODO** Consider including field assertions in the list of constraint types,
436also associated types and the appropriate matching type assertion.
437
438## Polymorphism Costs ##
439The type resolver should prefer functions that are "less polymorphic" to
440functions that are "more polymorphic".
441Determining how to order functions by degree of polymorphism is somewhat less
442straightforward, though, as there are multiple axes of polymorphism and it is
443not always clear how they compose.
444The natural order for degree of polymorphism is a partial order, and this
445section includes some open questions on whether it is desirable or feasible to
446develop a tie-breaking strategy to impose a total order on the degree of
447polymorphism of functions.
448Helpfully, though, the degree of polymorphism is a property of functions
449rather than function calls, so any complicated graph structure or calculation
450representing a (partial) order over function degree of polymorphism can be
451calculated once and cached.
452
453### Function Parameters ###
454All other things being equal, if a parameter of one function has a concrete
455type and the equivalent parameter of another function has a dynamic type, the
456first function is less polymorphic:
457
458                        void f( int, int );  // (0) least polymorphic
459        forall(otype T) void f( T, int );    // (1a) more polymorphic than (0)
460        forall(otype T) void f( int, T );    // (1b) more polymorphic than (0)
461                                             //      incomparable with (1a)
462        forall(otype T) void f( T, T );      // (2) more polymorphic than (1a/b)
463
464This should extend to parameterized types (pointers and generic types) also:
465
466        forall(otype S) struct box { S val; };
467       
468        forall(otype T) void f( T, T* );       // (3) less polymorphic than (2)
469        forall(otype T) void f( T, T** );      // (4) less polymorphic than (3)
470        forall(otype T) void f( T, box(T) );   // (5) less polymorphic than (2)
471                                               //     incomparable with (3)
472        forall(otype T) void f( T, box(T*) );  // (6) less polymorphic than (5)
473
474Every function in the group above is incomparable with (1a/b), but that's fine
475because an `int` isn't a pointer or a `box`, so the ambiguity shouldn't occur
476much in practice (unless there are safe or unsafe conversions defined between
477the possible argument types).
478
479For degree of polymorphism from arguments, I think we should not distinguish
480between different type parameters, e.g. the following should be considered
481equally polymorphic:
482
483        forall(otype T, otype S) void f( T, T, S );  // (7)
484        forall(otype T, otype S) void f( S, T, T );  // (8)
485
486However parameter lists are compared, parameters of multi-parameter generic
487types should ideally be treated as a recursive case, e.g. in the example
488below, (9) is less polymorphic than (10), which is less polymorphic than (11):
489
490        forall(otype T, otype S) struct pair { T x; S y; };
491       
492                        void f( pair(int, int) );  // (9)
493        forall(otype T) void f( pair(T, int) );    // (10)
494        forall(otype T) void f( pair(T, T) );      // (11)
495
496Parameter comparison could possibly be made somewhat cheaper at loss of some
497precision by representing each parameter as a value from the natural numbers
498plus infinity, where infinity represents a monomorphic parameter and a finite
499number counts how many levels deep the shallowest type variable is, e.g. where
500`T` is a type variable, `int` would have value infinity, `T` would have value
5010, `T*` would have value 1, `box(T)*` would have value 2, etc.
502Under this scheme, higher values represent less polymorphism.
503This makes the partial order on parameters a total order, so that many of the
504incomparable functions above compare equal, though that is perhaps a virtue.
505It also loses the ability to differentiate between some multi-parameter
506generic types, such as the parameters in (10) and (11), which would both be
507valued 1, losing the polymorphism distinction between them.
508
509A variant of the above scheme would be to fix a maximum depth of polymorphic
510type variables (16 seems like a reasonable choice) at which a parameter would
511be considered to be effectively monomorphic, and to subtract the value 
512described above from that maximum, clamping the result to a minimum of 0.
513Under this scheme, assuming a maximum value of 4, `int` has value 0, `T` has
514value 4, `T*` has value 3, `box(T)*` has value 2, and `box(T*)**` has value 0,
515the same as `int`.
516This can be quite succinctly represented, and summed without the presence of a
517single monomorphic parameter pushing the result to infinity, but does lose the
518ability to distinguish between very deeply structured polymorphic types.
519
520### Parameter Lists ###
521A partial order on function parameter lists can be produced by the
522product order of the partial orders on parameters described above.
523In more detail, this means that for two parameter lists with the same arity,
524if any pair of corresponding parameters are incomparable with respect to each
525other, the two parameter lists are incomparable; if in all pairs of
526corresponding parameters one list's parameter is always (less than or) equal
527to the other list's parameter than the first parameter list is (less than or)
528equal to the second parameter list; otherwise the lists are incomparable with
529respect to each other.
530
531How to compare parameter lists of different arity is a somewhat open question.
532A simple, but perhaps somewhat unsatisfying, solution would be just to say
533that such lists are incomparable.
534The simplist approach to make them comparable is to say that, given two lists
535`(T1, T2, ... Tn)` and `(S1, S2, ... Sm)`, where `n <= m`, the parameter lists
536can be compared based on their shared prefix of `n` types.
537This approach breaks the transitivity property of the equivalence relation on
538the partial order, though, as seen below:
539
540        forall(otype T) void f( T, int );       // (1a)
541        forall(otype T) void f( T, int, int );  // (12)
542        forall(otype T) void f( T, int, T );    // (13)
543
544By this rule, (1a) is equally polymorphic to both (12) and (13), so by
545transitivity (12) and (13) should also be equally polymorphic, but that is not
546actually the case.
547
548We can fix the rule by saying that `(T1 ... Tn)` can be compared to
549`(S1 ... Sm)` by _extending_ the list of `T`s to `m` types by inserting
550notional monomorphic parameters.
551In this case, (1a) and (12) are equally polymorphic, because (1a) gets
552extended with a monomorphic type that compares equal to (12)'s third `int` 
553parameter, but (1a) is less polymorphic than (13), because its notional
554monomorphic third parameter is less polymorphic than (13)'s `T`.
555Essentially what this rule says is that any parameter list with more
556parameters is no less polymorphic than one with fewer.
557
558We could collapse this parameter list ordering to a succinct total order by
559simply taking the sum of the clamped parameter polymorphism counts, but this
560would again make most incomparable parameter lists compare equal, as well as
561having the potential for some unexpected results based on the (completely
562arbitrary) value chosen for "completely polymorphic".
563For instance, if we set 4 to be the maximum depth of polymorphism (as above),
564the following functions would be equally polymorphic, which is a somewhat
565unexpected result:
566
567        forall(otype T) void g( T, T, T, int );    // 4 + 4 + 4 + 0 = 12
568        forall(otype T) void g( T*, T*, T*, T* );  // 3 + 3 + 3 + 3 = 12
569
570These functions would also be considered equally polymorphic:
571
572        forall(otype T) void g( T, int );    // 4 + 0 = 4;
573        forall(otype T) void g( T**, T** );  // 2 + 2 = 4;
574
575This issue can be mitigated by choosing a larger maximum depth of
576polymorphism, but this scheme does have the distinct disadvantage of either
577specifying the (completely arbitrary) maximum depth as part of the language or
578allowing the compiler to refuse to accept otherwise well-typed deeply-nested
579polymorphic types. 
580
581For purposes of determining polymorphism, the list of return types of a
582function should be treated like another parameter list, and combined with the
583degree of polymorphism from the parameter list in the same way that the
584parameters in the parameter list are combined.
585For instance, in the following, (14) is less polymorphic than (15) which is
586less polymorphic than (16):
587
588        forall(otype T) int f( T );  // (14)
589        forall(otype T) T*  f( T );  // (15)
590        forall(otype T) T   f( T );  // (16)
591
592### Type Variables and Bounds ###
593Degree of polymorphism doesn't solely depend on the parameter lists, though.
594Glen's thesis (4.4.4, p.89) gives an example that shows that it also depends
595on the number of type variables as well:
596
597        forall(otype T)          void f( T, int );  // (1a) polymorphic
598        forall(otype T)          void f( T, T );    // (2)  more polymorphic
599        forall(otype T, otype S) void f( T, S );    // (17) most polymorphic
600
601Clearly the `forall` type parameter list needs to factor into calculation of
602degree of polymorphism as well, as it's the only real differentiation between
603(2) and (17).
604The simplest way to include the type parameter list would be to simply count
605the type variables and say that functions with more type variables are more
606polymorphic.
607
608However, it also seems natural that more-constrained type variables should be
609counted as "less polymorphic" than less-constrained type variables.
610This would allow our resolver to pick more specialized (and presumably more
611efficient) implementations of functions where one exists.
612For example:
613
614        forall(otype T | { void g(T); }) T f( T );  // (18) less polymorphic
615        forall(otype T)                  T f( T );  // (16) more polymorphic
616
617We could account for this by counting the number of unique constraints and
618saying that functions with more constraints are less polymorphic.
619
620That said, we do model the `forall` constraint list as a (possibly anonymous)
621_trait_, and say that each trait is a set of constraints, so we could
622presumably define a partial order over traits based on subset inclusion, and
623use this partial order instead of the weaker count of constraints to order the
624list of type parameters of a function, as below:
625
626        trait has_g(otype T) { void g(T); };
627        trait has_h(otype S) { void h(T); };
628        trait has_gh(otype R | has_g(R) | has_h(R)) {};
629        // has_gh is equivlent to { void g(R); void h(R); }
630       
631        forall(otype T | has_gh(T)) T f( T );  // (19) least polymorphic
632        forall(otype T | has_g(T))  T f( T );  // (18) more polymorphic than (19)
633        forall(otype T | has_h(T))  T f( T );  // (18b) more polymorphic than (19)
634                                               //       incomparable with (18)
635        forall(otype T)             T f( T );  // (16) most polymorphic
636
637The tricky bit with this is figuring out how to compare the constraint
638functions for equality up to type variable renaming; I suspect there's a known
639solution, but don't know what it is (perhaps some sort of unification
640calculation, though I hope there's a more lightweight option).
641We also should be able to take advantage of the programmer-provided trait
642subset information (like the constraint on `has_gh` in the example) to more
643efficiently generate the partial-order graph for traits, which should be able
644to be cached for efficiency.
645
646Combining count of type variables with the (partial) order on the trait
647constraining those variables seems like it should be a fairly straightforward
648product ordering to me - one `forall` qualifier is (less than or) equal to
649another if it has both a (less than or) equal number of type variables and a
650(less than or) equal degree of polymorphism from its constraining trait; the
651two qualifiers are incomparable otherwise.
652If an easier-to-calculate total ordering is desired, it might be acceptable to
653use the number of type variables, with ties broken by number of constraints.
654
655Similarly, to combine the (partial) orders on parameter and return lists with
656the (partial) order on `forall` qualifiers, a product ordering seems like the
657reasonable choice, though if we wanted a total order a reasonable choice would
658be to use whatever method we use to combine parameter costs into parameter
659lists to combine the costs for the parameter and return lists, then break ties
660by the order on the `forall` qualifiers.
661
662## Expression Costs ##
663
664### Variable Expressions ###
665Variables may be overloaded; that is, there may be multiple distinct variables
666with the same name so long as each variable has a distinct type.
667The variable expression `x` has one zero-cost interpretation as type `T` for
668each variable `T x` in scope.
669
670### Member Selection Expressions ###
671For every interpretation `I` of `e` which has a struct or union type `S`,
672`e.y` has an interpretation of type `T` for each member `T y` of `S`, with the
673same cost as `I`.
674Note that there may be more than one member of `S` with the same name, as per
675Cforall's usual overloading rules.
676The indirect member expression `e->y` is desugared to `(*e).y` and interpreted
677analogously.
678
679**TODO** Consider allowing `e.y` to be interpreted as `e->y` if no
680interpretations as `e.y` exist.
681
682### Address Expressions ###
683Address expressions `&e` have an interpretation for each interpretation `I` of
684`e` that is an lvalue of type `T`, with the same cost as `I` and type `T*`.
685Lvalues result from variable expressions, member selection expressions, or
686application of functions returning an lvalue-qualified type.
687Note that the dereference operator is overloadable, so the rules for its
688resolution follow those for function application below.
689
690**TODO** Consider allowing conversion-to-lvalue so that, e.g., `&42` spawns a
691new temporary holding `42` and takes its address.
692
693### Boolean-context Expressions ###
694C has a number of "boolean contexts", where expressions are assigned a truth
695value; these include both arguments to the short-circuiting `&&` and `||` 
696operators, as well as the conditional expressions in `if` and `while` 
697statements, the middle expression in `for` statements, and the first argument
698to the `?:` ternary conditional operator.
699In all these contexts, C interprets `0` (which is both an integer and a null
700pointer literal) as false, and all other integer or pointer values as true.
701In this spirit, Cforall allows other types to be considered "truthy" if they
702support the following de-sugaring in a conditional context (see notes on
703interpretation of literal `0` below):
704
705        x => ((int)( x != 0 ))
706
707### Literal Expressions ###
708Literal expressions (e.g. 42, 'c', 3.14, "Hello, world!") have one
709zero-cost interpretation with the same type the expression would have in C,
710with three exceptions:
711
712Character literals like 'x' are typed as `char` in Cforall, not `int` as in C.
713This change breaks very little C code (primarily `sizeof 'x'`; the implicit
714conversion from `int` to `char` and lack of overloading handle most other
715expressions), matches the behaviour of C++, and is more compatible with
716programmer intuition.
717
718The literals `0` and `1` are also treated specially by Cforall, due to their
719potential uses in operator overloading.
720Earlier versions of Cforall allowed `0` and `1` to be variable names, allowing
721multiple interpretations of them according to the existing variable
722overloading rules, with the following declarations in the prelude:
723
724        const int 0, 1;
725        forall ( dtype DT ) const DT * const    0;
726        forall ( ftype FT ) FT * const          0;
727
728This did, however, create some backward-compatibility problems and potential
729performance issues, and works poorly for generic types. To start with, this
730(entirely legal C) code snippet doesn't compile in Cforall:
731
732        if ( 0 ) {}
733
734It desugars to `if ( (int)(0 != 0) ) {}`, and since both `int` and
735`forall(dtype DT) DT*` have a != operator which returns `int` the resolver can
736not choose which `0` variable to take, because they're both exact matches.
737
738The general != computation may also be less efficient than a check for a zero
739value; take the following example of a rational type:
740
741        struct rational { int32_t num, int32_t den };
742        rational 0 = { 0, 1 };
743       
744        int ?!=? (rational a, rational b) {
745                return ((int64_t)a.num)*b.den != ((int64_t)b.num)*a.den;
746        }
747       
748        int not_zero (rational a) { return a.num != 0; }
749
750To check if two rationals are equal we need to do a pair of multiplications to
751normalize them (the casts in the example are to prevent overflow), but to
752check if a rational is non-zero we just need to check its numerator, a more
753efficient operation.
754
755Finally, though polymorphic null-pointer variables can be meaningfully
756defined, most other polymorphic variables cannot be, which makes it difficult
757to make generic types "truthy" using the existing system:
758
759        forall(otype T) struct pair { T x; T y; };
760        forall(otype T | { T 0; }) pair(T) 0 = { 0, 0 };
761
762Now, it seems natural enough to want to define the zero for this pair type as
763a pair of the zero values of its element type (if they're defined).
764The declaration of `pair(T) 0` above is actually illegal though, as there is
765no way to represent the zero values of an infinite number of types in the
766single memory location available for this polymorphic variable - the
767polymorphic null-pointer variables defined in the prelude are legal, but that
768is only because all pointers are the same size and the single zero value is a
769legal value of all pointer types simultaneously; null pointer is, however,
770somewhat unique in this respect.
771
772The technical explanation for the problems with polymorphic zero is that `0` 
773is really a rvalue, not a lvalue - an expression, not an object.
774Drawing from this, the solution we propose is to give `0` a new built-in type,
775`_zero_t` (name open to bikeshedding), and similarly give `1` the new built-in
776type `_unit_t`.
777If the prelude defines != over `_zero_t` this solves the `if ( 0 )` problem,
778because now the unambiguous best interpretation of `0 != 0` is to read them
779both as `_zero_t` (and say that this expression is false).
780Backwards compatibility with C can be served by defining conversions in the
781prelude from `_zero_t` and `_unit_t` to `int` and the appropriate pointer
782types, as below:
783
784        // int 0;
785        forall(otype T | { void ?{safe}(T*, int); }) void ?{safe} (T*, _zero_t);
786        forall(otype T | { void ?{unsafe}(T*, int); }) void ?{unsafe} (T*, _zero_t);
787       
788        // int 1;
789        forall(otype T | { void ?{safe}(T*, int); }) void ?{safe} (T*, _unit_t);
790        forall(otype T | { void ?{unsafe}(T*, int); }) void ?{unsafe} (T*, _unit_t);
791       
792        // forall(dtype DT) const DT* 0;
793        forall(dtype DT) void ?{safe}(const DT**, _zero_t);
794        // forall(ftype FT) FT* 0;
795        forall(ftype FT) void ?{safe}(FT**, _zero_t);
796
797Further, with this change, instead of making `0` and `1` overloadable
798variables, we can instead allow user-defined constructors (or, more flexibly,
799safe conversions) from `_zero_t`, as below:
800
801        // rational 0 = { 0, 1 };
802        void ?{safe} (rational *this, _zero_t) { this->num = 0; this->den = 1; }
803
804Note that we don't need to name the `_zero_t` parameter to this constructor,
805because its only possible value is a literal zero.
806This one line allows `0` to be used anywhere a `rational` is required, as well
807as enabling the same use of rationals in boolean contexts as above (by
808interpreting the `0` in the desguraring to be a rational by this conversion).
809Furthermore, while defining a conversion function from literal zero to
810`rational` makes rational a "truthy" type able to be used in a boolean
811context, we can optionally further optimize the truth decision on rationals as
812follows:
813
814        int ?!=? (rational a, _zero_t) { return a.num != 0; }
815
816This comparison function will be chosen in preference to the more general
817rational comparison function for comparisons against literal zero (like in
818boolean contexts) because it doesn't require a conversion on the `0` argument.
819Functions of the form `int ?!=? (T, _zero_t)` can acutally be used in general
820to make a type `T` truthy without making `0` a value which can convert to that
821type, a capability not available in the current design.
822
823This design also solves the problem of polymorphic zero for generic types, as
824in the following example:
825
826        // ERROR: forall(otype T | { T 0; }) pair(T) 0 = { 0, 0 };
827        forall(otype T | { T 0; }) void ?{safe} (pair(T) *this, _zero_t) {
828                this->x = 0; this->y = 0;
829        }
830
831The polymorphic variable declaration didn't work, but this constructor is
832perfectly legal and has the desired semantics.
833
834We can assert that `T` can be used in a boolean context as follows:
835
836        `forall(otype T | { int ?!=?(T, _zero_t); })`
837 
838Since the C standard (6.5.16.1.1) specifically states that pointers can be
839assigned into `_Bool` variables (and implies that other artithmetic types can
840be assigned into `_Bool` variables), it seems natural to say that assignment
841into a `_Bool` variable effectively constitutes a boolean context.
842To allow this interpretation, I propose including the following function (or
843its effective equivalent) in the prelude:
844
845        forall(otype T | { int ?!=?(T, _zero_t); })
846        void ?{safe}( _Bool *this, T that ) { *this = that != 0; }
847
848Note that this conversion is not transitive; that is, for `t` a variable of
849some "truthy" type `T`, `(_Bool)t;` would use this conversion (in the absence
850of a lower-cost one), `(int)t;` would not use this conversion (and in fact
851would not be legal in the absence of another valid way to convert a `T` to an
852`int`), but `(int)(_Bool)t;` could legally use this conversion.
853
854Similarly giving literal `1` the special type `_unit_t` allows for more
855concise and consistent specification of the increment and decrement operators,
856using the following de-sugaring:
857
858        ++i => i += 1
859        i++ => (tmp = i, i += 1, tmp)
860        --i => i -= 1
861        i-- => (tmp = i, i -= 1, tmp)
862
863In the examples above, `tmp` is a fresh temporary with its type inferred from
864the return type of `i += 1`.
865Under this proposal, defining a conversion from `_unit_t` to `T` and a
866`lvalue T ?+=? (T*, T)` provides both the pre- and post-increment operators
867for free in a consistent fashion (similarly for -= and the decrement
868operators).
869If a meaningful `1` cannot be defined for a type, both increment operators can
870still be defined with the signature `lvalue T ?+=? (T*, _unit_t)`.
871Similarly, if scalar addition can be performed on a type more efficiently than
872by repeated increment, `lvalue T ?+=? (T*, int)` will not only define the
873addition operator, it will simultaneously define consistent implementations of
874both increment operators (this can also be accomplished by defining a
875conversion from `int` to `T` and an addition operator `lvalue T ?+=?(T*, T)`).
876
877To allow functions of the form `lvalue T ?+=? (T*, int)` to satisfy "has an
878increment operator" assertions of the form `lvalue T ?+=? (T*, _unit_t)`,
879we also define a non-transitive unsafe conversion from `_Bool` (allowable
880values `0` and `1`) to `_unit_t` (and `_zero_t`) as follows:
881
882        void ?{unsafe} (_unit_t*, _Bool) {}
883
884As a note, the desugaring of post-increment above is possibly even more
885efficient than that of C++ - in C++, the copy to the temporary may be hidden
886in a separately-compiled module where it can't be elided in cases where it is
887not used, whereas this approach for Cforall always gives the compiler the
888opportunity to optimize out the temporary when it is not needed.
889Furthermore, one could imagine a post-increment operator that returned some
890type `T2` that was implicitly convertable to `T` but less work than a full
891copy of `T` to create (this seems like an absurdly niche case) - since the
892type of `tmp` is inferred from the return type of `i += 1`, you could set up
893functions with the following signatures to enable an equivalent pattern in
894Cforall:
895
896        lvalue T2 ?+=? (T*, _unit_t); // increment operator returns T2
897        void ?{} (T2*, T);            // initialize T2 from T for use in `tmp = i`
898        void ?{safe} (T*, T2);        // allow T2 to be used as a T when needed to
899                                      // preserve expected semantics of T x = y++;
900
901**TODO** Look in C spec for literal type interprations.
902**TODO** Write up proposal for wider range of literal types, put in appendix
903
904### Initialization and Cast Expressions ###
905An initialization expression `T x = e` has one interpretation for each
906interpretation `I` of `e` with type `S` which is convertable to `T`.
907The cost of the interpretation is the cost of `I` plus the conversion cost
908from `S` to `T`.
909A cast expression `(T)e` is interpreted as hoisting initialization of a
910temporary variable `T tmp = e` out of the current expression, then replacing
911`(T)e` by the new temporary `tmp`.
912
913### Assignment Expressions ###
914An assignment expression `e = f` desugars to `(?=?(&e, f), e)`, and is then
915interpreted according to the usual rules for function application and comma
916expressions.
917Operator-assignment expressions like `e += f` desugar similarly as
918`(?+=?(&e, f), e)`.
919
920### Function Application Expressions ###
921Every _compatible function_ and satisfying interpretation of its arguments and
922polymorphic variable bindings produces one intepretation for the function
923application expression.
924Broadly speaking, the resolution cost of a function application is the sum of
925the cost of the interpretations of all arguments, the cost of all conversions
926to make those argument interpretations match the parameter types, and the
927binding cost of any of the function's polymorphic type parameters.
928
929**TODO** Work out binding cost in more detail.
930**TODO** Address whether "incomparably polymorphic" should be treated as
931"equally polymorphic" and be disambiguated by count of (safe) conversions.
932**TODO** Think about what polymorphic return types mean in terms of late
933binding.
934**TODO** Consider if "f is less polymorphic than g" can mean exactly "f
935specializes g"; if we don't consider the assertion parameters (except perhaps
936by count) and make polymorphic variables bind exactly (rather than after
937implicit conversions) this should actually be pre-computable.
938**TODO** Add "deletable" functions - take Thierry's suggestion that a deleted
939function declaration is costed out by the resolver in the same way that any
940other function declaration is costed; if the deleted declaration is the unique
941min-cost resolution refuse to type the expression, if it is tied for min-cost
942then take the non-deleted alternative, and of two equivalent-cost deleted
943interpretations with the same return type pick one arbitrarily rather than
944producing an ambiguous resolution. This would also be useful for forbidding
945pointer-to-floating-point explicit conversions (C11, 6.5.4.4).
946**TODO** Cover default parameters, maybe named parameters (see "named
947arguments" thread of 11 March 2016)
948 
949
950### Sizeof, Alignof & Offsetof Expressions ###
951`sizeof`, `alignof`, and `offsetof` expressions have at most a single
952interpretation, of type `size_t`.
953`sizeof` and `alignof` expressions take either a type or an expression as a 
954an argument; if the argument is a type, it must be a complete type which is
955not a function type, if an expression, the expression must have a single
956interpretation, the type of which conforms to the same rules.
957`offsetof` takes two arguments, a type and a member name; the type must be
958a complete structure or union type, and the second argument must name a member
959of that type.
960
961### Comma Expressions ###
962A comma expression `x, y` resolves `x` as if it had been cast to `void`, and
963then, if there is a unique interpretation `I` of `x`, has one interpretation
964for each interpretation `J` of `y` with the same type as `J` costing the sum
965of the costs of `I` and `J`.
966
967### Index Expressions ###
968**TODO** Consider adding polymorphic function in prelude for this, as per
9696.5.2.1.2 in the C standard:
970
971        forall(otype T, otype I, otype R, otype E | { R ?+?(T, I); lvalue E *?(R); })
972        lvalue E ?[?](T a, I i) { return *(a + i); }
973
974I think this isn't actually a good idea, because the cases for it are niche,
975mostly odd tricks like `0[p]` as an alternate syntax for dereferencing a
976pointer `p`, and adding it to the prelude would slow down resolution of
977every index expression just a bit. Our existing prelude includes an indexing
978operator `forall(otype T) lvalue T ?[?](ptrdiff_t, T*)`, plus qualified
979variants, which should satisfy C source-compatibility without propegating this
980silly desugaring further.
981
982#### Compatible Functions ####
983**TODO** This subsection is very much a work in progress and has multiple open
984design questions.
985
986A _compatible function_ for an application expression is a visible function
987declaration with the same name as the application expression and parameter
988types that can be converted to from the argument types.
989Function pointers and variables of types with the `?()` function call operator
990overloaded may also serve as function declarations for purposes of
991compatibility.
992
993For monomorphic parameters of a function declaration, the declaration is a
994compatible function if there is an argument interpretation that is either an
995exact match, or has a safe or unsafe implicit conversion that can be used to
996reach the parameter type; for example:
997
998        void f(int);
999       
1000        f(42);        // compatible; exact match to int type
1001        f('x');       // compatible; safe conversion from char => int
1002        f(3.14);      // compatible; unsafe conversion from double => int
1003        f((void*)0);  // not compatible; no implicit conversion from void* => int
1004
1005Per Richard[*], function assertion satisfaction involves recursively searching
1006for compatible functions, not an exact match on the function types (I don't
1007believe the current Cforall resolver handles this properly); to extend the
1008previous example:
1009
1010        forall(otype T | { void f(T); }) void g(T);
1011       
1012        g(42);    // binds T = int, takes f(int) by exact match
1013        g('x');   // binds T = char, takes f(int) by conversion
1014        g(3.14);  // binds T = double, takes f(int) by conversion
1015
1016[*] Bilson, s.2.1.3, p.26-27, "Assertion arguments are found by searching the
1017accessible scopes for definitions corresponding to assertion names, and
1018choosing the ones whose types correspond *most closely* to the assertion
1019types." (emphasis mine)
1020
1021There are three approaches we could take to binding type variables: type
1022variables must bind to argument types exactly, each type variable must bind
1023exactly to at least one argument, or type variables may bind to any type which
1024all corresponding arguments can implicitly convert to; I'll provide some
1025possible motivation for each approach.
1026
1027There are two main arguments for the more restrictive binding schemes; the
1028first is that the built-in implicit conversions in C between `void*` and `T*` 
1029for any type `T` can lead to unexpectedly type-unsafe behaviour in a more
1030permissive binding scheme, for example:
1031
1032        forall(dtype T) T* id(T *p) { return p; }
1033       
1034        int main() {
1035                int *p = 0;
1036                char *c = id(p);
1037        }
1038
1039This code compiles in CFA today, and it works because the extra function
1040wrapper `id` provides a level of indirection that allows the non-chaining
1041implicit conversions from `int*` => `void*` and `void*` => `char*` to chain.
1042The resolver types the last line with `T` bound to `void` as follows:
1043
1044        char *c = (char*)id( (void*)p );
1045
1046It has been suggested that making the implicit conversions to and from `void*` 
1047explicit in Cforall code (as in C++) would solve this particular problem, and
1048provide enough other type-safety benefits to outweigh the source-compatibility
1049break with C; see Appendix D for further details.
1050
1051The second argument for a more constrained binding scheme is performance;
1052trait assertions need to be checked after the type variables are bound, and
1053adding more possible values for the type variables should correspond to a
1054linear increase in runtime of the resolver per type variable.
1055There are 21 built-in arithmetic types in C (ignoring qualifiers), and each of
1056them is implicitly convertable to any other; if we allow unrestricted binding
1057of type variables, a common `int` variable (or literal) used in the position
1058of a polymorphic variable parameter would cause a 20x increase in the amount
1059of time needed to check trait resolution for that interpretation.
1060These numbers have yet to be emprically substantiated, but the theory is
1061reasonable, and given that much of the impetus for re-writing the resolver is
1062due to its poor performance, I think this is a compelling argument.
1063
1064I would also mention that a constrained binding scheme is practical; the most
1065common type of assertion is a function assertion, and, as mentioned above,
1066those assertions should be able to be implicitly converted to to match.
1067Thus, in the example above with `g(T)`, where the assertion is `void f(T)`,
1068we first bind `T = int` or `T = char` or `T = double`, then substitute the
1069binding into the assertion, yielding assertions of `void f(int)`,
1070`void f(char)`, or `void f(double)`, respectively, then attempt to satisfy
1071these assertions to complete the binding.
1072Though in all three cases, the existing function with signature `void f(int)` 
1073satisfies this assertion, the checking work cannot easily be re-used between
1074variable bindings, because there may be better or worse matches depending on
1075the specifics of the binding.
1076
1077The main argument for a more flexible binding scheme is that the binding
1078abstraction can actually cause a wrapped function call that would work to
1079cease to resolve, as below:
1080
1081        forall(otype T | { T ?+? (T, T) })
1082        T add(T x, T y) { return x + y; }
1083       
1084        int main() {
1085                int i, j = 2;
1086                short r, s = 3;
1087                i = add(j, s);
1088                r = add(s, j);
1089        }
1090
1091Now, C's implicit conversions mean that you can call `j + s` or `s + j`, and
1092in both cases the short `s` is promoted to `int` to match `j`.
1093If, on the other hand, we demand that variables exactly match type variables,
1094neither call to `add` will compile, because it is impossible to simultaneously
1095bind `T` to both `int` and `short` (C++ has a similar restriction on template
1096variable inferencing).
1097One alternative that enables this case, while still limiting the possible
1098type variable bindings is to say that at least one argument must bind to its
1099type parameter exactly.
1100In this case, both calls to `add` would have the set `{ T = int, T = short }` 
1101for candidate bindings, and would have to check both, as well as checking that
1102`short` could convert to `int` or vice-versa.
1103
1104It is worth noting here that parameterized types generally bind their type
1105parameters exactly anyway, so these "restrictive" semantics only restrict a
1106small minority of calls; for instance, in the example following, there isn't a
1107sensible way to type the call to `ptr-add`:
1108
1109        forall(otype T | { T ?+?(T, T) })
1110        void ptr-add( T* rtn, T* x, T* y ) {
1111                *rtn = *x + *y;
1112        }
1113       
1114        int main() {
1115                int i, j = 2;
1116                short s = 3;
1117                ptr-add(&i, &j, &s); // ERROR &s is not an int*
1118        }
1119
1120I think there is some value in providing library authors with the
1121capability to express "these two parameter types must match exactly".
1122This can be done without restricting the language's expressivity, as the `add` 
1123case above can be made to work under the strictest type variable binding
1124semantics with any addition operator in the system by changing its signature
1125as follows:
1126
1127        forall( otype T, otype R, otype S | { R ?+?(T, S); } )
1128        R add(T x, S y) { return x + y; }
1129
1130Now, it is somewhat unfortunate that the most general version here is more
1131verbose (and thus that the path of least resistence would be more restrictive
1132library code); however, the breaking case in the example code above is a bit
1133odd anyway - explicitly taking two variables of distinct types and relying on
1134C's implicit conversions to do the right thing is somewhat bad form,
1135especially where signed/unsigned conversions are concerned.
1136I think the more common case for implicit conversions is the following,
1137though, where the conversion is used on a literal:
1138
1139        short x = 40;
1140        short y = add(x, 2);
1141
1142One option to handle just this case would be to make literals implicitly
1143convertable to match polymorphic type variables, but only literals.
1144The example above would actually behave slightly differently than `x + 2` in
1145C, though, casting the `2` down to `short` rather than the `x` up to `int`, a
1146possible demerit of this scheme.
1147
1148The other question to ask would be which conversions would be allowed for
1149literals; it seems rather odd to allow down-casting `42ull` to `char`, when
1150the programmer has explicitly specified by the suffix that it's an unsigned
1151long.
1152Type interpretations of literals in C are rather complex (see [1]), but one
1153reasonable approach would be to say that un-suffixed integer literals could be
1154interpreted as any type convertable from int, "u" suffixed literals could be
1155interpreted as any type convertable from "unsigned int" except the signed
1156integer types, and "l" or "ll" suffixed literals could only be interpreted as
1157`long` or `long long`, respectively (or possibly that the "u" suffix excludes
1158the signed types, while the "l" suffix excludes the types smaller than
1159`long int`, as in [1]).
1160Similarly, unsuffixed floating-point literals could be interpreted as `float`,
1161`double` or `long double`, but "f" or "l" suffixed floating-point literals
1162could only be interpreted as `float` or `long double`, respectively.
1163I would like to specify that character literals can only be interpreted as
1164`char`, but the wide-character variants and the C practice of typing character
1165literals as `int` means that would likely break code, so character literals
1166should be able to take any integer type.
1167
1168[1] http://en.cppreference.com/w/c/language/integer_constant
1169
1170With the possible exception of the `add` case above, implicit conversions to
1171the function types of assertions can handle most of the expected behaviour
1172from C.
1173However, implicit conversions cannot be applied to match variable assertions,
1174as in the following example:
1175
1176        forall( otype T | { int ?<?(T, T); T ?+?(T, T); T min; T max; } )
1177        T clamp_sum( T x, T y ) {
1178                T sum = x + y;
1179                if ( sum < min ) return min;
1180                if ( max < sum ) return max;
1181                return sum;
1182        }
1183       
1184        char min = 'A';
1185        double max = 100.0;
1186        //int val = clamp_sum( 'X', 3.14 );  // ERROR (1)
1187       
1188        char max = 'Z'
1189        char val = clamp_sum( 'X', 3.14 ); // MATCH (2)
1190        double val = clamp_sum( 40.9, 19.9 ); // MAYBE (3)
1191
1192In this example, the call to `clamp_sum` at (1) doesn't compile, because even
1193though there are compatible `min` and `max` variables of types `char` and
1194`double`, they need to have the same type to match the constraint, and they
1195don't.
1196The (2) example does compile, but with a behaviour that might be a bit
1197unexpected given the "usual arithmetic conversions", in that both values are
1198narrowed to `char` to match the `min` and `max` constraints, rather than
1199widened to `double` as is usual for mis-matched arguments to +.
1200The (3) example is the only case discussed here that would require the most
1201permisive type binding semantics - here, `T` is bound to `char`, to match the
1202constraints, and both the parameters are narrowed from `double` to `char` 
1203before the call, which would not be allowed under either of the more
1204restrictive binding semantics.
1205However, the behaviour here is unexpected to the programmer, because the
1206return value will be `(double)'A' /* == 60.0 */` due to the conversions,
1207rather than `60.8 /* == 40.9 + 19.9 */` as they might expect.
1208
1209Personally, I think that implicit conversions are not a particularly good
1210language design, and that the use-cases for them can be better handled with
1211less powerful features (e.g. more versatile rules for typing constant
1212expressions).
1213However, though we do need implicit conversions in monomorphic code for C
1214compatibility, I'm in favour of restricting their usage in polymorphic code,
1215both to give programmers some stronger tools to express their intent and to
1216shrink the search space for the resolver.
1217Of the possible binding semantics I've discussed, I'm in favour of forcing
1218polymorphic type variables to bind exactly, though I could be talked into
1219allowing literal expressions to have more flexibility in their bindings, or
1220possibly loosening "type variables bind exactly" to "type variables bind
1221exactly at least once"; I think the unrestricted combination of implicit
1222conversions and polymorphic type variable binding unneccesarily multiplies the
1223space of possible function resolutions, and that the added resolution options
1224are mostly unexpected and therefore confusing and not useful to user
1225programmers.
1226
1227## Resolver Architecture ##
1228
1229### Function Application Resolution ###
1230Our resolution algorithm for function application expressions is based on
1231Baker's[3] single-pass bottom-up algorithm, with Cormack's[4] single-pass
1232top-down algorithm applied where appropriate as an optimization.
1233Broadly speaking, the cost of this resolution per expression will be
1234proportional to `i^d`, where `i` is the number of interpretations of each
1235program symbol, and `d` is the maximum depth of the expression DAG.
1236Since `d` is determined by the user programmer (in general, bounded by a small
1237constant), opportunities for resolver optimization primarily revolve around
1238minimizing `i`, the number of interpretations of each symbol that are
1239considered.
1240
1241[3] Baker, Theodore P. A one-pass algorithm for overload resolution in Ada.
1242ACM Transactions on Programming Languages and Systems (1982) 4:4 p.601-614
1243
1244[4] Cormack, Gordon V. An algorithm for the selection of overloaded functions
1245in Ada. SIGPLAN Notices (1981) 16:2 p.48-52
1246
1247Unlike Baker, our system allows implicit type conversions for function
1248arguments and return types; the problem then becomes to find the valid
1249interpretation for an expression that has the unique minimal conversion cost,
1250if such exists.
1251Interpretations can be produced both by overloaded names and implicit
1252conversions applied to existing interpretations; we have proposals to reduce
1253the number of interpretations considered from both sources.
1254To simplify the problem for this discussion, we will consider application
1255resolution restricted to a domain of functions applied to variables, possibly
1256in a nested manner (e.g. `f( g( x ), y )`, where `x` and `y` are variables and
1257`f` and `g` are functions), and possibly in a typed context such as a variable
1258initialization (e.g. `int i = f( x );`); the other aspects of Cforall type
1259resolution should be able to be straightforwardly mapped into this model.
1260The types of the symbol tables used for variable and function declarations
1261look somewhat like the following:
1262
1263        variable_table = name_map( variable_name, variable_map )
1264       
1265        function_table = name_map( function_name, function_map )
1266       
1267        variable_map = multi_index( by_type( variable_type ),
1268                                    variable_decl_set )
1269
1270        function_map = multi_index( by_int( n_params ),
1271                                                                by_type( return_type ),
1272                                                                function_decl_set )
1273
1274`variable_name` and `function_name` are likely simple strings, with `name_map` 
1275a hash table (or perhaps trie) mapping string keys to values.
1276`variable_decl_set` and `function_decl_set` can be thought of for the moment
1277as simple bags of typed declarations, where the declaration types are linked
1278to the graph of available conversions for that type.
1279In a typed context both the `variable_decl_set` and the `function_decl_set` 
1280should be able to be selected upon by type; this is accomplished by the
1281`by_type` index of both `variable_map` and `function_map`.
1282The `by_int` index of `function_map` also provides a way to select functions
1283by their number of parameters; this index may be used to swiftly discard any
1284function declaration which does not have the appropriate number of parameters
1285for the argument interpretations being passed to it; given the likely small
1286number of entries in this map, it is possible that a binary search of a sorted
1287vector or even a linear search of an unsorted vector would be more efficient
1288than the usual hash-based index.
1289
1290Given these data structures, the general outline of our algorithm follows
1291Baker, with Cormack's algorithm used as a heuristic filter in typed contexts.
1292
1293In an untyped context, we use a variant of Baker's bottom-up algorithm.
1294The leaves of the interpretation DAG are drawn from the variable symbol table,
1295with entries in the table each producing zero-cost interpretations, and each
1296implicit conversion available to be applied to the type of an existing entry
1297producing a further interpretation with the same cost as the conversion.
1298As in Baker, if two or more interpretations have the same type, only the
1299minimum cost interpretation with that type is produced; if there is no unique
1300minimum cost interpretation than resolution with that type is ambiguous, and
1301not permitted.
1302It should be relatively simple to produce the list of interpretations sorted
1303by cost by producing the interpretations via a breadth-first search of the
1304conversion graph from the initial interpretations provided in the variable
1305symbol table.
1306
1307To match a function at one of the internal nodes of the DAG, we first look up
1308the function's name in the function symbol table, the appropriate number of
1309parameters for the arguments that are provided through the `by_int` index of
1310the returned `function_map`, then go through the resulting `function_decl_set` 
1311searching for functions where the parameter types can unify with the provided
1312argument lists; any such matching function produces an interpretation with a
1313cost that is the sum of its argument costs.
1314Though this is not included in our simplified model, this unification step may
1315include binding of polymorphic variables, which introduces a cost for the
1316function binding itself which must be added to the argument costs.
1317Also, checking of function assertions would likely be done at this step as
1318well, possibly eliminating some possible matching functions (if no suitable
1319assertions can be satisfied), or adding further conversion costs for the
1320assertion satisfaction.
1321Once the set of valid function interpretations is produced, these may also be
1322expanded by the graph of implicit conversions on their return types, as the
1323variable interpretations were.
1324
1325This implicit conversion-based expansion of interpretations should be skipped
1326for the top-level expression if used in an untyped (void) context, e.g. for
1327`f` in `f( g ( x ) );` or `x` in `x;`.
1328On the other hand, if the top-level expression specifies a type, e.g. in
1329`int i = f( x );`, only top level expressions that return that type are
1330relevant to the search, so the candidates for `f` can be filtered first by
1331those that return `int` (or a type convertable to it); this can be
1332accomplished by performing a top-down filter of the interpretations of `f` by
1333the `by_type` index of the `function_map` in a manner similar to Cormack's[4]
1334algorithm.
1335
1336In a typed context, such as an initialization expression
1337`T x = f( g( y ), z );`, only interpretations of `f( g( y ), z )` which have
1338type `T` are valid; since there are likely to be valid interpretations of
1339`f( g( y ), z )` which cannot be used to initialize a variable of type `T`, we
1340can use this information to reduce the number of interpretations considered.
1341Drawing from Cormack[4], we first search for interpretations of `f` where the
1342return type is `T`; by breadth-first-search of the conversion graph, it should
1343be straightforward to order the interpretations of `f` by the cost to convert
1344their return type to `T`.
1345We can also filter out interpretations of `f` with less than two parameters,
1346since both `g( y )` and `z` must produce at least one parameter; we may not,
1347however, rule out interpretations of `f` with more than two parameters, as
1348there may be a valid interpretation of `g( y )` as a function returning more
1349than one parameter (if the expression was `f( y, z )` instead, we could use an
1350exact parameter count, assuming that variables of tuple type don't exist).
1351For each compatible interpretation of `f`, we can add the type of the first
1352parameter of that interpretation of `f` to a set `S`, and recursively search
1353for interpretations of `g( y )` that return some type `Si` in `S`, and
1354similarly for interpretations of `z` that match the type of any of the second
1355parameters of some `f`.
1356Naturally, if no matching interpretation of `g( y )` can be found for the
1357first parameter of some `f`, the type of the second parameter of that `f` will
1358not be added to the set of valid types for `z`.
1359Each node in this interpretation DAG is given a cost the same way it would be
1360in the bottom-up approach, with the exception that when going top-down there
1361must be a final bottom-up pass to sum the interpretation costs and sort them
1362as appropriate.
1363
1364If a parameter type for some `f` is a polymorphic type variable that is left
1365unbound by the return type (e.g. `forall(otype S) int f(S x, int y)`), the
1366matching arguments should be found using the bottom-up algorithm above for
1367untyped contexts, because the polymorphic type variable does not sufficiently
1368constrain the available interpretations of the argument expression.
1369Similarly, it would likely be an advantage to use top-down resolution for
1370cast expressions (e.g. `(int)x`), even when those cast expressions are
1371subexpressions of an otherwise untyped expression.
1372It may also be fruitful to switch between the bottom-up and top-down
1373algorithms if the number of valid interpretations for a subexpression or valid
1374types for an argument exceeds some heuristic threshold, but finding such
1375a threshold (if any exists) will require experimental data.
1376This hybrid top-down/bottom-up search provides more opportunities for pruning
1377interpretations than either a bottom-up or top-down approach alone, and thus
1378may be more efficient than either.
1379A top-down-only approach, however, devolves to linear search through every
1380possible interpretation in the solution space in an untyped context, and is
1381thus likely to be inferior to a strictly bottom-up approach, though this
1382hypothesis needs to be empirically validated.
1383
1384Both Baker and Cormack explicitly generate all possible interpretations of a
1385given expression; thinking of the set of interpretations of an expression as a
1386list sorted by cost, this is an eager evaluation of the list.
1387However, since we generally expect that user programmers will not often use
1388high-cost implicit conversions, one potentially effective way to prune the
1389search space would be to first find the minimal-cost interpretations of any
1390given subexpression, then to save the resolution progress of the
1391subexpressions and attempt to resolve the superexpression using only those
1392subexpression interpretations.
1393If no valid interpretation of the superexpression can be found, the resolver
1394would then repeatedly find the next-most-minimal cost interpretations of the
1395subexpressions and attempt to produce the minimal cost interpretation of the
1396superexpression.
1397This process would proceed until all possible subexpression interpretations
1398have been found and considered.
1399
1400A middle ground between the eager and lazy approaches can be taken by
1401considering the lexical order on the cost tuple; essentially, every
1402interpretation in each of the classes below will be strictly cheaper than any
1403interpretation in the class after it, so if a min-cost valid interpretation
1404can be found while only generating interpretations in a given class, that
1405interpretation is guaranteed to be the best possible one:
1406
14071. Interpretations without polymorphic functions or implicit conversions
14082. Interpretations without polymorphic functions using only safe conversions
14093. Interpretations using polymorphic functions without unsafe conversions
14104. Interpretations using unsafe conversions
1411
1412In this lazy-eager approach, all the interpretations in one class would be
1413eagerly generated, while the interpretations in the next class would only be
1414considered if no match was found in the previous class.
1415
1416## Appendix A: Partial and Total Orders ##
1417The `<=` relation on integers is a commonly known _total order_, and
1418intuitions based on how it works generally apply well to other total orders.
1419Formally, a total order is some binary relation `<=` over a set `S` such that
1420for any two members `a` and `b` of `S`, `a <= b` or `b <= a` (if both, `a` and
1421`b` must be equal, the _antisymmetry_ property); total orders also have a
1422_transitivity_ property, that if `a <= b` and `b <= c`, then `a <= c`.
1423If `a` and `b` are distinct elements and `a <= b`, we may write `a < b`.
1424 
1425A _partial order_ is a generalization of this concept where the `<=` relation
1426is not required to be defined over all pairs of elements in `S` (though there
1427is a _reflexivity_ requirement that for all `a` in `S`, `a <= a`); in other
1428words, it is possible for two elements `a` and `b` of `S` to be
1429_incomparable_, unable to be ordered with respect to one another (any `a` and
1430`b` for which either `a <= b` or `b <= a` are called _comparable_).
1431Antisymmetry and transitivity are also required for a partial order, so all
1432total orders are also partial orders by definition.
1433One fairly natural partial order is the "subset of" relation over sets from
1434the same universe; `{ }` is a subset of both `{ 1 }` and `{ 2 }`, which are
1435both subsets of `{ 1, 2 }`, but neither `{ 1 }` nor `{ 2 }` is a subset of the
1436other - they are incomparable under this relation.
1437
1438We can compose two (or more) partial orders to produce a new partial order on
1439tuples drawn from both (or all the) sets.
1440For example, given `a` and `c` from set `S` and `b` and `d` from set `R`,
1441where both `S` and `R` both have partial orders defined on them, we can define
1442a ordering relation between `(a, b)` and `(c, d)`.
1443One common order is the _lexicographical order_, where `(a, b) <= (c, d)` iff
1444`a < c` or both `a = c` and `b <= d`; this can be thought of as ordering by
1445the first set and "breaking ties" by the second set.
1446Another common order is the _product order_, which can be roughly thought of
1447as "all the components are ordered the same way"; formally `(a, b) <= (c, d)` 
1448iff `a <= c` and `b <= d`.
1449One difference between the lexicographical order and the product order is that
1450in the lexicographical order if both `a` and `c` and `b` and `d` are
1451comparable then `(a, b)` and `(c, d)` will be comparable, while in the product
1452order you can have `a <= c` and `d <= b` (both comparable) which will make
1453`(a, b)` and `(c, d)` incomparable.
1454The product order, on the other hand, has the benefit of not prioritizing one
1455order over the other.
1456
1457Any partial order has a natural representation as a directed acyclic graph
1458(DAG).
1459Each element `a` of the set becomes a node of the DAG, with an arc pointing to
1460its _covering_ elements, any element `b` such that `a < b` but where there is
1461no `c` such that `a < c` and `c < b`.
1462Intuitively, the covering elements are the "next ones larger", where you can't
1463fit another element between the two.
1464Under this construction, `a < b` is equivalent to "there is a path from `a` to
1465`b` in the DAG", and the lack of cycles in the directed graph is ensured by
1466the antisymmetry property of the partial order.
1467
1468Partial orders can be generalized to _preorders_ by removing the antisymmetry
1469property.
1470In a preorder the relation is generally called `<~`, and it is possible for
1471two distict elements `a` and `b` to have `a <~ b` and `b <~ a` - in this case
1472we write `a ~ b`; `a <~ b` and not `a ~ b` is written `a < b`.
1473Preorders may also be represented as directed graphs, but in this case the
1474graph may contain cycles.
1475
1476## Appendix B: Building a Conversion Graph from Un-annotated Single Steps ##
1477The short answer is that it's impossible.
1478
1479The longer answer is that it has to do with what's essentially a diamond
1480inheritance problem.
1481In C, `int` converts to `unsigned int` and also `long` "safely"; both convert
1482to `unsigned long` safely, and it's possible to chain the conversions to
1483convert `int` to `unsigned long`.
1484There are two constraints here; one is that the `int` to `unsigned long` 
1485conversion needs to cost more than the other two (because the types aren't as
1486"close" in a very intuitive fashion), and the other is that the system needs a
1487way to choose which path to take to get to the destination type.
1488Now, a fairly natural solution for this would be to just say "C knows how to
1489convert from `int` to `unsigned long`, so we just put in a direct conversion
1490and make the compiler smart enough to figure out the costs" - given that the
1491users can build an arbitrary graph of conversions, this needs to be handled
1492anyway.
1493We can define a preorder over the types by saying that `a <~ b` if there
1494exists a chain of conversions from `a` to `b`.
1495This preorder corresponds roughly to a more usual type-theoretic concept of
1496subtyping ("if I can convert `a` to `b`, `a` is a more specific type than
1497`b`"); however, since this graph is arbitrary, it may contain cycles, so if
1498there is also a path to convert `b` to `a` they are in some sense equivalently
1499specific.
1500
1501Now, to compare the cost of two conversion chains `(s, x1, x2, ... xn)` and
1502`(s, y1, y2, ... ym)`, we have both the length of the chains (`n` versus `m`)
1503and this conversion preorder over the destination types `xn` and `ym`.
1504We could define a preorder by taking chain length and breaking ties by the
1505conversion preorder, but this would lead to unexpected behaviour when closing
1506diamonds with an arm length of longer than 1.
1507Consider a set of types `A`, `B1`, `B2`, `C` with the arcs `A->B1`, `B1->B2`,
1508`B2->C`, and `A->C`.
1509If we are comparing conversions from `A` to both `B2` and `C`, we expect the
1510conversion to `B2` to be chosen because it's the more specific type under the
1511conversion preorder, but since its chain length is longer than the conversion
1512to `C`, it loses and `C` is chosen.
1513However, taking the conversion preorder and breaking ties or ambiguities by
1514chain length also doesn't work, because of cases like the following example
1515where the transitivity property is broken and we can't find a global maximum:
1516
1517        `X->Y1->Y2`, `X->Z1->Z2->Z3->W`, `X->W`
1518
1519In this set of arcs, if we're comparing conversions from `X` to each of `Y2`,
1520`Z3` and `W`, converting to `Y2` is cheaper than converting to `Z3`, because
1521there are no conversions between `Y2` and `Z3`, and `Y2` has the shorter chain
1522length.
1523Also, comparing conversions from `X` to `Z3` and to `W`, we find that the
1524conversion to `Z3` is cheaper, because `Z3 < W` by the conversion preorder,
1525and so is considered to be the nearer type.
1526By transitivity, then, the conversion from `X` to `Y2` should be cheaper than
1527the conversion from `X` to `W`, but in this case the `X` and `W` are
1528incomparable by the conversion preorder, so the tie is broken by the shorter
1529path from `X` to `W` in favour of `W`, contradicting the transitivity property
1530for this proposed order.
1531
1532Without transitivity, we would need to compare all pairs of conversions, which
1533would be expensive, and possibly not yield a minimal-cost conversion even if
1534all pairs were comparable.
1535In short, this ordering is infeasible, and by extension I believe any ordering
1536composed solely of single-step conversions between types with no further
1537user-supplied information will be insufficiently powerful to express the
1538built-in conversions between C's types.
1539
1540## Appendix C: Proposed Prelude Changes ##
1541**TODO** Port Glen's "Future Work" page for builtin C conversions.
1542**TODO** Move discussion of zero_t, unit_t here.
1543
1544It may be desirable to have some polymorphic wrapper functions in the prelude
1545which provide consistent default implementations of various operators given a
1546definition of one of them.
1547Naturally, users could still provide a monomorphic overload if they wished to
1548make their own code more efficient than the polymorphic wrapper could be, but
1549this would minimize user effort in the common case where the user cannot write
1550a more efficient function, or is willing to trade some runtime efficiency for
1551developer time.
1552As an example, consider the following polymorphic defaults for `+` and `+=`:
1553
1554        forall(otype T | { T ?+?(T, T); })
1555        lvalue T ?+=? (T *a, T b) {
1556                return *a = *a + b;
1557        }
1558       
1559        forall(otype T | { lvalue T ?+=? (T*, T) })
1560        T ?+? (T a, T b) {
1561                T tmp = a;
1562                return tmp += b;
1563        }
1564
1565Both of these have a possibly unneccessary copy (the first in copying the
1566result of `*a + b` back into `*a`, the second copying `a` into `tmp`), but in
1567cases where these copies are unavoidable the polymorphic wrappers should be
1568just as performant as the monomorphic equivalents (assuming a compiler
1569sufficiently clever to inline the extra function call), and allow programmers
1570to define a set of related operators with maximal concision.
1571
1572**TODO** Look at what Glen and Richard have already written for this.
1573
1574## Appendix D: Feasibility of Making void* Conversions Explicit ##
1575C allows implicit conversions between `void*` and other pointer types, as per
1576section 6.3.2.3.1 of the standard.
1577Making these implicit conversions explicit in Cforall would provide
1578significant type-safety benefits, and is precedented in C++.
1579A weaker version of this proposal would be to allow implicit conversions to
1580`void*` (as a sort of "top type" for all pointer types), but to make the
1581unsafe conversion from `void*` back to a concrete pointer type an explicit
1582conversion.
1583However, `int *p = malloc( sizeof(int) );` and friends are hugely common
1584in C code, and rely on the unsafe implicit conversion from the `void*` return
1585type of `malloc` to the `int*` type of the variable - obviously it would be
1586too much of a source-compatibility break to disallow this for C code.
1587We do already need to wrap C code in an `extern "C"` block, though, so it is
1588technically feasible to make the `void*` conversions implicit in C but
1589explicit in Cforall.
1590Also, for calling C code with `void*`-based APIs, pointers-to-dtype are
1591calling-convention compatible with `void*`; we could read `void*` in function
1592signatures as essentially a fresh dtype type variable, e.g:
1593
1594        void* malloc( size_t )
1595                => forall(dtype T0) T0* malloc( size_t )
1596        void qsort( void*, size_t, size_t, int (*)( const void*, const void* ) )
1597                => forall(dtype T0, dtype T1, dtype T2)
1598                   void qsort( T0*, size_t, size_t, int (*)( const T1*, const T2* ) )
1599
1600This would even allow us to leverage some of Cforall's type safety to write
1601better declarations for legacy C API functions, like the following wrapper for
1602`qsort`:
1603
1604        extern "C" { // turns off name-mangling so that this calls the C library
1605                // call-compatible type-safe qsort signature
1606                forall(dtype T)
1607                void qsort( T*, size_t, size_t, int (*)( const T*, const T* ) );
1608               
1609                // forbid type-unsafe C signature from resolving
1610                void qsort( void*, size_t, size_t, int (*)( const void*, const void* ) )
1611                        = delete;
1612        }
Note: See TracBrowser for help on using the repository browser.