# Thoughts on Resolver Design #

## Conversions ##
C's implicit "usual arithmetic conversions" define a structure among the 
built-in types consisting of _unsafe_ narrowing conversions and a hierarchy of 
_safe_ widening conversions. 
There is also a set of _explicit_ conversions that are only allowed through a 
cast expression.
Based on Glen's notes on conversions [1], I propose that safe and unsafe 
conversions be expressed as constructor variants, though I make explicit 
(cast) conversions a constructor variant as well rather than a dedicated 
operator. 
Throughout this article, I will use the following operator names for 
constructors and conversion functions from `From` to `To`:

	void ?{} ( To*, To );            // copy constructor
	void ?{} ( To*, From );          // explicit constructor
	void ?{explicit} ( To*, From );  // explicit cast conversion
	void ?{safe} ( To*, From );      // implicit safe conversion
	void ?{unsafe} ( To*, From );    // implicit unsafe conversion

[1] http://plg.uwaterloo.ca/~cforall/Conversions/index.html

Glen's design made no distinction between constructors and unsafe implicit 
conversions; this is elegant, but interacts poorly with tuples. 
Essentially, without making this distinction, a constructor like the following 
would add an interpretation of any two `int`s as a `Coord`, needlessly 
multiplying the space of possible interpretations of all functions:

	void ?{}( Coord *this, int x, int y );

That said, it would certainly be possible to make a multiple-argument implicit 
conversion, as below, though the argument above suggests this option should be 
used infrequently:

	void ?{unsafe}( Coord *this, int x, int y );

An alternate possibility would be to only count two-arg constructors 
`void ?{} ( To*, From )` as unsafe conversions; under this semantics, safe and 
explicit conversions should also have a compiler-enforced restriction to 
ensure that they are two-arg functions (this restriction may be valuable 
regardless).

Regardless of syntax, there should be a type assertion that expresses `From` 
is convertable to `To`. 
If user-defined conversions are not added to the language, 
`void ?{} ( To*, From )` may be a suitable representation, relying on 
conversions on the argument types to account for transitivity. 
On the other hand, `To*` should perhaps match its target type exactly, so 
another assertion syntax specific to conversions may be required, e.g. 
`From -> To`.

### Constructor Idiom ###
Basing our notion of conversions off otherwise normal Cforall functions means 
that we can use the full range of Cforall features for conversions, including 
polymorphism.
Glen [1] defines a _constructor idiom_ that can be used to create chains of 
safe conversions without duplicating code; given a type `Safe` which members 
of another type `From` can be directly converted to, the constructor idiom 
allows us to write a conversion for any type `To` which `Safe` converts to: 

	forall(otype To | { void ?{safe}( To*, Safe ) })
	void ?{safe}( To *this, From that ) {
		Safe tmp = /* some expression involving that */;
		*this = tmp; // uses assertion parameter
	}

This idiom can also be used with only minor variations for a parallel set of 
unsafe conversions.

What selective non-use of the constructor idiom gives us is the ability to 
define a conversion that may only be the *last* conversion in a chain of such. 
Constructing a conversion graph able to unambiguously represent the full 
hierarchy of implicit conversions in C is provably impossible using only 
single-step conversions with no additional information (see Appendix B), but 
this mechanism is sufficiently powerful (see [1], though the design there has 
some minor bugs; the general idea is to use the constructor idiom to define 
two chains of conversions, one among the signed integral types, another among 
the unsigned, and to use monomorphic conversions to allow conversions between 
signed and unsigned integer types).

### Implementation Details ###
It is desirable to have a system which can be efficiently implemented, yet 
also to have one which has sufficient power to distinguish between functions 
on all possible axes of polymorphism. 
This ordering may be a partial order, which may complicate implementation 
somewhat; in this case it may be desirable to store the set of implementations 
for a given function as the directed acyclic graph (DAG) representing the 
order.

## Conversion Costs ##
Each possible resolution of an expression has a _cost_ tuple consisting of 
the following components:
1. _unsafe_ conversion cost: summed degree of unsafe conversions; unlike CFA03, this is not a simple count of conversions (for symmetry with the safe conversions)
2. _polymorphic unifications_: count of parameters and return values bound to some polymorphic type for boxing
3. _type variables_: number of polymorphic type variables bound
4. negated _type specializations_: Each type assertion specializes the polymorphism, thus decreasing the cost; nested polymorphic types (e.g. `T*`) are also counted as specializations
5. _safe_ conversions: summed degree of safe conversions
6. _qualifier_ conversions: summed degree of qualifier and reference conversions
These components are lexically-ordered and can be summed element-wise; 
summation starts at `(0, 0, 0, 0, 0)`.

**TODO** update below for consistency with this

### Lvalue and Qualifier Conversions ###
C defines the notion of a _lvalue_, essentially an addressable object, as well 
as a number of type _qualifiers_, `const`, `volatile`, and `restrict`. 
As these type qualifiers are generally only meaningful to the type system as 
applied to lvalues, the two concepts are closely related. 
A const lvalue cannot be modified, the compiler cannot assume that a volatile 
lvalue will not be concurrently modified by some other part of the system, and 
a restrict lvalue must have pointer type, and the compiler may assume that no 
other pointer in scope aliases that pointer (this is solely a performance 
optimization, and may be ignored by implementers).
_Lvalue-to-rvalue conversion_, which takes an lvalue of type `T` and converts 
it to an expression result of type `T` (commonly called an _rvalue_ of type 
`T`) also strips all the qualifiers from the lvalue, as an expression result 
is a value, not an addressable object that can have properties like 
immutability. 
Though lvalue-to-rvalue conversion strips the qualifiers from lvalues, 
derived rvalue types such as pointer types may include qualifiers; 
`const int *` is a distinct type from `int *`, though the latter is safely 
convertable to the former. 
In general, any number of qualifiers can be safely added to the 
pointed-to-type of a pointer type, e.g. `int *` converts safely to 
`const int *` and `volatile int *`, both of which convert safely to 
`const volatile int *`.

Since lvalues are precicely "addressable objects", in C, only lvalues can be 
used as the operand of the `&` address-of operator. 
Similarly, only modifiable lvalues may be used as the assigned-to 
operand of the mutating operators: assignment, compound assignment 
(e.g. `+=`), and increment and decrement; roughly speaking, lvalues without 
the `const` qualifier are modifiable, but lvalues of incomplete types, array 
types, and struct or union types with const members are also not modifiable. 
Lvalues are produced by the following expressions: object identifiers 
(function identifiers are not considered to be lvalues), the result of the `*` 
dereference operator applied to an object pointer, the result of a member 
expression `s.f` if the left argument `s` is an lvalue (note that the 
preceding two rules imply that the result of indirect member expressions 
`s->f` are always lvalues, by desugaring to `(*s).f`), and the result of the 
indexing operator `a[i]` (similarly by its desugaring to `*((a)+(i))`). 
Somewhat less obviously, parenthesized lvalue expressions, string literals, 
and compound literals (e.g. `(struct foo){ 'x', 3.14, 42 }`) are also lvalues.

All of the conversions described above are defined in standard C, but Cforall 
requires further features from its type system. 
In particular, to allow overloading of the `*?` and `?[?]` dereferencing and 
indexing operators, Cforall requires a way to declare that the functions 
defining these operators return lvalues, and since C functions never return 
lvalues and for syntactic reasons we wish to distinguish functions which 
return lvalues from functions which return pointers, this is of necessity an 
extension to standard C. 
In the current design, an `lvalue` qualifier can be added to function return 
types (and only to function return types), the effect of which is to return a 
pointer which is implicitly dereferenced by the caller.
C++ includes the more general concept of _references_, which are typically 
implemented as implicitly dereferenced pointers as well. 
Another use case which C++ references support is providing a way to pass 
function parameters by reference (rather than by value) with a natural 
syntax; Cforall in its current state has no such mechanism. 
As an example, consider the following (currently typical) copy-constructor 
signature and call:

	void ?{}(T *lhs, T rhs);
	
	T x;
	T y = { x };

Note that the right-hand argument is passed by value, and would in fact be 
copied twice in the course of the constructor call `T y = { x };` (once into 
the parameter by C's standard `memcpy` semantics, once again in the body of 
the copy constructor, though it is possible that return value optimization 
will elide the `memcpy`-style copy).
However, to pass by reference using the existing pointer syntax, the example 
above would look like this:

	void ?{}(T *lhs, const T *rhs);
	
	T x;
	T y = { &x };

This example is not even as bad as it could be; assuming pass-by-reference is 
the desired semantics for the `?+?` operator, that implies the following 
design today:

	T ?+?(const T *lhs, const T *rhs);
	
	T a, b;
	T c = &a + &b,

In addition to `&a + &b` being unsightly and confusing syntax to add `a` and 
`b`, it also introduces a possible ambiguity with pointer arithmetic on `T*` 
which can only be resolved by return-type inference.

Pass-by-reference and marking functions as returning lvalues instead of the 
usual rvalues are actually closely related concepts, as obtaining a reference 
to pass depends on the referenced object being addressable, i.e. an lvalue, 
and lvalue return types are effectively return-by-reference. 
Cforall should also unify the concepts, with a parameterized type for 
"reference to `T`", which I will write `ref T`. 
Syntax bikeshedding can be done later (there are some examples at the bottom 
of this section), but `ref T` is sufficiently distinct from both the existing 
`lvalue T` (which it subsumes) and the closely related C++ `T&` to allow 
independent discussion of its semantics.

Firstly, assignment to a function parameter as part of a function call and 
local variable initialization have almost identical semantics, so should be 
treated similarly for the reference type too; this implies we should be able 
to declare local variables of reference type, as in the following:

	int x = 42;
	ref int r = x; // r is now an alias for x

Unlike in C++, we would like to have the capability to re-bind references 
after initialization, as this allows the attractive syntax of references to 
support some further useful code patterns, such as first initializing a 
reference after its declaration. 
Constant references to `T` (`const ref T`) should not be re-bindable. 

One option for re-binding references is to use a dedicated operator, as in the 
code example below:

	int i = 42, j = 7;
	ref int r = i; // bind r to i
	r = j;         // set i (== r) to 7
	r := j;        // rebind r to j using the new := rebind operator
	i = 42;        // reset i (!= r) to 42
	assert( r == 7 );

The other syntactic option for reference re-bind would be to overload 
assignment and use type inference on the left and right-hand sides to 
determine whether the referred-to variable on the left should be reassigned to 
the value on the right, or if the reference on the left should be aliased to 
the reference on the right. 
This could be disambiguated with casts, as in the following code example:

	int i
	int j;
	ref int r = i;   // (0a)
	ref int s = i;   // (0b)
	
	i = j;           // (1)
	i = (int)s;      // (2)
	i = s;           // (3)
	// ---------------------
	r = s;           // (4)
	r = (ref int)j;  // (5)
	// ---------------------
	r = j;           // (6)
	r = (int)s;      // (7)

By the expected aliasing syntax, (0a) and (0b) are initializing `r` and `s` as 
aliases for `i`.
For C compatibility, (1) has to be assignment; in general, any assignment to a 
non-reference type should be assignment, so (2) and (3) are as well. 
By types, (4) and (5) should have the same semantics, and the semantics of (6) 
and (7) should match as well. 
This suggests that (4) and (5) are reference re-bind, and (6) and (7) are an 
assignment to the referred variable; this makes the syntax to explicitly alias 
a local variable rather ugly (and inconsistent with the initialization 
syntax), as well as making it rather awkward to copy the value stored in one 
reference-type variable into another reference type variable (which is likely 
more painful in functions with by-reference parameters than with local 
variables of reference type).

Because of the aforementioned issues with overloading assignment as reference 
rebind, in addition to the fact that reference rebind should not be a 
user-overloadable operator (unlike assignment), I propose refererence rebind 
should have its own dedicated operator. 

The semantics and restrictions of `ref T` are effectively the semantics of an 
lvalue of type `T`, and by this analogy there should be a safe, qualifier 
dropping conversion from `ref const volatile restrict T` (and every other 
qualifier combination on the `T` in `ref T`) to `T`. 
With this conversion, the resolver may type most expressions that C would 
call "lvalue of type `T`" as `ref T`. 
There's also an obvious argument that lvalues of a (possibly-qualified) type 
`T` should be convertable to references of type `T`, where `T` is also 
so-qualified (e.g. lvalue `int` to `ref int`, lvalue `const char` to 
`ref const char`). 
By similar arguments to pointer types, qualifiers should be addable to the 
referred-to type of a reference (e.g. `ref int` to `ref const int`). 
As a note, since pointer arithmetic is explictly not defined on `ref T`, 
`restrict ref T` should be allowable and would have alias-analysis rules that 
are actually comprehensible to mere mortals.

Using pass-by-reference semantics for function calls should not put syntactic 
constraints on how the function is called; particularly, temporary values 
should be able to be passed by reference. 
The mechanism for this pass-by-reference would be to store the value of the 
temporary expression into a new unnamed temporary, and pass the reference of 
that temporary to the function.
As an example, the following code should all compile and run:

	void f(ref int x) { printf("%d\n", x++); }
	
	int i = 7, j = 11;
	const int answer = 42;
	
	f(i);      // (1)
	f(42);     // (2)
	f(i + j);  // (3)
	f(answer); // (4)

The semantics of (1) are just like C++'s, "7" is printed, and `i` has the 
value 8 afterward. 
For (2), "42" is printed, and the increment of the unnamed temporary to 43 is 
not visible to the caller; (3) behaves similarly, printing "19", but not 
changing `i` or `j`. 
(4) is a bit of an interesting case; we want to be able to support named 
constants like `answer` that can be used anywhere the constant expression 
they're replacing (like `42`) could go; in this sense, (4) and (2) should have 
the same semantics. 
However, we don't want the mutation to the `x` parameter to be visible in 
`answer` afterward, because `answer` is a constant, and thus shouldn't change. 
The solution to this is to allow chaining of the two `ref` conversions; 
`answer` has the type `ref const int`, which can be converted to `int` by the 
lvalue-to-rvalue conversion (which drops the qualifiers), then up to `ref int` 
by the temporary-producing rvalue-to-lvalue conversion. 
Thus, an unnamed temporary is inserted, initialized to `answer` (i.e. 42), 
mutated by `f`, then discarded; "42" is printed, just as in case (2), and 
`answer` still equals 42 after the call, because it was the temporary that was 
mutated, not `answer`.
It may be somewhat surprising to C++ programmers that `f(i)` mutates `i` while 
`f(answer)` does not mutate `answer` (though `f(answer)` would be illegal in 
C++, leading to the dreaded "const hell"), but the behaviour of this rule can 
be determined by examining local scope with the simple rule "non-`const` 
references to `const` variables produce temporaries", which aligns with 
programmer intuition that `const` variables cannot be mutated.

To bikeshed syntax for `ref T`, there are three basic options: language 
keywords (`lvalue T` is already in Cforall), compiler-supported "special" 
generic types (e.g. `ref(T)`), or sigils (`T&` is familiar to C++ 
programmers). 
Keyword or generic based approaches run the risk of name conflicts with 
existing code, while any sigil used would have to be carefully chosen to not 
create parsing conflicts.

**TODO** Consider arguments for move semantics and see if there is a 
compelling case for rvalue references.

### Conversion Operator Costs ###
Copy constructors, safe conversions, and unsafe conversions all have an 
associated conversion cost, calculated according to the algorithm below:

1. Monomorphic copy constructors have a conversion cost of `(0, 0, 0, 0)`
2. Monomorphic safe conversions have a conversion cost of `(0, 0, 1, 1)`
3. Monomoprhic unsafe conversions have a conversion cost of `(1, 0, 0, 1)`
4. Polymorphic conversion operators (or copy constructors) have a conversion 
   cost of `(0, 1, 0, 1)` plus the conversion cost of their monomorphic 
   equivalent and the sum of the conversion costs of all conversion operators 
   passed as assertion parameters, but where the fourth "count" element of the 
   cost tuple is fixed to `1`.

**TODO** Polymorphism cost may need to be reconsidered in the light of the 
thoughts on polymorphism below.
**TODO** You basically just want path-length in the conversion graph implied 
by the set of conversions; the only tricky question is whether or not you can 
account for "mixed" safe and unsafe conversions used to satisfy polymorphic 
constraints, whether a polymorphic conversion should cost more than a 
monomorphic one, and whether to account for non-conversion constraints in the 
polymorphism cost

### Argument-Parameter Matching ###
Given a function `f` with an parameter list (after tuple flattening) 
`(T1 t1, T2 t2, ... Tn tn)`, and a function application 
`f(<e1>, <e2>, ... <em>)`, the cost of matching each argument to the 
appropriate parameter is calculated according to the algorithm below:

Given a parameter `t` of type `T` and an expression `<e>` from these lists, 
`<e>` will have a set of interpretations of types `E1, E2, ... Ek` with 
associated costs `(u1, p1, s1, c1), (u2, p2, s2, c2), ... (uk, pk, sk, ck)`. 
(If any `Ei` is a tuple type, replace it with its first flattened element for 
the purposes of this section.) 

The cost of matching the interpretation of `<e>` with type `Ei` to `t1` with 
type `T` is the sum of the interpretation cost `(ui, pi, si, ci)` and the 
conversion operator cost from `Ei` to `T`.

### Object Initialization ###
The cost to initialize an object is calculated very similarly to 
argument-parameter matching, with a few modifications. 
Firstly, explicit constructors are included in the set of available 
conversions, with conversion cost `(0, 0, 0, 1)` plus associated polymorphic 
conversion costs (if applicable) and the _interpretation cost_ of the 
constructor, the sum of the argument-parameter matching costs for its 
parameters. 
Also, ties in overall cost (interpretation cost plus conversion cost) are 
broken by lowest conversion cost (i.e. of alternatives with the same overall 
cost, copy constructors are preferred to other explicit constructors, 
explicit constructors are preferred to safe conversions, which are preferred 
to unsafe conversions). 
An object initialization is properly typed if it has exactly one min-cost 
interpretation.

### Explicit Casts ###
Explicit casts are handled similarly to object initialization. 
Copy constructors and other explicit constructors are not included in the set 
of possible conversions, though interpreting a cast as type ascription 
(`(T)e`, meaning the interpretation of `e` as type `T`) has conversion cost 
`(0, 0, 0, 0)`. 
Explicit conversion operators are also included in the set of possible 
conversions, with cost `(0, 0, 0, 1)` plus whatever polymorphic conversion 
costs are invoked. 
Unlike for explicit constructors and other functions, implicit conversions are 
never applied to the argument or return type of an explicit cast operator, so 
that the cast may be used more effectively as a method for the user programmer 
to guide type resolution.

## Trait Satisfaction ##
A _trait_ consists of a list of _type variables_ along with a (possibly empty) 
set of _assertions_ on those variables. 
Assertions can take two forms, _variable assertions_ and the more common 
_function assertions_, as in the following example:

	trait a_trait(otype T, otype S) {
		T a_variable_assertion;
		S* another_variable_assertion;
		S a_function_assertion( T* );
	};

Variable assertions enforce that a variable with the given name and type 
exists (the type is generally one of the type variables, or derived from one), 
while a function assertion enforces that a function with a 
_compatible signature_ to the provided function exists. 

To test if some list of types _satisfy_ the trait, the types are first _bound_ 
to the type variables, and then declarations to satisfy each assertion are 
sought out. 
Variable assertions require an exact match, because they are passed as object 
pointers, and there is no mechanism to employ conversion functions, while 
function assertions only require a function that can be wrapped to a 
compatible type; for example, the declarations below satisfy 
`a_trait(int, short)`:

	int a_variable_assertion;
	short* another_variable_assertion;
	char a_function_assertion( void* );
	// int* may be implicitly converted to void*, and char to short, so the 
	// above works 

Cforall Polymorphic functions have a _constraining trait_, denoted as follows: 

	forall(otype A, otype B | some_trait(A, B))

The trait may be anonymous, with the same syntax as a trait declaration, and 
may be unioned together using `|` or `,`.

**TODO** Consider including field assertions in the list of constraint types, 
also associated types and the appropriate matching type assertion.

## Polymorphism Costs ##
The type resolver should prefer functions that are "less polymorphic" to 
functions that are "more polymorphic". 
Determining how to order functions by degree of polymorphism is somewhat less 
straightforward, though, as there are multiple axes of polymorphism and it is 
not always clear how they compose. 
The natural order for degree of polymorphism is a partial order, and this 
section includes some open questions on whether it is desirable or feasible to 
develop a tie-breaking strategy to impose a total order on the degree of 
polymorphism of functions. 
Helpfully, though, the degree of polymorphism is a property of functions 
rather than function calls, so any complicated graph structure or calculation 
representing a (partial) order over function degree of polymorphism can be 
calculated once and cached.

### Function Parameters ###
All other things being equal, if a parameter of one function has a concrete 
type and the equivalent parameter of another function has a dynamic type, the 
first function is less polymorphic:

	                void f( int, int );  // (0) least polymorphic 
	forall(otype T) void f( T, int );    // (1a) more polymorphic than (0)
	forall(otype T) void f( int, T );    // (1b) more polymorphic than (0)
	                                     //      incomparable with (1a)
	forall(otype T) void f( T, T );      // (2) more polymorphic than (1a/b)

This should extend to parameterized types (pointers and generic types) also: 

	forall(otype S) struct box { S val; };
	
	forall(otype T) void f( T, T* );       // (3) less polymorphic than (2)
	forall(otype T) void f( T, T** );      // (4) less polymorphic than (3)
	forall(otype T) void f( T, box(T) );   // (5) less polymorphic than (2)
	                                       //     incomparable with (3)
	forall(otype T) void f( T, box(T*) );  // (6) less polymorphic than (5)

Every function in the group above is incomparable with (1a/b), but that's fine 
because an `int` isn't a pointer or a `box`, so the ambiguity shouldn't occur 
much in practice (unless there are safe or unsafe conversions defined between 
the possible argument types).

For degree of polymorphism from arguments, I think we should not distinguish 
between different type parameters, e.g. the following should be considered 
equally polymorphic:

	forall(otype T, otype S) void f( T, T, S );  // (7)
	forall(otype T, otype S) void f( S, T, T );  // (8)

However parameter lists are compared, parameters of multi-parameter generic 
types should ideally be treated as a recursive case, e.g. in the example 
below, (9) is less polymorphic than (10), which is less polymorphic than (11): 

	forall(otype T, otype S) struct pair { T x; S y; };
	
	                void f( pair(int, int) );  // (9)
	forall(otype T) void f( pair(T, int) );    // (10)
	forall(otype T) void f( pair(T, T) );      // (11)

Parameter comparison could possibly be made somewhat cheaper at loss of some 
precision by representing each parameter as a value from the natural numbers 
plus infinity, where infinity represents a monomorphic parameter and a finite 
number counts how many levels deep the shallowest type variable is, e.g. where 
`T` is a type variable, `int` would have value infinity, `T` would have value 
0, `T*` would have value 1, `box(T)*` would have value 2, etc. 
Under this scheme, higher values represent less polymorphism. 
This makes the partial order on parameters a total order, so that many of the 
incomparable functions above compare equal, though that is perhaps a virtue. 
It also loses the ability to differentiate between some multi-parameter 
generic types, such as the parameters in (10) and (11), which would both be 
valued 1, losing the polymorphism distinction between them.

A variant of the above scheme would be to fix a maximum depth of polymorphic 
type variables (16 seems like a reasonable choice) at which a parameter would 
be considered to be effectively monomorphic, and to subtract the value 
described above from that maximum, clamping the result to a minimum of 0. 
Under this scheme, assuming a maximum value of 4, `int` has value 0, `T` has 
value 4, `T*` has value 3, `box(T)*` has value 2, and `box(T*)**` has value 0, 
the same as `int`. 
This can be quite succinctly represented, and summed without the presence of a 
single monomorphic parameter pushing the result to infinity, but does lose the 
ability to distinguish between very deeply structured polymorphic types. 

### Parameter Lists ###
A partial order on function parameter lists can be produced by the 
product order of the partial orders on parameters described above. 
In more detail, this means that for two parameter lists with the same arity, 
if any pair of corresponding parameters are incomparable with respect to each 
other, the two parameter lists are incomparable; if in all pairs of 
corresponding parameters one list's parameter is always (less than or) equal 
to the other list's parameter than the first parameter list is (less than or) 
equal to the second parameter list; otherwise the lists are incomparable with 
respect to each other.

How to compare parameter lists of different arity is a somewhat open question. 
A simple, but perhaps somewhat unsatisfying, solution would be just to say 
that such lists are incomparable. 
The simplist approach to make them comparable is to say that, given two lists 
`(T1, T2, ... Tn)` and `(S1, S2, ... Sm)`, where `n <= m`, the parameter lists 
can be compared based on their shared prefix of `n` types. 
This approach breaks the transitivity property of the equivalence relation on 
the partial order, though, as seen below:

	forall(otype T) void f( T, int );       // (1a)
	forall(otype T) void f( T, int, int );  // (12)
	forall(otype T) void f( T, int, T );    // (13)

By this rule, (1a) is equally polymorphic to both (12) and (13), so by 
transitivity (12) and (13) should also be equally polymorphic, but that is not 
actually the case.

We can fix the rule by saying that `(T1 ... Tn)` can be compared to 
`(S1 ... Sm)` by _extending_ the list of `T`s to `m` types by inserting 
notional monomorphic parameters. 
In this case, (1a) and (12) are equally polymorphic, because (1a) gets 
extended with a monomorphic type that compares equal to (12)'s third `int` 
parameter, but (1a) is less polymorphic than (13), because its notional 
monomorphic third parameter is less polymorphic than (13)'s `T`. 
Essentially what this rule says is that any parameter list with more 
parameters is no less polymorphic than one with fewer. 

We could collapse this parameter list ordering to a succinct total order by 
simply taking the sum of the clamped parameter polymorphism counts, but this 
would again make most incomparable parameter lists compare equal, as well as 
having the potential for some unexpected results based on the (completely 
arbitrary) value chosen for "completely polymorphic". 
For instance, if we set 4 to be the maximum depth of polymorphism (as above), 
the following functions would be equally polymorphic, which is a somewhat 
unexpected result:

	forall(otype T) void g( T, T, T, int );    // 4 + 4 + 4 + 0 = 12
	forall(otype T) void g( T*, T*, T*, T* );  // 3 + 3 + 3 + 3 = 12

These functions would also be considered equally polymorphic:

	forall(otype T) void g( T, int );    // 4 + 0 = 4;
	forall(otype T) void g( T**, T** );  // 2 + 2 = 4;

This issue can be mitigated by choosing a larger maximum depth of 
polymorphism, but this scheme does have the distinct disadvantage of either 
specifying the (completely arbitrary) maximum depth as part of the language or 
allowing the compiler to refuse to accept otherwise well-typed deeply-nested 
polymorphic types.

For purposes of determining polymorphism, the list of return types of a 
function should be treated like another parameter list, and combined with the 
degree of polymorphism from the parameter list in the same way that the 
parameters in the parameter list are combined. 
For instance, in the following, (14) is less polymorphic than (15) which is 
less polymorphic than (16):

	forall(otype T) int f( T );  // (14)
	forall(otype T) T*  f( T );  // (15)
	forall(otype T) T   f( T );  // (16)

### Type Variables and Bounds ###
Degree of polymorphism doesn't solely depend on the parameter lists, though. 
Glen's thesis (4.4.4, p.89) gives an example that shows that it also depends 
on the number of type variables as well:

	forall(otype T)          void f( T, int );  // (1a) polymorphic
	forall(otype T)          void f( T, T );    // (2)  more polymorphic
	forall(otype T, otype S) void f( T, S );    // (17) most polymorphic

Clearly the `forall` type parameter list needs to factor into calculation of 
degree of polymorphism as well, as it's the only real differentiation between 
(2) and (17). 
The simplest way to include the type parameter list would be to simply count 
the type variables and say that functions with more type variables are more 
polymorphic.

However, it also seems natural that more-constrained type variables should be 
counted as "less polymorphic" than less-constrained type variables. 
This would allow our resolver to pick more specialized (and presumably more 
efficient) implementations of functions where one exists. 
For example:

	forall(otype T | { void g(T); }) T f( T );  // (18) less polymorphic
	forall(otype T)                  T f( T );  // (16) more polymorphic

We could account for this by counting the number of unique constraints and 
saying that functions with more constraints are less polymorphic.

That said, we do model the `forall` constraint list as a (possibly anonymous) 
_trait_, and say that each trait is a set of constraints, so we could 
presumably define a partial order over traits based on subset inclusion, and 
use this partial order instead of the weaker count of constraints to order the 
list of type parameters of a function, as below:

	trait has_g(otype T) { void g(T); };
	trait has_h(otype S) { void h(T); };
	trait has_gh(otype R | has_g(R) | has_h(R)) {};
	// has_gh is equivlent to { void g(R); void h(R); }
	
	forall(otype T | has_gh(T)) T f( T );  // (19) least polymorphic
	forall(otype T | has_g(T))  T f( T );  // (18) more polymorphic than (19)
	forall(otype T | has_h(T))  T f( T );  // (18b) more polymorphic than (19) 
	                                       //       incomparable with (18)
	forall(otype T)             T f( T );  // (16) most polymorphic

The tricky bit with this is figuring out how to compare the constraint 
functions for equality up to type variable renaming; I suspect there's a known 
solution, but don't know what it is (perhaps some sort of unification 
calculation, though I hope there's a more lightweight option). 
We also should be able to take advantage of the programmer-provided trait 
subset information (like the constraint on `has_gh` in the example) to more 
efficiently generate the partial-order graph for traits, which should be able 
to be cached for efficiency.

Combining count of type variables with the (partial) order on the trait 
constraining those variables seems like it should be a fairly straightforward 
product ordering to me - one `forall` qualifier is (less than or) equal to 
another if it has both a (less than or) equal number of type variables and a 
(less than or) equal degree of polymorphism from its constraining trait; the 
two qualifiers are incomparable otherwise. 
If an easier-to-calculate total ordering is desired, it might be acceptable to 
use the number of type variables, with ties broken by number of constraints.

Similarly, to combine the (partial) orders on parameter and return lists with 
the (partial) order on `forall` qualifiers, a product ordering seems like the 
reasonable choice, though if we wanted a total order a reasonable choice would 
be to use whatever method we use to combine parameter costs into parameter 
lists to combine the costs for the parameter and return lists, then break ties 
by the order on the `forall` qualifiers.

## Expression Costs ##

### Variable Expressions ###
Variables may be overloaded; that is, there may be multiple distinct variables 
with the same name so long as each variable has a distinct type. 
The variable expression `x` has one zero-cost interpretation as type `T` for 
each variable `T x` in scope.

### Member Selection Expressions ###
For every interpretation `I` of `e` which has a struct or union type `S`, 
`e.y` has an interpretation of type `T` for each member `T y` of `S`, with the 
same cost as `I`. 
Note that there may be more than one member of `S` with the same name, as per 
Cforall's usual overloading rules.
The indirect member expression `e->y` is desugared to `(*e).y` and interpreted 
analogously.

**TODO** Consider allowing `e.y` to be interpreted as `e->y` if no 
interpretations as `e.y` exist.

### Address Expressions ###
Address expressions `&e` have an interpretation for each interpretation `I` of 
`e` that is an lvalue of type `T`, with the same cost as `I` and type `T*`. 
Lvalues result from variable expressions, member selection expressions, or 
application of functions returning an lvalue-qualified type. 
Note that the dereference operator is overloadable, so the rules for its 
resolution follow those for function application below.

**TODO** Consider allowing conversion-to-lvalue so that, e.g., `&42` spawns a 
new temporary holding `42` and takes its address.

### Boolean-context Expressions ###
C has a number of "boolean contexts", where expressions are assigned a truth 
value; these include both arguments to the short-circuiting `&&` and `||` 
operators, as well as the conditional expressions in `if` and `while` 
statements, the middle expression in `for` statements, and the first argument 
to the `?:` ternary conditional operator. 
In all these contexts, C interprets `0` (which is both an integer and a null 
pointer literal) as false, and all other integer or pointer values as true. 
In this spirit, Cforall allows other types to be considered "truthy" if they 
support the following de-sugaring in a conditional context (see notes on 
interpretation of literal `0` below):

	x => ((int)( x != 0 ))

### Literal Expressions ###
Literal expressions (e.g. 42, 'c', 3.14, "Hello, world!") have one 
zero-cost interpretation with the same type the expression would have in C, 
with three exceptions:

Character literals like 'x' are typed as `char` in Cforall, not `int` as in C. 
This change breaks very little C code (primarily `sizeof 'x'`; the implicit 
conversion from `int` to `char` and lack of overloading handle most other 
expressions), matches the behaviour of C++, and is more compatible with 
programmer intuition.

The literals `0` and `1` are also treated specially by Cforall, due to their 
potential uses in operator overloading. 
Earlier versions of Cforall allowed `0` and `1` to be variable names, allowing 
multiple interpretations of them according to the existing variable 
overloading rules, with the following declarations in the prelude:

	const int 0, 1;
	forall ( dtype DT ) const DT * const    0;
	forall ( ftype FT ) FT * const          0;

This did, however, create some backward-compatibility problems and potential 
performance issues, and works poorly for generic types. To start with, this 
(entirely legal C) code snippet doesn't compile in Cforall:

	if ( 0 ) {}

It desugars to `if ( (int)(0 != 0) ) {}`, and since both `int` and 
`forall(dtype DT) DT*` have a != operator which returns `int` the resolver can 
not choose which `0` variable to take, because they're both exact matches.

The general != computation may also be less efficient than a check for a zero 
value; take the following example of a rational type:

	struct rational { int32_t num, int32_t den };
	rational 0 = { 0, 1 };
	
	int ?!=? (rational a, rational b) {
		return ((int64_t)a.num)*b.den != ((int64_t)b.num)*a.den;
	}
	
	int not_zero (rational a) { return a.num != 0; }

To check if two rationals are equal we need to do a pair of multiplications to 
normalize them (the casts in the example are to prevent overflow), but to 
check if a rational is non-zero we just need to check its numerator, a more 
efficient operation.

Finally, though polymorphic null-pointer variables can be meaningfully 
defined, most other polymorphic variables cannot be, which makes it difficult 
to make generic types "truthy" using the existing system:

	forall(otype T) struct pair { T x; T y; };
	forall(otype T | { T 0; }) pair(T) 0 = { 0, 0 };

Now, it seems natural enough to want to define the zero for this pair type as 
a pair of the zero values of its element type (if they're defined). 
The declaration of `pair(T) 0` above is actually illegal though, as there is 
no way to represent the zero values of an infinite number of types in the 
single memory location available for this polymorphic variable - the 
polymorphic null-pointer variables defined in the prelude are legal, but that 
is only because all pointers are the same size and the single zero value is a 
legal value of all pointer types simultaneously; null pointer is, however, 
somewhat unique in this respect.

The technical explanation for the problems with polymorphic zero is that `0` 
is really a rvalue, not a lvalue - an expression, not an object. 
Drawing from this, the solution we propose is to give `0` a new built-in type, 
`_zero_t` (name open to bikeshedding), and similarly give `1` the new built-in 
type `_unit_t`. 
If the prelude defines != over `_zero_t` this solves the `if ( 0 )` problem, 
because now the unambiguous best interpretation of `0 != 0` is to read them 
both as `_zero_t` (and say that this expression is false). 
Backwards compatibility with C can be served by defining conversions in the 
prelude from `_zero_t` and `_unit_t` to `int` and the appropriate pointer 
types, as below:

	// int 0;
	forall(otype T | { void ?{safe}(T*, int); }) void ?{safe} (T*, _zero_t);
	forall(otype T | { void ?{unsafe}(T*, int); }) void ?{unsafe} (T*, _zero_t);
	
	// int 1;
	forall(otype T | { void ?{safe}(T*, int); }) void ?{safe} (T*, _unit_t);
	forall(otype T | { void ?{unsafe}(T*, int); }) void ?{unsafe} (T*, _unit_t);
	
	// forall(dtype DT) const DT* 0;
	forall(dtype DT) void ?{safe}(const DT**, _zero_t);
	// forall(ftype FT) FT* 0;
	forall(ftype FT) void ?{safe}(FT**, _zero_t);

Further, with this change, instead of making `0` and `1` overloadable 
variables, we can instead allow user-defined constructors (or, more flexibly, 
safe conversions) from `_zero_t`, as below:

	// rational 0 = { 0, 1 };
	void ?{safe} (rational *this, _zero_t) { this->num = 0; this->den = 1; }

Note that we don't need to name the `_zero_t` parameter to this constructor, 
because its only possible value is a literal zero. 
This one line allows `0` to be used anywhere a `rational` is required, as well 
as enabling the same use of rationals in boolean contexts as above (by 
interpreting the `0` in the desguraring to be a rational by this conversion). 
Furthermore, while defining a conversion function from literal zero to 
`rational` makes rational a "truthy" type able to be used in a boolean 
context, we can optionally further optimize the truth decision on rationals as 
follows:

	int ?!=? (rational a, _zero_t) { return a.num != 0; }

This comparison function will be chosen in preference to the more general 
rational comparison function for comparisons against literal zero (like in 
boolean contexts) because it doesn't require a conversion on the `0` argument. 
Functions of the form `int ?!=? (T, _zero_t)` can acutally be used in general 
to make a type `T` truthy without making `0` a value which can convert to that 
type, a capability not available in the current design.

This design also solves the problem of polymorphic zero for generic types, as 
in the following example:

	// ERROR: forall(otype T | { T 0; }) pair(T) 0 = { 0, 0 };
	forall(otype T | { T 0; }) void ?{safe} (pair(T) *this, _zero_t) {
		this->x = 0; this->y = 0;
	}

The polymorphic variable declaration didn't work, but this constructor is 
perfectly legal and has the desired semantics.

We can assert that `T` can be used in a boolean context as follows: 

	`forall(otype T | { int ?!=?(T, _zero_t); })`
 
Since the C standard (6.5.16.1.1) specifically states that pointers can be 
assigned into `_Bool` variables (and implies that other artithmetic types can 
be assigned into `_Bool` variables), it seems natural to say that assignment 
into a `_Bool` variable effectively constitutes a boolean context. 
To allow this interpretation, I propose including the following function (or 
its effective equivalent) in the prelude:

	forall(otype T | { int ?!=?(T, _zero_t); })
	void ?{safe}( _Bool *this, T that ) { *this = that != 0; }

Note that this conversion is not transitive; that is, for `t` a variable of 
some "truthy" type `T`, `(_Bool)t;` would use this conversion (in the absence 
of a lower-cost one), `(int)t;` would not use this conversion (and in fact 
would not be legal in the absence of another valid way to convert a `T` to an 
`int`), but `(int)(_Bool)t;` could legally use this conversion.

Similarly giving literal `1` the special type `_unit_t` allows for more 
concise and consistent specification of the increment and decrement operators, 
using the following de-sugaring:

	++i => i += 1
	i++ => (tmp = i, i += 1, tmp)
	--i => i -= 1
	i-- => (tmp = i, i -= 1, tmp)

In the examples above, `tmp` is a fresh temporary with its type inferred from 
the return type of `i += 1`. 
Under this proposal, defining a conversion from `_unit_t` to `T` and a 
`lvalue T ?+=? (T*, T)` provides both the pre- and post-increment operators 
for free in a consistent fashion (similarly for -= and the decrement 
operators).
If a meaningful `1` cannot be defined for a type, both increment operators can 
still be defined with the signature `lvalue T ?+=? (T*, _unit_t)`. 
Similarly, if scalar addition can be performed on a type more efficiently than 
by repeated increment, `lvalue T ?+=? (T*, int)` will not only define the 
addition operator, it will simultaneously define consistent implementations of 
both increment operators (this can also be accomplished by defining a 
conversion from `int` to `T` and an addition operator `lvalue T ?+=?(T*, T)`).

To allow functions of the form `lvalue T ?+=? (T*, int)` to satisfy "has an 
increment operator" assertions of the form `lvalue T ?+=? (T*, _unit_t)`, 
we also define a non-transitive unsafe conversion from `_Bool` (allowable 
values `0` and `1`) to `_unit_t` (and `_zero_t`) as follows:

	void ?{unsafe} (_unit_t*, _Bool) {}

As a note, the desugaring of post-increment above is possibly even more 
efficient than that of C++ - in C++, the copy to the temporary may be hidden 
in a separately-compiled module where it can't be elided in cases where it is 
not used, whereas this approach for Cforall always gives the compiler the 
opportunity to optimize out the temporary when it is not needed. 
Furthermore, one could imagine a post-increment operator that returned some 
type `T2` that was implicitly convertable to `T` but less work than a full 
copy of `T` to create (this seems like an absurdly niche case) - since the 
type of `tmp` is inferred from the return type of `i += 1`, you could set up 
functions with the following signatures to enable an equivalent pattern in 
Cforall:

	lvalue T2 ?+=? (T*, _unit_t); // increment operator returns T2
	void ?{} (T2*, T);            // initialize T2 from T for use in `tmp = i`
	void ?{safe} (T*, T2);        // allow T2 to be used as a T when needed to 
	                              // preserve expected semantics of T x = y++;

**TODO** Look in C spec for literal type interprations.
**TODO** Write up proposal for wider range of literal types, put in appendix

### Initialization and Cast Expressions ###
An initialization expression `T x = e` has one interpretation for each 
interpretation `I` of `e` with type `S` which is convertable to `T`. 
The cost of the interpretation is the cost of `I` plus the conversion cost 
from `S` to `T`.
A cast expression `(T)e` is interpreted as hoisting initialization of a 
temporary variable `T tmp = e` out of the current expression, then replacing 
`(T)e` by the new temporary `tmp`.

### Assignment Expressions ###
An assignment expression `e = f` desugars to `(?=?(&e, f), e)`, and is then 
interpreted according to the usual rules for function application and comma 
expressions. 
Operator-assignment expressions like `e += f` desugar similarly as 
`(?+=?(&e, f), e)`.

### Function Application Expressions ###
Every _compatible function_ and satisfying interpretation of its arguments and 
polymorphic variable bindings produces one intepretation for the function 
application expression. 
Broadly speaking, the resolution cost of a function application is the sum of 
the cost of the interpretations of all arguments, the cost of all conversions 
to make those argument interpretations match the parameter types, and the 
binding cost of any of the function's polymorphic type parameters.

**TODO** Work out binding cost in more detail.
**TODO** Address whether "incomparably polymorphic" should be treated as 
"equally polymorphic" and be disambiguated by count of (safe) conversions.
**TODO** Think about what polymorphic return types mean in terms of late 
binding.
**TODO** Consider if "f is less polymorphic than g" can mean exactly "f 
specializes g"; if we don't consider the assertion parameters (except perhaps 
by count) and make polymorphic variables bind exactly (rather than after 
implicit conversions) this should actually be pre-computable.
**TODO** Add "deletable" functions - take Thierry's suggestion that a deleted 
function declaration is costed out by the resolver in the same way that any 
other function declaration is costed; if the deleted declaration is the unique 
min-cost resolution refuse to type the expression, if it is tied for min-cost 
then take the non-deleted alternative, and of two equivalent-cost deleted 
interpretations with the same return type pick one arbitrarily rather than 
producing an ambiguous resolution. This would also be useful for forbidding 
pointer-to-floating-point explicit conversions (C11, 6.5.4.4).
**TODO** Cover default parameters, maybe named parameters (see "named 
arguments" thread of 11 March 2016)
 

### Sizeof, Alignof & Offsetof Expressions ###
`sizeof`, `alignof`, and `offsetof` expressions have at most a single 
interpretation, of type `size_t`. 
`sizeof` and `alignof` expressions take either a type or an expression as an 
argument; if the argument is a type, it must be a complete type which is not a 
function type, if an expression, the expression must have a single 
interpretation, the type of which conforms to the same rules. 
`offsetof` takes two arguments, a type and a member name; the type must be 
a complete structure or union type, and the second argument must name a member 
of that type.

### Comma Expressions ###
A comma expression `x, y` resolves `x` as if it had been cast to `void`, and 
then, if there is a unique interpretation `I` of `x`, has one interpretation 
for each interpretation `J` of `y` with the same type as `J` costing the sum 
of the costs of `I` and `J`.

### Index Expressions ###
**TODO** Consider adding polymorphic function in prelude for this, as per 
6.5.2.1.2 in the C standard:

	forall(otype T, otype I, otype R, otype E | { R ?+?(T, I); lvalue E *?(R); })
	lvalue E ?[?](T a, I i) { return *(a + i); }

I think this isn't actually a good idea, because the cases for it are niche, 
mostly odd tricks like `0[p]` as an alternate syntax for dereferencing a 
pointer `p`, and adding it to the prelude would slow down resolution of 
every index expression just a bit. Our existing prelude includes an indexing 
operator `forall(otype T) lvalue T ?[?](ptrdiff_t, T*)`, plus qualified 
variants, which should satisfy C source-compatibility without propegating this 
silly desugaring further.

#### Compatible Functions ####
**TODO** This subsection is very much a work in progress and has multiple open 
design questions.

A _compatible function_ for an application expression is a visible function 
declaration with the same name as the application expression and parameter 
types that can be converted to from the argument types. 
Function pointers and variables of types with the `?()` function call operator 
overloaded may also serve as function declarations for purposes of 
compatibility. 

For monomorphic parameters of a function declaration, the declaration is a 
compatible function if there is an argument interpretation that is either an 
exact match, or has a safe or unsafe implicit conversion that can be used to 
reach the parameter type; for example: 

	void f(int);
	
	f(42);        // compatible; exact match to int type
	f('x');       // compatible; safe conversion from char => int
	f(3.14);      // compatible; unsafe conversion from double => int
	f((void*)0);  // not compatible; no implicit conversion from void* => int

Per Richard[*], function assertion satisfaction involves recursively searching 
for compatible functions, not an exact match on the function types (I don't 
believe the current Cforall resolver handles this properly); to extend the 
previous example:

	forall(otype T | { void f(T); }) void g(T);
	
	g(42);    // binds T = int, takes f(int) by exact match
	g('x');   // binds T = char, takes f(int) by conversion
	g(3.14);  // binds T = double, takes f(int) by conversion

[*] Bilson, s.2.1.3, p.26-27, "Assertion arguments are found by searching the 
accessible scopes for definitions corresponding to assertion names, and 
choosing the ones whose types correspond *most closely* to the assertion 
types." (emphasis mine)

There are three approaches we could take to binding type variables: type 
variables must bind to argument types exactly, each type variable must bind 
exactly to at least one argument, or type variables may bind to any type which 
all corresponding arguments can implicitly convert to; I'll provide some 
possible motivation for each approach.

There are two main arguments for the more restrictive binding schemes; the 
first is that the built-in implicit conversions in C between `void*` and `T*` 
for any type `T` can lead to unexpectedly type-unsafe behaviour in a more 
permissive binding scheme, for example:

	forall(dtype T) T* id(T *p) { return p; }
	
	int main() {
		int *p = 0;
		char *c = id(p);
	}

This code compiles in CFA today, and it works because the extra function 
wrapper `id` provides a level of indirection that allows the non-chaining 
implicit conversions from `int*` => `void*` and `void*` => `char*` to chain. 
The resolver types the last line with `T` bound to `void` as follows:

	char *c = (char*)id( (void*)p );

It has been suggested that making the implicit conversions to and from `void*` 
explicit in Cforall code (as in C++) would solve this particular problem, and 
provide enough other type-safety benefits to outweigh the source-compatibility 
break with C; see Appendix D for further details.

The second argument for a more constrained binding scheme is performance; 
trait assertions need to be checked after the type variables are bound, and 
adding more possible values for the type variables should correspond to a 
linear increase in runtime of the resolver per type variable. 
There are 21 built-in arithmetic types in C (ignoring qualifiers), and each of 
them is implicitly convertable to any other; if we allow unrestricted binding 
of type variables, a common `int` variable (or literal) used in the position 
of a polymorphic variable parameter would cause a 20x increase in the amount 
of time needed to check trait resolution for that interpretation. 
These numbers have yet to be emprically substantiated, but the theory is 
reasonable, and given that much of the impetus for re-writing the resolver is 
due to its poor performance, I think this is a compelling argument. 

I would also mention that a constrained binding scheme is practical; the most 
common type of assertion is a function assertion, and, as mentioned above, 
those assertions should be able to be implicitly converted to to match. 
Thus, in the example above with `g(T)`, where the assertion is `void f(T)`, 
we first bind `T = int` or `T = char` or `T = double`, then substitute the 
binding into the assertion, yielding assertions of `void f(int)`, 
`void f(char)`, or `void f(double)`, respectively, then attempt to satisfy 
these assertions to complete the binding. 
Though in all three cases, the existing function with signature `void f(int)` 
satisfies this assertion, the checking work cannot easily be re-used between 
variable bindings, because there may be better or worse matches depending on 
the specifics of the binding.

The main argument for a more flexible binding scheme is that the binding 
abstraction can actually cause a wrapped function call that would work to 
cease to resolve, as below:

	forall(otype T | { T ?+? (T, T) })
	T add(T x, T y) { return x + y; }
	
	int main() {
		int i, j = 2;
		short r, s = 3;
		i = add(j, s);
		r = add(s, j);
	}

Now, C's implicit conversions mean that you can call `j + s` or `s + j`, and 
in both cases the short `s` is promoted to `int` to match `j`. 
If, on the other hand, we demand that variables exactly match type variables, 
neither call to `add` will compile, because it is impossible to simultaneously 
bind `T` to both `int` and `short` (C++ has a similar restriction on template 
variable inferencing). 
One alternative that enables this case, while still limiting the possible 
type variable bindings is to say that at least one argument must bind to its 
type parameter exactly. 
In this case, both calls to `add` would have the set `{ T = int, T = short }` 
for candidate bindings, and would have to check both, as well as checking that 
`short` could convert to `int` or vice-versa.

It is worth noting here that parameterized types generally bind their type 
parameters exactly anyway, so these "restrictive" semantics only restrict a 
small minority of calls; for instance, in the example following, there isn't a 
sensible way to type the call to `ptr-add`:

	forall(otype T | { T ?+?(T, T) })
	void ptr-add( T* rtn, T* x, T* y ) {
		*rtn = *x + *y;
	}
	
	int main() {
		int i, j = 2;
		short s = 3;
		ptr-add(&i, &j, &s); // ERROR &s is not an int*
	}

I think there is some value in providing library authors with the 
capability to express "these two parameter types must match exactly". 
This can be done without restricting the language's expressivity, as the `add` 
case above can be made to work under the strictest type variable binding 
semantics with any addition operator in the system by changing its signature 
as follows:

	forall( otype T, otype R, otype S | { R ?+?(T, S); } )
	R add(T x, S y) { return x + y; }

Now, it is somewhat unfortunate that the most general version here is more 
verbose (and thus that the path of least resistence would be more restrictive 
library code); however, the breaking case in the example code above is a bit 
odd anyway - explicitly taking two variables of distinct types and relying on 
C's implicit conversions to do the right thing is somewhat bad form, 
especially where signed/unsigned conversions are concerned. 
I think the more common case for implicit conversions is the following, 
though, where the conversion is used on a literal: 

	short x = 40;
	short y = add(x, 2);

One option to handle just this case would be to make literals implicitly 
convertable to match polymorphic type variables, but only literals. 
The example above would actually behave slightly differently than `x + 2` in 
C, though, casting the `2` down to `short` rather than the `x` up to `int`, a 
possible demerit of this scheme. 

The other question to ask would be which conversions would be allowed for 
literals; it seems rather odd to allow down-casting `42ull` to `char`, when 
the programmer has explicitly specified by the suffix that it's an unsigned 
long. 
Type interpretations of literals in C are rather complex (see [1]), but one 
reasonable approach would be to say that un-suffixed integer literals could be 
interpreted as any type convertable from int, "u" suffixed literals could be 
interpreted as any type convertable from "unsigned int" except the signed 
integer types, and "l" or "ll" suffixed literals could only be interpreted as 
`long` or `long long`, respectively (or possibly that the "u" suffix excludes 
the signed types, while the "l" suffix excludes the types smaller than 
`long int`, as in [1]). 
Similarly, unsuffixed floating-point literals could be interpreted as `float`, 
`double` or `long double`, but "f" or "l" suffixed floating-point literals 
could only be interpreted as `float` or `long double`, respectively. 
I would like to specify that character literals can only be interpreted as 
`char`, but the wide-character variants and the C practice of typing character 
literals as `int` means that would likely break code, so character literals 
should be able to take any integer type.

[1] http://en.cppreference.com/w/c/language/integer_constant

With the possible exception of the `add` case above, implicit conversions to 
the function types of assertions can handle most of the expected behaviour 
from C. 
However, implicit conversions cannot be applied to match variable assertions, 
as in the following example:

	forall( otype T | { int ?<?(T, T); T ?+?(T, T); T min; T max; } )
	T clamp_sum( T x, T y ) {
		T sum = x + y;
		if ( sum < min ) return min;
		if ( max < sum ) return max;
		return sum;
	}
	
	char min = 'A';
	double max = 100.0;
	//int val = clamp_sum( 'X', 3.14 );  // ERROR (1)
	
	char max = 'Z'
	char val = clamp_sum( 'X', 3.14 ); // MATCH (2)
	double val = clamp_sum( 40.9, 19.9 ); // MAYBE (3)

In this example, the call to `clamp_sum` at (1) doesn't compile, because even 
though there are compatible `min` and `max` variables of types `char` and 
`double`, they need to have the same type to match the constraint, and they 
don't. 
The (2) example does compile, but with a behaviour that might be a bit 
unexpected given the "usual arithmetic conversions", in that both values are 
narrowed to `char` to match the `min` and `max` constraints, rather than 
widened to `double` as is usual for mis-matched arguments to +.
The (3) example is the only case discussed here that would require the most 
permisive type binding semantics - here, `T` is bound to `char`, to match the 
constraints, and both the parameters are narrowed from `double` to `char` 
before the call, which would not be allowed under either of the more 
restrictive binding semantics. 
However, the behaviour here is unexpected to the programmer, because the 
return value will be `(double)'A' /* == 60.0 */` due to the conversions, 
rather than `60.8 /* == 40.9 + 19.9 */` as they might expect.

Personally, I think that implicit conversions are not a particularly good 
language design, and that the use-cases for them can be better handled with 
less powerful features (e.g. more versatile rules for typing constant 
expressions). 
However, though we do need implicit conversions in monomorphic code for C 
compatibility, I'm in favour of restricting their usage in polymorphic code, 
both to give programmers some stronger tools to express their intent and to 
shrink the search space for the resolver. 
Of the possible binding semantics I've discussed, I'm in favour of forcing 
polymorphic type variables to bind exactly, though I could be talked into 
allowing literal expressions to have more flexibility in their bindings, or 
possibly loosening "type variables bind exactly" to "type variables bind 
exactly at least once"; I think the unrestricted combination of implicit 
conversions and polymorphic type variable binding unneccesarily multiplies the 
space of possible function resolutions, and that the added resolution options 
are mostly unexpected and therefore confusing and not useful to user 
programmers.

## Resolver Architecture ##

### Function Application Resolution ###
Our resolution algorithm for function application expressions is based on 
Baker's[3] single-pass bottom-up algorithm, with Cormack's[4] single-pass 
top-down algorithm applied where appropriate as an optimization. 
Broadly speaking, the cost of this resolution per expression will be 
proportional to `i^d`, where `i` is the number of interpretations of each 
program symbol, and `d` is the maximum depth of the expression DAG. 
Since `d` is determined by the user programmer (in general, bounded by a small 
constant), opportunities for resolver optimization primarily revolve around 
minimizing `i`, the number of interpretations of each symbol that are 
considered.

[3] Baker, Theodore P. A one-pass algorithm for overload resolution in Ada. 
ACM Transactions on Programming Languages and Systems (1982) 4:4 p.601-614

[4] Cormack, Gordon V. An algorithm for the selection of overloaded functions 
in Ada. SIGPLAN Notices (1981) 16:2 p.48-52

Unlike Baker, our system allows implicit type conversions for function 
arguments and return types; the problem then becomes to find the valid 
interpretation for an expression that has the unique minimal conversion cost, 
if such exists. 
Interpretations can be produced both by overloaded names and implicit 
conversions applied to existing interpretations; we have proposals to reduce 
the number of interpretations considered from both sources. 
To simplify the problem for this discussion, we will consider application 
resolution restricted to a domain of functions applied to variables, possibly 
in a nested manner (e.g. `f( g( x ), y )`, where `x` and `y` are variables and 
`f` and `g` are functions), and possibly in a typed context such as a variable 
initialization (e.g. `int i = f( x );`); the other aspects of Cforall type 
resolution should be able to be straightforwardly mapped into this model. 
The types of the symbol tables used for variable and function declarations 
look somewhat like the following:

	variable_table = name_map( variable_name, variable_map )
	
	function_table = name_map( function_name, function_map )
	
	variable_map = multi_index( by_type( variable_type ), 
	                            variable_decl_set )

	function_map = multi_index( by_int( n_params ),
								by_type( return_type ),
								function_decl_set )

`variable_name` and `function_name` are likely simple strings, with `name_map` 
a hash table (or perhaps trie) mapping string keys to values. 
`variable_decl_set` and `function_decl_set` can be thought of for the moment 
as simple bags of typed declarations, where the declaration types are linked 
to the graph of available conversions for that type. 
In a typed context both the `variable_decl_set` and the `function_decl_set` 
should be able to be selected upon by type; this is accomplished by the 
`by_type` index of both `variable_map` and `function_map`. 
The `by_int` index of `function_map` also provides a way to select functions 
by their number of parameters; this index may be used to swiftly discard any 
function declaration which does not have the appropriate number of parameters 
for the argument interpretations being passed to it; given the likely small 
number of entries in this map, it is possible that a binary search of a sorted 
vector or even a linear search of an unsorted vector would be more efficient 
than the usual hash-based index.

Given these data structures, the general outline of our algorithm follows 
Baker, with Cormack's algorithm used as a heuristic filter in typed contexts. 

In an untyped context, we use a variant of Baker's bottom-up algorithm. 
The leaves of the interpretation DAG are drawn from the variable symbol table, 
with entries in the table each producing zero-cost interpretations, and each 
implicit conversion available to be applied to the type of an existing entry 
producing a further interpretation with the same cost as the conversion. 
As in Baker, if two or more interpretations have the same type, only the 
minimum cost interpretation with that type is produced; if there is no unique 
minimum cost interpretation than resolution with that type is ambiguous, and 
not permitted. 
It should be relatively simple to produce the list of interpretations sorted 
by cost by producing the interpretations via a breadth-first search of the 
conversion graph from the initial interpretations provided in the variable 
symbol table.

To match a function at one of the internal nodes of the DAG, we first look up 
the function's name in the function symbol table, the appropriate number of 
parameters for the arguments that are provided through the `by_int` index of 
the returned `function_map`, then go through the resulting `function_decl_set` 
searching for functions where the parameter types can unify with the provided 
argument lists; any such matching function produces an interpretation with a 
cost that is the sum of its argument costs. 
Though this is not included in our simplified model, this unification step may 
include binding of polymorphic variables, which introduces a cost for the 
function binding itself which must be added to the argument costs. 
Also, checking of function assertions would likely be done at this step as 
well, possibly eliminating some possible matching functions (if no suitable 
assertions can be satisfied), or adding further conversion costs for the 
assertion satisfaction. 
Once the set of valid function interpretations is produced, these may also be 
expanded by the graph of implicit conversions on their return types, as the 
variable interpretations were. 

This implicit conversion-based expansion of interpretations should be skipped 
for the top-level expression if used in an untyped (void) context, e.g. for 
`f` in `f( g ( x ) );` or `x` in `x;`. 
On the other hand, if the top-level expression specifies a type, e.g. in 
`int i = f( x );`, only top level expressions that return that type are 
relevant to the search, so the candidates for `f` can be filtered first by 
those that return `int` (or a type convertable to it); this can be 
accomplished by performing a top-down filter of the interpretations of `f` by 
the `by_type` index of the `function_map` in a manner similar to Cormack's[4] 
algorithm. 

In a typed context, such as an initialization expression 
`T x = f( g( y ), z );`, only interpretations of `f( g( y ), z )` which have 
type `T` are valid; since there are likely to be valid interpretations of 
`f( g( y ), z )` which cannot be used to initialize a variable of type `T`, we 
can use this information to reduce the number of interpretations considered. 
Drawing from Cormack[4], we first search for interpretations of `f` where the 
return type is `T`; by breadth-first-search of the conversion graph, it should 
be straightforward to order the interpretations of `f` by the cost to convert 
their return type to `T`. 
We can also filter out interpretations of `f` with less than two parameters, 
since both `g( y )` and `z` must produce at least one parameter; we may not, 
however, rule out interpretations of `f` with more than two parameters, as 
there may be a valid interpretation of `g( y )` as a function returning more 
than one parameter (if the expression was `f( y, z )` instead, we could use an 
exact parameter count, assuming that variables of tuple type don't exist). 
For each compatible interpretation of `f`, we can add the type of the first 
parameter of that interpretation of `f` to a set `S`, and recursively search 
for interpretations of `g( y )` that return some type `Si` in `S`, and 
similarly for interpretations of `z` that match the type of any of the second 
parameters of some `f`. 
Naturally, if no matching interpretation of `g( y )` can be found for the 
first parameter of some `f`, the type of the second parameter of that `f` will 
not be added to the set of valid types for `z`. 
Each node in this interpretation DAG is given a cost the same way it would be 
in the bottom-up approach, with the exception that when going top-down there 
must be a final bottom-up pass to sum the interpretation costs and sort them 
as appropriate. 

If a parameter type for some `f` is a polymorphic type variable that is left 
unbound by the return type (e.g. `forall(otype S) int f(S x, int y)`), the 
matching arguments should be found using the bottom-up algorithm above for 
untyped contexts, because the polymorphic type variable does not sufficiently 
constrain the available interpretations of the argument expression. 
Similarly, it would likely be an advantage to use top-down resolution for 
cast expressions (e.g. `(int)x`), even when those cast expressions are 
subexpressions of an otherwise untyped expression. 
It may also be fruitful to switch between the bottom-up and top-down 
algorithms if the number of valid interpretations for a subexpression or valid 
types for an argument exceeds some heuristic threshold, but finding such 
a threshold (if any exists) will require experimental data. 
This hybrid top-down/bottom-up search provides more opportunities for pruning 
interpretations than either a bottom-up or top-down approach alone, and thus 
may be more efficient than either. 
A top-down-only approach, however, devolves to linear search through every 
possible interpretation in the solution space in an untyped context, and is 
thus likely to be inferior to a strictly bottom-up approach, though this 
hypothesis needs to be empirically validated.

Another approach would be to abandon expression-tree ordering for 
subexpression matching, and order by "most constrained symbol"; symbols would  
be more constrained if there were fewer matching declarations, fewer 
subexpressions yet to resolve, or possibly fewer possible types the expression 
could resolve to. Ordering the expressions in a priority-queue by this metric 
would not necessarily produce a top-down or a bottom-up order, but would add 
opportunities for pruning based on memoized upper and lower bounds.

Both Baker and Cormack explicitly generate all possible interpretations of a 
given expression; thinking of the set of interpretations of an expression as a 
list sorted by cost, this is an eager evaluation of the list. 
However, since we generally expect that user programmers will not often use 
high-cost implicit conversions, one potentially effective way to prune the 
search space would be to first find the minimal-cost interpretations of any 
given subexpression, then to save the resolution progress of the 
subexpressions and attempt to resolve the superexpression using only those 
subexpression interpretations.
If no valid interpretation of the superexpression can be found, the resolver 
would then repeatedly find the next-most-minimal cost interpretations of the 
subexpressions and attempt to produce the minimal cost interpretation of the 
superexpression. 
This process would proceed until all possible subexpression interpretations 
have been found and considered. 

A middle ground between the eager and lazy approaches can be taken by 
considering the lexical order on the cost tuple; essentially, every 
interpretation in each of the classes below will be strictly cheaper than any 
interpretation in the class after it, so if a min-cost valid interpretation 
can be found while only generating interpretations in a given class, that 
interpretation is guaranteed to be the best possible one:

1. Interpretations without polymorphic functions or implicit conversions
2. Interpretations without polymorphic functions using only safe conversions
3. Interpretations using polymorphic functions without unsafe conversions
4. Interpretations using unsafe conversions

In this lazy-eager approach, all the interpretations in one class would be 
eagerly generated, while the interpretations in the next class would only be 
considered if no match was found in the previous class.

Another source of efficiency would be to cache the best given interpretation 
of a subexpression within an environment; this may not be incredibly useful 
for explict parameters (though it may be useful for, e.g. `f( x, x )`, where 
both parameters of `f` have the same type), but should pay some dividends for 
the implicit assertion parameters, especially the otype parameters for the 
argument of a generic type, which will generally be resolved in duplicate for 
(at least) the assignment operator, constructor, copy constructor & destructor 
of the generic type. 

## Appendix A: Partial and Total Orders ##
The `<=` relation on integers is a commonly known _total order_, and 
intuitions based on how it works generally apply well to other total orders. 
Formally, a total order is some binary relation `<=` over a set `S` such that 
for any two members `a` and `b` of `S`, `a <= b` or `b <= a` (if both, `a` and 
`b` must be equal, the _antisymmetry_ property); total orders also have a 
_transitivity_ property, that if `a <= b` and `b <= c`, then `a <= c`. 
If `a` and `b` are distinct elements and `a <= b`, we may write `a < b`.
 
A _partial order_ is a generalization of this concept where the `<=` relation 
is not required to be defined over all pairs of elements in `S` (though there 
is a _reflexivity_ requirement that for all `a` in `S`, `a <= a`); in other 
words, it is possible for two elements `a` and `b` of `S` to be 
_incomparable_, unable to be ordered with respect to one another (any `a` and 
`b` for which either `a <= b` or `b <= a` are called _comparable_). 
Antisymmetry and transitivity are also required for a partial order, so all 
total orders are also partial orders by definition. 
One fairly natural partial order is the "subset of" relation over sets from 
the same universe; `{ }` is a subset of both `{ 1 }` and `{ 2 }`, which are 
both subsets of `{ 1, 2 }`, but neither `{ 1 }` nor `{ 2 }` is a subset of the 
other - they are incomparable under this relation. 

We can compose two (or more) partial orders to produce a new partial order on 
tuples drawn from both (or all the) sets. 
For example, given `a` and `c` from set `S` and `b` and `d` from set `R`, 
where both `S` and `R` both have partial orders defined on them, we can define 
a ordering relation between `(a, b)` and `(c, d)`. 
One common order is the _lexicographical order_, where `(a, b) <= (c, d)` iff 
`a < c` or both `a = c` and `b <= d`; this can be thought of as ordering by 
the first set and "breaking ties" by the second set. 
Another common order is the _product order_, which can be roughly thought of 
as "all the components are ordered the same way"; formally `(a, b) <= (c, d)` 
iff `a <= c` and `b <= d`. 
One difference between the lexicographical order and the product order is that 
in the lexicographical order if both `a` and `c` and `b` and `d` are 
comparable then `(a, b)` and `(c, d)` will be comparable, while in the product 
order you can have `a <= c` and `d <= b` (both comparable) which will make 
`(a, b)` and `(c, d)` incomparable. 
The product order, on the other hand, has the benefit of not prioritizing one 
order over the other.

Any partial order has a natural representation as a directed acyclic graph 
(DAG). 
Each element `a` of the set becomes a node of the DAG, with an arc pointing to 
its _covering_ elements, any element `b` such that `a < b` but where there is 
no `c` such that `a < c` and `c < b`. 
Intuitively, the covering elements are the "next ones larger", where you can't 
fit another element between the two. 
Under this construction, `a < b` is equivalent to "there is a path from `a` to 
`b` in the DAG", and the lack of cycles in the directed graph is ensured by 
the antisymmetry property of the partial order.

Partial orders can be generalized to _preorders_ by removing the antisymmetry 
property. 
In a preorder the relation is generally called `<~`, and it is possible for 
two distict elements `a` and `b` to have `a <~ b` and `b <~ a` - in this case 
we write `a ~ b`; `a <~ b` and not `a ~ b` is written `a < b`. 
Preorders may also be represented as directed graphs, but in this case the 
graph may contain cycles.

## Appendix B: Building a Conversion Graph from Un-annotated Single Steps ##
The short answer is that it's impossible.

The longer answer is that it has to do with what's essentially a diamond 
inheritance problem. 
In C, `int` converts to `unsigned int` and also `long` "safely"; both convert 
to `unsigned long` safely, and it's possible to chain the conversions to 
convert `int` to `unsigned long`. 
There are two constraints here; one is that the `int` to `unsigned long` 
conversion needs to cost more than the other two (because the types aren't as 
"close" in a very intuitive fashion), and the other is that the system needs a 
way to choose which path to take to get to the destination type. 
Now, a fairly natural solution for this would be to just say "C knows how to 
convert from `int` to `unsigned long`, so we just put in a direct conversion 
and make the compiler smart enough to figure out the costs" - given that the 
users can build an arbitrary graph of conversions, this needs to be handled 
anyway. 
We can define a preorder over the types by saying that `a <~ b` if there 
exists a chain of conversions from `a` to `b`. 
This preorder corresponds roughly to a more usual type-theoretic concept of 
subtyping ("if I can convert `a` to `b`, `a` is a more specific type than 
`b`"); however, since this graph is arbitrary, it may contain cycles, so if 
there is also a path to convert `b` to `a` they are in some sense equivalently 
specific. 

Now, to compare the cost of two conversion chains `(s, x1, x2, ... xn)` and 
`(s, y1, y2, ... ym)`, we have both the length of the chains (`n` versus `m`) 
and this conversion preorder over the destination types `xn` and `ym`. 
We could define a preorder by taking chain length and breaking ties by the 
conversion preorder, but this would lead to unexpected behaviour when closing 
diamonds with an arm length of longer than 1. 
Consider a set of types `A`, `B1`, `B2`, `C` with the arcs `A->B1`, `B1->B2`, 
`B2->C`, and `A->C`. 
If we are comparing conversions from `A` to both `B2` and `C`, we expect the 
conversion to `B2` to be chosen because it's the more specific type under the 
conversion preorder, but since its chain length is longer than the conversion 
to `C`, it loses and `C` is chosen. 
However, taking the conversion preorder and breaking ties or ambiguities by 
chain length also doesn't work, because of cases like the following example 
where the transitivity property is broken and we can't find a global maximum: 

	`X->Y1->Y2`, `X->Z1->Z2->Z3->W`, `X->W`

In this set of arcs, if we're comparing conversions from `X` to each of `Y2`, 
`Z3` and `W`, converting to `Y2` is cheaper than converting to `Z3`, because 
there are no conversions between `Y2` and `Z3`, and `Y2` has the shorter chain 
length. 
Also, comparing conversions from `X` to `Z3` and to `W`, we find that the 
conversion to `Z3` is cheaper, because `Z3 < W` by the conversion preorder, 
and so is considered to be the nearer type. 
By transitivity, then, the conversion from `X` to `Y2` should be cheaper than 
the conversion from `X` to `W`, but in this case the `X` and `W` are 
incomparable by the conversion preorder, so the tie is broken by the shorter 
path from `X` to `W` in favour of `W`, contradicting the transitivity property 
for this proposed order.

Without transitivity, we would need to compare all pairs of conversions, which 
would be expensive, and possibly not yield a minimal-cost conversion even if 
all pairs were comparable. 
In short, this ordering is infeasible, and by extension I believe any ordering 
composed solely of single-step conversions between types with no further 
user-supplied information will be insufficiently powerful to express the 
built-in conversions between C's types.

## Appendix C: Proposed Prelude Changes ##
**TODO** Port Glen's "Future Work" page for builtin C conversions.
**TODO** Move discussion of zero_t, unit_t here.

It may be desirable to have some polymorphic wrapper functions in the prelude 
which provide consistent default implementations of various operators given a 
definition of one of them. 
Naturally, users could still provide a monomorphic overload if they wished to 
make their own code more efficient than the polymorphic wrapper could be, but 
this would minimize user effort in the common case where the user cannot write 
a more efficient function, or is willing to trade some runtime efficiency for 
developer time. 
As an example, consider the following polymorphic defaults for `+` and `+=`:

	forall(otype T | { T ?+?(T, T); })
	lvalue T ?+=? (T *a, T b) {
		return *a = *a + b;
	}
	
	forall(otype T | { lvalue T ?+=? (T*, T) })
	T ?+? (T a, T b) {
		T tmp = a;
		return tmp += b;
	}

Both of these have a possibly unneccessary copy (the first in copying the 
result of `*a + b` back into `*a`, the second copying `a` into `tmp`), but in 
cases where these copies are unavoidable the polymorphic wrappers should be 
just as performant as the monomorphic equivalents (assuming a compiler 
sufficiently clever to inline the extra function call), and allow programmers 
to define a set of related operators with maximal concision.

**TODO** Look at what Glen and Richard have already written for this.

## Appendix D: Feasibility of Making void* Conversions Explicit ##
C allows implicit conversions between `void*` and other pointer types, as per 
section 6.3.2.3.1 of the standard. 
Making these implicit conversions explicit in Cforall would provide 
significant type-safety benefits, and is precedented in C++. 
A weaker version of this proposal would be to allow implicit conversions to 
`void*` (as a sort of "top type" for all pointer types), but to make the 
unsafe conversion from `void*` back to a concrete pointer type an explicit 
conversion. 
However, `int *p = malloc( sizeof(int) );` and friends are hugely common 
in C code, and rely on the unsafe implicit conversion from the `void*` return 
type of `malloc` to the `int*` type of the variable - obviously it would be 
too much of a source-compatibility break to disallow this for C code. 
We do already need to wrap C code in an `extern "C"` block, though, so it is 
technically feasible to make the `void*` conversions implicit in C but 
explicit in Cforall. 
Also, for calling C code with `void*`-based APIs, pointers-to-dtype are 
calling-convention compatible with `void*`; we could read `void*` in function 
signatures as essentially a fresh dtype type variable, e.g:

	void* malloc( size_t )
		=> forall(dtype T0) T0* malloc( size_t )
	void qsort( void*, size_t, size_t, int (*)( const void*, const void* ) )
		=> forall(dtype T0, dtype T1, dtype T2)
		   void qsort( T0*, size_t, size_t, int (*)( const T1*, const T2* ) )

This would even allow us to leverage some of Cforall's type safety to write 
better declarations for legacy C API functions, like the following wrapper for 
`qsort`:

	extern "C" { // turns off name-mangling so that this calls the C library
		// call-compatible type-safe qsort signature
		forall(dtype T)
		void qsort( T*, size_t, size_t, int (*)( const T*, const T* ) );
		
		// forbid type-unsafe C signature from resolving
		void qsort( void*, size_t, size_t, int (*)( const void*, const void* ) )
			= delete;
	}

## Appendix E: Features to Add in Resolver Re-write ##
* Reference types
* Special types for 0 and 1 literals
* Expression type for return statement that resolves similarly to ?=?
  - This is to get rid of the kludge in the box pass that effectively 
    re-implements the resolver poorly.
