Index: doc/working/resolver_design.md
===================================================================
--- doc/working/resolver_design.md	(revision e24f13ade68e4ef547f6f560caddb2890c2f4a3e)
+++ doc/working/resolver_design.md	(revision e24f13ade68e4ef547f6f560caddb2890c2f4a3e)
@@ -0,0 +1,1176 @@
+# Thoughts on Resolver Design #
+
+## Conversions ##
+C's implicit "usual arithmetic conversions" define a structure among the 
+built-in types consisting of _unsafe_ narrowing conversions and a hierarchy of 
+_safe_ widening conversions. 
+There is also a set of _explicit_ conversions that are only allowed through a 
+cast expression.
+Based on Glen's notes on conversions [1], I propose that safe and unsafe 
+conversions be expressed as constructor variants, though I make explicit 
+(cast) conversions a constructor variant as well rather than a dedicated 
+operator. 
+Throughout this article, I will use the following operator names for 
+constructors and conversion functions from `From` to `To`:
+
+	void ?{} ( To*, To );            // copy constructor
+	void ?{} ( To*, From );          // explicit constructor
+	void ?{explicit} ( To*, From );  // explicit cast conversion
+	void ?{safe} ( To*, From );      // implicit safe conversion
+	void ?{unsafe} ( To*, From );    // implicit unsafe conversion
+
+[1] http://plg.uwaterloo.ca/~cforall/Conversions/index.html
+
+Glen's design made no distinction between constructors and unsafe implicit 
+conversions; this is elegant, but interacts poorly with tuples. 
+Essentially, without making this distinction, a constructor like the following 
+would add an interpretation of any two `int`s as a `Coord`, needlessly 
+multiplying the space of possible interpretations of all functions:
+
+	void ?{}( Coord *this, int x, int y );
+
+That said, it would certainly be possible to make a multiple-argument implicit 
+conversion, as below, though the argument above suggests this option should be 
+used infrequently:
+
+	void ?{unsafe}( Coord *this, int x, int y );
+
+An alternate possibility would be to only count two-arg constructors 
+`void ?{} ( To*, From )` as unsafe conversions; under this semantics, safe and  
+explicit conversions should also have a compiler-enforced restriction to 
+ensure that they are two-arg functions (this restriction may be valuable 
+regardless).
+
+### Constructor Idiom ###
+Basing our notion of conversions off otherwise normal Cforall functions means 
+that we can use the full range of Cforall features for conversions, including 
+polymorphism.
+Glen [1] defines a _constructor idiom_ that can be used to create chains of 
+safe conversions without duplicating code; given a type `Safe` which members 
+of another type `From` can be directly converted to, the constructor idiom 
+allows us to write a conversion for any type `To` which `Safe` converts to: 
+
+	forall(otype To | { void ?{safe}( To*, Safe ) })
+	void ?{safe}( To *this, From that ) {
+		Safe tmp = /* some expression involving that */;
+		*this = tmp; // uses assertion parameter
+	}
+
+This idiom can also be used with only minor variations for a parallel set of 
+unsafe conversions.
+
+What selective non-use of the constructor idiom gives us is the ability to 
+define a conversion that may only be the *last* conversion in a chain of such. 
+Constructing a conversion graph able to unambiguously represent the full 
+hierarchy of implicit conversions in C is provably impossible using only 
+single-step conversions with no additional information (see Appendix B), but 
+this mechanism is sufficiently powerful (see [1], though the design there has 
+some minor bugs; the general idea is to use the constructor idiom to define 
+two chains of conversions, one among the signed integral types, another among 
+the unsigned, and to use monomorphic conversions to allow conversions between 
+signed and unsigned integer types).   
+
+### Implementation Details ###
+It is desirable to have a system which can be efficiently implemented, yet 
+also to have one which has sufficient power to distinguish between functions 
+on all possible axes of polymorphism. 
+This ordering may be a partial order, which may complicate implementation 
+somewhat; in this case it may be desirable to store the set of implementations 
+for a given function as the directed acyclic graph (DAG) representing the 
+order.
+
+## Conversion Costs ##
+Each possible resolution of an expression has a _cost_ consisting of four 
+integer components: _unsafe_ conversion cost, _polymorphic_ specialization 
+cost, _safe_ conversion cost, and a _count_ of conversions. 
+These components form a lexically-ordered tuple which can be summed 
+element-wise; summation starts at `(0, 0, 0, 0)`.
+
+**TODO** Costs of T, T*, lvalue T, rvalue T conversions (if applicable)
+
+### Lvalue Conversions ###
+**TODO** Finish me
+
+#### NOTES
+* C standard 6.3.2.1
+* pointer_like_generators.md
+
+### Qualifier Conversions ###
+**TODO** Finish me
+
+#### NOTES
+C standard 6.3.2.3.2: We can add any qualifier to the pointed-to-type of a 
+pointer.
+* Glen thinks this means that we should make the default assignment operator 
+  `?=?(T volatile restrict *this, T that)`, but I'm not sure I like the 
+  implications for the actual implementation of forcing `this` to be 
+  volatile
+* I want to consider whether this property should generalize to other 
+  parameterized types (e.g. `lvalue T`, `box(T)`)
+
+C standard 6.3.2.1.1: "modifiable lvalues" recursively exclude structs with 
+const-qualified fields
+
+C standard 6.3.2.1.2: Using lvalues as rvalues implicitly strips qualifiers
+
+C standard 6.2.4.26: 
+
+C standard 6.7.3: **TODO**
+
+### Conversion Operator Costs ###
+Copy constructors, safe conversions, and unsafe conversions all have an 
+associated conversion cost, calculated according to the algorithm below:
+
+1. Monomorphic copy constructors have a conversion cost of `(0, 0, 0, 0)`
+2. Monomorphic safe conversions have a conversion cost of `(0, 0, 1, 1)`
+3. Monomoprhic unsafe conversions have a conversion cost of `(1, 0, 0, 1)`
+4. Polymorphic conversion operators (or copy constructors) have a conversion 
+   cost of `(0, 1, 0, 1)` plus the conversion cost of their monomorphic 
+   equivalent and the sum of the conversion costs of all conversion operators 
+   passed as assertion parameters, but where the fourth "count" element of the 
+   cost tuple is fixed to `1`.
+
+**TODO** Polymorphism cost may need to be reconsidered in the light of the 
+thoughts on polymorphism below.
+**TODO** You basically just want path-length in the conversion graph implied 
+by the set of conversions; the only tricky question is whether or not you can 
+account for "mixed" safe and unsafe conversions used to satisfy polymorphic 
+constraints, whether a polymorphic conversion should cost more than a 
+monomorphic one, and whether to account for non-conversion constraints in the 
+polymorphism cost
+
+### Argument-Parameter Matching ###
+Given a function `f` with an parameter list (after tuple flattening) 
+`(T1 t1, T2 t2, ... Tn tn)`, and a function application 
+`f(<e1>, <e2>, ... <em>)`, the cost of matching each argument to the 
+appropriate parameter is calculated according to the algorithm below:
+
+Given a parameter `t` of type `T` and an expression `<e>` from these lists, 
+`<e>` will have a set of interpretations of types `E1, E2, ... Ek` with 
+associated costs `(u1, p1, s1, c1), (u2, p2, s2, c2), ... (uk, pk, sk, ck)`. 
+(If any `Ei` is a tuple type, replace it with its first flattened element for 
+the purposes of this section.) 
+
+The cost of matching the interpretation of `<e>` with type `Ei` to `t1` with 
+type `T` is the sum of the interpretation cost `(ui, pi, si, ci)` and the 
+conversion operator cost from `Ei` to `T`.
+
+### Object Initialization ###
+The cost to initialize an object is calculated very similarly to 
+argument-parameter matching, with a few modifications. 
+Firstly, explicit constructors are included in the set of available 
+conversions, with conversion cost `(0, 0, 0, 1)` plus associated polymorphic 
+conversion costs (if applicable) and the _interpretation cost_ of the 
+constructor, the sum of the argument-parameter matching costs for its 
+parameters. 
+Also, ties in overall cost (interpretation cost plus conversion cost) are 
+broken by lowest conversion cost (i.e. of alternatives with the same overall 
+cost, copy constructors are preferred to other explicit constructors, 
+explicit constructors are preferred to safe conversions, which are preferred 
+to unsafe conversions). 
+An object initialization is properly typed if it has exactly one min-cost 
+interpretation.
+
+### Explicit Casts ###
+Explicit casts are handled similarly to object initialization. 
+Copy constructors and other explicit constructors are not included in the set 
+of possible conversions, though interpreting a cast as type ascription 
+(`(T)e`, meaning the interpretation of `e` as type `T`) has conversion cost 
+`(0, 0, 0, 0)`. 
+Explicit conversion operators are also included in the set of possible 
+conversions, with cost `(0, 0, 0, 1)` plus whatever polymorphic conversion 
+costs are invoked. 
+Unlike for explicit constructors and other functions, implicit conversions are 
+never applied to the argument or return type of an explicit cast operator, so 
+that the cast may be used more effectively as a method for the user programmer 
+to guide type resolution.
+
+## Trait Satisfaction ##
+A _trait_ consists of a list of _type variables_ along with a (possibly empty) 
+set of _assertions_ on those variables. 
+Assertions can take two forms, _variable assertions_ and the more common 
+_function assertions_, as in the following example:
+
+	trait a_trait(otype T, otype S) {
+		T a_variable_assertion;
+		S* another_variable_assertion;
+		S a_function_assertion( T* );
+	};
+
+Variable assertions enforce that a variable with the given name and type 
+exists (the type is generally one of the type variables, or derived from one), 
+while a function assertion enforces that a function with a 
+_compatible signature_ to the provided function exists. 
+
+To test if some list of types _satisfy_ the trait, the types are first _bound_ 
+to the type variables, and then declarations to satisfy each assertion are 
+sought out. 
+Variable assertions require an exact match, because they are passed as object 
+pointers, and there is no mechanism to employ conversion functions, while 
+function assertions only require a function that can be wrapped to a 
+compatible type; for example, the declarations below satisfy 
+`a_trait(int, short)`:
+
+	int a_variable_assertion;
+	short* another_variable_assertion;
+	char a_function_assertion( void* );
+	// int* may be implicitly converted to void*, and char to short, so the 
+	// above works 
+
+Cforall Polymorphic functions have a _constraining trait_, denoted as follows: 
+
+	forall(otype A, otype B | some_trait(A, B))
+
+The trait may be anonymous, with the same syntax as a trait declaration, and 
+may be unioned together using `|` or `,`.
+
+**TODO** Consider including field assertions in the list of constraint types, 
+also associated types and the appropriate matching type assertion.
+
+## Polymorphism Costs ##
+The type resolver should prefer functions that are "less polymorphic" to 
+functions that are "more polymorphic". 
+Determining how to order functions by degree of polymorphism is somewhat less 
+straightforward, though, as there are multiple axes of polymorphism and it is 
+not always clear how they compose. 
+The natural order for degree of polymorphism is a partial order, and this 
+section includes some open questions on whether it is desirable or feasible to 
+develop a tie-breaking strategy to impose a total order on the degree of 
+polymorphism of functions. 
+Helpfully, though, the degree of polymorphism is a property of functions 
+rather than function calls, so any complicated graph structure or calculation 
+representing a (partial) order over function degree of polymorphism can be 
+calculated once and cached.
+
+### Function Parameters ###
+All other things being equal, if a parameter of one function has a concrete 
+type and the equivalent parameter of another function has a dynamic type, the 
+first function is less polymorphic:
+
+	                void f( int, int );  // (0) least polymorphic 
+	forall(otype T) void f( T, int );    // (1a) more polymorphic than (0)
+	forall(otype T) void f( int, T );    // (1b) more polymorphic than (0)
+	                                     //      incomparable with (1a)
+	forall(otype T) void f( T, T );      // (2) more polymorphic than (1a/b)
+
+This should extend to parameterized types (pointers and generic types) also: 
+
+	forall(otype S) struct box { S val; };
+	
+	forall(otype T) void f( T, T* );       // (3) less polymorphic than (2)
+	forall(otype T) void f( T, T** );      // (4) less polymorphic than (3)
+	forall(otype T) void f( T, box(T) );   // (5) less polymorphic than (2)
+	                                       //     incomparable with (3)
+	forall(otype T) void f( T, box(T*) );  // (6) less polymorphic than (5)
+
+Every function in the group above is incomparable with (1a/b), but that's fine 
+because an `int` isn't a pointer or a `box`, so the ambiguity shouldn't occur 
+much in practice (unless there are safe or unsafe conversions defined between 
+the possible argument types).
+
+For degree of polymorphism from arguments, I think we should not distinguish 
+between different type parameters, e.g. the following should be considered 
+equally polymorphic:
+
+	forall(otype T, otype S) void f( T, T, S );  // (7)
+	forall(otype T, otype S) void f( S, T, T );  // (8)
+
+However parameter lists are compared, parameters of multi-parameter generic 
+types should ideally be treated as a recursive case, e.g. in the example 
+below, (9) is less polymorphic than (10), which is less polymorphic than (11): 
+
+	forall(otype T, otype S) struct pair { T x; S y; };
+	
+	                void f( pair(int, int) );  // (9)
+	forall(otype T) void f( pair(T, int) );    // (10)
+	forall(otype T) void f( pair(T, T) );      // (11)
+
+Parameter comparison could possibly be made somewhat cheaper at loss of some 
+precision by representing each parameter as a value from the natural numbers 
+plus infinity, where infinity represents a monomorphic parameter and a finite 
+number counts how many levels deep the shallowest type variable is, e.g. where 
+`T` is a type variable, `int` would have value infinity, `T` would have value 
+0, `T*` would have value 1, `box(T)*` would have value 2, etc. 
+Under this scheme, higher values represent less polymorphism. 
+This makes the partial order on parameters a total order, so that many of the 
+incomparable functions above compare equal, though that is perhaps a virtue. 
+It also loses the ability to differentiate between some multi-parameter 
+generic types, such as the parameters in (10) and (11), which would both be 
+valued 1, losing the polymorphism distinction between them.
+
+A variant of the above scheme would be to fix a maximum depth of polymorphic 
+type variables (16 seems like a reasonable choice) at which a parameter would 
+be considered to be effectively monomorphic, and to subtract the value  
+described above from that maximum, clamping the result to a minimum of 0. 
+Under this scheme, assuming a maximum value of 4, `int` has value 0, `T` has 
+value 4, `T*` has value 3, `box(T)*` has value 2, and `box(T*)**` has value 0, 
+the same as `int`. 
+This can be quite succinctly represented, and summed without the presence of a 
+single monomorphic parameter pushing the result to infinity, but does lose the 
+ability to distinguish between very deeply structured polymorphic types. 
+
+### Parameter Lists ###
+A partial order on function parameter lists can be produced by the 
+product order of the partial orders on parameters described above. 
+In more detail, this means that for two parameter lists with the same arity, 
+if any pair of corresponding parameters are incomparable with respect to each 
+other, the two parameter lists are incomparable; if in all pairs of 
+corresponding parameters one list's parameter is always (less than or) equal 
+to the other list's parameter than the first parameter list is (less than or) 
+equal to the second parameter list; otherwise the lists are incomparable with 
+respect to each other.
+
+How to compare parameter lists of different arity is a somewhat open question. 
+A simple, but perhaps somewhat unsatisfying, solution would be just to say 
+that such lists are incomparable. 
+The simplist approach to make them comparable is to say that, given two lists 
+`(T1, T2, ... Tn)` and `(S1, S2, ... Sm)`, where `n <= m`, the parameter lists 
+can be compared based on their shared prefix of `n` types. 
+This approach breaks the transitivity property of the equivalence relation on 
+the partial order, though, as seen below:
+
+	forall(otype T) void f( T, int );       // (1a)
+	forall(otype T) void f( T, int, int );  // (12)
+	forall(otype T) void f( T, int, T );    // (13)
+
+By this rule, (1a) is equally polymorphic to both (12) and (13), so by 
+transitivity (12) and (13) should also be equally polymorphic, but that is not 
+actually the case.
+
+We can fix the rule by saying that `(T1 ... Tn)` can be compared to 
+`(S1 ... Sm)` by _extending_ the list of `T`s to `m` types by inserting 
+notional monomorphic parameters. 
+In this case, (1a) and (12) are equally polymorphic, because (1a) gets 
+extended with a monomorphic type that compares equal to (12)'s third `int` 
+parameter, but (1a) is less polymorphic than (13), because its notional 
+monomorphic third parameter is less polymorphic than (13)'s `T`. 
+Essentially what this rule says is that any parameter list with more 
+parameters is no less polymorphic than one with fewer. 
+
+We could collapse this parameter list ordering to a succinct total order by 
+simply taking the sum of the clamped parameter polymorphism counts, but this 
+would again make most incomparable parameter lists compare equal, as well as 
+having the potential for some unexpected results based on the (completely 
+arbitrary) value chosen for "completely polymorphic". 
+For instance, if we set 4 to be the maximum depth of polymorphism (as above), 
+the following functions would be equally polymorphic, which is a somewhat 
+unexpected result:
+
+	forall(otype T) void g( T, T, T, int );    // 4 + 4 + 4 + 0 = 12
+	forall(otype T) void g( T*, T*, T*, T* );  // 3 + 3 + 3 + 3 = 12
+
+These functions would also be considered equally polymorphic:
+
+	forall(otype T) void g( T, int );    // 4 + 0 = 4;
+	forall(otype T) void g( T**, T** );  // 2 + 2 = 4;
+
+This issue can be mitigated by choosing a larger maximum depth of 
+polymorphism, but this scheme does have the distinct disadvantage of either 
+specifying the (completely arbitrary) maximum depth as part of the language or 
+allowing the compiler to refuse to accept otherwise well-typed deeply-nested 
+polymorphic types.  
+
+For purposes of determining polymorphism, the list of return types of a 
+function should be treated like another parameter list, and combined with the 
+degree of polymorphism from the parameter list in the same way that the 
+parameters in the parameter list are combined. 
+For instance, in the following, (14) is less polymorphic than (15) which is 
+less polymorphic than (16):
+
+	forall(otype T) int f( T );  // (14)
+	forall(otype T) T*  f( T );  // (15)
+	forall(otype T) T   f( T );  // (16)
+
+### Type Variables and Bounds ###
+Degree of polymorphism doesn't solely depend on the parameter lists, though. 
+Glen's thesis (4.4.4, p.89) gives an example that shows that it also depends 
+on the number of type variables as well:
+
+	forall(otype T)          void f( T, int );  // (1a) polymorphic
+	forall(otype T)          void f( T, T );    // (2)  more polymorphic
+	forall(otype T, otype S) void f( T, S );    // (17) most polymorphic
+
+Clearly the `forall` type parameter list needs to factor into calculation of 
+degree of polymorphism as well, as it's the only real differentiation between 
+(2) and (17). 
+The simplest way to include the type parameter list would be to simply count 
+the type variables and say that functions with more type variables are more 
+polymorphic.
+
+However, it also seems natural that more-constrained type variables should be 
+counted as "less polymorphic" than less-constrained type variables. 
+This would allow our resolver to pick more specialized (and presumably more 
+efficient) implementations of functions where one exists. 
+For example:
+
+	forall(otype T | { void g(T); }) T f( T );  // (18) less polymorphic
+	forall(otype T)                  T f( T );  // (16) more polymorphic
+
+We could account for this by counting the number of unique constraints and 
+saying that functions with more constraints are less polymorphic.
+
+That said, we do model the `forall` constraint list as a (possibly anonymous) 
+_trait_, and say that each trait is a set of constraints, so we could 
+presumably define a partial order over traits based on subset inclusion, and 
+use this partial order instead of the weaker count of constraints to order the 
+list of type parameters of a function, as below:
+
+	trait has_g(otype T) { void g(T); };
+	trait has_h(otype S) { void h(T); };
+	trait has_gh(otype R | has_g(R) | has_h(R)) {};
+	// has_gh is equivlent to { void g(R); void h(R); }
+	
+	forall(otype T | has_gh(T)) T f( T );  // (19) least polymorphic
+	forall(otype T | has_g(T))  T f( T );  // (18) more polymorphic than (19)
+	forall(otype T | has_h(T))  T f( T );  // (18b) more polymorphic than (19) 
+	                                       //       incomparable with (18)
+	forall(otype T)             T f( T );  // (16) most polymorphic
+
+The tricky bit with this is figuring out how to compare the constraint 
+functions for equality up to type variable renaming; I suspect there's a known 
+solution, but don't know what it is (perhaps some sort of unification 
+calculation, though I hope there's a more lightweight option). 
+We also should be able to take advantage of the programmer-provided trait 
+subset information (like the constraint on `has_gh` in the example) to more 
+efficiently generate the partial-order graph for traits, which should be able 
+to be cached for efficiency.
+
+Combining count of type variables with the (partial) order on the trait 
+constraining those variables seems like it should be a fairly straightforward 
+product ordering to me - one `forall` qualifier is (less than or) equal to 
+another if it has both a (less than or) equal number of type variables and a 
+(less than or) equal degree of polymorphism from its constraining trait; the 
+two qualifiers are incomparable otherwise. 
+If an easier-to-calculate total ordering is desired, it might be acceptable to 
+use the number of type variables, with ties broken by number of constraints.
+
+Similarly, to combine the (partial) orders on parameter and return lists with 
+the (partial) order on `forall` qualifiers, a product ordering seems like the 
+reasonable choice, though if we wanted a total order a reasonable choice would 
+be to use whatever method we use to combine parameter costs into parameter 
+lists to combine the costs for the parameter and return lists, then break ties 
+by the order on the `forall` qualifiers.
+
+## Expression Costs ##
+
+### Variable Expressions ###
+Variables may be overloaded; that is, there may be multiple distinct variables 
+with the same name so long as each variable has a distinct type. 
+The variable expression `x` has one zero-cost interpretation as type `T` for 
+each variable `T x` in scope.
+
+### Member Selection Expressions ###
+For every interpretation `I` of `e` which has a struct or union type `S`, 
+`e.y` has an interpretation of type `T` for each member `T y` of `S`, with the 
+same cost as `I`. 
+Note that there may be more than one member of `S` with the same name, as per 
+Cforall's usual overloading rules.
+The indirect member expression `e->y` is desugared to `(*e).y` and interpreted 
+analogously.
+
+**TODO** Consider allowing `e.y` to be interpreted as `e->y` if no 
+interpretations as `e.y` exist.
+
+### Address Expressions ###
+Address expressions `&e` have an interpretation for each interpretation `I` of 
+`e` that is an lvalue of type `T`, with the same cost as `I` and type `T*`. 
+Lvalues result from variable expressions, member selection expressions, or 
+application of functions returning an lvalue-qualified type. 
+Note that the dereference operator is overloadable, so the rules for its 
+resolution follow those for function application below.
+
+**TODO** Consider allowing conversion-to-lvalue so that, e.g., `&42` spawns a 
+new temporary holding `42` and takes its address.
+
+### Boolean-context Expressions ###
+C has a number of "boolean contexts", where expressions are assigned a truth 
+value; these include both arguments to the short-circuiting `&&` and `||` 
+operators, as well as the conditional expressions in `if` and `while` 
+statements, the middle expression in `for` statements, and the first argument 
+to the `?:` ternary conditional operator. 
+In all these contexts, C interprets `0` (which is both an integer and a null 
+pointer literal) as false, and all other integer or pointer values as true. 
+In this spirit, Cforall allows other types to be considered "truthy" if they 
+support the following de-sugaring in a conditional context (see notes on 
+interpretation of literal `0` below):
+
+	x => ((int)( x != 0 ))
+
+### Literal Expressions ###
+Literal expressions (e.g. 42, 'c', 3.14, "Hello, world!") have one 
+zero-cost interpretation with the same type the expression would have in C, 
+with three exceptions:
+
+Character literals like 'x' are typed as `char` in Cforall, not `int` as in C. 
+This change breaks very little C code (primarily `sizeof 'x'`; the implicit 
+conversion from `int` to `char` and lack of overloading handle most other 
+expressions), matches the behaviour of C++, and is more compatible with 
+programmer intuition.
+
+The literals `0` and `1` are also treated specially by Cforall, due to their 
+potential uses in operator overloading. 
+Earlier versions of Cforall allowed `0` and `1` to be variable names, allowing 
+multiple interpretations of them according to the existing variable 
+overloading rules, with the following declarations in the prelude:
+
+	const int 0, 1;
+	forall ( dtype DT ) const DT * const    0;
+	forall ( ftype FT ) FT * const          0;
+
+This did, however, create some backward-compatibility problems and potential 
+performance issues, and works poorly for generic types. To start with, this 
+(entirely legal C) code snippet doesn't compile in Cforall:
+
+	if ( 0 ) {}
+
+It desugars to `if ( (int)(0 != 0) ) {}`, and since both `int` and 
+`forall(dtype DT) DT*` have a != operator which returns `int` the resolver can 
+not choose which `0` variable to take, because they're both exact matches.
+
+The general != computation may also be less efficient than a check for a zero 
+value; take the following example of a rational type:
+
+	struct rational { int32_t num, int32_t den };
+	rational 0 = { 0, 1 };
+	
+	int ?!=? (rational a, rational b) {
+		return ((int64_t)a.num)*b.den != ((int64_t)b.num)*a.den;
+	}
+	
+	int not_zero (rational a) { return a.num != 0; }
+
+To check if two rationals are equal we need to do a pair of multiplications to 
+normalize them (the casts in the example are to prevent overflow), but to 
+check if a rational is non-zero we just need to check its numerator, a more 
+efficient operation.
+
+Finally, though polymorphic null-pointer variables can be meaningfully 
+defined, most other polymorphic variables cannot be, which makes it difficult 
+to make generic types "truthy" using the existing system:
+
+	forall(otype T) struct pair { T x; T y; };
+	forall(otype T | { T 0; }) pair(T) 0 = { 0, 0 };
+
+Now, it seems natural enough to want to define the zero for this pair type as 
+a pair of the zero values of its element type (if they're defined). 
+The declaration of `pair(T) 0` above is actually illegal though, as there is 
+no way to represent the zero values of an infinite number of types in the 
+single memory location available for this polymorphic variable - the 
+polymorphic null-pointer variables defined in the prelude are legal, but that 
+is only because all pointers are the same size and the single zero value is a 
+legal value of all pointer types simultaneously; null pointer is, however, 
+somewhat unique in this respect.
+
+The technical explanation for the problems with polymorphic zero is that `0` 
+is really a rvalue, not a lvalue - an expression, not an object. 
+Drawing from this, the solution we propose is to give `0` a new built-in type, 
+`_zero_t` (name open to bikeshedding), and similarly give `1` the new built-in 
+type `_unit_t`. 
+If the prelude defines != over `_zero_t` this solves the `if ( 0 )` problem, 
+because now the unambiguous best interpretation of `0 != 0` is to read them 
+both as `_zero_t` (and say that this expression is false). 
+Backwards compatibility with C can be served by defining conversions in the 
+prelude from `_zero_t` and `_unit_t` to `int` and the appropriate pointer 
+types, as below:
+
+	// int 0;
+	forall(otype T | { void ?{safe}(T*, int); }) void ?{safe} (T*, _zero_t);
+	forall(otype T | { void ?{unsafe}(T*, int); }) void ?{unsafe} (T*, _zero_t);
+	
+	// int 1;
+	forall(otype T | { void ?{safe}(T*, int); }) void ?{safe} (T*, _unit_t);
+	forall(otype T | { void ?{unsafe}(T*, int); }) void ?{unsafe} (T*, _unit_t);
+	
+	// forall(dtype DT) const DT* 0;
+	forall(dtype DT) void ?{safe}(const DT**, _zero_t);
+	// forall(ftype FT) FT* 0;
+	forall(ftype FT) void ?{safe}(FT**, _zero_t);
+
+Further, with this change, instead of making `0` and `1` overloadable 
+variables, we can instead allow user-defined constructors (or, more flexibly, 
+safe conversions) from `_zero_t`, as below:
+
+	// rational 0 = { 0, 1 };
+	void ?{safe} (rational *this, _zero_t) { this->num = 0; this->den = 1; }
+
+Note that we don't need to name the `_zero_t` parameter to this constructor, 
+because its only possible value is a literal zero. 
+This one line allows `0` to be used anywhere a `rational` is required, as well 
+as enabling the same use of rationals in boolean contexts as above (by 
+interpreting the `0` in the desguraring to be a rational by this conversion). 
+Furthermore, while defining a conversion function from literal zero to 
+`rational` makes rational a "truthy" type able to be used in a boolean 
+context, we can optionally further optimize the truth decision on rationals as 
+follows:
+
+	int ?!=? (rational a, _zero_t) { return a.num != 0; }
+
+This comparison function will be chosen in preference to the more general 
+rational comparison function for comparisons against literal zero (like in 
+boolean contexts) because it doesn't require a conversion on the `0` argument. 
+Functions of the form `int ?!=? (T, _zero_t)` can acutally be used in general 
+to make a type `T` truthy without making `0` a value which can convert to that 
+type, a capability not available in the current design.
+
+This design also solves the problem of polymorphic zero for generic types, as 
+in the following example:
+
+	// ERROR: forall(otype T | { T 0; }) pair(T) 0 = { 0, 0 };
+	forall(otype T | { T 0; }) void ?{safe} (pair(T) *this, _zero_t) {
+		this->x = 0; this->y = 0;
+	}
+
+The polymorphic variable declaration didn't work, but this constructor is 
+perfectly legal and has the desired semantics.
+
+Similarly giving literal `1` the special type `_unit_t` allows for more 
+concise and consistent specification of the increment and decrement operators, 
+using the following de-sugaring:
+
+	++i => i += 1
+	i++ => (tmp = i, i += 1, tmp)
+	--i => i -= 1
+	i-- => (tmp = i, i -= 1, tmp)
+
+In the examples above, `tmp` is a fresh temporary with its type inferred from 
+the return type of `i += 1`. 
+Under this proposal, defining a conversion from `_unit_t` to `T` and a 
+`lvalue T ?+=? (T*, T)` provides both the pre- and post-increment operators 
+for free in a consistent fashion (similarly for -= and the decrement 
+operators).
+If a meaningful `1` cannot be defined for a type, both increment operators can 
+still be defined with the signature `lvalue T ?+=? (T*, _unit_t)`. 
+Similarly, if scalar addition can be performed on a type more efficiently than 
+by repeated increment, `lvalue T ?+=? (T*, int)` will not only define the 
+addition operator, it will simultaneously define consistent implementations of 
+both increment operators (this can also be accomplished by defining a 
+conversion from `int` to `T` and an addition operator `lvalue T ?+=?(T*, T)`).
+
+To allow functions of the form `lvalue T ?+=? (T*, int)` to satisfy "has an 
+increment operator" assertions of the form `lvalue T ?+=? (T*, _unit_t)`, 
+we also define a non-transitive unsafe conversion from `_Bool` (allowable 
+values `0` and `1`) to `_unit_t` (and `_zero_t`) as follows:
+
+	void ?{unsafe} (_unit_t*, _Bool) {}
+
+As a note, the desugaring of post-increment above is possibly even more 
+efficient than that of C++ - in C++, the copy to the temporary may be hidden 
+in a separately-compiled module where it can't be elided in cases where it is 
+not used, whereas this approach for Cforall always gives the compiler the 
+opportunity to optimize out the temporary when it is not needed. 
+Furthermore, one could imagine a post-increment operator that returned some 
+type `T2` that was implicitly convertable to `T` but less work than a full 
+copy of `T` to create (this seems like an absurdly niche case) - since the 
+type of `tmp` is inferred from the return type of `i += 1`, you could set up 
+functions with the following signatures to enable an equivalent pattern in 
+Cforall:
+
+	lvalue T2 ?+=? (T*, _unit_t); // increment operator returns T2
+	void ?{} (T2*, T);            // initialize T2 from T for use in `tmp = i`
+	void ?{safe} (T*, T2);        // allow T2 to be used as a T when needed to 
+	                              // preserve expected semantics of T x = y++;
+
+**TODO** Look in C spec for literal type interprations.
+**TODO** Write up proposal for wider range of literal types, put in appendix
+
+### Initialization and Cast Expressions ###
+An initialization expression `T x = e` has one interpretation for each 
+interpretation `I` of `e` with type `S` which is convertable to `T`. 
+The cost of the interpretation is the cost of `I` plus the conversion cost 
+from `S` to `T`.
+A cast expression `(T)e` is interpreted as hoisting initialization of a 
+temporary variable `T tmp = e` out of the current expression, then replacing 
+`(T)e` by the new temporary `tmp`.
+
+### Assignment Expressions ###
+An assignment expression `e = f` desugars to `(?=?(&e, f), e)`, and is then 
+interpreted according to the usual rules for function application and comma 
+expressions. 
+Operator-assignment expressions like `e += f` desugar similarly as 
+`(?+=?(&e, f), e)`.
+
+### Function Application Expressions ###
+Every _compatible function_ and satisfying interpretation of its arguments and 
+polymorphic variable bindings produces one intepretation for the function 
+application expression. 
+Broadly speaking, the resolution cost of a function application is the sum of 
+the cost of the interpretations of all arguments, the cost of all conversions 
+to make those argument interpretations match the parameter types, and the 
+binding cost of any of the function's polymorphic type parameters.
+
+**TODO** Work out binding cost in more detail.
+**TODO** Address whether "incomparably polymorphic" should be treated as 
+"equally polymorphic" and be disambiguated by count of (safe) conversions.
+**TODO** Think about what polymorphic return types mean in terms of late 
+binding.
+**TODO** Consider if "f is less polymorphic than g" can mean exactly "f 
+specializes g"; if we don't consider the assertion parameters (except perhaps 
+by count) and make polymorphic variables bind exactly (rather than after 
+implicit conversions) this should actually be pre-computable.
+**TODO** Add "deletable" functions - take Thierry's suggestion that a deleted 
+function declaration is costed out by the resolver in the same way that any 
+other function declaration is costed; if the deleted declaration is the unique 
+min-cost resolution refuse to type the expression, if it is tied for min-cost 
+then take the non-deleted alternative, and of two equivalent-cost deleted 
+interpretations with the same return type pick one arbitrarily rather than 
+producing an ambiguous resolution.
+
+### Sizeof, Alignof & Offsetof Expressions ###
+`sizeof`, `alignof`, and `offsetof` expressions have at most a single 
+interpretation, of type `size_t`. 
+`sizeof` and `alignof` expressions take either a type or an expression as a  
+an argument; if the argument is a type, it must be a complete type which is 
+not a function type, if an expression, the expression must have a single 
+interpretation, the type of which conforms to the same rules. 
+`offsetof` takes two arguments, a type and a member name; the type must be 
+a complete structure or union type, and the second argument must name a member 
+of that type.
+
+### Comma Expressions ###
+A comma expression `x, y` resolves `x` as if it had been cast to `void`, and 
+then, if there is a unique interpretation `I` of `x`, has one interpretation 
+for each interpretation `J` of `y` with the same type as `J` costing the sum 
+of the costs of `I` and `J`.
+
+#### Compatible Functions ####
+**TODO** This subsection is very much a work in progress and has multiple open 
+design questions.
+
+A _compatible function_ for an application expression is a visible function 
+declaration with the same name as the application expression and parameter 
+types that can be converted to from the argument types. 
+Function pointers and variables of types with the `?()` function call operator 
+overloaded may also serve as function declarations for purposes of 
+compatibility. 
+
+For monomorphic parameters of a function declaration, the declaration is a 
+compatible function if there is an argument interpretation that is either an 
+exact match, or has a safe or unsafe implicit conversion that can be used to 
+reach the parameter type; for example: 
+
+	void f(int);
+	
+	f(42);        // compatible; exact match to int type
+	f('x');       // compatible; safe conversion from char => int
+	f(3.14);      // compatible; unsafe conversion from double => int
+	f((void*)0);  // not compatible; no implicit conversion from void* => int
+
+Per Richard[*], function assertion satisfaction involves recursively searching 
+for compatible functions, not an exact match on the function types (I don't 
+believe the current Cforall resolver handles this properly); to extend the 
+previous example:
+
+	forall(otype T | { void f(T); }) void g(T);
+	
+	g(42);    // binds T = int, takes f(int) by exact match
+	g('x');   // binds T = char, takes f(int) by conversion
+	g(3.14);  // binds T = double, takes f(int) by conversion
+
+[*] Bilson, s.2.1.3, p.26-27, "Assertion arguments are found by searching the 
+accessible scopes for definitions corresponding to assertion names, and 
+choosing the ones whose types correspond *most closely* to the assertion 
+types." (emphasis mine)
+
+There are three approaches we could take to binding type variables: type 
+variables must bind to argument types exactly, each type variable must bind 
+exactly to at least one argument, or type variables may bind to any type which 
+all corresponding arguments can implicitly convert to; I'll provide some 
+possible motivation for each approach.
+
+There are two main arguments for the more restrictive binding schemes; the 
+first is that the built-in implicit conversions in C between `void*` and `T*` 
+for any type `T` can lead to unexpectedly type-unsafe behaviour in a more 
+permissive binding scheme, for example:
+
+	forall(dtype T) T* id(T *p) { return p; }
+	
+	int main() {
+		int *p = 0;
+		char *c = id(p);
+	}
+
+This code compiles in CFA today, and it works because the extra function 
+wrapper `id` provides a level of indirection that allows the non-chaining 
+implicit conversions from `int*` => `void*` and `void*` => `char*` to chain. 
+The resolver types the last line with `T` bound to `void` as follows:
+
+	char *c = (char*)id( (void*)p );
+
+It has been suggested that making the implicit conversions to and from `void*` 
+explicit in Cforall code (as in C++) would solve this particular problem, and 
+provide enough other type-safety benefits to outweigh the source-compatibility 
+break with C; see Appendix D for further details.
+
+The second argument for a more constrained binding scheme is performance; 
+trait assertions need to be checked after the type variables are bound, and 
+adding more possible values for the type variables should correspond to a 
+linear increase in runtime of the resolver per type variable. 
+There are 21 built-in arithmetic types in C (ignoring qualifiers), and each of 
+them is implicitly convertable to any other; if we allow unrestricted binding 
+of type variables, a common `int` variable (or literal) used in the position 
+of a polymorphic variable parameter would cause a 20x increase in the amount 
+of time needed to check trait resolution for that interpretation. 
+These numbers have yet to be emprically substantiated, but the theory is 
+reasonable, and given that much of the impetus for re-writing the resolver is 
+due to its poor performance, I think this is a compelling argument. 
+
+I would also mention that a constrained binding scheme is practical; the most 
+common type of assertion is a function assertion, and, as mentioned above, 
+those assertions should be able to be implicitly converted to to match. 
+Thus, in the example above with `g(T)`, where the assertion is `void f(T)`, 
+we first bind `T = int` or `T = char` or `T = double`, then substitute the 
+binding into the assertion, yielding assertions of `void f(int)`, 
+`void f(char)`, or `void f(double)`, respectively, then attempt to satisfy 
+these assertions to complete the binding. 
+Though in all three cases, the existing function with signature `void f(int)` 
+satisfies this assertion, the checking work cannot easily be re-used between 
+variable bindings, because there may be better or worse matches depending on 
+the specifics of the binding.
+
+The main argument for a more flexible binding scheme is that the binding 
+abstraction can actually cause a wrapped function call that would work to 
+cease to resolve, as below:
+
+	forall(otype T | { T ?+? (T, T) })
+	T add(T x, T y) { return x + y; }
+	
+	int main() {
+		int i, j = 2;
+		short r, s = 3;
+		i = add(j, s);
+		r = add(s, j);
+	}
+
+Now, C's implicit conversions mean that you can call `j + s` or `s + j`, and 
+in both cases the short `s` is promoted to `int` to match `j`. 
+If, on the other hand, we demand that variables exactly match type variables, 
+neither call to `add` will compile, because it is impossible to simultaneously 
+bind `T` to both `int` and `short` (C++ has a similar restriction on template 
+variable inferencing). 
+One alternative that enables this case, while still limiting the possible 
+type variable bindings is to say that at least one argument must bind to its 
+type parameter exactly. 
+In this case, both calls to `add` would have the set `{ T = int, T = short }` 
+for candidate bindings, and would have to check both, as well as checking that 
+`short` could convert to `int` or vice-versa.
+
+It is worth noting here that parameterized types generally bind their type 
+parameters exactly anyway, so these "restrictive" semantics only restrict a 
+small minority of calls; for instance, in the example following, there isn't a 
+sensible way to type the call to `ptr-add`:
+
+	forall(otype T | { T ?+?(T, T) })
+	void ptr-add( T* rtn, T* x, T* y ) {
+		*rtn = *x + *y;
+	}
+	
+	int main() {
+		int i, j = 2;
+		short s = 3;
+		ptr-add(&i, &j, &s); // ERROR &s is not an int*
+	}
+
+I think there is some value in providing library authors with the 
+capability to express "these two parameter types must match exactly". 
+This can be done without restricting the language's expressivity, as the `add` 
+case above can be made to work under the strictest type variable binding 
+semantics with any addition operator in the system by changing its signature 
+as follows:
+
+	forall( otype T, otype R, otype S | { R ?+?(T, S); } )
+	R add(T x, S y) { return x + y; }
+
+Now, it is somewhat unfortunate that the most general version here is more 
+verbose (and thus that the path of least resistence would be more restrictive 
+library code); however, the breaking case in the example code above is a bit 
+odd anyway - explicitly taking two variables of distinct types and relying on 
+C's implicit conversions to do the right thing is somewhat bad form, 
+especially where signed/unsigned conversions are concerned. 
+I think the more common case for implicit conversions is the following, 
+though, where the conversion is used on a literal: 
+
+	short x = 40;
+	short y = add(x, 2);
+
+One option to handle just this case would be to make literals implicitly 
+convertable to match polymorphic type variables, but only literals. 
+The example above would actually behave slightly differently than `x + 2` in 
+C, though, casting the `2` down to `short` rather than the `x` up to `int`, a 
+possible demerit of this scheme. 
+
+The other question to ask would be which conversions would be allowed for 
+literals; it seems rather odd to allow down-casting `42ull` to `char`, when 
+the programmer has explicitly specified by the suffix that it's an unsigned 
+long. 
+Type interpretations of literals in C are rather complex (see [1]), but one 
+reasonable approach would be to say that un-suffixed integer literals could be 
+interpreted as any type convertable from int, "u" suffixed literals could be 
+interpreted as any type convertable from "unsigned int" except the signed 
+integer types, and "l" or "ll" suffixed literals could only be interpreted as 
+`long` or `long long`, respectively (or possibly that the "u" suffix excludes 
+the signed types, while the "l" suffix excludes the types smaller than 
+`long int`, as in [1]). 
+Similarly, unsuffixed floating-point literals could be interpreted as `float`, 
+`double` or `long double`, but "f" or "l" suffixed floating-point literals 
+could only be interpreted as `float` or `long double`, respectively. 
+I would like to specify that character literals can only be interpreted as 
+`char`, but the wide-character variants and the C practice of typing character 
+literals as `int` means that would likely break code, so character literals 
+should be able to take any integer type.
+
+[1] http://en.cppreference.com/w/c/language/integer_constant
+
+With the possible exception of the `add` case above, implicit conversions to 
+the function types of assertions can handle most of the expected behaviour 
+from C. 
+However, implicit conversions cannot be applied to match variable assertions, 
+as in the following example:
+
+	forall( otype T | { int ?<?(T, T); T ?+?(T, T); T min; T max; } )
+	T clamp_sum( T x, T y ) {
+		T sum = x + y;
+		if ( sum < min ) return min;
+		if ( max < sum ) return max;
+		return sum;
+	}
+	
+	char min = 'A';
+	double max = 100.0;
+	//int val = clamp_sum( 'X', 3.14 );  // ERROR (1)
+	
+	char max = 'Z'
+	char val = clamp_sum( 'X', 3.14 ); // MATCH (2)
+	double val = clamp_sum( 40.9, 19.9 ); // MAYBE (3)
+
+In this example, the call to `clamp_sum` at (1) doesn't compile, because even 
+though there are compatible `min` and `max` variables of types `char` and 
+`double`, they need to have the same type to match the constraint, and they 
+don't. 
+The (2) example does compile, but with a behaviour that might be a bit 
+unexpected given the "usual arithmetic conversions", in that both values are 
+narrowed to `char` to match the `min` and `max` constraints, rather than 
+widened to `double` as is usual for mis-matched arguments to +.
+The (3) example is the only case discussed here that would require the most 
+permisive type binding semantics - here, `T` is bound to `char`, to match the 
+constraints, and both the parameters are narrowed from `double` to `char` 
+before the call, which would not be allowed under either of the more 
+restrictive binding semantics. 
+However, the behaviour here is unexpected to the programmer, because the 
+return value will be `(double)'A' /* == 60.0 */` due to the conversions, 
+rather than `60.8 /* == 40.9 + 19.9 */` as they might expect.
+
+Personally, I think that implicit conversions are not a particularly good 
+language design, and that the use-cases for them can be better handled with 
+less powerful features (e.g. more versatile rules for typing constant 
+expressions). 
+However, though we do need implicit conversions in monomorphic code for C 
+compatibility, I'm in favour of restricting their usage in polymorphic code, 
+both to give programmers some stronger tools to express their intent and to 
+shrink the search space for the resolver. 
+Of the possible binding semantics I've discussed, I'm in favour of forcing 
+polymorphic type variables to bind exactly, though I could be talked into 
+allowing literal expressions to have more flexibility in their bindings, or 
+possibly loosening "type variables bind exactly" to "type variables bind 
+exactly at least once"; I think the unrestricted combination of implicit 
+conversions and polymorphic type variable binding unneccesarily multiplies the 
+space of possible function resolutions, and that the added resolution options 
+are mostly unexpected and therefore confusing and not useful to user 
+programmers.
+
+## Appendix A: Partial and Total Orders ##
+The `<=` relation on integers is a commonly known _total order_, and 
+intuitions based on how it works generally apply well to other total orders. 
+Formally, a total order is some binary relation `<=` over a set `S` such that 
+for any two members `a` and `b` of `S`, `a <= b` or `b <= a` (if both, `a` and 
+`b` must be equal, the _antisymmetry_ property); total orders also have a 
+_transitivity_ property, that if `a <= b` and `b <= c`, then `a <= c`. 
+If `a` and `b` are distinct elements and `a <= b`, we may write `a < b`.
+ 
+A _partial order_ is a generalization of this concept where the `<=` relation 
+is not required to be defined over all pairs of elements in `S` (though there 
+is a _reflexivity_ requirement that for all `a` in `S`, `a <= a`); in other 
+words, it is possible for two elements `a` and `b` of `S` to be 
+_incomparable_, unable to be ordered with respect to one another (any `a` and 
+`b` for which either `a <= b` or `b <= a` are called _comparable_). 
+Antisymmetry and transitivity are also required for a partial order, so all 
+total orders are also partial orders by definition. 
+One fairly natural partial order is the "subset of" relation over sets from 
+the same universe; `{ }` is a subset of both `{ 1 }` and `{ 2 }`, which are 
+both subsets of `{ 1, 2 }`, but neither `{ 1 }` nor `{ 2 }` is a subset of the 
+other - they are incomparable under this relation. 
+
+We can compose two (or more) partial orders to produce a new partial order on 
+tuples drawn from both (or all the) sets. 
+For example, given `a` and `c` from set `S` and `b` and `d` from set `R`, 
+where both `S` and `R` both have partial orders defined on them, we can define 
+a ordering relation between `(a, b)` and `(c, d)`. 
+One common order is the _lexicographical order_, where `(a, b) <= (c, d)` iff 
+`a < c` or both `a = c` and `b <= d`; this can be thought of as ordering by 
+the first set and "breaking ties" by the second set. 
+Another common order is the _product order_, which can be roughly thought of 
+as "all the components are ordered the same way"; formally `(a, b) <= (c, d)` 
+iff `a <= c` and `b <= d`. 
+One difference between the lexicographical order and the product order is that 
+in the lexicographical order if both `a` and `c` and `b` and `d` are 
+comparable then `(a, b)` and `(c, d)` will be comparable, while in the product 
+order you can have `a <= c` and `d <= b` (both comparable) which will make 
+`(a, b)` and `(c, d)` incomparable. 
+The product order, on the other hand, has the benefit of not prioritizing one 
+order over the other.
+
+Any partial order has a natural representation as a directed acyclic graph 
+(DAG). 
+Each element `a` of the set becomes a node of the DAG, with an arc pointing to 
+its _covering_ elements, any element `b` such that `a < b` but where there is 
+no `c` such that `a < c` and `c < b`. 
+Intuitively, the covering elements are the "next ones larger", where you can't 
+fit another element between the two. 
+Under this construction, `a < b` is equivalent to "there is a path from `a` to 
+`b` in the DAG", and the lack of cycles in the directed graph is ensured by 
+the antisymmetry property of the partial order.
+
+Partial orders can be generalized to _preorders_ by removing the antisymmetry 
+property. 
+In a preorder the relation is generally called `<~`, and it is possible for 
+two distict elements `a` and `b` to have `a <~ b` and `b <~ a` - in this case 
+we write `a ~ b`; `a <~ b` and not `a ~ b` is written `a < b`. 
+Preorders may also be represented as directed graphs, but in this case the 
+graph may contain cycles.
+
+## Appendix B: Building a Conversion Graph from Un-annotated Single Steps ##
+The short answer is that it's impossible.
+
+The longer answer is that it has to do with what's essentially a diamond 
+inheritance problem. 
+In C, `int` converts to `unsigned int` and also `long` "safely"; both convert 
+to `unsigned long` safely, and it's possible to chain the conversions to 
+convert `int` to `unsigned long`. 
+There are two constraints here; one is that the `int` to `unsigned long` 
+conversion needs to cost more than the other two (because the types aren't as 
+"close" in a very intuitive fashion), and the other is that the system needs a 
+way to choose which path to take to get to the destination type. 
+Now, a fairly natural solution for this would be to just say "C knows how to 
+convert from `int` to `unsigned long`, so we just put in a direct conversion 
+and make the compiler smart enough to figure out the costs" - given that the 
+users can build an arbitrary graph of conversions, this needs to be handled 
+anyway. 
+We can define a preorder over the types by saying that `a <~ b` if there 
+exists a chain of conversions from `a` to `b`. 
+This preorder corresponds roughly to a more usual type-theoretic concept of 
+subtyping ("if I can convert `a` to `b`, `a` is a more specific type than 
+`b`"); however, since this graph is arbitrary, it may contain cycles, so if 
+there is also a path to convert `b` to `a` they are in some sense equivalently 
+specific. 
+
+Now, to compare the cost of two conversion chains `(s, x1, x2, ... xn)` and 
+`(s, y1, y2, ... ym)`, we have both the length of the chains (`n` versus `m`) 
+and this conversion preorder over the destination types `xn` and `ym`. 
+We could define a preorder by taking chain length and breaking ties by the 
+conversion preorder, but this would lead to unexpected behaviour when closing 
+diamonds with an arm length of longer than 1. 
+Consider a set of types `A`, `B1`, `B2`, `C` with the arcs `A->B1`, `B1->B2`, 
+`B2->C`, and `A->C`. 
+If we are comparing conversions from `A` to both `B2` and `C`, we expect the 
+conversion to `B2` to be chosen because it's the more specific type under the 
+conversion preorder, but since its chain length is longer than the conversion 
+to `C`, it loses and `C` is chosen. 
+However, taking the conversion preorder and breaking ties or ambiguities by 
+chain length also doesn't work, because of cases like the following example 
+where the transitivity property is broken and we can't find a global maximum: 
+
+	`X->Y1->Y2`, `X->Z1->Z2->Z3->W`, `X->W`
+
+In this set of arcs, if we're comparing conversions from `X` to each of `Y2`, 
+`Z3` and `W`, converting to `Y2` is cheaper than converting to `Z3`, because 
+there are no conversions between `Y2` and `Z3`, and `Y2` has the shorter chain 
+length. 
+Also, comparing conversions from `X` to `Z3` and to `W`, we find that the 
+conversion to `Z3` is cheaper, because `Z3 < W` by the conversion preorder, 
+and so is considered to be the nearer type. 
+By transitivity, then, the conversion from `X` to `Y2` should be cheaper than 
+the conversion from `X` to `W`, but in this case the `X` and `W` are 
+incomparable by the conversion preorder, so the tie is broken by the shorter 
+path from `X` to `W` in favour of `W`, contradicting the transitivity property 
+for this proposed order.
+
+Without transitivity, we would need to compare all pairs of conversions, which 
+would be expensive, and possibly not yield a minimal-cost conversion even if 
+all pairs were comparable. 
+In short, this ordering is infeasible, and by extension I believe any ordering 
+composed solely of single-step conversions between types with no further 
+user-supplied information will be insufficiently powerful to express the 
+built-in conversions between C's types.
+
+## Appendix C: Proposed Prelude Changes ##
+**TODO** Port Glen's "Future Work" page for builtin C conversions.
+**TODO** Move discussion of zero_t, unit_t here.
+
+It may be desirable to have some polymorphic wrapper functions in the prelude 
+which provide consistent default implementations of various operators given a 
+definition of one of them. 
+Naturally, users could still provide a monomorphic overload if they wished to 
+make their own code more efficient than the polymorphic wrapper could be, but 
+this would minimize user effort in the common case where the user cannot write 
+a more efficient function, or is willing to trade some runtime efficiency for 
+developer time. 
+As an example, consider the following polymorphic defaults for `+` and `+=`:
+
+	forall(otype T | { T ?+?(T, T); })
+	lvalue T ?+=? (T *a, T b) {
+		return *a = *a + b;
+	}
+	
+	forall(otype T | { lvalue T ?+=? (T*, T) })
+	T ?+? (T a, T b) {
+		T tmp = a;
+		return tmp += b;
+	}
+
+Both of these have a possibly unneccessary copy (the first in copying the 
+result of `*a + b` back into `*a`, the second copying `a` into `tmp`), but in 
+cases where these copies are unavoidable the polymorphic wrappers should be 
+just as performant as the monomorphic equivalents (assuming a compiler 
+sufficiently clever to inline the extra function call), and allow programmers 
+to define a set of related operators with maximal concision.
+
+**TODO** Look at what Glen and Richard have already written for this.
+
+## Appendix D: Feasibility of Making void* Conversions Explicit ##
+C allows implicit conversions between `void*` and other pointer types, as per 
+section 6.3.2.3.1 of the standard. 
+Making these implicit conversions explicit in Cforall would provide 
+significant type-safety benefits, and is precedented in C++. 
+A weaker version of this proposal would be to allow implicit conversions to 
+`void*` (as a sort of "top type" for all pointer types), but to make the 
+unsafe conversion from `void*` back to a concrete pointer type an explicit 
+conversion. 
+However, `int *p = malloc( sizeof(int) );` and friends are hugely common 
+in C code, and rely on the unsafe implicit conversion from the `void*` return 
+type of `malloc` to the `int*` type of the variable - obviously it would be 
+too much of a source-compatibility break to disallow this for C code. 
+We do already need to wrap C code in an `extern "C"` block, though, so it is 
+technically feasible to make the `void*` conversions implicit in C but 
+explicit in Cforall. 
+Also, for calling C code with `void*`-based APIs, pointers-to-dtype are 
+calling-convention compatible with `void*`; we could read `void*` in function 
+signatures as essentially a fresh dtype type variable, e.g:
+
+	void* malloc( size_t )
+		=> forall(dtype T0) T0* malloc( size_t )
+	void qsort( void*, size_t, size_t, int (*)( const void*, const void* ) )
+		=> forall(dtype T0, dtype T1, dtype T2)
+		   void qsort( T0*, size_t, size_t, int (*)( const T1*, const T2* ) )
+
+This would even allow us to leverage some of Cforall's type safety to write 
+better declarations for legacy C API functions, like the following wrapper for 
+`qsort`:
+
+	extern "C" { // turns off name-mangling so that this calls the C library
+		// call-compatible type-safe qsort signature
+		forall(dtype T)
+		void qsort( T*, size_t, size_t, int (*)( const T*, const T* ) );
+		
+		// forbid type-unsafe C signature from resolving
+		void qsort( void*, size_t, size_t, int (*)( const void*, const void* ) )
+			= delete;
+	}
