| 1 | # Thoughts on Resolver Design # | 
|---|
| 2 |  | 
|---|
| 3 | ## Conversions ## | 
|---|
| 4 | C's implicit "usual arithmetic conversions" define a structure among the | 
|---|
| 5 | built-in types consisting of _unsafe_ narrowing conversions and a hierarchy of | 
|---|
| 6 | _safe_ widening conversions. | 
|---|
| 7 | There is also a set of _explicit_ conversions that are only allowed through a | 
|---|
| 8 | cast expression. | 
|---|
| 9 | Based on Glen's notes on conversions [1], I propose that safe and unsafe | 
|---|
| 10 | conversions be expressed as constructor variants, though I make explicit | 
|---|
| 11 | (cast) conversions a constructor variant as well rather than a dedicated | 
|---|
| 12 | operator. | 
|---|
| 13 | Throughout this article, I will use the following operator names for | 
|---|
| 14 | constructors and conversion functions from `From` to `To`: | 
|---|
| 15 |  | 
|---|
| 16 | void ?{} ( To*, To );            // copy constructor | 
|---|
| 17 | void ?{} ( To*, From );          // explicit constructor | 
|---|
| 18 | void ?{explicit} ( To*, From );  // explicit cast conversion | 
|---|
| 19 | void ?{safe} ( To*, From );      // implicit safe conversion | 
|---|
| 20 | void ?{unsafe} ( To*, From );    // implicit unsafe conversion | 
|---|
| 21 |  | 
|---|
| 22 | [1] http://plg.uwaterloo.ca/~cforall/Conversions/index.html | 
|---|
| 23 |  | 
|---|
| 24 | Glen's design made no distinction between constructors and unsafe implicit | 
|---|
| 25 | conversions; this is elegant, but interacts poorly with tuples. | 
|---|
| 26 | Essentially, without making this distinction, a constructor like the following | 
|---|
| 27 | would add an interpretation of any two `int`s as a `Coord`, needlessly | 
|---|
| 28 | multiplying the space of possible interpretations of all functions: | 
|---|
| 29 |  | 
|---|
| 30 | void ?{}( Coord *this, int x, int y ); | 
|---|
| 31 |  | 
|---|
| 32 | That said, it would certainly be possible to make a multiple-argument implicit | 
|---|
| 33 | conversion, as below, though the argument above suggests this option should be | 
|---|
| 34 | used infrequently: | 
|---|
| 35 |  | 
|---|
| 36 | void ?{unsafe}( Coord *this, int x, int y ); | 
|---|
| 37 |  | 
|---|
| 38 | An alternate possibility would be to only count two-arg constructors | 
|---|
| 39 | `void ?{} ( To*, From )` as unsafe conversions; under this semantics, safe and | 
|---|
| 40 | explicit conversions should also have a compiler-enforced restriction to | 
|---|
| 41 | ensure that they are two-arg functions (this restriction may be valuable | 
|---|
| 42 | regardless). | 
|---|
| 43 |  | 
|---|
| 44 | Regardless of syntax, there should be a type assertion that expresses `From` | 
|---|
| 45 | is convertable to `To`. | 
|---|
| 46 | If user-defined conversions are not added to the language, | 
|---|
| 47 | `void ?{} ( To*, From )` may be a suitable representation, relying on | 
|---|
| 48 | conversions on the argument types to account for transitivity. | 
|---|
| 49 | On the other hand, `To*` should perhaps match its target type exactly, so | 
|---|
| 50 | another assertion syntax specific to conversions may be required, e.g. | 
|---|
| 51 | `From -> To`. | 
|---|
| 52 |  | 
|---|
| 53 | ### Constructor Idiom ### | 
|---|
| 54 | Basing our notion of conversions off otherwise normal Cforall functions means | 
|---|
| 55 | that we can use the full range of Cforall features for conversions, including | 
|---|
| 56 | polymorphism. | 
|---|
| 57 | Glen [1] defines a _constructor idiom_ that can be used to create chains of | 
|---|
| 58 | safe conversions without duplicating code; given a type `Safe` which members | 
|---|
| 59 | of another type `From` can be directly converted to, the constructor idiom | 
|---|
| 60 | allows us to write a conversion for any type `To` which `Safe` converts to: | 
|---|
| 61 |  | 
|---|
| 62 | forall(otype To | { void ?{safe}( To*, Safe ) }) | 
|---|
| 63 | void ?{safe}( To *this, From that ) { | 
|---|
| 64 | Safe tmp = /* some expression involving that */; | 
|---|
| 65 | *this = tmp; // uses assertion parameter | 
|---|
| 66 | } | 
|---|
| 67 |  | 
|---|
| 68 | This idiom can also be used with only minor variations for a parallel set of | 
|---|
| 69 | unsafe conversions. | 
|---|
| 70 |  | 
|---|
| 71 | What selective non-use of the constructor idiom gives us is the ability to | 
|---|
| 72 | define a conversion that may only be the *last* conversion in a chain of such. | 
|---|
| 73 | Constructing a conversion graph able to unambiguously represent the full | 
|---|
| 74 | hierarchy of implicit conversions in C is provably impossible using only | 
|---|
| 75 | single-step conversions with no additional information (see Appendix B), but | 
|---|
| 76 | this mechanism is sufficiently powerful (see [1], though the design there has | 
|---|
| 77 | some minor bugs; the general idea is to use the constructor idiom to define | 
|---|
| 78 | two chains of conversions, one among the signed integral types, another among | 
|---|
| 79 | the unsigned, and to use monomorphic conversions to allow conversions between | 
|---|
| 80 | signed and unsigned integer types). | 
|---|
| 81 |  | 
|---|
| 82 | ### Implementation Details ### | 
|---|
| 83 | It is desirable to have a system which can be efficiently implemented, yet | 
|---|
| 84 | also to have one which has sufficient power to distinguish between functions | 
|---|
| 85 | on all possible axes of polymorphism. | 
|---|
| 86 | This ordering may be a partial order, which may complicate implementation | 
|---|
| 87 | somewhat; in this case it may be desirable to store the set of implementations | 
|---|
| 88 | for a given function as the directed acyclic graph (DAG) representing the | 
|---|
| 89 | order. | 
|---|
| 90 |  | 
|---|
| 91 | ## Conversion Costs ## | 
|---|
| 92 | Each possible resolution of an expression has a _cost_ tuple consisting of | 
|---|
| 93 | the following components: _unsafe_ conversion cost, _polymorphic_ | 
|---|
| 94 | specialization cost, _safe_ conversion cost, a count of _explicit_ | 
|---|
| 95 | conversions, and _qualifier_ conversion cost. | 
|---|
| 96 | These components are lexically-ordered and can be summed element-wise; | 
|---|
| 97 | summation starts at `(0, 0, 0, 0, 0)`. | 
|---|
| 98 |  | 
|---|
| 99 | ### Lvalue and Qualifier Conversions ### | 
|---|
| 100 | C defines the notion of a _lvalue_, essentially an addressable object, as well | 
|---|
| 101 | as a number of type _qualifiers_, `const`, `volatile`, and `restrict`. | 
|---|
| 102 | As these type qualifiers are generally only meaningful to the type system as | 
|---|
| 103 | applied to lvalues, the two concepts are closely related. | 
|---|
| 104 | A const lvalue cannot be modified, the compiler cannot assume that a volatile | 
|---|
| 105 | lvalue will not be concurrently modified by some other part of the system, and | 
|---|
| 106 | a restrict lvalue must have pointer type, and the compiler may assume that no | 
|---|
| 107 | other pointer in scope aliases that pointer (this is solely a performance | 
|---|
| 108 | optimization, and may be ignored by implementers). | 
|---|
| 109 | _Lvalue-to-rvalue conversion_, which takes an lvalue of type `T` and converts | 
|---|
| 110 | it to an expression result of type `T` (commonly called an _rvalue_ of type | 
|---|
| 111 | `T`) also strips all the qualifiers from the lvalue, as an expression result | 
|---|
| 112 | is a value, not an addressable object that can have properties like | 
|---|
| 113 | immutability. | 
|---|
| 114 | Though lvalue-to-rvalue conversion strips the qualifiers from lvalues, | 
|---|
| 115 | derived rvalue types such as pointer types may include qualifiers; | 
|---|
| 116 | `const int *` is a distinct type from `int *`, though the latter is safely | 
|---|
| 117 | convertable to the former. | 
|---|
| 118 | In general, any number of qualifiers can be safely added to the | 
|---|
| 119 | pointed-to-type of a pointer type, e.g. `int *` converts safely to | 
|---|
| 120 | `const int *` and `volatile int *`, both of which convert safely to | 
|---|
| 121 | `const volatile int *`. | 
|---|
| 122 |  | 
|---|
| 123 | Since lvalues are precicely "addressable objects", in C, only lvalues can be | 
|---|
| 124 | used as the operand of the `&` address-of operator. | 
|---|
| 125 | Similarly, only modifiable lvalues may be used as the assigned-to | 
|---|
| 126 | operand of the mutating operators: assignment, compound assignment | 
|---|
| 127 | (e.g. `+=`), and increment and decrement; roughly speaking, lvalues without | 
|---|
| 128 | the `const` qualifier are modifiable, but lvalues of incomplete types, array | 
|---|
| 129 | types, and struct or union types with const members are also not modifiable. | 
|---|
| 130 | Lvalues are produced by the following expressions: object identifiers | 
|---|
| 131 | (function identifiers are not considered to be lvalues), the result of the `*` | 
|---|
| 132 | dereference operator applied to an object pointer, the result of a member | 
|---|
| 133 | expression `s.f` if the left argument `s` is an lvalue (note that the | 
|---|
| 134 | preceding two rules imply that the result of indirect member expressions | 
|---|
| 135 | `s->f` are always lvalues, by desugaring to `(*s).f`), and the result of the | 
|---|
| 136 | indexing operator `a[i]` (similarly by its desugaring to `*((a)+(i))`). | 
|---|
| 137 | Somewhat less obviously, parenthesized lvalue expressions, string literals, | 
|---|
| 138 | and compound literals (e.g. `(struct foo){ 'x', 3.14, 42 }`) are also lvalues. | 
|---|
| 139 |  | 
|---|
| 140 | All of the conversions described above are defined in standard C, but Cforall | 
|---|
| 141 | requires further features from its type system. | 
|---|
| 142 | In particular, to allow overloading of the `*?` and `?[?]` dereferencing and | 
|---|
| 143 | indexing operators, Cforall requires a way to declare that the functions | 
|---|
| 144 | defining these operators return lvalues, and since C functions never return | 
|---|
| 145 | lvalues and for syntactic reasons we wish to distinguish functions which | 
|---|
| 146 | return lvalues from functions which return pointers, this is of necessity an | 
|---|
| 147 | extension to standard C. | 
|---|
| 148 | In the current design, an `lvalue` qualifier can be added to function return | 
|---|
| 149 | types (and only to function return types), the effect of which is to return a | 
|---|
| 150 | pointer which is implicitly dereferenced by the caller. | 
|---|
| 151 | C++ includes the more general concept of _references_, which are typically | 
|---|
| 152 | implemented as implicitly dereferenced pointers as well. | 
|---|
| 153 | Another use case which C++ references support is providing a way to pass | 
|---|
| 154 | function parameters by reference (rather than by value) with a natural | 
|---|
| 155 | syntax; Cforall in its current state has no such mechanism. | 
|---|
| 156 | As an example, consider the following (currently typical) copy-constructor | 
|---|
| 157 | signature and call: | 
|---|
| 158 |  | 
|---|
| 159 | void ?{}(T *lhs, T rhs); | 
|---|
| 160 |  | 
|---|
| 161 | T x; | 
|---|
| 162 | T y = { x }; | 
|---|
| 163 |  | 
|---|
| 164 | Note that the right-hand argument is passed by value, and would in fact be | 
|---|
| 165 | copied twice in the course of the constructor call `T y = { x };` (once into | 
|---|
| 166 | the parameter by C's standard `memcpy` semantics, once again in the body of | 
|---|
| 167 | the copy constructor, though it is possible that return value optimization | 
|---|
| 168 | will elide the `memcpy`-style copy). | 
|---|
| 169 | However, to pass by reference using the existing pointer syntax, the example | 
|---|
| 170 | above would look like this: | 
|---|
| 171 |  | 
|---|
| 172 | void ?{}(T *lhs, const T *rhs); | 
|---|
| 173 |  | 
|---|
| 174 | T x; | 
|---|
| 175 | T y = { &x }; | 
|---|
| 176 |  | 
|---|
| 177 | This example is not even as bad as it could be; assuming pass-by-reference is | 
|---|
| 178 | the desired semantics for the `?+?` operator, that implies the following | 
|---|
| 179 | design today: | 
|---|
| 180 |  | 
|---|
| 181 | T ?+?(const T *lhs, const T *rhs); | 
|---|
| 182 |  | 
|---|
| 183 | T a, b; | 
|---|
| 184 | T c = &a + &b, | 
|---|
| 185 |  | 
|---|
| 186 | In addition to `&a + &b` being unsightly and confusing syntax to add `a` and | 
|---|
| 187 | `b`, it also introduces a possible ambiguity with pointer arithmetic on `T*` | 
|---|
| 188 | which can only be resolved by return-type inference. | 
|---|
| 189 |  | 
|---|
| 190 | Pass-by-reference and marking functions as returning lvalues instead of the | 
|---|
| 191 | usual rvalues are actually closely related concepts, as obtaining a reference | 
|---|
| 192 | to pass depends on the referenced object being addressable, i.e. an lvalue, | 
|---|
| 193 | and lvalue return types are effectively return-by-reference. | 
|---|
| 194 | Cforall should also unify the concepts, with a parameterized type for | 
|---|
| 195 | "reference to `T`", which I will write `ref T`. | 
|---|
| 196 | Syntax bikeshedding can be done later (there are some examples at the bottom | 
|---|
| 197 | of this section), but `ref T` is sufficiently distinct from both the existing | 
|---|
| 198 | `lvalue T` (which it subsumes) and the closely related C++ `T&` to allow | 
|---|
| 199 | independent discussion of its semantics. | 
|---|
| 200 |  | 
|---|
| 201 | Firstly, assignment to a function parameter as part of a function call and | 
|---|
| 202 | local variable initialization have almost identical semantics, so should be | 
|---|
| 203 | treated similarly for the reference type too; this implies we should be able | 
|---|
| 204 | to declare local variables of reference type, as in the following: | 
|---|
| 205 |  | 
|---|
| 206 | int x = 42; | 
|---|
| 207 | ref int r = x; // r is now an alias for x | 
|---|
| 208 |  | 
|---|
| 209 | Unlike in C++, we would like to have the capability to re-bind references | 
|---|
| 210 | after initialization, as this allows the attractive syntax of references to | 
|---|
| 211 | support some further useful code patterns, such as first initializing a | 
|---|
| 212 | reference after its declaration. | 
|---|
| 213 | Constant references to `T` (`const ref T`) should not be re-bindable. | 
|---|
| 214 |  | 
|---|
| 215 | One option for re-binding references is to use a dedicated operator, as in the | 
|---|
| 216 | code example below: | 
|---|
| 217 |  | 
|---|
| 218 | int i = 42, j = 7; | 
|---|
| 219 | ref int r = i; // bind r to i | 
|---|
| 220 | r = j;         // set i (== r) to 7 | 
|---|
| 221 | r := j;        // rebind r to j using the new := rebind operator | 
|---|
| 222 | i = 42;        // reset i (!= r) to 42 | 
|---|
| 223 | assert( r == 7 ); | 
|---|
| 224 |  | 
|---|
| 225 | The other syntactic option for reference re-bind would be to overload | 
|---|
| 226 | assignment and use type inference on the left and right-hand sides to | 
|---|
| 227 | determine whether the referred-to variable on the left should be reassigned to | 
|---|
| 228 | the value on the right, or if the reference on the left should be aliased to | 
|---|
| 229 | the reference on the right. | 
|---|
| 230 | This could be disambiguated with casts, as in the following code example: | 
|---|
| 231 |  | 
|---|
| 232 | int i | 
|---|
| 233 | int j; | 
|---|
| 234 | ref int r = i;   // (0a) | 
|---|
| 235 | ref int s = i;   // (0b) | 
|---|
| 236 |  | 
|---|
| 237 | i = j;           // (1) | 
|---|
| 238 | i = (int)s;      // (2) | 
|---|
| 239 | i = s;           // (3) | 
|---|
| 240 | // --------------------- | 
|---|
| 241 | r = s;           // (4) | 
|---|
| 242 | r = (ref int)j;  // (5) | 
|---|
| 243 | // --------------------- | 
|---|
| 244 | r = j;           // (6) | 
|---|
| 245 | r = (int)s;      // (7) | 
|---|
| 246 |  | 
|---|
| 247 | By the expected aliasing syntax, (0a) and (0b) are initializing `r` and `s` as | 
|---|
| 248 | aliases for `i`. | 
|---|
| 249 | For C compatibility, (1) has to be assignment; in general, any assignment to a | 
|---|
| 250 | non-reference type should be assignment, so (2) and (3) are as well. | 
|---|
| 251 | By types, (4) and (5) should have the same semantics, and the semantics of (6) | 
|---|
| 252 | and (7) should match as well. | 
|---|
| 253 | This suggests that (4) and (5) are reference re-bind, and (6) and (7) are an | 
|---|
| 254 | assignment to the referred variable; this makes the syntax to explicitly alias | 
|---|
| 255 | a local variable rather ugly (and inconsistent with the initialization | 
|---|
| 256 | syntax), as well as making it rather awkward to copy the value stored in one | 
|---|
| 257 | reference-type variable into another reference type variable (which is likely | 
|---|
| 258 | more painful in functions with by-reference parameters than with local | 
|---|
| 259 | variables of reference type). | 
|---|
| 260 |  | 
|---|
| 261 | Because of the aforementioned issues with overloading assignment as reference | 
|---|
| 262 | rebind, in addition to the fact that reference rebind should not be a | 
|---|
| 263 | user-overloadable operator (unlike assignment), I propose refererence rebind | 
|---|
| 264 | should have its own dedicated operator. | 
|---|
| 265 |  | 
|---|
| 266 | The semantics and restrictions of `ref T` are effectively the semantics of an | 
|---|
| 267 | lvalue of type `T`, and by this analogy there should be a safe, qualifier | 
|---|
| 268 | dropping conversion from `ref const volatile restrict T` (and every other | 
|---|
| 269 | qualifier combination on the `T` in `ref T`) to `T`. | 
|---|
| 270 | With this conversion, the resolver may type most expressions that C would | 
|---|
| 271 | call "lvalue of type `T`" as `ref T`. | 
|---|
| 272 | There's also an obvious argument that lvalues of a (possibly-qualified) type | 
|---|
| 273 | `T` should be convertable to references of type `T`, where `T` is also | 
|---|
| 274 | so-qualified (e.g. lvalue `int` to `ref int`, lvalue `const char` to | 
|---|
| 275 | `ref const char`). | 
|---|
| 276 | By similar arguments to pointer types, qualifiers should be addable to the | 
|---|
| 277 | referred-to type of a reference (e.g. `ref int` to `ref const int`). | 
|---|
| 278 | As a note, since pointer arithmetic is explictly not defined on `ref T`, | 
|---|
| 279 | `restrict ref T` should be allowable and would have alias-analysis rules that | 
|---|
| 280 | are actually comprehensible to mere mortals. | 
|---|
| 281 |  | 
|---|
| 282 | Using pass-by-reference semantics for function calls should not put syntactic | 
|---|
| 283 | constraints on how the function is called; particularly, temporary values | 
|---|
| 284 | should be able to be passed by reference. | 
|---|
| 285 | The mechanism for this pass-by-reference would be to store the value of the | 
|---|
| 286 | temporary expression into a new unnamed temporary, and pass the reference of | 
|---|
| 287 | that temporary to the function. | 
|---|
| 288 | As an example, the following code should all compile and run: | 
|---|
| 289 |  | 
|---|
| 290 | void f(ref int x) { printf("%d\n", x++); } | 
|---|
| 291 |  | 
|---|
| 292 | int i = 7, j = 11; | 
|---|
| 293 | const int answer = 42; | 
|---|
| 294 |  | 
|---|
| 295 | f(i);      // (1) | 
|---|
| 296 | f(42);     // (2) | 
|---|
| 297 | f(i + j);  // (3) | 
|---|
| 298 | f(answer); // (4) | 
|---|
| 299 |  | 
|---|
| 300 | The semantics of (1) are just like C++'s, "7" is printed, and `i` has the | 
|---|
| 301 | value 8 afterward. | 
|---|
| 302 | For (2), "42" is printed, and the increment of the unnamed temporary to 43 is | 
|---|
| 303 | not visible to the caller; (3) behaves similarly, printing "19", but not | 
|---|
| 304 | changing `i` or `j`. | 
|---|
| 305 | (4) is a bit of an interesting case; we want to be able to support named | 
|---|
| 306 | constants like `answer` that can be used anywhere the constant expression | 
|---|
| 307 | they're replacing (like `42`) could go; in this sense, (4) and (2) should have | 
|---|
| 308 | the same semantics. | 
|---|
| 309 | However, we don't want the mutation to the `x` parameter to be visible in | 
|---|
| 310 | `answer` afterward, because `answer` is a constant, and thus shouldn't change. | 
|---|
| 311 | The solution to this is to allow chaining of the two `ref` conversions; | 
|---|
| 312 | `answer` has the type `ref const int`, which can be converted to `int` by the | 
|---|
| 313 | lvalue-to-rvalue conversion (which drops the qualifiers), then up to `ref int` | 
|---|
| 314 | by the temporary-producing rvalue-to-lvalue conversion. | 
|---|
| 315 | Thus, an unnamed temporary is inserted, initialized to `answer` (i.e. 42), | 
|---|
| 316 | mutated by `f`, then discarded; "42" is printed, just as in case (2), and | 
|---|
| 317 | `answer` still equals 42 after the call, because it was the temporary that was | 
|---|
| 318 | mutated, not `answer`. | 
|---|
| 319 | It may be somewhat surprising to C++ programmers that `f(i)` mutates `i` while | 
|---|
| 320 | `f(answer)` does not mutate `answer` (though `f(answer)` would be illegal in | 
|---|
| 321 | C++, leading to the dreaded "const hell"), but the behaviour of this rule can | 
|---|
| 322 | be determined by examining local scope with the simple rule "non-`const` | 
|---|
| 323 | references to `const` variables produce temporaries", which aligns with | 
|---|
| 324 | programmer intuition that `const` variables cannot be mutated. | 
|---|
| 325 |  | 
|---|
| 326 | To bikeshed syntax for `ref T`, there are three basic options: language | 
|---|
| 327 | keywords (`lvalue T` is already in Cforall), compiler-supported "special" | 
|---|
| 328 | generic types (e.g. `ref(T)`), or sigils (`T&` is familiar to C++ | 
|---|
| 329 | programmers). | 
|---|
| 330 | Keyword or generic based approaches run the risk of name conflicts with | 
|---|
| 331 | existing code, while any sigil used would have to be carefully chosen to not | 
|---|
| 332 | create parsing conflicts. | 
|---|
| 333 |  | 
|---|
| 334 | **TODO** Consider arguments for move semantics and see if there is a | 
|---|
| 335 | compelling case for rvalue references. | 
|---|
| 336 |  | 
|---|
| 337 | ### Conversion Operator Costs ### | 
|---|
| 338 | Copy constructors, safe conversions, and unsafe conversions all have an | 
|---|
| 339 | associated conversion cost, calculated according to the algorithm below: | 
|---|
| 340 |  | 
|---|
| 341 | 1. Monomorphic copy constructors have a conversion cost of `(0, 0, 0, 0)` | 
|---|
| 342 | 2. Monomorphic safe conversions have a conversion cost of `(0, 0, 1, 1)` | 
|---|
| 343 | 3. Monomoprhic unsafe conversions have a conversion cost of `(1, 0, 0, 1)` | 
|---|
| 344 | 4. Polymorphic conversion operators (or copy constructors) have a conversion | 
|---|
| 345 | cost of `(0, 1, 0, 1)` plus the conversion cost of their monomorphic | 
|---|
| 346 | equivalent and the sum of the conversion costs of all conversion operators | 
|---|
| 347 | passed as assertion parameters, but where the fourth "count" element of the | 
|---|
| 348 | cost tuple is fixed to `1`. | 
|---|
| 349 |  | 
|---|
| 350 | **TODO** Polymorphism cost may need to be reconsidered in the light of the | 
|---|
| 351 | thoughts on polymorphism below. | 
|---|
| 352 | **TODO** You basically just want path-length in the conversion graph implied | 
|---|
| 353 | by the set of conversions; the only tricky question is whether or not you can | 
|---|
| 354 | account for "mixed" safe and unsafe conversions used to satisfy polymorphic | 
|---|
| 355 | constraints, whether a polymorphic conversion should cost more than a | 
|---|
| 356 | monomorphic one, and whether to account for non-conversion constraints in the | 
|---|
| 357 | polymorphism cost | 
|---|
| 358 |  | 
|---|
| 359 | ### Argument-Parameter Matching ### | 
|---|
| 360 | Given a function `f` with an parameter list (after tuple flattening) | 
|---|
| 361 | `(T1 t1, T2 t2, ... Tn tn)`, and a function application | 
|---|
| 362 | `f(<e1>, <e2>, ... <em>)`, the cost of matching each argument to the | 
|---|
| 363 | appropriate parameter is calculated according to the algorithm below: | 
|---|
| 364 |  | 
|---|
| 365 | Given a parameter `t` of type `T` and an expression `<e>` from these lists, | 
|---|
| 366 | `<e>` will have a set of interpretations of types `E1, E2, ... Ek` with | 
|---|
| 367 | associated costs `(u1, p1, s1, c1), (u2, p2, s2, c2), ... (uk, pk, sk, ck)`. | 
|---|
| 368 | (If any `Ei` is a tuple type, replace it with its first flattened element for | 
|---|
| 369 | the purposes of this section.) | 
|---|
| 370 |  | 
|---|
| 371 | The cost of matching the interpretation of `<e>` with type `Ei` to `t1` with | 
|---|
| 372 | type `T` is the sum of the interpretation cost `(ui, pi, si, ci)` and the | 
|---|
| 373 | conversion operator cost from `Ei` to `T`. | 
|---|
| 374 |  | 
|---|
| 375 | ### Object Initialization ### | 
|---|
| 376 | The cost to initialize an object is calculated very similarly to | 
|---|
| 377 | argument-parameter matching, with a few modifications. | 
|---|
| 378 | Firstly, explicit constructors are included in the set of available | 
|---|
| 379 | conversions, with conversion cost `(0, 0, 0, 1)` plus associated polymorphic | 
|---|
| 380 | conversion costs (if applicable) and the _interpretation cost_ of the | 
|---|
| 381 | constructor, the sum of the argument-parameter matching costs for its | 
|---|
| 382 | parameters. | 
|---|
| 383 | Also, ties in overall cost (interpretation cost plus conversion cost) are | 
|---|
| 384 | broken by lowest conversion cost (i.e. of alternatives with the same overall | 
|---|
| 385 | cost, copy constructors are preferred to other explicit constructors, | 
|---|
| 386 | explicit constructors are preferred to safe conversions, which are preferred | 
|---|
| 387 | to unsafe conversions). | 
|---|
| 388 | An object initialization is properly typed if it has exactly one min-cost | 
|---|
| 389 | interpretation. | 
|---|
| 390 |  | 
|---|
| 391 | ### Explicit Casts ### | 
|---|
| 392 | Explicit casts are handled similarly to object initialization. | 
|---|
| 393 | Copy constructors and other explicit constructors are not included in the set | 
|---|
| 394 | of possible conversions, though interpreting a cast as type ascription | 
|---|
| 395 | (`(T)e`, meaning the interpretation of `e` as type `T`) has conversion cost | 
|---|
| 396 | `(0, 0, 0, 0)`. | 
|---|
| 397 | Explicit conversion operators are also included in the set of possible | 
|---|
| 398 | conversions, with cost `(0, 0, 0, 1)` plus whatever polymorphic conversion | 
|---|
| 399 | costs are invoked. | 
|---|
| 400 | Unlike for explicit constructors and other functions, implicit conversions are | 
|---|
| 401 | never applied to the argument or return type of an explicit cast operator, so | 
|---|
| 402 | that the cast may be used more effectively as a method for the user programmer | 
|---|
| 403 | to guide type resolution. | 
|---|
| 404 |  | 
|---|
| 405 | ## Trait Satisfaction ## | 
|---|
| 406 | A _trait_ consists of a list of _type variables_ along with a (possibly empty) | 
|---|
| 407 | set of _assertions_ on those variables. | 
|---|
| 408 | Assertions can take two forms, _variable assertions_ and the more common | 
|---|
| 409 | _function assertions_, as in the following example: | 
|---|
| 410 |  | 
|---|
| 411 | trait a_trait(otype T, otype S) { | 
|---|
| 412 | T a_variable_assertion; | 
|---|
| 413 | S* another_variable_assertion; | 
|---|
| 414 | S a_function_assertion( T* ); | 
|---|
| 415 | }; | 
|---|
| 416 |  | 
|---|
| 417 | Variable assertions enforce that a variable with the given name and type | 
|---|
| 418 | exists (the type is generally one of the type variables, or derived from one), | 
|---|
| 419 | while a function assertion enforces that a function with a | 
|---|
| 420 | _compatible signature_ to the provided function exists. | 
|---|
| 421 |  | 
|---|
| 422 | To test if some list of types _satisfy_ the trait, the types are first _bound_ | 
|---|
| 423 | to the type variables, and then declarations to satisfy each assertion are | 
|---|
| 424 | sought out. | 
|---|
| 425 | Variable assertions require an exact match, because they are passed as object | 
|---|
| 426 | pointers, and there is no mechanism to employ conversion functions, while | 
|---|
| 427 | function assertions only require a function that can be wrapped to a | 
|---|
| 428 | compatible type; for example, the declarations below satisfy | 
|---|
| 429 | `a_trait(int, short)`: | 
|---|
| 430 |  | 
|---|
| 431 | int a_variable_assertion; | 
|---|
| 432 | short* another_variable_assertion; | 
|---|
| 433 | char a_function_assertion( void* ); | 
|---|
| 434 | // int* may be implicitly converted to void*, and char to short, so the | 
|---|
| 435 | // above works | 
|---|
| 436 |  | 
|---|
| 437 | Cforall Polymorphic functions have a _constraining trait_, denoted as follows: | 
|---|
| 438 |  | 
|---|
| 439 | forall(otype A, otype B | some_trait(A, B)) | 
|---|
| 440 |  | 
|---|
| 441 | The trait may be anonymous, with the same syntax as a trait declaration, and | 
|---|
| 442 | may be unioned together using `|` or `,`. | 
|---|
| 443 |  | 
|---|
| 444 | **TODO** Consider including field assertions in the list of constraint types, | 
|---|
| 445 | also associated types and the appropriate matching type assertion. | 
|---|
| 446 |  | 
|---|
| 447 | ## Polymorphism Costs ## | 
|---|
| 448 | The type resolver should prefer functions that are "less polymorphic" to | 
|---|
| 449 | functions that are "more polymorphic". | 
|---|
| 450 | Determining how to order functions by degree of polymorphism is somewhat less | 
|---|
| 451 | straightforward, though, as there are multiple axes of polymorphism and it is | 
|---|
| 452 | not always clear how they compose. | 
|---|
| 453 | The natural order for degree of polymorphism is a partial order, and this | 
|---|
| 454 | section includes some open questions on whether it is desirable or feasible to | 
|---|
| 455 | develop a tie-breaking strategy to impose a total order on the degree of | 
|---|
| 456 | polymorphism of functions. | 
|---|
| 457 | Helpfully, though, the degree of polymorphism is a property of functions | 
|---|
| 458 | rather than function calls, so any complicated graph structure or calculation | 
|---|
| 459 | representing a (partial) order over function degree of polymorphism can be | 
|---|
| 460 | calculated once and cached. | 
|---|
| 461 |  | 
|---|
| 462 | ### Function Parameters ### | 
|---|
| 463 | All other things being equal, if a parameter of one function has a concrete | 
|---|
| 464 | type and the equivalent parameter of another function has a dynamic type, the | 
|---|
| 465 | first function is less polymorphic: | 
|---|
| 466 |  | 
|---|
| 467 | void f( int, int );  // (0) least polymorphic | 
|---|
| 468 | forall(otype T) void f( T, int );    // (1a) more polymorphic than (0) | 
|---|
| 469 | forall(otype T) void f( int, T );    // (1b) more polymorphic than (0) | 
|---|
| 470 | //      incomparable with (1a) | 
|---|
| 471 | forall(otype T) void f( T, T );      // (2) more polymorphic than (1a/b) | 
|---|
| 472 |  | 
|---|
| 473 | This should extend to parameterized types (pointers and generic types) also: | 
|---|
| 474 |  | 
|---|
| 475 | forall(otype S) struct box { S val; }; | 
|---|
| 476 |  | 
|---|
| 477 | forall(otype T) void f( T, T* );       // (3) less polymorphic than (2) | 
|---|
| 478 | forall(otype T) void f( T, T** );      // (4) less polymorphic than (3) | 
|---|
| 479 | forall(otype T) void f( T, box(T) );   // (5) less polymorphic than (2) | 
|---|
| 480 | //     incomparable with (3) | 
|---|
| 481 | forall(otype T) void f( T, box(T*) );  // (6) less polymorphic than (5) | 
|---|
| 482 |  | 
|---|
| 483 | Every function in the group above is incomparable with (1a/b), but that's fine | 
|---|
| 484 | because an `int` isn't a pointer or a `box`, so the ambiguity shouldn't occur | 
|---|
| 485 | much in practice (unless there are safe or unsafe conversions defined between | 
|---|
| 486 | the possible argument types). | 
|---|
| 487 |  | 
|---|
| 488 | For degree of polymorphism from arguments, I think we should not distinguish | 
|---|
| 489 | between different type parameters, e.g. the following should be considered | 
|---|
| 490 | equally polymorphic: | 
|---|
| 491 |  | 
|---|
| 492 | forall(otype T, otype S) void f( T, T, S );  // (7) | 
|---|
| 493 | forall(otype T, otype S) void f( S, T, T );  // (8) | 
|---|
| 494 |  | 
|---|
| 495 | However parameter lists are compared, parameters of multi-parameter generic | 
|---|
| 496 | types should ideally be treated as a recursive case, e.g. in the example | 
|---|
| 497 | below, (9) is less polymorphic than (10), which is less polymorphic than (11): | 
|---|
| 498 |  | 
|---|
| 499 | forall(otype T, otype S) struct pair { T x; S y; }; | 
|---|
| 500 |  | 
|---|
| 501 | void f( pair(int, int) );  // (9) | 
|---|
| 502 | forall(otype T) void f( pair(T, int) );    // (10) | 
|---|
| 503 | forall(otype T) void f( pair(T, T) );      // (11) | 
|---|
| 504 |  | 
|---|
| 505 | Parameter comparison could possibly be made somewhat cheaper at loss of some | 
|---|
| 506 | precision by representing each parameter as a value from the natural numbers | 
|---|
| 507 | plus infinity, where infinity represents a monomorphic parameter and a finite | 
|---|
| 508 | number counts how many levels deep the shallowest type variable is, e.g. where | 
|---|
| 509 | `T` is a type variable, `int` would have value infinity, `T` would have value | 
|---|
| 510 | 0, `T*` would have value 1, `box(T)*` would have value 2, etc. | 
|---|
| 511 | Under this scheme, higher values represent less polymorphism. | 
|---|
| 512 | This makes the partial order on parameters a total order, so that many of the | 
|---|
| 513 | incomparable functions above compare equal, though that is perhaps a virtue. | 
|---|
| 514 | It also loses the ability to differentiate between some multi-parameter | 
|---|
| 515 | generic types, such as the parameters in (10) and (11), which would both be | 
|---|
| 516 | valued 1, losing the polymorphism distinction between them. | 
|---|
| 517 |  | 
|---|
| 518 | A variant of the above scheme would be to fix a maximum depth of polymorphic | 
|---|
| 519 | type variables (16 seems like a reasonable choice) at which a parameter would | 
|---|
| 520 | be considered to be effectively monomorphic, and to subtract the value | 
|---|
| 521 | described above from that maximum, clamping the result to a minimum of 0. | 
|---|
| 522 | Under this scheme, assuming a maximum value of 4, `int` has value 0, `T` has | 
|---|
| 523 | value 4, `T*` has value 3, `box(T)*` has value 2, and `box(T*)**` has value 0, | 
|---|
| 524 | the same as `int`. | 
|---|
| 525 | This can be quite succinctly represented, and summed without the presence of a | 
|---|
| 526 | single monomorphic parameter pushing the result to infinity, but does lose the | 
|---|
| 527 | ability to distinguish between very deeply structured polymorphic types. | 
|---|
| 528 |  | 
|---|
| 529 | ### Parameter Lists ### | 
|---|
| 530 | A partial order on function parameter lists can be produced by the | 
|---|
| 531 | product order of the partial orders on parameters described above. | 
|---|
| 532 | In more detail, this means that for two parameter lists with the same arity, | 
|---|
| 533 | if any pair of corresponding parameters are incomparable with respect to each | 
|---|
| 534 | other, the two parameter lists are incomparable; if in all pairs of | 
|---|
| 535 | corresponding parameters one list's parameter is always (less than or) equal | 
|---|
| 536 | to the other list's parameter than the first parameter list is (less than or) | 
|---|
| 537 | equal to the second parameter list; otherwise the lists are incomparable with | 
|---|
| 538 | respect to each other. | 
|---|
| 539 |  | 
|---|
| 540 | How to compare parameter lists of different arity is a somewhat open question. | 
|---|
| 541 | A simple, but perhaps somewhat unsatisfying, solution would be just to say | 
|---|
| 542 | that such lists are incomparable. | 
|---|
| 543 | The simplist approach to make them comparable is to say that, given two lists | 
|---|
| 544 | `(T1, T2, ... Tn)` and `(S1, S2, ... Sm)`, where `n <= m`, the parameter lists | 
|---|
| 545 | can be compared based on their shared prefix of `n` types. | 
|---|
| 546 | This approach breaks the transitivity property of the equivalence relation on | 
|---|
| 547 | the partial order, though, as seen below: | 
|---|
| 548 |  | 
|---|
| 549 | forall(otype T) void f( T, int );       // (1a) | 
|---|
| 550 | forall(otype T) void f( T, int, int );  // (12) | 
|---|
| 551 | forall(otype T) void f( T, int, T );    // (13) | 
|---|
| 552 |  | 
|---|
| 553 | By this rule, (1a) is equally polymorphic to both (12) and (13), so by | 
|---|
| 554 | transitivity (12) and (13) should also be equally polymorphic, but that is not | 
|---|
| 555 | actually the case. | 
|---|
| 556 |  | 
|---|
| 557 | We can fix the rule by saying that `(T1 ... Tn)` can be compared to | 
|---|
| 558 | `(S1 ... Sm)` by _extending_ the list of `T`s to `m` types by inserting | 
|---|
| 559 | notional monomorphic parameters. | 
|---|
| 560 | In this case, (1a) and (12) are equally polymorphic, because (1a) gets | 
|---|
| 561 | extended with a monomorphic type that compares equal to (12)'s third `int` | 
|---|
| 562 | parameter, but (1a) is less polymorphic than (13), because its notional | 
|---|
| 563 | monomorphic third parameter is less polymorphic than (13)'s `T`. | 
|---|
| 564 | Essentially what this rule says is that any parameter list with more | 
|---|
| 565 | parameters is no less polymorphic than one with fewer. | 
|---|
| 566 |  | 
|---|
| 567 | We could collapse this parameter list ordering to a succinct total order by | 
|---|
| 568 | simply taking the sum of the clamped parameter polymorphism counts, but this | 
|---|
| 569 | would again make most incomparable parameter lists compare equal, as well as | 
|---|
| 570 | having the potential for some unexpected results based on the (completely | 
|---|
| 571 | arbitrary) value chosen for "completely polymorphic". | 
|---|
| 572 | For instance, if we set 4 to be the maximum depth of polymorphism (as above), | 
|---|
| 573 | the following functions would be equally polymorphic, which is a somewhat | 
|---|
| 574 | unexpected result: | 
|---|
| 575 |  | 
|---|
| 576 | forall(otype T) void g( T, T, T, int );    // 4 + 4 + 4 + 0 = 12 | 
|---|
| 577 | forall(otype T) void g( T*, T*, T*, T* );  // 3 + 3 + 3 + 3 = 12 | 
|---|
| 578 |  | 
|---|
| 579 | These functions would also be considered equally polymorphic: | 
|---|
| 580 |  | 
|---|
| 581 | forall(otype T) void g( T, int );    // 4 + 0 = 4; | 
|---|
| 582 | forall(otype T) void g( T**, T** );  // 2 + 2 = 4; | 
|---|
| 583 |  | 
|---|
| 584 | This issue can be mitigated by choosing a larger maximum depth of | 
|---|
| 585 | polymorphism, but this scheme does have the distinct disadvantage of either | 
|---|
| 586 | specifying the (completely arbitrary) maximum depth as part of the language or | 
|---|
| 587 | allowing the compiler to refuse to accept otherwise well-typed deeply-nested | 
|---|
| 588 | polymorphic types. | 
|---|
| 589 |  | 
|---|
| 590 | For purposes of determining polymorphism, the list of return types of a | 
|---|
| 591 | function should be treated like another parameter list, and combined with the | 
|---|
| 592 | degree of polymorphism from the parameter list in the same way that the | 
|---|
| 593 | parameters in the parameter list are combined. | 
|---|
| 594 | For instance, in the following, (14) is less polymorphic than (15) which is | 
|---|
| 595 | less polymorphic than (16): | 
|---|
| 596 |  | 
|---|
| 597 | forall(otype T) int f( T );  // (14) | 
|---|
| 598 | forall(otype T) T*  f( T );  // (15) | 
|---|
| 599 | forall(otype T) T   f( T );  // (16) | 
|---|
| 600 |  | 
|---|
| 601 | ### Type Variables and Bounds ### | 
|---|
| 602 | Degree of polymorphism doesn't solely depend on the parameter lists, though. | 
|---|
| 603 | Glen's thesis (4.4.4, p.89) gives an example that shows that it also depends | 
|---|
| 604 | on the number of type variables as well: | 
|---|
| 605 |  | 
|---|
| 606 | forall(otype T)          void f( T, int );  // (1a) polymorphic | 
|---|
| 607 | forall(otype T)          void f( T, T );    // (2)  more polymorphic | 
|---|
| 608 | forall(otype T, otype S) void f( T, S );    // (17) most polymorphic | 
|---|
| 609 |  | 
|---|
| 610 | Clearly the `forall` type parameter list needs to factor into calculation of | 
|---|
| 611 | degree of polymorphism as well, as it's the only real differentiation between | 
|---|
| 612 | (2) and (17). | 
|---|
| 613 | The simplest way to include the type parameter list would be to simply count | 
|---|
| 614 | the type variables and say that functions with more type variables are more | 
|---|
| 615 | polymorphic. | 
|---|
| 616 |  | 
|---|
| 617 | However, it also seems natural that more-constrained type variables should be | 
|---|
| 618 | counted as "less polymorphic" than less-constrained type variables. | 
|---|
| 619 | This would allow our resolver to pick more specialized (and presumably more | 
|---|
| 620 | efficient) implementations of functions where one exists. | 
|---|
| 621 | For example: | 
|---|
| 622 |  | 
|---|
| 623 | forall(otype T | { void g(T); }) T f( T );  // (18) less polymorphic | 
|---|
| 624 | forall(otype T)                  T f( T );  // (16) more polymorphic | 
|---|
| 625 |  | 
|---|
| 626 | We could account for this by counting the number of unique constraints and | 
|---|
| 627 | saying that functions with more constraints are less polymorphic. | 
|---|
| 628 |  | 
|---|
| 629 | That said, we do model the `forall` constraint list as a (possibly anonymous) | 
|---|
| 630 | _trait_, and say that each trait is a set of constraints, so we could | 
|---|
| 631 | presumably define a partial order over traits based on subset inclusion, and | 
|---|
| 632 | use this partial order instead of the weaker count of constraints to order the | 
|---|
| 633 | list of type parameters of a function, as below: | 
|---|
| 634 |  | 
|---|
| 635 | trait has_g(otype T) { void g(T); }; | 
|---|
| 636 | trait has_h(otype S) { void h(T); }; | 
|---|
| 637 | trait has_gh(otype R | has_g(R) | has_h(R)) {}; | 
|---|
| 638 | // has_gh is equivlent to { void g(R); void h(R); } | 
|---|
| 639 |  | 
|---|
| 640 | forall(otype T | has_gh(T)) T f( T );  // (19) least polymorphic | 
|---|
| 641 | forall(otype T | has_g(T))  T f( T );  // (18) more polymorphic than (19) | 
|---|
| 642 | forall(otype T | has_h(T))  T f( T );  // (18b) more polymorphic than (19) | 
|---|
| 643 | //       incomparable with (18) | 
|---|
| 644 | forall(otype T)             T f( T );  // (16) most polymorphic | 
|---|
| 645 |  | 
|---|
| 646 | The tricky bit with this is figuring out how to compare the constraint | 
|---|
| 647 | functions for equality up to type variable renaming; I suspect there's a known | 
|---|
| 648 | solution, but don't know what it is (perhaps some sort of unification | 
|---|
| 649 | calculation, though I hope there's a more lightweight option). | 
|---|
| 650 | We also should be able to take advantage of the programmer-provided trait | 
|---|
| 651 | subset information (like the constraint on `has_gh` in the example) to more | 
|---|
| 652 | efficiently generate the partial-order graph for traits, which should be able | 
|---|
| 653 | to be cached for efficiency. | 
|---|
| 654 |  | 
|---|
| 655 | Combining count of type variables with the (partial) order on the trait | 
|---|
| 656 | constraining those variables seems like it should be a fairly straightforward | 
|---|
| 657 | product ordering to me - one `forall` qualifier is (less than or) equal to | 
|---|
| 658 | another if it has both a (less than or) equal number of type variables and a | 
|---|
| 659 | (less than or) equal degree of polymorphism from its constraining trait; the | 
|---|
| 660 | two qualifiers are incomparable otherwise. | 
|---|
| 661 | If an easier-to-calculate total ordering is desired, it might be acceptable to | 
|---|
| 662 | use the number of type variables, with ties broken by number of constraints. | 
|---|
| 663 |  | 
|---|
| 664 | Similarly, to combine the (partial) orders on parameter and return lists with | 
|---|
| 665 | the (partial) order on `forall` qualifiers, a product ordering seems like the | 
|---|
| 666 | reasonable choice, though if we wanted a total order a reasonable choice would | 
|---|
| 667 | be to use whatever method we use to combine parameter costs into parameter | 
|---|
| 668 | lists to combine the costs for the parameter and return lists, then break ties | 
|---|
| 669 | by the order on the `forall` qualifiers. | 
|---|
| 670 |  | 
|---|
| 671 | ## Expression Costs ## | 
|---|
| 672 |  | 
|---|
| 673 | ### Variable Expressions ### | 
|---|
| 674 | Variables may be overloaded; that is, there may be multiple distinct variables | 
|---|
| 675 | with the same name so long as each variable has a distinct type. | 
|---|
| 676 | The variable expression `x` has one zero-cost interpretation as type `T` for | 
|---|
| 677 | each variable `T x` in scope. | 
|---|
| 678 |  | 
|---|
| 679 | ### Member Selection Expressions ### | 
|---|
| 680 | For every interpretation `I` of `e` which has a struct or union type `S`, | 
|---|
| 681 | `e.y` has an interpretation of type `T` for each member `T y` of `S`, with the | 
|---|
| 682 | same cost as `I`. | 
|---|
| 683 | Note that there may be more than one member of `S` with the same name, as per | 
|---|
| 684 | Cforall's usual overloading rules. | 
|---|
| 685 | The indirect member expression `e->y` is desugared to `(*e).y` and interpreted | 
|---|
| 686 | analogously. | 
|---|
| 687 |  | 
|---|
| 688 | **TODO** Consider allowing `e.y` to be interpreted as `e->y` if no | 
|---|
| 689 | interpretations as `e.y` exist. | 
|---|
| 690 |  | 
|---|
| 691 | ### Address Expressions ### | 
|---|
| 692 | Address expressions `&e` have an interpretation for each interpretation `I` of | 
|---|
| 693 | `e` that is an lvalue of type `T`, with the same cost as `I` and type `T*`. | 
|---|
| 694 | Lvalues result from variable expressions, member selection expressions, or | 
|---|
| 695 | application of functions returning an lvalue-qualified type. | 
|---|
| 696 | Note that the dereference operator is overloadable, so the rules for its | 
|---|
| 697 | resolution follow those for function application below. | 
|---|
| 698 |  | 
|---|
| 699 | **TODO** Consider allowing conversion-to-lvalue so that, e.g., `&42` spawns a | 
|---|
| 700 | new temporary holding `42` and takes its address. | 
|---|
| 701 |  | 
|---|
| 702 | ### Boolean-context Expressions ### | 
|---|
| 703 | C has a number of "boolean contexts", where expressions are assigned a truth | 
|---|
| 704 | value; these include both arguments to the short-circuiting `&&` and `||` | 
|---|
| 705 | operators, as well as the conditional expressions in `if` and `while` | 
|---|
| 706 | statements, the middle expression in `for` statements, and the first argument | 
|---|
| 707 | to the `?:` ternary conditional operator. | 
|---|
| 708 | In all these contexts, C interprets `0` (which is both an integer and a null | 
|---|
| 709 | pointer literal) as false, and all other integer or pointer values as true. | 
|---|
| 710 | In this spirit, Cforall allows other types to be considered "truthy" if they | 
|---|
| 711 | support the following de-sugaring in a conditional context (see notes on | 
|---|
| 712 | interpretation of literal `0` below): | 
|---|
| 713 |  | 
|---|
| 714 | x => ((int)( x != 0 )) | 
|---|
| 715 |  | 
|---|
| 716 | ### Literal Expressions ### | 
|---|
| 717 | Literal expressions (e.g. 42, 'c', 3.14, "Hello, world!") have one | 
|---|
| 718 | zero-cost interpretation with the same type the expression would have in C, | 
|---|
| 719 | with three exceptions: | 
|---|
| 720 |  | 
|---|
| 721 | Character literals like 'x' are typed as `char` in Cforall, not `int` as in C. | 
|---|
| 722 | This change breaks very little C code (primarily `sizeof 'x'`; the implicit | 
|---|
| 723 | conversion from `int` to `char` and lack of overloading handle most other | 
|---|
| 724 | expressions), matches the behaviour of C++, and is more compatible with | 
|---|
| 725 | programmer intuition. | 
|---|
| 726 |  | 
|---|
| 727 | The literals `0` and `1` are also treated specially by Cforall, due to their | 
|---|
| 728 | potential uses in operator overloading. | 
|---|
| 729 | Earlier versions of Cforall allowed `0` and `1` to be variable names, allowing | 
|---|
| 730 | multiple interpretations of them according to the existing variable | 
|---|
| 731 | overloading rules, with the following declarations in the prelude: | 
|---|
| 732 |  | 
|---|
| 733 | const int 0, 1; | 
|---|
| 734 | forall ( dtype DT ) const DT * const    0; | 
|---|
| 735 | forall ( ftype FT ) FT * const          0; | 
|---|
| 736 |  | 
|---|
| 737 | This did, however, create some backward-compatibility problems and potential | 
|---|
| 738 | performance issues, and works poorly for generic types. To start with, this | 
|---|
| 739 | (entirely legal C) code snippet doesn't compile in Cforall: | 
|---|
| 740 |  | 
|---|
| 741 | if ( 0 ) {} | 
|---|
| 742 |  | 
|---|
| 743 | It desugars to `if ( (int)(0 != 0) ) {}`, and since both `int` and | 
|---|
| 744 | `forall(dtype DT) DT*` have a != operator which returns `int` the resolver can | 
|---|
| 745 | not choose which `0` variable to take, because they're both exact matches. | 
|---|
| 746 |  | 
|---|
| 747 | The general != computation may also be less efficient than a check for a zero | 
|---|
| 748 | value; take the following example of a rational type: | 
|---|
| 749 |  | 
|---|
| 750 | struct rational { int32_t num, int32_t den }; | 
|---|
| 751 | rational 0 = { 0, 1 }; | 
|---|
| 752 |  | 
|---|
| 753 | int ?!=? (rational a, rational b) { | 
|---|
| 754 | return ((int64_t)a.num)*b.den != ((int64_t)b.num)*a.den; | 
|---|
| 755 | } | 
|---|
| 756 |  | 
|---|
| 757 | int not_zero (rational a) { return a.num != 0; } | 
|---|
| 758 |  | 
|---|
| 759 | To check if two rationals are equal we need to do a pair of multiplications to | 
|---|
| 760 | normalize them (the casts in the example are to prevent overflow), but to | 
|---|
| 761 | check if a rational is non-zero we just need to check its numerator, a more | 
|---|
| 762 | efficient operation. | 
|---|
| 763 |  | 
|---|
| 764 | Finally, though polymorphic null-pointer variables can be meaningfully | 
|---|
| 765 | defined, most other polymorphic variables cannot be, which makes it difficult | 
|---|
| 766 | to make generic types "truthy" using the existing system: | 
|---|
| 767 |  | 
|---|
| 768 | forall(otype T) struct pair { T x; T y; }; | 
|---|
| 769 | forall(otype T | { T 0; }) pair(T) 0 = { 0, 0 }; | 
|---|
| 770 |  | 
|---|
| 771 | Now, it seems natural enough to want to define the zero for this pair type as | 
|---|
| 772 | a pair of the zero values of its element type (if they're defined). | 
|---|
| 773 | The declaration of `pair(T) 0` above is actually illegal though, as there is | 
|---|
| 774 | no way to represent the zero values of an infinite number of types in the | 
|---|
| 775 | single memory location available for this polymorphic variable - the | 
|---|
| 776 | polymorphic null-pointer variables defined in the prelude are legal, but that | 
|---|
| 777 | is only because all pointers are the same size and the single zero value is a | 
|---|
| 778 | legal value of all pointer types simultaneously; null pointer is, however, | 
|---|
| 779 | somewhat unique in this respect. | 
|---|
| 780 |  | 
|---|
| 781 | The technical explanation for the problems with polymorphic zero is that `0` | 
|---|
| 782 | is really a rvalue, not a lvalue - an expression, not an object. | 
|---|
| 783 | Drawing from this, the solution we propose is to give `0` a new built-in type, | 
|---|
| 784 | `_zero_t` (name open to bikeshedding), and similarly give `1` the new built-in | 
|---|
| 785 | type `_unit_t`. | 
|---|
| 786 | If the prelude defines != over `_zero_t` this solves the `if ( 0 )` problem, | 
|---|
| 787 | because now the unambiguous best interpretation of `0 != 0` is to read them | 
|---|
| 788 | both as `_zero_t` (and say that this expression is false). | 
|---|
| 789 | Backwards compatibility with C can be served by defining conversions in the | 
|---|
| 790 | prelude from `_zero_t` and `_unit_t` to `int` and the appropriate pointer | 
|---|
| 791 | types, as below: | 
|---|
| 792 |  | 
|---|
| 793 | // int 0; | 
|---|
| 794 | forall(otype T | { void ?{safe}(T*, int); }) void ?{safe} (T*, _zero_t); | 
|---|
| 795 | forall(otype T | { void ?{unsafe}(T*, int); }) void ?{unsafe} (T*, _zero_t); | 
|---|
| 796 |  | 
|---|
| 797 | // int 1; | 
|---|
| 798 | forall(otype T | { void ?{safe}(T*, int); }) void ?{safe} (T*, _unit_t); | 
|---|
| 799 | forall(otype T | { void ?{unsafe}(T*, int); }) void ?{unsafe} (T*, _unit_t); | 
|---|
| 800 |  | 
|---|
| 801 | // forall(dtype DT) const DT* 0; | 
|---|
| 802 | forall(dtype DT) void ?{safe}(const DT**, _zero_t); | 
|---|
| 803 | // forall(ftype FT) FT* 0; | 
|---|
| 804 | forall(ftype FT) void ?{safe}(FT**, _zero_t); | 
|---|
| 805 |  | 
|---|
| 806 | Further, with this change, instead of making `0` and `1` overloadable | 
|---|
| 807 | variables, we can instead allow user-defined constructors (or, more flexibly, | 
|---|
| 808 | safe conversions) from `_zero_t`, as below: | 
|---|
| 809 |  | 
|---|
| 810 | // rational 0 = { 0, 1 }; | 
|---|
| 811 | void ?{safe} (rational *this, _zero_t) { this->num = 0; this->den = 1; } | 
|---|
| 812 |  | 
|---|
| 813 | Note that we don't need to name the `_zero_t` parameter to this constructor, | 
|---|
| 814 | because its only possible value is a literal zero. | 
|---|
| 815 | This one line allows `0` to be used anywhere a `rational` is required, as well | 
|---|
| 816 | as enabling the same use of rationals in boolean contexts as above (by | 
|---|
| 817 | interpreting the `0` in the desguraring to be a rational by this conversion). | 
|---|
| 818 | Furthermore, while defining a conversion function from literal zero to | 
|---|
| 819 | `rational` makes rational a "truthy" type able to be used in a boolean | 
|---|
| 820 | context, we can optionally further optimize the truth decision on rationals as | 
|---|
| 821 | follows: | 
|---|
| 822 |  | 
|---|
| 823 | int ?!=? (rational a, _zero_t) { return a.num != 0; } | 
|---|
| 824 |  | 
|---|
| 825 | This comparison function will be chosen in preference to the more general | 
|---|
| 826 | rational comparison function for comparisons against literal zero (like in | 
|---|
| 827 | boolean contexts) because it doesn't require a conversion on the `0` argument. | 
|---|
| 828 | Functions of the form `int ?!=? (T, _zero_t)` can acutally be used in general | 
|---|
| 829 | to make a type `T` truthy without making `0` a value which can convert to that | 
|---|
| 830 | type, a capability not available in the current design. | 
|---|
| 831 |  | 
|---|
| 832 | This design also solves the problem of polymorphic zero for generic types, as | 
|---|
| 833 | in the following example: | 
|---|
| 834 |  | 
|---|
| 835 | // ERROR: forall(otype T | { T 0; }) pair(T) 0 = { 0, 0 }; | 
|---|
| 836 | forall(otype T | { T 0; }) void ?{safe} (pair(T) *this, _zero_t) { | 
|---|
| 837 | this->x = 0; this->y = 0; | 
|---|
| 838 | } | 
|---|
| 839 |  | 
|---|
| 840 | The polymorphic variable declaration didn't work, but this constructor is | 
|---|
| 841 | perfectly legal and has the desired semantics. | 
|---|
| 842 |  | 
|---|
| 843 | We can assert that `T` can be used in a boolean context as follows: | 
|---|
| 844 |  | 
|---|
| 845 | `forall(otype T | { int ?!=?(T, _zero_t); })` | 
|---|
| 846 |  | 
|---|
| 847 | Since the C standard (6.5.16.1.1) specifically states that pointers can be | 
|---|
| 848 | assigned into `_Bool` variables (and implies that other artithmetic types can | 
|---|
| 849 | be assigned into `_Bool` variables), it seems natural to say that assignment | 
|---|
| 850 | into a `_Bool` variable effectively constitutes a boolean context. | 
|---|
| 851 | To allow this interpretation, I propose including the following function (or | 
|---|
| 852 | its effective equivalent) in the prelude: | 
|---|
| 853 |  | 
|---|
| 854 | forall(otype T | { int ?!=?(T, _zero_t); }) | 
|---|
| 855 | void ?{safe}( _Bool *this, T that ) { *this = that != 0; } | 
|---|
| 856 |  | 
|---|
| 857 | Note that this conversion is not transitive; that is, for `t` a variable of | 
|---|
| 858 | some "truthy" type `T`, `(_Bool)t;` would use this conversion (in the absence | 
|---|
| 859 | of a lower-cost one), `(int)t;` would not use this conversion (and in fact | 
|---|
| 860 | would not be legal in the absence of another valid way to convert a `T` to an | 
|---|
| 861 | `int`), but `(int)(_Bool)t;` could legally use this conversion. | 
|---|
| 862 |  | 
|---|
| 863 | Similarly giving literal `1` the special type `_unit_t` allows for more | 
|---|
| 864 | concise and consistent specification of the increment and decrement operators, | 
|---|
| 865 | using the following de-sugaring: | 
|---|
| 866 |  | 
|---|
| 867 | ++i => i += 1 | 
|---|
| 868 | i++ => (tmp = i, i += 1, tmp) | 
|---|
| 869 | --i => i -= 1 | 
|---|
| 870 | i-- => (tmp = i, i -= 1, tmp) | 
|---|
| 871 |  | 
|---|
| 872 | In the examples above, `tmp` is a fresh temporary with its type inferred from | 
|---|
| 873 | the return type of `i += 1`. | 
|---|
| 874 | Under this proposal, defining a conversion from `_unit_t` to `T` and a | 
|---|
| 875 | `lvalue T ?+=? (T*, T)` provides both the pre- and post-increment operators | 
|---|
| 876 | for free in a consistent fashion (similarly for -= and the decrement | 
|---|
| 877 | operators). | 
|---|
| 878 | If a meaningful `1` cannot be defined for a type, both increment operators can | 
|---|
| 879 | still be defined with the signature `lvalue T ?+=? (T*, _unit_t)`. | 
|---|
| 880 | Similarly, if scalar addition can be performed on a type more efficiently than | 
|---|
| 881 | by repeated increment, `lvalue T ?+=? (T*, int)` will not only define the | 
|---|
| 882 | addition operator, it will simultaneously define consistent implementations of | 
|---|
| 883 | both increment operators (this can also be accomplished by defining a | 
|---|
| 884 | conversion from `int` to `T` and an addition operator `lvalue T ?+=?(T*, T)`). | 
|---|
| 885 |  | 
|---|
| 886 | To allow functions of the form `lvalue T ?+=? (T*, int)` to satisfy "has an | 
|---|
| 887 | increment operator" assertions of the form `lvalue T ?+=? (T*, _unit_t)`, | 
|---|
| 888 | we also define a non-transitive unsafe conversion from `_Bool` (allowable | 
|---|
| 889 | values `0` and `1`) to `_unit_t` (and `_zero_t`) as follows: | 
|---|
| 890 |  | 
|---|
| 891 | void ?{unsafe} (_unit_t*, _Bool) {} | 
|---|
| 892 |  | 
|---|
| 893 | As a note, the desugaring of post-increment above is possibly even more | 
|---|
| 894 | efficient than that of C++ - in C++, the copy to the temporary may be hidden | 
|---|
| 895 | in a separately-compiled module where it can't be elided in cases where it is | 
|---|
| 896 | not used, whereas this approach for Cforall always gives the compiler the | 
|---|
| 897 | opportunity to optimize out the temporary when it is not needed. | 
|---|
| 898 | Furthermore, one could imagine a post-increment operator that returned some | 
|---|
| 899 | type `T2` that was implicitly convertable to `T` but less work than a full | 
|---|
| 900 | copy of `T` to create (this seems like an absurdly niche case) - since the | 
|---|
| 901 | type of `tmp` is inferred from the return type of `i += 1`, you could set up | 
|---|
| 902 | functions with the following signatures to enable an equivalent pattern in | 
|---|
| 903 | Cforall: | 
|---|
| 904 |  | 
|---|
| 905 | lvalue T2 ?+=? (T*, _unit_t); // increment operator returns T2 | 
|---|
| 906 | void ?{} (T2*, T);            // initialize T2 from T for use in `tmp = i` | 
|---|
| 907 | void ?{safe} (T*, T2);        // allow T2 to be used as a T when needed to | 
|---|
| 908 | // preserve expected semantics of T x = y++; | 
|---|
| 909 |  | 
|---|
| 910 | **TODO** Look in C spec for literal type interprations. | 
|---|
| 911 | **TODO** Write up proposal for wider range of literal types, put in appendix | 
|---|
| 912 |  | 
|---|
| 913 | ### Initialization and Cast Expressions ### | 
|---|
| 914 | An initialization expression `T x = e` has one interpretation for each | 
|---|
| 915 | interpretation `I` of `e` with type `S` which is convertable to `T`. | 
|---|
| 916 | The cost of the interpretation is the cost of `I` plus the conversion cost | 
|---|
| 917 | from `S` to `T`. | 
|---|
| 918 | A cast expression `(T)e` is interpreted as hoisting initialization of a | 
|---|
| 919 | temporary variable `T tmp = e` out of the current expression, then replacing | 
|---|
| 920 | `(T)e` by the new temporary `tmp`. | 
|---|
| 921 |  | 
|---|
| 922 | ### Assignment Expressions ### | 
|---|
| 923 | An assignment expression `e = f` desugars to `(?=?(&e, f), e)`, and is then | 
|---|
| 924 | interpreted according to the usual rules for function application and comma | 
|---|
| 925 | expressions. | 
|---|
| 926 | Operator-assignment expressions like `e += f` desugar similarly as | 
|---|
| 927 | `(?+=?(&e, f), e)`. | 
|---|
| 928 |  | 
|---|
| 929 | ### Function Application Expressions ### | 
|---|
| 930 | Every _compatible function_ and satisfying interpretation of its arguments and | 
|---|
| 931 | polymorphic variable bindings produces one intepretation for the function | 
|---|
| 932 | application expression. | 
|---|
| 933 | Broadly speaking, the resolution cost of a function application is the sum of | 
|---|
| 934 | the cost of the interpretations of all arguments, the cost of all conversions | 
|---|
| 935 | to make those argument interpretations match the parameter types, and the | 
|---|
| 936 | binding cost of any of the function's polymorphic type parameters. | 
|---|
| 937 |  | 
|---|
| 938 | **TODO** Work out binding cost in more detail. | 
|---|
| 939 | **TODO** Address whether "incomparably polymorphic" should be treated as | 
|---|
| 940 | "equally polymorphic" and be disambiguated by count of (safe) conversions. | 
|---|
| 941 | **TODO** Think about what polymorphic return types mean in terms of late | 
|---|
| 942 | binding. | 
|---|
| 943 | **TODO** Consider if "f is less polymorphic than g" can mean exactly "f | 
|---|
| 944 | specializes g"; if we don't consider the assertion parameters (except perhaps | 
|---|
| 945 | by count) and make polymorphic variables bind exactly (rather than after | 
|---|
| 946 | implicit conversions) this should actually be pre-computable. | 
|---|
| 947 | **TODO** Add "deletable" functions - take Thierry's suggestion that a deleted | 
|---|
| 948 | function declaration is costed out by the resolver in the same way that any | 
|---|
| 949 | other function declaration is costed; if the deleted declaration is the unique | 
|---|
| 950 | min-cost resolution refuse to type the expression, if it is tied for min-cost | 
|---|
| 951 | then take the non-deleted alternative, and of two equivalent-cost deleted | 
|---|
| 952 | interpretations with the same return type pick one arbitrarily rather than | 
|---|
| 953 | producing an ambiguous resolution. This would also be useful for forbidding | 
|---|
| 954 | pointer-to-floating-point explicit conversions (C11, 6.5.4.4). | 
|---|
| 955 | **TODO** Cover default parameters, maybe named parameters (see "named | 
|---|
| 956 | arguments" thread of 11 March 2016) | 
|---|
| 957 |  | 
|---|
| 958 |  | 
|---|
| 959 | ### Sizeof, Alignof & Offsetof Expressions ### | 
|---|
| 960 | `sizeof`, `alignof`, and `offsetof` expressions have at most a single | 
|---|
| 961 | interpretation, of type `size_t`. | 
|---|
| 962 | `sizeof` and `alignof` expressions take either a type or an expression as an | 
|---|
| 963 | argument; if the argument is a type, it must be a complete type which is not a | 
|---|
| 964 | function type, if an expression, the expression must have a single | 
|---|
| 965 | interpretation, the type of which conforms to the same rules. | 
|---|
| 966 | `offsetof` takes two arguments, a type and a member name; the type must be | 
|---|
| 967 | a complete structure or union type, and the second argument must name a member | 
|---|
| 968 | of that type. | 
|---|
| 969 |  | 
|---|
| 970 | ### Comma Expressions ### | 
|---|
| 971 | A comma expression `x, y` resolves `x` as if it had been cast to `void`, and | 
|---|
| 972 | then, if there is a unique interpretation `I` of `x`, has one interpretation | 
|---|
| 973 | for each interpretation `J` of `y` with the same type as `J` costing the sum | 
|---|
| 974 | of the costs of `I` and `J`. | 
|---|
| 975 |  | 
|---|
| 976 | ### Index Expressions ### | 
|---|
| 977 | **TODO** Consider adding polymorphic function in prelude for this, as per | 
|---|
| 978 | 6.5.2.1.2 in the C standard: | 
|---|
| 979 |  | 
|---|
| 980 | forall(otype T, otype I, otype R, otype E | { R ?+?(T, I); lvalue E *?(R); }) | 
|---|
| 981 | lvalue E ?[?](T a, I i) { return *(a + i); } | 
|---|
| 982 |  | 
|---|
| 983 | I think this isn't actually a good idea, because the cases for it are niche, | 
|---|
| 984 | mostly odd tricks like `0[p]` as an alternate syntax for dereferencing a | 
|---|
| 985 | pointer `p`, and adding it to the prelude would slow down resolution of | 
|---|
| 986 | every index expression just a bit. Our existing prelude includes an indexing | 
|---|
| 987 | operator `forall(otype T) lvalue T ?[?](ptrdiff_t, T*)`, plus qualified | 
|---|
| 988 | variants, which should satisfy C source-compatibility without propegating this | 
|---|
| 989 | silly desugaring further. | 
|---|
| 990 |  | 
|---|
| 991 | #### Compatible Functions #### | 
|---|
| 992 | **TODO** This subsection is very much a work in progress and has multiple open | 
|---|
| 993 | design questions. | 
|---|
| 994 |  | 
|---|
| 995 | A _compatible function_ for an application expression is a visible function | 
|---|
| 996 | declaration with the same name as the application expression and parameter | 
|---|
| 997 | types that can be converted to from the argument types. | 
|---|
| 998 | Function pointers and variables of types with the `?()` function call operator | 
|---|
| 999 | overloaded may also serve as function declarations for purposes of | 
|---|
| 1000 | compatibility. | 
|---|
| 1001 |  | 
|---|
| 1002 | For monomorphic parameters of a function declaration, the declaration is a | 
|---|
| 1003 | compatible function if there is an argument interpretation that is either an | 
|---|
| 1004 | exact match, or has a safe or unsafe implicit conversion that can be used to | 
|---|
| 1005 | reach the parameter type; for example: | 
|---|
| 1006 |  | 
|---|
| 1007 | void f(int); | 
|---|
| 1008 |  | 
|---|
| 1009 | f(42);        // compatible; exact match to int type | 
|---|
| 1010 | f('x');       // compatible; safe conversion from char => int | 
|---|
| 1011 | f(3.14);      // compatible; unsafe conversion from double => int | 
|---|
| 1012 | f((void*)0);  // not compatible; no implicit conversion from void* => int | 
|---|
| 1013 |  | 
|---|
| 1014 | Per Richard[*], function assertion satisfaction involves recursively searching | 
|---|
| 1015 | for compatible functions, not an exact match on the function types (I don't | 
|---|
| 1016 | believe the current Cforall resolver handles this properly); to extend the | 
|---|
| 1017 | previous example: | 
|---|
| 1018 |  | 
|---|
| 1019 | forall(otype T | { void f(T); }) void g(T); | 
|---|
| 1020 |  | 
|---|
| 1021 | g(42);    // binds T = int, takes f(int) by exact match | 
|---|
| 1022 | g('x');   // binds T = char, takes f(int) by conversion | 
|---|
| 1023 | g(3.14);  // binds T = double, takes f(int) by conversion | 
|---|
| 1024 |  | 
|---|
| 1025 | [*] Bilson, s.2.1.3, p.26-27, "Assertion arguments are found by searching the | 
|---|
| 1026 | accessible scopes for definitions corresponding to assertion names, and | 
|---|
| 1027 | choosing the ones whose types correspond *most closely* to the assertion | 
|---|
| 1028 | types." (emphasis mine) | 
|---|
| 1029 |  | 
|---|
| 1030 | There are three approaches we could take to binding type variables: type | 
|---|
| 1031 | variables must bind to argument types exactly, each type variable must bind | 
|---|
| 1032 | exactly to at least one argument, or type variables may bind to any type which | 
|---|
| 1033 | all corresponding arguments can implicitly convert to; I'll provide some | 
|---|
| 1034 | possible motivation for each approach. | 
|---|
| 1035 |  | 
|---|
| 1036 | There are two main arguments for the more restrictive binding schemes; the | 
|---|
| 1037 | first is that the built-in implicit conversions in C between `void*` and `T*` | 
|---|
| 1038 | for any type `T` can lead to unexpectedly type-unsafe behaviour in a more | 
|---|
| 1039 | permissive binding scheme, for example: | 
|---|
| 1040 |  | 
|---|
| 1041 | forall(dtype T) T* id(T *p) { return p; } | 
|---|
| 1042 |  | 
|---|
| 1043 | int main() { | 
|---|
| 1044 | int *p = 0; | 
|---|
| 1045 | char *c = id(p); | 
|---|
| 1046 | } | 
|---|
| 1047 |  | 
|---|
| 1048 | This code compiles in CFA today, and it works because the extra function | 
|---|
| 1049 | wrapper `id` provides a level of indirection that allows the non-chaining | 
|---|
| 1050 | implicit conversions from `int*` => `void*` and `void*` => `char*` to chain. | 
|---|
| 1051 | The resolver types the last line with `T` bound to `void` as follows: | 
|---|
| 1052 |  | 
|---|
| 1053 | char *c = (char*)id( (void*)p ); | 
|---|
| 1054 |  | 
|---|
| 1055 | It has been suggested that making the implicit conversions to and from `void*` | 
|---|
| 1056 | explicit in Cforall code (as in C++) would solve this particular problem, and | 
|---|
| 1057 | provide enough other type-safety benefits to outweigh the source-compatibility | 
|---|
| 1058 | break with C; see Appendix D for further details. | 
|---|
| 1059 |  | 
|---|
| 1060 | The second argument for a more constrained binding scheme is performance; | 
|---|
| 1061 | trait assertions need to be checked after the type variables are bound, and | 
|---|
| 1062 | adding more possible values for the type variables should correspond to a | 
|---|
| 1063 | linear increase in runtime of the resolver per type variable. | 
|---|
| 1064 | There are 21 built-in arithmetic types in C (ignoring qualifiers), and each of | 
|---|
| 1065 | them is implicitly convertable to any other; if we allow unrestricted binding | 
|---|
| 1066 | of type variables, a common `int` variable (or literal) used in the position | 
|---|
| 1067 | of a polymorphic variable parameter would cause a 20x increase in the amount | 
|---|
| 1068 | of time needed to check trait resolution for that interpretation. | 
|---|
| 1069 | These numbers have yet to be emprically substantiated, but the theory is | 
|---|
| 1070 | reasonable, and given that much of the impetus for re-writing the resolver is | 
|---|
| 1071 | due to its poor performance, I think this is a compelling argument. | 
|---|
| 1072 |  | 
|---|
| 1073 | I would also mention that a constrained binding scheme is practical; the most | 
|---|
| 1074 | common type of assertion is a function assertion, and, as mentioned above, | 
|---|
| 1075 | those assertions should be able to be implicitly converted to to match. | 
|---|
| 1076 | Thus, in the example above with `g(T)`, where the assertion is `void f(T)`, | 
|---|
| 1077 | we first bind `T = int` or `T = char` or `T = double`, then substitute the | 
|---|
| 1078 | binding into the assertion, yielding assertions of `void f(int)`, | 
|---|
| 1079 | `void f(char)`, or `void f(double)`, respectively, then attempt to satisfy | 
|---|
| 1080 | these assertions to complete the binding. | 
|---|
| 1081 | Though in all three cases, the existing function with signature `void f(int)` | 
|---|
| 1082 | satisfies this assertion, the checking work cannot easily be re-used between | 
|---|
| 1083 | variable bindings, because there may be better or worse matches depending on | 
|---|
| 1084 | the specifics of the binding. | 
|---|
| 1085 |  | 
|---|
| 1086 | The main argument for a more flexible binding scheme is that the binding | 
|---|
| 1087 | abstraction can actually cause a wrapped function call that would work to | 
|---|
| 1088 | cease to resolve, as below: | 
|---|
| 1089 |  | 
|---|
| 1090 | forall(otype T | { T ?+? (T, T) }) | 
|---|
| 1091 | T add(T x, T y) { return x + y; } | 
|---|
| 1092 |  | 
|---|
| 1093 | int main() { | 
|---|
| 1094 | int i, j = 2; | 
|---|
| 1095 | short r, s = 3; | 
|---|
| 1096 | i = add(j, s); | 
|---|
| 1097 | r = add(s, j); | 
|---|
| 1098 | } | 
|---|
| 1099 |  | 
|---|
| 1100 | Now, C's implicit conversions mean that you can call `j + s` or `s + j`, and | 
|---|
| 1101 | in both cases the short `s` is promoted to `int` to match `j`. | 
|---|
| 1102 | If, on the other hand, we demand that variables exactly match type variables, | 
|---|
| 1103 | neither call to `add` will compile, because it is impossible to simultaneously | 
|---|
| 1104 | bind `T` to both `int` and `short` (C++ has a similar restriction on template | 
|---|
| 1105 | variable inferencing). | 
|---|
| 1106 | One alternative that enables this case, while still limiting the possible | 
|---|
| 1107 | type variable bindings is to say that at least one argument must bind to its | 
|---|
| 1108 | type parameter exactly. | 
|---|
| 1109 | In this case, both calls to `add` would have the set `{ T = int, T = short }` | 
|---|
| 1110 | for candidate bindings, and would have to check both, as well as checking that | 
|---|
| 1111 | `short` could convert to `int` or vice-versa. | 
|---|
| 1112 |  | 
|---|
| 1113 | It is worth noting here that parameterized types generally bind their type | 
|---|
| 1114 | parameters exactly anyway, so these "restrictive" semantics only restrict a | 
|---|
| 1115 | small minority of calls; for instance, in the example following, there isn't a | 
|---|
| 1116 | sensible way to type the call to `ptr-add`: | 
|---|
| 1117 |  | 
|---|
| 1118 | forall(otype T | { T ?+?(T, T) }) | 
|---|
| 1119 | void ptr-add( T* rtn, T* x, T* y ) { | 
|---|
| 1120 | *rtn = *x + *y; | 
|---|
| 1121 | } | 
|---|
| 1122 |  | 
|---|
| 1123 | int main() { | 
|---|
| 1124 | int i, j = 2; | 
|---|
| 1125 | short s = 3; | 
|---|
| 1126 | ptr-add(&i, &j, &s); // ERROR &s is not an int* | 
|---|
| 1127 | } | 
|---|
| 1128 |  | 
|---|
| 1129 | I think there is some value in providing library authors with the | 
|---|
| 1130 | capability to express "these two parameter types must match exactly". | 
|---|
| 1131 | This can be done without restricting the language's expressivity, as the `add` | 
|---|
| 1132 | case above can be made to work under the strictest type variable binding | 
|---|
| 1133 | semantics with any addition operator in the system by changing its signature | 
|---|
| 1134 | as follows: | 
|---|
| 1135 |  | 
|---|
| 1136 | forall( otype T, otype R, otype S | { R ?+?(T, S); } ) | 
|---|
| 1137 | R add(T x, S y) { return x + y; } | 
|---|
| 1138 |  | 
|---|
| 1139 | Now, it is somewhat unfortunate that the most general version here is more | 
|---|
| 1140 | verbose (and thus that the path of least resistence would be more restrictive | 
|---|
| 1141 | library code); however, the breaking case in the example code above is a bit | 
|---|
| 1142 | odd anyway - explicitly taking two variables of distinct types and relying on | 
|---|
| 1143 | C's implicit conversions to do the right thing is somewhat bad form, | 
|---|
| 1144 | especially where signed/unsigned conversions are concerned. | 
|---|
| 1145 | I think the more common case for implicit conversions is the following, | 
|---|
| 1146 | though, where the conversion is used on a literal: | 
|---|
| 1147 |  | 
|---|
| 1148 | short x = 40; | 
|---|
| 1149 | short y = add(x, 2); | 
|---|
| 1150 |  | 
|---|
| 1151 | One option to handle just this case would be to make literals implicitly | 
|---|
| 1152 | convertable to match polymorphic type variables, but only literals. | 
|---|
| 1153 | The example above would actually behave slightly differently than `x + 2` in | 
|---|
| 1154 | C, though, casting the `2` down to `short` rather than the `x` up to `int`, a | 
|---|
| 1155 | possible demerit of this scheme. | 
|---|
| 1156 |  | 
|---|
| 1157 | The other question to ask would be which conversions would be allowed for | 
|---|
| 1158 | literals; it seems rather odd to allow down-casting `42ull` to `char`, when | 
|---|
| 1159 | the programmer has explicitly specified by the suffix that it's an unsigned | 
|---|
| 1160 | long. | 
|---|
| 1161 | Type interpretations of literals in C are rather complex (see [1]), but one | 
|---|
| 1162 | reasonable approach would be to say that un-suffixed integer literals could be | 
|---|
| 1163 | interpreted as any type convertable from int, "u" suffixed literals could be | 
|---|
| 1164 | interpreted as any type convertable from "unsigned int" except the signed | 
|---|
| 1165 | integer types, and "l" or "ll" suffixed literals could only be interpreted as | 
|---|
| 1166 | `long` or `long long`, respectively (or possibly that the "u" suffix excludes | 
|---|
| 1167 | the signed types, while the "l" suffix excludes the types smaller than | 
|---|
| 1168 | `long int`, as in [1]). | 
|---|
| 1169 | Similarly, unsuffixed floating-point literals could be interpreted as `float`, | 
|---|
| 1170 | `double` or `long double`, but "f" or "l" suffixed floating-point literals | 
|---|
| 1171 | could only be interpreted as `float` or `long double`, respectively. | 
|---|
| 1172 | I would like to specify that character literals can only be interpreted as | 
|---|
| 1173 | `char`, but the wide-character variants and the C practice of typing character | 
|---|
| 1174 | literals as `int` means that would likely break code, so character literals | 
|---|
| 1175 | should be able to take any integer type. | 
|---|
| 1176 |  | 
|---|
| 1177 | [1] http://en.cppreference.com/w/c/language/integer_constant | 
|---|
| 1178 |  | 
|---|
| 1179 | With the possible exception of the `add` case above, implicit conversions to | 
|---|
| 1180 | the function types of assertions can handle most of the expected behaviour | 
|---|
| 1181 | from C. | 
|---|
| 1182 | However, implicit conversions cannot be applied to match variable assertions, | 
|---|
| 1183 | as in the following example: | 
|---|
| 1184 |  | 
|---|
| 1185 | forall( otype T | { int ?<?(T, T); T ?+?(T, T); T min; T max; } ) | 
|---|
| 1186 | T clamp_sum( T x, T y ) { | 
|---|
| 1187 | T sum = x + y; | 
|---|
| 1188 | if ( sum < min ) return min; | 
|---|
| 1189 | if ( max < sum ) return max; | 
|---|
| 1190 | return sum; | 
|---|
| 1191 | } | 
|---|
| 1192 |  | 
|---|
| 1193 | char min = 'A'; | 
|---|
| 1194 | double max = 100.0; | 
|---|
| 1195 | //int val = clamp_sum( 'X', 3.14 );  // ERROR (1) | 
|---|
| 1196 |  | 
|---|
| 1197 | char max = 'Z' | 
|---|
| 1198 | char val = clamp_sum( 'X', 3.14 ); // MATCH (2) | 
|---|
| 1199 | double val = clamp_sum( 40.9, 19.9 ); // MAYBE (3) | 
|---|
| 1200 |  | 
|---|
| 1201 | In this example, the call to `clamp_sum` at (1) doesn't compile, because even | 
|---|
| 1202 | though there are compatible `min` and `max` variables of types `char` and | 
|---|
| 1203 | `double`, they need to have the same type to match the constraint, and they | 
|---|
| 1204 | don't. | 
|---|
| 1205 | The (2) example does compile, but with a behaviour that might be a bit | 
|---|
| 1206 | unexpected given the "usual arithmetic conversions", in that both values are | 
|---|
| 1207 | narrowed to `char` to match the `min` and `max` constraints, rather than | 
|---|
| 1208 | widened to `double` as is usual for mis-matched arguments to +. | 
|---|
| 1209 | The (3) example is the only case discussed here that would require the most | 
|---|
| 1210 | permisive type binding semantics - here, `T` is bound to `char`, to match the | 
|---|
| 1211 | constraints, and both the parameters are narrowed from `double` to `char` | 
|---|
| 1212 | before the call, which would not be allowed under either of the more | 
|---|
| 1213 | restrictive binding semantics. | 
|---|
| 1214 | However, the behaviour here is unexpected to the programmer, because the | 
|---|
| 1215 | return value will be `(double)'A' /* == 60.0 */` due to the conversions, | 
|---|
| 1216 | rather than `60.8 /* == 40.9 + 19.9 */` as they might expect. | 
|---|
| 1217 |  | 
|---|
| 1218 | Personally, I think that implicit conversions are not a particularly good | 
|---|
| 1219 | language design, and that the use-cases for them can be better handled with | 
|---|
| 1220 | less powerful features (e.g. more versatile rules for typing constant | 
|---|
| 1221 | expressions). | 
|---|
| 1222 | However, though we do need implicit conversions in monomorphic code for C | 
|---|
| 1223 | compatibility, I'm in favour of restricting their usage in polymorphic code, | 
|---|
| 1224 | both to give programmers some stronger tools to express their intent and to | 
|---|
| 1225 | shrink the search space for the resolver. | 
|---|
| 1226 | Of the possible binding semantics I've discussed, I'm in favour of forcing | 
|---|
| 1227 | polymorphic type variables to bind exactly, though I could be talked into | 
|---|
| 1228 | allowing literal expressions to have more flexibility in their bindings, or | 
|---|
| 1229 | possibly loosening "type variables bind exactly" to "type variables bind | 
|---|
| 1230 | exactly at least once"; I think the unrestricted combination of implicit | 
|---|
| 1231 | conversions and polymorphic type variable binding unneccesarily multiplies the | 
|---|
| 1232 | space of possible function resolutions, and that the added resolution options | 
|---|
| 1233 | are mostly unexpected and therefore confusing and not useful to user | 
|---|
| 1234 | programmers. | 
|---|
| 1235 |  | 
|---|
| 1236 | ## Resolver Architecture ## | 
|---|
| 1237 |  | 
|---|
| 1238 | ### Function Application Resolution ### | 
|---|
| 1239 | Our resolution algorithm for function application expressions is based on | 
|---|
| 1240 | Baker's[3] single-pass bottom-up algorithm, with Cormack's[4] single-pass | 
|---|
| 1241 | top-down algorithm applied where appropriate as an optimization. | 
|---|
| 1242 | Broadly speaking, the cost of this resolution per expression will be | 
|---|
| 1243 | proportional to `i^d`, where `i` is the number of interpretations of each | 
|---|
| 1244 | program symbol, and `d` is the maximum depth of the expression DAG. | 
|---|
| 1245 | Since `d` is determined by the user programmer (in general, bounded by a small | 
|---|
| 1246 | constant), opportunities for resolver optimization primarily revolve around | 
|---|
| 1247 | minimizing `i`, the number of interpretations of each symbol that are | 
|---|
| 1248 | considered. | 
|---|
| 1249 |  | 
|---|
| 1250 | [3] Baker, Theodore P. A one-pass algorithm for overload resolution in Ada. | 
|---|
| 1251 | ACM Transactions on Programming Languages and Systems (1982) 4:4 p.601-614 | 
|---|
| 1252 |  | 
|---|
| 1253 | [4] Cormack, Gordon V. An algorithm for the selection of overloaded functions | 
|---|
| 1254 | in Ada. SIGPLAN Notices (1981) 16:2 p.48-52 | 
|---|
| 1255 |  | 
|---|
| 1256 | Unlike Baker, our system allows implicit type conversions for function | 
|---|
| 1257 | arguments and return types; the problem then becomes to find the valid | 
|---|
| 1258 | interpretation for an expression that has the unique minimal conversion cost, | 
|---|
| 1259 | if such exists. | 
|---|
| 1260 | Interpretations can be produced both by overloaded names and implicit | 
|---|
| 1261 | conversions applied to existing interpretations; we have proposals to reduce | 
|---|
| 1262 | the number of interpretations considered from both sources. | 
|---|
| 1263 | To simplify the problem for this discussion, we will consider application | 
|---|
| 1264 | resolution restricted to a domain of functions applied to variables, possibly | 
|---|
| 1265 | in a nested manner (e.g. `f( g( x ), y )`, where `x` and `y` are variables and | 
|---|
| 1266 | `f` and `g` are functions), and possibly in a typed context such as a variable | 
|---|
| 1267 | initialization (e.g. `int i = f( x );`); the other aspects of Cforall type | 
|---|
| 1268 | resolution should be able to be straightforwardly mapped into this model. | 
|---|
| 1269 | The types of the symbol tables used for variable and function declarations | 
|---|
| 1270 | look somewhat like the following: | 
|---|
| 1271 |  | 
|---|
| 1272 | variable_table = name_map( variable_name, variable_map ) | 
|---|
| 1273 |  | 
|---|
| 1274 | function_table = name_map( function_name, function_map ) | 
|---|
| 1275 |  | 
|---|
| 1276 | variable_map = multi_index( by_type( variable_type ), | 
|---|
| 1277 | variable_decl_set ) | 
|---|
| 1278 |  | 
|---|
| 1279 | function_map = multi_index( by_int( n_params ), | 
|---|
| 1280 | by_type( return_type ), | 
|---|
| 1281 | function_decl_set ) | 
|---|
| 1282 |  | 
|---|
| 1283 | `variable_name` and `function_name` are likely simple strings, with `name_map` | 
|---|
| 1284 | a hash table (or perhaps trie) mapping string keys to values. | 
|---|
| 1285 | `variable_decl_set` and `function_decl_set` can be thought of for the moment | 
|---|
| 1286 | as simple bags of typed declarations, where the declaration types are linked | 
|---|
| 1287 | to the graph of available conversions for that type. | 
|---|
| 1288 | In a typed context both the `variable_decl_set` and the `function_decl_set` | 
|---|
| 1289 | should be able to be selected upon by type; this is accomplished by the | 
|---|
| 1290 | `by_type` index of both `variable_map` and `function_map`. | 
|---|
| 1291 | The `by_int` index of `function_map` also provides a way to select functions | 
|---|
| 1292 | by their number of parameters; this index may be used to swiftly discard any | 
|---|
| 1293 | function declaration which does not have the appropriate number of parameters | 
|---|
| 1294 | for the argument interpretations being passed to it; given the likely small | 
|---|
| 1295 | number of entries in this map, it is possible that a binary search of a sorted | 
|---|
| 1296 | vector or even a linear search of an unsorted vector would be more efficient | 
|---|
| 1297 | than the usual hash-based index. | 
|---|
| 1298 |  | 
|---|
| 1299 | Given these data structures, the general outline of our algorithm follows | 
|---|
| 1300 | Baker, with Cormack's algorithm used as a heuristic filter in typed contexts. | 
|---|
| 1301 |  | 
|---|
| 1302 | In an untyped context, we use a variant of Baker's bottom-up algorithm. | 
|---|
| 1303 | The leaves of the interpretation DAG are drawn from the variable symbol table, | 
|---|
| 1304 | with entries in the table each producing zero-cost interpretations, and each | 
|---|
| 1305 | implicit conversion available to be applied to the type of an existing entry | 
|---|
| 1306 | producing a further interpretation with the same cost as the conversion. | 
|---|
| 1307 | As in Baker, if two or more interpretations have the same type, only the | 
|---|
| 1308 | minimum cost interpretation with that type is produced; if there is no unique | 
|---|
| 1309 | minimum cost interpretation than resolution with that type is ambiguous, and | 
|---|
| 1310 | not permitted. | 
|---|
| 1311 | It should be relatively simple to produce the list of interpretations sorted | 
|---|
| 1312 | by cost by producing the interpretations via a breadth-first search of the | 
|---|
| 1313 | conversion graph from the initial interpretations provided in the variable | 
|---|
| 1314 | symbol table. | 
|---|
| 1315 |  | 
|---|
| 1316 | To match a function at one of the internal nodes of the DAG, we first look up | 
|---|
| 1317 | the function's name in the function symbol table, the appropriate number of | 
|---|
| 1318 | parameters for the arguments that are provided through the `by_int` index of | 
|---|
| 1319 | the returned `function_map`, then go through the resulting `function_decl_set` | 
|---|
| 1320 | searching for functions where the parameter types can unify with the provided | 
|---|
| 1321 | argument lists; any such matching function produces an interpretation with a | 
|---|
| 1322 | cost that is the sum of its argument costs. | 
|---|
| 1323 | Though this is not included in our simplified model, this unification step may | 
|---|
| 1324 | include binding of polymorphic variables, which introduces a cost for the | 
|---|
| 1325 | function binding itself which must be added to the argument costs. | 
|---|
| 1326 | Also, checking of function assertions would likely be done at this step as | 
|---|
| 1327 | well, possibly eliminating some possible matching functions (if no suitable | 
|---|
| 1328 | assertions can be satisfied), or adding further conversion costs for the | 
|---|
| 1329 | assertion satisfaction. | 
|---|
| 1330 | Once the set of valid function interpretations is produced, these may also be | 
|---|
| 1331 | expanded by the graph of implicit conversions on their return types, as the | 
|---|
| 1332 | variable interpretations were. | 
|---|
| 1333 |  | 
|---|
| 1334 | This implicit conversion-based expansion of interpretations should be skipped | 
|---|
| 1335 | for the top-level expression if used in an untyped (void) context, e.g. for | 
|---|
| 1336 | `f` in `f( g ( x ) );` or `x` in `x;`. | 
|---|
| 1337 | On the other hand, if the top-level expression specifies a type, e.g. in | 
|---|
| 1338 | `int i = f( x );`, only top level expressions that return that type are | 
|---|
| 1339 | relevant to the search, so the candidates for `f` can be filtered first by | 
|---|
| 1340 | those that return `int` (or a type convertable to it); this can be | 
|---|
| 1341 | accomplished by performing a top-down filter of the interpretations of `f` by | 
|---|
| 1342 | the `by_type` index of the `function_map` in a manner similar to Cormack's[4] | 
|---|
| 1343 | algorithm. | 
|---|
| 1344 |  | 
|---|
| 1345 | In a typed context, such as an initialization expression | 
|---|
| 1346 | `T x = f( g( y ), z );`, only interpretations of `f( g( y ), z )` which have | 
|---|
| 1347 | type `T` are valid; since there are likely to be valid interpretations of | 
|---|
| 1348 | `f( g( y ), z )` which cannot be used to initialize a variable of type `T`, we | 
|---|
| 1349 | can use this information to reduce the number of interpretations considered. | 
|---|
| 1350 | Drawing from Cormack[4], we first search for interpretations of `f` where the | 
|---|
| 1351 | return type is `T`; by breadth-first-search of the conversion graph, it should | 
|---|
| 1352 | be straightforward to order the interpretations of `f` by the cost to convert | 
|---|
| 1353 | their return type to `T`. | 
|---|
| 1354 | We can also filter out interpretations of `f` with less than two parameters, | 
|---|
| 1355 | since both `g( y )` and `z` must produce at least one parameter; we may not, | 
|---|
| 1356 | however, rule out interpretations of `f` with more than two parameters, as | 
|---|
| 1357 | there may be a valid interpretation of `g( y )` as a function returning more | 
|---|
| 1358 | than one parameter (if the expression was `f( y, z )` instead, we could use an | 
|---|
| 1359 | exact parameter count, assuming that variables of tuple type don't exist). | 
|---|
| 1360 | For each compatible interpretation of `f`, we can add the type of the first | 
|---|
| 1361 | parameter of that interpretation of `f` to a set `S`, and recursively search | 
|---|
| 1362 | for interpretations of `g( y )` that return some type `Si` in `S`, and | 
|---|
| 1363 | similarly for interpretations of `z` that match the type of any of the second | 
|---|
| 1364 | parameters of some `f`. | 
|---|
| 1365 | Naturally, if no matching interpretation of `g( y )` can be found for the | 
|---|
| 1366 | first parameter of some `f`, the type of the second parameter of that `f` will | 
|---|
| 1367 | not be added to the set of valid types for `z`. | 
|---|
| 1368 | Each node in this interpretation DAG is given a cost the same way it would be | 
|---|
| 1369 | in the bottom-up approach, with the exception that when going top-down there | 
|---|
| 1370 | must be a final bottom-up pass to sum the interpretation costs and sort them | 
|---|
| 1371 | as appropriate. | 
|---|
| 1372 |  | 
|---|
| 1373 | If a parameter type for some `f` is a polymorphic type variable that is left | 
|---|
| 1374 | unbound by the return type (e.g. `forall(otype S) int f(S x, int y)`), the | 
|---|
| 1375 | matching arguments should be found using the bottom-up algorithm above for | 
|---|
| 1376 | untyped contexts, because the polymorphic type variable does not sufficiently | 
|---|
| 1377 | constrain the available interpretations of the argument expression. | 
|---|
| 1378 | Similarly, it would likely be an advantage to use top-down resolution for | 
|---|
| 1379 | cast expressions (e.g. `(int)x`), even when those cast expressions are | 
|---|
| 1380 | subexpressions of an otherwise untyped expression. | 
|---|
| 1381 | It may also be fruitful to switch between the bottom-up and top-down | 
|---|
| 1382 | algorithms if the number of valid interpretations for a subexpression or valid | 
|---|
| 1383 | types for an argument exceeds some heuristic threshold, but finding such | 
|---|
| 1384 | a threshold (if any exists) will require experimental data. | 
|---|
| 1385 | This hybrid top-down/bottom-up search provides more opportunities for pruning | 
|---|
| 1386 | interpretations than either a bottom-up or top-down approach alone, and thus | 
|---|
| 1387 | may be more efficient than either. | 
|---|
| 1388 | A top-down-only approach, however, devolves to linear search through every | 
|---|
| 1389 | possible interpretation in the solution space in an untyped context, and is | 
|---|
| 1390 | thus likely to be inferior to a strictly bottom-up approach, though this | 
|---|
| 1391 | hypothesis needs to be empirically validated. | 
|---|
| 1392 |  | 
|---|
| 1393 | Another approach would be to abandon expression-tree ordering for | 
|---|
| 1394 | subexpression matching, and order by "most constrained symbol"; symbols would | 
|---|
| 1395 | be more constrained if there were fewer matching declarations, fewer | 
|---|
| 1396 | subexpressions yet to resolve, or possibly fewer possible types the expression | 
|---|
| 1397 | could resolve to. Ordering the expressions in a priority-queue by this metric | 
|---|
| 1398 | would not necessarily produce a top-down or a bottom-up order, but would add | 
|---|
| 1399 | opportunities for pruning based on memoized upper and lower bounds. | 
|---|
| 1400 |  | 
|---|
| 1401 | Both Baker and Cormack explicitly generate all possible interpretations of a | 
|---|
| 1402 | given expression; thinking of the set of interpretations of an expression as a | 
|---|
| 1403 | list sorted by cost, this is an eager evaluation of the list. | 
|---|
| 1404 | However, since we generally expect that user programmers will not often use | 
|---|
| 1405 | high-cost implicit conversions, one potentially effective way to prune the | 
|---|
| 1406 | search space would be to first find the minimal-cost interpretations of any | 
|---|
| 1407 | given subexpression, then to save the resolution progress of the | 
|---|
| 1408 | subexpressions and attempt to resolve the superexpression using only those | 
|---|
| 1409 | subexpression interpretations. | 
|---|
| 1410 | If no valid interpretation of the superexpression can be found, the resolver | 
|---|
| 1411 | would then repeatedly find the next-most-minimal cost interpretations of the | 
|---|
| 1412 | subexpressions and attempt to produce the minimal cost interpretation of the | 
|---|
| 1413 | superexpression. | 
|---|
| 1414 | This process would proceed until all possible subexpression interpretations | 
|---|
| 1415 | have been found and considered. | 
|---|
| 1416 |  | 
|---|
| 1417 | A middle ground between the eager and lazy approaches can be taken by | 
|---|
| 1418 | considering the lexical order on the cost tuple; essentially, every | 
|---|
| 1419 | interpretation in each of the classes below will be strictly cheaper than any | 
|---|
| 1420 | interpretation in the class after it, so if a min-cost valid interpretation | 
|---|
| 1421 | can be found while only generating interpretations in a given class, that | 
|---|
| 1422 | interpretation is guaranteed to be the best possible one: | 
|---|
| 1423 |  | 
|---|
| 1424 | 1. Interpretations without polymorphic functions or implicit conversions | 
|---|
| 1425 | 2. Interpretations without polymorphic functions using only safe conversions | 
|---|
| 1426 | 3. Interpretations using polymorphic functions without unsafe conversions | 
|---|
| 1427 | 4. Interpretations using unsafe conversions | 
|---|
| 1428 |  | 
|---|
| 1429 | In this lazy-eager approach, all the interpretations in one class would be | 
|---|
| 1430 | eagerly generated, while the interpretations in the next class would only be | 
|---|
| 1431 | considered if no match was found in the previous class. | 
|---|
| 1432 |  | 
|---|
| 1433 | Another source of efficiency would be to cache the best given interpretation | 
|---|
| 1434 | of a subexpression within an environment; this may not be incredibly useful | 
|---|
| 1435 | for explict parameters (though it may be useful for, e.g. `f( x, x )`, where | 
|---|
| 1436 | both parameters of `f` have the same type), but should pay some dividends for | 
|---|
| 1437 | the implicit assertion parameters, especially the otype parameters for the | 
|---|
| 1438 | argument of a generic type, which will generally be resolved in duplicate for | 
|---|
| 1439 | (at least) the assignment operator, constructor, copy constructor & destructor | 
|---|
| 1440 | of the generic type. | 
|---|
| 1441 |  | 
|---|
| 1442 | ## Appendix A: Partial and Total Orders ## | 
|---|
| 1443 | The `<=` relation on integers is a commonly known _total order_, and | 
|---|
| 1444 | intuitions based on how it works generally apply well to other total orders. | 
|---|
| 1445 | Formally, a total order is some binary relation `<=` over a set `S` such that | 
|---|
| 1446 | for any two members `a` and `b` of `S`, `a <= b` or `b <= a` (if both, `a` and | 
|---|
| 1447 | `b` must be equal, the _antisymmetry_ property); total orders also have a | 
|---|
| 1448 | _transitivity_ property, that if `a <= b` and `b <= c`, then `a <= c`. | 
|---|
| 1449 | If `a` and `b` are distinct elements and `a <= b`, we may write `a < b`. | 
|---|
| 1450 |  | 
|---|
| 1451 | A _partial order_ is a generalization of this concept where the `<=` relation | 
|---|
| 1452 | is not required to be defined over all pairs of elements in `S` (though there | 
|---|
| 1453 | is a _reflexivity_ requirement that for all `a` in `S`, `a <= a`); in other | 
|---|
| 1454 | words, it is possible for two elements `a` and `b` of `S` to be | 
|---|
| 1455 | _incomparable_, unable to be ordered with respect to one another (any `a` and | 
|---|
| 1456 | `b` for which either `a <= b` or `b <= a` are called _comparable_). | 
|---|
| 1457 | Antisymmetry and transitivity are also required for a partial order, so all | 
|---|
| 1458 | total orders are also partial orders by definition. | 
|---|
| 1459 | One fairly natural partial order is the "subset of" relation over sets from | 
|---|
| 1460 | the same universe; `{ }` is a subset of both `{ 1 }` and `{ 2 }`, which are | 
|---|
| 1461 | both subsets of `{ 1, 2 }`, but neither `{ 1 }` nor `{ 2 }` is a subset of the | 
|---|
| 1462 | other - they are incomparable under this relation. | 
|---|
| 1463 |  | 
|---|
| 1464 | We can compose two (or more) partial orders to produce a new partial order on | 
|---|
| 1465 | tuples drawn from both (or all the) sets. | 
|---|
| 1466 | For example, given `a` and `c` from set `S` and `b` and `d` from set `R`, | 
|---|
| 1467 | where both `S` and `R` both have partial orders defined on them, we can define | 
|---|
| 1468 | a ordering relation between `(a, b)` and `(c, d)`. | 
|---|
| 1469 | One common order is the _lexicographical order_, where `(a, b) <= (c, d)` iff | 
|---|
| 1470 | `a < c` or both `a = c` and `b <= d`; this can be thought of as ordering by | 
|---|
| 1471 | the first set and "breaking ties" by the second set. | 
|---|
| 1472 | Another common order is the _product order_, which can be roughly thought of | 
|---|
| 1473 | as "all the components are ordered the same way"; formally `(a, b) <= (c, d)` | 
|---|
| 1474 | iff `a <= c` and `b <= d`. | 
|---|
| 1475 | One difference between the lexicographical order and the product order is that | 
|---|
| 1476 | in the lexicographical order if both `a` and `c` and `b` and `d` are | 
|---|
| 1477 | comparable then `(a, b)` and `(c, d)` will be comparable, while in the product | 
|---|
| 1478 | order you can have `a <= c` and `d <= b` (both comparable) which will make | 
|---|
| 1479 | `(a, b)` and `(c, d)` incomparable. | 
|---|
| 1480 | The product order, on the other hand, has the benefit of not prioritizing one | 
|---|
| 1481 | order over the other. | 
|---|
| 1482 |  | 
|---|
| 1483 | Any partial order has a natural representation as a directed acyclic graph | 
|---|
| 1484 | (DAG). | 
|---|
| 1485 | Each element `a` of the set becomes a node of the DAG, with an arc pointing to | 
|---|
| 1486 | its _covering_ elements, any element `b` such that `a < b` but where there is | 
|---|
| 1487 | no `c` such that `a < c` and `c < b`. | 
|---|
| 1488 | Intuitively, the covering elements are the "next ones larger", where you can't | 
|---|
| 1489 | fit another element between the two. | 
|---|
| 1490 | Under this construction, `a < b` is equivalent to "there is a path from `a` to | 
|---|
| 1491 | `b` in the DAG", and the lack of cycles in the directed graph is ensured by | 
|---|
| 1492 | the antisymmetry property of the partial order. | 
|---|
| 1493 |  | 
|---|
| 1494 | Partial orders can be generalized to _preorders_ by removing the antisymmetry | 
|---|
| 1495 | property. | 
|---|
| 1496 | In a preorder the relation is generally called `<~`, and it is possible for | 
|---|
| 1497 | two distict elements `a` and `b` to have `a <~ b` and `b <~ a` - in this case | 
|---|
| 1498 | we write `a ~ b`; `a <~ b` and not `a ~ b` is written `a < b`. | 
|---|
| 1499 | Preorders may also be represented as directed graphs, but in this case the | 
|---|
| 1500 | graph may contain cycles. | 
|---|
| 1501 |  | 
|---|
| 1502 | ## Appendix B: Building a Conversion Graph from Un-annotated Single Steps ## | 
|---|
| 1503 | The short answer is that it's impossible. | 
|---|
| 1504 |  | 
|---|
| 1505 | The longer answer is that it has to do with what's essentially a diamond | 
|---|
| 1506 | inheritance problem. | 
|---|
| 1507 | In C, `int` converts to `unsigned int` and also `long` "safely"; both convert | 
|---|
| 1508 | to `unsigned long` safely, and it's possible to chain the conversions to | 
|---|
| 1509 | convert `int` to `unsigned long`. | 
|---|
| 1510 | There are two constraints here; one is that the `int` to `unsigned long` | 
|---|
| 1511 | conversion needs to cost more than the other two (because the types aren't as | 
|---|
| 1512 | "close" in a very intuitive fashion), and the other is that the system needs a | 
|---|
| 1513 | way to choose which path to take to get to the destination type. | 
|---|
| 1514 | Now, a fairly natural solution for this would be to just say "C knows how to | 
|---|
| 1515 | convert from `int` to `unsigned long`, so we just put in a direct conversion | 
|---|
| 1516 | and make the compiler smart enough to figure out the costs" - given that the | 
|---|
| 1517 | users can build an arbitrary graph of conversions, this needs to be handled | 
|---|
| 1518 | anyway. | 
|---|
| 1519 | We can define a preorder over the types by saying that `a <~ b` if there | 
|---|
| 1520 | exists a chain of conversions from `a` to `b`. | 
|---|
| 1521 | This preorder corresponds roughly to a more usual type-theoretic concept of | 
|---|
| 1522 | subtyping ("if I can convert `a` to `b`, `a` is a more specific type than | 
|---|
| 1523 | `b`"); however, since this graph is arbitrary, it may contain cycles, so if | 
|---|
| 1524 | there is also a path to convert `b` to `a` they are in some sense equivalently | 
|---|
| 1525 | specific. | 
|---|
| 1526 |  | 
|---|
| 1527 | Now, to compare the cost of two conversion chains `(s, x1, x2, ... xn)` and | 
|---|
| 1528 | `(s, y1, y2, ... ym)`, we have both the length of the chains (`n` versus `m`) | 
|---|
| 1529 | and this conversion preorder over the destination types `xn` and `ym`. | 
|---|
| 1530 | We could define a preorder by taking chain length and breaking ties by the | 
|---|
| 1531 | conversion preorder, but this would lead to unexpected behaviour when closing | 
|---|
| 1532 | diamonds with an arm length of longer than 1. | 
|---|
| 1533 | Consider a set of types `A`, `B1`, `B2`, `C` with the arcs `A->B1`, `B1->B2`, | 
|---|
| 1534 | `B2->C`, and `A->C`. | 
|---|
| 1535 | If we are comparing conversions from `A` to both `B2` and `C`, we expect the | 
|---|
| 1536 | conversion to `B2` to be chosen because it's the more specific type under the | 
|---|
| 1537 | conversion preorder, but since its chain length is longer than the conversion | 
|---|
| 1538 | to `C`, it loses and `C` is chosen. | 
|---|
| 1539 | However, taking the conversion preorder and breaking ties or ambiguities by | 
|---|
| 1540 | chain length also doesn't work, because of cases like the following example | 
|---|
| 1541 | where the transitivity property is broken and we can't find a global maximum: | 
|---|
| 1542 |  | 
|---|
| 1543 | `X->Y1->Y2`, `X->Z1->Z2->Z3->W`, `X->W` | 
|---|
| 1544 |  | 
|---|
| 1545 | In this set of arcs, if we're comparing conversions from `X` to each of `Y2`, | 
|---|
| 1546 | `Z3` and `W`, converting to `Y2` is cheaper than converting to `Z3`, because | 
|---|
| 1547 | there are no conversions between `Y2` and `Z3`, and `Y2` has the shorter chain | 
|---|
| 1548 | length. | 
|---|
| 1549 | Also, comparing conversions from `X` to `Z3` and to `W`, we find that the | 
|---|
| 1550 | conversion to `Z3` is cheaper, because `Z3 < W` by the conversion preorder, | 
|---|
| 1551 | and so is considered to be the nearer type. | 
|---|
| 1552 | By transitivity, then, the conversion from `X` to `Y2` should be cheaper than | 
|---|
| 1553 | the conversion from `X` to `W`, but in this case the `X` and `W` are | 
|---|
| 1554 | incomparable by the conversion preorder, so the tie is broken by the shorter | 
|---|
| 1555 | path from `X` to `W` in favour of `W`, contradicting the transitivity property | 
|---|
| 1556 | for this proposed order. | 
|---|
| 1557 |  | 
|---|
| 1558 | Without transitivity, we would need to compare all pairs of conversions, which | 
|---|
| 1559 | would be expensive, and possibly not yield a minimal-cost conversion even if | 
|---|
| 1560 | all pairs were comparable. | 
|---|
| 1561 | In short, this ordering is infeasible, and by extension I believe any ordering | 
|---|
| 1562 | composed solely of single-step conversions between types with no further | 
|---|
| 1563 | user-supplied information will be insufficiently powerful to express the | 
|---|
| 1564 | built-in conversions between C's types. | 
|---|
| 1565 |  | 
|---|
| 1566 | ## Appendix C: Proposed Prelude Changes ## | 
|---|
| 1567 | **TODO** Port Glen's "Future Work" page for builtin C conversions. | 
|---|
| 1568 | **TODO** Move discussion of zero_t, unit_t here. | 
|---|
| 1569 |  | 
|---|
| 1570 | It may be desirable to have some polymorphic wrapper functions in the prelude | 
|---|
| 1571 | which provide consistent default implementations of various operators given a | 
|---|
| 1572 | definition of one of them. | 
|---|
| 1573 | Naturally, users could still provide a monomorphic overload if they wished to | 
|---|
| 1574 | make their own code more efficient than the polymorphic wrapper could be, but | 
|---|
| 1575 | this would minimize user effort in the common case where the user cannot write | 
|---|
| 1576 | a more efficient function, or is willing to trade some runtime efficiency for | 
|---|
| 1577 | developer time. | 
|---|
| 1578 | As an example, consider the following polymorphic defaults for `+` and `+=`: | 
|---|
| 1579 |  | 
|---|
| 1580 | forall(otype T | { T ?+?(T, T); }) | 
|---|
| 1581 | lvalue T ?+=? (T *a, T b) { | 
|---|
| 1582 | return *a = *a + b; | 
|---|
| 1583 | } | 
|---|
| 1584 |  | 
|---|
| 1585 | forall(otype T | { lvalue T ?+=? (T*, T) }) | 
|---|
| 1586 | T ?+? (T a, T b) { | 
|---|
| 1587 | T tmp = a; | 
|---|
| 1588 | return tmp += b; | 
|---|
| 1589 | } | 
|---|
| 1590 |  | 
|---|
| 1591 | Both of these have a possibly unneccessary copy (the first in copying the | 
|---|
| 1592 | result of `*a + b` back into `*a`, the second copying `a` into `tmp`), but in | 
|---|
| 1593 | cases where these copies are unavoidable the polymorphic wrappers should be | 
|---|
| 1594 | just as performant as the monomorphic equivalents (assuming a compiler | 
|---|
| 1595 | sufficiently clever to inline the extra function call), and allow programmers | 
|---|
| 1596 | to define a set of related operators with maximal concision. | 
|---|
| 1597 |  | 
|---|
| 1598 | **TODO** Look at what Glen and Richard have already written for this. | 
|---|
| 1599 |  | 
|---|
| 1600 | ## Appendix D: Feasibility of Making void* Conversions Explicit ## | 
|---|
| 1601 | C allows implicit conversions between `void*` and other pointer types, as per | 
|---|
| 1602 | section 6.3.2.3.1 of the standard. | 
|---|
| 1603 | Making these implicit conversions explicit in Cforall would provide | 
|---|
| 1604 | significant type-safety benefits, and is precedented in C++. | 
|---|
| 1605 | A weaker version of this proposal would be to allow implicit conversions to | 
|---|
| 1606 | `void*` (as a sort of "top type" for all pointer types), but to make the | 
|---|
| 1607 | unsafe conversion from `void*` back to a concrete pointer type an explicit | 
|---|
| 1608 | conversion. | 
|---|
| 1609 | However, `int *p = malloc( sizeof(int) );` and friends are hugely common | 
|---|
| 1610 | in C code, and rely on the unsafe implicit conversion from the `void*` return | 
|---|
| 1611 | type of `malloc` to the `int*` type of the variable - obviously it would be | 
|---|
| 1612 | too much of a source-compatibility break to disallow this for C code. | 
|---|
| 1613 | We do already need to wrap C code in an `extern "C"` block, though, so it is | 
|---|
| 1614 | technically feasible to make the `void*` conversions implicit in C but | 
|---|
| 1615 | explicit in Cforall. | 
|---|
| 1616 | Also, for calling C code with `void*`-based APIs, pointers-to-dtype are | 
|---|
| 1617 | calling-convention compatible with `void*`; we could read `void*` in function | 
|---|
| 1618 | signatures as essentially a fresh dtype type variable, e.g: | 
|---|
| 1619 |  | 
|---|
| 1620 | void* malloc( size_t ) | 
|---|
| 1621 | => forall(dtype T0) T0* malloc( size_t ) | 
|---|
| 1622 | void qsort( void*, size_t, size_t, int (*)( const void*, const void* ) ) | 
|---|
| 1623 | => forall(dtype T0, dtype T1, dtype T2) | 
|---|
| 1624 | void qsort( T0*, size_t, size_t, int (*)( const T1*, const T2* ) ) | 
|---|
| 1625 |  | 
|---|
| 1626 | This would even allow us to leverage some of Cforall's type safety to write | 
|---|
| 1627 | better declarations for legacy C API functions, like the following wrapper for | 
|---|
| 1628 | `qsort`: | 
|---|
| 1629 |  | 
|---|
| 1630 | extern "C" { // turns off name-mangling so that this calls the C library | 
|---|
| 1631 | // call-compatible type-safe qsort signature | 
|---|
| 1632 | forall(dtype T) | 
|---|
| 1633 | void qsort( T*, size_t, size_t, int (*)( const T*, const T* ) ); | 
|---|
| 1634 |  | 
|---|
| 1635 | // forbid type-unsafe C signature from resolving | 
|---|
| 1636 | void qsort( void*, size_t, size_t, int (*)( const void*, const void* ) ) | 
|---|
| 1637 | = delete; | 
|---|
| 1638 | } | 
|---|
| 1639 |  | 
|---|
| 1640 | ## Appendix E: Features to Add in Resolver Re-write ## | 
|---|
| 1641 | * Reference types | 
|---|
| 1642 | * Special types for 0 and 1 literals | 
|---|
| 1643 | * Expression type for return statement that resolves similarly to ?=? | 
|---|
| 1644 | - This is to get rid of the kludge in the box pass that effectively | 
|---|
| 1645 | re-implements the resolver poorly. | 
|---|