| [41634098] | 1 | ## User-defined Conversions ##
 | 
|---|
 | 2 | C's implicit "usual arithmetic conversions" define a structure among the 
 | 
|---|
 | 3 | built-in types consisting of _unsafe_ narrowing conversions and a hierarchy of 
 | 
|---|
 | 4 | _safe_ widening conversions. 
 | 
|---|
 | 5 | There is also a set of _explicit_ conversions that are only allowed through a 
 | 
|---|
 | 6 | cast expression.
 | 
|---|
| [6eb131c] | 7 | I propose that safe, unsafe, and explicit (cast) conversions be expressed as 
 | 
|---|
 | 8 | constructor variants. 
 | 
|---|
| [41634098] | 9 | Throughout this article, I will use the following operator names for 
 | 
|---|
 | 10 | constructors and conversion functions from `From` to `To`:
 | 
|---|
 | 11 | 
 | 
|---|
| [6eb131c] | 12 |         void ?{} ( To&, To );            // copy constructor
 | 
|---|
 | 13 |         void ?{} ( To&, From );          // explicit constructor
 | 
|---|
 | 14 |         void ?{explicit} ( To&, From );  // explicit cast conversion
 | 
|---|
 | 15 |         void ?{safe} ( To&, From );      // implicit safe conversion
 | 
|---|
 | 16 |         void ?{unsafe} ( To&, From );    // implicit unsafe conversion
 | 
|---|
| [41634098] | 17 | 
 | 
|---|
| [6eb131c] | 18 | It has been suggested that all constructors would define unsafe implicit 
 | 
|---|
| [41634098] | 19 | conversions; this is elegant, but interacts poorly with tuples. 
 | 
|---|
 | 20 | Essentially, without making this distinction, a constructor like the following 
 | 
|---|
 | 21 | would add an interpretation of any two `int`s as a `Coord`, needlessly 
 | 
|---|
 | 22 | multiplying the space of possible interpretations of all functions:
 | 
|---|
 | 23 | 
 | 
|---|
| [6eb131c] | 24 |         void ?{}( Coord& this, int x, int y );
 | 
|---|
| [41634098] | 25 | 
 | 
|---|
 | 26 | That said, it would certainly be possible to make a multiple-argument implicit 
 | 
|---|
 | 27 | conversion, as below, though the argument above suggests this option should be 
 | 
|---|
 | 28 | used infrequently:
 | 
|---|
 | 29 | 
 | 
|---|
| [6eb131c] | 30 |         void ?{unsafe}( Coord& this, int x, int y );
 | 
|---|
| [41634098] | 31 | 
 | 
|---|
 | 32 | An alternate possibility would be to only count two-arg constructors 
 | 
|---|
| [6eb131c] | 33 | `void ?{} ( To&, From )` as unsafe conversions; under this semantics, safe and 
 | 
|---|
| [41634098] | 34 | explicit conversions should also have a compiler-enforced restriction to 
 | 
|---|
 | 35 | ensure that they are two-arg functions (this restriction may be valuable 
 | 
|---|
 | 36 | regardless).
 | 
|---|
 | 37 | 
 | 
|---|
 | 38 | Regardless of syntax, there should be a type assertion that expresses `From` 
 | 
|---|
 | 39 | is convertable to `To`. 
 | 
|---|
 | 40 | If user-defined conversions are not added to the language, 
 | 
|---|
| [6eb131c] | 41 | `void ?{} ( To&, From )` may be a suitable representation, relying on 
 | 
|---|
| [41634098] | 42 | conversions on the argument types to account for transitivity. 
 | 
|---|
| [6eb131c] | 43 | Since `To&` should be an exact match on `To`, this should put all the implicit 
 | 
|---|
 | 44 | conversions on the RHS.
 | 
|---|
 | 45 | On the other hand, under some models (like [1]), implicit conversions are not 
 | 
|---|
 | 46 | allowed in assertion parameters, so another assertion syntax specific to 
 | 
|---|
 | 47 | conversions may be required, e.g. `From -> To`. 
 | 
|---|
 | 48 | It has also been suggested that, for programmer control, no implicit 
 | 
|---|
 | 49 | conversions (except, possibly, for polymorphic specialization) should be 
 | 
|---|
 | 50 | allowed in resolution of cast operators.
 | 
|---|
 | 51 | 
 | 
|---|
 | 52 | [1] ../working/assertion_resolution.md
 | 
|---|
| [41634098] | 53 | 
 | 
|---|
 | 54 | ### Constructor Idiom ###
 | 
|---|
 | 55 | Basing our notion of conversions off otherwise normal Cforall functions means 
 | 
|---|
 | 56 | that we can use the full range of Cforall features for conversions, including 
 | 
|---|
 | 57 | polymorphism.
 | 
|---|
| [6eb131c] | 58 | In an earlier version of this proposal, Glen Ditchfield defines a 
 | 
|---|
 | 59 | _constructor idiom_ that can be used to create chains of safe conversions 
 | 
|---|
 | 60 | without duplicating code; given a type `Safe` which members of another type 
 | 
|---|
 | 61 | `From` can be directly converted to, the constructor idiom allows us to write 
 | 
|---|
 | 62 | a conversion for any type `To` which `Safe` converts to: 
 | 
|---|
 | 63 | 
 | 
|---|
 | 64 |         forall(otype To | { void ?{safe}( To&, Safe ) })
 | 
|---|
 | 65 |         void ?{safe}( To& this, From that ) {
 | 
|---|
| [41634098] | 66 |                 Safe tmp = /* some expression involving that */;
 | 
|---|
| [6eb131c] | 67 |                 this{ tmp }; // initialize from assertion parameter
 | 
|---|
| [41634098] | 68 |         }
 | 
|---|
 | 69 | 
 | 
|---|
 | 70 | This idiom can also be used with only minor variations for a parallel set of 
 | 
|---|
 | 71 | unsafe conversions.
 | 
|---|
 | 72 | 
 | 
|---|
| [6eb131c] | 73 | Glen's original suggestion said the copy constructor for `To` should also be 
 | 
|---|
 | 74 | accepted as a resolution for `void ?{safe}( To&, Safe )` (`Safe` == `To`), 
 | 
|---|
 | 75 | allowing this same code to be used for the single-step conversion as well. 
 | 
|---|
 | 76 | This proposal does come at the cost of an extra copy initialization of the 
 | 
|---|
 | 77 | target value, though.
 | 
|---|
 | 78 | 
 | 
|---|
 | 79 | Contrariwise, if a monomorphic conversion from `From` to `Safe` is written, 
 | 
|---|
 | 80 | e.g:
 | 
|---|
 | 81 | 
 | 
|---|
 | 82 |         void ?{safe}( Safe& this, From that ) {
 | 
|---|
 | 83 |                 this{ /* some parameters involving that */ };
 | 
|---|
 | 84 |         }
 | 
|---|
 | 85 | 
 | 
|---|
 | 86 | Then the code for a transitive conversion from `From` to any `To` type 
 | 
|---|
 | 87 | convertable from `Safe` is written:
 | 
|---|
 | 88 | 
 | 
|---|
 | 89 |         forall(otype To | { void ?{safe}( To&, Safe ) })
 | 
|---|
 | 90 |         void ?{safe}( To& this, From that ) {
 | 
|---|
 | 91 |                 Safe tmp = that;  // uses monomorphic conversion
 | 
|---|
 | 92 |                 this{ tmp };      // initialize from assertion parameter
 | 
|---|
 | 93 |         }
 | 
|---|
 | 94 | 
 | 
|---|
 | 95 | Given the entirely-boilerplate nature of this code, but negative performance 
 | 
|---|
 | 96 | implications of the unmodified constructor idiom, it might be fruitful to have 
 | 
|---|
 | 97 | transitive and single step conversion operators, and let CFA build the 
 | 
|---|
 | 98 | transitive conversions; some possible names:
 | 
|---|
 | 99 | 
 | 
|---|
 | 100 |         void ?{safe}  (To&, From);    void ?{final safe} (To&, From);  // single-step
 | 
|---|
 | 101 |         void ?{safe*} (To&, From);    void ?{safe}       (To&, From);  // transitive
 | 
|---|
 | 102 | 
 | 
|---|
| [41634098] | 103 | What selective non-use of the constructor idiom gives us is the ability to 
 | 
|---|
 | 104 | define a conversion that may only be the *last* conversion in a chain of such. 
 | 
|---|
| [6eb131c] | 105 | One use for this is to solve the problem that `explicit` conversions were 
 | 
|---|
 | 106 | added to C++ for, that of conversions to `bool` chaining to become conversions 
 | 
|---|
 | 107 | to any arithmetic type.
 | 
|---|
 | 108 | Another use is to unambiguously represent the full hierarchy of implicit 
 | 
|---|
 | 109 | conversions in C by making sign conversions non-transitive, allowing the 
 | 
|---|
 | 110 | compiler to resolve e.g. `int -> unsigned long` as 
 | 
|---|
 | 111 | `int -> long -> unsigned long` over `int -> unsigned int -> unsigned long`. 
 | 
|---|
 | 112 | See [2] for more details.
 | 
|---|
 | 113 | 
 | 
|---|
 | 114 | [2] ../working/glen_conversions/index.html#usual
 | 
|---|
| [41634098] | 115 | 
 | 
|---|
 | 116 | ### Appendix A: Partial and Total Orders ###
 | 
|---|
 | 117 | The `<=` relation on integers is a commonly known _total order_, and 
 | 
|---|
 | 118 | intuitions based on how it works generally apply well to other total orders. 
 | 
|---|
 | 119 | Formally, a total order is some binary relation `<=` over a set `S` such that 
 | 
|---|
 | 120 | for any two members `a` and `b` of `S`, `a <= b` or `b <= a` (if both, `a` and 
 | 
|---|
 | 121 | `b` must be equal, the _antisymmetry_ property); total orders also have a 
 | 
|---|
 | 122 | _transitivity_ property, that if `a <= b` and `b <= c`, then `a <= c`. 
 | 
|---|
 | 123 | If `a` and `b` are distinct elements and `a <= b`, we may write `a < b`.
 | 
|---|
 | 124 |  
 | 
|---|
 | 125 | A _partial order_ is a generalization of this concept where the `<=` relation 
 | 
|---|
 | 126 | is not required to be defined over all pairs of elements in `S` (though there 
 | 
|---|
 | 127 | is a _reflexivity_ requirement that for all `a` in `S`, `a <= a`); in other 
 | 
|---|
 | 128 | words, it is possible for two elements `a` and `b` of `S` to be 
 | 
|---|
 | 129 | _incomparable_, unable to be ordered with respect to one another (any `a` and 
 | 
|---|
 | 130 | `b` for which either `a <= b` or `b <= a` are called _comparable_). 
 | 
|---|
 | 131 | Antisymmetry and transitivity are also required for a partial order, so all 
 | 
|---|
 | 132 | total orders are also partial orders by definition. 
 | 
|---|
 | 133 | One fairly natural partial order is the "subset of" relation over sets from 
 | 
|---|
 | 134 | the same universe; `{ }` is a subset of both `{ 1 }` and `{ 2 }`, which are 
 | 
|---|
 | 135 | both subsets of `{ 1, 2 }`, but neither `{ 1 }` nor `{ 2 }` is a subset of the 
 | 
|---|
 | 136 | other - they are incomparable under this relation. 
 | 
|---|
 | 137 | 
 | 
|---|
 | 138 | We can compose two (or more) partial orders to produce a new partial order on 
 | 
|---|
 | 139 | tuples drawn from both (or all the) sets. 
 | 
|---|
 | 140 | For example, given `a` and `c` from set `S` and `b` and `d` from set `R`, 
 | 
|---|
 | 141 | where both `S` and `R` both have partial orders defined on them, we can define 
 | 
|---|
 | 142 | a ordering relation between `(a, b)` and `(c, d)`. 
 | 
|---|
 | 143 | One common order is the _lexicographical order_, where `(a, b) <= (c, d)` iff 
 | 
|---|
 | 144 | `a < c` or both `a = c` and `b <= d`; this can be thought of as ordering by 
 | 
|---|
 | 145 | the first set and "breaking ties" by the second set. 
 | 
|---|
 | 146 | Another common order is the _product order_, which can be roughly thought of 
 | 
|---|
 | 147 | as "all the components are ordered the same way"; formally `(a, b) <= (c, d)` 
 | 
|---|
 | 148 | iff `a <= c` and `b <= d`. 
 | 
|---|
 | 149 | One difference between the lexicographical order and the product order is that 
 | 
|---|
 | 150 | in the lexicographical order if both `a` and `c` and `b` and `d` are 
 | 
|---|
 | 151 | comparable then `(a, b)` and `(c, d)` will be comparable, while in the product 
 | 
|---|
 | 152 | order you can have `a <= c` and `d <= b` (both comparable) which will make 
 | 
|---|
 | 153 | `(a, b)` and `(c, d)` incomparable. 
 | 
|---|
 | 154 | The product order, on the other hand, has the benefit of not prioritizing one 
 | 
|---|
 | 155 | order over the other.
 | 
|---|
 | 156 | 
 | 
|---|
 | 157 | Any partial order has a natural representation as a directed acyclic graph 
 | 
|---|
 | 158 | (DAG). 
 | 
|---|
 | 159 | Each element `a` of the set becomes a node of the DAG, with an arc pointing to 
 | 
|---|
 | 160 | its _covering_ elements, any element `b` such that `a < b` but where there is 
 | 
|---|
 | 161 | no `c` such that `a < c` and `c < b`. 
 | 
|---|
 | 162 | Intuitively, the covering elements are the "next ones larger", where you can't 
 | 
|---|
 | 163 | fit another element between the two. 
 | 
|---|
 | 164 | Under this construction, `a < b` is equivalent to "there is a path from `a` to 
 | 
|---|
 | 165 | `b` in the DAG", and the lack of cycles in the directed graph is ensured by 
 | 
|---|
 | 166 | the antisymmetry property of the partial order.
 | 
|---|
 | 167 | 
 | 
|---|
 | 168 | Partial orders can be generalized to _preorders_ by removing the antisymmetry 
 | 
|---|
 | 169 | property. 
 | 
|---|
 | 170 | In a preorder the relation is generally called `<~`, and it is possible for 
 | 
|---|
 | 171 | two distict elements `a` and `b` to have `a <~ b` and `b <~ a` - in this case 
 | 
|---|
 | 172 | we write `a ~ b`; `a <~ b` and not `a ~ b` is written `a < b`. 
 | 
|---|
 | 173 | Preorders may also be represented as directed graphs, but in this case the 
 | 
|---|
 | 174 | graph may contain cycles.
 | 
|---|
 | 175 | 
 | 
|---|
 | 176 | ### Appendix B: Building a Conversion Graph from Un-annotated Single Steps ###
 | 
|---|
 | 177 | The short answer is that it's impossible.
 | 
|---|
 | 178 | 
 | 
|---|
 | 179 | The longer answer is that it has to do with what's essentially a diamond 
 | 
|---|
 | 180 | inheritance problem. 
 | 
|---|
 | 181 | In C, `int` converts to `unsigned int` and also `long` "safely"; both convert 
 | 
|---|
 | 182 | to `unsigned long` safely, and it's possible to chain the conversions to 
 | 
|---|
 | 183 | convert `int` to `unsigned long`. 
 | 
|---|
 | 184 | There are two constraints here; one is that the `int` to `unsigned long` 
 | 
|---|
 | 185 | conversion needs to cost more than the other two (because the types aren't as 
 | 
|---|
 | 186 | "close" in a very intuitive fashion), and the other is that the system needs a 
 | 
|---|
 | 187 | way to choose which path to take to get to the destination type. 
 | 
|---|
 | 188 | Now, a fairly natural solution for this would be to just say "C knows how to 
 | 
|---|
 | 189 | convert from `int` to `unsigned long`, so we just put in a direct conversion 
 | 
|---|
 | 190 | and make the compiler smart enough to figure out the costs" - this is the 
 | 
|---|
| [6eb131c] | 191 | approach taken by the existing compiler, but given that in a user-defined 
 | 
|---|
| [41634098] | 192 | conversion proposal the users can build an arbitrary graph of conversions, 
 | 
|---|
 | 193 | this case still needs to be handled. 
 | 
|---|
 | 194 | 
 | 
|---|
 | 195 | We can define a preorder over the types by saying that `a <~ b` if there 
 | 
|---|
 | 196 | exists a chain of conversions from `a` to `b` (see Appendix A for description 
 | 
|---|
 | 197 | of preorders and related constructs). 
 | 
|---|
| [6eb131c] | 198 | This preorder roughly corresponds to a more usual type-theoretic concept of 
 | 
|---|
| [41634098] | 199 | subtyping ("if I can convert `a` to `b`, `a` is a more specific type than 
 | 
|---|
 | 200 | `b`"); however, since this graph is arbitrary, it may contain cycles, so if 
 | 
|---|
 | 201 | there is also a path to convert `b` to `a` they are in some sense equivalently 
 | 
|---|
 | 202 | specific. 
 | 
|---|
 | 203 | 
 | 
|---|
 | 204 | Now, to compare the cost of two conversion chains `(s, x1, x2, ... xn)` and 
 | 
|---|
 | 205 | `(s, y1, y2, ... ym)`, we have both the length of the chains (`n` versus `m`) 
 | 
|---|
 | 206 | and this conversion preorder over the destination types `xn` and `ym`. 
 | 
|---|
 | 207 | We could define a preorder by taking chain length and breaking ties by the 
 | 
|---|
 | 208 | conversion preorder, but this would lead to unexpected behaviour when closing 
 | 
|---|
 | 209 | diamonds with an arm length of longer than 1. 
 | 
|---|
 | 210 | Consider a set of types `A`, `B1`, `B2`, `C` with the arcs `A->B1`, `B1->B2`, 
 | 
|---|
 | 211 | `B2->C`, and `A->C`. 
 | 
|---|
 | 212 | If we are comparing conversions from `A` to both `B2` and `C`, we expect the 
 | 
|---|
 | 213 | conversion to `B2` to be chosen because it's the more specific type under the 
 | 
|---|
 | 214 | conversion preorder, but since its chain length is longer than the conversion 
 | 
|---|
 | 215 | to `C`, it loses and `C` is chosen. 
 | 
|---|
 | 216 | However, taking the conversion preorder and breaking ties or ambiguities by 
 | 
|---|
 | 217 | chain length also doesn't work, because of cases like the following example 
 | 
|---|
 | 218 | where the transitivity property is broken and we can't find a global maximum: 
 | 
|---|
 | 219 | 
 | 
|---|
 | 220 |         `X->Y1->Y2`, `X->Z1->Z2->Z3->W`, `X->W`
 | 
|---|
 | 221 | 
 | 
|---|
 | 222 | In this set of arcs, if we're comparing conversions from `X` to each of `Y2`, 
 | 
|---|
 | 223 | `Z3` and `W`, converting to `Y2` is cheaper than converting to `Z3`, because 
 | 
|---|
 | 224 | there are no conversions between `Y2` and `Z3`, and `Y2` has the shorter chain 
 | 
|---|
 | 225 | length. 
 | 
|---|
 | 226 | Also, comparing conversions from `X` to `Z3` and to `W`, we find that the 
 | 
|---|
 | 227 | conversion to `Z3` is cheaper, because `Z3 < W` by the conversion preorder, 
 | 
|---|
 | 228 | and so is considered to be the nearer type. 
 | 
|---|
 | 229 | By transitivity, then, the conversion from `X` to `Y2` should be cheaper than 
 | 
|---|
| [6eb131c] | 230 | the conversion from `X` to `W`, but in this case the `Y2` and `W` are 
 | 
|---|
| [41634098] | 231 | incomparable by the conversion preorder, so the tie is broken by the shorter 
 | 
|---|
 | 232 | path from `X` to `W` in favour of `W`, contradicting the transitivity property 
 | 
|---|
 | 233 | for this proposed order.
 | 
|---|
 | 234 | 
 | 
|---|
 | 235 | Without transitivity, we would need to compare all pairs of conversions, which 
 | 
|---|
 | 236 | would be expensive, and possibly not yield a minimal-cost conversion even if 
 | 
|---|
 | 237 | all pairs were comparable. 
 | 
|---|
 | 238 | In short, this ordering is infeasible, and by extension I believe any ordering 
 | 
|---|
 | 239 | composed solely of single-step conversions between types with no further 
 | 
|---|
 | 240 | user-supplied information will be insufficiently powerful to express the 
 | 
|---|
 | 241 | built-in conversions between C's types.
 | 
|---|