Conversions for Cforall

NOTE: This proposal for constructors and user-defined conversions does not represent the current state of Cforall language development, but is maintained for its possible utility in building user-defined conversions. See doc/proposals/user_conversions.md for a more current presentation of these ideas.

This is the first draft of a description of a possible extension to the current definition of Cforall ("Cforall-as-is") that would let programmers fit new types into Cforall's system of conversions.

Design Notes
Proposed Extension
Conversion Composition
Constructors and Cost
Heap Allocation
Generic Conversions and Constructors
C's "Usual Arithmetic Conversions"
Other Promotions
Other Pre-Defined Implicit Conversions
Pre-Defined Explicit Conversions
Non-Conversions
Assignment Operators
Overload Resolution
Final Notes

Design Notes

Goals

My design goal for this extension is to provide a framework that explains the bulk of C's conversion semantics in terms of more basic languages, just as Cforall explains most expression semantics in terms of overloaded function calls.

My pragmatic goal is to allow a programmer to define a portable rational number data type, fit it into the existing C type system, and use it in mixed-mode arithmetic expressions, all in a convenient and esthetically pleasing manner.

Conversions

A conversion creates a value from a value of a different type. C defines a large number of conversions, especially between arithmetic types. A subset of these can be performed by implicit conversions, which occurs in certain contexts: in assignment expressions, when passing arguments to function (where parameters are "assigned the value of the corresponding argument"), in initialized declarations (where "the same type constraints and conversions as for simple assignment apply"), and in mixed mode arithmetic. All conversions can be performed explicitly by cast expressions.

C prefers some implicit conversions, the promotions, to the others. The promotions are ranked among themselves, creating a hierarchy of types. In mixed-mode operations, the "usual arithmetic conversions" promote the operands to what amounts to their least common supertype. Cforall-as-is uses a slightly larger set of promotions to choose the smallest possible promotion when resolving overloading.

An extension should allow Cforall to explain C's conversions as a set of pre-defined functions, including its explicit conversions, implicit conversions, and preferences among conversions. The extension must let the programmer define new conversions for programmer-defined types, for instance so that new arithmetic types can be used conveniently in mixed-mode arithmetic.

Constructors

C++ introduced constructors to the C language family, and I will use its terminology. A constructor is a function that initializes an object. C does not have constructors; instead, it makes do with initialization, which works like assignment. Cforall-as-is does not have constructors, either: instead, by analogy with C's semantics, a programmer-defined assignment function may be called during initialization. However, there is a key difference between a function that implements assignment and a constructor: constructors assume that the object is uninitialized, and must set up any data structure invariants that the object is supposed to obey. An assignment function assumes that the target object obeys its invariants.

A default constructor has no parameters other than the object it initializes. It establishes invariants, but need not do anything else. A default constructor for a rational number type might set the denominator to be non-zero, but leave the numerator undefined.

A copy constructor has two parameters: the object it initializes, and a value of the same type. Its purpose is to copy the value into the object, and so it is very similar to an assignment function. In fact, it could be expressed as a call to a default constructor followed by an assignment.

A converting constructor also has two parameters, but the second parameter is a value of some type different from the type of the object it initializes. Its purpose is to convert the value to the object's type before copying it, and so it is very similar to a C assignment operation that performs an implicit conversion.

C++ sensibly defines parameter passing as call by initialization, since the parameter is uninitialized when the argument value is placed in it. Extended Cforall should do the same. However, parameter passing is one of the main places where implicit conversions occur. Hence in extended Cforall constructors define the implicit conversions. Cforall should also encourage programmers to maintain the similarity between constructors and assignment.

Ambiguity

In extended Cforall, programmer-defined conversions should fit in with the predefined conversions. For instance, programmer-defined promotions should interact with the normal promotions so that programmer-defined types can take part in mixed-mode arithmetic expressions. The first design that springs to mind is to define a minimal set of conversions between neighbouring types in the type hierarchy, and to have Cforall create conversions between more distant types by composition of predefined and programmer-defined conversions. Unfortunately, if one draws a graph of C's promotions, with C's types as vertices and C's promotions as edges, the result is a directed acyclic graph, not a tree. This means that an attempt to build the full set of promotions by composition of a minimal set of promotions will fail.

Consider a simple processor with 32-bit int and long types. On such a machine, C's "usual arithmetic conversions" dictate that mixed-mode arithmetic that combines a signed integer with an unsigned integer must promote the signed integer to an unsigned type. Here is a directed graph showing the some of the minimal set of promotions. Each of the four promotions is necessary, because each could be required by some mixed-mode expression, and none can be decomposed into simpler conversions.

long --------> unsigned long
 ^               ^
 |               |
int ---------> unsigned int

Now imagine attempting to compose an int-to-unsigned long conversion from the minimal set: there are two paths through the graph, so the composition is ambiguous.

(In C, and in Cforall-as-is, no ambiguity exists: there is just one int-to-unsigned long promotion, defined by the language semantics. In Cforall-as-is, the preference for int-to-long over int-to-unsigned long is determined by a "conversion cost" calculated from the graph of the full set of promotions, but the calculation depends on maximal path lengths, not the exact path.)

Unfortunately, the same problem with ambiguity creeps in any time conversions might be chained together. The extension must carefully control conversion composition, so that programmers can avoid ambiguous conversions.

Proposed Extension

The rest of this document describes my proposal to add programmer-definable conversions and constructors to Cforall.

If your browser supports CSS style sheets, the proposal will appear in "normal" paragraphs, and commentary on the proposal will have the same appearance as this paragraph.

New Operator Identifiers

Cforall would be given a cast identifier, two constructor identifiers, and a destructor identifier:

(?)?, for cast functions.
(?create)?, for constructors.
(?promote)?, for constructors that are promotions.
(?destroy)?, for destructors.

The ugly identifier (?)? is meant to be mnemonic for the cast expression. The other identifiers are pretty weak (suggestions, anyone?) but are supposed to remind the programmer of the connection between conversions and constructors.

We could instead use a single (?create)? identifier for constructors and add a promote storage class specifier, at some small risk clashes of with identifiers in existing code.

It is an error to declare two functions with different constructor identifiers that have the same type in the same translation unit.

Functions declared with these identifiers can be polymorphic. Unlike other polymorphic functions, the return type of a polymorphic cast function need not be derivable from the type of its parameters

The return type of a call to a polymorphic cast function can be deduced from the calling context.

forall(type T1) T1 (?)?(T2);  // Legal.
forall(type T1) T1 pfun(T2);  // Illegal -- no way to infer T1.

A cast function from type T1 to type T2 is named "(?)?", accepts exactly one explicit argument of type T1, and returns a value of type T2.

If the cast function is polymorphic, it will have type parameters and assertion parameters as well, and can be said to be a cast function from many different types to many different types.

A default constructor function for type T is named "(?create)?", accepts exactly one explicit argument of type T*, and returns void.

A copy constructor function for type T is named "(?create)?", accepts exactly two explicit arguments of types T* and T, and returns void.

A converting constructor function for type T1 from T2 is named "(?create)?" or "(?promote)?", accepts exactly two explicit arguments of types T1* and T2, and returns void.

A destructor function for type T is named "(?destroy)?", accepts exactly one explicit argument of type T*, and returns void.

The monomorphic function prototypes for these functions are

T1   (?)?(T2);
void (?create)?(T1*);
void (?create)?(T1*, T2);
void (?promote)?(T1*, T2);
void (?destroy)?(T1*);

Cast Expressions

In most cases the cast expression (T)e would be treated like the function call (?)?(e), except that only cast functions to type T would be valid interpretations of (?)?, and e would not be implicitly converted to the cast function's parameter type. In particular, the usual rules for resolving function overloading (see below) would be used to choose the best interpretation of the expression.

For example, in

type Wazzit;
type Thingum;
Wazzit w;
(Thingum)w;

the cast function that is called must be "Thingum (?)?(Wazzit)", or a polymorphic function that can be specialized to that.

The ban on implicit conversions within the cast allows programmers to explicitly control composition of conversions and avoid ambiguity. I also hope that this will make it easier for compilers and programmers to determine which conversions will be applied in which circumstances. If implicit conversions could be applied to the inputs and outputs of casts, when any and all of the conversion functions involved could be polymorphic ... the possibilities seem endless, unfortunately.

Object Definitions

A definition of an object x would call a constructor function. Let T be x's type with type qualifiers removed, and let a be x's address (with type T*).

If type qualifiers weren't ignored, const objects couldn't be initialized, and every constructor would have to be duplicated, with one version for T* objects and one for volatile T* objects.

A definition with an initializer that is a single expression e, optionally enclosed in braces, would call a copy or converting constructor. The call would be treated much like the function call f(a,e), except that only copy and converting constructors for type T would be valid interpretations of f, and e would not be implicitly converted to the type of the constructor's second parameter.
If x has automatic storage duration and is not initialized explicitly, the definition would call a default constructor function. The call would be treated much like the function call f(a), except that only default constructor functions for type T would be valid interpretations of f.
If x has static storage duration and is not initialized explicitly, and is defined within the scope of a type definition that defines T, then T's implementation type would determine how x is initialized.
```
      type Rational = struct { int numerator; unsigned denominator; };
      Rational r; // Both members initialized to 0.
      
```
If x has static storage duration and is not initialized explicitly, and the type T is an opaque type, the definition would be treated as if x was initialized with the expression 0.
This is a simple extension of C's rules for static objects, which initialized them all to 0. Frequently, the 0 involved will have type T, and the definition will call a copy constructor.
```
      extern type Rational;
      extern Rational 0;
      static Rational r;  // initialized with the Rational 0.
      
```
In other cases, the 0 will be an integer or null pointer, and the definition will call a converting constructor.

The obvious alternative design would call T's default constructor. That design would be inconsistent, because some static objects would go uninitialized. It would also cause subtle problems, because a particular static definition could be uninitialized or initialized to 0 depending on whether T is an extern type or a typedef.

Except when calling constructors, parameter passing invokes constructor functions. Passing argument expression e to a parameter would be equivalent to initializing the parameter with that expression. When calling constructors, the value of the argument would be copied into the parameter.

When the lifetime of x ends, a destructor function would be called. The call would be treated much like the function call (?destroy)?(a). When a block ends, the objects that were defined in the block would be destroyed in the reverse of the order in which they are declared.

The storage class specifier register will have the semantics that it has in C++, instead of the semantics of C: it is merely a hint to the implementation that the object will be heavily used, and does not prevent programs from computing the address of the object.

Default Functions

In Cforall-as-is, every declaration with type-class type implicitly declares a default assignment function, with the same scope and linkage as the type. Extended Cforall would also declare a default default constructor and a default destructor.

{
    extern type T;
    T t;           // calls external constructor for T.
    }              // calls external destructor for T.

The destructor and some sort of constructor are necessary to instantiate the type. I include the default constructor because it is the most basic. Arguably the declaration should also declare a default copy constructor, but I chose not to because Cforall can construct a copy constructor from the default constructor and the assignment operator, as will be seen below.

If the type does not need to be instantiated, it probably should have been declared by dtype instead of by type.

A type definition would implicitly define a default constructor and destructor by inheriting the implementation type's default constructor and destructor, just as is done for the implicitly defined default assignment function.

Conversion Composition

As mentioned above, Cforall does not apply implicit conversions to the arguments and results of cast expressions or constructor calls. Neither does it automatically create conversions or constructors by composing programmer-defined compositions: given

T1 (?)?(T2);
T2 (?)?(T3);
T3 v3;
(T1)v3;

then Cforall does not automatically create

T1 (?)?(T3 p) { return (T1)(T2)p; }

Composition of conversions does show up through a third mechanism where the programmer has more control: assertion lists. Consider a Month type, that represents months as integers between 0 and 11. Clearly a Month can be promoted to unsigned, and to any type above unsigned in the arithmetic type hierarchy as well.

type Month = unsigned;

forall(type T | void (?promote)(T*, unsigned))
  void (?promote)?(T* target, Month source) {
    unsigned u_temp = (unsigned)source;
    T t_temp = u_temp;           // calls the assertion parameter.
    *target = t_temp;
  }

The intimidating polymorphic promotion declaration says that, if T is a type and unsigned can be promoted to T, then the function can promote Month to T.

Month m;
unsigned long ul = m;

To initialize ul, Cforall must bind T to unsigned long, find the (pre-defined) unsigned-to-unsigned long promotion, and pass it to the assertion parameter of the polymorphic Month-to-T function.

But what about converting from Month to unsigned itself?

unsigned u = m;  // How?

A monomorphic Month-to-unsigned constructor would do the job, but its body would mostly duplicate the body of the polymorphic function.

Instead, Cforall should use the polymorphic promotion and the unsigned copy constructor. To initialize u, Cforall should pass the unsigned copy constructor to the assertion parameter of the polymorphic Month promotion, and bind T to unsigned.

Note that the polymorphic promotion can promote Month to the standard types, to implementation-defined extended types, and to programmer-defined types that have yet to be written. This is much better than writing a flock of monomorphic promotions, with function bodies that would be nearly identical, to convert Month to each unsigned type individually. The predefined constructors make heavy use of this constructor idiom: instead of writing

void (?promote)? (T1*, T2);

("You can make a T2 into a T1"), write

forall(type T | void (?promote)?(T*, T1) ) void (?promote)?(T*, T2);

("You can make a T2 into anything that can be made from a T1").

Constructors and Cost

Calls to constructors have construction costs, which let Cforall choose the least expensive implicit conversion when given a choice.

The cost of a call to a copy constructor is 0.
The cost of a call to a monomorphic constructor is 1.
The cost of a call to a polymorphic constructor, or a specialization of it, is 1 plus the sum of the construction costs of constructors that are passed to it through assertion parameters.

Note that, although point 3 refers to constructors that are passed at run-time, the translator statically matches arguments to assertion parameters, so it can determine construction costs statically.

Construction cost is defined for every constructor, not just the promotions (which are the equivalent of the safe conversions of Cforall-as-is). This seemed like the easiest way to handle (admittedly dicey) "mixed" constructors, where the constructor and its assertion parameter have different identifiers:

type Thingum;
type Wazzit;
forall(type T | void (?create)?(T*, Thingum) )
  void (?promote)?(T*, Wazzit);

Examples:

"unsigned ui = 42U;" calls a copy constructor, and so has cost 0.

"unsigned ui = m;", where m has type Month, calls the polymorphic Month promotion defined previously. It passes the unsigned-to-unsigned copy constructor to the assertion parameter, and so has cost 1+0 = 1.

"unsigned long ul = m;" calls the polymorphic Month promotion, passing the unsigned-to-unsigned long constructor to the assertion parameter. unsigned-to-unsigned long is defined below and will turn out to have cost 1, so the total cost is 2.

Inside the body of the Month promotion, the assertion parameter has a monomorphic type, and so has a construction cost of 1 where it is called by the initialization of t_temp. The cost of the argument passed through the assertion parameter has no relevance inside the body of the promotion.

Overload Resolution

In Cforall-as-is, there is at most one language-defined implicit conversion between any two types. In extended Cforall, more than one conversion may be applicable, and overload resolution must be adapted to account for that, by using the lowest-cost conversion.

The unsafe conversion cost of a function call expression would be the total conversion cost of implicit calls of (?create)?() constructors applied directly to arguments of the function -- 0 if there are none.

This would replace a rule in Cforall-as-is, which considers all unsafe conversions to be equally bad and just counts them. I think the difference would be subtle and unimportant.

The promotion cost would be the total conversion costs of implicit calls of (?promote)?() constructors applied directly to arguments of the function -- 0 if there are none.

Overload resolution would examine each argument expression individually. The best interpretations of an expression would be:

the interpretations with the lowest unsafe conversion cost;
of these, the interpretations with the lowest promotion cost;
of these, if any can be promoted to the parameter type, then just those that can be converted at minimal cost; otherwise, all remaining interpretations.

The best interpretation would be implicitly converted to the parameter type, by calling the conversion function with minimal cost. If there is more than one best interpretation, or if there is more than one minimal-cost conversion, the argument is ambiguous.

A maximal set of interpretations of the function call expression that have compatible result types produces a single interpretation: the interpretations with the lowest unsafe conversion cost, and of these, the interpretations with the lowest promotion cost. If there is more than one such interpretation, the function call expression is ambiguous.

Heap Allocation

Cforall would define new heap allocation functions that would ensure that constructors and destructors would be applied to objects in the heap. There's lots of room for ambitious design here, but a simple facility might look like this:

forall(type T) void delete(T const volatile restrict* ptr) {
  if (ptr) (?destroy)?(ptr);
  free(ptr);
}

In a call to delete(), the argument might be a pointer to a pointer: T would be a pointer type, and the argument might have all three type qualifiers. (If it doesn't, pointer conversions will add missing qualifiers to the argument.)

// Pointer to a const volatile restricted pointer to an int:
int * const volatile restrict * pcvrpi;
// ...
delete(cvrpi);    // T bound to int *

A new() function would take the address of a pointer and an initial value, and points the pointer at heap storage initialized to that value.

forall(type T | void (?create)?(T*, T))
  void new(T* volatile restrict* ptr, T val) {
    *ptr = malloc(sizeof(T));
    if (*ptr) (?create)?(*ptr, val);  // explicit constructor call
}

forall(type T | void (?create)?(T*, T))
  void new(T const* volatile restrict* ptr, T val),
       new(T volatile* volatile restrict* ptr, T val),
       new(T restrict* volatile restrict* ptr, T val),
       new(T const volatile* volatile restrict* ptr, T val),
       new(T const restrict* volatile restrict* ptr, T val),
       new(T volatile restrict* volatile restrict* ptr, T val),
       new(T const volatile restrict* volatile restrict* ptr, T val);

Cforall can't add type qualifiers to pointed-at pointer types, so new() needs one variation for each set of type qualifiers.

Another new() function would omit the initial value, and apply the default constructor. Obviously, there's no point in allocating const-qualified uninitialized storage.

forall(type T)
  void new(T* volatile restrict * ptr) {
    *ptr = malloc(sizeof(T));
    if (*ptr) (?create)?(*ptr);   // Explicit default constructor call.
}

forall(type T)
  void new(T volatile* volatile restrict*),
  void new(T restrict* volatile restrict*),
  void new(T volatile restrict* volatile restrict*);

Generic Conversions and Constructors

Cforall would provide a polymorphic default constructor function and destructor function, for types that do not have their own:

forall(type T)
  void (?create)?(T*) { return; };

forall(type T)
  void (?destroy)?(T*) { return; };

The generic default constructor and destructor provide C semantics for uninitialized variables: "do nothing".

For every structure type struct s Cforall would define a default constructor function that applies a default constructor to each member, in no particular order. Similarly, it would define a destructor that applies the destructor of each member in no particular order.

Any promotion would be treated as a plain constructor:

forall(type T, type S | void (?promote)(T*, S))
  void (?create)?(T*, S) {
    (?promote)?(T*, S);    // Explicit constructor call!
  }

A predefined cast function would allow explicit conversions anywhere that implicit conversions are possible:

forall(type T, type S | void (?create)?(T*, S))
  T (?)?(S source) {
    T temp = source;
    return temp;
  }

A predefined converting constructor would allow initialization anywhere that assignment is defined:

forall(type T | void (?create)?(T*), type S | T ?=?(T*, S))
  void (?create)?(T* target, S source) {
    (?create)?(target);
    *target = source;
  }

This implements the typical semantic link between assignment and initialization.

The predefined copy constructor function is

forall(type T)
  void (?promote)?(T* target, T source) {
    (?create)?(target);
    *target = source;
  }

Since Cforall defines assignment and default constructors for structure types, this provides the copy constructor for structure types.

Finally, Cforall defines the conversion to void, which discards its argument.

forall(type T) void (?promote)(void*, T);

C's "Usual Arithmetic Conversions"

C has five groups of arithmetic types: signed integers, unsigned integers, complex floating-point numbers, imaginary floating-point numbers, and real floating-point numbers. (Implementations are not required to provide complex and imaginary types.) Some of the "usual arithmetic conversions" promote upward within a group or to a more general group: from int to long long, for instance. Others promote across from a type in one group to a similar type in another group: for instance, from int to unsigned int.

Floating-Point Types

The floating point types would use the constructor idiom for upward promotions, and monomorphic constructors for promotions across from real and imaginary types to complex types with the same precision.

I will use a macro to abbreviate the constructor idiom. "Promoter(T,S)" promotes S to any type that T can be promoted to

#define Promoter(Target, Source) \
  forall(type T | void (?promote)?(T*, Target)) void (?promote)?(T*, Source)

Promoter(long double _Complex, double _Complex);      // a
Promoter(double _Complex,      float _Complex);       // b
Promoter(long double, double);                        // c
Promoter(double,      float);                         // d
Promoter(long double _Imaginary, double _Imaginary);  // e
Promoter(double _Imaginary,      float _Imaginary);   // f

void (?promote)?(long double _Complex*, long double);             // g
void (?promote)?(long double _Complex*, long double _Imaginary);  // h
void (?promote)?(double _Complex*, double);                       // i
void (?promote)?(double _Complex*, double _Imaginary);            // j
void (?promote)?(float _Complex*, float);                         // k
void (?promote)?(float _Complex*, float _Imaginary);              // l

It helps to draw a graph of the promotions. In this diagram, monomorphic promotions are solid arrows from the source type to the target type, and polymorphic promotions are dotted arrows from the source type to a bubble that surrounds all possible target types. (Twenty years after first hearing about them, I have finally found a use for directed multigraphs!) To determine the promotion from one type to another, find a path of zero or more dotted arrows optionally ending with a solid arrow.

A long double _Complex can be constructed from

a double _Complex, via a, with a double _Complex copy constructor passed as the assertion parameter.
a long double, via constructor g.
a double, via c (which promotes double to long double and higher), with g passed as the assertion parameter. In other words, the path from double to long double _Complex passes through long double
a float _Complex, via b. For the assertion parameter, Cforall passes a double _Complex-to-long double _Complex constructor that it makes by specializing a; for the assertion parameter of the specialization, it passes a long double _Complex-to-long double _Complex copy constructor.
a float, via d, with a specialization of c passed as its assertion parameter, with g passed as the specialization's assertion parameter.

Note how "upward" and "across" promotions interact. Polymorphic "upward" promotions connect widely separated types by composing constructors through their assertion parameters. Monomorphic "across" promotions extend composition one step across to corresponding types in different groups.

Defining the set of predefined promotions turned out to be quite tricky. For example, if "across" promotions used the constructor idiom, ambiguity would result: a conversion from float to double _Complex could convert upward through double or across through float _Complex. The key points are:

Monomorphic constructors are only used to connect neighbouring types in the conversion hierarchy, because they have constructor cost 1.
Polymorphic constructors only connect directly to neighbours, because their minimal cost is 1. They reach other types by composition.
The types in the assertion parameter of a polymorphic constructor specify the exact path between two types by specifying the next type in a sequence of composed constructors.
There can be more than one path between two types, provided that the paths have different construction costs or degrees of polymorphism.

Large Integer Types

The conversions for the integer types cannot be defined by a simple list, because the set of integer types is implementation-defined, the range of each type is implementation-defined, and the set of promotions depend on whether a particular signed type can represent all values of a particular unsigned type. As I read the C standard, every signed type has a matching unsigned type, but the reverse is not true. This complicates the definitions below.

Let the rank of an integer type be the integer conversion rank defined in C99, with the added condition that the ranks form a continuous sequence of integers.
Let r_int be the rank of int.
Let signed(r) and unsigned(r) be the signed integer type and unsigned integer type with rank r.

Integers promote upward to floating-point types. Let SMax be the highest ranking signed integer type, and let UMax be the highest ranking unsigned integer type. Then Cforall would define

Promoter(float, SMax);
Promoter(float, Umax);

Signed types promote across to unsigned types with the same rank. For every r >= r_int such that signed(r) exists, Cforall would define

void (?promote)?( unsigned(r)*, signed(r) );

Lower-ranking signed integers promote to higher-ranking signed integers. For every signed integer type T with rank greater than r_int, let S be the signed integer type with the next lowest rank. Then Cforall would define

Promoter(T, S);

Similarly, lower-ranking unsigned integers promote to higher-ranking unsigned integers. For every r > r_int, Cforall would define

Promoter(unsigned(r), unsigned(r-1));

C's usual arithmetic conversions may promote an unsigned type to a signed type, but only if the signed type can represent every value of the unsigned type. For every r >= r_int, if there are any signed types that can represent every value in unsigned(r), let S be the lowest ranking of these types; then Cforall defines

Promoter(S, unsigned(r));

C's "Integer Promotions"

C's integer promotions apply to "small" types (those with rank less than r_int): they promote to int if int can hold all of their values, and to unsigned int otherwise. At least one unsigned type, _Bool, will promote to int. This breaks the pattern set by the usual arithmetic conversions, where unsigned types always promote to the next larger unsigned type. Consider a machine with 32-bit ints and 16-bit unsigned shorts: if two unsigned shorts are added, they must be promoted to int instead of unsigned int. Hence for this machine there must not be a promotion from unsigned short to unsigned int.

Since the C integer promotions always promote small signed types to int, Cforall would extend the chain of polymorphic "upward" and monomorphic "across" signed integer promotions to the small signed types.

For every signed integer type S with rank less than r_int, Cforall would define

Promoter(T, S);

where T is the signed integer type with the next highest rank.

Let r_break be the rank of the highest-ranking unsigned type whose values can all be represented by int, and let T be the lowest-ranking signed type that can represent all of the values of unsigned(r_break). Cforall would define

Promoter(T, unsigned(r_break));

For every r less than r_int except r_break, Cforall would define

Promoter(unsigned(r+1), unsigned(r));

r_break is the point where the normal pattern of unsigned promotion breaks. Unsigned types with higher rank promote upward toward unsigned int. Unsigned types with lower rank promote upward to the type at the break, which promotes upward to a signed type and onward toward int.

For each r < r_int such that signed(r) exists, Cforall would define

void (?promote)?(unsigned(r)*, signed(r));

These "across" promotions are not strictly necessary, but it seems useful to extend the pattern of signed-to-unsigned monomorphic conversions established by the larger integer types. Note that because of these promotions, unsigned(r_break) does promote to the next larger unsigned type, after a detour through a signed type that increases the conversion cost.

Finally, char is equivalent to signed char or unsigned char, on an implementation-defined basis. If char is equivalent to signed char, the implementation would define

Promoter(signed char, char);

Otherwise, it would define

Promoter(unsigned char, char);

Other Promotions

Promotions can add qualifiers to the pointed-to type of a pointer type.

forall(dtype DT) void (?promote)?(const DT**, DT*);
forall(dtype DT) void (?promote)?(volatile DT**, DT*);
forall(dtype DT) void (?promote)?(restrict DT**, DT*);
forall(dtype DT) void (?promote)?(const volatile DT**, DT*);
forall(dtype DT) void (?promote)?(const restrict DT**, DT*);
forall(dtype DT) void (?promote)?(volatile restrict DT**, DT*);
forall(dtype DT) void (?promote)?(const volatile restrict DT**, DT*);

forall(dtype DT) void (?promote)?(const volatile DT**, const DT*);
forall(dtype DT) void (?promote)?(const restrict DT**, const DT*);
forall(dtype DT) void (?promote)?(const volatile restrict DT**, const DT*);

forall(dtype DT) void (?promote)?(const volatile DT**, volatile DT*);
forall(dtype DT) void (?promote)?(volatile restrict DT**, volatile DT*);
forall(dtype DT) void (?promote)?(const volatile restrict DT**, volatile DT*);

forall(dtype DT) void (?promote)?(const restrict DT**, restrict DT*);
forall(dtype DT) void (?promote)?(volatile restrict DT**, restrict DT*);
forall(dtype DT) void (?promote)?(const volatile restrict DT**, restrict
DT*);

forall(dtype DT) void (?promote)?(const volatile restrict DT**, const volatile DT);
forall(dtype DT) void (?promote)?(const volatile restrict DT**, const restrict DT);
forall(dtype DT) void (?promote)?(const volatile restrict DT**, volatile restrict DT);

The type qualifier promotions are simple, but verbose because Cforall doesn't abstract over type qualifiers very well. They also give every type qualifier promotion a cost of 1. It is possible to define a smaller set of promotions, some using the constructor idiom, that gives greater cost to promotions that add more qualifiers, but the set is arbitrary and asymmetric: only one of the three promotions that add one qualifier to an unqualified pointer type can use the constructor idiom, or else ambiguity results.

Within the scope of a type definition type T1 = T2;, constructors would convert between the new type and its implementation type.

void (?promote)(T2*, T1);
void (?promote)(T2**, T1*);
void (?create)?(T1*, T2);
void (?create)?(T1**, T2*);

The conversion from the implementation type T2 to the new type T1 gives functions that implement operations on T1 access to the type's implementation. The conversion is a promotion because most such functions work with the implementation most of the time. The reverse conversion is merely implicit, so that mixed operations won't be ambiguous.

Other Pre-Defined Implicit Conversions

Arithmetic Conversions

C defines implicit conversions between any two arithmetic types. In Cforall terms, the conversions that are not promotions are ordinary conversions. Most of the ordinary conversions follow a pattern that looks like the Usual Arithmetic Conversions in reverse. Once again, I will use a macro to hide details of the constructor idiom.

#define Creator(Target, Source) \
  forall(type T | void (?create)?(T*, Target)) void (?create)?(T*, Source)

Creator(double _Complex, long double _Complex);
Creator(float _Complex,  double _Complex);
Creator(double, long double);
Creator(float,  double);
Creator(double _Imaginary, long double _Imaginary);
Creator(float _Imaginary,  double _Imaginary);

void (?create)?(long double*,            long double _Complex);
void (?create)?(long double _Imaginary*, long double _Complex);
void (?create)?(double*,            double _Complex);
void (?create)?(double _Imaginary*, double _Complex);
void (?create)?(float*,            float _Complex);
void (?create)?(float _Imaginary*, float _Complex);

The C99 draft standards that I have access to state that real types and imaginary types are implicitly interconvertible. This seems like a mistake, since the result of the conversion will always be zero, but ...

void (?create)?(long double*, long double _Imaginary);
void (?create)?(long double _Imaginary*, long double);
void (?create)?(double*, double _Imaginary);
void (?create)?(double _Imaginary*, double);
void (?create)?(float*, float _Imaginary);
void (?create)?(float _Imaginary*, float);

Let SMax be the highest ranking signed integer type, and let UMax be the highest ranking unsigned integer type. Then Cforall would define

Creator(SMax, float);
Creator(SMax, float _Complex);
Creator(SMax, float _Imaginary);
Creator(UMax, float);
Creator(UMax, float _Complex);
Creator(UMax, float _Imaginary);

For every signed integer type T with rank greater than that of signed char, Cforall would define

Creator(S, T);

where S is the signed integer type with the next lowest rank.

For every rank r greater than the rank of _Bool, Cforall would define

Creator(unsigned(r-1), unsigned(r));

For every rank r such that signed(r) exists, Cforall would define

void (?create)?( signed(r)*, unsigned(r) );

char and _Bool are interconvertible.

void (?create)?(char*, _Bool);
void (?create)?(_Bool*, char);

If char is equivalent to signed char, the implementation would define

Creator(char, signed char);
void (?create)?(char*, unsigned char);

Otherwise, the implementation would define

Creator(char, unsigned char);
void (?create)?(char*, signed char);
void (?create)?(_Bool*, signed char);
void (?create)?(signed char*, _Bool);

Pointer conversions

Pointer types are implicitly interconvertible with pointers to void, provided that the target type has all of the qualifiers of the source type.

forall(dtype SourceType,
       type QVPtr | void (?promote)?(QVPtr*, void*))
  void (?create)?(QVPtr*, SourceType*);

This conversion uses the constructor idiom, but note that the assertion parameter is a promotion even though the conversion itself is not a promotion. My intent is that the assertion parameter will be bound to a promotion that adds type qualifiers to a pointer type. A conversion from int* to const void* would bind SourceType to int, QVPtr to const void*, and the assertion parameter to a promotion from void* to const void* (which is a specialization of one of the polymorphic type qualifier promotions given above). Because of this composition of pointer conversions, I don't have to define conversions for every combination of type qualifiers on the target type. I do have to handle all combinations of qualifiers on the source type:

forall(dtype SourceType,
       type QVPtr | void (?promote)?(QVPtr*, const void*))
  void (?create)?(QVPtr*, const SourceType*);
forall(dtype SourceType,
       type QVPtr | void (?promote)?(QVPtr*, volatile void*))
  void (?create)?(QVPtr*, volatile SourceType*);
forall(dtype SourceType,
       type QVPtr | void (?promote)?(QVPtr*, restrict void*))
  void (?create)?(QVPtr*, restrict SourceType*);
forall(dtype SourceType,
       type QVPtr | void (?promote)?(QVPtr*, const volatile void*))
  void (?create)?(QVPtr*, const volatile SourceType*);
forall(dtype SourceType,
       type QVPtr | void (?promote)?(QVPtr*, const restrict void*))
  void (?create)?(QVPtr*, const restrict SourceType*);
forall(dtype SourceType,
       type QVPtr | void (?promote)?(QVPtr*, volatile restrict void*))
  void (?create)?(QVPtr*, volatile restrict SourceType*);
forall(dtype SourceType,
       type QVPtr | void (?promote)?(QVPtr*, const volatile restrict void*))
  void (?create)?(QVPtr*, const volatile restrict SourceType*);

forall(type QTPtr,
       dtype TargetType | void (?promote)?(QTPtr*, TargetType*)
  void (?create)?(QTPtr*, void*);
forall(type QTPtr,
       dtype TargetType | void (?promote)?(QTPtr*, const TargetType*)
  void (?create)?(QTPtr*, const void*);
forall(type QTPtr,
       dtype TargetType | void (?promote)?(QTPtr*, volatile TargetType*)
  void (?create)?(QTPtr*, volatile void*);
forall(type QTPtr,
       dtype TargetType | void (?promote)?(QTPtr*, restrict TargetType*)
  void (?create)?(QTPtr*, restrict void*);
forall(type QTPtr,
       dtype TargetType | void (?promote)?(QTPtr*, const volatile TargetType*)
  void (?create)?(QTPtr*, const volatile void*);
forall(type QTPtr,
       dtype TargetType | void (?promote)?(QTPtr*, const restrict TargetType*)
  void (?create)?(QTPtr*, const restrict void*);
forall(type QTPtr,
       dtype TargetType | void (?promote)?(QTPtr*, volatile restrict TargetType*)
  void (?create)?(QTPtr*, volatile restrict void*);
forall(type QTPtr,
       dtype TargetType | void (?promote)?(QTPtr*, const volatile restrict TargetType*)
  void (?create)?(QTPtr*, const volatile restrict void*);

Pre-Defined Explicit Conversions

Function pointers are interconvertible.

forall(ftype FT1, ftype FT2, type T | FT1* (?)?(T) ) FT2* (?)?(FT1*);

Data pointers including pointers to void are interconvertible, regardless of type qualifiers.

forall(dtype DT1, dtype DT2) DT2*                (?)?(DT1*);
forall(dtype DT1, dtype DT2) const DT2*          (?)?(DT1*);
forall(dtype DT1, dtype DT2) volatile DT2*       (?)?(DT1*);
forall(dtype DT1, dtype DT2) const volatile DT2* (?)?(DT1*);

forall(dtype DT1, dtype DT2) DT2*                (?)?(const DT1*);
forall(dtype DT1, dtype DT2) const DT2*          (?)?(const DT1*);
forall(dtype DT1, dtype DT2) volatile DT2*       (?)?(const DT1*);
forall(dtype DT1, dtype DT2) const volatile DT2* (?)?(const DT1*);

forall(dtype DT1, dtype DT2) DT2*                (?)?(volatile DT*);
forall(dtype DT1, dtype DT2) const DT2*          (?)?(volatile DT*);
forall(dtype DT1, dtype DT2) volatile DT2*       (?)?(volatile DT*);
forall(dtype DT1, dtype DT2) const volatile DT2* (?)?(volatile DT*);

forall(dtype DT1, dtype DT2) DT2*                (?)?(const volatile DT*);
forall(dtype DT1, dtype DT2) const DT2*          (?)?(const volatile DT*);
forall(dtype DT1, dtype DT2) volatile DT2*       (?)?(const volatile DT*);
forall(dtype DT1, dtype DT2) const volatile DT2* (?)?(const volatile DT*);

Integers and pointers are interconvertible. For every integer type I define

forall(dtype DT, type T | I (?)?(T) ) DT* ?(?)(T);
forall(ftype FT, type T | I (?)?(T) ) FT* ?(?)(T);

forall(dtype DT, type T | DT* (?)?(T) ) I (?)?(T);
forall(dtype DT, type T | DT* (?)?(T) ) I (?)?(T);

Non-Conversions

C99 has a few other "conversions" that don't fit into this proposal. Outside of some special circumstances (such as application of sizeof),

array lvalues "convert" to pointers
function designators "convert" to pointers to functions
non-array lvalues "convert" to plain values
bit fields undergo "integer promotion" to int or unsigned int values.

I'd like to stop calling these "conversions". Perhaps they could be handled by some verbiage in the semantics of "Primary Expressions".

Cforall-as-is provides "specialization", which reduces the number of type parameters or assertion parameters of a polymorphic object or function. Specialization looks like a conversion -- it can happen implicitly or as a result of a cast -- but would no longer be considered to be a conversion.

Assignment

Since extended Cforall separates conversion from assignment, it can simplify Cforall-as-is's set of assignment operators. Implicit conversions can add type qualifiers to the target's type, and to the source's type in the case of pointer assignment.

char ?=?(volatile char*, char);
char ?+=?(volatile char*, char);
// ... and similarly for the rest of the basic types and
// compound assignment operators.

char c;
c = 'a';  // => ?=?( &c, 'a' );
          // => ?=?( (volatile char*)&c, 'a' );

// Assignment between data pointers, where the target has all of
// the qualifiers of the source.
forall(dtype DT)
  DT* ?=?(DT* volatile restrict*, DT*);
forall(dtype DT)
  const DT* ?=?(const DT* volatile restrict*, const DT*);
forall(dtype DT)
  volatile DT* ?=?(volatile DT* volatile restrict*, volatile DT*);
forall(dtype DT)
  const volatile DT* ?=?(const volatile DT* volatile restrict*, const volatile DT*);

// Assignment to data pointers from voidpointers.
forall(dtype DT) DT* ?=?(DT* volatile restrict*,  void*)
forall(dtype DT)
  const DT* ?=?(const DT* volatile restrict*, const void*);
forall(dtype DT)
  volatile DT* ?=?(volatile DT* volatile restrict*, volatile void*);
forall(dtype DT)
  const volatile DT* ?=?(const volatile DT* volatile restrict*, const volatile void*);

// Assignment to void pointers from data pointers.
forall(dtype DT)
  void* ?=?(void* volatile restrict*, DT*);
forall(dtype DT)
  const void* ?=?(const void* volatile restrict*, const DT*);
forall(dtype DT)
  volatile void* ?=?(volatile void* volatile restrict*, volatile DT*);
forall(dtype DT)
  const volatile void* ?=?(const volatile void* volatile restrict*, const volatile DT*);

// Assignment from null pointers to other pointer types.
forall(dtype DT)
  void* ?=?(void* volatile restrict*, forall(dtype DT2) const DT2*);
forall(dtype DT)
  const void* ?=?(const void* volatile restrict*, forall(dtype DT2) const DT2*);
forall(dtype DT)
  volatile void* ?=?(volatile void* volatile restrict*, forall(dtype DT2) const DT2*);
forall(dtype DT)
  const volatile void* ?=?(const volatile void* volatile restrict*, forall(dtype DT2) const DT2*);

// Function pointer assignment
forall(ftype FT) FT* ?=?(FT* volatile restrict*, FT*);
forall(ftype FT) FT* ?=?(FT* volatile restrict*, forall(ftype FT2) FT2*);

The difference, relative to Cforall-as-is, is that assignment operators come in one flavor (a pointer to a volatile value as the first operand) instead of two (a pointer to volatile in one case, a plain pointer in the other) or the four that restrict would have led to.

However, to make this work, the type of default assignment functions must also change. A declaration of a type T would implicitly declare

 T ?=?(T volatile restrict*, T)

Final Notes

The constructor idiom is polymorphic in the object's type: an initial value of one particular type can initialize objects of many types. The constructor that promotes a Wazzit into a Thingum is declared

forall(type T | void (?promote)?(T*, Thingum) )
  void (?promote)?(T*, Wazzit);

("You can make a Wazzit into a Thingum and types higher in the hierarchy.")

It would also be possible to use a constructor idiom where the object's type is fixed and the initial value's type is polymorphic:

forall(type T | void (?promote)?(Wazzit*, T) )
  void (?promote)?(Thingum*, T);

("You can make a Thingum from a Wazzit and types lower in the hierarchy.")

The "polymorphic value" idiom has the advantage that it is fairly obvious that the function is a constructor for type Thingum. In the "polymorphic object" idiom, Thingum is buried in the assertion parameter.

However, I chose the "polymorphic object" idiom because it matches C's semantics for signed-to-unsigned integer conversions. In the "polymorphic object" idiom, the natural way to write the polymorphic promoter from int to larger types is

forall(type T | void (?promote)?(T*, long) )
  void (?promote)?(T* tp, int i) {
    long l = i;
    *tp = (T)l;    // calls the assertion parameter.
    }

Now consider the case of a CPU with 16-bit ints, where we need to convert an int value -1 to a 32-bit unsigned long. The assertion parameter will be bound to the monomorphic long-to-unsigned long promoter. The function body above converts the int -1 to a long -1, and then uses the assertion parameter to convert the result to the correct unsigned long value: 4,294,967,295.

In the "polymorphic value" idiom, the conversion would be done by calling the polymorphic promoter to unsigned long from smaller types:

forall(type T | void (?promote)?(unsigned*, T) )
  void (?promote)?(unsigned long* ulp, T t) {
    unsigned u = t;    // calls the assertion parameter.
    *ulp = u;
    }

This time the assertion parameter will be bound to the int-to-unsigned promoter. The function body uses the assertion parameter to convert the integer -1 to unsigned 65,565, and then converts the result to the incorrect unsigned long value 65,535.

Clearly the "polymorphic value" idiom would require the implementation to do some unnatural, and probably implementation-dependent, bit mangling to get the right answer. Of course, an implementation is allowed to perform any unnatural acts it chooses. But programmers would have to conform to the prevailing constructor idiom when writing their constructors, and will want to write natural and portable code.