Proposal For Use of Virtual Tables ================================== This is an adaptation of the earlier virtual proposal, updating it with new ideas, re-framing it and laying out more design decisions. It should eventually replace the earlier proposal, but not all features and syntax have been converted to the new design. The basic concept of a virtual table (vtable) is the same here as in most other languages. They will mostly contain function pointers although they should be able to store anything that goes into a trait. I also include notes on a sample implementation, which primarly exists to show there is a resonable implementation. The code samples for that are in a slight psudo-code to help avoid name mangling and keeps some CFA features while they would actually be writen in C. Trait Instances --------------- Currently traits are completely abstract. Data types might implement a trait but traits are not themselves data types. This will change that and allow instances of traits to be created from instances of data types that implement the trait. trait combiner(otype T) { void combine(T&, int); }; struct summation { int sum; }; void ?{}( struct summation & this ) { this.sum = 0; } void combine( struct summation & this, int num ) { this.sum = this.sum + num; } trait combiner obj = struct summation{}; combine(obj, 5); As with `struct` (and `union` and `enum`), `trait` might be optional when using the trait as a type name. A trait may be used in assertion list as before. For traits to be used this way they should meet two requirements. First they should only have a single polymorphic type and each assertion should use that type once as a parameter. Extentions may later loosen these requirements. If a trait object is used it should generate a series of implicate functions each of which implements one of the functions required by the trait. So for combiner there is an implicate: void combine(trait combiner & this, int); This function is the one actually called at the end The main use case for trait objects is that they can be stored. They can be passed into functions, but using the trait directly is prefred in this case. trait drawable(otype T) { void draw(Surface & to, T & draw); Rect(int) drawArea(T & draw); }; struct UpdatingSurface { Surface * surface; vector(trait drawable) drawables; }; void updateSurface(UpdatingSurface & us) { for (size_t i = 0 ; i < us.drawables.size ; ++i) { draw(us.surface, us.drawables[i]); } } Currently these traits are limited to 1 trait parameter and functions should have exactly 1 parameter. We cannot abstract away pairs of types and still pass them into normal functions, which take them seperately. The second is required the because we need to get the vtable from somewhere. If there are 0 trait objects than no vtable is avalible, if we have more than 1 than the vtables give conflicting answers on what underlying function to call. And even then the underlying type assumes a concrete type. This loop can sort of be broken by using the trait object directly in the signature. This has well defined meaning, but might not be useful. trait example(otype T) { bool test(T & this, trait example & that); } #### Sample Implementation A simple way to implement trait objects is by a pair of pointers. One to the underlying object and one to the vtable. struct vtable_drawable { void (*draw)(Surface &, void *); Rect(int) (*drawArea)(void *); }; struct drawable { void * object; vtable_drawable * vtable; }; The functions that run on the trait object would generally be generated using the following pattern: void draw(Surface & surface, drawable & traitObj) { return traitObj.vtable->draw(surface, traitObj.object); } There may have to be special cases for things like copy construction, that might require a more sigificant wrapper. On the other hand moving could be implemented by moving the pointers without any need to refer to the base object. ### Extention: Multiple Trait Parameters Currently, this gives traits two independent uses. They use the same syntax, except for limits boxable traits have, and yet don't really mix. The most natural way to do this is to allow trait instances to pick one parameter that they are generic over, the others they choose types to implement. The two ways to do the selection, the first is do it at the trait definition. Each trait picks out a single parameter which it can box (here the `virtual` qualifier). When you create an instance of a trait object you provide arguments like for a generic structure, but skip over the marked parameter. trait combiner(virtual otype T, otype Combined) { void combine(T &, Combined &); } trait combiner(int) int_combiner; The second is to do it at the instaniation point. A placeholder (here the keyword `virtual`) is used to explicately skip over the parameter that will be abstracted away, with the same rules as above if it was the marked parameter. trait combiner(otype T, otype Combined) { void combine(T &, Combined &); }; trait combiner(virtual, int) int_combiner; Using both (first to set the default, second as a local override) would also work, although might be exessively complicated. This is useful in cases where you want to use a generic type, but leave part of it open and store partially generic result. As a simple example trait folder(otype T, otype In, otype Out) { void fold(T & this, In); Out fold_result(T & this); } Which allows you to fold values without putting them in a container. If they are already in a container this is exessive, but if they are generated over time this gives you a simple interface. This could for instance be used in a profile, where T changes for each profiling statistic and you can plug in multiple profilers for any run by adding them to an array. Hierarchy --------- Virtual tables by them selves are not quite enough to implement the planned hierarchy system. An addition of type ids, implemented as pointers which point to your parent's type id, is required to actually create the shape of the hierarchy. However vtables would allow behaviour to be carried with the tree. The hierarchy would be a tree of types, of traits and structs. Currently we do not support structural extension, so traits form the internal nodes and structures the leaf nodes. The syntax is undecided but it will include a clause like `virtual (PARENT)` on trait and struct definitions. It marks out all types in a hierarchy. PARENT may be omitted, if it is this type is the root of a hierarchy. Otherwise it is the name of the type that is this type's parent in the hierarchy. Traits define a trait instance type that implements all assertions in this trait and its parents up until the root of the hierarchy. Each trait then defines a vtable type. Structures will also have a vtable type but it should be the same as their parent's. Trait objects within the tree can be statically cast to a parent type. Casts from a parent type to a child type are conditional, they check to make sure the underlying instance is an instance of the child type, or an instance of one of its children. The type then is recoverable at run-time. As with regular trait objects, calling a function on a trait object will cause a look-up on the the virtual table. The casting rules make sure anything that can be cast to a trait type will have all the function implementations for that trait. Converting from a concrete type (structures at the edge of the hierarchy) to an abstract type works the same as with normal trait objects, the underlying object is packaged with a virtual table pointer. Converting back to an abstract type requires confirming the underlying type matches, but then simply extracts the pointer to it. Exception Example: (Also I'm not sure where I got these casing rules.) trait exception(otype T) virtual() { char const * what(T & this); } trait io_error(otype T) virtual(exception) { FILE * which_file(T & this); } struct eof_error(otype T) virtual(io_error) { FILE * file; } char const * what(eof_error &) { return "Tried to read from an empty file."; } FILE * which_file(eof_error & this) { return eof_error.file; } Ast Example: trait ast_node(otype T) virtual() { void print(T & this, ostream & out); void visit(T & this, Visitor & visitor); CodeLocation const & get_code_location(T & this); } trait expression_node(otype T) virtual(ast_node) { Type eval_type(T const & this); } struct operator_expression virtual(expression_node) { enum operator_kind kind; trait expression_node rands[2]; } trait statement_node(otype T) virtual(ast_node) { vector(Label) & get_labels(T & this); } struct goto_statement virtual(statement_node) { vector(Label) labels; Label target; } trait declaration_node(otype T) virtual(ast_node) { string name_of(T const & this); Type type_of(T const & this); } struct using_declaration virtual(declaration_node) { string new_type; Type old_type; } struct variable_declaration virtual(declaration_node) { string name; Type type; } #### Sample Implementation The type id may be as little as: struct typeid { struct typeid const * const parent; }; Some linker magic would have to be used to ensure exactly one copy of each structure for each type exists in memory. There seem to be spectial once sections that support this and it should be easier than generating unique ids across compilation units. The structure could be extended to contain any additional type information. There are two general designs for vtables with type ids. The first is to put the type id at the top of the vtable, this is the most compact and efficient solution but only works if we have exactly 1 vtable for each type. The second is to put a pointer to the type id in each vtable. This has more overhead but allows multiple vtables. struct _vtable { struct typeid const id; // Trait dependent list of vtable members. }; struct _vtable { struct typeid const * const id; // Trait dependent list of vtable members. }; ### Virtual Casts To convert from a pointer to a type higher on the hierarchy to one lower on the hierarchy a check is used to make sure that the underlying type is also of that lower type. The proposed syntax for this is: trait SubType * new_value = (virtual trait SubType *)super_type; It will return the same pointer if it does point to the subtype and null if it does not, doing the check and conversion in one operation. ### Inline vtables Since the structures here are usually made to be turned into trait objects it might be worth it to have fields on them to store the virtual table pointer. This would have to be declared on the trait as an assertion (example: `vtable;` or `T.vtable;`), but if it is the trait object could be a single pointer. There are also three options for where the pointer to the vtable. It could be anywhere, a fixed location for each trait or always at the front. For the per- trait solution an extention to specify what it is (example `vtable[0];`) which could also be used to combine it with others. So these options can be combined to allow access to all three options. ### Virtual Tables as Types Here we consider encoding plus the implementation of functions on it to be a type. Which is to say in the type hierarchy structures aren't concrete types anymore, instead they are parent types to vtables, which combine the encoding and implementation. Resolution Scope ---------------- What is the scope of a resolution? When are the functions in a vtable decided and how broadly is this applied? ### Type Level: Each structure has a single resolution for all of the functions in the virtual trait. This is how many languages that implement this or similar features do it. The main thing CFA would need to do it this way is some single point where the type declaration, including the functions that satisfy the trait, are all defined. Currently there are many points where this can happen, not all of them will have the same definitions and no way to select one over the other. Some syntax would have to be added to specify the resolution point. To ensure a single instance there may have to be two variants, one forward declaration and one to create the instance. With some compiler magic the forward declaration maybe enough. extern trait combiner(struct summation) vtable; trait combiner(struct summation) vtable; Or (with the same variants): vtable combiner(struct summation); The extern variant promises that the vtable will exist while the normal one is where the resolution actually happens. ### Explicit Resolution Points: Slightly looser than the above, there are explicit points where the vtables are resolved, but there is no limit on the number of resolution points that might be provided. Each time a object is bound to a trait, one of the resolutions is selected. This might be the most flexible option. An syntax would have to be provided as above. There may also be the option to name resolution points so that you can choose between them. This also could come with the ability to forward declare them. Especially if they are not named, these resolution points should be able to appear in functions, where the scoping rules can be used to select one. However this also means that stack-allocated functions can end up in the vtable. extern trait combiner(struct summation) vtable sum; trait combiner(struct summation) vtable sum; extern trait combiner(struct summation) vtable sum default; trait combiner(struct summation) vtable sum default; The extern difference is the same before. The name (sum in the samples) is used at the binding site to say which one is picked. The default keyword can be used in only some of the declarations. trait combiner fee = (summation_instance, sum); trait combiner foe = summation_instance; (I am not really happy about this syntax, but it kind of works.) The object being bound is required. The name of the vtable is optional if there is exactly one vtable name marked with default. These could also be placed inside functions. In which case both the name and the default keyword might be optional. If the name is ommited in an assignment the closest vtable is choosen (returning to the global default rule if no approprate local vtable is in scope). ### Site Based Resolution: Every place in code where the binding of a vtable to an object occurs has its own resolution. Syntax-wise this is the simplest as it should be able to use just the existing declarations and the conversion to trait object. It also is very close to the current polymorphic resolution rules. This works as the explicit resolution points except the resolution points are implicit and their would be no selection of which resolution to use. The closest (current) resolution is always selected. This could easily lead to an explosion of vtables as it has the most fine grained resolution the number of bindings in a single scope (that produces the same binding) could be quite high. Merging identical vtables might help reduce that. Vtable Lifetime Issues ---------------------- Vtables interact badly with the thunk issue. Conceptually vtables are static like type/function data they carry, as those decisions are made by the resolver at compile time. Stack allocated functions interact badly with this because they are not static. There are several ways to try to resolve this, however without a general solution most can only buy time. Filling in some fields of a static vtable could cause issues on a recursive call. And then we are still limited by the lifetime of the stack functions, as the vtable with stale pointers is still a problem. Dynamically allocated vtables introduces memory management overhead and requires some way to differentiate between dynamic and statically allocated tables. The stale function pointer problem continues unless those becomes dynamically allocated as well which gives us the same costs again. Stack allocating the vtable seems like the best issue. The vtable's lifetime is now the limiting factor but it should be effectively the same as the shortest lifetime of a function assigned to it. However this still limits the lifetime "implicitly" and returns to the original problem with thunks.