Proposal For Use of Virtual Tables
==================================

This is an adaptation of the earlier virtual proposal, updating it with new
ideas, re-framing it and laying out more design decisions. It should
eventually replace the earlier proposal, but not all features and syntax have
been converted to the new design.

The basic concept of a virtual table (vtable) is the same here as in most
other languages. They will mostly contain function pointers although they
should be able to store anything that goes into a trait.

I also include notes on a sample implementation, which primarly exists to show
there is a resonable implementation. The code samples for that are in a slight
psudo-code to help avoid name mangling and keeps some CFA features while they
would actually be writen in C.

Trait Instances
---------------

Currently traits are completely abstract. Data types might implement a trait
but traits are not themselves data types. This will change that and allow
instances of traits to be created from instances of data types that implement
the trait.

    trait combiner(otype T) {
		void combine(T&, int);
	};

    struct summation {
		int sum;
	};

	void ?{}( struct summation & this ) {
		this.sum = 0;
	}

    void combine( struct summation & this, int num ) {
		this.sum = this.sum + num;
	}

	trait combiner obj = struct summation{};
	combine(obj, 5);

As with `struct` (and `union` and `enum`), `trait` might be optional when
using the trait as a type name. A trait may be used in assertion list as
before.

For traits to be used this way they should meet two requirements. First they
should only have a single polymorphic type and each assertion should use that
type once as a parameter. Extentions may later loosen these requirements.

If a trait object is used it should generate a series of implicate functions
each of which implements one of the functions required by the trait. So for
combiner there is an implicate:

    void combine(trait combiner & this, int);

This function is the one actually called at the end

The main use case for trait objects is that they can be stored. They can be
passed into functions, but using the trait directly is prefred in this case.

    trait drawable(otype T) {
        void draw(Surface & to, T & draw);
        Rect(int) drawArea(T & draw);
    };

    struct UpdatingSurface {
        Surface * surface;
        vector(trait drawable) drawables;
    };

    void updateSurface(UpdatingSurface & us) {
        for (size_t i = 0 ; i < us.drawables.size ; ++i) {
            draw(us.surface, us.drawables[i]);
        }
    }

Currently these traits are limited to 1 trait parameter and functions should
have exactly 1 parameter. We cannot abstract away pairs of types and still
pass them into normal functions, which take them seperately.

The second is required the because we need to get the vtable from somewhere.
If there are 0 trait objects than no vtable is avalible, if we have more than
1 than the vtables give conflicting answers on what underlying function to
call. And even then the underlying type assumes a concrete type.

This loop can sort of be broken by using the trait object directly in the
signature. This has well defined meaning, but might not be useful.

    trait example(otype T) {
        bool test(T & this, trait example & that);
    }

#### Sample Implementation
A simple way to implement trait objects is by a pair of pointers. One to the
underlying object and one to the vtable.

    struct vtable_drawable {
        void (*draw)(Surface &, void *);
        Rect(int) (*drawArea)(void *);
    };

    struct drawable {
        void * object;
        vtable_drawable * vtable;
    };

The functions that run on the trait object would generally be generated using
the following pattern:

    void draw(Surface & surface, drawable & traitObj) {
        return traitObj.vtable->draw(surface, traitObj.object);
    }

There may have to be special cases for things like copy construction, that
might require a more sigificant wrapper. On the other hand moving could be
implemented by moving the pointers without any need to refer to the base
object.

### Extention: Multiple Trait Parameters
Currently, this gives traits two independent uses. They use the same syntax,
except for limits boxable traits have, and yet don't really mix. The most
natural way to do this is to allow trait instances to pick one parameter
that they are generic over, the others they choose types to implement.

The two ways to do the selection, the first is do it at the trait definition.
Each trait picks out a single parameter which it can box (here the `virtual`
qualifier). When you create an instance of a trait object you provide
arguments like for a generic structure, but skip over the marked parameter.

    trait combiner(virtual otype T, otype Combined) {
        void combine(T &, Combined &);
    }

    trait combiner(int) int_combiner;

The second is to do it at the instaniation point. A placeholder (here the
keyword `virtual`) is used to explicately skip over the parameter that will be
abstracted away, with the same rules as above if it was the marked parameter.

    trait combiner(otype T, otype Combined) {
        void combine(T &, Combined &);
    };

    trait combiner(virtual, int) int_combiner;

Using both (first to set the default, second as a local override) would also
work, although might be exessively complicated.

This is useful in cases where you want to use a generic type, but leave part
of it open and store partially generic result. As a simple example

    trait folder(otype T, otype In, otype Out) {
        void fold(T & this, In);
        Out fold_result(T & this);
    }

Which allows you to fold values without putting them in a container. If they
are already in a container this is exessive, but if they are generated over
time this gives you a simple interface. This could for instance be used in
a profile, where T changes for each profiling statistic and you can plug in
multiple profilers for any run by adding them to an array.

Hierarchy
---------

Virtual tables by them selves are not quite enough to implement the planned
hierarchy system. An addition of type ids, implemented as pointers which
point to your parent's type id, is required to actually create the shape of
the hierarchy. However vtables would allow behaviour to be carried with the
tree.

The hierarchy would be a tree of types, of traits and structs. Currently we do
not support structural extension, so traits form the internal nodes and
structures the leaf nodes.

The syntax is undecided but it will include a clause like `virtual (PARENT)`
on trait and struct definitions. It marks out all types in a hierarchy.
PARENT may be omitted, if it is this type is the root of a hierarchy. Otherwise
it is the name of the type that is this type's parent in the hierarchy.

Traits define a trait instance type that implements all assertions in this
trait and its parents up until the root of the hierarchy. Each trait then
defines a vtable type. Structures will also have a vtable type but it should
be the same as their parent's.

Trait objects within the tree can be statically cast to a parent type. Casts
from a parent type to a child type are conditional, they check to make sure
the underlying instance is an instance of the child type, or an instance of
one of its children. The type then is recoverable at run-time.

As with regular trait objects, calling a function on a trait object will cause
a look-up on the the virtual table. The casting rules make sure anything that
can be cast to a trait type will have all the function implementations for
that trait.

Converting from a concrete type (structures at the edge of the hierarchy) to
an abstract type works the same as with normal trait objects, the underlying
object is packaged with a virtual table pointer. Converting back to an abstract
type requires confirming the underlying type matches, but then simply extracts
the pointer to it.

Exception Example:
(Also I'm not sure where I got these casing rules.)

    trait exception(otype T) virtual() {
        char const * what(T & this);
    }

    trait io_error(otype T) virtual(exception) {
        FILE * which_file(T & this);
    }

    struct eof_error(otype T) virtual(io_error) {
        FILE * file;
    }

    char const * what(eof_error &) {
        return "Tried to read from an empty file.";
    }

    FILE * which_file(eof_error & this) {
        return eof_error.file;
    }

Ast Example:

    trait ast_node(otype T) virtual() {
        void print(T & this, ostream & out);
        void visit(T & this, Visitor & visitor);
        CodeLocation const & get_code_location(T & this);
    }

    trait expression_node(otype T) virtual(ast_node) {
        Type eval_type(T const & this);
    }

    struct operator_expression virtual(expression_node) {
        enum operator_kind kind;
        trait expression_node rands[2];
    }

    trait statement_node(otype T) virtual(ast_node) {
        vector(Label) & get_labels(T & this);
    }

    struct goto_statement virtual(statement_node) {
        vector(Label) labels;
        Label target;
    }

    trait declaration_node(otype T) virtual(ast_node) {
        string name_of(T const & this);
        Type type_of(T const & this);
    }

    struct using_declaration virtual(declaration_node) {
        string new_type;
        Type old_type;
    }

    struct variable_declaration virtual(declaration_node) {
        string name;
        Type type;
    }

#### Sample Implementation
The type id may be as little as:

    struct typeid {
        struct typeid const * const parent;
    };

Some linker magic would have to be used to ensure exactly one copy of each
structure for each type exists in memory. There seem to be spectial once
sections that support this and it should be easier than generating unique
ids across compilation units.

The structure could be extended to contain any additional type information.

There are two general designs for vtables with type ids. The first is to put
the type id at the top of the vtable, this is the most compact and efficient
solution but only works if we have exactly 1 vtable for each type. The second
is to put a pointer to the type id in each vtable. This has more overhead but
allows multiple vtables.

    struct <trait>_vtable {
        struct typeid const id;

        // Trait dependent list of vtable members.
    };

    struct <trait>_vtable {
        struct typeid const * const id;

        // Trait dependent list of vtable members.
    };

### Virtual Casts
To convert from a pointer to a type higher on the hierarchy to one lower on
the hierarchy a check is used to make sure that the underlying type is also
of that lower type.

The proposed syntax for this is:

    trait SubType * new_value = (virtual trait SubType *)super_type;

It will return the same pointer if it does point to the subtype and null if
it does not, doing the check and conversion in one operation.

### Inline vtables
Since the structures here are usually made to be turned into trait objects
it might be worth it to have fields on them to store the virtual table
pointer. This would have to be declared on the trait as an assertion (example:
`vtable;` or `T.vtable;`), but if it is the trait object could be a single
pointer.

There are also three options for where the pointer to the vtable. It could be
anywhere, a fixed location for each trait or always at the front. For the per-
trait solution an extention to specify what it is (example `vtable[0];`) which
could also be used to combine it with others. So these options can be combined
to allow access to all three options.

### Virtual Tables as Types
Here we consider encoding plus the implementation of functions on it to be a
type. Which is to say in the type hierarchy structures aren't concrete types
anymore, instead they are parent types to vtables, which combine the encoding
and implementation.

Resolution Scope
----------------

What is the scope of a resolution? When are the functions in a vtable decided
and how broadly is this applied?

### Type Level:
Each structure has a single resolution for all of the functions in the
virtual trait. This is how many languages that implement this or similar
features do it.

The main thing CFA would need to do it this way is some single point where
the type declaration, including the functions that satisfy the trait, are
all defined. Currently there are many points where this can happen, not all
of them will have the same definitions and no way to select one over the
other.

Some syntax would have to be added to specify the resolution point. To ensure
a single instance there may have to be two variants, one forward declaration
and one to create the instance. With some compiler magic the forward
declaration maybe enough.

    extern trait combiner(struct summation) vtable;
    trait combiner(struct summation) vtable;

Or (with the same variants):

    vtable combiner(struct summation);

The extern variant promises that the vtable will exist while the normal one
is where the resolution actually happens.

### Explicit Resolution Points:
Slightly looser than the above, there are explicit points where the vtables
are resolved, but there is no limit on the number of resolution points that
might be provided. Each time a object is bound to a trait, one of the
resolutions is selected. This might be the most flexible option.

An syntax would have to be provided as above. There may also be the option
to name resolution points so that you can choose between them. This also
could come with the ability to forward declare them.

Especially if they are not named, these resolution points should be able to
appear in functions, where the scoping rules can be used to select one.
However this also means that stack-allocated functions can end up in the
vtable.

    extern trait combiner(struct summation) vtable sum;
    trait combiner(struct summation) vtable sum;

    extern trait combiner(struct summation) vtable sum default;
    trait combiner(struct summation) vtable sum default;

The extern difference is the same before. The name (sum in the samples) is
used at the binding site to say which one is picked. The default keyword can
be used in only some of the declarations.

    trait combiner fee = (summation_instance, sum);
    trait combiner foe = summation_instance;

(I am not really happy about this syntax, but it kind of works.)
The object being bound is required. The name of the vtable is optional if
there is exactly one vtable name marked with default.

These could also be placed inside functions. In which case both the name and
the default keyword might be optional. If the name is ommited in an assignment
the closest vtable is choosen (returning to the global default rule if no
approprate local vtable is in scope).

### Site Based Resolution:
Every place in code where the binding of a vtable to an object occurs has
its own resolution. Syntax-wise this is the simplest as it should be able
to use just the existing declarations and the conversion to trait object.
It also is very close to the current polymorphic resolution rules.

This works as the explicit resolution points except the resolution points
are implicit and their would be no selection of which resolution to use. The
closest (current) resolution is always selected.

This could easily lead to an explosion of vtables as it has the most fine
grained resolution the number of bindings in a single scope (that produces
the same binding) could be quite high. Merging identical vtables might help
reduce that.

Vtable Lifetime Issues
----------------------

Vtables interact badly with the thunk issue. Conceptually vtables are static
like type/function data they carry, as those decisions are made by the
resolver at compile time.

Stack allocated functions interact badly with this because they are not
static. There are several ways to try to resolve this, however without a
general solution most can only buy time.

Filling in some fields of a static vtable could cause issues on a recursive
call. And then we are still limited by the lifetime of the stack functions, as
the vtable with stale pointers is still a problem.

Dynamically allocated vtables introduces memory management overhead and
requires some way to differentiate between dynamic and statically allocated
tables. The stale function pointer problem continues unless those becomes
dynamically allocated as well which gives us the same costs again.

Stack allocating the vtable seems like the best issue. The vtable's lifetime
is now the limiting factor but it should be effectively the same as the
shortest lifetime of a function assigned to it. However this still limits the
lifetime "implicitly" and returns to the original problem with thunks.