source: doc/proposals/vtable.md @ b0ccd1c

ADTarm-ehast-experimentalcleanup-dtorsenumforall-pointer-decayjacob/cs343-translationjenkins-sandboxnew-astnew-ast-unique-exprpthread-emulationqualifiedEnum
Last change on this file since b0ccd1c was 07ac6d0, checked in by Andrew Beach <ajbeach@…>, 5 years ago

Its rough, but I think I have all the content I need in the vtable proposal now.

  • Property mode set to 100644
File size: 16.5 KB
Line 
1Proposal For Use of Virtual Tables
2==================================
3
4This is an adaptation of the earlier virtual proposal, updating it with new
5ideas, re-framing it and laying out more design decisions. It should
6eventually replace the earlier proposal, but not all features and syntax have
7been converted to the new design.
8
9The basic concept of a virtual table (vtable) is the same here as in most
10other languages. They will mostly contain function pointers although they
11should be able to store anything that goes into a trait.
12
13I also include notes on a sample implementation, which primarly exists to show
14there is a resonable implementation. The code samples for that are in a slight
15psudo-code to help avoid name mangling and keeps some CFA features while they
16would actually be writen in C.
17
18Trait Instances
19---------------
20
21Currently traits are completely abstract. Data types might implement a trait
22but traits are not themselves data types. This will change that and allow
23instances of traits to be created from instances of data types that implement
24the trait.
25
26    trait combiner(otype T) {
27                void combine(T&, int);
28        };
29
30    struct summation {
31                int sum;
32        };
33
34        void ?{}( struct summation & this ) {
35                this.sum = 0;
36        }
37
38    void combine( struct summation & this, int num ) {
39                this.sum = this.sum + num;
40        }
41
42        trait combiner obj = struct summation{};
43        combine(obj, 5);
44
45As with `struct` (and `union` and `enum`), `trait` might be optional when
46using the trait as a type name. A trait may be used in assertion list as
47before.
48
49For traits to be used this way they should meet two requirements. First they
50should only have a single polymorphic type and each assertion should use that
51type once as a parameter. Extentions may later loosen these requirements.
52
53If a trait object is used it should generate a series of implicate functions
54each of which implements one of the functions required by the trait. So for
55combiner there is an implicate:
56
57    void combine(trait combiner & this, int);
58
59This function is the one actually called at the end
60
61The main use case for trait objects is that they can be stored. They can be
62passed into functions, but using the trait directly is prefred in this case.
63
64    trait drawable(otype T) {
65        void draw(Surface & to, T & draw);
66        Rect(int) drawArea(T & draw);
67    };
68
69    struct UpdatingSurface {
70        Surface * surface;
71        vector(trait drawable) drawables;
72    };
73
74    void updateSurface(UpdatingSurface & us) {
75        for (size_t i = 0 ; i < us.drawables.size ; ++i) {
76            draw(us.surface, us.drawables[i]);
77        }
78    }
79
80Currently these traits are limited to 1 trait parameter and functions should
81have exactly 1 parameter. We cannot abstract away pairs of types and still
82pass them into normal functions, which take them seperately.
83
84The second is required the because we need to get the vtable from somewhere.
85If there are 0 trait objects than no vtable is avalible, if we have more than
861 than the vtables give conflicting answers on what underlying function to
87call. And even then the underlying type assumes a concrete type.
88
89This loop can sort of be broken by using the trait object directly in the
90signature. This has well defined meaning, but might not be useful.
91
92    trait example(otype T) {
93        bool test(T & this, trait example & that);
94    }
95
96#### Sample Implementation
97A simple way to implement trait objects is by a pair of pointers. One to the
98underlying object and one to the vtable.
99
100    struct vtable_drawable {
101        void (*draw)(Surface &, void *);
102        Rect(int) (*drawArea)(void *);
103    };
104
105    struct drawable {
106        void * object;
107        vtable_drawable * vtable;
108    };
109
110The functions that run on the trait object would generally be generated using
111the following pattern:
112
113    void draw(Surface & surface, drawable & traitObj) {
114        return traitObj.vtable->draw(surface, traitObj.object);
115    }
116
117There may have to be special cases for things like copy construction, that
118might require a more sigificant wrapper. On the other hand moving could be
119implemented by moving the pointers without any need to refer to the base
120object.
121
122### Extention: Multiple Trait Parameters
123Currently, this gives traits two independent uses. They use the same syntax,
124except for limits boxable traits have, and yet don't really mix. The most
125natural way to do this is to allow trait instances to pick one parameter
126that they are generic over, the others they choose types to implement.
127
128The two ways to do the selection, the first is do it at the trait definition.
129Each trait picks out a single parameter which it can box (here the `virtual`
130qualifier). When you create an instance of a trait object you provide
131arguments like for a generic structure, but skip over the marked parameter.
132
133    trait combiner(virtual otype T, otype Combined) {
134        void combine(T &, Combined &);
135    }
136
137    trait combiner(int) int_combiner;
138
139The second is to do it at the instaniation point. A placeholder (here the
140keyword `virtual`) is used to explicately skip over the parameter that will be
141abstracted away, with the same rules as above if it was the marked parameter.
142
143    trait combiner(otype T, otype Combined) {
144        void combine(T &, Combined &);
145    };
146
147    trait combiner(virtual, int) int_combiner;
148
149Using both (first to set the default, second as a local override) would also
150work, although might be exessively complicated.
151
152This is useful in cases where you want to use a generic type, but leave part
153of it open and store partially generic result. As a simple example
154
155    trait folder(otype T, otype In, otype Out) {
156        void fold(T & this, In);
157        Out fold_result(T & this);
158    }
159
160Which allows you to fold values without putting them in a container. If they
161are already in a container this is exessive, but if they are generated over
162time this gives you a simple interface. This could for instance be used in
163a profile, where T changes for each profiling statistic and you can plug in
164multiple profilers for any run by adding them to an array.
165
166Hierarchy
167---------
168
169Virtual tables by them selves are not quite enough to implement the planned
170hierarchy system. An addition of type ids, implemented as pointers which
171point to your parent's type id, is required to actually create the shape of
172the hierarchy. However vtables would allow behaviour to be carried with the
173tree.
174
175The hierarchy would be a tree of types, of traits and structs. Currently we do
176not support structural extension, so traits form the internal nodes and
177structures the leaf nodes.
178
179The syntax is undecided but it will include a clause like `virtual (PARENT)`
180on trait and struct definitions. It marks out all types in a hierarchy.
181PARENT may be omitted, if it is this type is the root of a hierarchy. Otherwise
182it is the name of the type that is this type's parent in the hierarchy.
183
184Traits define a trait instance type that implements all assertions in this
185trait and its parents up until the root of the hierarchy. Each trait then
186defines a vtable type. Structures will also have a vtable type but it should
187be the same as their parent's.
188
189Trait objects within the tree can be statically cast to a parent type. Casts
190from a parent type to a child type are conditional, they check to make sure
191the underlying instance is an instance of the child type, or an instance of
192one of its children. The type then is recoverable at run-time.
193
194As with regular trait objects, calling a function on a trait object will cause
195a look-up on the the virtual table. The casting rules make sure anything that
196can be cast to a trait type will have all the function implementations for
197that trait.
198
199Converting from a concrete type (structures at the edge of the hierarchy) to
200an abstract type works the same as with normal trait objects, the underlying
201object is packaged with a virtual table pointer. Converting back to an abstract
202type requires confirming the underlying type matches, but then simply extracts
203the pointer to it.
204
205Exception Example:
206(Also I'm not sure where I got these casing rules.)
207
208    trait exception(otype T) virtual() {
209        char const * what(T & this);
210    }
211
212    trait io_error(otype T) virtual(exception) {
213        FILE * which_file(T & this);
214    }
215
216    struct eof_error(otype T) virtual(io_error) {
217        FILE * file;
218    }
219
220    char const * what(eof_error &) {
221        return "Tried to read from an empty file.";
222    }
223
224    FILE * which_file(eof_error & this) {
225        return eof_error.file;
226    }
227
228Ast Example:
229
230    trait ast_node(otype T) virtual() {
231        void print(T & this, ostream & out);
232        void visit(T & this, Visitor & visitor);
233        CodeLocation const & get_code_location(T & this);
234    }
235
236    trait expression_node(otype T) virtual(ast_node) {
237        Type eval_type(T const & this);
238    }
239
240    struct operator_expression virtual(expression_node) {
241        enum operator_kind kind;
242        trait expression_node rands[2];
243    }
244
245    trait statement_node(otype T) virtual(ast_node) {
246        vector(Label) & get_labels(T & this);
247    }
248
249    struct goto_statement virtual(statement_node) {
250        vector(Label) labels;
251        Label target;
252    }
253
254    trait declaration_node(otype T) virtual(ast_node) {
255        string name_of(T const & this);
256        Type type_of(T const & this);
257    }
258
259    struct using_declaration virtual(declaration_node) {
260        string new_type;
261        Type old_type;
262    }
263
264    struct variable_declaration virtual(declaration_node) {
265        string name;
266        Type type;
267    }
268
269#### Sample Implementation
270The type id may be as little as:
271
272    struct typeid {
273        struct typeid const * const parent;
274    };
275
276Some linker magic would have to be used to ensure exactly one copy of each
277structure for each type exists in memory. There seem to be spectial once
278sections that support this and it should be easier than generating unique
279ids across compilation units.
280
281The structure could be extended to contain any additional type information.
282
283There are two general designs for vtables with type ids. The first is to put
284the type id at the top of the vtable, this is the most compact and efficient
285solution but only works if we have exactly 1 vtable for each type. The second
286is to put a pointer to the type id in each vtable. This has more overhead but
287allows multiple vtables.
288
289    struct <trait>_vtable {
290        struct typeid const id;
291
292        // Trait dependent list of vtable members.
293    };
294
295    struct <trait>_vtable {
296        struct typeid const * const id;
297
298        // Trait dependent list of vtable members.
299    };
300
301### Virtual Casts
302To convert from a pointer to a type higher on the hierarchy to one lower on
303the hierarchy a check is used to make sure that the underlying type is also
304of that lower type.
305
306The proposed syntax for this is:
307
308    trait SubType * new_value = (virtual trait SubType *)super_type;
309
310It will return the same pointer if it does point to the subtype and null if
311it does not, doing the check and conversion in one operation.
312
313### Inline vtables
314Since the structures here are usually made to be turned into trait objects
315it might be worth it to have fields on them to store the virtual table
316pointer. This would have to be declared on the trait as an assertion (example:
317`vtable;` or `T.vtable;`), but if it is the trait object could be a single
318pointer.
319
320There are also three options for where the pointer to the vtable. It could be
321anywhere, a fixed location for each trait or always at the front. For the per-
322trait solution an extention to specify what it is (example `vtable[0];`) which
323could also be used to combine it with others. So these options can be combined
324to allow access to all three options.
325
326### Virtual Tables as Types
327Here we consider encoding plus the implementation of functions on it to be a
328type. Which is to say in the type hierarchy structures aren't concrete types
329anymore, instead they are parent types to vtables, which combine the encoding
330and implementation.
331
332Resolution Scope
333----------------
334
335What is the scope of a resolution? When are the functions in a vtable decided
336and how broadly is this applied?
337
338### Type Level:
339Each structure has a single resolution for all of the functions in the
340virtual trait. This is how many languages that implement this or similar
341features do it.
342
343The main thing CFA would need to do it this way is some single point where
344the type declaration, including the functions that satisfy the trait, are
345all defined. Currently there are many points where this can happen, not all
346of them will have the same definitions and no way to select one over the
347other.
348
349Some syntax would have to be added to specify the resolution point. To ensure
350a single instance there may have to be two variants, one forward declaration
351and one to create the instance. With some compiler magic the forward
352declaration maybe enough.
353
354    extern trait combiner(struct summation) vtable;
355    trait combiner(struct summation) vtable;
356
357Or (with the same variants):
358
359    vtable combiner(struct summation);
360
361The extern variant promises that the vtable will exist while the normal one
362is where the resolution actually happens.
363
364### Explicit Resolution Points:
365Slightly looser than the above, there are explicit points where the vtables
366are resolved, but there is no limit on the number of resolution points that
367might be provided. Each time a object is bound to a trait, one of the
368resolutions is selected. This might be the most flexible option.
369
370An syntax would have to be provided as above. There may also be the option
371to name resolution points so that you can choose between them. This also
372could come with the ability to forward declare them.
373
374Especially if they are not named, these resolution points should be able to
375appear in functions, where the scoping rules can be used to select one.
376However this also means that stack-allocated functions can end up in the
377vtable.
378
379    extern trait combiner(struct summation) vtable sum;
380    trait combiner(struct summation) vtable sum;
381
382    extern trait combiner(struct summation) vtable sum default;
383    trait combiner(struct summation) vtable sum default;
384
385The extern difference is the same before. The name (sum in the samples) is
386used at the binding site to say which one is picked. The default keyword can
387be used in only some of the declarations.
388
389    trait combiner fee = (summation_instance, sum);
390    trait combiner foe = summation_instance;
391
392(I am not really happy about this syntax, but it kind of works.)
393The object being bound is required. The name of the vtable is optional if
394there is exactly one vtable name marked with default.
395
396These could also be placed inside functions. In which case both the name and
397the default keyword might be optional. If the name is ommited in an assignment
398the closest vtable is choosen (returning to the global default rule if no
399approprate local vtable is in scope).
400
401### Site Based Resolution:
402Every place in code where the binding of a vtable to an object occurs has
403its own resolution. Syntax-wise this is the simplest as it should be able
404to use just the existing declarations and the conversion to trait object.
405It also is very close to the current polymorphic resolution rules.
406
407This works as the explicit resolution points except the resolution points
408are implicit and their would be no selection of which resolution to use. The
409closest (current) resolution is always selected.
410
411This could easily lead to an explosion of vtables as it has the most fine
412grained resolution the number of bindings in a single scope (that produces
413the same binding) could be quite high. Merging identical vtables might help
414reduce that.
415
416Vtable Lifetime Issues
417----------------------
418
419Vtables interact badly with the thunk issue. Conceptually vtables are static
420like type/function data they carry, as those decisions are made by the
421resolver at compile time.
422
423Stack allocated functions interact badly with this because they are not
424static. There are several ways to try to resolve this, however without a
425general solution most can only buy time.
426
427Filling in some fields of a static vtable could cause issues on a recursive
428call. And then we are still limited by the lifetime of the stack functions, as
429the vtable with stale pointers is still a problem.
430
431Dynamically allocated vtables introduces memory management overhead and
432requires some way to differentiate between dynamic and statically allocated
433tables. The stale function pointer problem continues unless those becomes
434dynamically allocated as well which gives us the same costs again.
435
436Stack allocating the vtable seems like the best issue. The vtable's lifetime
437is now the limiting factor but it should be effectively the same as the
438shortest lifetime of a function assigned to it. However this still limits the
439lifetime "implicitly" and returns to the original problem with thunks.
Note: See TracBrowser for help on using the repository browser.