source: doc/proposals/autogen.md@ 43911a0

Last change on this file since 43911a0 was 43911a0, checked in by Michael Brooks <mlbrooks@…>, 22 hours ago

Add issues with autogen removal: hiding too eagerly and not acting inspectably

  • Property mode set to 100644
File size: 14.4 KB
Line 
1# Review of Auto-generation
2There have been known issues with auto-generated routines for a long time. Although no one has time to leap onto the problem right now, we figure people should start thinking about that. And the first part of that is to get all the grievances with the current system.
3
4# Core Features
5What are the core features of autogeneration, or that autogeneration allows?
6
7## C Compatibility
8Old C code should continue to work without any (or minimal). Furthermore, C-style code should usually work when mixed with CFA features. This includes behaviour not implemented as operators in CFA (such as field access and designators) as well as those that do.
9
10Note, that some CFA feature can disable C Compatibility, for instance visibility modifiers on fields might disable by-field initialization (PAB explain). However, orthogonal features, such as polymorphism, should not.
11
12## Life-Time Functions
13We want to get the life-time functions (destructor, copy assignment and copy construction) without having to write them when they are obvious as does C++.
14
15This actually has a lot of overlap with C Compatibility, in that these are also things you can do with them in C. So these functions should act like the primitive operations in C in those cases.
16
17## Custom Implementations
18We should be able to write custom implementations of the operators. These can be used to replace one of the generated functions. It also can add a new operator for the type.
19
20## Purposeful Missing Functions
21For the C-Compatibility's functions and life-time functions, sometimes we do not need, and in fact do not want, some of those functions. These should be possible to remove and any attempt to use them should be rejected at compilation.
22
23# Problems
24Those are the principles we have, but here are particular issues.
25(Thanks to Mike for producing a lot of the examples.)
26
27## Problems With Generated Functions
28
29### Value Call Semantics
30This is actually more general issue than autogenerated functions, but the copy constructor and copy assignment operators still take their source argument by value. They have to be copied in C-style to implement the copy operator. When it is fixed, then autogeneration will have to be updated as well.
31
32Current Forms:
33 void ?{}(char &, char);
34 char ?=?(char &, char);
35
36New Forms:
37 void ?{}(char &, char const &);
38 char & ?=?(char &, char const &);
39
40### Unused Assertions Still Added to the Assertion List
41All assertions on the type declaration are used in all autogenerated functions even if they are never used. For example:
42
43The declaration of:
44 forall(T)
45 struct Cell { T x; }
46
47Results in the following autogenerated expands to:
48 forall(T* | { T ?=?(T&, T); void ?{}(T&); void ?{}(T&, T); void ^?{}(T&); })
49 void ?{}(Cell(T)&);
50 forall(T* | { T ?=?(T&, T); void ?{}(T&); void ?{}(T&, T); void ^?{}(T&); })
51 void ?{}(Cell(T)&, Cell(T));
52 forall(T* | { T ?=?(T&, T); void ?{}(T&); void ?{}(T&, T); void ^?{}(T&); })
53 void ^?{}(Cell(T)&);
54 forall(T* | { T ?=?(T&, T); void ?{}(T&); void ?{}(T&, T); void ^?{}(T&); })
55 void ?=?(Cell(T)&, Cell(T));
56 forall(T* | { T ?=?(T&, T); void ?{}(T&); void ?{}(T&, T); void ^?{}(T&); })
57 void ?{}(Cell(T)&, T);
58
59If these assertions were reduced to the minimal required assertions the result would instead look something like the:
60 forall(T* | { void ?{}(T&); })
61 void ?{}(Cell(T)&);
62 forall(T* | { void ?{}(T&, T); })
63 void ?{}(Cell(T)&, Cell(T));
64 forall(T* | { void ^?{}(T&); })
65 void ^?{}(Cell(T)&);
66 forall(T* | { T ?=?(T&, T); })
67 void ?=?(Cell(T)&, Cell(T));
68 forall(T* | { void ?{}(T&, T); })
69 void ?{}(Cell(T)&, T);
70
71This leads to exponential thunk generation for `Cell(Cell(int))` (or a matrix represented `vector(vector(vector(int)))`).
72
73### Autogened Functions cannot use Available Functions
74If you supply an implementation for one of the autogenerated functions, it will not be used while generating other functions.
75
76Consider a case with a custom copy constructor but don't define an assignment operator. The current (problematic) behaviour reimplements the assignment operator member-wise. The ideal solution would be to create a new implementation of the operator that applies the appropriate destructor, then the custom copy constructor. Although this implementation may be slower, it will have correct behaviour if the other operators are implemented properly.
77
78An alternate behaviour would simply to remove the assignment operator entirely unless the users explicit provides one. This is more similar to C++'s "The Rule of Three" (or "The Rule of Five" with move operations), where all three of the lifetime functions must be redefined if any of them are. The advantage of the new assignment operator (mentioned in the "ideal solution") is that it avoids a similar rule of three, needing only destruction and copy construction for proper lifetime behaviour.
79
80## Problems With Removed Functions
81
82### Failed Autogeneration Leaves Behind Declaration
83All autogenerated functions are checked by attempting to resolve them. If there is an error, than the autogenerated function is removed. But that only removes the definition, so it can still be considered as a candidate for resolution. The following code will compile but fail during linking.
84
85 forall(T *) struct Cell { T x; };
86 Cell(char) s;
87
88This should be an error at resolution time, reporting that no such constructor is defined, instead of making it all the way to the linker.
89
90### Overriding a Function can Lead to Problems
91Implementing your own version of a function should always override the autogenerated function. This does not happen, especially if the declared function does not use the exact same assertions as the autogenerated function (provided via the type declaration itself).
92
93(This issue is filled as Trac Ticket 186.)
94
95### Cannot Manually Remove Functions
96You cannot request that a function not be generated. The above cases could be worked around if you could. In addition, there are cases where an autogenerated routine could be created, but you do not want that operator to be be callable at all.
97
98You can delete (using `= void`) functions to mask the autogeneration functions. However the autogenerated functions still exist and you have to get the signatures exactly the same, so this is not considered practical.
99
100The main reason to manually remove functions is to enforce a behaviour based interface for a type, as opposed to a data based one. To enforce that new interface, this would have to interact with visibility.
101
102### Automatic Remove Hides Everything
103As soon as you declare a custom constructor, _all_ autogenerated constructors become inaccessible. Often, this behaviour is good, and it agrees with C++. But the desire dual to "manually remove" exists: "manually keep." Furthermore, there is a use case for invoking an automatically provided constructor as a helper, when implementing a custom constructor.
104
105 struct S {
106 int i;
107 int & j;
108 };
109 void ?{}( S & s, int & w ) {
110 // Unique best alternative includes deleted identifier in:
111 s{ 3, w };
112 // intended meaning / workaround:
113 // s.i = 3; &s.j = &w;
114 }
115
116This use case should be considered also, along with visibility. A private helper constructor should be usable in the implementation of a public value-add constructor. The private helper being an autogen is one such arrangement.
117
118Users should have the option to adopt the idiom: All constructors funnel into the most private, all-member constructor.
119
120### Removed Functions Linger
121Even if a function is made CFA-uncallable, it still shows in the emitted C code (`cfa -CFA`). An uncallable function shows identically to a callable function, with no indication of, "This declaration has been deleted," or, "Here is a redaction of X." This quirk reduces the utility of inspecting -CFA to answer, "What constructors would be available to me?" or, "What's the net effect of the constructor declarations and deletions that I've given?"
122
123## Other Problems & Requested Features
124
125### Designators
126Designators (named parameters or keyword arguments) are nice features and being able to use them with constructors/initializers are really nice. This could be either a general solution for keyword arguments or something special for initializers.
127
128 vector v = {capacity: 128};
129
130The designator syntax (included in the example) being different from C is also a problem for compatibility, but does not change their use in pure Cforall.
131
132### Non-Intuitive Reference Initializer
133Initializing reference type struct members can easily catch up beginners. The following piece of code compiles without error or warning, but will lead to a segmentation fault.
134
135 struct S { int & x; };
136 void ?{}( S & this, int & x ) {
137 (this.x){ x };
138 // The correct way to implement this operation.
139 // (&this.x){ &x };
140 }
141
142### No Const Field Initialization
143One cannot initialize a constant field without using a cast, which makes writing constructors for types with such field more difficult. The following example does not compile.
144
145 struct Const { const int x; };
146 void ?{}( Const & this, int x ) {
147 (this.x){ x };
148 // A correct way to implement this operation.
149 // ?{}(*(int*)&this.x, x); // remove const
150 }
151
152(The `(*(int*)&this.x){ x };` form appears not to work for unrelated reasons.)
153
154### New Type Parameter Shorthand
155A request to include another parameter shorthand for a group of assertions between sized and object-type. Notably, often we don't need to create fresh instance, we just want to manipulate existing instances and destroy them when we are done.
156
157Mike reports that for `forall(T * | [life-time-assertions])` cases he sees an approximate breakdown of:
158
15920%: dtor
16040%: dtor + copy ctor
16110%: dtor + copy ctor + no-arg ctor
16220%: dtor + copy ctor + custom ctor (no need for a no-arg ctor)
16310%: anything else
164
165(This was not counting copy assignment, although it could be considered an optimization of destroy and then copy (re)construct.)
166
167### Incorrect Field Detection
168When you do write your own constructor (or destructor) any fields you do not construct (or destruct) particular fields they are automatically constructed (or destructed). But the detection is inaccurate.
169
170Exact issues are not known. But at the very least the rules are not clearly documented because no one seems to know what they are.
171
172### No-op Constructor
173This may be solved, in some cases, but there is no clear interface to specify that a construction should not be run. It would be nice, like in C, to leave stack allocated variable uninitialized, this is mostly a performance issue but can allow you do declare a variable before the information to construct it is read.
174
175However, if a constructor is run, then all of its components should be initialized by default.
176
177### Earlier Inline of Autogenerated Function
178A warning that comes up around autogenerated functions mentions static function called from inline functions. Although, this may not lead to problems, it does highlight some issues with the C initializer to Cforall constructor conversion.
179
180# Possible Solution and Suggestions
181Proposals for features to address some of the above issues.
182
183## Fine-Grained C Constructor Escapes
184Currently, the C escape for constructors only work at the top constructor. This suggestion moves the escape from the initialization context to the constructor call/initializer. (As an aside, ideally there would be no need for a C escape because Cforall would never overstep, but until then, we should try to have good escapes.)
185
186There are two ways to escape an constructor, so that Cforall always resolves it as a C initializer and not a Cforall constructor call. These are syntactically tied to the initialization context, not the initializer, and semantically apply to the top initializer.
187
188The syntax change could just move the `@` from the declaration to the initializer. Escaped initializers are written `@{ ... }`. This doesn't change the syntax for compound literals (`(TYPE)@{ ... }`), but it does change variable declarations (`DECL @= { ... };` becomes `DECL = @{ ... };`). Each escape means exactly that initializer must not be a constructor call.
189
190## Initializer/Constructor
191A different way to stop Cforall constructors from conflicting C initializers they could just use a different syntax. This could try to be a small change to the initializer syntax, the minimum change to separate the two, or a more drastic change, that might enable new features (ex. `ctor_name{ ... }`, allowing for named constructors).
192
193This fixes the backwards compatibility issue, and removes the need for escapes, but does result in a larger syntax change for new calls. Separating initializers from constructors might also help with autogeneration and unexpected conflicts between autogen and manually defined functions.
194
195## Autogeneration Attributes
196Add attributes that control what routines are autogenerated. At its simplest `[[cfa_no_autogen]]` could be added to a SUE declaration to prevent any autogenerated routines. That could be it, but it does require manually define any functions that would be autogenerated routines, so you could have more selective attributes (or a single attribute with options) to disable only the autogeneration of particular routines.
197
198## The is_pod Assertion (Optimization)
199An assertion that a type is a "plain old data" type. A plain old data type is any type that is entirely defined by its bit pattern without any context used in its definition.
200
201This means:
202+ Destroying an instance of the type is a no-op, no clean-up required.
203+ Copying the type replicates the bit pattern in the new memory location.
204+ Moving the type is equivalent to copying the type.
205
206This means that the type carries size and alignment, and from that you can implement the copy constructor, copy assignment and destructor using just memory operations. That means this is equivalent to `is_value` in terms of operations but the implementations of those functions can be different, and less data has to be passed around.
207
208One requirement that is not used is that all zeros a valid bit pattern for that type (or that and given bit pattern is valid). It could be added, and then you can also construct instances of the type by zero filling the storage. It is a `is_object` interface and considering the trend to from object to value, right now it seems it should at most be a secondary trait/assertion (ex. `is_pod0`).
209
210However, in both of these cases there is actually no new functionality added. These are existing operations. The advantage is it allows for more optimizations to be used. The function pointers do not need to be passed into polymorphic functions and some operations can be bundled together. Whether these optimizations save significant about of time or memory has to be investigated.
Note: See TracBrowser for help on using the repository browser.