Context Navigation

source: doc/theses/rob_schluntz_MMath/conclusions.tex @ a41e87b

ADTarm-ehast-experimentalenumforall-pointer-decayjacob/cs343-translationnew-ast-unique-exprpthread-emulationqualifiedEnum

Last change on this file since a41e87b was 67982887, checked in by Peter A. Buhr <pabuhr@…>, 6 years ago
specialize thesis directory-names
Property mode set to `100644`
File size: 16.0 KB

Line
1	%======================================================================
2	\chapter{Conclusions}
3	%======================================================================
4
5	Adding resource management and tuples to \CFA has been a challenging design, engineering, and implementation exercise.
6	On the surface, the work may appear as a rehash of similar mechanisms in \CC.
7	However, every added feature is different than its \CC counterpart, often with extended functionality, better integration with C and its programmers, and always supports separate compilation.
8	All of these new features are being used extensively by the \CFA development-team to build the \CFA runtime system.
9	In particular, the concurrency system is built on top of RAII, library functions @new@ and @delete@ are used to manage dynamically allocated objects, and tuples are used to provide uniform interfaces to C library routines such as @div@ and @remquo@.
10
11	\section{Constructors and Destructors}
12	\CFA supports the RAII idiom using constructors and destructors.
13	There are many engineering challenges in introducing constructors and destructors, partially since \CFA is not an object-oriented language.
14	By making use of managed types, \CFA programmers are afforded an extra layer of safety and ease of use in comparison to C programmers.
15	While constructors and destructors provide a sensible default behaviour, \CFA allows experienced programmers to declare unmanaged objects to take control of object management for performance reasons.
16	Constructors and destructors as named functions fit the \CFA polymorphism model perfectly, allowing polymorphic code to use managed types seamlessly.
17
18	\section{Tuples}
19	\CFA can express functions with multiple return values in a way that is simple, concise, and safe.
20	The addition of multiple-return-value functions naturally requires a way to use multiple return values, which begets tuple types.
21	Tuples provide two useful notions of assignment: multiple assignment, allowing simple, yet expressive assignment between multiple variables, and mass assignment, allowing a lossless assignment of a single value across multiple variables.
22	Tuples have a flexible structure that allows the \CFA type-system to decide how to restructure tuples, making it syntactically simple to pass tuples between functions.
23	Tuple types can be combined with polymorphism and tuple conversions can apply during assertion inference to produce a cohesive feel.
24
25	\section{Variadic Functions}
26	Type-safe variadic functions, with a similar feel to variadic templates, are added to \CFA.
27	The new variadic functions can express complicated recursive algorithms.
28	Unlike variadic templates, it is possible to write @new@ as a library routine and to separately compile @ttype@ polymorphic functions.
29	Variadic functions are statically type checked and provide a user experience that is consistent with that of tuples and polymorphic functions.
30
31	\section{Future Work}
32	\subsection{Constructors and Destructors}
33	Both \CC and Rust support move semantics, which expand the user's control of memory management by providing the ability to transfer ownership of large data, rather than forcing potentially expensive copy semantics.
34	\CFA currently does not support move semantics, partially due to the complexity of the model.
35	The design space is currently being explored with the goal of finding an alternative to move semantics that provides necessary performance benefits, while reducing the amount of repetition required to create a new type, along with the cognitive burden placed on the user.
36
37	% One technique being evaluated is whether named return-values can be used to eliminate unnecessary temporaries \cite{Buhr94a}.
38	% For example,
39	% \begin{cfacode}
40	% struct A { ... };
41	% [A x] f(A x);
42	% [A y] g(A y);
43	% [A z] h(A z);
44
45	% struct A a1, a2;
46	% a2 = h(g(f(a1)));
47	% \end{cfacode}
48	% Here, since both @f@'s argument and return value have the same name and type, the compiler can infer that @f@ returns its argument.
49	% With this knowledge, the compiler can reuse the storage for the argument to @f@ as the argument to @g@. % TODO: cite Till thesis?
50
51	Exception handling is among the features expected to be added to \CFA in the near future.
52	For exception handling to properly interact with the rest of the language, it must ensure all RAII guarantees continue to be met.
53	That is, when an exception is raised, it must properly unwind the stack by calling the destructors for any objects that live between the raise and the handler.
54	This can be accomplished either by augmenting the translator to properly emit code that executes the destructors, or by switching destructors to hook into the GCC @cleanup@ attribute \cite[6.32.1]{GCCExtensions}.
55
56	The @cleanup@ attribute, which is attached to a variable declaration, takes a function name as an argument and schedules that routine to be executed when the variable goes out of scope.
57	\begin{cfacode}
58	struct S { int x; };
59	void __dtor_S(struct S *);
60	{
61	__attribute__((cleanup(__dtor_S))) struct S s;
62	} // calls __dtor_S(&s)
63	\end{cfacode}
64	This mechanism is known and understood by GCC, so that the destructor is properly called in any situation where a variable goes out of scope, including function returns, branches, and built-in GCC exception handling mechanisms using libunwind.
65
66	A caveat of this approach is that the @cleanup@ attribute only permits a function that consumes a single argument of type @T *@ for a variable of type @T@.
67	This restriction means that any destructor that consumes multiple arguments (\eg, because it is polymorphic) or any destructor that is a function pointer (\eg, because it is an assertion parameter) must be called through a local thunk.
68	For example,
69	\begin{cfacode}
70	forall(otype T)
71	struct Box {
72	T x;
73	};
74	forall(otype T) void ^?{}(Box(T) * x); // has implicit parameters
75
76	forall(otype T)
77	void f(T x) {
78	T y = x; // destructor is a function-pointer parameter
79	Box(T) z = { x }; // destructor has multiple parameters
80	}
81	\end{cfacode}
82	currently generates the following
83	\begin{cfacode}
84	void _dtor_BoxT( // consumes more than 1 parameter due to assertions
85	void (_adapter_PTT)(void ()(), void , void ),
86	void (_adapter_T_PTT)(void ()(), void , void , void *),
87	long unsigned int _sizeof_T,
88	long unsigned int _alignof_T,
89	void (_assign_T_PTT)(void , void ),
90	void (_ctor_PT)(void ),
91	void (_ctor_PTT)(void , void *),
92	void (_dtor_PT)(void ),
93	void *x
94	);
95
96	void f(
97	void (_adapter_PTT)(void ()(), void , void ),
98	void (_adapter_T_PTT)(void ()(), void , void , void *),
99	long unsigned int _sizeof_T,
100	long unsigned int _alignof_T,
101	void (_assign_TT)(void , void ),
102	void (_ctor_T)(void ),
103	void (_ctor_TT)(void , void *),
104	void (_dtor_T)(void ),
105	void *x
106	){
107	void *y = __builtin_alloca(_sizeof_T);
108	// constructor call elided
109
110	// generic layout computation elided
111	long unsigned int _sizeof_BoxT = ...;
112	void *z = __builtin_alloca(_sizeof_BoxT);
113	// constructor call elided
114
115	_dtor_BoxT( // ^?{}(&z); -- _dtor_BoxT has > 1 arguments
116	_adapter_PTT,
117	_adapter_T_PTT,
118	_sizeof_T,
119	_alignof_T,
120	_assign_TT,
121	_ctor_T,
122	_ctor_TT,
123	_dtor_T,
124	z
125	);
126	_dtor_T(y); // ^?{}(&y); -- _dtor_T is a function pointer
127	}
128	\end{cfacode}
129	Further to this point, every distinct array type will require a thunk for its destructor, where array destructor code is currently inlined, since array destructors hard code the length of the array.
130
131	For function call temporaries, new scopes have to be added for destructor ordering to remain consistent.
132	In particular, the translator currently destroys argument and return value temporary objects as soon as the statement they were created for ends.
133	In order for this behaviour to be maintained, new scopes have to be added around every statement that contains a function call.
134	Since a nested expression can raise an exception, care must be taken when destroying temporary objects.
135	One way to achieve this is to split statements at every function call, to provide the correct scoping to destroy objects as necessary.
136	For example,
137	\begin{cfacode}
138	struct S { ... };
139	void ?{}(S *, S);
140	void ^?{}(S *);
141
142	S f();
143	S g(S);
144
145	g(f());
146	\end{cfacode}
147	would generate
148	\begin{cfacode}
149	struct S { ... };
150	void _ctor_S(struct S *, struct S);
151	void _dtor_S(struct S *);
152
153	{
154	__attribute__((cleanup(_dtor_S))) struct S _tmp1 = f();
155	__attribute__((cleanup(_dtor_S))) struct S _tmp2 =
156	(_ctor_S(&_tmp2, _tmp1), _tmp2);
157	__attribute__((cleanup(_dtor_S))) struct S _tmp3 = g(_tmp2);
158	} // destroy _tmp3, _tmp2, _tmp1
159	\end{cfacode}
160	Note that destructors must be registered after the temporary is fully initialized, since it is possible for initialization expressions to raise exceptions, and a destructor should never be called on an uninitialized object.
161	This requires a slightly strange looking initializer for constructor calls, where a comma expression is used to produce the value of the object being initialized, after the constructor call, conceptually bitwise copying the initialized data into itself.
162	Since this copy is wholly unnecessary, it is easily optimized away.
163
164	A second approach is to attach an accompanying boolean to every temporary that records whether the object contains valid data, and thus whether the value should be destructed.
165	\begin{cfacode}
166	struct S { ... };
167	void _ctor_S(struct S *, struct S);
168	void _dtor_S(struct S *);
169
170	struct _tmp_bundle_S {
171	bool valid;
172	struct S value;
173	};
174
175	void _dtor_tmpS(struct _tmp_bundle_S * ret) {
176	if (ret->valid) {
177	_dtor_S(&ret->value);
178	}
179	}
180
181	{
182	__attribute__((cleanup(_dtor_tmpS))) struct _tmp_bundle_S _tmp1 = { 0 };
183	__attribute__((cleanup(_dtor_tmpS))) struct _tmp_bundle_S _tmp2 = { 0 };
184	__attribute__((cleanup(_dtor_tmpS))) struct _tmp_bundle_S _tmp3 = { 0 };
185	_tmp2.value = g(
186	(_ctor_S(
187	&_tmp2.value,
188	(_tmp1.value = f(), _tmp1.valid = 1, _tmp1.value)
189	), _tmp2.valid = 1, _tmp2.value)
190	), _tmp3.valid = 1, _tmp3.value;
191	} // destroy _tmp3, _tmp2, _tmp1
192	\end{cfacode}
193	In particular, the boolean is set immediately after argument construction and immediately after return value copy.
194	The boolean is checked as a part of the @cleanup@ routine, forwarding to the object's destructor if the object is valid.
195	One such type and @cleanup@ routine needs to be generated for every type used in a function parameter or return value.
196
197	The former approach generates much simpler code, however splitting expressions requires care to ensure that expression evaluation order does not change.
198	Expression ordering has to be performed by a full compiler, so it is possible that the latter approach would be more suited to the \CFA prototype, whereas the former approach is clearly the better option in a full compiler.
199	More investigation is needed to determine whether the translator's current design can easily handle proper expression ordering.
200
201	As discussed in Section \ref{s:implicit_copy_construction}, return values are destructed with a different @this@ pointer than they are constructed with.
202	This problem can be easily fixed once a full \CFA compiler is built, since it would have full control over the call/return mechanism.
203	In particular, since the callee is aware of where it needs to place the return value, it can construct the return value directly, rather than bitwise copy the internal data.
204
205	Currently, the special functions are always auto-generated, except for generic types where the type parameter does not have assertions for the corresponding operation.
206	For example,
207	\begin{cfacode}
208	forall(dtype T \| sized(T) \| { void ?{}(T *); })
209	struct S { T x; };
210	\end{cfacode}
211	only auto-generates the default constructor for @S@, since the member @x@ is missing the other 3 special functions.
212	Once deleted functions have been added, function generation can make use of this information to disable generation of special functions when a member has a deleted function.
213	For example,
214	\begin{cfacode}
215	struct A {};
216	void ?{}(A *) = delete;
217	struct S { A x; }; // does not generate void ?{}(S *);
218	\end{cfacode}
219
220	Unmanaged objects and their interactions with the managed \CFA environment are an open problem that deserves greater attention.
221	In particular, the interactions between unmanaged objects and copy semantics are subtle and can easily lead to errors.
222	It is possible that the compiler should mark some of these situations as errors by default, and possibly conditionally emit warnings for some situations.
223	Another possibility is to construct, destruct, and assign unmanaged objects using the intrinsic and auto-generated functions.
224	A more thorough examination of the design space for this problem is required.
225
226	Currently, the \CFA translator does not support any warnings.
227	Ideally, the translator should support optional warnings in the case where it can detect that an object has been constructed twice.
228	For example, forwarding constructor calls are guaranteed to initialize the entire object, so redundant constructor calls can cause problems such as memory leaks, while looking innocuous to a novice user.
229	\begin{cfacode}
230	struct B { ... };
231	struct A {
232	B x, y, z;
233	};
234	void ?{}(A * a, B x) {
235	// y, z implicitly default constructed
236	(&a->x){ ... }; // explicitly construct x
237	} // constructs an entire A
238	void ?{}(A * a) {
239	(&a->y){}; // initialize y
240	a{ (B){ ... } }; // forwarding constructor call
241	// initializes entire object, including y
242	}
243	\end{cfacode}
244
245	Finally, while constructors provide a mechanism for establishing invariants, there is currently no mechanism for maintaining invariants without resorting to opaque types.
246	That is, structure fields can be accessed and modified by any block of code without restriction, so while it is possible to ensure that an object is initially set to a valid state, it is not possible to ensure that it remains in a consistent state throughout its lifetime.
247	A popular technique for ensuring consistency in object-oriented programming languages is to provide access modifiers such as @private@, which provides compile-time checks that only privileged code accesses private data.
248	This approach could be added to \CFA, but it requires an idiomatic way of specifying what code is privileged and what data is protected.
249	One possibility is to tie access control into an eventual module system.
250
251	\begin{sloppypar}
252	The current implementation of implicit subobject-construction is currently an all-or-nothing check.
253	That is, if a subobject is conditionally constructed, \eg within an if-statement, no implicit constructors for that object are added.
254	\begin{cfacode}
255	struct A { ... };
256	void ?{}(A * a) { ... }
257
258	struct B {
259	A a;
260	};
261	void ?{}(B * b) {
262	if (...) {
263	(&b->a){}; // explicitly constructed
264	} // does not construct in else case
265	}
266	\end{cfacode}
267	This behaviour is unsafe and breaks the guarantee that constructors fully initialize objects.
268	This situation should be properly handled, either by examining all paths and inserting implicit constructor calls only in the paths missing construction, or by emitting an error or warning.
269	\end{sloppypar}
270
271	\subsection{Tuples}
272	Named result values are planned, but not yet implemented.
273	This feature ties nicely into named tuples, as seen in D and Swift.
274
275	Currently, tuple flattening and structuring conversions are 0-cost conversions in the resolution algorithm.
276	This makes tuples conceptually very simple to work with, but easily causes unnecessary ambiguity in situations where the type system should be able to differentiate between alternatives.
277	Adding an appropriate cost function to tuple conversions will allow tuples to interact with the rest of the programming language more cohesively.
278
279	\subsection{Variadic Functions}
280	Use of @ttype@ functions currently relies heavily on recursion.
281	\CC has opened variadic templates up so that recursion is not strictly necessary in some cases, and it would be interesting to see if any such cases can be applied to \CFA.
282
283	\CC supports variadic templated data-types, making it possible to express arbitrary length tuples, arbitrary parameter function objects, and more with generic types.
284	Currently, \CFA does not support @ttype@-parameter generic-types, though there does not appear to be a technical reason that it cannot.
285	Notably, opening up support for this makes it possible to implement the exit form of scope guard (see section \ref{s:ResMgmt}), making it possible to call arbitrary functions at scope exit in idiomatic \CFA.

Note: See TracBrowser for help on using the repository browser.

Download in other formats: