Context Navigation

source: doc/theses/andrew_beach_MMath/existing.tex @ be497c6

ADTast-experimentalenumforall-pointer-decayjacob/cs343-translationpthread-emulationqualifiedEnum

Last change on this file since be497c6 was be497c6, checked in by Andrew Beach <ajbeach@…>, 3 years ago
Andrew MMath: Used Peter's feedback for the existing chapter.
Property mode set to `100644`
File size: 14.8 KB

Line
1	\chapter{\CFA{} Existing Features}
2	\label{c:existing}
3
4	\CFA is an open-source project extending ISO C with
5	modern safety and productivity features, while still ensuring backwards
6	compatibility with C and its programmers. \CFA is designed to have an
7	orthogonal feature-set based closely on the C programming paradigm
8	(non-object-oriented) and these features can be added incrementally to an
9	existing C code-base allowing programmers to learn \CFA on an as-needed basis.
10
11	Only those \CFA features pertaining to this thesis are discussed.
12	A familiarity with
13	C or C-like languages is assumed.
14
15	\section{Overloading and \lstinline{extern}}
16	\CFA has extensive overloading, allowing multiple definitions of the same name
17	to be defined~\cite{Moss18}.
18	\begin{cfa}
19	char i; int i; double i;
20	int f(); double f();
21	void g( int ); void g( double );
22	\end{cfa}
23	This feature requires name mangling so the assembly symbols are unique for
24	different overloads. For compatibility with names in C, there is also a syntax
25	to disable name mangling. These unmangled names cannot be overloaded but act as
26	the interface between C and \CFA code. The syntax for disabling/enabling
27	mangling is:
28	\begin{cfa}
29	// name mangling on by default
30	int i; // _X1ii_1
31	extern "C" { // disables name mangling
32	int j; // j
33	extern "Cforall" { // enables name mangling
34	int k; // _X1ki_1
35	}
36	// revert to no name mangling
37	}
38	// revert to name mangling
39	\end{cfa}
40	Both forms of @extern@ affect all the declarations within their nested lexical
41	scope and transition back to the previous mangling state when the lexical scope
42	ends.
43
44	\section{Reference Type}
45	\CFA adds a reference type to C as an auto-dereferencing pointer.
46	They work very similarly to pointers.
47	Reference-types are written the same way as a pointer-type but each
48	asterisk (@*@) is replaced with a ampersand (@&@);
49	this includes cv-qualifiers and multiple levels of reference.
50
51	Generally, references act like pointers with an implicate dereferencing
52	operation added to each use of the variable.
53	These automatic dereferences may be disabled with the address-of operator
54	(@&@).
55
56	% Check to see if these are generating errors.
57	\begin{minipage}{0,5\textwidth}
58	With references:
59	\begin{cfa}
60	int i, j;
61	int & ri = i;
62	int && rri = ri;
63	rri = 3;
64	&ri = &j;
65	ri = 5;
66	\end{cfa}
67	\end{minipage}
68	\begin{minipage}{0,5\textwidth}
69	With pointers:
70	\begin{cfa}
71	int i, j;
72	int * pi = &i
73	int ** ppi = π
74	**ppi = 3;
75	pi = &j;
76	*pi = 5;
77	\end{cfa}
78	\end{minipage}
79
80	References are intended to be used when the indirection of a pointer is
81	required, but the address is not as important as the value and dereferencing
82	is the common usage.
83	Mutable references may be assigned to by converting them to a pointer
84	with a @&@ and then assigning a pointer to them, as in @&ri = &j;@ above.
85	% ???
86
87	\section{Operators}
88
89	\CFA implements operator overloading by providing special names, where
90	operator expressions are translated into function calls using these names.
91	An operator name is created by taking the operator symbols and joining them with
92	@?@s to show where the arguments go.
93	For example,
94	infixed multiplication is @??@, while prefix dereference is @?@.
95	This syntax make it easy to tell the difference between prefix operations
96	(such as @++?@) and post-fix operations (@?++@).
97
98	As an example, here are the addition and equality operators for a point type.
99	\begin{cfa}
100	point ?+?(point a, point b) { return point{a.x + b.x, a.y + b.y}; }
101	int ?==?(point a, point b) { return a.x == b.x && a.y == b.y; }
102	{
103	assert(point{1, 2} + point{3, 4} == point{4, 6});
104	}
105	\end{cfa}
106	Note that this syntax works effectively but a textual transformation,
107	the compiler converts all operators into functions and then resolves them
108	normally. This means any combination of types may be used,
109	although nonsensical ones (like @double ?==?(point, int);@) are discouraged.
110	This feature is also used for all builtin operators as well,
111	although those are implicitly provided by the language.
112
113	%\subsection{Constructors and Destructors}
114	In \CFA, constructors and destructors are operators, which means they are
115	functions with special operator names rather than type names in \Cpp.
116	Both constructors and destructors can be implicity called by the compiler,
117	however the operator names allow explicit calls.
118	% Placement new means that this is actually equivant to C++.
119
120	The special name for a constructor is @?{}@, which comes from the
121	initialization syntax in C, \eg @Example e = { ... }@.
122	\CFA generates a constructor call each time a variable is declared,
123	passing the initialization arguments to the constructor.
124	\begin{cfa}
125	struct Example { ... };
126	void ?{}(Example & this) { ... }
127	{
128	Example a;
129	Example b = {};
130	}
131	void ?{}(Example & this, char first, int num) { ... }
132	{
133	Example c = {'a', 2};
134	}
135	\end{cfa}
136	Both @a@ and @b@ will be initalized with the first constructor,
137	@b@ because of the explicit call and @a@ implicitly.
138	@c@ will be initalized with the second constructor.
139	Currently, there is no general way to skip initialation.
140	% I don't use @= anywhere in the thesis.
141
142	% I don't like the \^{} symbol but $^\wedge$ isn't better.
143	Similarly, destructors use the special name @^?{}@ (the @^@ has no special
144	meaning).
145	\begin{cfa}
146	void ^?{}(Example & this) { ... }
147	{
148	Example d;
149	^?{}(d);
150
151	Example e;
152	} // Implicit call of ^?{}(e);
153	\end{cfa}
154
155	Whenever a type is defined, \CFA creates a default zero-argument
156	constructor, a copy constructor, a series of argument-per-field constructors
157	and a destructor. All user constructors are defined after this.
158
159	\section{Polymorphism}
160	\CFA uses parametric polymorphism to create functions and types that are
161	defined over multiple types. \CFA polymorphic declarations serve the same role
162	as \Cpp templates or Java generics. The ``parametric'' means the polymorphism is
163	accomplished by passing argument operations to associate \emph{parameters} at
164	the call site, and these parameters are used in the function to differentiate
165	among the types the function operates on.
166
167	Polymorphic declarations start with a universal @forall@ clause that goes
168	before the standard (monomorphic) declaration. These declarations have the same
169	syntax except they may use the universal type names introduced by the @forall@
170	clause. For example, the following is a polymorphic identity function that
171	works on any type @T@:
172	\begin{cfa}
173	forall( T ) T identity( T val ) { return val; }
174	int forty_two = identity( 42 );
175	char capital_a = identity( 'A' );
176	\end{cfa}
177	Each use of a polymorphic declaration resolves its polymorphic parameters
178	(in this case, just @T@) to concrete types (@int@ in the first use and @char@
179	in the second).
180
181	To allow a polymorphic function to be separately compiled, the type @T@ must be
182	constrained by the operations used on @T@ in the function body. The @forall@
183	clause is augmented with a list of polymorphic variables (local type names)
184	and assertions (constraints), which represent the required operations on those
185	types used in a function, \eg:
186	\begin{cfa}
187	forall( T \| { void do_once(T); } )
188	void do_twice(T value) {
189	do_once(value);
190	do_once(value);
191	}
192	\end{cfa}
193
194	A polymorphic function can be used in the same way as a normal function. The
195	polymorphic variables are filled in with concrete types and the assertions are
196	checked. An assertion is checked by verifying each assertion operation (with
197	all the variables replaced with the concrete types from the arguments) is
198	defined at a call site.
199	\begin{cfa}
200	void do_once(int i) { ... }
201	int i;
202	do_twice(i);
203	\end{cfa}
204	Any object with a type fulfilling the assertion may be passed as an argument to
205	a @do_twice@ call.
206
207	Note, a function named @do_once@ is not required in the scope of @do_twice@ to
208	compile it, unlike \Cpp template expansion. Furthermore, call-site inferencing
209	allows local replacement of the specific parametric functions needs for a
210	call.
211	\begin{cfa}
212	void do_once(double y) { ... }
213	int quadruple(int x) {
214	void do_once(int & y) { y = y * 2; }
215	do_twice(x);
216	return x;
217	}
218	\end{cfa}
219	Specifically, the complier deduces that @do_twice@'s T is an integer from the
220	argument @x@. It then looks for the most specific definition matching the
221	assertion, which is the nested integral @do_once@ defined within the
222	function. The matched assertion function is then passed as a function pointer
223	to @do_twice@ and called within it.
224	The global definition of @do_once@ is ignored, however if quadruple took a
225	@double@ argument, then the global definition would be used instead as it
226	would then be a better match.
227	\todo{cite Aaron's thesis (maybe)}
228
229	To avoid typing long lists of assertions, constraints can be collected into
230	convenient a package called a @trait@, which can then be used in an assertion
231	instead of the individual constraints.
232	\begin{cfa}
233	trait done_once(T) {
234	void do_once(T);
235	}
236	\end{cfa}
237	and the @forall@ list in the previous example is replaced with the trait.
238	\begin{cfa}
239	forall(dtype T \| done_once(T))
240	\end{cfa}
241	In general, a trait can contain an arbitrary number of assertions, both
242	functions and variables, and are usually used to create a shorthand for, and
243	give descriptive names to, common groupings of assertions describing a certain
244	functionality, like @sumable@, @listable@, \etc.
245
246	Polymorphic structures and unions are defined by qualifying an aggregate type
247	with @forall@. The type variables work the same except they are used in field
248	declarations instead of parameters, returns, and local variable declarations.
249	\begin{cfa}
250	forall(dtype T)
251	struct node {
252	node(T) * next;
253	T * data;
254	};
255	node(int) inode;
256	\end{cfa}
257	The generic type @node(T)@ is an example of a polymorphic type usage. Like \Cpp
258	template usage, a polymorphic type usage must specify a type parameter.
259
260	There are many other polymorphism features in \CFA but these are the ones used
261	by the exception system.
262
263	\section{Control Flow}
264	\CFA has a number of advanced control-flow features: @generator@, @coroutine@, @monitor@, @mutex@ parameters, and @thread@.
265	The two features that interact with
266	the exception system are @coroutine@ and @thread@; they and their supporting
267	constructs are described here.
268
269	\subsection{Coroutine}
270	A coroutine is a type with associated functions, where the functions are not
271	required to finish execution when control is handed back to the caller. Instead
272	they may suspend execution at any time and be resumed later at the point of
273	last suspension. (Generators are stackless and coroutines are stackful.) These
274	types are not concurrent but share some similarities along with common
275	underpinnings, so they are combined with the \CFA threading library. Further
276	discussion in this section only refers to the coroutine because generators are
277	similar.
278
279	In \CFA, a coroutine is created using the @coroutine@ keyword, which is an
280	aggregate type like @struct,@ except the structure is implicitly modified by
281	the compiler to satisfy the @is_coroutine@ trait; hence, a coroutine is
282	restricted by the type system to types that provide this special trait. The
283	coroutine structure acts as the interface between callers and the coroutine,
284	and its fields are used to pass information in and out of coroutine interface
285	functions.
286
287	Here is a simple example where a single field is used to pass (communicate) the
288	next number in a sequence.
289	\begin{cfa}
290	coroutine CountUp {
291	unsigned int next;
292	};
293	CountUp countup;
294	\end{cfa}
295	Each coroutine has a @main@ function, which takes a reference to a coroutine
296	object and returns @void@.
297	%[numbers=left] Why numbers on this one?
298	\begin{cfa}
299	void main(CountUp & this) {
300	for (unsigned int next = 0 ; true ; ++next) {
301	this.next = next;
302	suspend;$\label{suspend}$
303	}
304	}
305	\end{cfa}
306	In this function, or functions called by this function (helper functions), the
307	@suspend@ statement is used to return execution to the coroutine's caller
308	without terminating the coroutine's function.
309
310	A coroutine is resumed by calling the @resume@ function, \eg @resume(countup)@.
311	The first resume calls the @main@ function at the top. Thereafter, resume calls
312	continue a coroutine in the last suspended function after the @suspend@
313	statement. In this case there is only one and, hence, the difference between
314	subsequent calls is the state of variables inside the function and the
315	coroutine object.
316	The return value of @resume@ is a reference to the coroutine, to make it
317	convent to access fields of the coroutine in the same expression.
318	Here is a simple example in a helper function:
319	\begin{cfa}
320	unsigned int get_next(CountUp & this) {
321	return resume(this).next;
322	}
323	\end{cfa}
324
325	When the main function returns the coroutine halts and can no longer be
326	resumed.
327
328	\subsection{Monitor and Mutex Parameter}
329	Concurrency does not guarantee ordering; without ordering results are
330	non-deterministic. To claw back ordering, \CFA uses monitors and @mutex@
331	(mutual exclusion) parameters. A monitor is another kind of aggregate, where
332	the compiler implicitly inserts a lock and instances are compatible with
333	@mutex@ parameters.
334
335	A function that requires deterministic (ordered) execution, acquires mutual
336	exclusion on a monitor object by qualifying an object reference parameter with
337	@mutex@.
338	\begin{cfa}
339	void example(MonitorA & mutex argA, MonitorB & mutex argB);
340	\end{cfa}
341	When the function is called, it implicitly acquires the monitor lock for all of
342	the mutex parameters without deadlock. This semantics means all functions with
343	the same mutex type(s) are part of a critical section for objects of that type
344	and only one runs at a time.
345
346	\subsection{Thread}
347	Functions, generators, and coroutines are sequential so there is only a single
348	(but potentially sophisticated) execution path in a program. Threads introduce
349	multiple execution paths that continue independently.
350
351	For threads to work safely with objects requires mutual exclusion using
352	monitors and mutex parameters. For threads to work safely with other threads,
353	also requires mutual exclusion in the form of a communication rendezvous, which
354	also supports internal synchronization as for mutex objects. For exceptions,
355	only two basic thread operations are important: fork and join.
356
357	Threads are created like coroutines with an associated @main@ function:
358	\begin{cfa}
359	thread StringWorker {
360	const char * input;
361	int result;
362	};
363	void main(StringWorker & this) {
364	const char * localCopy = this.input;
365	// ... do some work, perhaps hashing the string ...
366	this.result = result;
367	}
368	{
369	StringWorker stringworker; // fork thread running in "main"
370	} // Implicit call to join(stringworker), waits for completion.
371	\end{cfa}
372	The thread main is where a new thread starts execution after a fork operation
373	and then the thread continues executing until it is finished. If another thread
374	joins with an executing thread, it waits until the executing main completes
375	execution. In other words, everything a thread does is between a fork and join.
376
377	From the outside, this behaviour is accomplished through creation and
378	destruction of a thread object. Implicitly, fork happens after a thread
379	object's constructor is run and join happens before the destructor runs. Join
380	can also be specified explicitly using the @join@ function to wait for a
381	thread's completion independently from its deallocation (\ie destructor
382	call). If @join@ is called explicitly, the destructor does not implicitly join.

Note: See TracBrowser for help on using the repository browser.

Download in other formats: