Context Navigation

source: doc/theses/andrew_beach_MMath/existing.tex@ 2b910f9

Visit:

ADT ast-experimental enum forall-pointer-decay jacob/cs343-translation new-ast-unique-expr pthread-emulation qualifiedEnum

Last change on this file since 2b910f9 was 6071efc, checked in by Andrew Beach <ajbeach@…>, 4 years ago
Andrew MMath: Update the first three chapters using Colby's comments.
Property mode set to `100644`
File size: 14.6 KB

Line
1	\chapter{\CFA{} Existing Features}
2	\label{c:existing}
3
4	\CFA is an open-source project extending ISO C with
5	modern safety and productivity features, while still ensuring backwards
6	compatibility with C and its programmers. \CFA is designed to have an
7	orthogonal feature-set based closely on the C programming paradigm
8	(non-object-oriented) and these features can be added incrementally to an
9	existing C code-base allowing programmers to learn \CFA on an as-needed basis.
10
11	Only those \CFA features pertaining to this thesis are discussed.
12	Also, only new features of \CFA will be discussed, a familiarity with
13	C or C-like languages is assumed.
14
15	\section{Overloading and \lstinline{extern}}
16	\CFA has extensive overloading, allowing multiple definitions of the same name
17	to be defined~\cite{Moss18}.
18	\begin{cfa}
19	char i; int i; double i;
20	int f(); double f();
21	void g( int ); void g( double );
22	\end{cfa}
23	This feature requires name mangling so the assembly symbols are unique for
24	different overloads. For compatibility with names in C, there is also a syntax
25	to disable name mangling. These unmangled names cannot be overloaded but act as
26	the interface between C and \CFA code. The syntax for disabling/enabling
27	mangling is:
28	\begin{cfa}
29	// name mangling on by default
30	int i; // _X1ii_1
31	extern "C" { // disables name mangling
32	int j; // j
33	extern "Cforall" { // enables name mangling
34	int k; // _X1ki_1
35	}
36	// revert to no name mangling
37	}
38	// revert to name mangling
39	\end{cfa}
40	Both forms of @extern@ affect all the declarations within their nested lexical
41	scope and transition back to the previous mangling state when the lexical scope
42	ends.
43
44	\section{Reference Type}
45	\CFA adds a reference type to C as an auto-dereferencing pointer.
46	They work very similarly to pointers.
47	Reference-types are written the same way as a pointer-type but each
48	asterisk (@*@) is replaced with a ampersand (@&@);
49	this includes cv-qualifiers and multiple levels of reference.
50
51	Generally, references act like pointers with an implicate dereferencing
52	operation added to each use of the variable.
53	These automatic dereferences may be disabled with the address-of operator
54	(@&@).
55
56	% Check to see if these are generating errors.
57	\begin{minipage}{0,5\textwidth}
58	With references:
59	\begin{cfa}
60	int i, j;
61	int & ri = i;
62	int && rri = ri;
63	rri = 3;
64	&ri = &j;
65	ri = 5;
66	\end{cfa}
67	\end{minipage}
68	\begin{minipage}{0,5\textwidth}
69	With pointers:
70	\begin{cfa}
71	int i, j;
72	int * pi = &i
73	int ** ppi = π
74	**ppi = 3;
75	pi = &j;
76	*pi = 5;
77	\end{cfa}
78	\end{minipage}
79
80	References are intended to be used when you would use pointers but would
81	be dereferencing them (almost) every usage.
82	Mutable references may be assigned to by converting them to a pointer
83	with a @&@ and then assigning a pointer to them, as in @&ri = &j;@ above
84
85	\section{Operators}
86
87	\CFA implements operator overloading by providing special names.
88	Operator uses are translated into function calls using these names.
89	These names are created by taking the operator symbols and joining them with
90	@?@s to show where the arguments go.
91	For example,
92	infixed multiplication is @??@ while prefix dereference is @?@.
93	This syntax make it easy to tell the difference between prefix operations
94	(such as @++?@) and post-fix operations (@?++@).
95
96	\begin{cfa}
97	point ?+?(point a, point b) { return point{a.x + b.x, a.y + b.y}; }
98	bool ?==?(point a, point b) { return a.x == b.x && a.y == b.y; }
99	{
100	assert(point{1, 2} + point{3, 4} == point{4, 6});
101	}
102	\end{cfa}
103	Note that these special names are not limited to just being used for these
104	operator functions, and may be used name other declarations.
105	Some ``near misses", that will not match an operator form but looks like
106	it may have been supposed to, will generate wantings but otherwise they are
107	left alone.
108
109	%\subsection{Constructors and Destructors}
110
111	Both constructors and destructors are operators, which means they are
112	functions with special operator names rather than type names in \Cpp. The
113	special operator names may be used to call the functions explicitly.
114	% Placement new means that this is actually equivant to C++.
115
116	The special name for a constructor is @?{}@, which comes from the
117	initialization syntax in C, \eg @Example e = { ... }@.
118	\CFA will generate a constructor call each time a variable is declared,
119	passing the initialization arguments to the constructort.
120	\begin{cfa}
121	struct Example { ... };
122	void ?{}(Example & this) { ... }
123	{
124	Example a;
125	Example b = {};
126	}
127	void ?{}(Example & this, char first, int num) { ... }
128	{
129	Example c = {'a', 2};
130	}
131	\end{cfa}
132	Both @a@ and @b@ will be initalized with the first constructor,
133	while @c@ will be initalized with the second.
134	Currently, there is no general way to skip initialation.
135
136	% I don't like the \^{} symbol but $^\wedge$ isn't better.
137	Similarly destructors use the special name @^?{}@ (the @^@ has no special
138	meaning).
139	These are a normally called implicitly called on a variable when it goes out
140	of scope. They can be called explicitly as well.
141	\begin{cfa}
142	void ^?{}(Example & this) { ... }
143	{
144	Example d;
145	} // <- implicit destructor call
146	\end{cfa}
147
148	Whenever a type is defined, \CFA will create a default zero-argument
149	constructor, a copy constructor, a series of argument-per-field constructors
150	and a destructor. All user constructors are defined after this.
151	Because operators are never part of the type definition they may be added
152	at any time, including on built-in types.
153
154	\section{Polymorphism}
155	\CFA uses parametric polymorphism to create functions and types that are
156	defined over multiple types. \CFA polymorphic declarations serve the same role
157	as \Cpp templates or Java generics. The ``parametric'' means the polymorphism is
158	accomplished by passing argument operations to associate \emph{parameters} at
159	the call site, and these parameters are used in the function to differentiate
160	among the types the function operates on.
161
162	Polymorphic declarations start with a universal @forall@ clause that goes
163	before the standard (monomorphic) declaration. These declarations have the same
164	syntax except they may use the universal type names introduced by the @forall@
165	clause. For example, the following is a polymorphic identity function that
166	works on any type @T@:
167	\begin{cfa}
168	forall( T ) T identity( T val ) { return val; }
169	int forty_two = identity( 42 );
170	char capital_a = identity( 'A' );
171	\end{cfa}
172	Each use of a polymorphic declaration resolves its polymorphic parameters
173	(in this case, just @T@) to concrete types (@int@ in the first use and @char@
174	in the second).
175
176	To allow a polymorphic function to be separately compiled, the type @T@ must be
177	constrained by the operations used on @T@ in the function body. The @forall@
178	clause is augmented with a list of polymorphic variables (local type names)
179	and assertions (constraints), which represent the required operations on those
180	types used in a function, \eg:
181	\begin{cfa}
182	forall( T \| { void do_once(T); } )
183	void do_twice(T value) {
184	do_once(value);
185	do_once(value);
186	}
187	\end{cfa}
188
189	A polymorphic function can be used in the same way as a normal function. The
190	polymorphic variables are filled in with concrete types and the assertions are
191	checked. An assertion is checked by verifying each assertion operation (with
192	all the variables replaced with the concrete types from the arguments) is
193	defined at a call site.
194	\begin{cfa}
195	void do_once(int i) { ... }
196	int i;
197	do_twice(i);
198	\end{cfa}
199	Any object with a type fulfilling the assertion may be passed as an argument to
200	a @do_twice@ call.
201
202	Note, a function named @do_once@ is not required in the scope of @do_twice@ to
203	compile it, unlike \Cpp template expansion. Furthermore, call-site inferencing
204	allows local replacement of the most specific parametric functions needs for a
205	call.
206	\begin{cfa}
207	void do_once(double y) { ... }
208	int quadruple(int x) {
209	void do_once(int & y) { y = y * 2; }
210	do_twice(x);
211	return x;
212	}
213	\end{cfa}
214	Specifically, the complier deduces that @do_twice@'s T is an integer from the
215	argument @x@. It then looks for the most specific definition matching the
216	assertion, which is the nested integral @do_once@ defined within the
217	function. The matched assertion function is then passed as a function pointer
218	to @do_twice@ and called within it.
219	The global definition of @do_once@ is ignored, however if quadruple took a
220	@double@ argument then the global definition would be used instead as it
221	would be a better match.
222	% Aaron's thesis might be a good reference here.
223
224	To avoid typing long lists of assertions, constraints can be collect into
225	convenient packages called a @trait@, which can then be used in an assertion
226	instead of the individual constraints.
227	\begin{cfa}
228	trait done_once(T) {
229	void do_once(T);
230	}
231	\end{cfa}
232	and the @forall@ list in the previous example is replaced with the trait.
233	\begin{cfa}
234	forall(dtype T \| done_once(T))
235	\end{cfa}
236	In general, a trait can contain an arbitrary number of assertions, both
237	functions and variables, and are usually used to create a shorthand for, and
238	give descriptive names to, common groupings of assertions describing a certain
239	functionality, like @sumable@, @listable@, \etc.
240
241	Polymorphic structures and unions are defined by qualifying the aggregate type
242	with @forall@. The type variables work the same except they are used in field
243	declarations instead of parameters, returns, and local variable declarations.
244	\begin{cfa}
245	forall(dtype T)
246	struct node {
247	node(T) * next;
248	T * data;
249	}
250	node(int) inode;
251	\end{cfa}
252	The generic type @node(T)@ is an example of a polymorphic type usage. Like \Cpp
253	template usage, a polymorphic type usage must specify a type parameter.
254
255	There are many other polymorphism features in \CFA but these are the ones used
256	by the exception system.
257
258	\section{Control Flow}
259	\CFA has a number of advanced control-flow features: @generator@, @coroutine@, @monitor@, @mutex@ parameters, and @thread@.
260	The two features that interact with
261	the exception system are @coroutine@ and @thread@; they and their supporting
262	constructs are described here.
263
264	\subsection{Coroutine}
265	A coroutine is a type with associated functions, where the functions are not
266	required to finish execution when control is handed back to the caller. Instead
267	they may suspend execution at any time and be resumed later at the point of
268	last suspension. (Generators are stackless and coroutines are stackful.) These
269	types are not concurrent but share some similarities along with common
270	underpinnings, so they are combined with the \CFA threading library. Further
271	discussion in this section only refers to the coroutine because generators are
272	similar.
273
274	In \CFA, a coroutine is created using the @coroutine@ keyword, which is an
275	aggregate type like @struct,@ except the structure is implicitly modified by
276	the compiler to satisfy the @is_coroutine@ trait; hence, a coroutine is
277	restricted by the type system to types that provide this special trait. The
278	coroutine structure acts as the interface between callers and the coroutine,
279	and its fields are used to pass information in and out of coroutine interface
280	functions.
281
282	Here is a simple example where a single field is used to pass (communicate) the
283	next number in a sequence.
284	\begin{cfa}
285	coroutine CountUp {
286	unsigned int next;
287	}
288	CountUp countup;
289	\end{cfa}
290	Each coroutine has a @main@ function, which takes a reference to a coroutine
291	object and returns @void@.
292	%[numbers=left] Why numbers on this one?
293	\begin{cfa}
294	void main(CountUp & this) {
295	for (unsigned int next = 0 ; true ; ++next) {
296	next = up;
297	suspend;$\label{suspend}$
298	}
299	}
300	\end{cfa}
301	In this function, or functions called by this function (helper functions), the
302	@suspend@ statement is used to return execution to the coroutine's caller
303	without terminating the coroutine's function.
304
305	A coroutine is resumed by calling the @resume@ function, \eg @resume(countup)@.
306	The first resume calls the @main@ function at the top. Thereafter, resume calls
307	continue a coroutine in the last suspended function after the @suspend@
308	statement, in this case @main@ line~\ref{suspend}. The @resume@ function takes
309	a reference to the coroutine structure and returns the same reference. The
310	return value allows easy access to communication variables defined in the
311	coroutine object. For example, the @next@ value for coroutine object @countup@
312	is both generated and collected in the single expression:
313	@resume(countup).next@.
314
315	\subsection{Monitor and Mutex Parameter}
316	Concurrency does not guarantee ordering; without ordering results are
317	non-deterministic. To claw back ordering, \CFA uses monitors and @mutex@
318	(mutual exclusion) parameters. A monitor is another kind of aggregate, where
319	the compiler implicitly inserts a lock and instances are compatible with
320	@mutex@ parameters.
321
322	A function that requires deterministic (ordered) execution, acquires mutual
323	exclusion on a monitor object by qualifying an object reference parameter with
324	@mutex@.
325	\begin{cfa}
326	void example(MonitorA & mutex argA, MonitorB & mutex argB);
327	\end{cfa}
328	When the function is called, it implicitly acquires the monitor lock for all of
329	the mutex parameters without deadlock. This semantics means all functions with
330	the same mutex type(s) are part of a critical section for objects of that type
331	and only one runs at a time.
332
333	\subsection{Thread}
334	Functions, generators, and coroutines are sequential so there is only a single
335	(but potentially sophisticated) execution path in a program. Threads introduce
336	multiple execution paths that continue independently.
337
338	For threads to work safely with objects requires mutual exclusion using
339	monitors and mutex parameters. For threads to work safely with other threads,
340	also requires mutual exclusion in the form of a communication rendezvous, which
341	also supports internal synchronization as for mutex objects. For exceptions,
342	only two basic thread operations are important: fork and join.
343
344	Threads are created like coroutines with an associated @main@ function:
345	\begin{cfa}
346	thread StringWorker {
347	const char * input;
348	int result;
349	};
350	void main(StringWorker & this) {
351	const char * localCopy = this.input;
352	// ... do some work, perhaps hashing the string ...
353	this.result = result;
354	}
355	{
356	StringWorker stringworker; // fork thread running in "main"
357	} // <- implicitly join with thread / wait for completion
358	\end{cfa}
359	The thread main is where a new thread starts execution after a fork operation
360	and then the thread continues executing until it is finished. If another thread
361	joins with an executing thread, it waits until the executing main completes
362	execution. In other words, everything a thread does is between a fork and join.
363
364	From the outside, this behaviour is accomplished through creation and
365	destruction of a thread object. Implicitly, fork happens after a thread
366	object's constructor is run and join happens before the destructor runs. Join
367	can also be specified explicitly using the @join@ function to wait for a
368	thread's completion independently from its deallocation (\ie destructor
369	call). If @join@ is called explicitly, the destructor does not implicitly join.

Note: See TracBrowser for help on using the repository browser.

Download in other formats: