Context Navigation

source: doc/theses/andrew_beach_MMath/existing.tex @ ea54f1e

ADTast-experimentalenumforall-pointer-decayjacob/cs343-translationnew-ast-unique-exprpthread-emulationqualifiedEnum

Last change on this file since ea54f1e was 1e567ab, checked in by Peter A. Buhr <pabuhr@…>, 3 years ago
fix error in proofreading
Property mode set to `100644`
File size: 15.2 KB

Line
1	\chapter{\CFA{} Existing Features}
2	\label{c:existing}
3
4	\CFA is an open-source project extending ISO C with
5	modern safety and productivity features, while still ensuring backwards
6	compatibility with C and its programmers. \CFA is designed to have an
7	orthogonal feature-set based closely on the C programming paradigm
8	(non-object-oriented) and these features can be added incrementally to an
9	existing C code-base allowing programmers to learn \CFA on an as-needed basis.
10
11	Only those \CFA features pertaining to this thesis are discussed.
12	% Also, only new features of \CFA will be discussed,
13	A familiarity with
14	C or C-like languages is assumed.
15
16	\section{Overloading and \lstinline{extern}}
17	\CFA has extensive overloading, allowing multiple definitions of the same name
18	to be defined~\cite{Moss18}.
19	\begin{lstlisting}[language=CFA,{moredelim=**[is][\color{red}]{@}{@}}]
20	char @i@; int @i@; double @i@;
21	int @f@(); double @f@();
22	void @g@( int ); void @g@( double );
23	\end{lstlisting}
24	This feature requires name mangling so the assembly symbols are unique for
25	different overloads. For compatibility with names in C, there is also a syntax
26	to disable name mangling. These unmangled names cannot be overloaded but act as
27	the interface between C and \CFA code. The syntax for disabling/enabling
28	mangling is:
29	\begin{cfa}
30	// name mangling on by default
31	int i; // _X1ii_1
32	extern "C" { // disables name mangling
33	int j; // j
34	extern "Cforall" { // enables name mangling
35	int k; // _X1ki_1
36	}
37	// revert to no name mangling
38	}
39	// revert to name mangling
40	\end{cfa}
41	Both forms of @extern@ affect all the declarations within their nested lexical
42	scope and transition back to the previous mangling state when the lexical scope
43	ends.
44
45	\section{Reference Type}
46	\CFA adds a reference type to C as an auto-dereferencing pointer.
47	They work very similarly to pointers.
48	Reference-types are written the same way as a pointer-type but each
49	asterisk (@*@) is replaced with a ampersand (@&@);
50	this includes cv-qualifiers and multiple levels of reference.
51
52	Generally, references act like pointers with an implicate dereferencing
53	operation added to each use of the variable.
54	These automatic dereferences may be disabled with the address-of operator
55	(@&@).
56
57	% Check to see if these are generating errors.
58	\begin{minipage}{0,5\textwidth}
59	With references:
60	\begin{cfa}
61	int i, j;
62	int & ri = i;
63	int && rri = ri;
64	rri = 3;
65	&ri = &j; // rebindable
66	ri = 5;
67	\end{cfa}
68	\end{minipage}
69	\begin{minipage}{0,5\textwidth}
70	With pointers:
71	\begin{cfa}
72	int i, j;
73	int * pi = &i
74	int ** ppi = π
75	**ppi = 3;
76	pi = &j;
77	*pi = 5;
78	\end{cfa}
79	\end{minipage}
80
81	References are intended for pointer situations where dereferencing is the common usage,
82	\ie the value is more important than the pointer.
83	Mutable references may be assigned to by converting them to a pointer
84	with a @&@ and then assigning a pointer to them, as in @&ri = &j;@ above
85
86	\section{Operators}
87
88	\CFA implements operator overloading by providing special names, where
89	operator usages are translated into function calls using these names.
90	An operator name is created by taking the operator symbols and joining them with
91	@?@s to show where the arguments go.
92	For example,
93	infixed multiplication is @??@, while prefix dereference is @?@.
94	This syntax make it easy to tell the difference between prefix operations
95	(such as @++?@) and post-fix operations (@?++@).
96	For example, plus and equality operators are defined for a point type.
97	\begin{cfa}
98	point ?+?(point a, point b) { return point{a.x + b.x, a.y + b.y}; }
99	int ?==?(point a, point b) { return a.x == b.x && a.y == b.y; }
100	{
101	assert(point{1, 2} + point{3, 4} == point{4, 6});
102	}
103	\end{cfa}
104	Note these special names are not limited to builtin
105	operators, and hence, may be used with arbitrary types.
106	\begin{cfa}
107	double ?+?( int x, point y ); // arbitrary types
108	\end{cfa}
109	% Some ``near misses", that are that do not match an operator form but looks like
110	% it may have been supposed to, will generate warning but otherwise they are
111	% left alone.
112	Because operators are never part of the type definition they may be added
113	at any time, including on built-in types.
114
115	%\subsection{Constructors and Destructors}
116
117	\CFA also provides constructors and destructors as operators, which means they
118	are functions with special operator names rather than type names in \Cpp.
119	While constructors and destructions are normally called implicitly by the compiler,
120	the special operator names, allow explicit calls.
121
122	% Placement new means that this is actually equivalent to C++.
123
124	The special name for a constructor is @?{}@, which comes from the
125	initialization syntax in C, \eg @Example e = { ... }@.
126	\CFA generates a constructor call each time a variable is declared,
127	passing the initialization arguments to the constructor.
128	\begin{cfa}
129	struct Example { ... };
130	void ?{}(Example & this) { ... }
131	void ?{}(Example & this, char first, int num) { ... }
132	Example a; // implicit constructor calls
133	Example b = {};
134	Example c = {'a', 2};
135	\end{cfa}
136	Both @a@ and @b@ are initialized with the first constructor,
137	while @c@ is initialized with the second.
138	Constructor calls can be replaced with C initialization using special operator \lstinline{@=}.
139	\begin{cfa}
140	Example d @= {42};
141	\end{cfa}
142	% I don't like the \^{} symbol but $^\wedge$ isn't better.
143	Similarly, destructors use the special name @^?{}@ (the @^@ has no special
144	meaning).
145	% These are a normally called implicitly called on a variable when it goes out
146	% of scope. They can be called explicitly as well.
147	\begin{cfa}
148	void ^?{}(Example & this) { ... }
149	{
150	Example e; // implicit constructor call
151	^?{}(e); // explicit destructor call
152	?{}(e); // explicit constructor call
153	} // implicit destructor call
154	\end{cfa}
155
156	Whenever a type is defined, \CFA creates a default zero-argument
157	constructor, a copy constructor, a series of argument-per-field constructors
158	and a destructor. All user constructors are defined after this.
159
160	\section{Polymorphism}
161	\CFA uses parametric polymorphism to create functions and types that are
162	defined over multiple types. \CFA polymorphic declarations serve the same role
163	as \Cpp templates or Java generics. The ``parametric'' means the polymorphism is
164	accomplished by passing argument operations to associate \emph{parameters} at
165	the call site, and these parameters are used in the function to differentiate
166	among the types the function operates on.
167
168	Polymorphic declarations start with a universal @forall@ clause that goes
169	before the standard (monomorphic) declaration. These declarations have the same
170	syntax except they may use the universal type names introduced by the @forall@
171	clause. For example, the following is a polymorphic identity function that
172	works on any type @T@:
173	\begin{cfa}
174	forall( T ) T identity( T val ) { return val; }
175	int forty_two = identity( 42 );
176	char capital_a = identity( 'A' );
177	\end{cfa}
178	Each use of a polymorphic declaration resolves its polymorphic parameters
179	(in this case, just @T@) to concrete types (@int@ in the first use and @char@
180	in the second).
181
182	To allow a polymorphic function to be separately compiled, the type @T@ must be
183	constrained by the operations used on @T@ in the function body. The @forall@
184	clause is augmented with a list of polymorphic variables (local type names)
185	and assertions (constraints), which represent the required operations on those
186	types used in a function, \eg:
187	\begin{cfa}
188	forall( T \| { void do_once(T); } )
189	void do_twice(T value) {
190	do_once(value);
191	do_once(value);
192	}
193	\end{cfa}
194
195	A polymorphic function can be used in the same way as a normal function. The
196	polymorphic variables are filled in with concrete types and the assertions are
197	checked. An assertion is checked by verifying each assertion operation (with
198	all the variables replaced with the concrete types from the arguments) is
199	defined at a call site.
200	\begin{cfa}
201	void do_once(int i) { ... }
202	int i;
203	do_twice(i);
204	\end{cfa}
205	Any object with a type fulfilling the assertion may be passed as an argument to
206	a @do_twice@ call.
207
208	Note, a function named @do_once@ is not required in the scope of @do_twice@ to
209	compile it, unlike \Cpp template expansion. Furthermore, call-site inferencing
210	allows local replacement of the specific parametric functions needs for a
211	call.
212	\begin{cfa}
213	void do_once(double y) { ... }
214	int quadruple(int x) {
215	void do_once(int & y) { y = y * 2; }
216	do_twice(x);
217	return x;
218	}
219	\end{cfa}
220	Specifically, the complier deduces that @do_twice@'s T is an integer from the
221	argument @x@. It then looks for the most specific definition matching the
222	assertion, which is the nested integral @do_once@ defined within the
223	function. The matched assertion function is then passed as a function pointer
224	to @do_twice@ and called within it.
225	The global definition of @do_once@ is ignored, however if quadruple took a
226	@double@ argument, then the global definition would be used instead as it
227	is a better match.
228	% Aaron's thesis might be a good reference here.
229
230	To avoid typing long lists of assertions, constraints can be collect into
231	convenient package called a @trait@, which can then be used in an assertion
232	instead of the individual constraints.
233	\begin{cfa}
234	trait done_once(T) {
235	void do_once(T);
236	}
237	\end{cfa}
238	and the @forall@ list in the previous example is replaced with the trait.
239	\begin{cfa}
240	forall(dtype T \| done_once(T))
241	\end{cfa}
242	In general, a trait can contain an arbitrary number of assertions, both
243	functions and variables, and are usually used to create a shorthand for, and
244	give descriptive names to, common groupings of assertions describing a certain
245	functionality, like @sumable@, @listable@, \etc.
246
247	Polymorphic structures and unions are defined by qualifying an aggregate type
248	with @forall@. The type variables work the same except they are used in field
249	declarations instead of parameters, returns, and local variable declarations.
250	\begin{cfa}
251	forall(dtype T)
252	struct node {
253	node(T) * next;
254	T * data;
255	}
256	node(int) inode;
257	\end{cfa}
258	The generic type @node(T)@ is an example of a polymorphic type usage. Like \Cpp
259	template usage, a polymorphic type usage must specify a type parameter.
260
261	There are many other polymorphism features in \CFA but these are the ones used
262	by the exception system.
263
264	\section{Control Flow}
265	\CFA has a number of advanced control-flow features: @generator@, @coroutine@, @monitor@, @mutex@ parameters, and @thread@.
266	The two features that interact with
267	the exception system are @coroutine@ and @thread@; they and their supporting
268	constructs are described here.
269
270	\subsection{Coroutine}
271	A coroutine is a type with associated functions, where the functions are not
272	required to finish execution when control is handed back to the caller. Instead
273	they may suspend execution at any time and be resumed later at the point of
274	last suspension. (Generators are stackless and coroutines are stackful.) These
275	types are not concurrent but share some similarities along with common
276	underpinnings, so they are combined with the \CFA threading library. Further
277	discussion in this section only refers to the coroutine because generators are
278	similar.
279
280	In \CFA, a coroutine is created using the @coroutine@ keyword, which is an
281	aggregate type like @struct,@ except the structure is implicitly modified by
282	the compiler to satisfy the @is_coroutine@ trait; hence, a coroutine is
283	restricted by the type system to types that provide this special trait. The
284	coroutine structure acts as the interface between callers and the coroutine,
285	and its fields are used to pass information in and out of coroutine interface
286	functions.
287
288	Here is a simple example where a single field is used to pass (communicate) the
289	next number in a sequence.
290	\begin{cfa}
291	coroutine CountUp {
292	unsigned int next;
293	};
294	CountUp countup;
295	for (10) sout \| resume(countup).next; // print 10 values
296	\end{cfa}
297	Each coroutine has a @main@ function, which takes a reference to a coroutine
298	object and returns @void@.
299	%[numbers=left] Why numbers on this one?
300	\begin{cfa}[numbers=left,numberstyle=\scriptsize\sf]
301	void main(CountUp & this) {
302	for (unsigned int up = 0;; ++up) {
303	this.next = up;
304	suspend;$\label{suspend}$
305	}
306	}
307	\end{cfa}
308	In this function, or functions called by this function (helper functions), the
309	@suspend@ statement is used to return execution to the coroutine's resumer
310	without terminating the coroutine's function(s).
311
312	A coroutine is resumed by calling the @resume@ function, \eg @resume(countup)@.
313	The first resume calls the @main@ function at the top. Thereafter, resume calls
314	continue a coroutine in the last suspended function after the @suspend@
315	statement, in this case @main@ line~\ref{suspend}. The @resume@ function takes
316	a reference to the coroutine structure and returns the same reference. The
317	return value allows easy access to communication variables defined in the
318	coroutine object. For example, the @next@ value for coroutine object @countup@
319	is both generated and collected in the single expression:
320	@resume(countup).next@.
321
322	\subsection{Monitor and Mutex Parameter}
323	Concurrency does not guarantee ordering; without ordering results are
324	non-deterministic. To claw back ordering, \CFA uses monitors and @mutex@
325	(mutual exclusion) parameters. A monitor is another kind of aggregate, where
326	the compiler implicitly inserts a lock and instances are compatible with
327	@mutex@ parameters.
328
329	A function that requires deterministic (ordered) execution, acquires mutual
330	exclusion on a monitor object by qualifying an object reference parameter with
331	@mutex@.
332	\begin{lstlisting}[language=CFA,{moredelim=**[is][\color{red}]{@}{@}}]
333	void example(MonitorA & @mutex@ argA, MonitorB & @mutex@ argB);
334	\end{lstlisting}
335	When the function is called, it implicitly acquires the monitor lock for all of
336	the mutex parameters without deadlock. This semantics means all functions with
337	the same mutex type(s) are part of a critical section for objects of that type
338	and only one runs at a time.
339
340	\subsection{Thread}
341	Functions, generators, and coroutines are sequential so there is only a single
342	(but potentially sophisticated) execution path in a program. Threads introduce
343	multiple execution paths that continue independently.
344
345	For threads to work safely with objects requires mutual exclusion using
346	monitors and mutex parameters. For threads to work safely with other threads,
347	also requires mutual exclusion in the form of a communication rendezvous, which
348	also supports internal synchronization as for mutex objects. For exceptions,
349	only two basic thread operations are important: fork and join.
350
351	Threads are created like coroutines with an associated @main@ function:
352	\begin{cfa}
353	thread StringWorker {
354	const char * input;
355	int result;
356	};
357	void main(StringWorker & this) {
358	const char * localCopy = this.input;
359	// ... do some work, perhaps hashing the string ...
360	this.result = result;
361	}
362	{
363	StringWorker stringworker; // fork thread running in "main"
364	} // implicitly join with thread / wait for completion
365	\end{cfa}
366	The thread main is where a new thread starts execution after a fork operation
367	and then the thread continues executing until it is finished. If another thread
368	joins with an executing thread, it waits until the executing main completes
369	execution. In other words, everything a thread does is between a fork and join.
370
371	From the outside, this behaviour is accomplished through creation and
372	destruction of a thread object. Implicitly, fork happens after a thread
373	object's constructor is run and join happens before the destructor runs. Join
374	can also be specified explicitly using the @join@ function to wait for a
375	thread's completion independently from its deallocation (\ie destructor
376	call). If @join@ is called explicitly, the destructor does not implicitly join.

Note: See TracBrowser for help on using the repository browser.

Download in other formats: