Context Navigation

source: doc/theses/andrew_beach_MMath/existing.tex @ d3b95f1

ADTast-experimentalenumforall-pointer-decayjacob/cs343-translationnew-ast-unique-exprpthread-emulationqualifiedEnum

Last change on this file since d3b95f1 was 4ed7946e, checked in by Peter A. Buhr <pabuhr@…>, 3 years ago
proofread Andrew's thesis chapters
Property mode set to `100644`
File size: 14.3 KB

Rev	Line
[29c9b23]	1	\chapter{\CFA Existing Features}
[553f8abe]	2	\label{c:existing}
[f28fdee]	3
[4ed7946e]	4	\CFA is an open-source project extending ISO C with
[6c79bef]	5	modern safety and productivity features, while still ensuring backwards
	6	compatibility with C and its programmers. \CFA is designed to have an
	7	orthogonal feature-set based closely on the C programming paradigm
	8	(non-object-oriented) and these features can be added incrementally to an
	9	existing C code-base allowing programmers to learn \CFA on an as-needed basis.
	10
[4ed7946e]	11	Only those \CFA features pertaining to this thesis are discussed. Many of the
[6c79bef]	12	\CFA syntactic and semantic features used in the thesis should be fairly
	13	obvious to the reader.
	14
[9af0fe2d]	15	\section{Overloading and \lstinline{extern}}
[6c79bef]	16	\CFA has extensive overloading, allowing multiple definitions of the same name
[67c6a47]	17	to be defined~\cite{Moss18}.
[f28fdee]	18	\begin{cfa}
[df24d37]	19	char i; int i; double i;
	20	int f(); double f();
	21	void g( int ); void g( double );
[f28fdee]	22	\end{cfa}
[6c79bef]	23	This feature requires name mangling so the assembly symbols are unique for
	24	different overloads. For compatibility with names in C, there is also a syntax
	25	to disable name mangling. These unmangled names cannot be overloaded but act as
	26	the interface between C and \CFA code. The syntax for disabling/enabling
	27	mangling is:
[f28fdee]	28	\begin{cfa}
[edc6ea2]	29	// name mangling on by default
[6c79bef]	30	int i; // _X1ii_1
[4ed7946e]	31	@extern "C"@ { // disables name mangling
[6c79bef]	32	int j; // j
[4ed7946e]	33	@extern "Cforall"@ { // enables name mangling
[6c79bef]	34	int k; // _X1ki_1
	35	}
[edc6ea2]	36	// revert to no name mangling
[6e7b969]	37	}
[edc6ea2]	38	// revert to name mangling
[6c79bef]	39	\end{cfa}
	40	Both forms of @extern@ affect all the declarations within their nested lexical
	41	scope and transition back to the previous mangling state when the lexical scope
	42	ends.
	43
	44	\section{Reference Type}
[03c0e44]	45	\CFA adds a reference type to C as an auto-dereferencing pointer.
	46	They work very similarly to pointers.
[4ed7946e]	47	Reference-types are written the same way as a pointer-type but each
[03c0e44]	48	asterisk (@*@) is replaced with a ampersand (@&@);
[4ed7946e]	49	this includes cv-qualifiers and multiple levels of reference, \eg:
[03c0e44]	50
[4ed7946e]	51	\begin{minipage}{0,5\textwidth}
[03c0e44]	52	With references:
	53	\begin{cfa}
	54	int i, j;
	55	int & ri = i;
	56	int && rri = ri;
	57	rri = 3;
[4ed7946e]	58	&ri = &j; // reference assignment
[03c0e44]	59	ri = 5;
	60	\end{cfa}
	61	\end{minipage}
[4ed7946e]	62	\begin{minipage}{0,5\textwidth}
[03c0e44]	63	With pointers:
[6c79bef]	64	\begin{cfa}
	65	int i, j;
[03c0e44]	66	int * pi = &i
	67	int ** ppi = π
	68	**ppi = 3;
[4ed7946e]	69	pi = &j; // pointer assignment
[03c0e44]	70	*pi = 5;
[f28fdee]	71	\end{cfa}
[03c0e44]	72	\end{minipage}
[6e7b969]	73
[4ed7946e]	74	References are intended for cases where you would want to use pointers but would
	75	be dereferencing them (almost) every usage.
	76	In most cases a reference can just be thought of as a pointer that
	77	automatically puts a dereference in front of each of its uses (per-level of
	78	reference).
	79	The address-of operator (@&@) acts as an escape and removes one of the
	80	automatic dereference operations.
	81	Mutable references may be assigned by converting them to a pointer
	82	with a @&@ and then assigning a pointer to them, as in @&ri = &j;@ above.
[6e7b969]	83
[4ed7946e]	84	\section{Operators}
[6c79bef]	85
	86	In general, operator names in \CFA are constructed by bracketing an operator
[edc6ea2]	87	token with @?@, which indicates the position of the arguments. For example,
	88	infixed multiplication is @??@ while prefix dereference is @?@.
	89	This syntax make it easy to tell the difference between prefix operations
	90	(such as @++?@) and post-fix operations (@?++@).
[6c79bef]	91
[4ed7946e]	92	An operator name may describe any function signature (it is just a name) but
	93	only certain signatures may be called in operator form.
[edc6ea2]	94	\begin{cfa}
[4ed7946e]	95	int ?+?( int i, int j, int k ) { return i + j + k; }
[edc6ea2]	96	{
[4ed7946e]	97	sout \| ?+?( 3, 4, 5 ); // no infix form
[edc6ea2]	98	}
[4ed7946e]	99	\end{cfa}
	100	Some ``near-misses" for unary/binary operator prototypes generate warnings.
	101
	102	Both constructors and destructors are operators, which means they are
	103	functions with special operator names rather than type names in \Cpp. The
	104	special operator names may be used to call the functions explicitly (not
	105	allowed in \Cpp for constructors).
	106
	107	The special name for a constructor is @?{}@, where the name @{}@ comes from the
	108	initialization syntax in C, \eg @Structure s = {...}@.
	109	% That initialization syntax is also the operator form.
	110	\CFA generates a constructor call each time a variable is declared,
	111	passing the initialization arguments to the constructor.
	112	\begin{cfa}
	113	struct Structure { ... };
	114	void ?{}(Structure & this) { ... }
[edc6ea2]	115	{
[4ed7946e]	116	Structure a;
	117	Structure b = {};
	118	}
	119	void ?{}(Structure & this, char first, int num) { ... }
	120	{
	121	Structure c = {'a', 2};
[edc6ea2]	122	}
	123	\end{cfa}
[4ed7946e]	124	Both @a@ and @b@ are initialized with the first constructor,
	125	while @c@ is initialized with the second.
	126	Currently, there is no general way to skip initialization.
[edc6ea2]	127
[6c79bef]	128	% I don't like the \^{} symbol but $^\wedge$ isn't better.
[4ed7946e]	129	Similarly, destructors use the special name @^?{}@ (the @^@ has no special
	130	meaning). Normally, they are implicitly called on a variable when it goes out
	131	of scope but they can be called explicitly as well.
[6c79bef]	132	\begin{cfa}
[4ed7946e]	133	void ^?{}(Structure & this) { ... }
[6c79bef]	134	{
[4ed7946e]	135	Structure d;
[edc6ea2]	136	} // <- implicit destructor call
[6c79bef]	137	\end{cfa}
[edc6ea2]	138
[4ed7946e]	139	Whenever a type is defined, \CFA creates a default zero-argument
[edc6ea2]	140	constructor, a copy constructor, a series of argument-per-field constructors
	141	and a destructor. All user constructors are defined after this.
	142	Because operators are never part of the type definition they may be added
	143	at any time, including on built-in types.
[6e7b969]	144
	145	\section{Polymorphism}
[6c79bef]	146	\CFA uses parametric polymorphism to create functions and types that are
	147	defined over multiple types. \CFA polymorphic declarations serve the same role
[29c9b23]	148	as \Cpp templates or Java generics. The ``parametric'' means the polymorphism is
[6c79bef]	149	accomplished by passing argument operations to associate \emph{parameters} at
	150	the call site, and these parameters are used in the function to differentiate
	151	among the types the function operates on.
	152
	153	Polymorphic declarations start with a universal @forall@ clause that goes
	154	before the standard (monomorphic) declaration. These declarations have the same
	155	syntax except they may use the universal type names introduced by the @forall@
	156	clause. For example, the following is a polymorphic identity function that
	157	works on any type @T@:
	158	\begin{cfa}
[edc6ea2]	159	forall( T ) T identity( T val ) { return val; }
	160	int forty_two = identity( 42 );
	161	char capital_a = identity( 'A' );
[6c79bef]	162	\end{cfa}
[4ed7946e]	163	Each use of a polymorphic declaration resolves its polymorphic parameters
[edc6ea2]	164	(in this case, just @T@) to concrete types (@int@ in the first use and @char@
	165	in the second).
[6e7b969]	166
[6c79bef]	167	To allow a polymorphic function to be separately compiled, the type @T@ must be
	168	constrained by the operations used on @T@ in the function body. The @forall@
[4ed7946e]	169	clause is augmented with a list of polymorphic variables (local type names)
[6c79bef]	170	and assertions (constraints), which represent the required operations on those
	171	types used in a function, \eg:
[f28fdee]	172	\begin{cfa}
[4ed7946e]	173	forall( T \| { void do_once(T); } )
[6c79bef]	174	void do_twice(T value) {
	175	do_once(value);
	176	do_once(value);
[6e7b969]	177	}
[f28fdee]	178	\end{cfa}
[6c79bef]	179
	180	A polymorphic function can be used in the same way as a normal function. The
	181	polymorphic variables are filled in with concrete types and the assertions are
	182	checked. An assertion is checked by verifying each assertion operation (with
	183	all the variables replaced with the concrete types from the arguments) is
	184	defined at a call site.
[edc6ea2]	185	\begin{cfa}
	186	void do_once(int i) { ... }
	187	int i;
	188	do_twice(i);
	189	\end{cfa}
	190	Any object with a type fulfilling the assertion may be passed as an argument to
	191	a @do_twice@ call.
[6c79bef]	192
	193	Note, a function named @do_once@ is not required in the scope of @do_twice@ to
[29c9b23]	194	compile it, unlike \Cpp template expansion. Furthermore, call-site inferencing
[6c79bef]	195	allows local replacement of the most specific parametric functions needs for a
	196	call.
[f28fdee]	197	\begin{cfa}
[edc6ea2]	198	void do_once(double y) { ... }
[6e7b969]	199	int quadruple(int x) {
[4ed7946e]	200	void do_once(int y) { y = y * 2; } // replace global do_once
	201	do_twice(x); // use local do_once
	202	do_twice(x + 1.5); // use global do_once
[6c79bef]	203	return x;
[6e7b969]	204	}
[f28fdee]	205	\end{cfa}
[6c79bef]	206	Specifically, the complier deduces that @do_twice@'s T is an integer from the
[4ed7946e]	207	argument @x@. It then looks for the most \emph{specific} definition matching the
[6c79bef]	208	assertion, which is the nested integral @do_once@ defined within the
	209	function. The matched assertion function is then passed as a function pointer
[4ed7946e]	210	to @do_twice@ and called within it. The global definition of @do_once@ is used
	211	for the second call because the float-point argument is a better match.
[6c79bef]	212
	213	To avoid typing long lists of assertions, constraints can be collect into
	214	convenient packages called a @trait@, which can then be used in an assertion
	215	instead of the individual constraints.
[f28fdee]	216	\begin{cfa}
[6c79bef]	217	trait done_once(T) {
	218	void do_once(T);
[6e7b969]	219	}
[f28fdee]	220	\end{cfa}
[6c79bef]	221	and the @forall@ list in the previous example is replaced with the trait.
[f28fdee]	222	\begin{cfa}
[edc6ea2]	223	forall(dtype T \| done_once(T))
[f28fdee]	224	\end{cfa}
[6c79bef]	225	In general, a trait can contain an arbitrary number of assertions, both
	226	functions and variables, and are usually used to create a shorthand for, and
	227	give descriptive names to, common groupings of assertions describing a certain
	228	functionality, like @sumable@, @listable@, \etc.
	229
	230	Polymorphic structures and unions are defined by qualifying the aggregate type
	231	with @forall@. The type variables work the same except they are used in field
	232	declarations instead of parameters, returns, and local variable declarations.
[f28fdee]	233	\begin{cfa}
[edc6ea2]	234	forall(dtype T)
[6e7b969]	235	struct node {
[9b0bb79]	236	node(T) * next;
[edc6ea2]	237	T * data;
[6e7b969]	238	}
[edc6ea2]	239	node(int) inode;
[f28fdee]	240	\end{cfa}
[edc6ea2]	241	The generic type @node(T)@ is an example of a polymorphic type usage. Like \Cpp
	242	template usage, a polymorphic type usage must specify a type parameter.
[6e7b969]	243
[6c79bef]	244	There are many other polymorphism features in \CFA but these are the ones used
	245	by the exception system.
[6e7b969]	246
[67c6a47]	247	\section{Control Flow}
	248	\CFA has a number of advanced control-flow features: @generator@, @coroutine@, @monitor@, @mutex@ parameters, and @thread@.
	249	The two features that interact with
	250	the exception system are @coroutine@ and @thread@; they and their supporting
[6c79bef]	251	constructs are described here.
	252
	253	\subsection{Coroutine}
	254	A coroutine is a type with associated functions, where the functions are not
	255	required to finish execution when control is handed back to the caller. Instead
	256	they may suspend execution at any time and be resumed later at the point of
	257	last suspension. (Generators are stackless and coroutines are stackful.) These
	258	types are not concurrent but share some similarities along with common
	259	underpinnings, so they are combined with the \CFA threading library. Further
	260	discussion in this section only refers to the coroutine because generators are
	261	similar.
	262
	263	In \CFA, a coroutine is created using the @coroutine@ keyword, which is an
	264	aggregate type like @struct,@ except the structure is implicitly modified by
	265	the compiler to satisfy the @is_coroutine@ trait; hence, a coroutine is
	266	restricted by the type system to types that provide this special trait. The
	267	coroutine structure acts as the interface between callers and the coroutine,
	268	and its fields are used to pass information in and out of coroutine interface
	269	functions.
	270
	271	Here is a simple example where a single field is used to pass (communicate) the
	272	next number in a sequence.
[f28fdee]	273	\begin{cfa}
[6e7b969]	274	coroutine CountUp {
[9b0bb79]	275	unsigned int next;
[6e7b969]	276	}
[6c79bef]	277	CountUp countup;
[f28fdee]	278	\end{cfa}
[67c6a47]	279	Each coroutine has a @main@ function, which takes a reference to a coroutine
[6c79bef]	280	object and returns @void@.
[4ed7946e]	281	\begin{cfa}[numbers=left]
[edc6ea2]	282	void main(CountUp & this) {
[4ed7946e]	283	for (unsigned int next = 0 ; true ; ++next) {
[edc6ea2]	284	next = up;
	285	suspend;$\label{suspend}$
[6c79bef]	286	}
[6e7b969]	287	}
[f28fdee]	288	\end{cfa}
[6c79bef]	289	In this function, or functions called by this function (helper functions), the
	290	@suspend@ statement is used to return execution to the coroutine's caller
[67c6a47]	291	without terminating the coroutine's function.
[6c79bef]	292
	293	A coroutine is resumed by calling the @resume@ function, \eg @resume(countup)@.
	294	The first resume calls the @main@ function at the top. Thereafter, resume calls
	295	continue a coroutine in the last suspended function after the @suspend@
	296	statement, in this case @main@ line~\ref{suspend}. The @resume@ function takes
	297	a reference to the coroutine structure and returns the same reference. The
	298	return value allows easy access to communication variables defined in the
	299	coroutine object. For example, the @next@ value for coroutine object @countup@
	300	is both generated and collected in the single expression:
	301	@resume(countup).next@.
[6e7b969]	302
[67c6a47]	303	\subsection{Monitor and Mutex Parameter}
[6c79bef]	304	Concurrency does not guarantee ordering; without ordering results are
	305	non-deterministic. To claw back ordering, \CFA uses monitors and @mutex@
	306	(mutual exclusion) parameters. A monitor is another kind of aggregate, where
	307	the compiler implicitly inserts a lock and instances are compatible with
	308	@mutex@ parameters.
	309
	310	A function that requires deterministic (ordered) execution, acquires mutual
	311	exclusion on a monitor object by qualifying an object reference parameter with
	312	@mutex@.
	313	\begin{cfa}
[edc6ea2]	314	void example(MonitorA & mutex argA, MonitorB & mutex argB);
[6c79bef]	315	\end{cfa}
	316	When the function is called, it implicitly acquires the monitor lock for all of
	317	the mutex parameters without deadlock. This semantics means all functions with
	318	the same mutex type(s) are part of a critical section for objects of that type
	319	and only one runs at a time.
[6e7b969]	320
[67c6a47]	321	\subsection{Thread}
[6c79bef]	322	Functions, generators, and coroutines are sequential so there is only a single
	323	(but potentially sophisticated) execution path in a program. Threads introduce
	324	multiple execution paths that continue independently.
[6e7b969]	325
[6c79bef]	326	For threads to work safely with objects requires mutual exclusion using
	327	monitors and mutex parameters. For threads to work safely with other threads,
	328	also requires mutual exclusion in the form of a communication rendezvous, which
[67c6a47]	329	also supports internal synchronization as for mutex objects. For exceptions,
	330	only two basic thread operations are important: fork and join.
[6e7b969]	331
[6c79bef]	332	Threads are created like coroutines with an associated @main@ function:
[f28fdee]	333	\begin{cfa}
[6e7b969]	334	thread StringWorker {
[6c79bef]	335	const char * input;
	336	int result;
[6e7b969]	337	};
	338	void main(StringWorker & this) {
[6c79bef]	339	const char * localCopy = this.input;
	340	// ... do some work, perhaps hashing the string ...
	341	this.result = result;
[6e7b969]	342	}
[6c79bef]	343	{
	344	StringWorker stringworker; // fork thread running in "main"
[9b0bb79]	345	} // <- implicitly join with thread / wait for completion
[f28fdee]	346	\end{cfa}
[6c79bef]	347	The thread main is where a new thread starts execution after a fork operation
	348	and then the thread continues executing until it is finished. If another thread
	349	joins with an executing thread, it waits until the executing main completes
	350	execution. In other words, everything a thread does is between a fork and join.
	351
	352	From the outside, this behaviour is accomplished through creation and
	353	destruction of a thread object. Implicitly, fork happens after a thread
	354	object's constructor is run and join happens before the destructor runs. Join
	355	can also be specified explicitly using the @join@ function to wait for a
	356	thread's completion independently from its deallocation (\ie destructor
	357	call). If @join@ is called explicitly, the destructor does not implicitly join.

Note: See TracBrowser for help on using the repository browser.

Download in other formats: