Context Navigation

source: doc/theses/fangren_yu_COOP_S20/Report.tex @ cebfcb8

ADTarm-ehast-experimentalenumforall-pointer-decayjacob/cs343-translationnew-ast-unique-exprpthread-emulationqualifiedEnum

Last change on this file since cebfcb8 was cebfcb8, checked in by Peter A. Buhr <pabuhr@…>, 4 years ago
convert Fangren's co-op report to latex and proofread
Property mode set to `100644`
File size: 28.3 KB

Line
1	\documentclass[twoside,12pt]{article}
2
3	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
4
5	% Latex packages used in the document.
6	\usepackage{fullpage,times,comment}
7	\usepackage{epic,eepic}
8	\usepackage{upquote} % switch curled `'" to straight
9	\usepackage{calc}
10	\usepackage{varioref} % extended references
11	\usepackage[labelformat=simple,aboveskip=0pt,farskip=0pt]{subfig}
12	\renewcommand{\thesubfigure}{\alph{subfigure})}
13	\usepackage{latexsym} % \Box glyph
14	\usepackage{mathptmx} % better math font with "times"
15	\usepackage[usenames]{color}
16	\input{common} % common CFA document macros
17	\usepackage[dvips,plainpages=false,pdfpagelabels,pdfpagemode=UseNone,colorlinks=true,pagebackref=true,linkcolor=blue,citecolor=blue,urlcolor=blue,pagebackref=true,breaklinks=true]{hyperref}
18	\usepackage{breakurl}
19
20	\usepackage[pagewise]{lineno}
21	\renewcommand{\linenumberfont}{\scriptsize\sffamily}
22
23	% Default underscore is too low and wide. Cannot use lstlisting "literate" as replacing underscore
24	% removes it as a variable-name character so keywords in variables are highlighted. MUST APPEAR
25	% AFTER HYPERREF.
26	\renewcommand{\textunderscore}{\leavevmode\makebox[1.2ex][c]{\rule{1ex}{0.075ex}}}
27	\newcommand{\NOTE}{\textbf{NOTE}}
28	\newcommand{\TODO}[1]{{\color{Purple}#1}}
29
30	\setlength{\topmargin}{-0.45in} % move running title into header
31	\setlength{\headsep}{0.25in}
32
33	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
34
35	\CFADefaults
36	\lstset{
37	language=C++, % make C++ the default language
38	}% lstset
39	\lstnewenvironment{C++}[1][] % use C++ style
40	{\lstset{language=C++,moredelim=**[is][\color{red}]{@}{@},#1}}{}
41
42	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
43
44	\setcounter{secnumdepth}{3} % number subsubsections
45	\setcounter{tocdepth}{3} % subsubsections in table of contents
46	\makeindex
47
48	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
49
50	\title{\Huge
51	cfa-cc Developer's Reference
52	}% title
53
54	\author{\LARGE
55	Fangren Yu
56	}% author
57
58	\date{
59	\today
60	}% date
61
62	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
63
64	\begin{document}
65	\pagestyle{headings}
66	% changed after setting pagestyle
67	\renewcommand{\sectionmark}[1]{\markboth{\thesection\quad #1}{\thesection\quad #1}}
68	\renewcommand{\subsectionmark}[1]{\markboth{\thesubsection\quad #1}{\thesubsection\quad #1}}
69	\pagenumbering{roman}
70	\linenumbers % comment out to turn off line numbering
71
72	\maketitle
73	\pdfbookmark[1]{Contents}{section}
74	\tableofcontents
75
76	\clearpage
77	\thispagestyle{plain}
78	\pagenumbering{arabic}
79
80
81	\section{Overview}
82
83	cfa-cc is the reference compiler for the \CFA programming language, which is a non-object-oriented extension to C.
84	\CFA attempts to introduce productive modern programming language features to C while maintaining as much backward-compatibility as possible, so that most existing C programs can seamlessly work with \CFA.
85
86	Since the \CFA project dates back to the early 2000s, and only restarted in the past few years, there is a significant amount of legacy code in the current compiler codebase with little documentation.
87	The lack of documentation makes it difficult to develop new features from the current implementation and diagnose problems.
88
89	Currently, the \CFA team is also facing poor compiler performance.
90	For the development of a new programming language, writing standard libraries is an important component.
91	The slow compiler causes building of the library files to take tens of minutes, making iterative development and testing almost impossible.
92	There is an ongoing effort to rewrite the core data-structure of the compiler to overcome the performance issue, but many bugs have appeared during this work, and lack of documentation is hampering debugging.
93
94	This developer's reference manual begins the documentation and should be continuously im\-proved until it eventually covers the entire compiler codebase.
95	For now, the focus is mainly on the parts being rewritten, and also the primary performance bottleneck, namely the resolution algorithm.
96	Its aimed is to provide new project developers with guidance in understanding the codebase, and clarify the purpose and behaviour of certain functions that are not mentioned in the previous \CFA research papers~\cite{Bilson03,Ditchfield92,Moss19}.
97
98
99	\section{Compiler Framework}
100
101	\CFA source code is first transformed into an abstract syntax tree (AST) by the parser before analyzed by the compiler.
102
103
104	\subsection{AST Representation}
105
106
107	There are 4 major categories of AST nodes used by the compiler, along with some derived structures.
108
109	\subsubsection{Declaration Nodes}
110
111	A declaration node represents either of:
112	\begin{itemize}
113	\item
114	type declaration: @struct@, @union@, @typedef@ or type parameter \TODO{(see Appendix A.3)}
115	\item
116	variable declaration
117	\item
118	function declaration
119	\end{itemize}
120	Declarations are introduced by standard C declarations, with the usual scoping rules.
121	In addition, declarations can also be qualified by the \lstinline[language=CFA]@forall@ clause (which is the origin of \CFA's name):
122	\begin{cfa}
123	forall ( <$\emph{TypeParameterList}$> \| <$\emph{AssertionList}$> )
124	$\emph{declaration}$
125	\end{cfa}
126	Type parameters in \CFA are similar to \CC template type parameters.
127	The \CFA declaration
128	\begin{cfa}
129	forall (dtype T) ...
130	\end{cfa}
131	behaves similarly to the \CC template declaration
132	\begin{C++}
133	template <typename T> ...
134	\end{C++}
135
136	Assertions are a distinctive feature of \CFA, similar to \emph{interfaces} in D and Go, and \emph{traits} in Rust.
137	Contrary to the \CC template where arbitrary functions and operators can be used in a template definition, in a \CFA parametric function, operations on parameterized types must be declared in assertions.
138	Consider the following \CC template:
139	\begin{C++}
140	@template@ forall<typename T> T foo( T t ) {
141	return t + t * t;
142	}
143	\end{C++}
144	where there are no explicit requirements on the type @T@.
145	Therefore, the \CC compiler must deduce what operators are required during textual (macro) expansion of the template at each usage.
146	As a result, templates cannot be compiled.
147	\CFA assertions specify restrictions on type parameters:
148	\begin{cfa}
149	forall( dtype T \| @{ T ?+?( T, T ); T ?*?( T, T ) }@ ) int foo ( T t ) {
150	return t + t * t;
151	}
152	\end{cfa}
153	Assertions are written using the usual \CFA function declaration syntax.
154	Only types with operators ``@+@'' and ``@*@'' work with this function, and the function prototype is sufficient to allow separate compilation.
155
156	Type parameters and assertions are used in the following compiler data-structures.
157
158
159	\subsubsection{Type Nodes}
160
161	Type nodes represent the type of an object or expression.
162	Named types reference the corresponding type declarations.
163	The type of a function is its function pointer type (same as standard C).
164	With the addition of type parameters, named types may contain a list of parameter values (actual parameter types).
165
166
167	\subsubsection{Statement Nodes}
168
169	Statement nodes represent the executable statements in the program, including basic expression statements, control flows and blocks.
170	Local declarations (within a block statement) are represented as declaration statements.
171
172
173	\subsubsection{Expression Nodes}
174
175	Some expressions are represented differently before and after the resolution stage:
176	\begin{itemize}
177	\item
178	Name expressions: @NameExpr@ pre-resolution, @VariableExpr@ post-resolution
179	\item
180	Member expressions: @UntypedMemberExpr@ pre-resolution, @MemberExpr@ post-resolution
181	\item
182	\begin{sloppypar}
183	Function call expressions (including overloadable operators): @UntypedExpr@ pre-resolution, @ApplicationExpr@ post-resolution
184	\end{sloppypar}
185	\end{itemize}
186	The pre-resolution representation contains only the symbols.
187	Post-resolution links them to the actual variable and function declarations.
188
189
190	\subsection{Compilation Passes}
191
192	Compilation steps are implemented as passes, which follows a general structural recursion pattern on the syntax tree.
193
194	The basic workflow of compilation passes follows preorder and postorder traversal on the AST data-structure, implemented with visitor pattern, and can be loosely described with the following pseudocode:
195	\begin{C++}
196	Pass::visit( node_t node ) {
197	previsit( node );
198	if ( visit_children )
199	for each child of node:
200	child.accept( this );
201	postvisit( node );
202	}
203	\end{C++}
204	Operations in @previsit@ happen in preorder (top to bottom) and operations in @postvisit@ happen in postorder (bottom to top).
205	The precise order of recursive operations on child nodes can be found in @Common/PassVisitor.impl.h@ (old) and @AST/Pass.impl.hpp@ (new).
206
207	Implementations of compilation passes follow certain conventions:
208	\begin{itemize}
209	\item
210	Passes \textbf{should not} directly override the visit method (Non-virtual Interface principle);
211	if a pass desires different recursion behaviour, it should set @visit_children@ to false and perform recursive calls manually within previsit or postvisit procedures.
212	To enable this option, inherit from the @WithShortCircuiting@ mixin.
213	\item
214	previsit may mutate the node but \textbf{must not} change the node type or return @nullptr@.
215	\item
216	postvisit may mutate the node, reconstruct it to a different node type, or delete it by returning @nullptr@.
217	\item
218	If the previsit or postvisit method is not defined for a node type, the step is skipped.
219	If the return type is declared as @void@, the original node is returned by default.
220	These behaviours are controlled by template specialization rules;
221	see @Common/PassVisitor.proto.h@ (old) and @AST/@ @Pass.proto.hpp@ (new) for details.
222	\end{itemize}
223	Other useful mixin classes for compilation passes include:
224	\begin{itemize}
225	\item
226	@WithGuards@ allows saving and restoring variable values automatically upon entering/exiting the current node.
227	\item
228	@WithVisitorRef@ creates a wrapped entity for the current pass (the actual argument passed to recursive calls internally) for explicit recursion, usually used together with @WithShortCircuiting@.
229	\item
230	@WithSymbolTable@ gives a managed symbol table with built-in scoping-rule handling (\eg on entering and exiting a block statement)
231	\end{itemize}
232	\NOTE: If a pass extends the functionality of another existing pass, due to \CC overloading resolution rules, it \textbf{must} explicitly introduce the inherited previsit and postvisit procedures to its own scope, or otherwise they are not picked up by template resolution:
233	\begin{C++}
234	class Pass2: public Pass1 {
235	@using Pass1::previsit;@
236	@using Pass1::postvisit;@
237	// new procedures
238	}
239	\end{C++}
240
241
242	\subsection{Data Structure Change (new-ast)}
243
244	It has been observed that excessive copying of syntax tree structures accounts for a majority of computation cost and significantly slows down the compiler.
245	In the previous implementation of the syntax tree, every internal node has a unique parent;
246	therefore all copies are required to duplicate the entire subtree.
247	A new, experimental re-implementation of the syntax tree (source under directory @AST/@ hereby referred to as ``new-ast'') attempts to overcome this issue with a functional approach that allows sharing of common sub-structures and only makes copies when necessary.
248
249	The core of new-ast is a customized implementation of smart pointers, similar to @std::shared_ptr@ and @std::weak_ptr@ in the \CC standard library.
250	Reference counting is used to detect sharing and allowing certain optimizations.
251	For a purely functional (immutable) data-structure, all mutations are modelled by shallow copies along the path of mutation.
252	With reference counting optimization, unique nodes are allowed to be mutated in place.
253	This however, may potentially introduce some complications and bugs;
254	a few issues are discussed near the end of this section.
255
256
257	\subsubsection{Source: \lstinline{AST/Node.hpp}}
258
259	Class @ast::Node@ is the base class of all new-ast node classes, which implements reference counting mechanism.
260	Two different counters are recorded: ``strong'' reference count for number of nodes semantically owning it;
261	``weak'' reference count for number of nodes holding a mere reference and only need to observe changes.
262	Class @ast::ptr_base@ is the smart pointer implementation and also takes care of resource management.
263
264	Direct access through the smart pointer is read-only.
265	A mutable access should be obtained by calling @shallowCopy@ or mutate as below.
266
267	Currently, the weak pointers are only used to reference declaration nodes from a named type, or a variable expression.
268	Since declaration nodes are intended to denote unique entities in the program, weak pointers always point to unique (unshared) nodes.
269	This property may change in the future, and weak references to shared nodes may introduce some problems;
270	see mutate function below.
271
272	All node classes should always use smart pointers in structure definitions versus raw pointers.
273	Function
274	\begin{C++}
275	void ast::Node::increment(ref_type ref)
276	\end{C++}
277	increments this node's strong or weak reference count.
278	Function
279	\begin{C++}
280	void ast::Node::decrement(ref_type ref, bool do_delete = true)
281	\end{C++}
282	decrements this node's strong or weak reference count.
283	If strong reference count reaches zero, the node is deleted.
284	\NOTE: Setting @do_delete@ to false may result in a detached node.
285	Subsequent code should manually delete the node or assign it to a strong pointer to prevent memory leak.
286
287	Reference counting functions are internally called by @ast::ptr_base@.
288	Function
289	\begin{C++}
290	template<typename node_t>
291	node_t * shallowCopy(const node_t * node)
292	\end{C++}
293	returns a mutable, shallow copy of node: all child pointers are pointing to the same child nodes.
294	Function
295	\begin{C++}
296	template<typename node_t>
297	node_t * mutate(const node_t * node)
298	\end{C++}
299	returns a mutable pointer to the same node, if the node is unique (strong reference count is 1);
300	otherwise, it returns @shallowCopy(node)@.
301	It is an error to mutate a shared node that is weak-referenced.
302	Currently this does not happen.
303	A problem may appear once weak pointers to shared nodes (\eg expression nodes) are used;
304	special care is needed.
305
306	\NOTE: This naive uniqueness check may not be sufficient in some cases.
307	A discussion of the issue is presented at the end of this section.
308	Functions
309	\begin{C++}
310	template<typename node_t, typename parent_t, typename field_t, typename assn_t>
311	const node_t * mutate_field(const node_t * node, field_t parent_t::* field, assn_t && val)
312	\end{C++}
313	\begin{C++}
314	template<typename node_t, typename parent_t, typename coll_t, typename ind_t,
315	typename field_t>
316	const node_t * mutate_field_index(const node_t * node, coll_t parent_t::* field, ind_t i,
317	field_t && val)
318	\end{C++}
319	are helpers for mutating a field on a node using pointer to a member function (creates shallow copy when necessary).
320
321
322	\subsubsection{Issue: Undetected Sharing}
323
324	The @mutate@ behaviour described above has a problem: deeper shared nodes may be
325	mistakenly considered as unique. \VRef[Figure]{f:DeepNodeSharing} shows how the problem could arise:
326	\begin{figure}
327	\centering
328	\input{DeepNodeSharing}
329	\caption{Deep sharing of nodes}
330	\label{f:DeepNodeSharing}
331	\end{figure}
332	Given the tree rooted at P1, which is logically the chain P1-A-B, and P2 is irrelevant, assume @mutate(B)@ is called.
333	The algorithm considers B as unique since it is only directly owned by A.
334	However, the other tree P2-A-B indirectly shares the node B and is therefore wrongly mutated.
335
336	To partly address this problem, if the mutation is called higher up the tree, a chain mutation helper can be used.
337
338	\subsubsection{Source: \lstinline{AST/Chain.hpp}}
339
340	Function
341	\begin{C++}
342	template<typename node_t, Node::ref_type ref_t>
343	auto chain_mutate(ptr_base<node_t, ref_t> & base)
344	\end{C++}
345	returns a chain mutator handle that takes pointer-to-member to go down the tree, while creating shallow copies as necessary;
346	see @struct _chain_mutator@ in the source code for details.
347
348	For example, in the above diagram, if mutation of B is wanted while at P1, the call using @chain_mutate@ looks like the following:
349	\begin{C++}
350	chain_mutate(P1.a)(&A.b) = new_value_of_b;
351	\end{C++}
352	\NOTE: if some node in chain mutate is shared (therefore shallow copied), it implies that every node further down is also copied, thus correctly executing the functional mutation algorithm.
353	This example code creates copies of both A and B and performs mutation on the new nodes, so that the other tree P2-A-B is untouched.
354	However, if a pass traverses down to node B and performs mutation, for example, in @postvisit(B)@, information on sharing higher up is lost.
355	Since the new-ast structure is only in experimental use with the resolver algorithm, which mostly rebuilds the tree bottom-up, this issue does not actually happen.
356	It should be addressed in the future when other compilation passes are migrated to new-ast and many of them contain procedural mutations, where it might cause accidental mutations to other logically independent trees (\eg common sub-expression) and become a bug.
357
358
359	\section{Compiler Algorithm Documentation}
360
361	This compiler algorithm documentation covers most of the resolver, data structures used in variable and expression resolution, and a few directly related passes.
362	Later passes involving code generation are not included yet;
363	documentation for those will be done latter.
364
365
366	\subsection{Symbol Table}
367
368	\NOTE: For historical reasons, the symbol-table data-structure is called @indexer@ in the old implementation.
369	Hereby, the name is changed to @SymbolTable@.
370	The symbol table stores a mapping from names to declarations, implements a similar name-space separation rule, and provides the same scoping rules as standard C.\footnote{ISO/IEC 9899:1999, Sections 6.2.1 and 6.2.3.}
371	The difference in name-space rule is that @typedef@ aliases are no longer considered ordinary identifiers.
372	In addition to C tag-types (@struct@, @union@, @enum@), \CFA introduces another tag type, @trait@, which is a named collection of assertions.
373
374
375	\subsubsection{Source: \lstinline{AST/SymbolTable.hpp}}
376
377	\TODO{Add something here}
378
379
380	\subsubsection{Source: \lstinline{SymTab/Indexer.h}}
381
382	Function
383	\begin{C++}
384	SymbolTable::addId(const DeclWithType * decl)
385	\end{C++}
386	provides name mangling of identifiers, since \CFA allows overloading of variables and functions.
387	The mangling scheme is closely based on the Itanium \CC ABI,\footnote{\url{https://itanium-cxx-abi.github.io/cxx-abi/abi.html}, Section 5.1} while making adaptations to \CFA specific features, mainly assertions and overloaded variables by type.
388
389	Naming conflicts are handled by mangled names;
390	lookup by name returns a list of declarations with the same identifier name.
391	Functions
392	\begin{C++}
393	SymbolTable::addStruct(const StructDecl * decl)
394	SymbolTable::addUnion(const UnionDecl * decl)
395	SymbolTable::addEnum(const EnumDecl * decl)
396	SymbolTable::addTrait(const TraitDecl * decl)
397	\end{C++}
398	add a tag-type declaration to the symbol table.
399	Function
400	\begin{C++}
401	SymbolTable::addType(const NamedTypeDecl * decl)
402	\end{C++}
403	adds a @typedef@ alias to the symbol table.
404
405	\textbf{C Incompatibility Note}: Since \CFA allows using @struct@, @union@ and @enum@ type-names without a prefix keyword, as in \CC, @typedef@ names and tag-type names cannot be disambiguated by syntax rules.
406	Currently the compiler puts them together and disallows collision.
407	The following program is valid C but invalid \CFA (and \CC):
408	\begin{C++}
409	struct A {};
410	typedef int A; // gcc: ok, cfa: Cannot redefine typedef A
411	struct A sa; // C disambiguates via struct prefix
412	A ia;
413	\end{C++}
414	In practices, such usage is extremely rare, and hence, this change (as in \CC) has minimal impact on existing C programs.
415	The declaration
416	\begin{C++}
417	struct A {};
418	typedef struct A A; // A is an alias for struct A
419	A a;
420	struct A b;
421	\end{C++}
422	is not an error because the alias name is identical to the original.
423	Finally, the following program is allowed in \CFA:
424	\begin{C++}
425	typedef int A;
426	void A(); // name mangled
427	// gcc: A redeclared as different kind of symbol, cfa: ok
428	\end{C++}
429	because the function name is mangled.
430
431
432	\subsection{Type Environment and Unification}
433
434	The following core ideas underlie the parametric type-resolution algorithm.
435	A type environment organizes type parameters into \textbf{equivalent classes} and maps them to actual types.
436	Unification is the algorithm that takes two (possibly parametric) types and parameter mappings, and attempts to produce a common type by matching information in the type environments.
437
438	The unification algorithm is recursive in nature and runs in two different modes internally:
439	\begin{itemize}
440	\item
441	Exact unification mode requires equivalent parameters to match perfectly.
442	\item
443	Inexact unification mode allows equivalent parameters to be converted to a common type.
444	\end{itemize}
445	For a pair of matching parameters (actually, their equivalent classes), if either side is open (not bound to a concrete type yet), they are combined.
446
447	Within the inexact mode, types are allowed to differ on their cv-qualifiers (\eg @const@, @volatile@, \etc);
448	additionally, if a type never appear either in a parameter list or as the base type of a pointer, it may also be widened (\ie safely converted).
449	As \CFA currently does not implement subclassing as in object-oriented languages, widening conversions are only on the primitive types, \eg conversion from @int@ to @long int@.
450
451	The need for two unification modes comes from the fact that parametric types are considered compatible only if all parameters are exactly the same (not just compatible).
452	Pointer types also behaves similarly;
453	in fact, they may be viewed as a primitive kind of parametric types.
454	@int @ and @long @ are different types, just like @vector(int)@ and @vector(long)@ are, for the parametric type @*(T)@ / @vector(T)@, respectively.
455
456	The resolver uses the following @public@ functions:\footnote{
457	Actual code also tracks assertions on type parameters; those extra arguments are omitted here for conciseness.}
458
459
460	\subsubsection{Source: \lstinline{ResolvExpr/Unify.cc}}
461
462	Function
463	\begin{C++}
464	bool unify(const Type * type1, const Type * type2, TypeEnvironment & env,
465	OpenVarSet & openVars, const SymbolTable & symtab, Type *& commonType)
466	\end{C++}
467	returns a boolean indicating if the unification succeeds or fails after attempting to unify @type1@ and @type2@ within current type environment.
468	If the unify succeeds, @env@ is modified by combining the equivalence classes of matching parameters in @type1@ and @type2@, and their common type is written to @commonType@.
469	If the unify fails, nothing changes.
470	Functions
471	\begin{C++}
472	bool typesCompatible(const Type * type1, const Type * type2, const SymbolTable & symtab,
473	const TypeEnvironment & env)
474	bool typesCompatibleIgnoreQualifiers(const Type * type1, const Type * type2,
475	const SymbolTable & symtab, const TypeEnvironment & env)
476	\end{C++}
477	return a boolean indicating if types @type1@ and @type2@ can possibly be the same type.
478	The second version ignores the outermost cv-qualifiers if present.\footnote{
479	In \lstinline@const int * const@, only the second \lstinline@const@ is ignored.}
480	These function have no side effects.
481
482	\NOTE: No attempt is made to widen the types (exact unification is used), although the function names may suggest otherwise, \eg @typesCompatible(int, long)@ returns false.
483
484
485	\subsection{Expression Resolution}
486
487	The design of the current version of expression resolver is outlined in the Ph.D.\ thesis by Aaron Moss~\cite{Moss19}.
488	A summary of the resolver algorithm for each expression type is presented below.
489
490	All overloadable operators are modelled as function calls.
491	For a function call, interpretations of the function and arguments are found recursively.
492	Then the following steps produce a filtered list of valid interpretations:
493	\begin{enumerate}
494	\item
495	From all possible combinations of interpretations of the function and arguments, those where argument types may be converted to function parameter types are considered valid.
496	\item
497	Valid interpretations with the minimum sum of argument costs are kept.
498	\item
499	\label{p:argcost}
500	Argument costs are then discarded; the actual cost for the function call expression is the sum of conversion costs from the argument types to parameter types.
501	\item
502	\label{p:returntype}
503	For each return type, the interpretations with satisfiable assertions are then sorted by actual cost computed in step~\ref{p:argcost}.
504	If for a given type, the minimum cost interpretations are not unique, that return type is ambiguous.
505	If the minimum cost interpretation is unique but contains an ambiguous argument, it is also ambiguous.
506	\end{enumerate}
507	Therefore, for each return type, the resolver produces:
508	\begin{itemize}
509	\item
510	no alternatives
511	\item
512	a single valid alternative
513	\item
514	an ambiguous alternative
515	\end{itemize}
516	\NOTE: an ambiguous alternative may be discarded at the parent expressions because a different return type matches better for the parent expressions.
517
518	The \emph{non}-overloadable expressions in \CFA are: cast expressions, address-of (unary @&@) expressions, short-circuiting logical expressions (@&&@, @\|\|@) and ternary conditional expression (@?:@).
519
520	For a cast expression, the convertible argument types are kept.
521	Then the result is selected by lowest argument cost, and further by lowest conversion cost to target type.
522	If the lowest cost is still not unique or an ambiguous argument interpretation is selected, the cast expression is ambiguous.
523	In an expression statement, the top level expression is implicitly cast to @void@.
524
525	For an address-of expression, only lvalue results are kept and the minimum cost is selected.
526
527	For logical expressions @&&@ and @\|\|@, arguments are implicitly cast to @bool@, and follow the rules fr cast expression above.
528
529	For the ternary conditional expression, the condition is implicitly cast to @bool@, and the branch expressions must have compatible types.
530	Each pair of compatible branch expression types produce a possible interpretation, and the cost is defined as the sum of the expression costs plus the sum of conversion costs to the common type.
531
532	\TODO{Write a specification for expression costs.}
533
534
535	\subsection{Assertion Satisfaction}
536
537	The resolver tries to satisfy assertions on expressions only when it is needed: either while selecting from multiple alternatives of a same result type for a function call (step \ref{p:returntype} of resolving function calls) or upon reaching the top level of an expression statement.
538
539	Unsatisfiable alternatives are discarded.
540	Satisfiable alternatives receive \textbf{implicit parameters}: in \CFA, parametric functions may be separately compiled, as opposed to \CC templates which are only compiled at instantiation.
541	Given the parametric function-definition:
542	\begin{C++}
543	forall (otype T \| {void foo(T);})
544	void bar (T t) { foo(t); }
545	\end{C++}
546	the function @bar@ does not know which @foo@ to call when compiled without knowing the call site, so it requests a function pointer to be passed as an extra argument.
547	At the call site, implicit parameters are automatically inserted by the compiler.
548
549	\TODO{Explain how recursive assertion satisfaction and polymorphic recursion work.}
550
551
552	\section{Tests}
553
554	\subsection{Test Suites}
555
556	Automatic test suites are located under the @tests/@ directory.
557	A test case consists of an input CFA source file (suffix @.cfa@), and an expected output file located in the @tests/.expect/@ directory, with the same file name ending with suffix @.txt@.
558	For example, the test named @tests/tuple/tupleCast.cfa@ has the following files, for example:
559	\begin{C++}
560	tests/
561	tuple/
562	.expect/
563	tupleCast.txt
564	tupleCast.cfa
565	\end{C++}
566	If compilation fails, the error output is compared to the expect file.
567	If the compilation succeeds but does not generate an executable, the compilation output is compared to the expect file.
568	If the compilation succeeds and generates an executable, the executable is run and its output is compared to the expect file.
569	To run the tests, execute the test script @test.py@ under the @tests/@ directory, with a list of test names to be run, or @--all@ (or @make all-tests@) to run all tests.
570	The test script reports test cases fail/success, compilation time and program run time.
571	To see all the options available for @test.py@ using the @--help@ option.
572
573
574	\subsection{Performance Reports}
575
576	To turn on performance reports, pass the @-XCFA -S@ flag to the compiler.
577	Three kinds of performance reports are available:
578	\begin{enumerate}
579	\item
580	Time, reports time spent in each compilation step
581	\item
582	Heap, reports number of dynamic memory allocations, total bytes allocated, and
583	maximum heap memory usage
584	\item
585	Counters, for certain predefined statistics; counters can be registered anywhere in
586	the compiler as a static object, and the interface can be found at
587	@Common/Stats/Counter.h@.
588	\end{enumerate}
589	It is suggested to run performance tests with optimization (@g++@ flag @-O3@).
590
591
592	\bibliographystyle{plain}
593	\bibliography{pl}
594
595
596	\end{document}
597
598	% Local Variables: %
599	% tab-width: 4 %
600	% fill-column: 100 %
601	% compile-command: "make" %
602	% End: %

Note: See TracBrowser for help on using the repository browser.

Download in other formats: