Index: doc/theses/fangren_yu_COOP_S20/Report.tex
===================================================================
--- doc/theses/fangren_yu_COOP_S20/Report.tex	(revision d1864da6101f2b7824ef52fbb50040da9aeac0c1)
+++ doc/theses/fangren_yu_COOP_S20/Report.tex	(revision a75d464d464ddcf437d980074447bb99eb46164d)
@@ -15,26 +15,28 @@
 \usepackage[usenames]{color}
 \input{common}                                          % common CFA document macros
+\usepackage[dvips,plainpages=false,pdfpagelabels,pdfpagemode=UseNone,colorlinks=true,pagebackref=true,linkcolor=blue,citecolor=blue,urlcolor=blue,pagebackref=true,breaklinks=true]{hyperref}
+\usepackage{breakurl}
+
+\usepackage[pagewise]{lineno}
+\renewcommand{\linenumberfont}{\scriptsize\sffamily}
+
+% Default underscore is too low and wide. Cannot use lstlisting "literate" as replacing underscore
+% removes it as a variable-name character so keywords in variables are highlighted. MUST APPEAR
+% AFTER HYPERREF.
+\renewcommand{\textunderscore}{\leavevmode\makebox[1.2ex][c]{\rule{1ex}{0.075ex}}}
+\newcommand{\NOTE}{\textbf{NOTE}}
+
+\setlength{\topmargin}{-0.45in}							% move running title into header
+\setlength{\headsep}{0.25in}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
+\CFADefaults
 \lstset{
+language=C++,											% make C++ the default language
 escapechar=\$,											% LaTeX escape in CFA code
 moredelim=**[is][\color{red}]{`}{`},
 }% lstset
 \lstMakeShortInline@%
-\usepackage[dvips,plainpages=false,pdfpagelabels,pdfpagemode=UseNone,colorlinks=true,pagebackref=true,linkcolor=blue,citecolor=blue,urlcolor=blue,pagebackref=true,breaklinks=true]{hyperref}
-\usepackage{breakurl}
-
-\usepackage[pagewise]{lineno}
-\renewcommand{\linenumberfont}{\scriptsize\sffamily}
-
-% Default underscore is too low and wide. Cannot use lstlisting "literate" as replacing underscore
-% removes it as a variable-name character so keywords in variables are highlighted. MUST APPEAR
-% AFTER HYPERREF.
-\renewcommand{\textunderscore}{\leavevmode\makebox[1.2ex][c]{\rule{1ex}{0.075ex}}}
-
-\setlength{\topmargin}{-0.45in}							% move running title into header
-\setlength{\headsep}{0.25in}
-
-%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-
-\CFAStyle												% use default CFA format-style
 \lstnewenvironment{C++}[1][]                            % use C++ style
 {\lstset{language=C++,moredelim=**[is][\protect\color{red}]{`}{`},#1}}
@@ -82,11 +84,11 @@
 \section{Overview}
 
-cfa-cc is the reference compiler for the Cforall programming language, which is a non-
+cfa-cc is the reference compiler for the \CFA programming language, which is a non-
 object-oriented extension to C.
-Cforall attempts to introduce productive modern programming language features to C
+\CFA attempts to introduce productive modern programming language features to C
 while maintaining as much backward-compatibility as possible, so that most existing C
-programs can seamlessly work with Cforall.
-
-Since the Cforall project was dated back to the early 2000s, and only restarted in the past
+programs can seamlessly work with \CFA.
+
+Since the \CFA project was dated back to the early 2000s, and only restarted in the past
 few years, there is a significant amount of legacy code in the current compiler codebase,
 with little proper documentation available. This becomes a difficulty while developing new
@@ -94,5 +96,5 @@
 problems.
 
-Currently, the Cforall team is also facing another problem: bad compiler performance. For
+Currently, the \CFA team is also facing another problem: bad compiler performance. For
 the development of a new programming language, writing a standard library is an
 important part. The incompetence of the compiler causes building the library files to take
@@ -106,5 +108,5 @@
 performance bottleneck, namely the resolution algorithm. It is aimed to provide new
 developers to the project enough guidance and clarify the purposes and behavior of certain
-functions which are not mentioned in the previous Cforall research papers.
+functions which are not mentioned in the previous \CFA research papers.
 
 
@@ -132,10 +134,10 @@
 Declarations are introduced by standard C declarations, with the usual scoping rules.
 In addition, declarations can also be introduced by the forall clause (which is the origin
-of Cforall's name):
+of \CFA's name):
 \begin{cfa}
 forall (<$\emph{TypeParameterList}$> | <$\emph{AssertionList}$>)
 	$\emph{declaration}$
 \end{cfa}
-Type parameters in Cforall are similar to \CC template type parameters. The Cforall
+Type parameters in \CFA are similar to \CC template type parameters. The \CFA
 declaration
 \begin{cfa}
@@ -147,6 +149,6 @@
 \end{C++}
 
-Assertions are a distinctive feature of Cforall: contrary to the \CC template where
-arbitrary functions and operators can be used in a template definition, in a Cforall
+Assertions are a distinctive feature of \CFA: contrary to the \CC template where
+arbitrary functions and operators can be used in a template definition, in a \CFA
 parametric function, operations on parameterized types must be declared in assertions.
 
@@ -205,5 +207,5 @@
 tree data structure, implemented with visitor pattern, and can be loosely described with
 the following pseudocode:
-\begin{cfa}
+\begin{C++}
 Pass::visit (node_t node) {
 	previsit(node);
@@ -213,5 +215,5 @@
 	postvisit(node);
 }
-\end{cfa}
+\end{C++}
 Operations in previsit() happen in preorder (top to bottom) and operations in
 postvisit() happen in postorder (bottom to top). The precise order of recursive
@@ -247,10 +249,10 @@
 \item
 WithSymbolTable gives a managed symbol table with built-in scoping rule handling
-(e.g. on entering and exiting a block statement)
+(\eg on entering and exiting a block statement)
 \end{itemize}
-\textbf{NOTE}: If a pass extends the functionality of another existing pass, due to \CC overloading
+\NOTE: If a pass extends the functionality of another existing pass, due to \CC overloading
 resolution rules, it \textbf{must} explicitly introduce the inherited previsit and postvisit procedures
 to its own scope, or otherwise they will not be picked up by template resolution:
-\begin{cfa}
+\begin{C++}
 class Pass2: public Pass1 {
 	using Pass1::previsit;
@@ -258,5 +260,5 @@
 	// new procedures
 }
-\end{cfa}
+\end{C++}
 
 
@@ -272,5 +274,5 @@
 
 The core of new-ast is a customized implementation of smart pointers, similar to
-@std::shared_ptr@ and @std::weak_ptr@ in C++ standard library. Reference counting is
+@std::shared_ptr@ and @std::weak_ptr@ in \CC standard library. Reference counting is
 used to detect sharing and allows optimization. For a purely functional (a.k.a. immutable)
 data structure, all mutations are modelled by shallow copies along the path of mutation.
@@ -309,5 +311,5 @@
 Decrements this node's strong or weak reference count. If strong reference count reaches
 zero, the node is deleted by default.
-\textbf{NOTE}: Setting @do_delete@ to false may result in a detached node. Subsequent code should
+\NOTE: Setting @do_delete@ to false may result in a detached node. Subsequent code should
 manually delete the node or assign it to a strong pointer to prevent memory leak.
 Reference counting functions are internally called by @ast::ptr_base@.
@@ -325,8 +327,8 @@
 Otherwise, returns shallowCopy(node).
 It is an error to mutate a shared node that is weak-referenced. Currently this does not
-happen. The problem may appear once weak pointers to shared nodes (e.g. expression
+happen. The problem may appear once weak pointers to shared nodes (\eg expression
 nodes) are used; special care will be needed.
 
-\textbf{NOTE}: This naive uniqueness check may not be sufficient in some cases. A discussion of the
+\NOTE: This naive uniqueness check may not be sufficient in some cases. A discussion of the
 issue is presented at the end of this section.
 \begin{C++}
@@ -372,4 +374,272 @@
 source code for details.
 
+For example, in the above diagram, if mutation of B is wanted while at P1, the call using
+@chain_mutate@ looks like the following:
+\begin{C++}
+chain_mutate(P1.a)(&A.b) = new_value_of_b;
+\end{C++}
+Note that if some node in chain mutate is shared (therefore shallow copied), it implies that
+every node further down will also be copied, thus correctly executing the functional
+mutation algorithm. This example code creates copies of both A and B and performs
+mutation on the new nodes, so that the other tree P2-A-B is untouched.
+However, if a pass traverses down to node B and performs mutation, for example, in
+@postvisit(B)@, information on sharing higher up is lost. Since the new-ast structure is only in
+experimental use with the resolver algorithm, which mostly rebuilds the tree bottom-up,
+this issue does not actually happen. It should be addressed in the future when other
+compilation passes are migrated to new-ast and many of them contain procedural
+mutations, where it might cause accidental mutations to other logically independent trees
+(\eg common sub-expression) and become a bug.
+
+
+\vspace*{20pt} % FIX ME, spacing problem with this heading ???
+\section{Compiler Algorithm Documentation}
+
+This documentation currently covers most of the resolver, data structures used in variable
+and expression resolution, and a few directly related passes. Later passes involving code
+generation is not included yet; documentation for those will be done afterwards.
+
+\subsection{Symbol Table}
+
+\NOTE: For historical reasons, the symbol table data structure was called ``indexer'' in the
+old implementation. Hereby we will be using the name SymbolTable everywhere.
+The symbol table stores a mapping from names to declarations and implements a similar
+name space separation rule, and the same scoping rules in standard C.\footnote{ISO/IEC 9899:1999, Sections 6.2.1 and 6.2.3} The difference in
+name space rule is that typedef aliases are no longer considered ordinary identifiers.
+In addition to C tag types (struct, union, enum), \CFA introduces another tag type, trait,
+which is a named collection of assertions.
+
+\subsubsection{Source: AST/SymbolTable.hpp}
+
+\subsubsection{Source: SymTab/Indexer.h}
+
+\begin{C++}
+SymbolTable::addId(const DeclWithType * decl)
+\end{C++}
+Since \CFA allows overloading of variables and functions, ordinary identifier names need
+to be mangled. The mangling scheme is closely based on the Itanium \CC ABI,\footnote{\url{https://itanium-cxx-abi.github.io/cxx-abi/abi.html}, Section 5.1} while
+making adaptations to \CFA specific features, mainly assertions and overloaded variables
+by type. Naming conflicts are handled by mangled names; lookup by name returns a list of
+declarations with the same literal identifier name.
+
+\begin{C++}
+SymbolTable::addStruct(const StructDecl * decl)
+SymbolTable::addUnion(const UnionDecl * decl)
+SymbolTable::addEnum(const EnumDecl * decl)
+SymbolTable::addTrait(const TraitDecl * decl)
+\end{C++}
+Adds a tag type declaration to the symbol table.
+\begin{C++}
+SymbolTable::addType(const NamedTypeDecl * decl)
+\end{C++}
+Adds a typedef alias to the symbol table.
+
+\textbf{C Incompatibility Note}: Since Cforall allows using struct, union and enum type names
+without the keywords, typedef names and tag type names cannot be disambiguated by
+syntax rules. Currently the compiler puts them together and disallows collision. The
+following program is valid C but not valid Cforall:
+\begin{C++}
+struct A {};
+typedef int A;
+// gcc: ok, cfa: Cannot redefine typedef A
+\end{C++}
+In actual practices however, such usage is extremely rare, and typedef struct A A; is
+not considered an error, but silently discarded. Therefore, we expect this change to have
+minimal impact on existing C programs.
+Meanwhile, the following program is allowed in Cforall:
+\begin{C++}
+typedef int A;
+void A();
+// gcc: A redeclared as different kind of symbol, cfa: ok
+\end{C++}
+
+\subsection{Type Environment and Unification}
+
+The core of parametric type resolution algorithm.
+Type Environment organizes type parameters in \textbf{equivalent classes} and maps them to
+actual types. Unification is the algorithm that takes two (possibly parametric) types and
+parameter mappings and attempts to produce a common type by matching the type
+environments.
+
+The unification algorithm is recursive in nature and runs in two different modes internally:
+\begin{itemize}
+\item
+\textbf{Exact} unification mode requires equivalent parameters to match perfectly;
+\item
+\textbf{Inexact} unification mode allows equivalent parameters to be converted to a
+common type.
+\end{itemize}
+For a pair of matching parameters (actually, their equivalent classes), if either side is open
+(not bound to a concrete type yet), they are simply combined.
+
+Within inexact mode, types are allowed to differ on their cv-qualifiers; additionally, if a
+type never appear either in parameter list or as the base type of a pointer, it may also be
+widened (i.e. safely converted). As Cforall currently does not implement subclassing similar
+to object-oriented languages, widening conversions are on primitive types only, for
+example the conversion from int to long.
+
+The need for two unification modes come from the fact that parametric types are
+considered compatible only if all parameters are exactly the same (not just compatible).
+Pointer types also behaves similarly; in fact, they may be viewed as a primitive kind of
+parametric types. @int*@ and @long*@ are different types, just like @vector(int)@ and
+@vector(long)@ are, for the parametric type @vector(T)@.
+
+The resolver should use the following ``@public@'' functions:\footnote{
+Actual code also tracks assertions on type parameters; those extra arguments are omitted here for
+conciseness.}
+
+
+\subsubsection{Source: ResolvExpr/Unify.cc}
+
+\begin{C++}
+bool unify(const Type *type1, const Type *type2, TypeEnvironment &env,
+OpenVarSet &openVars, const SymbolTable &symtab, Type *&commonType)
+\end{C++}
+Attempts to unify @type1@ and @type2@ with current type environment.
+
+If operation succeeds, @env@ is modified by combining the equivalence classes of matching
+parameters in @type1@ and @type2@, and their common type is written to commonType.
+
+If operation fails, returns false.
+\begin{C++}
+bool typesCompatible(const Type * type1, const Type * type2, const
+SymbolTable &symtab, const TypeEnvironment &env)
+bool typesCompatibleIgnoreQualifiers(const Type * type1, const Type *
+type2, const SymbolTable &symtab, const TypeEnvironment &env)
+\end{C++}
+
+Determines if type1 and type2 can possibly be the same type. The second version ignores
+the outermost cv-qualifiers if present.\footnote{
+In const \lstinline@int * const@, only the second \lstinline@const@ is ignored.}
+
+The call has no side effect.
+
+\NOTE: No attempts are made to widen the types (exact unification is used), although the
+function names may suggest otherwise. E.g. @typesCompatible(int, long)@ returns false.
+
+
+\subsection{Expression Resolution}
+
+The design of the current version of expression resolver is outlined in the Ph.D. Thesis from
+Aaron Moss~\cite{Moss19}.
+
+A summary of the resolver algorithm for each expression type is presented below.
+
+All overloadable operators are modelled as function calls. For a function call,
+interpretations of the function and arguments are found recursively. Then the following
+steps produce a filtered list of valid interpretations:
+\begin{enumerate}
+\item
+From all possible combinations of interpretations of the function and arguments,
+those where argument types may be converted to function parameter types are
+considered valid.
+\item
+Valid interpretations with the minimum sum of argument costs are kept.
+\item
+Argument costs are then discarded; the actual cost for the function call expression is
+the sum of conversion costs from the argument types to parameter types.
+\item
+For each return type, the interpretations with satisfiable assertions are then sorted
+by actual cost computed in step 3. If for a given type, the minimum cost
+interpretations are not unique, it is said that for that return type the interpretation
+is ambiguous. If the minimum cost interpretation is unique but contains an
+ambiguous argument, it is also considered ambiguous.
+\end{enumerate}
+Therefore, for each return type, the resolver produces either of:
+\begin{itemize}
+\item
+No alternatives
+\item
+A single valid alternative
+\item
+An ambiguous alternative
+\end{itemize}
+Note that an ambiguous alternative may be discarded at the parent expressions because a
+different return type matches better for the parent expressions.
+
+The non-overloadable expressions in Cforall are: cast expressions, address-of (unary @&@)
+expressions, short-circuiting logical expressions (@&&@, @||@) and ternary conditional
+expression (@?:@).
+
+For a cast expression, the convertible argument types are kept. Then the result is selected
+by lowest argument cost, and further by lowest conversion cost to target type. If the lowest
+cost is still not unique, or an ambiguous argument interpretation is selected, the cast
+expression is ambiguous. In an expression statement, the top level expression is implicitly
+cast to void.
+
+For an address-of expression, only lvalue results are kept and the minimum cost is selected.
+
+For logical expressions @&&@ and @||@, arguments are implicitly cast to bool, and follow the rule
+of cast expression as above.
+
+For the ternary conditional expression, the condition is implicitly cast to bool, and the
+branch expressions must have compatible types. Each pair of compatible branch
+expression types produce a possible interpretation, and the cost is defined as the sum of
+expression costs plus the sum of conversion costs to the common type.
+
+TODO: Write a specification for expression costs.
+
+
+\subsection{Assertion Satisfaction}
+
+The resolver tries to satisfy assertions on expressions only when it is needed: either while
+selecting from multiple alternatives of a same result type for a function call (step 4 of
+resolving function calls), or upon reaching the top level of an expression statement.
+
+Unsatisfiable alternatives are discarded. Satisfiable alternatives receive \textbf{implicit
+parameters}: in Cforall, parametric functions are designed such that they can be compiled
+separately, as opposed to \CC templates which are only compiled at instantiation. Given a
+parametric function definition:
+\begin{C++}
+forall (otype T | {void foo(T);})
+void bar (T t) { foo(t); }
+\end{C++}
+The function bar does not know which @foo@ to call when compiled without knowing the call
+site, so it requests a function pointer to be passed as an extra argument. At the call site,
+implicit parameters are automatically inserted by the compiler.
+
+\textbf{TODO}: Explain how recursive assertion satisfaction and polymorphic recursion work.
+
+
+\section{Tests}
+
+\subsection{Test Suites}
+
+Automatic test suites are located under the @tests/@ directory. A test case consists of an
+input CFA source file (name ending with @.cfa@), and an expected output file located
+in @.expect/@ directory relative to the source file, with the same file name ending with @.txt@.
+So a test named @tuple/tupleCast@ has the following files, for example:
+\begin{C++}
+tests/
+..     tuple/
+......     .expect/
+..........       tupleCast.txt
+......     tupleCast.cfa
+\end{C++}
+If compilation fails, the error output is compared to the expect file. If compilation succeeds,
+the built program is run and its output compared to the expect file.
+To run the tests, execute the test script @test.py@ under the @tests/@ directory, with a list of
+test names to be run, or @--all@ to run all tests. The test script reports test cases
+fail/success, compilation time and program run time.
+
+
+\subsection{Performance Reports}
+
+To turn on performance reports, pass @-S@ flag to the compiler.
+
+3 kinds of performance reports are available:
+\begin{enumerate}
+\item
+Time, reports time spent in each compilation step
+\item
+Heap, reports number of dynamic memory allocations, total bytes allocated, and
+maximum heap memory usage
+\item
+Counters, for certain predefined statistics; counters can be registered anywhere in
+the compiler as a static object, and the interface can be found at
+@Common/Stats/Counter.h@.
+\end{enumerate}
+It is suggested to run performance tests with optimized build (@g++@ flag @-O3@)
+
+
 \bibliographystyle{plain}
 \bibliography{pl}
