\documentclass[twoside,12pt]{article}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% Latex packages used in the document.
\usepackage{fullpage,times,comment}
\usepackage{epic,eepic}
\usepackage{upquote}									% switch curled `'" to straight
\usepackage{calc}
\usepackage{varioref}									% extended references
\usepackage[labelformat=simple,aboveskip=0pt,farskip=0pt]{subfig}
\renewcommand{\thesubfigure}{\alph{subfigure})}
\usepackage{latexsym}                                   % \Box glyph
\usepackage{mathptmx}                                   % better math font with "times"
\usepackage[usenames]{color}
\input{common}                                          % common CFA document macros
\lstset{
escapechar=\$,											% LaTeX escape in CFA code
moredelim=**[is][\color{red}]{`}{`},
}% lstset
\lstMakeShortInline@%
\usepackage[dvips,plainpages=false,pdfpagelabels,pdfpagemode=UseNone,colorlinks=true,pagebackref=true,linkcolor=blue,citecolor=blue,urlcolor=blue,pagebackref=true,breaklinks=true]{hyperref}
\usepackage{breakurl}

\usepackage[pagewise]{lineno}
\renewcommand{\linenumberfont}{\scriptsize\sffamily}

% Default underscore is too low and wide. Cannot use lstlisting "literate" as replacing underscore
% removes it as a variable-name character so keywords in variables are highlighted. MUST APPEAR
% AFTER HYPERREF.
\renewcommand{\textunderscore}{\leavevmode\makebox[1.2ex][c]{\rule{1ex}{0.075ex}}}

\setlength{\topmargin}{-0.45in}							% move running title into header
\setlength{\headsep}{0.25in}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\CFAStyle												% use default CFA format-style
\lstnewenvironment{C++}[1][]                            % use C++ style
{\lstset{language=C++,moredelim=**[is][\protect\color{red}]{`}{`},#1}}
{}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\setcounter{secnumdepth}{3}                             % number subsubsections
\setcounter{tocdepth}{3}                                % subsubsections in table of contents
\makeindex

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\title{\Huge
cfa-cc Developer's Reference
}% title

\author{\LARGE
Fangren Yu
}% author

\date{
\today
}% date

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\begin{document}
\pagestyle{headings}
% changed after setting pagestyle
\renewcommand{\sectionmark}[1]{\markboth{\thesection\quad #1}{\thesection\quad #1}}
\renewcommand{\subsectionmark}[1]{\markboth{\thesubsection\quad #1}{\thesubsection\quad #1}}
\pagenumbering{roman}
\linenumbers                                            % comment out to turn off line numbering

\maketitle
\pdfbookmark[1]{Contents}{section}
\tableofcontents

\clearpage
\thispagestyle{plain}
\pagenumbering{arabic}


\section{Overview}

cfa-cc is the reference compiler for the Cforall programming language, which is a non-
object-oriented extension to C.
Cforall attempts to introduce productive modern programming language features to C
while maintaining as much backward-compatibility as possible, so that most existing C
programs can seamlessly work with Cforall.

Since the Cforall project was dated back to the early 2000s, and only restarted in the past
few years, there is a significant amount of legacy code in the current compiler codebase,
with little proper documentation available. This becomes a difficulty while developing new
features based on the previous implementations, and especially while diagnosing
problems.

Currently, the Cforall team is also facing another problem: bad compiler performance. For
the development of a new programming language, writing a standard library is an
important part. The incompetence of the compiler causes building the library files to take
tens of minutes, making iterative development and testing almost impossible. There is
ongoing effort to rewrite the core data structure of the compiler to overcome the
performance issue, but many bugs may appear during the work, and lack of documentation
makes debugging extremely difficult.

This developer's reference will be continuously improved and eventually cover the
compiler codebase. For now, the focus is mainly on the parts being rewritten, and also the
performance bottleneck, namely the resolution algorithm. It is aimed to provide new
developers to the project enough guidance and clarify the purposes and behavior of certain
functions which are not mentioned in the previous Cforall research papers.


\section{Compiler Framework}

\subsection{AST Representation}

Source code input is first transformed into abstract syntax tree (AST) representation by the
parser before analyzed by the compiler.

There are 4 major categories of AST nodes used by the compiler, along with some derived
structures.

\subsubsection{Declaration nodes}

A declaration node represents either of:
\begin{itemize}
\item
Type declaration: struct, union, typedef or type parameter (see Appendix A.3)
\item
Variable declaration
\item
Function declaration
\end{itemize}
Declarations are introduced by standard C declarations, with the usual scoping rules.
In addition, declarations can also be introduced by the forall clause (which is the origin
of Cforall's name):
\begin{cfa}
forall (<$\emph{TypeParameterList}$> | <$\emph{AssertionList}$>)
	$\emph{declaration}$
\end{cfa}
Type parameters in Cforall are similar to \CC template type parameters. The Cforall
declaration
\begin{cfa}
forall (dtype T) ...
\end{cfa}
behaves similarly as the \CC template declaration
\begin{C++}
template <typename T> ...
\end{C++}

Assertions are a distinctive feature of Cforall: contrary to the \CC template where
arbitrary functions and operators can be used in a template definition, in a Cforall
parametric function, operations on parameterized types must be declared in assertions.

Consider the following \CC template:
\begin{C++}
template <typename T> int foo(T t) {
	return bar(t) + baz(t);
}
\end{C++}
Unless bar and baz are also parametric functions taking any argument type, they must be
declared in the assertions, or otherwise the code will not compile:
\begin{cfa}
forall (dtype T | { int bar(T); int baz(t); }) int foo (T t) {
	return bar(t) + baz(t);
}
\end{cfa}
Assertions are written using the usual function declaration syntax. The scope of type
parameters and assertions is the following declaration.

\subsubsection{Type nodes}

A type node represents the type of an object or expression.
Named types reference the corresponding type declarations. The type of a function is its
function pointer type (same as standard C).
With the addition of type parameters, named types may contain a list of parameter values
(actual parameter types).

\subsubsection{Statement nodes}

Statement nodes represent the statements in the program, including basic expression
statements, control flows and blocks.
Local declarations (within a block statement) are represented as declaration statements.

\subsubsection{Expression nodes}

Some expressions are represented differently in the compiler before and after resolution
stage:
\begin{itemize}
\item
Name expressions: NameExpr pre-resolution, VariableExpr post-resolution
\item
Member expressions: UntypedMemberExpr pre-resolution, MemberExpr post-resolution
\item
Function call expressions (including overloadable operators): UntypedExpr pre-resolution, ApplicationExpr post-resolution
\end{itemize}
The pre-resolution representations contain only the symbols. Post-resolution results link
them to the actual variable and function declarations.


\subsection{Compilation Passes}

Compilation steps are implemented as passes, which follows a general structural recursion
pattern on the syntax tree.

The basic work flow of compilation passes follows preorder and postorder traversal on
tree data structure, implemented with visitor pattern, and can be loosely described with
the following pseudocode:
\begin{cfa}
Pass::visit (node_t node) {
	previsit(node);
	if (visit_children)
		for each child of node:
			child.accept(this);
	postvisit(node);
}
\end{cfa}
Operations in previsit() happen in preorder (top to bottom) and operations in
postvisit() happen in postorder (bottom to top). The precise order of recursive
operations on child nodes can be found in @Common/PassVisitor.impl.h@ (old) and
@AST/Pass.impl.hpp@ (new).
Implementations of compilation passes need to follow certain conventions:
\begin{itemize}
\item
Passes \textbf{should not} directly override the visit method (Non-virtual Interface
principle); if a pass desires different recursion behavior, it should set
@visit_children@ to false and perform recursive calls manually within previsit or
postvisit procedures. To enable this option, inherit from @WithShortCircuiting@ mixin.
\item
previsit may mutate the node but \textbf{must not} change the node type or return null.
\item
postvisit may mutate the node, reconstruct it to a different node type, or delete it by
returning null.
\item
If the previsit or postvisit method is not defined for a node type, the step is skipped.
If the return type is declared as void, the original node is returned by default. These
behaviors are controlled by template specialization rules; see
@Common/PassVisitor.proto.h@ (old) and @AST/Pass.proto.hpp@ (new) for details.
\end{itemize}
Other useful mixin classes for compilation passes include:
\begin{itemize}
\item
WithGuards allows saving values of variables and restore automatically upon exiting
the current node.
\item
WithVisitorRef creates a wrapped entity of current pass (the actual argument
passed to recursive calls internally) for explicit recursion, usually used together
with WithShortCircuiting.
\item
WithSymbolTable gives a managed symbol table with built-in scoping rule handling
(e.g. on entering and exiting a block statement)
\end{itemize}
\textbf{NOTE}: If a pass extends the functionality of another existing pass, due to \CC overloading
resolution rules, it \textbf{must} explicitly introduce the inherited previsit and postvisit procedures
to its own scope, or otherwise they will not be picked up by template resolution:
\begin{cfa}
class Pass2: public Pass1 {
	using Pass1::previsit;
	using Pass1::postvisit;
	// new procedures
}
\end{cfa}


\subsection{Data Structure Change WIP (new-ast)}

It has been observed that excessive copying of syntax tree structures accounts for a
majority of computation cost and significantly slows down the compiler. In the previous
implementation of the syntax tree, every internal node has a unique parent; therefore all
copies are required to duplicate everything down to the bottom. A new, experimental
re-implementation of the syntax tree (source under directory AST/ hereby referred to as
``new-ast'') attempts to overcome this issue with a functional approach that allows sharing
of common sub-structures and only makes copies when necessary.

The core of new-ast is a customized implementation of smart pointers, similar to
@std::shared_ptr@ and @std::weak_ptr@ in C++ standard library. Reference counting is
used to detect sharing and allows optimization. For a purely functional (a.k.a. immutable)
data structure, all mutations are modelled by shallow copies along the path of mutation.
With reference counting optimization, unique nodes are allowed to be mutated in place.
This however, may potentially introduce some complications and bugs; a few issues are
discussed near the end of this section.

\subsubsection{Source: AST/Node.hpp}

class @ast::Node@ is the base class of all new-ast node classes, which implements
reference counting mechanism. Two different counters are recorded: ``strong'' reference
count for number of nodes semantically owning it; ``weak'' reference count for number of
nodes holding a mere reference and only need to observe changes.
class @ast::ptr_base@ is the smart pointer implementation and also takes care of
resource management.

Direct access through the smart pointer is read-only. A mutable access should be obtained
by calling shallowCopy or mutate as below.

Currently, the weak pointers are only used to reference declaration nodes from a named
type, or a variable expression. Since declaration nodes are intended to denote unique
entities in the program, weak pointers always point to unique (unshared) nodes. This may
change in the future, and weak references to shared nodes may introduce some problems;
see mutate function below.

All node classes should always use smart pointers in the structure and should not use raw
pointers.

\begin{C++}
void ast::Node::increment(ref_type ref)
\end{C++}
Increments this node's strong or weak reference count.
\begin{C++}
void ast::Node::decrement(ref_type ref, bool do_delete = true)
\end{C++}
Decrements this node's strong or weak reference count. If strong reference count reaches
zero, the node is deleted by default.
\textbf{NOTE}: Setting @do_delete@ to false may result in a detached node. Subsequent code should
manually delete the node or assign it to a strong pointer to prevent memory leak.
Reference counting functions are internally called by @ast::ptr_base@.
\begin{C++}
template<typename node_t>
node_t * shallowCopy(const node_t * node)
\end{C++}
Returns a mutable, shallow copy of node: all child pointers are pointing to the same child
nodes.
\begin{C++}
template<typename node_t>
node_t * mutate(const node_t * node)
\end{C++}
If node is unique (strong reference count is 1), returns a mutable pointer to the same node.
Otherwise, returns shallowCopy(node).
It is an error to mutate a shared node that is weak-referenced. Currently this does not
happen. The problem may appear once weak pointers to shared nodes (e.g. expression
nodes) are used; special care will be needed.

\textbf{NOTE}: This naive uniqueness check may not be sufficient in some cases. A discussion of the
issue is presented at the end of this section.
\begin{C++}
template<typename node_t, typename parent_t, typename field_t, typename assn_t>
const node_t * mutate_field(const node_t * node, field_t parent_t::*field, assn_t && val)
\end{C++}
\begin{C++}
template<typename node_t, typename parent_t, typename coll_t, typename ind_t,
		typename field_t>
const node_t * mutate_field_index(const node_t * node, coll_t parent_t::* field, ind_t i,
		field_t && val)
\end{C++}
Helpers for mutating a field on a node using pointer to member (creates shallow copy
when necessary).

\subsubsection{Issue: Undetected sharing}

The @mutate@ behavior described above has a problem: deeper shared nodes may be
mistakenly considered as unique. \VRef[Figure]{f:DeepNodeSharing} shows how the problem could arise:
\begin{figure}
\centering
\input{DeepNodeSharing}
\caption{Deep sharing of nodes}
\label{f:DeepNodeSharing}
\end{figure}
Suppose that we are working on the tree rooted at P1, which
is logically the chain P1-A-B and P2 is irrelevant, and then
mutate(B) is called. The algorithm considers B as unique since
it is only directly owned by A. However, the other tree P2-A-B
indirectly shares the node B and is therefore wrongly mutated.

To partly address this problem, if the mutation is called higher up the tree, a chain
mutation helper can be used:

\subsubsection{Source: AST/Chain.hpp}

\begin{C++}
template<typename node_t, Node::ref_type ref_t>
auto chain_mutate(ptr_base<node_t, ref_t> & base)
\end{C++}
This function returns a chain mutator handle which takes pointer-to-member to go down
the tree while creating shallow copies as necessary; see @struct _chain_mutator@ in the
source code for details.

\bibliographystyle{plain}
\bibliography{pl}


\end{document}

% Local Variables: %
% tab-width: 4 %
% fill-column: 100 %
% compile-command: "make" %
% End: %
