Context Navigation

← Previous Change
Next Change →

Changeset 223a633 for doc

Timestamp:

Oct 15, 2020, 3:41:38 PM (5 years ago)

Author:

Thierry Delisle <tdelisle@…>

Branches:

ADT, arm-eh, ast-experimental, enum, forall-pointer-decay, jacob/cs343-translation, master, new-ast-unique-expr, pthread-emulation, qualifiedEnum, stuck-waitfor-destruct

Children:

Parents:

33c3ded (diff), 0b18db7 (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the (diff) links above to see all the changes relative to each parent.

Message:

Merge branch 'master' of plg.uwaterloo.ca:software/cfa/cfa-cc

Location:

Files:

: 13 added
: 12 edited
: 23 moved

LaTeXmacros/common.tex (modified) (4 diffs)
LaTeXmacros/lstlang.sty (modified) (3 diffs)
bibliography/pl.bib (modified) (2 diffs)
papers/concurrency/Paper.tex (modified) (19 diffs)
papers/concurrency/annex/local.bib (modified) (1 diff)
papers/concurrency/mail2 (modified) (1 diff)
papers/concurrency/response3 (added)
proposals/ZeroCostPreemption.md (added)
proposals/function_type_change.md (added)
refrat/refrat.tex (modified) (3 diffs)
theses/andrew_beach_MMath/glossaries.tex (added)
theses/andrew_beach_MMath/thesis.tex (modified) (1 diff)
theses/fangren_yu_COOP_S20/Makefile (modified) (1 diff)
theses/fangren_yu_COOP_S20/Report.tex (modified) (11 diffs)
theses/fangren_yu_COOP_S20/cfa_developer_reference.pdf (added)
theses/thierry_delisle_PhD/code/readQ_example/Makefile (added)
theses/thierry_delisle_PhD/code/readQ_example/proto-gui/main.cpp (added)
theses/thierry_delisle_PhD/code/readQ_example/thrdlib/Makefile (added)
theses/thierry_delisle_PhD/code/readQ_example/thrdlib/cforall.hpp (added)
theses/thierry_delisle_PhD/code/readQ_example/thrdlib/fibre.hpp (added)
theses/thierry_delisle_PhD/code/readQ_example/thrdlib/pthread.hpp (added)
theses/thierry_delisle_PhD/code/readQ_example/thrdlib/thread.cpp (added)
theses/thierry_delisle_PhD/code/readQ_example/thrdlib/thread.hpp (added)
theses/thierry_delisle_PhD/code/readyQ_proto/Makefile (moved) (moved from doc/theses/thierry_delisle_PhD/code/Makefile )
theses/thierry_delisle_PhD/code/readyQ_proto/assert.hpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/assert.hpp )
theses/thierry_delisle_PhD/code/readyQ_proto/bitbench/select.cpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/bitbench/select.cpp )
theses/thierry_delisle_PhD/code/readyQ_proto/bts.cpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/bts.cpp )
theses/thierry_delisle_PhD/code/readyQ_proto/bts_test.cpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/bts_test.cpp )
theses/thierry_delisle_PhD/code/readyQ_proto/links.hpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/links.hpp )
theses/thierry_delisle_PhD/code/readyQ_proto/prefetch.cpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/prefetch.cpp )
theses/thierry_delisle_PhD/code/readyQ_proto/process.sh (moved) (moved from doc/theses/thierry_delisle_PhD/code/process.sh )
theses/thierry_delisle_PhD/code/readyQ_proto/processor.hpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/processor.hpp )
theses/thierry_delisle_PhD/code/readyQ_proto/processor_list.hpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/processor_list.hpp )
theses/thierry_delisle_PhD/code/readyQ_proto/processor_list_fast.cpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/processor_list_fast.cpp )
theses/thierry_delisle_PhD/code/readyQ_proto/processor_list_good.cpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/processor_list_good.cpp )
theses/thierry_delisle_PhD/code/readyQ_proto/randbit.cpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/randbit.cpp )
theses/thierry_delisle_PhD/code/readyQ_proto/relaxed_list.cpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/relaxed_list.cpp )
theses/thierry_delisle_PhD/code/readyQ_proto/relaxed_list.hpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/relaxed_list.hpp )
theses/thierry_delisle_PhD/code/readyQ_proto/relaxed_list_layout.cpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/relaxed_list_layout.cpp )
theses/thierry_delisle_PhD/code/readyQ_proto/runperf.sh (moved) (moved from doc/theses/thierry_delisle_PhD/code/runperf.sh )
theses/thierry_delisle_PhD/code/readyQ_proto/scale.sh (moved) (moved from doc/theses/thierry_delisle_PhD/code/scale.sh )
theses/thierry_delisle_PhD/code/readyQ_proto/snzi-packed.hpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/snzi-packed.hpp )
theses/thierry_delisle_PhD/code/readyQ_proto/snzi.hpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/snzi.hpp )
theses/thierry_delisle_PhD/code/readyQ_proto/snzm.hpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/snzm.hpp )
theses/thierry_delisle_PhD/code/readyQ_proto/utils.hpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/utils.hpp )
theses/thierry_delisle_PhD/code/readyQ_proto/work_stealing.hpp (moved) (moved from doc/theses/thierry_delisle_PhD/code/work_stealing.hpp )
user/Makefile (modified) (1 diff)
user/user.tex (modified) (17 diffs)

Legend:

: Unmodified
: Added
: Removed

doc/LaTeXmacros/common.tex

-              r33c3ded
+              r223a633
 %% Created On       : Sat Apr  9 10:06:17 2016
 %% Last Modified By : Peter A. Buhr
 %% Last Modified On : Fri Sep  4 13:56:52 2020
 %% Update Count     : 383
+%% Last Modified On : Mon Oct  5 09:34:46 2020
+%% Update Count     : 464
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 …
 \newlength{\parindentlnth}
 \setlength{\parindentlnth}{\parindent}
-\newcommand{\LstBasicStyle}[1]{{\lst@basicstyle{#1}}}
-\newcommand{\LstKeywordStyle}[1]{{\lst@basicstyle{\lst@keywordstyle{#1}}}}
-\newcommand{\LstCommentStyle}[1]{{\lst@basicstyle{\lst@commentstyle{#1}}}}
-\newlength{\gcolumnposn}                                % temporary hack because lstlisting does not handle tabs correctly
-\newlength{\columnposn}
-\setlength{\gcolumnposn}{2.5in}
-\setlength{\columnposn}{\gcolumnposn}
-\newcommand{\C}[2][\@empty]{\ifx#1\@empty\else\global\setlength{\columnposn}{#1}\global\columnposn=\columnposn\fi\hfill\makebox[\textwidth-\columnposn][l]{\lst@basicstyle{\LstCommentStyle{#2}}}}
-\newcommand{\CRT}{\global\columnposn=\gcolumnposn}
-% allow escape sequence in lstinline
-%\usepackage{etoolbox}
-%\patchcmd{\lsthk@TextStyle}{\let\lst@DefEsc\@empty}{}{}{\errmessage{failed to patch}}
 \usepackage{pslatex}                                    % reduce size of san serif font
 …
 \usepackage{listings}                                                                   % format program code
 \usepackage{lstlang}
+\newcommand{\CFADefaults}{%
+\makeatletter
+\newcommand{\LstBasicStyle}[1]{{\lst@basicstyle{#1}}}
+\newcommand{\LstKeywordStyle}[1]{{\lst@basicstyle{\lst@keywordstyle{#1}}}}
+\newcommand{\LstCommentStyle}[1]{{\lst@basicstyle{\lst@commentstyle{#1}}}}
+\newlength{\gcolumnposn}                                % temporary hack because lstlisting does not handle tabs correctly
+\newlength{\columnposn}
+\setlength{\gcolumnposn}{2.75in}
+\setlength{\columnposn}{\gcolumnposn}
+\newcommand{\C}[2][\@empty]{\ifx#1\@empty\else\global\setlength{\columnposn}{#1}\global\columnposn=\columnposn\fi\hfill\makebox[\textwidth-\columnposn][l]{\lst@basicstyle{\LstCommentStyle{#2}}}}
+\newcommand{\CRT}{\global\columnposn=\gcolumnposn}
+% allow escape sequence in lstinline
+%\usepackage{etoolbox}
+%\patchcmd{\lsthk@TextStyle}{\let\lst@DefEsc\@empty}{}{}{\errmessage{failed to patch}}
+% allow adding to lst literate
+\def\addToLiterate#1{\protect\edef\lst@literate{\unexpanded\expandafter{\lst@literate}\unexpanded{#1}}}
+\lst@Key{add to literate}{}{\addToLiterate{#1}}
+\makeatother
+\newcommand{\CFAStyle}{%
 \lstset{
-language=CFA,
 columns=fullflexible,
 basicstyle=\linespread{0.9}\sf,                 % reduce line spacing and use sanserif font
 …
 belowskip=3pt,
 % replace/adjust listing characters that look bad in sanserif
 literate={-}{\makebox[1ex][c]{\raisebox{0.4ex}{\rule{0.8ex}{0.1ex}}}}1 {^}{\raisebox{0.6ex}{$\scriptscriptstyle\land\,$}}1
+literate={-}{\makebox[1ex][c]{\raisebox{0.4ex}{\rule{0.75ex}{0.1ex}}}}1 {^}{\raisebox{0.6ex}{$\scriptscriptstyle\land\,$}}1
         {~}{\raisebox{0.3ex}{$\scriptstyle\sim\,$}}1 {`}{\ttfamily\upshape\hspace*{-0.1ex}`}1
         {<-}{$\leftarrow$}2 {=>}{$\Rightarrow$}2 {->}{\makebox[1ex][c]{\raisebox{0.4ex}{\rule{0.8ex}{0.075ex}}}\kern-0.2ex\textgreater}2,
+moredelim=**[is][\color{red}]{?}{?},    % red highlighting ?...? (registered trademark symbol) emacs: C-q M-.
+}% lstset
+}% CFAStyle
+\ifdefined\CFALatin% extra Latin-1 escape characters
+\lstnewenvironment{cfa}[1][]{
+\lstset{
+language=CFA,
+moredelim=**[is][\color{red}]{®}{®},    % red highlighting ®...® (registered trademark symbol) emacs: C-q M-.
 moredelim=**[is][\color{blue}]{ß}{ß},   % blue highlighting ß...ß (sharp s symbol) emacs: C-q M-_
 moredelim=**[is][\color{OliveGreen}]{¢}{¢}, % green highlighting ¢...¢ (cent symbol) emacs: C-q M-"
 moredelim=[is][\lstset{keywords={}}]{¶}{¶}, % keyword escape ¶...¶ (pilcrow symbol) emacs: C-q M-^
+% replace/adjust listing characters that look bad in sanserif
+add to literate={`}{\ttfamily\upshape\hspace*{-0.1ex}`}1
 }% lstset
+}% CFADefaults
+\newcommand{\CFAStyle}{%
+\CFADefaults
+\lstset{#1}
+}{}
 % inline code ©...© (copyright symbol) emacs: C-q M-)
 \lstMakeShortInline©                                    % single-character for \lstinline
+}% CFAStyle
+\lstnewenvironment{cfa}[1][]
+{\CFADefaults\lstset{#1}}
+{}
+\else% regular ASCI characters
+\lstnewenvironment{cfa}[1][]{
+\lstset{
+language=CFA,
+escapechar=\$,                                                  % LaTeX escape in CFA code
+moredelim=**[is][\color{red}]{@}{@},    % red highlighting @...@
+}% lstset
+\lstset{#1}
+}{}
+% inline code @...@ (at symbol)
+\lstMakeShortInline@                                    % single-character for \lstinline
+\fi%
 % Local Variables: %

doc/LaTeXmacros/lstlang.sty

-              r33c3ded
+              r223a633
 %% Created On       : Sat May 13 16:34:42 2017
 %% Last Modified By : Peter A. Buhr
 %% Last Modified On : Tue Jan  8 14:40:33 2019
 %% Update Count     : 21
+%% Last Modified On : Wed Sep 23 22:40:04 2020
+%% Update Count     : 24
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 …
                 auto, _Bool, catch, catchResume, choose, _Complex, __complex, __complex__, __const, __const__,
                 coroutine, disable, dtype, enable, exception, __extension__, fallthrough, fallthru, finally,
                 __float80, float80, __float128, float128, forall, ftype, _Generic, _Imaginary, __imag, __imag__,
+                __float80, float80, __float128, float128, forall, ftype, generator, _Generic, _Imaginary, __imag, __imag__,
                 inline, __inline, __inline__, __int128, int128, __label__, monitor, mutex, _Noreturn, one_t, or,
                 otype, restrict, __restrict, __restrict__, __signed, __signed__, _Static_assert, thread,
+                otype, restrict, __restrict, __restrict__, __signed, __signed__, _Static_assert, suspend, thread,
                 _Thread_local, throw, throwResume, timeout, trait, try, ttype, typeof, __typeof, __typeof__,
                 virtual, __volatile, __volatile__, waitfor, when, with, zero_t,
 …
 % C++ programming language
+\lstdefinelanguage{C++}[ANSI]{C++}{}
+\lstdefinelanguage{C++}[ANSI]{C++}{
+        morekeywords={nullptr,}
+}
 % uC++ programming language, based on ANSI C++

doc/bibliography/pl.bib

-              r33c3ded
+              r223a633
     key         = {Cforall Benchmarks},
     author      = {{\textsf{C}{$\mathbf{\forall}$} Benchmarks}},
     howpublished= {\href{https://plg.uwaterloo.ca/~cforall/doc/CforallConcurrentBenchmarks.tar}{https://\-plg.uwaterloo.ca/\-$\sim$cforall/\-doc/\-CforallConcurrentBenchmarks.tar}},
+    howpublished= {\href{https://github.com/cforall/ConcurrentBenchmarks_SPE20}{https://\-github.com/\-cforall/\-ConcurrentBenchmarks\_SPE20}},
+}
 …
     title       = {Cooperating Sequential Processes},
     institution = {Technological University},
     address     = {Eindhoven, Netherlands},
+    address     = {Eindhoven, Neth.},
     year        = 1965,
     note        = {Reprinted in \cite{Genuys68} pp. 43--112.}

doc/papers/concurrency/Paper.tex

-              r33c3ded
+              r223a633
 {}
 \lstnewenvironment{C++}[1][]                            % use C++ style
 {\lstset{language=C++,moredelim=**[is][\protect\color{red}]{`}{`},#1}\lstset{#1}}
+{\lstset{language=C++,moredelim=**[is][\protect\color{red}]{`}{`}}\lstset{#1}}
 {}
 \lstnewenvironment{uC++}[1][]
 {\lstset{language=uC++,moredelim=**[is][\protect\color{red}]{`}{`},#1}\lstset{#1}}
+{\lstset{language=uC++,moredelim=**[is][\protect\color{red}]{`}{`}}\lstset{#1}}
 {}
 \lstnewenvironment{Go}[1][]
 {\lstset{language=Golang,moredelim=**[is][\protect\color{red}]{`}{`},#1}\lstset{#1}}
+{\lstset{language=Golang,moredelim=**[is][\protect\color{red}]{`}{`}}\lstset{#1}}
 {}
 \lstnewenvironment{python}[1][]
 {\lstset{language=python,moredelim=**[is][\protect\color{red}]{`}{`},#1}\lstset{#1}}
+{\lstset{language=python,moredelim=**[is][\protect\color{red}]{`}{`}}\lstset{#1}}
 {}
 \lstnewenvironment{java}[1][]
 {\lstset{language=java,moredelim=**[is][\protect\color{red}]{`}{`},#1}\lstset{#1}}
+{\lstset{language=java,moredelim=**[is][\protect\color{red}]{`}{`}}\lstset{#1}}
 {}
 …
 \begin{document}
 \linenumbers                            % comment out to turn off line numbering
+%\linenumbers                           % comment out to turn off line numbering
 \maketitle
 …
 \hline
 stateful                        & thread        & \multicolumn{1}{c|}{No} & \multicolumn{1}{c}{Yes} \\
 \hline
 \hline
+\hline
+\hline
 No                                      & No            & \textbf{1}\ \ \ @struct@                              & \textbf{2}\ \ \ @mutex@ @struct@              \\
 \hline
+\hline
 Yes (stackless)         & No            & \textbf{3}\ \ \ @generator@                   & \textbf{4}\ \ \ @mutex@ @generator@   \\
 \hline
+\hline
 Yes (stackful)          & No            & \textbf{5}\ \ \ @coroutine@                   & \textbf{6}\ \ \ @mutex@ @coroutine@   \\
 \hline
+\hline
 No                                      & Yes           & \textbf{7}\ \ \ {\color{red}rejected} & \textbf{8}\ \ \ {\color{red}rejected} \\
 \hline
+\hline
 Yes (stackless)         & Yes           & \textbf{9}\ \ \ {\color{red}rejected} & \textbf{10}\ \ \ {\color{red}rejected} \\
 \hline
+\hline
 Yes (stackful)          & Yes           & \textbf{11}\ \ \ @thread@                             & \textbf{12}\ \ @mutex@ @thread@               \\
 \end{tabular}
 …
 \label{s:RuntimeStructureCluster}
 A \newterm{cluster} is a collection of user and kernel threads, where the kernel threads run the user threads from the cluster's ready queue, and the operating system runs the kernel threads on the processors from its ready queue.
+A \newterm{cluster} is a collection of user and kernel threads, where the kernel threads run the user threads from the cluster's ready queue, and the operating system runs the kernel threads on the processors from its ready queue~\cite{Buhr90a}.
 The term \newterm{virtual processor} is introduced as a synonym for kernel thread to disambiguate between user and kernel thread.
 From the language perspective, a virtual processor is an actual processor (core).
 …
 \end{cfa}
 where CPU time in nanoseconds is from the appropriate language clock.
+Each benchmark is performed @N@ times, where @N@ is selected so the benchmark runs in the range of 2--20 seconds for the specific programming language.
+Each benchmark is performed @N@ times, where @N@ is selected so the benchmark runs in the range of 2--20 seconds for the specific programming language;
+each @N@ appears after the experiment name in the following tables.
 The total time is divided by @N@ to obtain the average time for a benchmark.
 Each benchmark experiment is run 13 times and the average appears in the table.
+For languages with a runtime JIT (Java, Node.js, Python), a single half-hour long experiment is run to check stability;
+all long-experiment results are statistically equivalent, \ie median/average/standard-deviation correlate with the short-experiment results, indicating the short experiments reached a steady state.
 All omitted tests for other languages are functionally identical to the \CFA tests and available online~\cite{CforallConcurrentBenchmarks}.
-% tar --exclude-ignore=exclude -cvhf benchmark.tar benchmark
-% cp -p benchmark.tar /u/cforall/public_html/doc/concurrent_benchmark.tar
 \paragraph{Creation}
 …
 \begin{multicols}{2}
+\lstset{language=CFA,moredelim=**[is][\color{red}]{@}{@},deletedelim=**[is][]{`}{`}}
+\begin{cfa}
+@coroutine@ MyCoroutine {};
+\begin{cfa}[xleftmargin=0pt]
+`coroutine` MyCoroutine {};
 void ?{}( MyCoroutine & this ) {
 #ifdef EAGER
 …
 void main( MyCoroutine & ) {}
 int main() {
         BENCH( for ( N ) { @MyCoroutine c;@ } )
+        BENCH( for ( N ) { `MyCoroutine c;` } )
         sout | result;
+}
 …
 \begin{tabular}[t]{@{}r*{3}{D{.}{.}{5.2}}@{}}
+\multicolumn{1}{@{}c}{} & \multicolumn{1}{c}{Median} & \multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\
+\CFA generator                  & 0.6           & 0.6           & 0.0           \\
+\CFA coroutine lazy             & 13.4          & 13.1          & 0.5           \\
+\CFA coroutine eager    & 144.7         & 143.9         & 1.5           \\
+\CFA thread                             & 466.4         & 468.0         & 11.3          \\
+\uC coroutine                   & 155.6         & 155.7         & 1.7           \\
+\uC thread                              & 523.4         & 523.9         & 7.7           \\
+Python generator                & 123.2         & 124.3         & 4.1           \\
+Node.js generator               & 33.4          & 33.5          & 0.3           \\
+Goroutine thread                & 751.0         & 750.5         & 3.1           \\
+Rust tokio thread               & 1860.0        & 1881.1        & 37.6          \\
+Rust thread                             & 53801.0       & 53896.8       & 274.9         \\
+Java thread                             & 120274.0      & 120722.9      & 2356.7        \\
+Pthreads thread                 & 31465.5       & 31419.5       & 140.4
+\multicolumn{1}{@{}r}{N\hspace*{10pt}} & \multicolumn{1}{c}{Median} & \multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\
+\CFA generator (1B)                     & 0.6           & 0.6           & 0.0           \\
+\CFA coroutine lazy     (100M)  & 13.4          & 13.1          & 0.5           \\
+\CFA coroutine eager (10M)      & 144.7         & 143.9         & 1.5           \\
+\CFA thread (10M)                       & 466.4         & 468.0         & 11.3          \\
+\uC coroutine (10M)                     & 155.6         & 155.7         & 1.7           \\
+\uC thread (10M)                        & 523.4         & 523.9         & 7.7           \\
+Python generator (10M)          & 123.2         & 124.3         & 4.1           \\
+Node.js generator (10M)         & 33.4          & 33.5          & 0.3           \\
+Goroutine thread (10M)          & 751.0         & 750.5         & 3.1           \\
+Rust tokio thread (10M)         & 1860.0        & 1881.1        & 37.6          \\
+Rust thread     (250K)                  & 53801.0       & 53896.8       & 274.9         \\
+Java thread (250K)                      & 119256.0      & 119679.2      & 2244.0        \\
+% Java thread (1 000 000)               & 123100.0      & 123052.5      & 751.6         \\
+Pthreads thread (250K)          & 31465.5       & 31419.5       & 140.4
 \end{tabular}
 \end{multicols}
 …
 Internal scheduling is measured using a cycle of two threads signalling and waiting.
 Figure~\ref{f:schedint} shows the code for \CFA, with results in Table~\ref{t:schedint}.
+Note, the incremental cost of bulk acquire for \CFA, which is largely a fixed cost for small numbers of mutex objects.
+Java scheduling is significantly greater because the benchmark explicitly creates multiple threads in order to prevent the JIT from making the program sequential, \ie removing all locking.
+Note, the \CFA incremental cost for bulk acquire is a fixed cost for small numbers of mutex objects.
+User-level threading has one kernel thread, eliminating contention between the threads (direct handoff of the kernel thread).
+Kernel-level threading has two kernel threads allowing some contention.
 \begin{multicols}{2}
 \lstset{language=CFA,moredelim=**[is][\color{red}]{@}{@},deletedelim=**[is][]{`}{`}}
 \begin{cfa}
+\setlength{\tabcolsep}{3pt}
+\begin{cfa}[xleftmargin=0pt]
 volatile int go = 0;
+@condition c;@
 @monitor@ M {} m1/*, m2, m3, m4*/;
 void call( M & @mutex p1/*, p2, p3, p4*/@ ) {
         @signal( c );@
+}
 void wait( M & @mutex p1/*, p2, p3, p4*/@ ) {
+`condition c;`
+`monitor` M {} m1/*, m2, m3, m4*/;
+void call( M & `mutex p1/*, p2, p3, p4*/` ) {
+        `signal( c );`
+}
+void wait( M & `mutex p1/*, p2, p3, p4*/` ) {
         go = 1; // continue other thread
         for ( N ) { @wait( c );@ } );
+        for ( N ) { `wait( c );` } );
+}
 thread T {};
 …
 \begin{tabular}{@{}r*{3}{D{.}{.}{5.2}}@{}}
+\multicolumn{1}{@{}c}{} & \multicolumn{1}{c}{Median} & \multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\
+\CFA @signal@, 1 monitor        & 364.4         & 364.2         & 4.4           \\
+\CFA @signal@, 2 monitor        & 484.4         & 483.9         & 8.8           \\
+\CFA @signal@, 4 monitor        & 709.1         & 707.7         & 15.0          \\
+\uC @signal@ monitor            & 328.3         & 327.4         & 2.4           \\
+Rust cond. variable                     & 7514.0        & 7437.4        & 397.2         \\
+Java @notify@ monitor           & 9623.0        & 9654.6        & 236.2         \\
+Pthreads cond. variable         & 5553.7        & 5576.1        & 345.6
+\multicolumn{1}{@{}r}{N\hspace*{10pt}} & \multicolumn{1}{c}{Median} & \multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\
+\CFA @signal@, 1 monitor (10M)  & 364.4         & 364.2         & 4.4           \\
+\CFA @signal@, 2 monitor (10M)  & 484.4         & 483.9         & 8.8           \\
+\CFA @signal@, 4 monitor (10M)  & 709.1         & 707.7         & 15.0          \\
+\uC @signal@ monitor (10M)              & 328.3         & 327.4         & 2.4           \\
+Rust cond. variable     (1M)            & 7514.0        & 7437.4        & 397.2         \\
+Java @notify@ monitor (1M)              & 8717.0        & 8774.1        & 471.8         \\
+% Java @notify@ monitor (100 000 000)           & 8634.0        & 8683.5        & 330.5         \\
+Pthreads cond. variable (1M)    & 5553.7        & 5576.1        & 345.6
 \end{tabular}
 \end{multicols}
 …
 External scheduling is measured using a cycle of two threads calling and accepting the call using the @waitfor@ statement.
 Figure~\ref{f:schedext} shows the code for \CFA with results in Table~\ref{t:schedext}.
 Note, the incremental cost of bulk acquire for \CFA, which is largely a fixed cost for small numbers of mutex objects.
+Note, the \CFA incremental cost for bulk acquire is a fixed cost for small numbers of mutex objects.
 \begin{multicols}{2}
 \lstset{language=CFA,moredelim=**[is][\color{red}]{@}{@},deletedelim=**[is][]{`}{`}}
+\setlength{\tabcolsep}{5pt}
 \vspace*{-16pt}
 \begin{cfa}
 @monitor@ M {} m1/*, m2, m3, m4*/;
 void call( M & @mutex p1/*, p2, p3, p4*/@ ) {}
 void wait( M & @mutex p1/*, p2, p3, p4*/@ ) {
         for ( N ) { @waitfor( call : p1/*, p2, p3, p4*/ );@ }
+\begin{cfa}[xleftmargin=0pt]
+`monitor` M {} m1/*, m2, m3, m4*/;
+void call( M & `mutex p1/*, p2, p3, p4*/` ) {}
+void wait( M & `mutex p1/*, p2, p3, p4*/` ) {
+        for ( N ) { `waitfor( call : p1/*, p2, p3, p4*/ );` }
+}
 thread T {};
 …
 \columnbreak
 \vspace*{-16pt}
+\vspace*{-18pt}
 \captionof{table}{External-scheduling comparison (nanoseconds)}
 \label{t:schedext}
 \begin{tabular}{@{}r*{3}{D{.}{.}{3.2}}@{}}
 \multicolumn{1}{@{}c}{} & \multicolumn{1}{c}{Median} &\multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\
 \CFA @waitfor@, 1 monitor       & 367.1 & 365.3 & 5.0   \\
 \CFA @waitfor@, 2 monitor       & 463.0 & 464.6 & 7.1   \\
 \CFA @waitfor@, 4 monitor       & 689.6 & 696.2 & 21.5  \\
 \uC \lstinline[language=uC++]|_Accept| monitor  & 328.2 & 329.1 & 3.4   \\
 Go \lstinline[language=Golang]|select| channel  & 365.0 & 365.5 & 1.2
+\multicolumn{1}{@{}r}{N\hspace*{10pt}} & \multicolumn{1}{c}{Median} &\multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\
+\CFA @waitfor@, 1 monitor (10M) & 367.1 & 365.3 & 5.0   \\
+\CFA @waitfor@, 2 monitor (10M) & 463.0 & 464.6 & 7.1   \\
+\CFA @waitfor@, 4 monitor (10M) & 689.6 & 696.2 & 21.5  \\
+\uC \lstinline[language=uC++]|_Accept| monitor (10M)    & 328.2 & 329.1 & 3.4   \\
+Go \lstinline[language=Golang]|select| channel (10M)    & 365.0 & 365.5 & 1.2
 \end{tabular}
 \end{multicols}
 …
 \begin{multicols}{2}
 \lstset{language=CFA,moredelim=**[is][\color{red}]{@}{@},deletedelim=**[is][]{`}{`}}
 \begin{cfa}
 @monitor@ M {} m1/*, m2, m3, m4*/;
 call( M & @mutex p1/*, p2, p3, p4*/@ ) {}
+\setlength{\tabcolsep}{3pt}
+\begin{cfa}[xleftmargin=0pt]
+`monitor` M {} m1/*, m2, m3, m4*/;
+call( M & `mutex p1/*, p2, p3, p4*/` ) {}
 int main() {
         BENCH( for( N ) call( m1/*, m2, m3, m4*/ ); )
 …
 \label{t:mutex}
 \begin{tabular}{@{}r*{3}{D{.}{.}{3.2}}@{}}
+\multicolumn{1}{@{}c}{} & \multicolumn{1}{c}{Median} &\multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\
+test-and-test-set lock                  & 19.1  & 18.9  & 0.4   \\
+\CFA @mutex@ function, 1 arg.   & 48.3  & 47.8  & 0.9   \\
+\CFA @mutex@ function, 2 arg.   & 86.7  & 87.6  & 1.9   \\
+\CFA @mutex@ function, 4 arg.   & 173.4 & 169.4 & 5.9   \\
+\uC @monitor@ member rtn.               & 54.8  & 54.8  & 0.1   \\
+Goroutine mutex lock                    & 34.0  & 34.0  & 0.0   \\
+Rust mutex lock                                 & 33.0  & 33.2  & 0.8   \\
+Java synchronized method                & 31.0  & 31.0  & 0.0   \\
+Pthreads mutex Lock                             & 31.0  & 31.1  & 0.4
+\multicolumn{1}{@{}r}{N\hspace*{10pt}} & \multicolumn{1}{c}{Median} &\multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\
+test-and-test-set lock (50M)            & 19.1  & 18.9  & 0.4   \\
+\CFA @mutex@ function, 1 arg. (50M)     & 48.3  & 47.8  & 0.9   \\
+\CFA @mutex@ function, 2 arg. (50M)     & 86.7  & 87.6  & 1.9   \\
+\CFA @mutex@ function, 4 arg. (50M)     & 173.4 & 169.4 & 5.9   \\
+\uC @monitor@ member rtn. (50M)         & 54.8  & 54.8  & 0.1   \\
+Goroutine mutex lock (50M)                      & 34.0  & 34.0  & 0.0   \\
+Rust mutex lock (50M)                           & 33.0  & 33.2  & 0.8   \\
+Java synchronized method (50M)          & 31.0  & 30.9  & 0.5   \\
+% Java synchronized method (10 000 000 000)             & 31.0 & 30.2 & 0.9 \\
+Pthreads mutex Lock (50M)                       & 31.0  & 31.1  & 0.4
 \end{tabular}
 \end{multicols}
 …
 % To: "Peter A. Buhr" <pabuhr@plg2.cs.uwaterloo.ca>
 % Date: Fri, 24 Jan 2020 13:49:18 -0500
+%
+%
 % I can also verify that the previous version, which just tied a bunch of promises together, *does not* go back to the
 % event loop at all in the current version of Node. Presumably they're taking advantage of the fact that the ordering of
 …
 \begin{multicols}{2}
+\lstset{language=CFA,moredelim=**[is][\color{red}]{@}{@},deletedelim=**[is][]{`}{`}}
+\begin{cfa}[aboveskip=0pt,belowskip=0pt]
+@coroutine@ C {};
+void main( C & ) { for () { @suspend;@ } }
+\begin{cfa}[xleftmargin=0pt]
+`coroutine` C {};
+void main( C & ) { for () { `suspend;` } }
 int main() { // coroutine test
         C c;
         BENCH( for ( N ) { @resume( c );@ } )
+        BENCH( for ( N ) { `resume( c );` } )
         sout | result;
+}
 int main() { // thread test
         BENCH( for ( N ) { @yield();@ } )
+        BENCH( for ( N ) { `yield();` } )
         sout | result;
+}
 …
 \label{t:ctx-switch}
 \begin{tabular}{@{}r*{3}{D{.}{.}{3.2}}@{}}
+\multicolumn{1}{@{}c}{} & \multicolumn{1}{c}{Median} &\multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\
+C function                      & 1.8           & 1.8           & 0.0   \\
+\CFA generator          & 1.8           & 2.0           & 0.3   \\
+\CFA coroutine          & 32.5          & 32.9          & 0.8   \\
+\CFA thread                     & 93.8          & 93.6          & 2.2   \\
+\uC coroutine           & 50.3          & 50.3          & 0.2   \\
+\uC thread                      & 97.3          & 97.4          & 1.0   \\
+Python generator        & 40.9          & 41.3          & 1.5   \\
+Node.js await           & 1852.2        & 1854.7        & 16.4  \\
+Node.js generator       & 33.3          & 33.4          & 0.3   \\
+Goroutine thread        & 143.0         & 143.3         & 1.1   \\
+Rust async await        & 32.0          & 32.0          & 0.0   \\
+Rust tokio thread       & 143.0         & 143.0         & 1.7   \\
+Rust thread                     & 332.0         & 331.4         & 2.4   \\
+Java thread                     & 405.0         & 415.0         & 17.6  \\
+Pthreads thread         & 334.3         & 335.2         & 3.9
+\multicolumn{1}{@{}r}{N\hspace*{10pt}} & \multicolumn{1}{c}{Median} &\multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\
+C function (10B)                        & 1.8           & 1.8           & 0.0   \\
+\CFA generator (5B)                     & 1.8           & 2.0           & 0.3   \\
+\CFA coroutine (100M)           & 32.5          & 32.9          & 0.8   \\
+\CFA thread (100M)                      & 93.8          & 93.6          & 2.2   \\
+\uC coroutine (100M)            & 50.3          & 50.3          & 0.2   \\
+\uC thread (100M)                       & 97.3          & 97.4          & 1.0   \\
+Python generator (100M)         & 40.9          & 41.3          & 1.5   \\
+Node.js await (5M)                      & 1852.2        & 1854.7        & 16.4  \\
+Node.js generator (100M)        & 33.3          & 33.4          & 0.3   \\
+Goroutine thread (100M)         & 143.0         & 143.3         & 1.1   \\
+Rust async await (100M)         & 32.0          & 32.0          & 0.0   \\
+Rust tokio thread (100M)        & 143.0         & 143.0         & 1.7   \\
+Rust thread (25M)                       & 332.0         & 331.4         & 2.4   \\
+Java thread (100M)                      & 405.0         & 415.0         & 17.6  \\
+% Java thread (  100 000 000)                   & 413.0 & 414.2 & 6.2 \\
+% Java thread (5 000 000 000)                   & 415.0 & 415.2 & 6.1 \\
+Pthreads thread (25M)           & 334.3         & 335.2         & 3.9
 \end{tabular}
 \end{multicols}
 …
 Languages using 1:1 threading based on pthreads can at best meet or exceed, due to language overhead, the pthread results.
 Note, pthreads has a fast zero-contention mutex lock checked in user space.
+Languages with M:N threading have better performance than 1:1 because there is no operating-system interactions.
+Languages with M:N threading have better performance than 1:1 because there is no operating-system interactions (context-switching or locking).
+As well, for locking experiments, M:N threading has less contention if only one kernel thread is used.
 Languages with stackful coroutines have higher cost than stackless coroutines because of stack allocation and context switching;
 however, stackful \uC and \CFA coroutines have approximately the same performance as stackless Python and Node.js generators.
 The \CFA stackless generator is approximately 25 times faster for suspend/resume and 200 times faster for creation than stackless Python and Node.js generators.
+The Node.js context-switch is costly when asynchronous await must enter the event engine because a promise is not fulfilled.
+Finally, the benchmark results correlate across programming languages with and without JIT, indicating the JIT has completed any runtime optimizations.
 …
 The authors recognize the design assistance of Aaron Moss, Rob Schluntz, Andrew Beach, and Michael Brooks; David Dice for commenting and helping with the Java benchmarks; and Gregor Richards for helping with the Node.js benchmarks.
 This research is funded by a grant from Waterloo-Huawei (\url{http://www.huawei.com}) Joint Innovation Lab. %, and Peter Buhr is partially funded by the Natural Sciences and Engineering Research Council of Canada.
+This research is funded by the NSERC/Waterloo-Huawei (\url{http://www.huawei.com}) Joint Innovation Lab. %, and Peter Buhr is partially funded by the Natural Sciences and Engineering Research Council of Canada.
 {%

doc/papers/concurrency/annex/local.bib

r33c3ded	r223a633
59	59	@manual{Cpp-Transactions,
60	60	keywords = {C++, Transactional Memory},
61		title = {Tech~~nical Specification~~ for C++ Extensions for Transactional Memory},
	61	title = {Tech. Spec. for C++ Extensions for Transactional Memory},
62	62	organization= {International Standard ISO/IEC TS 19841:2015 },
63	63	publisher = {American National Standards Institute},

doc/papers/concurrency/mail2

-              r33c3ded
+              r223a633
 Software: Practice and Experience Editorial Office
+Date: Wed, 2 Sep 2020 20:55:34 +0000
+From: Richard Jones <onbehalfof@manuscriptcentral.com>
+Reply-To: R.E.Jones@kent.ac.uk
+To: tdelisle@uwaterloo.ca, pabuhr@uwaterloo.ca
+Subject: Software: Practice and Experience - Decision on Manuscript ID
+ SPE-19-0219.R2
+-Sep-2020
+Dear Dr Buhr,
+Many thanks for submitting SPE-19-0219.R2 entitled "Advanced Control-flow and Concurrency in Cforall" to Software: Practice and Experience. The paper has now been reviewed and the comments of the referees are included at the bottom of this letter. I apologise for the length of time it has taken to get these.
+Both reviewers consider this paper to be close to acceptance. However, before I can accept this paper, I would like you address the comments of Reviewer 2, particularly with regard to the description of the adaptation Java harness to deal with warmup. I would expect to see a convincing argument that the computation has reached a steady state. I would also like you to provide the values for N for each benchmark run. This should be very straightforward for you to do. There are a couple of papers on steady state that you may wish to consult (though I am certainly not pushing my own work).
+) Barrett, Edd; Bolz-Tereick, Carl Friedrich; Killick, Rebecca; Mount, Sarah and Tratt, Laurence. Virtual Machine Warmup Blows Hot and Cold. OOPSLA 2017. https://doi.org/10.1145/3133876
+Virtual Machines (VMs) with Just-In-Time (JIT) compilers are traditionally thought to execute programs in two phases: the initial warmup phase determines which parts of a program would most benefit from dynamic compilation, before JIT compiling those parts into machine code; subsequently the program is said to be at a steady state of peak performance. Measurement methodologies almost always discard data collected during the warmup phase such that reported measurements focus entirely on peak performance. We introduce a fully automated statistical approach, based on changepoint analysis, which allows us to determine if a program has reached a steady state and, if so, whether that represents peak performance or not. Using this, we show that even when run in the most controlled of circumstances, small, deterministic, widely studied microbenchmarks often fail to reach a steady state of peak performance on a variety of common VMs. Repeating our experiment on 3 different machines, we found that at most 43.5% of pairs consistently reach a steady state of peak performance.
+) Kalibera, Tomas and Jones, Richard. Rigorous Benchmarking in Reasonable Time. ISMM  2013. https://doi.org/10.1145/2555670.2464160
+Experimental evaluation is key to systems research. Because modern systems are complex and non-deterministic, good experimental methodology demands that researchers account for uncertainty. To obtain valid results, they are expected to run many iterations of benchmarks, invoke virtual machines (VMs) several times, or even rebuild VM or benchmark binaries more than once. All this repetition costs time to complete experiments. Currently, many evaluations give up on sufficient repetition or rigorous statistical methods, or even run benchmarks only in training sizes. The results reported often lack proper variation estimates and, when a small difference between two systems is reported, some are simply unreliable.In contrast, we provide a statistically rigorous methodology for repetition and summarising results that makes efficient use of experimentation time. Time efficiency comes from two key observations. First, a given benchmark on a given platform is typically prone to much less non-determinism than the common worst-case of published corner-case studies. Second, repetition is most needed where most uncertainty arises (whether between builds, between executions or between iterations). We capture experimentation cost with a novel mathematical model, which we use to identify the number of repetitions at each level of an experiment necessary and sufficient to obtain a given level of precision.We present our methodology as a cookbook that guides researchers on the number of repetitions they should run to obtain reliable results. We also show how to present results with an effect size confidence interval. As an example, we show how to use our methodology to conduct throughput experiments with the DaCapo and SPEC CPU benchmarks on three recent platforms.
+You have 42 days from the date of this email to submit your revision. If you are unable to complete the revision within this time, please contact me to request a short extension.
+You can upload your revised manuscript and submit it through your Author Center. Log into https://mc.manuscriptcentral.com/spe and enter your Author Center, where you will find your manuscript title listed under "Manuscripts with Decisions".
+When submitting your revised manuscript, you will be able to respond to the comments made by the referee(s) in the space provided.  You can use this space to document any changes you make to the original manuscript.
+If you would like help with English language editing, or other article preparation support, Wiley Editing Services offers expert help with English Language Editing, as well as translation, manuscript formatting, and figure formatting at www.wileyauthors.com/eeo/preparation. You can also check out our resources for Preparing Your Article for general guidance about writing and preparing your manuscript at www.wileyauthors.com/eeo/prepresources.
+Once again, thank you for submitting your manuscript to Software: Practice and Experience. I look forward to receiving your revision.
+Sincerely,
+Richard
+Prof. Richard Jones
+Editor, Software: Practice and Experience
+R.E.Jones@kent.ac.uk
+Referee(s)' Comments to Author:
+Reviewing: 1
+Comments to the Author
+Overall, I felt that this draft was an improvement on previous drafts and I don't have further changes to request.
+I appreciated the new language to clarify the relationship of external and internal scheduling, for example, as well as the new measurements of Rust tokio. Also, while I still believe that the choice between thread/generator/coroutine and so forth could be made crisper and clearer, the current draft of Section 2 did seem adequate to me in terms of specifying the considerations that users would have to take into account to make the choice.
+Reviewing: 2
+Comments to the Author
+First: let me apologise for the delay on this review. I'll blame the global pandemic combined with my institution's senior management's counterproductive decisions for taking up most of my time and all of my energy.
+At this point, reading the responses, I think we've been around the course enough times that further iteration is unlikely to really improve the paper any further, so I'm happy to recommend acceptance.    My main comments are that there were some good points in the responses to *all* the reviews and I strongly encourage the authors to incorporate those discursive responses into the final paper so they may benefit readers as well as reviewers.   I agree with the recommendations of reviewer #2 that the paper could usefully be split in to two, which I think I made to a previous revision, but I'm happy to leave that decision to the Editor.
+Finally, the paper needs to describe how the Java harness was adapted to deal with warmup; why the computation has warmed up and reached a steady state - similarly for js and Python. The tables should also give the "N" chosen for each benchmark run.
+minor points
+* don't start sentences with "However"
+* most downloaded isn't an "Award"
+Date: Thu, 1 Oct 2020 05:34:29 +0000
+From: Richard Jones <onbehalfof@manuscriptcentral.com>
+Reply-To: R.E.Jones@kent.ac.uk
+To: pabuhr@uwaterloo.ca
+Subject: Revision reminder - SPE-19-0219.R2
+-Oct-2020
+Dear Dr Buhr
+SPE-19-0219.R2
+This is a reminder that your opportunity to revise and re-submit your manuscript will expire 14 days from now. If you require more time please contact me directly and I may grant an extension to this deadline, otherwise the option to submit a revision online, will not be available.
+If your article is of potential interest to the general public, (which means it must be timely, groundbreaking, interesting and impact on everyday society) then please e-mail ejp@wiley.co.uk explaining the public interest side of the research. Wiley will then investigate the potential for undertaking a global press campaign on the article.
+I look forward to receiving your revision.
+Sincerely,
+Prof. Richard Jones
+Editor, Software: Practice and Experience
+https://mc.manuscriptcentral.com/spe
+Date: Tue, 6 Oct 2020 15:29:41 +0000
+From: Mayank Roy Chowdhury <onbehalfof@manuscriptcentral.com>
+Reply-To: speoffice@wiley.com
+To: tdelisle@uwaterloo.ca, pabuhr@uwaterloo.ca
+Subject: SPE-19-0219.R3 successfully submitted
+-Oct-2020
+Dear Dr Buhr,
+Your manuscript entitled "Advanced Control-flow and Concurrency in Cforall" has been successfully submitted online and is presently being given full consideration for publication in Software: Practice and Experience.
+Your manuscript number is SPE-19-0219.R3.  Please mention this number in all future correspondence regarding this submission.
+You can view the status of your manuscript at any time by checking your Author Center after logging into https://mc.manuscriptcentral.com/spe.  If you have difficulty using this site, please click the 'Get Help Now' link at the top right corner of the site.
+Thank you for submitting your manuscript to Software: Practice and Experience.
+Sincerely,
+Software: Practice and Experience Editorial Office

doc/refrat/refrat.tex

-              r33c3ded
+              r223a633
 %% Created On       : Wed Apr  6 14:52:25 2016
 %% Last Modified By : Peter A. Buhr
 %% Last Modified On : Wed Jan 31 17:30:23 2018
 %% Update Count     : 108
+%% Last Modified On : Mon Oct  5 09:02:53 2020
+%% Update Count     : 110
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 …
 \usepackage{upquote}                                                                    % switch curled `'" to straight
 \usepackage{calc}
-\usepackage{xspace}
 \usepackage{varioref}                                                                   % extended references
-\usepackage{listings}                                                                   % format program code
 \usepackage[flushmargin]{footmisc}                                              % support label/reference in footnote
 \usepackage{latexsym}                                   % \Box glyph
 \usepackage{mathptmx}                                   % better math font with "times"
 \usepackage[usenames]{color}
+\input{common}                                          % common CFA document macros
+\usepackage[dvips,plainpages=false,pdfpagelabels,pdfpagemode=UseNone,colorlinks=true,pagebackref=true,linkcolor=blue,citecolor=blue,urlcolor=blue,pagebackref=true,breaklinks=true]{hyperref}
+\usepackage{breakurl}
+\renewcommand{\UrlFont}{\small\sf}
+\usepackage[pagewise]{lineno}
+\renewcommand{\linenumberfont}{\scriptsize\sffamily}
+\usepackage[firstpage]{draftwatermark}
+\SetWatermarkLightness{0.9}
+% Default underscore is too low and wide. Cannot use lstlisting "literate" as replacing underscore
+% removes it as a variable-name character so keywords in variables are highlighted. MUST APPEAR
+% AFTER HYPERREF.
+\renewcommand{\textunderscore}{\leavevmode\makebox[1.2ex][c]{\rule{1ex}{0.075ex}}}
+\setlength{\topmargin}{-0.45in}                                                 % move running title into header
+\setlength{\headsep}{0.25in}
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\CFAStyle                                                                                               % use default CFA format-style
+\lstnewenvironment{C++}[1][]                            % use C++ style
+{\lstset{language=C++,moredelim=**[is][\protect\color{red}]{®}{®}#1}}
+{}
+\newcommand{\CFALatin}{}
 % inline code ©...© (copyright symbol) emacs: C-q M-)
 % red highlighting ®...® (registered trademark symbol) emacs: C-q M-.
 …
 % keyword escape ¶...¶ (pilcrow symbol) emacs: C-q M-^
 % math escape $...$ (dollar symbol)
+\input{common}                                          % common CFA document macros
+\usepackage[dvips,plainpages=false,pdfpagelabels,pdfpagemode=UseNone,colorlinks=true,pagebackref=true,linkcolor=blue,citecolor=blue,urlcolor=blue,pagebackref=true,breaklinks=true]{hyperref}
+\usepackage{breakurl}
+\renewcommand{\UrlFont}{\small\sf}
+\usepackage[pagewise]{lineno}
+\renewcommand{\linenumberfont}{\scriptsize\sffamily}
+\usepackage[firstpage]{draftwatermark}
+\SetWatermarkLightness{0.9}
+% Default underscore is too low and wide. Cannot use lstlisting "literate" as replacing underscore
+% removes it as a variable-name character so keywords in variables are highlighted. MUST APPEAR
+% AFTER HYPERREF.
+\renewcommand{\textunderscore}{\leavevmode\makebox[1.2ex][c]{\rule{1ex}{0.075ex}}}
+\setlength{\topmargin}{-0.45in}                                                 % move running title into header
+\setlength{\headsep}{0.25in}
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\CFAStyle                                                                                               % use default CFA format-style
+\lstnewenvironment{C++}[1][]                            % use C++ style
+{\lstset{language=C++,moredelim=**[is][\protect\color{red}]{®}{®},#1}}
+{}
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 % Names used in the document.
 \newcommand{\Version}{\input{../../version}}
+\newcommand{\Version}{\input{build/version}}
 \newcommand{\Textbf}[2][red]{{\color{#1}{\textbf{#2}}}}
 \newcommand{\Emph}[2][red]{{\color{#1}\textbf{\emph{#2}}}}

doc/theses/andrew_beach_MMath/thesis.tex

-              r33c3ded
+              r223a633
 \usepackage[toc,abbreviations]{glossaries-extra}
+% Main glossary entries -- definitions of relevant terminology
+\newglossaryentry{computer}
+{
+name=computer,
+description={A programmable machine that receives input data,
+               stores and manipulates the data, and provides
+               formatted output}
+}
+% Nomenclature glossary entries -- New definitions, or unusual terminology
+\newglossary*{nomenclature}{Nomenclature}
+\newglossaryentry{dingledorf}
+{
+type=nomenclature,
+name=dingledorf,
+description={A person of supposed average intelligence who makes incredibly
+               brainless misjudgments}
+}
+% List of Abbreviations (abbreviations are from the glossaries-extra package)
+\newabbreviation{aaaaz}{AAAAZ}{American Association of Amature Astronomers
+               and Zoologists}
+% List of Symbols
+\newglossary*{symbols}{List of Symbols}
+\newglossaryentry{rvec}
+{
+name={$\mathbf{v}$},
+sort={label},
+type=symbols,
+description={Random vector: a location in n-dimensional Cartesian space, where
+               each dimensional component is determined by a random process}
+}
+% Define all the glossaries.
+\input{glossaries}
 % Generate the glossaries defined above.

doc/theses/fangren_yu_COOP_S20/Makefile

r33c3ded	r223a633
46	46	# File Dependencies #
47	47
48
49	48	${DOCUMENT} : ${BASE}.ps
50	49	ps2pdf $<

doc/theses/fangren_yu_COOP_S20/Report.tex

-              r33c3ded
+              r223a633
 \documentclass[twoside,12pt]{article}
+\documentclass[twoside,11pt]{article}
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 …
 \usepackage[labelformat=simple,aboveskip=0pt,farskip=0pt]{subfig}
 \renewcommand{\thesubfigure}{\alph{subfigure})}
+\usepackage[flushmargin]{footmisc}                                              % support label/reference in footnote
 \usepackage{latexsym}                                   % \Box glyph
 \usepackage{mathptmx}                                   % better math font with "times"
+\usepackage[toc]{appendix}                                                              % article does not have appendix
 \usepackage[usenames]{color}
 \input{common}                                          % common CFA document macros
 \usepackage[dvips,plainpages=false,pdfpagelabels,pdfpagemode=UseNone,colorlinks=true,pagebackref=true,linkcolor=blue,citecolor=blue,urlcolor=blue,pagebackref=true,breaklinks=true]{hyperref}
 \usepackage{breakurl}
+\urlstyle{sf}
+% reduce spacing
+\setlist[itemize]{topsep=5pt,parsep=0pt}% global
+\setlist[enumerate]{topsep=5pt,parsep=0pt}% global
 \usepackage[pagewise]{lineno}
 …
 \renewcommand{\textunderscore}{\leavevmode\makebox[1.2ex][c]{\rule{1ex}{0.075ex}}}
 \newcommand{\NOTE}{\textbf{NOTE}}
+\newcommand{\TODO}[1]{{\color{Purple}#1}}
 \setlength{\topmargin}{-0.45in}                                                 % move running title into header
 …
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \CFADefaults
+\CFAStyle                                                                                               % CFA code-style for all languages
 \lstset{
+language=C++,                                                                                   % make C++ the default language
+escapechar=\$,                                                                                  % LaTeX escape in CFA code
+moredelim=**[is][\color{red}]{`}{`},
+language=C++,moredelim=**[is][\color{red}]{@}{@}                % make C++ the default language
 }% lstset
-\lstMakeShortInline@%
 \lstnewenvironment{C++}[1][]                            % use C++ style
+{\lstset{language=C++,moredelim=**[is][\protect\color{red}]{`}{`},#1}}
+{}
+{\lstset{language=C++,moredelim=**[is][\color{red}]{@}{@}}\lstset{#1}}{}
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 …
 \section{Overview}
+cfa-cc is the reference compiler for the \CFA programming language, which is a non-
+object-oriented extension to C.
+\CFA attempts to introduce productive modern programming language features to C
+while maintaining as much backward-compatibility as possible, so that most existing C
+programs can seamlessly work with \CFA.
+Since the \CFA project was dated back to the early 2000s, and only restarted in the past
+few years, there is a significant amount of legacy code in the current compiler codebase,
+with little proper documentation available. This becomes a difficulty while developing new
+features based on the previous implementations, and especially while diagnosing
+problems.
+Currently, the \CFA team is also facing another problem: bad compiler performance. For
+the development of a new programming language, writing a standard library is an
+important part. The incompetence of the compiler causes building the library files to take
+tens of minutes, making iterative development and testing almost impossible. There is
+ongoing effort to rewrite the core data structure of the compiler to overcome the
+performance issue, but many bugs may appear during the work, and lack of documentation
+makes debugging extremely difficult.
+This developer's reference will be continuously improved and eventually cover the
+compiler codebase. For now, the focus is mainly on the parts being rewritten, and also the
+performance bottleneck, namely the resolution algorithm. It is aimed to provide new
+developers to the project enough guidance and clarify the purposes and behavior of certain
+functions which are not mentioned in the previous \CFA research papers.
+@cfa-cc@ is the reference compiler for the \CFA programming language, which is a non-object-oriented extension to C.
+\CFA attempts to introduce productive modern programming language features to C while maintaining as much backward-compatibility as possible, so that most existing C programs can seamlessly work with \CFA.
+Since the \CFA project dates back to the early 2000s, and only restarted in the past few years, there is a significant amount of legacy code in the current compiler codebase with little documentation.
+The lack of documentation makes it difficult to develop new features from the current implementation and diagnose problems.
+Currently, the \CFA team is also facing poor compiler performance.
+For the development of a new programming language, writing standard libraries is an important component.
+The slow compiler causes building of the library files to take tens of minutes, making iterative development and testing almost impossible.
+There is an ongoing effort to rewrite the core data-structure of the compiler to overcome the performance issue, but many bugs have appeared during this work, and lack of documentation is hampering debugging.
+This developer's reference manual begins the documentation and should be continuously im\-proved until it eventually covers the entire compiler codebase.
+For now, the focus is mainly on the parts being rewritten, and also the primary performance bottleneck, namely the resolution algorithm.
+Its aimed is to provide new project developers with guidance in understanding the codebase, and clarify the purpose and behaviour of certain functions that are not mentioned in the previous \CFA research papers~\cite{Bilson03,Ditchfield92,Moss19}.
 \section{Compiler Framework}
+\CFA source code is first transformed into an abstract syntax tree (AST) by the parser before analyzed by the compiler.
 \subsection{AST Representation}
+Source code input is first transformed into abstract syntax tree (AST) representation by the
+parser before analyzed by the compiler.
+There are 4 major categories of AST nodes used by the compiler, along with some derived
+structures.
+\subsubsection{Declaration nodes}
+There are 4 major categories of AST nodes used by the compiler, along with some derived structures.
+\subsubsection{Declaration Nodes}
 A declaration node represents either of:
 \begin{itemize}
 \item
 Type declaration: struct, union, typedef or type parameter (see Appendix A.3)
 \item
 Variable declaration
 \item
 Function declaration
+type declaration: @struct@, @union@, @typedef@ or type parameter (see \VRef[Appendix]{s:KindsTypeParameters})
+\item
+variable declaration
+\item
+function declaration
 \end{itemize}
 Declarations are introduced by standard C declarations, with the usual scoping rules.
+In addition, declarations can also be introduced by the forall clause (which is the origin
+of \CFA's name):
+In addition, declarations can also be qualified by the \lstinline[language=CFA]@forall@ clause (which is the origin of \CFA's name):
 \begin{cfa}
 forall (<$\emph{TypeParameterList}$> | <$\emph{AssertionList}$>)
+forall ( <$\emph{TypeParameterList}$> | <$\emph{AssertionList}$> )
         $\emph{declaration}$
 \end{cfa}
 Type parameters in \CFA are similar to \CC template type parameters. The \CFA
 declaration
+Type parameters in \CFA are similar to \CC template type parameters.
+The \CFA declaration
 \begin{cfa}
 forall (dtype T) ...
 \end{cfa}
 behaves similarly as the \CC template declaration
+behaves similarly to the \CC template declaration
 \begin{C++}
 template <typename T> ...
 \end{C++}
+Assertions are a distinctive feature of \CFA: contrary to the \CC template where
+arbitrary functions and operators can be used in a template definition, in a \CFA
+parametric function, operations on parameterized types must be declared in assertions.
+Assertions are a distinctive feature of \CFA, similar to \emph{interfaces} in D and Go, and \emph{traits} in Rust.
+Contrary to the \CC template where arbitrary functions and operators can be used in a template definition, in a \CFA parametric function, operations on parameterized types must be declared in assertions.
 Consider the following \CC template:
 \begin{C++}
 template <typename T> int foo(T t) {
         return bar(t) + baz(t);
+@template@ forall<typename T> T foo( T t ) {
+        return t + t * t;
+}
 \end{C++}
+Unless bar and baz are also parametric functions taking any argument type, they must be
+declared in the assertions, or otherwise the code will not compile:
+where there are no explicit requirements on the type @T@.
+Therefore, the \CC compiler must deduce what operators are required during textual (macro) expansion of the template at each usage.
+As a result, templates cannot be compiled.
+\CFA assertions specify restrictions on type parameters:
 \begin{cfa}
 forall (dtype T | { int bar(T); int baz(t); }) int foo (T t) {
         return bar(t) + baz(t);
+forall( dtype T | @{ T ?+?( T, T ); T ?*?( T, T ) }@ ) int foo ( T t ) {
+        return t + t * t;
+}
 \end{cfa}
+Assertions are written using the usual function declaration syntax. The scope of type
+parameters and assertions is the following declaration.
+\subsubsection{Type nodes}
+A type node represents the type of an object or expression.
+Named types reference the corresponding type declarations. The type of a function is its
+function pointer type (same as standard C).
+With the addition of type parameters, named types may contain a list of parameter values
+(actual parameter types).
+\subsubsection{Statement nodes}
+Statement nodes represent the statements in the program, including basic expression
+statements, control flows and blocks.
+Assertions are written using the usual \CFA function declaration syntax.
+Only types with operators ``@+@'' and ``@*@'' work with this function, and the function prototype is sufficient to allow separate compilation.
+Type parameters and assertions are used in the following compiler data-structures.
+\subsubsection{Type Nodes}
+Type nodes represent the type of an object or expression.
+Named types reference the corresponding type declarations.
+The type of a function is its function pointer type (same as standard C).
+With the addition of type parameters, named types may contain a list of parameter values (actual parameter types).
+\subsubsection{Statement Nodes}
+Statement nodes represent the executable statements in the program, including basic expression statements, control flows and blocks.
 Local declarations (within a block statement) are represented as declaration statements.
+\subsubsection{Expression nodes}
+Some expressions are represented differently in the compiler before and after resolution
 stage:
+\subsubsection{Expression Nodes}
+Some expressions are represented differently before and after the resolution stage:
 \begin{itemize}
 \item
+Name expressions: NameExpr pre-resolution, VariableExpr post-resolution
+\item
+Member expressions: UntypedMemberExpr pre-resolution, MemberExpr post-resolution
+\item
+Function call expressions (including overloadable operators): UntypedExpr pre-resolution, ApplicationExpr post-resolution
+Name expressions: @NameExpr@ pre-resolution, @VariableExpr@ post-resolution
+\item
+Member expressions: @UntypedMemberExpr@ pre-resolution, @MemberExpr@ post-resolution
+\item
+\begin{sloppypar}
+Function call expressions (including overloadable operators): @UntypedExpr@ pre-resolution, @ApplicationExpr@ post-resolution
+\end{sloppypar}
 \end{itemize}
 The pre-resolution representations contain only the symbols. Post-resolution results link
 them to the actual variable and function declarations.
+The pre-resolution representation contains only the symbols.
+Post-resolution links them to the actual variable and function declarations.
 \subsection{Compilation Passes}
+Compilation steps are implemented as passes, which follows a general structural recursion
+pattern on the syntax tree.
+The basic work flow of compilation passes follows preorder and postorder traversal on
+tree data structure, implemented with visitor pattern, and can be loosely described with
+the following pseudocode:
+\begin{C++}
+Pass::visit (node_t node) {
+        previsit(node);
+        if (visit_children)
+Compilation steps are implemented as passes, which follows a general structural recursion pattern on the syntax tree.
+The basic workflow of compilation passes follows preorder and postorder traversal on the AST data-structure, implemented with visitor pattern, and can be loosely described with the following pseudocode:
+\begin{C++}
+Pass::visit( node_t node ) {
+        previsit( node );
+        if ( visit_children )
                 for each child of node:
                         child.accept(this);
         postvisit(node);
+                        child.accept( this );
+        postvisit( node );
+}
 \end{C++}
+Operations in previsit() happen in preorder (top to bottom) and operations in
+postvisit() happen in postorder (bottom to top). The precise order of recursive
+operations on child nodes can be found in @Common/PassVisitor.impl.h@ (old) and
+@AST/Pass.impl.hpp@ (new).
+Implementations of compilation passes need to follow certain conventions:
+Operations in @previsit@ happen in preorder (top to bottom) and operations in @postvisit@ happen in postorder (bottom to top).
+The precise order of recursive operations on child nodes can be found in @Common/PassVisitor.impl.h@ (old) and @AST/Pass.impl.hpp@ (new).
+Implementations of compilation passes follow certain conventions:
 \begin{itemize}
 \item
+Passes \textbf{should not} directly override the visit method (Non-virtual Interface
+principle); if a pass desires different recursion behavior, it should set
+@visit_children@ to false and perform recursive calls manually within previsit or
+postvisit procedures. To enable this option, inherit from @WithShortCircuiting@ mixin.
+\item
+previsit may mutate the node but \textbf{must not} change the node type or return null.
+\item
+postvisit may mutate the node, reconstruct it to a different node type, or delete it by
+returning null.
+Passes \textbf{should not} directly override the visit method (Non-virtual Interface principle);
+if a pass desires different recursion behaviour, it should set @visit_children@ to false and perform recursive calls manually within previsit or postvisit procedures.
+To enable this option, inherit from the @WithShortCircuiting@ mixin.
+\item
+previsit may mutate the node but \textbf{must not} change the node type or return @nullptr@.
+\item
+postvisit may mutate the node, reconstruct it to a different node type, or delete it by returning @nullptr@.
 \item
 If the previsit or postvisit method is not defined for a node type, the step is skipped.
 If the return type is declared as void, the original node is returned by default. These
+behaviors are controlled by template specialization rules; see
 @Common/PassVisitor.proto.h@ (old) and @AST/Pass.proto.hpp@ (new) for details.
+If the return type is declared as @void@, the original node is returned by default.
+These behaviours are controlled by template specialization rules;
+see @Common/PassVisitor.proto.h@ (old) and @AST/@ @Pass.proto.hpp@ (new) for details.
 \end{itemize}
 Other useful mixin classes for compilation passes include:
 \begin{itemize}
 \item
+WithGuards allows saving values of variables and restore automatically upon exiting
+the current node.
+\item
+WithVisitorRef creates a wrapped entity of current pass (the actual argument
+passed to recursive calls internally) for explicit recursion, usually used together
+with WithShortCircuiting.
+\item
+WithSymbolTable gives a managed symbol table with built-in scoping rule handling
+(\eg on entering and exiting a block statement)
+@WithGuards@ allows saving and restoring variable values automatically upon entering/exiting the current node.
+\item
+@WithVisitorRef@ creates a wrapped entity for the current pass (the actual argument passed to recursive calls internally) for explicit recursion, usually used together with @WithShortCircuiting@.
+\item
+@WithSymbolTable@ gives a managed symbol table with built-in scoping-rule handling (\eg on entering and exiting a block statement)
 \end{itemize}
+\NOTE: If a pass extends the functionality of another existing pass, due to \CC overloading
+resolution rules, it \textbf{must} explicitly introduce the inherited previsit and postvisit procedures
+to its own scope, or otherwise they will not be picked up by template resolution:
+\NOTE: If a pass extends the functionality of another existing pass, due to \CC overloading resolution rules, it \textbf{must} explicitly introduce the inherited previsit and postvisit procedures to its own scope, or otherwise they are not picked up by template resolution:
 \begin{C++}
 class Pass2: public Pass1 {
         using Pass1::previsit;
         using Pass1::postvisit;
+        @using Pass1::previsit;@
+        @using Pass1::postvisit;@
         // new procedures
+}
 …
+\subsection{Data Structure Change WIP (new-ast)}
+It has been observed that excessive copying of syntax tree structures accounts for a
+majority of computation cost and significantly slows down the compiler. In the previous
+implementation of the syntax tree, every internal node has a unique parent; therefore all
+copies are required to duplicate everything down to the bottom. A new, experimental
+re-implementation of the syntax tree (source under directory AST/ hereby referred to as
+``new-ast'') attempts to overcome this issue with a functional approach that allows sharing
+of common sub-structures and only makes copies when necessary.
+The core of new-ast is a customized implementation of smart pointers, similar to
+@std::shared_ptr@ and @std::weak_ptr@ in \CC standard library. Reference counting is
+used to detect sharing and allows optimization. For a purely functional (a.k.a. immutable)
+data structure, all mutations are modelled by shallow copies along the path of mutation.
+\subsection{Data Structure Change (new-ast)}
+It has been observed that excessive copying of syntax tree structures accounts for a majority of computation cost and significantly slows down the compiler.
+In the previous implementation of the syntax tree, every internal node has a unique parent;
+therefore all copies are required to duplicate the entire subtree.
+A new, experimental re-implementation of the syntax tree (source under directory @AST/@ hereby referred to as ``new-ast'') attempts to overcome this issue with a functional approach that allows sharing of common sub-structures and only makes copies when necessary.
+The core of new-ast is a customized implementation of smart pointers, similar to @std::shared_ptr@ and @std::weak_ptr@ in the \CC standard library.
+Reference counting is used to detect sharing and allowing certain optimizations.
+For a purely functional (immutable) data-structure, all mutations are modelled by shallow copies along the path of mutation.
 With reference counting optimization, unique nodes are allowed to be mutated in place.
+This however, may potentially introduce some complications and bugs; a few issues are
+discussed near the end of this section.
+\subsubsection{Source: AST/Node.hpp}
+class @ast::Node@ is the base class of all new-ast node classes, which implements
+reference counting mechanism. Two different counters are recorded: ``strong'' reference
+count for number of nodes semantically owning it; ``weak'' reference count for number of
+nodes holding a mere reference and only need to observe changes.
+class @ast::ptr_base@ is the smart pointer implementation and also takes care of
+resource management.
+Direct access through the smart pointer is read-only. A mutable access should be obtained
+by calling shallowCopy or mutate as below.
+Currently, the weak pointers are only used to reference declaration nodes from a named
+type, or a variable expression. Since declaration nodes are intended to denote unique
+entities in the program, weak pointers always point to unique (unshared) nodes. This may
+change in the future, and weak references to shared nodes may introduce some problems;
+This however, may potentially introduce some complications and bugs;
+a few issues are discussed near the end of this section.
+\subsubsection{Source: \lstinline{AST/Node.hpp}}
+Class @ast::Node@ is the base class of all new-ast node classes, which implements reference counting mechanism.
+Two different counters are recorded: ``strong'' reference count for number of nodes semantically owning it;
+``weak'' reference count for number of nodes holding a mere reference and only need to observe changes.
+Class @ast::ptr_base@ is the smart pointer implementation and also takes care of resource management.
+Direct access through the smart pointer is read-only.
+A mutable access should be obtained by calling @shallowCopy@ or mutate as below.
+Currently, the weak pointers are only used to reference declaration nodes from a named type, or a variable expression.
+Since declaration nodes are intended to denote unique entities in the program, weak pointers always point to unique (unshared) nodes.
+This property may change in the future, and weak references to shared nodes may introduce some problems;
 see mutate function below.
+All node classes should always use smart pointers in the structure and should not use raw
+pointers.
+All node classes should always use smart pointers in structure definitions versus raw pointers.
+Function
 \begin{C++}
 void ast::Node::increment(ref_type ref)
 \end{C++}
+Increments this node's strong or weak reference count.
+increments this node's strong or weak reference count.
+Function
 \begin{C++}
 void ast::Node::decrement(ref_type ref, bool do_delete = true)
 \end{C++}
+Decrements this node's strong or weak reference count. If strong reference count reaches
+zero, the node is deleted by default.
+\NOTE: Setting @do_delete@ to false may result in a detached node. Subsequent code should
+manually delete the node or assign it to a strong pointer to prevent memory leak.
+decrements this node's strong or weak reference count.
+If strong reference count reaches zero, the node is deleted.
+\NOTE: Setting @do_delete@ to false may result in a detached node.
+Subsequent code should manually delete the node or assign it to a strong pointer to prevent memory leak.
 Reference counting functions are internally called by @ast::ptr_base@.
+Function
 \begin{C++}
 template<typename node_t>
 node_t * shallowCopy(const node_t * node)
 \end{C++}
+Returns a mutable, shallow copy of node: all child pointers are pointing to the same child
+nodes.
+returns a mutable, shallow copy of node: all child pointers are pointing to the same child nodes.
+Function
 \begin{C++}
 template<typename node_t>
 node_t * mutate(const node_t * node)
 \end{C++}
+If node is unique (strong reference count is 1), returns a mutable pointer to the same node.
+Otherwise, returns shallowCopy(node).
+It is an error to mutate a shared node that is weak-referenced. Currently this does not
+happen. The problem may appear once weak pointers to shared nodes (\eg expression
+nodes) are used; special care will be needed.
+\NOTE: This naive uniqueness check may not be sufficient in some cases. A discussion of the
+issue is presented at the end of this section.
+returns a mutable pointer to the same node, if the node is unique (strong reference count is 1);
+otherwise, it returns @shallowCopy(node)@.
+It is an error to mutate a shared node that is weak-referenced.
+Currently this does not happen.
+A problem may appear once weak pointers to shared nodes (\eg expression nodes) are used;
+special care is needed.
+\NOTE: This naive uniqueness check may not be sufficient in some cases.
+A discussion of the issue is presented at the end of this section.
+Functions
 \begin{C++}
 template<typename node_t, typename parent_t, typename field_t, typename assn_t>
 const node_t * mutate_field(const node_t * node, field_t parent_t::*field, assn_t && val)
+const node_t * mutate_field(const node_t * node, field_t parent_t::* field, assn_t && val)
 \end{C++}
 \begin{C++}
 …
                 field_t && val)
 \end{C++}
+Helpers for mutating a field on a node using pointer to member (creates shallow copy
+when necessary).
 \subsubsection{Issue: Undetected sharing}
 The @mutate@ behavior described above has a problem: deeper shared nodes may be
+are helpers for mutating a field on a node using pointer to a member function (creates shallow copy when necessary).
+\subsubsection{Issue: Undetected Sharing}
+The @mutate@ behaviour described above has a problem: deeper shared nodes may be
 mistakenly considered as unique. \VRef[Figure]{f:DeepNodeSharing} shows how the problem could arise:
 \begin{figure}
 …
 \label{f:DeepNodeSharing}
 \end{figure}
+Suppose that we are working on the tree rooted at P1, which
+is logically the chain P1-A-B and P2 is irrelevant, and then
+mutate(B) is called. The algorithm considers B as unique since
+it is only directly owned by A. However, the other tree P2-A-B
+indirectly shares the node B and is therefore wrongly mutated.
+To partly address this problem, if the mutation is called higher up the tree, a chain
+mutation helper can be used:
+\subsubsection{Source: AST/Chain.hpp}
+Given the tree rooted at P1, which is logically the chain P1-A-B, and P2 is irrelevant, assume @mutate(B)@ is called.
+The algorithm considers B as unique since it is only directly owned by A.
+However, the other tree P2-A-B indirectly shares the node B and is therefore wrongly mutated.
+To partly address this problem, if the mutation is called higher up the tree, a chain mutation helper can be used.
+\subsubsection{Source: \lstinline{AST/Chain.hpp}}
+Function
 \begin{C++}
 template<typename node_t, Node::ref_type ref_t>
 auto chain_mutate(ptr_base<node_t, ref_t> & base)
 \end{C++}
+This function returns a chain mutator handle which takes pointer-to-member to go down
+the tree while creating shallow copies as necessary; see @struct _chain_mutator@ in the
+source code for details.
+For example, in the above diagram, if mutation of B is wanted while at P1, the call using
+@chain_mutate@ looks like the following:
+returns a chain mutator handle that takes pointer-to-member to go down the tree, while creating shallow copies as necessary;
+see @struct _chain_mutator@ in the source code for details.
+For example, in the above diagram, if mutation of B is wanted while at P1, the call using @chain_mutate@ looks like the following:
 \begin{C++}
 chain_mutate(P1.a)(&A.b) = new_value_of_b;
 \end{C++}
+Note that if some node in chain mutate is shared (therefore shallow copied), it implies that
+every node further down will also be copied, thus correctly executing the functional
+mutation algorithm. This example code creates copies of both A and B and performs
+mutation on the new nodes, so that the other tree P2-A-B is untouched.
+However, if a pass traverses down to node B and performs mutation, for example, in
+@postvisit(B)@, information on sharing higher up is lost. Since the new-ast structure is only in
+experimental use with the resolver algorithm, which mostly rebuilds the tree bottom-up,
+this issue does not actually happen. It should be addressed in the future when other
+compilation passes are migrated to new-ast and many of them contain procedural
+mutations, where it might cause accidental mutations to other logically independent trees
+(\eg common sub-expression) and become a bug.
+\vspace*{20pt} % FIX ME, spacing problem with this heading ???
+\NOTE: if some node in chain mutate is shared (therefore shallow copied), it implies that every node further down is also copied, thus correctly executing the functional mutation algorithm.
+This example code creates copies of both A and B and performs mutation on the new nodes, so that the other tree P2-A-B is untouched.
+However, if a pass traverses down to node B and performs mutation, for example, in @postvisit(B)@, information on sharing higher up is lost.
+Since the new-ast structure is only in experimental use with the resolver algorithm, which mostly rebuilds the tree bottom-up, this issue does not actually happen.
+It should be addressed in the future when other compilation passes are migrated to new-ast and many of them contain procedural mutations, where it might cause accidental mutations to other logically independent trees (\eg common sub-expression) and become a bug.
 \section{Compiler Algorithm Documentation}
+This documentation currently covers most of the resolver, data structures used in variable
+and expression resolution, and a few directly related passes. Later passes involving code
+generation is not included yet; documentation for those will be done afterwards.
+This compiler algorithm documentation covers most of the resolver, data structures used in variable and expression resolution, and a few directly related passes.
+Later passes involving code generation are not included yet;
+documentation for those will be done latter.
 \subsection{Symbol Table}
+\NOTE: For historical reasons, the symbol table data structure was called ``indexer'' in the
+old implementation. Hereby we will be using the name SymbolTable everywhere.
+The symbol table stores a mapping from names to declarations and implements a similar
+name space separation rule, and the same scoping rules in standard C.\footnote{ISO/IEC 9899:1999, Sections 6.2.1 and 6.2.3} The difference in
+name space rule is that typedef aliases are no longer considered ordinary identifiers.
+In addition to C tag types (struct, union, enum), \CFA introduces another tag type, trait,
+which is a named collection of assertions.
+\subsubsection{Source: AST/SymbolTable.hpp}
+\subsubsection{Source: SymTab/Indexer.h}
+\NOTE: For historical reasons, the symbol-table data-structure is called @indexer@ in the old implementation.
+Hereby, the name is changed to @SymbolTable@.
+The symbol table stores a mapping from names to declarations, implements a similar name-space separation rule, and provides the same scoping rules as standard C.\footnote{ISO/IEC 9899:1999, Sections 6.2.1 and 6.2.3.}
+The difference in name-space rule is that @typedef@ aliases are no longer considered ordinary identifiers.
+In addition to C tag-types (@struct@, @union@, @enum@), \CFA introduces another tag type, @trait@, which is a named collection of assertions.
+\subsubsection{Source: \lstinline{AST/SymbolTable.hpp}}
+Function
 \begin{C++}
 SymbolTable::addId(const DeclWithType * decl)
 \end{C++}
+Since \CFA allows overloading of variables and functions, ordinary identifier names need
+to be mangled. The mangling scheme is closely based on the Itanium \CC ABI,\footnote{\url{https://itanium-cxx-abi.github.io/cxx-abi/abi.html}, Section 5.1} while
+making adaptations to \CFA specific features, mainly assertions and overloaded variables
+by type. Naming conflicts are handled by mangled names; lookup by name returns a list of
 declarations with the same literal identifier name.
+provides name mangling of identifiers, since \CFA allows overloading of variables and functions.
+The mangling scheme is closely based on the Itanium \CC ABI,\footnote{\url{https://itanium-cxx-abi.github.io/cxx-abi/abi.html}, Section 5.1} while making adaptations to \CFA specific features, mainly assertions and overloaded variables by type.
+Naming conflicts are handled by mangled names;
+lookup by name returns a list of declarations with the same identifier name.
+Functions
 \begin{C++}
 SymbolTable::addStruct(const StructDecl * decl)
 …
 SymbolTable::addTrait(const TraitDecl * decl)
 \end{C++}
+Adds a tag type declaration to the symbol table.
+add a tag-type declaration to the symbol table.
+Function
 \begin{C++}
 SymbolTable::addType(const NamedTypeDecl * decl)
 \end{C++}
+Adds a typedef alias to the symbol table.
+\textbf{C Incompatibility Note}: Since Cforall allows using struct, union and enum type names
+without the keywords, typedef names and tag type names cannot be disambiguated by
+syntax rules. Currently the compiler puts them together and disallows collision. The
+following program is valid C but not valid Cforall:
+adds a @typedef@ alias to the symbol table.
+\textbf{C Incompatibility Note}: Since \CFA allows using @struct@, @union@ and @enum@ type-names without a prefix keyword, as in \CC, @typedef@ names and tag-type names cannot be disambiguated by syntax rules.
+Currently the compiler puts them together and disallows collision.
+The following program is valid C but invalid \CFA (and \CC):
 \begin{C++}
 struct A {};
+typedef int A; // gcc: ok, cfa: Cannot redefine typedef A
+struct A sa; // C disambiguates via struct prefix
+A ia;
+\end{C++}
+In practices, such usage is extremely rare, and hence, this change (as in \CC) has minimal impact on existing C programs.
+The declaration
+\begin{C++}
+struct A {};
+typedef struct A A; // A is an alias for struct A
+A a;
+struct A b;
+\end{C++}
+is not an error because the alias name is identical to the original.
+Finally, the following program is allowed in \CFA:
+\begin{C++}
 typedef int A;
+// gcc: ok, cfa: Cannot redefine typedef A
+\end{C++}
+In actual practices however, such usage is extremely rare, and typedef struct A A; is
+not considered an error, but silently discarded. Therefore, we expect this change to have
+minimal impact on existing C programs.
+Meanwhile, the following program is allowed in Cforall:
+\begin{C++}
+typedef int A;
+void A();
+void A(); // name mangled
 // gcc: A redeclared as different kind of symbol, cfa: ok
 \end{C++}
+because the function name is mangled.
 \subsection{Type Environment and Unification}
+The core of parametric type resolution algorithm.
+Type Environment organizes type parameters in \textbf{equivalent classes} and maps them to
+actual types. Unification is the algorithm that takes two (possibly parametric) types and
+parameter mappings and attempts to produce a common type by matching the type
+environments.
+The following core ideas underlie the parametric type-resolution algorithm.
+A type environment organizes type parameters into \textbf{equivalent classes} and maps them to actual types.
+Unification is the algorithm that takes two (possibly parametric) types and parameter mappings, and attempts to produce a common type by matching information in the type environments.
 The unification algorithm is recursive in nature and runs in two different modes internally:
 \begin{itemize}
 \item
+\textbf{Exact} unification mode requires equivalent parameters to match perfectly;
+\item
+\textbf{Inexact} unification mode allows equivalent parameters to be converted to a
+common type.
+Exact unification mode requires equivalent parameters to match perfectly.
+\item
+Inexact unification mode allows equivalent parameters to be converted to a common type.
 \end{itemize}
+For a pair of matching parameters (actually, their equivalent classes), if either side is open
+(not bound to a concrete type yet), they are simply combined.
+Within inexact mode, types are allowed to differ on their cv-qualifiers; additionally, if a
+type never appear either in parameter list or as the base type of a pointer, it may also be
+widened (i.e. safely converted). As Cforall currently does not implement subclassing similar
+to object-oriented languages, widening conversions are on primitive types only, for
+example the conversion from int to long.
+The need for two unification modes come from the fact that parametric types are
+considered compatible only if all parameters are exactly the same (not just compatible).
+Pointer types also behaves similarly; in fact, they may be viewed as a primitive kind of
+parametric types. @int*@ and @long*@ are different types, just like @vector(int)@ and
+@vector(long)@ are, for the parametric type @vector(T)@.
+The resolver should use the following ``@public@'' functions:\footnote{
+Actual code also tracks assertions on type parameters; those extra arguments are omitted here for
+conciseness.}
+\subsubsection{Source: ResolvExpr/Unify.cc}
+\begin{C++}
+bool unify(const Type *type1, const Type *type2, TypeEnvironment &env,
+OpenVarSet &openVars, const SymbolTable &symtab, Type *&commonType)
+\end{C++}
+Attempts to unify @type1@ and @type2@ with current type environment.
+If operation succeeds, @env@ is modified by combining the equivalence classes of matching
+parameters in @type1@ and @type2@, and their common type is written to commonType.
+If operation fails, returns false.
+\begin{C++}
+bool typesCompatible(const Type * type1, const Type * type2, const
+SymbolTable &symtab, const TypeEnvironment &env)
+bool typesCompatibleIgnoreQualifiers(const Type * type1, const Type *
+type2, const SymbolTable &symtab, const TypeEnvironment &env)
+\end{C++}
+Determines if type1 and type2 can possibly be the same type. The second version ignores
+the outermost cv-qualifiers if present.\footnote{
+In const \lstinline@int * const@, only the second \lstinline@const@ is ignored.}
+The call has no side effect.
+\NOTE: No attempts are made to widen the types (exact unification is used), although the
+function names may suggest otherwise. E.g. @typesCompatible(int, long)@ returns false.
+For a pair of matching parameters (actually, their equivalent classes), if either side is open (not bound to a concrete type yet), they are combined.
+Within the inexact mode, types are allowed to differ on their cv-qualifiers (\eg @const@, @volatile@, \etc);
+additionally, if a type never appear either in a parameter list or as the base type of a pointer, it may also be widened (\ie safely converted).
+As \CFA currently does not implement subclassing as in object-oriented languages, widening conversions are only on the primitive types, \eg conversion from @int@ to @long int@.
+The need for two unification modes comes from the fact that parametric types are considered compatible only if all parameters are exactly the same (not just compatible).
+Pointer types also behaves similarly;
+in fact, they may be viewed as a primitive kind of parametric types.
+@int *@ and @long *@ are different types, just like @vector(int)@ and @vector(long)@ are, for the parametric type @*(T)@ / @vector(T)@, respectively.
+The resolver uses the following @public@ functions:\footnote{
+Actual code also tracks assertions on type parameters; those extra arguments are omitted here for conciseness.}
+\subsubsection{Source: \lstinline{ResolvExpr/Unify.cc}}
+Function
+\begin{C++}
+bool unify(const Type * type1, const Type * type2, TypeEnvironment & env,
+        OpenVarSet & openVars, const SymbolTable & symtab, Type *& commonType)
+\end{C++}
+returns a boolean indicating if the unification succeeds or fails after attempting to unify @type1@ and @type2@ within current type environment.
+If the unify succeeds, @env@ is modified by combining the equivalence classes of matching parameters in @type1@ and @type2@, and their common type is written to @commonType@.
+If the unify fails, nothing changes.
+Functions
+\begin{C++}
+bool typesCompatible(const Type * type1, const Type * type2, const SymbolTable & symtab,
+        const TypeEnvironment & env)
+bool typesCompatibleIgnoreQualifiers(const Type * type1, const Type * type2,
+        const SymbolTable & symtab, const TypeEnvironment & env)
+\end{C++}
+return a boolean indicating if types @type1@ and @type2@ can possibly be the same type.
+The second version ignores the outermost cv-qualifiers if present.\footnote{
+In \lstinline@const int * const@, only the second \lstinline@const@ is ignored.}
+These function have no side effects.
+\NOTE: No attempt is made to widen the types (exact unification is used), although the function names may suggest otherwise, \eg @typesCompatible(int, long)@ returns false.
 \subsection{Expression Resolution}
+The design of the current version of expression resolver is outlined in the Ph.D. Thesis from
+Aaron Moss~\cite{Moss19}.
+The design of the current version of expression resolver is outlined in the Ph.D.\ thesis by Aaron Moss~\cite{Moss19}.
 A summary of the resolver algorithm for each expression type is presented below.
 All overloadable operators are modelled as function calls. For a function call,
+interpretations of the function and arguments are found recursively. Then the following
 steps produce a filtered list of valid interpretations:
+All overloadable operators are modelled as function calls.
+For a function call, interpretations of the function and arguments are found recursively.
+Then the following steps produce a filtered list of valid interpretations:
 \begin{enumerate}
 \item
+From all possible combinations of interpretations of the function and arguments,
+those where argument types may be converted to function parameter types are
+considered valid.
+From all possible combinations of interpretations of the function and arguments, those where argument types may be converted to function parameter types are considered valid.
 \item
 Valid interpretations with the minimum sum of argument costs are kept.
 \item
+Argument costs are then discarded; the actual cost for the function call expression is
+the sum of conversion costs from the argument types to parameter types.
+\item
+For each return type, the interpretations with satisfiable assertions are then sorted
+by actual cost computed in step 3. If for a given type, the minimum cost
+interpretations are not unique, it is said that for that return type the interpretation
+is ambiguous. If the minimum cost interpretation is unique but contains an
+ambiguous argument, it is also considered ambiguous.
+\label{p:argcost}
+Argument costs are then discarded; the actual cost for the function call expression is the sum of conversion costs from the argument types to parameter types.
+\item
+\label{p:returntype}
+For each return type, the interpretations with satisfiable assertions are then sorted by actual cost computed in step~\ref{p:argcost}.
+If for a given type, the minimum cost interpretations are not unique, that return type is ambiguous.
+If the minimum cost interpretation is unique but contains an ambiguous argument, it is also ambiguous.
 \end{enumerate}
 Therefore, for each return type, the resolver produces either of:
+Therefore, for each return type, the resolver produces:
 \begin{itemize}
 \item
 No alternatives
 \item
 A single valid alternative
 \item
 An ambiguous alternative
+no alternatives
+\item
+a single valid alternative
+\item
+an ambiguous alternative
 \end{itemize}
+Note that an ambiguous alternative may be discarded at the parent expressions because a
+different return type matches better for the parent expressions.
+The non-overloadable expressions in Cforall are: cast expressions, address-of (unary @&@)
+expressions, short-circuiting logical expressions (@&&@, @||@) and ternary conditional
+expression (@?:@).
+For a cast expression, the convertible argument types are kept. Then the result is selected
+by lowest argument cost, and further by lowest conversion cost to target type. If the lowest
+cost is still not unique, or an ambiguous argument interpretation is selected, the cast
+expression is ambiguous. In an expression statement, the top level expression is implicitly
+cast to void.
+\NOTE: an ambiguous alternative may be discarded at the parent expressions because a different return type matches better for the parent expressions.
+The \emph{non}-overloadable expressions in \CFA are: cast expressions, address-of (unary @&@) expressions, short-circuiting logical expressions (@&&@, @||@) and ternary conditional expression (@?:@).
+For a cast expression, the convertible argument types are kept.
+Then the result is selected by lowest argument cost, and further by lowest conversion cost to target type.
+If the lowest cost is still not unique or an ambiguous argument interpretation is selected, the cast expression is ambiguous.
+In an expression statement, the top level expression is implicitly cast to @void@.
 For an address-of expression, only lvalue results are kept and the minimum cost is selected.
+For logical expressions @&&@ and @||@, arguments are implicitly cast to bool, and follow the rule
+of cast expression as above.
+For the ternary conditional expression, the condition is implicitly cast to bool, and the
+branch expressions must have compatible types. Each pair of compatible branch
+expression types produce a possible interpretation, and the cost is defined as the sum of
+expression costs plus the sum of conversion costs to the common type.
+TODO: Write a specification for expression costs.
+For logical expressions @&&@ and @||@, arguments are implicitly cast to @bool@, and follow the rules fr cast expression above.
+For the ternary conditional expression, the condition is implicitly cast to @bool@, and the branch expressions must have compatible types.
+Each pair of compatible branch expression types produce a possible interpretation, and the cost is defined as the sum of the expression costs plus the sum of conversion costs to the common type.
+\subsection{Conversion and Application Cost}
+There were some unclear parts in the previous documentation in the cost system, as described in the Moss thesis~\cite{Moss19}, section 4.1.2.
+Some clarification are presented in this section.
+\begin{enumerate}
+\item
+Conversion to a type denoted by parameter may incur additional cost if the match is not exact.
+For example, if a function is declared to accept @(T, T)@ and receives @(int, long)@, @T@ is deducted @long@ and an additional widening conversion cost is added for @int@ to @T@.
+\item
+The specialization level of a function is the sum of the least depth of an appearance of a type parameter (counting pointers, references and parameterized types), plus the number of assertions.
+A higher specialization level is favoured if argument conversion costs are equal.
+\item
+Coercion of pointer types is only allowed in explicit cast expressions;
+the only allowed implicit pointer casts are adding qualifiers to the base type and cast to @void*@, and these counts as safe conversions.
+Note that implicit cast from @void *@ to other pointer types is no longer valid, as opposed to standard C.
+\end{enumerate}
 \subsection{Assertion Satisfaction}
+The resolver tries to satisfy assertions on expressions only when it is needed: either while
+selecting from multiple alternatives of a same result type for a function call (step 4 of
+resolving function calls), or upon reaching the top level of an expression statement.
+Unsatisfiable alternatives are discarded. Satisfiable alternatives receive \textbf{implicit
+parameters}: in Cforall, parametric functions are designed such that they can be compiled
+separately, as opposed to \CC templates which are only compiled at instantiation. Given a
+parametric function definition:
+The resolver tries to satisfy assertions on expressions only when it is needed: either while selecting from multiple alternatives of a same result type for a function call (step \ref{p:returntype} of resolving function calls) or upon reaching the top level of an expression statement.
+Unsatisfiable alternatives are discarded.
+Satisfiable alternatives receive \textbf{implicit parameters}: in \CFA, parametric functions may be separately compiled, as opposed to \CC templates which are only compiled at instantiation.
+Given the parametric function-definition:
 \begin{C++}
 forall (otype T | {void foo(T);})
 void bar (T t) { foo(t); }
 \end{C++}
+The function bar does not know which @foo@ to call when compiled without knowing the call
+site, so it requests a function pointer to be passed as an extra argument. At the call site,
+implicit parameters are automatically inserted by the compiler.
+\textbf{TODO}: Explain how recursive assertion satisfaction and polymorphic recursion work.
+the function @bar@ does not know which @foo@ to call when compiled without knowing the call site, so it requests a function pointer to be passed as an extra argument.
+At the call site, implicit parameters are automatically inserted by the compiler.
+Implementation of implicit parameters is discussed in \VRef[Appendix]{s:ImplementationParametricFunctions}.
 \section{Tests}
 …
 \subsection{Test Suites}
+Automatic test suites are located under the @tests/@ directory. A test case consists of an
+input CFA source file (name ending with @.cfa@), and an expected output file located
+in @.expect/@ directory relative to the source file, with the same file name ending with @.txt@.
+So a test named @tuple/tupleCast@ has the following files, for example:
+Automatic test suites are located under the @tests/@ directory.
+A test case consists of an input CFA source file (suffix @.cfa@), and an expected output file located in the @tests/.expect/@ directory, with the same file name ending with suffix @.txt@.
+For example, the test named @tests/tuple/tupleCast.cfa@ has the following files, for example:
 \begin{C++}
 tests/
+..     tuple/
+......     .expect/
+..........       tupleCast.txt
+......     tupleCast.cfa
+\end{C++}
+If compilation fails, the error output is compared to the expect file. If compilation succeeds,
+the built program is run and its output compared to the expect file.
+To run the tests, execute the test script @test.py@ under the @tests/@ directory, with a list of
+test names to be run, or @--all@ to run all tests. The test script reports test cases
+fail/success, compilation time and program run time.
+        tuple/
+                .expect/
+                        tupleCast.txt
+                tupleCast.cfa
+\end{C++}
+If compilation fails, the error output is compared to the expect file.
+If the compilation succeeds but does not generate an executable, the compilation output is compared to the expect file.
+If the compilation succeeds and generates an executable, the executable is run and its output is compared to the expect file.
+To run the tests, execute the test script @test.py@ under the @tests/@ directory, with a list of test names to be run, or @--all@ (or @make all-tests@) to run all tests.
+The test script reports test cases fail/success, compilation time and program run time.
+To see all the options available for @test.py@ using the @--help@ option.
 \subsection{Performance Reports}
+To turn on performance reports, pass @-S@ flag to the compiler.
+kinds of performance reports are available:
+To turn on performance reports, pass the @-XCFA -S@ flag to the compiler.
+Three kinds of performance reports are available:
 \begin{enumerate}
 \item
 …
 @Common/Stats/Counter.h@.
 \end{enumerate}
+It is suggested to run performance tests with optimized build (@g++@ flag @-O3@)
+It is suggested to run performance tests with optimization (@g++@ flag @-O3@).
+\appendix
+\section{Appendix}
+\subsection{Kinds of Type Parameters}
+\label{s:KindsTypeParameters}
+A type parameter in a @forall@ clause has 3 kinds:
+\begin{enumerate}[listparindent=0pt]
+\item
+@dtype@: any data type (built-in or user defined) that is not a concrete type.
+A non-concrete type is an incomplete type such as an opaque type or pointer/reference with an implicit (pointer) size and implicitly generated reference and dereference operations.
+\item
+@otype@: any data type (built-in or user defined) that is concrete type.
+A concrete type is a complete type, \ie types that can be used to create a variable, which also implicitly asserts the existence of default and copy constructors, assignment, and destructor\footnote{\CFA implements the same automatic resource management (RAII) semantics as \CC.}.
+% \item
+% @ftype@: any function type.
+%
+% @ftype@ provides two purposes:
+% \begin{itemize}
+% \item
+% Differentiate function pointer from data pointer because (in theory) some systems have different sizes for these pointers.
+% \item
+% Disallow a function pointer to match an overloaded data pointer, since variables and functions can have the same names.
+% \end{itemize}
+\item
+@ttype@: tuple (variadic) type.
+Restricted to the type for the last parameter in a function, it provides a type-safe way to implement variadic functions.
+Note however, that it has certain restrictions, as described in the implementation section below.
+\end{enumerate}
+\subsection{GNU C Nested Functions}
+\CFA is designed to be mostly compatible with GNU C, an extension to ISO C99 and C11 standards. The \CFA compiler also implements some language features by GCC extensions, most notably nested functions.
+In ISO C, function definitions are not allowed to be nested. GCC allows nested functions with full lexical scoping. The following example is taken from GCC documentation\footnote{\url{https://gcc.gnu.org/onlinedocs/gcc/Nested-Functions.html}}:
+\begin{C++}
+void bar( int * array, int offset, int size ) {
+        int access( int * array, int index ) { return array[index + offset]; }
+        int i;
+        /* ... */
+        for ( i = 0; i < size; i++ )
+                /* ... */ access (array, i) /* ... */
+}
+\end{C++}
+GCC nested functions behave identically to \CC lambda functions with default by-reference capture (stack-allocated, lifetime ends upon exiting the declared block), while also possible to be passed as arguments with standard function pointer types.
+\subsection{Implementation of Parametric Functions}
+\label{s:ImplementationParametricFunctions}
+\CFA implements parametric functions using the implicit parameter approach: required assertions are passed to the callee by function pointers;
+size of a parametric type must also be known if referenced directly (\ie not as a pointer).
+The implementation is similar to the one from Scala\footnote{\url{https://www.scala-lang.org/files/archive/spec/2.13/07-implicits.html}}, with some notable differences in resolution:
+\begin{enumerate}
+\item
+All types, variables, and functions are candidates of implicit parameters
+\item
+The parameter (assertion) name must match the actual declarations.
+\end{enumerate}
+For example, the \CFA function declaration
+\begin{cfa}
+forall( otype T | { int foo( T, int ); } )
+int bar(T);
+\end{cfa}
+after implicit parameter expansion, has the actual signature\footnote{\textbf{otype} also requires the type to have constructor and destructor, which are the first two function pointers preceding the one for \textbf{foo}.}
+\begin{C++}
+int bar( T, size_t, void (*)(T&), void (*)(T&), int (*)(T, int) );
+\end{C++}
+The implicit parameter approach has an apparent issue: when the satisfying declaration is also parametric, it may require its own implicit parameters too.
+That also causes the supplied implicit parameter to have a different \textbf{actual} type than the \textbf{nominal} type, so it cannot be passed directly.
+Therefore, a wrapper with matching actual type must be created, and it is here where GCC nested functions are used internally by the compiler.
+Consider the following program:
+\begin{cfa}
+int assertion(int);
+forall( otype T | { int assertion(T); } )
+void foo(T);
+forall(otype T | { void foo(T); } )
+void bar(T t) {
+        foo(t);
+}
+\end{cfa}
+The \CFA compiler translates the program to non-parametric form\footnote{In the final code output, \lstinline@T@ needs to be replaced by an opaque type, and arguments must be accessed by a frame pointer offset table, due to the unknown sizes. The presented code here is simplified for better understanding.}
+\begin{C++}
+// ctor, dtor and size arguments are omitted
+void foo(T, int (*)(T));
+void bar(T t, void (*foo)(T)) {
+        foo(t);
+}
+\end{C++}
+However, when @bar(1)@ is called, @foo@ cannot be directly provided as an argument:
+\begin{C++}
+bar(1, foo); // WRONG: foo has different actual type
+\end{C++}
+and an additional step is required:
+\begin{C++}
+{
+        void _foo_wrapper(int t) {
+                foo( t, assertion );
+        }
+        bar( 1, _foo_wrapper );
+}
+\end{C++}
+Nested assertions and implicit parameter creation may continue indefinitely.
+This issue is a limitation of implicit parameter implementation.
+In particular, polymorphic variadic recursion must be structural (\ie the number of arguments decreases in any possible recursive calls), otherwise code generation gets into an infinite loop.
+The \CFA compiler sets a limit on assertion depth and reports an error if assertion resolution does not terminate within the limit (as for \lstinline[language=C++]@templates@ in \CC).
 \bibliographystyle{plain}

doc/user/Makefile

r33c3ded	r223a633
55	55
56	56	${DOCUMENT} : ${BASE}.ps
57		ps2pdf $<
	57	ps2pdf -dPDFSETTINGS=/prepress $<
58	58
59	59	${BASE}.ps : ${BASE}.dvi

doc/user/user.tex

-              r33c3ded
+              r223a633
 %% Created On       : Wed Apr  6 14:53:29 2016
 %% Last Modified By : Peter A. Buhr
 %% Last Modified On : Fri Mar  6 13:34:52 2020
 %% Update Count     : 3924
+%% Last Modified On : Mon Oct  5 08:57:29 2020
+%% Update Count     : 3998
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 …
 \usepackage{upquote}                                                                    % switch curled `'" to straight
 \usepackage{calc}
-\usepackage{xspace}
 \usepackage{varioref}                                                                   % extended references
+\usepackage{listings}                                                                   % format program code
+\usepackage[labelformat=simple,aboveskip=0pt,farskip=0pt]{subfig}
+\renewcommand{\thesubfigure}{\alph{subfigure})}
 \usepackage[flushmargin]{footmisc}                                              % support label/reference in footnote
 \usepackage{latexsym}                                   % \Box glyph
 \usepackage{mathptmx}                                   % better math font with "times"
 \usepackage[usenames]{color}
+\input{common}                                          % common CFA document macros
+\usepackage[dvips,plainpages=false,pdfpagelabels,pdfpagemode=UseNone,colorlinks=true,pagebackref=true,linkcolor=blue,citecolor=blue,urlcolor=blue,pagebackref=true,breaklinks=true]{hyperref}
+\usepackage{breakurl}
+\usepackage[pagewise]{lineno}
+\renewcommand{\linenumberfont}{\scriptsize\sffamily}
+\usepackage[firstpage]{draftwatermark}
+\SetWatermarkLightness{0.9}
+% Default underscore is too low and wide. Cannot use lstlisting "literate" as replacing underscore
+% removes it as a variable-name character so keywords in variables are highlighted. MUST APPEAR
+% AFTER HYPERREF.
+\renewcommand{\textunderscore}{\leavevmode\makebox[1.2ex][c]{\rule{1ex}{0.075ex}}}
+\setlength{\topmargin}{-0.45in}                                                 % move running title into header
+\setlength{\headsep}{0.25in}
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\CFAStyle                                                                                               % use default CFA format-style
+\lstnewenvironment{C++}[1][]                            % use C++ style
+{\lstset{language=C++,moredelim=**[is][\protect\color{red}]{®}{®},#1}}
+{}
+\newcommand{\CFALatin}{}
 % inline code ©...© (copyright symbol) emacs: C-q M-)
 % red highlighting ®...® (registered trademark symbol) emacs: C-q M-.
 …
 % keyword escape ¶...¶ (pilcrow symbol) emacs: C-q M-^
 % math escape $...$ (dollar symbol)
+\input{common}                                          % common CFA document macros
+\usepackage[dvips,plainpages=false,pdfpagelabels,pdfpagemode=UseNone,colorlinks=true,pagebackref=true,linkcolor=blue,citecolor=blue,urlcolor=blue,pagebackref=true,breaklinks=true]{hyperref}
+\usepackage{breakurl}
+\renewcommand\footnoterule{\kern -3pt\rule{0.3\linewidth}{0.15pt}\kern 2pt}
+\usepackage[pagewise]{lineno}
+\renewcommand{\linenumberfont}{\scriptsize\sffamily}
+\usepackage[firstpage]{draftwatermark}
+\SetWatermarkLightness{0.9}
+% Default underscore is too low and wide. Cannot use lstlisting "literate" as replacing underscore
+% removes it as a variable-name character so keywords in variables are highlighted. MUST APPEAR
+% AFTER HYPERREF.
+\renewcommand{\textunderscore}{\leavevmode\makebox[1.2ex][c]{\rule{1ex}{0.075ex}}}
+\setlength{\topmargin}{-0.45in}                                                 % move running title into header
+\setlength{\headsep}{0.25in}
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\CFAStyle                                                                                               % use default CFA format-style
+\lstnewenvironment{C++}[1][]                            % use C++ style
+{\lstset{language=C++,moredelim=**[is][\protect\color{red}]{®}{®},#1}}
+{}
+\newsavebox{\myboxA}
+\newsavebox{\myboxB}
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 …
 \newcommand{\G}[1]{{\Textbf[OliveGreen]{#1}}}
 \newcommand{\KWC}{K-W C\xspace}
-\newsavebox{\LstBox}
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 …
 The signature feature of \CFA is \emph{\Index{overload}able} \Index{parametric-polymorphic} functions~\cite{forceone:impl,Cormack90,Duggan96} with functions generalized using a ©forall© clause (giving the language its name):
 \begin{lstlisting}
+\begin{cfa}
 ®forall( otype T )® T identity( T val ) { return val; }
 int forty_two = identity( 42 ); §\C{// T is bound to int, forty\_two == 42}§
 \end{lstlisting}
+\end{cfa}
 % extending the C type system with parametric polymorphism and overloading, as opposed to the \Index*[C++]{\CC{}} approach of object-oriented extensions.
 \CFA{}\hspace{1pt}'s polymorphism was originally formalized by \Index*{Glen Ditchfield}\index{Ditchfield, Glen}~\cite{Ditchfield92}, and first implemented by \Index*{Richard Bilson}\index{Bilson, Richard}~\cite{Bilson03}.
 …
 \begin{comment}
 A simple example is leveraging the existing type-unsafe (©void *©) C ©bsearch© to binary search a sorted floating array:
 \begin{lstlisting}
+\begin{cfa}
 void * bsearch( const void * key, const void * base, size_t dim, size_t size,
                                 int (* compar)( const void *, const void * ));
 …
 double key = 5.0, vals[10] = { /* 10 sorted floating values */ };
 double * val = (double *)bsearch( &key, vals, 10, sizeof(vals[0]), comp ); §\C{// search sorted array}§
 \end{lstlisting}
+\end{cfa}
 which can be augmented simply with a polymorphic, type-safe, \CFA-overloaded wrappers:
 \begin{lstlisting}
+\begin{cfa}
 forall( otype T | { int ?<?( T, T ); } ) T * bsearch( T key, const T * arr, size_t size ) {
         int comp( const void * t1, const void * t2 ) { /* as above with double changed to T */ }
 …
 double * val = bsearch( 5.0, vals, 10 ); §\C{// selection based on return type}§
 int posn = bsearch( 5.0, vals, 10 );
 \end{lstlisting}
+\end{cfa}
 The nested function ©comp© provides the hidden interface from typed \CFA to untyped (©void *©) C, plus the cast of the result.
 Providing a hidden ©comp© function in \CC is awkward as lambdas do not use C calling-conventions and template declarations cannot appear at block scope.
 …
 \CFA has replacement libraries condensing hundreds of existing C functions into tens of \CFA overloaded functions, all without rewriting the actual computations.
 For example, it is possible to write a type-safe \CFA wrapper ©malloc© based on the C ©malloc©:
 \begin{lstlisting}
+\begin{cfa}
 forall( dtype T | sized(T) ) T * malloc( void ) { return (T *)malloc( sizeof(T) ); }
 int * ip = malloc(); §\C{// select type and size from left-hand side}§
 double * dp = malloc();
 struct S {...} * sp = malloc();
 \end{lstlisting}
+\end{cfa}
 where the return type supplies the type/size of the allocation, which is impossible in most type systems.
 \end{comment}
 …
 the same level as a ©case© clause; the target label may be case ©default©, but only associated
 with the current ©switch©/©choose© statement.
-\subsection{Loop Control}
-The ©for©/©while©/©do-while© loop-control allows empty or simplified ranges (see Figure~\ref{f:LoopControlExamples}).
-\begin{itemize}
-\item
-The loop index is polymorphic in the type of the comparison value N (when the start value is implicit) or the start value M.
-\item
-An empty conditional implies comparison value of ©1© (true).
-\item
-A comparison N is implicit up-to exclusive range [0,N©®)®©.
-\item
-A comparison ©=© N is implicit up-to inclusive range [0,N©®]®©.
-\item
-The up-to range M ©~©\index{~@©~©} N means exclusive range [M,N©®)®©.
-\item
-The up-to range M ©~=©\index{~=@©~=©} N means inclusive range [M,N©®]®©.
-\item
-The down-to range M ©-~©\index{-~@©-~©} N means exclusive range [N,M©®)®©.
-\item
-The down-to range M ©-~=©\index{-~=@©-~=©} N means inclusive range [N,M©®]®©.
-\item
-©0© is the implicit start value;
-\item
-©1© is the implicit increment value.
-\item
-The up-to range uses operator ©+=© for increment;
-\item
-The down-to range uses operator ©-=© for decrement.
-\item
-©@© means put nothing in this field.
-\item
-©:© means start another index.
-\end{itemize}
 \begin{figure}
 …
+\subsection{Loop Control}
+The ©for©/©while©/©do-while© loop-control allows empty or simplified ranges (see Figure~\ref{f:LoopControlExamples}).
+\begin{itemize}
+\item
+The loop index is polymorphic in the type of the comparison value N (when the start value is implicit) or the start value M.
+\item
+An empty conditional implies comparison value of ©1© (true).
+\item
+A comparison N is implicit up-to exclusive range [0,N©®)®©.
+\item
+A comparison ©=© N is implicit up-to inclusive range [0,N©®]®©.
+\item
+The up-to range M ©~©\index{~@©~©} N means exclusive range [M,N©®)®©.
+\item
+The up-to range M ©~=©\index{~=@©~=©} N means inclusive range [M,N©®]®©.
+\item
+The down-to range M ©-~©\index{-~@©-~©} N means exclusive range [N,M©®)®©.
+\item
+The down-to range M ©-~=©\index{-~=@©-~=©} N means inclusive range [N,M©®]®©.
+\item
+©0© is the implicit start value;
+\item
+©1© is the implicit increment value.
+\item
+The up-to range uses operator ©+=© for increment;
+\item
+The down-to range uses operator ©-=© for decrement.
+\item
+©@© means put nothing in this field.
+\item
+©:© means start another index.
+\end{itemize}
 %\subsection{\texorpdfstring{Labelled \protect\lstinline@continue@ / \protect\lstinline@break@}{Labelled continue / break}}
 \subsection{\texorpdfstring{Labelled \LstKeywordStyle{continue} / \LstKeywordStyle{break} Statement}{Labelled continue / break Statement}}
 …
 for ©break©, the target label can also be associated with a ©switch©, ©if© or compound (©{}©) statement.
 \VRef[Figure]{f:MultiLevelExit} shows ©continue© and ©break© indicating the specific control structure, and the corresponding C program using only ©goto© and labels.
 The innermost loop has 7 exit points, which cause continuation or termination of one or more of the 7 \Index{nested control-structure}s.
+The innermost loop has 8 exit points, which cause continuation or termination of one or more of the 7 \Index{nested control-structure}s.
 \begin{figure}
+\begin{tabular}{@{\hspace{\parindentlnth}}l@{\hspace{\parindentlnth}}l@{\hspace{\parindentlnth}}l@{}}
+\multicolumn{1}{@{\hspace{\parindentlnth}}c@{\hspace{\parindentlnth}}}{\textbf{\CFA}}   & \multicolumn{1}{@{\hspace{\parindentlnth}}c}{\textbf{C}}      \\
+\begin{cfa}
+®LC:® {
+        ... §declarations§ ...
+        ®LS:® switch ( ... ) {
+          case 3:
+                ®LIF:® if ( ... ) {
+                        ®LF:® for ( ... ) {
+                                ®LW:® while ( ... ) {
+                                        ... break ®LC®; ...
+                                        ... break ®LS®; ...
+                                        ... break ®LIF®; ...
+                                        ... continue ®LF;® ...
+                                        ... break ®LF®; ...
+                                        ... continue ®LW®; ...
+                                        ... break ®LW®; ...
+                                } // while
+                        } // for
+                } else {
+                        ... break ®LIF®; ...
+                } // if
+        } // switch
+\centering
+\begin{lrbox}{\myboxA}
+\begin{cfa}[tabsize=3]
+®Compound:® {
+        ®Try:® try {
+                ®For:® for ( ... ) {
+                        ®While:® while ( ... ) {
+                                ®Do:® do {
+                                        ®If:® if ( ... ) {
+                                                ®Switch:® switch ( ... ) {
+                                                        case 3:
+                                                                ®break Compound®;
+                                                                ®break Try®;
+                                                                ®break For®;      /* or */  ®continue For®;
+                                                                ®break While®;  /* or */  ®continue While®;
+                                                                ®break Do®;      /* or */  ®continue Do®;
+                                                                ®break If®;
+                                                                ®break Switch®;
+                                                        } // switch
+                                                } else {
+                                                        ... ®break If®; ...     // terminate if
+                                                } // if
+                                } while ( ... ); // do
+                        } // while
+                } // for
+        } ®finally® { // always executed
+        } // try
 } // compound
 \end{cfa}
+&
+\begin{cfa}
+\end{lrbox}
+\begin{lrbox}{\myboxB}
+\begin{cfa}[tabsize=3]
+{
+        ... §declarations§ ...
+        switch ( ... ) {
+          case 3:
+                if ( ... ) {
+                        for ( ... ) {
+                                while ( ... ) {
+                                        ... goto ®LC®; ...
+                                        ... goto ®LS®; ...
+                                        ... goto ®LIF®; ...
+                                        ... goto ®LFC®; ...
+                                        ... goto ®LFB®; ...
+                                        ... goto ®LWC®; ...
+                                        ... goto ®LWB®; ...
+                                  ®LWC®: ; } ®LWB:® ;
+                          ®LFC:® ; } ®LFB:® ;
+                } else {
+                        ... goto ®LIF®; ...
+                } ®L3:® ;
+        } ®LS:® ;
+} ®LC:® ;
+\end{cfa}
+&
+\begin{cfa}
+// terminate compound
+// terminate switch
+// terminate if
+// continue loop
+// terminate loop
+// continue loop
+// terminate loop
+// terminate if
+\end{cfa}
+\end{tabular}
+                ®ForC:® for ( ... ) {
+                        ®WhileC:® while ( ... ) {
+                                ®DoC:® do {
+                                        if ( ... ) {
+                                                switch ( ... ) {
+                                                        case 3:
+                                                                ®goto Compound®;
+                                                                ®goto Try®;
+                                                                ®goto ForB®;      /* or */  ®goto ForC®;
+                                                                ®goto WhileB®;  /* or */  ®goto WhileC®;
+                                                                ®goto DoB®;      /* or */  ®goto DoC®;
+                                                                ®goto If®;
+                                                                ®goto Switch®;
+                                                        } ®Switch:® ;
+                                                } else {
+                                                        ... ®goto If®; ...      // terminate if
+                                                } ®If:®;
+                                } while ( ... ); ®DoB:® ;
+                        } ®WhileB:® ;
+                } ®ForB:® ;
+} ®Compound:® ;
+\end{cfa}
+\end{lrbox}
+\subfloat[\CFA]{\label{f:CFibonacci}\usebox\myboxA}
+\hspace{2pt}
+\vrule
+\hspace{2pt}
+\subfloat[C]{\label{f:CFAFibonacciGen}\usebox\myboxB}
 \caption{Multi-level Exit}
 \label{f:MultiLevelExit}
 …
 try {
         f(...);
 } catch( E e ; §boolean-predicate§ ) {          §\C[8cm]{// termination handler}§
+} catch( E e ; §boolean-predicate§ ) {          §\C{// termination handler}§
         // recover and continue
 } catchResume( E e ; §boolean-predicate§ ) { §\C{// resumption handler}\CRT§
+} catchResume( E e ; §boolean-predicate§ ) { §\C{// resumption handler}§
         // repair and return
 } finally {
 …
 For implicit formatted input, the common case is reading a sequence of values separated by whitespace, where the type of an input constant must match with the type of the input variable.
 \begin{cquote}
 \begin{lrbox}{\LstBox}
+\begin{lrbox}{\myboxA}
 \begin{cfa}[aboveskip=0pt,belowskip=0pt]
 int x;   double y   char z;
 …
 \end{lrbox}
 \begin{tabular}{@{}l@{\hspace{3em}}l@{\hspace{3em}}l@{}}
 \multicolumn{1}{@{}l@{}}{\usebox\LstBox} \\
+\multicolumn{1}{@{}l@{}}{\usebox\myboxA} \\
 \multicolumn{1}{c@{\hspace{2em}}}{\textbf{\CFA}}        & \multicolumn{1}{c@{\hspace{2em}}}{\textbf{\CC}}       & \multicolumn{1}{c}{\textbf{Python}}   \\
 \begin{cfa}[aboveskip=0pt,belowskip=0pt]
 …
 For example, an initial alignment and fill capability are preserved during a resize copy so the copy has the same alignment and extended storage is filled.
 Without sticky properties it is dangerous to use ©realloc©, resulting in an idiom of manually performing the reallocation to maintain correctness.
+\begin{cfa}
+\end{cfa}
 \CFA memory management extends allocation to support constructors for initialization of allocated storage, \eg in
 …
         // §\CFA§ safe general allocation, fill, resize, alignment, array
+        T * alloc( void );§\indexc{alloc}§
+        T * alloc( size_t dim );
+        T * alloc( T ptr[], size_t dim );
+        T * alloc_set( char fill );§\indexc{alloc_set}§
+        T * alloc_set( T fill );
+        T * alloc_set( size_t dim, char fill );
+        T * alloc_set( size_t dim, T fill );
+        T * alloc_set( size_t dim, const T fill[] );
+        T * alloc_set( T ptr[], size_t dim, char fill );
+        T * alloc_align( size_t align );
+        T * alloc_align( size_t align, size_t dim );
+        T * alloc_align( T ptr[], size_t align ); // aligned realloc array
+        T * alloc_align( T ptr[], size_t align, size_t dim ); // aligned realloc array
+        T * alloc_align_set( size_t align, char fill );
+        T * alloc_align_set( size_t align, T fill );
+        T * alloc_align_set( size_t align, size_t dim, char fill );
+        T * alloc_align_set( size_t align, size_t dim, T fill );
+        T * alloc_align_set( size_t align, size_t dim, const T fill[] );
+        T * alloc_align_set( T ptr[], size_t align, size_t dim, char fill );
+        T * alloc( void );§\indexc{alloc}§                                      §\C[3.5in]{// variable, T size}§
+        T * alloc( size_t dim );                                                        §\C{// array[dim], T size elements}§
+        T * alloc( T ptr[], size_t dim );                                       §\C{// realloc array[dim], T size elements}§
+        T * alloc_set( char fill );§\indexc{alloc_set}§         §\C{// variable, T size, fill bytes with value}§
+        T * alloc_set( T fill );                                                        §\C{// variable, T size, fill with value}§
+        T * alloc_set( size_t dim, char fill );                         §\C{// array[dim], T size elements, fill bytes with value}§
+        T * alloc_set( size_t dim, T fill );                            §\C{// array[dim], T size elements, fill elements with value}§
+        T * alloc_set( size_t dim, const T fill[] );            §\C{// array[dim], T size elements, fill elements with array}§
+        T * alloc_set( T ptr[], size_t dim, char fill );        §\C{// realloc array[dim], T size elements, fill bytes with value}§
+        T * alloc_align( size_t align );                                        §\C{// aligned variable, T size}§
+        T * alloc_align( size_t align, size_t dim );            §\C{// aligned array[dim], T size elements}§
+        T * alloc_align( T ptr[], size_t align );                       §\C{// realloc new aligned array}§
+        T * alloc_align( T ptr[], size_t align, size_t dim ); §\C{// realloc new aligned array[dim]}§
+        T * alloc_align_set( size_t align, char fill );         §\C{// aligned variable, T size, fill bytes with value}§
+        T * alloc_align_set( size_t align, T fill );            §\C{// aligned variable, T size, fill with value}§
+        T * alloc_align_set( size_t align, size_t dim, char fill ); §\C{// aligned array[dim], T size elements, fill bytes with value}§
+        T * alloc_align_set( size_t align, size_t dim, T fill ); §\C{// aligned array[dim], T size elements, fill elements with value}§
+        T * alloc_align_set( size_t align, size_t dim, const T fill[] ); §\C{// aligned array[dim], T size elements, fill elements with array}§
+        T * alloc_align_set( T ptr[], size_t align, size_t dim, char fill ); §\C{// realloc new aligned array[dim], fill new bytes with value}§
         // §\CFA§ safe initialization/copy, i.e., implicit size specification

Note: See TracChangeset for help on using the changeset viewer.

Download in other formats: