Context Navigation

← Previous Change
Next Change →

Changeset bd72f517 for doc

Timestamp:

May 13, 2025, 1:17:50 PM (6 months ago)

Author:

Mike Brooks <mlbrooks@…>

Branches:

master

Children:

0528d79

Parents:

7d02d35 (diff), 2410424 (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the (diff) links above to see all the changes relative to each parent.

Message:

Merge branch 'master' of plg.uwaterloo.ca:software/cfa/cfa-cc

Location:

doc

Files:

: 23 added
: 17 deleted
: 21 edited
: 3 moved

LaTeXmacros/common.sty (modified) (5 diffs)
LaTeXmacros/common.tex (modified) (5 diffs)
bibliography/pl.bib (modified) (2 diffs)
proposals/exceptions.md (added)
theses/fangren_yu_MMath/background.tex (modified) (1 diff)
theses/fangren_yu_MMath/features.tex (modified) (24 diffs)
theses/fangren_yu_MMath/future.tex (modified) (6 diffs)
theses/fangren_yu_MMath/intro.tex (modified) (41 diffs)
theses/fangren_yu_MMath/resolution.tex (modified) (28 diffs)
theses/fangren_yu_MMath/uw-ethesis.bib (modified) (1 diff)
theses/fangren_yu_MMath/uw-ethesis.tex (modified) (1 diff)
theses/mike_brooks_MMath/Makefile (modified) (5 diffs)
theses/mike_brooks_MMath/array.tex (modified) (39 diffs)
theses/mike_brooks_MMath/background.tex (modified) (1 diff)
theses/mike_brooks_MMath/benchmarks/string/result-allocate-attrib-cfa.ssv (added)
theses/mike_brooks_MMath/benchmarks/string/result-allocate-attrib-stl.ssv (added)
theses/mike_brooks_MMath/benchmarks/string/result-allocate-space-cfa.ssv (added)
theses/mike_brooks_MMath/benchmarks/string/result-allocate-space-stl.ssv (added)
theses/mike_brooks_MMath/benchmarks/string/result-allocate-speed-cfa.csv (added)
theses/mike_brooks_MMath/benchmarks/string/result-allocate-speed-stl.csv (added)
theses/mike_brooks_MMath/benchmarks/string/result-append-pbv.csv (modified) (1 diff)
theses/mike_brooks_MMath/pictures/ArrayOfPtr.fig (added)
theses/mike_brooks_MMath/pictures/PtrToArray.fig (added)
theses/mike_brooks_MMath/pictures/measuring-like-layout.pdf (modified) ( previous)
theses/mike_brooks_MMath/pictures/measuring-like-layout.vsdx (modified) ( previous)
theses/mike_brooks_MMath/pictures/string-graph-allocn.csv (deleted)
theses/mike_brooks_MMath/pictures/string-graph-allocn.dat (deleted)
theses/mike_brooks_MMath/pictures/string-graph-allocn.png (deleted)
theses/mike_brooks_MMath/pictures/string-graph-pbv.csv (deleted)
theses/mike_brooks_MMath/pictures/string-graph-pbv.dat (deleted)
theses/mike_brooks_MMath/pictures/string-graph-pbv.png (deleted)
theses/mike_brooks_MMath/pictures/string-graph-peq-sharing.csv (deleted)
theses/mike_brooks_MMath/pictures/string-graph-peq-sharing.dat (deleted)
theses/mike_brooks_MMath/pictures/string-graph-peq-sharing.png (deleted)
theses/mike_brooks_MMath/pictures/string-graph-pta-sharing.csv (deleted)
theses/mike_brooks_MMath/pictures/string-graph-pta-sharing.dat (deleted)
theses/mike_brooks_MMath/pictures/string-graph-pta-sharing.png (deleted)
theses/mike_brooks_MMath/pictures/string-graphs-mapping.txt (deleted)
theses/mike_brooks_MMath/pictures/string-graphs-mem.xlsx (deleted)
theses/mike_brooks_MMath/pictures/string-graphs-speed.csv (deleted)
theses/mike_brooks_MMath/pictures/string-graphs-speed.xlsx (deleted)
theses/mike_brooks_MMath/plot-allocn.gp (deleted)
theses/mike_brooks_MMath/plots/common.py (added)
theses/mike_brooks_MMath/plots/string-allocn-attrib.py (added)
theses/mike_brooks_MMath/plots/string-allocn.d (added)
theses/mike_brooks_MMath/plots/string-allocn.gp (added)
theses/mike_brooks_MMath/plots/string-allocn.py (added)
theses/mike_brooks_MMath/plots/string-pbv-fixcorp.py (added)
theses/mike_brooks_MMath/plots/string-pbv-varcorp.py (added)
theses/mike_brooks_MMath/plots/string-pbv.d (added)
theses/mike_brooks_MMath/plots/string-pbv.gp (moved) (moved from doc/theses/mike_brooks_MMath/plot-pbv.gp ) (2 diffs)
theses/mike_brooks_MMath/plots/string-peq-cppemu.gp (modified) (1 diff)
theses/mike_brooks_MMath/plots/string-peq-cppemu.py (modified) (2 diffs)
theses/mike_brooks_MMath/plots/string-peq-sharing.d (added)
theses/mike_brooks_MMath/plots/string-peq-sharing.gp (moved) (moved from doc/theses/mike_brooks_MMath/plot-peq-sharing.gp ) (2 diffs)
theses/mike_brooks_MMath/plots/string-peq-sharing.py (added)
theses/mike_brooks_MMath/plots/string-pta-sharing.d (added)
theses/mike_brooks_MMath/plots/string-pta-sharing.gp (moved) (moved from doc/theses/mike_brooks_MMath/plot-pta-sharing.gp ) (2 diffs)
theses/mike_brooks_MMath/plots/string-pta-sharing.py (added)
theses/mike_brooks_MMath/programs/bkgd-cfa-arrayinteract.cfa (modified) (1 diff)
theses/mike_brooks_MMath/programs/hello-accordion.cfa (modified) (2 diffs)
theses/mike_brooks_MMath/string.tex (modified) (45 diffs)
theses/mike_brooks_MMath/word.cc (added)
theses/mike_brooks_MMath/word.cfa (added)

Legend:

: Unmodified
: Added
: Removed

doc/LaTeXmacros/common.sty

-              r7d02d35
+              rbd72f517
 %% Created On       : Sat Apr  9 10:06:17 2016
 %% Last Modified By : Peter A. Buhr
 %% Last Modified On : Wed Mar 19 21:22:28 2025
 %% Update Count     : 664
+%% Last Modified On : Mon May  5 21:37:13 2025
+%% Update Count     : 666
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+% This latex idiom for checking empty optional parameters
+%    \ifx#1\@empty\else\if\relax\detokenize{#1}\relax
+% first checks if there is no optional parameter specified: \name{...} versus \name[]{...}
+% second checks if the optional parameter is specified but empty: \name[]{...} versus \name[...]{...}
 \setlength{\textheight}{9in}
 …
 \newcommand{\Index}{\@ifstar\@sIndex\@Index}
 % inline text and as-in index: \Index[as-is index text]{inline text}
 \newcommand{\@Index}[2][\@empty]{\lowercase{\def\temp{#2}}#2\ifx#1\@empty\index{\temp}\else\index{#1@{\protect#2}}\fi}
+\newcommand{\@Index}[2][\@empty]{\lowercase{\def\temp{#2}}#2\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\index{\temp}\else\index{#1@{\protect#2}}\fi\fi}
 % inline text but index with different as-is text: \Index[index text]{inline text}
 \newcommand{\@sIndex}[2][\@empty]{#2\ifx#1\@empty\index{#2}\else\index{#1@{\protect#2}}\fi}
+\newcommand{\@sIndex}[2][\@empty]{#2\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\index{#2}\else\index{#1@{\protect#2}}\fi\fi}
 % inline text and code index (cannot use ©)
 …
 \newcommand{\newtermFontInline}{\emph}
 \newcommand{\newterm}{\protect\@ifstar\@snewterm\@newterm}
 \newcommand{\@snewterm}[2][\@empty]{{\newtermFontInline{#2}}\ifx#1\@empty\index{#2}\else\index{#1@{\protect#2}}\fi}
 \newcommand{\@newterm}[2][\@empty]{\lowercase{\def\temp{#2}}{\newtermFontInline{#2}}\ifx#1\@empty\index{\temp}\else\index{#1@{\protect#2}}\fi}
+\newcommand{\@snewterm}[2][\@empty]{{\newtermFontInline{#2}}\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\index{#2}\else\index{#1@{\protect#2}}\fi\fi}
+\newcommand{\@newterm}[2][\@empty]{\lowercase{\def\temp{#2}}{\newtermFontInline{#2}}\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\index{\temp}\else\index{#1@{\protect#2}}\fi\fi}
 % \snake{<identifier>}
 …
 \renewcommand{\reftextfaraway}[1]{\unskip, p.~\pageref{#1}}
 \renewcommand{\reftextpagerange}[2]{\unskip, pp.~\pageref{#1}--\pageref{#2}}
 \newcommand{\VRef}[2][Section]{\ifx#1\@empty\else{#1}\nobreakspace\fi\vref{#2}}
 \newcommand{\VRefrange}[3][Sections]{\ifx#1\@empty\else{#1}\nobreakspace\fi\vrefrange{#2}{#3}}
 \newcommand{\VPageref}[2][page]{\ifx#1\@empty\else{#1}\nobreakspace\fi\pageref{#2}}
 \newcommand{\VPagerefrange}[3][pages]{\ifx#1\@empty\else{#1}\nobreakspace\fi\pageref{#2}{#3}}
+\newcommand{\VRef}[2][Section]{\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\else{#1}\nobreakspace\fi\fi\vref{#2}}
+\newcommand{\VRefrange}[3][Sections]{\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\else{#1}\nobreakspace\fi\fi\vrefrange{#2}{#3}}
+\newcommand{\VPageref}[2][page]{\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\else{#1}\nobreakspace\fi\fi\pageref{#2}}
+\newcommand{\VPagerefrange}[3][pages]{\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\else{#1}\nobreakspace\fi\fi\pageref{#2}{#3}}
 \let\Oldthebibliography\thebibliography
 …
 \setlength{\columnposn}{\gcolumnposn}
 \newcommand{\setgcolumn}[1]{\global\gcolumnposn=#1\global\columnposn=\gcolumnposn}
 \newcommand{\C}[2][\@empty]{\ifx#1\@empty\else\global\setlength{\columnposn}{#1}\global\columnposn=\columnposn\fi\hfill\makebox[\textwidth-\columnposn][l]{\LstCommentStyle{#2}}}
 \newcommand{\CD}[2][\@empty]{\ifx#1\@empty\else\global\setlength{\columnposn}{#1}\global\columnposn=\columnposn\fi\hfill\makebox[\textwidth-\columnposn][l]{\LstBasicStyle{#2}}}
+\newcommand{\C}[2][\@empty]{\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\else\global\setlength{\columnposn}{#1}\global\columnposn=\columnposn\fi\fi\hfill\makebox[\textwidth-\columnposn][l]{\LstCommentStyle{#2}}}
+\newcommand{\CD}[2][\@empty]{\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\else\global\setlength{\columnposn}{#1}\global\columnposn=\columnposn\fi\fi\hfill\makebox[\textwidth-\columnposn][l]{\LstBasicStyle{#2}}}
 \newcommand{\CRT}{\global\columnposn=\gcolumnposn}

doc/LaTeXmacros/common.tex

-              r7d02d35
+              rbd72f517
 %% Created On       : Sat Apr  9 10:06:17 2016
 %% Last Modified By : Peter A. Buhr
 %% Last Modified On : Wed Mar 19 07:37:17 2025
 %% Update Count     : 688
+%% Last Modified On : Mon May  5 21:34:53 2025
+%% Update Count     : 709
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+% This latex idiom for checking empty optional parameters
+%    \ifx#1\@empty\else\if\relax\detokenize{#1}\relax
+% first checks if there is no optional parameter specified: \name{...} versus \name[]{...}
+% second checks if the optional parameter is specified but empty: \name[]{...} versus \name[...]{...}
 \setlength{\textheight}{9in}
 …
 \newcommand{\Index}{\@ifstar\@sIndex\@Index}
 % inline text and as-in index: \Index[as-is index text]{inline text}
 \newcommand{\@Index}[2][\@empty]{\lowercase{\def\temp{#2}}#2\ifx#1\@empty\index{\temp}\else\index{#1@{\protect#2}}\fi}
+\newcommand{\@Index}[2][\@empty]{\lowercase{\def\temp{#2}}#2\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\index{\temp}\else\index{#1@{\protect#2}}\fi\fi}
 % inline text but index with different as-is text: \Index[index text]{inline text}
 \newcommand{\@sIndex}[2][\@empty]{#2\ifx#1\@empty\index{#2}\else\index{#1@{\protect#2}}\fi}
+\newcommand{\@sIndex}[2][\@empty]{#2\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\index{#2}\else\index{#1@{\protect#2}}\fi\fi}
 % inline text and code index (cannot use ©)
 …
 \newcommand{\newtermFontInline}{\emph}
 \newcommand{\newterm}{\protect\@ifstar\@snewterm\@newterm}
 \newcommand{\@snewterm}[2][\@empty]{{\newtermFontInline{#2}}\ifx#1\@empty\index{#2}\else\index{#1@{\protect#2}}\fi}
 \newcommand{\@newterm}[2][\@empty]{\lowercase{\def\temp{#2}}{\newtermFontInline{#2}}\ifx#1\@empty\index{\temp}\else\index{#1@{\protect#2}}\fi}
+\newcommand{\@snewterm}[2][\@empty]{{\newtermFontInline{#2}}\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\index{#2}\else\index{#1@{\protect#2}}\fi\fi}
+\newcommand{\@newterm}[2][\@empty]{\lowercase{\def\temp{#2}}{\newtermFontInline{#2}}\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\index{\temp}\else\index{#1@{\protect#2}}\fi\fi}
 % \snake{<identifier>}
 …
 \renewcommand{\reftextfaraway}[1]{\unskip, p.~\pageref{#1}}
 \renewcommand{\reftextpagerange}[2]{\unskip, pp.~\pageref{#1}--\pageref{#2}}
 \newcommand{\VRef}[2][Section]{\ifx#1\@empty\else{#1}\nobreakspace\fi\vref{#2}}
 \newcommand{\VRefrange}[3][Sections]{\ifx#1\@empty\else{#1}\nobreakspace\fi\vrefrange{#2}{#3}}
 \newcommand{\VPageref}[2][page]{\ifx#1\@empty\else{#1}\nobreakspace\fi\pageref{#2}}
 \newcommand{\VPagerefrange}[3][pages]{\ifx#1\@empty\else{#1}\nobreakspace\fi\pageref{#2}{#3}}
+\newcommand{\VRef}[2][Section]{\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\else{#1}\nobreakspace\fi\fi\vref{#2}}
+\newcommand{\VRefrange}[3][Sections]{\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\else{#1}\nobreakspace\fi\fi\vrefrange{#2}{#3}}
+\newcommand{\VPageref}[2][page]{\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\else{#1}\nobreakspace\fi\fi\pageref{#2}}
+\newcommand{\VPagerefrange}[3][pages]{\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\else{#1}\nobreakspace\fi\fi\pageref{#2}{#3}}
 \let\Oldthebibliography\thebibliography
 …
 \setlength{\columnposn}{\gcolumnposn}
 \newcommand{\setgcolumn}[1]{\global\gcolumnposn=#1\global\columnposn=\gcolumnposn}
 \newcommand{\C}[2][\@empty]{\ifx#1\@empty\else\global\setlength{\columnposn}{#1}\global\columnposn=\columnposn\fi\hfill\makebox[\textwidth-\columnposn][l]{\LstCommentStyle{#2}}}
 \newcommand{\CD}[2][\@empty]{\ifx#1\@empty\else\global\setlength{\columnposn}{#1}\global\columnposn=\columnposn\fi\hfill\makebox[\textwidth-\columnposn][l]{\LstBasicStyle{#2}}}
+\newcommand{\C}[2][\@empty]{\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\else\global\setlength{\columnposn}{#1}\global\columnposn=\columnposn\fi\fi\hfill\makebox[\textwidth-\columnposn][l]{\LstCommentStyle{#2}}}
+\newcommand{\CD}[2][\@empty]{\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\else\global\setlength{\columnposn}{#1}\global\columnposn=\columnposn\fi\fi\hfill\makebox[\textwidth-\columnposn][l]{\LstBasicStyle{#2}}}
 \newcommand{\CRT}{\global\columnposn=\gcolumnposn}

doc/bibliography/pl.bib

-              r7d02d35
+              rbd72f517
 % M
+@misc{M4,
+    keywords    = {macros, preprocessor},
+    contributer = {pabuhr@plg},
+    author      = {Brian W. Kernighan and Dennis M. Ritchie},
+    title       = {The M4 Macro Processor},
+    year        = 1977,
+    howpublished= {\url{https://wolfram.schneider.org/bsd/7thEdManVol2/m4/m4.pdf}},
+    optnote     = {Accessed: 2016-09},
+}
 @book{M68K,
     keywords    = {M680XX, Motorola},
 …
+}
+@inproceedings{valgind,
+    keywords    = {Memcheck, Valgrind, dynamic binary analysis, dynamic binary instrumentation, shadow values},
+    contributer = {pabuhr@plg},
+    author      = {Nethercote, Nicholas and Seward, Julian},
+    title       = {{V}algrind: a framework for heavyweight dynamic binary instrumentation},
+    publisher   = {Association for Computing Machinery},
+    address     = {New York, NY, USA},
+    booktitle   = {Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation},
+    pages       = {89-100},
+    location    = {San Diego, California, USA},
+    series      = {PLDI'07}
+    year        = {2007},
+}
 @misc{Vala,
     keywords    = {GObject, Vala},

doc/theses/fangren_yu_MMath/background.tex

r7d02d35	rbd72f517
21	21	Furthermore, Cyclone's polymorphic functions and types are restricted to abstraction over types with the same layout and calling convention as @void *@, \ie only pointer types and @int@.
22	22	In \CFA terms, all Cyclone polymorphism must be dtype-static.
23		While the Cyclone design provides the efficiency benefits discussed in ~~Section~\ref{sec:generic-apps~~} for dtype-static polymorphism, it is more restrictive than \CFA's general model.
	23	While the Cyclone design provides the efficiency benefits discussed in~\VRef{s:GenericImplementation} for dtype-static polymorphism, it is more restrictive than \CFA's general model.
24	24	Smith and Volpano~\cite{Smith98} present Polymorphic C, an ML dialect with polymorphic functions, C-like syntax, and pointer types;
25	25	it lacks many of C's features, most notably structure types, and hence, is not a practical C replacement.

doc/theses/fangren_yu_MMath/features.tex

-              r7d02d35
+              rbd72f517
 Here, manipulating the pointer address is the primary operation, while dereferencing the pointer to its value is the secondary operation.
 For example, \emph{within} a data structure, \eg stack or queue, all operations involve pointer addresses and the pointer may never be dereferenced because the referenced object is opaque.
 Alternatively, use a reference when its primary purpose is to alias a value, \eg a function parameter that does not copy the argument (performance reason).
+Alternatively, use a reference when its primary purpose is to alias a value, \eg a function parameter that does not copy the argument, for performance reasons.
 Here, manipulating the value is the primary operation, while changing the pointer address is the secondary operation.
 Succinctly, if the address changes often, use a pointer;
 …
 \CFA adopts a uniform policy between pointers and references where mutability is a separate property made at the declaration.
 The following examples shows how pointers and references are treated uniformly in \CFA.
+The following examples show how pointers and references are treated uniformly in \CFA.
 \begin{cfa}[numbers=left,numberblanklines=false]
 int x = 1, y = 2, z = 3;$\label{p:refexamples}$
 …
 @&@r3 = @&@y; @&&@r3 = @&&@r4;                          $\C{// change r1, r2}$
 \end{cfa}
 Like pointers, reference can be cascaded, \ie a reference to a reference, \eg @&& r2@.\footnote{
+Like pointers, references can be cascaded, \ie a reference to a reference, \eg @&& r2@.\footnote{
 \CC uses \lstinline{&&} for rvalue reference, a feature for move semantics and handling the \lstinline{const} Hell problem.}
 Usage of a reference variable automatically performs the same number of dereferences as the number of references in its declaration, \eg @r2@ becomes @**r2@.
 …
 The call applies an implicit dereference once to @x@ so the call is typed @f( int & )@ with @T = int@, rather than with @T = int &@.
 As for a pointer type, a reference type may have qualifiers, where @const@ is most common.
+As with a pointer type, a reference type may have qualifiers, where @const@ is most common.
 \begin{cfa}
 int x = 3; $\C{// mutable}$
 …
 Interestingly, C does not give a warning/error if a @const@ pointer is not initialized, while \CC does.
 Hence, type @& const@ is similar to a \CC reference, but \CFA does not preclude initialization with a non-variable address.
 For example, in system's programming, there are cases where an immutable address is initialized to a specific memory location.
+For example, in systems programming, there are cases where an immutable address is initialized to a specific memory location.
 \begin{cfa}
 int & const mem_map = *0xe45bbc67@p@; $\C{// hardware mapped registers ('p' for pointer)}$
 …
 \end{cfa}
 the call to @foo@ must pass @x@ by value, implying auto-dereference, while the call to @bar@ must pass @x@ by reference, implying no auto-dereference.
+Without any restrictions, this ambiguity limits the behaviour of reference types in \CFA polymorphic functions, where a type @T@ can bind to a reference or non-reference type.
+\PAB{My analysis shows} without any restrictions, this ambiguity limits the behaviour of reference types in \CFA polymorphic functions, where a type @T@ can bind to a reference or non-reference type.
 This ambiguity prevents the type system treating reference types the same way as other types, even if type variables could be bound to reference types.
 The reason is that \CFA uses a common \emph{object trait}\label{p:objecttrait} (constructor, destructor and assignment operators) to handle passing dynamic concrete type arguments into polymorphic functions, and the reference types are handled differently in these contexts so they do not satisfy this common interface.
 …
 \end{cfa}
 While it is possible to write a reference type as the argument to a generic type, it is disallowed in assertion checking, if the generic type requires the object trait \see{\VPageref{p:objecttrait}} for the type argument, a fairly common use case.
 Even if the object trait can be made optional, the current type system often misbehaves by adding undesirable auto-dereference on the referenced-to value rather than the reference variable itself, as intended.
+Even if the object trait can be made optional, the current compiler implementation often misbehaves by adding undesirable auto-dereference on the referenced-to value rather than the reference variable itself, as intended.
 Some tweaks are necessary to accommodate reference types in polymorphic contexts and it is unclear what can or cannot be achieved.
 Currently, there are contexts where the \CFA programmer is forced to use a pointer type, giving up the benefits of auto-dereference operations and better syntax with reference types.
 …
 @[x, y, z]@ = foo( 3, 4 );  // return 3 values into a tuple
 \end{cfa}
 Along with making returning multiple values a first-class feature, tuples were extended to simplify a number of other common context that normally require multiple statements and/or additional declarations, all of which reduces coding time and errors.
+Along with making returning multiple values a first-class feature, tuples were extended to simplify a number of other common contexts that normally require multiple statements and/or additional declarations.
 \begin{cfa}
 [x, y, z] = 3; $\C[2in]{// x = 3; y = 3; z = 3, where types may be different}$
 …
 Only when returning a tuple from a function is there the notion of a tuple value.
 Overloading in the \CFA type-system must support complex composition of tuples and C type conversions using a costing scheme giving lower cost to widening conversions that do not truncate a value.
+Overloading in the \CFA type-system must support complex composition of tuples and C type conversions using a conversion cost scheme giving lower cost to widening conversions that do not truncate a value.
 \begin{cfa}
 [ int, int ] foo$\(_1\)$( int );                        $\C{// overloaded foo functions}$
 …
 \end{cfa}
 The type resolver only has the tuple return types to resolve the call to @bar@ as the @foo@ parameters are identical.
 The resultion involves unifying the flattened @foo@ return values with @bar@'s parameter list.
+The resulution involves unifying the flattened @foo@ return values with @bar@'s parameter list.
 However, no combination of @foo@s is an exact match with @bar@'s parameters;
 thus, the resolver applies C conversions to obtain a best match.
 …
 bar( foo( 3 ) ) // only one tuple returning call
 \end{lstlisting}
 Hence, programers cannot take advantage of the full power of tuples but type match is straightforward.
+Hence, programmers cannot take advantage of the full power of tuples but type match is straightforward.
 K-W C also supported tuple variables, but with a strong distinction between tuples and tuple values/variables.
 …
 \end{figure}
 The primary issues for tuples in the \CFA type system are polymorphism and conversions.
+\PAB{I identified} the primary issues for tuples in the \CFA type system are polymorphism and conversions.
 Specifically, does it make sense to have a generic (polymorphic) tuple type, as is possible for a structure?
 \begin{cfa}
 …
 \section{Tuple Implementation}
 As noted, tradition languages manipulate multiple values by in/out parameters and/or structures.
+As noted, traditional languages manipulate multiple values by in/out parameters and/or structures.
 K-W C adopted the structure for tuple values or variables, and as needed, the fields are extracted by field access operations.
 As well, for the tuple-assignment implementation, the left-hand tuple expression is expanded into assignments of each component, creating temporary variables to avoid unexpected side effects.
 …
 \end{figure}
 Interestingly, in the third implementation of \CFA tuples by Robert Schluntz~\cite[\S~3]{Schluntz17}, the MVR functions revert back to structure based, where it remains in the current version of \CFA.
+Interestingly, in the third implementation of \CFA tuples by Robert Schluntz~\cite[\S~3]{Schluntz17}, the MVR functions revert back to structure based, and this remains in the current version of \CFA.
 The reason for the reversion is a uniform approach for tuple values/variables making tuples first-class types in \CFA, \ie allow tuples with corresponding tuple variables.
 This reversion was possible, because in parallel with Schluntz's work, generic types were added independently by Moss~\cite{Moss19}, and the tuple variables leveraged the same implementation techniques as for generic variables~\cite[\S~3.7]{Schluntz17}.
 …
 Scala, like \CC, provides tuple types through a library using this structural expansion, \eg Scala provides tuple sizes 1 through 22 via hand-coded generic data-structures.
 However, after experience gained building the \CFA runtime system, making tuple-types first-class seems to add little benefit.
+However, after experience gained building the \CFA runtime system, \PAB{I convinced them} making tuple-types first-class seems to add little benefit.
 The main reason is that tuples usages are largely unstructured,
 \begin{cfa}
 …
 looping is used to traverse the argument pack from left to right.
 The @va_list@ interface is walking up the stack (by address) looking at the arguments pushed by the caller.
 (Magic knowledge is needed for arguments pushed using registers.)
+(Compiler-specific ABI knowledge is needed for arguments pushed using registers.)
 \begin{figure}
 …
 Currently in \CFA, variadic polymorphic functions are the only place tuple types are used.
 And because \CFA compiles polymorphic functions versus template expansion, many wrapper functions are generated to implement both user-defined generic-types and polymorphism with variadics.
+\PAB{My analysis showed} many wrapper functions are generated to implement both user-defined generic-types and polymorphism with variadics, because \CFA compiles polymorphic functions versus template expansion.
 Fortunately, the only permitted operations on polymorphic function parameters are given by the list of assertion (trait) functions.
 Nevertheless, this small set of functions eventually needs to be called with flattened tuple arguments.
 Unfortunately, packing the variadic arguments into a rigid @struct@ type and generating all the required wrapper functions is significant work and largely wasted because most are never called.
 Interested readers can refer to pages 77-80 of Robert Schluntz's thesis to see how verbose the translator output is to implement a simple variadic call with 3 arguments.
 As the number of arguments increases, \eg a call with 5 arguments, the translator generates a concrete @struct@ types for a 4-tuple and a 3-tuple along with all the polymorphic type data for them.
+As the number of arguments increases, \eg a call with 5 arguments, the translator generates concrete @struct@ types for a 4-tuple and a 3-tuple along with all the polymorphic type data for them.
 An alternative approach is to put the variadic arguments into an array, along with an offset array to retrieve each individual argument.
 This method is similar to how the C @va_list@ object is used (and how \CFA accesses polymorphic fields in a generic type), but the \CFA variadics generate the required type information to guarantee type safety (like the @printf@ format string).
 …
 Nested \emph{named} aggregates are allowed in C but there is no qualification operator, like the \CC type operator `@::@', to access an inner type.
+\emph{To compensate for the missing type operator, all named nested aggregates are hoisted to global scope, regardless of the nesting depth, and type usages within the nested type are replaced with global type name.}
+To compensate for the missing type operator, all named nested aggregates are hoisted to global scope, regardless of the nesting depth, and type usages within the nested type are replaced with global type name.
 Hoisting nested types can result in name collisions among types at the global level, which defeats the purpose of nesting the type.
 \VRef[Figure]{f:NestedNamedAggregate} shows the nested type @T@ is hoisted to the global scope and the declaration rewrites within structure @S@.
 …
 \end{figure}
 For good reasons, \CC chose to change this semantics:
+\CC chose to change this semantics:
 \begin{cquote}
 \begin{description}[leftmargin=*,topsep=0pt,itemsep=0pt,parsep=0pt]
 …
 Like an anonymous nested type, a named Plan-9 nested type has its field names hoisted into @struct S@, so there is direct access, \eg @s.x@ and @s.i@.
 Hence, the field names must be unique, unlike \CC nested types, but the type names are at a nested scope level, unlike type nesting in C.
 In addition, a pointer to a structure is automatically converted to a pointer to an anonymous field for assignments and function calls, providing containment inheritance with implicit subtyping, \ie @U@ $\subset$ @S@ and @W@ $\subset$ @S@, \eg:
+In addition, a pointer to a structure is automatically converted to a pointer to an anonymous field for assignments and function calls, providing containment inheritance with implicit subtyping, \ie @U@ $<:$ @S@ and @W@ $<:$ @S@, \eg:
 \begin{cfa}
 void f( union U * u );
 …
 Note, there is no value assignment, such as, @w = s@, to copy the @W@ field from @S@.
 Unfortunately, the Plan-9 designers did not lookahead to other useful features, specifically nested types.
+Unfortunately, the Plan-9 designers did not look ahead to other useful features, specifically nested types.
 This nested type compiles in \CC and \CFA.
 \begin{cfa}
 …
 In addition, a semi-non-compatible change is made so that Plan-9 syntax means a forward declaration in a nested type.
 Since the Plan-9 extension is not part of C and rarely used, this change has minimal impact.
 Hence, all Plan-9 semantics are denoted by the @inline@ qualifier, which is good ``eye-candy'' when reading a structure definition to spot Plan-9 definitions.
+Hence, all Plan-9 semantics are denoted by the @inline@ qualifier, which clearly indicates the usage of Plan-9 definitions.
 Finally, the following code shows the value and pointer polymorphism.
 \begin{cfa}
 …
 In general, non-standard C features (@gcc@) do not need any special treatment, as they are directly passed through to the C compiler.
 However, the Plan-9 semantics allow implicit conversions from the outer type to the inner type, which means the \CFA type resolver must take this information into account.
+However, \PAB{I found} the Plan-9 semantics allow implicit conversions from the outer type to the inner type, which means the \CFA type resolver must take this information into account.
 Therefore, the \CFA resolver must implement the Plan-9 features and insert necessary type conversions into the translated code output.
 In the current version of \CFA, this is the only kind of implicit type conversion other than the standard C arithmetic conversions.
 …
 \end{c++}
 and again the expression @d.x@ is ambiguous.
 While \CC has no direct syntax to disambiguate @x@, \ie @d.B.x@ or @d.C.x@, it is possible with casts, @((B)d).x@ or @((C)d).x@.
+While \CC has no direct syntax to disambiguate @x@, \eg @d.B.x@ or @d.C.x@, it is possible with casts, @((B)d).x@ or @((C)d).x@.
 Like \CC, \CFA compiles the Plan-9 version and provides direct qualification and casts to disambiguate @x@.
 While ambiguous definitions are allowed, duplicate field names is poor practice and should be avoided if possible.
+While ambiguous definitions are allowed, duplicate field names are poor practice and should be avoided if possible.
 However, when a programmer does not control all code, this problem can occur and a naming workaround must exist.

doc/theses/fangren_yu_MMath/future.tex

-              r7d02d35
+              rbd72f517
 The following are feature requests related to type-system enhancements that have surfaced during the development of the \CFA language and library, but have not been implemented yet.
 Currently, developers must work around these missing features, sometimes resulting in inefficiency.
+\PAB{The following sections discuss new features I am proposing to fix these problems.}
 \section{Closed Trait Types}
 Currently, \CFA does not have any closed types, as open type are the basis of its unique type-system, allowing new functions to be added at any time to override existing ones for trait satisfaction.
+Currently, \CFA does not have any closed types, as open types are the basis of its unique type-system, allowing new functions to be added at any time to override existing ones for trait satisfaction.
 Locally-declared nested-functions,\footnote{
 Nested functions are not a feature in C but supported by \lstinline{gcc} for multiple decades and are used heavily in \CFA.}
 …
 Library implementers normally do not want users to override certain operations and cause the behaviour of polymorphic invocations to change.
 \item
 Caching and reusing resolution results in the compiler is effected, as newly introduced declarations can participate in assertion resolution;
+Caching and reusing resolution results in the compiler is affected, as newly introduced declarations can participate in assertion resolution;
 as a result, previously invalid subexpressions suddenly become valid, or alternatively cause ambiguity in assertions.
 \end{enumerate}
 …
 \end{figure}
 A \CFA closed trait type is similar to a Haskell type class requiring an explicit instance declaration.
+A \CFA closed trait type is planned to be working similarly to a Haskell type class that requires an explicit instance declaration.
 The syntax for the closed trait might look like:
 \begin{cfa}
 …
 \section{Associated Types}
+\label{s:AssociatedTypes}
 The analysis presented in \VRef{s:AssertionSatisfaction} shows if all type parameters have to be bound before assertion resolution, the complexity of resolving assertions become much lower as every assertion parameter can be resolved independently.
+The analysis presented in \VRef{s:AssertionSatisfaction} shows if all type parameters have to be bound before assertion resolution, the complexity of resolving assertions becomes much lower as every assertion parameter can be resolved independently.
 That is, by utilizing information from higher up the expression tree for return value overloading, most of the type bindings can be resolved.
 However, there are scenarios where some intermediate types need to be involved in certain operations, which are neither input nor output types.
 …
 Note that the type @list *@ satisfies both @pointer_like( list *, int )@ and @pointer_like( list *,@ @list )@ (the latter by the built-in pointer dereference operator) and the expression @*it@ can be either a @struct list@ or an @int@.
 Requiring associated types to be unique makes the @pointer_like@ trait not applicable to @list *@, which is undesirable.
 I have not attempted to implement associated types in \CFA compiler, but based on the above discussions, one option is to make associated type resolution and return type overloading coexist:
+I have not attempted to implement associated types in the \CFA compiler, but based on the above discussions, one option is to make associated type resolution and return type overloading coexist:
 when the associated type appears in returns, it is deduced from the context and then verify the trait with ordinary assertion resolution;
 when it does not appear in the returns, the type is required to be uniquely determined by the expression that defines the associated type.
 …
 \section{User-defined Conversions}
 Missing type-system feature is a scheme for user-defined conversions.
+A missing type-system feature in \CFA is a scheme for user-defined conversions.
 Conversion means one type goes through an arbitrary complex process of changing its value to some meaningful value in another type.
 Because the conversion process can be arbitrarily complex, it requires the power of a function.

doc/theses/fangren_yu_MMath/intro.tex

-              r7d02d35
+              rbd72f517
 \section{Overloading}
+\label{s:Overloading}
+\vspace*{-5pt}
 \begin{quote}
 There are only two hard things in Computer Science: cache invalidation and \emph{naming things}. --- Phil Karlton
 \end{quote}
+\vspace*{-5pt}
 Overloading allows programmers to use the most meaningful names without fear of name clashes within a program or from external sources, like include files.
 Experience from \CC and \CFA developers shows the type system can implicitly and correctly disambiguates the majority of overloaded names, \ie it is rare to get an incorrect selection or ambiguity, even among hundreds of overloaded (variables and) functions.
+Experience from \CC and \CFA developers shows the type system can implicitly and correctly disambiguate the majority of overloaded names, \ie it is rare to get an incorrect selection or ambiguity, even among hundreds of overloaded (variables and) functions.
 In many cases, a programmer is unaware of name clashes, as they are silently resolved, simplifying the development process.
 Disambiguating among overloads is implemented by examining each call site and selecting the best matching overloaded function based on criteria like the types and number of arguments and the return context.
 Since the hardware does not support mixed-mode operands, @2 + 3.5@, the type system must disallow it or (safely) convert the operands to a common type.
+Since the hardware does not support mixed-mode operands, such as @2 + 3.5@, the type system must disallow it or (safely) convert the operands to a common type.
 Like overloading, the majority of mixed-mode conversions are silently resolved, simplifying the development process.
 This approach matches with programmer intuition and expectation, regardless of any \emph{safety} issues resulting from converted values.
 …
 As well, many namespace systems provide a mechanism to open their scope returning to normal overloading, \ie no qualification.
 While namespace mechanisms are very important and provide a number of crucial program-development features, protection from overloading is overstated.
 Similarly, lexical nesting is another place where overloading occurs.
+Similarly, lexical nesting is another place where duplicate naming issues arise.
 For example, in object-oriented programming, class member names \newterm{shadow} names within members.
 Some programmers, qualify all member names with @class::@ or @this->@ to make them unique from names defined in members.
 Even nested lexical blocks result in shadowing, \eg multiple nested loop-indices called @i@.
 Again, coding styles exist requiring all variables in nested block to be unique to prevent name shadowing.
+Some programmers qualify all member names with @class::@ or @this->@ to make them unique from names defined in members.
+Even nested lexical blocks result in shadowing, \eg multiple nested loop-indices called @i@, silently changing the meaning of @i@ at lower scope levels.
+Again, coding styles exist requiring all variables in nested block to be unique to prevent name shadowing problems.
 Depending on the language, these possible ambiguities can be reported (as warnings or errors) and resolved explicitly using some form of qualification and/or cast.
+Formally, overloading is defined by Strachey as \newterm{ad hoc polymorphism}:
+For example, if variables can be overloaded, shadowed variables of different type can produce ambiguities, indicating potential problems in lower scopes.
+Formally, overloading is defined by Strachey as one kind of \newterm{ad hoc polymorphism}:
+\vspace*{-5pt}
 \begin{quote}
 In ad hoc polymorphism there is no single systematic way of determining the type of the result from the type of the arguments.
 …
 It seems, moreover, that the automatic insertion of transfer functions by the compiling system is limited to this.~\cite[p.~37]{Strachey00}
 \end{quote}
+\vspace*{-5pt}
 where a \newterm{transfer function} is an implicit conversion to help find a matching overload:
+\vspace*{-5pt}
 \begin{quote}
 The problem of dealing with polymorphic operators is complicated by the fact that the range of types sometimes overlap.
 …
 The functions which perform this operation are known as transfer functions and may either be used explicitly by the programmer, or, in some systems, inserted automatically by the compiling system.~\cite[p.~35]{Strachey00}
 \end{quote}
+\vspace*{-5pt}
 The differentiating characteristic between parametric polymorphism and overloading is often stated as: polymorphic functions use one algorithm to operate on arguments of many different types, whereas overloaded functions use a different algorithm for each type of argument.
 A similar differentiation is applicable for overloading and default parameters.
 …
 \end{cfa}
 the overloaded names @S@ and @E@ are separated into the type and object domain, and C uses the type kinds @struct@ and @enum@ to disambiguate the names.
 In general, types are not overloaded because inferencing them is difficult to imagine in a statically programming language.
+In general, types are not overloaded because inferencing them is difficult to imagine in a statically typed programming language.
 \begin{cquote}
 \setlength{\tabcolsep}{26pt}
 …
 \noindent
 \newterm{General overloading} occurs when the type-system \emph{knows} a function's parameters and return types (or a variable's type for variable overloading).
 In functional programming-languages, there is always a return type (except for a monad).
+In functional programming-languages, there is always a return type.
 If a return type is specified, the compiler does not have to inference the function body.
 For example, the compiler has complete knowledge about builtin types and their overloaded arithmetic operators.
 …
 Hence, parametric overloading requires additional information about the universal types to make them useful.
+This additional information often comes as a set of operations a type must supply (@trait@/-@concept@) and these operations can then be used in the body of the function.
+\begin{cfa}
+forall( T | T ?@++@( T, T ) ) T inc( T t ) { return t@++@; }
+This additional information often comes as a set of operations that must be supply for a type, \eg \CFA/Rust/Go have traits, \CC template has concepts, Haskell has type-classes.
+These operations can then be used in the body of the function to manipulate the type's value.
+Here, a type binding to @T@ must have available a @++@ operation with the specified signature.
+\begin{cfa}
+forall( T | @T ?++( T, T )@ ) // trait
+T inc( T t ) { return t@++@; } // change type value
 int i = 3
 i = inc( i )
 …
 \end{cfa}
 Given a qualifying trait, are its elements inferred or declared?
 In the above example, the type system infers @int@ for @T@, infers it needs a @++@ operator that takes an @int@ and returns an @int@, and finds this function in the enclosing environment (\eg standard prelude).
+In the example, the type system infers @int@ for @T@, infers it needs an appropriately typed @++@ operator, and finds it in the enclosing environment, possibly in the language's prelude defining basic types and their operations.
 This implicit inferencing is expensive if matched with implicit conversions when there is no exact match.
 Alternatively, types opt-in to traits via declarations.
 …
 \subsection{Operator Overloading}
 Virtually all programming languages provide general overloading of the arithmetic operators across the basic computational types using the number and type of parameters and returns.
+Many programming languages provide general overloading of the arithmetic operators~\cite{OperOverloading} across the basic computational types using the number and type of parameters and returns.
 However, in some languages, arithmetic operators may not be first class, and hence, cannot be overloaded.
 Like \CC, \CFA allows general operator overloading for user-defined types.
 …
 \subsection{Function Overloading}
+Both \CFA and \CC allow general overloading for functions, as long as their prototypes differ in the number and type of parameters and returns.
+Many programming languages provide general overloading for functions~\cite{FuncOverloading}, as long as their prototypes differ in the number and type of parameters.
+A few programming languages also use the return type for selecting overloaded functions \see{below}.
 \begin{cfa}
 void f( void );                 $\C[2in]{// (1): no parameter}$
 …
 f( 'A' );                               $\C{// select (2)}\CRT$
 \end{cfa}
 The type system examines each call size and first looks for an exact match and then a best match using conversions.
+The type system examines each call site and first looks for an exact match and then a best match using conversions.
 Ada, Scala, and \CFA type-systems also use the return type in resolving a call, to pinpoint the best overloaded name.
 Essentailly, the return types are \emph{reversed curried} into output parameters of the function.
+Essentially, the return types are \emph{reversed curried} into output parameters of the function.
 For example, in many programming languages with overloading, the following functions are ambiguous without using the return type.
 \begin{cfa}
 …
 \begin{cfa}
 void foo( double d );
 int v;                              $\C[2in]{// (1)}$
+int v;                                  $\C[2in]{// (1)}$
 double v;                               $\C{// (2) variable overloading}$
 foo( v );                               $\C{// select (2)}$
 …
+}
 \end{cfa}
 It is interesting that shadow overloading is considered a normal programming-language feature with only slight software-engineering problems.
+It is interesting that shadowing \see{namespace pollution in \VRef{s:Overloading}} is considered a normal programming-language feature with only slight software-engineering problems.
 However, variable overloading within a scope is often considered dangerous, without any evidence to corroborate this claim.
 In contrast, function overloading in \CC occurs silently within the global scope from @#include@ files all the time without problems.
 …
 The following covers these issues, and why this scheme is not amenable with the \CFA type system.
 One of the first and powerful type-inferencing system is Hindley--Milner~\cite{Damas82}.
+One of the first and most powerful type-inferencing systems is Hindley--Milner~\cite{Damas82}.
 Here, the type resolver starts with the types of the program constants used for initialization and these constant types flow throughout the program, setting all variable and expression types.
 \begin{cfa}
 …
 Note, return-type inferencing goes in the opposite direction to Hindley--Milner: knowing the type of the result and flowing back through an expression to help select the best possible overloads, and possibly converting the constants for a best match.
 In simpler type-inferencing systems, such as C/\CC/\CFA, there are more specific usages.
+There are multiple ways to indirectly specify a variable's type, \eg from a prior variable or expression.
 \begin{cquote}
 \setlength{\tabcolsep}{10pt}
 …
 \end{tabular}
 \end{cquote}
+The two important capabilities are:
+Here, @type(expr)@ computes the same type as @auto@ righ-hand expression.
+The advantages are:
 \begin{itemize}[topsep=0pt]
 \item
 …
 This issue is exaggerated with \CC templates, where type names are 100s of characters long, resulting in unreadable error messages.
 \item
 Ensuring the type of secondary variables, match a primary variable.
+Ensuring the type of secondary variables match a primary variable.
 \begin{cfa}
 int x; $\C{// primary variable}$
 …
 \end{itemize}
 Note, the use of @typeof@ is more restrictive, and possibly safer, than general type-inferencing.
+\begin{cquote}
+\setlength{\tabcolsep}{20pt}
+\begin{tabular}{@{}ll@{}}
 \begin{cfa}
 int x;
 …
 type(x) z = ... // complex expression
 \end{cfa}
+Here, the types of @y@ and @z@ are fixed (branded), whereas with type inferencing, the types of @y@ and @z@ are potentially unknown.
+&
+\begin{cfa}
+int x;
+auto y = ... // complex expression
+auto z = ... // complex expression
+\end{cfa}
+\end{tabular}
+\end{cquote}
+On the left, the types of @y@ and @z@ are fixed (branded), whereas on the right, the types of @y@ and @z@ can fluctuate.
 \subsection{Type-Inferencing Issues}
 Each kind of type-inferencing system has its own set of issues that flow onto the programmer in the form of convenience, restrictions, or confusions.
+Each kind of type-inferencing system has its own set of issues that affect the programmer in the form of convenience, restrictions, or confusions.
 A convenience is having the compiler use its overarching program knowledge to select the best type for each variable based on some notion of \emph{best}, which simplifies the programming experience.
 …
 For example, if a change is made in an initialization expression, it can cascade type changes producing many other changes and/or errors.
 At some point, a variable's type needs to remain constant and the initializing expression needs to be modified or be in error when it changes.
 Often type-inferencing systems allow restricting (\newterm{branding}) a variable or function type, so the complier can report a mismatch with the constant initialization.
+Often type-inferencing systems allow restricting (\newterm{branding}) a variable or function type, so the compiler can report a mismatch with the constant initialization.
 \begin{cfa}
 void f( @int@ x, @int@ y ) {  // brand function prototype
 …
 As a result, understanding and changing the code becomes almost impossible.
 Types provide important clues as to the behaviour of the code, and correspondingly to correctly change or add new code.
 In these cases, a programmer is forced to re-engineer types, which is fragile, or rely on a fancy IDE that can re-engineer types for them.
+In these cases, a programmer is forced to re-engineer types, which is fragile, or rely on an IDE that can re-engineer types for them.
 For example, given:
 \begin{cfa}
 …
 In this situation, having the type name or its short alias is essential.
 \CFA's type system tries to prevent type-resolution mistakes by relying heavily on the type of the left-hand side of assignment to pinpoint the right types within an expression.
+\CFA's type system tries to prevent type-resolution mistakes by relying heavily on the type of the left-hand side of assignment to pinpoint correct types within an expression.
 Type inferencing defeats this goal because there is no left-hand type.
+Fundamentally, type inferencing tries to magic away variable types from the programmer.
+However, this results in lazy programming with the potential for poor performance and safety concerns.
+Types are as important as control-flow in writing a good program, and should not be masked, even if it requires the programmer to think!
+A similar issue is garbage collection, where storage management is magicked away, often resulting in poor program design and performance.\footnote{
+There are full-time Java consultants, who are hired to find memory-management problems in large Java programs.}
+The entire area of Computer-Science data-structures is obsessed with time and space, and that obsession should continue into regular programming.
+Understanding space and time issues is an essential part of the programming craft.
+Given @typedef@ and @typeof@ in \CFA, and the strong desire to use the left-hand type in resolution, the decision was made not to support implicit type-inferencing in the type system.
+Fundamentally, type inferencing tries to remove explicit typing from programming.
+However, writing down types is an important aspect of good programming, as it provides a check of the programmer's expected type and the actual type.
+Thinking carefully about types is similar to thinking carefully about date structures, often resulting in better performance and safety.
+Similarly, thinking carefully about storage management in unmanaged languages is an important aspect of good programming, versus implicit storage management (garbage collection) in managed language.\footnote{
+There are full-time Java consultants, who are hired to find memory-management problems in large Java programs, \eg Monika Beckworth.}
+Given @typedef@ and @typeof@, and the strong desire to use the left-hand type in resolution, no attempt has been made in \CFA to support implicit type-inferencing.
 Should a significant need arise, this decision can be revisited.
 …
 int i, * ip = identity( &i );
 \end{cfa}
 Unlike \CC template functions, \CFA polymorphic functions are compatible with C \emph{separate compilation}, preventing compilation and code bloat.
+Unlike \CC template functions, \CFA polymorphic functions are compatible with \emph{separate compilation}, preventing compilation and code bloat.
 To constrain polymorphic types, \CFA uses \newterm{type assertions}~\cite[pp.~37-44]{Alphard} to provide further type information, where type assertions may be variable or function declarations that depend on a polymorphic type variable.
 …
 int val = twice( twice( 3 ) );  $\C{// val == 12}$
 \end{cfa}
 Parametric polymorphism and assertions occur in existing type-unsafe (@void *@) C functions, like @qsort@ for sorting an array of unknown values.
+The closest approximation to parametric polymorphism and assertions in C is type-unsafe (@void *@) functions, like @qsort@ for sorting an array of unknown values.
 \begin{cfa}
 void qsort( void * base, size_t nmemb, size_t size, int (*cmp)( const void *, const void * ) );
 …
 The @sized@ assertion passes size and alignment as a data object has no implicit assertions.
 Both assertions are used in @malloc@ via @sizeof@ and @_Alignof@.
 In practise, this polymorphic @malloc@ is unwrapped by the C compiler and the @if@ statement is elided producing a type-safe call to @malloc@ or @memalign@.
+In practice, this polymorphic @malloc@ is unwrapped by the C compiler and the @if@ statement is elided producing a type-safe call to @malloc@ or @memalign@.
 This mechanism is used to construct type-safe wrapper-libraries condensing hundreds of existing C functions into tens of \CFA overloaded functions.
 …
 forall( T @| sumable( T )@ )   // use trait
 T sum( T a[$\,$], size_t size ) {
         @T@ total = 0;          // initialize by 0 constructor
+        @T@ total = 0;            // initialize by 0 constructor
         for ( i; size )
                 total @+=@ a[i];    // select appropriate +
+                total @+=@ a[i];        // select appropriate +
         return total;
+}
 …
 \end{tabular}
 \end{cquote}
 Traits are implemented by flatten them at use points, as if written in full by the programmer.
+Traits are implemented by flattening them at use points, as if written in full by the programmer.
 Flattening often results in overlapping assertions, \eg operator @+@.
 Hence, trait names play no part in type equivalence.
 …
 Write bespoke data structures for each context.
 While this approach is flexible and supports integration with the C type checker and tooling, it is tedious and error prone, especially for more complex data structures.
 \item
 Use @void *@-based polymorphism, \eg the C standard library functions @bsearch@ and @qsort@, which allow for the reuse of code with common functionality.
 However, this approach eliminates the type checker's ability to ensure argument types are properly matched, often requiring a number of extra function parameters, pointer indirection, and dynamic allocation that is otherwise unnecessary.
 \item
+Use preprocessor macros, similar to \CC @templates@, to generate code that is both generic and type checked, but errors may be difficult to interpret.
+Furthermore, writing and using complex preprocessor macros is difficult and inflexible.
+Use an internal macro capability, like \CC @templates@, to generate code that is both generic and type checked, but errors may be difficult to interpret.
+Furthermore, writing complex template macros is difficult and complex.
+\item
+Use an external macro capability, like M4~\cite{M4}, to generate code that is generic code, but errors may be difficult to interpret.
+Like internal macros, writing and using external macros is equally difficult and complex.
 \end{enumerate}
 …
 \end{tabular}
 \end{cquote}
+\label{s:GenericImplementation}
 \CFA generic types are \newterm{fixed} or \newterm{dynamic} sized.
 Fixed-size types have a fixed memory layout regardless of type parameters, whereas dynamic types vary in memory layout depending on the type parameters.
 …
 For software-engineering reasons, the set assertions would be refactored into a trait to allow alternative implementations, like a Java \lstinline[language=java]{interface}.
+In summation, the \CFA type system inherits \newterm{nominal typing} for concrete types from C, and adds \newterm{structural typing} for polymorphic types.
+Traits are used like interfaces in Java or abstract base-classes in \CC, but without the nominal inheritance relationships.
+Instead, each polymorphic function or generic type defines the structural type needed for its execution, which is fulfilled at each call site from the lexical environment, like Go~\cite{Go} or Rust~\cite{Rust} interfaces.
+Hence, new lexical scopes and nested functions are used extensively to create local subtypes, as in the @qsort@ example, without having to manage a nominal inheritance hierarchy.
+In summation, the \CFA type system inherits \newterm{nominal typing} for concrete types from C;
+however, without inheritance in \CFA, nominal typing cannot be extended to polymorphic subtyping.
+Instead, \CFA adds \newterm{structural typing} and uses it to generate polymorphism.
+Here, traits are like interfaces in Java or abstract base-classes in \CC, but without the nominal inheritance relationships.
+Instead, each polymorphic function or generic type defines the structural requirements needed for its execution, which is fulfilled at each call site from the lexical environment, like Go~\cite{Go} or Rust~\cite{Rust} interfaces.
+Hence, lexical scopes and nested functions are used extensively to mimic subtypes, as in the @qsort@ example, without managing a nominal inheritance hierarchy.
 …
 general\footnote{overloadable entities: V $\Rightarrow$ variable, O $\Rightarrow$ operator, F $\Rightarrow$ function, M $\Rightarrow$ member}
                                                 & O\footnote{except assignment}/F       & O/F/M & V/O/F & M\footnote{not universal}     & O/M   & O/F/M & no    & no    \\
 general constraints\footnote{T $\Rightarrow$ parameter type, \# $\Rightarrow$ parameter number, N $\Rightarrow$ parameter name; R $\Rightarrow$ return type}
+general constraints\footnote{T $\Rightarrow$ parameter type, \# $\Rightarrow$ parameter count, N $\Rightarrow$ parameter name; R $\Rightarrow$ return type}
                                                 & T/\#//R\footnote{parameter names can be used to disambiguate among overloads but not create overloads}
                                                                         & T/\#  & T/\#/R        & T/\#  & T/\#/N/R      & T/\#/N/R      & T/\#/N        & T/R \\
 …
 However, the parameter operations are severely restricted because universal types have few operations.
 For example, swift provides a @print@ operation for its universal type, and the java @Object@ class provides general methods: @toString@, @hashCode@, @equals@, @finalize@, \etc.
+For example, Swift provides a @print@ operation for its universal type, and the Java @Object@ class provides general methods: @toString@, @hashCode@, @equals@, @finalize@, \etc.
 This restricted mechanism still supports a few useful functions, where the parameters are abstract entities, \eg:
 \begin{swift}
 …
 \end{swift}
 To make a universal function useable, an abstract description is needed for the operations used on the parameters within the function body.
 Type matching these operations can occur by discover using techniques like \CC template expansion, or explicit stating, \eg interfaces, subtyping (inheritance), assertions (traits), type classes, type bounds.
+Type matching these operations can be done by using techniques like \CC template expansion, or explicit stating, \eg interfaces, subtyping (inheritance), assertions (traits), type classes, type bounds.
 The mechanism chosen can affect separate compilation or require runtime type information (RTTI).
 \begin{description}
 …
 \begin{figure}
 \setlength{\tabcolsep}{15pt}
+\setlength{\tabcolsep}{12pt}
 \begin{tabular}{@{}ll@{}}
 \multicolumn{1}{c}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{Haskell}} \\
 …
 forall( T ) trait sumable {
         void ?{}( T &, zero_t );
+        T ?+=?( T &, T );
+};
+        T ?+=?( T &, T );  };
 forall( T | sumable( T ) )
 T sum( T a[], size_t size ) {
         T total = 0;
         for ( i; size ) total += a[i];
+        return total;
+}
+        return total;  }
 struct S { int i, j; };
 void ?{}( S & s, zero_t ) { s.[i, j] = 0; }
 …
 void ?{}( S & s, int i, int j ) { s.[i, j] = [i, j]; }
 S ?+=?( S & l, S r ) { l.[i, j] += r.[i, j]; }
+int main() {
+        int ia[] = { 1, 2, 3 };
+        sout | sum( ia, 3 );        // trait inference
+        double da[] = { 1.5, 2.5, 3.5 };
+        sout | sum( da, 3 );        // trait inference
+        S sa[] = { {1, 1}, {2, 2}, {3, 3 } };
+        sout | sum( sa, 3 ).[i, j]; // trait inference
+}
+int main() {            // trait inferencing
+        sout | sum( (int []){ 1, 2, 3 }, 3 );
+        sout | sum( (double []){ 1.5, 2.5, 3.5 }, 3 );
+        sout | sum( (S []){ {1,1}, {2,2}, {3,3} }, 3 ).[i, j];  }
 \end{cfa}
+&
 …
         szero :: a
         sadd :: a -> a -> a
 ssum ::  Sumable a $=>$ [a] -> a
 ssum (x:xs) = sadd x (ssum xs)
 ssum [] = szero
 data S = S Int Int deriving Show
 @instance Sumable Int@ where
 …
 @instance Sumable Float@ where
         szero = 0.0
    sadd = (+)
+        sadd = (+)
 @instance Sumable S@ where
         szero = S 0 0
 …
 \end{haskell}
 \end{tabular}
 \caption{Implicitly/Explicitly Trait Inferencing}
 \label{f:ImplicitlyExplicitlyTraitInferencing}
+\caption{Implicit/Explicit Trait Inferencing}
+\label{f:ImplicitExplicitTraitInferencing}
 \end{figure}
 One differentiating feature among these specialization techniques is the ability to implicitly or explicitly infer the trait information at a class site.
 \VRef[Figure]{f:ImplicitlyExplicitlyTraitInferencing} compares the @sumable@ trait and polymorphic @sum@ function \see{\VRef{s:Traits}} for \CFA and Haskell.
+\VRef[Figure]{f:ImplicitExplicitTraitInferencing} compares the @sumable@ trait and polymorphic @sum@ function \see{\VRef{s:Traits}} for \CFA and Haskell.
 Here, the \CFA type system inferences the trait functions at each call site, so no additional specification is necessary by the programmer.
 The Haskell program requires the programmer to explicitly bind the trait and to each type that can be summed.
 …
 \end{ada}
 Finally, there is a belief that certain type systems cannot support general overloading, \eg Haskell.
 As \VRef[Table]{t:OverloadingFeatures} shows, there are multiple languages with both general and parametric overloading, so the decision to not support general overloading is based on the opinion of the language designers and the type system they choose, not any reason in type theory.
+As \VRef[Table]{t:OverloadingFeatures} shows, there are multiple languages with both general and parametric overloading, so the decision to not support general overloading is based on design choices made by the language designers not any reason in type theory.
 The fourth row classifies if conversions are attempted beyond exact match.
 …
 The details of compiler optimization work are covered in a previous technical report~\cite{Yu20}, which essentially forms part of this thesis.
 \item
 The thesis presents a systematic review of the new features added to the \CFA language and its type system.
+This thesis presents a systematic review of the new features added to the \CFA language and its type system.
 Some of the more recent inclusions to \CFA, such as tuples and generic structure types, were not well tested during development due to the limitation of compiler performance.
 Several issues coming from the interactions of various language features are identified and discussed in this thesis;

doc/theses/fangren_yu_MMath/resolution.tex

-              r7d02d35
+              rbd72f517
 \label{c:content2}
 Recapping, the \CFA's type-system provides expressive polymorphism: variables can be overloaded, functions can be overloaded by argument and return types, tuple types, generic (polymorphic) functions and types (aggregates) can have multiple type parameters with assertion restrictions;
+Recapping, \CFA's type-system provides expressive polymorphism: variables can be overloaded, functions can be overloaded by argument and return types, tuple types, generic (polymorphic) functions and types (aggregates) can have multiple type parameters with assertion restrictions;
 in addition, C's multiple implicit type-conversions must be respected.
 This generality leads to internal complexity and correspondingly higher compilation cost directly related to type resolution.
 …
 \end{enumerate}
 \VRef[Table]{t:SelectedFileByCompilerBuild} shows improvements for selected tests with accumulated reductions in compile time across each of the 5 fixes.
 To this day, the large reduction in compilation time significantly improves the development of the \CFA's runtime because of its frequent compilation cycles.
+The large reduction in compilation time significantly improves the development of the \CFA's runtime because of its frequent compilation cycles.
 \begin{table}[htb]
 …
 Some of those problems arise from the newly introduced language features described in the previous chapter.
 In addition, fixing unexpected interactions within the type system has presented challenges.
 This chapter describes in detail the type-resolution rules currently in use and some major problems that have been identified.
+This chapter describes in detail the type-resolution rules currently in use and some major problems \PAB{I} have identified.
 Not all of those problems have immediate solutions, because fixing them may require redesigning parts of the \CFA type system at a larger scale, which correspondingly affects the language design.
 …
 \begin{enumerate}[leftmargin=*]
 \item \textbf{Unsafe} cost representing a narrowing conversion of arithmetic types, \eg @int@ to @short@, and qualifier-dropping conversions for pointer and reference types.
 Narrowing conversions have the potential to lose (truncation) data.
+Narrowing conversions have the potential to lose (truncate) data.
 A programmer must decide if the computed data-range can safely be shorted in the smaller storage.
 Warnings for unsafe conversions are helpful.
 …
 \item \textbf{Safe} cost representing a widening conversion \eg @short@ to @int@, qualifier-adding conversions for pointer and reference types, and value conversion for enumeration constants.
 Even when conversions are safe, the fewest conversions it ranked better, \eg @short@ to @int@ versus @short@ to @long int@.
+When all conversions are safe, closer conversions are ranked better, \eg @short@ to @int@ versus @short@ to @long int@.
 \begin{cfa}
 void f( long int p ); $\C[2.5in]{// 1}$
 …
 \item \textbf{Specialization} cost counting the number of restrictions introduced by type assertions.
 Fewer restriction means fews parametric variables passed at the function call giving better performance.
+Fewer restriction means fewer parametric variables passed at the function call giving better performance.
 \begin{cfa}
 forall( T | { T ?+?( T, T ) } ) void f( T ); $\C[3.25in]{// 1}$
 …
 \end{cfa}
 \end{enumerate}
 Cost tuples are compared by lexicographical order, from unsafe (highest) to specialization (lowest), with ties moving to the next lowest item.
+Cost tuples are compared in lexicographical order, from unsafe (highest) to specialization (lowest), with ties moving to the next lowest item.
 At a subexpression level, the lowest cost candidate for each result type is included as a possible interpretation of the expression;
 at the top level, all possible interpretations of different types are considered (generating a total ordering) and the overall lowest cost is selected as the final interpretation of the expression.
 Glen Ditchfield first proposed this costing model~\cite[\S~4.4.5]{Ditchfield92} to generate a resolution behaviour that is reasonable to C programmers based on existing conversions in the C programming language.
 This model carried over into the first implementation of the \CFA type-system by Richard Bilson~\cite[\S~2.2]{Bilson03}, and was extended but not redesigned by Aaron Moss~\cite[chap.~4]{Moss19}.
 Moss's work began to show problems with the underlying costing model;
+Moss's work began to show problems with the underlying cost model;
 these design issues are part of this work.
 …
 Therefore, at each resolution step, the arguments are already given unique interpretations, so the ordering only needs to compare different sets of conversion targets (function parameter types) on the same set of input.
 In \CFA, trying to use such a system is problematic because of the presence of return-type overloading of functions and variable.
+\PAB{My conclusion} is that trying to use such a system in \CFA is problematic because of the presence of return-type overloading of functions and variables.
 Specifically, \CFA expression resolution considers multiple interpretations of argument subexpressions with different types, \eg:
 so it is possible that both the selected function and the set of arguments are different, and cannot be compared with a partial-ordering system.
 …
 \end{quote}
 However, I was unable to generate any Ada example program that demonstrates this preference.
 In contrast, the \CFA overload resolution-system is at the other end of the spectrum, as it tries to order every legal interpretations of an expression and chooses the best one according to cost, occasionally giving unexpected results rather than an ambiguity.
+In contrast, the \CFA overload resolution-system is at the other end of the spectrum, as it tries to order all legal interpretations of an expression and chooses the best one according to cost, occasionally giving unexpected results rather than an ambiguity.
 Interestingly, the \CFA cost-based model can sometimes make expression resolution too permissive because it always attempts to select the lowest cost option, and only when there are multiple options tied at the lowest cost does it report the expression is ambiguous.
 …
 Other than the case of multiple exact matches, where all have cost zero, incomparable candidates under a partial ordering can often have different expression costs since different kinds of implicit conversions are involved, resulting in seemingly arbitrary overload selections.
 There are currently at least three different situations where the polymorphic cost element of the cost model does not yield a candidate selection that is clearly justifiable, and one of them is straight up wrong.
+There are currently at least three different situations where the polymorphic cost element of the cost model does not yield a candidate selection that is justifiable, and one of them is clearly wrong.
 \begin{enumerate}[leftmargin=*]
 \item Polymorphic exact match versus non-polymorphic inexact match.
 …
 \end{itemize}
 In this example, option 1 produces the prototype @void f( int )@, which gives an exact match and therefore takes priority.
 The \CC resolution rules effectively makes option 2 a specialization that only applies to type @long@ exactly,\footnote{\CC does have explicit template specializations, however they do not participate directly in overload resolution and can sometimes lead to unintuitive results.} while the current \CFA rules make option 2 apply for all integral types below @long@.
+The \CC resolution rules effectively make option 2 a specialization that only applies to type @long@ exactly,\footnote{\CC does have explicit template specializations, however they do not participate directly in overload resolution and can sometimes lead to unintuitive results.} while the current \CFA rules make option 2 apply for all integral types ranked lower than @long@ as well.
 This difference could be explained as compensating for \CFA polymorphic functions being separately compiled versus template inlining;
 hence, calling them requires passing type information and assertions increasing the runtime cost.
 …
 Although it is true that both the sequence 1, 2 and 1, 3, 4 are increasingly more constrained on the argument types, option 2 is not comparable to either of option 3 or 4;
 they actually describe independent constraints on the two arguments.
 Specifically, option 2 says the two arguments must have the same type, while option 3 states the second argument must have type @int@,
+Specifically, option 2 says the two arguments must have the same type, while option 3 states the second argument must have type @int@.
 Because two constraints can independently be satisfied, neither should be considered a better match when trying to resolve a call to @f@ with argument types @(int, int)@;
 reporting such an expression as ambiguous is more appropriate.
 …
 Passing a @pair@ variable to @f@
 \begin{cfa}
 pair p;
+pair(int, double) p;
 f( p );
 \end{cfa}
 gives a cost of 1 poly, 2 variable for the @pair@ overload, versus a cost of 1 poly, 1 variable for the unconstrained overload.
 Programmer expectation is to select option 1 because of the exact match, but the cost model selects 2;
 while either could work, the type system should select a call that meets expectation of say the call is ambiguous, forcing the programmer to mediate.
+it is not possible to write a specialization for @f@ that works on any pair type and gets selected by the type resolver as intended.
 As a result, simply counting the number of polymorphic type variables is no longer correct to order the function candidates as being more constrained.
 \end{enumerate}
 These inconsistencies are not easily solvable in the current cost-model, meaning the currently \CFA codebase has to workaround these defects.
+These inconsistencies are not easily solvable in the current cost-model, meaning that currently the \CFA codebase has to workaround these defects.
 One potential solution is to mix the conversion cost and \CC-like partial ordering of specializations.
 For example, observe that the first three elements (unsafe, polymorphic and safe conversions) in the \CFA cost-tuple are related to the argument/parameter types, while the other two elements (polymorphic variable and assertion counts) are properties of the function declaration.
 …
 Here, the unsafe cost of signed to unsigned is factored into the ranking, so the safe conversion is selected over an unsafe one.
 Furthermore, an integral option is taken before considering a floating option.
 This model locally matches the C approach, but provides an ordering when there are many overloaded alternative.
+This model locally matches the C approach, but provides an ordering when there are many overload alternatives.
 However, as Moss pointed out overload resolution by total cost has problems, \eg handling cast expressions.
 \begin{cquote}
 …
 if an expression has any legal interpretations as a C builtin operation, only the lowest cost one is kept, regardless of the result type.
 \VRef[Figure]{f:CFAArithmeticConversions} shows an alternative \CFA partial-order arithmetic-conversions graphically.
+\VRef[Figure]{f:CFAArithmeticConversions} shows \PAB{my} alternative \CFA partial-order arithmetic-conversions graphically.
 The idea here is to first look for the best integral alternative because integral calculations are exact and cheap.
 If no integral solution is found, than there are different rules to select among floating-point alternatives.
 …
 \section{Type Unification}
 Type unification is the algorithm that assigns values to each (free) type parameters such that the types of the provided arguments and function parameters match.
+Type unification is the algorithm that assigns values to each (free) type parameter such that the types of the provided arguments and function parameters match.
 \CFA does not attempt to do any type \textit{inference} \see{\VRef{s:IntoTypeInferencing}}: it has no anonymous functions (\ie lambdas, commonly found in functional programming and also used in \CC and Java), and the variable types must all be explicitly defined (no auto typing).
 …
 With the introduction of generic record types, the parameters must match exactly as well; currently there are no covariance or contravariance supported for the generics.
 One simplification was made to the \CFA language that makes modelling the type system easier: polymorphic function pointer types are no longer allowed.
+\PAB{I made} one simplification to the \CFA language that makes modelling the type system easier: polymorphic function pointer types are no longer allowed.
 The polymorphic function declarations themselves are still treated as function pointer types internally, however the change means that formal parameter types can no longer be polymorphic.
 Previously it was possible to write function prototypes such as
 …
 A function operates on the call-site arguments together with any local and global variables.
 When the function is polymorphic, the types are inferred at each call site.
 On each invocation, the types to be operate on are determined from the arguments provided, and therefore, there is no need to pass a polymorphic function pointer, which can take any type in principle.
+On each invocation, the types to be operated on are determined from the arguments provided, and therefore, there is no need to pass a polymorphic function pointer, which can take any type in principle.
 For example, consider a polymorphic function that takes one argument of type @T@ and polymorphic function pointer.
 \begin{cfa}
 …
 The assertion set that needs to be resolved is just the declarations on the function prototype, which also simplifies the assertion satisfaction algorithm, which is discussed further in the next section.
 An implementation sketch stores type unification results in a type-environment data-structure, which represents all the type variables currently in scope as equivalent classes, together with their bound types and information such as whether the bound type is allowed to be opaque (\ie a forward declaration without definition in scope) and whether the bounds are allowed to be widened.
+\PAB{My} implementation sketch stores type unification results in a type-environment data-structure, which represents all the type variables currently in scope as equivalent classes, together with their bound types and information such as whether the bound type is allowed to be opaque (\ie a forward declaration without definition in scope) and whether the bounds are allowed to be widened.
 In the general approach commonly used in functional languages, the unification variables are given a lower bound and an upper bound to account for covariance and contravariance of types.
 \CFA does not implement any variance with its generic types and does not allow polymorphic function types, therefore no explicit upper bound is needed and one binding value for each equivalence class suffices.
 …
 In previous versions of \CFA, this number was set at 4; as the compiler becomes more optimized and capable of handling more complex expressions in a reasonable amount of time, I have increased the limit to 8 and it does not lead to problems.
+In previous versions of \CFA, this number was set at 4; as the compiler has become more optimized and capable of handling more complex expressions in a reasonable amount of time, I have increased the limit to 8 and it has not led to problems.
 Only rarely is there a case where the infinite recursion produces an exponentially growing assertion set, causing minutes of time wasted before the limit is reached.
 Fortunately, it is very hard to generate this situation with realistic \CFA code, and the ones that have occurred have clear characteristics, which can be prevented by alternative approaches.
 …
 One example is analysed in this section.
 While the assertion satisfaction problem in isolation looks like just another expression to resolve, its recursive nature makes some techniques for expression resolution no longer possible.
+\PAB{My analysis shows that} while the assertion satisfaction problem in isolation looks like just another expression to resolve, its recursive nature makes some techniques for expression resolution no longer possible.
 The most significant impact is that type unification has a side effect, namely editing the type environment (equivalence classes and bindings), which means if one expression has multiple associated assertions it is dependent, as the changes to the type environment must be compatible for all the assertions to be resolved.
 Particularly, if one assertion parameter can be resolved in multiple different ways, all of the results need to be checked to make sure the change to type variable bindings are compatible with other assertions to be resolved.
 …
 In many cases, these problems can be avoided by examining other assertions that provide insight on the desired type binding: if one assertion parameter can only be matched by a unique option, the type bindings can be updated confidently without the need for backtracking.
 The Moss algorithm currently used in \CFA was developed using a simplified type-simulator that capture most of \CFA type-system features.
+The Moss algorithm currently used in \CFA was developed using a simplified type system that captures most of \CFA's type system features.
 The simulation results were then ported back to the actual language.
 The simulator used a mix of breadth- and depth-first search in a staged approach.
 …
 If any new assertions are introduced by the selected candidates, the algorithm is applied recursively, until there are none pending resolution or the recursion limit is reached, which results in a failure.
 However, in practice the efficiency of this algorithm can be sensitive to the order of resolving assertions.
+However, \PAB{I identify that} in practice the efficiency of this algorithm can be sensitive to the order of resolving assertions.
 Suppose an unbound type variable @T@ appears in two assertions:
 \begin{cfa}
 …
 A type variable introduced by the @forall@ clause of function declaration can appear in parameter types, return types and assertion variables.
 If it appears in parameter types, it can be bound when matching the arguments to parameters at the call site.
 If it only appears in the return type, it can be eventually be determined from the call-site context.
+If it only appears in the return type, it can be eventually determined from the call-site context.
 Currently, type resolution cannot do enough return-type inferencing while performing eager assertion resolution: the return type information is unknown before the parent expression is resolved, unless the expression is an initialization context where the variable type is known.
 By delaying the assertion resolution until the return type becomes known, this problem can be circumvented.
 The truly problematic case occurs if a type variable does not appear in either of the parameter or return types and only appears in assertions or variables (associate types).
+The truly problematic case occurs if a type variable does not appear in either of the parameter or return types and only appears in assertions or variables (\newterm{associate types}).
 \begin{cfa}
 forall( T | { void foo( @T@ ) } ) int f( float ) {
 …
+}
 \end{cfa}
 This case is rare so forcing every type variable to appear at least once in parameter or return types limits does not limit the expressiveness of \CFA type system to a significant extent.
 The next section presents a proposal for including type declarations in traits rather than having all type variables appear in the trait parameter list, which is provides equivalent functionality to an unbound type parameter in assertion variables, and also addresses some of the variable cost issue discussed in \VRef{s:ExpressionCostModel}.
+This case is rare so forcing every type variable to appear at least once in parameter or return types does not limit the expressiveness of \CFA type system to a significant extent.
+\VRef{s:AssociatedTypes} presents a proposal for including type declarations in traits rather than having all type variables appear in the trait parameter list, which provides equivalent functionality to an unbound type parameter in assertion variables, and also addresses some of the variable cost issue discussed in \VRef{s:ExpressionCostModel}.
 …
 Based on the experiment results, this approach can improve the performance of expression resolution in general, and sometimes allow difficult instances of assertion resolution problems to be solved that are otherwise infeasible, \eg when the resolution encounters an infinite loop.
 The tricky problem in implementing this approach is that the resolution algorithm has side effects, namely modifying the type bindings in the environment.
+\PAB{I identify that} the tricky problem in implementing this approach is that the resolution algorithm has side effects, namely modifying the type bindings in the environment.
 If the modifications are cached, \ie the results that cause the type bindings to be modified, it is also necessary to store the changes to type bindings, too.
 Furthermore, in cases where multiple candidates can be used to satisfy one assertion parameter, all of them must be cached including those that are not eventually selected, since the side effect can produce different results depending on the context.
 …
 However, the implementation of the type environment is simplified;
 it only stores a tentative type binding with a flag indicating whether \emph{widening} is possible for an equivalence class of type variables.
 Formally speaking, this means the type environment used in \CFA is only capable of representing \emph{lower-bound} constraints.
+Formally speaking, \PAB{I concluded} the type environment used in \CFA is only capable of representing \emph{lower-bound} constraints.
 This simplification works most of the time, given the following properties of the existing \CFA type system and the resolution algorithms:
 \begin{enumerate}
 …
 \end{enumerate}
 \CFA does attempt to incorporate upstream type information propagated from variable a declaration with initializer, since the type of the variable being initialized is known.
+\CFA does attempt to incorporate upstream type information propagated from a variable declaration with initializer, since the type of the variable being initialized is known.
 However, the current type-environment representation is flawed in handling such type inferencing, when the return type in the initializer is polymorphic.
 Currently, an inefficient workaround is performed to create the necessary effect.

doc/theses/fangren_yu_MMath/uw-ethesis.bib

-              r7d02d35
+              rbd72f517
 % For use with BibTeX
+@misc{OperOverloading,
+    contributer = {pabuhr@plg},
+    key         = {Operator Overloading},
+    title       = {Operator Overloading},
+    author      = {{WikipediA}},
+    howpublished= {\url{https://en.wikipedia.org/wiki/Operator_overloading}},
+    year        = 2025,
+}
+@misc{FuncOverloading,
+    contributer = {pabuhr@plg},
+    key         = {Function Overloading},
+    title       = {Function Overloading},
+    author      = {{WikipediA}},
+    howpublished= {\url{https://en.wikipedia.org/wiki/Function_overloading}},
+    year        = 2025,
+}

doc/theses/fangren_yu_MMath/uw-ethesis.tex

r7d02d35	rbd72f517
100	100	\lstnewenvironment{ada}[1][]{\lstset{language=Ada,escapechar=\$,moredelim=**[is][\color{red}]{@}{@},}\lstset{#1}}{}
101	101
102		\newcommand{\PAB}[1]{{\color{~~red}PAB:~~ #1}}
	102	\newcommand{\PAB}[1]{{\color{magenta}#1}}
103	103	\newcommand{\newtermFont}{\emph}
104	104	\newcommand{\Newterm}[1]{\newtermFont{#1}}

doc/theses/mike_brooks_MMath/Makefile

-              r7d02d35
+              rbd72f517
 TeXSRC = ${wildcard *.tex}
 PicSRC = ${notdir ${wildcard ${Pictures}/*.png}} ${notdir ${wildcard ${Pictures}/*.fig}}
+PicSRC := ${PicSRC:.fig=.pdf}           # substitute ".fig" with ".pdf"
+GraphSRC_OLD = ${notdir ${wildcard ${Pictures}/*.dat}}
+GraphSRC_OLD := ${GraphSRC_OLD:.dat=.pdf}               # substitute ".dat" with ".pdf"
+PlotINPUTS = ${wildcard ${Plots}/*.gp} ${wildcard ${Plots}/*.py}
+PlotINPUTS := ${addsuffix .INPUTS,${PlotINPUTS}}
+PicSRC := ${PicSRC:.fig=.pdf}                   # substitute ".fig" with ".pdf"
 PlotSRC = ${notdir ${wildcard ${Plots}/*.gp}}
 PlotSRC := ${addprefix ${Build}/plot-,${PlotSRC:.gp=.pdf}}              # substitute ".gp" with ".pdf"
+PlotSRC := ${addprefix ${Build}/plot-,${PlotSRC:.gp=.pdf}} # substitute ".gp" with ".pdf"
 DemoPgmSRC = ${notdir ${wildcard ${Programs}/*-demo.cfa}}
 PgmSRC = ${notdir ${wildcard ${Programs}/*}}
 …
 # Rules and Recipes
 .PHONY : all clean                      # not file names
+.PHONY : all clean                              # not file names
 .SECONDARY:
 #.PRECIOUS : ${Build}/%                         # don't delete intermediates
 …
 # File Dependencies
 ${DOCUMENT}: ${TeXSRC} $(RunPgmOut) ${DemoPgmOut} ${GraphSRC_OLD} ${PlotSRC} ${PicSRC} ${BibSRC} ${BibRep}/pl.bib ${LaTMac}/common.tex Makefile | ${Build}
+${DOCUMENT}: ${TeXSRC} $(RunPgmOut) ${DemoPgmOut} ${PlotSRC} ${PicSRC} ${BibSRC} ${BibRep}/pl.bib ${LaTMac}/common.tex Makefile | ${Build}
         echo ${PicSRC}
         echo ${GraphSRC_OLD}
 …
         ${CFA} $< -o $@
 ${Build}/%: ${Programs}/%.run.cfa | ${Build} # cfa cannot handle pipe
+${Build}/%: ${Programs}/%.run.cfa | ${Build}    # cfa cannot handle pipe
         sed -f ${Programs}/sedcmd $< > ${Build}/tmp.cfa; ${CFA} ${Build}/tmp.cfa -o $@
 …
         $< > $@
-string-graph-peq-sharing.pdf: string-graph-peq-sharing.dat plot-peq-sharing.gp | ${Build}
-        gnuplot plot-peq-sharing.gp
-string-graph-pta-sharing.pdf: string-graph-pta-sharing.dat plot-pta-sharing.gp | ${Build}
-        gnuplot plot-pta-sharing.gp
-string-graph-pbv.pdf: string-graph-pbv.dat plot-pbv.gp | ${Build}
-        gnuplot plot-pbv.gp
-string-graph-allocn.pdf: string-graph-allocn.dat plot-allocn.gp | ${Build}
-        gnuplot plot-allocn.gp
 %.pdf: %.fig | ${Build}
         fig2dev -L pdf $< > ${Build}/$@
 -include $(Plots)/string-peq-cppemu.d
+-include $(Plots)/*.d
 ${Build}/plot-%.dat: ${Plots}/%.py ${Plots}/%.py.INPUTS | ${Build}
-        echo ${PlotINPUTS}
         python3 $< > $@

doc/theses/mike_brooks_MMath/array.tex

-              r7d02d35
+              rbd72f517
 though using a new style of generic parameter.
 \begin{cfa}
 @array( float, 99 )@ x;                                 $\C[2.75in]{// x contains 99 floats}$
 \end{cfa}
 Here, the arguments to the @array@ type are @float@ (element type) and @99@ (length).
 When this type is used as a function parameter, the type-system requires that a call's argument is a perfect match.
+@array( float, 99 )@ x;                                 $\C[2.5in]{// x contains 99 floats}$
+\end{cfa}
+Here, the arguments to the @array@ type are @float@ (element type) and @99@ (dimension).
+When this type is used as a function parameter, the type-system requires the argument is a perfect match.
 \begin{cfa}
 void f( @array( float, 42 )@ & p ) {}   $\C{// p accepts 42 floats}$
 f( x );                                                                 $\C{// statically rejected: type lengths are different, 99 != 42}$
 test2.cfa:3:1 error: Invalid application of existing declaration(s) in expression.
 Applying untyped:  Name: f ... to:  Name: x
 \end{cfa}
+Here, the function @f@'s parameter @p@ is declared with length 42.
+However, the call @f( x )@ is invalid, because @x@'s length is @99@, which does not match @42@.
+A function declaration can be polymorphic over these @array@ arguments by using the \CFA @forall@ declaration prefix.
+Function @f@'s parameter expects an array with dimension 42, but the argument dimension 99 does not match.
+A function can be polymorphic over @array@ arguments using the \CFA @forall@ declaration prefix.
 \begin{cfa}
 forall( T, @[N]@ )
 …
+}
 g( x, 0 );                                                              $\C{// T is float, N is 99, dynamic subscript check succeeds}$
+g( x, 1000 );                                                   $\C{// T is float, N is 99, dynamic subscript check fails}\CRT$
+g( x, 1000 );                                                   $\C{// T is float, N is 99, dynamic subscript check fails}$
 Cforall Runtime error: subscript 1000 exceeds dimension range [0,99) $for$ array 0x555555558020.
 \end{cfa}
+Function @g@ takes an arbitrary type parameter @T@ and a \emph{dimension parameter} @N@.
+A dimension parameter represents a to-be-determined count of elements, managed by the type system.
+The call @g( x, 0 )@ is valid because @g@ accepts any length of array, where the type system infers @float@ for @T@ and length @99@ for @N@.
+Inferring values for @T@ and @N@ is implicit.
+Furthermore, in this case, the runtime subscript @x[0]@ (parameter @i@ being @0@) in @g@ is valid because 0 is in the dimension range $[0,99)$ of argument @x@.
+However, the call @g( x, 1000 )@ is also accepted through compile time;
+however, this case's subscript, @x[1000]@, generates an error, because @1000@ is outside the dimension range $[0,99)$ of argument @x@.
+Function @g@ takes an arbitrary type parameter @T@ and an unsigned integer \emph{dimension} @N@.
+The dimension represents a to-be-determined number of elements, managed by the type system, where 0 represents an empty array.
+The type system implicitly infers @float@ for @T@ and @99@ for @N@.
+Furthermore, the runtime subscript @x[0]@ (parameter @i@ being @0@) in @g@ is valid because 0 is in the dimension range $[0,99)$ for argument @x@.
+The call @g( x, 1000 )@ is also accepted at compile time.
+However, the subscript, @x[1000]@, generates a runtime error, because @1000@ is outside the dimension range $[0,99)$ of argument @x@.
 In general, the @forall( ..., [N] )@ participates in the user-relevant declaration of the name @N@, which becomes usable in parameter/return declarations and within a function.
 The syntactic form is chosen to parallel other @forall@ forms:
 \begin{cfa}
 forall( @[N]@ ) ...     $\C[1.5in]{// dimension}$
 forall( T ) ...         $\C{// value datatype (formerly, "otype")}$
 forall( T & ) ...       $\C{// opaque datatype (formerly, "dtype")}\CRT$
+forall( @[N]@ ) ...     $\C{// dimension}$
+forall( T ) ...         $\C{// value datatype}$
+forall( T & ) ...       $\C{// opaque datatype}$
 \end{cfa}
 % The notation @array(thing, N)@ is a single-dimensional case, giving a generic type instance.
 …
 \begin{cfa}
 forall( [N] )
+void declDemo( ... ) {
+        float x1[N];                                            $\C{// built-in type ("C array")}$
+        array(float, N) x2;                                     $\C{// type from library}$
+}
+\end{cfa}
+Both of the locally-declared array variables, @x1@ and @x2@, have 42 elements, each element being a @float@.
+The two variables have identical size and layout; they both encapsulate 42-float stack allocations, with no additional ``bookkeeping'' allocations or headers.
+void f( ... ) {
+        float x1[@N@];                                          $\C{// C array, no subscript checking}$
+        array(float, N) x2;                                     $\C{// \CFA array, subscript checking}\CRT$
+}
+\end{cfa}
+Both of the stack declared array variables, @x1@ and @x2@, have 42 elements, each element being a @float@.
+The two variables have identical size and layout, with no additional ``bookkeeping'' allocations or headers.
+The C array, @x1@, has no subscript checking, while \CFA array, @x2@, does.
 Providing this explicit generic approach requires a significant extension to the \CFA type system to support a full-feature, safe, efficient (space and time) array-type, which forms the foundation for more complex array forms in \CFA.
+In all following discussion, ``C array'' means the types like that of @x@ and ``\CFA array'' means the standard-library @array@ type (instantiations), like the type of @x2@.
+Admittedly, the @array@ library type for @x2@ is syntactically different from its C counterpart.
+A future goal (TODO xref) is to provide the new @array@ features with syntax approaching C's (declaration style of @x1@).
+In all following discussion, ``C array'' means types like @x1@ and ``\CFA array'' means types like @x2@.
+A future goal is to provide the new @array@ features with syntax approaching C's (declaration style of @x1@).
 Then, the library @array@ type could be removed, giving \CFA a largely uniform array type.
+At present, the C-syntax @array@ is only partially supported, so the generic @array@ is used exclusively in the thesis;
+feature support and C compatibility are revisited in Section ? TODO.
+At present, the C-syntax @array@ is only partially supported, so the generic @array@ is used exclusively in the thesis.
 My contributions in this chapter are:
 \begin{enumerate}
+\begin{enumerate}[leftmargin=*]
 \item A type system enhancement that lets polymorphic functions and generic types be parameterized by a numeric value: @forall( [N] )@.
 \item Provide a length-checked array-type in the \CFA standard library, where the array's length is statically managed and dynamically valued.
+\item Provide a dimension/subscript-checked array-type in the \CFA standard library, where the array's length is statically managed and dynamically valued.
 \item Provide argument/parameter passing safety for arrays and subscript safety.
-\item TODO: general parking...
 \item Identify the interesting specific abilities available by the new @array@ type.
 \item Where there is a gap concerning this feature's readiness for prime-time, identification of specific workable improvements that are likely to close the gap.
 …
+\begin{comment}
 \section{Dependent Typing}
+General dependent typing allows the type system to encode arbitrary predicates (\eg behavioural specifications for functions),
+which is an anti-goal for my work.
+General dependent typing allows a type system to encode arbitrary predicates, \eg behavioural specifications for functions, which is an anti-goal for my work.
 Firstly, this application is strongly associated with pure functional languages,
 where a characterization of the return value (giving it a precise type, generally dependent upon the parameters)
 …
 Secondly, TODO: bash Rust.
 TODO: cite the crap out of these claims.
+\end{comment}
 \section{Features Added}
 This section shows more about using the \CFA array and dimension parameters, demonstrating their syntax and semantics by way of motivating examples.
+This section shows more about using the \CFA array and dimension parameters, demonstrating syntax and semantics by way of motivating examples.
 As stated, the core capability of the new array is tracking all dimensions within the type system, where dynamic dimensions are represented using type variables.
 By declaring type variables at the front of object declarations, an array dimension is lexically referenceable where it is needed.
+For example, a declaration can share one length, @N@, among a pair of parameters and the return,
+meaning that it requires both input arrays to be of the same length, and guarantees that the result is of that length as well.
+For example, a declaration can share one length, @N@, among a pair of parameters and return type, meaning the input arrays and return array are the same length.
 \lstinput{10-17}{hello-array.cfa}
 Function @f@ does a pointwise comparison of its two input arrays, checking if each pair of numbers is within half a percent of each other, returning the answers in a newly allocated @bool@ array.
 The dynamic allocation of the @ret@ array, by the library @alloc@ function,
+The dynamic allocation of the @ret@ array uses the library @alloc@ function,
 \begin{cfa}
 forall( T & | sized(T) )
 …
+}
 \end{cfa}
 uses the parameterized dimension information implicitly within its @sizeof@ determination, and casts the return type.
 Note that @alloc@ only sees one whole type for its @T@ (which is @f@'s @array(bool, N)@); this type's size is a computation based on @N@.
+which captures the parameterized dimension information implicitly within its @sizeof@ determination, and casts the return type.
+Note, @alloc@ only sees the whole type for its @T@, @array(bool, N)@, where this type's size is a computation based on @N@.
 This example illustrates how the new @array@ type plugs into existing \CFA behaviour by implementing necessary \emph{sized} assertions needed by other types.
 (\emph{sized} implies a concrete \vs abstract type with a runtime-available size, exposed as @sizeof@.)
 …
 \lstinput{30-43}{hello-array.cfa}
 \lstinput{45-48}{hello-array.cfa}
 \caption{\lstinline{f} Harness}
 \label{f:fHarness}
+\caption{\lstinline{f} Example}
+\label{f:fExample}
 \end{figure}
 \VRef[Figure]{f:fHarness} shows a harness that uses function @f@, illustrating how dynamic values are fed into the @array@ type.
+\VRef[Figure]{f:fExample} shows an example using function @f@, illustrating how dynamic values are fed into the @array@ type.
 Here, the dimension of arrays @x@, @y@, and @result@ is specified from a command-line value, @dim@, and these arrays are allocated on the stack.
 Then the @x@ array is initialized with decreasing values, and the @y@ array with amounts offset by constant @0.005@, giving relative differences within tolerance initially and diverging for later values.
 …
 In summary:
 \begin{itemize}
+\begin{itemize}[leftmargin=*]
 \item
 @[N]@ within a @forall@ declares the type variable @N@ to be a managed length.
 \item
 @N@ can be used an expression of type @size_t@ within the declared function body.
+@N@ can be used in an expression with type @size_t@ within the function body.
 \item
 The value of an @N@-expression is the acquired length, derived from the usage site, \ie generic declaration or function call.
 …
 \begin{enumerate}[leftmargin=*]
 \item
+The \CC template @N@ can only be compile-time value, while the \CFA @N@ may be a runtime value.
+% agreed, though already said
+The \CC template @N@ can only be a compile-time value, while the \CFA @N@ may be a runtime value.
 \item
 \CC does not allow a template function to be nested, while \CFA lets its polymorphic functions to be nested.
+% why is this important?
+\item
+Hence, \CC precludes a simple form of information hiding.
+\item
+\label{p:DimensionPassing}
 The \CC template @N@ must be passed explicitly at the call, unless @N@ has a default value, even when \CC can deduct the type of @T@.
 The \CFA @N@ is part of the array type and passed implicitly at the call.
 …
 % mycode/arrr/thesis-examples/check-peter/cs-cpp.cpp, v2
 \item
 \CC cannot have an array of references, but can have an array of pointers.
+\CC cannot have an array of references, but can have an array of @const@ pointers.
 \CC has a (mistaken) belief that references are not objects, but pointers are objects.
 In the \CC example, the arrays fall back on C arrays, which have a duality with references with respect to automatic dereferencing.
 …
 % https://stackoverflow.com/questions/922360/why-cant-i-make-a-vector-of-references
 \item
+\label{p:ArrayCopy}
 C/\CC arrays cannot be copied, while \CFA arrays can be copied, making them a first-class object (although array copy is often avoided for efficiency).
 % fixed by comparing to std::array
 % mycode/arrr/thesis-examples/check-peter/cs-cpp.cpp, v10
 \end{enumerate}
 TODO: settle Mike's concerns with this comparison (perhaps, remove)
+The \CC template @array@ type mitigates points \VRef[]{p:DimensionPassing} and \VRef[]{p:ArrayCopy}, but it is also trying to accomplish a similar mechanism to \CFA @array@.
 \begin{figure}
 …
 Just as the first example in \VRef[Section]{s:ArrayIntro} shows a compile-time rejection of a length mismatch,
 so are length mismatches stopped when they involve dimension parameters.
 While \VRef[Figure]{f:fHarness} shows successfully calling a function @f@ expecting two arrays of the same length,
+While \VRef[Figure]{f:fExample} shows successfully calling a function @f@ expecting two arrays of the same length,
 \begin{cfa}
 array( bool, N ) & f( array( float, N ) &, array( float, N ) & );
 …
 The same argument safety and the associated implicit communication of array length occurs.
 Preexisting \CFA allowed aggregate types to be generalized with type parameters, enabling parameterizing of element types.
+Now, \CFA also allows parameterizing them by length.
+Doing so gives a refinement of C's ``flexible array member'' pattern[TODO: cite ARM 6.7.2.1 pp18]\cite{arr:gnu-flex-mbr}.
+While a C flexible array member can only occur at the end of the enclosing structure,
+\CFA allows length-parameterized array members to be nested at arbitrary locations.
+This flexibility, in turn, allows for multiple array members.
+This has been extended to allow parameterizing by dimension.
+Doing so gives a refinement of C's ``flexible array member''~\cite[\S~6.7.2.1.18]{C11}.
+\begin{cfa}
+struct S {
+        ...
+        double d []; // incomplete array type => flexible array member
+} * s = malloc( sizeof( struct S ) + sizeof( double [10] ) );
+\end{cfa}
+which creates a VLA of size 10 @double@s at the end of the structure.
+A C flexible array member can only occur at the end of a structure;
+\CFA allows length-parameterized array members to be nested at arbitrary locations, with intervening member declarations.
 \lstinput{10-15}{hello-accordion.cfa}
 The structure has course- and student-level metatdata (their respective field names) and a position-based preferences' matrix.
 Its layout has the starting offset of @studentIds@ varying according to the generic parameter @C@, and the offset of @preferences@ varying according to both generic parameters.
 \VRef[Figure]{f:checkHarness} shows a program main using @School@ and results with different array sizes.
+\VRef[Figure]{f:checkExample} shows a program main using @School@ and results with different array sizes.
 The @school@ variable holds many students' course-preference forms.
 It is on the stack and its initialization does not use any casting or size arithmetic.
 …
 \end{cquote}
 \caption{\lstinline{School} harness, input and output}
 \label{f:checkHarness}
+\caption{\lstinline{School} Example, Input and Output}
+\label{f:checkExample}
 \end{figure}
 When a function operates on a @School@ structure, the type system handles its memory layout transparently.
 \lstinput{30-37}{hello-accordion.cfa}
 In the example, this @getPref@ function answers, for the student at position @is@, what is the position of its @pref@\textsuperscript{th}-favoured class?
+In the example, function @getPref@ returns, for the student at position @is@, what is the position of their @pref@\textsuperscript{th}-favoured class?
 …
 The repurposed heavy equipment is
 \begin{itemize}
+\begin{itemize}[leftmargin=*]
 \item
         Resolver provided values for a used declaration's type-system variables,
 …
 int main() {
         thing( @10@ ) x;  f( x );  $\C{// prints 10, [4]}$
+        thing( 100 ) y;  f( y );  $\C{// prints 100}$
+        return 0;
+        thing( @100@ ) y;  f( y );  $\C{// prints 100}$
+}
 \end{cfa}
 This example has:
 \begin{enumerate}
+\begin{enumerate}[leftmargin=*]
 \item
         The symbol @N@ being declared as a type variable (a variable of the type system).
 …
 Because the box pass handles a type's size as its main datum, the encoding is chosen to use it.
 The production and recovery are then straightforward.
 \begin{itemize}
+\begin{itemize}[leftmargin=*]
 \item
         The value $n$ is encoded as a type whose size is $n$.
 \item
         Given a dimension expression $e$, produce type @char[@$e$@]@ to represent it.
+        Given a dimension expression $e$, produce an internal type @char[@$e$@]@ to represent it.
         If $e$ evaluates to $n$ then the encoded type has size $n$.
 \item
 …
+}
 int main() {
+        thing( char[@10@] ) x;  f( x );  $\C{// prints 10, [4]}$
+        thing( char[100] ) y;  f( y );  $\C{// prints 100}$
+        return 0;
+        thing( @char[10]@ ) x;  f( x );  $\C{// prints 10, [4]}$
+        thing( @char[100]@ ) y;  f( y );  $\C{// prints 100}$
+}
 \end{cfa}
 Observe:
 \begin{enumerate}
+\begin{enumerate}[leftmargin=*]
 \item
         @N@ is now declared to be a type.
         It is declared to be \emph{sized} (by the @*@), meaning that the box pass shall do its @sizeof(N)@--@__sizeof_N@ extra parameter and expression translation.
+        It is declared to be \emph{sized} (by the @*@), meaning that the box pass shall do its @sizeof(N)@$\rightarrow$@__sizeof_N@ extra parameter and expression translation.
 \item
         @thing(N)@ is a type; the argument to the generic @thing@ is a type (type variable).
 …
         The @sout...@ expression (being an application of the @?|?@ operator) has a second argument that is an ordinary expression.
 \item
         The type of variable @x@ is another @thing(-)@ type; the argument to the generic @thing@ is a type (array type of bytes, @char@).
+        The type of variable @x@ is another @thing(-)@ type; the argument to the generic @thing@ is a type (array type of bytes, @char[@$e$@]@).
 \end{enumerate}
 …
         struct __conc_thing_10 {} x;  f( @10@, &x );  $\C{// prints 10, [4]}$
         struct __conc_thing_100 {} y;  f( @100@, &y );  $\C{// prints 100}$
-        return 0;
+}
 \end{cfa}
 Observe:
 \begin{enumerate}
+\begin{enumerate}[leftmargin=*]
 \item
         The type parameter @N@ is gone.
 …
         The @sout...@ expression (being an application of the @?|?@ operator) has a regular variable (parameter) usage for its second argument.
 \item
         Information about the particular @thing@ instantiation (value 10) has moved, from the type, to a regular function-call argument.
+        Information about the particular @thing@ instantiation (value 10) is moved, from the type, to a regular function-call argument.
 \end{enumerate}
 At the end of the desugaring and downstream processing, the original C idiom of ``pass both a length parameter and a pointer'' has been reconstructed.
 …
 The compiler's action produces the more complex form, which if handwritten, would be error-prone.
 Back at the compiler front end, the parsing changes AST schema extensions and validation rules for enabling the sugared user input.
 \begin{itemize}
+At the compiler front end, the parsing changes AST schema extensions and validation rules for enabling the sugared user input.
+\begin{itemize}[leftmargin=*]
 \item
         Recognize the form @[N]@ as a type-variable declaration within a @forall@.
 …
         Have the new brand of type-variable, \emph{Dimension}, in the AST form of a type-variable, to represent one parsed from @[-]@.
 \item
         Allow a type variable to occur in an expression.  Validate (after parsing) that only dimension-branded type variables are used here.
+        Allow a type variable to occur in an expression.  Validate (after parsing) that only dimension-branded type-variables are used here.
 \item
         Allow an expression to occur in type-argument position.  Brand the resulting type argument as a dimension.
 …
 \label{s:ArrayTypingC}
+Essential in giving a guarantee of accurate length is the compiler's ability
+to reject a program that presumes to mishandle length.
+By contrast, most discussion so far dealt with communicating length,
+from one party who knows it, to another who is willing to work with any given length.
+For scenarios where the concern is a mishandled length,
+the interaction is between two parties who both claim to know something about it.
+Such a scenario occurs in this pure C fragment, which today's C compilers accept:
+\begin{cfa}
+int n = @42@;
+float x[n];
+float (*xp)[@999@] = &x;
+Essential in giving a guarantee of accurate length is the compiler's ability to reject a program that presumes to mishandle length.
+By contrast, most discussion so far deals with communicating length, from one party who knows it, to another willing to work with any given length.
+For scenarios where the concern is a mishandled length, the interaction is between two parties who both claim to know something about it.
+C and \CFA can check when working with two static values.
+\begin{cfa}
+enum { n = 42 };
+float x[@n@];   // or just 42
+float (*xp1)[@42@] = &x;    // accept
+float (*xp2)[@999@] = &x;   // reject
+warning: initialization of 'float (*)[999]' from incompatible pointer type 'float (*)[42]'
+\end{cfa}
+When a variable is involved, C and \CFA take two different approaches.
+Today's C compilers accept the following without warning.
+\begin{cfa}
+static const int n = 42;
+float x[@n@];
+float (* xp)[@999@] = &x; $\C{// should be static rejection here}$
 (*xp)[@500@]; $\C{// in "bound"?}$
 \end{cfa}
 Here, the array @x@ has length 42, while a pointer to it (@xp@) claims length 999.
 So, while the subscript of @xp@ at position 500 is out of bound of its referent @x@,
+So, while the subscript of @xp@ at position 500 is out of bound with its referent @x@,
 the access appears in-bound of the type information available on @xp@.
+Truly, length is being mishandled in the previous step,
+where the type-carried length information on @x@ is not compatible with that of @xp@.
+The \CFA new-array rejects the analogous case:
+\begin{cfa}
+int n = @42@;
+array(float, n) x;
+array(float, 999) * xp = x; $\C{// static rejection here}$
+(*xp)[@500@]; $\C{// runtime check vs len 999}$
+\end{cfa}
+The way the \CFA array is implemented, the type analysis of this case reduces to a case similar to the earlier C version.
+In fact, length is being mishandled in the previous step, where the type-carried length information on @x@ is not compatible with that of @xp@.
+In \CFA, I choose to reject this C example at the point where the type-carried length information on @x@ is not compatible with that of @xp@, and correspondingly, its array counterpart at the same location:
+\begin{cfa}
+static const int n = 42;
+array( float, @n@ ) x;
+array( float, @999@ ) * xp = &x; $\C{// static rejection here}$
+(*xp)[@500@]; $\C{// runtime check passes}$
+\end{cfa}
+The way the \CFA array is implemented, the type analysis for this case reduces to a case similar to the earlier C version.
 The \CFA compiler's compatibility analysis proceeds as:
 \begin{itemize}[parsep=0pt]
 \item
+        Is @array(float, 999)@ type-compatible with @array(float, n)@?
+\item
+        Is @arrayX(float, char[999])@ type-compatible with @arrayX(float, char[n])@?\footnote{
+                Here, \lstinline{arrayX} represents the type that results
+                from desugaring the \lstinline{array} type
+                into a type whose generic parameters are all types.
+                This presentation elides the noisy fact that
+                \lstinline{array} is actually a macro for something bigger;
+                the reduction to \lstinline{char[-]} still proceeds as sketched.}
+\item
+        Is @char[999]@ type-compatible with @char[n]@?
+        Is @array( float, 999 )@ type-compatible with @array( float, n )@?
+\item
+        Is desugared @array( float, char[999] )@ type-compatible with desugared @array( float, char[n] )@?
+%               \footnote{
+%               Here, \lstinline{arrayX} represents the type that results from desugaring the \lstinline{array} type into a type whose generic parameters are all types.
+%               This presentation elides the noisy fact that \lstinline{array} is actually a macro for something bigger;
+%               the reduction to \lstinline{char [-]} still proceeds as sketched.}
+\item
+        Is internal type @char[999]@ type-compatible with internal type @char[n]@?
 \end{itemize}
+To achieve the necessary \CFA rejections meant rejecting the corresponding C case, which is not backward compatible.
+There are two complementary mitigations for this incompatibility.
+First, a simple recourse is available to a programmer who intends to proceed
+with the statically unsound assignment.
+This situation might arise if @n@ were known to be 999,
+rather than 42, as in the introductory examples.
+The programmer can add a cast in the \CFA code.
+\begin{cfa}
+xp = @(float (*)[999])@ &x;
+\end{cfa}
+This addition causes \CFA to accept, because now, the programmer has accepted blame.
+This addition is benign in plain C, because the cast is valid, just unnecessary there.
+Moreover, the addition can even be seen as appropriate ``eye candy,''
+marking where the unchecked length knowledge is used.
+Therefore, a program being onboarded to \CFA can receive a simple upgrade,
+to satisfy the \CFA rules (and arguably become clearer),
+without giving up its validity to a plain C compiler.
+Second, the incompatibility only affects types like pointer-to-array,
+which are are infrequently used in C.
+The more common C idiom for aliasing an array is to use a pointer-to-first-element type,
+which does not participate in the \CFA array's length checking.\footnote{
+The answer is false because, in general, the value of @n@ is unknown at compile time, and hence, an error is raised.
+For safety, it makes sense to reject the corresponding C case, which is a non-backwards compatible change.
+There are two mitigations for this incompatibility.
+First, a simple recourse is available in a situation where @n@ is \emph{known} to be 999 by using a cast.
+\begin{cfa}
+float (* xp)[999] = @(float (*)[999])@&x;
+\end{cfa}
+The cast means the programmer has accepted blame.
+Moreover, the cast is ``eye candy'' marking where the unchecked length knowledge is used.
+Therefore, a program being onboarded to \CFA requires some upgrading to satisfy the \CFA rules (and arguably become clearer), without giving up its validity to a plain C compiler.
+Second, the incompatibility only affects types like pointer-to-array, which are infrequently used in C.
+The more common C idiom for aliasing an array is to use a pointer-to-first-element type, which does not participate in the \CFA array's length checking.\footnote{
         Notably, the desugaring of the \lstinline{array} type avoids letting any \lstinline{-[-]} type decay,
         in order to preserve the length information that powers runtime bound-checking.}
+Therefore, the frequency of needing to upgrade legacy C code (as discussed in the first mitigation)
+is anticipated to be low.
+Because the incompatibility represents a low cost to a \CFA onboarding effort
+(with a plausible side benefit of linting the original code for a missing annotation),
+no special measures were added to retain the compatibility.
+It would be possible to flag occurrences of @-[-]@ types that come from @array@ desugaring,
+treating those with stricter \CFA rules, while treating others with classic C rules.
+If future lessons from C project onboarding warrant it,
+this special compatibility measure can be added.
+Having allowed that both the initial C example's check
+\begin{itemize}
+        \item
+                Is @float[999]@ type-compatible with @float[n]@?
+\end{itemize}
+and the second \CFA example's induced check
+\begin{itemize}
+        \item
+                Is @char[999]@ type-compatible with @char[n]@?
+\end{itemize}
+shall have the same answer, (``no''),
+discussion turns to how I got the \CFA compiler to produce this answer.
+In its preexisting form, it produced a (buggy) approximation of the C rules.
+To implement the new \CFA rules, I took the syntactic recursion a step further, obtaining,
+in both cases:
+\begin{itemize}
+        \item
+                Is @999@ compatible with @n@?
+\end{itemize}
+This compatibility question applies to a pair of expressions, where the earlier implementation were to types.
+Such an expression-compatibility question is a new addition to the \CFA compiler.
+Note, these questions only arise in the context of dimension expressions on (C) array types.
+TODO: ensure these compiler implementation matters are treated under \CFA compiler background:
+type unification,
+cost calculation,
+GenPoly.
+The relevant technical component of the \CFA compiler is the type unification procedure within the type resolver.
+I added rules for continuing this unification into expressions that occur within types.
+It is still fundamentally doing \emph{type} unification
+because it is participating in binding type variables,
+and not participating in binding any variables that stand in for expression fragments
+(for there is no such sort of variable in \CFA's analysis.)
+An unfortunate fact about the \CFA compiler's preexisting implementation is that
+type unification suffers from two forms of duplication.
+The first duplication has (many of) the unification rules stated twice.
+As a result, my additions for dimension expressions are stated twice.
+The extra statement of the rules occurs in the @GenPoly@ module,
+where concrete types like @array(int, 5)@\footnote{
+        Again, the presentation is simplified
+        by leaving the \lstinline{array} macro unexpanded.}
+are lowered into corresponding C types @struct __conc_array_1234@ (the suffix being a generated index).
+In this case, the struct's definition contains fields that hardcode the argument values of @float@ and @5@.
+The next time an @array(-,-)@ concrete instance is encountered, it checks if the previous @struct __conc_array_1234@ is suitable for it.
+Yes, for another occurrence of @array(int, 5)@;
+no, for either @array(rational(int), 5)@ or @array(int, 42)@.
+By the last example, this phase must ``reject''
+the hypothesis that it should reuse the dimension-5 instance's C-lowering for a dimension-42 instance.
+The second duplication has unification (proper) being invoked at two stages of expression resolution.
+As a result, my added rule set needs to handle more cases than the preceding discussion motivates.
+In the program
+\begin{cfa}
+void @f@( double );
+forall( T & ) void @f@( T & );
+void g( int n ) {
+        array( float, n + 1 ) x;
+        f(x);   // overloaded
+}
+\end{cfa}
+when resolving the function call, @g@, the first unification stage
+compares the type @T@ of the parameter with @array( float, n + 1 )@, of the argument.
+TODO: finish.
+The actual rules for comparing two dimension expressions are conservative.
+To answer, ``yes, consider this pair of expressions to be matching,''
+is to imply, ``all else being equal, allow an array with length calculated by $e_1$
+to be passed to a function expecting a length-$e_2$ array.''\footnote{
+        TODO: Deal with directionality, that I'm doing exact-match, no ``at least as long as,'' no subtyping.
+        Should it be an earlier scoping principle?  Feels like it should matter in more places than here.}
+So, a ``yes'' answer must represent a guarantee that both expressions evaluate the
+same result, while a ``no'' can tolerate ``they might, but we're not sure'',
+provided that practical recourses are available
+to let programmers express better knowledge.
+The new rule-set in the current release is, in fact, extremely conservative.
+I chose to keep things simple,
+and allow future needs to drive adding additional complexity, within the new framework.
+For starters, the original motivating example's rejection
+is not based on knowledge that
+the @xp@ length of (the literal) 999 is value-unequal to
+the (obvious) runtime value of the variable @n@, which is the @x@ length.
+Rather, the analysis assumes a variable's value can be anything,
+and so there can be no guarantee that its value is 999.
+So, a variable and a literal can never match.
+Two occurrences of the same literal value are obviously a fine match.
+For two occurrences of the same variable, more information is needed.
+For example, this one is fine
+\begin{cfa}
+void f( const int n ) {
+        float x[n];
+        float (*xp)[n] = x;   // accept
+}
+\end{cfa}
+while this one is not:
+\begin{cfa}
+Therefore, the need to upgrade legacy C code is low.
+Finally, if this incompatibility is a problem onboarding C programs to \CFA, it is should be possible to change the C type check to a warning rather than an error, acting as a \emph{lint} of the original code for a missing type annotation.
+To handle two occurrences of the same variable, more information is needed, \eg, this is fine,
+\begin{cfa}
+int n = 42;
+float x[@n@];
+float (*xp)[@n@] = x;   // accept
+\end{cfa}
+where @n@ remains fixed across a contiguous declaration context.
+However, intervening dynamic statement cause failures.
+\begin{cfa}
+int n = 42;
+float x[@n@];
+@n@ = 999; // dynamic change
+float (*xp)[@n@] = x;   // reject
+\end{cfa}
+However, side-effects can occur in a contiguous declaration context.
+\begin{cquote}
+\setlength{\tabcolsep}{20pt}
+\begin{tabular}{@{}ll@{}}
+\begin{cfa}
+// compile unit 1
+extern int @n@;
+extern float g();
 void f() {
+        int n = 42;
+        float x[n];
+        n = 999;
+        float (*xp)[n] = x;   // reject
+}
+\end{cfa}
+Furthermore, the fact that the first example sees @n@ as @const@
+is not actually sufficient.
+In this example, @f@'s length expression's declaration is as @const@ as it can be,
+yet its value still changes between the two invocations:
+\begin{cquote}
+\setlength{\tabcolsep}{15pt}
+\begin{tabular}{@{}ll@{}}
+\begin{cfa}
+// compile unit 1
+void g();
+void f( const int & const nr ) {
+        float x[nr];
+        g();    // change n
+        @float (*xp)[nr] = x;@   // reject
+        float x[@n@] = { g() };
+        float (*xp)[@n@] = x;   // reject
+}
 \end{cfa}
 …
 \begin{cfa}
 // compile unit 2
 static int n = 42;
+int @n@ = 42;
 void g() {
         n = 99;
+}
+f( n );
+        @n@ = 99;
+}
 \end{cfa}
 \end{tabular}
 …
 The issue here is that knowledge needed to make a correct decision is hidden by separate compilation.
 Even within a translation unit, static analysis might not be able to provide all the information.
+My rule set also respects a traditional C feature: In spite of the several limitations of the C rules
+accepting cases that produce different values, there are a few mismatches that C stops.
+C is quite precise when working with two static values.
+\begin{cfa}
+enum { fortytwo = 42 };
+float x[fortytwo];
+float (*xp1)[42] = &x;    // accept
+float (*xp2)[999] = &x;   // reject
+\end{cfa}
+My \CFA rules agree with C's on these cases.
+However, if the example uses @const@, the check is possible.
+\begin{cquote}
+\setlength{\tabcolsep}{20pt}
+\begin{tabular}{@{}ll@{}}
+\begin{cfa}
+// compile unit 1
+extern @const@ int n;
+extern float g();
+void f() {
+        float x[n] = { g() };
+        float (*xp)[n] = x;   // reject
+}
+\end{cfa}
+&
+\begin{cfa}
+// compile unit 2
+@const@ int n = 42;
+void g() {
+        @n = 99@; // allowed
+}
+\end{cfa}
+\end{tabular}
+\end{cquote}
 In summary, the new rules classify expressions into three groups:
 \begin{description}
 \item[Statically Evaluable]
+        Expressions for which a specific value can be calculated (conservatively)
+        at compile-time.
+        A preexisting \CFA compiler module defines which literals, enumerators, and expressions qualify,
+        and evaluates them.
+        Expressions for which a specific value can be calculated (conservatively) at compile-time.
+        A preexisting \CFA compiler module defines which literals, enumerators, and expressions qualify and evaluates them.
 \item[Dynamic but Stable]
         The value of a variable declared as @const@, including a @const@ parameter.
 \item[Potentially Unstable]
         The catch-all category.  Notable examples include:
+        any function-call result, @float x[foo()];@,
+        the particular function-call result that is a pointer dereference, @void f(const int * n)@ @{ float x[*n]; }@, and
+        any use of a reference-typed variable.
+        any function-call result, @float x[foo()]@, the particular function-call result that is a pointer dereference, @void f(const int * n)@ @{ float x[*n]; }@, and any use of a reference-typed variable.
 \end{description}
 Within these groups, my \CFA rules are:
 \begin{itemize}
+\begin{itemize}[leftmargin=*]
 \item
         Accept a Statically Evaluable pair, if both expressions have the same value.
 …
 \end{itemize}
 The traditional C rules are:
 \begin{itemize}
+\begin{itemize}[leftmargin=*]
 \item
         Reject a Statically Evaluable pair, if the expressions have two different values.
 …
         Otherwise, accept.
 \end{itemize}
+\VRef[Figure]{f:DimexprRuleCompare} gives a case-by-case comparison of the consequences of these rule sets.
+It demonstrates that the \CFA false alarms occur in the same cases as C treats unsafe.
+It also shows that C-incompatibilities only occur in cases that C treats unsafe.
 \begin{figure}
 …
                 where \lstinline{expr1} and \lstinline{expr2} are meta-variables varying according to the row's Case.
                 Each row's claim applies to other harnesses too, including,
                 \begin{itemize}
+                \begin{itemize}[leftmargin=*]
                 \item
                         calling a function with a parameter like \lstinline{x} and an argument of the \lstinline{xp} type,
 …
                 The table treats symbolic identity (Same/Different on rows)
                 apart from value equality (Equal/Unequal on columns).
                 \begin{itemize}
+                \begin{itemize}[leftmargin=*]
                 \item
                         The expressions \lstinline{1}, \lstinline{0+1} and \lstinline{n}
 …
 \end{figure}
+\VRef[Figure]{f:DimexprRuleCompare} gives a case-by-case comparison of the consequences of these rule sets.
+It demonstrates that the \CFA false alarms occur in the same cases as C treats unsafe.
+It also shows that C-incompatibilities only occur in cases that C treats unsafe.
+The conservatism of the new rule set can leave a programmer needing a recourse,
+when needing to use a dimension expression whose stability argument
+is more subtle than current-state analysis.
+\begin{comment}
+Given that the above check
+\begin{itemize}
+        \item
+        Is internal type @char[999]@ type-compatible with internal type @char[n]@?
+\end{itemize}
+answers false, discussion turns to how I got the \CFA compiler to produce this answer.
+In its preexisting form, the type system had a buggy approximation of the C rules.
+To implement the new \CFA rules, I added one further step.
+\begin{itemize}
+        \item
+                Is @999@ compatible with @n@?
+\end{itemize}
+This question applies to a pair of expressions, where the earlier question applies to types.
+An expression-compatibility question is a new addition to the \CFA compiler, and occurs in the context of dimension expressions, and possibly enumerations assigns, which must be unique.
+% TODO: ensure these compiler implementation matters are treated under \CFA compiler background: type unification, cost calculation, GenPoly.
+The relevant technical component of the \CFA compiler is the standard type-unification within the type resolver.
+\begin{cfa}
+example
+\end{cfa}
+I added rules for continuing this unification into expressions that occur within types.
+It is still fundamentally doing \emph{type} unification because it is participating in binding type variables, and not participating in binding any variables that stand in for expression fragments (for there is no such sort of variable in \CFA's analysis.)
+An unfortunate fact about the \CFA compiler's preexisting implementation is that type unification suffers from two forms of duplication.
+In detail, the first duplication has (many of) the unification rules stated twice.
+As a result, my additions for dimension expressions are stated twice.
+The extra statement of the rules occurs in the @GenPoly@ module, where concrete types like @array( int, 5 )@\footnote{
+        Again, the presentation is simplified
+        by leaving the \lstinline{array} macro unexpanded.}
+are lowered into corresponding C types @struct __conc_array_1234@ (the suffix being a generated index).
+In this case, the struct's definition contains fields that hardcode the argument values of @float@ and @5@.
+The next time an @array( -, - )@ concrete instance is encountered, it checks if the previous @struct __conc_array_1234@ is suitable for it.
+Yes, for another occurrence of @array( int, 5 )@;
+no, for examples like @array( int, 42 )@ or @array( rational(int), 5 )@.
+In the first example, it must reject the reuse hypothesis for a dimension-@5@ and a dimension-@42@ instance.
+The second duplication has unification (proper) being invoked at two stages of expression resolution.
+As a result, my added rule set needs to handle more cases than the preceding discussion motivates.
+In the program
+\begin{cfa}
+void @f@( double ); // overload
+forall( T & ) void @f@( T & ); // overload
+void g( int n ) {
+        array( float, n + 1 ) x;
+        f(x);   // overloaded
+}
+\end{cfa}
+when resolving a function call to @g@, the first unification stage compares the type @T@ of the parameter with @array( float, n + 1 )@, of the argument.
+\PAB{TODO: finish.}
+The actual rules for comparing two dimension expressions are conservative.
+To answer, ``yes, consider this pair of expressions to be matching,''
+is to imply, ``all else being equal, allow an array with length calculated by $e_1$
+to be passed to a function expecting a length-$e_2$ array.''\footnote{
+        TODO: Deal with directionality, that I'm doing exact-match, no ``at least as long as,'' no subtyping.
+        Should it be an earlier scoping principle?  Feels like it should matter in more places than here.}
+So, a ``yes'' answer must represent a guarantee that both expressions evaluate the
+same result, while a ``no'' can tolerate ``they might, but we're not sure'',
+provided that practical recourses are available
+to let programmers express better knowledge.
+The new rule-set in the current release is, in fact, extremely conservative.
+I chose to keep things simple,
+and allow future needs to drive adding additional complexity, within the new framework.
+For starters, the original motivating example's rejection is not based on knowledge that the @xp@ length of (the literal) 999 is value-unequal to the (obvious) runtime value of the variable @n@, which is the @x@ length.
+Rather, the analysis assumes a variable's value can be anything, and so there can be no guarantee that its value is 999.
+So, a variable and a literal can never match.
+TODO: Discuss the interaction of this dimension hoisting with the challenge of extra unification for cost calculation
+\end{comment}
+The conservatism of the new rule set can leave a programmer needing a recourse, when needing to use a dimension expression whose stability argument is more subtle than current-state analysis.
 This recourse is to declare an explicit constant for the dimension value.
+Consider these two dimension expressions,
+whose reuses are rejected by the blunt current-state rules:
+\begin{cfa}
+void f( int & nr, const int nv ) {
+        float x[nr];
+        float (*xp)[nr] = &x;   // reject: nr varying (no references)
+        float y[nv + 1];
+        float (*yp)[nv + 1] = &y;   // reject: ?+? unpredictable (no functions)
+Consider these two dimension expressions, whose uses are rejected by the blunt current-state rules:
+\begin{cfa}
+void f( int @&@ nr, @const@ int nv ) {
+        float x[@nr@];
+        float (*xp)[@nr@] = &x;   // reject: nr varying (no references)
+        float y[@nv + 1@];
+        float (*yp)[@nv + 1@] = &y;   // reject: ?+? unpredictable (no functions)
+}
 \end{cfa}
 Yet, both dimension expressions are reused safely.
+The @nr@ reference is never written, not volatile
+and control does not leave the function between the uses.
+The name @?+?@ resolves to a function that is quite predictable.
+Here, the programmer can add the constant declarations (cast does not work):
+The @nr@ reference is never written, not volatile meaning no implicit code (load) between declarations, and control does not leave the function between the uses.
+As well, the build-in @?+?@ function is predictable.
+To make these cases work, the programmer must add the follow constant declarations (cast does not work):
 \begin{cfa}
 void f( int & nr, const int nv ) {
 …
 achieved by adding a superfluous ``snapshot it as of now'' directive.
 The snapshotting trick is also used by the translation, though to achieve a different outcome.
+The snapshot trick is also used by the \CFA translation, though to achieve a different outcome.
 Rather obviously, every array must be subscriptable, even a bizarre one:
 \begin{cfa}
+array( float, rand(10) ) x;
+x[0];  // 10% chance of bound-check failure
+\end{cfa}
+Less obvious is that the mechanism of subscripting is a function call,
+which must communicate length accurately.
+The bound-check above (callee logic) must use the actual allocated length of @x@,
+without mistakenly reevaluating the dimension expression, @rand(10)@.
+array( float, @rand(10)@ ) x;
+x[@0@];  // 10% chance of bound-check failure
+\end{cfa}
+Less obvious is that the mechanism of subscripting is a function call, which must communicate length accurately.
+The bound-check above (callee logic) must use the actual allocated length of @x@, without mistakenly reevaluating the dimension expression, @rand(10)@.
 Adjusting the example to make the function's use of length more explicit:
 \begin{cfa}
 forall ( T * )
 void f( T * x ) { sout | sizeof(*x); }
+forall( T * )
+void f( T * x ) { sout | sizeof( *x ); }
 float x[ rand(10) ];
 f( x );
 …
 void f( size_t __sizeof_T, void * x ) { sout | __sizeof_T; }
 \end{cfa}
 the translation must call the dimension argument twice:
+the translation calls the dimension argument twice:
 \begin{cfa}
 float x[ rand(10) ];
 f( rand(10), &x );
 \end{cfa}
 Rather, the translation is:
+The correct form is:
 \begin{cfa}
 size_t __dim_x = rand(10);
 …
 f( __dim_x, &x );
 \end{cfa}
 The occurrence of this dimension hoisting during translation was in the preexisting \CFA compiler.
 But its cases were buggy, particularly with determining, ``Can hoisting the expression be skipped here?'', for skipping this hoisting is clearly desirable in some cases.
 For example, when the programmer has already done so manually. \PAB{I don't know what this means.}
+Dimension hoisting already existed in the \CFA compiler.
+But its was buggy, particularly with determining, ``Can hoisting the expression be skipped here?'', for skipping this hoisting is clearly desirable in some cases.
+For example, when a programmer has already hoisted to perform an optimiation to prelude duplicate code (expression) and/or expression evaluation.
 In the new implementation, these cases are correct, harmonized with the accept/reject criteria.
-TODO: Discuss the interaction of this dimension hoisting with the challenge of extra unification for cost calculation
 …
 A multidimensional array implementation has three relevant levels of abstraction, from highest to lowest, where the array occupies \emph{contiguous memory}.
 \begin{enumerate}
+\begin{enumerate}[leftmargin=*]
 \item
 Flexible-stride memory:
 …
 Preexisting \CFA mechanisms achieve this requirement, but with poor performance.
 Furthermore, advanced array users need an exception to the basic mechanism, which does not occur with other aggregates.
 Hence, arrays introduce subleties in supporting an element's lifecycle.
+Hence, arrays introduce subtleties in supporting an element's lifecycle.
 The preexisting \CFA support for contained-element lifecycle is based on recursive occurrences of the object-type (@otype@) pseudo-trait.
 …
 The @worker@ type is designed this way to work with the threading system.
 A thread type forks a thread at the end of each constructor and joins with it at the start of each destructor.
 But a @worker@ cannot begin its forked-thead work without knowing its @id@.
+But a @worker@ cannot begin its forked-thread work without knowing its @id@.
 Therefore, there is a conflict between the implicit actions of the builtin @thread@ type and a user's desire to defer these actions.
 …
 The \CFA array and the field of ``array language'' comparators all leverage dependent types to improve on the expressiveness over C and Java, accommodating examples such as:
 \begin{itemize}
+\begin{itemize}[leftmargin=*]
 \item a \emph{zip}-style operation that consumes two arrays of equal length
 \item a \emph{map}-style operation whose produced length matches the consumed length
 …
 The details in this presentation aren't meant to be taken too precisely as suggestions for how it should look in \CFA.
 But the example shows these abilities:
 \begin{itemize}
+\begin{itemize}[leftmargin=*]
 \item a built-in way (the @is_enum@ trait) for a generic routine to require enumeration-like information about its instantiating type
 \item an implicit implementation of the trait whenever a user-written enum occurs (@weekday@'s declaration implies @is_enum@)

doc/theses/mike_brooks_MMath/background.tex

r7d02d35	rbd72f517
995	995	Illustrated by pseudocode implementation of an STL-compatible API fragment using LQ as the underlying implementation.
996	996	The gap that makes it pseudocode is that
997		the LQ C macros do not expand to valid ~~C++~~ when instantiated with template parameters---there is no \lstinline{struct El}.
	997	the LQ C macros do not expand to valid \CC when instantiated with template parameters---there is no \lstinline{struct El}.
998	998	When using a custom-patched version of LQ to work around this issue,
999	999	the programs of \VRef[Figure]{f:WrappedRef} and wrapped value work with this shim in place of real STL.

doc/theses/mike_brooks_MMath/benchmarks/string/result-append-pbv.csv

-              r7d02d35
+              rbd72f517
+perfexp-cfa-pta-ll-share-reuse,corpus-100-1-1.txt,100,100,1.000000,219460000,10.000260
+perfexp-cfa-pta-ll-share-reuse,corpus-100-10-1.txt,100,100,9.500000,180250000,10.000486
+perfexp-cfa-pta-ll-share-reuse,corpus-100-100-1.txt,100,100,106.370000,152790000,10.000441
+perfexp-cfa-pta-ll-share-reuse,corpus-100-2-1.txt,100,100,2.030000,206090000,10.000311
+perfexp-cfa-pta-ll-share-reuse,corpus-100-20-1.txt,100,100,22.960000,184330000,10.000328
+perfexp-cfa-pta-ll-share-reuse,corpus-100-200-1.txt,100,100,177.280000,125090000,10.000138
+perfexp-cfa-pta-ll-share-reuse,corpus-100-5-1.txt,100,100,5.270000,199130000,10.000180
+perfexp-cfa-pta-ll-share-reuse,corpus-100-50-1.txt,100,100,43.320000,167720000,10.000327
+perfexp-cfa-pta-ll-share-reuse,corpus-100-500-1.txt,100,100,557.260000,93560000,10.001058
+perfexp-cfa-pta-ll-share-fresh,corpus-100-1-1.txt,100,100,1.000000,225090000,10.000393
+perfexp-cfa-pta-ll-share-fresh,corpus-100-10-1.txt,100,100,9.500000,196300000,10.000221
+perfexp-cfa-pta-ll-share-fresh,corpus-100-100-1.txt,100,100,106.370000,150670000,10.000337
+perfexp-cfa-pta-ll-share-fresh,corpus-100-2-1.txt,100,100,2.030000,206600000,10.000182
+perfexp-cfa-pta-ll-share-fresh,corpus-100-20-1.txt,100,100,22.960000,188400000,10.000199
+perfexp-cfa-pta-ll-share-fresh,corpus-100-200-1.txt,100,100,177.280000,125880000,10.000489
+perfexp-cfa-pta-ll-share-fresh,corpus-100-5-1.txt,100,100,5.270000,185930000,10.000231
+perfexp-cfa-pta-ll-share-fresh,corpus-100-50-1.txt,100,100,43.320000,170660000,10.000491
+perfexp-cfa-pta-ll-share-fresh,corpus-100-500-1.txt,100,100,557.260000,91520000,10.000640
+perfexp-cfa-pta-ll-noshare-reuse,corpus-100-1-1.txt,100,100,1.000000,146200000,10.000520
+perfexp-cfa-pta-ll-noshare-reuse,corpus-100-10-1.txt,100,100,9.500000,114140000,10.000734
+perfexp-cfa-pta-ll-noshare-reuse,corpus-100-100-1.txt,100,100,106.370000,17630000,10.000889
+perfexp-cfa-pta-ll-noshare-reuse,corpus-100-2-1.txt,100,100,2.030000,139700000,10.000460
+perfexp-cfa-pta-ll-noshare-reuse,corpus-100-20-1.txt,100,100,22.960000,71910000,10.000768
+perfexp-cfa-pta-ll-noshare-reuse,corpus-100-200-1.txt,100,100,177.280000,8540000,10.009186
+perfexp-cfa-pta-ll-noshare-reuse,corpus-100-5-1.txt,100,100,5.270000,129810000,10.000379
+perfexp-cfa-pta-ll-noshare-reuse,corpus-100-50-1.txt,100,100,43.320000,45280000,10.000006
+perfexp-cfa-pta-ll-noshare-reuse,corpus-100-500-1.txt,100,100,557.260000,3300000,10.021088
+perfexp-cfa-pta-ll-noshare-fresh,corpus-100-1-1.txt,100,100,1.000000,146050000,10.000551
+perfexp-cfa-pta-ll-noshare-fresh,corpus-100-10-1.txt,100,100,9.500000,102800000,10.000490
+perfexp-cfa-pta-ll-noshare-fresh,corpus-100-100-1.txt,100,100,106.370000,17060000,10.001677
+perfexp-cfa-pta-ll-noshare-fresh,corpus-100-2-1.txt,100,100,2.030000,137470000,10.000361
+perfexp-cfa-pta-ll-noshare-fresh,corpus-100-20-1.txt,100,100,22.960000,69520000,10.001142
+perfexp-cfa-pta-ll-noshare-fresh,corpus-100-200-1.txt,100,100,177.280000,8830000,10.010528
+perfexp-cfa-pta-ll-noshare-fresh,corpus-100-5-1.txt,100,100,5.270000,117120000,10.000681
+perfexp-cfa-pta-ll-noshare-fresh,corpus-100-50-1.txt,100,100,43.320000,42960000,10.001950
+perfexp-cfa-pta-ll-noshare-fresh,corpus-100-500-1.txt,100,100,557.260000,3220000,10.010203
+perfexp-cfa-peq-ll-share-reuse,corpus-100-1-1.txt,100,100,1.000000,583560000,10.000070
+perfexp-cfa-peq-ll-share-reuse,corpus-100-10-1.txt,100,100,9.500000,451400000,10.000013
+perfexp-cfa-peq-ll-share-reuse,corpus-100-100-1.txt,100,100,106.370000,253260000,10.000275
+perfexp-cfa-peq-ll-share-reuse,corpus-100-2-1.txt,100,100,2.030000,483580000,10.000140
+perfexp-cfa-peq-ll-share-reuse,corpus-100-20-1.txt,100,100,22.960000,396550000,10.000060
+perfexp-cfa-peq-ll-share-reuse,corpus-100-200-1.txt,100,100,177.280000,199760000,10.000416
+perfexp-cfa-peq-ll-share-reuse,corpus-100-5-1.txt,100,100,5.270000,454790000,10.000069
+perfexp-cfa-peq-ll-share-reuse,corpus-100-50-1.txt,100,100,43.320000,339690000,10.000243
+perfexp-cfa-peq-ll-share-reuse,corpus-100-500-1.txt,100,100,557.260000,123840000,10.000724
+perfexp-cfa-peq-ll-share-fresh,corpus-100-1-1.txt,100,100,1.000000,577650000,10.000157
+perfexp-cfa-peq-ll-share-fresh,corpus-100-10-1.txt,100,100,9.500000,445260000,10.000186
+perfexp-cfa-peq-ll-share-fresh,corpus-100-100-1.txt,100,100,106.370000,259650000,10.000273
+perfexp-cfa-peq-ll-share-fresh,corpus-100-2-1.txt,100,100,2.030000,485650000,10.000026
+perfexp-cfa-peq-ll-share-fresh,corpus-100-20-1.txt,100,100,22.960000,386150000,10.000120
+perfexp-cfa-peq-ll-share-fresh,corpus-100-200-1.txt,100,100,177.280000,197690000,10.000077
+perfexp-cfa-peq-ll-share-fresh,corpus-100-5-1.txt,100,100,5.270000,443650000,10.000006
+perfexp-cfa-peq-ll-share-fresh,corpus-100-50-1.txt,100,100,43.320000,339190000,10.000037
+perfexp-cfa-peq-ll-share-fresh,corpus-100-500-1.txt,100,100,557.260000,122740000,10.000753
+perfexp-cfa-peq-ll-noshare-reuse,corpus-100-1-1.txt,100,100,1.000000,595700000,10.000119
+perfexp-cfa-peq-ll-noshare-reuse,corpus-100-10-1.txt,100,100,9.500000,452000000,10.000055
+perfexp-cfa-peq-ll-noshare-reuse,corpus-100-100-1.txt,100,100,106.370000,280570000,10.000281
+perfexp-cfa-peq-ll-noshare-reuse,corpus-100-2-1.txt,100,100,2.030000,501040000,10.000073
+perfexp-cfa-peq-ll-noshare-reuse,corpus-100-20-1.txt,100,100,22.960000,422280000,10.000131
+perfexp-cfa-peq-ll-noshare-reuse,corpus-100-200-1.txt,100,100,177.280000,235640000,10.000126
+perfexp-cfa-peq-ll-noshare-reuse,corpus-100-5-1.txt,100,100,5.270000,461250000,10.000197
+perfexp-cfa-peq-ll-noshare-reuse,corpus-100-50-1.txt,100,100,43.320000,369020000,10.000057
+perfexp-cfa-peq-ll-noshare-reuse,corpus-100-500-1.txt,100,100,557.260000,135050000,10.000682
+perfexp-cfa-peq-ll-noshare-fresh,corpus-100-1-1.txt,100,100,1.000000,529900000,10.000150
+perfexp-cfa-peq-ll-noshare-fresh,corpus-100-10-1.txt,100,100,9.500000,408530000,10.000108
+perfexp-cfa-peq-ll-noshare-fresh,corpus-100-100-1.txt,100,100,106.370000,217530000,10.000334
+perfexp-cfa-peq-ll-noshare-fresh,corpus-100-2-1.txt,100,100,2.030000,463860000,10.000166
+perfexp-cfa-peq-ll-noshare-fresh,corpus-100-20-1.txt,100,100,22.960000,360110000,10.000008
+perfexp-cfa-peq-ll-noshare-fresh,corpus-100-200-1.txt,100,100,177.280000,176490000,10.000131
+perfexp-cfa-peq-ll-noshare-fresh,corpus-100-5-1.txt,100,100,5.270000,424710000,10.000106
+perfexp-cfa-peq-ll-noshare-fresh,corpus-100-50-1.txt,100,100,43.320000,290930000,10.000172
+perfexp-cfa-peq-ll-noshare-fresh,corpus-100-500-1.txt,100,100,557.260000,90430000,10.000065
+perfexp-cfa-pbv-ll-share-na,corpus-100-1-1.txt,xxx,100,1.000000,578040000,10.000159
+perfexp-cfa-pbv-ll-share-na,corpus-100-10-1.txt,xxx,100,9.500000,573200000,10.000098
+perfexp-cfa-pbv-ll-share-na,corpus-100-100-1.txt,xxx,100,106.370000,575160000,10.000149
+perfexp-cfa-pbv-ll-share-na,corpus-100-2-1.txt,xxx,100,2.030000,573780000,10.000134
+perfexp-cfa-pbv-ll-share-na,corpus-100-20-1.txt,xxx,100,22.960000,574500000,10.000156
+perfexp-cfa-pbv-ll-share-na,corpus-100-200-1.txt,xxx,100,177.280000,577170000,10.000125
+perfexp-cfa-pbv-ll-share-na,corpus-100-5-1.txt,xxx,100,5.270000,577820000,10.000046
+perfexp-cfa-pbv-ll-share-na,corpus-100-50-1.txt,xxx,100,43.320000,578770000,10.000033
+perfexp-cfa-pbv-ll-share-na,corpus-100-500-1.txt,xxx,100,557.260000,579540000,10.000128
+perfexp-cfa-pbv-ll-noshare-na,corpus-100-1-1.txt,xxx,100,1.000000,191420000,10.000232
+perfexp-cfa-pbv-ll-noshare-na,corpus-100-10-1.txt,xxx,100,9.500000,186330000,10.000046
+perfexp-cfa-pbv-ll-noshare-na,corpus-100-100-1.txt,xxx,100,106.370000,164610000,10.000463
+perfexp-cfa-pbv-ll-noshare-na,corpus-100-2-1.txt,xxx,100,2.030000,182390000,10.000409
+perfexp-cfa-pbv-ll-noshare-na,corpus-100-20-1.txt,xxx,100,22.960000,182280000,10.000252
+perfexp-cfa-pbv-ll-noshare-na,corpus-100-200-1.txt,xxx,100,177.280000,149840000,10.000281
+perfexp-cfa-pbv-ll-noshare-na,corpus-100-5-1.txt,xxx,100,5.270000,152370000,10.000284
+perfexp-cfa-pbv-ll-noshare-na,corpus-100-50-1.txt,xxx,100,43.320000,177430000,10.000397
+perfexp-cfa-pbv-ll-noshare-na,corpus-100-500-1.txt,xxx,100,557.260000,113440000,10.000150
+perfexp-stl-pta-na-na-reuse,corpus-100-1-1.txt,100,100,1.000000,152870000,10.000280
+perfexp-stl-pta-na-na-reuse,corpus-100-10-1.txt,100,100,9.500000,98530000,10.000299
+perfexp-stl-pta-na-na-reuse,corpus-100-100-1.txt,100,100,106.370000,16690000,10.005783
+perfexp-stl-pta-na-na-reuse,corpus-100-2-1.txt,100,100,2.030000,136230000,10.000196
+perfexp-stl-pta-na-na-reuse,corpus-100-20-1.txt,100,100,22.960000,62110000,10.001423
+perfexp-stl-pta-na-na-reuse,corpus-100-200-1.txt,100,100,177.280000,8960000,10.005548
+perfexp-stl-pta-na-na-reuse,corpus-100-5-1.txt,100,100,5.270000,104790000,10.000889
+perfexp-stl-pta-na-na-reuse,corpus-100-50-1.txt,100,100,43.320000,39170000,10.000011
+perfexp-stl-pta-na-na-reuse,corpus-100-500-1.txt,100,100,557.260000,3100000,10.015093
+perfexp-stl-pta-na-na-fresh,corpus-100-1-1.txt,100,100,1.000000,154450000,10.000054
+perfexp-stl-pta-na-na-fresh,corpus-100-10-1.txt,100,100,9.500000,96570000,10.000834
+perfexp-stl-pta-na-na-fresh,corpus-100-100-1.txt,100,100,106.370000,16400000,10.000697
+perfexp-stl-pta-na-na-fresh,corpus-100-2-1.txt,100,100,2.030000,133450000,10.000440
+perfexp-stl-pta-na-na-fresh,corpus-100-20-1.txt,100,100,22.960000,62540000,10.001476
+perfexp-stl-pta-na-na-fresh,corpus-100-200-1.txt,100,100,177.280000,8960000,10.006817
+perfexp-stl-pta-na-na-fresh,corpus-100-5-1.txt,100,100,5.270000,106470000,10.000109
+perfexp-stl-pta-na-na-fresh,corpus-100-50-1.txt,100,100,43.320000,37460000,10.000100
+perfexp-stl-pta-na-na-fresh,corpus-100-500-1.txt,100,100,557.260000,3090000,10.000541
+perfexp-stl-peq-na-na-reuse,corpus-100-1-1.txt,100,100,1.000000,863350000,10.000092
+perfexp-stl-peq-na-na-reuse,corpus-100-10-1.txt,100,100,9.500000,471070000,10.000189
+perfexp-stl-peq-na-na-reuse,corpus-100-100-1.txt,100,100,106.370000,287660000,10.000105
+perfexp-stl-peq-na-na-reuse,corpus-100-2-1.txt,100,100,2.030000,669380000,10.000082
+perfexp-stl-peq-na-na-reuse,corpus-100-20-1.txt,100,100,22.960000,432290000,10.000131
+perfexp-stl-peq-na-na-reuse,corpus-100-200-1.txt,100,100,177.280000,241690000,10.000290
+perfexp-stl-peq-na-na-reuse,corpus-100-5-1.txt,100,100,5.270000,510990000,10.000082
+perfexp-stl-peq-na-na-reuse,corpus-100-50-1.txt,100,100,43.320000,396380000,10.000235
+perfexp-stl-peq-na-na-reuse,corpus-100-500-1.txt,100,100,557.260000,135830000,10.000603
+perfexp-stl-peq-na-na-fresh,corpus-100-1-1.txt,100,100,1.000000,785420000,10.000062
+perfexp-stl-peq-na-na-fresh,corpus-100-10-1.txt,100,100,9.500000,418030000,10.000094
+perfexp-stl-peq-na-na-fresh,corpus-100-100-1.txt,100,100,106.370000,225290000,10.000237
+perfexp-stl-peq-na-na-fresh,corpus-100-2-1.txt,100,100,2.030000,550120000,10.000151
+perfexp-stl-peq-na-na-fresh,corpus-100-20-1.txt,100,100,22.960000,386080000,10.000206
+perfexp-stl-peq-na-na-fresh,corpus-100-200-1.txt,100,100,177.280000,176890000,10.000155
+perfexp-stl-peq-na-na-fresh,corpus-100-5-1.txt,100,100,5.270000,441830000,10.000135
+perfexp-stl-peq-na-na-fresh,corpus-100-50-1.txt,100,100,43.320000,310200000,10.000299
+perfexp-stl-peq-na-na-fresh,corpus-100-500-1.txt,100,100,557.260000,90360000,10.000474
+perfexp-stl-pbv-na-na-na,corpus-100-1-1.txt,xxx,100,1.000000,1267670000,10.000039
+perfexp-stl-pbv-na-na-na,corpus-100-10-1.txt,xxx,100,9.500000,482210000,10.000013
+perfexp-stl-pbv-na-na-na,corpus-100-100-1.txt,xxx,100,106.370000,268680000,10.000097
+perfexp-stl-pbv-na-na-na,corpus-100-2-1.txt,xxx,100,2.030000,806650000,10.000104
+perfexp-stl-pbv-na-na-na,corpus-100-20-1.txt,xxx,100,22.960000,369490000,10.000159
+perfexp-stl-pbv-na-na-na,corpus-100-200-1.txt,xxx,100,177.280000,227020000,10.000244
+perfexp-stl-pbv-na-na-na,corpus-100-5-1.txt,xxx,100,5.270000,534150000,10.000061
+perfexp-stl-pbv-na-na-na,corpus-100-50-1.txt,xxx,100,43.320000,298950000,10.000190
+perfexp-stl-pbv-na-na-na,corpus-100-500-1.txt,xxx,100,557.260000,158310000,10.000104
+perfexp-cfa-pta-ll-share-reuse,corpus-100-1-1.txt,100,100,1.000000,220120000,10.000178
+perfexp-cfa-pta-ll-share-reuse,corpus-100-10-1.txt,100,100,9.500000,177430000,10.000414
+perfexp-cfa-pta-ll-share-reuse,corpus-100-100-1.txt,100,100,106.370000,142410000,10.000162
+perfexp-cfa-pta-ll-share-reuse,corpus-100-2-1.txt,100,100,2.030000,195500000,10.000161
+perfexp-cfa-pta-ll-share-reuse,corpus-100-20-1.txt,100,100,22.960000,164560000,10.000548
+perfexp-cfa-pta-ll-share-reuse,corpus-100-200-1.txt,100,100,177.280000,122260000,10.000279
+perfexp-cfa-pta-ll-share-reuse,corpus-100-5-1.txt,100,100,5.270000,193960000,10.000071
+perfexp-cfa-pta-ll-share-reuse,corpus-100-50-1.txt,100,100,43.320000,163430000,10.000175
+perfexp-cfa-pta-ll-share-reuse,corpus-100-500-1.txt,100,100,557.260000,87960000,10.001073
+perfexp-cfa-pta-ll-share-reuse,corpus-1-1-1.txt,100,1,1.000000,224420000,10.000135
+perfexp-cfa-pta-ll-share-reuse,corpus-1-10-1.txt,100,1,10.000000,223740000,10.000014
+perfexp-cfa-pta-ll-share-reuse,corpus-1-100-1.txt,100,1,100.000000,153300000,10.000091
+perfexp-cfa-pta-ll-share-reuse,corpus-1-2-1.txt,100,1,2.000000,223430000,10.000120
+perfexp-cfa-pta-ll-share-reuse,corpus-1-20-1.txt,100,1,20.000000,210640000,10.000385
+perfexp-cfa-pta-ll-share-reuse,corpus-1-200-1.txt,100,1,200.000000,129790000,10.000596
+perfexp-cfa-pta-ll-share-reuse,corpus-1-5-1.txt,100,1,5.000000,222850000,10.000361
+perfexp-cfa-pta-ll-share-reuse,corpus-1-50-1.txt,100,1,50.000000,201700000,10.000220
+perfexp-cfa-pta-ll-share-reuse,corpus-1-500-1.txt,100,1,500.000000,110000000,10.000407
+perfexp-cfa-pta-ll-share-fresh,corpus-100-1-1.txt,100,100,1.000000,225030000,10.000360
+perfexp-cfa-pta-ll-share-fresh,corpus-100-10-1.txt,100,100,9.500000,192640000,10.000254
+perfexp-cfa-pta-ll-share-fresh,corpus-100-100-1.txt,100,100,106.370000,143960000,10.000633
+perfexp-cfa-pta-ll-share-fresh,corpus-100-2-1.txt,100,100,2.030000,204500000,10.000450
+perfexp-cfa-pta-ll-share-fresh,corpus-100-20-1.txt,100,100,22.960000,185400000,10.000274
+perfexp-cfa-pta-ll-share-fresh,corpus-100-200-1.txt,100,100,177.280000,126420000,10.000791
+perfexp-cfa-pta-ll-share-fresh,corpus-100-5-1.txt,100,100,5.270000,194450000,10.000396
+perfexp-cfa-pta-ll-share-fresh,corpus-100-50-1.txt,100,100,43.320000,173140000,10.000364
+perfexp-cfa-pta-ll-share-fresh,corpus-100-500-1.txt,100,100,557.260000,92390000,10.000098
+perfexp-cfa-pta-ll-share-fresh,corpus-1-1-1.txt,100,1,1.000000,222210000,10.000426
+perfexp-cfa-pta-ll-share-fresh,corpus-1-10-1.txt,100,1,10.000000,209110000,10.000235
+perfexp-cfa-pta-ll-share-fresh,corpus-1-100-1.txt,100,1,100.000000,154750000,10.000076
+perfexp-cfa-pta-ll-share-fresh,corpus-1-2-1.txt,100,1,2.000000,222030000,10.000114
+perfexp-cfa-pta-ll-share-fresh,corpus-1-20-1.txt,100,1,20.000000,208680000,10.000050
+perfexp-cfa-pta-ll-share-fresh,corpus-1-200-1.txt,100,1,200.000000,133490000,10.000231
+perfexp-cfa-pta-ll-share-fresh,corpus-1-5-1.txt,100,1,5.000000,217740000,10.000425
+perfexp-cfa-pta-ll-share-fresh,corpus-1-50-1.txt,100,1,50.000000,200340000,10.000126
+perfexp-cfa-pta-ll-share-fresh,corpus-1-500-1.txt,100,1,500.000000,109570000,10.000365
+perfexp-cfa-pta-ll-noshare-reuse,corpus-100-1-1.txt,100,100,1.000000,146130000,10.000557
+perfexp-cfa-pta-ll-noshare-reuse,corpus-100-10-1.txt,100,100,9.500000,110430000,10.000456
+perfexp-cfa-pta-ll-noshare-reuse,corpus-100-100-1.txt,100,100,106.370000,17440000,10.003114
+perfexp-cfa-pta-ll-noshare-reuse,corpus-100-2-1.txt,100,100,2.030000,139540000,10.000128
+perfexp-cfa-pta-ll-noshare-reuse,corpus-100-20-1.txt,100,100,22.960000,70380000,10.000395
+perfexp-cfa-pta-ll-noshare-reuse,corpus-100-200-1.txt,100,100,177.280000,8670000,10.001712
+perfexp-cfa-pta-ll-noshare-reuse,corpus-100-5-1.txt,100,100,5.270000,127040000,10.000370
+perfexp-cfa-pta-ll-noshare-reuse,corpus-100-50-1.txt,100,100,43.320000,44250000,10.002214
+perfexp-cfa-pta-ll-noshare-reuse,corpus-100-500-1.txt,100,100,557.260000,3290000,10.007370
+perfexp-cfa-pta-ll-noshare-reuse,corpus-1-1-1.txt,100,1,1.000000,139870000,10.000356
+perfexp-cfa-pta-ll-noshare-reuse,corpus-1-10-1.txt,100,1,10.000000,115500000,10.000281
+perfexp-cfa-pta-ll-noshare-reuse,corpus-1-100-1.txt,100,1,100.000000,18830000,10.003277
+perfexp-cfa-pta-ll-noshare-reuse,corpus-1-2-1.txt,100,1,2.000000,144880000,10.000426
+perfexp-cfa-pta-ll-noshare-reuse,corpus-1-20-1.txt,100,1,20.000000,82050000,10.001071
+perfexp-cfa-pta-ll-noshare-reuse,corpus-1-200-1.txt,100,1,200.000000,8870000,10.002904
+perfexp-cfa-pta-ll-noshare-reuse,corpus-1-5-1.txt,100,1,5.000000,138400000,10.000130
+perfexp-cfa-pta-ll-noshare-reuse,corpus-1-50-1.txt,100,1,50.000000,38130000,10.002351
+perfexp-cfa-pta-ll-noshare-reuse,corpus-1-500-1.txt,100,1,500.000000,3890000,10.003849
+perfexp-cfa-pta-ll-noshare-fresh,corpus-100-1-1.txt,100,100,1.000000,143100000,10.000056
+perfexp-cfa-pta-ll-noshare-fresh,corpus-100-10-1.txt,100,100,9.500000,97990000,10.000081
+perfexp-cfa-pta-ll-noshare-fresh,corpus-100-100-1.txt,100,100,106.370000,16950000,10.004190
+perfexp-cfa-pta-ll-noshare-fresh,corpus-100-2-1.txt,100,100,2.030000,135210000,10.000137
+perfexp-cfa-pta-ll-noshare-fresh,corpus-100-20-1.txt,100,100,22.960000,69270000,10.000092
+perfexp-cfa-pta-ll-noshare-fresh,corpus-100-200-1.txt,100,100,177.280000,8840000,10.000491
+perfexp-cfa-pta-ll-noshare-fresh,corpus-100-5-1.txt,100,100,5.270000,112610000,10.000397
+perfexp-cfa-pta-ll-noshare-fresh,corpus-100-50-1.txt,100,100,43.320000,42480000,10.001402
+perfexp-cfa-pta-ll-noshare-fresh,corpus-100-500-1.txt,100,100,557.260000,3250000,10.027871
+perfexp-cfa-pta-ll-noshare-fresh,corpus-1-1-1.txt,100,1,1.000000,139830000,10.000681
+perfexp-cfa-pta-ll-noshare-fresh,corpus-1-10-1.txt,100,1,10.000000,102320000,10.000624
+perfexp-cfa-pta-ll-noshare-fresh,corpus-1-100-1.txt,100,1,100.000000,17610000,10.000917
+perfexp-cfa-pta-ll-noshare-fresh,corpus-1-2-1.txt,100,1,2.000000,134520000,10.000287
+perfexp-cfa-pta-ll-noshare-fresh,corpus-1-20-1.txt,100,1,20.000000,78150000,10.000982
+perfexp-cfa-pta-ll-noshare-fresh,corpus-1-200-1.txt,100,1,200.000000,8930000,10.010066
+perfexp-cfa-pta-ll-noshare-fresh,corpus-1-5-1.txt,100,1,5.000000,119920000,10.000537
+perfexp-cfa-pta-ll-noshare-fresh,corpus-1-50-1.txt,100,1,50.000000,38540000,10.001545
+perfexp-cfa-pta-ll-noshare-fresh,corpus-1-500-1.txt,100,1,500.000000,3900000,10.024468
+perfexp-cfa-peq-ll-share-reuse,corpus-100-1-1.txt,100,100,1.000000,580710000,10.000065
+perfexp-cfa-peq-ll-share-reuse,corpus-100-10-1.txt,100,100,9.500000,430790000,10.000116
+perfexp-cfa-peq-ll-share-reuse,corpus-100-100-1.txt,100,100,106.370000,247640000,10.000266
+perfexp-cfa-peq-ll-share-reuse,corpus-100-2-1.txt,100,100,2.030000,464050000,10.000189
+perfexp-cfa-peq-ll-share-reuse,corpus-100-20-1.txt,100,100,22.960000,377820000,10.000065
+perfexp-cfa-peq-ll-share-reuse,corpus-100-200-1.txt,100,100,177.280000,195030000,10.000477
+perfexp-cfa-peq-ll-share-reuse,corpus-100-5-1.txt,100,100,5.270000,430190000,10.000121
+perfexp-cfa-peq-ll-share-reuse,corpus-100-50-1.txt,100,100,43.320000,331580000,10.000295
+perfexp-cfa-peq-ll-share-reuse,corpus-100-500-1.txt,100,100,557.260000,123230000,10.000186
+perfexp-cfa-peq-ll-share-reuse,corpus-1-1-1.txt,100,1,1.000000,572750000,10.000172
+perfexp-cfa-peq-ll-share-reuse,corpus-1-10-1.txt,100,1,10.000000,558790000,10.000101
+perfexp-cfa-peq-ll-share-reuse,corpus-1-100-1.txt,100,1,100.000000,291780000,10.000230
+perfexp-cfa-peq-ll-share-reuse,corpus-1-2-1.txt,100,1,2.000000,571220000,10.000023
+perfexp-cfa-peq-ll-share-reuse,corpus-1-20-1.txt,100,1,20.000000,461020000,10.000045
+perfexp-cfa-peq-ll-share-reuse,corpus-1-200-1.txt,100,1,200.000000,220880000,10.000260
+perfexp-cfa-peq-ll-share-reuse,corpus-1-5-1.txt,100,1,5.000000,555180000,10.000153
+perfexp-cfa-peq-ll-share-reuse,corpus-1-50-1.txt,100,1,50.000000,433290000,10.000123
+perfexp-cfa-peq-ll-share-reuse,corpus-1-500-1.txt,100,1,500.000000,165210000,10.000260
+perfexp-cfa-peq-ll-share-fresh,corpus-100-1-1.txt,100,100,1.000000,591360000,10.000013
+perfexp-cfa-peq-ll-share-fresh,corpus-100-10-1.txt,100,100,9.500000,432580000,10.000103
+perfexp-cfa-peq-ll-share-fresh,corpus-100-100-1.txt,100,100,106.370000,253100000,10.000162
+perfexp-cfa-peq-ll-share-fresh,corpus-100-2-1.txt,100,100,2.030000,470710000,10.000018
+perfexp-cfa-peq-ll-share-fresh,corpus-100-20-1.txt,100,100,22.960000,381580000,10.000172
+perfexp-cfa-peq-ll-share-fresh,corpus-100-200-1.txt,100,100,177.280000,197910000,10.000400
+perfexp-cfa-peq-ll-share-fresh,corpus-100-5-1.txt,100,100,5.270000,437470000,10.000123
+perfexp-cfa-peq-ll-share-fresh,corpus-100-50-1.txt,100,100,43.320000,337150000,10.000065
+perfexp-cfa-peq-ll-share-fresh,corpus-100-500-1.txt,100,100,557.260000,127310000,10.000685
+perfexp-cfa-peq-ll-share-fresh,corpus-1-1-1.txt,100,1,1.000000,581300000,10.000103
+perfexp-cfa-peq-ll-share-fresh,corpus-1-10-1.txt,100,1,10.000000,566650000,10.000166
+perfexp-cfa-peq-ll-share-fresh,corpus-1-100-1.txt,100,1,100.000000,295340000,10.000202
+perfexp-cfa-peq-ll-share-fresh,corpus-1-2-1.txt,100,1,2.000000,579220000,10.000012
+perfexp-cfa-peq-ll-share-fresh,corpus-1-20-1.txt,100,1,20.000000,470040000,10.000180
+perfexp-cfa-peq-ll-share-fresh,corpus-1-200-1.txt,100,1,200.000000,223060000,10.000188
+perfexp-cfa-peq-ll-share-fresh,corpus-1-5-1.txt,100,1,5.000000,563440000,10.000100
+perfexp-cfa-peq-ll-share-fresh,corpus-1-50-1.txt,100,1,50.000000,438260000,10.000200
+perfexp-cfa-peq-ll-share-fresh,corpus-1-500-1.txt,100,1,500.000000,166830000,10.000225
+perfexp-cfa-peq-ll-noshare-reuse,corpus-100-1-1.txt,100,100,1.000000,603080000,10.000107
+perfexp-cfa-peq-ll-noshare-reuse,corpus-100-10-1.txt,100,100,9.500000,439540000,10.000078
+perfexp-cfa-peq-ll-noshare-reuse,corpus-100-100-1.txt,100,100,106.370000,279990000,10.000309
+perfexp-cfa-peq-ll-noshare-reuse,corpus-100-2-1.txt,100,100,2.030000,509720000,10.000099
+perfexp-cfa-peq-ll-noshare-reuse,corpus-100-20-1.txt,100,100,22.960000,405590000,10.000206
+perfexp-cfa-peq-ll-noshare-reuse,corpus-100-200-1.txt,100,100,177.280000,230400000,10.000124
+perfexp-cfa-peq-ll-noshare-reuse,corpus-100-5-1.txt,100,100,5.270000,454270000,10.000057
+perfexp-cfa-peq-ll-noshare-reuse,corpus-100-50-1.txt,100,100,43.320000,375090000,10.000225
+perfexp-cfa-peq-ll-noshare-reuse,corpus-100-500-1.txt,100,100,557.260000,134440000,10.000290
+perfexp-cfa-peq-ll-noshare-reuse,corpus-1-1-1.txt,100,1,1.000000,588100000,10.000124
+perfexp-cfa-peq-ll-noshare-reuse,corpus-1-10-1.txt,100,1,10.000000,577110000,10.000002
+perfexp-cfa-peq-ll-noshare-reuse,corpus-1-100-1.txt,100,1,100.000000,319990000,10.000151
+perfexp-cfa-peq-ll-noshare-reuse,corpus-1-2-1.txt,100,1,2.000000,586540000,10.000010
+perfexp-cfa-peq-ll-noshare-reuse,corpus-1-20-1.txt,100,1,20.000000,480940000,10.000047
+perfexp-cfa-peq-ll-noshare-reuse,corpus-1-200-1.txt,100,1,200.000000,300590000,10.000162
+perfexp-cfa-peq-ll-noshare-reuse,corpus-1-5-1.txt,100,1,5.000000,577530000,10.000120
+perfexp-cfa-peq-ll-noshare-reuse,corpus-1-50-1.txt,100,1,50.000000,454950000,10.000114
+perfexp-cfa-peq-ll-noshare-reuse,corpus-1-500-1.txt,100,1,500.000000,186210000,10.000221
+perfexp-cfa-peq-ll-noshare-fresh,corpus-100-1-1.txt,100,100,1.000000,546170000,10.000079
+perfexp-cfa-peq-ll-noshare-fresh,corpus-100-10-1.txt,100,100,9.500000,403120000,10.000222
+perfexp-cfa-peq-ll-noshare-fresh,corpus-100-100-1.txt,100,100,106.370000,214740000,10.000444
+perfexp-cfa-peq-ll-noshare-fresh,corpus-100-2-1.txt,100,100,2.030000,449080000,10.000157
+perfexp-cfa-peq-ll-noshare-fresh,corpus-100-20-1.txt,100,100,22.960000,351690000,10.000146
+perfexp-cfa-peq-ll-noshare-fresh,corpus-100-200-1.txt,100,100,177.280000,174630000,10.000540
+perfexp-cfa-peq-ll-noshare-fresh,corpus-100-5-1.txt,100,100,5.270000,419160000,10.000085
+perfexp-cfa-peq-ll-noshare-fresh,corpus-100-50-1.txt,100,100,43.320000,296590000,10.000200
+perfexp-cfa-peq-ll-noshare-fresh,corpus-100-500-1.txt,100,100,557.260000,78000000,10.000539
+perfexp-cfa-peq-ll-noshare-fresh,corpus-1-1-1.txt,100,1,1.000000,541890000,10.000021
+perfexp-cfa-peq-ll-noshare-fresh,corpus-1-10-1.txt,100,1,10.000000,511140000,10.000142
+perfexp-cfa-peq-ll-noshare-fresh,corpus-1-100-1.txt,100,1,100.000000,243680000,10.000252
+perfexp-cfa-peq-ll-noshare-fresh,corpus-1-2-1.txt,100,1,2.000000,532730000,10.000135
+perfexp-cfa-peq-ll-noshare-fresh,corpus-1-20-1.txt,100,1,20.000000,413610000,10.000113
+perfexp-cfa-peq-ll-noshare-fresh,corpus-1-200-1.txt,100,1,200.000000,192770000,10.000185
+perfexp-cfa-peq-ll-noshare-fresh,corpus-1-5-1.txt,100,1,5.000000,495980000,10.000162
+perfexp-cfa-peq-ll-noshare-fresh,corpus-1-50-1.txt,100,1,50.000000,367590000,10.000269
+perfexp-cfa-peq-ll-noshare-fresh,corpus-1-500-1.txt,100,1,500.000000,111560000,10.000455
+perfexp-cfa-pbv-ll-share-na,corpus-100-1-1.txt,xxx,100,1.000000,638780000,10.000008
+perfexp-cfa-pbv-ll-share-na,corpus-100-10-1.txt,xxx,100,9.500000,637840000,10.000004
+perfexp-cfa-pbv-ll-share-na,corpus-100-100-1.txt,xxx,100,106.370000,635130000,10.000003
+perfexp-cfa-pbv-ll-share-na,corpus-100-2-1.txt,xxx,100,2.030000,639810000,10.000140
+perfexp-cfa-pbv-ll-share-na,corpus-100-20-1.txt,xxx,100,22.960000,552670000,10.000089
+perfexp-cfa-pbv-ll-share-na,corpus-100-200-1.txt,xxx,100,177.280000,639550000,10.000019
+perfexp-cfa-pbv-ll-share-na,corpus-100-5-1.txt,xxx,100,5.270000,636230000,10.000044
+perfexp-cfa-pbv-ll-share-na,corpus-100-50-1.txt,xxx,100,43.320000,631470000,10.000125
+perfexp-cfa-pbv-ll-share-na,corpus-100-500-1.txt,xxx,100,557.260000,628330000,10.000127
+perfexp-cfa-pbv-ll-share-na,corpus-1-1-1.txt,xxx,1,1.000000,589760000,10.000044
+perfexp-cfa-pbv-ll-share-na,corpus-1-10-1.txt,xxx,1,10.000000,589790000,10.000151
+perfexp-cfa-pbv-ll-share-na,corpus-1-100-1.txt,xxx,1,100.000000,587540000,10.000128
+perfexp-cfa-pbv-ll-share-na,corpus-1-2-1.txt,xxx,1,2.000000,580790000,10.000102
+perfexp-cfa-pbv-ll-share-na,corpus-1-20-1.txt,xxx,1,20.000000,586470000,10.000154
+perfexp-cfa-pbv-ll-share-na,corpus-1-200-1.txt,xxx,1,200.000000,587510000,10.000005
+perfexp-cfa-pbv-ll-share-na,corpus-1-5-1.txt,xxx,1,5.000000,582120000,10.000163
+perfexp-cfa-pbv-ll-share-na,corpus-1-50-1.txt,xxx,1,50.000000,587990000,10.000127
+perfexp-cfa-pbv-ll-share-na,corpus-1-500-1.txt,xxx,1,500.000000,587590000,10.000046
+perfexp-cfa-pbv-ll-noshare-na,corpus-100-1-1.txt,xxx,100,1.000000,218340000,10.000321
+perfexp-cfa-pbv-ll-noshare-na,corpus-100-10-1.txt,xxx,100,9.500000,189550000,10.000174
+perfexp-cfa-pbv-ll-noshare-na,corpus-100-100-1.txt,xxx,100,106.370000,169280000,10.000141
+perfexp-cfa-pbv-ll-noshare-na,corpus-100-2-1.txt,xxx,100,2.030000,197840000,10.000383
+perfexp-cfa-pbv-ll-noshare-na,corpus-100-20-1.txt,xxx,100,22.960000,182700000,10.000041
+perfexp-cfa-pbv-ll-noshare-na,corpus-100-200-1.txt,xxx,100,177.280000,157120000,10.000522
+perfexp-cfa-pbv-ll-noshare-na,corpus-100-5-1.txt,xxx,100,5.270000,155160000,10.000322
+perfexp-cfa-pbv-ll-noshare-na,corpus-100-50-1.txt,xxx,100,43.320000,179110000,10.000218
+perfexp-cfa-pbv-ll-noshare-na,corpus-100-500-1.txt,xxx,100,557.260000,113620000,10.000140
+perfexp-cfa-pbv-ll-noshare-na,corpus-1-1-1.txt,xxx,1,1.000000,216270000,10.000367
+perfexp-cfa-pbv-ll-noshare-na,corpus-1-10-1.txt,xxx,1,10.000000,214390000,10.000157
+perfexp-cfa-pbv-ll-noshare-na,corpus-1-100-1.txt,xxx,1,100.000000,165440000,10.000095
+perfexp-cfa-pbv-ll-noshare-na,corpus-1-2-1.txt,xxx,1,2.000000,217150000,10.000044
+perfexp-cfa-pbv-ll-noshare-na,corpus-1-20-1.txt,xxx,1,20.000000,216760000,10.000321
+perfexp-cfa-pbv-ll-noshare-na,corpus-1-200-1.txt,xxx,1,200.000000,176930000,10.000100
+perfexp-cfa-pbv-ll-noshare-na,corpus-1-5-1.txt,xxx,1,5.000000,200840000,10.000229
+perfexp-cfa-pbv-ll-noshare-na,corpus-1-50-1.txt,xxx,1,50.000000,212960000,10.000273
+perfexp-cfa-pbv-ll-noshare-na,corpus-1-500-1.txt,xxx,1,500.000000,163340000,10.000196
+perfexp-stl-pta-na-na-reuse,corpus-100-1-1.txt,100,100,1.000000,151210000,10.000032
+perfexp-stl-pta-na-na-reuse,corpus-100-10-1.txt,100,100,9.500000,92400000,10.000662
+perfexp-stl-pta-na-na-reuse,corpus-100-100-1.txt,100,100,106.370000,16700000,10.003595
+perfexp-stl-pta-na-na-reuse,corpus-100-2-1.txt,100,100,2.030000,132700000,10.000666
+perfexp-stl-pta-na-na-reuse,corpus-100-20-1.txt,100,100,22.960000,61670000,10.001135
+perfexp-stl-pta-na-na-reuse,corpus-100-200-1.txt,100,100,177.280000,8950000,10.005903
+perfexp-stl-pta-na-na-reuse,corpus-100-5-1.txt,100,100,5.270000,105760000,10.000126
+perfexp-stl-pta-na-na-reuse,corpus-100-50-1.txt,100,100,43.320000,38020000,10.001290
+perfexp-stl-pta-na-na-reuse,corpus-100-500-1.txt,100,100,557.260000,3080000,10.009583
+perfexp-stl-pta-na-na-reuse,corpus-1-1-1.txt,100,1,1.000000,150070000,10.000505
+perfexp-stl-pta-na-na-reuse,corpus-1-10-1.txt,100,1,10.000000,96240000,10.000747
+perfexp-stl-pta-na-na-reuse,corpus-1-100-1.txt,100,1,100.000000,17380000,10.005677
+perfexp-stl-pta-na-na-reuse,corpus-1-2-1.txt,100,1,2.000000,136340000,10.000556
+perfexp-stl-pta-na-na-reuse,corpus-1-20-1.txt,100,1,20.000000,69290000,10.000979
+perfexp-stl-pta-na-na-reuse,corpus-1-200-1.txt,100,1,200.000000,9140000,10.005445
+perfexp-stl-pta-na-na-reuse,corpus-1-5-1.txt,100,1,5.000000,114030000,10.000605
+perfexp-stl-pta-na-na-reuse,corpus-1-50-1.txt,100,1,50.000000,33470000,10.000871
+perfexp-stl-pta-na-na-reuse,corpus-1-500-1.txt,100,1,500.000000,3760000,10.021431
+perfexp-stl-pta-na-na-fresh,corpus-100-1-1.txt,100,100,1.000000,151890000,10.000693
+perfexp-stl-pta-na-na-fresh,corpus-100-10-1.txt,100,100,9.500000,97910000,10.000289
+perfexp-stl-pta-na-na-fresh,corpus-100-100-1.txt,100,100,106.370000,16740000,10.000756
+perfexp-stl-pta-na-na-fresh,corpus-100-2-1.txt,100,100,2.030000,134890000,10.000666
+perfexp-stl-pta-na-na-fresh,corpus-100-20-1.txt,100,100,22.960000,61040000,10.000514
+perfexp-stl-pta-na-na-fresh,corpus-100-200-1.txt,100,100,177.280000,8950000,10.004888
+perfexp-stl-pta-na-na-fresh,corpus-100-5-1.txt,100,100,5.270000,101780000,10.000043
+perfexp-stl-pta-na-na-fresh,corpus-100-50-1.txt,100,100,43.320000,38440000,10.000510
+perfexp-stl-pta-na-na-fresh,corpus-100-500-1.txt,100,100,557.260000,3060000,10.007733
+perfexp-stl-pta-na-na-fresh,corpus-1-1-1.txt,100,1,1.000000,149360000,10.000168
+perfexp-stl-pta-na-na-fresh,corpus-1-10-1.txt,100,1,10.000000,98400000,10.000118
+perfexp-stl-pta-na-na-fresh,corpus-1-100-1.txt,100,1,100.000000,17440000,10.004379
+perfexp-stl-pta-na-na-fresh,corpus-1-2-1.txt,100,1,2.000000,130340000,10.000520
+perfexp-stl-pta-na-na-fresh,corpus-1-20-1.txt,100,1,20.000000,69280000,10.001377
+perfexp-stl-pta-na-na-fresh,corpus-1-200-1.txt,100,1,200.000000,9070000,10.004963
+perfexp-stl-pta-na-na-fresh,corpus-1-5-1.txt,100,1,5.000000,114390000,10.000315
+perfexp-stl-pta-na-na-fresh,corpus-1-50-1.txt,100,1,50.000000,34350000,10.001033
+perfexp-stl-pta-na-na-fresh,corpus-1-500-1.txt,100,1,500.000000,3720000,10.009015
+perfexp-stl-peq-na-na-reuse,corpus-100-1-1.txt,100,100,1.000000,867730000,10.000040
+perfexp-stl-peq-na-na-reuse,corpus-100-10-1.txt,100,100,9.500000,470370000,10.000155
+perfexp-stl-peq-na-na-reuse,corpus-100-100-1.txt,100,100,106.370000,287440000,10.000190
+perfexp-stl-peq-na-na-reuse,corpus-100-2-1.txt,100,100,2.030000,667180000,10.000145
+perfexp-stl-peq-na-na-reuse,corpus-100-20-1.txt,100,100,22.960000,430260000,10.000102
+perfexp-stl-peq-na-na-reuse,corpus-100-200-1.txt,100,100,177.280000,232720000,10.000418
+perfexp-stl-peq-na-na-reuse,corpus-100-5-1.txt,100,100,5.270000,515130000,10.000118
+perfexp-stl-peq-na-na-reuse,corpus-100-50-1.txt,100,100,43.320000,401280000,10.000122
+perfexp-stl-peq-na-na-reuse,corpus-100-500-1.txt,100,100,557.260000,135350000,10.000692
+perfexp-stl-peq-na-na-reuse,corpus-1-1-1.txt,100,1,1.000000,847560000,10.000010
+perfexp-stl-peq-na-na-reuse,corpus-1-10-1.txt,100,1,10.000000,641250000,10.000095
+perfexp-stl-peq-na-na-reuse,corpus-1-100-1.txt,100,1,100.000000,300130000,10.000199
+perfexp-stl-peq-na-na-reuse,corpus-1-2-1.txt,100,1,2.000000,680950000,10.000050
+perfexp-stl-peq-na-na-reuse,corpus-1-20-1.txt,100,1,20.000000,515190000,10.000051
+perfexp-stl-peq-na-na-reuse,corpus-1-200-1.txt,100,1,200.000000,271800000,10.000194
+perfexp-stl-peq-na-na-reuse,corpus-1-5-1.txt,100,1,5.000000,611640000,10.000133
+perfexp-stl-peq-na-na-reuse,corpus-1-50-1.txt,100,1,50.000000,483780000,10.000084
+perfexp-stl-peq-na-na-reuse,corpus-1-500-1.txt,100,1,500.000000,191470000,10.000243
+perfexp-stl-peq-na-na-fresh,corpus-100-1-1.txt,100,100,1.000000,779650000,10.000085
+perfexp-stl-peq-na-na-fresh,corpus-100-10-1.txt,100,100,9.500000,419300000,10.000184
+perfexp-stl-peq-na-na-fresh,corpus-100-100-1.txt,100,100,106.370000,224270000,10.000410
+perfexp-stl-peq-na-na-fresh,corpus-100-2-1.txt,100,100,2.030000,545330000,10.000073
+perfexp-stl-peq-na-na-fresh,corpus-100-20-1.txt,100,100,22.960000,385000000,10.000210
+perfexp-stl-peq-na-na-fresh,corpus-100-200-1.txt,100,100,177.280000,174520000,10.000360
+perfexp-stl-peq-na-na-fresh,corpus-100-5-1.txt,100,100,5.270000,443460000,10.000165
+perfexp-stl-peq-na-na-fresh,corpus-100-50-1.txt,100,100,43.320000,310460000,10.000174
+perfexp-stl-peq-na-na-fresh,corpus-100-500-1.txt,100,100,557.260000,92820000,10.000352
+perfexp-stl-peq-na-na-fresh,corpus-1-1-1.txt,100,1,1.000000,774230000,10.000110
+perfexp-stl-peq-na-na-fresh,corpus-1-10-1.txt,100,1,10.000000,554850000,10.000064
+perfexp-stl-peq-na-na-fresh,corpus-1-100-1.txt,100,1,100.000000,227540000,10.000041
+perfexp-stl-peq-na-na-fresh,corpus-1-2-1.txt,100,1,2.000000,616830000,10.000134
+perfexp-stl-peq-na-na-fresh,corpus-1-20-1.txt,100,1,20.000000,436800000,10.000038
+perfexp-stl-peq-na-na-fresh,corpus-1-200-1.txt,100,1,200.000000,185050000,10.000439
+perfexp-stl-peq-na-na-fresh,corpus-1-5-1.txt,100,1,5.000000,569030000,10.000125
+perfexp-stl-peq-na-na-fresh,corpus-1-50-1.txt,100,1,50.000000,387710000,10.000249
+perfexp-stl-peq-na-na-fresh,corpus-1-500-1.txt,100,1,500.000000,113890000,10.000075
+perfexp-stl-pbv-na-na-na,corpus-100-1-1.txt,xxx,100,1.000000,1267570000,10.000072
+perfexp-stl-pbv-na-na-na,corpus-100-10-1.txt,xxx,100,9.500000,476260000,10.000192
+perfexp-stl-pbv-na-na-na,corpus-100-100-1.txt,xxx,100,106.370000,271870000,10.000171
+perfexp-stl-pbv-na-na-na,corpus-100-2-1.txt,xxx,100,2.030000,807830000,10.000110
+perfexp-stl-pbv-na-na-na,corpus-100-20-1.txt,xxx,100,22.960000,373160000,10.000221
+perfexp-stl-pbv-na-na-na,corpus-100-200-1.txt,xxx,100,177.280000,233700000,10.000081
+perfexp-stl-pbv-na-na-na,corpus-100-5-1.txt,xxx,100,5.270000,536240000,10.000165
+perfexp-stl-pbv-na-na-na,corpus-100-50-1.txt,xxx,100,43.320000,297400000,10.000317
+perfexp-stl-pbv-na-na-na,corpus-100-500-1.txt,xxx,100,557.260000,159500000,10.000290
+perfexp-stl-pbv-na-na-na,corpus-1-1-1.txt,xxx,1,1.000000,1089370000,10.000024
+perfexp-stl-pbv-na-na-na,corpus-1-10-1.txt,xxx,1,10.000000,722490000,10.000040
+perfexp-stl-pbv-na-na-na,corpus-1-100-1.txt,xxx,1,100.000000,311250000,10.000116
+perfexp-stl-pbv-na-na-na,corpus-1-2-1.txt,xxx,1,2.000000,747630000,10.000103
+perfexp-stl-pbv-na-na-na,corpus-1-20-1.txt,xxx,1,20.000000,348820000,10.000149
+perfexp-stl-pbv-na-na-na,corpus-1-200-1.txt,xxx,1,200.000000,302220000,10.000223
+perfexp-stl-pbv-na-na-na,corpus-1-5-1.txt,xxx,1,5.000000,725430000,10.000110
+perfexp-stl-pbv-na-na-na,corpus-1-50-1.txt,xxx,1,50.000000,335730000,10.000280
+perfexp-stl-pbv-na-na-na,corpus-1-500-1.txt,xxx,1,500.000000,258380000,10.000052

doc/theses/mike_brooks_MMath/plots/string-pbv.gp

-              r7d02d35
+              rbd72f517
 #set terminal wxt size 950,1250
+DIR="pictures"
+INDIR="build"
+OUTDIR="build"
 set macros
+set output "build/string-graph-pbv.pdf"
+set output OUTDIR."/plot-string-pbv.pdf"
+set multiplot layout 1, 2 ;
 #set pointsize 2.0
 set grid
 …
 set logscale x
 set logscale y 2
+set xlabel "String Length being passed (interp. varies)" offset 2,0
+set ylabel "Time per append (ns, mean), log_{2} scale"
+set xlabel "String length passed, varying (mean)"
+set ylabel "Time per pass (ns, mean), log_{2} scale"
+set yrange [4:64]
 set linetype 3 dashtype 2
 set linetype 4 dashtype 2
+plot DIR."/string-graph-pbv.dat" \
+           i 0 using 1:2 title columnheader(1) with linespoints lt rgb "blue"   pt  2  ps 1 lw 1, \
+        '' i 1 using 1:2 title columnheader(1) with linespoints lt rgb "red"    pt  3  ps 1 lw 1, \
+        '' i 2 using 1:2 title columnheader(1) with linespoints lt rgb "blue"   pt  6  ps 1 lw 1
+plot INDIR."/plot-string-pbv-varcorp.dat" \
+           i 0 using 1:2 title columnheader(1) with linespoints lt rgb "red"    pt  3  ps 1 lw 1, \
+        '' i 1 using 1:2 title columnheader(1) with linespoints lt rgb "blue"   pt  6  ps 1 lw 1
+set xlabel "String length passed, fixed"
+set ylabel
+plot INDIR."/plot-string-pbv-fixcorp.dat"  \
+           i 0 using 1:2 title columnheader(1) with linespoints lt rgb "red"    pt  3  ps 1 lw 1, \
+        '' i 1 using 1:2 title columnheader(1) with linespoints lt rgb "blue"   pt  6  ps 1 lw 1
+unset multiplot

doc/theses/mike_brooks_MMath/plots/string-peq-cppemu.gp

-              r7d02d35
+              rbd72f517
 set xtics (1,2,5,10,20,50,100,200,500)
 set logscale x
 set logscale y
 set yrange [10:200]
+#set logscale y
+set yrange [0:115]
 set xlabel "String Length being appended (mean, geo. dist.), log scale" offset 2,0
 set ylabel "Time per append (ns, mean)"

doc/theses/mike_brooks_MMath/plots/string-peq-cppemu.py

-              r7d02d35
+              rbd72f517
 import pandas as pd
 import numpy as np
+import sys
 import os
+infile = os.path.dirname(os.path.abspath(__file__)) + '/../benchmarks/string/result-append-pbv.csv'
+sys.path.insert(0, os.path.dirname(__file__))
+from common import *
 prettyFieldNames = {
 …
+}
+timings = pd.read_csv(
+    infile,
+    names=['test', 'corpus', 'concatsPerReset', 'corpusItemCount', 'corpusMeanLenChars', 'concatDoneActualCount', 'execTimeActualSec'],
+    dtype={'test':                  str,
+           'corpus':                str,
+           'concatsPerReset':       'Int64', # allows missing; https://stackoverflow.com/a/70626154
+           'corpusItemCount':       np.int64,
+           'corpusMeanLenChars':    np.float64,
+           'concatDoneActualCount': np.int64,
+           'execTimeActualSec':     np.float64},
+    na_values=['xxx'],
+)
+# print(timings.head())
+timings = loadParseTimingData('result-append-pbv.csv')
+# Filter operation=peq, corpus=100-*-1
+# project: parse executable and corpus names
+timings[['test-slug',
+     'sut-platform',
+     'operation',
+     'sut-cfa-level',
+     'sut-cfa-sharing',
+     'op-alloc']] = timings['test'].str.strip().str.split('-', expand=True)
+timings['sut'] = timings[['sut-platform',
+                    'sut-cfa-level',
+                    'sut-cfa-sharing',
+                    'op-alloc']].agg('-'.join, axis=1)
+timings[['corpus-basename',
+     'corpus-ext']] = timings['corpus'].str.strip().str.split('.', expand=True)
+timings[['corpus-slug',
+     'corpus-nstrs',
+     'corpus-meanlen',
+     'corpus-runid']] = timings['corpus-basename'].str.strip().str.split('-', expand=True)
+timings["corpus-nstrs"] = pd.to_numeric(timings["corpus-nstrs"])
+timings["corpus-meanlen"] = pd.to_numeric(timings["corpus-meanlen"])
+timings["corpus-runid"] = pd.to_numeric(timings["corpus-runid"])
+# project: calculate fact
+timings['op-duration-s'] = timings['execTimeActualSec'] / timings['concatDoneActualCount']
+timings['op-duration-ns'] = timings['op-duration-s'] * 1000 * 1000 * 1000
+# Filter operation=peq
+groupedOp = timings.groupby('operation')
+tgtOpTimings = groupedOp.get_group('peq')
+timings = timings.groupby('operation').get_group('peq')
+timings = timings.groupby('corpus-nstrs').get_group(100)
+timings = timings.groupby('corpus-runid').get_group(1)
 # Emit in groups
 groupedSut = tgtOpTimings.groupby('sut')
+groupedSut = timings.groupby('sut')
 for sut, sgroup in groupedSut:

doc/theses/mike_brooks_MMath/plots/string-peq-sharing.gp

-              r7d02d35
+              rbd72f517
 #set terminal wxt size 950,1250
+DIR="pictures"
+INDIR="build"
+OUTDIR="build"
 set macros
 set output "build/string-graph-peq-sharing.pdf"
+set output OUTDIR."/plot-string-peq-sharing.pdf"
 #set pointsize 2.0
 set grid
 …
 set xtics (1,2,5,10,20,50,100,200,500)
 set logscale x
+#set logscale y 2
+#set logscale y
+set yrange [10:115]
 set xlabel "String Length being appended (mean, geo. dist.), log scale" offset 2,0
 set ylabel "Time per append (ns, mean)"
 set linetype 2 dashtype 2
 set linetype 4 dashtype 2
 plot DIR."/string-graph-peq-sharing.dat" \
+plot INDIR."/plot-string-peq-sharing.dat" \
            i 0 using 1:2 title columnheader(1) with linespoints lt rgb "red"    pt  2  ps 1 lw 1, \
         '' i 1 using 1:2 title columnheader(1) with linespoints lt rgb "red"    pt  3  ps 1 lw 1, \

doc/theses/mike_brooks_MMath/plots/string-pta-sharing.gp

-              r7d02d35
+              rbd72f517
 #set terminal wxt size 950,1250
+DIR="pictures"
+INDIR="build"
+OUTDIR="build"
 set macros
 set output "build/string-graph-pta-sharing.pdf"
+set output OUTDIR."/plot-string-pta-sharing.pdf"
 #set pointsize 2.0
 set grid
 …
 set xtics (1,2,5,10,20,50,100,200,500)
 set logscale x
+set yrange [8:4096]
 set logscale y 2
 set xlabel "String Length being appended (mean, geo. dist.), log scale" offset 2,0
 set ylabel "Time per append (ns, mean), log_{2} scale"
-set linetype 5 dashtype 2
 #show colornames
 plot DIR."/string-graph-pta-sharing.dat" \
+plot INDIR."/plot-string-pta-sharing.dat" \
            i 0 using 1:2 title columnheader(1) with linespoints lt rgb "red"    pt  2  ps 1 lw 1, \
         '' i 1 using 1:2 title columnheader(1) with linespoints lt rgb "dark-green" pt  4  ps 1 lw 1, \
         '' i 2 using 1:2 title columnheader(1) with linespoints lt rgb "blue"   pt  6  ps 1 lw 1, \
+        '' i 3  using 1:2 title columnheader(1) with linespoints lt rgb "dark-green" pt  12  ps 1 lw 1, \
+        '' i 4  using 1:2 title columnheader(1) with linespoints lt rgb "blue"  pt  8  ps 1 lw 1
+        '' i 3  using 1:2 title columnheader(1) with linespoints lt rgb "dark-green" pt  12  ps 1 lw 1

doc/theses/mike_brooks_MMath/programs/bkgd-cfa-arrayinteract.cfa

r7d02d35	rbd72f517
3	3
4	4	struct tm { int x; };
5		forall ~~(T) T~~ alloc();
	5	forall( T * ) T * alloc();
6	6
7	7	int main () {

doc/theses/mike_brooks_MMath/programs/hello-accordion.cfa

-              r7d02d35
+              rbd72f517
 int getPref( @School( C, S ) & school@, int is, int pref ) {
         for ( ic; C ) {
+                int curPref = @school.preferences@[ic][is];   $\C{// offset calculation implicit}$
+                if ( curPref == pref ) return ic;
+                if ( pref == @school.preferences@[ic][is]; ) return ic; $\C{// offset calculation implicit}$
+        }
         assert( false );
+}
 …
                 sout | school.student_ids[is] | ": " | nonl;
                 for ( pref; 1 ~= nc ) {
                         int ic = getPref(school, is, pref);
+                        int ic = getPref( school, is, pref );
                         sout | school.course_codes[ ic ] | nonl;
+                }

doc/theses/mike_brooks_MMath/string.tex

-              r7d02d35
+              rbd72f517
 \begin{cquote}
 \begin{tabular}{@{}l|l|l|l@{}}
 C @char [ ]@                    &  \CC @string@                 & Java @String@     & \CFA @string@     \\
+C @char [ ]@                    &  \CC @string@                 & Java @String@ & \CFA @string@ \\
 \hline
 @strcpy@, @strncpy@             & @=@                                   & @=@               & @=@       \\
 @strcat@, @strncat@             & @+@, @+=@                             & @+@, @+=@         & @+@, @+=@ \\
+@strcpy@, @strncpy@             & @=@                                   & @=@                   & @=@   \\
+@strcat@, @strncat@             & @+@, @+=@                             & @+@, @+=@             & @+@, @+=@     \\
 @strcmp@, @strncmp@             & @==@, @!=@, @<@, @<=@, @>@, @>=@
                                                 & @equals@, @compareTo@
                                                                                                                                         & @==@, @!=@, @<@, @<=@, @>@, @>=@ \\
 @strlen@                                & @length@, @size@              & @length@                      & @size@        \\
 @[ ]@                                   & @[ ]@                                 & @charAt@          & @[ ]@     \\
 @strncpy@                               & @substr@                              & @substring@       & @( )@, on RHS of @=@      \\
 @strncpy@                               & @replace@                             & @replace@         & @( )@, on LHS of @=@ \\
 @strstr@                                & @find@                                & @indexOf@         & @find@ \\
 @strcspn@                               & @find_first_of@               & @matches@         & @include@ \\
 @strspn@                                & @find_first_not_of@   & @matches@         & @exclude@ \\
 n/a                                             & @c_str@, @data@               & n/a               & @strcpy@, @strncpy@ \\
+                                                                                                & @equals@, @compareTo@
+                                                                                                                                & @==@, @!=@, @<@, @<=@, @>@, @>=@ \\
+@strlen@                                & @length@, @size@              & @length@              & @size@        \\
+@[ ]@                                   & @[ ]@                                 & @charAt@              & @[ ]@ \\
+@strncpy@                               & @substr@                              & @substring@   & @( )@, on RHS of @=@  \\
+@strncpy@                               & @replace@                             & @replace@             & @( )@, on LHS of @=@ \\
+@strstr@                                & @find@                                & @indexOf@             & @find@ \\
+@strcspn@                               & @find_first_of@               & @matches@             & @include@ \\
+@strspn@                                & @find_first_not_of@   & @matches@             & @exclude@ \\
+N/A                                             & @c_str@, @data@               & N/A                   & @strcpy@, @strncpy@ \\
 \end{tabular}
 \end{cquote}
 …
 \section{\CFA \lstinline{string} type}
+\section{\CFA \lstinline{string} Type}
 \label{s:stringType}
 …
 ch = ch + 'b'; $\C[2in]{// LHS disambiguate, add character values}$
 s = 'a' + 'b'; $\C{// LHS disambiguate, concatenate characters}$
 printf( "%c\n", @'a' + 'b'@ ); $\C[2in]{// no LHS information, ambiguous}$
 printf( "%c\n", @(return char)@('a' + 'b') ); $\C{// disambiguate with ascription cast}$
+printf( "%c\n", @'a' + 'b'@ ); $\C{// no LHS information, ambiguous}$
+printf( "%c\n", @(return char)@('a' + 'b') ); $\C{// disambiguate with ascription cast}\CRT$
 \end{cfa}
 The ascription cast, @(return T)@, disambiguates by stating a (LHS) type to use during expression resolution (not a conversion).
 …
 ch = ch * 3; $\C[2in]{// LHS disambiguate, multiply character values}$
 s = 'a' * 3; $\C{// LHS disambiguate, concatenate characters}$
 printf( "%c\n", @'a' * 3@ ); $\C[2in]{// no LHS information, ambiguous}$
 printf( "%c\n", @(return char)@('a' * 3) ); $\C{// disambiguate with ascription cast}$
+printf( "%c\n", @'a' * 3@ ); $\C{// no LHS information, ambiguous}$
+printf( "%c\n", @(return char)@('a' * 3) ); $\C{// disambiguate with ascription cast}\CRT$
 \end{cfa}
 Fortunately, character multiplication without LHS information is even rarer than addition, so repurposing the operator @*@ for @string@ types is not a problem.
 …
+&
 \begin{cfa}
 for ( ;; ) {
+for () {
         size_t posn = exclude( line, alpha );
   if ( posn == len( line ) ) break;
 …
 \end{tabular}
 \end{cquote}
 Input text can be gulped, including whitespace, from the current point to an arbitrary delimiter character using @getline@.
+Input text can be \emph{gulped}, including whitespace, from the current point to an arbitrary delimiter character using @getline@.
 The \CFA philosophy for input is that, for every constant type in C, these constants should be usable as input.
 …
 \end{tabular}
 \end{cquote}
 Note, the ability to read in quoted strings to match with program string constants.
+Note, the ability to read in quoted strings with whitespace to match with program string constants.
 The @nl@ at the end of an input ignores the rest of the line.
 …
                                         & Laxed: The target's type is anything string-like; it may have a different status concerning ownership.
                                                                 & Strict: The target's type is the same as the source; both strings are equivalent peers concerning ownership.
                                                                                         & n/a           & no    & yes   & yes \\
+                                                                                        & N/A           & no    & yes   & yes \\
 \hline
 Referent
 …
         The C ``string'' is @char *@, under the conventions of @<string.h>@. Because this type does not manage a text allocation, symmetry does not apply.
 \item
         The Java @String@ class is analyzed; its @StringBuffer@ class behaves similarly to @C++@.
+        The Java @String@ class is analyzed; its @StringBuffer@ class behaves similarly to \CC.
 \end{itemize}
 \caption{Comparison of languages' strings, storage management perspective.}
 …
 \end{figure}
 In C, these declarations give very different things.
+In C, these declarations are very different.
 \begin{cfa}
 char x[$\,$] = "abcde";
 char * y = "abcde";
 \end{cfa}
 Both associate the declared name with fixed-six contiguous bytes, filled as @{'a', 'b', 'c', 'd', 'e', 0}@.
 But @x@ gets them allocated in the active stack frame (with values filled in as control passes the declaration), while @y@ refers into the executable's read-only data section.
+Both associate the declared name with the fixed, six contiguous bytes: @{'a', 'b', 'c', 'd', 'e', 0}@.
+But @x@ is allocated on the stack (with values filled at the declaration), while @y@ refers to the executable's read-only data-section.
 With @x@ representing an allocation, it offers information in @sizeof(x)@ that @y@ does not.
 But this extra information is second-class, as it can only be used in the immediate lexical context, \ie it cannot be passed on to string operations or user functions.
+But this extra information is second-class, as it can only be used in the immediate lexical context, \ie it cannot be passed to string operations or user functions.
 Only pointers to text buffers are first-class, and discussed further.
 \begin{cfa}
 char * s = "abcde";
 char * s1 = s;  $\C{// alias state, n/a symmetry, variable-constrained referent}$
 char * s2 = &s[1];  $\C{// alias state, n/a symmetry, variable-constrained referent}\CRT$
 char * s3 = &s2[1];  $\C{// alias state, n/a symmetry, variable-constrained referent}
+char * s1 = s;  $\C[2.25in]{// alias state, N/A symmetry, variable-constrained referent}$
+char * s2 = &s[1];  $\C{// alias state, N/A symmetry, variable-constrained referent}$
+char * s3 = &s2[1];  $\C{// alias state, N/A symmetry, variable-constrained referent}\CRT$
 printf( "%s %s %s %s\n", s, s1, s2, s3 );
 $\texttt{\small abcde abcde bcde cde}$
 …
 string & s5 = s.substr(2,4);  $\C{// error: cannot point to temporary}\CRT$
 \end{cfa}
 The @s1@ lax symmetry reflects how its validity of depends on the lifetime of @s@.
+The @s1@ lax symmetry reflects how its validity depends on the lifetime of @s@.
 It is common practice in \CC to use the @s1@-style for a by-reference function parameter.
 Doing so assumes that the callee only uses the referenced string for the duration of the call, \ie no storing the parameter (as a reference) for later.
 So, when the called function is a constructor, its definition typically uses an @s2@-style copy-initialization.
 Exceptions to this pattern are possible, but require the programmer to assure safety where the type system does not.
 The @s3@ initialization must copy the substring because it must support a subsequent @c_str@ call, which provides a null-termination, generally at a different position than the source string's.
+The @s3@ initialization must copy the substring to support a subsequent @c_str@ call, which provides null-termination, generally at a different position than the source string's.
 @s2@ assignment could be made fast, by reference-counting the text area and using copy-on-write, but would require an implementation upgrade.
 …
 With @s2@, the case for fast-copy is more subtle.
 Certainly, its value is not pointer-equal to @s@, implying at least a further allocation.
 But because Java is not constrained to use a null-terminated representation, a standard-library implementation is free to refer to the source characters in-place.
+But because Java is \emph{not} constrained to use a null-terminated representation, a standard-library implementation is free to refer to the source characters in-place.
 Java does not meet the aliasing requirement because immutability makes it impossible to modify.
 Java's @StringBuffer@ provides aliasing (see @replace@ example on \VPageref{p:JavaReplace}), though without supporting symmetric treatment of a fragment referent, \eg @substring@ of a @StringBuffer@ is a @String@;
 …
+\subsection{Logical overlap}
+\subsection{Logical Overlap}
 It may be unfamiliar to combine \VRef[Figure]{f:StrSemanticCompare}'s alias state and fragment referent in one API, or at the same time.
 This section shows the capability in action.
+In summary, the metaphor of a GUI text editor is intended.
+Selecting a consecutive block of text using the mouse defines an aliased substring within the file.
+Typing in this state overwrites what was there before, replacing the originally selected text with more or less text.
+But the \emph{whole file} grows or shrinks as a result, not just the selection.
+This action models assigning to an aliased substring when the two strings overlap by total containment: one string is the selection, the other is the whole file.
+Now extend the metaphor to a multi-user online editor.
+If Alice selects a range of text at the bottom of the file, wile Bob is rewriting a paragraph at the top, Alice's selection holds onto the logical characters initially selected, unaffected by Bob making the total file grow/shrink, and unaffectd by Bob causing the start index of Alice's selction to vary.
+This action models assigning to an aliased substring when the two strings do not overlap at all: one string is Alice's selection, the other is Bob's.
+If a third office worker were also watching Alice's and Bob's actions on the whole file (a string with ``all the text'' is kept around), then two further single-user-edit cases give the semantics of the individual edits flowing into the whole.
+But, departing from the document analogy, it is not necessary to keep a such a third string:
+no one has to resource-manage ``the document.''
+When an original string, from which both the Alice- and Bob-parts came, ceases to exist, Alice and Bob are left with two independent strings.
+They are independent because Alice and Bob have no API for growing the bounds of a string to subsume text that may once have been around it.
+Edge cases, notably ``Venn-diagram overlap,'' had to have handlings chosen.
+The intent in fleshing out these details was to achieve the above story, with a single API, while keeping the rest as simple as possible.
+The remainder of this section shows the resulting decisions, played out at the API level.
+\CFA uses the marker @`share@ as a dynamic mechanism to indicate alias (mutations shared) \vs snapshot (not quite an immutable result, but one with subsequent mutations isolated).
+\begin{comment}
+The metaphor of a GUI text-editor is used to illustrate combining these features.
+Most editors allow selecting a consecutive block of text (highlighted) to define an aliased substring within a document.
+Typing in this area overwrites the prior text, replacing the selected text with less, same, or more text.
+Importantly, the document also changes size, not just the selection.
+%This alias model is assigning to an aliased substring for two strings overlapping by total containment: one is the selected string, the other is the document.
+Extend the metaphor to two selected areas, where one area can be drag-and-dropped into another, changing the text in the drop area and correspondingly changing the document.
+When the selected areas are indenpendent, the semantics of the drag-and-drop are straightforward.
+However, for overlapping selections, either partial or full, there are multiple useful semantics.
+For example, two areas overlap at the top, or bottom, or a block at a corner, where one areas is dropped into the other.
+For selecting a smaller area within a larger, and dropping the smaller area into the larger to replace it.
+In both cases, meaningful semantics must be constructed or the operation precluded.
+However, without this advanced capability, certain operations become multi-step, possible requiring explicit temporaries.
+\end{comment}
+A GUI text-editor provides a metaphor.
+Selecting a block of text using the mouse defines an aliased substring within a document.
+Typing in this area overwrites what was there, replacing the originally selected text with more or less text.
+But the \emph{containing document} also grows or shrinks, not just the selection.
+This action models assigning to an aliased substring when one string is completely contained in the other.
+Extend the metaphor to a multi-user editor.
+If Alice selects a range of text at the bottom, while Bob is rewriting a paragraph at the top, Alice's selection holds onto the characters initially selected, unaffected by Bob making the document grow/shrink even though Alice's start index in the document is changing.
+This action models assigning to an aliased substring when the two strings do not overlap.
+Logically, Alice's and Bob's actions on the whole document are like two single-user-edit cases, giving the semantics of the individual edits flowing into a whole.
+But, there is no need to have two separate document strings.
+Even if a third selection removes all the text, both Alice's and Bob's strings remain.
+The independence of their selections assumes that the editor API does not allow the selection to be enlarged, \ie adding text from the containing environment, which may have disappeared.
+This leaves the ``Venn-diagram overlap'' cases, where Alice's and Bob's selections overlap at the top, bottom, or corner.
+In this case, the selection areas are dependent, and so, changes in content and size in one may have an affect in the other.
+There are multiple possible semantics for this case.
+The remainder of this section shows the chosen semantics for all of the cases.
+String sharing is expressed using the @`share@ marker to indicate aliasing (mutations shared) \vs snapshot (not quite an immutable result, but one with subsequent mutations isolated).
 This aliasing relationship is a sticky property established at initialization.
 For example, here strings @s1@ and @s1a@ are in an aliasing relationship, while @s2@ is in a copy relationship.
 \input{sharing1.tex}
 Here, the aliasing (@`share@) causes partial changes (subscripting) to flow in both directions.
 (In the following examples, watch how @s1@ and @s1a@ change together, and @s2@ is independent.)
+(In the following examples, note how @s1@ and @s1a@ change together, and @s2@ is independent.)
 \input{sharing2.tex}
 Similarly for complete changes.
 …
 \input{sharing4.tex}
 Now, consider string @s1_mid@ being an alias in the middle of @s1@, along with @s2@, made by a simple copy from the middle of @s1@.
+Now, consider string @s1_mid@ being an alias in the middle of @s1@, along with @s2@, made by a copy from the middle of @s1@.
 \input{sharing5.tex}
 Again, @`share@ passes changes in both directions; copy does not.
 …
 When @s1_bgn@'s size increases by 3, @s1_mid@'s starting location moves from 1 to 4 and @s1_end@'s from 3 to 6,
 When changes happens on an aliasing substring that overlap.
+When changes happen on an aliasing substring that overlap.
 \input{sharing10.tex}
 Strings @s1_crs@ and @s1_mid@ overlap at character 4, @j@ because the substrings are 3,2 and 4,2.
+Strings @s1_crs@ and @s1_mid@ overlap at character 4, @j@, because the substrings are 3,2 and 4,2.
 When @s1_crs@'s size increases by 1, @s1_mid@'s starting location moves from 4 to 5, but the overlapping character remains, changing to @'+'@.
 …
+\section{Storage management}
+\section{Storage Management}
 This section discusses issues related to storage management of strings.
 …
 const string s1 = "abc";
 \end{cfa}
+the @const@ applies to the @s1@ pointer to @"abc"@, and @"abc"@ is an immutable constant that is \emph{copied} into the string's storage.
+Hence, @s1@ is not pointing at an immutable constant, meaning its underlying string can be mutable, unless some other designation is specified, such as Java's global immutable rule.
+\subsection{General implementation}
+@const@ applies to the @s1@ pointer to @"abc"@, and @"abc"@ is an immutable constant that is \emph{copied} into the string's storage.
+Hence, @s1@ is not pointing at an immutable constant and its underlying string is mutable, unless some other designation is specified, such as Java's global immutable rule.
+\subsection{General Implementation}
 \label{string-general-impl}
 …
 A string is a smart pointer into this buffer.
 This cycle of frequent cheap allocations, interspersed with infrequent expensive compactions, has obvious similarities to a general-purpose memory manager based on garbage collection (GC).
+This cycle of frequent cheap allocations, interspersed with infrequent expensive compactions, has obvious similarities to a general-purpose memory-manager based on garbage collection (GC).
 A few differences are noteworthy.
 First, in a general purpose manager, the allocated objects may contain pointers to other objects, making the transitive reachability of these objects a crucial property.
 Here, the allocations are text, so one allocation never keeps another alive.
 Second, in a general purpose manager, the handle that keeps an allocation alive is a bare pointer.
 For strings, a fatter representation is acceptable because this pseudo-pointer is only used for enty into the string-heap, not for general data-sub-structure linking around the general heap.
+For strings, a fatter representation is acceptable because this pseudo-pointer is only used for entry into the string-heap, not for general data-substructure linking around the general heap.
 \begin{figure}
 …
 \VRef[Figure]{f:memmgr-basic} shows the representation.
 The heap header and text buffer define a sharing context.
 Normally, one global sharing context is appropriate for an entire program;
 concurrent exceptions are discussed in \VRef{s:ControllingImplicitSharing}.
 A string is a handle into the buffer and node within a linked list.
+Normally, one global context is appropriate for an entire program;
+concurrency is discussed in \VRef{s:ControllingImplicitSharing}.
+A string is a handle to a node in a linked list containing a information about a string text in the buffer.
 The list is doubly linked for $O(1)$ insertion and removal at any location.
 Strings are ordered in the list by text start address.
 The header maintains a next-allocation pointer, @alloc@, pointing to the last live allocation in the buffer.
+The heap header maintains a next-allocation pointer, @alloc@, pointing to the last live allocation in the buffer.
 No external references point into the buffer and the management procedure relocates the text allocations as needed.
 A string handle references a containing string, while its string is contiguous and not null terminated.
 …
 String handles can be allocated in the stack or heap, and represent the string variables in a program.
 Normal C life-time rules apply to guarantee correctness of the string linked-list.
 The text buffer is large enough with good management so that often only one dynamic allocation is necessary during program execution.
+The text buffer is large enough with good management so that often only one dynamic allocation is necessary during program execution, but not so large as to cause program bloat.
 % During this period, strings can vary in size dynamically.
 When the text buffer fills, \ie the next new string allocation causes @alloc@ to point beyond the end of the buffer, the strings are compacted.
 The linked handles define all live strings in the buffer, which indirectly defines the allocated and free space in the buffer.
 Since the string handles are in sorted order, the handle list can be traversed, copying the first live text to the start of the buffer, and subsequent strings after each other.
 If, upon compaction, the amount of free storage would still be less than the new string allocation, a larger text buffer is heap-allocated, the current buffer is copied into the new buffer, and the original buffer is freed.
+The string handles are maintained in sorted order, so the handle list can be traversed, copying the first live text to the start of the buffer, and subsequent strings after each other.
+After compaction, if free storage is still be less than the new string allocation, a larger text buffer is heap-allocated, the current buffer is copied into the new buffer, and the original buffer is freed.
 Note, the list of string handles is structurally unaffected during a compaction;
 only the text pointers in the handles are modified to new buffer locations.
 …
 Both string initialization styles preserve the string module's internal invariant that the linked-list order matches the buffer order.
 For string destruction, handles are removed from the list.
 As a result, once a last handle using a run of buffer characters is destroyed, that buffer space gets excluded from the next compaction, making its character-count available in the compacted buffer.
 Certain string operations can result in a substring of another string.
 The resulting handle is then placed in the correct sorted position in the list, possible with a short linear search to locate the position.
+Once the last handle using a run of buffer characters is destroyed, that buffer space is excluded from use until the next compaction.
+Certain string operations result in a substring of another string.
+The resulting handle is then placed in the correct sorted position in the list, possible requiring a short linear search to locate the position.
 For string operations resulting in a new string, that string is allocated at the end of the buffer.
 For shared-edit strings, handles that originally referenced containing locations need to see the new value at the new buffer location.
 …
 \subsection{RAII limitations}
+\subsection{RAII Limitations}
 \label{string-raii-limit}
 Earlier work on \CFA~\cite[ch.~2]{Schluntz17} implemented object constructors and destructors for all types (basic and user defined).
 A constructor is a user-defined function run implicitly \emph{after} an object's storage is allocated, and a destructor is a user-defined function run \emph{before} an object's storage is deallcated.
+A constructor is a user-defined function run implicitly \emph{after} an object's storage is allocated, and a destructor is a user-defined function run \emph{before} an object's storage is deallocated.
 This feature, called Resource Acquisition Is Initialization (RAII)~\cite[p.~389]{Stroustrup94}, helps guarantee invariants for users before accessing an object and for the programming environment after an object terminates.
 …
 \end{cfa}
 A module providing the @T@ type can traverse @all_T@ at relevant times, to keep the objects ``good.''
 Hence, declaring a @T@ not only ensures that it begins with an initially ``good'' value, but it also provides an implicit subscription to a service that keeps the value ``good'' in the future.
+Hence, declaring a @T@ not only ensures that it begins with an initially ``good'' value, but it also provides an implicit subscription to a service that keeps the value ``good'' during its lifetime.
 Again, both \CFA and \CC support this usage style.
 A third capability concerns \emph{implicitly} requested copies.
 When stack-allocated objects are used as parameter and return values, a sender's version exists in one stack frame and a receiver's version exists in another.
+In the parameter direction, the language's function-call handling must arrange for a copy-constructor call to happen\footnote{
+        \CC also offers move constructors and return-value optimization~\cite{RVO20}.
+        These features help reduce unhelpful copy-constructor calls, which, for types like the example \lstinline{S}, would lead to extra memory allocations.
+        \CFA does not currently have these features; adding similarly-intended features to \CFA is desirable.
+        However, this section is about a problem in the realization of features that \CFA already supports.
+        To understand the problem presented, the appropriate comparison is with classic versions of \CC that treated such copy-constructor calls as necessary.}
+at a time near the control transfer into the callee, with the source as the caller's (sender's) version and the target as the callee's (receiver's) version.
+(In the return direction, the roles are reversed and the copy-constructor call happens near the return of control.)
+\CC supports this capability without qualification.
+\CFA offers limited support here; simple examples work, but implicit copying does not combine successfully with the other RAII capabilities discussed.
+In the parameter direction, the language's function-call handling must arrange for a copy-constructor call to happen, at a time near the control transfer into the callee. %, with the source as the caller's (sender's) version and the target as the callee's (receiver's) version.
+In the return direction, the roles are reversed and the copy-constructor call happens near the return of control.
+\CC supports this capability.% without qualification.
+\CFA offers limited support;
+simple examples work, but implicit copying does not combine successfully with the other RAII capabilities discussed.
+\CC also offers move constructors and return-value optimization~\cite{RVO20}.
+These features help reduce unhelpful copy-constructor calls, which, for types like the @S@ example, would lead to extra memory allocations.
+\CFA does not currently have these features; adding similarly-intended features to \CFA is desirable.
+However, this section is about a problem in the realization of features that \CFA already supports.
+Hence, the comparison continues with the classic version of \CC that treated such copy-constructor calls as necessary.
 To summarize the unsupported combinations, the relevant features are:
 …
 At that time, adhering to a principal of minimal intervention, this code could always be treated as passthrough:
 \begin{cfa}
 struct U {...};
+struct U { ... };
 // RAII to go here
 void f( U u ) { F_BODY(u) }
 …
 f( x );
 \end{cfa}
 But adding custom RAII (at ``...here'') changes things.
 The common C++ lowering~\cite[Sec. 3.1.2.3]{cxx:raii-abi} proceeds differently than the present CFA lowering.
 \noindent
 \begin{tabular}{l|l}
 \begin{cfa}
+// C++, likely CFA to be
+But adding custom RAII (at ``...go here'') changes things.
+The common \CC lowering~\cite[Sec. 3.1.2.3]{cxx:raii-abi} proceeds differently than the present \CFA lowering.
+\begin{cquote}
+\setlength{\tabcolsep}{15pt}
+\begin{tabular}{@{}l|l@{}}
+\begin{cfa}
+$\C[0.0in]{// \CC, \CFA future}\CRT$
 struct U {...};
 // RAII elided
 void f( U * __u_orig ) {
         U u = * __u_orig;  // call copy ctor
         F_BODY(u)
+        F_BODY( u );
         // call dtor, u
+}
 U x; // call default ctor
+f( & x ) ;
+f( &x ) ;
 // call dtor, x
 \end{cfa}
+&
 \begin{cfa}
+// CFA today
+$\C[0.0in]{// \CFA today}\CRT$
 struct U {...};
 // RAII elided
 void f( U u ) {
+        F_BODY(u)
+        F_BODY( u );
+}
 U x; // call default ctor
 …
 \end{cfa}
 \end{tabular}
+In the CFA-today scheme, the lowered form is still using a by-value C call.
+C does a @memcpy@ on structs passed by value.
+And so, @F_BDY@ sees the bits of @__u_for_f@ occurring at an address that has never been presented to the @U@ lifecycle functions.
+If @U@ is trying to have a style-\#2 invariant, it shows up broken in @F_BDY@: references that are supposed to be to @u@ are actually to the different location @__u_for_f@.
+The \CC scheme does not have this problem because it constructs the for-@f@ copy in the correct location.
+Yet, the \CFA-today scheme is sufficient to deliver style-\#1 invariants (in this style-\#3 use case) because this scheme still does the correct number of lifecycle calls, using correct values, at correct times.  So, reference-counting or simple ownership applications get their invariants respected under call/return-by-value.
+\end{cquote}
+The current \CFA scheme is still using a by-value C call.
+C does a @memcpy@ on structures passed by value.
+And so, @F_BODY@ sees the bits of @__u_for_f@ occurring at an address that has never been presented to the @U@ lifecycle functions.
+If @U@ is trying to have a style-\#2 invariant, it shows up broken in @F_BODY@: references supposedly to @u@ are actually to @__u_for_f@.
+The \CC scheme does not have this problem because it constructs the for @f@ copy in the correct location within @f@.
+Yet, the current \CFA scheme is sufficient to deliver style-\#1 invariants (in this style-\#3 use case) because this scheme still does the correct number of lifecycle calls, using correct values, at correct times.
+So, reference-counting or simple ownership applications get their invariants respected under call/return-by-value.
 % [Mike is not currently seeing how distinguishing initialization from assignment is relevant]
 …
 % The following discusses the consequences of this semantics with respect to lifetime management of \CFA strings.
+The string API offers style \#3's pass-by-value in, for example, in the return of @"a" + "b"@.
+The string API offers style \#3's pass-by-value in, \eg in the return of @"a" + "b"@.
 Its implementation uses the style-\#2 invariant of the string handles being linked to each other, helping to achieve high performance.
 Since these two RAII styles cannont coexist, a workaround splits the API into two layers: one that provides pass-by-value, built upon the other with inter-linked handles.
+Since these two RAII styles cannot coexist, a workaround splits the API into two layers: one that provides pass-by-value, built upon the other with inter-linked handles.
 The layer with pass-by-value incurs a performance penalty, while the layer without delivers the desired runtime performance.
 The slower, friendlier High Level API (HL, type @string@) wrapps the faster, more primitive Low Level API (LL, type @string_res@, abbreviating ``resource'').
+The slower, friendlier High Level API (HL, type @string@) wraps the faster, more primitive Low Level API (LL, type @string_res@, abbreviating ``resource'').
 Both APIs present the same features, up to return-by-value operations being unavailable in LL and implemented via the workaround in HL.
 The intention is for most future code to target HL.
+When the RAII issue is fixed, the full HL feature set will be acheivable using the LL-style lifetime management.
+So then, there will be no need for two API levels; HL will be removed; LL's type will be renamed to @string@; programs written for current HL will run faster.
+When the RAII issue is fixed, the full HL feature set will be achievable using the LL-style lifetime management.
+Then, HL will be removed;
+LL's type will be renamed @string@ and programs written for current HL will run faster.
 In the meantime, performance-critical sections of applications must use LL.
 Subsequent performance experiments \see{\VRef{s:PerformanceAssessment}} use the LL API when comparing \CFA to other languages.
 This measurement gives a fair estimate of the goal state for \CFA.
 A separate measure of the HL overhead is also included.
 \VRef[Section]{string-general-impl} described the goal state for \CFA.  In present state, the type @string_res@ replaces its mention of @string@ as inter-linked handle.
 To use LL, a programmer rewrites invocations that used pass-by-value APIs into invocations where the resourcing is more explicit.
 Many invocations are unaffected, notably including assignment and comparison.
 Of the capabilities listed in \VRef[Figure]{f:StrApiCompare}, only the following three cases have revisions.
 \noindent
+hence, \VRef[Section]{string-general-impl} us describing the goal state for \CFA.
+In present state, the type @string_res@ replaces its mention of @string@ as inter-linked handle.
+To use LL, a programmer rewrites invocations using pass-by-value APIs into invocations where resourcing is more explicit.
+Many invocations are unaffected, notably assignment and comparison.
+Of the capabilities listed in \VRef[Figure]{f:StrApiCompare}, only the following three cases need revisions.
+\begin{cquote}
+\setlength{\tabcolsep}{15pt}
 \begin{tabular}{ll}
 HL & LL \\
 \hline
 \begin{cfa}
 string s = "a" + "b";
 \end{cfa}
 …
 string s = "abcde";
 string s2 = s(2, 3); // s2 == "cde"
 s(2,3) = "x"; // s == "abx" && s2 == "cde"
 \end{cfa}
 …
 \begin{cfa}
 string s = "abcde";
 s[2] = "xxx";  // s == "abxxxde"
 \end{cfa}
 …
 \end{cfa}
 \end{tabular}
+\end{cquote}
 The actual HL workaround is having @string@ wrap a pointer to a uniquely owned, heap-allocated @string_res@.  This arrangement has @string@ being style-\#1 RAII, which is compatible with pass-by-value.
+\subsection{Sharing implementation}
+\subsection{Sharing Implementation}
 \label{sharing-impl}
+The \CFA string module has two mechanisms to handle the case when string handles share a run of text.
+The \CFA string module has two mechanisms to deal with string handles sharing text.
 In the first type of sharing, the user requests that both string handles be views of the same logical, modifiable string.
 This state is typically produced by the substring operation.
 …
 $\texttt{\small axcde xc}$
 \end{cfa}
 In a typical substring call, the source string-handle is referencing an entire string, and the resulting, newly made, string handle is referencing a portion of the original.
 In this state, a subsequent modification made by either is visible in both.
+Here, the source string-handle is referencing an entire string, and the resulting, newly made, string handle is referencing a contained portion of the original.
+In this state, a modification made in the overlapping area is visible in both strings.
 The second type of sharing happens when the system implicitly delays the physical execution of a logical \emph{copy} operation, as part of its copy-on-write optimization.
 …
 In this state, a subsequent modification done on one handle triggers the deferred copy action, leaving the handles referencing different text within the buffer, holding distinct values.
 A further abstraction, in the string module's implementation, helps distinguish the two senses of sharing.
+A further abstraction helps distinguish the two senses of sharing.
 A share-edit set (SES) is an equivalence class over string handles, being the reflexive, symmetric and transitive closure of the relationship of one string being constructed from another, with the ``share'' option given.
 The SES is represented by a second linked list among the handles.
 …
 \subsection{Controlling implicit sharing}
+\subsection{Controlling Implicit Sharing}
 \label{s:ControllingImplicitSharing}
 …
 In detail, string sharing has inter-linked string handles, so managing one string is also managing the neighbouring strings, and from there, a data structure of the ``set of all strings.''
 Therefore, it is useful to toggle this capability on or off when it is not providing any application benefit.
-\begin{figure}
-    \begin{tabular}{ll}
-        \lstinputlisting[language=CFA, firstline=10, lastline=55]{sharectx.run.cfa}
+        &
-        \raisebox{-0.17\totalheight}{\includegraphics{string-sharectx.pdf}} % lower
-    \end{tabular}
-        \caption{Controlling copying vs sharing of strings using \lstinline{string_sharectx}.}
-        \label{fig:string-sharectx}
-\end{figure}
 The \CFA string library provides the type @string_sharectx@ to control an ambient sharing context.
 …
 Executing the example does not produce an interesting outcome, but the comments in the picture indicate when the logical copy operation runs with
 \begin{description}
     \item[share:] the copy being deferred, as described through the rest of this section (fast), or
     \item[copy:] the copy performed eagerly (slow).
+        \item[share:] the copy being deferred, as described through the rest of this section (fast), or
+        \item[copy:] the copy performed eagerly (slow).
 \end{description}
 Only eager copies can cross @string_sharectx@ boundaries.
 The intended use is with stack-managed lifetimes, in which the established context lasts until the current function returns, and affects all functions called that do not create their own contexts.
+[ TODO: true up with ``is thread local'' (implement that and expand this discussion to give a concurrent example, or adjust this wording) ]
+\subsection{Sharing and threading}
+\begin{figure}
+        \begin{tabular}{ll}
+                \lstinputlisting[language=CFA, firstline=10, lastline=55]{sharectx.run.cfa}
+                &
+                \raisebox{-0.17\totalheight}{\includegraphics{string-sharectx.pdf}} % lower
+        \end{tabular}
+        \caption{Controlling copying vs sharing of strings using \lstinline{string_sharectx}.}
+        \label{fig:string-sharectx}
+\end{figure}
+\subsection{Sharing and Threading}
 The \CFA string library provides no thread safety, the same as \CC string, providing similar performance goals.
 …
+\subsection{Future work}
+Implementing the small-string optimization is straightforward, as a string header contains a pointer to the string text in the buffer.
+This pointer could be marked with a flag and contain a small string.
+However, there is now a conditional check required on the fast-path to switch between small and large string operations.
+It might be possible to pack 16- or 32-bit Unicode characters within the same string buffer as 8-bit characters.
+Again, locations for identification flags must be found and checked along the fast path to select the correct actions.
+Handling utf8 (variable length), is more problematic because simple pointer arithmetic cannot be used to stride through the variable-length characters.
+Trying to use a secondary array of fixed-sized pointers/offsets to the characters is possible, but raises the question of storage management for the utf8 characters themselves.
+\section{Performance assessment}
+\label{s:PerformanceAssessment}
+I assessed the \CFA string library's speed and memory usage against strings in \CC STL.
+Overall, this analysis shows that adding support for the features shown earlier in the chapter comes at no substantial cost in the performance of featrues common to both APIs.
+Moreover, the results support the \CFA string's position as a high-level enabler of simplified text processing.
+STL makes its user think about memory management.
+When the user does, and is successful, STL's performance can be very good.
+But when the user fails to think through the consequences of the STL representation, performance becomes poor.
+The \CFA string lets the user work at the level of just putting the right text into right variables, with corresponding performance degradations reduced or eliminated.
+% The final test shows the overall win of the \CFA text-sharing mechanism.
+% It exercises several operations together, showing \CFA enabling clean user code to achieve performance that STL requires less-clean user code to achieve.
+\subsection{Methodology}
+These tests use a \emph{corpus} of strings.
+Their lengths are important; the specific characters occurring in them are immaterial.
+In a result graph, a corpus's mean string length is often the independent variable shown on the X axis.
+When a corpus contains strings of different lenghths, the lengths are drawn from a geometric distribution.
+Therefore, strings much longer than the mean occur nontrivially and strings slightly shorter than the mean occur most often.
+A corpus's string sizes are one of:
+\begin{description}
+        \item [Fixed-size] all string lengths are of the stated size.
+        \item [Varying 1 and up] the string lengths are drawn from the geometric distribution with a stated mean and all lengths occur.
+        \item [Varying 16 and up] string lengths are drawn from the geometric distribution with the stated mean, but only lengths 16 and above occur; thus, the stated mean is above 16.  \PAB{Is this one unused?  May have just been for ``normalize.''}
+\end{description}
+The special treatment of length 16 deals with the short-string optimization (SSO) in STL @string@, currently not implemented in \CFA, though a fine future improvement to \CFA.
+In the general case, an STL string handle is a pointer (to separately allocated text) and a length.
+But when the text is shorter than this representation, the optimization repurposes the handle's storage to eliminate using the heap.
+\subsection{Short-String Optimization}
+\CC implements a short-string ($\le$16) optimization (SSO).
+As a string header contains a pointer to the string text, this pointer can be tagged and used to contain a short string, removing a dynamic memory allocation/deallocation.
 \begin{c++}
 class string {
 …
                 char sstr[sizeof(lstr)]; $\C{// short string <16 characters, text in situ}$
         };
         $\C{// tagging for kind (short or long) elided}$
+        // some tagging for short or long strings
 };
 \end{c++}
+However, there is now a conditional check required on the fast-path to switch between short and long string operations.
+It might be possible to pack 16- or 32-bit Unicode characters within the same string buffer as 8-bit characters.
+Again, locations for identification flags must be found and checked along the fast path to select the correct actions.
+Handling utf8 (variable length), is more problematic because simple pointer arithmetic cannot be used to stride through the variable-length characters.
+Trying to use a secondary array of fixed-sized pointers/offsets to the characters is possible, but raises the question of storage management for the utf8 characters themselves.
+\section{Performance Assessment}
+\label{s:PerformanceAssessment}
+I assessed the \CFA string library's speed and memory usage against strings in \CC STL.
+Overall, this analysis shows that adding support for the features shown earlier in the chapter comes at no substantial cost in the performance of features common to both APIs.
+Moreover, the results support the \CFA string's position as a high-level enabler of simplified text processing.
+STL makes its user think about memory management.
+When the user does, and is successful, STL's performance can be very good.
+But when the user fails to think through the consequences of the STL representation, performance becomes poor.
+The \CFA string lets the user work at the level of just putting the right text into the right variables, with corresponding performance degradations reduced or eliminated.
+% The final test shows the overall win of the \CFA text-sharing mechanism.
+% It exercises several operations together, showing \CFA enabling clean user code to achieve performance that STL requires less-clean user code to achieve.
+\subsection{Methodology}
+These tests use a \emph{corpus} of strings.
+Their lengths are important; the specific characters occurring in them are immaterial.
+In a result graph, a corpus's mean string length is often the independent variable shown on the X axis.
+When a corpus contains strings of different lengths, the lengths are drawn from a geometric distribution.
+Therefore, strings much longer than the mean occur less often and strings slightly shorter than the mean occur most often.
+A corpus's string sizes are one of:
+\begin{description}
+        \item [Fixed-size] all string lengths are of the stated size.
+        \item [Varying 1 and up] the string lengths are drawn from the geometric distribution with a stated mean and all lengths occur.
+        \item [Varying 16 and up] string lengths are drawn from the geometric distribution with the stated mean, but only lengths 16 and above occur; thus, the stated mean is above 16.
+\end{description}
+The special treatment of length 16 deals with the SSO in STL @string@, currently not implemented in \CFA.
 A fixed-size or from-16 distribution ensures that \CC's extra-optimized cases are isolated within, or removed from, the comparison.
 In all experiments that use a corpus, its text is generated and loaded into the system under test before the timed phase begins.
 To ensure comparable results, a common memory allocator is used for \CFA and \CC.
 \CFA runs the llheap allocator~\cite{Zulfiqar22}; the test rig plugs this same allocator into \CC.
+\CFA runs the llheap allocator~\cite{Zulfiqar22}, which is also plugged into \CC.
 The operations being measured take dozens of nanoseconds, so a succession of many invocations is run and timed as a group.
 The experiments run with fixed duration (targeting approximately 5 seconds), stopping upon passing a goal time, as determined by re-checking @clock()@ every 10,000 invocations, which is never more often than once per 80 ms.
 Timing outcomes reprt mean nanoseconds per invocation, which includes harness overhead and the targeted string API execution.
+The experiments run for a fixed duration (5 seconds), as determined by re-checking @clock()@ every 10,000 invocations, which is never more often than once per 80 ms.
+Timing outcomes report mean nanoseconds per invocation, which includes harness overhead and the targeted string API execution.
 \PAB{To discuss: hardware and such}
+As discussed in \VRef[Section]{string-raii-limit}, general performance comparisons are made using \CFA's faster, low-level string API, whose string type is named @string_res@.
+\VRef{s:ControllingImplicitSharing} presents an operational mode where \CFA string sharing is turned off.  In this mode, the \CFA string operates similarly to \CC's, by using a distinct heap allocation for each string's text.
+Some experiments include measurements in this mode for baselining purposes.
+It is called ``\CC emulation mode'' or ``nosharing'' here.
+As discussed in \VRef[Section]{string-raii-limit}, general performance comparisons are made using \CFA's faster, low-level string API, named @string_res@.
+\VRef{s:ControllingImplicitSharing} presents an operational mode where \CFA string sharing is turned off.
+In this mode, the \CFA string operates similarly to \CC's, by using a heap allocation for string text.
+Some experiments include measurements in this mode for baselining purposes, called ``\CC emulation mode'' or ``nosharing''.
 \subsection{Test: Append}
 These tests measure the speed of appending strings from the corpus onto a larger, growing string.  They show \CFA performing comparably to \CC overall, though with reduced penalties for simple API misuses for which \CC programmers may not know to watch out.
+These tests measure the speed of appending strings from the corpus onto a larger, growing string.
+They show \CFA performing comparably to \CC overall, though with penalties for simple API misuses.
 The basic harness is:
+\begin{cquote}
+\setlength{\tabcolsep}{20pt}
+\begin{cfa}
+START_TIMER
+for ( ... ) {
+        string_res accum;
+        for ( i; 100 ) {
+                accum += corpus[ f(i) ]; // importing from char * here
+                COUNT_ONE_OP_DONE
+\begin{cfa}
+// set alarm duration
+for ( ... ) { $\C[1.5in]{// loop for duration}$
+        for ( i; N ) { $\C{// perform multiple appends (concatenations)}$
+                accum += corpus[ f( i ) ];
+        }
+        count += N; $\C{// count number of appends}\CRT$
+}
+STOP_TIMER
+\end{cfa}
+\end{cquote}
+The harness's outer loop executes until a sample-worthy amount of execution has happened.
+The inner loop builds up the desired-length string with successive appends, before the outer makes it start over from a blank accumulator.
+Each harness run targets a specific (mean) corpus string length and produces one data point on the result graph.
+\end{cfa}
+The harness's outer loop executes for the experiment duration.
+The string is reset to empty before appending (not shown).
+The inner loop builds up a growing-length string with successive appends.
+Each run targets a specific (mean) corpus string length and produces one data point on the result graph.
 Three specific comparisons are made with this harness.
 Each picks its own independent-variable basis of comparison.
+All three comparisons use the varying-from-1 corpus construction, \ie they allow the STL to show its advantage from small-string optimization.
+All three comparisons use the varying-from-1 corpus construction, \ie they allow the STL to show its advantage for SSO.
 \subsubsection{Fresh vs Reuse in \CC, Emulation Baseline}
 The first experiment compares \CFA with \CC, with \CFA operating in nosharing mode (and \CC having no other mode).
 This experiment simply baselines how \CFA modestly lags \CC's optimization/tuning level generally, yet reproduces a coarser phenomenon.
 This experiment also introduces the first \CC coding pitfall, which the next experiment will show is helped by turning on \CFA sharing.  By this pitfall, a \CC programmer must pay attention to string variable reuse.
 \begin{cquote}
 \setlength{\tabcolsep}{20pt}
+The first experiment compares \CFA with \CC, with \CFA operating in nosharing mode and \CC having no other mode, hence both string package are using @malloc@/@free@.
+% This experiment establishes a baseline for other experiments.
+This experiment also introduces the first \CC coding pitfall, which the next experiment shows is helped by turning on \CFA sharing.
+% This pitfall shows, a \CC programmer must pay attention to string variable reuse.
+In the following, both programs are doing the same thing: start with @accum@ empty and build it up by appending @N@ strings (type @string@ in \CC and the faster @string_res@ in \CFA).
+\begin{cquote}
+\setlength{\tabcolsep}{40pt}
 \begin{tabular}{@{}ll@{}}
 % \multicolumn{1}{c}{\textbf{fresh}} & \multicolumn{1}{c}{\textbf{reuse}} \\
 …
 for ( ... ) {
         @string_res accum;@       // fresh
         for ( ... )
                 accum @+=@ ...
+        @string_res accum;@     $\C[1.5in]{// fresh}$
+        for ( N )
+                accum @+=@ ...  $\C{// append}\CRT$
+}
 \end{cfa}
 …
 string_res accum;
 for ( ... ) {
         @accum = "";@  $\C[1in]{// reuse\CRT}$
         for ( ... )
                 accum @+=@ ...
+        @accum = "";@  $\C[1.5in]{// reuse}$
+        for ( N )
+                accum @+=@ ...  $\C{// append}\CRT$
+}
 \end{cfa}
 \end{tabular}
 \end{cquote}
+Both programs are doing the same thing: start with @x@ empty and build it up by appending the same chunks.
+A programmer should not have to consider this difference.
+But from under the covers, each string being an individual allocation leaks through.
+While the inner loop is appending text to an @x@ that had not yet grown to have a large capacity, the program is, naturally, paying to extend the variable-length allocation, occasionally.
+This capacity stretching is a sticky property that survives assigning a (short, empty-string) value into an existing initialization.
+So, the ``reuse'' version benefits from not growing the allocation on subsequent runs of the inner loop.
+Yet, the ``fresh'' version is constantly restarting from a small buffer.
+The difference is creating a new or reusing an existing string variable.
+The pitfall is that most programmers do not consider this difference.
+However, creating a new variable implies deallocating the previous string storage and allocating new empty storage.
+As the string grows, further deallocations/allocations are required to release the previous and extend the current string storage.
+So, the fresh version is constantly restarting with zero string storage, while the reuse version benefits from having its prior large storage from the last append sequence.
 \begin{figure}
 …
         \includegraphics{plot-string-peq-cppemu.pdf}
 %       \includegraphics[width=\textwidth]{string-graph-peq-cppemu.png}
+        \caption{Fresh vs Reuse in \CC, Emulation Baseline.  Average time per iteration with one \lstinline{x += y} invocation (lower is better).  Comparing \CFA's STL emulation mode with STL implementations, and comparing the ``fresh'' with ``reused'' reset styles.}
+        \caption{Fresh vs Reuse in \CC, Emulation Baseline.
+        Average time per iteration with one \lstinline{x += y} invocation (lower is better).
+        Comparing \CFA's STL emulation mode with STL implementations, and comparing the fresh with reused reset styles.}
         \label{fig:string-graph-peq-cppemu}
+\end{figure}
+\VRef[Figure]{fig:string-graph-peq-cppemu} shows the resulting performance.
+The fresh \vs reuse penalty is the dominant difference.
+The cost is 40\% averaged over the cases shown and minimally 24\%.
+It shows up consistently on both the \CFA and STL implementations, and this cost is more prominent with larger strings.
+The lesser \CFA \vs STL difference shows \CFA reproducing STL's performance, up to a 15\% penalty averaged over the cases shown, diminishing with larger strings, and 50\% in the worst case.
+This penalty characterizes implementation fine tuning done with STL and not done yet done with \CFA.
+\subsubsection{\CFA's Fresh-Reuse Compromise}
+This comparison has the same setup as the last one, except that the \CFA implementation is switched to use its sharing mode.  The outcome is that the fresh/reuse difference vanishes in \CFA, with \CFA consistently delivering performance that compromises between the two \CC cases.
+\begin{figure}
+\centering
+        \includegraphics{string-graph-peq-sharing.pdf}
+        \bigskip
+        \bigskip
+        \includegraphics{plot-string-peq-sharing.pdf}
 %       \includegraphics[width=\textwidth]{string-graph-peq-sharing.png}
+        \caption{\CFA Compromise for Fresh \vs Reuse.  Average time per iteration with one \lstinline{x += y} invocation (lower is better).  Comparing \CFA's sharing mode with STL, and comparing the ``fresh'' with ``reused'' reset styles.  The \CC results are repeated from \ref{fig:string-graph-peq-cppemu}.}
+        \caption{\CFA Compromise for Fresh \vs Reuse.
+        Average time per iteration with one \lstinline{x += y} invocation (lower is better).
+        Comparing \CFA's sharing mode with STL, and comparing the fresh with reused reset styles.
+        The \CC results are repeated from \VRef[Figure]{fig:string-graph-peq-cppemu}.}
         \label{fig:string-graph-peq-sharing}
 \end{figure}
+\VRef[Figure]{fig:string-graph-peq-sharing} has the result.
+At append lengths 5 and above, \CFA not only splits the two STL cases, but its slowdown of 16\% over STL with user-managed reuse is close to the baseline \CFA-v-STL implementation difference seen with \CFA in STL-emulation mode.
+\subsubsection{\CFA's low overhead for misusing \lstinline{+}}
+A further pitfall occurs when the user writes @x = x + y@, rather than @x += y@.  Again, they are logically equivalent.
+\VRef[Figure]{fig:string-graph-peq-cppemu} shows the resulting performance.
+The two fresh (solid) lines and the two reuse (dash) lines are identical, except for lengths $\le$10, where the \CC SSO has a 40\% average and minimally 24\% advantage.
+The gap between the fresh and reuse lines is the removal of the dynamic memory allocates and reuse of prior storage, \eg 100M allocations for fresh \vs 100 allocations for reuse across all experiments.
+While allocation reduction is huge, data copying dominates the cost, so the lines are still reasonably close together.
+\subsubsection{\CFA's Sharing Mode}
+This comparison is the same as the last one, except the \CFA implementation is using sharing mode.
+Hence, both \CFA's fresh and reuse versions have no memory allocations, and as before, only for reuse does \CC have no memory allocations.
+\VRef[Figure]{fig:string-graph-peq-sharing} shows the resulting performance.
+For fresh at append lengths 5 and above, \CFA is now closer to the \CC reuse performance, because of removing the dynamic allocations.
+However, for reuse, \CFA has slowed down slightly, to performance matching the new fresh version, as the two versions are now implemented virtually the same.
+The reason for the \CFA reuse slow-down is the overhead of managing the sharing scheme (primarily maintaining the list of handles), without gaining any benefit.
+\begin{comment}
+FIND A HOME!!!
+The potential benefits of the sharing scheme do not give \CFA an edge over \CC when appending onto a reused string, though the first one helps \CFA win at going onto a fresh string.  These abilities are:
+\begin{itemize}
+\item
+To grow a text allocation repeatedly without copying it elsewhere.
+This ability is enabled by \CFA's most-recently modified string being located immediately before the text buffer's \emph{shared} bump-pointer area, \ie often a very large greenfield, relative to the \emph{individual} string being grown.
+With \CC-reuse, this benefit is already reaped by the user's reuse of a pre-stretched allocation.
+Yet \CC-fresh pays the higher cost because its room to grow for free is at most a constant times the original string's length.
+\item
+To share an individual text allocation across multiple related strings.
+This ability is not applicable to appending with @+=@.
+It in play in [xref sub-experiment pta] and [xref experiment pbv].
+\item
+To share a text arena across unrelated strings, sourcing disparate allocations from a common place.
+That is, always allocating from a bump pointer, and never maintaining free lists.
+This ability is not relevant to running any append scenario on \CFA with sharing, because appending modifies an existing allocation and is not driving several allocations.
+This ability is assessed in [xref experiment allocn].
+\end{itemize}
+This cost, of slowing down append-with-reuse, is \CFA paying the piper for other scenarios done well.
+\CFA prioritizes the fresh use case because it is more natural.
+The \emph{user-invoked} reuse scheme is an unnatural programming act because it deliberately misuses lexical scope: a variable (@accum@) gets its lifetime extended beyond the scope in which it is used.
+A \CFA user needing the best performance on an append scenario can still access the \CC-like speed by invoking noshare.
+This (indirect) resource management is memory-safe, as compared to that required in \CC to use @string&@, where knowledge of another string's lifetime comes into play.
+This abstraction opt-out is also different from invoking the LL API-level option.
+In fact, these considerations are orthogonal.
+But the key difference is that invoking the LL API would be a temporary measure, to use a workaround of a known \CFA language issue; choosing to exempt a string from sharing is a permanent act of program tuning.
+Beyond these comparisons, opting for noshare actually provides program ``eye candy,'' indicating that under-the-hood thinking is becoming relevant here.
+\end{comment}
+\subsubsection{Misusing Concatenation}
+A further pitfall occurs writing the apparently equivalent @x = x + y@ \vs @x += y@.
 For numeric types, the generated code is equivalent, giving identical performance.
 However, for string types there can be a significant difference.
 This pitfall is a particularly likely hazard for beginners.
+This pitfall is a particularly likely for beginners.
 In earlier experiments, the choice of \CFA API among HL and LL had no impact on the functionality being tested.
 …
 \end{tabular}
 \end{cquote}
 Note that this ``Goal'' code functions today in HL.
+Note, the goal code functions today in HL but with slower performance.
 \begin{figure}
 \centering
         \includegraphics{string-graph-pta-sharing.pdf}
+        \includegraphics{plot-string-pta-sharing.pdf}
 %       \includegraphics[width=\textwidth]{string-graph-pta-sharing.png}
         \caption{CFA's low overhead for misusing \lstinline{+}.  Average time per iteration with one \lstinline{x += y} invocation (lower is better). Comparing \CFA (having implicit sharing activated) with STL, and comparing the \lstinline{+}-then-\lstinline{=} with the \lstinline{+=} append styles.  The \lstinline{+=} results are repeated from \VRef[Figure]{fig:string-graph-peq-sharing}.}
+        \caption{CFA's low overhead for misusing concatenation.  Average time per iteration with one \lstinline{x += y} invocation (lower is better). Comparing \CFA (having implicit sharing activated) with STL, and comparing the \lstinline{+}-then-\lstinline{=} with the \lstinline{+=} append styles.  The \lstinline{+=} results are repeated from \VRef[Figure]{fig:string-graph-peq-sharing}.}
         \label{fig:string-graph-pta-sharing}
 \end{figure}
+\VRef[Figure]{fig:string-graph-pta-sharing} gives the outcome.  The STL's penalty is $8 \times$ while \CFA's is only $2 \times$, averaged across the cases shown here.
+\VRef[Figure]{fig:string-graph-pta-sharing} gives the outcome, where the Y-axis is log scale because of the large differences.
+The STL's penalty is $8 \times$ while \CFA's is only $2 \times$, averaged across the cases shown here.
 Moreover, the STL's gap increases with string size, while \CFA's converges.
 So again, \CFA helps users who just want to treat strings as values, and not think about the resource management under the covers.
+While not a design goal, and not graphed out, \CFA in STL-emulation mode heppened to outperform STL in this case.  User-managed allocation reuse did not affect either implementation in this case; only ``fresh'' results are shown.
+\subsection{Test: Pass argument}
+While not a design goal, and not graphed, \CFA in STL-emulation mode outperformed STL in this case.
+User-managed allocation reuse did not affect either implementation in this case; only ``fresh'' results are shown.
+\subsection{Test: Pass Argument}
 STL has a penalty for passing a string by value, which forces users to think about memory management when communicating values with a function.
 The key \CFA value-add is that a user can think of a string simply as a value; this test shows that \CC charges a stiff penalty for thining this way, while \CFA does not.
+The key \CFA value-add is that a user can think of a string simply as a value; this test shows that \CC charges a stiff penalty for thinking this way, while \CFA does not.
 This test illustrates a main advantage of the \CFA sharing algorithm (in one case).
 It shows STL's small-string optimization providing a successful mitigation (in the other case).
+It shows STL's SSO providing a successful mitigation (in the other case).
 The basic operation considered is:
 …
+}
+START_TIMER
+for ( i; ... ) {
+        helper( corpus[ f(i) ] ); // imported from char * previously
+        COUNT_ONE_OP_DONE
+for ( ... ) { // loop for duration
+        helper( corpus[ f( i ) ] );
+        count += 1;
+}
-STOP_TIMER
 \end{cfa}
+&
 …
         string_res q = { qref, COPY_VALUE };
+}
+// rest same, elided
+\end{cfa}
+\end{tabular}
+\end{cquote}
+The Goal (HL) version gives the simplest sketch of the test harness.
+It uses a single level of looping.
+Each iteration uses a corpus item as the argument to a function call.
+\end{cfa}
+\end{tabular}
+\end{cquote}
+The goal (HL) version gives the modified test harness, with a single loop.
+Each iteration uses a corpus item as the argument to the function call.
 These corpus items were imported to the string heap before beginning the timed run.
 \begin{figure}
 \centering
         \includegraphics{string-graph-pbv.pdf}
+        \includegraphics{plot-string-pbv.pdf}
 %       \includegraphics[width=\textwidth]{string-graph-pbv.png}
+        \begin{tabularx}{\linewidth}{>{\centering\arraybackslash}X >{\centering\arraybackslash}X} (a) & (b) \end{tabularx}
         \caption{Average time per iteration (lower is better) with one call to a function that takes a by-value string argument, comparing \CFA (having implicit sharing activated) with STL.
+(a) With \emph{Varying-from-1} corpus construction, in which the STL-only benefit of small-string optimization occurs, in varying degrees, at all string sizes.
+(b) With \emph{Fixed-size} corpus construction, in which this benefit applies exactly to strings with length below 16.
+[TODO: show version (b)]}
+(a) With \emph{Varying-from-1} corpus construction, in which the STL-only benefit of SSO optimization occurs, in varying degrees, at all string sizes.
+(b) With \emph{Fixed-size} corpus construction, in which this benefit applies exactly to strings with length below 16.}
         \label{fig:string-graph-pbv}
 \end{figure}
 \VRef[Figure]{fig:string-graph-pbv} shows the costs for calling a function that receives a string argument by value.
 STL's performance worsens as string length increases, while \CFA has the same performance at all sizes.
+STL's performance worsens uniformly as string length increases, while \CFA has the same performance at all sizes.
+Although the STL is better than \CFA until string length 10 because of the SSO.
 While improved, the \CFA cost to pass a string is still nontrivial.
 The contributor is adding and removing the callee's string handle from the global list.
 This cost is $1.5 \times$ to $2 \times$ over STL's when small-string optimization applies, though this cost should be avoidable in the same case, upon a \CFA realization of this optimization.
 At the larger sizes, when STL has to manage storage for the string, STL runs more than $3 \times$ slower, mainly due to time spent in the general-purpose memory allocator.
+\PAB{Need to check that.  Expecting copying to dominate.}
+This cost is $1.5 \times$ to $2 \times$ over STL's when SSO applies, but is avoidable once \CFA realizes this optimization.
+At the larger sizes, the STL runs more than $3 \times$ slower, because it has to allocation/deallocate storage for the parameter and copy the argument string to the parameter.
+If the \CC string is passed by reference, the results are better and flat across string lengths like \CFA.
 …
 A garbage collector, afforded the freedom of managed memory (where it knows about all the pointers and is allowed to modify them), often runs faster than malloc-free in an amortized analysis, even though it must occasionally stop to collect.
+The sppedup happens because GC is able to use its collection time to move objects.
+(In the case of the mini-allocator powering the \CFA string library, objects are runs of text.)  Moving objects lets fresh allocations consume from a large contiguous store of available memory; the ``bump pointer'' bookkeeping for such a scheme is very light.
+The speedup happens because GC is able to use its collection time to move objects.
+(In the case of the mini-allocator powering the \CFA string library, objects are runs of text.)
+Moving objects lets fresh allocations consume from a large contiguous store of available memory; the ``bump pointer'' bookkeeping for such a scheme is very light.
 A malloc-free implementation without the freedom to move objects must, in the general case, allocate in the spaces between existing objects; doing so entails the heavier bookkeeping of maintaining a linked structure of freed allocations and/or coalescing freed allocations.
 …
 \begin{figure}
 \centering
   \includegraphics{string-graph-allocn.pdf}
+  \includegraphics{plot-string-allocn.pdf}
 % \includegraphics[width=\textwidth]{string-graph-allocn.png}
   \caption{Space and time performance, under varying fraction-live targets, for the five string lengths shown, at \emph{Fixed-size} corpus construction.

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset bd72f517 for doc

Legend:

Download in other formats: