Index: doc/papers/concurrency/Paper.tex
===================================================================
--- doc/papers/concurrency/Paper.tex	(revision d046db2ac11262165d9c13d2f62d3fff987a4f36)
+++ doc/papers/concurrency/Paper.tex	(revision c8ad5d9ee9d1ad9027a536667adcb82740622d91)
@@ -19,5 +19,6 @@
 \usepackage{listings}						% format program code
 \usepackage[labelformat=simple,aboveskip=0pt,farskip=0pt]{subfig}
-\renewcommand{\thesubfigure}{(\alph{subfigure})}
+\renewcommand{\thesubfigure}{(\Alph{subfigure})}
+\captionsetup{justification=raggedright,singlelinecheck=false}
 \usepackage{siunitx}
 \sisetup{ binary-units=true }
@@ -98,5 +99,5 @@
 \newcommand{\abbrevFont}{\textit}			% set empty for no italics
 \@ifundefined{eg}{
-\newcommand{\EG}{\abbrevFont{e}.\abbrevFont{g}.}
+\newcommand{\EG}{\abbrevFont{e}\abbrevFont{g}}
 \newcommand*{\eg}{%
 	\@ifnextchar{,}{\EG}%
@@ -105,5 +106,5 @@
 }}{}%
 \@ifundefined{ie}{
-\newcommand{\IE}{\abbrevFont{i}.\abbrevFont{e}.}
+\newcommand{\IE}{\abbrevFont{i}\abbrevFont{e}}
 \newcommand*{\ie}{%
 	\@ifnextchar{,}{\IE}%
@@ -143,5 +144,5 @@
 		_Alignas, _Alignof, __alignof, __alignof__, asm, __asm, __asm__, __attribute, __attribute__,
 		auto, _Bool, catch, catchResume, choose, _Complex, __complex, __complex__, __const, __const__,
-		coroutine, disable, dtype, enable, __extension__, exception, fallthrough, fallthru, finally,
+		coroutine, disable, dtype, enable, exception, __extension__, fallthrough, fallthru, finally,
 		__float80, float80, __float128, float128, forall, ftype, _Generic, _Imaginary, __imag, __imag__,
 		inline, __inline, __inline__, __int128, int128, __label__, monitor, mutex, _Noreturn, one_t, or,
@@ -169,5 +170,5 @@
 literate={-}{\makebox[1ex][c]{\raisebox{0.4ex}{\rule{0.8ex}{0.1ex}}}}1 {^}{\raisebox{0.6ex}{$\scriptstyle\land\,$}}1
 	{~}{\raisebox{0.3ex}{$\scriptstyle\sim\,$}}1 % {`}{\ttfamily\upshape\hspace*{-0.1ex}`}1
-	{<-}{$\leftarrow$}2 {=>}{$\Rightarrow$}2 {->}{\makebox[1ex][c]{\raisebox{0.5ex}{\rule{0.8ex}{0.075ex}}}\kern-0.2ex{\textgreater}}2,
+	{<-}{$\leftarrow$}2 {=>}{$\Rightarrow$}2 {->}{\makebox[1ex][c]{\raisebox{0.4ex}{\rule{0.8ex}{0.075ex}}}\kern-0.2ex{\textgreater}}2,
 moredelim=**[is][\color{red}]{`}{`},
 }% lstset
@@ -216,16 +217,16 @@
 \author[1]{Thierry Delisle}
 \author[1]{Peter A. Buhr*}
-\authormark{Thierry Delisle \textsc{et al}}
-
-\address[1]{\orgdiv{Cheriton School of Computer Science}, \orgname{University of Waterloo}, \orgaddress{\state{Ontario}, \country{Canada}}}
-
-\corres{*Peter A. Buhr, \email{pabuhr{\char`\@}uwaterloo.ca}}
-\presentaddress{Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, N2L 3G1, Canada}
-
+\authormark{DELISLE \textsc{et al.}}
+
+\address[1]{\orgdiv{Cheriton School of Computer Science}, \orgname{University of Waterloo}, \orgaddress{\state{Waterloo, ON}, \country{Canada}}}
+
+\corres{*Peter A. Buhr, Cheriton School of Computer Science, University of Waterloo, 200 University Avenue West, Waterloo, ON, N2L 3G1, Canada. \email{pabuhr{\char`\@}uwaterloo.ca}}
+
+\fundingInfo{Natural Sciences and Engineering Research Council of Canada}
 
 \abstract[Summary]{
 \CFA is a modern, polymorphic, \emph{non-object-oriented} extension of the C programming language.
 This paper discusses the design of the concurrency and parallelism features in \CFA, and the concurrent runtime-system.
-These features are created from scratch as ISO C lacks concurrency, relying largely on pthreads.
+These features are created from scratch as ISO C lacks concurrency, relying largely on pthreads library.
 Coroutines and lightweight (user) threads are introduced into the language.
 In addition, monitors are added as a high-level mechanism for mutual exclusion and synchronization.
@@ -255,15 +256,15 @@
 Examples of high-level approaches are task based~\cite{TBB}, message passing~\cite{Erlang,MPI}, and implicit threading~\cite{OpenMP}.
 
-This paper used the following terminology.
+This paper uses the following terminology.
 A \newterm{thread} is a fundamental unit of execution that runs a sequence of code and requires a stack to maintain state.
-Multiple simultaneous threads gives rise to \newterm{concurrency}, which requires locking to ensure safe communication and access to shared data.
+Multiple simultaneous threads give rise to \newterm{concurrency}, which requires locking to ensure safe communication and access to shared data.
 % Correspondingly, concurrency is defined as the concepts and challenges that occur when multiple independent (sharing memory, timing dependencies, \etc) concurrent threads are introduced.
 \newterm{Locking}, and by extension locks, are defined as a mechanism to prevent progress of threads to provide safety.
 \newterm{Parallelism} is running multiple threads simultaneously.
 Parallelism implies \emph{actual} simultaneous execution, where concurrency only requires \emph{apparent} simultaneous execution.
-As such, parallelism is only observable in differences in performance, which is observed through differences in timing.
+As such, parallelism only affects performance, which is observed through differences in space and/or time.
 
 Hence, there are two problems to be solved in the design of concurrency for a programming language: concurrency and parallelism.
-While these two concepts are often combined, they are in fact distinct, requiring different tools~\cite[\S~2]{Buhr05a}.
+While these two concepts are often combined, they are distinct, requiring different tools~\cite[\S~2]{Buhr05a}.
 Concurrency tools handle synchronization and mutual exclusion, while parallelism tools handle performance, cost and resource utilization.
 
@@ -278,12 +279,12 @@
 
 The following is a quick introduction to the \CFA language, specifically tailored to the features needed to support concurrency.
-
-\CFA is an extension of ISO-C and therefore supports all of the same paradigms as C.
-It is a non-object-oriented system-language, meaning most of the major abstractions have either no runtime overhead or can be opted out easily.
+Most of the following code examples can be found on the \CFA website~\cite{Cforall}.
+
+\CFA is an extension of ISO-C, and therefore, supports all of the same paradigms as C.
+%It is a non-object-oriented system-language, meaning most of the major abstractions have either no runtime overhead or can be opted out easily.
 Like C, the basics of \CFA revolve around structures and routines, which are thin abstractions over machine code.
 The vast majority of the code produced by the \CFA translator respects memory layouts and calling conventions laid out by C.
-Interestingly, while \CFA is not an object-oriented language, lacking the concept of a receiver (\eg {\tt this}), it does have some notion of objects\footnote{C defines the term objects as : ``region of data storage in the execution environment, the contents of which can represent
+Interestingly, while \CFA is not an object-oriented language, lacking the concept of a receiver (\eg @this@) and inheritance, it does have some notion of objects\footnote{C defines the term objects as : ``region of data storage in the execution environment, the contents of which can represent
 values''~\cite[3.15]{C11}}, most importantly construction and destruction of objects.
-Most of the following code examples can be found on the \CFA website~\cite{Cforall}.
 
 
@@ -293,15 +294,18 @@
 In regards to concurrency, the semantic difference between pointers and references are not particularly relevant, but since this document uses mostly references, here is a quick overview of the semantics:
 \begin{cfa}
-int x, *p1 = &x, **p2 = &p1, ***p3 = &p2,
-	&r1 = x,    &&r2 = r1,   &&&r3 = r2;
-***p3 = 3;							$\C{// change x}$
-r3    = 3;							$\C{// change x, ***r3}$
-**p3  = ...;						$\C{// change p1}$
-*p3   = ...;						$\C{// change p2}$
-int y, z, & ar[3] = {x, y, z};		$\C{// initialize array of references}$
-typeof( ar[1]) p;					$\C{// is int, referenced object type}$
-typeof(&ar[1]) q;					$\C{// is int \&, reference type}$
-sizeof( ar[1]) == sizeof(int);		$\C{// is true, referenced object size}$
-sizeof(&ar[1]) == sizeof(int *);	$\C{// is true, reference size}$
+int x, y, z;
+int * p1 = &x, ** p2 = &p1, *** p3 = &p2,	$\C{// pointers to x}$
+	& r1 = x,   && r2 = r1, &&& r3 = r2;	$\C{// references to x}$
+
+*p1 = 3; **p2 = 3; ***p3 = 3;				$\C{// change x}$
+  r1 = 3;    r2 = 3;      r3 = 3;			$\C{// change x}$
+**p3 = &y; *p3 = &z;						$\C{// change p1, p2}$
+&&r3 = &y; &r3 = &z;						$\C{// change p1, p2}$
+int & ar[3] = {x, y, z};					$\C{// initialize array of references}$
+
+typeof( ar[1]) p;							$\C{// is int, referenced object type}$
+typeof(&ar[1]) q;							$\C{// is int \&, reference type}$
+sizeof( ar[1]) == sizeof(int);				$\C{// is true, referenced object size}$
+sizeof(&ar[1]) == sizeof(int *);			$\C{// is true, reference size}$
 \end{cfa}
 The important take away from this code example is that a reference offers a handle to an object, much like a pointer, but which is automatically dereferenced for convenience.
@@ -626,5 +630,5 @@
 \end{lrbox}
 \subfloat[3 States, internal variables]{\label{f:Coroutine3States}\usebox\myboxA}
-\qquad
+\qquad\qquad
 \subfloat[1 State, internal variables]{\label{f:Coroutine1State}\usebox\myboxB}
 \caption{\CFA Coroutine Fibonacci Implementations}
@@ -653,6 +657,5 @@
 
 \begin{figure}
-\centering
-\begin{cfa}
+\begin{cfa}[xleftmargin=4\parindentlnth]
 `coroutine` Format {
 	char ch;								$\C{// used for communication}$
