Index: doc/proposals/concurrency/concurrency.tex
===================================================================
--- doc/proposals/concurrency/concurrency.tex	(revision 9a8dfcc22a0cb55b65ad97ee728cfe06c2b62751)
+++ doc/proposals/concurrency/concurrency.tex	(revision fe84230d44afd91fb519712a68f209d0452629d3)
@@ -1,10 +1,10 @@
 % requires tex packages: texlive-base texlive-latex-base tex-common texlive-humanities texlive-latex-extra texlive-fonts-recommended
 
-% inline code ©...© (copyright symbol) emacs: C-q M-)
-% red highlighting ®...® (registered trademark symbol) emacs: C-q M-.
-% blue highlighting ß...ß (sharp s symbol) emacs: C-q M-_
-% green highlighting ¢...¢ (cent symbol) emacs: C-q M-"
-% LaTex escape §...§ (section symbol) emacs: C-q M-'
-% keyword escape ¶...¶ (pilcrow symbol) emacs: C-q M-^
+% inline code �...� (copyright symbol) emacs: C-q M-)
+% red highlighting �...� (registered trademark symbol) emacs: C-q M-.
+% blue highlighting �...� (sharp s symbol) emacs: C-q M-_
+% green highlighting �...� (cent symbol) emacs: C-q M-"
+% LaTex escape �...� (section symbol) emacs: C-q M-'
+% keyword escape �...� (pilcrow symbol) emacs: C-q M-^
 % math escape $...$ (dollar symbol)
 
@@ -35,5 +35,5 @@
 \usepackage{fancyhdr}
 \renewcommand{\linenumberfont}{\scriptsize\sffamily}
-\input{common}							% bespoke macros used in the document
+\input{style}							% bespoke macros used in the document
 \usepackage[dvips,plainpages=false,pdfpagelabels,pdfpagemode=UseNone,colorlinks=true,pagebackref=true,linkcolor=blue,citecolor=blue,urlcolor=blue,pagebackref=true,breaklinks=true]{hyperref}
 \usepackage{breakurl}
@@ -153,5 +153,7 @@
 	void ?{}(counter_t & nomutex this);
 	size_t ++?(counter_t & mutex this);
-	void ?{}(size_t * this, counter_t & mutex cnt); //need for mutex is platform dependent here
+
+	//need for mutex is platform dependent here
+	void ?{}(size_t * this, counter_t & mutex cnt);
 \end{lstlisting}
 *semantics of the declaration of \code{mutex struct counter_t} are discussed in details in section \ref{data}
@@ -195,5 +197,6 @@
 	}
 
-	void ?{}(int * this, counter_t & mutex cnt) { //need for mutex is platform dependent here
+	//need for mutex is platform dependent here
+	void ?{}(int * this, counter_t & mutex cnt) {
 		*this = (int)cnt;
 	}
@@ -224,5 +227,5 @@
 \end{lstlisting}
 
-This code acquires both locks before entering the critical section. In practice, writing multi-locking routines that can not lead to deadlocks can be tricky. Having language support for such a feature is therefore a significant asset for \CFA. In the case presented above, \CFA guarantees that the order of aquisition will be consistent across calls to routines using the same monitors as arguments. However, since \CFA monitors use multi-acquiring locks users can effectively force the acquiring order. For example, notice which routines use \code{mutex}/\code{nomutex} and how this affects aquiring order :
+This code acquires both locks before entering the critical section (Referenced as \gls{group-acquire} from now on). In practice, writing multi-locking routines that can not lead to deadlocks can be tricky. Having language support for such a feature is therefore a significant asset for \CFA. In the case presented above, \CFA guarantees that the order of aquisition will be consistent across calls to routines using the same monitors as arguments. However, since \CFA monitors use multi-acquiring locks users can effectively force the acquiring order. For example, notice which routines use \code{mutex}/\code{nomutex} and how this affects aquiring order :
 \begin{lstlisting}
 	void foo(A & mutex a, B & mutex a) {
@@ -282,5 +285,5 @@
 
 \subsection{Internal scheduling} \label{insched}
-Monitors should also be able to schedule what threads access it as a mean of synchronization. Internal scheduling is one of the simple examples of such a feature. It allows users to declare condition variables and wait for them to be signaled. Here is a simple example of such a technique :
+Monitors also need to schedule waiting threads within it as a mean of synchronization. Internal scheduling is one of the simple examples of such a feature. It allows users to declare condition variables and have threads wait and signaled from them. Here is a simple example of such a technique :
 
 \begin{lstlisting}
@@ -300,6 +303,5 @@
 \end{lstlisting}
 
-Here routine \code{foo} waits on the \code{signal} from \code{bar} before making further progress, effectively ensuring a basic ordering. This semantic can easily be extended to multi-monitor calls by offering the same guarantee.
-
+Here routine \code{foo} waits for the \code{signal} from \code{bar} before making further progress, effectively ensuring a basic ordering. This semantic can easily be extended to multi-monitor calls by offering the same guarantee.
 \begin{center}
 \begin{tabular}{ c @{\hskip 0.65in} c }
@@ -307,5 +309,5 @@
 \begin{lstlisting}
 void foo(monitor & mutex a,
-         monitor & mutex b) {
+           monitor & mutex b) {
 	//...
 	wait(a.e);
@@ -316,5 +318,5 @@
 \end{lstlisting} &\begin{lstlisting}
 void bar(monitor & mutex a,
-         monitor & mutex b) {
+           monitor & mutex b) {
 	signal(a.e);
 }
@@ -326,6 +328,5 @@
 \end{tabular}
 \end{center}
-
-A direct extension of the single monitor semantics would be to release all locks when waiting and transferring ownership of all locks when signalling. However, for the purpose of synchronization it may be usefull to only release some of the locks but keep others. On the technical side, partially releasing lock is feasible but from the user perspective a choice must be made for the syntax of this feature. It is possible to do without any extra syntax by relying on order of acquisition (Note that here the use of helper routines is irrelevant, only routines the acquire mutual exclusion have an impact on internal scheduling):
+A direct extension of the single monitor semantics is to release all locks when waiting and transferring ownership of all locks when signalling. However, for the purpose of synchronization it may be usefull to only release some of the locks but keep others. It is possible to support internal scheduling and \gls{group-acquire} without any extra syntax by relying on order of acquisition. Here is an example of the different contexts in which internal scheduling can be used. (Note that here the use of helper routines is irrelevant, only routines acquire mutual exclusion have an impact on internal scheduling):
 
 \begin{center}
@@ -337,5 +338,6 @@
 
 void foo(monitor & mutex a,
-         monitor & mutex b) {
+           monitor & mutex b) {
+
 	wait(e);
 }
@@ -351,10 +353,10 @@
 
 void bar(monitor & mutex a,
-         monitor & nomutex b) {
+           monitor & nomutex b) {
 	foo(a,b);
 }
 
 void foo(monitor & mutex a,
-         monitor & mutex b) {
+           monitor & mutex b) {
 	wait(e);
 }
@@ -365,10 +367,10 @@
 
 void bar(monitor & mutex a,
-         monitor & nomutex b) {
-	foo(a,b);
+           monitor & nomutex b) {
+	baz(a,b);
 }
 
 void baz(monitor & nomutex a,
-         monitor & mutex b) {
+           monitor & mutex b) {
 	wait(e);
 }
@@ -379,147 +381,76 @@
 \end{center}
 
-This can be interpreted in two different ways :
-\begin{flushleft}
-\begin{enumerate}
-	\item \code{wait} atomically releases the monitors acquired by the inner-most routine, \underline{ignoring} nested calls.
-	\item \code{wait} atomically releases the monitors acquired by the inner-most routine, \underline{considering} nested calls.
-\end{enumerate}
-\end{flushleft}
-While the difference between these two is subtle, it has a significant impact. In the first case it means that the calls to \code{foo} would behave the same in Context 1 and 2. This semantic would also mean that the call to \code{wait} in routine \code{baz} would only release \code{monitor b}. While this may seem intuitive with these examples, it does have one significant implication, it creates a strong distinction between acquiring multiple monitors in sequence and acquiring the same monitors simulatenously, i.e. :
+Note that in \CFA, \code{condition} have no particular need to be stored inside a monitor, beyond any software engineering reasons. Context 1 is the simplest way of acquiring more than one monitor (\gls{group-acquire}), using a routine wiht multiple parameters having the \code{mutex} keyword. Context 2 also uses \gls{group-acquire} as well in routine \code{foo}. However, the routine is called by routine \code{bar} which only acquires monitor \code{a}. Since monitors can be acquired multiple times this will not cause a deadlock by itself but it does force the acquiring order to \code{a} then \code{b}. Context 3 also forces the acquiring order to be \code{a} then \code{b} but does not use \gls{group-acquire}. The previous example tries to illustrate the semantics that must be established to support releasing monitors in a \code{wait} statement. In all cases the behavior of the wait statment is to release all the locks that were acquired my the inner-most monitor call. That is \code{a & b} in context 1 and 2 and \code{b} only in context 3. Here are a few other examples of this behavior.
+
 
 \begin{center}
-\begin{tabular}{c @{\hskip 0.35in} c @{\hskip 0.35in} c}
-\begin{lstlisting}
-enterMonitor(a);
-enterMonitor(b);
-// do stuff
-leaveMonitor(b);
-leaveMonitor(a);
-\end{lstlisting} & != &\begin{lstlisting}
-enterMonitor(a);
-enterMonitor(a, b);
-// do stuff
-leaveMonitor(a, b);
-leaveMonitor(a);
+\begin{tabular}{|c|c|c|}
+\begin{lstlisting}
+condition e;
+
+//acquire a
+void foo(monitor & nomutex a,
+           monitor & mutex b) {
+	bar(a,b);
+}
+
+//acquire a
+void bar(monitor & mutex a,
+           monitor & nomutex b) {
+
+	//release a
+	//keep b
+	wait(e);
+}
+
+foo(a, b);
+\end{lstlisting} &\begin{lstlisting}
+condition e;
+
+//acquire a & b
+void foo(monitor & mutex a,
+           monitor & mutex b) {
+	bar(a,b);
+}
+
+//acquire b
+void bar(monitor & mutex a,
+           monitor & nomutex b) {
+
+   	//release b
+	//keep a
+	wait(e);
+}
+
+foo(a, b);
+\end{lstlisting} &\begin{lstlisting}
+condition e;
+
+//acquire a & b
+void foo(monitor & mutex a,
+           monitor & mutex b) {
+	bar(a,b);
+}
+
+//acquire none
+void bar(monitor & nomutex a,
+           monitor & nomutex b) {
+
+	//release a & b
+	//keep none
+	wait(e);
+}
+
+foo(a, b);
 \end{lstlisting}
 \end{tabular}
 \end{center}
-
-This is not intuitive because even if both methods display the same monitors state both inside and outside the critical section respectively, the behavior is different. Furthermore, the actual acquiring order will be exaclty the same since acquiring a monitor from inside its mutual exclusion is a no-op. This means that even if the data and the actual control flow are the same using both methods, the behavior of the \code{wait} will be different. The alternative is option 2, that is releasing acquired monitors, \underline{considering} nesting. This solves the issue of having the two acquiring method differ at the cost of making routine \code{foo} behave differently depending on from which context it is called (Context 1 or 2). Indeed in Context 2, routine \code{foo} actually behaves like routine \code{baz} rather than having the same behavior than in Context 1. The fact that both implicit approaches can be unintuitive depending on the perspective may be a sign that the explicit approach is superior. For this reason this \CFA does not support implicit monitor releasing and uses explicit semantics.
-\\
-
-The following examples shows three alternatives of explicit wait semantics :
-\\
-
-\begin{center}
-\begin{tabular}{|c|c|c|}
-Case 1 & Case 2 & Case 3 \\
-Branding on construction & Explicit release list & Explicit ignore list \\
-\hline
-\begin{lstlisting}
-void foo(monitor & mutex a,
-         monitor & mutex b,
-	   condition & c)
-{
-	// Releases monitors
-	// branded in ctor
-	wait(c);
-}
-
-monitor a;
-monitor b;
-condition1 c1 = {a};
-condition2 c2 = {a, b};
-
-//Will release only a
-foo(a,b,c1);
-
-//Will release a and b
-foo(a,b,c2);
-\end{lstlisting} &\begin{lstlisting}
-void foo(monitor & mutex a,
-         monitor & mutex b,
-	   condition & c)
-{
-	// Releases monitor a
-	// Holds monitor b
-	waitRelease(c, [a]);
-}
-
-monitor a;
-monitor b;
-condition c;
-
-
-
-foo(a,b,c);
-
-
-
-\end{lstlisting} &\begin{lstlisting}
-void foo(monitor & mutex a,
-         monitor & mutex b,
-	   condition & c)
-{
-	// Releases monitor a
-	// Holds monitor b
-	waitHold(c, [b]);
-}
-
-monitor a;
-monitor b;
-condition c;
-
-
-
-foo(a,b,c);
-
-
-
-\end{lstlisting}
-\end{tabular}
-\end{center}
-(Note : Case 2 and 3 use tuple semantics to pass a variable length list of elements.)
-\\
-
-All these cases have their pros and cons. Case 1 is more distinct because it means programmers need to be carefull about where the condition is initialized as well as where it is used. On the other hand, it is very clear and explicitly states which monitor is released and which monitor stays acquired. This is similar to Case 2, which releases only the monitors explictly listed. However, in Case 2, calling the \code{wait} routine instead of the \code{waitRelease} routine releases all the acquired monitor. The Case 3 is an improvement on that since it releases all the monitors except those specified. The result is that the \code{wait} routine can be written as follows :
-\begin{lstlisting}
-void wait(condition & cond) {
-	waitHold(cond, []);
-}
-\end{lstlisting}
-This alternative offers nice and consistent behavior between \code{wait} and \code{waitHold}. However, one large pitfall is that mutual exclusion can now be violated by calls to library code. Indeed, even if the following example seems benign there is one significant problem :
-\begin{lstlisting}
-monitor global;
-
-extern void doStuff(); //uses global
-
-void foo(monitor & mutex m) {
-	//...
-	doStuff(); //warning can release monitor m
-	//...
-}
-
-foo(global);
-\end{lstlisting}
-
-Indeed, if Case 2 or 3 are chosen it any code can violate the mutual exclusion of the calling code by issuing calls to \code{wait} or \code{waitHold} in a nested monitor context. Case 2 can be salvaged by removing the \code{wait} routine from the API but Case 3 cannot prevent users from calling \code{waitHold(someCondition, [])}. For this reason the syntax proposed in Case 3 is rejected. Note that the syntax proposed in case 1 and 2 are not exclusive. Indeed, by supporting two types of condition both cases can be supported :
-\begin{lstlisting}
-struct condition { /*...*/ };
-
-// Second argument is a variable length tuple.
-void wait(condition & cond, [...] monitorsToRelease);
-void signal(condition & cond);
-
-struct conditionN { /*...*/ };
-
-void ?{}(conditionN* this, /*list of N monitors to release*/);
-void wait(conditionN & cond);
-void signal(conditionN & cond);
-\end{lstlisting}
-
-Regardless of the option chosen for wait semantics, signal must be symmetrical. In all cases, signal only needs a single parameter, the condition variable that needs to be signalled. But \code{signal} needs to be called from the same monitor(s) that call to \code{wait}. Otherwise, mutual exclusion cannot be properly transferred back to the waiting monitor.
-
-Finally, an additionnal semantic which can be very usefull is the \code{signalBlock} routine. This routine behaves like signal for all of the semantics discussed above, but with the subtelty that mutual exclusion is transferred to the waiting task immediately rather than wating for the end of the critical section.
+Note the right-most example which uses a helper routine and therefore is not relevant to find which monitors will be released.
+
+These semantics imply that in order to release of subset of the monitors currently held, users must write (and name) a routine that only acquires the desired subset and simply calls wait. While users can use this method, \CFA offers the \code{wait_release}\footnote{Not sure if an overload of \code{wait} would work...} which will release only the specified monitors.
+
+Regardless of the context in which the \code{wait} statement is used, \code{signal} must used holding the same set of monitors. In all cases, signal only needs a single parameter, the condition variable that needs to be signalled. But \code{signal} needs to be called from the same monitor(s) that call to \code{wait}. Otherwise, mutual exclusion cannot be properly transferred back to the waiting monitor.
+
+Finally, an additional semantic which can be very usefull is the \code{signal_block} routine. This routine behaves like signal for all of the semantics discussed above, but with the subtelty that mutual exclusion is transferred to the waiting task immediately rather than wating for the end of the critical section.
 \\
 
@@ -531,5 +462,5 @@
 % #        #   #     #    ###    #     # #     # #     # #       #     #
 % ####### #     #    #    ###     #####   #####  #     # ####### ######
-
+\newpage
 \subsection{External scheduling} \label{extsched}
 As one might expect, the alternative to Internal scheduling is to use External scheduling instead. This method is somewhat more robust to deadlocks since one of the threads keeps a relatively tight control on scheduling. Indeed, as the following examples will demonstrate, external scheduling allows users to wait for events from other threads without the concern of unrelated events occuring. External scheduling can generally be done either in terms of control flow (ex: \uC) or in terms of data (ex: Go). Of course, both of these paradigms have their own strenghts and weaknesses but for this project control flow semantics where chosen to stay consistent with the rest of the languages semantics. Two challenges specific to \CFA arise when trying to add external scheduling with loose object definitions and multi-monitor routines. The following example shows what a simple use \code{accept} versus \code{wait}/\code{signal} and its advantages.
Index: doc/proposals/concurrency/glossary.tex
===================================================================
--- doc/proposals/concurrency/glossary.tex	(revision 9a8dfcc22a0cb55b65ad97ee728cfe06c2b62751)
+++ doc/proposals/concurrency/glossary.tex	(revision fe84230d44afd91fb519712a68f209d0452629d3)
@@ -11,4 +11,10 @@
 {
 Locking done by the called routine. With this technique, a monitor routine called by another routine will aquire the monitor \emph{after} entering the routine body but prior to any other code.
+}
+
+\longnewglossaryentry{group-acquire}
+{name={grouped acquiring}}
+{
+Implicitly acquiring several monitors when entering a monitor.
 }
 
Index: doc/proposals/concurrency/style.tex
===================================================================
--- doc/proposals/concurrency/style.tex	(revision fe84230d44afd91fb519712a68f209d0452629d3)
+++ doc/proposals/concurrency/style.tex	(revision fe84230d44afd91fb519712a68f209d0452629d3)
@@ -0,0 +1,9 @@
+\input{common}                                          % bespoke macros used in the document
+
+\lstset{
+morekeywords=[2]{nomutex,mutex,thread,wait,wait_release,signal,signal_block,accept,monitor},
+keywordstyle=[2]\color{blue},				% second set of keywords for concurency
+basicstyle=\linespread{0.9}\tt\small,		% reduce line spacing and use typewriter font
+stringstyle=\sf\color{Mahogany},			% use sanserif font
+commentstyle=\itshape\color{OliveGreen},		% green and italic comments
+}%
Index: doc/proposals/concurrency/version
===================================================================
--- doc/proposals/concurrency/version	(revision 9a8dfcc22a0cb55b65ad97ee728cfe06c2b62751)
+++ doc/proposals/concurrency/version	(revision fe84230d44afd91fb519712a68f209d0452629d3)
@@ -1,1 +1,1 @@
-0.5.116
+0.5.146
