Index: doc/papers/concurrency/Paper.tex
===================================================================
--- doc/papers/concurrency/Paper.tex	(revision 7bdcac1f3631ee6408a86b4d0321433114fee6d3)
+++ doc/papers/concurrency/Paper.tex	(revision 3fc59bdb3dcc1b79be061440b40a2cebb66b85f3)
@@ -580,7 +580,7 @@
 \subsection{\protect\CFA's Thread Building Blocks}
 
-An important missing feature in C is threading\footnote{While the C11 standard defines a ``threads.h'' header, it is minimal and defined as optional.
+An important missing feature in C is threading\footnote{While the C11 standard defines a \protect\lstinline@threads.h@ header, it is minimal and defined as optional.
 As such, library support for threading is far from widespread.
-At the time of writing the paper, neither \protect\lstinline|gcc| nor \protect\lstinline|clang| support ``threads.h'' in their standard libraries.}.
+At the time of writing the paper, neither \protect\lstinline@gcc@ nor \protect\lstinline@clang@ support \protect\lstinline@threads.h@ in their standard libraries.}.
 In modern programming languages, a lack of threading is unacceptable~\cite{Sutter05, Sutter05b}, and therefore existing and new programming languages must have tools for writing efficient concurrent programs to take advantage of parallelism.
 As an extension of C, \CFA needs to express these concepts in a way that is as natural as possible to programmers familiar with imperative languages.
@@ -1140,5 +1140,5 @@
 }
 \end{cfa}
-A consequence of the strongly typed approach to main is that memory layout of parameters and return values to/from a thread are now explicitly specified in the \textbf{api}.
+A consequence of the strongly typed approach to main is that memory layout of parameters and return values to/from a thread are now explicitly specified in the \textbf{API}.
 \end{comment}
 
@@ -1443,8 +1443,9 @@
 \label{s:InternalScheduling}
 
-While monitor mutual-exclusion provides safe access to shared data, the monitor data may indicate that a thread accessing it cannot proceed, \eg a bounded buffer, Figure~\ref{f:GenericBoundedBuffer}, may be full/empty so produce/consumer threads must block.
+While monitor mutual-exclusion provides safe access to shared data, the monitor data may indicate that a thread accessing it cannot proceed.
+For example, Figure~\ref{f:GenericBoundedBuffer} shows a bounded buffer that may be full/empty so produce/consumer threads must block.
 Leaving the monitor and trying again (busy waiting) is impractical for high-level programming.
 Monitors eliminate busy waiting by providing internal synchronization to schedule threads needing access to the shared data, where the synchronization is blocking (threads are parked) versus spinning.
-The synchronization is generally achieved with internal~\cite{Hoare74} or external~\cite[\S~2.9.2]{uC++} scheduling, where \newterm{scheduling} is defined as indicating which thread acquires the critical section next.
+Synchronization is generally achieved with internal~\cite{Hoare74} or external~\cite[\S~2.9.2]{uC++} scheduling, where \newterm{scheduling} defines which thread acquires the critical section next.
 \newterm{Internal scheduling} is characterized by each thread entering the monitor and making an individual decision about proceeding or blocking, while \newterm{external scheduling} is characterized by an entering thread making a decision about proceeding for itself and on behalf of other threads attempting entry.
 
@@ -1537,239 +1538,26 @@
 External scheduling is controlled by the @waitfor@ statement, which atomically blocks the calling thread, releases the monitor lock, and restricts the routine calls that can next acquire mutual exclusion.
 If the buffer is full, only calls to @remove@ can acquire the buffer, and if the buffer is empty, only calls to @insert@ can acquire the buffer.
-Threads making calls to routines that are currently excluded block outside (externally) of the monitor on a calling queue, versus blocking on condition queues inside the monitor.
-
-Both internal and external scheduling extend to multiple monitors in a natural way.
-\begin{cfa}
-monitor M { `condition e`; ... };
-void foo( M & mutex m1, M & mutex m2 ) {
-	... wait( `e` ); ...					$\C{// wait( e, m1, m2 )}$
-	... wait( `e, m1` ); ...
-	... wait( `e, m2` ); ...
-}
-
-void rtn$\(_1\)$( M & mutex m1, M & mutex m2 );
-void rtn$\(_2\)$( M & mutex m1 );
-void bar( M & mutex m1, M & mutex m2 ) {
-	... waitfor( `rtn` ); ...				$\C{// waitfor( rtn\(_1\), m1, m2 )}$
-	... waitfor( `rtn, m1` ); ...			$\C{// waitfor( rtn\(_2\), m1 )}$
-}
-\end{cfa}
-For @wait( e )@, the default semantics is to atomically block the signaller and release all acquired mutex types in the parameter list, \ie @wait( e, m1, m2 )@.
-To override the implicit multi-monitor wait, specific mutex parameter(s) can be specified, \eg @wait( e, m1 )@.
-Wait statically verifies the released monitors are the acquired mutex-parameters so unconditional release is safe.
-Similarly, for @waitfor( rtn, ... )@, the default semantics is to atomically block the acceptor and release all acquired mutex types in the parameter list, \ie @waitfor( rtn, m1, m2 )@.
-To override the implicit multi-monitor wait, specific mutex parameter(s) can be specified, \eg @waitfor( rtn, m1 )@.
-Waitfor statically verifies the released monitors are the same as the acquired mutex-parameters of the given routine or routine pointer.
-To statically verify the released monitors match with the accepted routine's mutex parameters, the routine (pointer) prototype must be accessible.
-
-Given the ability to release a subset of acquired monitors can result in a \newterm{nested monitor}~\cite{Lister77} deadlock.
-\begin{cfa}
-void foo( M & mutex m1, M & mutex m2 ) {
-	... wait( `e, m1` ); ...				$\C{// release m1, keeping m2 acquired )}$
-void baz( M & mutex m1, M & mutex m2 ) {	$\C{// must acquire m1 and m2 )}$
-	... signal( `e` ); ...
-\end{cfa}
-The @wait@ only releases @m1@ so the signalling thread cannot acquire both @m1@ and @m2@ to  enter @baz@ to get to the @signal@.
-While deadlock issues can occur with multiple/nesting acquisition, this issue results from the fact that locks, and by extension monitors, are not perfectly composable.
-
-Finally, an important aspect of monitor implementation is barging, \ie can calling threads barge ahead of signalled threads?
-If barging is allowed, synchronization between a singller and signallee is difficult, often requiring multiple unblock/block cycles (looping around a wait rechecking if a condition is met).
-\begin{quote}
-However, we decree that a signal operation be followed immediately by resumption of a waiting program, without possibility of an intervening procedure call from yet a third program.
-It is only in this way that a waiting program has an absolute guarantee that it can acquire the resource just released by the signalling program without any danger that a third program will interpose a monitor entry and seize the resource instead.~\cite[p.~550]{Hoare74}
-\end{quote}
-\CFA scheduling \emph{precludes} barging, which simplifies synchronization among threads in the monitor and increases correctness.
-For example, there are no loops in either bounded buffer solution in Figure~\ref{f:GenericBoundedBuffer}.
-Supporting barging prevention as well as extending internal scheduling to multiple monitors is the main source of complexity in the design and implementation of \CFA concurrency.
-
-
-\subsection{Barging Prevention}
-
-Figure~\ref{f:BargingPrevention} shows \CFA code where bulk acquire adds complexity to the internal-signalling semantics.
-The complexity begins at the end of the inner @mutex@ statement, where the semantics of internal scheduling need to be extended for multiple monitors.
-The problem is that bulk acquire is used in the inner @mutex@ statement where one of the monitors is already acquired.
-When the signalling thread reaches the end of the inner @mutex@ statement, it should transfer ownership of @m1@ and @m2@ to the waiting thread to prevent barging into the outer @mutex@ statement by another thread.
-However, both the signalling and signalled threads still need monitor @m1@.
-
-\begin{figure}
-\newbox\myboxA
-\begin{lrbox}{\myboxA}
-\begin{cfa}[aboveskip=0pt,belowskip=0pt]
-monitor M m1, m2;
-condition c;
-mutex( m1 ) {
-	...
-	mutex( m1, m2 ) {
-		... `wait( c )`; // block and release m1, m2
-		// m1, m2 acquired
-	} // $\LstCommentStyle{\color{red}release m2}$
-	// m1 acquired
-} // release m1
-\end{cfa}
-\end{lrbox}
-
-\newbox\myboxB
-\begin{lrbox}{\myboxB}
-\begin{cfa}[aboveskip=0pt,belowskip=0pt]
-
-
-mutex( m1 ) {
-	...
-	mutex( m1, m2 ) {
-		... `signal( c )`; ...
-		// m1, m2 acquired
-	} // $\LstCommentStyle{\color{red}release m2}$
-	// m1 acquired
-} // release m1
-\end{cfa}
-\end{lrbox}
-
-\newbox\myboxC
-\begin{lrbox}{\myboxC}
-\begin{cfa}[aboveskip=0pt,belowskip=0pt]
-
-
-mutex( m1 ) {
-	... `wait( c )`; ...
-	// m1 acquired
-} // $\LstCommentStyle{\color{red}release m1}$
-
-
-
-
-\end{cfa}
-\end{lrbox}
-
-\begin{cquote}
-\subfloat[Waiting Thread]{\label{f:WaitingThread}\usebox\myboxA}
-\hspace{2\parindentlnth}
-\subfloat[Signalling Thread]{\label{f:SignallingThread}\usebox\myboxB}
-\hspace{2\parindentlnth}
-\subfloat[Other Waiting Thread]{\label{f:SignallingThread}\usebox\myboxC}
-\end{cquote}
-\caption{Barging Prevention}
-\label{f:BargingPrevention}
-\end{figure}
-
-The obvious solution to the problem of multi-monitor scheduling is to keep ownership of all locks until the last lock is ready to be transferred.
-It can be argued that that moment is when the last lock is no longer needed, because this semantics fits most closely to the behaviour of single-monitor scheduling.
-This solution has the main benefit of transferring ownership of groups of monitors, which simplifies the semantics from multiple objects to a single group of objects, effectively making the existing single-monitor semantic viable by simply changing monitors to monitor groups.
-This solution releases the monitors once every monitor in a group can be released.
-However, since some monitors are never released (\eg the monitor of a thread), this interpretation means a group might never be released.
-A more interesting interpretation is to transfer the group until all its monitors are released, which means the group is not passed further and a thread can retain its locks.
-
-However, listing \ref{f:int-secret} shows this solution can become much more complicated depending on what is executed while secretly holding B at line \ref{line:secret}, while avoiding the need to transfer ownership of a subset of the condition monitors.
-Figure~\ref{f:dependency} shows a slightly different example where a third thread is waiting on monitor @A@, using a different condition variable.
-Because the third thread is signalled when secretly holding @B@, the goal  becomes unreachable.
-Depending on the order of signals (listing \ref{f:dependency} line \ref{line:signal-ab} and \ref{line:signal-a}) two cases can happen:
-
-\begin{comment}
-\paragraph{Case 1: thread $\alpha$ goes first.} In this case, the problem is that monitor @A@ needs to be passed to thread $\beta$ when thread $\alpha$ is done with it.
-\paragraph{Case 2: thread $\beta$ goes first.} In this case, the problem is that monitor @B@ needs to be retained and passed to thread $\alpha$ along with monitor @A@, which can be done directly or possibly using thread $\beta$ as an intermediate.
-\\
-
-Note that ordering is not determined by a race condition but by whether signalled threads are enqueued in FIFO or FILO order.
-However, regardless of the answer, users can move line \ref{line:signal-a} before line \ref{line:signal-ab} and get the reverse effect for listing \ref{f:dependency}.
-
-In both cases, the threads need to be able to distinguish, on a per monitor basis, which ones need to be released and which ones need to be transferred, which means knowing when to release a group becomes complex and inefficient (see next section) and therefore effectively precludes this approach.
-
-
-\subsubsection{Dependency graphs}
-
-\begin{figure}
-\begin{multicols}{3}
-Thread $\alpha$
-\begin{cfa}[numbers=left, firstnumber=1]
-acquire A
-	acquire A & B
-		wait A & B
-	release A & B
-release A
-\end{cfa}
-\columnbreak
-Thread $\gamma$
-\begin{cfa}[numbers=left, firstnumber=6, escapechar=|]
-acquire A
-	acquire A & B
-		|\label{line:signal-ab}|signal A & B
-	|\label{line:release-ab}|release A & B
-	|\label{line:signal-a}|signal A
-|\label{line:release-a}|release A
-\end{cfa}
-\columnbreak
-Thread $\beta$
-\begin{cfa}[numbers=left, firstnumber=12, escapechar=|]
-acquire A
-	wait A
-|\label{line:release-aa}|release A
-\end{cfa}
-\end{multicols}
-\begin{cfa}[caption={Pseudo-code for the three thread example.},label={f:dependency}]
-\end{cfa}
-\begin{center}
-\input{dependency}
-\end{center}
-\caption{Dependency graph of the statements in listing \ref{f:dependency}}
-\label{fig:dependency}
-\end{figure}
-
-In listing \ref{f:int-bulk-cfa}, there is a solution that satisfies both barging prevention and mutual exclusion.
-If ownership of both monitors is transferred to the waiter when the signaller releases @A & B@ and then the waiter transfers back ownership of @A@ back to the signaller when it releases it, then the problem is solved (@B@ is no longer in use at this point).
-Dynamically finding the correct order is therefore the second possible solution.
-The problem is effectively resolving a dependency graph of ownership requirements.
-Here even the simplest of code snippets requires two transfers and has a super-linear complexity.
-This complexity can be seen in listing \ref{f:explosion}, which is just a direct extension to three monitors, requires at least three ownership transfer and has multiple solutions.
-Furthermore, the presence of multiple solutions for ownership transfer can cause deadlock problems if a specific solution is not consistently picked; In the same way that multiple lock acquiring order can cause deadlocks.
-\begin{figure}
-\begin{multicols}{2}
-\begin{cfa}
-acquire A
-	acquire B
-		acquire C
-			wait A & B & C
-		release C
-	release B
-release A
-\end{cfa}
-
-\columnbreak
-
-\begin{cfa}
-acquire A
-	acquire B
-		acquire C
-			signal A & B & C
-		release C
-	release B
-release A
-\end{cfa}
-\end{multicols}
-\begin{cfa}[caption={Extension to three monitors of listing \ref{f:int-bulk-cfa}},label={f:explosion}]
-\end{cfa}
-\end{figure}
-
-Given the three threads example in listing \ref{f:dependency}, figure \ref{fig:dependency} shows the corresponding dependency graph that results, where every node is a statement of one of the three threads, and the arrows the dependency of that statement (\eg $\alpha1$ must happen before $\alpha2$).
-The extra challenge is that this dependency graph is effectively post-mortem, but the runtime system needs to be able to build and solve these graphs as the dependencies unfold.
-Resolving dependency graphs being a complex and expensive endeavour, this solution is not the preferred one.
-
-\subsubsection{Partial Signalling} \label{partial-sig}
-\end{comment}
-
-Finally, the solution that is chosen for \CFA is to use partial signalling.
-Again using listing \ref{f:int-bulk-cfa}, the partial signalling solution transfers ownership of monitor @B@ at lines \ref{line:signal1} to the waiter but does not wake the waiting thread since it is still using monitor @A@.
-Only when it reaches line \ref{line:lastRelease} does it actually wake up the waiting thread.
-This solution has the benefit that complexity is encapsulated into only two actions: passing monitors to the next owner when they should be released and conditionally waking threads if all conditions are met.
-This solution has a much simpler implementation than a dependency graph solving algorithms, which is why it was chosen.
-Furthermore, after being fully implemented, this solution does not appear to have any significant downsides.
-
-Using partial signalling, listing \ref{f:dependency} can be solved easily:
-\begin{itemize}
-	\item When thread $\gamma$ reaches line \ref{line:release-ab} it transfers monitor @B@ to thread $\alpha$ and continues to hold monitor @A@.
-	\item When thread $\gamma$ reaches line \ref{line:release-a}  it transfers monitor @A@ to thread $\beta$  and wakes it up.
-	\item When thread $\beta$  reaches line \ref{line:release-aa} it transfers monitor @A@ to thread $\alpha$ and wakes it up.
-\end{itemize}
-
-
-\subsection{Signalling: Now or Later}
+Threads making calls to routines that are currently excluded block outside (external) of the monitor on a calling queue, versus blocking on condition queues inside (internal) of the monitor.
+
+For internal scheduling, non-blocking signalling (as in the producer/consumer example) is used when the signaller is providing the cooperation for a waiting thread;
+the signaller enters the monitor and changes state, detects a waiting threads that can use the state, performs a non-blocking signal on the condition queue for the waiting thread, and exits the monitor to run concurrently.
+The waiter unblocks next, takes the state, and exits the monitor.
+Blocking signalling is the reverse, where the waiter is providing the cooperation for the signalling thread;
+the signaller enters the monitor, detects a waiting thread providing the necessary state, performs a blocking signal to place it on the urgent queue and unblock the waiter.
+The waiter changes state and exits the monitor, and the signaller unblocks next from the urgent queue to take the state.
+
+Figure~\ref{f:DatingService} shows a dating service demonstrating the two forms of signalling: non-blocking and blocking.
+The dating service matches girl and boy threads with matching compatibility codes so they can exchange phone numbers.
+A thread blocks until an appropriate partner arrives.
+The complexity is exchanging phone number in the monitor, 
+While the non-barging monitor prevents a caller from stealing a phone number, the monitor mutual-exclusion property 
+
+The dating service is an example of a monitor that cannot be written using external scheduling because:
+
+The example in table \ref{tbl:datingservice} highlights the difference in behaviour.
+As mentioned, @signal@ only transfers ownership once the current critical section exits; this behaviour requires additional synchronization when a two-way handshake is needed.
+To avoid this explicit synchronization, the @condition@ type offers the @signal_block@ routine, which handles the two-way handshake as shown in the example.
+This feature removes the need for a second condition variables and simplifies programming.
+Like every other monitor semantic, @signal_block@ uses barging prevention, which means mutual-exclusion is baton-passed both on the front end and the back end of the call to @signal_block@, meaning no other thread can acquire the monitor either before or after the call.
 
 \begin{figure}
@@ -1833,22 +1621,246 @@
 \subfloat[\lstinline@signal_block@]{\label{f:DatingSignalBlock}\usebox\myboxB}
 \caption{Dating service. }
-\label{f:Dating service}
+\label{f:DatingService}
 \end{figure}
 
-An important note is that, until now, signalling a monitor was a delayed operation.
-The ownership of the monitor is transferred only when the monitor would have otherwise been released, not at the point of the @signal@ statement.
-However, in some cases, it may be more convenient for users to immediately transfer ownership to the thread that is waiting for cooperation, which is achieved using the @signal_block@ routine.
-
-The example in table \ref{tbl:datingservice} highlights the difference in behaviour.
-As mentioned, @signal@ only transfers ownership once the current critical section exits; this behaviour requires additional synchronization when a two-way handshake is needed.
-To avoid this explicit synchronization, the @condition@ type offers the @signal_block@ routine, which handles the two-way handshake as shown in the example.
-This feature removes the need for a second condition variables and simplifies programming.
-Like every other monitor semantic, @signal_block@ uses barging prevention, which means mutual-exclusion is baton-passed both on the front end and the back end of the call to @signal_block@, meaning no other thread can acquire the monitor either before or after the call.
-
-% ======================================================================
-% ======================================================================
+Both internal and external scheduling extend to multiple monitors in a natural way.
+\begin{cquote}
+\begin{tabular}{@{}l@{\hspace{3\parindentlnth}}l@{}}
+\begin{cfa}
+monitor M { `condition e`; ... };
+void foo( M & mutex m1, M & mutex m2 ) {
+	... wait( `e` ); ...   // wait( e, m1, m2 )
+	... wait( `e, m1` ); ...
+	... wait( `e, m2` ); ...
+}
+\end{cfa}
+&
+\begin{cfa}
+void rtn$\(_1\)$( M & mutex m1, M & mutex m2 );
+void rtn$\(_2\)$( M & mutex m1 );
+void bar( M & mutex m1, M & mutex m2 ) {
+	... waitfor( `rtn` ); ...       // $\LstCommentStyle{waitfor( rtn\(_1\), m1, m2 )}$
+	... waitfor( `rtn, m1` ); ... // $\LstCommentStyle{waitfor( rtn\(_2\), m1 )}$
+}
+\end{cfa}
+\end{tabular}
+\end{cquote}
+For @wait( e )@, the default semantics is to atomically block the signaller and release all acquired mutex types in the parameter list, \ie @wait( e, m1, m2 )@.
+To override the implicit multi-monitor wait, specific mutex parameter(s) can be specified, \eg @wait( e, m1 )@.
+Wait statically verifies the released monitors are the acquired mutex-parameters so unconditional release is safe.
+Finally, a signaller,
+\begin{cfa}
+void baz( M & mutex m1, M & mutex m2 ) {
+	... signal( e ); ...
+}
+\end{cfa}
+must have acquired monitor locks that are greater than or equal to the number of locks for the waiting thread signalled from the front of the condition queue.
+In general, the signaller does not know the order of waiting threads, so in general, it must acquire the maximum number of mutex locks for the worst-case waiting thread.
+
+Similarly, for @waitfor( rtn )@, the default semantics is to atomically block the acceptor and release all acquired mutex types in the parameter list, \ie @waitfor( rtn, m1, m2 )@.
+To override the implicit multi-monitor wait, specific mutex parameter(s) can be specified, \eg @waitfor( rtn, m1 )@.
+Waitfor statically verifies the released monitors are the same as the acquired mutex-parameters of the given routine or routine pointer.
+To statically verify the released monitors match with the accepted routine's mutex parameters, the routine (pointer) prototype must be accessible.
+
+Given the ability to release a subset of acquired monitors can result in a \newterm{nested monitor}~\cite{Lister77} deadlock.
+\begin{cfa}
+void foo( M & mutex m1, M & mutex m2 ) {
+	... wait( `e, m1` ); ...				$\C{// release m1, keeping m2 acquired )}$
+void baz( M & mutex m1, M & mutex m2 ) {	$\C{// must acquire m1 and m2 )}$
+	... signal( `e` ); ...
+\end{cfa}
+The @wait@ only releases @m1@ so the signalling thread cannot acquire both @m1@ and @m2@ to  enter @baz@ to get to the @signal@.
+While deadlock issues can occur with multiple/nesting acquisition, this issue results from the fact that locks, and by extension monitors, are not perfectly composable.
+
+Finally, an important aspect of monitor implementation is barging, \ie can calling threads barge ahead of signalled threads?
+If barging is allowed, synchronization between a singller and signallee is difficult, often requiring multiple unblock/block cycles (looping around a wait rechecking if a condition is met).
+\begin{quote}
+However, we decree that a signal operation be followed immediately by resumption of a waiting program, without possibility of an intervening procedure call from yet a third program.
+It is only in this way that a waiting program has an absolute guarantee that it can acquire the resource just released by the signalling program without any danger that a third program will interpose a monitor entry and seize the resource instead.~\cite[p.~550]{Hoare74}
+\end{quote}
+\CFA scheduling \emph{precludes} barging, which simplifies synchronization among threads in the monitor and increases correctness.
+For example, there are no loops in either bounded buffer solution in Figure~\ref{f:GenericBoundedBuffer}.
+Supporting barging prevention as well as extending internal scheduling to multiple monitors is the main source of complexity in the design and implementation of \CFA concurrency.
+
+
+\subsection{Barging Prevention}
+
+Figure~\ref{f:BargingPrevention} shows \CFA code where bulk acquire adds complexity to the internal-signalling semantics.
+The complexity begins at the end of the inner @mutex@ statement, where the semantics of internal scheduling need to be extended for multiple monitors.
+The problem is that bulk acquire is used in the inner @mutex@ statement where one of the monitors is already acquired.
+When the signalling thread reaches the end of the inner @mutex@ statement, it should transfer ownership of @m1@ and @m2@ to the waiting threads to prevent barging into the outer @mutex@ statement by another thread.
+However, both the signalling and waiting thread W1 still need monitor @m1@.
+
+\begin{figure}
+\newbox\myboxA
+\begin{lrbox}{\myboxA}
+\begin{cfa}[aboveskip=0pt,belowskip=0pt]
+monitor M m1, m2;
+condition c;
+mutex( m1 ) { // $\LstCommentStyle{\color{red}outer}$
+	...
+	mutex( m1, m2 ) { // $\LstCommentStyle{\color{red}inner}$
+		... `signal( c )`; ...
+		// m1, m2 acquired
+	} // $\LstCommentStyle{\color{red}release m2}$
+	// m1 acquired
+} // release m1
+\end{cfa}
+\end{lrbox}
+
+\newbox\myboxB
+\begin{lrbox}{\myboxB}
+\begin{cfa}[aboveskip=0pt,belowskip=0pt]
+
+
+mutex( m1 ) {
+	...
+	mutex( m1, m2 ) {
+		... `wait( c )`; // block and release m1, m2
+		// m1, m2 acquired
+	} // $\LstCommentStyle{\color{red}release m2}$
+	// m1 acquired
+} // release m1
+\end{cfa}
+\end{lrbox}
+
+\newbox\myboxC
+\begin{lrbox}{\myboxC}
+\begin{cfa}[aboveskip=0pt,belowskip=0pt]
+
+
+mutex( m2 ) {
+	... `wait( c )`; ...
+	// m2 acquired
+} // $\LstCommentStyle{\color{red}release m2}$
+
+
+
+
+\end{cfa}
+\end{lrbox}
+
+\begin{cquote}
+\subfloat[Signalling Thread]{\label{f:SignallingThread}\usebox\myboxA}
+\hspace{2\parindentlnth}
+\subfloat[Waiting Thread (W1)]{\label{f:WaitingThread}\usebox\myboxB}
+\hspace{2\parindentlnth}
+\subfloat[Waiting Thread (W2)]{\label{f:OtherWaitingThread}\usebox\myboxC}
+\end{cquote}
+\caption{Barging Prevention}
+\label{f:BargingPrevention}
+\end{figure}
+
+One scheduling solution is for the signaller to keep ownership of all locks until the last lock is ready to be transferred, because this semantics fits most closely to the behaviour of single-monitor scheduling.
+However, Figure~\ref{f:OtherWaitingThread} shows this solution is complex depending on other waiters, resulting is choices when the signaller finishes the inner mutex-statement.
+The singaller can retain @m2@ until completion of the outer mutex statement and pass the locks to waiter W1, or it can pass @m2@ to waiter W2 after completing the inner mutex-statement, while continuing to hold @m1@.
+In the latter case, waiter W2 must eventually pass @m2@ to waiter W1, which is complex because W2 may have waited before W1 so it is unaware of W1.
+Furthermore, there is an execution sequence where the signaller always finds waiter W2, and hence, waiter W1 starves.
+
+While a number of approaches were examined~\cite[\S~4.3]{Delisle18}, the solution chosen for \CFA is a novel techique called \newterm{partial signalling}.
+Signalled threads are moved to an urgent queue and the waiter at the front defines the set of monitors necessary for it to unblock.
+Partial signalling transfers ownership of monitors to the front waiter.
+When the signaller thread exits or waits in the monitor the front waiter is unblocked if all its monitors are released.
+This solution has the benefit that complexity is encapsulated into only two actions: passing monitors to the next owner when they should be released and conditionally waking threads if all conditions are met.
+
+\begin{comment}
+Figure~\ref{f:dependency} shows a slightly different example where a third thread is waiting on monitor @A@, using a different condition variable.
+Because the third thread is signalled when secretly holding @B@, the goal  becomes unreachable.
+Depending on the order of signals (listing \ref{f:dependency} line \ref{line:signal-ab} and \ref{line:signal-a}) two cases can happen:
+
+\paragraph{Case 1: thread $\alpha$ goes first.} In this case, the problem is that monitor @A@ needs to be passed to thread $\beta$ when thread $\alpha$ is done with it.
+\paragraph{Case 2: thread $\beta$ goes first.} In this case, the problem is that monitor @B@ needs to be retained and passed to thread $\alpha$ along with monitor @A@, which can be done directly or possibly using thread $\beta$ as an intermediate.
+\\
+
+Note that ordering is not determined by a race condition but by whether signalled threads are enqueued in FIFO or FILO order.
+However, regardless of the answer, users can move line \ref{line:signal-a} before line \ref{line:signal-ab} and get the reverse effect for listing \ref{f:dependency}.
+
+In both cases, the threads need to be able to distinguish, on a per monitor basis, which ones need to be released and which ones need to be transferred, which means knowing when to release a group becomes complex and inefficient (see next section) and therefore effectively precludes this approach.
+
+
+\subsubsection{Dependency graphs}
+
+\begin{figure}
+\begin{multicols}{3}
+Thread $\alpha$
+\begin{cfa}[numbers=left, firstnumber=1]
+acquire A
+	acquire A & B
+		wait A & B
+	release A & B
+release A
+\end{cfa}
+\columnbreak
+Thread $\gamma$
+\begin{cfa}[numbers=left, firstnumber=6, escapechar=|]
+acquire A
+	acquire A & B
+		|\label{line:signal-ab}|signal A & B
+	|\label{line:release-ab}|release A & B
+	|\label{line:signal-a}|signal A
+|\label{line:release-a}|release A
+\end{cfa}
+\columnbreak
+Thread $\beta$
+\begin{cfa}[numbers=left, firstnumber=12, escapechar=|]
+acquire A
+	wait A
+|\label{line:release-aa}|release A
+\end{cfa}
+\end{multicols}
+\begin{cfa}[caption={Pseudo-code for the three thread example.},label={f:dependency}]
+\end{cfa}
+\begin{center}
+\input{dependency}
+\end{center}
+\caption{Dependency graph of the statements in listing \ref{f:dependency}}
+\label{fig:dependency}
+\end{figure}
+
+In listing \ref{f:int-bulk-cfa}, there is a solution that satisfies both barging prevention and mutual exclusion.
+If ownership of both monitors is transferred to the waiter when the signaller releases @A & B@ and then the waiter transfers back ownership of @A@ back to the signaller when it releases it, then the problem is solved (@B@ is no longer in use at this point).
+Dynamically finding the correct order is therefore the second possible solution.
+The problem is effectively resolving a dependency graph of ownership requirements.
+Here even the simplest of code snippets requires two transfers and has a super-linear complexity.
+This complexity can be seen in listing \ref{f:explosion}, which is just a direct extension to three monitors, requires at least three ownership transfer and has multiple solutions.
+Furthermore, the presence of multiple solutions for ownership transfer can cause deadlock problems if a specific solution is not consistently picked; In the same way that multiple lock acquiring order can cause deadlocks.
+\begin{figure}
+\begin{multicols}{2}
+\begin{cfa}
+acquire A
+	acquire B
+		acquire C
+			wait A & B & C
+		release C
+	release B
+release A
+\end{cfa}
+
+\columnbreak
+
+\begin{cfa}
+acquire A
+	acquire B
+		acquire C
+			signal A & B & C
+		release C
+	release B
+release A
+\end{cfa}
+\end{multicols}
+\begin{cfa}[caption={Extension to three monitors of listing \ref{f:int-bulk-cfa}},label={f:explosion}]
+\end{cfa}
+\end{figure}
+
+Given the three threads example in listing \ref{f:dependency}, figure \ref{fig:dependency} shows the corresponding dependency graph that results, where every node is a statement of one of the three threads, and the arrows the dependency of that statement (\eg $\alpha1$ must happen before $\alpha2$).
+The extra challenge is that this dependency graph is effectively post-mortem, but the runtime system needs to be able to build and solve these graphs as the dependencies unfold.
+Resolving dependency graphs being a complex and expensive endeavour, this solution is not the preferred one.
+
+\subsubsection{Partial Signalling} \label{partial-sig}
+\end{comment}
+
+
 \section{External scheduling} \label{extsched}
-% ======================================================================
-% ======================================================================
+
 An alternative to internal scheduling is external scheduling (see Table~\ref{tbl:sched}).
 
Index: doc/proposals/user_conversions.md
===================================================================
--- doc/proposals/user_conversions.md	(revision 7bdcac1f3631ee6408a86b4d0321433114fee6d3)
+++ doc/proposals/user_conversions.md	(revision 3fc59bdb3dcc1b79be061440b40a2cebb66b85f3)
@@ -5,20 +5,16 @@
 There is also a set of _explicit_ conversions that are only allowed through a 
 cast expression.
-Based on Glen's notes on conversions [1], I propose that safe and unsafe 
-conversions be expressed as constructor variants, though I make explicit 
-(cast) conversions a constructor variant as well rather than a dedicated 
-operator. 
+I propose that safe, unsafe, and explicit (cast) conversions be expressed as 
+constructor variants. 
 Throughout this article, I will use the following operator names for 
 constructors and conversion functions from `From` to `To`:
 
-	void ?{} ( To*, To );            // copy constructor
-	void ?{} ( To*, From );          // explicit constructor
-	void ?{explicit} ( To*, From );  // explicit cast conversion
-	void ?{safe} ( To*, From );      // implicit safe conversion
-	void ?{unsafe} ( To*, From );    // implicit unsafe conversion
-
-[1] http://plg.uwaterloo.ca/~cforall/Conversions/index.html
-
-Glen's design made no distinction between constructors and unsafe implicit 
+	void ?{} ( To&, To );            // copy constructor
+	void ?{} ( To&, From );          // explicit constructor
+	void ?{explicit} ( To&, From );  // explicit cast conversion
+	void ?{safe} ( To&, From );      // implicit safe conversion
+	void ?{unsafe} ( To&, From );    // implicit unsafe conversion
+
+It has been suggested that all constructors would define unsafe implicit 
 conversions; this is elegant, but interacts poorly with tuples. 
 Essentially, without making this distinction, a constructor like the following 
@@ -26,5 +22,5 @@
 multiplying the space of possible interpretations of all functions:
 
-	void ?{}( Coord *this, int x, int y );
+	void ?{}( Coord& this, int x, int y );
 
 That said, it would certainly be possible to make a multiple-argument implicit 
@@ -32,8 +28,8 @@
 used infrequently:
 
-	void ?{unsafe}( Coord *this, int x, int y );
+	void ?{unsafe}( Coord& this, int x, int y );
 
 An alternate possibility would be to only count two-arg constructors 
-`void ?{} ( To*, From )` as unsafe conversions; under this semantics, safe and 
+`void ?{} ( To&, From )` as unsafe conversions; under this semantics, safe and 
 explicit conversions should also have a compiler-enforced restriction to 
 ensure that they are two-arg functions (this restriction may be valuable 
@@ -43,9 +39,16 @@
 is convertable to `To`. 
 If user-defined conversions are not added to the language, 
-`void ?{} ( To*, From )` may be a suitable representation, relying on 
+`void ?{} ( To&, From )` may be a suitable representation, relying on 
 conversions on the argument types to account for transitivity. 
-On the other hand, `To*` should perhaps match its target type exactly, so 
-another assertion syntax specific to conversions may be required, e.g. 
-`From -> To`.
+Since `To&` should be an exact match on `To`, this should put all the implicit 
+conversions on the RHS.
+On the other hand, under some models (like [1]), implicit conversions are not 
+allowed in assertion parameters, so another assertion syntax specific to 
+conversions may be required, e.g. `From -> To`. 
+It has also been suggested that, for programmer control, no implicit 
+conversions (except, possibly, for polymorphic specialization) should be 
+allowed in resolution of cast operators.
+
+[1] ../working/assertion_resolution.md
 
 ### Constructor Idiom ###
@@ -53,13 +56,14 @@
 that we can use the full range of Cforall features for conversions, including 
 polymorphism.
-Glen [1] defines a _constructor idiom_ that can be used to create chains of 
-safe conversions without duplicating code; given a type `Safe` which members 
-of another type `From` can be directly converted to, the constructor idiom 
-allows us to write a conversion for any type `To` which `Safe` converts to: 
-
-	forall(otype To | { void ?{safe}( To*, Safe ) })
-	void ?{safe}( To *this, From that ) {
+In an earlier version of this proposal, Glen Ditchfield defines a 
+_constructor idiom_ that can be used to create chains of safe conversions 
+without duplicating code; given a type `Safe` which members of another type 
+`From` can be directly converted to, the constructor idiom allows us to write 
+a conversion for any type `To` which `Safe` converts to: 
+
+	forall(otype To | { void ?{safe}( To&, Safe ) })
+	void ?{safe}( To& this, From that ) {
 		Safe tmp = /* some expression involving that */;
-		*this = tmp; // uses assertion parameter
+		this{ tmp }; // initialize from assertion parameter
 	}
 
@@ -67,14 +71,46 @@
 unsafe conversions.
 
+Glen's original suggestion said the copy constructor for `To` should also be 
+accepted as a resolution for `void ?{safe}( To&, Safe )` (`Safe` == `To`), 
+allowing this same code to be used for the single-step conversion as well. 
+This proposal does come at the cost of an extra copy initialization of the 
+target value, though.
+
+Contrariwise, if a monomorphic conversion from `From` to `Safe` is written, 
+e.g:
+
+	void ?{safe}( Safe& this, From that ) {
+		this{ /* some parameters involving that */ };
+	}
+
+Then the code for a transitive conversion from `From` to any `To` type 
+convertable from `Safe` is written:
+
+	forall(otype To | { void ?{safe}( To&, Safe ) })
+	void ?{safe}( To& this, From that ) {
+		Safe tmp = that;  // uses monomorphic conversion
+		this{ tmp };      // initialize from assertion parameter
+	}
+
+Given the entirely-boilerplate nature of this code, but negative performance 
+implications of the unmodified constructor idiom, it might be fruitful to have 
+transitive and single step conversion operators, and let CFA build the 
+transitive conversions; some possible names:
+
+	void ?{safe}  (To&, From);    void ?{final safe} (To&, From);  // single-step
+	void ?{safe*} (To&, From);    void ?{safe}       (To&, From);  // transitive
+
 What selective non-use of the constructor idiom gives us is the ability to 
 define a conversion that may only be the *last* conversion in a chain of such. 
-Constructing a conversion graph able to unambiguously represent the full 
-hierarchy of implicit conversions in C is provably impossible using only 
-single-step conversions with no additional information (see Appendix A), but 
-this mechanism is sufficiently powerful (see [1], though the design there has 
-some minor bugs; the general idea is to use the constructor idiom to define 
-two chains of conversions, one among the signed integral types, another among 
-the unsigned, and to use monomorphic conversions to allow conversions between 
-signed and unsigned integer types).
+One use for this is to solve the problem that `explicit` conversions were 
+added to C++ for, that of conversions to `bool` chaining to become conversions 
+to any arithmetic type.
+Another use is to unambiguously represent the full hierarchy of implicit 
+conversions in C by making sign conversions non-transitive, allowing the 
+compiler to resolve e.g. `int -> unsigned long` as 
+`int -> long -> unsigned long` over `int -> unsigned int -> unsigned long`. 
+See [2] for more details.
+
+[2] ../working/glen_conversions/index.html#usual
 
 ### Appendix A: Partial and Total Orders ###
@@ -153,5 +189,5 @@
 convert from `int` to `unsigned long`, so we just put in a direct conversion 
 and make the compiler smart enough to figure out the costs" - this is the 
-approach taken by the existing compipler, but given that in a user-defined 
+approach taken by the existing compiler, but given that in a user-defined 
 conversion proposal the users can build an arbitrary graph of conversions, 
 this case still needs to be handled. 
@@ -160,5 +196,5 @@
 exists a chain of conversions from `a` to `b` (see Appendix A for description 
 of preorders and related constructs). 
-This preorder corresponds roughly to a more usual type-theoretic concept of 
+This preorder roughly corresponds to a more usual type-theoretic concept of 
 subtyping ("if I can convert `a` to `b`, `a` is a more specific type than 
 `b`"); however, since this graph is arbitrary, it may contain cycles, so if 
@@ -192,5 +228,5 @@
 and so is considered to be the nearer type. 
 By transitivity, then, the conversion from `X` to `Y2` should be cheaper than 
-the conversion from `X` to `W`, but in this case the `X` and `W` are 
+the conversion from `X` to `W`, but in this case the `Y2` and `W` are 
 incomparable by the conversion preorder, so the tie is broken by the shorter 
 path from `X` to `W` in favour of `W`, contradicting the transitivity property 
Index: doc/working/glen_conversions/index.html
===================================================================
--- doc/working/glen_conversions/index.html	(revision 3fc59bdb3dcc1b79be061440b40a2cebb66b85f3)
+++ doc/working/glen_conversions/index.html	(revision 3fc59bdb3dcc1b79be061440b40a2cebb66b85f3)
@@ -0,0 +1,1462 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
+            "http://www.w3.org/TR/html4/strict.dtd">
+<html>
+<head>
+<title>Conversions for Cforall</title>
+<style type='text/css'>
+.rationale { font-size: smaller; background-color: #F0F0FF }
+pre { margin-left: 2em; border-width: 1px; }
+code, pre { font-family: courier; font-weight: bold; }
+pre i {font-weight: normal; }
+dfn {font-weight: bold; font-style: italic; }
+</style>
+</head>
+
+<body>
+<h1>Conversions for Cforall</h1>
+
+<p><b>NOTE:</b> This proposal for constructors and user-defined conversions 
+does not represent the current state of Cforall language development, but is 
+maintained for its possible utility in building user-defined conversions. See 
+<tt>doc/proposals/user_conversions.md</tt> for a more current presentation of 
+these ideas.</p>
+
+<p>This is the first draft of a description of a possible extension to the
+current definition of Cforall ("Cforall-as-is") that would let programmers
+fit new types into Cforall's system of conversions.</p>
+
+<ol>
+  <li><a href="#notes">Design Notes</a>
+      <ol>
+	<li><a href="#goals">Goals</a></li>
+	<li><a href="#conversions">Conversions</a></li>
+	<li><a href="#constructors">Constructors</a></li>
+	<li><a href="#ambiguity">Ambiguity</a></li>
+      </ol>
+  </li>
+  <li><a href="#extension">Proposed Extension</a>
+      <ol>
+	<li><a href="#ops">New Operator Identifiers</a></li>
+	<li><a href="#casts">Cast Expressions</a></li>
+	<li><a href='#definitions'>Object Definitions</a></li>
+	<li><a href='#default'>Default Functions</a></li>
+      </ol>
+  </li>
+  <li><a href="#chaining">Conversion Composition</a></li>
+  <li><a href='#cost'>Constructors and Cost</a></li>
+  <li><a href='#heap'>Heap Allocation</a></li>
+  <li><a href='#generic'>Generic Conversions and Constructors</a></li>
+  <li><a href="#usual">C's "Usual Arithmetic Conversions"</a>
+      <ol>
+	<li><a href="#floating">Floating-Point Types</a></li>
+	<li><a href="#largeint">Large Integer Types</a></li>
+	<li><a href="#intpromo">C's "Integer Promotions"</a></li>
+      </ol>
+  </li>
+  <li><a href="#otherpromo">Other Promotions</a></li>
+  <li><a href="#demotions">Other Pre-Defined Implicit Conversions</a></li>
+  <li><a href="#explicit">Pre-Defined Explicit Conversions</a></li>
+  <li><a href="#nonconversions">Non-Conversions</a></li>
+  <li><a href="#assignment">Assignment Operators</a></li>
+  <li><a href="#overload">Overload Resolution</a></li>
+  <li><a href="#final">Final Notes</a></li>
+</ol>
+
+<h2 id="notes">Design Notes</h2>
+
+<h3 id='goals'>Goals</h3>
+<p>My design goal for this extension is to provide a framework that
+explains the bulk of C's conversion semantics in terms of more basic
+languages, just as Cforall explains most expression semantics in terms of
+overloaded function calls.</p>
+
+<p>My pragmatic goal is to allow a programmer to define a portable rational
+number data type, fit it into the existing C type system, and use it in
+mixed-mode arithmetic expressions, all in a convenient and esthetically
+pleasing manner.</p>
+
+<h3 id="conversions">Conversions</h3>
+
+<p>A <dfn>conversion</dfn> creates a value from a value of a different
+type.  C defines a large number of conversions, especially between
+arithmetic types.  A subset of these can be performed by <dfn>implicit
+conversions</dfn>, which occurs in certain contexts: in assignment
+expressions, when passing arguments to function (where parameters are
+"assigned the value of the corresponding argument"), in initialized
+declarations (where "the same type constraints and conversions as for
+simple assignment apply"), and in mixed mode arithmetic.  All conversions
+can be performed explicitly by cast expressions.</p>
+
+<p>C prefers some implicit conversions, the <dfn>promotions</dfn>, to the
+others.  The promotions are ranked among themselves, creating a hierarchy
+of types.  In mixed-mode operations, the "usual arithmetic conversions"
+promote the operands to what amounts to their least common supertype.
+Cforall-as-is uses a slightly larger set of promotions to choose the
+smallest possible promotion when resolving overloading.</p>
+
+<p>An extension should allow Cforall to explain C's conversions as a set of
+pre-defined functions, including its explicit conversions, implicit
+conversions, and preferences among conversions.  The extension must let the
+programmer define new conversions for programmer-defined types, for
+instance so that new arithmetic types can be used conveniently in
+mixed-mode arithmetic.</p>
+
+<h3 id="constructors">Constructors</h3>
+
+<p>C++ introduced constructors to the C language family, and I will use its
+terminology.  A <dfn>constructor</dfn> is a function that initializes an
+object.  C does not have constructors; instead, it makes do with
+initialization, which works like assignment.  Cforall-as-is does not have
+constructors, either: instead, by analogy with C's semantics, a
+programmer-defined assignment function may be called during initialization.
+However, there is a key difference between a function that implements
+assignment and a constructor: constructors assume that the object is
+uninitialized, and must set up any data structure invariants that the
+object is supposed to obey.  An assignment function assumes that the target
+object obeys its invariants.</p>
+
+<p>A <dfn>default constructor</dfn> has no parameters other than the object it
+initializes.  It establishes invariants, but need not do anything else.  A
+default constructor for a rational number type might set the denominator to be
+non-zero, but leave the numerator undefined.</p>
+
+<p>A <dfn>copy constructor</dfn> has two parameters: the object it
+initializes, and a value of the same type.  Its purpose is to copy the
+value into the object, and so it is very similar to an assignment
+function.  In fact, it could be expressed as a call to a default constructor
+followed by an assignment.</p>
+
+<p>A <dfn>converting constructor</dfn> also has two parameters, but the
+second parameter is a value of some type different from the type
+of the object it initializes.  Its purpose is to convert the value to the
+object's type before copying it, and so it is very similar to a C
+assignment operation that performs an implicit conversion.</p>
+
+<p>C++ sensibly defines parameter passing as call by initialization, since
+the parameter is uninitialized when the argument value is placed in it.
+Extended Cforall should do the same.  However, parameter passing is one of
+the main places where implicit conversions occur.  Hence in extended
+Cforall <em>constructors define the implicit conversions</em>.  Cforall
+should also encourage programmers to maintain the similarity between
+constructors and assignment.</p>
+
+<h3 id="ambiguity">Ambiguity</h3>
+
+<p>In extended Cforall, programmer-defined conversions should fit in with
+the predefined conversions.  For instance, programmer-defined promotions
+should interact with the normal promotions so that programmer-defined types
+can take part in mixed-mode arithmetic expressions.  The first design that
+springs to mind is to define a minimal set of conversions between
+neighbouring types in the type hierarchy, and to have Cforall create
+conversions between more distant types by composition of predefined and
+programmer-defined conversions.  Unfortunately, if one draws a graph of C's
+promotions, with C's types as vertices and C's promotions as edges, the
+result is a directed acyclic graph, not a tree.  This means that an attempt
+to build the full set of promotions by composition of a minimal set of
+promotions will fail.</p>
+
+<p> Consider a simple processor with 32-bit <code>int</code> and
+<code>long</code> types.  On such a machine, C's "usual arithmetic
+conversions" dictate that mixed-mode arithmetic that combines a signed
+integer with an unsigned integer must promote the signed integer to an
+unsigned type.  Here is a directed graph showing the some of the minimal
+set of promotions.  Each of the four promotions is necessary, because each
+could be required by some mixed-mode expression, and none can be decomposed
+into simpler conversions.</p>
+
+<pre>
+long --------> unsigned long
+ ^               ^
+ |               |
+int ---------> unsigned int
+</pre>
+
+<p>Now imagine attempting to compose an <code>int</code>-to-<code>unsigned
+long</code> conversion from the minimal set: there are two paths through
+the graph, so the composition is ambiguous.</p>
+
+<p>(In C, and in Cforall-as-is, no ambiguity exists: there is just one
+<code>int</code>-to-<code>unsigned long</code> promotion, defined by the
+language semantics.  In Cforall-as-is, the preference for
+<code>int</code>-to-<code>long</code> over
+<code>int</code>-to-<code>unsigned long</code> is determined by a
+"conversion cost" calculated from the graph of the full set of promotions,
+but the calculation depends on maximal path lengths, not the exact
+path.)</p>
+
+<p>Unfortunately, the same problem with ambiguity creeps in any time
+conversions might be chained together.  The extension must carefully
+control conversion composition, so that programmers can avoid ambiguous
+conversions.</p>
+
+<h2 id='extension'>Proposed Extension</h2>
+
+<p>The rest of this document describes my proposal to add
+programmer-definable conversions and constructors to Cforall.</p>
+
+<p class='rationale'>If your browser supports CSS style sheets, the
+proposal will appear in "normal" paragraphs, and commentary on the proposal
+will have the same appearance as this paragraph.</p>
+
+<h3 id='ops'>New Operator Identifiers</h3>
+
+<p>Cforall would be given a <dfn>cast identifier</dfn>, two
+<dfn>constructor identifiers</dfn>, and a <dfn>destructor
+identifier</dfn>:</p>
+
+<ul>
+  <li> <code>(?)?</code>, for cast functions.</li>
+  <li> <code>(?create)?</code>, for constructors.</li>
+  <li> <code>(?promote)?</code>, for constructors that are promotions.</li>
+  <li> <code>(?destroy)?</code>, for destructors.</li>
+</ul>
+
+<div class='rationale'>
+<p>The ugly identifier <code>(?)?</code> is meant to be mnemonic for the
+cast expression.  The other identifiers are pretty weak (suggestions,
+anyone?) but are supposed to remind the programmer of the connection
+between conversions and constructors.</p>
+
+<p>We could instead use a single <code>(?create)?</code> identifier for
+constructors and add a <code>promote</code> storage class specifier, at
+some small risk clashes of with identifiers in existing code.</p> </div>
+
+<p>It is an error to declare two functions with different constructor
+identifiers that have the same type in the same translation unit.</p>
+
+<p>Functions declared with these identifiers can be polymorphic.  Unlike
+other polymorphic functions, the return type of a polymorphic cast function
+need not be derivable from the type of its parameters</p>
+
+<div class='rationale'>
+<p>The return type of a call to a polymorphic cast
+function can be deduced from the calling context.</p>
+
+<pre>
+forall(type T1) T1 (?)?(T2);  // <i>Legal.</i>
+forall(type T1) T1 pfun(T2);  // <i>Illegal -- no way to infer </i>T1.
+</pre>
+</div>
+
+<p>A <dfn>cast function</dfn> from type <code>T1</code> to type
+<code>T2</code> is named "<code>(?)?</code>", accepts exactly one explicit
+argument of type <i>T1</i>, and returns a value of type <i>T2</i>.</p>
+
+<p class='rationale'>If the cast function is polymorphic, it will have
+type parameters and assertion parameters as well, and can be said to be a
+cast function from many different types to many different types.
+</p>
+
+<p>A <dfn>default constructor function</dfn> for type <i>T</i> is named
+"<code>(?create)?</code>", accepts exactly one explicit argument of type
+<i>T</i><code>*</code>, and returns <code>void</code>.</p>
+
+<p>A <dfn>copy constructor function</dfn> for type <i>T</i> is named
+<code>"(?create)?</code>", accepts exactly two explicit arguments of types
+<i>T</i><code>*</code> and <i>T</i>, and returns <code>void</code>.</p>
+
+<p>A <dfn>converting constructor function</dfn> for type <i>T1</i> from
+<i>T2</i> is named "<code>(?create)?</code>" or "<code>(?promote)?</code>",
+accepts exactly two explicit arguments of types <i>T1</i><code>*</code> and
+<i>T2</i>, and returns <code>void</code>.</p>
+
+<p>A <dfn>destructor function</dfn> for type <i>T</i> is named
+"<code>(?destroy)?</code>", accepts exactly one explicit argument of type
+<i>T</i><code>*</code>, and returns <code>void</code>.</p>
+
+<div class='rationale'>
+
+<p>The monomorphic function prototypes for these functions are</p>
+<pre>
+<i>T1</i>   (?)?(<i>T2</i>);
+void (?create)?(<i>T1</i>*);
+void (?create)?(<i>T1</i>*, <i>T2</i>);
+void (?promote)?(<i>T1</i>*, <i>T2</i>);
+void (?destroy)?(<i>T1</i>*);
+</pre>
+</div>
+
+<h3 id='casts'>Cast Expressions</h3>
+
+<p>In most cases the cast expression <code>(<i>T</i>)<i>e</i></code> would
+be treated like the function call <code>(?)?(<i>e</i>)</code>, except that
+only cast functions to type <i>T</i> would be valid interpretations of
+<code>(?)?</code>, and <code><i>e</i></code> would not be implicitly
+converted to the cast function's parameter type.  In particular, the usual
+rules for resolving function overloading (see <a href='#overload'>below</a>)
+would be used to choose the best interpretation of the expression.</p>
+
+<div class='rationale'>
+<p>For example, in</p>
+<pre>
+type Wazzit;
+type Thingum;
+Wazzit w;
+(Thingum)w;
+</pre>
+<p>the cast function that is called must be "<code>Thingum
+(?)?(Wazzit)</code>", or a polymorphic function that can be specialized to
+that.</p>
+
+<p>The ban on implicit conversions within the cast allows programmers to
+explicitly control composition of conversions and avoid ambiguity.  I also
+hope that this will make it easier for compilers and programmers to
+determine which conversions will be applied in which circumstances.  If
+implicit conversions could be applied to the inputs and outputs of casts,
+when any and all of the conversion functions involved could be polymorphic
+... the possibilities seem endless, unfortunately.</p>
+
+</div>
+
+<h3 id='definitions'>Object Definitions</h3>
+<p>A definition of an object <i>x</i> would call a constructor function.  Let
+<i>T</i> be <i>x</i>'s type with type qualifiers removed, and let <i>a</i>
+be <i>x</i>'s address (with type <code><i>T</i>*</code>).</p>
+
+<p class='rationale'>If type qualifiers weren't ignored, <code>const</code>
+objects couldn't be initialized, and every constructor would have to be
+duplicated, with one version for <i>T</i>* objects and one for
+<code>volatile</code> <i>T</i>* objects.</p>
+
+<ul>
+  <li>A definition with an initializer that is a single expression
+      <i>e</i>, optionally enclosed in braces, would call a copy or converting
+      constructor.  The call would be treated much like the function call
+      <code><i>f</i>(<i>a</i>,<i>e</i>)</code>, except that only copy and 
+      converting constructors for type <i>T</i> would be valid interpretations
+      of <code><i>f</i></code>, and <i>e</i> would not be  implicitly
+      converted to the type of the constructor's second parameter.</li>
+  <li>If <i>x</i> has automatic storage duration and is not initialized
+      explicitly, the definition would call a default constructor function.
+      The call would be treated much like the function call
+      <code><i>f</i>(<i>a</i>)</code>, except that only default
+      constructor functions for type <i>T</i> would be valid interpretations of
+      <code><i>f</i></code>.</li>
+  <li><p>If <i>x</i> has static storage duration and is not initialized
+      explicitly, and is defined within the scope of a type definition that
+      defines <i>T</i>, then <i>T</i>'s implementation type would determine how
+      <i>x</i> is initialized.</p>
+      <div class='rationale'>
+      <pre>
+      type Rational = struct { int numerator; unsigned denominator; };
+      Rational r; // Both members initialized to 0.
+      </pre>
+      </div>
+  </li>
+  <li><p>If <i>x</i> has static storage duration and is not initialized
+      explicitly, and the type <i>T</i> is an opaque type, the definition
+      would be treated as if <i>x</i> was initialized with the expression
+      <code>0</code>.</p>
+      <div class='rationale'>
+      <p>This is a simple extension of C's rules for static objects,
+      which initialized them all to 0.  Frequently, the 0 involved
+      will have type <i>T</i>, and the definition will call a copy
+      constructor.</p>
+      <pre>
+      extern type Rational;
+      extern Rational 0;
+      static Rational r;  // initialized with the Rational 0.
+      </pre>
+      <p>In other cases, the 0 will be an integer or null pointer, and the
+      definition will call a converting constructor.</p>
+
+      <p>The obvious alternative design would call <i>T</i>'s default
+      constructor.  That design would be inconsistent, because some static
+      objects would go uninitialized.  It would also cause subtle problems,
+      because a particular static definition could be uninitialized or
+      initialized to 0 depending on whether <i>T</i> is an <code>extern
+      type</code> or a <code>typedef</code>.</p>
+      </div>
+  </li>
+</ul>
+
+<p>Except when calling constructors, parameter passing invokes constructor
+functions.  Passing argument expression <i>e</i> to a parameter would be
+equivalent to initializing the parameter with that expression.  When
+calling constructors, the value of the argument would be copied into the
+parameter.</p>
+
+<p>When the lifetime of <i>x</i> ends, a destructor function would be called.
+The call would be treated much like the function call
+<code>(?destroy)?(<i>a</i>)</code>.  When a block ends, the objects that were
+defined in the block would be destroyed in the reverse of the order in which
+they are declared.</p>
+
+<p>The storage class specifier <code>register</code> will have the
+semantics that it has in C++, instead of the semantics of C: it is merely a
+hint to the implementation that the object will be heavily used, and does
+not prevent programs from computing the address of the object.</p>
+
+<h3 id='default'>Default Functions</h3>
+
+<p>In Cforall-as-is, every declaration with type-class <code>type</code>
+implicitly declares a default assignment function, with the same scope and
+linkage as the type.  Extended Cforall would also declare a <dfn>default
+default constructor</dfn> and a <dfn>default destructor</dfn>.</p>
+
+<div class='rationale'>
+<pre>
+{
+    extern type T;
+    T t;           // <i>calls external constructor for T.</i>
+    }              // <i>calls external destructor for T.</i>
+</pre>
+<p>The destructor and some sort of constructor are necessary to instantiate
+the type.  I include the default constructor because it is the most basic.
+Arguably the declaration should also declare a default copy constructor,
+but I chose not to because Cforall can construct a copy constructor from
+the default constructor and the assignment operator, as will be seen
+<a href="#generic">below</a>.</p>
+
+<p>If the type does not need to be instantiated, it probably should have
+been declared by <code>dtype</code> instead of by <code>type</code>.</p>
+</div>
+
+<p>A type definition would implicitly define a default constructor and
+destructor by inheriting the implementation type's default constructor and
+destructor, just as is done for the implicitly defined default assignment
+function.</p>
+
+<h2 id='chaining'>Conversion Composition</h2>
+
+<p>As mentioned above, Cforall does not apply implicit conversions to the
+arguments and results of cast expressions or constructor calls.  Neither
+does it automatically create conversions or constructors by composing
+programmer-defined compositions: given</p>
+
+<pre>
+T1 (?)?(T2);
+T2 (?)?(T3);
+T3 v3;
+(T1)v3;
+</pre>
+
+<p>then Cforall does not automatically create</p>
+
+<pre>
+T1 (?)?(T3 p) { return (T1)(T2)p; }
+</pre>
+
+<p>Composition of conversions does show up through a third mechanism where
+the programmer has more control: assertion lists.  Consider a
+<code>Month</code> type, that represents months as integers between 0 and
+11.  Clearly a <code>Month</code> can be promoted to <code>unsigned</code>,
+and to any type above <code>unsigned</code> in the arithmetic type
+hierarchy as well.</p>
+
+<pre id='monthpromo'>
+type Month = unsigned;
+
+forall(type T | void (?promote)(T*, unsigned))
+  void (?promote)?(T* target, Month source) {
+    unsigned u_temp = (unsigned)source;
+    T t_temp = u_temp;           // <i>calls the assertion parameter.</i>
+    *target = t_temp;
+  }
+</pre>
+
+<p>The intimidating polymorphic promotion declaration says that, if
+<code>T</code> is a type and <code>unsigned</code> can be promoted to
+<code>T</code>, then the function can promote <code>Month</code> to
+<code>T</code>.</p>
+
+<pre>
+Month m;
+unsigned long ul = m;
+</pre>
+
+<p>To initialize <code>ul</code>, Cforall must bind <code>T</code> to
+<code>unsigned long</code>, find the (pre-defined)
+<code>unsigned</code>-to-<code>unsigned long</code> promotion, and pass it
+to the assertion parameter of the polymorphic
+<code>Month</code>-to-<code>T</code> function.</p>
+
+<p>But what about converting from <code>Month</code> to
+<code>unsigned</code> itself?</p>
+
+<pre>
+unsigned u = m;  // <i>How?</i>
+</pre>
+
+<p>A monomorphic <code>Month</code>-to-<code>unsigned</code> constructor
+would do the job, but its body would mostly duplicate the body of the
+polymorphic function.</p>
+
+<p>Instead, Cforall should use the polymorphic promotion and the
+<code>unsigned</code> copy constructor.  To initialize <code>u</code>,
+Cforall should pass the <code>unsigned</code> copy constructor to the assertion
+parameter of the polymorphic <code>Month</code> promotion, and bind
+<code>T</code> to <code>unsigned</code>.</p>
+
+<p>Note that the polymorphic promotion can promote <code>Month</code> to
+the standard types, to implementation-defined extended types, and to
+programmer-defined types that have yet to be written.  This is much better
+than writing a flock of monomorphic promotions, with function bodies that
+would be nearly identical, to convert <code>Month</code> to each unsigned
+type individually.  The predefined constructors make heavy use of this
+<dfn id='idiom'>constructor idiom</dfn>: instead of writing</p>
+
+<pre>
+void (?promote)? (T1*, T2);
+</pre>
+
+<p>("You can make a T2 into a T1"), write</p>
+<pre>
+forall(type T | void (?promote)?(T*, T1) ) void (?promote)?(T*, T2);
+</pre>
+
+<p>("You can make a T2 into anything that can be made from a T1").</p>
+
+<h2 id='cost'>Constructors and Cost</h2>
+
+<p>Calls to constructors have <dfn>construction costs</dfn>, which let
+Cforall choose the least expensive implicit conversion when given a
+choice.</p>
+
+<ol>
+  <li>The cost of a call to a copy constructor is 0.</li>
+  <li>The cost of a call to a monomorphic constructor is 1.</li>
+  <li>The cost of a call to a polymorphic constructor, or a specialization
+      of it, is 1 plus the sum of the construction costs of constructors
+      that are passed to it through assertion parameters.</li>
+</ol>
+
+<div class='rationale'>
+
+<p>Note that, although point 3 refers to constructors that are
+passed at run-time, the translator statically matches arguments to
+assertion parameters, so it can determine construction costs statically.</p>
+
+<p>Construction cost is defined for <em>every</em>
+constructor, not just the promotions (which are the equivalent of the safe
+conversions of Cforall-as-is).  This seemed like the easiest way to handle
+(admittedly dicey) "mixed" constructors, where the constructor and its
+assertion parameter have different identifiers:</p>
+
+<pre>
+type Thingum;
+type Wazzit;
+forall(type T | void (?create)?(T*, Thingum) )
+  void (?promote)?(T*, Wazzit);
+</pre>
+</div>
+
+<h3>Examples:</h3>
+<p>"<code>unsigned ui = 42U;</code>" calls a copy constructor, and so has
+cost 0.</p>
+
+<p>"<code>unsigned ui = m;</code>", where <code>m</code> has type
+<code>Month</code>, calls the polymorphic <code>Month</code> promotion
+defined <a href="#monthpromo">previously</a>.  It passes the
+<code>unsigned</code>-to-<code>unsigned</code> copy constructor to the
+assertion parameter, and so has cost 1+0&nbsp;=&nbsp;1.</p>
+
+<p>"<code>unsigned long ul = m;</code>" calls the polymorphic
+<code>Month</code> promotion, passing the
+<code>unsigned</code>-to-<code>unsigned long</code> constructor to the
+assertion parameter.  <code>unsigned</code>-to-<code>unsigned long</code>
+is defined below and will turn out to have cost 1, so the total cost is 2.</p>
+
+<p>Inside the body of the <code>Month</code> promotion, the assertion
+parameter has a monomorphic type, and so has a construction cost of 1 where
+it is called by the initialization of <code>t_temp</code>.  The cost of the
+<em>argument</em> passed through the assertion parameter has no relevance
+inside the body of the promotion.</p>
+
+<h2 id='overload'>Overload Resolution</h2>
+
+<p>In Cforall-as-is, there is at most one language-defined implicit
+conversion between any two types.  In extended Cforall, more than one
+conversion may be applicable, and overload resolution must be adapted to
+account for that, by using the lowest-cost conversion.</p>
+
+<p>The <dfn>unsafe conversion cost</dfn> of a function call expression
+would be the total conversion cost of implicit calls of
+<code>(?create)?()</code> constructors applied directly to arguments of the
+function -- 0 if there are none.</p>
+
+<p class='rationale'>This would replace a rule in Cforall-as-is, which
+considers all unsafe conversions to be equally bad and just counts them.  I
+think the difference would be subtle and unimportant.</p>
+
+<p>The <dfn>promotion cost</dfn> would be the total conversion costs of
+implicit calls of <code>(?promote)?()</code> constructors applied directly
+to arguments of the function -- 0 if there are none.</p>
+
+<p>Overload resolution would examine each argument expression individually.
+The best interpretations of an expression would be:</p>
+
+<ol>
+  <li>the interpretations with the lowest unsafe conversion cost;</li>
+  <li>of these, the interpretations with the lowest promotion cost;</li>
+  <li>of these, if any can be promoted to the parameter type, then just
+      those that can be converted at minimal cost; otherwise, all remaining
+      interpretations.</li>
+</ol>
+
+<p>The best interpretation would be implicitly converted to the parameter
+type, by calling the conversion function with minimal cost.  If there is
+more than one best interpretation, or if there is more than one
+minimal-cost conversion, the argument is ambiguous.</p>
+
+<p>A maximal set of interpretations of the function call expression that
+have compatible result types produces a single interpretation: the
+interpretations with the lowest unsafe conversion cost, and of these, the
+interpretations with the lowest promotion cost.  If there is more than one
+such interpretation, the function call expression is ambiguous.</p>
+
+<h2 id='heap'>Heap Allocation</h2>
+
+<p>Cforall would define new heap allocation functions that would ensure
+that constructors and destructors would be applied to objects in the
+heap.  There's lots of room for ambitious design here, but a simple
+facility might look like this:</p>
+
+<pre>
+forall(type T) void delete(T const volatile restrict* ptr) {
+  if (ptr) (?destroy)?(ptr);
+  free(ptr);
+}
+</pre>
+
+<div class='rationale'>
+<p>In a call to <code>delete()</code>, the argument might be a pointer to a
+pointer: <code>T</code> would be a pointer type, and the argument might
+have all three type qualifiers.  (If it doesn't, pointer conversions will add
+missing qualifiers to the argument.)</p>
+<pre>
+// <i>Pointer to a const volatile restricted pointer to an int:</i>
+int * const volatile restrict * pcvrpi;
+// <i>...</i>
+delete(cvrpi);    // T<i> bound to </i>int *
+</pre>
+</div>
+
+<p>A <code>new()</code> function would take the address of a pointer and an
+initial value, and points the pointer at heap storage initialized to that
+value.</p>
+<pre>
+forall(type T | void (?create)?(T*, T))
+  void new(T* volatile restrict* ptr, T val) {
+    *ptr = malloc(sizeof(T));
+    if (*ptr) (?create)?(*ptr, val);  // <i>explicit constructor call</i>
+}
+
+forall(type T | void (?create)?(T*, T))
+  void new(T const* volatile restrict* ptr, T val),
+       new(T volatile* volatile restrict* ptr, T val),
+       new(T restrict* volatile restrict* ptr, T val),
+       new(T const volatile* volatile restrict* ptr, T val),
+       new(T const restrict* volatile restrict* ptr, T val),
+       new(T volatile restrict* volatile restrict* ptr, T val),
+       new(T const volatile restrict* volatile restrict* ptr, T val);
+</pre>
+<p class='rationale'>Cforall can't add type qualifiers to pointed-at
+pointer types, so <code>new()</code> needs one variation for each set of
+type qualifiers.</p>
+
+<p>Another <code>new()</code> function would omit the initial value, and
+apply the default constructor.  <span class='rationale'>Obviously, there's
+no point in allocating <code>const</code>-qualified uninitialized
+storage.</span></p>
+<pre>
+forall(type T)
+  void new(T* volatile restrict * ptr) {
+    *ptr = malloc(sizeof(T));
+    if (*ptr) (?create)?(*ptr);   // <i>Explicit default constructor call.</i>
+}
+
+forall(type T)
+  void new(T volatile* volatile restrict*),
+  void new(T restrict* volatile restrict*),
+  void new(T volatile restrict* volatile restrict*);
+</pre>
+
+<h2 id='generic'>Generic Conversions and Constructors</h2>
+
+<p>Cforall would provide a polymorphic default constructor function and
+destructor function, for types that do not have their own:</p>
+
+<pre>
+forall(type T)
+  void (?create)?(T*) { return; };
+
+forall(type T)
+  void (?destroy)?(T*) { return; };
+</pre>
+
+<p class='rationale'>The generic default constructor and destructor provide
+C semantics for uninitialized variables: "do nothing".</p>
+
+<p>For every structure type <code>struct <i>s</i></code> Cforall would define a
+default constructor function that applies a default constructor to each
+member, in no particular order.  Similarly, it would define a destructor that
+applies the destructor of each member in no particular order.</p>
+
+<p>Any promotion would be treated as a plain constructor:</p>
+<pre>
+forall(type T, type S | void (?promote)(T*, S))
+  void (?create)?(T*, S) {
+    (?promote)?(T*, S);    // <i>Explicit constructor call!</i>
+  }
+</pre>
+
+<p>A predefined cast function would allow explicit conversions anywhere
+that implicit conversions are possible:</p>
+<pre>
+forall(type T, type S | void (?create)?(T*, S))
+  T (?)?(S source) {
+    T temp = source;
+    return temp;
+  }
+</pre>
+
+<p>A predefined converting constructor would allow initialization anywhere
+that assignment is defined:</p>
+<pre>
+forall(type T | void (?create)?(T*), type S | T ?=?(T*, S))
+  void (?create)?(T* target, S source) {
+    (?create)?(target);
+    *target = source;
+  }
+</pre>
+
+<p class='rationale'>This implements the typical semantic link between
+assignment and initialization.</p>
+
+<p>The predefined copy constructor function is</p>
+<pre>
+forall(type T)
+  void (?promote)?(T* target, T source) {
+    (?create)?(target);
+    *target = source;
+  }
+</pre>
+
+<p class='rationale'>Since Cforall defines assignment and default
+constructors for structure types, this provides the copy constructor for
+structure types.</p>
+
+<p>Finally, Cforall defines the conversion to <code>void</code>, which
+discards its argument.</p>
+<pre>
+forall(type T) void (?promote)(void*, T);
+</pre>
+
+<h2 id='usual'>C's "Usual Arithmetic Conversions"</h2>
+
+<p>C has five groups of arithmetic types: signed integers, unsigned
+integers, complex floating-point numbers, imaginary floating-point numbers,
+and real floating-point numbers.  (Implementations are not required to
+provide complex and imaginary types.)  Some of the "usual arithmetic
+conversions" promote upward within a group or to a more general group: from
+<code>int</code> to <code>long long</code>, for instance.   Others
+promote across from a type in one group to a similar type in another group:
+for instance, from <code>int</code> to <code>unsigned int</code>.</p>
+
+<h3 id='floating'>Floating-Point Types</h3>
+
+<p>The floating point types would use the <a href="#idiom">constructor
+idiom</a> for upward promotions, and monomorphic constructors for
+promotions across from real and imaginary types to complex types with the
+same precision.</p> 
+
+<p>I will use a macro to abbreviate the constructor idiom.
+"<code>Promoter(T,S)</code>" promotes <code>S</code> to any type that
+<code>T</code> can be promoted to</p>
+<pre>
+#define Promoter(Target, Source) \
+  forall(type T | void (?promote)?(T*, Target)) void (?promote)?(T*, Source)
+
+Promoter(long double _Complex, double _Complex);      // <i>a</i>
+Promoter(double _Complex,      float _Complex);       // <i>b</i>
+Promoter(long double, double);                        // <i>c</i>
+Promoter(double,      float);                         // <i>d</i>
+Promoter(long double _Imaginary, double _Imaginary);  // <i>e</i>
+Promoter(double _Imaginary,      float _Imaginary);   // <i>f</i>
+
+void (?promote)?(long double _Complex*, long double);             // <i>g</i>
+void (?promote)?(long double _Complex*, long double _Imaginary);  // <i>h</i>
+void (?promote)?(double _Complex*, double);                       // <i>i</i>
+void (?promote)?(double _Complex*, double _Imaginary);            // <i>j</i>
+void (?promote)?(float _Complex*, float);                         // <i>k</i>
+void (?promote)?(float _Complex*, float _Imaginary);              // <i>l</i>
+</pre>
+
+<div class='rationale'>
+<p>It helps to draw a graph of the promotions.  In this diagram,
+monomorphic promotions are solid arrows from the source type to the target
+type, and polymorphic promotions are dotted arrows from the source type to
+a bubble that surrounds all possible target types.  (Twenty years after
+first hearing about them, I have finally found a use for directed
+multigraphs!)  To determine the promotion from one type to another, find a
+path of zero or more dotted arrows optionally ending with a solid arrow.</p>
+<div>
+<img alt="Floating point promotions" src="./float_promo.png">
+</div>
+
+<p>A <code>long double _Complex</code> can be constructed from</p>
+<ol>
+  <li>a <code>double _Complex</code>, via <i>a</i>, with a <code>double
+      _Complex</code> copy constructor passed as the assertion
+      parameter.</li>
+  <li>a <code>long double</code>, via constructor <i>g</i>.</li>
+  <li>a <code>double</code>, via <i>c</i> (which promotes
+      <code>double</code> to <code>long double</code> and higher), with
+      <i>g</i> passed as the assertion parameter.  In other words, the path
+      from <code>double</code> to <code>long double _Complex</code> passes
+      through <code>long double</code></li>
+  <li>a <code>float _Complex</code>, via <i>b</i>.  For the assertion
+      parameter, Cforall passes a <code>double
+      _Complex</code>-to-<code>long double _Complex</code> constructor that
+      it makes by specializing <i>a</i>; for the assertion parameter of the
+      specialization, it passes a <code>long double
+      _Complex</code>-to-<code>long double _Complex</code> copy
+      constructor.</li>
+  <li>a <code>float</code>, via <i>d</i>, with a specialization of <i>c</i>
+      passed as its assertion parameter, with <i>g</i> passed as the
+      specialization's assertion parameter.</li>
+</ol>
+
+<p>Note how "upward" and "across" promotions interact.  Polymorphic
+"upward" promotions connect widely separated types by composing
+constructors through their assertion parameters.  Monomorphic "across"
+promotions extend composition one step across to corresponding types in
+different groups.</p>
+
+<p>Defining the set of predefined promotions turned out to be quite tricky.
+For example, if "across" promotions used the constructor idiom, ambiguity
+would result: a conversion from <code>float</code> to <code>double
+_Complex</code> could convert upward through <code>double</code> or across
+through <code>float _Complex</code>.  The key points are:</p>
+<ol>
+  <li>Monomorphic constructors are only used to connect neighbouring types
+      in the conversion hierarchy, because they have constructor cost 1.</li>
+  <li>Polymorphic constructors only connect directly to neighbours, because
+      their minimal cost is 1.  They reach other types by composition.</li>
+  <li>The types in the assertion parameter of a polymorphic constructor
+      specify the exact path between two types by specifying the
+      next type in a sequence of composed constructors.</li>
+  <li>There can be more than one path between two types, provided that the
+      paths have different construction costs or degrees of
+      polymorphism.</li>
+</ol>
+
+</div>
+
+<h3 id='largeint'>Large Integer Types</h3>
+
+<p class='rationale'>The conversions for the integer types cannot be
+defined by a simple list, because the set of integer types is
+implementation-defined, the range of each type is implementation-defined,
+and the set of promotions depend on whether a particular signed type can
+represent all values of a particular unsigned type.  As I read the C
+standard, every signed type has a matching unsigned type, but the reverse
+is not true.  This complicates the definitions below.</p>
+
+<ul>
+  <li>Let the <dfn>rank</dfn> of an integer type be the integer conversion
+      rank defined in C99, with the added condition that the ranks form a
+      continuous sequence of integers.</li>
+  <li>Let <dfn><i>r</i><sub>int</sub></dfn> be the rank of
+      <code>int</code>.</li>
+  <li>Let <dfn><code>signed(<i>r</i>)</code></dfn> and
+      <dfn><code>unsigned(<i>r</i>)</code></dfn>
+      be the signed integer type and unsigned integer type with rank
+      <i>r</i>.</li>
+</ul>
+
+<p>Integers promote upward to floating-point types.  Let <i>SMax</i> be the
+highest ranking signed integer type, and let <i>UMax</i> be the highest
+ranking unsigned integer type.  Then Cforall would define</p>
+
+<pre>
+Promoter(float, <i>SMax</i>);
+Promoter(float, <i>Umax</i>);
+</pre>
+
+<p>Signed types promote across to unsigned types with the same rank.  For
+every <i>r</i> &gt;= <i>r</i><sub>int</sub> such that
+<code>signed(<i>r</i>)</code> exists, Cforall would define</p>
+
+<pre>
+void (?promote)?( unsigned(<i>r</i>)*, signed(<i>r</i>) );
+</pre>
+
+<p>Lower-ranking signed integers promote to higher-ranking signed integers.
+For every signed integer type <i>T</i> with rank greater than
+<i>r</i><sub>int</sub>, let <i>S</i> be the signed integer type with the
+next lowest rank.  Then Cforall would define</p>
+
+<pre>
+Promoter(<i>T</i>, <i>S</i>);
+</pre>
+
+<p>Similarly, lower-ranking unsigned integers promote to higher-ranking
+unsigned integers.  For every <i>r</i> &gt; <i>r</i><sub>int</sub>, Cforall
+would define</p>
+
+<pre>
+Promoter(unsigned(<i>r</i>), unsigned(<i>r</i>-1));
+</pre>
+
+<p>C's usual arithmetic conversions may promote an unsigned type to a
+signed type, but only if the signed type can represent every value of the
+unsigned type.  For every <i>r</i> &gt;= <i>r</i><sub>int</sub>, if there
+are any signed types that can represent every value in
+<code>unsigned(<i>r</i>)</code>, let <i>S</i> be the
+lowest ranking of these types; then Cforall defines</p>
+
+<pre>
+Promoter(<i>S</i>, unsigned(<i>r</i>));
+</pre>
+
+<h3 id='intpromo'>C's "Integer Promotions"</h3>
+
+<div class='rationale'>
+
+<p>C's <dfn>integer promotions</dfn> apply to "small" types (those with
+rank less than <i>r</i><sub>int</sub>): they promote to <code>int</code> if
+<code>int</code> can hold all of their values, and to <code>unsigned
+int</code> otherwise.  At least one unsigned type, <code>_Bool</code>,
+will promote to <code>int</code>.  This breaks the pattern set by the usual
+arithmetic conversions, where unsigned types always promote to the next
+larger unsigned type.  Consider a machine with 32-bit <code>int</code>s and
+16-bit <code>unsigned short</code>s: if two <code>unsigned short</code>s
+are added, they must be promoted to <code>int</code> instead of
+<code>unsigned int</code>.  Hence for this machine there must <em>not</em>
+be a promotion from <code>unsigned short</code> to <code>unsigned
+int</code>.</p>
+
+<p>Since the C integer promotions always promote small signed types to
+<code>int</code>, Cforall would extend the chain of polymorphic "upward"
+and monomorphic "across" signed integer promotions to the small
+signed types.</p>
+</div>
+
+<p>For every signed integer type <i>S</i> with rank less than
+<i>r</i><sub>int</sub>, Cforall would define</p>
+<pre>
+Promoter(<i>T</i>, <i>S</i>);
+</pre>
+<p>where <i>T</i> is the signed integer type with the next highest
+rank.</p>
+
+<p>Let <i>r</i><sub>break</sub> be the rank of the highest-ranking unsigned
+type whose values can all be represented by <code>int</code>, and let
+<i>T</i> be the lowest-ranking signed type that can represent all of the
+values of <code>unsigned(<i>r</i><sub>break</sub>)</code>.  Cforall would
+define</p>
+
+<pre>
+Promoter(T, unsigned(<i>r</i><sub>break</sub>));
+</pre>
+
+<p>For every
+<i>r</i> less than <i>r</i><sub>int</sub> except
+<i>r</i><sub>break</sub>, Cforall would define</p>
+<pre>
+Promoter(unsigned(<i>r+1</i>), unsigned(<i>r</i>));
+</pre>
+
+<p class='rationale'><i>r</i><sub>break</sub> is the point where the normal
+pattern of unsigned promotion breaks.  Unsigned types with higher rank
+promote upward toward <code>unsigned int</code>.  Unsigned types with
+lower rank promote upward to the type at the break, which promotes upward
+to a signed type and onward toward <code>int</code>.</p>
+
+<p>For each <i>r</i> &lt; <i>r</i><sub>int</sub> such that
+<code>signed(<i>r</i>)</code> exists, Cforall would define</p>
+<pre>
+void (?promote)?(unsigned(<i>r</i>)*, signed(<i>r</i>));
+</pre>
+
+<p class='rationale'>These "across" promotions are not strictly necessary,
+but it seems useful to extend the pattern of signed-to-unsigned monomorphic
+conversions established by the larger integer types.  Note that because of
+these promotions, <code>unsigned(<i>r</i><sub>break</sub>)</code> does
+promote to the next larger unsigned type, after a detour through a signed
+type that increases the conversion cost.</p>
+
+<p>Finally, <code>char</code> is equivalent to <code>signed char</code> or
+<code>unsigned char</code>, on an implementation-defined basis.  If
+<code>char</code> is equivalent to <code>signed char</code>, the
+implementation would define</p>
+
+<pre>
+Promoter(signed char, char);
+</pre>
+
+<p>Otherwise, it would define</p>
+<pre>
+Promoter(unsigned char, char);
+</pre>
+
+<h2 id="otherpromo">Other Promotions</h2>
+<p>Promotions can add qualifiers to the pointed-to type of a
+pointer type.</p>
+<pre>
+forall(dtype DT) void (?promote)?(const DT**, DT*);
+forall(dtype DT) void (?promote)?(volatile DT**, DT*);
+forall(dtype DT) void (?promote)?(restrict DT**, DT*);
+forall(dtype DT) void (?promote)?(const volatile DT**, DT*);
+forall(dtype DT) void (?promote)?(const restrict DT**, DT*);
+forall(dtype DT) void (?promote)?(volatile restrict DT**, DT*);
+forall(dtype DT) void (?promote)?(const volatile restrict DT**, DT*);
+
+forall(dtype DT) void (?promote)?(const volatile DT**, const DT*);
+forall(dtype DT) void (?promote)?(const restrict DT**, const DT*);
+forall(dtype DT) void (?promote)?(const volatile restrict DT**, const DT*);
+
+forall(dtype DT) void (?promote)?(const volatile DT**, volatile DT*);
+forall(dtype DT) void (?promote)?(volatile restrict DT**, volatile DT*);
+forall(dtype DT) void (?promote)?(const volatile restrict DT**, volatile DT*);
+
+forall(dtype DT) void (?promote)?(const restrict DT**, restrict DT*);
+forall(dtype DT) void (?promote)?(volatile restrict DT**, restrict DT*);
+forall(dtype DT) void (?promote)?(const volatile restrict DT**, restrict
+DT*);
+
+forall(dtype DT) void (?promote)?(const volatile restrict DT**, const volatile DT);
+forall(dtype DT) void (?promote)?(const volatile restrict DT**, const restrict DT);
+forall(dtype DT) void (?promote)?(const volatile restrict DT**, volatile restrict DT);
+</pre>
+
+<p class='rationale'> The type qualifier promotions are simple, but verbose
+because Cforall doesn't abstract over type qualifiers very well.  They also
+give <em>every</em> type qualifier promotion a cost of 1.  It is possible
+to define a smaller set of promotions, some using the constructor idiom,
+that gives greater cost to promotions that add more qualifiers, but the set
+is arbitrary and asymmetric: only one of the three promotions that add one
+qualifier to an unqualified pointer type can use the constructor idiom, or
+else ambiguity results.</p>
+
+<p>Within the scope of a type definition <code>type <i>T1</i> =
+<i>T2</i>;</code>, constructors would convert between the new type and its
+implementation type.</p>
+
+<pre>
+void (?promote)(<i>T2</i>*, <i>T1</i>);
+void (?promote)(<i>T2</i>**, <i>T1</i>*);
+void (?create)?(<i>T1</i>*, <i>T2</i>);
+void (?create)?(<i>T1</i>**, <i>T2</i>*);
+</pre>
+
+<p class='rationale'>The conversion from the implementation type
+<code><i>T2</i></code> to the new type <code><i>T1</i></code> gives
+functions that implement operations on <code><i>T1</i></code> access to the
+type's implementation.  The conversion is a promotion because most such
+functions work with the implementation most of the time.  The reverse
+conversion is merely implicit, so that mixed operations won't be
+ambiguous.</p>
+
+<h2 id="demotions">Other Pre-Defined Implicit Conversions</h2>
+<h3>Arithmetic Conversions</h3>
+
+<p class='rationale'>C defines implicit conversions between any two
+arithmetic types.  In Cforall terms, the conversions that are not
+promotions are ordinary conversions.  Most of the ordinary conversions
+follow a pattern that looks like the <a href='#usual'>Usual Arithmetic
+Conversions</a> in reverse.  Once again, I will use a macro to hide details
+of the constructor idiom.</p>
+
+<pre>
+#define Creator(Target, Source) \
+  forall(type T | void (?create)?(T*, Target)) void (?create)?(T*, Source)
+
+Creator(double _Complex, long double _Complex);
+Creator(float _Complex,  double _Complex);
+Creator(double, long double);
+Creator(float,  double);
+Creator(double _Imaginary, long double _Imaginary);
+Creator(float _Imaginary,  double _Imaginary);
+
+void (?create)?(long double*,            long double _Complex);
+void (?create)?(long double _Imaginary*, long double _Complex);
+void (?create)?(double*,            double _Complex);
+void (?create)?(double _Imaginary*, double _Complex);
+void (?create)?(float*,            float _Complex);
+void (?create)?(float _Imaginary*, float _Complex);
+</pre>
+
+<p class='rationale'>The C99 draft standards that I have access to state
+that real types and imaginary types are implicitly interconvertible.  This
+seems like a mistake, since the result of the conversion will always be
+zero, but ...</p>
+
+<pre>
+void (?create)?(long double*, long double _Imaginary);
+void (?create)?(long double _Imaginary*, long double);
+void (?create)?(double*, double _Imaginary);
+void (?create)?(double _Imaginary*, double);
+void (?create)?(float*, float _Imaginary);
+void (?create)?(float _Imaginary*, float);
+</pre>
+
+<p>Let <i>SMax</i> be the highest ranking signed integer type, and let
+<i>UMax</i> be the highest ranking unsigned integer type.  Then Cforall
+would define</p>
+
+<pre>
+Creator(<i>SMax</i>, float);
+Creator(<i>SMax</i>, float _Complex);
+Creator(<i>SMax</i>, float _Imaginary);
+Creator(<i>UMax</i>, float);
+Creator(<i>UMax</i>, float _Complex);
+Creator(<i>UMax</i>, float _Imaginary);
+</pre>
+
+<p>For every signed integer type <i>T</i> with rank greater than that of
+<code>signed char</code>, Cforall would define</p>
+
+<pre>
+Creator(<i>S</i>, <i>T</i>);
+</pre>
+<p>where <i>S</i> is the signed integer type with the next lowest rank.</p>
+
+<p>For every rank <i>r</i> greater than the rank of <code>_Bool</code>,
+Cforall would define</p>
+
+<pre>
+Creator(unsigned(<i>r</i>-1), unsigned(<i>r</i>));
+</pre>
+
+<p>For every rank <i>r</i> such that <code>signed(<i>r</i>)</code> exists,
+Cforall would define</p>
+
+<pre>
+void (?create)?( signed(<i>r</i>)*, unsigned(<i>r</i>) );
+</pre>
+
+<p><code>char</code> and <code>_Bool</code> are interconvertible.</p>
+<pre>
+void (?create)?(char*, _Bool);
+void (?create)?(_Bool*, char);
+</pre>
+
+<p>If <code>char</code> is equivalent to <code>signed char</code>, the
+implementation would define</p>
+
+<pre>
+Creator(char, signed char);
+void (?create)?(char*, unsigned char);
+</pre>
+
+<p>Otherwise, the implementation would define</p>
+<pre>
+Creator(char, unsigned char);
+void (?create)?(char*, signed char);
+void (?create)?(_Bool*, signed char);
+void (?create)?(signed char*, _Bool);
+</pre>
+
+<h3>Pointer conversions</h3>
+
+<p>Pointer types are implicitly interconvertible with pointers to void,
+provided that the target type has all of the qualifiers of the source
+type.</p>
+
+<pre>
+forall(dtype SourceType,
+       type QVPtr | void (?promote)?(QVPtr*, void*))
+  void (?create)?(QVPtr*, SourceType*);
+</pre>
+
+<p class='rationale'>This conversion uses the constructor idiom, but note
+that the assertion parameter is a promotion even though the conversion
+itself is not a promotion.  My intent is that the assertion parameter will
+be bound to a promotion that adds <a href='#otherpromo'>type qualifiers</a>
+to a pointer type.  A conversion from <code>int*</code> to <code>const
+void*</code> would bind <code>SourceType</code> to <code>int</code>,
+<code>QVPtr</code> to <code>const void*</code>, and the assertion parameter
+to a promotion from <code>void*</code> to <code>const void*</code> (which
+is a specialization of one of the polymorphic type qualifier promotions
+given above).  Because of this composition of pointer conversions, I don't
+have to define conversions for every combination of type qualifiers on the
+target type.  I do have to handle all combinations of qualifiers on the
+source type:</p>
+
+<pre>
+forall(dtype SourceType,
+       type QVPtr | void (?promote)?(QVPtr*, const void*))
+  void (?create)?(QVPtr*, const SourceType*);
+forall(dtype SourceType,
+       type QVPtr | void (?promote)?(QVPtr*, volatile void*))
+  void (?create)?(QVPtr*, volatile SourceType*);
+forall(dtype SourceType,
+       type QVPtr | void (?promote)?(QVPtr*, restrict void*))
+  void (?create)?(QVPtr*, restrict SourceType*);
+forall(dtype SourceType,
+       type QVPtr | void (?promote)?(QVPtr*, const volatile void*))
+  void (?create)?(QVPtr*, const volatile SourceType*);
+forall(dtype SourceType,
+       type QVPtr | void (?promote)?(QVPtr*, const restrict void*))
+  void (?create)?(QVPtr*, const restrict SourceType*);
+forall(dtype SourceType,
+       type QVPtr | void (?promote)?(QVPtr*, volatile restrict void*))
+  void (?create)?(QVPtr*, volatile restrict SourceType*);
+forall(dtype SourceType,
+       type QVPtr | void (?promote)?(QVPtr*, const volatile restrict void*))
+  void (?create)?(QVPtr*, const volatile restrict SourceType*);
+
+forall(type QTPtr,
+       dtype TargetType | void (?promote)?(QTPtr*, TargetType*)
+  void (?create)?(QTPtr*, void*);
+forall(type QTPtr,
+       dtype TargetType | void (?promote)?(QTPtr*, const TargetType*)
+  void (?create)?(QTPtr*, const void*);
+forall(type QTPtr,
+       dtype TargetType | void (?promote)?(QTPtr*, volatile TargetType*)
+  void (?create)?(QTPtr*, volatile void*);
+forall(type QTPtr,
+       dtype TargetType | void (?promote)?(QTPtr*, restrict TargetType*)
+  void (?create)?(QTPtr*, restrict void*);
+forall(type QTPtr,
+       dtype TargetType | void (?promote)?(QTPtr*, const volatile TargetType*)
+  void (?create)?(QTPtr*, const volatile void*);
+forall(type QTPtr,
+       dtype TargetType | void (?promote)?(QTPtr*, const restrict TargetType*)
+  void (?create)?(QTPtr*, const restrict void*);
+forall(type QTPtr,
+       dtype TargetType | void (?promote)?(QTPtr*, volatile restrict TargetType*)
+  void (?create)?(QTPtr*, volatile restrict void*);
+forall(type QTPtr,
+       dtype TargetType | void (?promote)?(QTPtr*, const volatile restrict TargetType*)
+  void (?create)?(QTPtr*, const volatile restrict void*);
+</pre>
+
+<h2 id="explicit">Pre-Defined Explicit Conversions</h2>
+<p>Function pointers are interconvertible.</p>
+<pre>
+forall(ftype FT1, ftype FT2, type T | FT1* (?)?(T) ) FT2* (?)?(FT1*);
+</pre>
+
+<p>Data pointers including pointers to <code>void</code> are
+interconvertible, regardless of type qualifiers.</p>
+
+<pre>
+forall(dtype DT1, dtype DT2) DT2*                (?)?(DT1*);
+forall(dtype DT1, dtype DT2) const DT2*          (?)?(DT1*);
+forall(dtype DT1, dtype DT2) volatile DT2*       (?)?(DT1*);
+forall(dtype DT1, dtype DT2) const volatile DT2* (?)?(DT1*);
+
+forall(dtype DT1, dtype DT2) DT2*                (?)?(const DT1*);
+forall(dtype DT1, dtype DT2) const DT2*          (?)?(const DT1*);
+forall(dtype DT1, dtype DT2) volatile DT2*       (?)?(const DT1*);
+forall(dtype DT1, dtype DT2) const volatile DT2* (?)?(const DT1*);
+
+forall(dtype DT1, dtype DT2) DT2*                (?)?(volatile DT*);
+forall(dtype DT1, dtype DT2) const DT2*          (?)?(volatile DT*);
+forall(dtype DT1, dtype DT2) volatile DT2*       (?)?(volatile DT*);
+forall(dtype DT1, dtype DT2) const volatile DT2* (?)?(volatile DT*);
+
+forall(dtype DT1, dtype DT2) DT2*                (?)?(const volatile DT*);
+forall(dtype DT1, dtype DT2) const DT2*          (?)?(const volatile DT*);
+forall(dtype DT1, dtype DT2) volatile DT2*       (?)?(const volatile DT*);
+forall(dtype DT1, dtype DT2) const volatile DT2* (?)?(const volatile DT*);
+</pre>
+
+<p>Integers and pointers are interconvertible.  For every integer type
+<i>I</i> define</p>
+<pre>
+forall(dtype DT, type T | <i>I</i> (?)?(T) ) DT* ?(?)(T);
+forall(ftype FT, type T | <i>I</i> (?)?(T) ) FT* ?(?)(T);
+
+forall(dtype DT, type T | DT* (?)?(T) ) <i>I</i> (?)?(T);
+forall(dtype DT, type T | DT* (?)?(T) ) <i>I</i> (?)?(T);
+</pre>
+
+<h2 id='nonconversions'>Non-Conversions</h2>
+<p>C99 has a few other "conversions" that don't fit into this proposal.
+Outside of some special circumstances (such as application of
+<code>sizeof</code>),</p>
+<ul>
+  <li>array lvalues "convert" to pointers</li>
+  <li>function designators "convert" to pointers to functions</li>
+  <li>non-array lvalues "convert" to plain values</li>
+  <li>bit fields undergo "integer promotion" to <code>int</code> or
+      <code>unsigned int</code> values.</li>
+</ul>
+
+<p>I'd like to stop calling these "conversions".  Perhaps they could be
+handled by some verbiage in the semantics of "Primary Expressions".</p>
+
+<p>Cforall-as-is provides "specialization", which reduces the number of
+type parameters or assertion parameters of a polymorphic object or
+function.  Specialization looks like a conversion -- it can happen
+implicitly or as a result of a cast -- but would no longer be considered to
+be a conversion.</p>
+
+<h2 id='assignment'>Assignment</h2>
+
+<p>Since extended Cforall separates conversion from assignment, it can
+simplify Cforall-as-is's set of assignment operators.  Implicit conversions
+can add type qualifiers to the target's type, and to the source's type in
+the case of pointer assignment.</p>
+
+<pre>
+char ?=?(volatile char*, char);
+char ?+=?(volatile char*, char);
+// <i>... and similarly for the rest of the basic types and</i>
+// <i>compound assignment operators.</i>
+</pre>
+<pre class='rationale'>
+char c;
+c = 'a';  // <i>=&gt; ?=?( &amp;c, 'a' );</i>
+          // <i>=&gt; ?=?( (volatile char*)&amp;c, 'a' );</i>
+</pre>
+
+<pre>
+// <i>Assignment between data pointers, where the target has all of</i>
+// <i>the qualifiers of the source.</i>
+forall(dtype DT)
+  DT* ?=?(DT* volatile restrict*, DT*);
+forall(dtype DT)
+  const DT* ?=?(const DT* volatile restrict*, const DT*);
+forall(dtype DT)
+  volatile DT* ?=?(volatile DT* volatile restrict*, volatile DT*);
+forall(dtype DT)
+  const volatile DT* ?=?(const volatile DT* volatile restrict*, const volatile DT*);
+
+// <i>Assignment to data pointers from </i>void<i>pointers.</i>
+forall(dtype DT) DT* ?=?(DT* volatile restrict*,  void*)
+forall(dtype DT)
+  const DT* ?=?(const DT* volatile restrict*, const void*);
+forall(dtype DT)
+  volatile DT* ?=?(volatile DT* volatile restrict*, volatile void*);
+forall(dtype DT)
+  const volatile DT* ?=?(const volatile DT* volatile restrict*, const volatile void*);
+
+// <i>Assignment to </i>void<i> pointers from data pointers.</i>
+forall(dtype DT)
+  void* ?=?(void* volatile restrict*, DT*);
+forall(dtype DT)
+  const void* ?=?(const void* volatile restrict*, const DT*);
+forall(dtype DT)
+  volatile void* ?=?(volatile void* volatile restrict*, volatile DT*);
+forall(dtype DT)
+  const volatile void* ?=?(const volatile void* volatile restrict*, const volatile DT*);
+
+// <i>Assignment from null pointers to other pointer types.</i>
+forall(dtype DT)
+  void* ?=?(void* volatile restrict*, forall(dtype DT2) const DT2*);
+forall(dtype DT)
+  const void* ?=?(const void* volatile restrict*, forall(dtype DT2) const DT2*);
+forall(dtype DT)
+  volatile void* ?=?(volatile void* volatile restrict*, forall(dtype DT2) const DT2*);
+forall(dtype DT)
+  const volatile void* ?=?(const volatile void* volatile restrict*, forall(dtype DT2) const DT2*);
+
+// <i>Function pointer assignment</i>
+forall(ftype FT) FT* ?=?(FT* volatile restrict*, FT*);
+forall(ftype FT) FT* ?=?(FT* volatile restrict*, forall(ftype FT2) FT2*);
+</pre>
+
+<div class='rationale'>
+
+<p>The difference, relative to Cforall-as-is, is that assignment operators
+come in one flavor (a pointer to a volatile value as the first operand)
+instead of two (a pointer to volatile in one case, a plain pointer in the
+other) or the four that <code>restrict</code> would have led to.</p>
+
+<p>However, to make this work, the type of <dfn>default assignment</dfn>
+functions must also change.  A declaration of a type <code>T</code> would
+implicitly declare</p> <pre> T ?=?(T volatile restrict*, T) </pre> </div>
+
+<h2 id='final'>Final Notes</h2>
+
+<p>The <a href='#idiom'>constructor idiom</a> is polymorphic in the
+object's type: an initial value of one particular type can initialize
+objects of many types.  The constructor that promotes a <code>Wazzit</code>
+into a <code>Thingum</code> is declared</p>
+
+<pre>
+forall(type T | void (?promote)?(T*, Thingum) )
+  void (?promote)?(T*, Wazzit);
+</pre>
+<p>("You can make a <code>Wazzit</code> into a <code>Thingum</code> and
+types higher in the hierarchy.")</p>
+
+<p>It would also be possible to use a constructor idiom where the object's
+type is fixed and the initial value's type is polymorphic:</p>
+
+<pre>
+forall(type T | void (?promote)?(Wazzit*, T) )
+  void (?promote)?(Thingum*, T);
+</pre>
+<p>("You can make a <code>Thingum</code> from a <code>Wazzit</code> and
+types lower in the hierarchy.")</p>
+
+<p>The "polymorphic value" idiom has the advantage that it is fairly
+obvious that the function is a constructor for type <code>Thingum</code>.
+In the "polymorphic object" idiom, <code>Thingum</code> is buried in the
+assertion parameter.</p>
+
+<p>However, I chose the "polymorphic object" idiom because it matches C's
+semantics for signed-to-unsigned integer conversions.  In the "polymorphic
+object" idiom, the natural way to write the polymorphic promoter from
+<code>int</code> to larger types is 
+</p>
+
+<pre>
+forall(type T | void (?promote)?(T*, long) )
+  void (?promote)?(T* tp, int i) {
+    long l = i;
+    *tp = (T)l;    // <i>calls the assertion parameter.</i>
+    }
+</pre>
+
+<p>Now consider the case of a CPU with 16-bit <code>int</code>s, where we
+need to convert an <code>int</code> value <code>-1</code> to a 32-bit
+<code>unsigned long</code>.  The assertion parameter will be bound to the
+monomorphic <code>long</code>-to-<code>unsigned long</code> promoter.  The
+function body above converts the <code>int</code> -1 to a <code>long</code>
+-1, and then uses the assertion parameter to convert the result to the
+correct <code>unsigned long</code> value: 4,294,967,295.</p>
+
+<p>In the "polymorphic value" idiom, the conversion would be done by
+calling the polymorphic promoter to <code>unsigned long</code> from smaller
+types:</p>
+
+<pre>
+forall(type T | void (?promote)?(unsigned*, T) )
+  void (?promote)?(unsigned long* ulp, T t) {
+    unsigned u = t;    // <i>calls the assertion parameter.</i>
+    *ulp = u;
+    }
+</pre>
+
+<p>This time the assertion parameter will be bound to the
+<code>int</code>-to-<code>unsigned</code> promoter.  The function body uses
+the assertion parameter to convert the integer -1 to <code>unsigned</code>
+65,565, and then converts the result to the incorrect <code>unsigned
+long</code> value 65,535.</p>
+
+<p>Clearly the "polymorphic value" idiom would require the implementation
+to do some unnatural, and probably implementation-dependent, bit mangling
+to get the right answer.  Of course, an implementation is allowed to
+perform any unnatural acts it chooses.  But programmers would have to
+conform to the prevailing constructor idiom when writing their
+constructors, and will want to write natural and portable code.</p>
+
+<!--
+Multi-argument constructors and {...} notation.  Default and keyword
+parameters?
+
+mutable.
+
+Automating implementation-dependent promotion, so new types can fit in
+easily.
+
+Cast has no cost; implicit construction does.
+
+Allow instantiation of dtype/incomplete type if the type has a constructor? 
+The problem is space allocation: constructors would have to allocate space,
+which would interfere with their use in dynamic allocation.
+
+generic function that treates promoters as creators might cause loops when
+chaining creators.
+
+-->
+</body>
+</html>
Index: src/tests/preempt_longrun/Makefile.am
===================================================================
--- src/tests/preempt_longrun/Makefile.am	(revision 7bdcac1f3631ee6408a86b4d0321433114fee6d3)
+++ src/tests/preempt_longrun/Makefile.am	(revision 3fc59bdb3dcc1b79be061440b40a2cebb66b85f3)
@@ -45,5 +45,5 @@
 
 clean-local:
-	rm -f ${TESTS}
+	rm -f ${TESTS} core* out.log
 
 % : %.c ${CC}
Index: src/tests/preempt_longrun/Makefile.in
===================================================================
--- src/tests/preempt_longrun/Makefile.in	(revision 7bdcac1f3631ee6408a86b4d0321433114fee6d3)
+++ src/tests/preempt_longrun/Makefile.in	(revision 3fc59bdb3dcc1b79be061440b40a2cebb66b85f3)
@@ -889,5 +889,5 @@
 
 clean-local:
-	rm -f ${TESTS}
+	rm -f ${TESTS} core* out.log
 
 % : %.c ${CC}
