\chapter{Concurrency in \CFA}\label{s:cfa_concurrency}

The groundwork for concurrency in \CFA was laid by Thierry Delisle in his Master's Thesis\cite{Delisle18}. In that work he introduced coroutines, user level threading, and monitors. Not listed in that work were the other concurrency features that were needed as building blocks, such as locks, futures, and condition variables which he also added to \CFA.

\section{Threading Model}\label{s:threading}
\CFA has user level threading and supports a $M:N$ threading model where $M$ user threads are scheduled on $N$ cores, where both $M$ and $N$ can be explicitly set by the user. Cores are used by a program by creating instances of a \code{processor} struct. User threads types are defined using the \code{thread} keyword, in the place where a \code{struct} keyword is typically used. For each thread type a corresponding main must be defined, which is where the thread starts running once it is created. Listing~\ref{l:cfa_thd_init} shows an example of processor and thread creation. When processors are added, they are added alongside the existing processor given to each program. Thus if you want $N$ processors you need to allocate $N-1$. To join a thread the thread must be deallocated, either deleted if it is allocated on the heap, or go out of scope if stack allocated. The thread performing the deallocation will wait for the thread being deallocated to terminate before the deallocation can occur. A thread terminates by returning from the main routine where it starts.

\begin{cfacode}[tabsize=3,caption={\CFA user thread and processor creation},label={l:cfa_thd_init}]

thread my_thread {}     // user thread type
void main( my_thread & this ) { // thread start routine
    printf("Hello threading world\n");
}

int main() {
    // add 2 processors, now 3 total
    processor p[2];    
    {
        my_thread t1;
        my_thread t2;
    } // waits for threads to end before going out of scope
}

\end{cfacode}