[14e1053] | 1 | % ====================================================================== |
---|
| 2 | % ====================================================================== |
---|
| 3 | \chapter{Conclusion}\label{s:conclusion} |
---|
| 4 | % ====================================================================== |
---|
| 5 | % ====================================================================== |
---|
[c1b6bc6] | 6 | |
---|
[79b05224] | 7 | The goal of this thesis is to expand concurrent support in \CFA to fill in gaps and increase support for writing safe and efficient concurrent programs. |
---|
[0f8b1a7] | 8 | The presented features achieve this goal and provide users with the means to write scalable concurrent programs in \CFA through multiple avenues. |
---|
| 9 | Additionally, the tools presented provide safety and productivity features including: detection of deadlock and other common concurrency errors, easy concurrent shutdown, and toggleable performance statistics. |
---|
[aae9c17] | 10 | |
---|
[79b05224] | 11 | For locking, the mutex statement provides a safe and easy-to-use interface for mutual exclusion. |
---|
[0f8b1a7] | 12 | If programmers prefer the message-passing paradigm, \CFA now supports it in the form of channels and actors. |
---|
| 13 | The @waituntil@ statement simplifies writing concurrent programs in both the message-passing and shared-memory paradigms of concurrency. |
---|
[79b05224] | 14 | Finally, no other programming language provides a synchronous multiplexing tool that is polymorphic over resources like \CFA's @waituntil@. |
---|
| 15 | This work successfully provides users with familiar concurrent-language support, but with additional value added over similar utilities in other popular languages. |
---|
| 16 | |
---|
| 17 | On overview of the contributions made in this thesis include the following: |
---|
[d3c3261] | 18 | \begin{enumerate} |
---|
[79b05224] | 19 | \item The mutex statement, which provides performant and deadlock-free multi-lock acquisition. |
---|
| 20 | \item Channels with comparable performance to Go, which have safety and productivity features including deadlock detection and an easy-to-use exception-based channel @close@ routine. |
---|
| 21 | \item An in-memory actor system, which achieves the lowest latency message send of systems tested due to the novel copy-queue data structure. |
---|
[9509d67a] | 22 | \item As well, the actor system has built-in detection of six common actor errors, with excellent performance compared to other systems across all benchmarks presented in this thesis. |
---|
| 23 | \item A @waituntil@ statement, which tackles the hard problem of allowing a thread wait synchronously for an arbitrary set of concurrent resources. |
---|
[d3c3261] | 24 | \end{enumerate} |
---|
| 25 | |
---|
[0f8b1a7] | 26 | The added features are now commonly used to solve concurrent problems in \CFA. |
---|
[79b05224] | 27 | The @mutex@ statement sees use across almost all concurrent code in \CFA, as it is the simplest mechanism for providing thread-safe input and output. |
---|
| 28 | The channels and the @waituntil@ statement see use in programs where a thread operates as a server or administrator, which accepts and distributes work among channels based on some shared state. |
---|
| 29 | When implemented, the polymorphic support of the @waituntil@ statement will see use with the actor system to enable user threads outside the actor system to wait for work to be done or for actors to finish. |
---|
| 30 | Finally, the new features are often combined, \eg channels pass pointers to shared memory that may still need mutual exclusion, requiring the @mutex@ statement to be used. |
---|
[1f10959] | 31 | |
---|
[d3c3261] | 32 | From the novel copy-queue data structure in the actor system and the plethora of user-supporting safety features, all these utilities build upon existing concurrent tooling with value added. |
---|
| 33 | Performance results verify that each new feature is comparable or better than similar features in other programming languages. |
---|
[0f8b1a7] | 34 | All in all, this suite of concurrent tools expands a \CFA programmer's ability to easily write safe and performant multi-threaded programs. |
---|
[14e1053] | 35 | |
---|
| 36 | \section{Future Work} |
---|
[c1b6bc6] | 37 | |
---|
[14e1053] | 38 | \subsection{Further Implicit Concurrency} |
---|
[c1b6bc6] | 39 | |
---|
[0f8b1a7] | 40 | This thesis only scratches the surface of implicit concurrency by providing an actor system. |
---|
[14e1053] | 41 | There is room for more implicit concurrency tools in \CFA. |
---|
[79b05224] | 42 | User-defined implicit concurrency in the form of annotated loops or recursive concurrent functions exists in other languages and libraries~\cite{uC++,OpenMP}. |
---|
[c1b6bc6] | 43 | Similar implicit concurrency mechanisms could be implemented and expanded on in \CFA. |
---|
| 44 | Additionally, the problem of automatic parallelism of sequential programs via the compiler is an interesting research space that other languages have approached~\cite{wilson94,haskell:parallel} and could be explored in \CFA. |
---|
[14e1053] | 45 | |
---|
[5e81a9c] | 46 | \subsection{Advanced Actor Stealing Heuristics} |
---|
[c1b6bc6] | 47 | |
---|
[79b05224] | 48 | In this thesis, two basic victim-selection heuristics are chosen when implementing the work-stealing actor-system. |
---|
| 49 | Good victim selection is an active area of work-stealing research, especially when taking into account NUMA effects and cache locality~\cite{barghi18,wolke17}. |
---|
[c1b6bc6] | 50 | The actor system in \CFA is modular and exploration of other victim-selection heuristics for queue stealing in \CFA could be provided by pluggable modules. |
---|
| 51 | Another question in work stealing is: when should a worker thread steal? |
---|
[79b05224] | 52 | Work-stealing systems can often be too aggressive when stealing, causing the cost of the steal to be have a negative rather positive effect on performance. |
---|
[c1b6bc6] | 53 | Given that thief threads often have cycles to spare, there is room for a more nuanced approaches when stealing. |
---|
| 54 | Finally, there is the very difficult problem of blocking and unblocking idle threads for workloads with extreme oscillations in CPU needs. |
---|
[14e1053] | 55 | |
---|
| 56 | \subsection{Synchronously Multiplexing System Calls} |
---|
[c1b6bc6] | 57 | |
---|
[79b05224] | 58 | There are many tools that try to synchronously wait for or asynchronously check I/O. |
---|
| 59 | Improvements in this area pay dividends in many areas of I/O based programming~\cite{linux:select,linux:poll,linux:epoll,linux:iouring}. |
---|
[14e1053] | 60 | Research on improving user-space tools to synchronize over I/O and other system calls is an interesting area to explore in the world of concurrent tooling. |
---|
[c1b6bc6] | 61 | Specifically, incorporating I/O into the @waituntil@ to allow a network server to work with multiple kinds of asynchronous I/O interconnects without using tradition event loops. |
---|
[14e1053] | 62 | |
---|
[c9019ce] | 63 | \subsection{Better Atomic Operations} |
---|
[c1b6bc6] | 64 | |
---|
| 65 | When writing low-level concurrent programs, especially lock/wait-free programs, low-level atomic instructions need to be used. |
---|
[c9019ce] | 66 | In C, the gcc-builtin atomics~\cite{gcc:atomics} are commonly used, but leave much to be desired. |
---|
[14e1053] | 67 | Some of the problems include the following. |
---|
| 68 | Archaic and opaque macros often have to be used to ensure that atomic assembly is generated instead of locks. |
---|
| 69 | The builtins are polymorphic, but not type safe since they use void pointers. |
---|
[c1b6bc6] | 70 | The semantics and safety of these builtins require careful navigation since they require the user to have a deep understanding of concurrent memory-ordering models. |
---|
[14e1053] | 71 | Furthermore, these atomics also often require a user to understand how to fence appropriately to ensure correctness. |
---|
[79b05224] | 72 | All these problems and more would benefit from language support in \CFA. |
---|
[c1b6bc6] | 73 | Adding good language support for atomics is a difficult problem, which if solved well, would allow for easier and safer writing of low-level concurrent code. |
---|
[930a800] | 74 | |
---|