Index: doc/theses/mubeen_zulfiqar_MMath/performance.tex
===================================================================
--- doc/theses/mubeen_zulfiqar_MMath/performance.tex	(revision cd1a5e82757d4c0f985e674d59731246eb66df2a)
+++ doc/theses/mubeen_zulfiqar_MMath/performance.tex	(revision 57af3f3fae95ef06a0cafe16bfd0bfa6e8f9cf62)
@@ -2,81 +2,82 @@
 \label{c:Performance}
 
+This chapter uses the micro-benchmarks from \VRef[Chapter]{s:Benchmarks} to test a number of current memory allocators, including llheap.
+The goal is to see if llheap is competitive with the current best memory allocators.
+
+
 \section{Machine Specification}
 
-The performance experiments were run on three different multicore systems to determine if there is consistency across platforms:
+The performance experiments were run on two different multi-core architectures (x86 and ARM) to determine if there is consistency across platforms:
 \begin{itemize}
 \item
-{\bf Nasus} AMD EPYC 7662, 64-core socket $\times$ 2, 2.0 GHz, GCC version 9.3.0
+\textbf{Nasus} AMD EPYC 7662, 64-core socket $\times$ 2, 2.0 GHz, GCC version 9.3.0
 \item
-{\bf Algol} Huawei ARM TaiShan 2280 V2 Kunpeng 920, 24-core socket $\times$ 4, 2.6 GHz, GCC version 9.4.0
+\textbf{Algol} Huawei ARM TaiShan 2280 V2 Kunpeng 920, 24-core socket $\times$ 4, 2.6 GHz, GCC version 9.4.0
 \end{itemize}
 
 
-\section{Existing Memory Allocators}\label{sec:curAllocatorSec}
-With dynamic allocation being an important feature of C, there are many stand-alone memory allocators that have been designed for different purposes. For this thesis, we chose 7 of the most popular and widely used memory allocators.
+\section{Existing Memory Allocators}
+\label{sec:curAllocatorSec}
+
+With dynamic allocation being an important feature of C, there are many stand-alone memory allocators that have been designed for different purposes.
+For this thesis, 7 of the most popular and widely used memory allocators were selected for comparison.
+
+\subsection{glibc}
+glibc~\cite{glibc} is the default gcc thread-safe allocator.
+\\
+\textbf{Version:} Ubuntu GLIBC 2.31-0ubuntu9.7 2.31\\
+\textbf{Configuration:} Compiled by Ubuntu 20.04.\\
+\textbf{Compilation command:} N/A
 
 \subsection{dlmalloc}
-dlmalloc (FIX ME: cite allocator with download link) is a thread-safe allocator that is single threaded and single heap. dlmalloc maintains free-lists of different sizes to store freed dynamic memory. (FIX ME: cite wasik)
+dlmalloc~\cite{dlmalloc} is a thread-safe allocator that is single threaded and single heap.
+It maintains free-lists of different sizes to store freed dynamic memory.
 \\
+\textbf{Version:} 2.8.6\\
+\textbf{Configuration:} Compiled with preprocessor @USE_LOCKS@.\\
+\textbf{Compilation command:} @gcc -g3 -O3 -Wall -Wextra -fno-builtin-malloc -fno-builtin-calloc@ @-fno-builtin-realloc -fno-builtin-free -fPIC -shared -DUSE_LOCKS -o libdlmalloc.so malloc-2.8.6.c@
+
+\subsection{hoard}
+Hoard~\cite{hoard} is a thread-safe allocator that is multi-threaded and using a heap layer framework. It has per-thread heaps that have thread-local free-lists, and a global shared heap.
 \\
-{\bf Version:} 2.8.6\\
-{\bf Configuration:} Compiled with pre-processor USE\_LOCKS.\\
-{\bf Compilation command:}\\
-cc -g3 -O3 -Wall -Wextra -fno-builtin-malloc -fno-builtin-calloc -fno-builtin-realloc -fno-builtin-free -fPIC -shared -DUSE\_LOCKS -o libdlmalloc.so malloc-2.8.6.c
-
-\subsection{hoard}
-Hoard (FIX ME: cite allocator) is a thread-safe allocator that is multi-threaded and using a heap layer framework. It has per-thread heaps that have thread-local free-lists, and a global shared heap. (FIX ME: cite wasik)
+\textbf{Version:} 3.13\\
+\textbf{Configuration:} Compiled with hoard's default configurations and @Makefile@.\\
+\textbf{Compilation command:} @make all@
+
+\subsection{jemalloc}
+jemalloc~\cite{jemalloc} is a thread-safe allocator that uses multiple arenas. Each thread is assigned an arena. Each arena has chunks that contain contagious memory regions of same size. An arena has multiple chunks that contain regions of multiple sizes.
 \\
+\textbf{Version:} 5.2.1\\
+\textbf{Configuration:} Compiled with jemalloc's default configurations and @Makefile@.\\
+\textbf{Compilation command:} @autogen.sh; configure; make; make install@
+
+\subsection{pt3malloc}
+pt3malloc~\cite{pt3malloc} is a modification of dlmalloc.
+It is a thread-safe multi-threaded memory allocator that uses multiple heaps. pt3malloc heap has similar design to dlmalloc's heap.
 \\
-{\bf Version:} 3.13\\
-{\bf Configuration:} Compiled with hoard's default configurations and Makefile.\\
-{\bf Compilation command:}\\
-make all
-
-\subsection{jemalloc}
-jemalloc (FIX ME: cite allocator) is a thread-safe allocator that uses multiple arenas. Each thread is assigned an arena. Each arena has chunks that contain contagious memory regions of same size. An arena has multiple chunks that contain regions of multiple sizes.
+\textbf{Version:} 1.8\\
+\textbf{Configuration:} Compiled with pt3malloc's @Makefile@ using option ``linux-shared''.\\
+\textbf{Compilation command:} @make linux-shared@
+
+\subsection{rpmalloc}
+rpmalloc~\cite{rpmalloc} is a thread-safe allocator that is multi-threaded and uses per-thread heap. Each heap has multiple size-classes and each size-class contains memory regions of the relevant size.
 \\
+\textbf{Version:} 1.4.1\\
+\textbf{Configuration:} Compiled with rpmalloc's default configurations and ninja build system.\\
+\textbf{Compilation command:} @python3 configure.py; ninja@
+
+\subsection{tbb malloc}
+tbb malloc~\cite{tbbmallocmail
+} is a thread-safe allocator that is multi-threaded and uses private heap for each thread. Each private-heap has multiple bins of different sizes. Each bin contains free regions of the same size.
 \\
-{\bf Version:} 5.2.1\\
-{\bf Configuration:} Compiled with jemalloc's default configurations and Makefile.\\
-{\bf Compilation command:}\\
-./autogen.sh\\
-./configure\\
-make\\
-make install
-
-\subsection{pt3malloc}
-pt3malloc (FIX ME: cite allocator) is a modification of dlmalloc. It is a thread-safe multi-threaded memory allocator that uses multiple heaps. pt3malloc heap has similar design to dlmalloc's heap.
-\\
-\\
-{\bf Version:} 1.8\\
-{\bf Configuration:} Compiled with pt3malloc's Makefile using option "linux-shared".\\
-{\bf Compilation command:}\\
-make linux-shared
-
-\subsection{rpmalloc}
-rpmalloc (FIX ME: cite allocator) is a thread-safe allocator that is multi-threaded and uses per-thread heap. Each heap has multiple size-classes and each size-class contains memory regions of the relevant size.
-\\
-\\
-{\bf Version:} 1.4.1\\
-{\bf Configuration:} Compiled with rpmalloc's default configurations and ninja build system.\\
-{\bf Compilation command:}\\
-python3 configure.py\\
-ninja
-
-\subsection{tbb malloc}
-tbb malloc (FIX ME: cite allocator) is a thread-safe allocator that is multi-threaded and uses private heap for each thread. Each private-heap has multiple bins of different sizes. Each bin contains free regions of the same size.
-\\
-\\
-{\bf Version:} intel tbb 2020 update 2, tbb\_interface\_version == 11102\\
-{\bf Configuration:} Compiled with tbbmalloc's default configurations and Makefile.\\
-{\bf Compilation command:}\\
-make
-
-\section{Experiment Environment}
-We used our micro becnhmark suite (FIX ME: cite mbench) to evaluate these memory allocators \ref{sec:curAllocatorSec} and our own memory allocator uHeap \ref{sec:allocatorSec}.
-
-\section{Results}
-FIX ME: add experiment, knobs, graphs, description+analysis
+\textbf{Version:} intel tbb 2020 update 2, tbb\_interface\_version == 11102\\
+\textbf{Configuration:} Compiled with tbbmalloc's default configurations and @Makefile@.\\
+\textbf{Compilation command:} @make@
+
+% \section{Experiment Environment}
+% We used our micro benchmark suite (FIX ME: cite mbench) to evaluate these memory allocators \ref{sec:curAllocatorSec} and our own memory allocator uHeap \ref{sec:allocatorSec}.
+
+\section{Experiments}
+% FIX ME: add experiment, knobs, graphs, description+analysis
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@@ -84,29 +85,37 @@
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 
-\subsection{Churn Benchmark}
-
-Churn benchmark tested memory allocators for speed under intensive dynamic memory usage.
-
+\subsection{Churn Micro-Benchmark}
+
+Churn tests allocators for speed under intensive dynamic memory usage (see \VRef{s:ChurnBenchmark}).
 This experiment was run with following configurations:
-
--maxS		 : 500
-
--minS		 : 50
-
--stepS		 : 50
-
--distroS	 : fisher
-
--objN		 : 100000
-
--cSpots		 : 16
-
--threadN	 : \{ 1, 2, 4, 8, 16 \} *
-
-* Each allocator was tested for its performance across different number of threads. Experiment was repeated for each allocator for 1, 2, 4, 8, and 16 threads by setting the configuration -threadN.
-
-Results are shown in figure \ref{fig:churn} for both algol and nasus.
-X-axis shows number of threads. Each allocator's performance for each thread is shown in different colors.
-Y-axis shows the total time experiment took to finish.
+\begin{description}[itemsep=0pt,parsep=0pt]
+\item[thread:]
+1, 2, 4, 8, 16
+\item[spots:]
+16
+\item[obj:]
+100,000
+\item[max:]
+500
+\item[min:]
+50
+\item[step:]
+50
+\item[distro:]
+fisher
+\end{description}
+
+% -maxS		 : 500
+% -minS		 : 50
+% -stepS		 : 50
+% -distroS	 : fisher
+% -objN		 : 100000
+% -cSpots		 : 16
+% -threadN	 : 1, 2, 4, 8, 16
+
+\VRef[Figure]{fig:churn} shows the results for algol and nasus.
+The X-axis shows the number of threads.
+Each allocator's performance for each thread is shown in different colors.
+The Y-axis shows the total experiment time.
 
 \begin{figure}
@@ -118,4 +127,7 @@
 \end{figure}
 
+All allocators did well in this micro-benchmark, except for dmalloc on the ARM, 
+
+
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 %% THRASH
@@ -124,17 +136,19 @@
 \subsection{Cache Thrash}
 
-Thrash benchmark tested memory allocators for active false sharing.
-
+Thrash tests memory allocators for active false sharing (see \VRef{sec:benchThrashSec}).
 This experiment was run with following configurations:
-
--cacheIt 	: 1000
-
--cacheRep	: 1000000
-
--cacheObj	: 1
-
--threadN 	: \{ 1, 2, 4, 8, 16 \} *
-
-* Each allocator was tested for its performance across different number of threads. Experiment was repeated for each allocator for 1, 2, 4, 8, and 16 threads by setting the configuration -threadN.
+\begin{description}[itemsep=0pt,parsep=0pt]
+\item[thread:]
+1, 2, 4, 8, 16
+\item[iterations:]
+1,000
+\item[cacheRW:]
+1,000,000
+\item[size:]
+1
+\end{description}
+
+% * Each allocator was tested for its performance across different number of threads.
+% Experiment was repeated for each allocator for 1, 2, 4, 8, and 16 threads by setting the configuration -threadN.
 
 Results are shown in figure \ref{fig:cacheThrash} for both algol and nasus.
@@ -155,18 +169,21 @@
 
 \subsection{Cache Scratch}
-
-Scratch benchmark tested memory allocators for program induced allocator preserved passive false sharing.
-
+\label{s:CacheScratch}
+
+Scratch tests memory allocators for program-induced allocator-preserved passive false-sharing.
 This experiment was run with following configurations:
-
--cacheIt 	: 1000
-
--cacheRep	: 1000000
-
--cacheObj	: 1
-
--threadN 	: \{ 1, 2, 4, 8, 16 \} *
-
-* Each allocator was tested for its performance across different number of threads. Experiment was repeated for each allocator for 1, 2, 4, 8, and 16 threads by setting the configuration -threadN.
+\begin{description}[itemsep=0pt,parsep=0pt]
+\item[threads:]
+1, 2, 4, 8, 16
+\item[iterations:]
+1,000
+\item[cacheRW:]
+1,000,000
+\item[size:]
+1
+\end{description}
+
+% * Each allocator was tested for its performance across different number of threads.
+% Experiment was repeated for each allocator for 1, 2, 4, 8, and 16 threads by setting the configuration -threadN.
 
 Results are shown in figure \ref{fig:cacheScratch} for both algol and nasus.
@@ -186,23 +203,32 @@
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 
-\subsection{Speed Benchmark}
-
-Speed benchmark tested memory allocators for runtime latency.
-
+\subsection{Speed Micro-Benchmark}
+
+Speed testa memory allocators for runtime latency (see \VRef{s:SpeedMicroBenchmark}).
 This experiment was run with following configurations:
-
--maxS    :  500
-
--minS    :  50
-
--stepS   :  50
-
--distroS :  fisher
-
--objN    :  1000000
-
--threadN    : \{ 1, 2, 4, 8, 16 \} *
-
-* Each allocator was tested for its performance across different number of threads. Experiment was repeated for each allocator for 1, 2, 4, 8, and 16 threads by setting the configuration -threadN.
+\begin{description}[itemsep=0pt,parsep=0pt]
+\item[max:]
+500
+\item[min:]
+50
+\item[step:]
+50
+\item[distro:]
+fisher
+\item[objects:]
+1,000,000
+\item[workers:]
+1, 2, 4, 8, 16
+\end{description}
+
+% -maxS    :  500
+% -minS    :  50
+% -stepS   :  50
+% -distroS :  fisher
+% -objN    :  1000000
+% -threadN    : \{ 1, 2, 4, 8, 16 \} *
+
+%* Each allocator was tested for its performance across different number of threads.
+%Experiment was repeated for each allocator for 1, 2, 4, 8, and 16 threads by setting the configuration -threadN.
 
 Results for speed benchmark are shown in 12 figures, one figure for each chain of speed benchmark.
@@ -337,51 +363,71 @@
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 
-\subsection{Memory Benchmark}
-
-Speed benchmark tested memory allocators for their memory footprint.
-
-This experiment was run with following two configurations for each allocator:\\
-
+\subsection{Memory Micro-Benchmark}
+
+This experiment is run with following two configurations for each allocator.
+The difference between the two configurations is the number of producers and consumers.
+Configuration 1 has one producer and one consumer, and configuration 2 has 4 producers where each producer has 4 consumers.
+
+\noindent
 Configuration 1:
-
--threadA :  1
-
--threadF :  1
-
--maxS    :  500
-
--minS    :  50
-
--stepS   :  50
-
--distroS :  fisher
-
--objN    :  100000
-
--consumeS:  100000\\
-
+\begin{description}[itemsep=0pt,parsep=0pt]
+\item[producer (K):]
+1
+\item[consumer (M):]
+1
+\item[round:]
+100,000
+\item[max:]
+500
+\item[min:]
+50
+\item[step:]
+50
+\item[distro:]
+fisher
+\item[objects (N):]
+100,000
+\end{description}
+
+% -threadA :  1
+% -threadF :  1
+% -maxS    :  500
+% -minS    :  50
+% -stepS   :  50
+% -distroS :  fisher
+% -objN    :  100000
+% -consumeS:  100000
+
+\noindent
 Configuration 2:
-
--threadA :  4
-
--threadF :  4
-
--maxS    :  500
-
--minS    :  50
-
--stepS   :  50
-
--distroS :  fisher
-
--objN    :  100000
-
--consumeS:  100000
-
-Difference between the two configurations is the number of producers and consumers.
-Configuration 1 has one producer and one consumer.
-While configuration 2 has four producers where each producer has four consumers.
-
-\begin{table}[h!]
+\begin{description}[itemsep=0pt,parsep=0pt]
+\item[producer (K):]
+4
+\item[consumer (M):]
+4
+\item[round:]
+100,000
+\item[max:]
+500
+\item[min:]
+50
+\item[step:]
+50
+\item[distro:]
+fisher
+\item[objects (N):]
+100,000
+\end{description}
+
+% -threadA :  4
+% -threadF :  4
+% -maxS    :  500
+% -minS    :  50
+% -stepS   :  50
+% -distroS :  fisher
+% -objN    :  100000
+% -consumeS:  100000
+
+\begin{table}[b]
 \centering
     \begin{tabular}{ |c|c|c| }
@@ -411,5 +457,5 @@
 
 Results for memory benchmark are shown in 16 figures, two figures for each of the 8 allocators, one for each configuration.
-Table \ref{table:mem-benchmark-figs} shows the list of figures that contain memory benchmar results.
+Table \ref{table:mem-benchmark-figs} shows the list of figures that contain memory benchmark results.
 
 Each figure has 2 graphs, one for each experiment environment.
@@ -426,9 +472,9 @@
 * These statistics are gathered by monitoring the \textit{/proc/self/maps} file of the process in linux system.
 
-For each subgraph, x-axis shows the time during the program lifetime at which the datapoint was generated.
+For each subgraph, x-axis shows the time during the program lifetime at which the data point was generated.
 Y-axis shows the memory usage in bytes.
 
-For the experiment, at a certain time in the prgram's life, the difference betweem the memory requested by the benchmark (\textit{current\_req\_mem(B)})
-and the memory that the process has recieved from system (\textit{heap}, \textit{mmap}) should be minimum.
+For the experiment, at a certain time in the program's life, the difference between the memory requested by the benchmark (\textit{current\_req\_mem(B)})
+and the memory that the process has received from system (\textit{heap}, \textit{mmap}) should be minimum.
 This difference is the memory overhead caused by the allocator and shows the level of fragmentation in the allocator.
 
@@ -438,5 +484,5 @@
     \subfigure[Algol]{ \includegraphics[width=0.9\textwidth]{evaluations/algol-perf-eps/mem-1-prod-1-cons-100-cfa} }
     \subfigure[Nasus]{ \includegraphics[width=0.9\textwidth]{evaluations/nasus-perf-eps/mem-1-prod-1-cons-100-cfa} }
-\caption{Memory benchmaark results with 1 producer for cfa memory allocator}
+\caption{Memory benchmark results with 1 producer for cfa memory allocator}
 \label{fig:mem-1-prod-1-cons-100-cfa}
 \end{figure}
@@ -447,5 +493,5 @@
     \subfigure[Algol]{ \includegraphics[width=0.9\textwidth]{evaluations/algol-perf-eps/mem-1-prod-1-cons-100-dl} }
     \subfigure[Nasus]{ \includegraphics[width=0.9\textwidth]{evaluations/nasus-perf-eps/mem-1-prod-1-cons-100-dl} }
-\caption{Memory benchmaark results with 1 producer for dl memory allocator}
+\caption{Memory benchmark results with 1 producer for dl memory allocator}
 \label{fig:mem-1-prod-1-cons-100-dl}
 \end{figure}
@@ -456,5 +502,5 @@
     \subfigure[Algol]{ \includegraphics[width=0.9\textwidth]{evaluations/algol-perf-eps/mem-1-prod-1-cons-100-glc} }
     \subfigure[Nasus]{ \includegraphics[width=0.9\textwidth]{evaluations/nasus-perf-eps/mem-1-prod-1-cons-100-glc} }
-\caption{Memory benchmaark results with 1 producer for glibc memory allocator}
+\caption{Memory benchmark results with 1 producer for glibc memory allocator}
 \label{fig:mem-1-prod-1-cons-100-glc}
 \end{figure}
@@ -465,5 +511,5 @@
     \subfigure[Algol]{ \includegraphics[width=0.9\textwidth]{evaluations/algol-perf-eps/mem-1-prod-1-cons-100-hrd} }
     \subfigure[Nasus]{ \includegraphics[width=0.9\textwidth]{evaluations/nasus-perf-eps/mem-1-prod-1-cons-100-hrd} }
-\caption{Memory benchmaark results with 1 producer for hoard memory allocator}
+\caption{Memory benchmark results with 1 producer for hoard memory allocator}
 \label{fig:mem-1-prod-1-cons-100-hrd}
 \end{figure}
@@ -474,5 +520,5 @@
     \subfigure[Algol]{ \includegraphics[width=0.9\textwidth]{evaluations/algol-perf-eps/mem-1-prod-1-cons-100-je} }
     \subfigure[Nasus]{ \includegraphics[width=0.9\textwidth]{evaluations/nasus-perf-eps/mem-1-prod-1-cons-100-je} }
-\caption{Memory benchmaark results with 1 producer for je memory allocator}
+\caption{Memory benchmark results with 1 producer for je memory allocator}
 \label{fig:mem-1-prod-1-cons-100-je}
 \end{figure}
@@ -483,5 +529,5 @@
     \subfigure[Algol]{ \includegraphics[width=0.9\textwidth]{evaluations/algol-perf-eps/mem-1-prod-1-cons-100-pt3} }
     \subfigure[Nasus]{ \includegraphics[width=0.9\textwidth]{evaluations/nasus-perf-eps/mem-1-prod-1-cons-100-pt3} }
-\caption{Memory benchmaark results with 1 producer for pt3 memory allocator}
+\caption{Memory benchmark results with 1 producer for pt3 memory allocator}
 \label{fig:mem-1-prod-1-cons-100-pt3}
 \end{figure}
@@ -492,5 +538,5 @@
     \subfigure[Algol]{ \includegraphics[width=0.9\textwidth]{evaluations/algol-perf-eps/mem-1-prod-1-cons-100-rp} }
     \subfigure[Nasus]{ \includegraphics[width=0.9\textwidth]{evaluations/nasus-perf-eps/mem-1-prod-1-cons-100-rp} }
-\caption{Memory benchmaark results with 1 producer for rp memory allocator}
+\caption{Memory benchmark results with 1 producer for rp memory allocator}
 \label{fig:mem-1-prod-1-cons-100-rp}
 \end{figure}
@@ -501,5 +547,5 @@
     \subfigure[Algol]{ \includegraphics[width=0.9\textwidth]{evaluations/algol-perf-eps/mem-1-prod-1-cons-100-tbb} }
     \subfigure[Nasus]{ \includegraphics[width=0.9\textwidth]{evaluations/nasus-perf-eps/mem-1-prod-1-cons-100-tbb} }
-\caption{Memory benchmaark results with 1 producer for tbb memory allocator}
+\caption{Memory benchmark results with 1 producer for tbb memory allocator}
 \label{fig:mem-1-prod-1-cons-100-tbb}
 \end{figure}
@@ -510,5 +556,5 @@
     \subfigure[Algol]{ \includegraphics[width=0.9\textwidth]{evaluations/algol-perf-eps/mem-4-prod-4-cons-100-cfa} }
     \subfigure[Nasus]{ \includegraphics[width=0.9\textwidth]{evaluations/nasus-perf-eps/mem-4-prod-4-cons-100-cfa} }
-\caption{Memory benchmaark results with 4 producers for cfa memory allocator}
+\caption{Memory benchmark results with 4 producers for cfa memory allocator}
 \label{fig:mem-4-prod-4-cons-100-cfa}
 \end{figure}
@@ -519,5 +565,5 @@
     \subfigure[Algol]{ \includegraphics[width=0.9\textwidth]{evaluations/algol-perf-eps/mem-4-prod-4-cons-100-dl} }
     \subfigure[Nasus]{ \includegraphics[width=0.9\textwidth]{evaluations/nasus-perf-eps/mem-4-prod-4-cons-100-dl} }
-\caption{Memory benchmaark results with 4 producers for dl memory allocator}
+\caption{Memory benchmark results with 4 producers for dl memory allocator}
 \label{fig:mem-4-prod-4-cons-100-dl}
 \end{figure}
@@ -528,5 +574,5 @@
     \subfigure[Algol]{ \includegraphics[width=0.9\textwidth]{evaluations/algol-perf-eps/mem-4-prod-4-cons-100-glc} }
     \subfigure[Nasus]{ \includegraphics[width=0.9\textwidth]{evaluations/nasus-perf-eps/mem-4-prod-4-cons-100-glc} }
-\caption{Memory benchmaark results with 4 producers for glibc memory allocator}
+\caption{Memory benchmark results with 4 producers for glibc memory allocator}
 \label{fig:mem-4-prod-4-cons-100-glc}
 \end{figure}
@@ -537,5 +583,5 @@
     \subfigure[Algol]{ \includegraphics[width=0.9\textwidth]{evaluations/algol-perf-eps/mem-4-prod-4-cons-100-hrd} }
     \subfigure[Nasus]{ \includegraphics[width=0.9\textwidth]{evaluations/nasus-perf-eps/mem-4-prod-4-cons-100-hrd} }
-\caption{Memory benchmaark results with 4 producers for hoard memory allocator}
+\caption{Memory benchmark results with 4 producers for hoard memory allocator}
 \label{fig:mem-4-prod-4-cons-100-hrd}
 \end{figure}
@@ -546,5 +592,5 @@
     \subfigure[Algol]{ \includegraphics[width=0.9\textwidth]{evaluations/algol-perf-eps/mem-4-prod-4-cons-100-je} }
     \subfigure[Nasus]{ \includegraphics[width=0.9\textwidth]{evaluations/nasus-perf-eps/mem-4-prod-4-cons-100-je} }
-\caption{Memory benchmaark results with 4 producers for je memory allocator}
+\caption{Memory benchmark results with 4 producers for je memory allocator}
 \label{fig:mem-4-prod-4-cons-100-je}
 \end{figure}
@@ -555,5 +601,5 @@
     \subfigure[Algol]{ \includegraphics[width=0.9\textwidth]{evaluations/algol-perf-eps/mem-4-prod-4-cons-100-pt3} }
     \subfigure[Nasus]{ \includegraphics[width=0.9\textwidth]{evaluations/nasus-perf-eps/mem-4-prod-4-cons-100-pt3} }
-\caption{Memory benchmaark results with 4 producers for pt3 memory allocator}
+\caption{Memory benchmark results with 4 producers for pt3 memory allocator}
 \label{fig:mem-4-prod-4-cons-100-pt3}
 \end{figure}
@@ -564,5 +610,5 @@
     \subfigure[Algol]{ \includegraphics[width=0.9\textwidth]{evaluations/algol-perf-eps/mem-4-prod-4-cons-100-rp} }
     \subfigure[Nasus]{ \includegraphics[width=0.9\textwidth]{evaluations/nasus-perf-eps/mem-4-prod-4-cons-100-rp} }
-\caption{Memory benchmaark results with 4 producers for rp memory allocator}
+\caption{Memory benchmark results with 4 producers for rp memory allocator}
 \label{fig:mem-4-prod-4-cons-100-rp}
 \end{figure}
@@ -573,5 +619,5 @@
     \subfigure[Algol]{ \includegraphics[width=0.9\textwidth]{evaluations/algol-perf-eps/mem-4-prod-4-cons-100-tbb} }
     \subfigure[Nasus]{ \includegraphics[width=0.9\textwidth]{evaluations/nasus-perf-eps/mem-4-prod-4-cons-100-tbb} }
-\caption{Memory benchmaark results with 4 producers for tbb memory allocator}
+\caption{Memory benchmark results with 4 producers for tbb memory allocator}
 \label{fig:mem-4-prod-4-cons-100-tbb}
 \end{figure}
