[d286e94d] | 1 | \chapter{Performance} |
---|
[6978468] | 2 | \label{c:Performance} |
---|
[080471a] | 3 | |
---|
[c9136d9] | 4 | This chapter uses the micro-benchmarks from \VRef[Chapter]{s:Benchmarks} to test a number of current memory allocators, including llheap. |
---|
| 5 | The goal is to see if llheap is competitive with the current best memory allocators. |
---|
| 6 | |
---|
| 7 | |
---|
[028404f] | 8 | \section{Machine Specification} |
---|
| 9 | |
---|
[b81ab1c6] | 10 | The performance experiments were run on two different multi-core architectures (x64 and ARM) to determine if there is consistency across platforms: |
---|
[028404f] | 11 | \begin{itemize} |
---|
| 12 | \item |
---|
[c9136d9] | 13 | \textbf{Nasus} AMD EPYC 7662, 64-core socket $\times$ 2, 2.0 GHz, GCC version 9.3.0 |
---|
[028404f] | 14 | \item |
---|
[c9136d9] | 15 | \textbf{Algol} Huawei ARM TaiShan 2280 V2 Kunpeng 920, 24-core socket $\times$ 4, 2.6 GHz, GCC version 9.4.0 |
---|
[028404f] | 16 | \end{itemize} |
---|
| 17 | |
---|
| 18 | |
---|
[c9136d9] | 19 | \section{Existing Memory Allocators} |
---|
| 20 | \label{sec:curAllocatorSec} |
---|
[028404f] | 21 | |
---|
[c9136d9] | 22 | With dynamic allocation being an important feature of C, there are many stand-alone memory allocators that have been designed for different purposes. |
---|
[b81ab1c6] | 23 | For this thesis, 7 of the most popular and widely used memory allocators were selected for comparison, along with llheap. |
---|
[c9136d9] | 24 | |
---|
[a6c10de] | 25 | \paragraph{llheap (\textsf{llh})} |
---|
[b81ab1c6] | 26 | is the thread-safe allocator from \VRef[Chapter]{c:Allocator} |
---|
| 27 | \\ |
---|
| 28 | \textbf{Version:} 1.0 |
---|
| 29 | \textbf{Configuration:} Compiled with dynamic linking, but without statistics or debugging.\\ |
---|
| 30 | \textbf{Compilation command:} @make@ |
---|
| 31 | |
---|
| 32 | \paragraph{glibc (\textsf{glc})} |
---|
| 33 | \cite{glibc} is the default gcc thread-safe allocator. |
---|
[ba897d21] | 34 | \\ |
---|
[c9136d9] | 35 | \textbf{Version:} Ubuntu GLIBC 2.31-0ubuntu9.7 2.31\\ |
---|
| 36 | \textbf{Configuration:} Compiled by Ubuntu 20.04.\\ |
---|
| 37 | \textbf{Compilation command:} N/A |
---|
| 38 | |
---|
[b81ab1c6] | 39 | \paragraph{dlmalloc (\textsf{dl})} |
---|
| 40 | \cite{dlmalloc} is a thread-safe allocator that is single threaded and single heap. |
---|
[c9136d9] | 41 | It maintains free-lists of different sizes to store freed dynamic memory. |
---|
[ba897d21] | 42 | \\ |
---|
[c9136d9] | 43 | \textbf{Version:} 2.8.6\\ |
---|
| 44 | \textbf{Configuration:} Compiled with preprocessor @USE_LOCKS@.\\ |
---|
| 45 | \textbf{Compilation command:} @gcc -g3 -O3 -Wall -Wextra -fno-builtin-malloc -fno-builtin-calloc@ @-fno-builtin-realloc -fno-builtin-free -fPIC -shared -DUSE_LOCKS -o libdlmalloc.so malloc-2.8.6.c@ |
---|
[028404f] | 46 | |
---|
[b81ab1c6] | 47 | \paragraph{hoard (\textsf{hrd})} |
---|
| 48 | \cite{hoard} is a thread-safe allocator that is multi-threaded and using a heap layer framework. It has per-thread heaps that have thread-local free-lists, and a global shared heap. |
---|
[ba897d21] | 49 | \\ |
---|
[c9136d9] | 50 | \textbf{Version:} 3.13\\ |
---|
| 51 | \textbf{Configuration:} Compiled with hoard's default configurations and @Makefile@.\\ |
---|
| 52 | \textbf{Compilation command:} @make all@ |
---|
[028404f] | 53 | |
---|
[b81ab1c6] | 54 | \paragraph{jemalloc (\textsf{je})} |
---|
| 55 | \cite{jemalloc} is a thread-safe allocator that uses multiple arenas. Each thread is assigned an arena. |
---|
| 56 | Each arena has chunks that contain contagious memory regions of same size. An arena has multiple chunks that contain regions of multiple sizes. |
---|
[ba897d21] | 57 | \\ |
---|
[c9136d9] | 58 | \textbf{Version:} 5.2.1\\ |
---|
| 59 | \textbf{Configuration:} Compiled with jemalloc's default configurations and @Makefile@.\\ |
---|
| 60 | \textbf{Compilation command:} @autogen.sh; configure; make; make install@ |
---|
[028404f] | 61 | |
---|
[8f94a63] | 62 | \paragraph{ptmalloc3 (\textsf{pt3})} |
---|
| 63 | \cite{ptmalloc3} is a modification of dlmalloc. |
---|
[b81ab1c6] | 64 | It is a thread-safe multi-threaded memory allocator that uses multiple heaps. |
---|
[8f94a63] | 65 | ptmalloc3 heap has similar design to dlmalloc's heap. |
---|
[ba897d21] | 66 | \\ |
---|
[c9136d9] | 67 | \textbf{Version:} 1.8\\ |
---|
[8f94a63] | 68 | \textbf{Configuration:} Compiled with ptmalloc3's @Makefile@ using option ``linux-shared''.\\ |
---|
[c9136d9] | 69 | \textbf{Compilation command:} @make linux-shared@ |
---|
[028404f] | 70 | |
---|
[b81ab1c6] | 71 | \paragraph{rpmalloc (\textsf{rp})} |
---|
| 72 | \cite{rpmalloc} is a thread-safe allocator that is multi-threaded and uses per-thread heap. |
---|
| 73 | Each heap has multiple size-classes and each size-class contains memory regions of the relevant size. |
---|
[ba897d21] | 74 | \\ |
---|
[c9136d9] | 75 | \textbf{Version:} 1.4.1\\ |
---|
| 76 | \textbf{Configuration:} Compiled with rpmalloc's default configurations and ninja build system.\\ |
---|
| 77 | \textbf{Compilation command:} @python3 configure.py; ninja@ |
---|
[028404f] | 78 | |
---|
[b81ab1c6] | 79 | \paragraph{tbb malloc (\textsf{tbb})} |
---|
| 80 | \cite{tbbmalloc} is a thread-safe allocator that is multi-threaded and uses private heap for each thread. |
---|
| 81 | Each private-heap has multiple bins of different sizes. Each bin contains free regions of the same size. |
---|
[ba897d21] | 82 | \\ |
---|
[c9136d9] | 83 | \textbf{Version:} intel tbb 2020 update 2, tbb\_interface\_version == 11102\\ |
---|
| 84 | \textbf{Configuration:} Compiled with tbbmalloc's default configurations and @Makefile@.\\ |
---|
| 85 | \textbf{Compilation command:} @make@ |
---|
[080471a] | 86 | |
---|
[c9136d9] | 87 | % \section{Experiment Environment} |
---|
| 88 | % We used our micro benchmark suite (FIX ME: cite mbench) to evaluate these memory allocators \ref{sec:curAllocatorSec} and our own memory allocator uHeap \ref{sec:allocatorSec}. |
---|
[080471a] | 89 | |
---|
[c9136d9] | 90 | \section{Experiments} |
---|
[b81ab1c6] | 91 | |
---|
| 92 | The each micro-benchmark is configured and run with each of the allocators, |
---|
| 93 | The less time an allocator takes to complete a benchmark the better, so lower in the graphs is better. |
---|
[080471a] | 94 | |
---|
[ba897d21] | 95 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
| 96 | %% CHURN |
---|
| 97 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
[080471a] | 98 | |
---|
[c9136d9] | 99 | \subsection{Churn Micro-Benchmark} |
---|
[080471a] | 100 | |
---|
[c9136d9] | 101 | Churn tests allocators for speed under intensive dynamic memory usage (see \VRef{s:ChurnBenchmark}). |
---|
[ba897d21] | 102 | This experiment was run with following configurations: |
---|
[c9136d9] | 103 | \begin{description}[itemsep=0pt,parsep=0pt] |
---|
| 104 | \item[thread:] |
---|
| 105 | 1, 2, 4, 8, 16 |
---|
| 106 | \item[spots:] |
---|
| 107 | 16 |
---|
| 108 | \item[obj:] |
---|
| 109 | 100,000 |
---|
| 110 | \item[max:] |
---|
| 111 | 500 |
---|
| 112 | \item[min:] |
---|
| 113 | 50 |
---|
| 114 | \item[step:] |
---|
| 115 | 50 |
---|
| 116 | \item[distro:] |
---|
| 117 | fisher |
---|
| 118 | \end{description} |
---|
| 119 | |
---|
| 120 | % -maxS : 500 |
---|
| 121 | % -minS : 50 |
---|
| 122 | % -stepS : 50 |
---|
| 123 | % -distroS : fisher |
---|
| 124 | % -objN : 100000 |
---|
| 125 | % -cSpots : 16 |
---|
| 126 | % -threadN : 1, 2, 4, 8, 16 |
---|
| 127 | |
---|
| 128 | \VRef[Figure]{fig:churn} shows the results for algol and nasus. |
---|
[b81ab1c6] | 129 | The X-axis shows the number of threads; |
---|
| 130 | the Y-axis shows the total experiment time. |
---|
[c9136d9] | 131 | Each allocator's performance for each thread is shown in different colors. |
---|
[ba897d21] | 132 | |
---|
| 133 | \begin{figure} |
---|
| 134 | \centering |
---|
[b81ab1c6] | 135 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/churn} } |
---|
| 136 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/churn} } |
---|
[ba897d21] | 137 | \caption{Churn} |
---|
| 138 | \label{fig:churn} |
---|
| 139 | \end{figure} |
---|
| 140 | |
---|
[b81ab1c6] | 141 | All allocators did well in this micro-benchmark, except for \textsf{dl} on the ARM. |
---|
[4b2ea0d] | 142 | \textsf{dl}'s performace decreases and the difference with the other allocators starts increases as the number of worker threads increase. |
---|
| 143 | \textsf{je} was the fastest, although there is not much difference between \textsf{je} and rest of the allocators. |
---|
| 144 | |
---|
[b81ab1c6] | 145 | llheap is slightly slower because it uses ownership, where many of the allocations have remote frees, which requires locking. |
---|
| 146 | When llheap is compiled without ownership, its performance is the same as the other allocators (not shown). |
---|
[c9136d9] | 147 | |
---|
[ba897d21] | 148 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
| 149 | %% THRASH |
---|
| 150 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
[080471a] | 151 | |
---|
| 152 | \subsection{Cache Thrash} |
---|
[4b2ea0d] | 153 | \label{sec:cache-thrash-perf} |
---|
[080471a] | 154 | |
---|
[c9136d9] | 155 | Thrash tests memory allocators for active false sharing (see \VRef{sec:benchThrashSec}). |
---|
[ba897d21] | 156 | This experiment was run with following configurations: |
---|
[c9136d9] | 157 | \begin{description}[itemsep=0pt,parsep=0pt] |
---|
[b81ab1c6] | 158 | \item[threads:] |
---|
[c9136d9] | 159 | 1, 2, 4, 8, 16 |
---|
| 160 | \item[iterations:] |
---|
| 161 | 1,000 |
---|
| 162 | \item[cacheRW:] |
---|
| 163 | 1,000,000 |
---|
| 164 | \item[size:] |
---|
| 165 | 1 |
---|
| 166 | \end{description} |
---|
| 167 | |
---|
| 168 | % * Each allocator was tested for its performance across different number of threads. |
---|
| 169 | % Experiment was repeated for each allocator for 1, 2, 4, 8, and 16 threads by setting the configuration -threadN. |
---|
[ba897d21] | 170 | |
---|
[b81ab1c6] | 171 | \VRef[Figure]{fig:cacheThrash} shows the results for algol and nasus. |
---|
| 172 | The X-axis shows the number of threads; |
---|
| 173 | the Y-axis shows the total experiment time. |
---|
| 174 | Each allocator's performance for each thread is shown in different colors. |
---|
[ba897d21] | 175 | |
---|
| 176 | \begin{figure} |
---|
| 177 | \centering |
---|
[a6c10de] | 178 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/cache_thrash_0-thrash} } |
---|
| 179 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/cache_thrash_0-thrash} } |
---|
[ba897d21] | 180 | \caption{Cache Thrash} |
---|
| 181 | \label{fig:cacheThrash} |
---|
| 182 | \end{figure} |
---|
| 183 | |
---|
[4b2ea0d] | 184 | All allocators did well in this micro-benchmark, except for \textsf{dl} and \textsf{pt3}. |
---|
| 185 | \textsf{dl} uses a single heap for all threads so it is understable that it is generating so much active false-sharing. |
---|
| 186 | Requests from different threads will be dealt with sequientially by a single heap using locks which can allocate objects to different threads on the same cache line. |
---|
| 187 | \textsf{pt3} uses multiple heaps but it is not exactly per-thread heap. |
---|
| 188 | So, it is possible that multiple threads using one heap can get objects allocated on the same cache line which might be causing active false-sharing. |
---|
| 189 | Rest of the memory allocators generate little or no active false-sharing. |
---|
[b81ab1c6] | 190 | |
---|
[ba897d21] | 191 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
| 192 | %% SCRATCH |
---|
| 193 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
| 194 | |
---|
| 195 | \subsection{Cache Scratch} |
---|
| 196 | |
---|
[b81ab1c6] | 197 | Scratch tests memory allocators for program-induced allocator-preserved passive false-sharing (see \VRef{s:CacheScratch}). |
---|
[ba897d21] | 198 | This experiment was run with following configurations: |
---|
[c9136d9] | 199 | \begin{description}[itemsep=0pt,parsep=0pt] |
---|
| 200 | \item[threads:] |
---|
| 201 | 1, 2, 4, 8, 16 |
---|
| 202 | \item[iterations:] |
---|
| 203 | 1,000 |
---|
| 204 | \item[cacheRW:] |
---|
| 205 | 1,000,000 |
---|
| 206 | \item[size:] |
---|
| 207 | 1 |
---|
| 208 | \end{description} |
---|
| 209 | |
---|
| 210 | % * Each allocator was tested for its performance across different number of threads. |
---|
| 211 | % Experiment was repeated for each allocator for 1, 2, 4, 8, and 16 threads by setting the configuration -threadN. |
---|
[ba897d21] | 212 | |
---|
[b81ab1c6] | 213 | \VRef[Figure]{fig:cacheScratch} shows the results for algol and nasus. |
---|
| 214 | The X-axis shows the number of threads; |
---|
| 215 | the Y-axis shows the total experiment time. |
---|
| 216 | Each allocator's performance for each thread is shown in different colors. |
---|
[ba897d21] | 217 | |
---|
| 218 | \begin{figure} |
---|
| 219 | \centering |
---|
[a6c10de] | 220 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/cache_scratch_0-scratch} } |
---|
| 221 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/cache_scratch_0-scratch} } |
---|
[ba897d21] | 222 | \caption{Cache Scratch} |
---|
| 223 | \label{fig:cacheScratch} |
---|
| 224 | \end{figure} |
---|
| 225 | |
---|
[4b2ea0d] | 226 | This micro-benchmark divided the allocators in 2 groups. |
---|
| 227 | First is the group of best performers \textsf{llh}, \textsf{je}, and \textsf{rp}. |
---|
| 228 | These memory alloctors generate little or no passive false-sharing and their performance difference is negligible. |
---|
| 229 | Second is the group of the low performers which includes rest of the memory allocators. |
---|
| 230 | These memory allocators seem to preserve program-induced passive false-sharing. |
---|
| 231 | \textsf{hrd}'s performance keeps getting worst as the number of threads increase. |
---|
| 232 | |
---|
| 233 | Interestingly, allocators such as \textsf{hrd} and \textsf{glc} were among the best performers in micro-benchmark cache thrash as described in section \ref{sec:cache-thrash-perf}. |
---|
| 234 | But, these allocators were among the low performers in this micro-benchmark. |
---|
| 235 | It tells us that these allocators do not actively produce false-sharing but they may preserve program-induced passive false sharing. |
---|
[b81ab1c6] | 236 | |
---|
[ba897d21] | 237 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
| 238 | %% SPEED |
---|
| 239 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
| 240 | |
---|
[c9136d9] | 241 | \subsection{Speed Micro-Benchmark} |
---|
[ba897d21] | 242 | |
---|
[b81ab1c6] | 243 | Speed tests memory allocators for runtime latency (see \VRef{s:SpeedMicroBenchmark}). |
---|
[ba897d21] | 244 | This experiment was run with following configurations: |
---|
[c9136d9] | 245 | \begin{description}[itemsep=0pt,parsep=0pt] |
---|
| 246 | \item[max:] |
---|
| 247 | 500 |
---|
| 248 | \item[min:] |
---|
| 249 | 50 |
---|
| 250 | \item[step:] |
---|
| 251 | 50 |
---|
| 252 | \item[distro:] |
---|
| 253 | fisher |
---|
| 254 | \item[objects:] |
---|
| 255 | 1,000,000 |
---|
| 256 | \item[workers:] |
---|
| 257 | 1, 2, 4, 8, 16 |
---|
| 258 | \end{description} |
---|
| 259 | |
---|
| 260 | % -maxS : 500 |
---|
| 261 | % -minS : 50 |
---|
| 262 | % -stepS : 50 |
---|
| 263 | % -distroS : fisher |
---|
| 264 | % -objN : 1000000 |
---|
| 265 | % -threadN : \{ 1, 2, 4, 8, 16 \} * |
---|
| 266 | |
---|
| 267 | %* Each allocator was tested for its performance across different number of threads. |
---|
| 268 | %Experiment was repeated for each allocator for 1, 2, 4, 8, and 16 threads by setting the configuration -threadN. |
---|
[3c79ea9] | 269 | |
---|
[b81ab1c6] | 270 | \VRefrange[Figures]{fig:speed-3-malloc}{fig:speed-14-malloc-calloc-realloc-free} show 12 figures, one figure for each chain of the speed benchmark. |
---|
| 271 | The X-axis shows the number of threads; |
---|
| 272 | the Y-axis shows the total experiment time. |
---|
| 273 | Each allocator's performance for each thread is shown in different colors. |
---|
[3c79ea9] | 274 | |
---|
| 275 | \begin{itemize} |
---|
[b81ab1c6] | 276 | \item \VRef[Figure]{fig:speed-3-malloc} shows results for chain: malloc |
---|
| 277 | \item \VRef[Figure]{fig:speed-4-realloc} shows results for chain: realloc |
---|
| 278 | \item \VRef[Figure]{fig:speed-5-free} shows results for chain: free |
---|
| 279 | \item \VRef[Figure]{fig:speed-6-calloc} shows results for chain: calloc |
---|
| 280 | \item \VRef[Figure]{fig:speed-7-malloc-free} shows results for chain: malloc-free |
---|
| 281 | \item \VRef[Figure]{fig:speed-8-realloc-free} shows results for chain: realloc-free |
---|
| 282 | \item \VRef[Figure]{fig:speed-9-calloc-free} shows results for chain: calloc-free |
---|
| 283 | \item \VRef[Figure]{fig:speed-10-malloc-realloc} shows results for chain: malloc-realloc |
---|
| 284 | \item \VRef[Figure]{fig:speed-11-calloc-realloc} shows results for chain: calloc-realloc |
---|
| 285 | \item \VRef[Figure]{fig:speed-12-malloc-realloc-free} shows results for chain: malloc-realloc-free |
---|
| 286 | \item \VRef[Figure]{fig:speed-13-calloc-realloc-free} shows results for chain: calloc-realloc-free |
---|
| 287 | \item \VRef[Figure]{fig:speed-14-malloc-calloc-realloc-free} shows results for chain: malloc-realloc-free-calloc |
---|
[3c79ea9] | 288 | \end{itemize} |
---|
[ba897d21] | 289 | |
---|
[b81ab1c6] | 290 | All allocators did well in this micro-benchmark across all allocation chains, except for \textsf{dl} and \textsf{pt3}. |
---|
[4b2ea0d] | 291 | \textsf{dl} performed the lowest overall and its performce kept getting worse with increasing number of threads. |
---|
| 292 | \textsf{dl} uses a single heap with a global lock that can become a bottleneck. |
---|
| 293 | Multiple threads doing memory allocation in parallel can create contention on \textsf{dl}'s single heap. |
---|
| 294 | \textsf{pt3} which is a modification of \textsf{dl} for multi-threaded applications does not use per-thread heaps and may also have similar bottlenecks. |
---|
| 295 | |
---|
| 296 | There's a sudden increase in program completion time of chains that include \textsf{calloc} and all allocators perform relatively slower in these chains including \textsf{calloc}. |
---|
| 297 | \textsf{calloc} uses \textsf{memset} to set the allocated memory to zero. |
---|
| 298 | \textsf{memset} is a slow routine which takes a long time compared to the actual memory allocation. |
---|
| 299 | So, a major part of the time is taken for \textsf{memset} in performance of chains that include \textsf{calloc}. |
---|
| 300 | But the relative difference among the different memory allocators running the same chain of memory allocation operations still gives us an idea of theor relative performance. |
---|
[b81ab1c6] | 301 | |
---|
[ba897d21] | 302 | %speed-3-malloc.eps |
---|
| 303 | \begin{figure} |
---|
| 304 | \centering |
---|
[b81ab1c6] | 305 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/speed-3-malloc} } |
---|
| 306 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/speed-3-malloc} } |
---|
[3c79ea9] | 307 | \caption{Speed benchmark chain: malloc} |
---|
[ba897d21] | 308 | \label{fig:speed-3-malloc} |
---|
| 309 | \end{figure} |
---|
| 310 | |
---|
| 311 | %speed-4-realloc.eps |
---|
| 312 | \begin{figure} |
---|
| 313 | \centering |
---|
[b81ab1c6] | 314 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/speed-4-realloc} } |
---|
| 315 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/speed-4-realloc} } |
---|
[3c79ea9] | 316 | \caption{Speed benchmark chain: realloc} |
---|
[ba897d21] | 317 | \label{fig:speed-4-realloc} |
---|
| 318 | \end{figure} |
---|
| 319 | |
---|
| 320 | %speed-5-free.eps |
---|
| 321 | \begin{figure} |
---|
| 322 | \centering |
---|
[b81ab1c6] | 323 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/speed-5-free} } |
---|
| 324 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/speed-5-free} } |
---|
[3c79ea9] | 325 | \caption{Speed benchmark chain: free} |
---|
[ba897d21] | 326 | \label{fig:speed-5-free} |
---|
| 327 | \end{figure} |
---|
| 328 | |
---|
| 329 | %speed-6-calloc.eps |
---|
| 330 | \begin{figure} |
---|
| 331 | \centering |
---|
[b81ab1c6] | 332 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/speed-6-calloc} } |
---|
| 333 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/speed-6-calloc} } |
---|
[3c79ea9] | 334 | \caption{Speed benchmark chain: calloc} |
---|
[ba897d21] | 335 | \label{fig:speed-6-calloc} |
---|
| 336 | \end{figure} |
---|
| 337 | |
---|
| 338 | %speed-7-malloc-free.eps |
---|
| 339 | \begin{figure} |
---|
| 340 | \centering |
---|
[b81ab1c6] | 341 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/speed-7-malloc-free} } |
---|
| 342 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/speed-7-malloc-free} } |
---|
[3c79ea9] | 343 | \caption{Speed benchmark chain: malloc-free} |
---|
[ba897d21] | 344 | \label{fig:speed-7-malloc-free} |
---|
| 345 | \end{figure} |
---|
| 346 | |
---|
| 347 | %speed-8-realloc-free.eps |
---|
| 348 | \begin{figure} |
---|
| 349 | \centering |
---|
[b81ab1c6] | 350 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/speed-8-realloc-free} } |
---|
| 351 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/speed-8-realloc-free} } |
---|
[3c79ea9] | 352 | \caption{Speed benchmark chain: realloc-free} |
---|
[ba897d21] | 353 | \label{fig:speed-8-realloc-free} |
---|
| 354 | \end{figure} |
---|
| 355 | |
---|
| 356 | %speed-9-calloc-free.eps |
---|
| 357 | \begin{figure} |
---|
| 358 | \centering |
---|
[b81ab1c6] | 359 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/speed-9-calloc-free} } |
---|
| 360 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/speed-9-calloc-free} } |
---|
[3c79ea9] | 361 | \caption{Speed benchmark chain: calloc-free} |
---|
[ba897d21] | 362 | \label{fig:speed-9-calloc-free} |
---|
| 363 | \end{figure} |
---|
| 364 | |
---|
| 365 | %speed-10-malloc-realloc.eps |
---|
| 366 | \begin{figure} |
---|
| 367 | \centering |
---|
[b81ab1c6] | 368 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/speed-10-malloc-realloc} } |
---|
| 369 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/speed-10-malloc-realloc} } |
---|
[3c79ea9] | 370 | \caption{Speed benchmark chain: malloc-realloc} |
---|
[ba897d21] | 371 | \label{fig:speed-10-malloc-realloc} |
---|
| 372 | \end{figure} |
---|
| 373 | |
---|
| 374 | %speed-11-calloc-realloc.eps |
---|
| 375 | \begin{figure} |
---|
| 376 | \centering |
---|
[b81ab1c6] | 377 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/speed-11-calloc-realloc} } |
---|
| 378 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/speed-11-calloc-realloc} } |
---|
[3c79ea9] | 379 | \caption{Speed benchmark chain: calloc-realloc} |
---|
[ba897d21] | 380 | \label{fig:speed-11-calloc-realloc} |
---|
| 381 | \end{figure} |
---|
| 382 | |
---|
| 383 | %speed-12-malloc-realloc-free.eps |
---|
| 384 | \begin{figure} |
---|
| 385 | \centering |
---|
[b81ab1c6] | 386 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/speed-12-malloc-realloc-free} } |
---|
| 387 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/speed-12-malloc-realloc-free} } |
---|
[3c79ea9] | 388 | \caption{Speed benchmark chain: malloc-realloc-free} |
---|
[ba897d21] | 389 | \label{fig:speed-12-malloc-realloc-free} |
---|
| 390 | \end{figure} |
---|
| 391 | |
---|
| 392 | %speed-13-calloc-realloc-free.eps |
---|
| 393 | \begin{figure} |
---|
| 394 | \centering |
---|
[b81ab1c6] | 395 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/speed-13-calloc-realloc-free} } |
---|
| 396 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/speed-13-calloc-realloc-free} } |
---|
[3c79ea9] | 397 | \caption{Speed benchmark chain: calloc-realloc-free} |
---|
[ba897d21] | 398 | \label{fig:speed-13-calloc-realloc-free} |
---|
| 399 | \end{figure} |
---|
| 400 | |
---|
| 401 | %speed-14-{m,c,re}alloc-free.eps |
---|
| 402 | \begin{figure} |
---|
| 403 | \centering |
---|
[b81ab1c6] | 404 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/speed-14-m-c-re-alloc-free} } |
---|
| 405 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/speed-14-m-c-re-alloc-free} } |
---|
[3c79ea9] | 406 | \caption{Speed benchmark chain: malloc-calloc-realloc-free} |
---|
| 407 | \label{fig:speed-14-malloc-calloc-realloc-free} |
---|
[ba897d21] | 408 | \end{figure} |
---|
| 409 | |
---|
| 410 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
| 411 | %% MEMORY |
---|
| 412 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
| 413 | |
---|
[b81ab1c6] | 414 | \newpage |
---|
[c9136d9] | 415 | \subsection{Memory Micro-Benchmark} |
---|
[ba897d21] | 416 | |
---|
[b81ab1c6] | 417 | This experiment is run with the following two configurations for each allocator. |
---|
[c9136d9] | 418 | The difference between the two configurations is the number of producers and consumers. |
---|
[b81ab1c6] | 419 | Configuration 1 has one producer and one consumer, and configuration 2 has 4 producers, where each producer has 4 consumers. |
---|
[3c79ea9] | 420 | |
---|
[c9136d9] | 421 | \noindent |
---|
[3c79ea9] | 422 | Configuration 1: |
---|
[c9136d9] | 423 | \begin{description}[itemsep=0pt,parsep=0pt] |
---|
| 424 | \item[producer (K):] |
---|
| 425 | 1 |
---|
| 426 | \item[consumer (M):] |
---|
| 427 | 1 |
---|
| 428 | \item[round:] |
---|
| 429 | 100,000 |
---|
| 430 | \item[max:] |
---|
| 431 | 500 |
---|
| 432 | \item[min:] |
---|
| 433 | 50 |
---|
| 434 | \item[step:] |
---|
| 435 | 50 |
---|
| 436 | \item[distro:] |
---|
| 437 | fisher |
---|
| 438 | \item[objects (N):] |
---|
| 439 | 100,000 |
---|
| 440 | \end{description} |
---|
| 441 | |
---|
| 442 | % -threadA : 1 |
---|
| 443 | % -threadF : 1 |
---|
| 444 | % -maxS : 500 |
---|
| 445 | % -minS : 50 |
---|
| 446 | % -stepS : 50 |
---|
| 447 | % -distroS : fisher |
---|
| 448 | % -objN : 100000 |
---|
| 449 | % -consumeS: 100000 |
---|
| 450 | |
---|
| 451 | \noindent |
---|
[3c79ea9] | 452 | Configuration 2: |
---|
[c9136d9] | 453 | \begin{description}[itemsep=0pt,parsep=0pt] |
---|
| 454 | \item[producer (K):] |
---|
| 455 | 4 |
---|
| 456 | \item[consumer (M):] |
---|
| 457 | 4 |
---|
| 458 | \item[round:] |
---|
| 459 | 100,000 |
---|
| 460 | \item[max:] |
---|
| 461 | 500 |
---|
| 462 | \item[min:] |
---|
| 463 | 50 |
---|
| 464 | \item[step:] |
---|
| 465 | 50 |
---|
| 466 | \item[distro:] |
---|
| 467 | fisher |
---|
| 468 | \item[objects (N):] |
---|
| 469 | 100,000 |
---|
| 470 | \end{description} |
---|
| 471 | |
---|
| 472 | % -threadA : 4 |
---|
| 473 | % -threadF : 4 |
---|
| 474 | % -maxS : 500 |
---|
| 475 | % -minS : 50 |
---|
| 476 | % -stepS : 50 |
---|
| 477 | % -distroS : fisher |
---|
| 478 | % -objN : 100000 |
---|
| 479 | % -consumeS: 100000 |
---|
| 480 | |
---|
[4994d67] | 481 | % \begin{table}[b] |
---|
| 482 | % \centering |
---|
| 483 | % \begin{tabular}{ |c|c|c| } |
---|
| 484 | % \hline |
---|
| 485 | % Memory Allocator & Configuration 1 Result & Configuration 2 Result\\ |
---|
| 486 | % \hline |
---|
[a6c10de] | 487 | % llh & \VRef[Figure]{fig:mem-1-prod-1-cons-100-llh} & \VRef[Figure]{fig:mem-4-prod-4-cons-100-llh}\\ |
---|
[4994d67] | 488 | % \hline |
---|
| 489 | % dl & \VRef[Figure]{fig:mem-1-prod-1-cons-100-dl} & \VRef[Figure]{fig:mem-4-prod-4-cons-100-dl}\\ |
---|
| 490 | % \hline |
---|
| 491 | % glibc & \VRef[Figure]{fig:mem-1-prod-1-cons-100-glc} & \VRef[Figure]{fig:mem-4-prod-4-cons-100-glc}\\ |
---|
| 492 | % \hline |
---|
| 493 | % hoard & \VRef[Figure]{fig:mem-1-prod-1-cons-100-hrd} & \VRef[Figure]{fig:mem-4-prod-4-cons-100-hrd}\\ |
---|
| 494 | % \hline |
---|
| 495 | % je & \VRef[Figure]{fig:mem-1-prod-1-cons-100-je} & \VRef[Figure]{fig:mem-4-prod-4-cons-100-je}\\ |
---|
| 496 | % \hline |
---|
| 497 | % pt3 & \VRef[Figure]{fig:mem-1-prod-1-cons-100-pt3} & \VRef[Figure]{fig:mem-4-prod-4-cons-100-pt3}\\ |
---|
| 498 | % \hline |
---|
| 499 | % rp & \VRef[Figure]{fig:mem-1-prod-1-cons-100-rp} & \VRef[Figure]{fig:mem-4-prod-4-cons-100-rp}\\ |
---|
| 500 | % \hline |
---|
| 501 | % tbb & \VRef[Figure]{fig:mem-1-prod-1-cons-100-tbb} & \VRef[Figure]{fig:mem-4-prod-4-cons-100-tbb}\\ |
---|
| 502 | % \hline |
---|
| 503 | % \end{tabular} |
---|
| 504 | % \caption{Memory benchmark results} |
---|
| 505 | % \label{table:mem-benchmark-figs} |
---|
| 506 | % \end{table} |
---|
| 507 | % Table \ref{table:mem-benchmark-figs} shows the list of figures that contain memory benchmark results. |
---|
[3c79ea9] | 508 | |
---|
[a6c10de] | 509 | \VRefrange[Figures]{fig:mem-1-prod-1-cons-100-llh}{fig:mem-4-prod-4-cons-100-tbb} show 16 figures, two figures for each of the 8 allocators, one for each configuration. |
---|
[3c79ea9] | 510 | Each figure has 2 graphs, one for each experiment environment. |
---|
[4994d67] | 511 | Each graph has following 5 subgraphs that show memory usage and statistics throughout the micro-benchmark's lifetime. |
---|
[3c79ea9] | 512 | \begin{itemize} |
---|
| 513 | \item \textit{\textbf{current\_req\_mem(B)}} shows the amount of dynamic memory requested and currently in-use of the benchmark. |
---|
[4994d67] | 514 | \item \textit{\textbf{heap}}* shows the memory requested by the program (allocator) from the system that lies in the heap (@sbrk@) area. |
---|
| 515 | \item \textit{\textbf{mmap\_so}}* shows the memory requested by the program (allocator) from the system that lies in the @mmap@ area. |
---|
| 516 | \item \textit{\textbf{mmap}}* shows the memory requested by the program (allocator or shared libraries) from the system that lies in the @mmap@ area. |
---|
| 517 | \item \textit{\textbf{total\_dynamic}} shows the total usage of dynamic memory by the benchmark program, which is a sum of \textit{heap}, \textit{mmap}, and \textit{mmap\_so}. |
---|
[3c79ea9] | 518 | \end{itemize} |
---|
[4994d67] | 519 | * These statistics are gathered by monitoring a process's @/proc/self/maps@ file. |
---|
[3c79ea9] | 520 | |
---|
[4994d67] | 521 | The X-axis shows the time when the memory information is polled. |
---|
| 522 | The Y-axis shows the memory usage in bytes. |
---|
[3c79ea9] | 523 | |
---|
[4994d67] | 524 | For the experiment, at a certain time in the program's life, the difference between the memory requested by the benchmark (\textit{current\_req\_mem(B)}) and the memory that the process has received from system (\textit{heap}, \textit{mmap}) should be minimum. |
---|
[3c79ea9] | 525 | This difference is the memory overhead caused by the allocator and shows the level of fragmentation in the allocator. |
---|
| 526 | |
---|
[4994d67] | 527 | First, the differences in the shape of the curves between architectures (top ARM, bottom x64) is small, where the differences are in the amount of memory used. |
---|
| 528 | Hence, it is possible to focus on either the top or bottom graph. |
---|
[4b2ea0d] | 529 | The heap curve is remains zero for 4 memory allocators: \textsf{hrd}, \textsf{je}, \textsf{pt3}, and \textsf{rp}. |
---|
| 530 | These memory allocators are not using the sbrk area, instead they only use mmap to get memory from the system. |
---|
[4994d67] | 531 | |
---|
[4b2ea0d] | 532 | \textsf{hrd}, and \textsf{tbb} have higher memory footprint than the others as they use more total dynamic memory. |
---|
| 533 | One reason for that can be the usage of superblocks as both of these memory allocators create superblocks where each block contains objects of the same size. |
---|
| 534 | These superblocks are maintained throughout the life of the program. |
---|
[4994d67] | 535 | |
---|
[4b2ea0d] | 536 | \textsf{pt3} is the only memory allocator for which the total dynamic memory goes down in the second half of the program lifetime when the memory is freed by the benchmark program. |
---|
| 537 | It makes pt3 the only memory allocator that gives memory back to operating system as it is freed by the program. |
---|
| 538 | |
---|
| 539 | % FOR 1 THREAD |
---|
[4994d67] | 540 | |
---|
[a6c10de] | 541 | %mem-1-prod-1-cons-100-llh.eps |
---|
[ba897d21] | 542 | \begin{figure} |
---|
| 543 | \centering |
---|
[a6c10de] | 544 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/mem-1-prod-1-cons-100-llh} } |
---|
| 545 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/mem-1-prod-1-cons-100-llh} } |
---|
[4b2ea0d] | 546 | \caption{Memory benchmark results with Configuration-1 for llh memory allocator} |
---|
[a6c10de] | 547 | \label{fig:mem-1-prod-1-cons-100-llh} |
---|
[ba897d21] | 548 | \end{figure} |
---|
| 549 | |
---|
[4994d67] | 550 | %mem-1-prod-1-cons-100-dl.eps |
---|
| 551 | \begin{figure} |
---|
| 552 | \centering |
---|
| 553 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/mem-1-prod-1-cons-100-dl} } |
---|
| 554 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/mem-1-prod-1-cons-100-dl} } |
---|
[4b2ea0d] | 555 | \caption{Memory benchmark results with Configuration-1 for dl memory allocator} |
---|
[4994d67] | 556 | \label{fig:mem-1-prod-1-cons-100-dl} |
---|
| 557 | \end{figure} |
---|
| 558 | |
---|
| 559 | %mem-1-prod-1-cons-100-glc.eps |
---|
| 560 | \begin{figure} |
---|
| 561 | \centering |
---|
| 562 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/mem-1-prod-1-cons-100-glc} } |
---|
| 563 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/mem-1-prod-1-cons-100-glc} } |
---|
[4b2ea0d] | 564 | \caption{Memory benchmark results with Configuration-1 for glibc memory allocator} |
---|
[4994d67] | 565 | \label{fig:mem-1-prod-1-cons-100-glc} |
---|
| 566 | \end{figure} |
---|
| 567 | |
---|
| 568 | %mem-1-prod-1-cons-100-hrd.eps |
---|
| 569 | \begin{figure} |
---|
| 570 | \centering |
---|
| 571 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/mem-1-prod-1-cons-100-hrd} } |
---|
| 572 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/mem-1-prod-1-cons-100-hrd} } |
---|
[4b2ea0d] | 573 | \caption{Memory benchmark results with Configuration-1 for hoard memory allocator} |
---|
[4994d67] | 574 | \label{fig:mem-1-prod-1-cons-100-hrd} |
---|
| 575 | \end{figure} |
---|
| 576 | |
---|
| 577 | %mem-1-prod-1-cons-100-je.eps |
---|
| 578 | \begin{figure} |
---|
| 579 | \centering |
---|
| 580 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/mem-1-prod-1-cons-100-je} } |
---|
| 581 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/mem-1-prod-1-cons-100-je} } |
---|
[4b2ea0d] | 582 | \caption{Memory benchmark results with Configuration-1 for je memory allocator} |
---|
[4994d67] | 583 | \label{fig:mem-1-prod-1-cons-100-je} |
---|
| 584 | \end{figure} |
---|
| 585 | |
---|
| 586 | %mem-1-prod-1-cons-100-pt3.eps |
---|
| 587 | \begin{figure} |
---|
| 588 | \centering |
---|
| 589 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/mem-1-prod-1-cons-100-pt3} } |
---|
| 590 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/mem-1-prod-1-cons-100-pt3} } |
---|
[4b2ea0d] | 591 | \caption{Memory benchmark results with Configuration-1 for pt3 memory allocator} |
---|
[4994d67] | 592 | \label{fig:mem-1-prod-1-cons-100-pt3} |
---|
| 593 | \end{figure} |
---|
| 594 | |
---|
| 595 | %mem-1-prod-1-cons-100-rp.eps |
---|
| 596 | \begin{figure} |
---|
| 597 | \centering |
---|
| 598 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/mem-1-prod-1-cons-100-rp} } |
---|
| 599 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/mem-1-prod-1-cons-100-rp} } |
---|
[4b2ea0d] | 600 | \caption{Memory benchmark results with Configuration-1 for rp memory allocator} |
---|
[4994d67] | 601 | \label{fig:mem-1-prod-1-cons-100-rp} |
---|
| 602 | \end{figure} |
---|
| 603 | |
---|
| 604 | %mem-1-prod-1-cons-100-tbb.eps |
---|
| 605 | \begin{figure} |
---|
| 606 | \centering |
---|
| 607 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/mem-1-prod-1-cons-100-tbb} } |
---|
| 608 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/mem-1-prod-1-cons-100-tbb} } |
---|
[4b2ea0d] | 609 | \caption{Memory benchmark results with Configuration-1 for tbb memory allocator} |
---|
[4994d67] | 610 | \label{fig:mem-1-prod-1-cons-100-tbb} |
---|
| 611 | \end{figure} |
---|
| 612 | |
---|
[4b2ea0d] | 613 | % FOR 4 THREADS |
---|
| 614 | |
---|
| 615 | %mem-4-prod-4-cons-100-llh.eps |
---|
| 616 | \begin{figure} |
---|
| 617 | \centering |
---|
| 618 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/mem-4-prod-4-cons-100-llh} } |
---|
| 619 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/mem-4-prod-4-cons-100-llh} } |
---|
| 620 | \caption{Memory benchmark results with Configuration-2 for llh memory allocator} |
---|
| 621 | \label{fig:mem-4-prod-4-cons-100-llh} |
---|
| 622 | \end{figure} |
---|
| 623 | |
---|
| 624 | %mem-4-prod-4-cons-100-dl.eps |
---|
| 625 | \begin{figure} |
---|
| 626 | \centering |
---|
| 627 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/mem-4-prod-4-cons-100-dl} } |
---|
| 628 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/mem-4-prod-4-cons-100-dl} } |
---|
| 629 | \caption{Memory benchmark results with Configuration-2 for dl memory allocator} |
---|
| 630 | \label{fig:mem-4-prod-4-cons-100-dl} |
---|
| 631 | \end{figure} |
---|
| 632 | |
---|
| 633 | %mem-4-prod-4-cons-100-glc.eps |
---|
| 634 | \begin{figure} |
---|
| 635 | \centering |
---|
| 636 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/mem-4-prod-4-cons-100-glc} } |
---|
| 637 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/mem-4-prod-4-cons-100-glc} } |
---|
| 638 | \caption{Memory benchmark results with Configuration-2 for glibc memory allocator} |
---|
| 639 | \label{fig:mem-4-prod-4-cons-100-glc} |
---|
| 640 | \end{figure} |
---|
| 641 | |
---|
| 642 | %mem-4-prod-4-cons-100-hrd.eps |
---|
| 643 | \begin{figure} |
---|
| 644 | \centering |
---|
| 645 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/mem-4-prod-4-cons-100-hrd} } |
---|
| 646 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/mem-4-prod-4-cons-100-hrd} } |
---|
| 647 | \caption{Memory benchmark results with Configuration-2 for hoard memory allocator} |
---|
| 648 | \label{fig:mem-4-prod-4-cons-100-hrd} |
---|
| 649 | \end{figure} |
---|
| 650 | |
---|
| 651 | %mem-4-prod-4-cons-100-je.eps |
---|
| 652 | \begin{figure} |
---|
| 653 | \centering |
---|
| 654 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/mem-4-prod-4-cons-100-je} } |
---|
| 655 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/mem-4-prod-4-cons-100-je} } |
---|
| 656 | \caption{Memory benchmark results with Configuration-2 for je memory allocator} |
---|
| 657 | \label{fig:mem-4-prod-4-cons-100-je} |
---|
| 658 | \end{figure} |
---|
| 659 | |
---|
| 660 | %mem-4-prod-4-cons-100-pt3.eps |
---|
| 661 | \begin{figure} |
---|
| 662 | \centering |
---|
| 663 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/mem-4-prod-4-cons-100-pt3} } |
---|
| 664 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/mem-4-prod-4-cons-100-pt3} } |
---|
| 665 | \caption{Memory benchmark results with Configuration-2 for pt3 memory allocator} |
---|
| 666 | \label{fig:mem-4-prod-4-cons-100-pt3} |
---|
| 667 | \end{figure} |
---|
| 668 | |
---|
| 669 | %mem-4-prod-4-cons-100-rp.eps |
---|
| 670 | \begin{figure} |
---|
| 671 | \centering |
---|
| 672 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/mem-4-prod-4-cons-100-rp} } |
---|
| 673 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/mem-4-prod-4-cons-100-rp} } |
---|
| 674 | \caption{Memory benchmark results with Configuration-2 for rp memory allocator} |
---|
| 675 | \label{fig:mem-4-prod-4-cons-100-rp} |
---|
| 676 | \end{figure} |
---|
| 677 | |
---|
[ba897d21] | 678 | %mem-4-prod-4-cons-100-tbb.eps |
---|
| 679 | \begin{figure} |
---|
| 680 | \centering |
---|
[b81ab1c6] | 681 | \subfigure[Algol]{ \includegraphics[width=0.95\textwidth]{evaluations/algol-perf-eps/mem-4-prod-4-cons-100-tbb} } |
---|
| 682 | \subfigure[Nasus]{ \includegraphics[width=0.95\textwidth]{evaluations/nasus-perf-eps/mem-4-prod-4-cons-100-tbb} } |
---|
[4b2ea0d] | 683 | \caption{Memory benchmark results with Configuration-2 for tbb memory allocator} |
---|
[ba897d21] | 684 | \label{fig:mem-4-prod-4-cons-100-tbb} |
---|
| 685 | \end{figure} |
---|