Changeset 1eec0b0 for doc/theses/mubeen_zulfiqar_MMath/intro.tex
- Timestamp:
- Feb 22, 2022, 2:42:45 PM (3 years ago)
- Branches:
- ADT, ast-experimental, enum, master, pthread-emulation, qualifiedEnum
- Children:
- 5cefa43
- Parents:
- 5c216b4
- git-author:
- Peter A. Buhr <pabuhr@…> (02/20/22 20:37:23)
- git-committer:
- Peter A. Buhr <pabuhr@…> (02/22/22 14:42:45)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/mubeen_zulfiqar_MMath/intro.tex
r5c216b4 r1eec0b0 1 1 \chapter{Introduction} 2 2 3 4 \section{Introduction} 5 6 % Shared-memory multi-processor computers are ubiquitous and important for improving application performance. 7 % However, writing programs that take advantage of multiple processors is not an easy task~\cite{Alexandrescu01b}, \eg shared resources can become a bottleneck when increasing (scaling) threads. 8 % One crucial shared resource is program memory, since it is used by all threads in a shared-memory concurrent-program~\cite{Berger00}. 9 % Therefore, providing high-performance, scalable memory-management is important for virtually all shared-memory multi-threaded programs. 10 11 Memory management takes a sequence of program generated allocation/deallocation requests and attempts to satisfy them within a fixed-sized block of memory while minimizing the total amount of memory used. 12 A general-purpose dynamic-allocation algorithm cannot anticipate future allocation requests so its output is rarely optimal. 13 However, memory allocators do take advantage of regularities in allocation patterns for typical programs to produce excellent results, both in time and space (similar to LRU paging). 14 In general, allocators use a number of similar techniques, each optimizing specific allocation patterns. 15 Nevertheless, memory allocators are a series of compromises, occasionally with some static or dynamic tuning parameters to optimize specific program-request patterns. 16 17 18 \subsection{Memory Structure} 19 \label{s:MemoryStructure} 20 21 \VRef[Figure]{f:ProgramAddressSpace} shows the typical layout of a program's address space divided into the following zones (right to left): static code/data, dynamic allocation, dynamic code/data, and stack, with free memory surrounding the dynamic code/data~\cite{memlayout}. 22 Static code and data are placed into memory at load time from the executable and are fixed-sized at runtime. 23 Dynamic-allocation memory starts empty and grows/shrinks as the program dynamically creates/deletes variables with independent lifetime. 24 The programming-language's runtime manages this area, where management complexity is a function of the mechanism for deleting variables. 25 Dynamic code/data memory is managed by the dynamic loader for libraries loaded at runtime, which is complex especially in a multi-threaded program~\cite{Huang06}. 26 However, changes to the dynamic code/data space are typically infrequent, many occurring at program startup, and are largely outside of a program's control. 27 Stack memory is managed by the program call-mechanism using simple LIFO management, which works well for sequential programs. 28 For multi-threaded programs (and coroutines), a new stack is created for each thread; 29 these thread stacks are commonly created in dynamic-allocation memory. 30 This thesis focuses on management of the dynamic-allocation memory. 31 32 \begin{figure} 33 \centering 34 \input{AddressSpace} 35 \vspace{-5pt} 36 \caption{Program Address Space Divided into Zones} 37 \label{f:ProgramAddressSpace} 38 \end{figure} 39 40 41 \subsection{Dynamic Memory-Management} 42 \label{s:DynamicMemoryManagement} 43 44 Modern programming languages manage dynamic-allocation memory in different ways. 45 Some languages, such as Lisp~\cite{CommonLisp}, Java~\cite{Java}, Go~\cite{Go}, Haskell~\cite{Haskell}, provide explicit allocation but \emph{implicit} deallocation of data through garbage collection~\cite{Wilson92}. 46 In general, garbage collection supports memory compaction, where dynamic (live) data is moved during runtime to better utilize space. 47 However, moving data requires finding pointers to it and updating them to reflect new data locations. 48 Programming languages such as C~\cite{C}, \CC~\cite{C++}, and Rust~\cite{Rust} provide the programmer with explicit allocation \emph{and} deallocation of data. 49 These languages cannot find and subsequently move live data because pointers can be created to any storage zone, including internal components of allocated objects, and may contain temporary invalid values generated by pointer arithmetic. 50 Attempts have been made to perform quasi garbage collection in C/\CC~\cite{Boehm88}, but it is a compromise. 51 This thesis only examines dynamic memory-management with \emph{explicit} deallocation. 52 While garbage collection and compaction are not part this work, many of the results are applicable to the allocation phase in any memory-management approach. 53 54 Most programs use a general-purpose allocator, often the one provided implicitly by the programming-language's runtime. 55 When this allocator proves inadequate, programmers often write specialize allocators for specific needs. 56 C and \CC allow easy replacement of the default memory allocator with an alternative specialized or general-purpose memory-allocator. 57 (Jikes RVM MMTk~\cite{MMTk} provides a similar generalization for the Java virtual machine.) 58 However, high-performance memory-allocators for kernel and user multi-threaded programs are still being designed and improved. 59 For this reason, several alternative general-purpose allocators have been written for C/\CC with the goal of scaling in a multi-threaded program~\cite{Berger00,mtmalloc,streamflow,tcmalloc}. 60 This work examines the design of high-performance allocators for use by kernel and user multi-threaded applications written in C/\CC. 61 62 63 \subsection{Contributions} 64 \label{s:Contributions} 65 66 This work provides the following contributions in the area of concurrent dynamic allocation: 67 \begin{enumerate} 68 \item 69 Implementation of a new stand-lone concurrent memory allocator ($\approx$1,200 lines of code) for C/\CC programs using kernel threads (1:1 threading), and specialized versions of the allocator for programming languages \uC and \CFA using user-level threads running over multiple kernel threads (M:N threading). 70 71 \item 72 Adopt the return of @nullptr@ for a zero-sized allocation, rather than an actual memory address, both of which can be passed to @free@. 73 Most allocators use @nullptr@ to indicate an allocation failure, such as full memory; 74 hence the need to return an alternate value for a zero-sized allocation. 75 The alternative is to abort the program on allocation failure. 76 In theory, notifying the programmer of a failure allows recovery; 77 in practice, it is almost impossible to gracefully recover from allocation failure, especially full memory, so adopting the cheaper return @nullptr@ for a zero-sized allocation is chosen. 78 79 \item 80 Extended the standard C heap functionality by preserving with each allocation its original request size versus the amount allocated due to bucketing, if an allocation is zero fill, and the allocation alignment. 81 82 \item 83 Use the zero fill and alignment as \emph{sticky} properties for @realloc@, to realign existing storage, or preserve existing zero-fill and alignment when storage is copied. 84 Without this extension, it is unsafe to @realloc@ storage initially allocated with zero-fill/alignment as these properties are not preserved when copying. 85 This silent generation of a problem is unintuitive to programmers and difficult to locate because it is transient. 86 87 \item 88 Provide additional heap operations to complete programmer expectation with respect to accessing different allocation properties. 89 \begin{itemize} 90 \item 91 @resize( oaddr, size )@ re-purpose an old allocation for a new type \emph{without} preserving fill or alignment. 92 \item 93 @resize( oaddr, alignment, size )@ re-purpose an old allocation with new alignment but \emph{without} preserving fill. 94 \item 95 @realloc( oaddr, alignment, size )@ same as previous @realloc@ but adding or changing alignment. 96 \item 97 @aalloc( dim, elemSize )@ same as @calloc@ except memory is \emph{not} zero filled. 98 \item 99 @amemalign( alignment, dim, elemSize )@ same as @aalloc@ with memory alignment. 100 \item 101 @cmemalign( alignment, dim, elemSize )@ same as @calloc@ with memory alignment. 102 \end{itemize} 103 104 \item 105 Provide additional query operations to access information about an allocation: 106 \begin{itemize} 107 \item 108 @malloc_alignment( addr )@ returns the alignment of the allocation pointed-to by @addr@. 109 If the allocation is not aligned or @addr@ is the @nulladdr@, the minimal alignment is returned. 110 \item 111 @malloc_zero_fill( addr )@ returns a boolean result indicating if the memory pointed-to by @addr@ is allocated with zero fill, e.g., by @calloc@/@cmemalign@. 112 \item 113 @malloc_size( addr )@ returns the size of the memory allocation pointed-to by @addr@. 114 \item 115 @malloc_usable_size( addr )@ returns the usable size of the memory pointed-to by @addr@, i.e., the bin size containing the allocation, where @malloc_size( addr )@ $\le$ @malloc_usable_size( addr )@. 116 \end{itemize} 117 118 \item 119 Provide complete and fast allocation statistics to help understand program behaviour: 120 \begin{itemize} 121 \item 122 @malloc_stats()@ print memory-allocation statistics on the file-descriptor set by @malloc_stats_fd@. 123 \item 124 @malloc_info( options, stream )@ print memory-allocation statistics as an XML string on the specified file-descriptor set by @malloc_stats_fd@. 125 \item 126 @malloc_stats_fd( fd )@ set file-descriptor number for printing memory-allocation statistics (default @STDERR_FILENO@). 127 This file descriptor is used implicitly by @malloc_stats@ and @malloc_info@. 128 \end{itemize} 129 130 \item 131 Provide mostly contention-free allocation and free operations via a heap-per-kernel-thread implementation. 132 133 \item 134 Provide extensive contention-free runtime checks to valid allocation operations and identify the amount of unfreed storage at program termination. 135 136 \item 137 Build 4 different versions of the allocator: 138 \begin{itemize} 139 \item 140 static or dynamic linking 141 \item 142 statistic/debugging (testing) or no statistic/debugging (performance) 143 \end{itemize} 144 A program may link to any of these 4 versions of the allocator often without recompilation. 145 (It is possible to separate statistics and debugging, giving 8 different versions.) 146 147 \item 148 A micro-benchmark test-suite for comparing allocators rather than relying on a suite of arbitrary programs. 149 These micro-benchmarks have adjustment knobs to simulate allocation patterns hard-coded into arbitrary test programs 150 \end{enumerate} 151 152 \begin{comment} 3 153 \noindent 4 154 ==================== … … 26 176 27 177 \section{Introduction} 28 Dynamic memory allocation and management is one of the core features of C. It gives programmer the freedom to allocate, free, use, and manage dynamic memory himself. The programmer is not given the complete control of the dynamic memory management instead an interface of memory allocator is given to the progr mmer that can be used to allocate/free dynamic memory for the application's use.29 30 Memory allocator is a layer between th rprogrammer and the system. Allocator gets dynamic memory from the system in heap/mmap area of application storage and manages it for programmer's use.31 32 GNU C Library (FIX ME: cite this) provides an interchangeable memory allocator that can be replaced with a custom memory allocator that supports required features and fulfills application's custom needs. It also allows others to innovate in memory allocation and design their own memory allocator. GNU C Library has set guidelines that should be followed when designing a stand alone memory allocator. GNU C Library requires new memory allocators to have atlease following set of functions in their allocator's interface:178 Dynamic memory allocation and management is one of the core features of C. It gives programmer the freedom to allocate, free, use, and manage dynamic memory himself. The programmer is not given the complete control of the dynamic memory management instead an interface of memory allocator is given to the programmer that can be used to allocate/free dynamic memory for the application's use. 179 180 Memory allocator is a layer between the programmer and the system. Allocator gets dynamic memory from the system in heap/mmap area of application storage and manages it for programmer's use. 181 182 GNU C Library (FIX ME: cite this) provides an interchangeable memory allocator that can be replaced with a custom memory allocator that supports required features and fulfills application's custom needs. It also allows others to innovate in memory allocation and design their own memory allocator. GNU C Library has set guidelines that should be followed when designing a stand-alone memory allocator. GNU C Library requires new memory allocators to have at lease following set of functions in their allocator's interface: 33 183 34 184 \begin{itemize} … … 43 193 \end{itemize} 44 194 45 In addition to the above functions, GNU C Library also provides some more functions to increase the usability of the dynamic memory allocator. Most stand alone allocators also provide all or some of the above additional functions.195 In addition to the above functions, GNU C Library also provides some more functions to increase the usability of the dynamic memory allocator. Most stand-alone allocators also provide all or some of the above additional functions. 46 196 47 197 \begin{itemize} … … 60 210 \end{itemize} 61 211 62 With the rise of concurrent applications, memory allocators should be able to fulfill dynamic memory requests from multiple threads in parallel without causing contention on shared resources. There needs to be a set of a standard benchmarks that can be used to evaluate an allocator's performance in different scen erios.212 With the rise of concurrent applications, memory allocators should be able to fulfill dynamic memory requests from multiple threads in parallel without causing contention on shared resources. There needs to be a set of a standard benchmarks that can be used to evaluate an allocator's performance in different scenarios. 63 213 64 214 \section{Research Objectives} … … 69 219 Design a lightweight concurrent memory allocator with added features and usability that are currently not present in the other memory allocators. 70 220 \item 71 Design a suite of benchmarks to evalu te multiple aspects of a memory allocator.221 Design a suite of benchmarks to evaluate multiple aspects of a memory allocator. 72 222 \end{itemize} 73 223 74 224 \section{An outline of the thesis} 75 225 LAST FIX ME: add outline at the end 226 \end{comment}
Note: See TracChangeset
for help on using the changeset viewer.