source: doc/theses/mubeen_zulfiqar_MMath/allocator.tex @ 1f8dbfe

ADTast-experimentalenumforall-pointer-decayjacob/cs343-translationnew-ast-unique-exprpthread-emulationqualifiedEnum
Last change on this file since 1f8dbfe was 1f8dbfe, checked in by m3zulfiq <m3zulfiq@…>, 3 years ago

Added the new routines in C and CFA allocator interface

  • Property mode set to 100644
File size: 21.1 KB
Line 
1\chapter{Allocator}
2
3\noindent
4====================
5
6Writing Points:
7\begin{itemize}
8\item
9Objective of uHeapLmmm.
10\item
11Design philosophy.
12\item
13Background and previous design of uHeapLmmm.
14\item
15Distributed design of uHeapLmmm.
16
17----- SHOULD WE GIVE IMPLEMENTATION DETAILS HERE? -----
18
19\PAB{Maybe. There might be an Implementation chapter.}
20\item
21figure.
22\item
23Advantages of distributed design.
24\end{itemize}
25
26The new features added to uHeapLmmm (incl. @malloc_size@ routine)
27\CFA alloc interface with examples.
28\begin{itemize}
29\item
30Why did we need it?
31\item
32The added benefits.
33\end{itemize}
34
35
36%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
37%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
38%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% uHeapLmmm Design
39%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
40%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
41
42\section{Objective of uHeapLmmm}
43UHeapLmmm is a lightweight memory allocator. The objective behind uHeapLmmm is to design a minimal concurrent memory allocator that has new features and also fulfills GNU C Library requirements (FIX ME: cite requirements).
44
45\subsection{Design philosophy}
46The objective of uHeapLmmm's new design was to fulfill following requirements:
47\begin{itemize}
48\item It should be concurrent to be used in multi-threaded programs.
49\item It should avoid global locks, on resources shared across all threads, as much as possible.
50\item It's performance (FIX ME: cite performance benchmarks) should be comparable to the commonly used allocators (FIX ME: cite common allocators).
51\item It should be a lightweight memory allocator.
52\end{itemize}
53
54%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
55
56\section{Background and previous design of uHeapLmmm}
57uHeapLmmm was originally designed by X in X (FIX ME: add original author after confirming with Peter).
58(FIX ME: make and add figure of previous design with description)
59
60%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
61
62\section{Distributed design of uHeapLmmm}
63uHeapLmmm's design was reviewed and changed to fulfill new requirements (FIX ME: cite allocator philosophy). For this purpose, following two designs of uHeapLmm were proposed:
64
65\paragraph{Design 1: Decentralized}
66Fixed number of heaps: shard the heap into N heaps each with a bump-area allocated from the @sbrk@ area.
67Kernel threads (KT) are assigned to the N heaps.
68When KTs $\le$ N, the heaps are uncontented.
69When KTs $>$ N, the heaps are contented.
70By adjusting N, this approach reduces storage at the cost of speed due to contention.
71In all cases, a thread acquires/releases a lock, contented or uncontented.
72\begin{cquote}
73\centering
74\input{AllocDS1}
75\end{cquote}
76Problems: need to know when a KT is created and destroyed to know when to assign/un-assign a heap to the KT.
77
78\paragraph{Design 2: Centralized}
79One heap, but lower bucket sizes are N-shared across KTs.
80This design leverages the fact that 95\% of allocation requests are less than 512 bytes and there are only 3--5 different request sizes.
81When KTs $\le$ N, the important bucket sizes are uncontented.
82When KTs $>$ N, the free buckets are contented.
83Therefore, threads are only contending for a small number of buckets, which are distributed among them to reduce contention.
84\begin{cquote}
85\centering
86\input{AllocDS2}
87\end{cquote}
88Problems: need to know when a kernel thread (KT) is created and destroyed to know when to assign a shared bucket-number.
89When no thread is assigned a bucket number, its free storage is unavailable. All KTs will be contended for one lock on sbrk for their initial allocations (before free-lists gets populated).
90
91Out of the two designs, Design 1 was chosen because it's concurrency is better across all bucket-sizes as design-2 shards a few buckets of selected sizes while design-1 shards all the buckets. Design-2 shards the whole heap which has all the buckets with the addition of sharding sbrk area.
92
93\subsection{Advantages of distributed design}
94The distributed design of uHeapLmmm is concurrent to work in multi-threaded applications.
95
96Some key benefits of the distributed design of uHeapLmmm are as follows:
97
98\begin{itemize}
99\item
100The bump allocation is concurrent as memory taken from sbrk is sharded across all heaps as bump allocation reserve. The lock on bump allocation (on memory taken from sbrk) will only be contended if KTs > N. The contention on sbrk area is less likely as it will only happen in the case if heaps assigned to two KTs get short of bump allocation reserve simultanously.
101\item
102N heaps are created at the start of the program and destroyed at the end of program. When a KT is created, we only assign it to one of the heaps. When a KT is destroyed, we only dissociate it from the assigned heap but we do not destroy that heap. That heap will go back to our pool-of-heaps, ready to be used by some new KT. And if that heap was shared among multiple KTs (like the case of KTs > N) then, on deletion of one KT, that heap will be still in-use of the other KTs. This will prevent creation and deletion of heaps during run-time as heaps are re-usable which helps in keeping low-memory footprint.
103\item
104It is possible to use sharing and stealing techniques to share/find unused storage, when a free list is unused or empty.
105\item
106Distributed design avoids unnecassry locks on resources shared across all KTs.
107\end{itemize}
108
109FIX ME: Cite performance comparison of the two heap designs if required
110
111%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
112
113\section{Added Features and Methods}
114To improve the UHeapLmmm allocator (FIX ME: cite uHeapLmmm) interface and make it more user friendly, we added a few more routines to the C allocator. Also, we built a CFA (FIX ME: cite cforall) interface on top of C interface to increase the usability of the allocator.
115
116\subsection{C Interface}
117We added a few more features and routines to the allocator's C interface that can make the allocator more usable to the programmers. THese features will programmer more control on the dynamic memory allocation.
118
119\subsubsection void * aalloc( size_t dim, size_t elemSize )
120aalloc is an extension of malloc. It allows programmer to allocate a dynamic array of objects without calculating the total size of array explicitly. The only alternate of this routine in the other allocators is calloc but calloc also fills the dynamic memory with 0 which makes it slower for a programmer who only wants to dynamically allocate an array of objects without filling it with 0.
121\paragraph{Usage}
122aalloc takes two parameters.
123\begin{itemize}
124\item
125dim: number of objects in the array
126\item
127elemSize: size of the object in the array.
128\end{itemize}
129It returns address of dynamic object allocatoed on heap that can contain dim number of objects of the size elemSize. On failure, it returns NULL pointer.
130
131\subsubsection void * resize( void * oaddr, size_t size )
132resize is an extension of relloc. It allows programmer to reuse a cuurently allocated dynamic object with a new size requirement. Its alternate in the other allocators is realloc but relloc also copy the data in old object to the new object which makes it slower for the programmer who only wants to reuse an old dynamic object for a new size requirement but does not want to preserve the data in the old object to the new object.
133\paragraph{Usage}
134resize takes two parameters.
135\begin{itemize}
136\item
137oaddr: the address of the old object that needs to be resized.
138\item
139size: the new size requirement of the to which the old object needs to be resized.
140\end{itemize}
141It returns an object that is of the size given but it does not preserve the data in the old object. On failure, it returns NULL pointer.
142
143\subsubsection void * resize( void * oaddr, size_t nalign, size_t size )
144This resize is an extension of the above resize (FIX ME: cite above resize). In addition to resizing the size of of an old object, it can also realign the old object to a new alignment requirement.
145\paragraph{Usage}
146This resize takes three parameters. It takes an additional parameter of nalign as compared to the above resize (FIX ME: cite above resize).
147\begin{itemize}
148\item
149oaddr: the address of the old object that needs to be resized.
150\item
151nalign: the new alignment to which the old object needs to be realigned.
152\item
153size: the new size requirement of the to which the old object needs to be resized.
154\end{itemize}
155It returns an object with the size and alignment given in the parameters. On failure, it returns a NULL pointer.
156
157\subsubsection void * amemalign( size_t alignment, size_t dim, size_t elemSize )
158amemalign is a hybrid of memalign and aalloc. It allows programmer to allocate an aligned dynamic array of objects without calculating the total size of the array explicitly. It frees the programmer from calculating the total size of the array.
159\paragraph{Usage}
160amemalign takes three parameters.
161\begin{itemize}
162\item
163alignment: the alignment to which the dynamic array needs to be aligned.
164\item
165dim: number of objects in the array
166\item
167elemSize: size of the object in the array.
168\end{itemize}
169It returns a dynamic array of objects that has the capacity to contain dim number of objects of the size of elemSize. The returned dynamic array is aligned to the given alignment. On failure, it returns NULL pointer.
170
171\subsubsection void * cmemalign( size_t alignment, size_t dim, size_t elemSize )
172cmemalign is a hybrid of amemalign and calloc. It allows programmer to allocate an aligned dynamic array of objects that is 0 filled. The current way to do this in other allocators is to allocate an aligned object with memalign and then fill it with 0 explicitly. This routine provides both features of aligning and 0 filling, implicitly.
173\paragraph{Usage}
174cmemalign takes three parameters.
175\begin{itemize}
176\item
177alignment: the alignment to which the dynamic array needs to be aligned.
178\item
179dim: number of objects in the array
180\item
181elemSize: size of the object in the array.
182\end{itemize}
183It returns a dynamic array of objects that has the capacity to contain dim number of objects of the size of elemSize. The returned dynamic array is aligned to the given alignment and is 0 filled. On failure, it returns NULL pointer.
184
185\subsubsection size_t malloc_alignment( void * addr )
186malloc_alignment returns the alignment of a currently allocated dynamic object. It allows the programmer in memory management and personal bookkeeping. It helps the programmer in verofying the alignment of a dynamic object especially in a scenerio similar to prudcer-consumer where a producer allocates a dynamic object and the consumer needs to assure that the dynamic object was allocated with the required alignment.
187\paragraph{Usage}
188malloc_alignment takes one parameters.
189\begin{itemize}
190\item
191addr: the address of the currently allocated dynamic object.
192\end{itemize}
193malloc_alignment returns the alignment of the given dynamic object. On failure, it return the value of default alignment of the uHeapLmmm allocator.
194
195\subsubsection bool malloc_zero_fill( void * addr )
196malloc_zero_fill returns whether a currently allocated dynamic object was initially zero filled at the time of allocation. It allows the programmer in memory management and personal bookkeeping. It helps the programmer in verifying the zero filled property of a dynamic object especially in a scenerio similar to prudcer-consumer where a producer allocates a dynamic object and the consumer needs to assure that the dynamic object was zero filled at the time of allocation.
197\paragraph{Usage}
198malloc_zero_fill takes one parameters.
199\begin{itemize}
200\item
201addr: the address of the currently allocated dynamic object.
202\end{itemize}
203malloc_zero_fill returns true if the dynamic object was initially zero filled and return false otherwise. On failure, it returns false.
204
205\subsubsection size_t malloc_size( void * addr )
206malloc_size returns the allocation size of a currently allocated dynamic object. It allows the programmer in memory management and personal bookkeeping. It helps the programmer in verofying the alignment of a dynamic object especially in a scenerio similar to prudcer-consumer where a producer allocates a dynamic object and the consumer needs to assure that the dynamic object was allocated with the required size. Its current alternate in the other allocators is malloc_usable_size. But, malloc_size is different from malloc_usable_size as malloc_usabe_size returns the total data capacity of dynamic object including the extra space at the end of the dynamic object. On the other hand, malloc_size returns the size that was given to the allocator at the allocation of the dynamic object. This size is updated when an object is realloced, resized, or passed through a similar allocator routine.
207\paragraph{Usage}
208malloc_size takes one parameters.
209\begin{itemize}
210\item
211addr: the address of the currently allocated dynamic object.
212\end{itemize}
213malloc_size returns the allocation size of the given dynamic object. On failure, it return zero.
214
215\subsubsection void * realloc( void * oaddr, size_t nalign, size_t size )
216This realloc is an extension of the default realloc (FIX ME: cite default realloc). In addition to reallocating an old object and preserving the data in old object, it can also realign the old object to a new alignment requirement.
217\paragraph{Usage}
218This realloc takes three parameters. It takes an additional parameter of nalign as compared to the default realloc.
219\begin{itemize}
220\item
221oaddr: the address of the old object that needs to be reallocated.
222\item
223nalign: the new alignment to which the old object needs to be realigned.
224\item
225size: the new size requirement of the to which the old object needs to be resized.
226\end{itemize}
227It returns an object with the size and alignment given in the parameters that preserves the data in the old object. On failure, it returns a NULL pointer.
228
229\subsection{CFA Malloc Interface}
230We added some routines to the malloc interface of CFA. These routines can only be used in CFA and not in our standalone uHeapLmmm allocator as these routines use some features that are only provided by CFA and not by C. It makes the allocator even more usable to the programmers.
231CFA provides the liberty to know the returned type of a call to the allocator. So, mainly in these added routines, we removed the object size parameter from the routine as allocator can calculate the size of the object from the returned type.
232
233\subsubsection T * malloc( void )
234This malloc is a simplified polymorphic form of defualt malloc (FIX ME: cite malloc). It does not take any parameter as compared to default malloc that takes one parameter.
235\paragraph{Usage}
236This malloc takes no parameters.
237It returns a dynamic object of the size of type T. On failure, it return NULL pointer.
238
239\subsubsection T * aalloc( size_t dim )
240This aalloc is a simplified polymorphic form of above aalloc (FIX ME: cite aalloc). It takes one parameter as compared to the above aalloc that takes two parameters.
241\paragraph{Usage}
242aalloc takes one parameters.
243\begin{itemize}
244\item
245dim: required number of objects in the array.
246\end{itemize}
247It returns a dynamic object that has the capacity to contain dim number of objects, each of the size of type T. On failure, it return NULL pointer.
248
249\subsubsection T * calloc( size_t dim )
250This calloc is a simplified polymorphic form of defualt calloc (FIX ME: cite calloc). It takes one parameter as compared to the default calloc that takes two parameters.
251\paragraph{Usage}
252This calloc takes one parameter.
253\begin{itemize}
254\item
255dim: required number of objects in the array.
256\end{itemize}
257It returns a dynamic object that has the capacity to contain dim number of objects, each of the size of type T. On failure, it return NULL pointer.
258
259\subsubsection T * resize( T * ptr, size_t size )
260This resize is a simplified polymorphic form of above resize (FIX ME: cite resize with alignment). It takes two parameters as compared to the above resize that takes three parameters. It frees the programmer from explicitly mentioning the alignment of the allocation as CFA provides gives allocator the liberty to get the alignment of the returned type.
261\paragraph{Usage}
262This resize takes two parameters.
263\begin{itemize}
264\item
265ptr: address of the old object.
266\item
267size: the required size of the new object.
268\end{itemize}
269It returns a dynamic object of the size given in paramters. The returned object is aligned to the alignemtn of type T. On failure, it return NULL pointer.
270
271\subsubsection T * realloc( T * ptr, size_t size )
272This realloc is a simplified polymorphic form of defualt realloc (FIX ME: cite realloc with align). It takes two parameters as compared to the above realloc that takes three parameters. It frees the programmer from explicitly mentioning the alignment of the allocation as CFA provides gives allocator the liberty to get the alignment of the returned type.
273\paragraph{Usage}
274This realloc takes two parameters.
275\begin{itemize}
276\item
277ptr: address of the old object.
278\item
279size: the required size of the new object.
280\end{itemize}
281It returns a dynamic object of the size given in paramters that preserves the data in the given object. The returned object is aligned to the alignemtn of type T. On failure, it return NULL pointer.
282
283\subsubsection T * memalign( size_t align )
284This memalign is a simplified polymorphic form of defualt memalign (FIX ME: cite memalign). It takes one parameters as compared to the default memalign that takes two parameters.
285\paragraph{Usage}
286memalign takes one parameters.
287\begin{itemize}
288\item
289align: the required alignment of the dynamic object.
290\end{itemize}
291It returns a dynamic object of the size of type T that is aligned to given parameter align. On failure, it return NULL pointer.
292
293\subsubsection T * amemalign( size_t align, size_t dim )
294This amemalign is a simplified polymorphic form of above amemalign (FIX ME: cite amemalign). It takes two parameter as compared to the above amemalign that takes three parameters.
295\paragraph{Usage}
296amemalign takes two parameters.
297\begin{itemize}
298\item
299align: required alignment of the dynamic array.
300\item
301dim: required number of objects in the array.
302\end{itemize}
303It returns a dynamic object that has the capacity to contain dim number of objects, each of the size of type T. The returned object is aligned to the given parameter align. On failure, it return NULL pointer.
304
305\subsubsection T * cmemalign( size_t align, size_t dim  )
306This cmemalign is a simplified polymorphic form of above cmemalign (FIX ME: cite cmemalign). It takes two parameter as compared to the above cmemalign that takes three parameters.
307\paragraph{Usage}
308cmemalign takes two parameters.
309\begin{itemize}
310\item
311align: required alignment of the dynamic array.
312\item
313dim: required number of objects in the array.
314\end{itemize}
315It returns a dynamic object that has the capacity to contain dim number of objects, each of the size of type T. The returned object is aligned to the given parameter align and is zero filled. On failure, it return NULL pointer.
316
317\subsubsection T * aligned_alloc( size_t align )
318This aligned_alloc is a simplified polymorphic form of defualt aligned_alloc (FIX ME: cite aligned_alloc). It takes one parameter as compared to the default aligned_alloc that takes two parameters.
319\paragraph{Usage}
320This aligned_alloc takes one parameter.
321\begin{itemize}
322\item
323align: required alignment of the dynamic object.
324\end{itemize}
325It returns a dynamic object of the size of type T that is aligned to the given parameter. On failure, it return NULL pointer.
326
327\subsubsection int posix_memalign( T ** ptr, size_t align )
328This posix_memalign is a simplified polymorphic form of defualt posix_memalign (FIX ME: cite posix_memalign). It takes two parameters as compared to the default posix_memalign that takes three parameters.
329\paragraph{Usage}
330This posix_memalign takes two parameter.
331\begin{itemize}
332\item
333ptr: variable address to store the address of the allocated object.
334\item
335align: required alignment of the dynamic object.
336\end{itemize}
337It stores address of the dynamic object of the size of type T in given parameter ptr. This object is aligned to the given parameter. On failure, it return NULL pointer.
338
339\subsubsection T * valloc( void )
340This valloc is a simplified polymorphic form of defualt valloc (FIX ME: cite valloc). It takes no parameters as compared to the default valloc that takes one parameter.
341\paragraph{Usage}
342valloc takes no parameters.
343It returns a dynamic object of the size of type T that is aligned to the page size. On failure, it return NULL pointer.
344
345\subsubsection T * pvalloc( void )
346This pcvalloc is a simplified polymorphic form of defualt pcvalloc (FIX ME: cite pcvalloc). It takes no parameters as compared to the default pcvalloc that takes one parameter.
347\paragraph{Usage}
348pvalloc takes no parameters.
349It returns a dynamic object of the size that is calcutaed by rouding the size of type T. The returned object is also aligned to the page size. On failure, it return NULL pointer.
350
351\subsection{Alloc Interface}
352Why did we need it?
353The added benefits.
Note: See TracBrowser for help on using the repository browser.