\chapter{Allocator} \noindent ==================== Writing Points: \begin{itemize} \item Objective of uHeapLmmm. \item Design philosophy. \item Background and previous design of uHeapLmmm. \item Distributed design of uHeapLmmm. ----- SHOULD WE GIVE IMPLEMENTATION DETAILS HERE? ----- \PAB{Maybe. There might be an Implementation chapter.} \item figure. \item Advantages of distributed design. \end{itemize} The new features added to uHeapLmmm (incl. @malloc_size@ routine) \CFA alloc interface with examples. \begin{itemize} \item Why did we need it? \item The added benefits. \end{itemize} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% uHeapLmmm Design %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Objective of uHeapLmmm} UHeapLmmm is a lightweight memory allocator. The objective behind uHeapLmmm is to design a minimal concurrent memory allocator that has new features and also fulfills GNU C Library requirements (FIX ME: cite requirements). \subsection{Design philosophy} The objective of uHeapLmmm's new design was to fulfill following requirements: \begin{itemize} \item It should be concurrent to be used in multi-threaded programs. \item It should avoid global locks, on resources shared across all threads, as much as possible. \item It's performance (FIX ME: cite performance benchmarks) should be comparable to the commonly used allocators (FIX ME: cite common allocators). \item It should be a lightweight memory allocator. \end{itemize} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Background and previous design of uHeapLmmm} uHeapLmmm was originally designed by X in X (FIX ME: add original author after confirming with Peter). (FIX ME: make and add figure of previous design with description) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Distributed design of uHeapLmmm} uHeapLmmm's design was reviewed and changed to fulfill new requirements (FIX ME: cite allocator philosophy). For this purpose, following two designs of uHeapLmm were proposed: \paragraph{Design 1: Decentralized} Fixed number of heaps: shard the heap into N heaps each with a bump-area allocated from the @sbrk@ area. Kernel threads (KT) are assigned to the N heaps. When KTs $\le$ N, the heaps are uncontented. When KTs $>$ N, the heaps are contented. By adjusting N, this approach reduces storage at the cost of speed due to contention. In all cases, a thread acquires/releases a lock, contented or uncontented. \begin{cquote} \centering \input{AllocDS1} \end{cquote} Problems: need to know when a KT is created and destroyed to know when to assign/un-assign a heap to the KT. \paragraph{Design 2: Centralized} One heap, but lower bucket sizes are N-shared across KTs. This design leverages the fact that 95\% of allocation requests are less than 512 bytes and there are only 3--5 different request sizes. When KTs $\le$ N, the important bucket sizes are uncontented. When KTs $>$ N, the free buckets are contented. Therefore, threads are only contending for a small number of buckets, which are distributed among them to reduce contention. \begin{cquote} \centering \input{AllocDS2} \end{cquote} Problems: need to know when a kernel thread (KT) is created and destroyed to know when to assign a shared bucket-number. When no thread is assigned a bucket number, its free storage is unavailable. All KTs will be contended for one lock on sbrk for their initial allocations (before free-lists gets populated). Out of the two designs, Design 1 was chosen because it's concurrency is better across all bucket-sizes as design-2 shards a few buckets of selected sizes while design-1 shards all the buckets. Design-2 shards the whole heap which has all the buckets with the addition of sharding sbrk area. \subsection{Advantages of distributed design} The distributed design of uHeapLmmm is concurrent to work in multi-threaded applications. Some key benefits of the distributed design of uHeapLmmm are as follows: \begin{itemize} \item The bump allocation is concurrent as memory taken from sbrk is sharded across all heaps as bump allocation reserve. The lock on bump allocation (on memory taken from sbrk) will only be contended if KTs > N. The contention on sbrk area is less likely as it will only happen in the case if heaps assigned to two KTs get short of bump allocation reserve simultanously. \item N heaps are created at the start of the program and destroyed at the end of program. When a KT is created, we only assign it to one of the heaps. When a KT is destroyed, we only dissociate it from the assigned heap but we do not destroy that heap. That heap will go back to our pool-of-heaps, ready to be used by some new KT. And if that heap was shared among multiple KTs (like the case of KTs > N) then, on deletion of one KT, that heap will be still in-use of the other KTs. This will prevent creation and deletion of heaps during run-time as heaps are re-usable which helps in keeping low-memory footprint. \item It is possible to use sharing and stealing techniques to share/find unused storage, when a free list is unused or empty. \item Distributed design avoids unnecassry locks on resources shared across all KTs. \end{itemize} FIX ME: Cite performance comparison of the two heap designs if required %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Added Features and Methods} To improve the UHeapLmmm allocator (FIX ME: cite uHeapLmmm) interface and make it more user friendly, we added a few more routines to the C allocator. Also, we built a CFA (FIX ME: cite cforall) interface on top of C interface to increase the usability of the allocator. \subsection{C Interface} We added a few more features and routines to the allocator's C interface that can make the allocator more usable to the programmers. THese features will programmer more control on the dynamic memory allocation. \subsubsection void * aalloc( size_t dim, size_t elemSize ) aalloc is an extension of malloc. It allows programmer to allocate a dynamic array of objects without calculating the total size of array explicitly. The only alternate of this routine in the other allocators is calloc but calloc also fills the dynamic memory with 0 which makes it slower for a programmer who only wants to dynamically allocate an array of objects without filling it with 0. \paragraph{Usage} aalloc takes two parameters. \begin{itemize} \item dim: number of objects in the array \item elemSize: size of the object in the array. \end{itemize} It returns address of dynamic object allocatoed on heap that can contain dim number of objects of the size elemSize. On failure, it returns NULL pointer. \subsubsection void * resize( void * oaddr, size_t size ) resize is an extension of relloc. It allows programmer to reuse a cuurently allocated dynamic object with a new size requirement. Its alternate in the other allocators is realloc but relloc also copy the data in old object to the new object which makes it slower for the programmer who only wants to reuse an old dynamic object for a new size requirement but does not want to preserve the data in the old object to the new object. \paragraph{Usage} resize takes two parameters. \begin{itemize} \item oaddr: the address of the old object that needs to be resized. \item size: the new size requirement of the to which the old object needs to be resized. \end{itemize} It returns an object that is of the size given but it does not preserve the data in the old object. On failure, it returns NULL pointer. \subsubsection void * resize( void * oaddr, size_t nalign, size_t size ) This resize is an extension of the above resize (FIX ME: cite above resize). In addition to resizing the size of of an old object, it can also realign the old object to a new alignment requirement. \paragraph{Usage} This resize takes three parameters. It takes an additional parameter of nalign as compared to the above resize (FIX ME: cite above resize). \begin{itemize} \item oaddr: the address of the old object that needs to be resized. \item nalign: the new alignment to which the old object needs to be realigned. \item size: the new size requirement of the to which the old object needs to be resized. \end{itemize} It returns an object with the size and alignment given in the parameters. On failure, it returns a NULL pointer. \subsubsection void * amemalign( size_t alignment, size_t dim, size_t elemSize ) amemalign is a hybrid of memalign and aalloc. It allows programmer to allocate an aligned dynamic array of objects without calculating the total size of the array explicitly. It frees the programmer from calculating the total size of the array. \paragraph{Usage} amemalign takes three parameters. \begin{itemize} \item alignment: the alignment to which the dynamic array needs to be aligned. \item dim: number of objects in the array \item elemSize: size of the object in the array. \end{itemize} It returns a dynamic array of objects that has the capacity to contain dim number of objects of the size of elemSize. The returned dynamic array is aligned to the given alignment. On failure, it returns NULL pointer. \subsubsection void * cmemalign( size_t alignment, size_t dim, size_t elemSize ) cmemalign is a hybrid of amemalign and calloc. It allows programmer to allocate an aligned dynamic array of objects that is 0 filled. The current way to do this in other allocators is to allocate an aligned object with memalign and then fill it with 0 explicitly. This routine provides both features of aligning and 0 filling, implicitly. \paragraph{Usage} cmemalign takes three parameters. \begin{itemize} \item alignment: the alignment to which the dynamic array needs to be aligned. \item dim: number of objects in the array \item elemSize: size of the object in the array. \end{itemize} It returns a dynamic array of objects that has the capacity to contain dim number of objects of the size of elemSize. The returned dynamic array is aligned to the given alignment and is 0 filled. On failure, it returns NULL pointer. \subsubsection size_t malloc_alignment( void * addr ) malloc_alignment returns the alignment of a currently allocated dynamic object. It allows the programmer in memory management and personal bookkeeping. It helps the programmer in verofying the alignment of a dynamic object especially in a scenerio similar to prudcer-consumer where a producer allocates a dynamic object and the consumer needs to assure that the dynamic object was allocated with the required alignment. \paragraph{Usage} malloc_alignment takes one parameters. \begin{itemize} \item addr: the address of the currently allocated dynamic object. \end{itemize} malloc_alignment returns the alignment of the given dynamic object. On failure, it return the value of default alignment of the uHeapLmmm allocator. \subsubsection bool malloc_zero_fill( void * addr ) malloc_zero_fill returns whether a currently allocated dynamic object was initially zero filled at the time of allocation. It allows the programmer in memory management and personal bookkeeping. It helps the programmer in verifying the zero filled property of a dynamic object especially in a scenerio similar to prudcer-consumer where a producer allocates a dynamic object and the consumer needs to assure that the dynamic object was zero filled at the time of allocation. \paragraph{Usage} malloc_zero_fill takes one parameters. \begin{itemize} \item addr: the address of the currently allocated dynamic object. \end{itemize} malloc_zero_fill returns true if the dynamic object was initially zero filled and return false otherwise. On failure, it returns false. \subsubsection size_t malloc_size( void * addr ) malloc_size returns the allocation size of a currently allocated dynamic object. It allows the programmer in memory management and personal bookkeeping. It helps the programmer in verofying the alignment of a dynamic object especially in a scenerio similar to prudcer-consumer where a producer allocates a dynamic object and the consumer needs to assure that the dynamic object was allocated with the required size. Its current alternate in the other allocators is malloc_usable_size. But, malloc_size is different from malloc_usable_size as malloc_usabe_size returns the total data capacity of dynamic object including the extra space at the end of the dynamic object. On the other hand, malloc_size returns the size that was given to the allocator at the allocation of the dynamic object. This size is updated when an object is realloced, resized, or passed through a similar allocator routine. \paragraph{Usage} malloc_size takes one parameters. \begin{itemize} \item addr: the address of the currently allocated dynamic object. \end{itemize} malloc_size returns the allocation size of the given dynamic object. On failure, it return zero. \subsubsection void * realloc( void * oaddr, size_t nalign, size_t size ) This realloc is an extension of the default realloc (FIX ME: cite default realloc). In addition to reallocating an old object and preserving the data in old object, it can also realign the old object to a new alignment requirement. \paragraph{Usage} This realloc takes three parameters. It takes an additional parameter of nalign as compared to the default realloc. \begin{itemize} \item oaddr: the address of the old object that needs to be reallocated. \item nalign: the new alignment to which the old object needs to be realigned. \item size: the new size requirement of the to which the old object needs to be resized. \end{itemize} It returns an object with the size and alignment given in the parameters that preserves the data in the old object. On failure, it returns a NULL pointer. \subsection{CFA Malloc Interface} We added some routines to the malloc interface of CFA. These routines can only be used in CFA and not in our standalone uHeapLmmm allocator as these routines use some features that are only provided by CFA and not by C. It makes the allocator even more usable to the programmers. CFA provides the liberty to know the returned type of a call to the allocator. So, mainly in these added routines, we removed the object size parameter from the routine as allocator can calculate the size of the object from the returned type. \subsubsection T * malloc( void ) This malloc is a simplified polymorphic form of defualt malloc (FIX ME: cite malloc). It does not take any parameter as compared to default malloc that takes one parameter. \paragraph{Usage} This malloc takes no parameters. It returns a dynamic object of the size of type T. On failure, it return NULL pointer. \subsubsection T * aalloc( size_t dim ) This aalloc is a simplified polymorphic form of above aalloc (FIX ME: cite aalloc). It takes one parameter as compared to the above aalloc that takes two parameters. \paragraph{Usage} aalloc takes one parameters. \begin{itemize} \item dim: required number of objects in the array. \end{itemize} It returns a dynamic object that has the capacity to contain dim number of objects, each of the size of type T. On failure, it return NULL pointer. \subsubsection T * calloc( size_t dim ) This calloc is a simplified polymorphic form of defualt calloc (FIX ME: cite calloc). It takes one parameter as compared to the default calloc that takes two parameters. \paragraph{Usage} This calloc takes one parameter. \begin{itemize} \item dim: required number of objects in the array. \end{itemize} It returns a dynamic object that has the capacity to contain dim number of objects, each of the size of type T. On failure, it return NULL pointer. \subsubsection T * resize( T * ptr, size_t size ) This resize is a simplified polymorphic form of above resize (FIX ME: cite resize with alignment). It takes two parameters as compared to the above resize that takes three parameters. It frees the programmer from explicitly mentioning the alignment of the allocation as CFA provides gives allocator the liberty to get the alignment of the returned type. \paragraph{Usage} This resize takes two parameters. \begin{itemize} \item ptr: address of the old object. \item size: the required size of the new object. \end{itemize} It returns a dynamic object of the size given in paramters. The returned object is aligned to the alignemtn of type T. On failure, it return NULL pointer. \subsubsection T * realloc( T * ptr, size_t size ) This realloc is a simplified polymorphic form of defualt realloc (FIX ME: cite realloc with align). It takes two parameters as compared to the above realloc that takes three parameters. It frees the programmer from explicitly mentioning the alignment of the allocation as CFA provides gives allocator the liberty to get the alignment of the returned type. \paragraph{Usage} This realloc takes two parameters. \begin{itemize} \item ptr: address of the old object. \item size: the required size of the new object. \end{itemize} It returns a dynamic object of the size given in paramters that preserves the data in the given object. The returned object is aligned to the alignemtn of type T. On failure, it return NULL pointer. \subsubsection T * memalign( size_t align ) This memalign is a simplified polymorphic form of defualt memalign (FIX ME: cite memalign). It takes one parameters as compared to the default memalign that takes two parameters. \paragraph{Usage} memalign takes one parameters. \begin{itemize} \item align: the required alignment of the dynamic object. \end{itemize} It returns a dynamic object of the size of type T that is aligned to given parameter align. On failure, it return NULL pointer. \subsubsection T * amemalign( size_t align, size_t dim ) This amemalign is a simplified polymorphic form of above amemalign (FIX ME: cite amemalign). It takes two parameter as compared to the above amemalign that takes three parameters. \paragraph{Usage} amemalign takes two parameters. \begin{itemize} \item align: required alignment of the dynamic array. \item dim: required number of objects in the array. \end{itemize} It returns a dynamic object that has the capacity to contain dim number of objects, each of the size of type T. The returned object is aligned to the given parameter align. On failure, it return NULL pointer. \subsubsection T * cmemalign( size_t align, size_t dim ) This cmemalign is a simplified polymorphic form of above cmemalign (FIX ME: cite cmemalign). It takes two parameter as compared to the above cmemalign that takes three parameters. \paragraph{Usage} cmemalign takes two parameters. \begin{itemize} \item align: required alignment of the dynamic array. \item dim: required number of objects in the array. \end{itemize} It returns a dynamic object that has the capacity to contain dim number of objects, each of the size of type T. The returned object is aligned to the given parameter align and is zero filled. On failure, it return NULL pointer. \subsubsection T * aligned_alloc( size_t align ) This aligned_alloc is a simplified polymorphic form of defualt aligned_alloc (FIX ME: cite aligned_alloc). It takes one parameter as compared to the default aligned_alloc that takes two parameters. \paragraph{Usage} This aligned_alloc takes one parameter. \begin{itemize} \item align: required alignment of the dynamic object. \end{itemize} It returns a dynamic object of the size of type T that is aligned to the given parameter. On failure, it return NULL pointer. \subsubsection int posix_memalign( T ** ptr, size_t align ) This posix_memalign is a simplified polymorphic form of defualt posix_memalign (FIX ME: cite posix_memalign). It takes two parameters as compared to the default posix_memalign that takes three parameters. \paragraph{Usage} This posix_memalign takes two parameter. \begin{itemize} \item ptr: variable address to store the address of the allocated object. \item align: required alignment of the dynamic object. \end{itemize} It stores address of the dynamic object of the size of type T in given parameter ptr. This object is aligned to the given parameter. On failure, it return NULL pointer. \subsubsection T * valloc( void ) This valloc is a simplified polymorphic form of defualt valloc (FIX ME: cite valloc). It takes no parameters as compared to the default valloc that takes one parameter. \paragraph{Usage} valloc takes no parameters. It returns a dynamic object of the size of type T that is aligned to the page size. On failure, it return NULL pointer. \subsubsection T * pvalloc( void ) This pcvalloc is a simplified polymorphic form of defualt pcvalloc (FIX ME: cite pcvalloc). It takes no parameters as compared to the default pcvalloc that takes one parameter. \paragraph{Usage} pvalloc takes no parameters. It returns a dynamic object of the size that is calcutaed by rouding the size of type T. The returned object is also aligned to the page size. On failure, it return NULL pointer. \subsection{Alloc Interface} In addition to improve allocator interface both for CFA and our standalone allocator uHeapLmmm in C. We also added a new alloc interface in CFA that increases usability of dynamic memory allocation. This interface helps programmers in three major ways. \begin{itemize} \item Routine Name: alloc interfce frees programmers from remmebring different routine names for different kind of dynamic allocations. \item Parametre Positions: alloc interface frees programmers from remembering parameter postions in call to routines. \item Object Size: alloc interface does not require programmer to mention the object size as CFA allows allocator to determince the object size from returned type of alloc call. \end{itemize} Alloc interface uses polymorphism, backtick routines (FIX ME: cite backtick) and ttype parameters of CFA (FIX ME: cite ttype) to provide a very simple dynamic memory allocation interface to the programmers. The new interfece has just one routine name alloc that can be used to perform a wide range of dynamic allocations. The parameters use backtick functions to provide a similar-to named parameters feature for our alloc interface so that programmers do not have to remember parameter positions in alloc call except the position of dimension (dim) parameter. \subsubsection{Routine: T * alloc( ... )} Call to alloc wihout any parameter returns one object of size of type T allocated dynamically. Only the dimension (dim) parameter for array allocation has the fixed position in the alloc routine. If programmer wants to allocate an array of objects that the required number of members in the array has to be given as the first parameter to the alloc routine. alocc routine accepts six kinds of arguments. Using different combinations of tha parameters, different kind of allocations can be performed. Any combincation of parameters can be used together except `realloc and `resize that should not be used simultanously in one call to routine as it creates ambiguity about whether to reallocate or resize a currently allocated dynamic object. If both `resize and `realloc are used in a call to alloc then the latter one will take effect or unexpected resulted might be produced. \paragraph{Dim} This is the only parameter in the alloc routine that has a fixed-position and it is also the only parameter that does not use a backtick function. It has to be passed at the first position to alloc call in-case of an array allocation of objects of type T. It represents the required number of members in the array allocation as in CFA's aalloc (FIX ME: cite aalloc). This parameter should be of type size_t. Example: int a = alloc( 5 ) This call will return a dynamic array of five integers. \paragraph{Align} This parameter is position-free and uses a backtick routine align (`align). The parameter passed with `align should be of type size_t. If the alignment parameter is not a power of two or is less than the default alignment of the allocator (that can be found out using routine libAlign in CFA) then the passed alignment parameter will be rejected and the default alignment will be used. Example: int b = alloc( 5 , 64`align ) This call will return a dynamic array of five integers. It will align the allocated object to 64. \paragraph{Fill} This parameter is position-free and uses a backtick routine fill (`fill). In case of realloc, only the extra space after copying the data in the old object will be filled with given parameter. Three types of parameters can be passed using `fill. \begin{itemize} \item char: A char can be passed with `fill to fill the whole dynamic allocation with the given char recursively till the end of required allocation. \item Object of returned type: An object of type of returned type can be passed with `fill to fill the whole dynamic allocation with the given object recursively till the end of required allocation. \item Dynamic object of returned type: A dynamic object of type of returned type can be passed with `fill to fill the dynamic allocation with the given dynamic object. In this case, the allocated memory is not filled recursively till the end of allocation. The filling happen untill the end object passed to `fill or the end of requested allocation reaches. \end{itemize} Example: int b = alloc( 5 , 'a'`fill ) This call will return a dynamic array of five integers. It will fill the allocated object with character 'a' recursively till the end of requested allocation size. Example: int b = alloc( 5 , 4`fill ) This call will return a dynamic array of five integers. It will fill the allocated object with integer 4 recursively till the end of requested allocation size. Example: int b = alloc( 5 , a`fill ) where a is a pointer of int type This call will return a dynamic array of five integers. It will copy data in a to the returned object non-recursively untill end of a or the newly allocated object is reached. \paragraph{Resize} This parameter is position-free and uses a backtick routine resize (`resize). It represents the old dynamic object (oaddr) that the programmer wants to \begin{itemize} \item resize to a new size. \item realign to a new alignment \item fill with something. \end{itemize} The data in old dynamic object will not be preserved in the new object. The type of object passed to `resize and the returned type of alloc call can be different. Example: int b = alloc( 5 , a`resize ) This call will resize object a to a dynamic array that can contain 5 integers. Example: int b = alloc( 5 , a`resize , 32`align ) This call will resize object a to a dynamic array that can contain 5 integers. The returned object will also be aligned to 32. Example: int b = alloc( 5 , a`resize , 32`align , 2`fill) This call will resize object a to a dynamic array that can contain 5 integers. The returned object will also be aligned to 32 and will be filled with 2. \paragraph{Realloc} This parameter is position-free and uses a backtick routine realloc (`realloc). It represents the old dynamic object (oaddr) that the programmer wants to \begin{itemize} \item realloc to a new size. \item realign to a new alignment \item fill with something. \end{itemize} The data in old dynamic object will be preserved in the new object. The type of object passed to `realloc and the returned type of alloc call cannot be different. Example: int b = alloc( 5 , a`realloc ) This call will realloc object a to a dynamic array that can contain 5 integers. Example: int b = alloc( 5 , a`realloc , 32`align ) This call will realloc object a to a dynamic array that can contain 5 integers. The returned object will also be aligned to 32. Example: int b = alloc( 5 , a`realloc , 32`align , 2`fill) This call will resize object a to a dynamic array that can contain 5 integers. The returned object will also be aligned to 32. The extra space after copying data of a to the returned object will be filled with 2.