Changeset 16c795c for doc/papers/llheap


Ignore:
Timestamp:
Mar 11, 2024, 10:45:46 PM (11 months ago)
Author:
Peter A. Buhr <pabuhr@…>
Branches:
master
Children:
75d789c
Parents:
a885357
Message:

update llheap paper

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/papers/llheap/Paper.tex

    ra885357 r16c795c  
    554554\label{s:MultipleHeaps}
    555555
    556 Figure~\ref{f:ThreadHeapRelationship} shows how a multi-threaded allocator can subdivide a single global heap into multiple heaps to reduce contention among threads.
     556Figure~\ref{f:ThreadHeapRelationship} shows how a multi-threaded allocator reduced contention by subdividing a single heap into multiple heaps.
    557557
    558558\begin{figure}
     
    580580
    581581\begin{description}[leftmargin=*]
    582 \item[T:1 model (Figure~\ref{f:SingleHeap})] has all threads allocating and deallocating objects from one heap.
    583 Memory is obtained from the freed objects, or reserved memory in the heap, or from the OS;
    584 the heap may also return freed memory to the OS.
    585 The arrows indicate the direction memory moves for each alocation/deallocation operation.
    586 To safely handle concurrency, a single lock may be used for all heap operations or fine-grained locking for different operations.
    587 Regardless, a single heap is a significant source of contention for threaded programs with a large amount of memory allocations.
    588 
    589 \item[T:H model (Figure~\ref{f:SharedHeaps})] subdivides the heap independently from the threads.
    590 The decision to create a heap and which heap a thread allocates/deallocates during its lifetime depends on the allocator design.
    591 Locking is required within each heap because of multiple tread access, but contention is reduced because fewer threads access a specific heap.
    592 The goal is to have mininal heaps (storage) and thread contention per heap (time).
    593 However, the worst case results in more heaps than threads, \eg if the number of threads is large at startup creating a large number of heaps and then the number of threads reduces.
     582\item[T:1 model (Figure~\ref{f:SingleHeap})] is all threads (T) sharing a single heap (1).
     583The arrows indicate memory movement for allocation/deallocation operations.
     584Memory is obtained from freed objects, reserved memory, or the OS;
     585freed memory can be returned to the OS.
     586To handle concurrency, a single lock is used for all heap operations or fine-grained locking if operations can be made independent.
     587As threads perform large numbers of allocations, a single heap becomes a significant source of contention.
     588
     589\item[T:H model (Figure~\ref{f:SharedHeaps})] is multiple threads (T) sharing multiple heaps (H).
     590The allocator independently allocates/deallocates heaps and assigns threads to heaps based on dynamic contention pressure.
     591Locking is required within each heap, but contention is reduced because fewer threads access a specific heap.
     592The goal is minimal heaps (storage) and contention per heap (time).
     593A worst case is more heaps than threads, \eg many threads at startup create a large number of heaps and then the threads reduce.
    594594
    595595% For example, multiple heaps are managed in a pool, starting with a single or a fixed number of heaps that increase\-/decrease depending on contention\-/space issues.
     
    612612% Allocated and free objects are labelled by the thread or heap they are associated with.
    613613% (Links between free objects are removed for simplicity.)
    614 The management information for multiple heaps in the static zone must be able to locate all heaps.
    615 The management information for the heaps must reside in the dynamic-allocation zone if there are a variable number.
    616 Each heap in the dynamic zone is composed of a list of free objects and a pointer to its reserved memory.
    617 An alternative implementation is for all heaps to share one reserved memory, which requires a separate lock for the reserved storage to ensure mutual exclusion when acquiring new memory.
    618 Because multiple threads can allocate/free/reallocate adjacent storage, all forms of false sharing may occur.
    619 Other storage-management options are to use @mmap@ to set aside (large) areas of virtual memory for each heap and suballocate each heap's storage within that area, pushing part of the storage management complexity back to the OS.
     614% The management information for multiple heaps in the static zone must be able to locate all heaps.
     615% The management information for the heaps must reside in the dynamic-allocation zone if there are a variable number.
     616% Each heap in the dynamic zone is composed of a list of free objects and a pointer to its reserved memory.
     617% An alternative implementation is for all heaps to share one reserved memory, which requires a separate lock for the reserved storage to ensure mutual exclusion when acquiring new memory.
     618% Because multiple threads can allocate/free/reallocate adjacent storage, all forms of false sharing may occur.
     619% Other storage-management options are to use @mmap@ to set aside (large) areas of virtual memory for each heap and suballocate each heap's storage within that area, pushing part of the storage management complexity back to the OS.
    620620
    621621% \begin{figure}
     
    644644In general, the cost is minimal since the majority of memory operations are completed without the use of the global heap.
    645645
    646 \item[1:1 model (Figure~\ref{f:PerThreadHeap})] has each thread with its own heap, eliminating most contention and locking because threads seldom access another thread's heap (see Section~\ref{s:Ownership}).
    647 An additional benefit of thread heaps is improved locality due to better memory layout.
    648 As each thread only allocates from its heap, all objects are consolidated in the storage area for that heap, better utilizing each CPUs cache and accessing fewer pages.
    649 In contrast, the T:H model spreads each thread's objects over a larger area in different heaps.
    650 Thread heaps can also reduces false-sharing, except at crucial boundaries overlapping memory from another thread's heap.
    651 For example, assume page boundaries coincide with cache line boundaries, if a thread heap always acquires pages of memory then no two threads share a page or cache line unless pointers are passed among them.
    652 % Hence, allocator-induced active false-sharing cannot occur because the memory for thread heaps never overlaps.
    653 
    654 When a thread terminates, there are two options for handling its thread heap.
    655 First is to free all objects in the thread heap to the global heap and destroy the thread heap.
    656 Second is to place the thread heap on a list of available heaps and reuse it for a new thread in the future.
    657 Destroying the thread heap immediately may reduce external fragmentation sooner, since all free objects are freed to the global heap and may be reused by other threads.
    658 Alternatively, reusing thread heaps may improve performance if the inheriting thread makes similar allocation requests as the thread that previously held the thread heap because any unfreed storage is immediately accessible.
     646\item[1:1 model (Figure~\ref{f:PerThreadHeap})] is each thread (1) has a heap (1), eliminating most contention and locking if threads seldom access another thread's heap (see Section~\ref{s:Ownership}).
     647A thread's objects are consolidated in its heap, better utilizing the cache and paging during thread execution.
     648In contrast, the T:H model can spread thread objects over a larger area in different heaps.
     649Thread heaps can also reduces false-sharing, unless there are overlapping memory boundaries from another thread's heap.
     650%For example, assume page boundaries coincide with cache line boundaries, if a thread heap always acquires pages of memory then no two threads share a page or cache line unless pointers are passed among them.
     651
     652When a thread terminates, it can free its heap objects to the global heap, or the thread heap is retained as-is and reused for a new thread in the future.
     653Destroying a heap can reduce external fragmentation sooner, since all free objects in the global heap are available for immediate reuse.
     654Alternatively, reusing a heap can aid the inheriting thread, if it has a similar allocation pattern because the heap in primed with unfreed storage of the right sizes.
    659655\end{description}
    660656
Note: See TracChangeset for help on using the changeset viewer.