Index: doc/papers/llheap/Paper.tex
===================================================================
--- doc/papers/llheap/Paper.tex	(revision 1ae3ac46904e0b164dcec0e13b996849dc2b67a6)
+++ doc/papers/llheap/Paper.tex	(revision baa1d5dcfe059fec982d2f0351950afe46056e2a)
@@ -77,16 +77,17 @@
 \lstset{
 columns=fullflexible,
-basicstyle=\linespread{0.9}\sf,							% reduce line spacing and use sanserif font
-stringstyle=\tt,										% use typewriter font
-tabsize=5,												% N space tabbing
-xleftmargin=\parindentlnth,								% indent code to paragraph indentation
-%mathescape=true,										% LaTeX math escape in CFA code $...$
-escapechar=\$,											% LaTeX escape in CFA code
-keepspaces=true,										%
-showstringspaces=false,									% do not show spaces with cup
-showlines=true,											% show blank lines at end of code
-aboveskip=4pt,											% spacing above/below code block
-belowskip=3pt,
-moredelim=**[is][\color{red}]{`}{`},
+basicstyle=\linespread{0.9}\sf,			% reduce line spacing and use sanserif font
+stringstyle=\small\tt,					% use typewriter font
+tabsize=5,								% N space tabbing
+xleftmargin=\parindentlnth,				% indent code to paragraph indentation
+escapechar=\$,							% LaTeX escape in CFA code
+%mathescape=true,						% LaTeX math escape in CFA code $...$
+keepspaces=true,						%
+showstringspaces=false,					% do not show spaces with cup
+showlines=true,							% show blank lines at end of code
+aboveskip=4pt,							% spacing above/below code block
+belowskip=2pt,
+numberstyle=\footnotesize\sf,			% numbering style
+moredelim=**[is][\color{red}]{@}{@},
 }% lstset
 
@@ -1082,9 +1083,9 @@
 
 The primary design objective for llheap is low-latency across all allocator calls independent of application access-patterns and/or number of threads, \ie very seldom does the allocator have a delay during an allocator call.
-(Large allocations requiring initialization, \eg zero fill, and/or copying are not covered by the low-latency objective.)
+Excluded from the low-latency objective are (large) allocations requiring initialization, \eg zero fill, and/or data copying, which are outside the allocator's purview.
 A direct consequence of this objective is very simple or no storage coalescing;
 hence, llheap's design is willing to use more storage to lower latency.
 This objective is apropos because systems research and industrial applications are striving for low latency and computers have huge amounts of RAM memory.
-Finally, llheap's performance should be comparable with the current best allocators (see performance comparison in Section~\ref{c:Performance}).
+Finally, llheap's performance should be comparable with the current best allocators, both in space and time (see performance comparison in Section~\ref{c:Performance}).
 
 % The objective of llheap's new design was to fulfill following requirements:
@@ -1205,9 +1206,8 @@
 % \label{s:AllocationFastpath}
 
-llheap's design was reviewed and changed multiple times during its development.  Only the final design choices are
-discussed in this paper.
+llheap's design was reviewed and changed multiple times during its development, with the final choices are discussed here.
 (See~\cite{Zulfiqar22} for a discussion of alternate choices and reasons for rejecting them.)
 All designs were analyzed for the allocation/free \newterm{fastpath}, \ie when an allocation can immediately return free storage or returned storage is not coalesced.
-The heap model choosen is 1:1, which is the T:H model with T = H, where there is one thread-local heap for each KT.
+The heap model chosen is 1:1, which is the T:H model with T = H, where there is one thread-local heap for each KT.
 (See Figure~\ref{f:THSharedHeaps} but with a heap bucket per KT and no bucket or local-pool lock.)
 Hence, immediately after a KT starts, its heap is created and just before a KT terminates, its heap is (logically) deleted.
@@ -1426,5 +1426,5 @@
 
 
-Algorithm~\ref{alg:heapObjectFreeOwn} shows the de-allocation (free) outline for an object at address $A$ with ownership.
+Algorithm~\ref{alg:heapObjectFreeOwn} shows the deallocation (free) outline for an object at address $A$ with ownership.
 First, the address is divided into small (@sbrk@) or large (@mmap@).
 For large allocations, the storage is unmapped back to the OS.
@@ -1433,5 +1433,5 @@
 If the bucket is not local to the thread, the allocation is pushed onto the owning thread's associated away stack.
 
-Algorithm~\ref{alg:heapObjectFreeNoOwn} shows the de-allocation (free) outline for an object at address $A$ without ownership.
+Algorithm~\ref{alg:heapObjectFreeNoOwn} shows the deallocation (free) outline for an object at address $A$ without ownership.
 The algorithm is the same as for ownership except if the bucket is not local to the thread.
 Then the corresponding bucket of the owner thread is computed for the deallocating thread, and the allocation is pushed onto the deallocating thread's bucket.
@@ -1792,112 +1792,63 @@
 The C dynamic-memory API is extended with the following routines:
 
-\paragraph{\lstinline{void * aalloc( size_t dim, size_t elemSize )}}
-extends @calloc@ for allocating a dynamic array of objects without calculating the total size of array explicitly but \emph{without} zero-filling the memory.
-@aalloc@ is significantly faster than @calloc@, which is the only alternative given by the standard memory-allocation routines.
-
-\noindent\textbf{Usage}
-@aalloc@ takes two parameters.
-\begin{itemize}[topsep=3pt,itemsep=2pt,parsep=0pt]
-\item
-@dim@: number of array objects
-\item
-@elemSize@: size of array object
-\end{itemize}
+\medskip\noindent
+\lstinline{void * aalloc( size_t dim, size_t elemSize )}
+extends @calloc@ for allocating a dynamic array of objects with total size @dim@ $\times$ @elemSize@ but \emph{without} zero-filling the memory.
+@aalloc@ is significantly faster than @calloc@, which is the only alternative given by the standard memory-allocation routines for array allocation.
 It returns the address of the dynamic array or @NULL@ if either @dim@ or @elemSize@ are zero.
 
-\paragraph{\lstinline{void * resize( void * oaddr, size_t size )}}
-extends @realloc@ for resizing an existing allocation \emph{without} copying previous data into the new allocation or preserving sticky properties.
+\medskip\noindent
+\lstinline{void * resize( void * oaddr, size_t size )}
+extends @realloc@ for resizing an existing allocation, @oaddr@, to the new @size@ (smaller or larger than previous) \emph{without} copying previous data into the new allocation or preserving sticky properties.
 @resize@ is significantly faster than @realloc@, which is the only alternative.
-
-\noindent\textbf{Usage}
-@resize@ takes two parameters.
-\begin{itemize}[topsep=3pt,itemsep=2pt,parsep=0pt]
-\item
-@oaddr@: address to be resized
-\item
-@size@: new allocation size (smaller or larger than previous)
-\end{itemize}
 It returns the address of the old or new storage with the specified new size or @NULL@ if @size@ is zero.
 
-\paragraph{\lstinline{void * amemalign( size_t alignment, size_t dim, size_t elemSize )}}
-extends @aalloc@ and @memalign@ for allocating an aligned dynamic array of objects.
+\medskip\noindent
+\lstinline{void * amemalign( size_t alignment, size_t dim, size_t elemSize )}
+extends @aalloc@ and @memalign@ for allocating a dynamic array of objects with the starting address on the @alignment@ boundary.
 Sets sticky alignment property.
-
-\noindent\textbf{Usage}
-@amemalign@ takes three parameters.
-\begin{itemize}[topsep=3pt,itemsep=2pt,parsep=0pt]
-\item
-@alignment@: alignment requirement
-\item
-@dim@: number of array objects
-\item
-@elemSize@: size of array object
-\end{itemize}
 It returns the address of the aligned dynamic-array or @NULL@ if either @dim@ or @elemSize@ are zero.
 
-\paragraph{\lstinline{void * cmemalign( size_t alignment, size_t dim, size_t elemSize )}}
+\medskip\noindent
+\lstinline{void * cmemalign( size_t alignment, size_t dim, size_t elemSize )}
 extends @amemalign@ with zero fill and has the same usage as @amemalign@.
 Sets sticky zero-fill and alignment property.
 It returns the address of the aligned, zero-filled dynamic-array or @NULL@ if either @dim@ or @elemSize@ are zero.
 
-\paragraph{\lstinline{size_t malloc_alignment( void * addr )}}
-returns the alignment of the dynamic object for use in aligning similar allocations.
-
-\noindent\textbf{Usage}
-@malloc_alignment@ takes one parameter.
-\begin{itemize}[topsep=3pt,itemsep=2pt,parsep=0pt]
-\item
-@addr@: address of an allocated object.
-\end{itemize}
-It returns the alignment of the given object, where objects not allocated with alignment return the minimal allocation alignment.
-
-\paragraph{\lstinline{bool malloc_zero_fill( void * addr )}}
-returns true if the object has the zero-fill sticky property for use in zero filling similar allocations.
-
-\noindent\textbf{Usage}
-@malloc_zero_fill@ takes one parameters.
-
-\begin{itemize}[topsep=3pt,itemsep=2pt,parsep=0pt]
-\item
-@addr@: address of an allocated object.
-\end{itemize}
-It returns true if the zero-fill sticky property is set and false otherwise.
-
-\paragraph{\lstinline{size_t malloc_size( void * addr )}}
-returns the request size of the dynamic object (updated when an object is resized) for use in similar allocations.
-See also @malloc_usable_size@.
-
-\noindent\textbf{Usage}
-@malloc_size@ takes one parameters.
-\begin{itemize}[topsep=3pt,itemsep=2pt,parsep=0pt]
-\item
-@addr@: address of an allocated object.
-\end{itemize}
-It returns the request size or zero if @addr@ is @NULL@.
-
-\paragraph{\lstinline{int malloc_stats_fd( int fd )}}
-changes the file descriptor where @malloc_stats@ writes statistics (default @stdout@).
-
-\noindent\textbf{Usage}
-@malloc_stats_fd@ takes one parameters.
-\begin{itemize}[topsep=3pt,itemsep=2pt,parsep=0pt]
-\item
-@fd@: file descriptor.
-\end{itemize}
-It returns the previous file descriptor.
-
-\paragraph{\lstinline{size_t malloc_expansion()}}
+\medskip\noindent
+\lstinline{size_t malloc_alignment( void * addr )}
+returns the object alignment, where objects not allocated with alignment return the minimal allocation alignment.
+For use in aligning similar allocations.
+
+\medskip\noindent
+\lstinline{bool malloc_zero_fill( void * addr )}
+returns true if the objects zero-fill sticky property is set and false otherwise.
+For use in zero filling similar allocations.
+
+\medskip\noindent
+\lstinline{size_t malloc_size( void * addr )}
+returns the object's request size, which is updated when an object is resized or zero if @addr@ is @NULL@ (see also @malloc_usable_size@).
+For use in similar allocations.
+
+\medskip\noindent
+\lstinline{int malloc_stats_fd( int fd )}
+changes the file descriptor where @malloc_stats@ writes statistics (default @stdout@) and returns the previous file descriptor.
+
+\medskip\noindent
+\lstinline{size_t malloc_expansion()}
 \label{p:malloc_expansion}
 set the amount (bytes) to extend the heap when there is insufficient free storage to service an allocation request.
 It returns the heap extension size used throughout a program when requesting more memory from the system using @sbrk@ system-call, \ie called once at heap initialization.
 
-\paragraph{\lstinline{size_t malloc_mmap_start()}}
+\medskip\noindent
+\lstinline{size_t malloc_mmap_start()}
 set the crossover between allocations occurring in the @sbrk@ area or separately mapped.
 It returns the crossover point used throughout a program, \ie called once at heap initialization.
 
-\paragraph{\lstinline{size_t malloc_unfreed()}}
+\medskip\noindent
+\lstinline{size_t malloc_unfreed()}
 \label{p:malloc_unfreed}
 amount subtracted to adjust for unfreed program storage (debug only).
-It returns the new subtraction amount and called by @malloc_stats@.
+It returns the new subtraction amount and called by @malloc_stats@ (discussed in Section~\ref{}).
 
 
@@ -1906,21 +1857,13 @@
 The following extensions take advantage of overload polymorphism in the \CC type-system.
 
-\paragraph{\lstinline{void * resize( void * oaddr, size_t nalign, size_t size )}}
-extends @resize@ with an alignment re\-quirement.
-
-\noindent\textbf{Usage}
-takes three parameters.
-\begin{itemize}[topsep=3pt,itemsep=2pt,parsep=0pt]
-\item
-@oaddr@: address to be resized
-\item
-@nalign@: alignment requirement
-\item
-@size@: new allocation size (smaller or larger than previous)
-\end{itemize}
+\medskip\noindent
+\lstinline{void * resize( void * oaddr, size_t nalign, size_t size )}
+extends @resize@ with an alignment requirement, @nalign@.
 It returns the address of the old or new storage with the specified new size and alignment, or @NULL@ if @size@ is zero.
 
-\paragraph{\lstinline{void * realloc( void * oaddr, size_t nalign, size_t size )}}
-extends @realloc@ with an alignment re\-quirement and has the same usage as aligned @resize@.
+\medskip\noindent
+\lstinline{void * realloc( void * oaddr, size_t nalign, size_t size )}
+extends @realloc@ with an alignment requirement, @nalign@.
+It returns the address of the old or new storage with the specified new size and alignment, or @NULL@ if @size@ is zero.
 
 
@@ -1979,5 +1922,5 @@
 object size: like the \CFA's C-interface, programmers do not have to specify object size or cast allocation results.
 \end{itemize}
-Note, postfix function call is an alternative call syntax, using backtick @`@, where the argument appears before the function name, \eg
+Note, postfix function call is an alternative call syntax, using backtick @`@, so the argument appears before the function name, \eg
 \begin{cfa}
 duration ?@`@h( int h );		// ? denote the position of the function operand
@@ -1987,5 +1930,8 @@
 \end{cfa}
 
-\paragraph{\lstinline{T * alloc( ... )} or \lstinline{T * alloc( size_t dim, ... )}}
+The following extensions take advantage of overload polymorphism in the \CC type-system.
+
+\medskip\noindent
+\lstinline{T * alloc( ... )} or \lstinline{T * alloc( size_t dim, ... )}
 is overloaded with a variable number of specific allocation operations, or an integer dimension parameter followed by a variable number of specific allocation operations.
 These allocation operations can be passed as named arguments when calling the \lstinline{alloc} routine.
@@ -1996,7 +1942,10 @@
 
 The allocation property functions are:
-\subparagraph{\lstinline{T_align ?`align( size_t alignment )}}
+
+\medskip\noindent
+\lstinline{T_align ?`align( size_t alignment )}
 to align the allocation.
-The alignment parameter must be $\ge$ the default alignment (@libAlign()@ in \CFA) and a power of two, \eg:
+The alignment parameter must be $\ge$ the default alignment (@libAlign()@ in \CFA) and a power of two.
+The following example returns a dynamic object and object array aligned on a 4096-byte boundary.
 \begin{cfa}
 int * i0 = alloc( @4096`align@ );  sout | i0 | nl;
@@ -2006,10 +1955,10 @@
 0x555555574000 0x555555574000 0x555555574004 0x555555574008
 \end{cfa}
-returns a dynamic object and object array aligned on a 4096-byte boundary.
-
-\subparagraph{\lstinline{S_fill(T) ?`fill ( /* various types */ )}}
+
+\medskip\noindent
+\lstinline{S_fill(T) ?`fill ( /* various types */ )}
 to initialize storage.
 There are three ways to fill storage:
-\begin{enumerate}
+\begin{enumerate}[itemsep=0pt,parsep=0pt]
 \item
 A char fills each byte of each object.
@@ -2020,5 +1969,5 @@
 \end{enumerate}
 For example:
-\begin{cfa}[numbers=left]
+\begin{cfa}[numbers=left,xleftmargin=2.5\parindentlnth]
 int * i0 = alloc( @0n`fill@ );  sout | *i0 | nl;  // disambiguate 0
 int * i1 = alloc( @5`fill@ );  sout | *i1 | nl;
@@ -2029,5 +1978,5 @@
 int * i6 = alloc( 5, @[i3, 3]`fill@ );  for ( i; 5 ) sout | i6[i]; sout | nl;
 \end{cfa}
-\begin{lstlisting}[numbers=left]
+\begin{lstlisting}[numbers=left,xleftmargin=2.5\parindentlnth]
 0
 5
@@ -2041,14 +1990,15 @@
 Examples 4 to 7 fill an array of objects with values, another array, or part of an array.
 
-\subparagraph{\lstinline{S_resize(T) ?`resize( void * oaddr )}}
+\medskip\noindent
+\lstinline{S_resize(T) ?`resize( void * oaddr )}
 used to resize, realign, and fill, where the old object data is not copied to the new object.
 The old object type may be different from the new object type, since the values are not used.
 For example:
-\begin{cfa}[numbers=left]
+\begin{cfa}[numbers=left,xleftmargin=2.5\parindentlnth]
 int * i = alloc( @5`fill@ );  sout | i | *i;
 i = alloc( @i`resize@, @256`align@, @7`fill@ );  sout | i | *i;
 double * d = alloc( @i`resize@, @4096`align@, @13.5`fill@ );  sout | d | *d;
 \end{cfa}
-\begin{lstlisting}[numbers=left]
+\begin{lstlisting}[numbers=left,xleftmargin=2.5\parindentlnth]
 0x55555556d5c0 5
 0x555555570000 7
@@ -2057,5 +2007,5 @@
 Examples 2 to 3 change the alignment, fill, and size for the initial storage of @i@.
 
-\begin{cfa}[numbers=left]
+\begin{cfa}[numbers=left,xleftmargin=2.5\parindentlnth]
 int * ia = alloc( 5, @5`fill@ );  for ( i; 5 ) sout | ia[i]; sout | nl;
 ia = alloc( 10, @ia`resize@, @7`fill@ ); for ( i; 10 ) sout | ia[i]; sout | nl;
@@ -2063,5 +2013,5 @@
 ia = alloc( 3, @ia`resize@, @4096`align@, @2`fill@ );  sout | ia; for ( i; 3 ) sout | &ia[i] | ia[i]; sout | nl;
 \end{cfa}
-\begin{lstlisting}[numbers=left]
+\begin{lstlisting}[numbers=left,xleftmargin=2.5\parindentlnth]
 5 5 5 5 5
 7 7 7 7 7 7 7 7 7 7
@@ -2071,15 +2021,16 @@
 Examples 2 to 4 change the array size, alignment and fill for the initial storage of @ia@.
 
-\subparagraph{\lstinline{S_realloc(T) ?`realloc( T * a ))}}
+\medskip\noindent
+\lstinline{S_realloc(T) ?`realloc( T * a ))}
 used to resize, realign, and fill, where the old object data is copied to the new object.
 The old object type must be the same as the new object type, since the value is used.
 Note, for @fill@, only the extra space after copying the data from the old object is filled with the given parameter.
 For example:
-\begin{cfa}[numbers=left]
+\begin{cfa}[numbers=left,xleftmargin=2.5\parindentlnth]
 int * i = alloc( @5`fill@ );  sout | i | *i;
 i = alloc( @i`realloc@, @256`align@ );  sout | i | *i;
 i = alloc( @i`realloc@, @4096`align@, @13`fill@ );  sout | i | *i;
 \end{cfa}
-\begin{lstlisting}[numbers=left]
+\begin{lstlisting}[numbers=left,xleftmargin=2.5\parindentlnth]
 0x55555556d5c0 5
 0x555555570000 5
@@ -2089,5 +2040,5 @@
 The @13`fill@ in example 3 does nothing because no extra space is added.
 
-\begin{cfa}[numbers=left]
+\begin{cfa}[numbers=left,xleftmargin=2.5\parindentlnth]
 int * ia = alloc( 5, @5`fill@ );  for ( i; 5 ) sout | ia[i]; sout | nl;
 ia = alloc( 10, @ia`realloc@, @7`fill@ ); for ( i; 10 ) sout | ia[i]; sout | nl;
@@ -2095,5 +2046,5 @@
 ia = alloc( 3, @ia`realloc@, @4096`align@, @2`fill@ );  sout | ia; for ( i; 3 ) sout | &ia[i] | ia[i]; sout | nl;
 \end{cfa}
-\begin{lstlisting}[numbers=left]
+\begin{lstlisting}[numbers=left,xleftmargin=2.5\parindentlnth]
 5 5 5 5 5
 5 5 5 5 5 7 7 7 7 7
