Changeset b4fd981 for doc/theses/aaron_moss_PhD/phd/typeenvironment.tex
 Timestamp:
 Jan 25, 2019, 5:19:24 PM (5 years ago)
 Branches:
 ADT, aaronthesis, armeh, astexperimental, cleanupdtors, enum, forallpointerdecay, jacob/cs343translation, jenkinssandbox, master, newast, newastuniqueexpr, pthreademulation, qualifiedEnum
 Children:
 21cf101
 Parents:
 c58bb11
 File:

 1 edited
Legend:
 Unmodified
 Added
 Removed

doc/theses/aaron_moss_PhD/phd/typeenvironment.tex
rc58bb11 rb4fd981 70 70 \subsection{Na\"{\i}ve} 71 71 72 The type environment data structure used in Bilson's \cite{Bilson03} original implementation of \CFACC{} is a straightforward translation of the definitions in Section~\ref{envdefnsec} to \CC{} code; a !TypeEnvironment! contains a list of !EqvClass! type equivalence classes, each of which contains the type bound information and a treebased sorted set of type variables.72 The type environment data structure used in Bilson's~\cite{Bilson03} original implementation of \CFACC{} is a straightforward translation of the definitions in Section~\ref{envdefnsec} to \CC{} code; a !TypeEnvironment! contains a list of !EqvClass! type equivalence classes, each of which contains the type bound information and a treebased sorted set of type variables. 73 73 This approach has the benefit of being easy to understand and not imposing lifecycle or inheritance constraints on its use, but, as can be seen in Table~\ref{envboundstable}, does not support many of the desired operations with any particular efficiency. 74 74 Some variations on this structure may improve performance somewhat; for instance, replacing the !EqvClass! variable storage with a hashbased set would reduce search and update times from $O(\log n)$ to amortized $O(1)$, while adding an index for the type variables in the entire environment would remove the need to check each type class individually to maintain the disjointness property. … … 91 91 \subsection{UnionFind} \label{envunionfindapproach} 92 92 93 Given the nature of the classes of type variables as disjoint sets, another natural approach to implementing a type environment is the unionfind disjoint set data structure \cite{Galler64}.93 Given the nature of the classes of type variables as disjoint sets, another natural approach to implementing a type environment is the unionfind disjoint set data structure~\cite{Galler64}. 94 94 Unionfind efficiently implements two operations over a partition of a collection of elements into disjoint sets; $find(x)$ locates the \emph{representative} of $x$, the element which canonically names its set, while $union(r, s)$ merges two sets represented by $r$ and $s$, respectively. 95 95 The unionfind data structure is based on providing each element with a reference to its parent element, such that the root of a tree of elements is the representative of the set of elements contained in the tree. … … 110 110 Unfortunately, the literature\cite{Tarjan84,Galil91,Patwary10} on unionfind does not present a way to keep references to children without breaking the asymptotic time bounds of the algorithm; I have discovered a method to do so which, despite its simplicity, seems to be novel. 111 111 112 \TODO{port figure from slideshow}113 114 112 The core idea of this ``unionfind with classes'' data structure and algorithm is to keep the members of each class stored in a circularlylinked list. 115 113 Aho, Hopcroft, and Ullman also include a circularlylinked list in their 1974 textbook~\cite{Aho74}. … … 117 115 In my version, the list data structure does not affect the layout of the unionfind tree, maintaining the same asymptotic bounds as unionfind. 118 116 In more detail, each element is given a !next! pointer to another element in the same class; this !next! pointer initially points to the element itself. 119 When two classes are unified, the !next! pointers of the representatives of those classes are swapped, splicing the two circularlylinked lists together .117 When two classes are unified, the !next! pointers of the representatives of those classes are swapped, splicing the two circularlylinked lists together as illustrated in Figure~\ref{unionfindclassesfig}. 120 118 Importantly, though this approach requires an extra pointer per element, it does maintain the linear space bound of unionfind, and because it only requires updating the two root nodes in $union$ it does not asymptotically increase runtime either. 121 119 The basic approach is compatible with all pathcompression techniques, and allows the members of any class to be retrieved in time linear in the size of the class simply by following the !next! pointers from any element. 120 121 \begin{figure} 122 \centering 123 \includegraphics{figures/unionfindwithclasses} 124 \caption[Union operation for unionfind with classes.]{Union operation for unionfind with classes. Solid lines indicate parent pointers, dashed lines are \lstinline{next} pointers.} 125 \label{unionfindclassesfig} 126 \end{figure} 122 127 123 128 If the pathcompression optimization is abandoned, unionfind with classes also encodes a reversible history of all the $union$ operations applied to a given class. … … 143 148 Conchon and Filli\^{a}tre~\cite{Conchon07} present a persistent unionfind data structure based on the persistent array of Baker~\cite{Baker78,Baker91}. 144 149 145 \TODO{port figure from slideshow}146 147 150 In Baker's persistent array, an array reference contains either a pointer to the array or a pointer to an \emph{edit node}; these edit nodes contain an array index, the value in that index, and another array reference pointing either to the array or a different edit node. 148 151 In this manner, a tree of edits is formed, rooted at the actual array. 149 Read from the actual array at the root can be performed in constant time, as with a nonpersistent array.152 Reads from the actual array at the root can be performed in constant time, as with a nonpersistent array. 150 153 The persistent array can be mutated in constant time by directly modifying the underlying array, then replacing its array reference with an edit node containing the mutated index, the previous value at that index, and a reference to the mutated array. If the current array reference is not the root, mutation consists simply of constructing a new edit node encoding the change and referring to the current array reference. 151 154 The mutation algorithm at the root is in some sense a special case of the key operation on persistent arrays, $reroot$. … … 163 166 Besides replacing the underlying array with a hash table, the other major change in this approach is to replace the two types of array references, !Array! and !Edit!, with four node types, !Table!, !Edit!, !Add!, and !Remove!, where !Add! adds a new keyvalue pair, !Remove! removes a key, and !Edit! mutates an existing keyvalue pair. 164 167 In this variant of \CFACC{}, this persistent hash table is used as the side map discussed in Section~\ref{envunionfindapproach} for class bounds. 165 The actual unionfind data structure is slightly modified from this approach, with a !Base! node containing the root unionfind data structure, !Add! nodes adding new elements, !AddTo! nodes defining the union of two type classes, and !Remove! and !RemoveFrom! nodes as inverses of the previous two elements, for purposes of maintaining the edit list .168 The actual unionfind data structure is slightly modified from this approach, with a !Base! node containing the root unionfind data structure, !Add! nodes adding new elements, !AddTo! nodes defining the union of two type classes, and !Remove! and !RemoveFrom! nodes as inverses of the previous two elements, for purposes of maintaining the edit list; Figure~\ref{persistentunionfindfig} demonstrates the structure of a simple example. 166 169 Making !AddTo! and !RemoveFrom! single nodes shortens the edit path for improved performance, while also providing semantic information missing from the raw array updates in Conchon and Filli\^{a}tre's data structure. 167 170 The singlenode approach, does, however, break under most pathcompression algorithms; !RemoveFrom! can be applied to the underlying data structure using the ``leaf of last union'' approach discussed in in Section~\ref{envunionfindclassesapproach}; this was judged an acceptable tradeoff for the added semantic information and shortened paths. 171 172 \begin{figure} 173 \centering 174 \includegraphics{figures/persistentunionfind} 175 \caption[Persistent unionfind data structure.]{Persistent unionfind data structure. Shows the edit nodes to reverse back to an empty structure.} 176 \label{persistentunionfindfig} 177 \end{figure} 168 178 169 179 Maintaining explicit information on $union$ operations in the persistent unionfind edit tree in the form of !AddTo! and !RemoveFrom! nodes exposes a new option for combining type environments.
Note: See TracChangeset
for help on using the changeset viewer.