- Timestamp:
- Jan 25, 2019, 5:19:24 PM (6 years ago)
- Branches:
- ADT, aaron-thesis, arm-eh, ast-experimental, cleanup-dtors, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, pthread-emulation, qualifiedEnum
- Children:
- 21cf101
- Parents:
- c58bb11
- Location:
- doc/theses/aaron_moss_PhD/phd
- Files:
-
- 4 added
- 2 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/aaron_moss_PhD/phd/Makefile
rc58bb11 rb4fd981 30 30 FIGURES = ${addsuffix .eps, \ 31 31 resolution-dag \ 32 union-find-with-classes \ 33 persistent-union-find \ 32 34 } 33 35 -
doc/theses/aaron_moss_PhD/phd/type-environment.tex
rc58bb11 rb4fd981 70 70 \subsection{Na\"{\i}ve} 71 71 72 The type environment data structure used in Bilson's \cite{Bilson03} original implementation of \CFACC{} is a straightforward translation of the definitions in Section~\ref{env-defn-sec} to \CC{} code; a !TypeEnvironment! contains a list of !EqvClass! type equivalence classes, each of which contains the type bound information and a tree-based sorted set of type variables.72 The type environment data structure used in Bilson's~\cite{Bilson03} original implementation of \CFACC{} is a straightforward translation of the definitions in Section~\ref{env-defn-sec} to \CC{} code; a !TypeEnvironment! contains a list of !EqvClass! type equivalence classes, each of which contains the type bound information and a tree-based sorted set of type variables. 73 73 This approach has the benefit of being easy to understand and not imposing life-cycle or inheritance constraints on its use, but, as can be seen in Table~\ref{env-bounds-table}, does not support many of the desired operations with any particular efficiency. 74 74 Some variations on this structure may improve performance somewhat; for instance, replacing the !EqvClass! variable storage with a hash-based set would reduce search and update times from $O(\log n)$ to amortized $O(1)$, while adding an index for the type variables in the entire environment would remove the need to check each type class individually to maintain the disjointness property. … … 91 91 \subsection{Union-Find} \label{env-union-find-approach} 92 92 93 Given the nature of the classes of type variables as disjoint sets, another natural approach to implementing a type environment is the union-find disjoint set data structure \cite{Galler64}.93 Given the nature of the classes of type variables as disjoint sets, another natural approach to implementing a type environment is the union-find disjoint set data structure~\cite{Galler64}. 94 94 Union-find efficiently implements two operations over a partition of a collection of elements into disjoint sets; $find(x)$ locates the \emph{representative} of $x$, the element which canonically names its set, while $union(r, s)$ merges two sets represented by $r$ and $s$, respectively. 95 95 The union-find data structure is based on providing each element with a reference to its parent element, such that the root of a tree of elements is the representative of the set of elements contained in the tree. … … 110 110 Unfortunately, the literature\cite{Tarjan84,Galil91,Patwary10} on union-find does not present a way to keep references to children without breaking the asymptotic time bounds of the algorithm; I have discovered a method to do so which, despite its simplicity, seems to be novel. 111 111 112 \TODO{port figure from slideshow}113 114 112 The core idea of this ``union-find with classes'' data structure and algorithm is to keep the members of each class stored in a circularly-linked list. 115 113 Aho, Hopcroft, and Ullman also include a circularly-linked list in their 1974 textbook~\cite{Aho74}. … … 117 115 In my version, the list data structure does not affect the layout of the union-find tree, maintaining the same asymptotic bounds as union-find. 118 116 In more detail, each element is given a !next! pointer to another element in the same class; this !next! pointer initially points to the element itself. 119 When two classes are unified, the !next! pointers of the representatives of those classes are swapped, splicing the two circularly-linked lists together .117 When two classes are unified, the !next! pointers of the representatives of those classes are swapped, splicing the two circularly-linked lists together as illustrated in Figure~\ref{union-find-classes-fig}. 120 118 Importantly, though this approach requires an extra pointer per element, it does maintain the linear space bound of union-find, and because it only requires updating the two root nodes in $union$ it does not asymptotically increase runtime either. 121 119 The basic approach is compatible with all path-compression techniques, and allows the members of any class to be retrieved in time linear in the size of the class simply by following the !next! pointers from any element. 120 121 \begin{figure} 122 \centering 123 \includegraphics{figures/union-find-with-classes} 124 \caption[Union operation for union-find with classes.]{Union operation for union-find with classes. Solid lines indicate parent pointers, dashed lines are \lstinline{next} pointers.} 125 \label{union-find-classes-fig} 126 \end{figure} 122 127 123 128 If the path-compression optimization is abandoned, union-find with classes also encodes a reversible history of all the $union$ operations applied to a given class. … … 143 148 Conchon and Filli\^{a}tre~\cite{Conchon07} present a persistent union-find data structure based on the persistent array of Baker~\cite{Baker78,Baker91}. 144 149 145 \TODO{port figure from slideshow}146 147 150 In Baker's persistent array, an array reference contains either a pointer to the array or a pointer to an \emph{edit node}; these edit nodes contain an array index, the value in that index, and another array reference pointing either to the array or a different edit node. 148 151 In this manner, a tree of edits is formed, rooted at the actual array. 149 Read from the actual array at the root can be performed in constant time, as with a non-persistent array.152 Reads from the actual array at the root can be performed in constant time, as with a non-persistent array. 150 153 The persistent array can be mutated in constant time by directly modifying the underlying array, then replacing its array reference with an edit node containing the mutated index, the previous value at that index, and a reference to the mutated array. If the current array reference is not the root, mutation consists simply of constructing a new edit node encoding the change and referring to the current array reference. 151 154 The mutation algorithm at the root is in some sense a special case of the key operation on persistent arrays, $reroot$. … … 163 166 Besides replacing the underlying array with a hash table, the other major change in this approach is to replace the two types of array references, !Array! and !Edit!, with four node types, !Table!, !Edit!, !Add!, and !Remove!, where !Add! adds a new key-value pair, !Remove! removes a key, and !Edit! mutates an existing key-value pair. 164 167 In this variant of \CFACC{}, this persistent hash table is used as the side map discussed in Section~\ref{env-union-find-approach} for class bounds. 165 The actual union-find data structure is slightly modified from this approach, with a !Base! node containing the root union-find data structure, !Add! nodes adding new elements, !AddTo! nodes defining the union of two type classes, and !Remove! and !RemoveFrom! nodes as inverses of the previous two elements, for purposes of maintaining the edit list .168 The actual union-find data structure is slightly modified from this approach, with a !Base! node containing the root union-find data structure, !Add! nodes adding new elements, !AddTo! nodes defining the union of two type classes, and !Remove! and !RemoveFrom! nodes as inverses of the previous two elements, for purposes of maintaining the edit list; Figure~\ref{persistent-union-find-fig} demonstrates the structure of a simple example. 166 169 Making !AddTo! and !RemoveFrom! single nodes shortens the edit path for improved performance, while also providing semantic information missing from the raw array updates in Conchon and Filli\^{a}tre's data structure. 167 170 The single-node approach, does, however, break under most path-compression algorithms; !RemoveFrom! can be applied to the underlying data structure using the ``leaf of last union'' approach discussed in in Section~\ref{env-union-find-classes-approach}; this was judged an acceptable trade-off for the added semantic information and shortened paths. 171 172 \begin{figure} 173 \centering 174 \includegraphics{figures/persistent-union-find} 175 \caption[Persistent union-find data structure.]{Persistent union-find data structure. Shows the edit nodes to reverse back to an empty structure.} 176 \label{persistent-union-find-fig} 177 \end{figure} 168 178 169 179 Maintaining explicit information on $union$ operations in the persistent union-find edit tree in the form of !AddTo! and !RemoveFrom! nodes exposes a new option for combining type environments.
Note: See TracChangeset
for help on using the changeset viewer.