Changeset b3edf7f5 for doc/theses/aaron_moss_PhD
 Timestamp:
 Feb 21, 2019, 6:40:58 PM (5 years ago)
 Branches:
 ADT, aaronthesis, armeh, astexperimental, cleanupdtors, enum, forallpointerdecay, jacob/cs343translation, jenkinssandbox, master, newast, newastuniqueexpr, pthreademulation, qualifiedEnum
 Children:
 d065ded
 Parents:
 f316c68
 File:

 1 edited
Legend:
 Unmodified
 Added
 Removed

doc/theses/aaron_moss_PhD/phd/typeenvironment.tex
rf316c68 rb3edf7f5 5 5 As discussed in Chapter~\ref{resolutionchap}, being able to efficiently determine which type variables are bound to which concrete types or whether two type environments are compatible is a core requirement of the resolution algorithm. 6 6 Furthermore, expression resolution involves a search through many related possible solutions, so being able to reuse shared subsets of type environment data and to switch between environments quickly is desirable for performance. 7 In this chapter I discuss and empirically compare a number of type environment data structure variants, including some novel variations on the unionfind\cite{Galler64} data structure introduced in this thesis. 7 In this chapter I discuss a number of type environment data structure variants, including some novel variations on the unionfind \cite{Galler64} data structure introduced in this thesis. 8 Chapter~\ref{exprchap} contains empirical comparisons of the performance of these data structures when integrated into the resolution algorithm. 8 9 9 10 \section{Definitions} \label{envdefnsec} … … 108 109 Since the unionfind data structure stores only links from children to parents and not viceversa, there is no way to reconstruct a class from one of its elements without a linear search over the entire data structure, with $find$ called on each element to check its membership in the class. 109 110 The situation is even worse for the $split$ operation, which would require extra information to maintain the order that each child was added to its parent node. 110 Unfortunately, the literature \cite{Tarjan84,Galil91,Patwary10} on unionfind does not present a way to keep references to children without breaking the asymptotic time bounds of the algorithm; I have discovered a method to do so which, despite its simplicity, seems to be novel.111 Unfortunately, the literature \cite{Tarjan84,Galil91,Patwary10} on unionfind does not present a way to keep references to children without breaking the asymptotic time bounds of the algorithm; I have discovered a method to do so which, despite its simplicity, seems to be novel. 111 112 112 113 The core idea of this ``unionfind with classes'' data structure and algorithm is to keep the members of each class stored in a circularlylinked list. … … 158 159 This is accomplished by tracing the path from some edit node to the root node of the array (always the underlying array), recursively applying the edits to the underlying array and replacing each edit node's successor with the inverse edit. 159 160 In this way, any previous state of the persistent array can be restored in time proportional to the number of edits to the current state of the array. 160 While $reroot$ does maintain the same value mapping in every version of the persistent array, the internal mutations it performs means that it is not threadsafe, andmust be used behind a lock in a concurrent context.161 While $reroot$ does maintain the same value mapping in every version of the persistent array, the internal mutations it performs break threadsafety, and thus it must be used behind a lock in a concurrent context. 161 162 Also, the root node with the actual array may in principle be anywhere in the tree, and does not provide information to report its leaf nodes, so some form of automatic garbage collection is generally required for the data structure. 162 163 Since the graph of edit nodes is treestructured, reference counting approaches suffice for garbage collection; Conchon and Filli\^{a}tre~\cite{Conchon07} also observe that if the only $reroot$ operations are for backtracking then the tail of inverse edit nodes may be elided, suggesting the possibility of stackbased memory management. … … 248 249 The unionfind data structure is designed to support $find$ efficiently, and thus for any variant, the cost is simply the distance up the tree to the representative element. 249 250 For basic unionfind, this is amortized to the inverse Ackermann function, $O(\alpha(m))$, essentially a small constant, though the loss of path compression in persistent unionfind raises this cost to the still cheap $O(\log m)$. 250 Basic unionfind is not designed to support the $report$ operation, however, so it must be simulated by checking the representative of every type var aible, at cost $O(n\alpha(m))$.251 Basic unionfind is not designed to support the $report$ operation, however, so it must be simulated by checking the representative of every type variable, at cost $O(n\alpha(m))$. 251 252 Persistent unionfind, on the other hand, uses the ``with classes'' extension to unionfind described in Section~\ref{envunionfindclassesapproach} to run $report$ in $O(m)$ time. 252 253 … … 265 266 Persistent unionfind, as the name suggests, is more optimized, with $O(1)$ cost to $save$ a backtrackcapable reference to the current environment state, and $O(p)$ cost to revert to that state (possibly destroying nolongerused edit nodes along the path). 266 267 267 % Future work: design multithreaded version of C&F persistent map  core idea is some sort of threadboundary edit node 268 \section{Conclusion \& Future Work} 269 270 This chapter has discussed the type environment data structure, as well as some more sophisticated approaches to optimize its performance for workloads encountered in the expression resolution problem, and asymptotic analysis of each approach. 271 Chapter~\ref{exprchap} provides experimental performance results for a representative set of these approaches. 272 One contribution of this thesis is the unionfind with classes data structure for efficient retrieval of unionfind class members, along with a related algorithm for reversing the history of $union$ operations in this data structure. 273 This reversible history contributes to the second novel contribution of this chapter, a type environment data structure based off the persistent unionfind data structure of Conchon and Filli\^{a}tre~\cite{Conchon07}. 274 This persistent unionfind environment uses the $split$ operation introduced in unionfind with classes and the edit history of the persistent data structure to support an environmentcombining algorithm that only considers the edits between the environments to be merged. 275 276 This persistent unionfind data structure is efficient, but not threadsafe; as suggested in Section~\ref{resnconclusionsec}, it may be valuable to parallelize the \CFA{} expression resolver. 277 However, allowing multiple threads concurrent access to the persistent data structure is likely to result in a sort of ``reroot thrashing'', as different threads reroot the data structure to their own versions of interest. 278 It is possible this could be mitigated by partitioning the data structure into separate subtrees for each thread, with each subtree having its own root node, and the boundaries between them implemented with a lockequipped !ThreadBoundary! edit node.
Note: See TracChangeset
for help on using the changeset viewer.