source: doc/theses/aaron_moss_PhD/phd/experiments.tex @ 0f78f3c7

ADTaaron-thesisarm-ehast-experimentalcleanup-dtorsenumforall-pointer-decayjacob/cs343-translationjenkins-sandboxnew-astnew-ast-unique-exprpthread-emulationqualifiedEnum
Last change on this file since 0f78f3c7 was 8d752592, checked in by Aaron Moss <a3moss@…>, 5 years ago

thesis: discuss design differences between cfa-cc and resolv-proto

  • Property mode set to 100644
File size: 9.5 KB
Line 
1\chapter{Experiments}
2\label{expr-chap}
3
4I have implemented a prototype system to test the practical effectiveness of the various algorithms described in Chapters~\ref{resolution-chap} and~\ref{env-chap}.
5This prototype system essentially just implements the expression resolution pass of \CFACC{}, with a simplified version of the \CFA{} type system and a parser to read in problem instances.
6The prototype system allows for quicker iteration on algorithms due to its simpler language model and lack of a requirement to generate runnable code, yet captures enough of the nuances of \CFA{} to have some predictive power for the runtime performance of algorithmic variants in \CFACC{} itself.
7
8There are three sources of problem instances for the resolver prototype.
9The first is small, hand-written tests designed to test the expressive power and correctness of the prototype.
10These tests are valuable for regression testing, but not time-consuming enough to be useful performance tests.
11The second sort of problem instances are procedurally generated according to a set of parameters (distributions of polymorphic versus monomorphic functions, number of function arguments, number of types, \etc{}); the procedural problem generator can be used to explore the behaviour of an algorithm with respect to certain sorts of problem instances by varying the input parameters.
12I have implemented a flagged \CFACC{} pass which outputs information which can be used to initialize the procedural generator's parameters to realistic values.
13The final sort of problem instances are derived from actual \CFA{} code.
14The prototype has a rich enough representation of \CFA{} that actual instances of expression resolution can be expressed with good fidelity, and I have implemented a compiler pass for \CFACC{} which can generate instances from \CFA{} code.
15Since at this juncture all development in \CFA{} is done by our research team, I have tested the prototype system on all \CFA{} code currently extant, primarily the standard library and compiler test suite.
16
17\section{Resolver Prototype Features}
18
19The resolver prototype can express most of the \CFA{} features described in Chapter~\ref{cfa-chap}.
20It supports both monomorphic and polymorphic functions, with type assertions for polymorphic functions.
21Traits are not explicitly represented, but \CFACC{} inlines traits before the resolver pass, so this is a faithful representation of the existing compiler problem.
22The prototype system supports variable declarations as well as function declarations, and has a lexical-scoping scheme which obeys \CFA{}-like overloading and overriding rules.
23
24The type system of the resolver prototype also captures the key aspects of the \CFA{} type system.
25\emph{Concrete types} represent the built-in arithmetic types of \CFA{}, along with the implicit conversions between them.
26Each concrete type is represented by an integer ID, and the conversion cost from $x$ to $y$ is $|y-x|$, a safe conversion if $y > x$, or an unsafe conversion if $y < x$.
27This is markedly simpler than the graph of conversion costs in \CFA{}, but captures the essentials of the design well.
28For simplicity, !zero_t! and !one_t!, the types of !0! and !1!, are represented by the type corresponding to !int!.
29\emph{Named types} are analogues to \CFA{} aggregates, such as structs and unions; aggregate fields are encoded as unary functions from the struct type to the field type, named based on the field name.
30Named types also support type parameters, and as such can represent generic types as well.
31Generic named types are used to represent the built-in parameterized types of \CFA{} as well; !T*! is encoded as \texttt{\#\$ptr<T>}.
32\CFA{} arrays are also represented as pointers, to simulate array-to-pointer decay, while top-level reference types are replaced by their referent to simulate the variety of reference conversions.
33Function types have first-class representation in the prototype as the type of both function declarations and function pointers, though the function type in the prototype system loses information about type assertions, so polymorphic function pointers cannot be expressed.
34Void and tuple types are also supported in the prototype, to express the multiple-return-value functions in \CFA{}, though varargs functions and !ttype! tuple-typed type variables are absent from the prototype system.
35The prototype system also does not represent type qualifiers (\eg{} !const!, !volatile!), so all such qualifiers are stripped during conversion to the prototype system.
36
37The resolver prototype supports three sorts of expressions in its input language.
38The simplest are \emph{value expressions}, which are expressions declared to be a certain type; these implement literal expressions in \CFA{}, and, already being typed, are passed through the resolver unchanged.
39The second sort, \emph{name expressions}, represent a variable expression in \CFA{}; these contain the name of a variable or function, and are matched to an appropriate declaration overloading that name by the resolver.
40The third input expression, the \emph{function expression}, represents a call to a function, with a name and zero or more argument subexpressions.
41As is usual in \CFA{}, operators are represented as function calls; however, as mentioned above, the prototype system represents field access expressions !a.f! as function expressions as well.
42
43The main area for future expansion in the design of the resolver prototype is conversions.
44Cast expressions are implemented in the output language of the resolver, but cannot be expressed in the input.
45The only implicit conversions supported are between the arithmetic-like concrete types, which captures most, but not all, of \CFA{}'s built-in implicit conversions\footnote{Notable absences include \lstinline{void*} to other pointer types, or \lstinline{0} to pointer types.}.
46Future work should include a way to express implicit (and possibly explicit) conversions in the input language, with an investigation of the most efficient way to handle implicit conversions, and potentially design for user-defined conversions.
47
48\section{Resolver Prototype Design}
49
50As discussed above, the resolver prototype works over a simplified version of the \CFA{} type system, for speed of development.
51The build system for the resolver prototype uses a number of conditional compilation flags to switch between algorithm variants while retaining maximal shared code.
52A different executable name is also generated for each algorithmic variant so that distinct variants can be more easily tested against each other.
53
54The primary architectural difference between the resolver prototype and \CFACC{} is that the prototype system uses a simple mark-and-sweep garbage collector for memory management, while \CFACC{} takes a manual memory management approach.
55This decision was made for the purpose of faster development iteration, but has proved to be a significant performance benefit as well.
56\CFACC{} frequently needs to make deep clones of significant object graphs to ensure memory ownership (followed by eventual deletion of these clones), an unnecessarily time-consuming process.
57The prototype, on the other hand, only needs to clone modified nodes, and can share identical subsets of the object graph.
58The key design decision enabling this is that all subnodes are held by !const! pointer, and thus cannot be mutated once they have been stored in a parent node.
59With minimal programming discipline, it can thus be ensured that any expression is either mutable or shared, but never both; the Dotty research compiler for Scala takes a similar architectural approach\cit{}. % this citation would be "personal correspondence"
60The tree mutator abstraction is designed to take advantage of this, only creating new nodes if a node must actually be mutated.
61I attempted to port this garbage collector to \CFACC{}, but without success.
62The GC could be used for memory management with few changes to the codebase, but without a substantial re-write to enforce the same ``!const! children'' discipline \CFACC{} could not take advantage of the potential to share sub-objects; without sharing of sub-objects the GC variant of \CFACC{} must do all the same allocations and deletions and garbage-collector overhead degraded performance unacceptably (though it did fix some known memory leaks intoduced by failures of the existing manual memory management scheme).
63
64Another minor architectural difference between \CFACC{} and the prototype system is that \CFACC{} makes extensive use of the pointer-chasing !std::list!, !std::set!, and !std::map! data structures, while the prototype uses the array-based !std::vector! and the hash-based !unordered_! variants of !set! and !map! instead. \TODO{investigate performance difference by testing a resolver prototype variant with List etc. redefined}
65
66The final difference between \CFACC{} and the resolver prototype is that, as an experiment in language usability, the prototype performs resolution-based rather than unification-based assertion satisfaction, as discussed in Section~\ref{resn-conclusion-sec}.
67This enables coding patterns not available in \CFACC{}, \eg{} a more flexible approach to type assertion satisfaction and better handling of functions returning polymorphic type variables that do not exist in the parameter list.
68\TODO{test performance; shouldn't be too hard to change \texttt{resolveAssertions} to use unification}
69
70\section{Experimental Results}
71
72% use Jenkins daily build logs to rebuild speedup graph with more data
73
74% look back at Resolution Algorithms section for threads to tie up "does the algorithm look like this?"
Note: See TracBrowser for help on using the repository browser.