Changeset f1240b0


Ignore:
Timestamp:
Apr 16, 2019, 3:51:40 PM (6 years ago)
Author:
Aaron Moss <a3moss@…>
Branches:
ADT, aaron-thesis, arm-eh, ast-experimental, cleanup-dtors, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, pthread-emulation, qualifiedEnum
Children:
a786586
Parents:
397848f5
git-author:
Aaron Moss <a3moss@…> (04/16/19 15:50:52)
git-committer:
Aaron Moss <a3moss@…> (04/16/19 15:51:40)
Message:

thesis: strip CFA-DCA results, add speedup graph

Location:
doc/theses/aaron_moss_PhD/phd
Files:
3 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/aaron_moss_PhD/phd/evaluation/cfa-plots.gp

    r397848f5 rf1240b0  
    1111set key outside
    1212
    13 plot for [i=2:6] 'evaluation/cfa-time.tsv' using 2:i title columnheader
     13plot for [i=2:5] 'evaluation/cfa-time.tsv' using 2:i title columnheader
    1414
    1515# MEMORY #
     
    1818set ylabel "peak memory (MB)"
    1919
    20 plot for [i=2:6] 'evaluation/cfa-mem-by-time.tsv' using 7:(column(i)/1000) title columnheader
     20plot for [i=2:5] 'evaluation/cfa-mem-by-time.tsv' using 7:(column(i)/1000) title columnheader
     21
     22# SPEEDUP #
     23set output BUILD."cfa-speedup.tex"
     24
     25set ylabel "speedup w.r.t. baseline"
     26unset logscale y
     27
     28plot 'evaluation/cfa-time.tsv' using 2:(column(2)/column(2)) title 'baseline', \
     29     '' using 2:(column(2)/column(3)) title columnheader, \
     30         '' using 2:(column(3)/column(4)) title 'inter-round', \
     31         '' using 2:(column(4)/column(5)) title columnheader
    2132
    2233# # RUNTIME SPEEDUP #
  • doc/theses/aaron_moss_PhD/phd/experiments.tex

    r397848f5 rf1240b0  
    210210\section{\CFA{} Results} \label{cfa-results-sec}
    211211
    212 I have integrated most of the algorithmic techniques discussed in this chapter into \CFACC{}.
     212I have integrated a number of the algorithmic techniques discussed in this chapter into \CFACC{}.
    213213This integration took place over a period of months while \CFACC{} was under active development on a number of other fronts, so it is not possible to completely isolate the effects of the algorithmic changes, but I believe the algorithmic changes to have had the most-significant effects on performance over the study period.
    214214To generate this data, representative commits from the \texttt{git} history of the project were checked out and compiled, then run on the same machine used for the resolver prototype experiments discussed in Section~\ref{proto-exp-sec}.
     
    217217I performed two rounds of modification to \CFACC{}; the first round moved from Bilson's original combined-bottom-up algorithm to an un-combined bottom-up algorithm, denoted \textsc{cfa-co} and \textsc{cfa-bu}, respectively.
    218218A top-down algorithm was not attempted in \CFACC{} due to its poor performance in the prototype.
    219 The second round of modifications addressed assertion satisfaction, taking Bilson's original \textsc{cfa-imm} algorithm, and iteratively modifying it, first to use the deferred approach \textsc{cfa-def}, then caching those results in the \textsc{cfa-dca} algorithm.
     219The second round of modifications addressed assertion satisfaction, taking Bilson's original \textsc{cfa-imm} algorithm and modifying it to use the deferred approach \textsc{cfa-def}.
     220Due to time constraints a deferred-cached assertion satisfaction algorithm for \CFACC{} could not be completed, but both preliminary results from this effort and the averaged prototype results from Section~\ref{proto-exp-sec} indicate that assertion satisfaction caching is not likely to be a fruitful optimization for \CFACC{}.
    220221The new environment data structures discussed in Section~\ref{proto-exp-sec} have not been successfully merged into \CFACC{} due to their dependencies on the garbage-collection framework in the prototype; I spent several months modifying \CFACC{} to use similar garbage collection, but due to \CFACC{} not being designed to use such memory management the performance of the modified compiler was non-viable.
    221222It is possible that the persistent union-find environment could be modified to use a reference-counted pointer internally without changing the entire memory-management framework of \CFACC{}, but such an attempt is left to future work.
    222223
    223 As can be seen in Figures~\ref{cfa-time-fig} and~\ref{cfa-mem-fig}, the time and peak memory results for these five versions of \CFACC{} show that assertion resolution dominates total resolution cost, with the \textsc{cfa-def} and \textsc{cfa-dca} variants running consistently faster than the others on more expensive test cases; no difference can be seen between these two algorithms' performance, but which agrees with the prototype experiments in Section~\ref{proto-exp-sec}.
    224 The results from \CFACC{} for \textsc{cfa-co} \textit{vs.}\ \textsc{cfa-bu} do not mirror those from the prototype; I conjecture this is mostly due to the different memory-management schemes and sorts of data required to run type unification and assertion satisfaction calculations, as \CFACC{} performance has proven to be particularly sensitive to the amount of heap allocation performed.
    225 This data also shows a noticeable regression in compiler performance in the eleven months between \textsc{cfa-bu} and \textsc{cfa-imm}, which use the same resolution algorithms; this regression is not due to expression resolution, as no integration work happened in this time, but I am unable to ascertain its actual cause.
     224As can be seen in Figures~\ref{cfa-time-fig}--\ref{cfa-mem-fig}, the time and peak memory results for these five versions of \CFACC{} show that assertion resolution dominates total resolution cost, with the \textsc{cfa-def} variant running consistently faster than the others on more expensive test cases, and the speedup from the deferred approach increasing with the difficulty of the test case.
     225The results from \CFACC{} for \textsc{cfa-co} \vs{} \textsc{cfa-bu} do not mirror those from the prototype; I conjecture this is mostly due to the different memory-management schemes and sorts of data required to run type unification and assertion satisfaction calculations, as \CFACC{} performance has proven to be particularly sensitive to the amount of heap allocation performed.
     226This data also shows a noticeable regression in compiler performance in the eleven months between \textsc{cfa-bu} and \textsc{cfa-imm}, which use the same resolution algorithms; this approximate doubling in runtime is not due to expression resolution, as no integration work happened in this time, but I am unable to ascertain its actual cause.
     227To isolate the effects of the algorithmic changes from this unrelated performance regression, the speedup results in Figure~\ref{cfa-speedup-fig} are shown with respect to the start of each modification round, \textsc{cfa-bu} \vs{} \textsc{cfa-co} and \textsc{cfa-def} \vs{} \textsc{cfa-imm}.
    226228It should also be noted with regard to the peak memory results in Figure~\ref{cfa-mem-fig} that the peak memory usage does not always occur during the resolution phase of the compiler.
    227229
     
    230232\input{cfa-time}
    231233\caption[\CFACC{} runtime against \textsc{cfa-co} baseline.]{\CFACC{} runtime against \textsc{cfa-co} baseline. Note log scales on both axes.} \label{cfa-time-fig}
     234\end{figure}
     235
     236\begin{figure}
     237\centering
     238\input{cfa-speedup}
     239\caption[\CFACC{} speedup.]{\CFACC{} speedup against against \textsc{cfa-co} baseline runtime. To isolate the effect of algorithmic changes, \textsc{cfa-bu} speedup is \vs{} \textsc{cfa-co} while \textsc{cfa-def} speedup is \vs{} \textsc{cfa-imm}. The `inter-round' series shows slowdown between \textsc{cfa-bu} and \textsc{cfa-imm}.} \label{cfa-speedup-fig}
    232240\end{figure}
    233241
     
    246254% some data you collected personally for imm vs. def vs. dca variants in resolv-proto/logs/rp-bu-tec_vs_cfacc.ods
    247255
    248 % look back at Resolution Algorithms section for threads to tie up "does the algorithm look like this?"
    249 
    250256\section{Conclusion}
    251257
  • doc/theses/aaron_moss_PhD/phd/macros.tex

    r397848f5 rf1240b0  
    1919\newcommand{\etc}{\textit{etc.}\@}
    2020\newcommand{\etal}{\textit{et~al.}\@}
     21\newcommand{\vs}{\textit{vs.}\@}
    2122
    2223\newcommand{\myset}[1]{\left\{#1\right\}}
Note: See TracChangeset for help on using the changeset viewer.