Changeset 735b627 for doc/theses/fangren_yu_COOP_F20

Ignore:
Timestamp:
Feb 8, 2021, 10:30:51 PM (12 months ago)
Branches:
arm-eh, jacob/cs343-translation, master, new-ast-unique-expr
Children:
f4eb705
Parents:
7584279
Message:

update to new common macros, incorporate Thierry's fixes

File:
1 edited

Legend:

Unmodified
 r7584279 \CFA language, developed by the Programming Language Group at the University of Waterloo, has a long history, with the initial language design in 1992 by Glen Ditchfield~\cite{Ditchfield92} and the first proof-of-concept compiler built in 2003 by Richard Bilson~\cite{Bilson03}. Many new features have been added to the language over time, but the core of \CFA's type-system --- parametric functions introduced by the @forall@ clause (hence the name of the language) providing parametric overloading --- remains mostly unchanged. The current \CFA reference compiler, @cfa-cc@, is designed using the visitor pattern~\cite{vistorpattern} over an abstract syntax tree (AST), where multiple passes over the AST modify it for subsequent passes. @cfa-cc@ still includes many parts taken directly from the original Bilson implementation, which served as the starting point for this enhancement work to the type system. Unfortunately, the prior implementation did not provide the efficiency required for the language to be practical: a \CFA source file of approximately 1000 lines of code can take a multiple minutes to compile. The cause of the problem is that the old compiler used inefficient data structures and algorithms for expression resolution, which involved significant copying and redundant work. The current \CFA reference compiler, @cfa-cc@, is designed using the visitor pattern~\cite{vistorpattern} over an abstract syntax tree (AST), where multiple passes over the AST modify it for subsequent passes. @cfa-cc@ still includes many parts taken directly from the original Bilson implementation, which served as the starting point for this enhancement work to the type system. Unfortunately, the prior implementation did not provide the efficiency required for the language to be practical: a \CFA source file of approximately 1000 lines of code can take multiple minutes to compile. The cause of the problem is that the old compiler used inefficient data structures and algorithms for expression resolution, which involved significant copying and redundant work. This report presents a series of optimizations to the performance-critical parts of the resolver, with a major rework of the compiler data-structures using a functional-programming approach to reduce memory complexity. The improvements were suggested by running the compiler builds with a performance profiler against the \CFA standard-library source-code and a test suite to find the most underperforming components in the compiler algorithm. \end{itemize} The resolver algorithm, designed for overload resolution, uses a significant amount of reused, and hence copying, for the intermediate representations, especially in the following two places: The resolver algorithm, designed for overload resolution, allows a significant amount of code reused, and hence copying, for the intermediate representations, especially in the following two places: \begin{itemize} \item forall( dtype T | sized( T ) ) T * malloc( void ) { return (T *)malloc( sizeof(T) ); } // call C malloc int * i = malloc();  // type deduced from left-hand size $\Rightarrow$ no size argument or return cast int * i = malloc();  // type deduced from left-hand size $$$\Rightarrow$$$ no size argument or return cast \end{cfa} An unbound return-type is problematic in resolver complexity because a single match of a function call with an unbound return type may create multiple candidates. In the worst case, consider a function declared that returns any @otype@ (defined \VPageref{otype}): \begin{cfa} void f( int ); double g$_1$( int ); int g$_2$( long ); double g$$$_1$$$( int ); int g$$$_2$$$( long ); f( g( 42 ) ); \end{cfa}