# Changeset 53f9448

Ignore:
Timestamp:
Apr 2, 2017, 2:54:55 PM (5 years ago)
Branches:
aaron-thesis, arm-eh, cleanup-dtors, deferred_resn, demangler, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, new-env, no_list, persistent-indexer, resolv-new, with_gc
Children:
ea73caf
Parents:
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the (diff) links above to see all the changes relative to each parent.
Message:

Merge branch 'master' of plg2:software/cfa/cfa-cc

Location:
doc
Files:
2 edited

### Legend:

Unmodified
 r0788c03d \begin{abstract} The C programming language is a foundational technology for modern computing with millions of lines of code implementing everything from commercial operating-systems to hobby projects. This installation base and the programmers producing it represent a massive software-engineering investment spanning decades and likely to continue for decades more. Nonetheless, C, first standardized over thirty years ago, lacks many features that make programming in more modern languages safer and more productive. The goal of the \CFA project is to create an extension of C that provides modern safety and productivity features while still ensuring strong backwards compatibility with C and its programmers. Prior projects have attempted similar goals~\cite{C++,Grossman06} but failed to honour C programming-style, e.g., adding object-oriented, functional programming with garbage collection is a non-starter for many C developers. Specifically, \CFA is designed to have an orthogonal feature-set based closely on the C programming paradigm, so that \CFA features can be added \emph{incrementally} to existing C code-bases, and C programmers can learn \CFA extensions on an as-needed basis, preserving investment in existing code and engineers. This paper describes only two \CFA extensions, generic and tuple types, and how they are implemented in accordance with these principles. The C programming language is a foundational technology for modern computing with millions of lines of code implementing everything from commercial operating-systems to hobby projects. This installation base and the programmers producing it represent a massive software-engineering investment spanning decades and likely to continue for decades more. Nonetheless, C, first standardized over thirty years ago, lacks many features that make programming in more modern languages safer and more productive. The goal of the \CFA project is to create an extension of C that provides modern safety and productivity features while still ensuring strong backwards compatibility with C and its programmers. Prior projects have attempted similar goals but failed to honour C programming-style; for instance, adding object-oriented or functional programming with garbage collection is a non-starter for many C developers. Specifically, \CFA is designed to have an orthogonal feature-set based closely on the C programming paradigm, so that \CFA features can be added \emph{incrementally} to existing C code-bases, and C programmers can learn \CFA extensions on an as-needed basis, preserving investment in existing code and engineers. This paper describes only two \CFA extensions, generic and tuple types, and how they are implemented in accordance with these principles. \end{abstract} \item Extensions introduced by \CFA must be translated in the most efficient way possible. \end{enumerate} The purpose of these goals is to ensure that existing C code-bases can be converted to \CFA incrementally and with minimal effort, and that programmers who know C can productively generate \CFA code without training beyond the features they wish to employ. In its current implementation, \CFA is compiled by translating it to the GCC-dialect of C, allowing it to leverage the portability and code optimizations provided by GCC, meeting goals (1)-(3). \CFA has been previously extended with polymorphic functions and name overloading (including operator overloading)~\citep{Bilson03}, and deterministically-executed constructors and destructors~\citep{Schluntz17}. This paper describes how generic and tuple types are designed and implemented in \CFA in accordance with both the backward compatibility goals and existing features described above. The purpose of these goals is to ensure that existing C code-bases can be converted to \CFA incrementally and with minimal effort, and that programmers who know C can productively generate \CFA code without training beyond the features they wish to employ. In its current implementation, \CFA is compiled by translating it to the GCC-dialect of C~\citep{GCCExtensions}, allowing it to leverage the portability and code optimizations provided by GCC, meeting goals (1)-(3). \CFA has been previously extended with polymorphic functions and name overloading (including operator overloading) by \citet{Bilson03}, and deterministically-executed constructors and destructors by \citet{Schluntz17}. This paper describes how generic and tuple types are designed and implemented in \CFA in accordance with both the backward compatibility goals and existing features described above. \label{sec:poly-fns} \CFA's polymorphism was originally formalized by~\citet{Ditchfield92}, and first implemented by~\citet{Bilson03}. The signature feature of \CFA is parametric-polymorphic functions; such functions are written using a @forall@ clause (which gives the language its name): \CFA's polymorphism was originally formalized by \citet{Ditchfield92}, and first implemented by \citet{Bilson03}. The signature feature of \CFA is parametric-polymorphic functions; such functions are written using a @forall@ clause (which gives the language its name): \begin{lstlisting} forall( otype T ) T identity( T val ) { return val; } int val = twice( twice( 3.7 ) ); \end{lstlisting} which works for any type @T@ with an addition operator defined. The translator accomplishes this polymorphism by creating a wrapper function for calling @+@ with @T@ bound to @double@, then providing this function to the first call of @twice@. It then has the option of using the same @twice@ again and converting the result to @int@ on assignment, or creating another @twice@ with type parameter @T@ bound to @int@ because \CFA uses the return type in its type analysis. The first approach has a late conversion from integer to floating-point on the final assignment, while the second has an eager conversion to integer. \CFA minimize the number of conversions and their potential to lose information, so it selects the first approach. which works for any type @T@ with an addition operator defined. The translator accomplishes this polymorphism by creating a wrapper function for calling @+@ with @T@ bound to @double@, then providing this function to the first call of @twice@. It then has the option of using the same @twice@ again and converting the result to @int@ on assignment, or creating another @twice@ with type parameter @T@ bound to @int@ because \CFA uses the return type in its type analysis. The first approach has a late conversion from integer to floating-point on the final assignment, while the second has an eager conversion to integer. \CFA minimizes the number of conversions and their potential to lose information, so it selects the first approach. Monomorphic specializations of polymorphic functions can satisfy polymorphic type-assertions. \end{lstlisting} @qsort@ and @bsearch@ can only be called with arguments for which there exists a function named @<@ taking two arguments of the same type and returning an @int@ value. Here, the build-in monomorphic specialization of @<@ for type @double@ is passed as an additional implicit parameter to the calls of @qsort@ and @bsearch@. Here, the built-in monomorphic specialization of @<@ for type @double@ is passed as an additional implicit parameter to the calls of @qsort@ and @bsearch@. Crucial to the design of a new programming language are the libraries to access thousands of external features. \CC's type-system cannot disambiguate between the two versions of @bsearch@ because it does not use the return type in overload resolution, nor can \CC separately compile a templated @bsearch@. Call-site inferencing and nested functions provide a localized form of inhertance. For example, @qsort@ only sorts in asending order using @<@. However, it is trivial to locally change this behaviour: Call-site inferencing and nested functions provide a localized form of inheritance. For example, @qsort@ only sorts in ascending order using @<@. However, it is trivial to locally change this behaviour: \begin{lstlisting} { % \end{lstlisting} \begin{lstlisting} trait sumable( otype T ) { const T 0;                                                              $\C{// variable zero}$ trait summable( otype T ) { void ?{}(T*, zero_t);                                   $\C{// constructor from 0 literal}$ T ?+?( T, T );                                                  $\C{// assortment of additions}$ T ?+=?( T *, T ); T ?++( T * ); }; forall( otype T | sumable( T ) ) forall( otype T | summable( T ) ) T sum( T a[$\,$], size_t dimension ) { T total = 0;                                                    $\C{// instantiate T, select 0}$ T total = { 0 };                                                        $\C{// instantiate T from 0}$ for ( unsigned int i = 0; i < dimension; i += 1 ) total += a[i];                                          $\C{// select appropriate +}$ }; \end{lstlisting} Given the information provided for an @otype@, variables of polymorphic type can be treated as if they were a complete struct type -- they can be stack-allocated using the @alloca@ compiler builtin, default or copy-initialized, assigned, and deleted. As an example, the @sum@ function produces generated code something like the following (simplified for clarity and brevity): Given the information provided for an @otype@, variables of polymorphic type can be treated as if they were a complete struct type -- they can be stack-allocated using the @alloca@ compiler builtin, default or copy-initialized, assigned, and deleted. As an example, the @sum@ function produces generated code something like the following (simplified for clarity and brevity) \TODO{} fix example, maybe elide, it's likely too long with the more complicated function: \begin{lstlisting} void abs( size_t _sizeof_M, size_t _alignof_M, If assertion arguments must match exactly, then the call to @g@ cannot be resolved, since the expected type of @f@ is flat, while the only @f@ in scope requires a tuple type. Since tuples are fluid, this requirement reduces the usability of tuples in polymorphic code. To ease this pain point, function parameter and return lists are flattened for the purposes of type unification, which allows the previous example to pass expression resolution. This relaxation is made possible by extending the existing thunk generation scheme, as described by~\citet{Bilson03}. Now, whenever a candidate's parameter structure does not exactly match the formal parameter's structure, a thunk is generated to specialize calls to the actual function: This relaxation is made possible by extending the existing thunk generation scheme, as described by \citet{Bilson03}. Now, whenever a candidate's parameter structure does not exactly match the formal parameter's structure, a thunk is generated to specialize calls to the actual function: \begin{lstlisting} int _thunk(int _p0, double _p1, double _p2) {