Changeset b39e3dae

Ignore:
Timestamp:
Apr 11, 2017, 7:43:00 AM (4 years ago)
Branches:
aaron-thesis, arm-eh, cleanup-dtors, deferred_resn, demangler, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-env, no_list, persistent-indexer, resolv-new, with_gc
Children:
5a48d79, a0fc78a
Parents:
b715c9a
Message:

Location:
doc
Files:
2 edited

Legend:

Unmodified
 rb715c9a (3) \CFA code must be at least as portable as standard C code; (4) Extensions introduced by \CFA must be translated in the most efficient way possible. These goals ensure existing C code-bases can be converted to \CFA incrementally with minimal effort, and C programmers can productively generate \CFA code without training beyond the features being used. In its current implementation, \CFA is compiled by translating it to the GCC-dialect of C~\citep{GCCExtensions}, allowing it to leverage the portability and code optimizations provided by GCC, meeting goals (1)-(3). Ultimately, a compiler is necessary for advanced features and optimal performance. These goals ensure existing C code-bases can be converted to \CFA incrementally with minimal effort, and C programmers can productively generate \CFA code without training beyond the features being used. We claim \CC is diverging from C, and hence, incremental additions of language features require significant effort and training, while suffering from historically poor design choices. \CFA is currently implemented as a source-to-source translator from \CFA to the GCC-dialect of C~\citep{GCCExtensions}, allowing it to leverage the portability and code optimizations provided by GCC, meeting goals (1)-(3). Ultimately, a compiler is necessary for advanced features and optimal performance. This paper identifies shortcomings in existing approaches to generic and variadic data types in C-like languages and presents a design for generic and variadic types avoiding those shortcomings. Specifically, the solution is both reusable and type-checked, as well as conforming to the design goals of \CFA with ergonomic use of existing C abstractions. The new constructs are empirically compared with both standard C and \CC; the results show the new design is comparable in performance. The @identity@ function above can be applied to any complete \emph{object type} (or @otype@). The type variable @T@ is transformed into a set of additional implicit parameters encoding sufficient information about @T@ to create and return a variable of that type. The \CFA implementation passes the size and alignment of the type represented by an @otype@ parameter, as well as an assignment operator, constructor, copy constructor and destructor. If this extra information is not needed, \eg for a pointer, the type parameter can be declared as a \emph{data type} (or @dtype@). Here, the runtime cost of polymorphism is spread over each polymorphic call, due to passing more arguments to polymorphic functions; preliminary experiments have shown this overhead is similar to \CC virtual function calls. An advantage of this design is that, unlike \CC template functions, \CFA polymorphic functions are compatible with C \emph{separate} compilation, preventing code bloat. In \CFA, the polymorphism runtime-cost is spread over each polymorphic call, due to passing more arguments to polymorphic functions; preliminary experiments show this overhead is similar to \CC virtual-function calls. An advantage of this design is that, unlike \CC template-functions, \CFA polymorphic-functions are compatible with C \emph{separate compilation}, preventing compilation and code bloat. Since bare polymorphic-types provide only a narrow set of available operations, \CFA provides a \emph{type assertion} mechanism to provide further type information, where type assertions may be variable or function declarations that depend on a polymorphic type-variable. For example, the function @twice@ can be defined using the \CFA syntax for operator overloading: int val = twice( twice( 3.7 ) ); \end{lstlisting} which works for any type @T@ with a matching addition operator. The polymorphism is achieved by creating a wrapper function for calling @+@ with @T@ bound to @double@, then passing this function to the first call of @twice@. There is now the option of using the same @twice@ and converting the result to @int@ on assignment, or creating another @twice@ with type parameter @T@ bound to @int@ because \CFA uses the return type~\cite{Ada} in its type analysis. The first approach has a late conversion from @int@ to @double@ on the final assignment, while the second has an eager conversion to @int@. \CFA minimizes the number of conversions and their potential to lose information, so it selects the first approach, which corresponds with C-programmer intuition. which works for any type @T@ with a matching addition operator. The polymorphism is achieved by creating a wrapper function for calling @+@ with @T@ bound to @double@, then passing this function to the first call of @twice@. There is now the option of using the same @twice@ and converting the result to @int@ on assignment, or creating another @twice@ with type parameter @T@ bound to @int@ because \CFA uses the return type (as in~\cite{Ada}) in its type analysis. The first approach has a late conversion from @int@ to @double@ on the final assignment, while the second has an eager conversion to @int@. \CFA minimizes the number of conversions and their potential to lose information, so it selects the first approach, which corresponds with C-programmer intuition. Crucial to the design of a new programming language are the libraries to access thousands of external software features. \CFA inherits a massive compatible library-base, where other programming languages must rewrite or provide fragile inter-language communication with C. Like \CC, \CFA inherits a massive compatible library-base, where other programming languages must rewrite or provide fragile inter-language communication with C. A simple example is leveraging the existing type-unsafe (@void *@) C @bsearch@ to binary search a sorted floating-point array: \begin{lstlisting} int posn = bsearch( 5.0, vals, 10 ); \end{lstlisting} The nested routine @comp@ provides the hidden interface from typed \CFA to untyped (@void *@) C, plus the cast of the result. The nested routine @comp@ (impossible in \CC as lambdas do not use C calling conventions) provides the hidden interface from typed \CFA to untyped (@void *@) C, plus the cast of the result. As well, an alternate kind of return is made available: position versus pointer to found element. \CC's type-system cannot disambiguate between the two versions of @bsearch@ because it does not use the return type in overload resolution, nor can \CC separately compile a templated @bsearch@. forall( otype T | summable( T ) ) T sum( T a[$\,$], size_t size ) {  // use trait T total = { 0 };                                    $\C{// instantiate T from 0 by calling its constructor}$ for ( unsigned int i = 0; i < size; i += 1 ) total += a[i];                                        $\C{// select appropriate +}$ for ( unsigned int i = 0; i < size; i += 1 ) total += a[i]; $\C{// select appropriate +}$ return total; } \end{lstlisting} A trait name plays no part in type equivalence; it is solely a macro for a list of assertions. Traits may overlap assertions without conflict, and therefore, do not form a hierarchy. In fact, the set of operators is incomplete, \eg no assignment, but @otype@ is syntactic sugar for the following implicit trait: