# Changeset a332d432 for doc

Ignore:
Timestamp:
Oct 4, 2018, 3:52:45 PM (4 years ago)
Branches:
aaron-thesis, arm-eh, cleanup-dtors, deferred_resn, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, no_list, persistent-indexer, pthread-emulation, qualifiedEnum
Children:
a72b240
Parents:
a32346b
Message:

Added to related work, implementation, and future work for generic types chapter

Location:
doc/theses/aaron_moss_PhD/phd
Files:
2 edited

### Legend:

Unmodified
 ra32346b \subsection{Related Work} One approach to the design of generic types is that taken by \CC{} templates\cit{}. One approach to the design of generic types is that taken by \CC{} templates\cite{C++}. The template approach is closely related to the macro-expansion approach to C polymorphism demonstrated in Figure~\ref{macro-generic-fig}, but where the macro-expansion syntax has been given first-class language support. Template expansion has the benefit of generating code with near-optimal runtime efficiency, as distinct optimizations can be applied for each instantiation of the template. The most significant restriction of the \CC{} template model is that it breaks separate compilation and C's translation-unit-based encapsulation mechanisms. Because a \CC{} template is not actually code, but rather a sort of recipe'' to generate code, template code must be visible at its call site to be used. Furthermore, \CC{} template code cannot be type-checked without instantiating it, a time consuming process with no hope of improvement until \CC{} concepts\cite{C++Concepts} are standardized in \CCtwenty{}. C code, by contrast, only needs a !struct! or function declaration to call that function or use (by-pointer) values of that type, a desirable property to maintain for \CFA{}. Java\cit{} has another prominent implementation for generic types, based on a significantly different approach than \CC{}. Java\cite{Java8} has another prominent implementation for generic types, introduced in Java~5 and based on a significantly different approach than \CC{}. The Java approach has much more in common with the !void*!-polymorphism shown in Figure~\ref{void-generic-fig}; since in Java nearly all data is stored by reference, the Java approach to polymorphic data is to store pointers to arbitrary data and insert type-checked implicit casts at compile-time. This process of \emph{type erasure} has the benefit of allowing a single instantiation of polymorphic code, but relies heavily on Java's object model and garbage collector. To use this model, a more C-like language such as \CFA{} would be required to dynamically allocate internal storage for variables, track their lifetime, and properly clean them up afterward. \TODO{Talk about Go, maybe Rust, Swift, etc. as well; specifically mention fat pointer'' polymorphism} \TODO{Talk about Cyclone as well, and why my generics are more powerful} Cyclone\cite{Grossman06} is another language extending C, and also provides capabilities for polymorphic functions and existential types, similar to \CFA{}'s !forall! functions and generic types. Cyclone existential types can include function pointers in a construct similar to a virtual function table, but these pointers must be explicitly initialized at some point in the code, which is tedious and error-prone compared to \CFA{}'s implicit assertion satisfaction. Furthermore, Cyclone's polymorphic functions and types are restricted to abstraction over types with the same layout and calling convention as !void*!, \ie{} only pointer types and !int!. In the \CFA{} terminology discussed in Section~\ref{generic-impl-sec}, all Cyclone polymorphism must be dtype-static. While the Cyclone polymorphism design provides the efficiency benefits discussed in Section~\ref{dtype-static-sec} for dtype-static polymorphism, it is more restrictive than the more general model of \CFA{}. Many other languages include some form of generic types. As a brief survey, ML\cite{ML} was the first language to support parameteric polymorphism, but unlike \CFA{} does not support the use of assertions and traits to constrain type arguments. Haskell\cite{Haskell10} combines ML-style polymorphism with the notion of type classes, similar to \CFA{} traits, but requiring an explicit association with their implementing types, unlike \CFA{}. Objective-C\cite{obj-c-book} is an extension to C which has had some industrial success; however, it did not support type-checked generics until recently\cite{xcode7}, and it's garbage-collected, message-passing object-oriented model is a radical departure from C. Go\cite{Go}, and Rust\cite{Rust} are modern compiled languages with abstraction features similar to \CFA{} traits, \emph{interfaces} in Go and \emph{traits} in Rust. Go has implicit interface implementation and uses a fat pointer'' construct to pass polymorphic objects to functions, similar in principle to \CFA{}'s implicit forall paramters. Go does not, however, allow user code to define generic types, restricting Go programmers to the small set of generic types defined by the compiler. Rust has powerful abstractions for generic programming, including explicit implemenation of traits and options for both separately-compiled virtual dispatch and template-instantiated static dispatch in functions. On the other hand, the safety guarantees of Rust's \emph{lifetime} abstraction and borrow checker impose a distinctly idiosyncratic programming style and steep learning curve; \CFA{}, with its more modest safety features, allows direct ports of C code while maintaining the idiomatic style of the original source. \subsection{\CFA{} Generics} In this example, !with_len! is defined at the same scope as !pair!, but it could be called from any context that can see the definition of !pair! and a declaration of !with_len!. If its return type was !pair(const char*, int)*!, callers of !with_len! would only need the declaration !forall(otype R, otype S) struct pair;! to call it, in accordance with the usual C rules for opaque types. If its return type was !pair(const char*, int)*!, callers of !with_len! would only need the declaration !forall(otype R, otype S) struct pair! to call it, in accordance with the usual C rules for opaque types. !with_len! is itself a monomorphic function, returning a type that is structurally identical to !struct { const char* first; int second; }!, and as such could be called from C given an appropriate redeclaration and demangling flags. \end{cfa} \section{Implementation} % forall constraints on struct/union constrain default constructor (TODO check with Rob) % TODO discuss layout function algorithm, application to separate compilation % TODO put a static const field in for _n_fields for each generic, describe utility for separate compilation (actually no, you need to be able to see the type for it to be sized) % mention that tuples are implemented on top of a per-arity generic type \section{Performance Experiments} % TODO pull benchmarks from Moss et al. \CFA{} generic types also support the type constraints from !forall! functions. For example, the following declaration of a sorted set type ensures that the set key implements equality and relational comparison: \begin{cfa} forall(otype Key | { int ?==?(Key, Key); int ?first, b->first ); return c ? c : cmp( a->second, b->second ); } \end{cfa} Since !pair(T*, T*)! is a concrete type, there are no implicit parameters passed to !lexcmp!; hence, the generated code is identical to a function written in standard C using !void*!, yet the \CFA{} version is type-checked to ensure members of both pairs and arguments to the comparison function match in type. Another useful pattern enabled by reused dtype-static type instantiations is zero-cost \emph{tag structures}. Sometimes, information is only used for type checking and can be omitted at runtime. In the example below, !scalar! is a dtype-static type; hence, all uses have a single structure definition containing !unsigned long! and can share the same implementations of common functions like !?+?!. These implementations may even be separately compiled, unlike \CC{} template functions. However, the \CFA{} type checker ensures matching types are used by all calls to !?+?!, preventing nonsensical computations like adding a length to a volume. \begin{cfa} forall(dtype Unit) struct scalar { unsigned long value; }; struct metres {}; struct litres {}; forall(dtype U) scalar(U) ?+?(scalar(U) a, scalar(U) b) { return (scalar(U)){ a.value + b.value }; } scalar(metres) half_marathon = { 21098 }; scalar(litres) pool = { 2500000 }; scalar(metres) marathon = half_marathon + half_marathon; marathon + pool; $\C[4in]{// compiler ERROR, mismatched types}$ \end{cfa} \section{Performance Experiments} \label{generic-performance-sec} \TODO{pull benchmarks from Moss et al.} \section{Future Work} % mention future work adding non-type generic parameters, like ints % taking advantage of generic layout functions to provide field assertions in forall qualifiers % mention packed generic layouts (significantly more complex layout function, but possible) The generic types design presented here is already sufficiently expressive to implement a variety of useful library types. However, some other features based on this design could further improve \CFA{}. The most pressing addition is the ability to have non-type generic parameters. C already supports fixed-length array types, \eg{} !int[10]!; these types are essentially generic types with unsigned integer parameters, and allowing \CFA{} users the capability to build similar types is a requested feature. More exotically, the ability to have these non-type parameters depend on dynamic runtime values rather than static compile-time constants opens up interesting opportunities for type-checking problematic code patterns. For example, if a collection iterator was parameterized over the pointer to the collection it was drawn from, then a sufficiently powerful static analysis pass could ensure that that iterator was only used for that collection, eliminating one source of hard-to-find bugs. The implementation mechanisms behind this generic types design can also be used to add new features to \CFA{}. One such potential feature would be to add \emph{field assertions} to the existing function and variable assertions on polymorphic type variables. Implementation of these field assertions would be based on the same code that supports member access by dynamic offset calculation for dynamic generic types. Simulating field access can already be done more flexibly in \CFA{} by declaring a trait containing an accessor function to be called from polymorphic code, but these accessor functions impose some overhead both to write and call, and directly providing field access via an implicit offset parameter would be both more concise and more efficient. Of course, there are language design trade-offs to such an approach, notably that providing the two similar features of field and function assertions would impose a burden of choice on programmers writing traits, with field assertions more efficient, but function assertions more general; given this open design question we have deferred a decision on field assertions until we have more experience using \CFA{}. If field assertions are included in the language, a natural extension would be to provide a structural inheritance mechanism for every !struct! type that simply turns the list of !struct! fields into a list of field assertions, allowing monomorphic functions over that type to be generalized to polymorphic functions over other similar types with added or reordered fields. \CFA{} could also support a packed or otherwise size-optimized representation for generic types based on a similar mechanism --- the layout function would need to be re-written, but nothing in the use of the offset arrays implies that the field offsets need be monotonically increasing. With respect to the broader \CFA{} polymorphism design, the experimental results in Section~\ref{generic-performance-sec} demonstrate that though the runtime impact of \CFA{}'s dynamic virtual dispatch is low, it is not as low as the static dispatch of \CC{} template inlining. However, rather than subject all \CFA{} users to the compile-time costs of ubiquitous template expansion, we are considering more targeted mechanisms for performance-sensitive code. Two promising approaches are are an !inline! annotation at polymorphic function call sites to create a template specialization of the function (provided the code is visible) or placing a different !inline! annotation on polymorphic function definitions to instantiate a specialized version of the function for some set of types. These approaches are not mutually exclusive and allow performance optimizations to be applied only when necessary, without suffering global code bloat. In general, the \CFA{} team believes that separate compilation works well with loaded hardware caches by producing smaller code, which may offset the benefit of larger inlined code.