Changeset 673cd63 for doc/theses/aaron_moss_PhD/phd/generic-types.tex
- Timestamp:
- Apr 26, 2019, 4:59:48 PM (5 years ago)
- Branches:
- ADT, arm-eh, ast-experimental, cleanup-dtors, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, pthread-emulation, qualifiedEnum
- Children:
- 5b11c25
- Parents:
- ffe2fad (diff), 1bc5975 (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the(diff)
links above to see all the changes relative to each parent. - File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/aaron_moss_PhD/phd/generic-types.tex
rffe2fad r673cd63 27 27 int int_list_head( const struct int_list* ls ) { return ls->value; } 28 28 29 $\C[\textwidth]{// all code must be duplicated for every generic instantiation}$29 // all code must be duplicated for every generic instantiation 30 30 31 31 struct string_list { const char* value; struct string_list* next; }; … … 40 40 { return ls->value; } 41 41 42 $\C[\textwidth]{// use is efficient and idiomatic}$42 // use is efficient and idiomatic 43 43 44 44 int main() { … … 65 65 struct list { void* value; struct list* next; }; 66 66 67 $\C[\textwidth]{// internal memory management requires helper functions}$67 // internal memory management requires helper functions 68 68 69 69 void list_insert( struct list** ls, void* x, void* (*copy)(void*) ) { … … 75 75 void* list_head( const struct list* ls ) { return ls->value; } 76 76 77 $\C[\textwidth]{// helpers duplicated per type}$77 // helpers duplicated per type 78 78 79 79 void* int_copy(void* x) { … … 105 105 #include <stdio.h> $\C{// for printf}$ 106 106 107 $\C[\textwidth]{// code is nested in macros}$107 // code is nested in macros 108 108 109 109 #define list(N) N ## _list … … 127 127 define_list(string, const char*); $\C[3in]{// defines string\_list}$ 128 128 129 $\C[\textwidth]{// use is efficient, but syntactically idiosyncratic}$129 // use is efficient, but syntactically idiosyncratic 130 130 131 131 int main() { … … 143 143 \end{figure} 144 144 145 \CC{}, Java, and other languages use \emph{generic types} to produce type-safe abstract data types.146 Design and implementation of generic types for \CFA{} is the first major contribution of this thesis, a summary of which is published in \cite{Moss18} and fromwhich this chapter is closely based.145 \CC{}, Java, and other languages use \emph{generic types} (or \emph{parameterized types}) to produce type-safe abstract data types. 146 Design and implementation of generic types for \CFA{} is the first major contribution of this thesis, a summary of which is published in \cite{Moss18} and on which this chapter is closely based. 147 147 \CFA{} generic types integrate efficiently and naturally with the existing polymorphic functions in \CFA{}, while retaining backward compatibility with C in layout and support for separate compilation. 148 148 A generic type can be declared in \CFA{} by placing a !forall! specifier on a !struct! or !union! declaration, and instantiated using a parenthesized list of types after the generic name. … … 156 156 forall(otype T) struct list { T value; list(T)* next; }; 157 157 158 $\C[\textwidth]{// single polymorphic implementation of each function}$159 $\C[\textwidth]{// overloading reduces need for namespace prefixes}$158 // single polymorphic implementation of each function 159 // overloading reduces need for namespace prefixes 160 160 161 161 forall(otype T) void insert( list(T)** ls, T x ) { … … 167 167 forall(otype T) T head( const list(T)* ls ) { return ls->value; } 168 168 169 $\C[\textwidth]{// use is clear and efficient}$169 // use is clear and efficient 170 170 171 171 int main() { … … 189 189 A key insight for this design is that C already possesses a handful of built-in generic types (\emph{derived types} in the language of the standard \cite[\S{}6.2.5]{C11}), notably pointer (!T*!) and array (!T[]!), and that user-definable generics should act similarly. 190 190 191 \subsection{Related Work} 191 \subsection{Related Work} \label{generic-related-sec} 192 192 193 193 One approach to the design of generic types is that taken by \CC{} templates \cite{C++}. … … 202 202 Java \cite{Java8} has another prominent implementation for generic types, introduced in Java~5 and based on a significantly different approach than \CC{}. 203 203 The Java approach has much more in common with the !void*!-polymorphism shown in Figure~\ref{void-generic-fig}; since in Java nearly all data is stored by reference, the Java approach to polymorphic data is to store pointers to arbitrary data and insert type-checked implicit casts at compile-time. 204 This process of \emph{type erasure} has the benefit of allowing a single instantiation of polymorphic code, but relies heavily on Java's object model and garbage collector.204 This process of \emph{type erasure} has the benefit of allowing a single instantiation of polymorphic code, but relies heavily on Java's object model. 205 205 To use this model, a more C-like language such as \CFA{} would be required to dynamically allocate internal storage for variables, track their lifetime, and properly clean them up afterward. 206 206 … … 277 277 A \emph{dtype-static} type has polymorphic parameters but is still concrete. 278 278 Polymorphic pointers are an example of dtype-static types; given some type variable !T!, !T! is a polymorphic type, but !T*! has a fixed size and can therefore be represented by a !void*! in code generation. 279 In particular, generic types where all parameters are un-!sized! (\ie{} they do not conform to the built-in !sized! trait because the compiler does not know their size and alignment) are always concrete, as there is no possibility for their layout to vary based on type parameters of unknown size and alignment.279 In particular, generic types where all parameters are un-!sized! (\ie{} they do not conform to the built-in !sized! trait, which is satisfied by all types the compiler knows the size and alignment of) are always concrete, as there is no possibility for their layout to vary based on type parameters of unknown size and alignment. 280 280 More precisely, a type is concrete if and only if all of its !sized! type parameters are concrete, and a concrete type is dtype-static if any of its type parameters are (possibly recursively) polymorphic. 281 281 To illustrate, the following code using the !pair! type from above has each use of !pair! commented with its class: … … 333 333 \end{cfa} 334 334 335 Here, !_assign_T! is passed in as an implicit parameter from !otype T! and takes two !T*! (!void*! in the generated code ), a destination and a source, and !_retval! is the pointer to a caller-allocated buffer for the return value, the usual \CFA{} method to handle dynamically-sized return types.335 Here, !_assign_T! is passed in as an implicit parameter from !otype T! and takes two !T*! (!void*! in the generated code\footnote{A GCC extension allows arithmetic on \lstinline{void*}, calculated as if \lstinline{sizeof(void) == 1}.}), a destination and a source, and !_retval! is the pointer to a caller-allocated buffer for the return value, the usual \CFA{} method to handle dynamically-sized return types. 336 336 !_offsetof_pair! is the offset array passed into !value!; this array is statically generated at the call site as: 337 337 … … 350 350 Similarly, the layout function can only safely be called from a context where the generic type definition is visible, because otherwise the caller does not know how large to allocate the array of member offsets. 351 351 352 The C standard does not specify a memory layout for structs, but the POSIX ABI for x86 \cite{POSIX08} does; this memory layout is common for C implementations, but is a platform-specific issue for porting \CFA{}.352 The C standard does not specify a memory layout for structs, but the System V ABI \cite{SysVABI} does; compatibility with this standard is sufficient for \CFA{}'s currently-supported architectures, though future ports may require different layout-function generation algorithms. 353 353 This algorithm, sketched below in pseudo-\CFA{}, is a straightforward mapping of consecutive fields into the first properly-aligned offset in the !struct! layout; layout functions for !union! types omit the offset array and simply calculate the maximum size and alignment over all union variants. 354 354 Since \CFACC{} generates a distinct layout function for each type, constant-folding and loop unrolling are applied. … … 423 423 424 424 To validate the practicality of this generic type design, microbenchmark-based tests were conducted against a number of comparable code designs in C and \CC{}, first published in \cite{Moss18}. 425 Since allthese languages are all C-based and compiled with the same compiler backend, maximal-performance benchmarks should show little runtime variance, differing only in length and clarity of source code.425 Since these languages are all C-based and compiled with the same compiler backend, maximal-performance benchmarks should show little runtime variance, differing only in length and clarity of source code. 426 426 A more illustrative comparison measures the costs of idiomatic usage of each language's features. 427 The code below shows the \CFA{} benchmark tests for a generic stack based on a singly-linked list; the test suite is equivalent for the other otherlanguages, code for which is included in Appendix~\ref{generic-bench-app}.428 The experiment uses element types !int! and !pair(short, char)! and pushes $N = 4 0M$ elements on a generic stack, copies the stack, clears one of the stacks, and finds the maximum value in the other stack.427 The code below shows the \CFA{} benchmark tests for a generic stack based on a singly-linked list; the test suite is equivalent for the other languages, code for which is included in Appendix~\ref{generic-bench-app}. 428 The experiment uses element types !int! and !pair(short, char)! and pushes $N = 4M$ elements on a generic stack, copies the stack, clears one of the stacks, and finds the maximum value in the other stack. 429 429 430 430 \begin{cfa} … … 451 451 452 452 The four versions of the benchmark implemented are C with !void*!-based polymorphism, \CFA{} with parametric polymorphism, \CC{} with templates, and \CC{} using only class inheritance for polymorphism, denoted \CCV{}. 453 The \CCV{} variant illustrates an alternative object-oriented idiom where all objects inherit from a base !object! class, mimicking a Java-like interface; in particular, runtime checks are necessary to safely downcast objects.453 The \CCV{} variant illustrates an alternative object-oriented idiom where all objects inherit from a base !object! class, a language design similar to Java 4; in particular, runtime checks are necessary to safely downcast objects. 454 454 The most notable difference among the implementations is the memory layout of generic types: \CFA{} and \CC{} inline the stack and pair elements into corresponding list and pair nodes, while C and \CCV{} lack such capability and, instead, must store generic objects via pointers to separately allocated objects. 455 455 Note that the C benchmark uses unchecked casts as C has no runtime mechanism to perform such checks, whereas \CFA{} and \CC{} provide type safety statically. … … 481 481 482 482 The C and \CCV{} variants are generally the slowest and have the largest memory footprint, due to their less-efficient memory layout and the pointer indirection necessary to implement generic types in those languages; this inefficiency is exacerbated by the second level of generic types in the pair benchmarks. 483 By contrast, the \CFA{} and \CC{} variants run in roughly equivalenttime for both the integer and pair because of the equivalent storage layout, with the inlined libraries (\ie{} no separate compilation) and greater maturity of the \CC{} compiler contributing to its lead.483 By contrast, the \CFA{} and \CC{} variants run in noticeably less time for both the integer and pair because of the equivalent storage layout, with the inlined libraries (\ie{} no separate compilation) and greater maturity of the \CC{} compiler contributing to its lead. 484 484 \CCV{} is slower than C largely due to the cost of runtime type checking of downcasts (implemented with !dynamic_cast!); the outlier for \CFA{}, pop !pair!, results from the complexity of the generated-C polymorphic code. 485 485 The gcc compiler is unable to optimize some dead code and condense nested calls; a compiler designed for \CFA{} could more easily perform these optimizations. 486 Finally, the binary size for \CFA{} is larger because of static linking with \CFA{} libraries.486 Finally, the binary size for \CFA{} is larger because of static linking with the \CFA{} prelude library, which includes function definitions for all the built-in operators. 487 487 488 488 \CFA{} is also competitive in terms of source code size, measured as a proxy for programmer effort.
Note: See TracChangeset
for help on using the changeset viewer.