| 1 | \chapter{Related Work}
|
|---|
| 2 |
|
|---|
| 3 | \section{Polymorphism}
|
|---|
| 4 |
|
|---|
| 5 | ML~\cite{ML} was the first language to support parametric polymorphism.
|
|---|
| 6 | Like \CFA, it supports universal type parameters, but not the use of assertions and traits to constrain type arguments.
|
|---|
| 7 | Haskell~\cite{Haskell10} combines ML-style polymorphism, polymorphic data types, and type inference with the notion of type classes, collections of overloadable methods that correspond in intent to traits in \CFA.
|
|---|
| 8 | Unlike \CFA, Haskell requires an explicit association between types and their classes that specifies the implementation of operations.
|
|---|
| 9 | These associations determine the functions that are assertion arguments for particular combinations of class and type, in contrast to \CFA where the assertion arguments are selected at function call sites based upon the set of operations in scope at that point.
|
|---|
| 10 | Haskell also severely restricts the use of overloading: an overloaded name can only be associated with a single class, and methods with overloaded names can only be defined as part of instance declarations.
|
|---|
| 11 |
|
|---|
| 12 | \CC provides three disjoint polymorphic extensions to C: overloading, inheritance, and templates.
|
|---|
| 13 | The overloading is restricted because resolution does not use the return type, inheritance requires learning object-oriented programming and coping with a restricted nominal-inheritance hierarchy, templates cannot be separately compiled resulting in compilation/code bloat and poor error messages, and determining how these mechanisms interact and which to use is confusing.
|
|---|
| 14 | In contrast, \CFA has a single facility for polymorphic code supporting type-safe separate compilation of polymorphic functions and generic (opaque) types, which uniformly leverage the C procedural paradigm.
|
|---|
| 15 | The key mechanism to support separate compilation is \CFA's \emph{explicit} use of assumed type properties.
|
|---|
| 16 | Until \CC concepts~\cite{C++Concepts} are standardized (anticipated for \CCtwenty), \CC provides no way of specifying the requirements of a generic function beyond compilation errors during template expansion;
|
|---|
| 17 | furthermore, \CC concepts are restricted to template polymorphism.
|
|---|
| 18 |
|
|---|
| 19 | Cyclone~\cite{Grossman06} also provides capabilities for polymorphic functions and existential types, similar to \CFA's @forall@ functions and generic types.
|
|---|
| 20 | Cyclone existential types can include function pointers in a construct similar to a virtual function table, but these pointers must be explicitly initialized at some point in the code, which is a tedious and potentially error-prone process.
|
|---|
| 21 | Furthermore, Cyclone's polymorphic functions and types are restricted to abstraction over types with the same layout and calling convention as @void *@, \ie only pointer types and @int@.
|
|---|
| 22 | In \CFA terms, all Cyclone polymorphism must be dtype-static.
|
|---|
| 23 | While the Cyclone design provides the efficiency benefits discussed in~\VRef{s:GenericImplementation} for dtype-static polymorphism, it is more restrictive than \CFA's general model.
|
|---|
| 24 | Smith and Volpano~\cite{Smith98} present Polymorphic C, an ML dialect with polymorphic functions, C-like syntax, and pointer types;
|
|---|
| 25 | it lacks many of C's features, most notably structure types, and hence, is not a practical C replacement.
|
|---|
| 26 |
|
|---|
| 27 | Objective-C~\cite{obj-c-book} is an industrially successful extension to C.
|
|---|
| 28 | However, Objective-C is a radical departure from C, using an object-oriented model with message passing.
|
|---|
| 29 | Objective-C did not support type-checked generics until recently \cite{xcode7}, historically using less-efficient runtime checking of object types.
|
|---|
| 30 | The GObject~\cite{GObject} framework also adds object-oriented programming with runtime type-checking and reference-counting garbage collection to C;
|
|---|
| 31 | these features are more intrusive additions than those provided by \CFA, in addition to the runtime overhead of reference counting.
|
|---|
| 32 | Vala~\cite{Vala} compiles to GObject-based C, adding the burden of learning a separate language syntax to the aforementioned demerits of GObject as a modernization path for existing C code bases.
|
|---|
| 33 | Java~\cite{Java8} included generic types in Java~5, which are type checked at compilation and type erased at runtime, similar to \CFA's.
|
|---|
| 34 | However, in Java, each object carries its own table of method pointers, whereas \CFA passes the method pointers separately to maintain a C-compatible layout.
|
|---|
| 35 | Java is also a garbage-collected, object-oriented language, with the associated resource usage and C-interoperability burdens.
|
|---|
| 36 |
|
|---|
| 37 | D~\cite{D}, Go, and Rust~\cite{Rust} are modern compiled languages with abstraction features similar to \CFA traits, \emph{interfaces} in D and Go, and \emph{traits} in Rust.
|
|---|
| 38 | However, each language represents a significant departure from C in terms of language model, and none has the same level of compatibility with C as \CFA.
|
|---|
| 39 | D and Go are garbage-collected languages, imposing the associated runtime overhead.
|
|---|
| 40 | The necessity of accounting for data transfer between managed runtimes and the unmanaged C runtime complicates foreign-function interfaces to C.
|
|---|
| 41 | Furthermore, while generic types and functions are available in Go, they are limited to a small fixed set provided by the compiler, with no language facility to define more.
|
|---|
| 42 | D restricts garbage collection to its own heap by default, whereas Rust is not garbage collected and, thus, has a lighter-weight runtime more interoperable with C.
|
|---|
| 43 | Rust also possesses much more powerful abstraction capabilities for writing generic code than Go.
|
|---|
| 44 | On the other hand, Rust's borrow checker provides strong safety guarantees but is complex and difficult to learn and imposes a distinctly idiomatic programming style.
|
|---|
| 45 | \CFA, with its more modest safety features, allows direct ports of C code while maintaining the idiomatic style of the original source.
|
|---|
| 46 |
|
|---|
| 47 |
|
|---|
| 48 | \section{Tuples/variadics}
|
|---|
| 49 |
|
|---|
| 50 | \vspace*{-5pt}
|
|---|
| 51 | Many programming languages have some form of tuple construct and/or variadic functions, \eg SETL, C, KW-C, \CC, D, Go, Java, ML, and Scala.
|
|---|
| 52 | SETL~\cite{SETL} is a high-level mathematical programming language, with tuples being one of the primary data types.
|
|---|
| 53 | Tuples in SETL allow subscripting, dynamic expansion, and multiple assignment.
|
|---|
| 54 | C provides variadic functions through @va_list@ objects, but the programmer is responsible for managing the number of arguments and their types;
|
|---|
| 55 | thus, the mechanism is type unsafe.
|
|---|
| 56 | KW-C~\cite{Buhr94a}, a predecessor of \CFA, introduced tuples to C as an extension of the C syntax, taking much of its inspiration from SETL.
|
|---|
| 57 | The main contributions of that work were adding MRVF, tuple mass and multiple assignment, and record-member access.
|
|---|
| 58 | \CCeleven introduced @std::tuple@ as a library variadic-template structure.
|
|---|
| 59 | Tuples are a generalization of @std::pair@, in that they allow for arbitrary length, fixed-size aggregation of heterogeneous values.
|
|---|
| 60 | Operations include @std::get<N>@ to extract values, @std::tie@ to create a tuple of references used for assignment, and lexicographic comparisons.
|
|---|
| 61 | \CCseventeen proposes \emph{structured bindings}~\cite{Sutter15} to eliminate predeclaring variables and the use of @std::tie@ for binding the results.
|
|---|
| 62 | This extension requires the use of @auto@ to infer the types of the new variables; hence, complicated expressions with a nonobvious type must be documented with some other mechanism.
|
|---|
| 63 | Furthermore, structured bindings are not a full replacement for @std::tie@, as it always declares new variables.
|
|---|
| 64 | Like \CC, D provides tuples through a library variadic-template structure.
|
|---|
| 65 | Go does not have tuples but supports MRVF.
|
|---|
| 66 | Java's variadic functions appear similar to C's but are type safe using homogeneous arrays, which are less useful than \CFA's heterogeneously typed variadic functions.
|
|---|
| 67 | Tuples are a fundamental abstraction in most functional programming languages, such as Standard ML~\cite{sml}, Haskell, and Scala~\cite{Scala}, which decompose tuples using pattern matching.
|
|---|
| 68 |
|
|---|
| 69 |
|
|---|
| 70 | \section{Type Resolution}
|
|---|
| 71 |
|
|---|
| 72 | \CFA expression resolution must deal with extensive overloading and inference of polymorphic types with assertions.
|
|---|
| 73 | The goal is to keep the base algorithm used in type unification simple enough so resolving a complicated expression can still be done reasonably fast.
|
|---|
| 74 | The following is work that handles the aforementioned bidirectional subtyping relations concisely.
|
|---|
| 75 |
|
|---|
| 76 | Melo~\etal~\cite{Melo17} developed PsycheC, a tool built for inferencing missing type and variable declarations of incomplete C programs, which can also be viewed as a dialect of C with type inferencing.
|
|---|
| 77 | As PsycheC is built for analyzing standard C programs, it does not have any kind of overloading or polymorphism.
|
|---|
| 78 | Instead, all top-level variables and function parameters may have indeterminate types.
|
|---|
| 79 | The scope of the problem PsycheC deals with is quite the opposite of that in \CFA, but the techniques used to solve for the unknown types are similar.
|
|---|
| 80 |
|
|---|
| 81 | In PsycheC, a unification variable is assigned to every identifier with unknown type, and each use of a variable generates a constraint on its type.
|
|---|
| 82 | Just like in \CFA, the most common kinds of expression contexts PsycheC deals with are assignments and function calls, which creates inequality constraints: the right hand side of the assignment is a subtype of the left hand side;
|
|---|
| 83 | function argument passing is treated the same as assignment.
|
|---|
| 84 | Since both sides of the inequality can have unknown types, type resolution needs to handle both upper bound and lower bound constraints.
|
|---|
| 85 | Instead of directly solving with lower and upper bounds, the authors developed a two-phase unification algorithm where the equivalences are first resolved and substituted into the inequalities, and showed that by ordering the inequalities using the partial order of concrete types involved, a second pass of unification treating inequalities as equivalences is sufficient to resolve all type variables.
|
|---|
| 86 | However, PsycheC does not consider the hierarchy of arithmetic types, as they are all mutually convertible in C, and therefore, considered \emph{equivalent} in unification.
|
|---|
| 87 | It also lacks structured types except simple pointers, meaning their strategy of unifying subtypes as equivalences does not work in a more general setting such as in \CFA.
|
|---|
| 88 |
|
|---|
| 89 | Pierce and Turner~\cite{Pierce00} presented a \emph{local} type-inference method that restricts the propagation of types and use of unification variables within adjacent syntax-tree nodes.
|
|---|
| 90 | The technique provides sufficient type checking and inference for the ML family of languages.
|
|---|
| 91 | Odersky, Zenger, and Zenger~\cite{Odersky01} refined this method by introducing a coloured type-system that distinguishes inherited types (given directly by context) and synthesized types (deducted from the term itself), to handle bidirectional type propagation in a more concise manner.
|
|---|
| 92 | Their algorithm is designed for functional programming, where a first-class (untyped) lambda expression is common and type inference must also deduce the argument types of lambdas requiring bidirectional inference.
|
|---|
| 93 | Given that \CFA does not have lambda expressions, but the rest of the type inference cases are quite similar, implementing this type-inference algorithm for \CFA expression resolution might be promising.
|
|---|
| 94 | Moreover, the current \CFA implementation also keeps unification variables \emph{almost} local, except when a type variable does not appear in the parameter list.
|
|---|
| 95 | Hence, it cannot be given a value from the argument types;
|
|---|
| 96 | rather, it is passed upwards as a free variable for matching.
|
|---|
| 97 | In all other cases, the substituted minimal types are passed upwards as the result types.
|
|---|
| 98 | This process means the current implementation of \CFA expression resolution semantically agrees with Pierce's type-inference system in most of the cases.
|
|---|
| 99 | A question remains whether these methods can be adapted to handle \CFA assertion resolution.
|
|---|