source: doc/theses/fangren_yu_MMath/background.tex@ b28ce93

Last change on this file since b28ce93 was 206f4cf, checked in by Fangren Yu <f37yu@…>, 4 months ago

proofreading fixes

  • Property mode set to 100644
File size: 12.3 KB
Line 
1\chapter{Related Work}
2
3\section{Polymorphism}
4
5ML~\cite{ML} was the first language to support parametric polymorphism.
6Like \CFA, it supports unconstrained, universal, type parameters.
7\CFA differs by adding assertions and traits to constrain type arguments.
8Haskell~\cite{Haskell10} combines ML-style polymorphism, polymorphic data types, and type inference with the notion of type classes, collections of overloadable methods.
9The class/type-class association constrain type arguments by indicating the set of functions that become implicit assertion arguments and specify the implementation of these operations.
10As pointed out \see{\VRef[Figure]{f:ImplicitExplicitTraitInferencing}}, Haskell requires an explicit association between types and constraints (type-class).
11Otherwise, Haskell does not provide general overloading.
12\CFA differs by allowing general overloading and constraining type arguments with traits.
13Most importantly, \CFA automates selection of the assertion arguments at the function call-sites based on the set of operations in scope at that point to generalize reuse.
14
15\CC provides three disjoint polymorphic extensions to C: overloading, inheritance, and templates.
16Selecting among them and understanding how they interact is part of the challenge in \CC program development.
17General overloading is available, subtyping inheritance can be single or multiple, templates are typed macro-expansion over a universal type, which precludes separate compilation.
18Universal template types are constrained using @concept@s~\cite{C++Concepts}, but only as a guideline, as the template expansion can still discover additional constraints.
19Template expansion can result in code bloat and poor error messages.
20Type inferencing is available using \lstinline[language=C++]{auto}, precluding using the return type for overload selection.
21\CFA differs by providing a simplified, uniform facility for polymorphic code, eliminating subtyping polymorphism, and encompassing overloading among parametric functions and universal (generic) types, all of which are separately compilable.
22Overload resolution uses the return type and arithmetic conversions to make precise function selections versus generating ambiguities, at the cost of type inferencing.
23Both \CFA call-site inferencing and \CC template expansion search in the local environment to satisfy explicit assertions or to find named functions to complete the template, respectively.
24
25Cyclone~\cite{Grossman06} also provides capabilities for polymorphic functions and existential types, similar to \CFA's @forall@ functions and generic types.
26Cyclone's existential types can include function pointers in a construct similar to a virtual function table, but these pointers must be explicitly initialized at some point in the code, which is potentially error prone.
27Furthermore, Cyclone's polymorphic functions and types are restricted to abstraction over types with the same layout and calling convention as @void *@, \ie only pointer types and @int@.
28In \CFA terms, all Cyclone polymorphism must be an incomplete data-type @forall( T * )@ \see{\VRef{s:PolymorphicFunction}}, which provides the efficiency benefits of fixed-size types \see{\VPageref{s:GenericImplementation}}.
29\CFA differs by adding object and variadic kinds of polymorphism, which provide more expressive reuse forms.
30Smith and Volpano~\cite{Smith98} present Polymorphic C, an ML dialect with polymorphic functions, C-like syntax, and pointer types.
31While these language purport to be C replacements, they are significantly different from C, and hence, not a practical C replacement.
32\CFA differs in providing better imperative-style polymorphism, while retaining backwards syntax and semantic compatibility with C.
33
34Objective-C~\cite{obj-c-book} and its successor Swift~\cite{Swift} are used in iOS phone applications.
35Swift has polymorphic functions and generics, object subtyping, traits via @extensions@, general function/method overloading, including parameter names and return type.
36Objective-C communication is via message passing rather than function call, while swift uses function call unless interacting with Objective-C.
37Swift and \CFA's type-systems are very similar, minus the object-oriented subtyping in \CFA, which is felt to be unnecessary.
38The GObject~\cite{GObject} framework adds object-oriented programming to C with runtime type-checking and reference-counting garbage collection, as a modernization path for existing C code-bases.
39Vala~\cite{Vala} compiles to GObject-based C, but requires additional language syntax over GObject.
40\CFA's path for modernizing works with the existing C type system and runtime, \ie not adding object-oriented types or garbage collection.
41Java~\cite{Java8} has object-oriented subtyping, generic @interface@s that act like traits, which are type checked at compilation and type erased at runtime similar to \CFA's, and general overloading on methods.
42However, in Java, each class carries its own table of method pointers, and use of interfaces must be pre-declared with the @implements@ keyword, whereas \CFA passes trait pointers at call-site maintaining a C-compatible layout.
43Java is also garbage-collected.
44
45D~\cite{D}, Go, and Rust~\cite{Rust} are compiled languages with generic functions and types using traits similar to \CFA: \emph{interfaces} in D and Go, and \emph{traits} in Rust.
46D and Go are garbage-collected languages;
47Rust is not garbage collected.
48Go's generic types and functions are limited to a small fixed-set provided by the compiler, with no language facility to define more.
49Rust also possesses more powerful abstraction capabilities for writing generic code than Go.
50While Rust's borrow checker provides strong safety guarantees, it is complex and difficult to learn, and imposes a distinctly idiomatic programming style different from C.
51\CFA, with its modest safety features, has a comparable type-system to Rust's that also maintains C backwards compatibility, which provides a modernization path for existing C code-bases.
52
53
54\section{Tuples/Variadics}
55
56Many programming languages have some form of tuple construct and/or variadic functions, \eg SETL, C, KW-C, \CC, D, Go, Java, ML, and Scala.
57SETL~\cite{SETL} is a high-level mathematical programming language, with tuples being one of the primary data types.
58Tuples in SETL allow subscripting, dynamic expansion, and multiple assignment.
59KW-C~\cite{Buhr94a}, a predecessor of \CFA, introduced tuples to C as an extension of the C syntax, taking much of its inspiration from SETL.
60This work added multiple return value functions (MRVF), tuple mass and multiple assignment, and record-member access, giving unstructured tuples.
61Structured tuples (tuple variables) are also introduced including multiple coercions between structured and unstructured tuples.
62Like \CC, D provides tuples through a library variadic-template structure.
63Go does not have tuples but supports MRVF.
64Tuples are a fundamental abstraction in most functional programming languages, such as Standard ML~\cite{sml}, Haskell, and Scala~\cite{Scala}, which decompose tuples using pattern matching.
65From KW-C unstructured tuples, \CFA took MRVF, mass and multiple assignment, and record-member access.
66While \CFA has some structured-tuple capabilities, \VRef{s:TupleImplementation}, my analysis suggests this feature might be removed.
67
68An alternative to tuple types is variadic (variable number of arguments) functions or types.
69C provides variadic functions through @va_list@ objects, but the programmer is responsible for managing the number of arguments and their types;
70thus, the mechanism is type unsafe.
71\CC{11} introduced @std::tuple@ as a library variadic-template structure.
72Tuples are a generalization of @std::pair@ allowing for arbitrary length, fixed-size aggregation of heterogeneous values.
73Operations include @std::get<N>@ to extract values, @std::tie@ to create a tuple of references used for assignment, and lexicographic comparisons.
74\CC{17} proposes \emph{structured bindings}~\cite{Sutter15} to eliminate pre-declaring variables and the use of @std::tie@ for binding the results.
75This extension requires the use of @auto@ to infer the types of the new variables; hence, complicated expressions with a nonobvious type must be documented with some other mechanism.
76Furthermore, structured bindings are not a full replacement for @std::tie@, as it always declares new variables.
77Java's variadic functions appear similar to C's but are type safe using homogeneous arrays.
78\CFA's heterogeneous variadic functions \see{\VPageref{p:VariadicFunctions}} provide a type-safe version of C variadic, although limited in features, which fits into the overall \CFA type-system design.
79
80
81\section{Type Resolution}
82
83% \CFA expression resolution must deal with extensive overloading and inference of polymorphic types with assertions.
84% The goal is to keep the base algorithm used in type unification simple enough so resolving a complicated expression can still be done reasonably fast.
85% The following is work that handles the aforementioned bidirectional subtyping relations concisely.
86
87Psyche-C~\cite{Melo17} is a tool built for inferencing missing type and variable declarations of incomplete C programs, which can also be viewed as a dialect of C with type inferencing.
88As PsycheC is built for analyzing standard C programs, it does not have any kind of overloading or polymorphism.
89Instead, all top-level variables and function parameters may have indeterminate types.
90The scope of the problem PsycheC deals with is quite the opposite of that in \CFA, but the techniques used to solve for the unknown types are similar.
91
92In PsycheC, a unification variable is assigned to every identifier with unknown type, and each use of a variable generates a constraint on its type.
93Just like in \CFA, the most common kinds of expression contexts PsycheC deals with are assignments and function calls, which creates inequality constraints: the right hand side of the assignment is a subtype of the left hand side;
94function argument passing is treated the same as assignment.
95Since both sides of the inequality can have unknown types, type resolution needs to handle both upper bound and lower bound constraints.
96Instead of directly solving with lower and upper bounds, the authors developed a two-phase unification algorithm where the equivalences are first resolved and substituted into the inequalities, and showed that by ordering the inequalities using the partial order of concrete types involved, a second pass of unification treating inequalities as equivalences is sufficient to resolve all type variables.
97However, PsycheC does not consider the hierarchy of arithmetic types, as they are all mutually convertible in C, and therefore, considered \emph{equivalent} in unification.
98It also lacks structured types except simple pointers, meaning their strategy of unifying subtypes as equivalences does not work in a more general setting such as in \CFA.
99
100Pierce and Turner~\cite{Pierce00} presented a \emph{local} type-inference method that restricts the propagation of types and use of unification variables within adjacent syntax-tree nodes.
101The technique provides sufficient type checking and inference for the ML family of languages.
102Odersky, Zenger, and Zenger~\cite{Odersky01} refined this method by introducing a coloured type-system that distinguishes inherited types (given directly by context) and synthesized types (deducted from the term itself), to handle bidirectional type propagation in a more concise manner.
103Their algorithm is designed for functional programming, where a first-class (untyped) lambda expression is common and type inference must also deduce the argument types of lambdas requiring bidirectional inference.
104Given that \CFA does not have lambda expressions, but the rest of the type inference cases are quite similar, implementing this type-inference algorithm for \CFA expression resolution might be promising.
105Moreover, the current \CFA implementation also keeps unification variables \emph{almost} local, except when a type variable does not appear in the parameter list.
106Hence, it cannot be given a value from the argument types;
107rather, it is passed upwards as a free variable for matching.
108In all other cases, the substituted minimal types are passed upwards as the result types.
109This process means the current implementation of \CFA expression resolution semantically agrees with Pierce's type-inference system in most of the cases.
110A question remains whether these methods can be adapted to handle \CFA assertion resolution.
Note: See TracBrowser for help on using the repository browser.