source: doc/theses/aaron_moss_PhD/phd/introduction.tex @ 555c599

Last change on this file since 555c599 was bd405fa, checked in by Aaron Moss <a3moss@…>, 6 years ago

thesis: final edits

  • Property mode set to 100644
File size: 6.8 KB
Line 
1\chapter{Introduction}
2
3The C programming language~\cite{C11} has had a wide-ranging impact on the design of software and programming languages.
4In the 30 years since its first standardization, it has consistently been one of the most popular programming languages, with billions of lines of C code still in active use, and tens of thousands of trained programmers producing it. The TIOBE index~\cite{TIOBE} tracks popularity of programming languages over time, and C has never dropped below second place:
5
6\begin{table}[h]
7\caption[TIOBE index over time]{Current top 5 places in the TIOBE index averaged over years} \label{tiobe-table}
8
9\centering
10\begin{tabular}{@{}cccccccc@{}}
11        & 2018  & 2013  & 2008  & 2003  & 1998  & 1993  & 1988  \\
12Java            & 1             & 2             & 1             & 1             & 18    & --    & --    \\
13\textbf{C}      & \textbf{2} & \textbf{1} & \textbf{2} & \textbf{2} & \textbf{1} & \textbf{1} & \textbf{1} \\
14\CC{}           & 3             & 4             & 3             & 3             & 2             & 2             & 5             \\
15Python          & 4             & 7             & 6             & 11    & 22    & 17    & --    \\
16\Csharp{}       & 5             & 5             & 7             & 8             & --    & --    & --    \\
17\end{tabular}
18\end{table}
19
20The impact of C on programming language design is also obvious from Table~\ref{tiobe-table}; with the exception of Python, all of the top five languages use C-like syntax and control structures.
21\CC{}~\cite{C++} is even a largely backwards-compatible extension of C.
22Though its lasting popularity and wide impact on programming language design point to the continued relevance of C, there is also widespread desire of programmers for languages with more expressive power and programmer-friendly features; accommodating both maintenance of legacy C code and development of the software of the future is a difficult task for a single programming language.
23
24\CFA{}\footnote{Pronounced ``C-for-all'', and written \CFA{} or \CFL{}.} is an evolutionary modernization of the C programming language that aims to fulfill both these ends well.
25\CFA{} both fixes existing design problems and adds multiple new features to C, including name overloading, user-defined operators, parametric-polymorphic routines, and type constructors and destructors, among others.
26The new features make \CFA{} more powerful and expressive than C, while maintaining strong backward-compatibility with both C code and the procedural paradigm expected by C programmers.
27Unlike other popular C extensions like \CC{} and Objective-C, \CFA{} adds modern features to C without imposing an object-oriented paradigm to use them.
28However, these new features do impose a compile-time cost, particularly in the expression resolver, which must evaluate the typing rules of a significantly more complex type system.
29
30This thesis is focused on making \CFA{} a more powerful and expressive language, both by adding new features to the \CFA{} type system and ensuring that both added and existing features can be efficiently implemented in \CFACC{}, the \CFA{} reference compiler.
31Particular contributions of this work include:
32\begin{itemize}
33\item design and implementation of parametric-polymorphic (``generic'') types in a manner compatible with the existing polymorphism design of \CFA{} (Chapter~\ref{generic-chap}),
34\item a new expression resolution algorithm designed to quickly locate the optimal declarations for a \CFA{} expression (Chapter~\ref{resolution-chap}),
35\item a type environment data structure based on a novel variant of the union-find algorithm (Chapter~\ref{env-chap}),
36\item and as a technical contribution, a prototype system for compiler algorithm development which encapsulates the essential aspects of the \CFA{} type system without incurring the technical debt of the existing system or the friction-inducing necessity of maintaining a working compiler (Chapter~\ref{expr-chap}).
37\end{itemize}
38
39The prototype system, which implements the algorithmic contributions of this thesis, is the first performant type-checker implementation for a \CFA{}-style type system.
40As the existence of an efficient compiler is necessary for the practical viability of a programming language, the contributions of this thesis comprise a validation of the \CFA{} language design that was previously lacking.
41
42Though the direction and experimental validation of this work is fairly narrowly focused on the \CFA{} programming language, the tools used and results obtained should be of interest to a wider compiler and programming language design community.
43In particular, with the addition of \emph{concepts} in \CCtwenty{}~\cite{C++Concepts}, conforming \CC{} compilers must support a model of type assertions very similar to that in \CFA{}, and the algorithmic techniques used here may prove useful.
44Much of the difficulty of type-checking \CFA{} stems from the language design choice to allow overload selection from the context of a function call based on function return type in addition to the type of the arguments to the call; this feature allows the programmer to specify fewer redundant type annotations on functions that are polymorphic in their return type.
45As an example in \CC{}:
46\begin{C++}
47template<typename T> T* zero() { return new T( 0 ); }
48
49int* z = zero<int>();  $\C{// must specify int twice}$
50\end{C++}
51
52\CFA{} allows !int* z = zero()!, which elides the second !int!.
53While the !auto! keyword in \CCeleven{} supports similar inference in a limited set of contexts (\eg{} !auto z = zero<int>()!), the demonstration of the richer inference in \CFA{} raises possibilities for similar features in future versions of \CC{}.
54By contrast to \CC{}, Java~8~\cite{Java8} and Scala~\cite{Scala} use comparably powerful forms of type inference to \CFA{}, so the algorithmic techniques in this thesis may be effective for those languages' compiler implementors.
55Type environments are also widely modelled in compiler implementations, particularly for functional languages, though also increasingly commonly for other languages (such as Rust~\cite{Rust}) that perform type inference; the type environment presented here may be useful to those language implementors.
56
57One area of inquiry that is outside the scope of this thesis is formalization of the \CFA{} type system.
58Ditchfield~\cite{Ditchfield92} defined the $F_\omega^\ni$ polymorphic lambda calculus, which is the theoretical basis for the \CFA{} type system.
59Ditchfield did not, however, prove any soundness or completeness properties for $F_\omega^\ni$; such proofs remain future work.
60A number of formalisms other than $F_\omega^\ni$ could potentially be adapted to model \CFA{}.
61One promising candidate is \emph{local type inference} \cite{Pierce00,Odersky01}, which describes similar contextual propagation of type information; another is the polymorphic conformity-based model of the Emerald~\cite{Black90} programming language, which defines a subtyping relation on types that is conceptually similar to \CFA{} traits.
62These modelling approaches could potentially be used to extend an existing formal semantics for C such as Cholera \cite{Norrish98}, CompCert \cite{Leroy09}, or Formalin \cite{Krebbers14}.
Note: See TracBrowser for help on using the repository browser.