1 | % inline code ©...© (copyright symbol) emacs: C-q M-) |
---|
2 | % red highlighting ®...® (registered trademark symbol) emacs: C-q M-. |
---|
3 | % blue highlighting ß...ß (sharp s symbol) emacs: C-q M-_ |
---|
4 | % green highlighting ¢...¢ (cent symbol) emacs: C-q M-" |
---|
5 | % LaTex escape §...§ (section symbol) emacs: C-q M-' |
---|
6 | % keyword escape ¶...¶ (pilcrow symbol) emacs: C-q M-^ |
---|
7 | % math escape $...$ (dollar symbol) |
---|
8 | |
---|
9 | \documentclass[twoside,11pt]{article} |
---|
10 | |
---|
11 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
12 | |
---|
13 | % Latex packages used in the document (copied from CFA user manual). |
---|
14 | \usepackage[T1]{fontenc} % allow Latin1 (extended ASCII) characters |
---|
15 | \usepackage{textcomp} |
---|
16 | \usepackage[latin1]{inputenc} |
---|
17 | \usepackage{fullpage,times,comment} |
---|
18 | \usepackage{epic,eepic} |
---|
19 | \usepackage{upquote} % switch curled `'" to straight |
---|
20 | \usepackage{calc} |
---|
21 | \usepackage{xspace} |
---|
22 | \usepackage{graphicx} |
---|
23 | \usepackage{varioref} % extended references |
---|
24 | \usepackage{listings} % format program code |
---|
25 | \usepackage[flushmargin]{footmisc} % support label/reference in footnote |
---|
26 | \usepackage{latexsym} % \Box glyph |
---|
27 | \usepackage{mathptmx} % better math font with "times" |
---|
28 | \usepackage[usenames]{color} |
---|
29 | \usepackage[pagewise]{lineno} |
---|
30 | \renewcommand{\linenumberfont}{\scriptsize\sffamily} |
---|
31 | \input{common} % bespoke macros used in the document |
---|
32 | \usepackage[dvips,plainpages=false,pdfpagelabels,pdfpagemode=UseNone,colorlinks=true,pagebackref=true,linkcolor=blue,citecolor=blue,urlcolor=blue,pagebackref=true,breaklinks=true]{hyperref} |
---|
33 | \usepackage{breakurl} |
---|
34 | \renewcommand{\UrlFont}{\small\sf} |
---|
35 | |
---|
36 | \setlength{\topmargin}{-0.45in} % move running title into header |
---|
37 | \setlength{\headsep}{0.25in} |
---|
38 | |
---|
39 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
40 | |
---|
41 | \newsavebox{\LstBox} |
---|
42 | |
---|
43 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
44 | |
---|
45 | \title{\Huge |
---|
46 | \vspace*{1in} |
---|
47 | Efficient Type Resolution in \CFA \\ |
---|
48 | \vspace*{0.25in} |
---|
49 | \huge |
---|
50 | PhD Comprehensive II Research Proposal |
---|
51 | \vspace*{1in} |
---|
52 | } |
---|
53 | |
---|
54 | \author{\huge |
---|
55 | \vspace*{0.1in} |
---|
56 | Aaron Moss \\ |
---|
57 | \Large Cheriton School of Computer Science \\ |
---|
58 | \Large University of Waterloo |
---|
59 | } |
---|
60 | |
---|
61 | \date{ |
---|
62 | \today |
---|
63 | } |
---|
64 | |
---|
65 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
66 | |
---|
67 | \begin{document} |
---|
68 | \pagestyle{headings} |
---|
69 | % changed after setting pagestyle |
---|
70 | \renewcommand{\sectionmark}[1]{\markboth{\thesection\quad #1}{\thesection\quad #1}} |
---|
71 | \renewcommand{\subsectionmark}[1]{\markboth{\thesubsection\quad #1}{\thesubsection\quad #1}} |
---|
72 | \pagenumbering{roman} |
---|
73 | \linenumbers % comment out to turn off line numbering |
---|
74 | |
---|
75 | \maketitle |
---|
76 | \thispagestyle{empty} |
---|
77 | |
---|
78 | \clearpage |
---|
79 | \thispagestyle{plain} |
---|
80 | \pdfbookmark[1]{Contents}{section} |
---|
81 | \tableofcontents |
---|
82 | |
---|
83 | \clearpage |
---|
84 | \thispagestyle{plain} |
---|
85 | \pagenumbering{arabic} |
---|
86 | |
---|
87 | \section{Introduction} |
---|
88 | |
---|
89 | \CFA\footnote{Pronounced ``C-for-all'', and written \CFA, CFA, or \CFL.} is an evolutionary modernization of the C programming language currently being designed and built at the University of Waterloo by a team led by Peter Buhr. |
---|
90 | Features added to C by \CFA include name overloading, user-defined operators, parametric-polymorphic routines, and type constructors and destructors, among others. |
---|
91 | These features make \CFA significantly more powerful and expressive than C, but impose a significant compile-time cost to implement, particularly in the expression resolver, which must evaluate the typing rules of a much more complex type system. |
---|
92 | The primary goal of this proposed research project is to develop a sufficiently performant expression resolution algorithm, experimentally validate its performance, and integrate it into \Index*{CFA-CC}, the \CFA reference compiler. |
---|
93 | Secondary goals of this project include the development of various new language features for \CFA; parametric-polymorphic (``generic'') types have already been designed and implemented, and reference types and user-defined conversions are under design consideration. |
---|
94 | The experimental performance-testing architecture for resolution algorithms will also be used to determine the compile-time cost of adding such new features to the \CFA type system. |
---|
95 | |
---|
96 | \section{\CFA} |
---|
97 | |
---|
98 | To make the scope of the proposed expression resolution problem more explicit, it is necessary to define the features of both C and \CFA (both current and proposed) which affect this algorithm. |
---|
99 | In some cases the interactions of multiple features make expression resolution a significantly more complex problem than any individual feature would; in others a feature which does not by itself add any complexity to expression resolution will trigger previously rare edge cases much more frequently. |
---|
100 | |
---|
101 | \subsection{Polymorphic Functions} |
---|
102 | The most significant feature \CFA adds is parametric-polymorphic functions. |
---|
103 | Such functions are written using a ©forall© clause, the feature that gave the language its name: |
---|
104 | \begin{lstlisting} |
---|
105 | forall(otype T) |
---|
106 | T identity(T x) { |
---|
107 | return x; |
---|
108 | } |
---|
109 | |
---|
110 | int forty_two = identity(42); // T is bound to int, forty_two == 42 |
---|
111 | \end{lstlisting} |
---|
112 | The ©identity© function above can be applied to any complete object type (or ``©otype©''). |
---|
113 | The type variable ©T© is transformed into a set of additional implicit parameters to ©identity© which encode sufficient information about ©T© to create and return a variable of that type. |
---|
114 | The current \CFA implementation passes the size and alignment of the type represented by an ©otype© parameter, as well as an assignment operator, constructor, copy constructor \& destructor. |
---|
115 | |
---|
116 | Since bare polymorphic types do not provide a great range of available operations, \CFA also provides a \emph{type assertion} mechanism to provide further information about a type: |
---|
117 | \begin{lstlisting} |
---|
118 | forall(otype T | { T twice(T); }) |
---|
119 | T four_times(T x) { |
---|
120 | return twice( twice(x) ); |
---|
121 | } |
---|
122 | |
---|
123 | double twice(double d) { return d * 2.0; } // (1) |
---|
124 | |
---|
125 | double magic = four_times(10.5); // T is bound to double, uses (1) to satisfy type assertion |
---|
126 | \end{lstlisting} |
---|
127 | These type assertions may be either variable or function declarations which depend on a polymorphic type variable. |
---|
128 | ©four_times© can only be called with an argument for which there exists a function named ©twice© that can take that argument and return another value of the same type; a pointer to the appropriate ©twice© function will be passed as an additional implicit parameter to the call to ©four_times©. |
---|
129 | |
---|
130 | Monomorphic specializations of polymorphic functions can themselves be used to satisfy type assertions. |
---|
131 | For instance, ©twice© could have been define as below, using the \CFA syntax for operator overloading: |
---|
132 | \begin{lstlisting} |
---|
133 | forall(otype S | { S ?+?(S, S); }) |
---|
134 | S twice(S x) { return x + x; } // (2) |
---|
135 | \end{lstlisting} |
---|
136 | This version of ©twice© will work for any type ©S© that has an addition operator defined for it, and it could have been used to satisfy the type assertion on ©four_times©. |
---|
137 | The compiler accomplishes this by creating a wrapper function calling ©twice // (2)© with ©S© bound to ©double©, then providing this wrapper function to ©four_times©\footnote{©twice // (2)© could have had a type parameter named ©T©; \CFA specifies a renaming the type parameters, which would avoid the name conflict with the parameter ©T© of ©four_times©.}. |
---|
138 | |
---|
139 | Finding appropriate functions to satisfy type assertions is essentially a recursive case of expression resolution, as it takes a name (that of the type assertion) and attempts to match it to a suitable declaration in the current scope. |
---|
140 | If a polymorphic function can be used to satisfy one of its own type assertions, this recursion may not terminate, as it is possible that function will be examined as a candidate for its own type assertion unboundedly repeatedly. |
---|
141 | To avoid infinite loops, the current \Index*{CFA-CC} compiler imposes a fixed limit on the possible depth of recursion, similar to that employed by most \Index*[C++]{\CC} compilers for template expansion; this restriction means that there are some semantically well-typed expressions which cannot be resolved by {CFA-CC}. |
---|
142 | One area of potential improvement this project proposes to investigate is the possibility of using the compiler's knowledge of the current set of declarations to make a more precise judgement of when further type assertion satisfaction recursion will not produce a well-typed expression. |
---|
143 | |
---|
144 | \subsection{Name Overloading} |
---|
145 | In C, no more than one function or variable in the same scope may share the same name, and function or variable declarations in inner scopes with the same name as a declaration in an outer scope hide the outer declaration. |
---|
146 | This makes finding the proper declaration to match to a function application or variable expression a simple matter of symbol table lookup, which can be easily and efficiently implemented. |
---|
147 | \CFA, on the other hand, allows overloading of variable and function names % TODO talk about uses for this |
---|
148 | |
---|
149 | \subsection{Implicit Conversions} |
---|
150 | % TODO also discuss possibility of user-generated implicit conversions here |
---|
151 | |
---|
152 | \subsection{Generic Types} |
---|
153 | |
---|
154 | \subsection{Multiple Return Values} |
---|
155 | |
---|
156 | \subsection{Reference Types} |
---|
157 | % TODO discuss rvalue-to-lvalue conversion here |
---|
158 | |
---|
159 | \section{Expression Resolution} |
---|
160 | % TODO cite Baker, Cormack, etc. |
---|
161 | |
---|
162 | \subsection{Symbol Table} |
---|
163 | % TODO not sure this is sufficiently novel, but it is an improvement to CFA-CC |
---|
164 | |
---|
165 | \section{Completion Timeline} |
---|
166 | |
---|
167 | \section{Conclusion} |
---|
168 | |
---|
169 | \newpage |
---|
170 | |
---|
171 | \bibliographystyle{plain} |
---|
172 | \bibliography{cfa} |
---|
173 | |
---|
174 | |
---|
175 | \addcontentsline{toc}{section}{\indexname} % add index name to table of contents |
---|
176 | \begin{theindex} |
---|
177 | Italic page numbers give the location of the main entry for the referenced term. |
---|
178 | Plain page numbers denote uses of the indexed term. |
---|
179 | Entries for grammar non-terminals are italicized. |
---|
180 | A typewriter font is used for grammar terminals and program identifiers. |
---|
181 | \indexspace |
---|
182 | \input{comp_II.ind} |
---|
183 | \end{theindex} |
---|
184 | |
---|
185 | \end{document} |
---|