Context Navigation

source: doc/proposals/tuples/tuples.tex @ 5dd0704

ADTaaron-thesisarm-ehast-experimentalcleanup-dtorsdeferred_resndemanglerenumforall-pointer-decayjacob/cs343-translationjenkins-sandboxnew-astnew-ast-unique-exprnew-envno_listpersistent-indexerpthread-emulationqualifiedEnumresolv-newwith_gc

Last change on this file since 5dd0704 was 2298a7b8, checked in by Rob Schluntz <rschlunt@…>, 8 years ago
added ctor and tuple design docs
Property mode set to `100644`
File size: 22.5 KB

Line
1	% inline code ©...© (copyright symbol) emacs: C-q M-)
2	% red highlighting ®...® (registered trademark symbol) emacs: C-q M-.
3	% blue highlighting ß...ß (sharp s symbol) emacs: C-q M-_
4	% green highlighting ¢...¢ (cent symbol) emacs: C-q M-"
5	% LaTex escape §...§ (section symbol) emacs: C-q M-'
6	% keyword escape ¶...¶ (pilcrow symbol) emacs: C-q M-^
7	% math escape $...$ (dollar symbol)
8
9	\documentclass[twoside,11pt]{article}
10
11	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
12
13	% Latex packages used in the document (copied from CFA user manual).
14	\usepackage[T1]{fontenc} % allow Latin1 (extended ASCII) characters
15	\usepackage{textcomp}
16	\usepackage[latin1]{inputenc}
17	\usepackage{fullpage,times,comment}
18	\usepackage{epic,eepic}
19	\usepackage{upquote} % switch curled `'" to straight
20	\usepackage{calc}
21	\usepackage{xspace}
22	\usepackage{graphicx}
23	\usepackage{varioref} % extended references
24	\usepackage{listings} % format program code
25	\usepackage[flushmargin]{footmisc} % support label/reference in footnote
26	\usepackage{latexsym} % \Box glyph
27	\usepackage{mathptmx} % better math font with "times"
28	\usepackage[usenames]{color}
29	\usepackage[pagewise]{lineno}
30	\renewcommand{\linenumberfont}{\scriptsize\sffamily}
31	\input{common} % bespoke macros used in the document
32	\usepackage[dvips,plainpages=false,pdfpagelabels,pdfpagemode=UseNone,colorlinks=true,pagebackref=true,linkcolor=blue,citecolor=blue,urlcolor=blue,pagebackref=true,breaklinks=true]{hyperref}
33	\usepackage{breakurl}
34	\renewcommand{\UrlFont}{\small\sf}
35
36	\setlength{\topmargin}{-0.45in} % move running title into header
37	\setlength{\headsep}{0.25in}
38
39	\usepackage{caption}
40	\usepackage{subcaption}
41	\usepackage{bigfoot}
42
43	\interfootnotelinepenalty=10000
44
45	\title{
46	\Huge \vspace*{1in} Tuple Design in \CFA \\
47	\vspace*{1in}
48	}
49
50	\author{
51	\huge Rob Schluntz \\
52	\Large \vspace*{0.1in} \texttt{rschlunt@uwaterloo.ca} \\
53	\Large Cheriton School of Computer Science \\
54	\Large University of Waterloo
55	}
56
57	\date{
58	\today
59	}
60
61	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
62
63	\newcommand{\bigO}[1]{O\!\left( #1 \right)}
64
65	\begin{document}
66	\pagestyle{headings}
67	% changed after setting pagestyle
68	\renewcommand{\sectionmark}[1]{\markboth{\thesection\quad #1}{\thesection\quad #1}}
69	\renewcommand{\subsectionmark}[1]{\markboth{\thesubsection\quad #1}{\thesubsection\quad #1}}
70	\pagenumbering{roman}
71	% \linenumbers % comment out to turn off line numbering
72
73	\maketitle
74	\thispagestyle{empty}
75
76	\clearpage
77	\thispagestyle{plain}
78	\pdfbookmark[1]{Contents}{section}
79	\tableofcontents
80
81	\clearpage
82	\thispagestyle{plain}
83	\pagenumbering{arabic}
84
85
86	\section{Introduction}
87	This document describes my understanding of the existing tuple design~\cite{Buhr94a,Till89}, mixed with my thoughts on improvements after various discussions with Peter, Aaron, and Thierry.
88
89	\section{Tuple Expressions}
90	A tuple expression is an expression which produces a fixed-size, ordered list of values of heterogeneous types.
91	The type of a tuple expression is the tuple of the subexpression types.
92	In Cforall, a tuple expression is denoted by a list of expressions enclosed in square brackets.
93
94	For example, the expression ©[5, 'x', 10.5]© has type ©[int, char, double]©.
95	Tuples are a compile-time phenomenon and have little to no run-time presence.
96
97	\subsection{Tuple Variables}
98	It is possible to have variables of tuple type, pointer to tuple type, and array of tuple type.
99	Tuple types can be composed of any types, except for array types \footnote{I did not see this issue mentioned at all in the original design. A tuple containing an array type seems to make sense up until you try to use a tuple containing an array type as a function parameter. At this point you lose information about the size of the array, which makes tuple assignment difficult. Rather than allowing arrays in most situations and disallowing only as function parameters, it seems like it would be better to be consistent across the board.}.
100
101	\begin{lstlisting}
102	[double, int] di;
103	[double, int] * pdi
104	[double, int] adi[10];
105	\end{lstlisting}
106	The program above declares a variable of type ©[double, int]©, a variable of type pointer to ©[double, int]©, and an array of ten ©[double, int]©.
107
108	\subsection{Flattening and Structuring}
109	In Cforall, tuples do not have a rigid structure.
110	In function call contexts, tuples support implicit flattening and restructuring \footnote{In the original tuple design, four tuple coercions were described: opening, closing, flattening, and structuring. I've combined flattening with opening and structuring with closing in my description, as the distinctions do not seem useful in Cforall since these coercions happen only as arguments to function calls, and I believe all of the semantics are properly covered by the simplified descriptions.}. Tuple flattening recursively expands a tuple into the list of its basic components. Tuple structuring packages a list of expressions into a value of tuple type.
111
112	\begin{lstlisting}
113	int f(int, int);
114	int g([int, int]);
115	int h(int, [int, int]);
116	[int, int] x;
117	int y;
118
119	f(x);
120	g(y, 10);
121	h(x, y);
122	\end{lstlisting}
123	In Cforall, each of these calls is valid.
124	In the call to ©f©, ©x© is implicitly flattened so that the components of ©x© are passed as the two arguments to ©f©.
125	For the call to ©g©, the values ©y© and ©10© are structured into a single argument of type ©[int, int]© to match the type of the parameter of ©g©.
126	Finally, in the call to ©h©, ©y© is flattened to yield an argument list of length 3, of which the first component of ©x© is passed as the first parameter of ©h©, and the second component of ©x© and ©y© are structured into the second argument of type ©[int, int]©.
127
128	\section{Functions}
129	\subsection{Argument Passing}
130	In resolving a function call, all of the arguments to the call are flattened.
131	While determining if a particular function/argument-list combination is valid, the arguments are structured to match the shape of each formal parameter, in order.
132
133	For example, given a function declaration ©[int] f(int, [double, int])©, the call ©f([5, 10.2], 0)© can be satisfied by first flattening the tuple to yield the expression ©f(5, 10.2, 0)© and then structuring the argument list to match the formal parameter list structure as ©f(5, [10.2, 0])©.
134
135	\subsection{Multiple-Return-Value Functions}
136	Functions can be declared to return more than one value.
137	Multiple return values are packaged into a tuple value when the function returns.
138	A multiple-returning function with return type ©T© can return any expression which is implicitly convertible to ©T©.
139
140	\subsection{Tuple Assignment}
141	An assignment where the left side of the assignment operator has a tuple type is called tuple assignment.
142	There are two kinds of tuple assignment depending on whether the right side of the assignment operator has a tuple type or a non-tuple type, called Multiple Assignment and Mass Assignment respectively.
143	Let $L_i$ for $i$ in $[0, n)$ represent each component of the flattened left side, $R_i$ represent each component of the flattened right side of a multiple assignment, and $R$ represent the right side of a mass assignment.
144
145	For a multiple assignment to be valid, both tuples must have the same number of elements when flattened. Multiple assignment assigns $R_i$ to $L_i$ for each $i$.
146	That is, ©?=?(&$L_i$, $R_i$)© must be a well-typed expression.
147
148	Mass assignment assigns the value $R$ to each $L_i$. For a mass assignment to be valid, ©?=?(&$L_i$, $R$)© must be a well-typed expression.
149	This differs from C cascading assignment (e.g. ©a=b=c©) in that conversions are applied to $R$ in each individual assignment, which prevents data loss from the chain of conversions that can happen during a cascading assignment.
150
151	Both kinds of tuple assignment have parallel semantics, such that each value on the left side and right side is evaluated \emph{before} any assignments occur.
152	Tuple assignment is an expression where the result type is the type of the left side of the assignment, as in normal assignment (i.e. the tuple of the types of the left side expressions) \footnote{This is a change from the original tuple design, wherein tuple assignment was a statement. This decision appears to have been made in an attempt to fix what was seen as a problem with assignment, but at the same time this doesn't seem to fit C or Cforall very well. In another language, tuple assignment as a statement could be reasonable, but I don't see a good justification for making this the only kind of assignment that isn't an expression. In this case, I would value consistency over idealism}.
153	These semantics allow cascading tuple assignment to work out naturally in any context where a tuple is permitted.
154
155	The following example shows multiple, mass, and cascading assignment in one expression
156	\begin{lstlisting}
157	int a, b;
158	double c, d;
159	[void] f([int, int]);
160	f([c, a] = [b, d] = 1.5);
161	\end{lstlisting}
162	First a mass assignment of ©1.5© into ©[b, d]©, which assigns ©1.5© into ©b©, which is truncated to ©1©, and ©1© into ©d©, producing the tuple ©[1, 1.5]© as a result.
163	That tuple is used as the right side of the multiple assignment (i.e. ©[c, a] = [1, 1.5]©) which assigns ©1© into ©c© and ©1.5© into ©a©, which is truncated to ©1©, producing the result ©[1, 1]©.
164	Finally, the tuple ©[1, 1]© is used as an expression in the call to ©f©.
165
166	\subsection{Tuple Construction}
167	Tuple construction and destruction follow the same rules and semantics as tuple assignment, except that in the case where there is no right side, the default constructor or destructor is called on each component of the tuple.
168
169	It is possible to define constructors and assignment functions for tuple types that provide new semantics.
170	For example, the function ©void ?{}([T, U] *, S);© can be defined to allow a tuple variable to be constructed from a value of type ©S©. Due to the structure of generated constructors, it is possible to pass a tuple to a generated constructor for a type with a member prefix that matches the type of the tuple (e.g. an instance of ©struct S { int x; double y; int z }© can be constructed with a tuple of type ©[int, double]©, `out of the box').
171
172	\section{Other Tuple Expressions}
173	\subsection{Member Tuple Expression}
174	It is possible to access multiple fields from a single expression using a Member Tuple Expression \footnote{Called ``record field tuple'' in the original design, but there's no reason to limit this feature to only structs, so ``field tuple'' or ``member tuple'' feels more appropriate.}.
175	The result is a single tuple-valued expression whose type is the tuple of the types of the members.
176	A member tuple expression has the form ©a.[x, y, z];© where ©a© is an expression with type ©T©, where ©T© supports member access expressions, and ©x, y, z© are all members of ©T© with types ©T$_x$©, ©T$_y$©, and ©T$_z$© respectively.
177	Then the type of ©a.[x, y, z]© is ©[T$_x$, T$_y$, T$_z$]©.
178	It is possible for a member tuple expression to contain other member access expressions, e.g. ©a.[x, y.[i, j], z.k]©.
179	This expression is equivalent to ©[a.x, [a.y.i, a.y.j], a.z.k]©.
180	It is guaranteed that the aggregate expression to the left of the ©.© on a member tuple expression is evaluated once.
181
182	\subsection{Tuple Indexing}
183	Sometimes it is desirable to access a single component of a tuple-valued expression without creating unnecessary temporary variables to assign to.
184	Given a tuple-valued expression ©e© and a compile-time constant integer $i$ where $0 \leq i < n$, where $n$ is the number of components in ©e©, ©e.i© will access the $i$\textsuperscript{th} component of ©e©.
185	% \footnote{If this syntax cannot be parsed, we can make it ©_i©, and a semantic check ensures that ©_i© has the right form. The capability to access a component of a tuple is helpful internally, and there doesn't seem to be a disadvantage to exposing it to users. On the other hand, it is more general than casting and much more explicit, while also being less verbose.}.
186	It is possible to use a member tuple expression with tuple indexing to manually restructure a tuple (rearrange components, drop components, duplicate components, etc.).
187
188	% TODO: mention that Tuple.member_name and Aggregate.index could have sensible semantics, but introduce complexity into the model. Agg.idx could mean get the ith member of the aggregate (further, this could be extended for enumerations as well, where the LHS is a type instead of a value), but it's not clear there is a compelling use-case. Tuple.member_name can either mean "distribute the member across the elements of the tuple" [effectively a compile-time map], or alternatively array.member_name (to mean basically the same thing). The problem with this is that it takes this expression's meaning from being clear at compile-time to needing resolver support, as the member name needs to appropriately distribute across every member of the tuple, which could itself be a tuple, etc. Again, the extra complexity is not currently justified.
189
190	For example
191	\begin{lstlisting}
192	[int, double] x;
193	[double, int, double] y = [x.1, x.0, x.1]; // (1)
194
195	[int, int, int] f();
196	[x.0, y.1] = f().[0, 2]; // (2)
197	\end{lstlisting}
198
199	(1) ©y© is initialized using a tuple expression which selects components from the tuple variable ©x©.
200
201	(2) A mass assignment of the first and third components from the return value of ©f© into the first component of ©x© and the second component of ©y©.
202
203	\subsection{Casting}
204	A cast to tuple type is valid when $T_n \leq S_m$, where $T_n$ is the number of components in the target type and $S_m$ is the number of components in the source type, and for each $i$ in $[0, n)$, $S_i$ can be cast to $T_i$.
205	Excess elements ($S_j$ for all $j$ in $[n, m)$) are evaluated, but their values are discarded so that they are not included in the result expression.
206	This naturally follows the way that a cast to void works in C.
207
208	For example,
209	\begin{lstlisting}
210	[int, int, int] f();
211	[int, [int, int], int] g();
212
213	([int, double])f(); // (1)
214	([int, int, int])g(); // (2)
215	([void, [int, int]])g(); // (3)
216	([int, int, int, int])g(); // (4)
217	([int, [int, int, int]])g(); // (5)
218	\end{lstlisting}
219
220	(1) discards the last element of the return value and converts the second element to type double.
221
222	(2) discards the second component of the second element of the return value of ©g© (if ©g© is free of side effects, this is equivalent to ©[(int)(g().0), (int)(g().1.0), (int)(g().2)]©).
223
224	(3) discards the first and third return values (equivalent to ©[(int)(g().1.0), (int)(g().1.1)]©).
225
226	(4) is invalid because the cast target type contains 4 components, while the source type contains only 3.
227
228	(5) is invalid because the cast ©([int, int, int])(g().1)© is invalid (i.e. it is invalid to cast ©[int, int]© to ©[int, int, int]©)
229
230
231	\section{Tuples for Variadic Functions}
232	Functions with tuple parameters can be used to provide type-safe variadic functions.
233	It appears that it would be possible to leverage tuples to get similar power to what \CC vardiadic templates provide, but with the ability to separately compile them.
234
235	\subsection{Option 1: Allow type parameters to match whole tuples, rather than just their components}
236	This option could be implemented with two phases of argument matching when a function contains type parameters and the argument list contains tuple arguments.
237	If flattening and structuring fail to produce a match, a second attempt at matching the function and argument combination is made where tuple arguments are not expanded and structure must match exactly, modulo implicit conversions. \footnote{It may be desirable to skip the exact matching rule if flattening and structuring produce a match that fails when inferring assertion parameters, at least in the current resolver since our assertion inference appears to be very inefficient.}
238
239	For example:
240	\begin{lstlisting}
241	forall(otype T, otype U \| { T g(U); })
242	void f(T, U);
243
244	[int, int] g([int, int, int, int]);
245
246	f([1, 2], [3, 4, 5, 6]);
247	\end{lstlisting}
248	With flattening and structuring, the call is first transformed into ©f(1, 2, 3, 4, 5, 6)©.
249	Since the first argument of type ©T© does not have a tuple type, unification decides that ©T=int© and ©1© is matched as the first parameter.
250	Likewise, ©U© does not have a tuple type, so ©U=int© and ©2© is accepted as the second parameter.
251	There are now no remaining formal parameters, there are remaining arguments, and the function is not variadic, so the match fails.
252
253	With exact matching, ©T=[int,int]© and ©U=[int,int,int,int]© and so the arguments type check.
254	When inferring assertion ©g©, a match is found.
255	\footnote{This type of interaction between tuple arguments and type parameters is desirable for perfect forwarding, but it's not obvious to me exactly how this should interact with assertion inference. Ideally, the same rules should apply for assertion satisfaction as apply to argument matching (i.e. flattening \& structuring should be attempted, followed by an exact match attempt on failure), but this may be more complicated than it sounds for assertion satisfaction. Aaron, I'm especially interested to hear your thoughts on this with respect to efficiency in the resolver redesign.
256
257	For example, should we allow this to match?
258	\begin{lstlisting}
259	forall(otype T, otype U \| { T g(U); })
260	void f(T, U);
261
262	[int, int] g(int, ®[int, int]®, int);
263
264	f([1, 2], [3, 4, 5, 6]);
265	\end{lstlisting}
266	To only have an exact matching rule here feels too strict. At the very least, it would be nice to accept ©[int, int] g(int, int, int, int)©, since that would allow for argument lists to be packaged and sent off to polymorphic functions and then directly forwarded to other functions.}.
267
268	The addition of an exact matching rule only affects the outcome for polymorphic type binding when tuples are involved.
269	For non-tuple arguments, exact matching and flattening \& structuring are equivalent. For tuple arguments to a function without polymorphic formal parameters, flattening and structuring work whenever an exact match would have worked (the tuple is flattened and implicitly restructured to its original structure).
270	Thus there is nothing to be gained from permitting the exact matching rule to take effect when a function does not contain polymorphism and none of the arguments are tuples.
271
272	\subsection{Option 2: Add another type parameter kind}
273	Perhaps a simpler alternative would be to add another kind of type parameter (e.g., ©ttype©).
274	There should be at most one ©ttype© parameter which must occur last in a parameter list.
275	Matching against a ©ttype© parameter would consume/package all remaining argument components into a tuple, and would also match no arguments.
276	These semantics more closely match normal variadic semantics, while being type-safe. C variadic syntax and ©ttype© polymorphism probably should not be mixed, since it is not clear where to draw the line to decide which arguments belong where.\footnote{In fact, if we go with this proposal, it might be desirable to disallow polymorphic functions to use C variadic syntax to encourage a Cforall style. Aside from maybe calling C variadic functions, it's not obvious to me there would be anything you can do with C variadics that couldn't also be done with ©ttype© parameters. }
277
278	Example 1: taken from Wikipedia, demonstrates variadic templates done in a Cforall style
279	\begin{lstlisting}
280	void func(void) {} // termination version (1)
281	forall(otype T, ttype Params \| { void process(const T &); void func(const Params &); })
282	void func(const T& arg1, const Params & p) { // (2)
283	process(arg1);
284	func(p);
285	}
286	void process(int); // (3)
287	void process(double); // (4)
288	func(1, 2.0, 3.5, 4);
289	\end{lstlisting}
290	In the call to ©func©, the value ©1© is taken as the first parameter, so ©T© unifies with ©int©, and the arguments ©2.0©, ©3.5©, and ©4© are consumed to form a tuple argument of type ©[double, double, int]©.
291	To satisfy the assertions to ©func©, the functions (3) and (2) are implicitly selected to satisfy the requirements of ©void process(const T &)© and ©void func(const Params &)© respectively.
292	Since (2) requires assertion parameters, the process repeats selecting (4) and (2).
293	The matching process continues recursively until reaching the base case where (3) and (1) are selected.
294	The end result is semantically equivalent to ©process(1), process(2.0), process(3.5), process(4)©.
295
296	Since (2) is not an exact match for the expected assertion parameter, a thunk is generated that wraps a call to ©func© that accepts an argument of type ©[double, double, int]©.
297	This conversion already occurs in the Cforall translator, but may require some modification to handle all of the cases present here.
298
299	Example 2: new (i.e. type-safe malloc + constructors)
300	\begin{lstlisting}
301	forall(dtype T, ttype Params \| sized(T) \| { void ?{}(T *, Params); })
302	T * new(Params p) {
303	return malloc(){ p };
304	}
305	array(int) * x = new(1, 2, 3, 4);
306	\end{lstlisting}
307	In the call to ©new©, ©array(int)© is selected to match ©T©, and ©Params© is expanded ot match ©[int, int, int, int]©. To satisfy the assertions, a constructor with an interface compatible with ©void ?{}(array(int) *, int, int, int, int)© must exist in the current scope.
308
309	Assertion inference can also be special cased to match functions that take tuples of any structure only for ttype parameters, if desired.
310
311
312	\subsection{Conclusions}
313	With either option, we can generate a thunk to perform the conversion from the actual argument's structure to the structure expected by the assertion parameter and that function would be passed as the assertion argument, in a manner similar to the other thunks that are already generated.
314
315	I prefer option 2, because it is simpler and I think the semantics are clearer.
316	I wouldn't be surprised if it was also noticeably more efficient, because of the lack of backtracking.
317
318	As a side note, option 1 also requires calls to be written explicitly, e.g. ©array(int) * x = new([1, 2, 3, 4]);©, which isn't particularly appealing.
319	It shifts the burden from the compiler to the programmer, which is almost always wrong, and doesn't match with the way our tuples can be used elsewhere.
320	The more I think about it, the more I'm convinced option 1 is the wrong approach, but I'm putting it out anyway in case someone has a good thought on how to make it work correctly.
321
322	\addcontentsline{toc}{section}{\refname}
323	\bibliographystyle{plain}
324	\bibliography{cfa}
325
326	%\addcontentsline{toc}{section}{\indexname} % add index name to table of contents
327	%\begin{theindex}
328	%Italic page numbers give the location of the main entry for the referenced term.
329	%Plain page numbers denote uses of the indexed term.
330	%Entries for grammar non-terminals are italicized.
331	%A typewriter font is used for grammar terminals and program identifiers.
332	%\indexspace
333	%\input{comp_II.ind}
334	%\end{theindex}
335
336	\end{document}

Note: See TracBrowser for help on using the repository browser.

Download in other formats: