Context Navigation

source: doc/proposals/tuples/tuples.tex @ a7efc96

Last change on this file since a7efc96 was 484ee53, checked in by Peter A. Buhr <pabuhr@…>, 6 years ago
update Makefiles so ${Build} is order only
Property mode set to `100644`
File size: 22.6 KB

Rev	Line
[2298a7b8]	1	\documentclass[twoside,11pt]{article}
	2
	3	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
	4
	5	% Latex packages used in the document (copied from CFA user manual).
	6	\usepackage[T1]{fontenc} % allow Latin1 (extended ASCII) characters
	7	\usepackage{textcomp}
	8	\usepackage[latin1]{inputenc}
[484ee53]	9
[2298a7b8]	10	\usepackage{fullpage,times,comment}
	11	\usepackage{epic,eepic}
[484ee53]	12	\usepackage{upquote} % switch curled `'" to straight
[2298a7b8]	13	\usepackage{calc}
	14	\usepackage{xspace}
	15	\usepackage{graphicx}
[484ee53]	16	\usepackage{varioref} % extended references
	17	\usepackage{listings} % format program code
	18	\usepackage[flushmargin]{footmisc} % support label/reference in footnote
[2298a7b8]	19	\usepackage{latexsym} % \Box glyph
	20	\usepackage{mathptmx} % better math font with "times"
	21	\usepackage[usenames]{color}
	22	\usepackage[pagewise]{lineno}
	23	\renewcommand{\linenumberfont}{\scriptsize\sffamily}
	24	\input{common} % bespoke macros used in the document
	25	\usepackage[dvips,plainpages=false,pdfpagelabels,pdfpagemode=UseNone,colorlinks=true,pagebackref=true,linkcolor=blue,citecolor=blue,urlcolor=blue,pagebackref=true,breaklinks=true]{hyperref}
	26	\usepackage{breakurl}
	27	\renewcommand{\UrlFont}{\small\sf}
	28
[484ee53]	29	\setlength{\topmargin}{-0.45in} % move running title into header
[2298a7b8]	30	\setlength{\headsep}{0.25in}
	31
	32	\usepackage{caption}
	33	\usepackage{subcaption}
	34	\usepackage{bigfoot}
	35
	36	\interfootnotelinepenalty=10000
	37
[484ee53]	38	\CFAStyle % use default CFA format-style
	39	% inline code ©...© (copyright symbol) emacs: C-q M-)
	40	% red highlighting ®...® (registered trademark symbol) emacs: C-q M-.
	41	% blue highlighting ß...ß (sharp s symbol) emacs: C-q M-_
	42	% green highlighting ¢...¢ (cent symbol) emacs: C-q M-"
	43	% LaTex escape §...§ (section symbol) emacs: C-q M-'
	44	% keyword escape ¶...¶ (pilcrow symbol) emacs: C-q M-^
	45	% math escape $...$ (dollar symbol)
	46
	47
[2298a7b8]	48	\title{
	49	\Huge \vspace*{1in} Tuple Design in \CFA \\
	50	\vspace*{1in}
	51	}
	52
	53	\author{
	54	\huge Rob Schluntz \\
	55	\Large \vspace*{0.1in} \texttt{rschlunt@uwaterloo.ca} \\
	56	\Large Cheriton School of Computer Science \\
	57	\Large University of Waterloo
	58	}
	59
	60	\date{
	61	\today
	62	}
	63
	64	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
	65
	66	\newcommand{\bigO}[1]{O\!\left( #1 \right)}
	67
	68	\begin{document}
	69	\pagestyle{headings}
	70	% changed after setting pagestyle
	71	\renewcommand{\sectionmark}[1]{\markboth{\thesection\quad #1}{\thesection\quad #1}}
	72	\renewcommand{\subsectionmark}[1]{\markboth{\thesubsection\quad #1}{\thesubsection\quad #1}}
	73	\pagenumbering{roman}
	74	% \linenumbers % comment out to turn off line numbering
	75
	76	\maketitle
	77	\thispagestyle{empty}
	78
	79	\clearpage
	80	\thispagestyle{plain}
	81	\pdfbookmark[1]{Contents}{section}
	82	\tableofcontents
	83
	84	\clearpage
	85	\thispagestyle{plain}
	86	\pagenumbering{arabic}
	87
	88
	89	\section{Introduction}
	90	This document describes my understanding of the existing tuple design~\cite{Buhr94a,Till89}, mixed with my thoughts on improvements after various discussions with Peter, Aaron, and Thierry.
	91
	92	\section{Tuple Expressions}
	93	A tuple expression is an expression which produces a fixed-size, ordered list of values of heterogeneous types.
	94	The type of a tuple expression is the tuple of the subexpression types.
	95	In Cforall, a tuple expression is denoted by a list of expressions enclosed in square brackets.
	96
	97	For example, the expression ©[5, 'x', 10.5]© has type ©[int, char, double]©.
	98	Tuples are a compile-time phenomenon and have little to no run-time presence.
	99
	100	\subsection{Tuple Variables}
	101	It is possible to have variables of tuple type, pointer to tuple type, and array of tuple type.
	102	Tuple types can be composed of any types, except for array types \footnote{I did not see this issue mentioned at all in the original design. A tuple containing an array type seems to make sense up until you try to use a tuple containing an array type as a function parameter. At this point you lose information about the size of the array, which makes tuple assignment difficult. Rather than allowing arrays in most situations and disallowing only as function parameters, it seems like it would be better to be consistent across the board.}.
	103
	104	\begin{lstlisting}
	105	[double, int] di;
	106	[double, int] * pdi
	107	[double, int] adi[10];
	108	\end{lstlisting}
	109	The program above declares a variable of type ©[double, int]©, a variable of type pointer to ©[double, int]©, and an array of ten ©[double, int]©.
	110
	111	\subsection{Flattening and Structuring}
	112	In Cforall, tuples do not have a rigid structure.
	113	In function call contexts, tuples support implicit flattening and restructuring \footnote{In the original tuple design, four tuple coercions were described: opening, closing, flattening, and structuring. I've combined flattening with opening and structuring with closing in my description, as the distinctions do not seem useful in Cforall since these coercions happen only as arguments to function calls, and I believe all of the semantics are properly covered by the simplified descriptions.}. Tuple flattening recursively expands a tuple into the list of its basic components. Tuple structuring packages a list of expressions into a value of tuple type.
	114
	115	\begin{lstlisting}
	116	int f(int, int);
	117	int g([int, int]);
	118	int h(int, [int, int]);
	119	[int, int] x;
	120	int y;
	121
	122	f(x);
	123	g(y, 10);
	124	h(x, y);
	125	\end{lstlisting}
	126	In Cforall, each of these calls is valid.
	127	In the call to ©f©, ©x© is implicitly flattened so that the components of ©x© are passed as the two arguments to ©f©.
	128	For the call to ©g©, the values ©y© and ©10© are structured into a single argument of type ©[int, int]© to match the type of the parameter of ©g©.
	129	Finally, in the call to ©h©, ©y© is flattened to yield an argument list of length 3, of which the first component of ©x© is passed as the first parameter of ©h©, and the second component of ©x© and ©y© are structured into the second argument of type ©[int, int]©.
	130
	131	\section{Functions}
	132	\subsection{Argument Passing}
	133	In resolving a function call, all of the arguments to the call are flattened.
	134	While determining if a particular function/argument-list combination is valid, the arguments are structured to match the shape of each formal parameter, in order.
	135
	136	For example, given a function declaration ©[int] f(int, [double, int])©, the call ©f([5, 10.2], 0)© can be satisfied by first flattening the tuple to yield the expression ©f(5, 10.2, 0)© and then structuring the argument list to match the formal parameter list structure as ©f(5, [10.2, 0])©.
	137
	138	\subsection{Multiple-Return-Value Functions}
	139	Functions can be declared to return more than one value.
	140	Multiple return values are packaged into a tuple value when the function returns.
	141	A multiple-returning function with return type ©T© can return any expression which is implicitly convertible to ©T©.
	142
	143	\subsection{Tuple Assignment}
	144	An assignment where the left side of the assignment operator has a tuple type is called tuple assignment.
	145	There are two kinds of tuple assignment depending on whether the right side of the assignment operator has a tuple type or a non-tuple type, called Multiple Assignment and Mass Assignment respectively.
	146	Let $L_i$ for $i$ in $[0, n)$ represent each component of the flattened left side, $R_i$ represent each component of the flattened right side of a multiple assignment, and $R$ represent the right side of a mass assignment.
	147
	148	For a multiple assignment to be valid, both tuples must have the same number of elements when flattened. Multiple assignment assigns $R_i$ to $L_i$ for each $i$.
	149	That is, ©?=?(&$L_i$, $R_i$)© must be a well-typed expression.
	150
	151	Mass assignment assigns the value $R$ to each $L_i$. For a mass assignment to be valid, ©?=?(&$L_i$, $R$)© must be a well-typed expression.
	152	This differs from C cascading assignment (e.g. ©a=b=c©) in that conversions are applied to $R$ in each individual assignment, which prevents data loss from the chain of conversions that can happen during a cascading assignment.
	153
	154	Both kinds of tuple assignment have parallel semantics, such that each value on the left side and right side is evaluated \emph{before} any assignments occur.
	155	Tuple assignment is an expression where the result type is the type of the left side of the assignment, as in normal assignment (i.e. the tuple of the types of the left side expressions) \footnote{This is a change from the original tuple design, wherein tuple assignment was a statement. This decision appears to have been made in an attempt to fix what was seen as a problem with assignment, but at the same time this doesn't seem to fit C or Cforall very well. In another language, tuple assignment as a statement could be reasonable, but I don't see a good justification for making this the only kind of assignment that isn't an expression. In this case, I would value consistency over idealism}.
	156	These semantics allow cascading tuple assignment to work out naturally in any context where a tuple is permitted.
	157
	158	The following example shows multiple, mass, and cascading assignment in one expression
	159	\begin{lstlisting}
	160	int a, b;
	161	double c, d;
	162	[void] f([int, int]);
	163	f([c, a] = [b, d] = 1.5);
	164	\end{lstlisting}
	165	First a mass assignment of ©1.5© into ©[b, d]©, which assigns ©1.5© into ©b©, which is truncated to ©1©, and ©1© into ©d©, producing the tuple ©[1, 1.5]© as a result.
	166	That tuple is used as the right side of the multiple assignment (i.e. ©[c, a] = [1, 1.5]©) which assigns ©1© into ©c© and ©1.5© into ©a©, which is truncated to ©1©, producing the result ©[1, 1]©.
	167	Finally, the tuple ©[1, 1]© is used as an expression in the call to ©f©.
	168
	169	\subsection{Tuple Construction}
	170	Tuple construction and destruction follow the same rules and semantics as tuple assignment, except that in the case where there is no right side, the default constructor or destructor is called on each component of the tuple.
	171
	172	It is possible to define constructors and assignment functions for tuple types that provide new semantics.
	173	For example, the function ©void ?{}([T, U] *, S);© can be defined to allow a tuple variable to be constructed from a value of type ©S©. Due to the structure of generated constructors, it is possible to pass a tuple to a generated constructor for a type with a member prefix that matches the type of the tuple (e.g. an instance of ©struct S { int x; double y; int z }© can be constructed with a tuple of type ©[int, double]©, `out of the box').
	174
	175	\section{Other Tuple Expressions}
	176	\subsection{Member Tuple Expression}
	177	It is possible to access multiple fields from a single expression using a Member Tuple Expression \footnote{Called ``record field tuple'' in the original design, but there's no reason to limit this feature to only structs, so ``field tuple'' or ``member tuple'' feels more appropriate.}.
	178	The result is a single tuple-valued expression whose type is the tuple of the types of the members.
	179	A member tuple expression has the form ©a.[x, y, z];© where ©a© is an expression with type ©T©, where ©T© supports member access expressions, and ©x, y, z© are all members of ©T© with types ©T$_x$©, ©T$_y$©, and ©T$_z$© respectively.
	180	Then the type of ©a.[x, y, z]© is ©[T$_x$, T$_y$, T$_z$]©.
	181	It is possible for a member tuple expression to contain other member access expressions, e.g. ©a.[x, y.[i, j], z.k]©.
	182	This expression is equivalent to ©[a.x, [a.y.i, a.y.j], a.z.k]©.
	183	It is guaranteed that the aggregate expression to the left of the ©.© on a member tuple expression is evaluated once.
	184
	185	\subsection{Tuple Indexing}
	186	Sometimes it is desirable to access a single component of a tuple-valued expression without creating unnecessary temporary variables to assign to.
	187	Given a tuple-valued expression ©e© and a compile-time constant integer $i$ where $0 \leq i < n$, where $n$ is the number of components in ©e©, ©e.i© will access the $i$\textsuperscript{th} component of ©e©.
	188	% \footnote{If this syntax cannot be parsed, we can make it ©_i©, and a semantic check ensures that ©_i© has the right form. The capability to access a component of a tuple is helpful internally, and there doesn't seem to be a disadvantage to exposing it to users. On the other hand, it is more general than casting and much more explicit, while also being less verbose.}.
	189	It is possible to use a member tuple expression with tuple indexing to manually restructure a tuple (rearrange components, drop components, duplicate components, etc.).
	190
	191	% TODO: mention that Tuple.member_name and Aggregate.index could have sensible semantics, but introduce complexity into the model. Agg.idx could mean get the ith member of the aggregate (further, this could be extended for enumerations as well, where the LHS is a type instead of a value), but it's not clear there is a compelling use-case. Tuple.member_name can either mean "distribute the member across the elements of the tuple" [effectively a compile-time map], or alternatively array.member_name (to mean basically the same thing). The problem with this is that it takes this expression's meaning from being clear at compile-time to needing resolver support, as the member name needs to appropriately distribute across every member of the tuple, which could itself be a tuple, etc. Again, the extra complexity is not currently justified.
	192
	193	For example
	194	\begin{lstlisting}
	195	[int, double] x;
	196	[double, int, double] y = [x.1, x.0, x.1]; // (1)
	197
	198	[int, int, int] f();
	199	[x.0, y.1] = f().[0, 2]; // (2)
	200	\end{lstlisting}
	201
	202	(1) ©y© is initialized using a tuple expression which selects components from the tuple variable ©x©.
	203
	204	(2) A mass assignment of the first and third components from the return value of ©f© into the first component of ©x© and the second component of ©y©.
	205
	206	\subsection{Casting}
	207	A cast to tuple type is valid when $T_n \leq S_m$, where $T_n$ is the number of components in the target type and $S_m$ is the number of components in the source type, and for each $i$ in $[0, n)$, $S_i$ can be cast to $T_i$.
	208	Excess elements ($S_j$ for all $j$ in $[n, m)$) are evaluated, but their values are discarded so that they are not included in the result expression.
	209	This naturally follows the way that a cast to void works in C.
	210
	211	For example,
	212	\begin{lstlisting}
	213	[int, int, int] f();
	214	[int, [int, int], int] g();
	215
	216	([int, double])f(); // (1)
	217	([int, int, int])g(); // (2)
	218	([void, [int, int]])g(); // (3)
	219	([int, int, int, int])g(); // (4)
	220	([int, [int, int, int]])g(); // (5)
	221	\end{lstlisting}
	222
	223	(1) discards the last element of the return value and converts the second element to type double.
	224
	225	(2) discards the second component of the second element of the return value of ©g© (if ©g© is free of side effects, this is equivalent to ©[(int)(g().0), (int)(g().1.0), (int)(g().2)]©).
	226
	227	(3) discards the first and third return values (equivalent to ©[(int)(g().1.0), (int)(g().1.1)]©).
	228
	229	(4) is invalid because the cast target type contains 4 components, while the source type contains only 3.
	230
	231	(5) is invalid because the cast ©([int, int, int])(g().1)© is invalid (i.e. it is invalid to cast ©[int, int]© to ©[int, int, int]©)
	232
	233
	234	\section{Tuples for Variadic Functions}
	235	Functions with tuple parameters can be used to provide type-safe variadic functions.
	236	It appears that it would be possible to leverage tuples to get similar power to what \CC vardiadic templates provide, but with the ability to separately compile them.
	237
	238	\subsection{Option 1: Allow type parameters to match whole tuples, rather than just their components}
	239	This option could be implemented with two phases of argument matching when a function contains type parameters and the argument list contains tuple arguments.
	240	If flattening and structuring fail to produce a match, a second attempt at matching the function and argument combination is made where tuple arguments are not expanded and structure must match exactly, modulo implicit conversions. \footnote{It may be desirable to skip the exact matching rule if flattening and structuring produce a match that fails when inferring assertion parameters, at least in the current resolver since our assertion inference appears to be very inefficient.}
	241
	242	For example:
	243	\begin{lstlisting}
	244	forall(otype T, otype U \| { T g(U); })
	245	void f(T, U);
	246
	247	[int, int] g([int, int, int, int]);
	248
	249	f([1, 2], [3, 4, 5, 6]);
	250	\end{lstlisting}
	251	With flattening and structuring, the call is first transformed into ©f(1, 2, 3, 4, 5, 6)©.
	252	Since the first argument of type ©T© does not have a tuple type, unification decides that ©T=int© and ©1© is matched as the first parameter.
	253	Likewise, ©U© does not have a tuple type, so ©U=int© and ©2© is accepted as the second parameter.
	254	There are now no remaining formal parameters, there are remaining arguments, and the function is not variadic, so the match fails.
	255
	256	With exact matching, ©T=[int,int]© and ©U=[int,int,int,int]© and so the arguments type check.
	257	When inferring assertion ©g©, a match is found.
	258	\footnote{This type of interaction between tuple arguments and type parameters is desirable for perfect forwarding, but it's not obvious to me exactly how this should interact with assertion inference. Ideally, the same rules should apply for assertion satisfaction as apply to argument matching (i.e. flattening \& structuring should be attempted, followed by an exact match attempt on failure), but this may be more complicated than it sounds for assertion satisfaction. Aaron, I'm especially interested to hear your thoughts on this with respect to efficiency in the resolver redesign.
	259
	260	For example, should we allow this to match?
	261	\begin{lstlisting}
	262	forall(otype T, otype U \| { T g(U); })
	263	void f(T, U);
	264
	265	[int, int] g(int, ®[int, int]®, int);
	266
	267	f([1, 2], [3, 4, 5, 6]);
	268	\end{lstlisting}
	269	To only have an exact matching rule here feels too strict. At the very least, it would be nice to accept ©[int, int] g(int, int, int, int)©, since that would allow for argument lists to be packaged and sent off to polymorphic functions and then directly forwarded to other functions.}.
	270
	271	The addition of an exact matching rule only affects the outcome for polymorphic type binding when tuples are involved.
	272	For non-tuple arguments, exact matching and flattening \& structuring are equivalent. For tuple arguments to a function without polymorphic formal parameters, flattening and structuring work whenever an exact match would have worked (the tuple is flattened and implicitly restructured to its original structure).
	273	Thus there is nothing to be gained from permitting the exact matching rule to take effect when a function does not contain polymorphism and none of the arguments are tuples.
	274
	275	\subsection{Option 2: Add another type parameter kind}
	276	Perhaps a simpler alternative would be to add another kind of type parameter (e.g., ©ttype©).
	277	There should be at most one ©ttype© parameter which must occur last in a parameter list.
	278	Matching against a ©ttype© parameter would consume/package all remaining argument components into a tuple, and would also match no arguments.
	279	These semantics more closely match normal variadic semantics, while being type-safe. C variadic syntax and ©ttype© polymorphism probably should not be mixed, since it is not clear where to draw the line to decide which arguments belong where.\footnote{In fact, if we go with this proposal, it might be desirable to disallow polymorphic functions to use C variadic syntax to encourage a Cforall style. Aside from maybe calling C variadic functions, it's not obvious to me there would be anything you can do with C variadics that couldn't also be done with ©ttype© parameters. }
	280
	281	Example 1: taken from Wikipedia, demonstrates variadic templates done in a Cforall style
	282	\begin{lstlisting}
	283	void func(void) {} // termination version (1)
	284	forall(otype T, ttype Params \| { void process(const T &); void func(const Params &); })
	285	void func(const T& arg1, const Params & p) { // (2)
	286	process(arg1);
	287	func(p);
	288	}
	289	void process(int); // (3)
	290	void process(double); // (4)
	291	func(1, 2.0, 3.5, 4);
	292	\end{lstlisting}
	293	In the call to ©func©, the value ©1© is taken as the first parameter, so ©T© unifies with ©int©, and the arguments ©2.0©, ©3.5©, and ©4© are consumed to form a tuple argument of type ©[double, double, int]©.
	294	To satisfy the assertions to ©func©, the functions (3) and (2) are implicitly selected to satisfy the requirements of ©void process(const T &)© and ©void func(const Params &)© respectively.
	295	Since (2) requires assertion parameters, the process repeats selecting (4) and (2).
	296	The matching process continues recursively until reaching the base case where (3) and (1) are selected.
	297	The end result is semantically equivalent to ©process(1), process(2.0), process(3.5), process(4)©.
	298
	299	Since (2) is not an exact match for the expected assertion parameter, a thunk is generated that wraps a call to ©func© that accepts an argument of type ©[double, double, int]©.
	300	This conversion already occurs in the Cforall translator, but may require some modification to handle all of the cases present here.
	301
	302	Example 2: new (i.e. type-safe malloc + constructors)
	303	\begin{lstlisting}
	304	forall(dtype T, ttype Params \| sized(T) \| { void ?{}(T *, Params); })
	305	T * new(Params p) {
	306	return malloc(){ p };
	307	}
	308	array(int) * x = new(1, 2, 3, 4);
	309	\end{lstlisting}
	310	In the call to ©new©, ©array(int)© is selected to match ©T©, and ©Params© is expanded ot match ©[int, int, int, int]©. To satisfy the assertions, a constructor with an interface compatible with ©void ?{}(array(int) *, int, int, int, int)© must exist in the current scope.
	311
	312	Assertion inference can also be special cased to match functions that take tuples of any structure only for ttype parameters, if desired.
	313
	314
	315	\subsection{Conclusions}
	316	With either option, we can generate a thunk to perform the conversion from the actual argument's structure to the structure expected by the assertion parameter and that function would be passed as the assertion argument, in a manner similar to the other thunks that are already generated.
	317
	318	I prefer option 2, because it is simpler and I think the semantics are clearer.
	319	I wouldn't be surprised if it was also noticeably more efficient, because of the lack of backtracking.
	320
	321	As a side note, option 1 also requires calls to be written explicitly, e.g. ©array(int) * x = new([1, 2, 3, 4]);©, which isn't particularly appealing.
	322	It shifts the burden from the compiler to the programmer, which is almost always wrong, and doesn't match with the way our tuples can be used elsewhere.
	323	The more I think about it, the more I'm convinced option 1 is the wrong approach, but I'm putting it out anyway in case someone has a good thought on how to make it work correctly.
	324
	325	\addcontentsline{toc}{section}{\refname}
	326	\bibliographystyle{plain}
[5ff188f]	327	\bibliography{pl}
[2298a7b8]	328
	329	%\addcontentsline{toc}{section}{\indexname} % add index name to table of contents
	330	%\begin{theindex}
	331	%Italic page numbers give the location of the main entry for the referenced term.
	332	%Plain page numbers denote uses of the indexed term.
	333	%Entries for grammar non-terminals are italicized.
	334	%A typewriter font is used for grammar terminals and program identifiers.
	335	%\indexspace
	336	%\input{comp_II.ind}
	337	%\end{theindex}
	338
	339	\end{document}

Note: See TracBrowser for help on using the repository browser.

Download in other formats: