Context Navigation

source: doc/proposals/tuples.md@ 1cc5c6a

Visit:

Last change on this file since 1cc5c6a was 5485bcac, checked in by Andrew Beach <ajbeach@…>, 12 months ago
Updated tuple proposal to include array interations.
Property mode set to `100644`
File size: 38.6 KB

Line
1	Tuples
2	======
3
4	This is a proposal discusses update to Cforall tuples as they exist after
5	the work done by Robert Schluntz (after they were added to Cforall by
6	Rodolfo Esteves and to K-W C by Dave Till). The new features and ideas
7	discussed here are designed to address problems that have appeared as tuples
8	have been used in the past 6 years. Some or all of the changes discussed may
9	already be implemented.
10
11	The core change is breaking the current restructurable tuples into unstructured
12	and structured tuples. Unstructured tuples handle most of the existing uses,
13	with structured tuples filling in a few missing use cases.
14
15	Current State of Tuples
16	-----------------------
17	An overview of the current tuple design is the starting place for the proposal.
18
19	### Tuple Syntax
20
21	Currently, tuples have three main components: tuple types, tuple
22	expressions/values (constructing tuples), and tuple index expressions
23	(deconstructing tuples).
24
25	Current syntax for tuple types:
26
27	- Nullary: [void] or []
28
29	[void] f();
30	[] f();
31
32	- Unary: [TYPE]
33
34	[int] f();
35	[double] x = 1.5;
36
37	- Multiary: [TYPE, TYPE, ...]
38
39	[bool, char] f();
40	[short, int, long] x;
41
42	Tuple types can appear anywhere regular types may be used, except for the
43	nullary tuple which can only appear where void can be used (usually return
44	types), with the exception of special case when void is used to make an empty
45	parameter list.
46
47	Current syntax for tuple expressions:
48
49	- Nullary: (Same as `void`, use return without an expression.)
50
51	[] f() { return; }
52
53	- Unary: EXPR (in return only) or {EXPR}
54
55	[int] f() { return 3; }
56	[int] x = {3};
57
58	- Multiary: [EXPR, EXPR, ...]
59
60	[int, int] f() { return [3, 4]; }
61	[bool, char, int] x = [true, 'a', 3];
62
63	Tuple expressions should be useable whenever an expression of that type is
64	expected. Issues with the current state will be discussed later.
65
66	Current syntax for tuple indexing is an integer constant, where its value
67	selects a tuple member, e.g.:
68
69	[char, int] tup = ['a', 0];
70	char ch = tup.0;
71	int value = tup.1;
72
73	### Mass and Multi-Assignment
74
75	Two special forms of assignment can be used to set values in tuples: mass and
76	multi. Mass assignment assigns every element in the destination tuple to a
77	single source value.
78
79	[int, long, float] dst;
80	int src = 4;
81	dst = src;
82	// Becomes: dst.0 = src; dst.1 = src; dst.2 = src;
83
84	Multi-assignment assigns every element in the destination tuple to the
85	corresponding element in the source tuple. Both tuples must be the same size
86	and the elements pairs must be assignment compatible conversions.
87
88	[long, int, float] dst;
89	[int, char, double] src = [1, 'a', 300.0];
90	dst = src;
91	// Becomes: dst.0 = src.0; dst.1 = src.1; dst.2 = src.1;
92
93	### Tuple Restructuring
94
95	Tuples can be restructured as part of overload resolution. Function calls
96	unpack tuples and repack tuples to match signatures. This semantics is a form
97	of implicit conversion and is considered during overload resolution.
98
99	A simple example is matching multiple parameters of a function to a single
100	argument expression, where each parameter is bound to a different element of
101	the returned tuple.
102
103	[int, int] argFunc();
104	void parmFunc(int a, int b, int c, int d);
105
106	parmFunc(argFunc(), argFunc());
107
108	// Roughly equivilent to:
109	[int, int] fst = argFunc();
110	[int, int] snd = argFunc();
111	parmFunc(fst.0, fst.1, snd.0, snd.1);
112
113	### Tuple Casts
114
115	C-style casts can be used on tuples. These are usually conversion casts (casts
116	that perform operations on the cast type, as opposed to reinterpreting the
117	existing value).
118
119	Tuple casts can remove elements from the end of a tuple and apply a recursive
120	cast to any of the elements. As an example:
121
122	[char, char, char] x = ['a', 'b', 'c'];
123	([int, char])x;
124
125	This casts the first element type of x from a char to an int and drops the last
126	element. The second line can be replaced with the following code, which creates
127	a new tuple by directly casting the kept elements of the old tuple:
128
129	[(int)x.0, (char)x.1];
130
131	Note, tuple casting is more restricted than the implicit tuple restructuring.
132	It cannot do any restructuring beyond dropping the trailing elements of
133	tuples. For example,
134
135	int i; char c; bool b;
136	[i, b, c, i] = [i, [b, c], i];
137	[int, [bool, char], int] w;
138	[i, b, c, i] = w;
139	[i, b, c] = ([int, [bool, char]])w;
140	// But this does not work:
141	[i, b, c, i] = ([int, bool, char, int])w;
142
143	### Polymorphic Tuple Parameters
144
145	One kind of polymorphic parameter is the tuple type parameter, previously
146	noted by the `ttype` keyword and now marked by a tailing ellipses (`...`).
147	Although described as a tuple type, it is usually viewed as its flattened
148	sequence of types.
149
150	// Fixed Length Max - Base Case:
151	forall(T \| { int ?>?( T, T ); })
152	T max(T v1, T v2) { return v1 > v2 ? v1 : v2; }
153
154	// Variable Length Max - Recursive:
155	forall(T, Ts... \| { T max(T, T); T max(Ts); })
156	T max(T arg, Ts args) {
157	return max(arg, max(args));
158	}
159
160	This feature introduces a type name into scope (Ts). It is used as a type and
161	the value it declares (args) is a tuple value, and that the second assertion
162	"T max(Ts);" matches takes a single tuple, but with restructuring it can also
163	match functions that take a series of non-tuple arguments.
164
165	A good way to explain this is to show a partial implementation the following
166	call `max(int_a, int_b, int_c);` into actual code:
167
168	void _thunk0(int v1, int v2) {
169	return max{?>?}(v1, v2);
170	}
171	void _thunk1([int, int] tup) {
172	return max{?>?}(tup.0, tup.1);
173	}
174	max{_thunk0, _thunk1}(int_a, [int_b, int_c]);
175
176	The part to highlight is that "_thunk1", one of the functions created to
177	make an inexact match that can be used as a call site into an exact match
178	so the function can be passed through function parameters.
179	Although, the tuple type `[int, int]` matches a parameter list `int, int`,
180	just with a bit of low lever restructuring.
181
182	In larger cases (with four or more parameters in the first case), the
183	recursive case will work similarly, creating an intermidate function to
184	restructure the tuple into a new call such as:
185
186	void _thunk2([int, int, int] tup) {
187	return max{_thunk0, _thunk1}(tup.0, [tup.1, tup.2]);
188	}
189
190	Issues with the Current State
191	-----------------------------
192	There are a variety of problems with the current implementation which need to
193	be fixed.
194
195	### Tuples are not Objects
196
197	Spoilers: this proposal actually takes them even further away from being
198	objects, but illustrates why the current version is not as useful is it could
199	be.
200
201	Tuples do not have the lifetime operators (copy construction, copy assignment
202	and destructor) that define an object type in Cforall. This means they are
203	not polymorphic otypes, they are objects in C's usage of the word. This means
204	that tuples cannot be used as a single object in
205
206	There are several reasons for this. The fluid nature of tuples (flattening
207	and restructuring) means that matching against a tuple type is fluid in some
208	inconvent ways.
209
210	forall(T, U) void ?{}([T, U] & this;
211	void ?{}([int, int, int] & this);
212	// What constructor(s) should be used here?
213	[int, [int, int]] pair;
214
215	Should the polymorpic constructor be applied twice or should the tuple be
216	restructured and the three element constructor be used? This is not really
217	something a user should be wondering about (especially since tuples should
218	always be simple containers wrapping their elements).
219
220	### Providing TType Arguments is Inconsistent
221
222	The syntax for ttype arguments is slightly inconsistent. It has not come up
223	much yet, because you do not directly provide ttype polymorphic arguments to
224	functions and there are very few existing use-cases for ttype structures.
225
226	forall(T, Us...)
227	void function(T head, Us tail) {}
228
229	forall(T, Us...)
230	struct Type {};
231
232	int a; bool b; char c;
233	function(a, b, c);
234	function(a, [b, c]);
235	// Doesn't work: Type(int, bool, char) x;
236	Type(int, [bool, char]) x;
237
238	The `function` case works either way because the tuple value can be
239	restructured if it does not match exactly. The `Type` case though does not
240	allow restructuring when the type is passed as a type value. This may be
241	the most correct form, but seems surprising compared to the common use.
242
243	### Tuple Arrays Not Supported
244
245	You cannot make an array of tuples. For some reason the type just will not
246	work. Thoretically it could, each tuple type has a size and alignment and
247	could be put in an array. However the following type is not accepted:
248
249	[int, char] array[3] = {{1, 'a'}, {2, 'b'}, {3, 'c'}};
250
251	Now the opposite, a tuple containing an array, does actually work:
252
253	[int[3], char] data = {{1, 2, 3}, 'd'};
254
255	### Unary Tuple Value Syntax
256
257	Unary tuples don't use the standard tuple syntax and one of the two alternate
258	syntax options your may use instead doesn't work consistently. Solving both
259	of these issues would help with consistency.
260
261	### Syntax Conflict
262
263	The tuple syntax conflicts with designators and the C23 (C++-style) attribute
264	syntax.
265
266	int a[10] = { [2] = 3 };
267
268	Here [2] = 3 is an array designator, but has the same syntax as an assignment
269	to a tuple of references, it isn't until the resolver determains that 2 is
270	not a reference that that case can be ruled out.
271
272	[[ unused ]] [[int, int], int] x;
273
274	Here the opening `[[` could be a nested tuple or the beginning of an
275	attribute, the parser can't figure out until later.
276
277	These conflicts break C compatibility goals of Cforall. Designators had to
278	their syntax change and Cforall cannot parse the new attributes.
279	The behaviour of these cases, if they could be parsed, should be unchanged.
280
281	Change in Model
282	---------------
283	This proposal modifies the existing tuples to better handle its main use
284	cases. There are some use cases that are no longer handled, so in addition
285	to changing the existing "restructured tuples" to "unstructured tuples"
286	(or packs) a new "structured tuple" is added in the form of struct tuple.
287
288	The new tuples is even more "unstructured" than before. New tuples are
289	considered a sequence of types or typed entities. These packs are then unpacked
290	into the surrounding context. Put differently, tuples are now flattened as much
291	as possible, with some places (like parameter lists) being treated as an
292	implicit tuple and the tuple being flattened into that.
293
294	Structured tuples are now a separate feature: a structure called "tuple".
295	These are polymorphic structures; an instance should act as a structure, except
296	its fields are accessed using indices instead of field names. Experience so far
297	is that structured tuples are not used often, but fill in the use cases that
298	unstructured tuples no longer support.
299
300	Note that the underlying implementation might not actually look like this,
301	but this is how it should seem to the user.
302
303	Changed Features
304	----------------
305	Some of the concrete changes to the features of the language.
306
307	### Unstructured Tuple / Pack Type
308
309	The evolution of our current tuples (called packs in an attempt to make the
310	names more distinct). They work like the current tuples except they are
311	always flattened. You might still be able to declare nested tuples, but these
312	are not meaningful, they are flattened to a single level and, where
313	approprate, is inlined into the sounding content.
314
315	[bool, [char, int], long] x;
316	// Becomes:
317	[bool, char, int, long] x;
318
319	void f(bool a0, [char, [int, long]] a1, float a2);
320	// Becomes:
321	void f(bool a0, char a1_0, int a1_1, long a1_2, float a2);
322
323	[,] f(int a0, [,] a1, [bool,] a2);
324	// Becomes:
325	void f(int a0, bool a2);
326
327	This actually means that tuples do not participate in overload resolution
328	in the way the do currently. Restructuring is always free because they are
329	always reduced to the same form as part of type matching.
330
331	Packs are still not object types and do not have lifetime functions. They
332	cannot be used as object types, nor as any data type, but they can be still
333	used as ttypes.
334
335	(Unless they need to have lifetime functions for other features.)
336
337	### Structured Tuple Type
338
339	There is a standard library or built-in type named `tuple`, it does not need
340	a special syntax to write types or values. The type definition might need
341	some primitive support, but if supported as a regular type would look
342	something like this:
343
344	forall(Ts...)
345	struct tuple {
346	inline Ts all;
347	};
348
349	This type wraps up an unstructured tuple, in this case a pack of members,
350	and gives it "edges" and so structure. It also gives tuples an interface
351	more like a normal structure type, for the anonymous structure use case.
352
353	The struct tuple does have lifetime functions, can use list initializer
354	syntax and otherwise be treated as a normal structure. You can use regular
355	member access syntax to convert the struct tuple into a unstructured pack,
356	writing `object.all`.
357
358	The `inline` modifier is a new specialized feature to allow you use tuple
359	index expressions directly on the type. With or without inline you should
360	be able to chain a tuple index expression onto the member expression, such
361	as `object.all.1`. The inline specifier allows you to skip the middle
362	and use `object.1`.
363
364	More complex operations can usually be preformed by taking the pack out of
365	the struct tuple and acting on it. Polymorphic operations only have to go one
366	level down to get to base operations, there is no need for recursive
367	construction, nor manually coding different sizes of tuple. This should give
368	most of the ease of working with a primitive tuple, because effectively you
369	are except it has been given fixed edges.
370
371	A tuple struct should also be an object type, and let's you pass multiple
372	values through a single polymorphic slot.
373
374	A note on the naming here, because there are more pieces that need names
375	instead of symbols. The name `tuple` follows because that is the common name,
376	although this means this is now the default tuple in some sense. The name of
377	the pack was picked to work with inline and make the difference between `.0`
378	and `.all` clear. (If inline doesn't work then `get` might be a better name.)
379
380	### Type Qualifier Distribution
381
382	Because tuples are no longer object types, applying type modifiers to them,
383	such as cv-qualifiers and pointers, no longer makes sense. That syntax is
384	now considered distribution, applying the type modifier to each element of
385	the tuple.
386
387	Previously `const [int, bool] &` would mean a const reference to a tuple of an
388	integer and a boolean. Now it means an alias for `[const int &, const bool &]`,
389	a tuple of a reference to a constant integer and a reference to a constant
390	boolean. This also applies to polymorphic tuple type packs `Ts &` in
391	polymorphic functions.
392
393	This new approach can specify restrictions on tuple variables as for a single
394	type variable. For example, this approach can replace the special cased tuple
395	operations multi-assignment (N-to-N) and mass-assignment (1-to-N).
396
397	// Multi-Assignment
398	forall(T, Us... \|
399	{ T & ?=?(T &, T const &); Us & ?=?(Us &, Us const &); })
400	[T, Us] & ?=?(T & dst, Us & dsts, T const & src, Us const & srcs) {
401	dst = src;
402	dsts = srcs;
403	return [dst, dsts];
404	}
405
406	// Mass-Assignment
407	forall(T, U, Vs... \|
408	{ U & ?=?(U &, T const &); Vs & ?=?(Vs &, T const &); })
409	[U, Vs] & ?=?(U & dst, Vs & dsts, T const & src) {
410	dst = src;
411	dsts = src;
412	return [dst, dsts];
413	}
414
415	These may not work exactly as given (for one, the copy assignment assertion
416	in the first function would likely be redundant/conflicting with the implicit
417	assertions on the parameters), but they show the pattern. Multi-assignment
418	also would be very hard to write with simple tuple types, because the
419	relationship between the two halves of the parameter list does have to line
420	up, that cannot be enforced with two different tuples.
421
422	This example doesn't have to be implemented this way because we do have the
423	special case operations for these already.
424
425	### Type Packs
426
427	This approach is not a new feature, but a reframing/extension of existing tuple
428	tuple polymorphic parameters as polymorphic type packs. The old `Vars...`
429	syntax introduces a pack of types into scope. It can be used in much the same
430	way as a tuple, but in some new ways to.
431
432	The primary existing use remains: to use a polymorphic pack in a parameter
433	list, both as part of an assertion and in the signature of the main
434	function. The difference is that this is not an enclosed tuple, but a series of
435	types. The only effective difference this makes is it does not prefer to match
436	another tuple/pack.
437
438	This pattern continues to a parameter defined with a pack of types, which
439	is considered a pack of parameters, and the name it introduces is a pack
440	of variables, or a flattened tuple.
441
442	forall(Params...)
443	void function(Params values);
444
445	New use cases include declarations of members and variables. For example, the
446	creation of a structured tuple structure:
447
448	forall(Fields...)
449	struct tuple {
450	Fields get;
451	};
452
453	This is again, treated as a pack of members. They have the same layout as
454	if they were written out by hand. Now, the name get is still accessed as if
455	it was a regular, singular, member. The result of that expression is a
456	pack expression, a tuple of all the field accesses, which can be used with a
457	tuple index expression to access the underlying members. For example:
458
459	tuple(bool, char, int) data = { ... };
460	// This expression:
461	data.get.2;
462	// Is the same as (if the fields had these names):
463	data.[__anonymous0, __anonymous1, __anonymous2].2;
464
465	For local declarations it works similarly, except the name introduced is
466	directly usable as a tuple instead of going through a member access.
467
468	forall(Objects...)
469	void function( ??? ) {
470	???
471	Objects objs = ???;
472	???
473	}
474
475	### Tuple Pack Restructuring
476
477	Implicit restructuring still occurs, but is not considered as part of
478	overload resolution. The structure of tuples is ignored, where individual
479	tuples begin or end is not considered, just the sequence of types.
480
481	This can be viewed as flattening both sides (the arguments/caller and the
482	parameters/callee) before type resolution. Both any nested packs and packs
483	into any larger context:
484
485	call(false, ['a', [-7,]], 3.14)
486	// Inline Nesting:
487	call(false, ['a', -7], 3.14)
488	// Inline into Context:
489	call(false, 'a', -7, 3.14)
490
491	The main context where tuples are inlined are parameter lists, in that you
492	can consider a parameter list as a single pack. Return types could be viewed
493	in this way, but because they are presented in a single type, they are
494	already wrapped in a tuple.
495
496	[int, [bool, char]] f();
497	// Inline Nesting:
498	[int, bool, char] f();
499
500	This also happens "after" any polymorphic parameters are instantiated.
501
502	Empty packs are also collapsed, logically being removed or replaced with
503	void as in the following example:
504
505	// For example, this function:
506	forall(Ts...) Ts example(int first, Ts middle, int last);
507	// With Ts = [,], should not be treated as:
508	[,] example(int first, [,] middle, int last);
509	// But instead:
510	void example(int first, int last);
511
512	Or closer to it, there are some slight differences between the nullary tuple
513	and void, for instance creating some instances for consistency with the other
514	tuple types.
515
516	The struct tuple is never implicitly restructured.
517
518	### Tuple Declaration/Deconstruction
519
520	Declaring a tuple acts as a pack of variable declarations. When this is done
521	with a written out type (as opposed to a polymorphic parameter above), then the
522	elements of the tuple can be named.
523
524	[int quo, int rem] quorem = divmod(a, b);
525
526	Here `ret` refers to the tuple, the entire pack, while `quo` and `rem`
527	give explicit names to elements of the tuple. Not all the names have to be
528	provided, at the very least, any element name can be omitted if the pack name
529	is provided, and the pack name can be omitted if all the element names are
530	provided. That would ensure every element can be accessed, but it could be
531	reduced even more, effectively immediately dropping some values if there is
532	no named to access it.
533
534	This does not add any new functionality, more it is a qualify of life feature
535	making it very easy to give elements of a tuple descriptive names
536	instead of trying to combine a name for the whole tuple with an index.
537	(`quo` and `rem` vs. `quorem.0` and `quorem.1`)
538
539	### Tuple Casts
540
541	Tuple casts are no longer allowed to do any restructuring. Internal
542	restructuring would not be useful, as there is no internal structure.
543	Dropping tail elements could be added back in but it is a narrow use case
544	so it may be replaced with other features (for example: tuple slicing).
545
546	// Allowed:
547	([bool, short])tuple_bool_char;
548	([int, float])tuple_float_float;
549
550	// Forbidden:
551	([int, int, int])tuple_int_int;
552	([long, [float, int]])tuple_long_float_int;
553
554	### Tuple Array Support
555
556	Arrays of unstructured tuples/packs are not actually required, an array of
557	structures will store the data just as well and this is not the transient
558	context where the unlabeled tuples work well. So it would work fine to
559	not allow an array of packs, but if we do not a better error should be
560	created.
561
562	All that being said, the following could just work.
563
564	[int, char] u_array[3] = {...};
565
566	What should definitely work is an array of struct tuple, because that should
567	work where a generic structure could be used. This is also part of why arrays
568	of packs are not needed.
569
570	tuple([int, char]) s_array[3] = {...};
571
572	### New Tuple Literal Syntax
573
574	In addition to the main redesign of this proposal, the syntax is updated to
575	be more consistant. It is now a "[]" list with "," separators and a ","/comma
576	terminator that is optional for tuples with 2 or more elements (those that
577	already have a comma in them) and manditory in other tuples.
578
579	It should also be symmetric between types and value, differing only in that
580	an element (ITEM below) of a tuple type is a type and the element of a tuple
581	value is a value.
582
583	- Nullary Tuple: [,]
584	- Unary Tuples: [ITEM,]
585	- Multiary Tuples: [ITEM, ITEM] [ITEM, ITEM,] [ITEM, ITEM, ITEM] ...
586
587	Currently, although syntax is provided to write them, but manually writing
588	a nullary or unary tuple should be very rare. This is because a nullary tuple
589	can either be erased or treated as void, while unary tuples can be treated
590	as their element type. Similary for nested tuples, can be treated as the
591	flattened tuple.
592
593	(Update: Orignal nullary tuple proposal was `[]`, but that had some issues.)
594
595	### TType Argument Syntax
596
597	It would be nice to seamlessly provide packs of types in polymorphic
598	argument lists, however doing so requires a new restriction.
599
600	As an example of the change, consider providing arguments to a polymorphic
601	structure with a otype and a ttype parameter.
602
603	forall(T, Us...) struct D {...};
604	D(int, [,]) // D(int)
605	D(bool, [char,]) // D(bool, char)
606	D(long, [int, short]) // D(long, int, short)
607
608	This is a nice syntaxtic improvement for the common case, it does restrict
609	some of the more complex cases, such as cases with two ttypes. There are
610	various restrictions on this for functions. If the rules make it unambigous
611	then the grouping could be omitted, or a more advanced version, it can be
612	ommited only in the indivual cases where it is unambigous.
613
614	### Tuple Slicing (Range Tuple Indexing)
615
616	(Disclaimer: this is a bit more impulsive, see end for notes.)
617
618	This is an extension to tuple indexing. Currently, only single location of a
619	tuple can be index, extracting the element at that location. By extending the
620	index expression to be a list-range a multiple sub-tuples can be extracted.
621
622	['a', 'b', 'c', 'd', 'e', 'f'].0~3
623	['a', 'b', 'c', 'd', 'e', 'f'].1~5~2
624	// Produces:
625	['a', 'b', 'c']
626	['b', 'd']
627
628	In words, the first indexes the first 3 tuple elements (0, 1, 2, stopping
629	before index 3) and the second indexes elements starting at index 1, stoping
630	before index 5 going in steps of 2. This does not allow for arbitary
631	selections, but it covers many common slices.
632
633	To get the new tuple, the range has to be constructed and traversed at compile
634	time. The size of the range is the size of the result tuple, with each element
635	in the result tuple decided by the matching element of the range, which gives
636	the index of the original tuple to place there. The type of the indices may be
637	an integral type, but all indices must be in range, otherwise it is a compile
638	time error.
639
640	In terms of the existing range loops, if you could write this dynamically, it
641	would look something like this:
642
643	// Implementation of: output = input.RANGE
644	dyn output = []
645	for ( index : RANGE ) { output = [output, input.index] }
646
647	The result of the expression is only has to be considered a tuple if the
648	range has two or more values, otherwise it can be considered void or the same
649	as indexing the single value in the range.
650
651	Some closing notes, this is dependent on generalized range expressions.
652	The iterators proposal has a section on some new range types, this feature
653	is intended to be built on that feature. Not simply reuse some of the syntax
654	that is also used in the special for loop. This may add some requirements
655	about the compile-time evaluation on range expressions to make sure these
656	expressions can be evaluated at compile-time.
657
658	Implementation
659	--------------
660	An overview of the implementation details of the new proposal.
661
662	### Structured Tuple
663
664	There is actually a decision point here. If it can be implemented entirely
665	in the standard library, that would be good. If it cannot, it can actually
666	be implemented as the same primitive implementation as we use now and may be
667	how unstructured tuple packs are implemented.
668
669	### Member Pack Implementation
670
671	Implementing member packs can use the same pattern as current built-in tuple
672	syntax. This is a two stage process, first is by ... then the it can be
673	handled by the existing code for polymorphic but not variadic structures.
674
675	forall(Ts...)
676	struct tuple {
677	Ts get;
678	};
679
680	So when used with diffences length an hidden declaration is created which
681	has the pack parameter replaced with the approprate number of object type
682	parameters. Such as in the following examples:
683
684	struct tuple$0 {
685	};
686
687	forall(Ts$0)
688	struct tuple$1 {
689	Ts$0 get$0;
690	};
691
692	forall(Ts$0, Ts$1)
693	struct tuple$2 {
694	Ts$0 get$0;
695	Ts$1 get$1;
696	};
697
698	Then it can be handled by the regular boxing code. Detection of these cases
699	will have to changed because it is no longer a single case. But the pattern
700	does work if the structure has additional fields in it, as they are just
701	placed before and after the pack.
702
703	forall(U, Ts...)
704	struct blob {
705	int header;
706	Ts data;
707	U footer;
708	};
709
710	forall(U, Ts$0, Ts$1)
711	struct blob$2 {
712	int header;
713	Ts$0 data$0;
714	Ts$1 data$1;
715	U footer;
716	};
717
718	The `inline` member pack just effects the resolver. It should be expanded out
719	to the member access and then the tuple index expression.
720
721	### Local Declaration Pack
722
723	Packs of local declarations can be handled very differently if they are
724	concrete or not. Concrete packs can be writen out in full (even if some
725	individual elements are polymorphic) can be inlined directly, expanding out
726	into a series of object declarations.
727
728	[A, B] local = func();
729	call(local);
730
731	A local$0;
732	B local$1;
733	?{}([&local$0, &local$1], func());
734	call([local$0, local$1]);
735
736	In the polymorphic case though, that entire group needs to be reduced to a
737	block that can put into a generic function. Here we can use something like
738	the currently existing implementation, where the entire tuple an opaque
739	bundle, passed to adapters and thunks for handling.
740
741	This is also how tuple declarations are implemented. In the concrete case the
742	element names are real and the pack name is replaced with a tuple expression
743	of the elements. In the polymorphic case the pack name is real and the
744	element names are replaced with tuple index expressions.
745
746	### Unstructured Tuple Implementation
747
748	Unstructured tuples have two general implementation stratagies. Either they
749	are erased and the packs inlined into their context, or they are wrapped
750	up as like a structured tuple, generally using the current implementation.
751
752	For example, if a concrete pack (size and all the types are known), appears
753	in a parameter list the pack can be split and inlined into the argument list.
754
755	void function(int a, [bool, char, short] b, float c);
756	// Becomes:
757	void function(int a, bool b0, char b1, short b2, float c);
758
759	On the other hand, if a polymorphic pack appears in the same place, it will
760	have to be wrapped up in a dynamic structure so it different instances can
761
762	forall(Ts...)
763	void function(int a, Ts b, float c);
764	// Becomes (after the implicit packing and unpacking operations):
765	forall(T)
766	void function(int a, T* b, float c);
767
768	### AST Updates
769
770	The AST may already be capable of representing most of this. I have only
771	identified one feature that definitely will now work in the current AST, and
772	that are the object like tuple declarations with named elements.
773
774	Particularly, an object declaration cannot name elements of the tuple.
775	To this end a new node type, `TupleDecl`,
776	should be added to handle tuple deconstruction declarations
777	(other cases can still be handled with `ObjectDecl`).
778
779	This would act much like a `FunctionDecl` except for tuples, narrowing the
780	possible types, to `TupleType` instances instead of `FunctionType` instances,
781	and storing some additional information. In this case, the names of the
782	elements of the tuples.
783
784	(I'm not actually going to decide the implementation now, but some early
785	examination of the problem suggests that it might be better off wrapping a
786	series of `ObjectDecl` rather than just some strings to hold the names.)
787
788	Related Features in Other Languages
789	-----------------------------------
790	Other languages have similar features. Organized by the related feature,
791	then language (this does mean we can and do hit languages multiple times).
792
793	(In hindsight, I may have gone overboard with the number of examples.)
794
795	### Structured Tuples
796
797	There are many languages with structured tuples. Usually just called tuples,
798	but they usually follow the same rules as a polymorphic anonymous structures,
799	that is to say N types are provided to create a new N-arity tuple. They also
800	usually have some special syntax to index the tuples, because there are no
801	field names.
802
803	#### Rust
804
805	Rust has the standard tuples as a primitive in the language. Each arity of
806	tuple is a different polymorphic type.
807
808	Tuple types and expressions are written with a parenthesized, comma separated
809	list of types or expressions. To avoid confusion with the grouping "(...)",
810	one-element tuples must have a trailing comma (and larger tuples may have a
811	trailing comma).
812
813	const empty: () = ();
814	const single: (usize,) = (12,)
815	const double: (usize, isize) = (34, -56);
816
817	Element access uses similar syntax to field access (that is "."), but uses an
818	integer literal instead of a field name (for example: "pair.1"). Tuples
819	support pattern matching with similar syntax to their expression form.
820
821	Some tuple features only apply up to 12-arity tuples.
822	It is not directly stated, but I believe the difference in the more limited
823	features are implemented in the standard library, one for each arity of tuple.
824
825	- https://doc.rust-lang.org/std/primitive.tuple.html
826	- https://doc.rust-lang.org/reference/types/tuple.html
827	- https://doc.rust-lang.org/reference/expressions/tuple-expr.html
828
829	#### C++
830
831	Implemented as a template type defined in the standard library. No special
832	language features exist to support this, due to the power of C++'s template
833	meta-programming.
834
835	C++ is also one of the few languages with support for variadic polymorphic
836	types, so there is one template that defines all the types. It is written
837	as `std::tuple<TYPE...>`, where "TYPE..." is any number of comma separated
838	types. This is the standard notation for template type instances.
839
840	There is no special syntax for member access of a tuple. A template function is
841	used, e.g., `std::get<0>( tuple )`.
842
843	C++ is an interesting case because it is one of the few cases where no part
844	of the tuple system is built-in. It is entirely created in the standard
845	library using templates. This also has massive error messages when something
846	goes wrong, hopefully our version of struct tuple will reduce the problem.
847
848	C++ also has structured binding, a kind of limited pattern matching. In a
849	structured binding declaration, you can write an auto typed declaration where
850	a list of identifiers in a `[]` replaces the single identifier.
851
852	auto [first, second] = getSize2Tuple();
853
854	The base type has to be `auto` because the base type for each element varies.
855	Still, qualifiers (cv and reference) on the auto will be distributed to types
856	on the individual members.
857
858	The type bound to the binding may be an array, a "plain" structure or any
859	type that implements the tuple-like interface (`std::tuple_size` and
860	`std::tuple_element`).
861
862	- https://en.cppreference.com/w/cpp/utility/tuple
863	- https://en.cppreference.com/w/cpp/language/structured_binding
864
865	#### Haskell
866
867	Haskell has a special syntax for tuples, but otherwise they are completely
868	normal polymorphic types. Because of this, tuples have a maximum size.
869	Haskell (98) supports tuples of up to 15 elements and the standard library
870	has functions for tuples of up to 7 elements.
871
872	The syntax for tuples is a comma separated list of elements. Either element
873	types to create a tuple type, or element values to create a tuple value. The
874	context decides among them, such as `(6, "six")` or `(Bool, Char, Int)`.
875
876	Also all the elements can be removed, getting an expression like "(,)" or
877	"(,,,)", which can be then be used as a function, for a type, or an expression.
878
879	Haskell supports pattern matching as the main way to extract values from a
880	tuple, although helper functions "fst" and "snd" are provided for field
881	access on two element tuples.
882
883	Haskell does not have 0 or 1-element tuples. The nil type, written "()" for
884	both type and value, is effectively the 0 element tuple. There is also a type
885	called "Solo" that is a polymorphic structure with one field and is used as
886	the 1-element tuple, but has regular Haskell data-type syntax.
887
888	- https://www.haskell.org/onlinereport/basic.html
889
890	#### OCaml
891
892	OCaml only supports multi-element (2 or more) tuples. It does have the `unit`
893	type, which has one value, written `()`. Tuple types are written as a '*'
894	separated list of types, tuple values are written as ',' separated list of
895	expressions. Pattern matching tuples is supported and uses the same syntax as
896	values. Parenthesizing these lists is only needed for grouping.
897
898	- https://ocaml.org/docs/basic-data-types#tuples
899
900	#### Swift
901
902	Swift has tuple types that use the basic parenthesized, comma separated list of
903	types or values. It only supports 0 and 2 or more element tuples (the `Void`
904	type is an alias for the empty tuple type).
905
906	Swift also supports named tuples. Names can be added before the tuple element,
907	both for the tuple type and value. The syntax is a name followed by a colon,
908	e.g., `(first: int, second: int)`. These names are a fixed part of the type,
909	and can be used as part of field access notation (otherwise numbers are used
910	in-place of field names `tuple.0` vs. `tuple.first`).
911
912	- https://docs.swift.org/swift-book/documentation/the-swift-programming-language/types/#Tuple-Type
913
914	#### Python
915
916	In Python tuples are immutable lists. Because they are dynamically typed,
917	there is only one tuple type `tuple`.
918
919	It also has various named tuples. The first, namedtuple, allows naming the
920	elements of the tuple. The second, NamedTuple, is actually a way of creating a
921	typed record in a normally untyped language.
922
923	- https://docs.python.org/3/library/stdtypes.html#sequence-types-list-tuple-range
924	- https://docs.python.org/3/library/collections.html#collections.namedtuple
925	- https://docs.python.org/3/library/typing.html#typing.NamedTuple
926
927	#### LISP
928	As LISP is dynamically typed, its `cons` data type is an untyped pair and is
929	(perhaps infamously) the main constructor of all compound data types. The
930	function `cons` takes two arguments and builds a pair. Functions `car` and
931	`cdr` get the first and second elements of the pair.
932
933	### Packs
934	Packs (or unstructured tuples) are a much less common feature. In fact, there
935	might just be one language, C++, that supports packs. The most common use
936	case for unstructured tuples is returning multiple values, so there is a
937	comparison to languages that have that as a special feature.
938
939	#### Go
940	Go does not have built in tuple types, but it has multi-return syntax that
941	looks like the tuple syntax of many other languages.
942
943	```
944	func main() {
945	i, j := returnIntInt()
946	...
947	}
948
949	func returnIntInt() (int, int) {
950	return 12, 34
951	}
952	```
953
954	Go can unpack multiple return values and pass them all to a function, but
955	the unpacked tuple must match the parameter list exactly.
956
957	```
958	func acceptIntInt(a int, b int) {
959	fmt.Println(a, b)
960	}
961	```
962
963	- https://golangdocs.com/functions-in-golang
964	- https://go.dev/src/go/types/tuple.go
965
966	#### Lua
967	Lua is a scripting language that is dynamically typed and stack based. Although
968	the stack is usually only directly visible in the C-API, it does allow any
969	function to return any number of values, one, zero or more, from any return
970	expression.
971
972	```
973	local funcion f()
974	return 12, 34
975	end
976
977	local i, j = f()
978	```
979
980	The last argument in an argument list can be an expression - function call -
981	with multiple results and all the results are passed to the function after
982	other arguments.
983
984	Because Lua is dynamically typed, multiple return values are allowed anywhere
985	and the length of the value listed is adjusted, discarding any extra values
986	and filling in new values with the value `nil`. This is also used if a
987	function is called with the incorrect number of arguments.
988
989	- https://lua.org/
990	- https://lua.org/manual/5.4/manual.html#3.4.12
991
992	#### C++
993
994	We return to C++ to view template parameter packs. These are the symmetric
995	with our pack feature (and both are even used to construct the structured
996	tuples in the last section).
997
998	C++ template parameter packs are a modification that can be applied to any
999	template parameter, so the parameter now takes a series of arguments instead
1000	of just one. These are used in pack expansion,
1001	which usually expand to a comma separated list, but it can also be a chain of
1002	boolean binary operator applications. For example, if the parameter
1003	`typename... Ts` is in scope, then you can declare `std::tuple<Ts...>` to
1004	introduce a tuple type, if Ts = bool, char, int then the type becomes
1005	`std::tuple<bool, char, int>`.
1006
1007	A common use is for perfect argument forwarding, which shows some different
1008	uses of the pattern:
1009
1010	```
1011	template<typename inner_t>
1012	class Outer {
1013	inner_t inner;
1014	public:
1015	template<typename... Args>
1016	Outer(Args&&... args) : inner(std::forward<Args>(args)...) {}
1017	};
1018	```
1019
1020	In the first application, `Args&&... args` both uses a pack and introduces
1021	another one. `Arg0&& arg0, Arg1&& arg1, Arg2&& arg2, ...` is the pattern
1022	it expands too. The `&&` is used to copy the argument type's reference
1023	qualifier (the default can strip references away).
1024
1025	The second application, `std::forward<Args>(args)...` uses two packs. These
1026	are expanded in parallel, and must be the same length (in this case they
1027	always will be). It also helps show that the `...` is actually the bit doing
1028	the expansion, it is a suffix "operator" to the expansion pattern.
1029
1030	There are also fold expressions that use binary operators to combine a pack
1031	into a single expression. For example, you can write a variadic sum function
1032	that compiles down to the primitive additions.
1033
1034	```
1035	template<typename... Args>
1036	int sum(Args&&... args) {
1037	return (args + ... + 0);
1038	}
1039	```
1040
1041	C++ is about the best you could ask for in this area, but it does a lot of work
1042	at compile time to make this happen.
1043
1044	- https://en.cppreference.com/w/cpp/language/template_parameters
1045	- https://en.cppreference.com/w/cpp/language/parameter_pack
1046	- https://en.cppreference.com/w/cpp/language/fold

Note: See TracBrowser for help on using the repository browser.

Download in other formats: