source: doc/proposals/tuples.md @ 90be0cf

Last change on this file since 90be0cf was 12c4a5f, checked in by Andrew Beach <ajbeach@…>, 3 weeks ago

Folding in feedback to the tuple proposal and rewriting the last rewrite.

  • Property mode set to 100644
File size: 37.6 KB
Line 
1Tuples
2======
3
4This is a proposal discusses update to Cforall tuples as they exist after
5the work done by Robert Schluntz (after they were added to Cforall by
6Rodolfo Esteves and to K-W C by Dave Till). The new features and ideas
7discussed here are designed to address problems that have appeared as tuples
8have been used in the past 6 years. Some or all of the changes discussed may
9already be implemented.
10
11The core change is breaking the current restructurable tuples into unstructured
12and structured tuples. Unstructured tuples handle most of the existing uses,
13with structured tuples filling in a few missing use cases.
14
15Current State of Tuples
16-----------------------
17An overview of the current tuple design is the starting place for the proposal.
18
19### Tuple Syntax
20
21Currently, tuples have three main components: tuple types, tuple
22expressions/values (constructing tuples), and tuple index expressions
23(deconstructing tuples).
24
25Current syntax for tuple types:
26
27-   Nullary: [void] or []
28
29        [void] f();
30        [] f();
31
32-   Unary: [TYPE]
33
34        [int] f();
35        [double] x = 1.5;
36
37-   Multiary: [TYPE, TYPE, ...]
38
39        [bool, char] f();
40        [short, int, long] x;
41
42Tuple types can appear anywhere regular types may be used, except for the
43nullary tuple which can only appear where void can be used (usually return
44types), with the exception of special case when void is used to make an empty
45parameter list.
46
47Current syntax for tuple expressions:
48
49-   Nullary: (Same as `void`, use return without an expression.)
50
51        [] f() { return; }
52
53-   Unary: EXPR (in return only) or {EXPR}
54
55        [int] f() { return 3; }
56        [int] x = {3};
57
58-   Multiary: [EXPR, EXPR, ...]
59
60        [int, int] f() { return [3, 4]; }
61        [bool, char, int] x = [true, 'a', 3];
62
63Tuple expressions should be useable whenever an expression of that type is
64expected. Issues with the current state will be discussed later.
65
66Current syntax for tuple indexing is an integer constant, where its value
67selects a tuple member, e.g.:
68
69        [char, int] tup = ['a', 0];
70        char ch = tup.0;
71        int value = tup.1;
72
73### Mass and Multi-Assignment
74
75Two special forms of assignment can be used to set values in tuples: mass and
76multi.  Mass assignment assigns every element in the destination tuple to a
77single source value.
78
79        [int, long, float] dst;
80        int src = 4;
81        dst = src;
82        // Becomes: dst.0 = src; dst.1 = src; dst.2 = src;
83
84Multi-assignment assigns every element in the destination tuple to the
85corresponding element in the source tuple. Both tuples must be the same size
86and the elements pairs must be assignment compatible conversions.
87
88        [long, int, float] dst;
89        [int, char, double] src = [1, 'a', 300.0];
90        dst = src;
91        // Becomes: dst.0 = src.0; dst.1 = src.1; dst.2 = src.1;
92
93### Tuple Restructuring
94
95Tuples can be restructured as part of overload resolution. Function calls
96unpack tuples and repack tuples to match signatures. This semantics is a form
97of implicit conversion and is considered during overload resolution.
98
99A simple example is matching multiple parameters of a function to a single
100argument expression, where each parameter is bound to a different element of
101the returned tuple.
102
103        [int, int] argFunc();
104        void parmFunc(int a, int b, int c, int d);
105
106        parmFunc(argFunc(), argFunc());
107
108        // Roughly equivilent to:
109        [int, int] fst = argFunc();
110        [int, int] snd = argFunc();
111        parmFunc(fst.0, fst.1, snd.0, snd.1);
112
113### Tuple Casts
114
115C-style casts can be used on tuples. These are usually conversion casts (casts
116that perform operations on the cast type, as opposed to reinterpreting the
117existing value).
118
119Tuple casts can remove elements from the end of a tuple and apply a recursive
120cast to any of the elements. As an example:
121
122        [char, char, char] x = ['a', 'b', 'c'];
123        ([int, char])x;
124
125This casts the first element type of x from a char to an int and drops the last
126element. The second line can be replaced with the following code, which creates
127a new tuple by directly casting the kept elements of the old tuple:
128
129        [(int)x.0, (char)x.1];
130
131Note, tuple casting is more restricted than the implicit tuple restructuring.
132It cannot do any restructuring beyond dropping the trailing elements of
133tuples. For example,
134
135        int i; char c; bool b;
136        [i, b, c, i] = [i, [b, c], i];
137        [int, [bool, char], int] w;
138        [i, b, c, i] = w;
139        [i, b, c] = ([int, [bool, char]])w;
140        // But this does not work:
141        [i, b, c, i] = ([int, bool, char, int])w;
142
143### Polymorphic Tuple Parameters
144
145One kind of polymorphic parameter is the tuple type parameter, previously
146noted by the `ttype` keyword and now marked by a tailing ellipses (`...`).
147Although described as a tuple type, it is usually viewed as its flattened
148sequence of types.
149
150        // Fixed Length Max - Base Case:
151        forall(T | { int ?>?( T, T ); })
152        T max(T v1, T v2) { return v1 > v2 ? v1 : v2; }
153
154        // Variable Length Max - Recursive:
155        forall(T, Ts... | { T max(T, T); T max(Ts); })
156        T max(T arg, Ts args) {
157                return max(arg, max(args));
158        }
159
160This feature introduces a type name into scope (Ts). It is used as a type and
161the value it declares (args) is a tuple value, and that the second assertion
162"T max(Ts);" matches takes a single tuple, but with restructuring it can also
163match functions that take a series of non-tuple arguments.
164
165A good way to explain this is to show a partial implementation the following
166call `max(int_a, int_b, int_c);` into actual code:
167
168        void _thunk0(int v1, int v2) {
169                return max{?>?}(v1, v2);
170        }
171        void _thunk1([int, int] tup) {
172                return max{?>?}(tup.0, tup.1);
173        }
174        max{_thunk0, _thunk1}(int_a, [int_b, int_c]);
175
176The part to highlight is that "_thunk1", one of the functions created to
177make an inexact match that can be used as a call site into an exact match
178so the function can be passed through function parameters.
179Although, the tuple type `[int, int]` matches a parameter list `int, int`,
180just with a bit of low lever restructuring.
181
182In larger cases (with four or more parameters in the first case), the
183recursive case will work similarly, creating an intermidate function to
184restructure the tuple into a new call such as:
185
186        void _thunk2([int, int, int] tup) {
187                return max{_thunk0, _thunk1}(tup.0, [tup.1, tup.2]);
188        }
189
190Issues with the Current State
191-----------------------------
192There are a variety of problems with the current implementation which need to
193be fixed.
194
195### Tuples are not Objects
196
197Spoilers: this proposal actually takes them even further away from being
198objects, but illustrates why the current version is not as useful is it could
199be.
200
201Tuples do not have the lifetime operators (copy construction, copy assignment
202and destructor) that define an object type in Cforall. This means they are
203not polymorphic otypes, they are objects in C's usage of the word. This means
204that tuples cannot be used as a single object in
205
206There are several reasons for this. The fluid nature of tuples (flattening
207and restructuring) means that matching against a tuple type is fluid in some
208inconvent ways.
209
210        forall(T, U) void ?{}([T, U] & this;
211        void ?{}([int, int, int] & this);
212        // What constructor(s) should be used here?
213        [int, [int, int]] pair;
214
215Should the polymorpic constructor be applied twice or should the tuple be
216restructured and the three element constructor be used? This is not really
217something a user should be wondering about (especially since tuples should
218always be simple containers wrapping their elements).
219
220### Providing TType Arguments is Inconsistent
221
222The syntax for ttype arguments is slightly inconsistent. It has not come up
223much yet, because you do not directly provide ttype polymorphic arguments to
224functions and there are very few existing use-cases for ttype structures.
225
226        forall(T, Us...)
227        void function(T head, Us tail) {}
228
229        forall(T, Us...)
230        struct Type {};
231
232        int a;  bool b;  char c;
233        function(a, b, c);
234        function(a, [b, c]);
235        // Doesn't work: Type(int, bool, char) x;
236        Type(int, [bool, char]) x;
237
238The `function` case works either way because the tuple value can be
239restructured if it does not match exactly. The `Type` case though does not
240allow restructuring when the type is passed as a type value. This may be
241the most correct form, but seems surprising compared to the common use.
242
243### Unary Tuple Value Syntax
244
245Unary tuples don't use the standard tuple syntax and one of the two alternate
246syntax options your may use instead doesn't work consistently. Solving both
247of these issues would help with consistency.
248
249### Syntax Conflict
250
251The tuple syntax conflicts with designators and the C23 (C++-style) attribute
252syntax.
253
254        int a[10] = { [2] = 3 };
255
256Here [2] = 3 is an array designator, but has the same syntax as an assignment
257to a tuple of references, it isn't until the resolver determains that 2 is
258not a reference that that case can be ruled out.
259
260        [[ unused ]] [[int, int], int] x;
261
262Here the opening `[[` could be a nested tuple or the beginning of an
263attribute, the parser can't figure out until later.
264
265These conflicts break C compatibility goals of Cforall. Designators had to
266their syntax change and Cforall cannot parse the new attributes.
267The behaviour of these cases, if they could be parsed, should be unchanged.
268
269Change in Model
270---------------
271This proposal modifies the existing tuples to better handle its main use
272cases. There are some use cases that are no longer handled, so in addition
273to changing the existing "restructured tuples" to "unstructured tuples"
274(or packs) a new "structured tuple" is added in the form of struct tuple.
275
276The new tuples is even more "unstructured" than before. New tuples are
277considered a sequence of types or typed entities. These packs are then unpacked
278into the surrounding context. Put differently, tuples are now flattened as much
279as possible, with some places (like parameter lists) being treated as an
280implicit tuple and the tuple being flattened into that.
281
282Structured tuples are now a separate feature: a structure called "tuple".
283These are polymorphic structures; an instance should act as a structure, except
284its fields are accessed using indices instead of field names. Experience so far
285is that structured tuples are not used often, but fill in the use cases that
286unstructured tuples no longer support.
287
288Note that the underlying implementation might not actually look like this,
289but this is how it should seem to the user.
290
291Changed Features
292----------------
293Some of the concrete changes to the features of the language.
294
295### Unstructured Tuple / Pack Type
296
297The evolution of our current tuples (called packs in an attempt to make the
298names more distinct). They work like the current tuples except they are
299always flattened. You might still be able to declare nested tuples, but these
300are not meaningful, they are flattened to a single level and, where
301approprate, is inlined into the sounding content.
302
303        [bool, [char, int], long] x;
304        // Becomes:
305        [bool, char, int, long] x;
306
307        void f(bool a0, [char, [int, long]] a1, float a2);
308        // Becomes:
309        void f(bool a0, char a1_0, int a1_1, long a1_2, float a2);
310
311        [,] f(int a0, [,] a1, [bool,] a2);
312        // Becomes:
313        void f(int a0, bool a2);
314
315This actually means that tuples do not participate in overload resolution
316in the way the do currently. Restructuring is always free because they are
317always reduced to the same form as part of type matching.
318
319Packs are still not object types and do not have lifetime functions. They
320cannot be used as object types, nor as any data type, but they can be still
321used as ttypes.
322
323(Unless they need to have lifetime functions for other features.)
324
325### Structured Tuple Type
326
327There is a standard library or built-in type named `tuple`, it does not need
328a special syntax to write types or values. The type definition might need
329some primitive support, but if supported as a regular type would look
330something like this:
331
332        forall(Ts...)
333        struct tuple {
334                inline Ts all;
335        };
336
337This type wraps up an unstructured tuple, in this case a pack of members,
338and gives it "edges" and so structure. It also gives tuples an interface
339more like a normal structure type, for the anonymous structure use case.
340
341The struct tuple does have lifetime functions, can use list initializer
342syntax and otherwise be treated as a normal structure. You can use regular
343member access syntax to convert the struct tuple into a unstructured pack,
344writing `object.all`.
345
346The `inline` modifier is a new specialized feature to allow you use tuple
347index expressions directly on the type. With or without inline you should
348be able to chain a tuple index expression onto the member expression, such
349as `object.all.1`. The inline specifier allows you to skip the middle
350and use `object.1`.
351
352More complex operations can usually be preformed by taking the pack out of
353the struct tuple and acting on it. Polymorphic operations only have to go one
354level down to get to base operations, there is no need for recursive
355construction, nor manually coding different sizes of tuple. This should give
356most of the ease of working with a primitive tuple, because effectively you
357are except it has been given fixed edges.
358
359A tuple struct should also be an object type, and let's you pass multiple
360values through a single polymorphic slot.
361
362A note on the naming here, because there are more pieces that need names
363instead of symbols. The name `tuple` follows because that is the common name,
364although this means this is now the default tuple in some sense. The name of
365the pack was picked to work with inline and make the difference between `.0`
366and `.all` clear. (If inline doesn't work then `get` might be a better name.)
367
368### Type Qualifier Distribution
369
370Because tuples are no longer object types, applying type modifiers to them,
371such as cv-qualifiers and pointers, no longer makes sense. That syntax is
372now considered distribution, applying the type modifier to each element of
373the tuple.
374
375Previously `const [int, bool] &` would mean a const reference to a tuple of an
376integer and a boolean. Now it means an alias for `[const int &, const bool &]`,
377a tuple of a reference to a constant integer and a reference to a constant
378boolean. This also applies to polymorphic tuple type packs `Ts &` in
379polymorphic functions.
380
381This new approach can specify restrictions on tuple variables as for a single
382type variable. For example, this approach can replace the special cased tuple
383operations multi-assignment (N-to-N) and mass-assignment (1-to-N).
384
385        // Multi-Assignment
386        forall(T, Us... |
387                        { T & ?=?(T &, T const &); Us & ?=?(Us &, Us const &); })
388        [T, Us] & ?=?(T & dst, Us & dsts, T const & src, Us const & srcs) {
389                dst = src;
390                dsts = srcs;
391                return [dst, dsts];
392        }
393
394        // Mass-Assignment
395        forall(T, U, Vs... |
396                        { U & ?=?(U &, T const &); Vs & ?=?(Vs &, T const &); })
397        [U, Vs] & ?=?(U & dst, Vs & dsts, T const & src) {
398                dst = src;
399                dsts = src;
400                return [dst, dsts];
401        }
402
403These may not work exactly as given (for one, the copy assignment assertion
404in the first function would likely be redundant/conflicting with the implicit
405assertions on the parameters), but they show the pattern. Multi-assignment
406also would be very hard to write with simple tuple types, because the
407relationship between the two halves of the parameter list does have to line
408up, that cannot be enforced with two different tuples.
409
410This example doesn't have to be implemented this way because we do have the
411special case operations for these already.
412
413### Type Packs
414
415This approach is not a new feature, but a reframing/extension of existing tuple
416tuple polymorphic parameters as polymorphic type packs. The old `Vars...`
417syntax introduces a pack of types into scope. It can be used in much the same
418way as a tuple, but in some new ways to.
419
420The primary existing use remains: to use a polymorphic pack in a parameter
421list, both as part of an assertion and in the signature of the main
422function. The difference is that this is not an enclosed tuple, but a series of
423types. The only effective difference this makes is it does not prefer to match
424another tuple/pack.
425
426This pattern continues to a parameter defined with a pack of types, which
427is considered a pack of parameters, and the name it introduces is a pack
428of variables, or a flattened tuple.
429
430        forall(Params...)
431        void function(Params values);
432
433New use cases include declarations of members and variables. For example, the
434creation of a structured tuple structure:
435
436        forall(Fields...)
437        struct tuple {
438                Fields get;
439        };
440
441This is again, treated as a pack of members. They have the same layout as
442if they were written out by hand. Now, the name get is still accessed as if
443it was a regular, singular, member. The result of that expression is a
444pack expression, a tuple of all the field accesses, which can be used with a
445tuple index expression to access the underlying members. For example:
446
447        tuple(bool, char, int) data = { ... };
448        // This expression:
449        data.get.2;
450        // Is the same as (if the fields had these names):
451        data.[__anonymous0, __anonymous1, __anonymous2].2;
452
453For local declarations it works similarly, except the name introduced is
454directly usable as a tuple instead of going through a member access.
455
456        forall(Objects...)
457        void function( ??? ) {
458                ???
459                Objects objs = ???;
460                ???
461        }
462
463### Tuple Pack Restructuring
464
465Implicit restructuring still occurs, but is not considered as part of
466overload resolution. The structure of tuples is ignored, where individual
467tuples begin or end is not considered, just the sequence of types.
468
469This can be viewed as flattening both sides (the arguments/caller and the
470parameters/callee) before type resolution. Both any nested packs and packs
471into any larger context:
472
473        call(false, ['a', [-7,]], 3.14)
474        // Inline Nesting:
475        call(false, ['a', -7], 3.14)
476        // Inline into Context:
477        call(false, 'a', -7, 3.14)
478
479The main context where tuples are inlined are parameter lists, in that you
480can consider a parameter list as a single pack. Return types could be viewed
481in this way, but because they are presented in a single type, they are
482already wrapped in a tuple.
483
484        [int, [bool, char]] f();
485        // Inline Nesting:
486        [int, bool, char] f();
487
488This also happens "after" any polymorphic parameters are instantiated.
489
490Empty packs are also collapsed, logically being removed or replaced with
491void as in the following example:
492
493        // For example, this function:
494        forall(Ts...) Ts example(int first, Ts middle, int last);
495        // With Ts = [,], should not be treated as:
496        [,] example(int first, [,] middle, int last);
497        // But instead:
498        void example(int first, int last);
499
500Or closer to it, there are some slight differences between the nullary tuple
501and void, for instance creating some instances for consistency with the other
502tuple types.
503
504The struct tuple is never implicitly restructured.
505
506### Tuple Declaration/Deconstruction
507
508Declaring a tuple acts as a pack of variable declarations. When this is done
509with a written out type (as opposed to a polymorphic parameter above), then the
510elements of the tuple can be named.
511
512        [int quo, int rem] quorem = divmod(a, b);
513
514Here `ret` refers to the tuple, the entire pack, while `quo` and `rem`
515give explicit names to elements of the tuple. Not all the names have to be
516provided, at the very least, any element name can be omitted if the pack name
517is provided, and the pack name can be omitted if all the element names are
518provided. That would ensure every element can be accessed, but it could be
519reduced even more, effectively immediately dropping some values if there is
520no named to access it.
521
522This does not add any new functionality, more it is a qualify of life feature
523making it very easy to give elements of a tuple descriptive names
524instead of trying to combine a name for the whole tuple with an index.
525(`quo` and `rem` vs. `quorem.0` and `quorem.1`)
526
527### Tuple Casts
528
529Tuple casts are no longer allowed to do any restructuring. Internal
530restructuring would not be useful, as there is no internal structure.
531Dropping tail elements could be added back in but it is a narrow use case
532so it may be replaced with other features (for example: tuple slicing).
533
534        // Allowed:
535        ([bool, short])tuple_bool_char;
536        ([int, float])tuple_float_float;
537
538        // Forbidden:
539        ([int, int, int])tuple_int_int;
540        ([long, [float, int]])tuple_long_float_int;
541
542### New Tuple Literal Syntax
543
544In addition to the main redesign of this proposal, the syntax is updated to
545be more consistant. It is now a "[]" list with "," separators and a ","/comma
546terminator that is optional for tuples with 2 or more elements (those that
547already have a comma in them) and manditory in other tuples.
548
549It should also be symmetric between types and value, differing only in that
550an element (ITEM below) of a tuple type is a type and the element of a tuple
551value is a value.
552
553-   Nullary Tuple: [,]
554-   Unary Tuples: [ITEM,]
555-   Multiary Tuples: [ITEM, ITEM] [ITEM, ITEM,] [ITEM, ITEM, ITEM] ...
556
557Currently, although syntax is provided to write them, but manually writing
558a nullary or unary tuple should be very rare. This is because a nullary tuple
559can either be erased or treated as void, while unary tuples can be treated
560as their element type. Similary for nested tuples, can be treated as the
561flattened tuple.
562
563(Update: Orignal nullary tuple proposal was `[]`, but that had some issues.)
564
565### TType Argument Syntax
566
567It would be nice to seamlessly provide packs of types in polymorphic
568argument lists, however doing so requires a new restriction.
569
570As an example of the change, consider providing arguments to a polymorphic
571structure with a otype and a ttype parameter.
572
573        forall(T, Us...) struct D {...};
574        D(int, [,])            // D(int)
575        D(bool, [char,])       // D(bool, char)
576        D(long, [int, short])  // D(long, int, short)
577
578This is a nice syntaxtic improvement for the common case, it does restrict
579some of the more complex cases, such as cases with two ttypes. There are
580various restrictions on this for functions. If the rules make it unambigous
581then the grouping could be omitted, or a more advanced version, it can be
582ommited only in the indivual cases where it is unambigous.
583
584### Tuple Slicing (Range Tuple Indexing)
585
586(Disclaimer: this is a bit more impulsive, see end for notes.)
587
588This is an extension to tuple indexing. Currently, only single location of a
589tuple can be index, extracting the element at that location. By extending the
590index expression to be a list-range a multiple sub-tuples can be extracted.
591
592        ['a', 'b', 'c', 'd', 'e', 'f'].0~3
593        ['a', 'b', 'c', 'd', 'e', 'f'].1~5~2
594        // Produces:
595        ['a', 'b', 'c']
596        ['b', 'd']
597
598In words, the first indexes the first 3 tuple elements (0, 1, 2, stopping
599before index 3) and the second indexes elements starting at index 1, stoping
600before index 5 going in steps of 2. This does not allow for arbitary
601selections, but it covers many common slices.
602
603To get the new tuple, the range has to be constructed and traversed at compile
604time. The size of the range is the size of the result tuple, with each element
605in the result tuple decided by the matching element of the range, which gives
606the index of the original tuple to place there.  The type of the indices may be
607an integral type, but all indices must be in range, otherwise it is a compile
608time error.
609
610In terms of the existing range loops, if you could write this dynamically, it
611would look something like this:
612
613        // Implementation of: output = input.RANGE
614        dyn output = []
615        for ( index : RANGE ) { output = [output, input.index] }
616
617The result of the expression is only has to be considered a tuple if the
618range has two or more values, otherwise it can be considered void or the same
619as indexing the single value in the range.
620
621Some closing notes, this is dependent on generalized range expressions.
622The iterators proposal has a section on some new range types, this feature
623is intended to be built on that feature. Not simply reuse some of the syntax
624that is also used in the special for loop. This may add some requirements
625about the compile-time evaluation on range expressions to make sure these
626expressions can be evaluated at compile-time.
627
628Implementation
629--------------
630An overview of the implementation details of the new proposal.
631
632### Structured Tuple
633
634There is actually a decision point here. If it can be implemented entirely
635in the standard library, that would be good. If it cannot, it can actually
636be implemented as the same primitive implementation as we use now and may be
637how unstructured tuple packs are implemented.
638
639### Member Pack Implementation
640
641Implementing member packs can use the same pattern as current built-in tuple
642syntax. This is a two stage process, first is by ... then the it can be
643handled by the existing code for polymorphic but not variadic structures.
644
645        forall(Ts...)
646        struct tuple {
647                Ts get;
648        };
649
650So when used with diffences length an hidden declaration is created which
651has the pack parameter replaced with the approprate number of object type
652parameters. Such as in the following examples:
653
654        struct tuple$0 {
655        };
656
657        forall(Ts$0)
658        struct tuple$1 {
659                Ts$0 get$0;
660        };
661
662        forall(Ts$0, Ts$1)
663        struct tuple$2 {
664                Ts$0 get$0;
665                Ts$1 get$1;
666        };
667
668Then it can be handled by the regular boxing code. Detection of these cases
669will have to changed because it is no longer a single case. But the pattern
670does work if the structure has additional fields in it, as they are just
671placed before and after the pack.
672
673        forall(U, Ts...)
674        struct blob {
675                int header;
676                Ts data;
677                U footer;
678        };
679
680        forall(U, Ts$0, Ts$1)
681        struct blob$2 {
682                int header;
683                Ts$0 data$0;
684                Ts$1 data$1;
685                U footer;
686        };
687
688The `inline` member pack just effects the resolver. It should be expanded out
689to the member access and then the tuple index expression.
690
691### Local Declaration Pack
692
693Packs of local declarations can be handled very differently if they are
694concrete or not. Concrete packs can be writen out in full (even if some
695individual elements are polymorphic) can be inlined directly, expanding out
696into a series of object declarations.
697
698        [A, B] local = func();
699        call(local);
700
701        A local$0;
702        B local$1;
703        ?{}([&local$0, &local$1], func());
704        call([local$0, local$1]);
705
706In the polymorphic case though, that entire group needs to be reduced to a
707block that can put into a generic function. Here we can use something like
708the currently existing implementation, where the entire tuple an opaque
709bundle, passed to adapters and thunks for handling.
710
711This is also how tuple declarations are implemented. In the concrete case the
712element names are real and the pack name is replaced with a tuple expression
713of the elements. In the polymorphic case the pack name is real and the
714element names are replaced with tuple index expressions.
715
716### Unstructured Tuple Implementation
717
718Unstructured tuples have two general implementation stratagies. Either they
719are erased and the packs inlined into their context, or they are wrapped
720up as like a structured tuple, generally using the current implementation.
721
722For example, if a concrete pack (size and all the types are known), appears
723in a parameter list the pack can be split and inlined into the argument list.
724
725        void function(int a, [bool, char, short] b, float c);
726        // Becomes:
727        void function(int a, bool b0, char b1, short b2, float c);
728
729On the other hand, if a polymorphic pack appears in the same place, it will
730have to be wrapped up in a dynamic structure so it different instances can
731
732        forall(Ts...)
733        void function(int a, Ts b, float c);
734        // Becomes (after the implicit packing and unpacking operations):
735        forall(T)
736        void function(int a, T* b, float c);
737
738### AST Updates
739
740The AST may already be capable of representing most of this. I have only
741identified one feature that definitely will now work in the current AST, and
742that are the object like tuple declarations with named elements.
743
744Particularly, an object declaration cannot name elements of the tuple.
745To this end a new node type, `TupleDecl`,
746should be added to handle tuple deconstruction declarations
747(other cases can still be handled with `ObjectDecl`).
748
749This would act much like a `FunctionDecl` except for tuples, narrowing the
750possible types, to `TupleType` instances instead of `FunctionType` instances,
751and storing some additional information. In this case, the names of the
752elements of the tuples.
753
754(I'm not actually going to decide the implementation now, but some early
755examination of the problem suggests that it might be better off wrapping a
756series of `ObjectDecl` rather than just some strings to hold the names.)
757
758Related Features in Other Languages
759-----------------------------------
760Other languages have similar features. Organized by the related feature,
761then language (this does mean we can and do hit languages multiple times).
762
763(In hindsight, I may have gone overboard with the number of examples.)
764
765### Structured Tuples
766
767There are many languages with structured tuples. Usually just called tuples,
768but they usually follow the same rules as a polymorphic anonymous structures,
769that is to say N types are provided to create a new N-arity tuple. They also
770usually have some special syntax to index the tuples, because there are no
771field names.
772
773#### Rust
774
775Rust has the standard tuples as a primitive in the language. Each arity of
776tuple is a different polymorphic type.
777
778Tuple types and expressions are written with a parenthesized, comma separated
779list of types or expressions. To avoid confusion with the grouping "(...)",
780one-element tuples must have a trailing comma (and larger tuples may have a
781trailing comma).
782
783        const empty: () = ();
784        const single: (usize,) = (12,)
785        const double: (usize, isize) = (34, -56);
786
787Element access uses similar syntax to field access (that is "."), but uses an
788integer literal instead of a field name (for example: "pair.1"). Tuples
789support pattern matching with similar syntax to their expression form.
790
791Some tuple features only apply up to 12-arity tuples.
792It is not directly stated, but I believe the difference in the more limited
793features are implemented in the standard library, one for each arity of tuple.
794
795-   https://doc.rust-lang.org/std/primitive.tuple.html
796-   https://doc.rust-lang.org/reference/types/tuple.html
797-   https://doc.rust-lang.org/reference/expressions/tuple-expr.html
798
799#### C++
800
801Implemented as a template type defined in the standard library. No special
802language features exist to support this, due to the power of C++'s template
803meta-programming.
804
805C++ is also one of the few languages with support for variadic polymorphic
806types, so there is one template that defines all the types. It is written
807as `std::tuple<TYPE...>`, where "TYPE..." is any number of comma separated
808types. This is the standard notation for template type instances.
809
810There is no special syntax for member access of a tuple. A template function is
811used, e.g., `std::get<0>( tuple )`.
812
813C++ is an interesting case because it is one of the few cases where no part
814of the tuple system is built-in. It is entirely created in the standard
815library using templates. This also has massive error messages when something
816goes wrong, hopefully our version of struct tuple will reduce the problem.
817
818C++ also has structured binding, a kind of limited pattern matching. In a
819structured binding declaration, you can write an auto typed declaration where
820a list of identifiers in a `[]` replaces the single identifier.
821
822        auto [first, second] = getSize2Tuple();
823
824The base type has to be `auto` because the base type for each element varies.
825Still, qualifiers (cv and reference) on the auto will be distributed to types
826on the individual members.
827
828The type bound to the binding may be an array, a "plain" structure or any
829type that implements the tuple-like interface (`std::tuple_size` and
830`std::tuple_element`).
831
832-   https://en.cppreference.com/w/cpp/utility/tuple
833-   https://en.cppreference.com/w/cpp/language/structured_binding
834
835#### Haskell
836
837Haskell has a special syntax for tuples, but otherwise they are completely
838normal polymorphic types. Because of this, tuples have a maximum size.
839Haskell (98) supports tuples of up to 15 elements and the standard library
840has functions for tuples of up to 7 elements.
841
842The syntax for tuples is a comma separated list of elements. Either element
843types to create a tuple type, or element values to create a tuple value. The
844context decides among them, such as `(6, "six")` or `(Bool, Char, Int)`.
845
846Also all the elements can be removed, getting an expression like "(,)" or
847"(,,,)", which can be then be used as a function, for a type, or an expression.
848
849Haskell supports pattern matching as the main way to extract values from a
850tuple, although helper functions "fst" and "snd" are provided for field
851access on two element tuples.
852
853Haskell does not have 0 or 1-element tuples. The nil type, written "()" for
854both type and value, is effectively the 0 element tuple. There is also a type
855called "Solo" that is a polymorphic structure with one field and is used as
856the 1-element tuple, but has regular Haskell data-type syntax.
857
858-   https://www.haskell.org/onlinereport/basic.html
859
860#### OCaml
861
862OCaml only supports multi-element (2 or more) tuples.  It does have the `unit`
863type, which has one value, written `()`. Tuple types are written as a '*'
864separated list of types, tuple values are written as ',' separated list of
865expressions. Pattern matching tuples is supported and uses the same syntax as
866values. Parenthesizing these lists is only needed for grouping.
867
868-   https://ocaml.org/docs/basic-data-types#tuples
869
870#### Swift
871
872Swift has tuple types that use the basic parenthesized, comma separated list of
873types or values. It only supports 0 and 2 or more element tuples (the `Void`
874type is an alias for the empty tuple type).
875
876Swift also supports named tuples. Names can be added before the tuple element,
877both for the tuple type and value. The syntax is a name followed by a colon,
878e.g., `(first: int, second: int)`. These names are a fixed part of the type,
879and can be used as part of field access notation (otherwise numbers are used
880in-place of field names `tuple.0` vs. `tuple.first`).
881
882-   https://docs.swift.org/swift-book/documentation/the-swift-programming-language/types/#Tuple-Type
883
884#### Python
885
886In Python tuples are immutable lists. Because they are dynamically typed,
887there is only one tuple type `tuple`.
888
889It also has various named tuples. The first, namedtuple, allows naming the
890elements of the tuple. The second, NamedTuple, is actually a way of creating a
891typed record in a normally untyped language.
892
893-   https://docs.python.org/3/library/stdtypes.html#sequence-types-list-tuple-range
894-   https://docs.python.org/3/library/collections.html#collections.namedtuple
895-   https://docs.python.org/3/library/typing.html#typing.NamedTuple
896
897#### LISP
898As LISP is dynamically typed, its `cons` data type is an untyped pair and is
899(perhaps infamously) the main constructor of all compound data types.  The
900function `cons` takes two arguments and builds a pair. Functions `car` and
901`cdr` get the first and second elements of the pair.
902
903### Packs
904Packs (or unstructured tuples) are a much less common feature. In fact, there
905might just be one language, C++, that supports packs. The most common use
906case for unstructured tuples is returning multiple values, so there is a
907comparison to languages that have that as a special feature.
908
909#### Go
910Go does not have built in tuple types, but it has multi-return syntax that
911looks like the tuple syntax of many other languages.
912
913```
914func main() {
915        i, j := returnIntInt()
916        ...
917}
918
919func returnIntInt() (int, int) {
920        return 12, 34
921}
922```
923
924Go can unpack multiple return values and pass them all to a function, but
925the unpacked tuple must match the parameter list exactly.
926
927```
928func acceptIntInt(a int, b int) {
929        fmt.Println(a, b)
930}
931```
932
933-   https://golangdocs.com/functions-in-golang
934-   https://go.dev/src/go/types/tuple.go
935
936#### Lua
937Lua is a scripting language that is dynamically typed and stack based. Although
938the stack is usually only directly visible in the C-API, it does allow any
939function to return any number of values, one, zero or more, from any return
940expression.
941
942```
943local funcion f()
944        return 12, 34
945end
946
947local i, j = f()
948```
949
950The last argument in an argument list can be an expression - function call -
951with multiple results and all the results are passed to the function after
952other arguments.
953
954Because Lua is dynamically typed, multiple return values are allowed anywhere
955and the length of the value listed is adjusted, discarding any extra values
956and filling in new values with the value `nil`. This is also used if a
957function is called with the incorrect number of arguments.
958
959-   https://lua.org/
960-   https://lua.org/manual/5.4/manual.html#3.4.12
961
962#### C++
963
964We return to C++ to view template parameter packs. These are the symmetric
965with our pack feature (and both are even used to construct the structured
966tuples in the last section).
967
968C++ template parameter packs are a modification that can be applied to any
969template parameter, so the parameter now takes a series of arguments instead
970of just one. These are used in pack expansion,
971which usually expand to a comma separated list, but it can also be a chain of
972boolean binary operator applications. For example, if the parameter
973`typename... Ts` is in scope, then you can declare `std::tuple<Ts...>` to
974introduce a tuple type, if Ts = bool, char, int then the type becomes
975`std::tuple<bool, char, int>`.
976
977A common use is for perfect argument forwarding, which shows some different
978uses of the pattern:
979
980```
981template<typename inner_t>
982class Outer {
983        inner_t inner;
984public:
985        template<typename... Args>
986        Outer(Args&&... args) : inner(std::forward<Args>(args)...) {}
987};
988```
989
990In the first application, `Args&&... args` both uses a pack and introduces
991another one. `Arg0&& arg0, Arg1&& arg1, Arg2&& arg2, ...` is the pattern
992it expands too. The `&&` is used to copy the argument type's reference
993qualifier (the default can strip references away).
994
995The second application, `std::forward<Args>(args)...` uses two packs. These
996are expanded in parallel, and must be the same length (in this case they
997always will be). It also helps show that the `...` is actually the bit doing
998the expansion, it is a suffix "operator" to the expansion pattern.
999
1000There are also fold expressions that use binary operators to combine a pack
1001into a single expression. For example, you can write a variadic sum function
1002that compiles down to the primitive additions.
1003
1004```
1005template<typename... Args>
1006int sum(Args&&... args) {
1007        return (args + ... + 0);
1008}
1009```
1010
1011C++ is about the best you could ask for in this area, but it does a lot of work
1012at compile time to make this happen.
1013
1014-   https://en.cppreference.com/w/cpp/language/template_parameters
1015-   https://en.cppreference.com/w/cpp/language/parameter_pack
1016-   https://en.cppreference.com/w/cpp/language/fold
Note: See TracBrowser for help on using the repository browser.