| [8baa40aa] | 1 | Iterators | 
|---|
|  | 2 | ========= | 
|---|
|  | 3 | This is the proposal for adding iterators to Cforall and the standard | 
|---|
| [4d3666d] | 4 | library. Iterators provide a common interface for sequences of values in | 
|---|
| [8baa40aa] | 5 | the language. Many inputs and outputs can be described in terms of sequences, | 
|---|
|  | 6 | creating a common interface that can be used in many places. | 
|---|
|  | 7 |  | 
|---|
|  | 8 | Related Traits | 
|---|
|  | 9 | -------------- | 
|---|
|  | 10 | There are two groups of types that interact with this proposal. | 
|---|
|  | 11 |  | 
|---|
| [1f922f4] | 12 | Iterable | 
|---|
|  | 13 |  | 
|---|
|  | 14 | An iterable is a container of some type that has a default iterator that goes | 
|---|
|  | 15 | over its own contents. It may be a logical container, like a mathematical range, | 
|---|
|  | 16 | that does not store elements in memory, but it should have logical elements. | 
|---|
|  | 17 | A trait is defined to standardize how to get that iterator. | 
|---|
|  | 18 |  | 
|---|
|  | 19 | forall(C, I) | 
|---|
|  | 20 | trait is_iterable { | 
|---|
|  | 21 | I get_iterator(C&); | 
|---|
|  | 22 | } | 
|---|
|  | 23 |  | 
|---|
|  | 24 | This is not particularly useful on its own, but is used as standardized glue | 
|---|
|  | 25 | between some other features below. | 
|---|
|  | 26 |  | 
|---|
| [8baa40aa] | 27 | Iterator | 
|---|
|  | 28 |  | 
|---|
| [1f922f4] | 29 | An iterator is an object that can be advanced through a sequence of values. | 
|---|
|  | 30 | An iterator can be checked to see if it is at the end of the sequence, if not | 
|---|
|  | 31 | the next value can be retrieved from it and the iterator advanced. This can | 
|---|
|  | 32 | be done in as little as a single operation: | 
|---|
| [8baa40aa] | 33 |  | 
|---|
| [1f922f4] | 34 | forall(Iter, Elem) | 
|---|
|  | 35 | trait is_iterator { | 
|---|
|  | 36 | maybe(Elem) next(Iter&); | 
|---|
| [8baa40aa] | 37 |  | 
|---|
| [1f922f4] | 38 | // is_iterable(I, I&) | 
|---|
|  | 39 | Iter& get_iterator(Iter&); | 
|---|
|  | 40 | } | 
|---|
| [8baa40aa] | 41 |  | 
|---|
| [1f922f4] | 42 | This interface is not fixed, the function could change or be split into | 
|---|
|  | 43 | multiple functions (ex: `bool next(Elem&, Iter&)`). It should be natural, | 
|---|
|  | 44 | but also efficient to implement. This version will be used for examples in | 
|---|
|  | 45 | the proposal. | 
|---|
| [8baa40aa] | 46 |  | 
|---|
| [1f922f4] | 47 | When next is called, if there are elements left in the iterators, the next | 
|---|
|  | 48 | element is returned in a maybe and the iterator is advanced. If the iterator | 
|---|
|  | 49 | is exhausted nothing is changed and an empty maybe is returned. | 
|---|
|  | 50 |  | 
|---|
|  | 51 | Iterators are also iterables. They must return themselves, this allows | 
|---|
|  | 52 | iterators to be used directly where iterables are expected. | 
|---|
|  | 53 |  | 
|---|
|  | 54 | For-Each Loop | 
|---|
|  | 55 | ------------- | 
|---|
|  | 56 | The main part of this proposal which is part of the language definition, | 
|---|
|  | 57 | the rest is part of the standard library. And this is the new for-each loop: | 
|---|
|  | 58 |  | 
|---|
|  | 59 | for ( IDENTIFIER ; EXPRESSION ) STATEMENT | 
|---|
|  | 60 |  | 
|---|
|  | 61 | While this uses similar syntax to the existing special for loops, but takes | 
|---|
|  | 62 | a standard expression. This expression is evaluated, and then the iterator is | 
|---|
|  | 63 | retrieved from it (this should be the identity function iterators). Each | 
|---|
|  | 64 | iteration of the loop then checks the iterator to see if it is exhausted, if | 
|---|
|  | 65 | not it uses the value to execute the statement. If the iterator is exhausted, | 
|---|
|  | 66 | the loop is completed, and control continues after the loop (or the else | 
|---|
|  | 67 | clause if provided). | 
|---|
|  | 68 |  | 
|---|
|  | 69 | The loop should show the same behaviour as the following code: | 
|---|
|  | 70 | (`typeof(?)` should be decided by the resolver.) | 
|---|
|  | 71 |  | 
|---|
|  | 72 | { | 
|---|
|  | 73 | typeof(?) __iterable = EXPRESSION; | 
|---|
|  | 74 | typeof(?) __iterator = get_iterator(__iterable); | 
|---|
|  | 75 | typeof(?) __next = next(__iterator); | 
|---|
|  | 76 | while (has_value(&__next)) { | 
|---|
|  | 77 | typeof(?) IDENTIFIER = get(&__next) | 
|---|
|  | 78 | STATEMENT | 
|---|
|  | 79 | __next = next(__iterator); | 
|---|
|  | 80 | } | 
|---|
|  | 81 | } | 
|---|
|  | 82 |  | 
|---|
|  | 83 | This new loop can (but may not need to) also support the existing variants, | 
|---|
|  | 84 | where the name is omitted or multiple iterators are advanced in lock-step: | 
|---|
|  | 85 |  | 
|---|
|  | 86 | for ( EXPRESSION ) STATEMENT | 
|---|
|  | 87 | for ( IDENTIFIER0 ; EXPRESSION0 : IDENTIFIER1 ; EXPRESSION1 ) STATEMENT | 
|---|
|  | 88 |  | 
|---|
|  | 89 | These may not be needed. The unnamed variant is a nice short hand. | 
|---|
|  | 90 | The multiple iterator method could be replaced with an iterator transformer | 
|---|
|  | 91 | that has the same behaviour: | 
|---|
| [8baa40aa] | 92 |  | 
|---|
| [1f922f4] | 93 | for (i, j, k; zip(iter, range, container)) { ... } | 
|---|
| [8baa40aa] | 94 |  | 
|---|
| [1f922f4] | 95 | This would require some further extensions, but would also allow single | 
|---|
|  | 96 | iterators to iterate over sets of related values. It also shows how library | 
|---|
|  | 97 | features can be used to extend the for-each loop. | 
|---|
| [8baa40aa] | 98 |  | 
|---|
| [1f922f4] | 99 | Containers as Iterables | 
|---|
|  | 100 | ----------------------- | 
|---|
|  | 101 | The primary iterator application is going over the contents of various | 
|---|
|  | 102 | collections. Most of the collections should support the iterable interface. | 
|---|
|  | 103 | Each of these iterators should go over each element in the container. | 
|---|
|  | 104 |  | 
|---|
|  | 105 | They could support multiple iterator interfaces. For example, a map could go | 
|---|
|  | 106 | over key-value-pairs, or just keys or just values. | 
|---|
|  | 107 |  | 
|---|
|  | 108 | There isn't a lot to say here, except that as the implementation progresses, | 
|---|
|  | 109 | various rules about iterator invalidation should be hammered out. | 
|---|
| [8baa40aa] | 110 |  | 
|---|
|  | 111 | Ranges | 
|---|
|  | 112 | ------ | 
|---|
| [1f922f4] | 113 | The existing range syntax is incompatible with the for-each loop. But this | 
|---|
|  | 114 | can be recovered by making at least some of the range syntax into operators. | 
|---|
|  | 115 | These operators create and return new range types. | 
|---|
|  | 116 |  | 
|---|
|  | 117 | The ranges are types that map a random numeric range. For iteration, a range | 
|---|
|  | 118 | is also considered an iterator over the numbers in the range. | 
|---|
|  | 119 |  | 
|---|
|  | 120 | The there could be up to 6 new operators to create ranges. These are: | 
|---|
|  | 121 | `~` `+~` `~=` `+~=` `-~` `-~=` | 
|---|
|  | 122 |  | 
|---|
|  | 123 | Parsing issues might rule some of these out. Others may be rendered | 
|---|
|  | 124 | irrelevant by iterator transformers. For example, adding a transformer that | 
|---|
|  | 125 | reverses the iterator could replace having downwards range operators. | 
|---|
|  | 126 |  | 
|---|
|  | 127 | The one range that does not involve the operator is the lone number. This is | 
|---|
|  | 128 | a shorthand for a half-open/exclusive range from zero to the given number. | 
|---|
|  | 129 | That is the same behaviour as one (or two ~ and +~) of the existing range | 
|---|
|  | 130 | operators. So it only saves 2-3 characters. | 
|---|
|  | 131 |  | 
|---|
|  | 132 | But if we want to save those characters, there may be a special form that | 
|---|
|  | 133 | could be used (for example, in the unnamed case) or you could treat integers | 
|---|
|  | 134 | as iterables that create iterators that count from zero up to the limit. | 
|---|
|  | 135 |  | 
|---|
|  | 136 | forall(N) struct Upto { N current; N limit; }; | 
|---|
|  | 137 | Upto(int) get_iterator(const int&); | 
|---|
|  | 138 |  | 
|---|
|  | 139 | Enumerations as Iterators | 
|---|
|  | 140 | ------------------------- | 
|---|
|  | 141 | To convert enumerations to the new form, a new syntax will have to be created | 
|---|
|  | 142 | to treat an enumeration type as an expression returning a range. | 
|---|
|  | 143 |  | 
|---|
|  | 144 | Using the unmodified type name seems problematic, it could be done if we | 
|---|
|  | 145 | convert the type expression to some type object that can be passed to the | 
|---|
|  | 146 | get_iterator function. But there are other alternatives. | 
|---|
|  | 147 |  | 
|---|
|  | 148 | An operator could be created to convert the type into an iterable expression. | 
|---|
|  | 149 |  | 
|---|
|  | 150 | range_over(EnumType) | 
|---|
|  | 151 |  | 
|---|
|  | 152 | It is also possible to create a library type that iterates over the range. | 
|---|
|  | 153 | Then we use a compound literal of a nullary type as a flag to get it. | 
|---|
|  | 154 |  | 
|---|
|  | 155 | (ERange(EnumType)){} | 
|---|
|  | 156 |  | 
|---|
|  | 157 | Both of these are placed in the EXPRESSION part of the for-each loop. | 
|---|
|  | 158 |  | 
|---|
|  | 159 | Iterator Transformers | 
|---|
|  | 160 | --------------------- | 
|---|
|  | 161 | New library features could be created to work with iterators. One of the | 
|---|
|  | 162 | biggest groups are transformers, which allow for more complicated compound | 
|---|
|  | 163 | operations on iterators. | 
|---|
| [8baa40aa] | 164 |  | 
|---|
| [1f922f4] | 165 | For a few examples: | 
|---|
| [8baa40aa] | 166 |  | 
|---|
| [1f922f4] | 167 | // Create a new iterator that transforms elements from a source iterator. | 
|---|
|  | 168 | forall(I, T, U | is_iterator(I, T)) | 
|---|
|  | 169 | /* Iterator over U */ map(U (*func)(T), I iter); | 
|---|
| [0e3f80d] | 170 |  | 
|---|
| [1f922f4] | 171 | // Create an iterator that keeps only elements that satisfy a predicate. | 
|---|
|  | 172 | forall(I, T | is_iterator(I, T)) | 
|---|
|  | 173 | /* Iterator over T */ filter(bool (*pred)(const T&), I iter); | 
|---|
| [8baa40aa] | 174 |  | 
|---|
| [1f922f4] | 175 | forall(Is..., Ts... | is_iterator(Is, Ts)...) | 
|---|
|  | 176 | /* Iterator over [Ts] */ zip(Is iters); | 
|---|
| [8baa40aa] | 177 |  | 
|---|
| [1f922f4] | 178 | forall(Index, I, T | is_serial(Index) | is_iterator(I, T)) | 
|---|
|  | 179 | /* Iterator over [Index, T]) */ enumerate(Index, I); | 
|---|
| [8baa40aa] | 180 |  | 
|---|
| [1f922f4] | 181 | The return type is omitted in each case, because usually these functions end | 
|---|
|  | 182 | up being thin wrappers around a constructor for a specialized type that is | 
|---|
|  | 183 | the iterator, in each case, this is the return type. | 
|---|
| [8baa40aa] | 184 |  | 
|---|
| [1f922f4] | 185 | Also, these should be over iterables (with the conversion from iterable to | 
|---|
|  | 186 | iterator part of the thin wrapper), not just iterators. But the examples are | 
|---|
|  | 187 | easier to write out this way. | 
|---|
|  | 188 |  | 
|---|
|  | 189 | Here is one example expanded to cover those details: | 
|---|
|  | 190 |  | 
|---|
|  | 191 | forall(Iter, Elem) | 
|---|
|  | 192 | struct Filter { | 
|---|
|  | 193 | bool (*pred)(const Elem&); | 
|---|
|  | 194 | Iter iterator; | 
|---|
|  | 195 | } | 
|---|
|  | 196 |  | 
|---|
|  | 197 | forall(Iter, Elem | is_iterator(Iter, Elem)) | 
|---|
|  | 198 | maybe(Elem) next(Filter(Iter, Elem) & this) { | 
|---|
|  | 199 | maybe(Elem) ret; | 
|---|
|  | 200 | do { | 
|---|
|  | 201 | ret = next(this.iterator); | 
|---|
|  | 202 | } while (has_value(ret) && this.pred(get(&ret))); | 
|---|
|  | 203 | return ret; | 
|---|
|  | 204 | } | 
|---|
|  | 205 |  | 
|---|
|  | 206 | forall(T, Iter, Elem | is_iterable(T, Iter) | is_iterator(Iter, Elem)) | 
|---|
|  | 207 | Filter(Iter, Elem) filter(bool (*pred)(const Elem&), T& iterable) { | 
|---|
|  | 208 | return (Filter(Iter, Elem)){ pred, get_iterator(iterable) }; | 
|---|
|  | 209 | } | 
|---|
|  | 210 |  | 
|---|
|  | 211 | Iterator Consumers | 
|---|
|  | 212 | ------------------ | 
|---|
|  | 213 | Eventually we will have to turn the iterators into something else. Iterator | 
|---|
|  | 214 | consumers take an iterator and turn it into something else. The biggest of | 
|---|
|  | 215 | these is the for-each loop, as described above. In addition various library | 
|---|
|  | 216 | functions can be added. | 
|---|
|  | 217 |  | 
|---|
|  | 218 | One group of functions are container constructors which take everything from | 
|---|
|  | 219 | the iterable and build a new container out of it. There would be various | 
|---|
|  | 220 | instances of this because you will need one for each container type. | 
|---|
|  | 221 |  | 
|---|
|  | 222 | Higher order folding functions (for example, the functional | 
|---|
|  | 223 | `foldr : (a -> b -> b) -> b -> [a]`) are a bit less useful since Cforall does | 
|---|
|  | 224 | not work with anonymous functions quite as well. However, some of the | 
|---|
|  | 225 | specialized versions may still be useful. For example, getting the sum of an | 
|---|
|  | 226 | iterable's elements. | 
|---|
|  | 227 |  | 
|---|
|  | 228 | forall(T, I, N | is_iterable(T, I) | is_iterator(I, N) | | 
|---|
|  | 229 | { void ?{}(N&, zero_t); N ?+=?(N&, N); }) | 
|---|
|  | 230 | N sum(T & iterable) { | 
|---|
|  | 231 | N acc = 0; | 
|---|
|  | 232 | for ( i ; iterable ) acc += i; | 
|---|
|  | 233 | return acc; | 
|---|
|  | 234 | } | 
|---|
|  | 235 |  | 
|---|
|  | 236 | Iterator Producers | 
|---|
|  | 237 | ------------------ | 
|---|
|  | 238 | Another group of library features are iterator producers, that take some | 
|---|
|  | 239 | data and create an iterator out of it. Many of already discussed cases are | 
|---|
|  | 240 | actually examples of this (containers, ranges and enumerations). | 
|---|
|  | 241 |  | 
|---|
|  | 242 | There are a few more cases that might be useful. Such as "repeat", which | 
|---|
|  | 243 | takes a single element and iterates over it infinitely. This is usually | 
|---|
|  | 244 | combined with other iterators and not used directly. | 
|---|
|  | 245 |  | 
|---|
|  | 246 | Other Improvements to Cforall | 
|---|
|  | 247 | ---------------------------- | 
|---|
|  | 248 | All code snippets should be considered pseudo-code because they require some | 
|---|
|  | 249 | features that do not currently exist. There are also some that just run into | 
|---|
|  | 250 | current bugs. For instance, my proof of concept for ERange encountered a few | 
|---|
|  | 251 | warnings even after I got it working. | 
|---|
|  | 252 |  | 
|---|
|  | 253 | This proposal might also benefit from associated types. Both is_iterable and | 
|---|
|  | 254 | is_iterator may not need to be generic over all their type parameters. | 
|---|
|  | 255 |  | 
|---|
|  | 256 | Tuple enhancements might also help with iterators that want to return more | 
|---|
|  | 257 | than one value. | 
|---|
| [8baa40aa] | 258 |  | 
|---|
|  | 259 | Related Work | 
|---|
|  | 260 | ------------ | 
|---|
| [1f922f4] | 261 | Python has a robust iterator tool set. It uses the same iterator / iterable | 
|---|
|  | 262 | divide as described above and has the interfaces (note, Python uses duck | 
|---|
|  | 263 | typing instead of defined interfaces, so these are "imaginary"): | 
|---|
|  | 264 |  | 
|---|
|  | 265 | class Iterable: | 
|---|
|  | 266 |  | 
|---|
|  | 267 | def __iter__(self): | 
|---|
|  | 268 | # Return an iterator over the iterable's contents. | 
|---|
|  | 269 |  | 
|---|
|  | 270 | class Iterator(Iterable): | 
|---|
|  | 271 |  | 
|---|
|  | 272 | def __iter__(self): | 
|---|
|  | 273 | # Iterators are iterable over themselves. | 
|---|
|  | 274 | return self | 
|---|
|  | 275 |  | 
|---|
|  | 276 | def __next__(self): | 
|---|
|  | 277 | # Return next value if present and advance internal state. | 
|---|
|  | 278 | # Otherwise raise StopIteration exception. | 
|---|
|  | 279 |  | 
|---|
|  | 280 | It uses this in for loops, the for loop takes an iterable gets the iterable, | 
|---|
|  | 281 | and iterates until the iterable is exhausted. Iterators are iterables so they | 
|---|
|  | 282 | can be passed in directly as well. | 
|---|
|  | 283 |  | 
|---|
|  | 284 | It also has a `range` built-in which handles integer loops, although only | 
|---|
|  | 285 | increasing half-open ranges. | 
|---|
| [0e3f80d] | 286 |  | 
|---|
|  | 287 | In addition, it has many dedicated iterator constructors and transformers, | 
|---|
|  | 288 | and many containers can both produce and be constructed from iterators. | 
|---|
| [1f922f4] | 289 | For example, the chain function goes though a series of iterators in sequence | 
|---|
|  | 290 | an can be used by: | 
|---|
| [8baa40aa] | 291 |  | 
|---|
| [1f922f4] | 292 | import itertools | 
|---|
|  | 293 | itertools.chain(iter_a, iter_b) | 
|---|
|  | 294 |  | 
|---|
|  | 295 | Python enumerations are implemented in the standard library and are iterables | 
|---|
|  | 296 | over all their possible values. There is no special construct for this, in | 
|---|
|  | 297 | Python types are object, and the type of an enum type object is EnumType | 
|---|
|  | 298 | which supports the iterable interface. | 
|---|
|  | 299 |  | 
|---|
|  | 300 | for elem in SomeEnum: | 
|---|
|  | 301 | print(elem) | 
|---|
|  | 302 |  | 
|---|
|  | 303 | Python References | 
|---|
| [8baa40aa] | 304 | +   https://docs.python.org/3/reference/datamodel.html#object.__iter__ | 
|---|
|  | 305 | +   https://docs.python.org/3/library/functions.html#func-range | 
|---|
| [1f922f4] | 306 | +   https://docs.python.org/3/library/enum.html | 
|---|
| [8baa40aa] | 307 |  | 
|---|
| [0e3f80d] | 308 | C++ has many iterator tools at well, except for the fact it's "iterators" are | 
|---|
| [8baa40aa] | 309 | not what are usually called iterators (as above) but rather an abstraction of | 
|---|
| [0e3f80d] | 310 | pointers. The notable missing feature is that a single iterator has no | 
|---|
|  | 311 | concept of being empty or not, instead it must be compared to the end | 
|---|
|  | 312 | iterator. | 
|---|
|  | 313 |  | 
|---|
|  | 314 | However, C++ ranges have an interface much more similar to iterators. | 
|---|
|  | 315 | They do appear to be a wrapper around the "pointer" iterators. | 
|---|
|  | 316 |  | 
|---|
| [1f922f4] | 317 | template< class T > | 
|---|
|  | 318 | concept range = requires( T& t ) { | 
|---|
|  | 319 | ranges::begin(t); | 
|---|
|  | 320 | ranges::end(t); | 
|---|
|  | 321 | }; | 
|---|
|  | 322 |  | 
|---|
|  | 323 | C++ also has "structured bindings" which can be used to unpack an object and | 
|---|
|  | 324 | bind its components to different names. This can be used to iterate over | 
|---|
|  | 325 | multiple values simultaneously. In this example from the Cforall compiler | 
|---|
|  | 326 | using a custom utility function written without special support. | 
|---|
|  | 327 |  | 
|---|
|  | 328 | for ( auto const & [index, handler] : enumerate( terminate_handlers ) ) { | 
|---|
|  | 329 | ... | 
|---|
|  | 330 | } | 
|---|
|  | 331 |  | 
|---|
|  | 332 | C++ References | 
|---|
| [0e3f80d] | 333 | +   https://en.cppreference.com/w/cpp/ranges | 
|---|
| [1f922f4] | 334 | +   https://en.cppreference.com/w/cpp/language/structured_binding | 
|---|
| [8baa40aa] | 335 |  | 
|---|
| [4d3666d] | 336 | Rust also has a imperative implementation of a functional style of iterators, | 
|---|
| [1f922f4] | 337 | with iterator tools implemented as "Provided Methods". Provided methods | 
|---|
| [8baa40aa] | 338 | including a great number of standard transformers. Otherwise, it is very | 
|---|
|  | 339 | similar to Python. | 
|---|
|  | 340 |  | 
|---|
| [1f922f4] | 341 | pub trait Iterator { | 
|---|
|  | 342 | type Item; | 
|---|
|  | 343 |  | 
|---|
|  | 344 | fn next(&mut self) -> Option<Self::Item>; | 
|---|
|  | 345 |  | 
|---|
|  | 346 | // Many provided methods ... | 
|---|
|  | 347 | } | 
|---|
|  | 348 |  | 
|---|
|  | 349 | Not that Rust has many of its iterator transformers stored as methods on the | 
|---|
|  | 350 | iterator. These methods are "provided" so you only have to define next, but | 
|---|
|  | 351 | you can redefine the other methods to provide optimizations. | 
|---|
|  | 352 |  | 
|---|
|  | 353 | Rust also has range expressions. Based around `start .. end` and | 
|---|
|  | 354 | `start ..= end`, with the sub-expressions giving the start and end of the | 
|---|
|  | 355 | range being optional, except for end after `..=`. If start is omitted, the | 
|---|
|  | 356 | range has no lower bound, otherwise the range uses start as its inclusive | 
|---|
|  | 357 | lower bound. It is the same case for end and the upper bound, except that the | 
|---|
|  | 358 | `..=` will have an inclusive upper bound instead of an exclusive one. | 
|---|
|  | 359 |  | 
|---|
|  | 360 | Rust achieves ordering and step control by iterator transformers. `.rev()` | 
|---|
|  | 361 | allows you to reverse a range and `.step_by(step)` provides a positive step | 
|---|
|  | 362 | value. | 
|---|
|  | 363 |  | 
|---|
|  | 364 | Rust Reference | 
|---|
|  | 365 | +   https://doc.rust-lang.org/std/iter/trait.Iterator.html | 
|---|
|  | 366 | +   https://doc.rust-lang.org/reference/expressions/range-expr.html | 
|---|
|  | 367 | +   https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.rev | 
|---|