Changeset 154fdc8 for doc/rob_thesis/intro.tex
- Timestamp:
- Apr 19, 2017, 10:15:45 AM (9 years ago)
- Branches:
- ADT, aaron-thesis, arm-eh, ast-experimental, cleanup-dtors, deferred_resn, demangler, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, new-env, no_list, persistent-indexer, pthread-emulation, qualifiedEnum, resolv-new, with_gc
- Children:
- cd348e7
- Parents:
- 221c2de7 (diff), de4ce0e (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the(diff)links above to see all the changes relative to each parent. - File:
-
- 1 edited
-
doc/rob_thesis/intro.tex (modified) (37 diffs)
Legend:
- Unmodified
- Added
- Removed
-
doc/rob_thesis/intro.tex
r221c2de7 r154fdc8 5 5 \section{\CFA Background} 6 6 \label{s:background} 7 \CFA is a modernextension to the C programming language.7 \CFA \footnote{Pronounced ``C-for-all'', and written \CFA or Cforall.} is a modern non-object-oriented extension to the C programming language. 8 8 As it is an extension of C, there is already a wealth of existing C code and principles that govern the design of the language. 9 9 Among the goals set out in the original design of \CFA, four points stand out \cite{Bilson03}. … … 16 16 Therefore, these design principles must be kept in mind throughout the design and development of new language features. 17 17 In order to appeal to existing C programmers, great care must be taken to ensure that new features naturally feel like C. 18 The remainder of this section describes some of the important new features that currently exist in \CFA, to give the reader the necessary context in which the new features presented in this thesis must dovetail. % TODO: harmonize with? 18 These goals ensure existing C code-bases can be converted to \CFA incrementally with minimal effort, and C programmers can productively generate \CFA code without training beyond the features being used. 19 Unfortunately, \CC is actively diverging from C, so incremental additions require significant effort and training, coupled with multiple legacy design-choices that cannot be updated. 20 21 The remainder of this section describes some of the important new features that currently exist in \CFA, to give the reader the necessary context in which the new features presented in this thesis must dovetail. 19 22 20 23 \subsection{C Background} … … 29 32 A a1 = { 1, .y:7, 6 }; 30 33 A a2[4] = { [2]:a0, [0]:a1, { .z:3 } }; 31 // equ vialent to34 // equivalent to 32 35 // A a0 = { 0, 8, 0, 1 }; 33 36 // A a1 = { 1, 0, 7, 6 }; … … 36 39 Designations allow specifying the field to initialize by name, rather than by position. 37 40 Any field not explicitly initialized is initialized as if it had static storage duration \cite[p.~141]{C11}. 38 A designator specifies the current object for initialization, and as such any undesignated sub objects pick up where the last initialization left off.39 For example, in the initialization of @a1@, the initializer of @y@ is @7@, and the unnamed initializer @6@ initializes the next sub object, @z@.40 Later initializers override earlier initializers, so a sub object for which there is more than one initializer is only initailized by its last initializer.41 Th is can be seen in the initialization of @a0@, where @x@ is designated twice, and thus initialized to @8@.42 Note that in \CFA, designations use a colon separator, rather than an equals sign as in C .41 A designator specifies the current object for initialization, and as such any undesignated sub-objects pick up where the last initialization left off. 42 For example, in the initialization of @a1@, the initializer of @y@ is @7@, and the unnamed initializer @6@ initializes the next sub-object, @z@. 43 Later initializers override earlier initializers, so a sub-object for which there is more than one initializer is only initialized by its last initializer. 44 These semantics can be seen in the initialization of @a0@, where @x@ is designated twice, and thus initialized to @8@. 45 Note that in \CFA, designations use a colon separator, rather than an equals sign as in C, because this syntax is one of the few places that conflicts with the new language features. 43 46 44 47 C also provides \emph{compound literal} expressions, which provide a first-class mechanism for creating unnamed objects. … … 53 56 \end{cfacode} 54 57 Compound literals create an unnamed object, and result in an lvalue, so it is legal to assign a value into a compound literal or to take its address \cite[p.~86]{C11}. 55 Syntactically, compound literals look like a cast operator followed by a brace-enclosed initializer, but semantically are different from a C cast, which only applies basic conversions and is never an lvalue.58 Syntactically, compound literals look like a cast operator followed by a brace-enclosed initializer, but semantically are different from a C cast, which only applies basic conversions and coercions and is never an lvalue. 56 59 57 60 \subsection{Overloading} … … 59 62 Overloading is the ability to specify multiple entities with the same name. 60 63 The most common form of overloading is function overloading, wherein multiple functions can be defined with the same name, but with different signatures. 61 Like in \CC, \CFA allows overloading based both on the number of parameters and on the types of parameters. 64 C provides a small amount of built-in overloading, \eg + is overloaded for the basic types. 65 Like in \CC, \CFA allows user-defined overloading based both on the number of parameters and on the types of parameters. 62 66 \begin{cfacode} 63 67 void f(void); // (1) … … 91 95 92 96 There are times when a function should logically return multiple values. 93 Since a function in standard C can only return a single value, a programmer must either take in additional return values by address, or the function's designer must create a wrapper structure t0 package multiple return-values. 97 Since a function in standard C can only return a single value, a programmer must either take in additional return values by address, or the function's designer must create a wrapper structure to package multiple return-values. 98 For example, the first approach: 94 99 \begin{cfacode} 95 100 int f(int * ret) { // returns a value through parameter ret … … 101 106 int res1 = g(&res2); // explicitly pass storage 102 107 \end{cfacode} 103 The former solution is awkward because it requires the caller to explicitly allocate memory for $n$ result variables, even if they are only temporary values used as a subexpression, or even not used at all. 108 is awkward because it requires the caller to explicitly allocate memory for $n$ result variables, even if they are only temporary values used as a subexpression, or even not used at all. 109 The second approach: 104 110 \begin{cfacode} 105 111 struct A { … … 112 118 ... res3.x ... res3.y ... // use result values 113 119 \end{cfacode} 114 The latter approach requires the callerto either learn the field names of the structure or learn the names of helper routines to access the individual return values.115 Both solutions are syntactically unnatural.116 117 In \CFA, it is possible to directly declare a function returning mu tliple values.118 This provides important semantic information to the caller, since return values are only for output.119 \begin{cfacode} 120 [int, int] f() { // don't need to create anew type120 is awkward because the caller has to either learn the field names of the structure or learn the names of helper routines to access the individual return values. 121 Both approaches are syntactically unnatural. 122 123 In \CFA, it is possible to directly declare a function returning multiple values. 124 This extension provides important semantic information to the caller, since return values are only for output. 125 \begin{cfacode} 126 [int, int] f() { // no new type 121 127 return [123, 37]; 122 128 } 123 129 \end{cfacode} 124 However, the ability to return multiple values requires a syntax for accepting the results from a function. 130 However, the ability to return multiple values is useless without a syntax for accepting the results from the function. 131 125 132 In standard C, return values are most commonly assigned directly into local variables, or are used as the arguments to another function call. 126 133 \CFA allows both of these contexts to accept multiple return values. … … 148 155 g(f()); // selects (2) 149 156 \end{cfacode} 150 In this example, the only possible call to @f@ that can produce the two @int@s required by @g@ is the second option.151 A similar reasoning holds for assigning into multiple variables.157 In this example, the only possible call to @f@ that can produce the two @int@s required for assigning into the variables @x@ and @y@ is the second option. 158 A similar reasoning holds calling the function @g@. 152 159 153 160 In \CFA, overloading also applies to operator names, known as \emph{operator overloading}. … … 163 170 \begin{cfacode} 164 171 struct A { int i; }; 165 int ?+?(A x, A y); 172 int ?+?(A x, A y); // '?'s represent operands 166 173 bool ?<?(A x, A y); 167 174 \end{cfacode} 168 Notably, the only difference i n this example is syntax.175 Notably, the only difference is syntax. 169 176 Most of the operators supported by \CC for operator overloading are also supported in \CFA. 170 Of notable exception are the logical operators ( e.g. @||@), the sequence operator (i.e. @,@), and the member-access operators (e.g.@.@ and \lstinline{->}).177 Of notable exception are the logical operators (\eg @||@), the sequence operator (\ie @,@), and the member-access operators (\eg @.@ and \lstinline{->}). 171 178 172 179 Finally, \CFA also permits overloading variable identifiers. 173 180 This feature is not available in \CC. 174 \begin{cfacode} % TODO: pick something better than x? max, zero, one?181 \begin{cfacode} 175 182 struct Rational { int numer, denom; }; 176 183 int x = 3; // (1) … … 186 193 In this example, there are three definitions of the variable @x@. 187 194 Based on the context, \CFA attempts to choose the variable whose type best matches the expression context. 195 When used judiciously, this feature allows names like @MAX@, @MIN@, and @PI@ to apply across many types. 188 196 189 197 Finally, the values @0@ and @1@ have special status in standard C. … … 197 205 } 198 206 \end{cfacode} 199 Every if statement in C compares the condition with @0@, and every increment and decrement operator is semantically equivalent to adding or subtracting the value @1@ and storing the result.207 Every if- and iteration-statement in C compares the condition with @0@, and every increment and decrement operator is semantically equivalent to adding or subtracting the value @1@ and storing the result. 200 208 Due to these rewrite rules, the values @0@ and @1@ have the types \zero and \one in \CFA, which allow for overloading various operations that connect to @0@ and @1@ \footnote{In the original design of \CFA, @0@ and @1@ were overloadable names \cite[p.~7]{cforall}.}. 201 The types \zero and \one have special built in implicit conversions to the various integral types, and a conversion to pointer types for @0@, which allows standard C code involving @0@ and @1@ to work as normal.209 The types \zero and \one have special built-in implicit conversions to the various integral types, and a conversion to pointer types for @0@, which allows standard C code involving @0@ and @1@ to work as normal. 202 210 \begin{cfacode} 203 211 // lvalue is similar to returning a reference in C++ … … 240 248 template<typename T> 241 249 T sum(T *arr, int n) { 242 T t; 250 T t; // default construct => 0 243 251 for (; n > 0; n--) t += arr[n-1]; 244 252 return t; … … 258 266 \end{cfacode} 259 267 The first thing to note here is that immediately following the declaration of @otype T@ is a list of \emph{type assertions} that specify restrictions on acceptable choices of @T@. 260 In particular, the assertions above specify that there must be a an assignment from \zero to @T@ and an addition assignment operator from @T@ to @T@.268 In particular, the assertions above specify that there must be an assignment from \zero to @T@ and an addition assignment operator from @T@ to @T@. 261 269 The existence of an assignment operator from @T@ to @T@ and the ability to create an object of type @T@ are assumed implicitly by declaring @T@ with the @otype@ type-class. 262 270 In addition to @otype@, there are currently two other type-classes. … … 278 286 A major difference between the approaches of \CC and \CFA to polymorphism is that the set of assumed properties for a type is \emph{explicit} in \CFA. 279 287 One of the major limiting factors of \CC's approach is that templates cannot be separately compiled. 280 In contrast, the explicit nature of assertions allows \CFA's polymorphic functions to be separately compiled. 288 In contrast, the explicit nature of assertions allows \CFA's polymorphic functions to be separately compiled, as the function prototype states all necessary requirements separate from the implementation. 289 For example, the prototype for the previous sum function is 290 \begin{cfacode} 291 forall(otype T | **R**{ T ?=?(T *, zero_t); T ?+=?(T *, T); }**R**) 292 T sum(T *arr, int n); 293 \end{cfacode} 294 With this prototype, a caller in another translation unit knows all of the constraints on @T@, and thus knows all of the operations that need to be made available to @sum@. 281 295 282 296 In \CFA, a set of assertions can be factored into a \emph{trait}. … … 293 307 This capability allows specifying the same set of assertions in multiple locations, without the repetition and likelihood of mistakes that come with manually writing them out for each function declaration. 294 308 309 An interesting application of return-type resolution and polymorphism is a type-safe version of @malloc@. 310 \begin{cfacode} 311 forall(dtype T | sized(T)) 312 T * malloc() { 313 return (T*)malloc(sizeof(T)); // call C malloc 314 } 315 int * x = malloc(); // malloc(sizeof(int)) 316 double * y = malloc(); // malloc(sizeof(double)) 317 318 struct S { ... }; 319 S * s = malloc(); // malloc(sizeof(S)) 320 \end{cfacode} 321 The built-in trait @sized@ ensures that size and alignment information for @T@ is available in the body of @malloc@ through @sizeof@ and @_Alignof@ expressions respectively. 322 In calls to @malloc@, the type @T@ is bound based on call-site information, allowing \CFA code to allocate memory without the potential for errors introduced by manually specifying the size of the allocated block. 323 295 324 \section{Invariants} 296 % TODO: discuss software engineering benefits of ctor/dtors: {pre/post} conditions, invariants 297 % an important invariant is the state of the environment (memory, resources) 298 % some objects pass their contract to the object user 299 An \emph{invariant} is a logical assertion that true for some duration of a program's execution. 325 An \emph{invariant} is a logical assertion that is true for some duration of a program's execution. 300 326 Invariants help a programmer to reason about code correctness and prove properties of programs. 301 327 328 \begin{sloppypar} 302 329 In object-oriented programming languages, type invariants are typically established in a constructor and maintained throughout the object's lifetime. 303 Th is is typically achieved through a combination of accesscontrol modifiers and a restricted interface.330 These assertions are typically achieved through a combination of access-control modifiers and a restricted interface. 304 331 Typically, data which requires the maintenance of an invariant is hidden from external sources using the \emph{private} modifier, which restricts reads and writes to a select set of trusted routines, including member functions. 305 332 It is these trusted routines that perform all modifications to internal data in a way that is consistent with the invariant, by ensuring that the invariant holds true at the end of the routine call. 333 \end{sloppypar} 306 334 307 335 In C, the @assert@ macro is often used to ensure invariants are true. 308 336 Using @assert@, the programmer can check a condition and abort execution if the condition is not true. 309 This is a powerful tool thatforces the programmer to deal with logical inconsistencies as they occur.337 This powerful tool forces the programmer to deal with logical inconsistencies as they occur. 310 338 For production, assertions can be removed by simply defining the preprocessor macro @NDEBUG@, making it simple to ensure that assertions are 0-cost for a performance intensive application. 311 339 \begin{cfacode} … … 354 382 \end{dcode} 355 383 The D compiler is able to assume that assertions and invariants hold true and perform optimizations based on those assumptions. 356 357 An important invariant is the state of the execution environment, including the heap, the open file table, the state of global variables, etc. 358 Since resources are finite, it is important to ensure that objects clean up properly when they are finished, restoring the execution environment to a stable state so that new objects can reuse resources. 384 Note, these invariants are internal to the type's correct behaviour. 385 386 Types also have external invariants with the state of the execution environment, including the heap, the open-file table, the state of global variables, etc. 387 Since resources are finite and shared (concurrency), it is important to ensure that objects clean up properly when they are finished, restoring the execution environment to a stable state so that new objects can reuse resources. 359 388 360 389 \section{Resource Management} … … 366 395 The program stack grows and shrinks automatically with each function call, as needed for local variables. 367 396 However, whenever a program needs a variable to outlive the block it is created in, the storage must be allocated dynamically with @malloc@ and later released with @free@. 368 This pattern is extended to more complex objects, such as files and sockets, which also outlive the block where they are created, but at their core isresource management.369 Once allocated storage escapes a block, the responsibility for deallocating the storage is not specified in a function's type, that is, that the return value is owned by the caller.397 This pattern is extended to more complex objects, such as files and sockets, which can also outlive the block where they are created, and thus require their own resource management. 398 Once allocated storage escapes\footnote{In garbage collected languages, such as Java, escape analysis \cite{Choi:1999:EAJ:320385.320386} is used to determine when dynamically allocated objects are strictly contained within a function, which allows the optimizer to allocate them on the stack.} a block, the responsibility for deallocating the storage is not specified in a function's type, that is, that the return value is owned by the caller. 370 399 This implicit convention is provided only through documentation about the expectations of functions. 371 400 372 401 In other languages, a hybrid situation exists where resources escape the allocation block, but ownership is precisely controlled by the language. 373 This pattern requires a strict interface and protocol for a data structure, where the protocol consistsof a pre-initialization and a post-termination call, and all intervening access is done via interface routines.374 This kind of encapsulation is popular in object-oriented programming languages, and like the stack, it contains a significant portion of resourcemanagement cases.402 This pattern requires a strict interface and protocol for a data structure, consisting of a pre-initialization and a post-termination call, and all intervening access is done via interface routines. 403 This kind of encapsulation is popular in object-oriented programming languages, and like the stack, it takes care of a significant portion of resource-management cases. 375 404 376 405 For example, \CC directly supports this pattern through class types and an idiom known as RAII \footnote{Resource Acquisition is Initialization} by means of constructors and destructors. … … 380 409 On the other hand, destructors provide a simple mechanism for tearing down an object and resetting the environment in which the object lived. 381 410 RAII ensures that if all resources are acquired in a constructor and released in a destructor, there are no resource leaks, even in exceptional circumstances. 382 A type with at least one non-trivial constructor or destructor will henceforth bereferred to as a \emph{managed type}.383 In the context of \CFA, a non-trivial constructor is either a user defined constructor or an auto generated constructor that calls a non-trivial constructor.384 385 For the remaining resource ownership cases, programmer must follow a brittle, explicit protocol for freeing resources or an implicit porotocol implemented viathe programming language.411 A type with at least one non-trivial constructor or destructor is henceforth referred to as a \emph{managed type}. 412 In the context of \CFA, a non-trivial constructor is either a user defined constructor or an auto-generated constructor that calls a non-trivial constructor. 413 414 For the remaining resource ownership cases, a programmer must follow a brittle, explicit protocol for freeing resources or an implicit protocol enforced by the programming language. 386 415 387 416 In garbage collected languages, such as Java, resources are largely managed by the garbage collector. 388 Still, garbage collectors aretypically focus only on memory management.417 Still, garbage collectors typically focus only on memory management. 389 418 There are many kinds of resources that the garbage collector does not understand, such as sockets, open files, and database connections. 390 419 In particular, Java supports \emph{finalizers}, which are similar to destructors. 391 Sadly, finalizers come with far fewer guarantees, to the point where a completely conforming JVM may never call a single finalizer. % TODO: citation JVM spec; http://stackoverflow.com/a/2506514/2386739 392 Due to operating system resource limits, this is unacceptable for many long running tasks. % TODO: citation?393 Instead, the paradigm in Java requires programmers manually keep track of all resource\emph{except} memory, leading many novices and experts alike to forget to close files, etc.394 Complicating the picture, uncaught exceptions can cause control flow to change dramatically, leaking a resource which appears on first glance to be closed.420 Unfortunately, finalizers are only guaranteed to be called before an object is reclaimed by the garbage collector \cite[p.~373]{Java8}, which may not happen if memory use is not contentious. 421 Due to operating-system resource-limits, this is unacceptable for many long running programs. 422 Instead, the paradigm in Java requires programmers to manually keep track of all resources \emph{except} memory, leading many novices and experts alike to forget to close files, etc. 423 Complicating the picture, uncaught exceptions can cause control flow to change dramatically, leaking a resource that appears on first glance to be released. 395 424 \begin{javacode} 396 425 void write(String filename, String msg) throws Exception { … … 403 432 } 404 433 \end{javacode} 405 Any line in this program can throw an exception. 406 This leads to a profusion of finally blocks around many function bodies, since it isn't always clear when an exception may be thrown. 434 Any line in this program can throw an exception, which leads to a profusion of finally blocks around many function bodies, since it is not always clear when an exception may be thrown. 407 435 \begin{javacode} 408 436 public void write(String filename, String msg) throws Exception { … … 422 450 \end{javacode} 423 451 In Java 7, a new \emph{try-with-resources} construct was added to alleviate most of the pain of working with resources, but ultimately it still places the burden squarely on the user rather than on the library designer. 424 Furthermore, for complete safety this pattern requires nested objects to be declared separately, otherwise resources which can throw an exception on close can leak nested resources. % TODO: cite oracle article http://www.oracle.com/technetwork/articles/java/trywithresources-401775.html?452 Furthermore, for complete safety this pattern requires nested objects to be declared separately, otherwise resources that can throw an exception on close can leak nested resources \cite{TryWithResources}. 425 453 \begin{javacode} 426 454 public void write(String filename, String msg) throws Exception { 427 try ( 455 try ( // try-with-resources 428 456 FileOutputStream out = new FileOutputStream(filename); 429 457 FileOutputStream log = new FileOutputStream("log.txt"); … … 434 462 } 435 463 \end{javacode} 436 On the other hand, the Java compiler generates more code if more resources are declared, meaning that users must be more familiar with each type and library designers must provide better documentation. 464 Variables declared as part of a try-with-resources statement must conform to the @AutoClosable@ interface, and the compiler implicitly calls @close@ on each of the variables at the end of the block. 465 Depending on when the exception is raised, both @out@ and @log@ are null, @log@ is null, or both are non-null, therefore, the cleanup for these variables at the end is automatically guarded and conditionally executed to prevent null-pointer exceptions. 466 467 While Rust \cite{Rust} does not enforce the use of a garbage collector, it does provide a manual memory management environment, with a strict ownership model that automatically frees allocated memory and prevents common memory management errors. 468 In particular, a variable has ownership over its associated value, which is freed automatically when the owner goes out of scope. 469 Furthermore, values are \emph{moved} by default on assignment, rather than copied, which invalidates the previous variable binding. 470 \begin{rustcode} 471 struct S { 472 x: i32 473 } 474 let s = S { x: 123 }; 475 let z = s; // move, invalidate s 476 println!("{}", s.x); // error, s has been moved 477 \end{rustcode} 478 Types can be made copyable by implementing the @Copy@ trait. 479 480 Rust allows multiple unowned views into an object through references, also known as borrows, provided that a reference does not outlive its referent. 481 A mutable reference is allowed only if it is the only reference to its referent, preventing data race errors and iterator invalidation errors. 482 \begin{rustcode} 483 let mut x = 10; 484 { 485 let y = &x; 486 let z = &x; 487 println!("{} {}", y, z); // prints 10 10 488 } 489 { 490 let y = &mut x; 491 // let z1 = &x; // not allowed, have mutable reference 492 // let z2 = &mut x; // not allowed, have mutable reference 493 *y = 5; 494 println!("{}", y); // prints 5 495 } 496 println!("{}", x); // prints 5 497 \end{rustcode} 498 Since references are not owned, they do not release resources when they go out of scope. 499 There is no runtime cost imposed on these restrictions, since they are enforced at compile-time. 500 501 Rust provides RAII through the @Drop@ trait, allowing arbitrary code to execute when the object goes out of scope, providing automatic clean up of auxiliary resources, much like a \CC program. 502 \begin{rustcode} 503 struct S { 504 name: &'static str 505 } 506 507 impl Drop for S { // RAII for S 508 fn drop(&mut self) { // destructor 509 println!("dropped {}", self.name); 510 } 511 } 512 513 { 514 let x = S { name: "x" }; 515 let y = S { name: "y" }; 516 } // prints "dropped y" "dropped x" 517 \end{rustcode} 437 518 438 519 % D has constructors and destructors that are worth a mention (under classes) https://dlang.org/spec/spec.html … … 442 523 The programming language, D, also manages resources with constructors and destructors \cite{D}. 443 524 In D, @struct@s are stack allocated and managed via scoping like in \CC, whereas @class@es are managed automatically by the garbage collector. 444 Like Java, using the garbage collector means that destructors may never be called, requiring the use of finally statements to ensure dynamically allocated resources that are not managed by the garbage collector, such as open files, are cleaned up.525 Like Java, using the garbage collector means that destructors are called indeterminately, requiring the use of finally statements to ensure dynamically allocated resources that are not managed by the garbage collector, such as open files, are cleaned up. 445 526 Since D supports RAII, it is possible to use the same techniques as in \CC to ensure that resources are released in a timely manner. 446 Finally, D provides a scope guard statement, which allows an arbitrary statement to be executed at normal scope exit with \emph{success}, at exceptional scope exit with \emph{failure}, or at normal and exceptional scope exit with \emph{exit}. % cite? https://dlang.org/spec/statement.html#ScopeGuardStatement 447 It has been shown that the \emph{exit} form of the scope guard statement can be implemented in a library in \CC. % cite: http://www.drdobbs.com/184403758 448 449 % TODO: discussion of lexical scope vs. dynamic 450 % see Peter's suggestions 451 % RAII works in both cases. Guaranteed to work in stack case, works in heap case if root is deleted (but it's dangerous to rely on this, because of exceptions) 527 Finally, D provides a scope guard statement, which allows an arbitrary statement to be executed at normal scope exit with \emph{success}, at exceptional scope exit with \emph{failure}, or at normal and exceptional scope exit with \emph{exit}. % https://dlang.org/spec/statement.html#ScopeGuardStatement 528 It has been shown that the \emph{exit} form of the scope guard statement can be implemented in a library in \CC \cite{ExceptSafe}. 529 530 To provide managed types in \CFA, new kinds of constructors and destructors are added to \CFA and discussed in Chapter 2. 452 531 453 532 \section{Tuples} 454 533 \label{s:Tuples} 455 In mathematics, tuples are finite-length sequences which, unlike sets, a llow duplicate elements.456 In programming languages, tuples are a construct thatprovide fixed-sized heterogeneous lists of elements.534 In mathematics, tuples are finite-length sequences which, unlike sets, are ordered and allow duplicate elements. 535 In programming languages, tuples provide fixed-sized heterogeneous lists of elements. 457 536 Many programming languages have tuple constructs, such as SETL, \KWC, ML, and Scala. 458 537 … … 462 541 Adding tuples to \CFA has previously been explored by Esteves \cite{Esteves04}. 463 542 464 The design of tuples in \KWC took much of its inspiration from SETL .543 The design of tuples in \KWC took much of its inspiration from SETL \cite{SETL}. 465 544 SETL is a high-level mathematical programming language, with tuples being one of the primary data types. 466 545 Tuples in SETL allow a number of operations, including subscripting, dynamic expansion, and multiple assignment. … … 470 549 \begin{cppcode} 471 550 tuple<int, int, int> triple(10, 20, 30); 472 get<1>(triple); // access component 1 => 30551 get<1>(triple); // access component 1 => 20 473 552 474 553 tuple<int, double> f(); … … 482 561 Tuples are simple data structures with few specific operations. 483 562 In particular, it is possible to access a component of a tuple using @std::get<N>@. 484 Another interesting feature is @std::tie@, which creates a tuple of references, which allows assigningthe results of a tuple-returning function into separate local variables, without requiring a temporary variable.563 Another interesting feature is @std::tie@, which creates a tuple of references, allowing assignment of the results of a tuple-returning function into separate local variables, without requiring a temporary variable. 485 564 Tuples also support lexicographic comparisons, making it simple to write aggregate comparators using @std::tie@. 486 565 487 There is a proposal for \CCseventeen called \emph{structured bindings} , that introduces new syntax to eliminate the need to pre-declare variables and use @std::tie@ for binding the results from a function call. % TODO: cite http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0144r0.pdf566 There is a proposal for \CCseventeen called \emph{structured bindings} \cite{StructuredBindings}, that introduces new syntax to eliminate the need to pre-declare variables and use @std::tie@ for binding the results from a function call. 488 567 \begin{cppcode} 489 568 tuple<int, double> f(); … … 492 571 tuple<int, int, int> triple(10, 20, 30); 493 572 auto & [t1, t2, t3] = triple; 494 t2 = 0; // changes triple573 t2 = 0; // changes middle element of triple 495 574 496 575 struct S { int x; double y; }; … … 498 577 auto [x, y] = s; // unpack s 499 578 \end{cppcode} 500 Structured bindings allow unpacking any struct with all public non-static data members into fresh local variables.579 Structured bindings allow unpacking any structure with all public non-static data members into fresh local variables. 501 580 The use of @&@ allows declaring new variables as references, which is something that cannot be done with @std::tie@, since \CC references do not support rebinding. 502 This extension requires the use of @auto@ to infer the types of the new variables, so complicated expressions with a non-obvious type must documented with some other mechanism.581 This extension requires the use of @auto@ to infer the types of the new variables, so complicated expressions with a non-obvious type must be documented with some other mechanism. 503 582 Furthermore, structured bindings are not a full replacement for @std::tie@, as it always declares new variables. 504 583 505 Like \CC, D provides tuples through a library variadic template struct.584 Like \CC, D provides tuples through a library variadic-template structure. 506 585 In D, it is possible to name the fields of a tuple type, which creates a distinct type. 507 \begin{dcode} % TODO: cite http://dlang.org/phobos/std_typecons.html 586 % http://dlang.org/phobos/std_typecons.html 587 \begin{dcode} 508 588 Tuple!(float, "x", float, "y") point2D; 509 Tuple!(float, float) float2; // different type s589 Tuple!(float, float) float2; // different type from point2D 510 590 511 591 point2D[0]; // access first element … … 521 601 The @expand@ method produces the components of the tuple as a list of separate values, making it possible to call a function that takes $N$ arguments using a tuple with $N$ components. 522 602 523 Tuples are a fundamental abstraction in most functional programming languages, such as Standard ML .603 Tuples are a fundamental abstraction in most functional programming languages, such as Standard ML \cite{sml}. 524 604 A function in SML always accepts exactly one argument. 525 605 There are two ways to mimic multiple argument functions: the first through currying and the second by accepting tuple arguments. … … 533 613 \end{smlcode} 534 614 Here, the function @binco@ appears to take 2 arguments, but it actually takes a single argument which is implicitly decomposed via pattern matching. 535 Tuples are a foundational tool in SML, allowing the creation of arbitrarily complex structured datatypes.536 537 Scala, like \CC, provides tuple types through the standard library .615 Tuples are a foundational tool in SML, allowing the creation of arbitrarily-complex structured data-types. 616 617 Scala, like \CC, provides tuple types through the standard library \cite{Scala}. 538 618 Scala provides tuples of size 1 through 22 inclusive through generic data structures. 539 619 Tuples support named access and subscript access, among a few other operations. 540 620 \begin{scalacode} 541 val a = new Tuple3 [Int, String, Double](0, "Text", 2.1)// explicit creation542 val b = (6, 'a', 1.1f) // syntactic sugar forTuple3[Int, Char, Float]621 val a = new Tuple3(0, "Text", 2.1) // explicit creation 622 val b = (6, 'a', 1.1f) // syntactic sugar: Tuple3[Int, Char, Float] 543 623 val (i, _, d) = triple // extractor syntax, ignore middle element 544 624 … … 547 627 \end{scalacode} 548 628 In Scala, tuples are primarily used as simple data structures for carrying around multiple values or for returning multiple values from a function. 549 The 22-element restriction is an odd and arbitrary choice, but in practice it does n't cause problems since large tuples are uncommon.629 The 22-element restriction is an odd and arbitrary choice, but in practice it does not cause problems since large tuples are uncommon. 550 630 Subscript access is provided through the @productElement@ method, which returns a value of the top-type @Any@, since it is impossible to receive a more precise type from a general subscripting method due to type erasure. 551 631 The disparity between named access beginning at @_1@ and subscript access starting at @0@ is likewise an oddity, but subscript access is typically avoided since it discards type information. … … 553 633 554 634 555 \Csharp has similarly strange limitations, allowing tuples of size up to 7 components. % TODO: citehttps://msdn.microsoft.com/en-us/library/system.tuple(v=vs.110).aspx635 \Csharp also has tuples, but has similarly strange limitations, allowing tuples of size up to 7 components. % https://msdn.microsoft.com/en-us/library/system.tuple(v=vs.110).aspx 556 636 The officially supported workaround for this shortcoming is to nest tuples in the 8th component. 557 637 \Csharp allows accessing a component of a tuple by using the field @Item$N$@ for components 1 through 7, and @Rest@ for the nested tuple. 558 638 559 560 % TODO: cite 5.3 https://docs.python.org/3/tutorial/datastructures.html 561 In Python, tuples are immutable sequences that provide packing and unpacking operations. 639 In Python \cite{Python}, tuples are immutable sequences that provide packing and unpacking operations. 562 640 While the tuple itself is immutable, and thus does not allow the assignment of components, there is nothing preventing a component from being internally mutable. 563 641 The components of a tuple can be accessed by unpacking into multiple variables, indexing, or via field name, like D. 564 642 Tuples support multiple assignment through a combination of packing and unpacking, in addition to the common sequence operations. 565 643 566 % TODO: cite https://developer.apple.com/library/content/documentation/Swift/Conceptual/Swift_Programming_Language/Types.html#//apple_ref/doc/uid/TP40014097-CH31-ID448 567 Swift, like D, provides named tuples, with components accessed by name, index, or via extractors. 644 Swift \cite{Swift}, like D, provides named tuples, with components accessed by name, index, or via extractors. 568 645 Tuples are primarily used for returning multiple values from a function. 569 646 In Swift, @Void@ is an alias for the empty tuple, and there are no single element tuples. 647 648 Tuples comparable to those described above are added to \CFA and discussed in Chapter 3. 570 649 571 650 \section{Variadic Functions} … … 581 660 printf("%d %g %c %s", 10, 3.5, 'X', "a string"); 582 661 \end{cfacode} 583 Through the use of a format string, @printf@ allowsC programmers to print any of the standard C data types.662 Through the use of a format string, C programmers can communicate argument type information to @printf@, allowing C programmers to print any of the standard C data types. 584 663 Still, @printf@ is extremely limited, since the format codes are specified by the C standard, meaning users cannot define their own format codes to extend @printf@ for new data types or new formatting rules. 585 664 665 \begin{sloppypar} 586 666 C provides manipulation of variadic arguments through the @va_list@ data type, which abstracts details of the manipulation of variadic arguments. 587 667 Since the variadic arguments are untyped, it is up to the function to interpret any data that is passed in. 588 668 Additionally, the interface to manipulate @va_list@ objects is essentially limited to advancing to the next argument, without any built-in facility to determine when the last argument is read. 589 This requires the use of an \emph{argument descriptor} to pass information to the function about the structure of the argument list, including the number of arguments and their types.669 This limitation requires the use of an \emph{argument descriptor} to pass information to the function about the structure of the argument list, including the number of arguments and their types. 590 670 The format string in @printf@ is one such example of an argument descriptor. 591 671 \begin{cfacode} … … 618 698 Furthermore, if the user makes a mistake, compile-time checking is typically restricted to standard format codes and their corresponding types. 619 699 In general, this means that C's variadic functions are not type-safe, making them difficult to use properly. 700 \end{sloppypar} 620 701 621 702 % When arguments are passed to a variadic function, they undergo \emph{default argument promotions}. … … 641 722 A parameter pack matches 0 or more elements, which can be types or expressions depending on the context. 642 723 Like other templates, variadic template functions rely on an implicit set of constraints on a type, in this example a @print@ routine. 643 That is, it is possible to use the @f@ routine anyany type provided there is a corresponding @print@ routine, making variadic templates fully open to extension, unlike variadic functions in C.724 That is, it is possible to use the @f@ routine on any type provided there is a corresponding @print@ routine, making variadic templates fully open to extension, unlike variadic functions in C. 644 725 645 726 Recent \CC standards (\CCfourteen, \CCseventeen) expand on the basic premise by allowing variadic template variables and providing convenient expansion syntax to remove the need for recursion in some cases, amongst other things. … … 672 753 Unfortunately, Java's use of nominal inheritance means that types must explicitly inherit from classes or interfaces in order to be considered a subclass. 673 754 The combination of these two issues greatly restricts the usefulness of variadic functions in Java. 755 756 Type-safe variadic functions are added to \CFA and discussed in Chapter 4.
Note:
See TracChangeset
for help on using the changeset viewer.