Ignore:
Timestamp:
Apr 7, 2017, 6:25:23 PM (7 years ago)
Author:
Rob Schluntz <rschlunt@…>
Branches:
ADT, aaron-thesis, arm-eh, ast-experimental, cleanup-dtors, deferred_resn, demangler, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, new-env, no_list, persistent-indexer, pthread-emulation, qualifiedEnum, resolv-new, with_gc
Children:
2ccb93c
Parents:
c51b5a3
Message:

thesis conclusions and editting pass

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/rob_thesis/intro.tex

    rc51b5a3 rf92aa32  
    55\section{\CFA Background}
    66\label{s:background}
    7 \CFA is a modern non-object-oriented extension to the C programming language.
     7\CFA \footnote{Pronounced ``C-for-all'', and written \CFA or Cforall.} is a modern non-object-oriented extension to the C programming language.
    88As it is an extension of C, there is already a wealth of existing C code and principles that govern the design of the language.
    99Among the goals set out in the original design of \CFA, four points stand out \cite{Bilson03}.
     
    2929A a1 = { 1, .y:7, 6 };
    3030A a2[4] = { [2]:a0, [0]:a1, { .z:3 } };
    31 // equvialent to
     31// equivalent to
    3232// A a0 = { 0, 8, 0, 1 };
    3333// A a1 = { 1, 0, 7, 6 };
     
    3636Designations allow specifying the field to initialize by name, rather than by position.
    3737Any field not explicitly initialized is initialized as if it had static storage duration \cite[p.~141]{C11}.
    38 A designator specifies the current object for initialization, and as such any undesignated subobjects pick up where the last initialization left off.
    39 For example, in the initialization of @a1@, the initializer of @y@ is @7@, and the unnamed initializer @6@ initializes the next subobject, @z@.
    40 Later initializers override earlier initializers, so a subobject for which there is more than one initializer is only initailized by its last initializer.
     38A designator specifies the current object for initialization, and as such any undesignated sub-objects pick up where the last initialization left off.
     39For example, in the initialization of @a1@, the initializer of @y@ is @7@, and the unnamed initializer @6@ initializes the next sub-object, @z@.
     40Later initializers override earlier initializers, so a sub-object for which there is more than one initializer is only initialized by its last initializer.
    4141These semantics can be seen in the initialization of @a0@, where @x@ is designated twice, and thus initialized to @8@.
    4242Note that in \CFA, designations use a colon separator, rather than an equals sign as in C, because this syntax is one of the few places that conflicts with the new language features.
     
    116116Both solutions are syntactically unnatural.
    117117
    118 In \CFA, it is possible to directly declare a function returning mutliple values.
     118In \CFA, it is possible to directly declare a function returning multiple values.
    119119This extension provides important semantic information to the caller, since return values are only for output.
    120120\begin{cfacode}
     
    308308S * s = malloc();       // malloc(sizeof(S))
    309309\end{cfacode}
    310 The built-in trait @sized@ ensures that size and alignment information for @T@ is available to @malloc@ through @sizeof@ and @_Alignof@ expressions respectively.
     310The built-in trait @sized@ ensures that size and alignment information for @T@ is available in the body of @malloc@ through @sizeof@ and @_Alignof@ expressions respectively.
    311311In calls to @malloc@, the type @T@ is bound based on call-site information, allowing \CFA code to allocate memory without the potential for errors introduced by manually specifying the size of the allocated block.
    312312
     
    371371Note, these invariants are internal to the type's correct behaviour.
    372372
    373 Types also have external invarients with state of the execution environment, including the heap, the open file-table, the state of global variables, etc.
     373Types also have external invariants with the state of the execution environment, including the heap, the open-file table, the state of global variables, etc.
    374374Since resources are finite and shared (concurrency), it is important to ensure that objects clean up properly when they are finished, restoring the execution environment to a stable state so that new objects can reuse resources.
    375375
     
    382382The program stack grows and shrinks automatically with each function call, as needed for local variables.
    383383However, whenever a program needs a variable to outlive the block it is created in, the storage must be allocated dynamically with @malloc@ and later released with @free@.
    384 This pattern is extended to more complex objects, such as files and sockets, which also outlive the block where they are created, but at their core is resource management.
     384This pattern is extended to more complex objects, such as files and sockets, which can also outlive the block where they are created, and thus require their own resource management.
    385385Once allocated storage escapes\footnote{In garbage collected languages, such as Java, escape analysis \cite{Choi:1999:EAJ:320385.320386} is used to determine when dynamically allocated objects are strictly contained within a function, which allows the optimizer to allocate them on the stack.} a block, the responsibility for deallocating the storage is not specified in a function's type, that is, that the return value is owned by the caller.
    386386This implicit convention is provided only through documentation about the expectations of functions.
    387387
    388388In other languages, a hybrid situation exists where resources escape the allocation block, but ownership is precisely controlled by the language.
    389 This pattern requires a strict interface and protocol for a data structure, where the protocol consists of a pre-initialization and a post-termination call, and all intervening access is done via interface routines.
    390 This kind of encapsulation is popular in object-oriented programming languages, and like the stack, it contains a significant portion of resource management cases.
     389This pattern requires a strict interface and protocol for a data structure, consisting of a pre-initialization and a post-termination call, and all intervening access is done via interface routines.
     390This kind of encapsulation is popular in object-oriented programming languages, and like the stack, it takes care of a significant portion of resource management cases.
    391391
    392392For example, \CC directly supports this pattern through class types and an idiom known as RAII \footnote{Resource Acquisition is Initialization} by means of constructors and destructors.
     
    397397RAII ensures that if all resources are acquired in a constructor and released in a destructor, there are no resource leaks, even in exceptional circumstances.
    398398A type with at least one non-trivial constructor or destructor is henceforth referred to as a \emph{managed type}.
    399 In the context of \CFA, a non-trivial constructor is either a user defined constructor or an auto generated constructor that calls a non-trivial constructor.
    400 
    401 For the remaining resource ownership cases, programmer must follow a brittle, explicit protocol for freeing resources or an implicit porotocol implemented via the programming language.
     399In the context of \CFA, a non-trivial constructor is either a user defined constructor or an auto-generated constructor that calls a non-trivial constructor.
     400
     401For the remaining resource ownership cases, programmer must follow a brittle, explicit protocol for freeing resources or an implicit protocol implemented via the programming language.
    402402
    403403In garbage collected languages, such as Java, resources are largely managed by the garbage collector.
     
    406406In particular, Java supports \emph{finalizers}, which are similar to destructors.
    407407Sadly, finalizers are only guaranteed to be called before an object is reclaimed by the garbage collector \cite[p.~373]{Java8}, which may not happen if memory use is not contentious.
    408 Due to operating-system resource-limits, this is unacceptable for many long running programs. % TODO: citation?
     408Due to operating-system resource-limits, this is unacceptable for many long running programs.
    409409Instead, the paradigm in Java requires programmers to manually keep track of all resources \emph{except} memory, leading many novices and experts alike to forget to close files, etc.
    410410Complicating the picture, uncaught exceptions can cause control flow to change dramatically, leaking a resource that appears on first glance to be released.
     
    452452Depending on when the exception is raised, both @out@ and @log@ are null, @log@ is null, or both are non-null, therefore, the cleanup for these variables at the end is appropriately guarded and conditionally executed to prevent null-pointer exceptions.
    453453
    454 % TODO: discuss Rust?
    455 % Like \CC, Rust \cite{Rust} provides RAII through constructors and destructors.
    456 % Smart pointers are deeply integrated in the Rust type-system.
     454While Rust \cite{Rust} does not enforce the use of a garbage collector, it does provide a manual memory management environment, with a strict ownership model that automatically frees allocated memory and prevents common memory management errors.
     455In particular, a variable has ownership over its associated value, which is freed automatically when the owner goes out of scope.
     456Furthermore, values are \emph{moved} by default on assignment, rather than copied, which invalidates the previous variable binding.
     457\begin{rustcode}
     458struct S {
     459  x: i32
     460}
     461let s = S { x: 123 };
     462let z = s;           // move, invalidate s
     463println!("{}", s.x); // error, s has been moved
     464\end{rustcode}
     465Types can be made copyable by implementing the @Copy@ trait.
     466
     467Rust allows multiple unowned views into an object through references, also known as borrows, provided that a reference does not outlive its referent.
     468A mutable reference is allowed only if it is the only reference to its referent, preventing data race errors and iterator invalidation errors.
     469\begin{rustcode}
     470let mut x = 10;
     471{
     472  let y = &x;
     473  let z = &x;
     474  println!("{} {}", y, z); // prints 10 10
     475}
     476{
     477  let y = &mut x;
     478  // let z1 = &x;     // not allowed, have mutable reference
     479  // let z2 = &mut x; // not allowed, have mutable reference
     480  *y = 5;
     481  println!("{}", y); // prints 5
     482}
     483println!("{}", x); // prints 5
     484\end{rustcode}
     485Since references are not owned, they do not release resources when they go out of scope.
     486There is no runtime cost imposed on these restrictions, since they are enforced at compile-time.
     487
     488Rust provides RAII through the @Drop@ trait, allowing arbitrary code to execute when the object goes out of scope, allowing Rust programs to automatically clean up auxiliary resources much like a \CC program.
     489\begin{rustcode}
     490struct S {
     491  name: &'static str
     492}
     493
     494impl Drop for S {  // RAII for S
     495  fn drop(&mut self) {
     496    println!("dropped {}", self.name);
     497  }
     498}
     499
     500{
     501  let x = S { name: "x" };
     502  let y = S { name: "y" };
     503} // prints "dropped y" "dropped x"
     504\end{rustcode}
    457505
    458506% D has constructors and destructors that are worth a mention (under classes) https://dlang.org/spec/spec.html
     
    462510The programming language, D, also manages resources with constructors and destructors \cite{D}.
    463511In D, @struct@s are stack allocated and managed via scoping like in \CC, whereas @class@es are managed automatically by the garbage collector.
    464 Like Java, using the garbage collector means that destructors may never be called, requiring the use of finally statements to ensure dynamically allocated resources that are not managed by the garbage collector, such as open files, are cleaned up.
     512Like Java, using the garbage collector means that destructors are called indeterminately, requiring the use of finally statements to ensure dynamically allocated resources that are not managed by the garbage collector, such as open files, are cleaned up.
    465513Since D supports RAII, it is possible to use the same techniques as in \CC to ensure that resources are released in a timely manner.
    466 Finally, D provides a scope guard statement, which allows an arbitrary statement to be executed at normal scope exit with \emph{success}, at exceptional scope exit with \emph{failure}, or at normal and exceptional scope exit with \emph{exit}. % TODO: cite? https://dlang.org/spec/statement.html#ScopeGuardStatement
     514Finally, D provides a scope guard statement, which allows an arbitrary statement to be executed at normal scope exit with \emph{success}, at exceptional scope exit with \emph{failure}, or at normal and exceptional scope exit with \emph{exit}. % https://dlang.org/spec/statement.html#ScopeGuardStatement
    467515It has been shown that the \emph{exit} form of the scope guard statement can be implemented in a library in \CC \cite{ExceptSafe}.
    468516
    469 To provide managed types in \CFA, new kinds of constructors and destructors are added to C and discussed in Chapter 2.
     517To provide managed types in \CFA, new kinds of constructors and destructors are added to \CFA and discussed in Chapter 2.
    470518
    471519\section{Tuples}
    472520\label{s:Tuples}
    473 In mathematics, tuples are finite-length sequences which, unlike sets, allow duplicate elements.
     521In mathematics, tuples are finite-length sequences which, unlike sets, are ordered and allow duplicate elements.
    474522In programming languages, tuples provide fixed-sized heterogeneous lists of elements.
    475523Many programming languages have tuple constructs, such as SETL, \KWC, ML, and Scala.
     
    523571Like \CC, D provides tuples through a library variadic template struct.
    524572In D, it is possible to name the fields of a tuple type, which creates a distinct type.
    525 % TODO: cite http://dlang.org/phobos/std_typecons.html
     573% http://dlang.org/phobos/std_typecons.html
    526574\begin{dcode}
    527575Tuple!(float, "x", float, "y") point2D;
     
    572620
    573621
    574 \Csharp also has tuples, but has similarly strange limitations, allowing tuples of size up to 7 components. % TODO: cite https://msdn.microsoft.com/en-us/library/system.tuple(v=vs.110).aspx
     622\Csharp also has tuples, but has similarly strange limitations, allowing tuples of size up to 7 components. % https://msdn.microsoft.com/en-us/library/system.tuple(v=vs.110).aspx
    575623The officially supported workaround for this shortcoming is to nest tuples in the 8th component.
    576624\Csharp allows accessing a component of a tuple by using the field @Item$N$@ for components 1 through 7, and @Rest@ for the nested tuple.
     
    585633In Swift, @Void@ is an alias for the empty tuple, and there are no single element tuples.
    586634
    587 % TODO: this statement feels like it's too strong
    588 Tuples as powerful as the above languages are added to C and discussed in Chapter 3.
     635Tuples comparable to those described above are added to \CFA and discussed in Chapter 3.
    589636
    590637\section{Variadic Functions}
     
    600647printf("%d %g %c %s", 10, 3.5, 'X', "a string");
    601648\end{cfacode}
    602 Through the use of a format string, @printf@ allows C programmers to print any of the standard C data types.
     649Through the use of a format string, C programmers can communicate argument type information to @printf@, allowing C programmers to print any of the standard C data types.
    603650Still, @printf@ is extremely limited, since the format codes are specified by the C standard, meaning users cannot define their own format codes to extend @printf@ for new data types or new formatting rules.
    604651
     
    692739The combination of these two issues greatly restricts the usefulness of variadic functions in Java.
    693740
    694 Type-safe variadic functions are added to C and discussed in Chapter 4.
     741Type-safe variadic functions are added to \CFA and discussed in Chapter 4.
Note: See TracChangeset for help on using the changeset viewer.