# Changeset f92aa32 for doc/rob_thesis/intro.tex

Ignore:
Timestamp:
Apr 7, 2017, 6:25:23 PM (5 years ago)
Branches:
aaron-thesis, arm-eh, cleanup-dtors, deferred_resn, demangler, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, new-env, no_list, persistent-indexer, resolv-new, with_gc
Children:
2ccb93c
Parents:
c51b5a3
Message:

thesis conclusions and editting pass

File:
1 edited

### Legend:

Unmodified
 rc51b5a3 \section{\CFA Background} \label{s:background} \CFA is a modern non-object-oriented extension to the C programming language. \CFA \footnote{Pronounced C-for-all'', and written \CFA or Cforall.} is a modern non-object-oriented extension to the C programming language. As it is an extension of C, there is already a wealth of existing C code and principles that govern the design of the language. Among the goals set out in the original design of \CFA, four points stand out \cite{Bilson03}. A a1 = { 1, .y:7, 6 }; A a2[4] = { [2]:a0, [0]:a1, { .z:3 } }; // equvialent to // equivalent to // A a0 = { 0, 8, 0, 1 }; // A a1 = { 1, 0, 7, 6 }; Designations allow specifying the field to initialize by name, rather than by position. Any field not explicitly initialized is initialized as if it had static storage duration \cite[p.~141]{C11}. A designator specifies the current object for initialization, and as such any undesignated subobjects pick up where the last initialization left off. For example, in the initialization of @a1@, the initializer of @y@ is @7@, and the unnamed initializer @6@ initializes the next subobject, @z@. Later initializers override earlier initializers, so a subobject for which there is more than one initializer is only initailized by its last initializer. A designator specifies the current object for initialization, and as such any undesignated sub-objects pick up where the last initialization left off. For example, in the initialization of @a1@, the initializer of @y@ is @7@, and the unnamed initializer @6@ initializes the next sub-object, @z@. Later initializers override earlier initializers, so a sub-object for which there is more than one initializer is only initialized by its last initializer. These semantics can be seen in the initialization of @a0@, where @x@ is designated twice, and thus initialized to @8@. Note that in \CFA, designations use a colon separator, rather than an equals sign as in C, because this syntax is one of the few places that conflicts with the new language features. Both solutions are syntactically unnatural. In \CFA, it is possible to directly declare a function returning mutliple values. In \CFA, it is possible to directly declare a function returning multiple values. This extension provides important semantic information to the caller, since return values are only for output. \begin{cfacode} S * s = malloc();       // malloc(sizeof(S)) \end{cfacode} The built-in trait @sized@ ensures that size and alignment information for @T@ is available to @malloc@ through @sizeof@ and @_Alignof@ expressions respectively. The built-in trait @sized@ ensures that size and alignment information for @T@ is available in the body of @malloc@ through @sizeof@ and @_Alignof@ expressions respectively. In calls to @malloc@, the type @T@ is bound based on call-site information, allowing \CFA code to allocate memory without the potential for errors introduced by manually specifying the size of the allocated block. Note, these invariants are internal to the type's correct behaviour. Types also have external invarients with state of the execution environment, including the heap, the open file-table, the state of global variables, etc. Types also have external invariants with the state of the execution environment, including the heap, the open-file table, the state of global variables, etc. Since resources are finite and shared (concurrency), it is important to ensure that objects clean up properly when they are finished, restoring the execution environment to a stable state so that new objects can reuse resources. The program stack grows and shrinks automatically with each function call, as needed for local variables. However, whenever a program needs a variable to outlive the block it is created in, the storage must be allocated dynamically with @malloc@ and later released with @free@. This pattern is extended to more complex objects, such as files and sockets, which also outlive the block where they are created, but at their core is resource management. This pattern is extended to more complex objects, such as files and sockets, which can also outlive the block where they are created, and thus require their own resource management. Once allocated storage escapes\footnote{In garbage collected languages, such as Java, escape analysis \cite{Choi:1999:EAJ:320385.320386} is used to determine when dynamically allocated objects are strictly contained within a function, which allows the optimizer to allocate them on the stack.} a block, the responsibility for deallocating the storage is not specified in a function's type, that is, that the return value is owned by the caller. This implicit convention is provided only through documentation about the expectations of functions. In other languages, a hybrid situation exists where resources escape the allocation block, but ownership is precisely controlled by the language. This pattern requires a strict interface and protocol for a data structure, where the protocol consists of a pre-initialization and a post-termination call, and all intervening access is done via interface routines. This kind of encapsulation is popular in object-oriented programming languages, and like the stack, it contains a significant portion of resource management cases. This pattern requires a strict interface and protocol for a data structure, consisting of a pre-initialization and a post-termination call, and all intervening access is done via interface routines. This kind of encapsulation is popular in object-oriented programming languages, and like the stack, it takes care of a significant portion of resource management cases. For example, \CC directly supports this pattern through class types and an idiom known as RAII \footnote{Resource Acquisition is Initialization} by means of constructors and destructors. RAII ensures that if all resources are acquired in a constructor and released in a destructor, there are no resource leaks, even in exceptional circumstances. A type with at least one non-trivial constructor or destructor is henceforth referred to as a \emph{managed type}. In the context of \CFA, a non-trivial constructor is either a user defined constructor or an auto generated constructor that calls a non-trivial constructor. For the remaining resource ownership cases, programmer must follow a brittle, explicit protocol for freeing resources or an implicit porotocol implemented via the programming language. In the context of \CFA, a non-trivial constructor is either a user defined constructor or an auto-generated constructor that calls a non-trivial constructor. For the remaining resource ownership cases, programmer must follow a brittle, explicit protocol for freeing resources or an implicit protocol implemented via the programming language. In garbage collected languages, such as Java, resources are largely managed by the garbage collector. In particular, Java supports \emph{finalizers}, which are similar to destructors. Sadly, finalizers are only guaranteed to be called before an object is reclaimed by the garbage collector \cite[p.~373]{Java8}, which may not happen if memory use is not contentious. Due to operating-system resource-limits, this is unacceptable for many long running programs. % TODO: citation? Due to operating-system resource-limits, this is unacceptable for many long running programs. Instead, the paradigm in Java requires programmers to manually keep track of all resources \emph{except} memory, leading many novices and experts alike to forget to close files, etc. Complicating the picture, uncaught exceptions can cause control flow to change dramatically, leaking a resource that appears on first glance to be released. Depending on when the exception is raised, both @out@ and @log@ are null, @log@ is null, or both are non-null, therefore, the cleanup for these variables at the end is appropriately guarded and conditionally executed to prevent null-pointer exceptions. % TODO: discuss Rust? % Like \CC, Rust \cite{Rust} provides RAII through constructors and destructors. % Smart pointers are deeply integrated in the Rust type-system. While Rust \cite{Rust} does not enforce the use of a garbage collector, it does provide a manual memory management environment, with a strict ownership model that automatically frees allocated memory and prevents common memory management errors. In particular, a variable has ownership over its associated value, which is freed automatically when the owner goes out of scope. Furthermore, values are \emph{moved} by default on assignment, rather than copied, which invalidates the previous variable binding. \begin{rustcode} struct S { x: i32 } let s = S { x: 123 }; let z = s;           // move, invalidate s println!("{}", s.x); // error, s has been moved \end{rustcode} Types can be made copyable by implementing the @Copy@ trait. Rust allows multiple unowned views into an object through references, also known as borrows, provided that a reference does not outlive its referent. A mutable reference is allowed only if it is the only reference to its referent, preventing data race errors and iterator invalidation errors. \begin{rustcode} let mut x = 10; { let y = &x; let z = &x; println!("{} {}", y, z); // prints 10 10 } { let y = &mut x; // let z1 = &x;     // not allowed, have mutable reference // let z2 = &mut x; // not allowed, have mutable reference *y = 5; println!("{}", y); // prints 5 } println!("{}", x); // prints 5 \end{rustcode} Since references are not owned, they do not release resources when they go out of scope. There is no runtime cost imposed on these restrictions, since they are enforced at compile-time. Rust provides RAII through the @Drop@ trait, allowing arbitrary code to execute when the object goes out of scope, allowing Rust programs to automatically clean up auxiliary resources much like a \CC program. \begin{rustcode} struct S { name: &'static str } impl Drop for S {  // RAII for S fn drop(&mut self) { println!("dropped {}", self.name); } } { let x = S { name: "x" }; let y = S { name: "y" }; } // prints "dropped y" "dropped x" \end{rustcode} % D has constructors and destructors that are worth a mention (under classes) https://dlang.org/spec/spec.html The programming language, D, also manages resources with constructors and destructors \cite{D}. In D, @struct@s are stack allocated and managed via scoping like in \CC, whereas @class@es are managed automatically by the garbage collector. Like Java, using the garbage collector means that destructors may never be called, requiring the use of finally statements to ensure dynamically allocated resources that are not managed by the garbage collector, such as open files, are cleaned up. Like Java, using the garbage collector means that destructors are called indeterminately, requiring the use of finally statements to ensure dynamically allocated resources that are not managed by the garbage collector, such as open files, are cleaned up. Since D supports RAII, it is possible to use the same techniques as in \CC to ensure that resources are released in a timely manner. Finally, D provides a scope guard statement, which allows an arbitrary statement to be executed at normal scope exit with \emph{success}, at exceptional scope exit with \emph{failure}, or at normal and exceptional scope exit with \emph{exit}. % TODO: cite? https://dlang.org/spec/statement.html#ScopeGuardStatement Finally, D provides a scope guard statement, which allows an arbitrary statement to be executed at normal scope exit with \emph{success}, at exceptional scope exit with \emph{failure}, or at normal and exceptional scope exit with \emph{exit}. % https://dlang.org/spec/statement.html#ScopeGuardStatement It has been shown that the \emph{exit} form of the scope guard statement can be implemented in a library in \CC \cite{ExceptSafe}. To provide managed types in \CFA, new kinds of constructors and destructors are added to C and discussed in Chapter 2. To provide managed types in \CFA, new kinds of constructors and destructors are added to \CFA and discussed in Chapter 2. \section{Tuples} \label{s:Tuples} In mathematics, tuples are finite-length sequences which, unlike sets, allow duplicate elements. In mathematics, tuples are finite-length sequences which, unlike sets, are ordered and allow duplicate elements. In programming languages, tuples provide fixed-sized heterogeneous lists of elements. Many programming languages have tuple constructs, such as SETL, \KWC, ML, and Scala. Like \CC, D provides tuples through a library variadic template struct. In D, it is possible to name the fields of a tuple type, which creates a distinct type. % TODO: cite http://dlang.org/phobos/std_typecons.html % http://dlang.org/phobos/std_typecons.html \begin{dcode} Tuple!(float, "x", float, "y") point2D; \Csharp also has tuples, but has similarly strange limitations, allowing tuples of size up to 7 components. % TODO: cite https://msdn.microsoft.com/en-us/library/system.tuple(v=vs.110).aspx \Csharp also has tuples, but has similarly strange limitations, allowing tuples of size up to 7 components. % https://msdn.microsoft.com/en-us/library/system.tuple(v=vs.110).aspx The officially supported workaround for this shortcoming is to nest tuples in the 8th component. \Csharp allows accessing a component of a tuple by using the field @Item$N$@ for components 1 through 7, and @Rest@ for the nested tuple. In Swift, @Void@ is an alias for the empty tuple, and there are no single element tuples. % TODO: this statement feels like it's too strong Tuples as powerful as the above languages are added to C and discussed in Chapter 3. Tuples comparable to those described above are added to \CFA and discussed in Chapter 3. \section{Variadic Functions} printf("%d %g %c %s", 10, 3.5, 'X', "a string"); \end{cfacode} Through the use of a format string, @printf@ allows C programmers to print any of the standard C data types. Through the use of a format string, C programmers can communicate argument type information to @printf@, allowing C programmers to print any of the standard C data types. Still, @printf@ is extremely limited, since the format codes are specified by the C standard, meaning users cannot define their own format codes to extend @printf@ for new data types or new formatting rules. The combination of these two issues greatly restricts the usefulness of variadic functions in Java. Type-safe variadic functions are added to C and discussed in Chapter 4. Type-safe variadic functions are added to \CFA and discussed in Chapter 4.