# Changeset 3f8ab8f for doc

Ignore:
Timestamp:
Feb 7, 2018, 5:04:29 PM (5 years ago)
Branches:
aaron-thesis, arm-eh, cleanup-dtors, deferred_resn, demangler, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, new-env, no_list, persistent-indexer, pthread-emulation, qualifiedEnum, resolv-new, with_gc
Children:
169d944
Parents:
0723a57 (diff), 77acd07d (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the (diff) links above to see all the changes relative to each parent.
Message:

Merge branch 'master' of plg2:software/cfa/cfa-cc

File:
1 edited

Unmodified
Added
Removed
• ## doc/papers/general/Paper.tex

 r0723a57 p2 = &y;  $\C{// p2 points to y}$ p3 = &p1;  $\C{// p3 points to p1}$ *p2 = ((*p1 + *p2) * (**p3 - *p1)) / (**p3 - 15); \end{cfa} Unfortunately, the dereference and address-of operators introduce a great deal of syntactic noise when dealing with pointed-to values rather than pointers, as well as the potential for subtle bugs. It would be desirable to have the compiler figure out how to elide the dereference operators in a complex expression such as @*p2 = ((*p1 + *p2) * (**p3 - *p1)) / (**p3 - 15);@, for both brevity and clarity. For both brevity and clarity, it would be desirable to have the compiler figure out how to elide the dereference operators in a complex expression such as the assignment to @*p2@ above. However, since C defines a number of forms of \emph{pointer arithmetic}, two similar expressions involving pointers to arithmetic types (\eg @*p1 + x@ and @p1 + x@) may each have well-defined but distinct semantics, introducing the possibility that a user programmer may write one when they mean the other, and precluding any simple algorithm for elision of dereference operators. To solve these problems, \CFA introduces reference types @T&@; a @T&@ has exactly the same value as a @T*@, but where the @T*@ takes the address interpretation by default, a @T&@ takes the value interpretation by default, as below: Secondly, unlike the references in \CC which always point to a fixed address, \CFA references are rebindable. This allows \CFA references to be default-initialized (to a null pointer), and also to point to different addresses throughout their lifetime. This allows \CFA references to be default-initialized (\eg to a null pointer), and also to point to different addresses throughout their lifetime. This rebinding is accomplished without adding any new syntax to \CFA, but simply by extending the existing semantics of the address-of operator in C. In C, the address of a lvalue is always a rvalue, as in general that address is not stored anywhere in memory, and does not itself have an address. The syntactic motivation for this is clearest when considering overloaded operator-assignment, \eg @int ?+=?(int &, int)@; given @int x, y@, the expected call syntax is @x += y@, not @&x += y@. This initialization of references from lvalues rather than pointers can be considered a lvalue-to-reference'' conversion rather than an elision of the address-of operator; similarly, use of a the value pointed to by a reference in an rvalue context can be thought of as a reference-to-rvalue'' conversion. \CFA includes one more reference conversion, an rvalue-to-reference'' conversion, implemented by means of an implicit temporary. More generally, this initialization of references from lvalues rather than pointers is an instance of a lvalue-to-reference'' conversion rather than an elision of the address-of operator; this conversion can actually be used in any context in \CFA an implicit conversion would be allowed. Similarly, use of a the value pointed to by a reference in an rvalue context can be thought of as a reference-to-rvalue'' conversion, and \CFA also includes a qualifier-adding reference-to-reference'' conversion, analagous to the @T *@ to @const T *@ conversion in standard C. The final reference conversion included in \CFA is rvalue-to-reference'' conversion, implemented by means of an implicit temporary. When an rvalue is used to initialize a reference, it is instead used to initialize a hidden temporary value with the same lexical scope as the reference, and the reference is initialized to the address of this temporary. This allows complex values to be succinctly and efficiently passed to functions, without the syntactic overhead of explicit definition of a temporary variable or the runtime cost of pass-by-value. One of the strengths of C is the control over memory management it gives programmers, allowing resource release to be more consistent and precisely timed than is possible with garbage-collected memory management. However, this manual approach to memory management is often verbose, and it is useful to manage resources other than memory (\eg file handles) using the same mechanism as memory. \CC is well-known for an approach to manual memory management that addresses both these issues, Resource Allocation Is Initialization (RAII), implemented by means of special \emph{constructor} and \emph{destructor} functions; we have implemented a similar feature in \CFA. \TODO{Fill out section. Mention field-constructors and at-equal escape hatch to C-style initialization. Probably pull some text from Rob's thesis for first draft.} \CC is well-known for an approach to manual memory management that addresses both these issues, Resource Aquisition Is Initialization (RAII), implemented by means of special \emph{constructor} and \emph{destructor} functions; we have implemented a similar feature in \CFA. While RAII is a common feature of object-oriented programming languages, its inclusion in \CFA does not violate the design principle that \CFA retain the same procedural paradigm as C. In particular, \CFA does not implement class-based encapsulation: neither the constructor nor any other function has privileged access to the implementation details of a type, except through the translation-unit-scope method of opaque structs provided by C. In \CFA, a constructor is a function named @?{}@, while a destructor is a function named @^?{}@; like other \CFA operators, these names represent the syntax used to call the constructor or destructor, \eg @S s = { ... };@ or @^(s){};@. Every constructor and destructor must have a return type of @void@, and its first parameter must have a reference type whose base type is the type of the object the function constructs or destructs. This first parameter is informally called the @this@ parameter, as in many object-oriented languages, though a programmer may give it an arbitrary name. Destructors must have exactly one parameter, while constructors allow passing of zero or more additional arguments along with the @this@ parameter. \begin{cfa} struct Array { int * data; int len; }; void ?{}( Array& arr ) { arr.len = 10; arr.data = calloc( arr.len, sizeof(int) ); } void ^?{}( Array& arr ) { free( arr.data ); } { Array x; ?{}(x);       $\C{// implicitly compiler-generated}$ // ... use x ^?{}(x);      $\C{// implicitly compiler-generated}$ } \end{cfa} In the example above, a \emph{default constructor} (\ie one with no parameters besides the @this@ parameter) and destructor are defined for the @Array@ struct, a dynamic array of @int@. @Array@ is an example of a \emph{managed type} in \CFA, a type with a non-trivial constructor or destructor, or with a field of a managed type. As in the example, all instances of managed types are implicitly constructed upon allocation, and destructed upon deallocation; this ensures proper initialization and cleanup of resources contained in managed types, in this case the @data@ array on the heap. The exact details of the placement of these implicit constructor and destructor calls are omitted here for brevity, the interested reader should consult \cite{Schluntz17}. Constructor calls are intended to seamlessly integrate with existing C initialization syntax, providing a simple and familiar syntax to veteran C programmers and allowing constructor calls to be inserted into legacy C code with minimal code changes. As such, \CFA also provides syntax for \emph{copy initialization} and \emph{initialization parameters}: \begin{cfa} void ?{}( Array& arr, Array other ); void ?{}( Array& arr, int size, int fill ); Array y = { 20, 0xDEADBEEF }, z = y; \end{cfa} Copy constructors have exactly two parameters, the second of which has the same type as the base type of the @this@ parameter; appropriate care is taken in the implementation to avoid recursive calls to the copy constructor when initializing this second parameter. Other constructor calls look just like C initializers, except rather than using field-by-field initialization (as in C), an initialization which matches a defined constructor will call the constructor instead. In addition to initialization syntax, \CFA provides two ways to explicitly call constructors and destructors. Explicit calls to constructors double as a placement syntax, useful for construction of member fields in user-defined constructors and reuse of large storage allocations. While the existing function-call syntax works for explicit calls to constructors and destructors, \CFA also provides a more concise \emph{operator syntax} for both: \begin{cfa} Array a, b; (a){};                                  $\C{// default construct}$ (b){ a };                               $\C{// copy construct}$ ^(a){};                                 $\C{// destruct}$ (a){ 5, 0xFFFFFFFF };   $\C{// explicit constructor call}$ \end{cfa} To provide a uniform type interface for @otype@ polymorphism, the \CFA compiler automatically generates a default constructor, copy constructor, assignment operator, and destructor for all types. These default functions can be overridden by user-generated versions of them. For compatibility with the standard behaviour of C, the default constructor and destructor for all basic, pointer, and reference types do nothing, while the copy constructor and assignment operator are bitwise copies; if default zero-initialization is desired, the default constructors can be overridden. For user-generated types, the four functions are also automatically generated. @enum@ types are handled the same as their underlying integral type, and unions are also bitwise copied and no-op initialized and destructed. For compatibility with C, a copy constructor from the first union member type is also defined. For @struct@ types, each of the four functions are implicitly defined to call their corresponding functions on each member of the struct. To better simulate the behaviour of C initializers, a set of \emph{field constructors} is also generated for structures. A constructor is generated for each non-empty prefix of a structure's member-list which copy-constructs the members passed as parameters and default-constructs the remaining members. To allow users to limit the set of constructors available for a type, when a user declares any constructor or destructor, the corresponding generated function and all field constructors for that type are hidden from expression resolution; similarly, the generated default constructor is hidden upon declaration of any constructor. These semantics closely mirror the rule for implicit declaration of constructors in \CC\cite[p.~186]{ANSI98:C++}. In rare situations user programmers may not wish to have constructors and destructors called; in these cases, \CFA provides an escape hatch'' to not call them. If a variable is initialized using the syntax \lstinline|S x @= {}| it will be an \emph{unmanaged object}, and will not have constructors or destructors called. Any C initializer can be the right-hand side of an \lstinline|@=| initializer, \eg  \lstinline|Array a @= { 0, 0x0 }|, with the usual C initialization semantics. In addition to the expressive power, \lstinline|@=| provides a simple path for migrating legacy C code to \CFA, by providing a mechanism to incrementally convert initializers; the \CFA design team decided to introduce a new syntax for this escape hatch because we believe that our RAII implementation will handle the vast majority of code in a desirable way, and we wished to maintain familiar syntax for this common case. \subsection{Default Parameters}
Note: See TracChangeset for help on using the changeset viewer.