Changes in doc/papers/general/Paper.tex [0723a57:396fd72]
- File:
-
- 1 edited
-
doc/papers/general/Paper.tex (modified) (11 diffs)
Legend:
- Unmodified
- Added
- Removed
-
doc/papers/general/Paper.tex
r0723a57 r396fd72 1052 1052 \label{s:WithClauseStatement} 1053 1053 1054 Grouping heterogenous data into \newterm{aggregate}s (structure/union)is a common programming practice, and an aggregate can be further organized into more complex structures, such as arrays and containers:1054 Grouping heterogenous data into \newterm{aggregate}s is a common programming practice, and an aggregate can be further organized into more complex structures, such as arrays and containers: 1055 1055 \begin{cfa} 1056 struct S { $\C{// aggregate}$1057 char c; $\C{// fields}$1056 struct S { $\C{// aggregate}$ 1057 char c; $\C{// fields}$ 1058 1058 int i; 1059 1059 double d; … … 1061 1061 S s, as[10]; 1062 1062 \end{cfa} 1063 However, routines manipulating aggregates must repeatthe aggregate name to access its containing fields:1063 However, routines manipulating aggregates have repeition of the aggregate name to access its containing fields: 1064 1064 \begin{cfa} 1065 1065 void f( S s ) { 1066 `s.`c; `s.`i; `s.`d; $\C{// access containing fields}$1066 `s.`c; `s.`i; `s.`d; $\C{// access containing fields}$ 1067 1067 } 1068 1068 \end{cfa} … … 1070 1070 \begin{C++} 1071 1071 class C { 1072 char c; $\C{// fields}$1072 char c; $\C{// fields}$ 1073 1073 int i; 1074 1074 double d; 1075 int mem() { $\C{// implicit "this" parameter}$1076 `this->`c; `this->`i; `this->`d; $\C{// access containing fields}$1075 int mem() { $\C{// implicit "this" parameter}$ 1076 `this->`c; `this->`i; `this->`d;$\C{// access containing fields}$ 1077 1077 } 1078 1078 } 1079 1079 \end{C++} 1080 Nesting of member routines in a \lstinline[language=C++]@class@ allows eliding \lstinline[language=C++]@this->@ because of lexicalscoping.1080 Nesting of member routines in a \lstinline[language=C++]@class@ allows eliding \lstinline[language=C++]@this->@ because of nested lexical-scoping. 1081 1081 1082 1082 % In object-oriented programming, there is an implicit first parameter, often names @self@ or @this@, which is elided. … … 1088 1088 % \TODO{Fill out section. Be sure to mention arbitrary expressions in with-blocks, recent change driven by Thierry to prioritize field name over parameters.} 1089 1089 1090 \CFA provides a @with@ clause/statement (see Pascal~\cite[\S~4.F]{Pascal}) to elide aggregate qualification to fields by opening a scope containing thefield identifiers.1091 Hence, the qualified fields become variables with the side-effect that it is easier to optimizingfield references in a block.1090 \CFA provides a @with@ clause/statement (see Pascal~\cite[\S~4.F]{Pascal}) to elide aggregate qualification to fields by opening a scope containing field identifiers. 1091 Hence, the qualified fields become variables, and making it easier to optimize field references in a block. 1092 1092 \begin{cfa} 1093 void f( S s ) `with( s )` { $\C{// with clause}$1094 c; i; d; $\C{\color{red}// s.c, s.i, s.d}$1093 void f( S s ) `with( s )` { $\C{// with clause}$ 1094 c; i; d; $\C{\color{red}// s.c, s.i, s.d}$ 1095 1095 } 1096 1096 \end{cfa} … … 1098 1098 \begin{cfa} 1099 1099 int mem( S & this ) `with( this )` { $\C{// with clause}$ 1100 c; i; d; $\C{\color{red}// this.c, this.i, this.d}$1100 c; i; d; $\C{\color{red}// this.c, this.i, this.d}$ 1101 1101 } 1102 1102 \end{cfa} 1103 The generality over the object-oriented approach is that multiple aggregate parameters can be opened, not just \lstinline[language=C++]@this@:1103 The key generality over the object-oriented approach is that one aggregate parameter \lstinline[language=C++]@this@ is not treated specially over other aggregate parameters: 1104 1104 \begin{cfa} 1105 1105 struct T { double m, n; }; 1106 1106 int mem( S & s, T & t ) `with( s, t )` { $\C{// multiple aggregate parameters}$ 1107 c; i; d; $\C{\color{red}// s.c, s.i, s.d}$1108 m; n; $\C{\color{red}// t.m, t.n}$1107 c; i; d; $\C{\color{red}// s.c, s.i, s.d}$ 1108 m; n; $\C{\color{red}// t.m, t.n}$ 1109 1109 } 1110 1110 \end{cfa} 1111 The equivalent object-oriented approachis:1111 The equivalent object-oriented style is: 1112 1112 \begin{cfa} 1113 int S::mem( T & t ) { $\C{// multiple aggregate parameters}$ 1114 c; i; d; $\C{\color{red}// this-\textgreater.c, this-\textgreater.i, this-\textgreater.d}$ 1115 `t.`m; `t.`n; $\C{// must qualify}$ 1116 } 1117 \end{cfa} 1118 1119 \begin{cfa} 1120 struct S { int i, j; } sv; 1121 with( sv ) { 1122 S & sr = sv; 1123 with( sr ) { 1124 S * sp = &sv; 1125 with( *sp ) { 1126 i = 3; j = 4; $\C{\color{red}// sp-{\textgreater}i, sp-{\textgreater}j}$ 1127 } 1128 i = 3; j = 4; $\C{\color{red}// sr.i, sr.j}$ 1129 } 1130 i = 3; j = 4; $\C{\color{red}// sv.i, sv.j}$ 1113 int S::mem( T & t ) { $\C{// multiple aggregate parameters}$ 1114 c; i; d; $\C{\color{red}// this-\textgreater.c, this-\textgreater.i, this-\textgreater.d}$ 1115 `t.`m; `t.`n; 1131 1116 } 1132 1117 \end{cfa} … … 1137 1122 struct S1 { ... } s1; 1138 1123 struct S2 { ... } s2; 1139 `with( s1 )` { $\C{// with statement}$1124 `with( s1 )` { $\C{// with statement}$ 1140 1125 // access fields of s1 without qualification 1141 `with( s2 )` { $\C{// nesting}$1126 `with( s2 )` { $\C{// nesting}$ 1142 1127 // access fields of s1 and s2 without qualification 1143 1128 } … … 1155 1140 struct T { int i; int k; int m } b, c; 1156 1141 `with( a, b )` { 1157 j + k; $\C{// unambiguous, unique names define unique types}$1158 i; $\C{// ambiguous, same name and type}$1159 a.i + b.i; $\C{// unambiguous, qualification defines unique names}$1160 m; $\C{// ambiguous, same name and no context to define unique type}$1161 m = 5.0; $\C{// unambiguous, same name and context defines unique type}$1162 m = 1; $\C{// unambiguous, same name and context defines unique type}$1163 } 1164 `with( c )` { ... } $\C{// ambiguous, same name and no context}$1165 `with( (S)c )` { ... } $\C{// unambiguous, same name and cast defines unique type}$1142 j + k; $\C{// unambiguous, unique names define unique types}$ 1143 i; $\C{// ambiguous, same name and type}$ 1144 a.i + b.i; $\C{// unambiguous, qualification defines unique names}$ 1145 m; $\C{// ambiguous, same name and no context to define unique type}$ 1146 m = 5.0; $\C{// unambiguous, same name and context defines unique type}$ 1147 m = 1; $\C{// unambiguous, same name and context defines unique type}$ 1148 } 1149 `with( c )` { ... } $\C{// ambiguous, same name and no context}$ 1150 `with( (S)c )` { ... } $\C{// unambiguous, same name and cast defines unique type}$ 1166 1151 \end{cfa} 1167 1152 1168 1153 The components in the "with" clause 1169 1154 1170 with ( a, b, c ) { ... }1155 with ( a, b, c ) { ... } 1171 1156 1172 1157 serve 2 purposes: each component provides a type and object. The type must be a … … 1220 1205 p2 = &y; $\C{// p2 points to y}$ 1221 1206 p3 = &p1; $\C{// p3 points to p1}$ 1207 *p2 = ((*p1 + *p2) * (**p3 - *p1)) / (**p3 - 15); 1222 1208 \end{cfa} 1223 1209 1224 1210 Unfortunately, the dereference and address-of operators introduce a great deal of syntactic noise when dealing with pointed-to values rather than pointers, as well as the potential for subtle bugs. 1225 It would be desirable to have the compiler figure out how to elide the dereference operators in a complex expression such as @*p2 = ((*p1 + *p2) * (**p3 - *p1)) / (**p3 - 15);@, for both brevity and clarity.1211 For both brevity and clarity, it would be desirable to have the compiler figure out how to elide the dereference operators in a complex expression such as the assignment to @*p2@ above. 1226 1212 However, since C defines a number of forms of \emph{pointer arithmetic}, two similar expressions involving pointers to arithmetic types (\eg @*p1 + x@ and @p1 + x@) may each have well-defined but distinct semantics, introducing the possibility that a user programmer may write one when they mean the other, and precluding any simple algorithm for elision of dereference operators. 1227 1213 To solve these problems, \CFA introduces reference types @T&@; a @T&@ has exactly the same value as a @T*@, but where the @T*@ takes the address interpretation by default, a @T&@ takes the value interpretation by default, as below: … … 1248 1234 1249 1235 Secondly, unlike the references in \CC which always point to a fixed address, \CFA references are rebindable. 1250 This allows \CFA references to be default-initialized ( to a null pointer), and also to point to different addresses throughout their lifetime.1236 This allows \CFA references to be default-initialized (\eg to a null pointer), and also to point to different addresses throughout their lifetime. 1251 1237 This rebinding is accomplished without adding any new syntax to \CFA, but simply by extending the existing semantics of the address-of operator in C. 1252 1238 In C, the address of a lvalue is always a rvalue, as in general that address is not stored anywhere in memory, and does not itself have an address. … … 1274 1260 The syntactic motivation for this is clearest when considering overloaded operator-assignment, \eg @int ?+=?(int &, int)@; given @int x, y@, the expected call syntax is @x += y@, not @&x += y@. 1275 1261 1276 This initialization of references from lvalues rather than pointers can be considered a ``lvalue-to-reference'' conversion rather than an elision of the address-of operator; similarly, use of a the value pointed to by a reference in an rvalue context can be thought of as a ``reference-to-rvalue'' conversion. 1277 \CFA includes one more reference conversion, an ``rvalue-to-reference'' conversion, implemented by means of an implicit temporary. 1262 More generally, this initialization of references from lvalues rather than pointers is an instance of a ``lvalue-to-reference'' conversion rather than an elision of the address-of operator; this conversion can actually be used in any context in \CFA an implicit conversion would be allowed. 1263 Similarly, use of a the value pointed to by a reference in an rvalue context can be thought of as a ``reference-to-rvalue'' conversion, and \CFA also includes a qualifier-adding ``reference-to-reference'' conversion, analagous to the @T *@ to @const T *@ conversion in standard C. 1264 The final reference conversion included in \CFA is ``rvalue-to-reference'' conversion, implemented by means of an implicit temporary. 1278 1265 When an rvalue is used to initialize a reference, it is instead used to initialize a hidden temporary value with the same lexical scope as the reference, and the reference is initialized to the address of this temporary. 1279 1266 This allows complex values to be succinctly and efficiently passed to functions, without the syntactic overhead of explicit definition of a temporary variable or the runtime cost of pass-by-value. … … 1284 1271 One of the strengths of C is the control over memory management it gives programmers, allowing resource release to be more consistent and precisely timed than is possible with garbage-collected memory management. 1285 1272 However, this manual approach to memory management is often verbose, and it is useful to manage resources other than memory (\eg file handles) using the same mechanism as memory. 1286 \CC is well-known for an approach to manual memory management that addresses both these issues, Resource Allocation Is Initialization (RAII), implemented by means of special \emph{constructor} and \emph{destructor} functions; we have implemented a similar feature in \CFA. 1287 1288 \TODO{Fill out section. Mention field-constructors and at-equal escape hatch to C-style initialization. Probably pull some text from Rob's thesis for first draft.} 1289 1273 \CC is well-known for an approach to manual memory management that addresses both these issues, Resource Aquisition Is Initialization (RAII), implemented by means of special \emph{constructor} and \emph{destructor} functions; we have implemented a similar feature in \CFA. 1274 While RAII is a common feature of object-oriented programming languages, its inclusion in \CFA does not violate the design principle that \CFA retain the same procedural paradigm as C. 1275 In particular, \CFA does not implement class-based encapsulation: neither the constructor nor any other function has privileged access to the implementation details of a type, except through the translation-unit-scope method of opaque structs provided by C. 1276 1277 In \CFA, a constructor is a function named @?{}@, while a destructor is a function named @^?{}@; like other \CFA operators, these names represent the syntax used to call the constructor or destructor, \eg @S s = { ... };@ or @^(s){};@. 1278 Every constructor and destructor must have a return type of @void@, and its first parameter must have a reference type whose base type is the type of the object the function constructs or destructs. 1279 This first parameter is informally called the @this@ parameter, as in many object-oriented languages, though a programmer may give it an arbitrary name. 1280 Destructors must have exactly one parameter, while constructors allow passing of zero or more additional arguments along with the @this@ parameter. 1281 1282 \begin{cfa} 1283 struct Array { 1284 int * data; 1285 int len; 1286 }; 1287 1288 void ?{}( Array& arr ) { 1289 arr.len = 10; 1290 arr.data = calloc( arr.len, sizeof(int) ); 1291 } 1292 1293 void ^?{}( Array& arr ) { 1294 free( arr.data ); 1295 } 1296 1297 { 1298 Array x; 1299 `?{}(x);` $\C{// implicitly compiler-generated}$ 1300 // ... use x 1301 `^?{}(x);` $\C{// implicitly compiler-generated}$ 1302 } 1303 \end{cfa} 1304 1305 In the example above, a \emph{default constructor} (\ie one with no parameters besides the @this@ parameter) and destructor are defined for the @Array@ struct, a dynamic array of @int@. 1306 @Array@ is an example of a \emph{managed type} in \CFA, a type with a non-trivial constructor or destructor, or with a field of a managed type. 1307 As in the example, all instances of managed types are implicitly constructed upon allocation, and destructed upon deallocation; this ensures proper initialization and cleanup of resources contained in managed types, in this case the @data@ array on the heap. 1308 The exact details of the placement of these implicit constructor and destructor calls are omitted here for brevity, the interested reader should consult \cite{Schluntz17}. 1309 1310 Constructor calls are intended to seamlessly integrate with existing C initialization syntax, providing a simple and familiar syntax to veteran C programmers and allowing constructor calls to be inserted into legacy C code with minimal code changes. 1311 As such, \CFA also provides syntax for \emph{copy initialization} and \emph{initialization parameters}: 1312 1313 \begin{cfa} 1314 void ?{}( Array& arr, Array other ); 1315 1316 void ?{}( Array& arr, int size, int fill ); 1317 1318 Array y = { 20, 0xDEADBEEF }, z = y; 1319 \end{cfa} 1320 1321 Copy constructors have exactly two parameters, the second of which has the same type as the base type of the @this@ parameter; appropriate care is taken in the implementation to avoid recursive calls to the copy constructor when initializing this second parameter. 1322 Other constructor calls look just like C initializers, except rather than using field-by-field initialization (as in C), an initialization which matches a defined constructor will call the constructor instead. 1323 1324 In addition to initialization syntax, \CFA provides two ways to explicitly call constructors and destructors. 1325 Explicit calls to constructors double as a placement syntax, useful for construction of member fields in user-defined constructors and reuse of large storage allocations. 1326 While the existing function-call syntax works for explicit calls to constructors and destructors, \CFA also provides a more concise \emph{operator syntax} for both: 1327 1328 \begin{cfa} 1329 Array a, b; 1330 (a){}; $\C{// default construct}$ 1331 (b){ a }; $\C{// copy construct}$ 1332 ^(a){}; $\C{// destruct}$ 1333 (a){ 5, 0xFFFFFFFF }; $\C{// explicit constructor call}$ 1334 \end{cfa} 1335 1336 To provide a uniform type interface for @otype@ polymorphism, the \CFA compiler automatically generates a default constructor, copy constructor, assignment operator, and destructor for all types. 1337 These default functions can be overridden by user-generated versions of them. 1338 For compatibility with the standard behaviour of C, the default constructor and destructor for all basic, pointer, and reference types do nothing, while the copy constructor and assignment operator are bitwise copies; if default zero-initialization is desired, the default constructors can be overridden. 1339 For user-generated types, the four functions are also automatically generated. 1340 @enum@ types are handled the same as their underlying integral type, and unions are also bitwise copied and no-op initialized and destructed. 1341 For compatibility with C, a copy constructor from the first union member type is also defined. 1342 For @struct@ types, each of the four functions are implicitly defined to call their corresponding functions on each member of the struct. 1343 To better simulate the behaviour of C initializers, a set of \emph{field constructors} is also generated for structures. 1344 A constructor is generated for each non-empty prefix of a structure's member-list which copy-constructs the members passed as parameters and default-constructs the remaining members. 1345 To allow users to limit the set of constructors available for a type, when a user declares any constructor or destructor, the corresponding generated function and all field constructors for that type are hidden from expression resolution; similarly, the generated default constructor is hidden upon declaration of any constructor. 1346 These semantics closely mirror the rule for implicit declaration of constructors in \CC\cite[p.~186]{ANSI98:C++}. 1347 1348 In rare situations user programmers may not wish to have constructors and destructors called; in these cases, \CFA provides an ``escape hatch'' to not call them. 1349 If a variable is initialized using the syntax \lstinline|S x @= {}| it will be an \emph{unmanaged object}, and will not have constructors or destructors called. 1350 Any C initializer can be the right-hand side of an \lstinline|@=| initializer, \eg \lstinline|Array a @= { 0, 0x0 }|, with the usual C initialization semantics. 1351 In addition to the expressive power, \lstinline|@=| provides a simple path for migrating legacy C code to \CFA, by providing a mechanism to incrementally convert initializers; the \CFA design team decided to introduce a new syntax for this escape hatch because we believe that our RAII implementation will handle the vast majority of code in a desirable way, and we wished to maintain familiar syntax for this common case. 1290 1352 1291 1353 \subsection{Default Parameters}
Note:
See TracChangeset
for help on using the changeset viewer.