Changeset 14cbfad for doc/papers


Ignore:
Timestamp:
Feb 17, 2018, 2:08:03 PM (4 years ago)
Author:
Peter A. Buhr <pabuhr@…>
Branches:
aaron-thesis, arm-eh, cleanup-dtors, deferred_resn, demangler, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, new-env, no_list, persistent-indexer, resolv-new, with_gc
Children:
c0b4db0
Parents:
93401f8
Message:

complete draft of "with" statement

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/papers/general/Paper.tex

    r93401f8 r14cbfad  
    233233
    234234C already has a limited form of ad-hoc polymorphism in the form of its basic arithmetic operators, which apply to a variety of different types using identical syntax.
    235 \CFA extends the built-in operator overloading by allowing users to define overloads for any function, not just operators, and even any variable; Section~\ref{sec:libraries} includes a number of examples of how this overloading simplifies \CFA programming relative to C.
     235\CFA extends the built-in operator overloading by allowing users to define overloads for any function, not just operators, and even any variable;
     236Section~\ref{sec:libraries} includes a number of examples of how this overloading simplifies \CFA programming relative to C.
    236237Code generation for these overloaded functions and variables is implemented by the usual approach of mangling the identifier names to include a representation of their type, while \CFA decides which overload to apply based on the same ``usual arithmetic conversions'' used in C to disambiguate operator overloads.
    237238As an example:
     
    254255The macro wrapping the generic expression imposes some limitations; as an example, it could not implement the example above, because the variables @max@ would collide with the functions @max@.
    255256Ergonomic limitations of @_Generic@ include the necessity to put a fixed list of supported types in a single place and manually dispatch to appropriate overloads, as well as possible namespace pollution from the functions dispatched to, which must all have distinct names.
     257
     258% http://fanf.livejournal.com/144696.html
     259% http://www.robertgamble.net/2012/01/c11-generic-selections.html
     260% https://abissell.com/2014/01/16/c11s-_generic-keyword-macro-applications-and-performance-impacts/
     261
    256262
    257263\subsection{\texorpdfstring{\LstKeywordStyle{forall} Functions}{forall Functions}}
     
    290296void * bsearch( const void * key, const void * base, size_t nmemb, size_t size,
    291297                                int (* compar)( const void *, const void * ));
     298
    292299int comp( const void * t1, const void * t2 ) { return *(double *)t1 < *(double *)t2 ? -1 :
    293300                                *(double *)t2 < *(double *)t1 ? 1 : 0; }
     301
    294302double key = 5.0, vals[10] = { /* 10 sorted float values */ };
    295303double * val = (double *)bsearch( &key, vals, 10, sizeof(vals[0]), comp );      $\C{// search sorted array}$
     
    300308        int comp( const void * t1, const void * t2 ) { /* as above with double changed to T */ }
    301309        return (T *)bsearch( &key, arr, size, sizeof(T), comp ); }
     310
    302311forall( otype T | { int ?<?( T, T ); } ) unsigned int bsearch( T key, const T * arr, size_t size ) {
    303312        T * result = bsearch( key, arr, size ); $\C{// call first version}$
    304313        return result ? result - arr : size; }  $\C{// pointer subtraction includes sizeof(T)}$
     314
    305315double * val = bsearch( 5.0, vals, 10 );        $\C{// selection based on return type}$
    306316int posn = bsearch( 5.0, vals, 10 );
     
    311321\CC's type-system cannot disambiguate between the two versions of @bsearch@ because it does not use the return type in overload resolution, nor can \CC separately compile a templated @bsearch@.
    312322
    313 \CFA has replacement libraries condensing hundreds of existing C functions into tens of \CFA overloaded functions, all without rewriting the actual computations.
     323\CFA has replacement libraries condensing hundreds of existing C functions into tens of \CFA overloaded functions, all without rewriting the actual computations (see Section~\ref{sec:libraries}).
    314324For example, it is possible to write a type-safe \CFA wrapper @malloc@ based on the C @malloc@:
    315325\begin{lstlisting}
     
    346356\CFA provides \newterm{traits} to name a group of type assertions, where the trait name allows specifying the same set of assertions in multiple locations, preventing repetition mistakes at each function declaration:
    347357\begin{lstlisting}
    348 trait summable( otype T ) {
     358trait `summable`( otype T ) {
    349359        void ?{}( T *, zero_t );                                $\C{// constructor from 0 literal}$
    350360        T ?+?( T, T );                                                  $\C{// assortment of additions}$
     
    352362        T ++?( T * );
    353363        T ?++( T * ); };
     364
    354365forall( otype T `| summable( T )` ) T sum( T a[$\,$], size_t size ) {  // use trait
    355366        `T` total = { `0` };                                    $\C{// instantiate T from 0 by calling its constructor}$
     
    425436forall( otype T ) T value( pair( const char *, T ) p ) { return p.second; }
    426437forall( dtype F, otype T ) T value_p( pair( F *, T * ) p ) { return * p.second; }
     438
    427439pair( const char *, int ) p = { "magic", 42 };
    428440int magic = value( p );
     
    11501162@case@ clauses are made disjoint by the @break@ statement.
    11511163While the ability to fall through \emph{is} a useful form of control flow, it does not match well with programmer intuition, resulting in many errors from missing @break@ statements.
    1152 \CFA provides a new control structure, @choose@, which mimics @switch@, but reverses the meaning of fall through:
     1164For backwards compatibility, \CFA provides a \emph{new} control structure, @choose@, which mimics @switch@, but reverses the meaning of fall through:
    11531165\begin{cquote}
    11541166\lstDeleteShortInline@%
     
    11571169\begin{cfa}
    11581170`choose` ( day ) {
    1159   case Mon~Thu:
    1160         // program
    1161 
    1162   case Fri:
    1163         // program
     1171  case Mon~Thu:  // program
     1172
     1173  case Fri:  // program
    11641174        wallet += pay;
    11651175        `fallthrough;`
    1166   case Sat:
    1167         // party
     1176  case Sat:  // party
    11681177        wallet -= party;
    11691178
    1170   case Sun:
    1171         // rest
    1172 
    1173   default:
    1174         // error
     1179  case Sun:  // rest
     1180
     1181  default:  // error
    11751182}
    11761183\end{cfa}
     
    11781185\begin{cfa}
    11791186switch ( day ) {
    1180   case Mon: case Tue: case Wed: case Thu:
    1181         // program
     1187  case Mon: case Tue: case Wed: case Thu:  // program
    11821188        `break;`
    1183   case Fri:
    1184         // program
     1189  case Fri:  // program
    11851190        wallet += pay;
    11861191
    1187   case Sat:
    1188         // party
     1192  case Sat:  // party
    11891193        wallet -= party;
    11901194        `break;`
    1191   case Sun:
    1192         // rest
     1195  case Sun:  // rest
    11931196        `break;`
    1194   default:
    1195         // error
     1197  default:  // error
    11961198}
    11971199\end{cfa}
     
    12281230\label{s:WithClauseStatement}
    12291231
    1230 Grouping heterogenous data into \newterm{aggregate}s (structure/union) is a common programming practice, and an aggregate can be further organized into more complex structures, such as arrays and containers:
     1232Grouping heterogeneous data into \newterm{aggregate}s (structure/union) is a common programming practice, and an aggregate can be further organized into more complex structures, such as arrays and containers:
    12311233\begin{cfa}
    12321234struct S {                                                                      $\C{// aggregate}$
     
    12431245}
    12441246\end{cfa}
     1247which extends to multiple levels of qualification for nested aggregates.
    12451248A similar situation occurs in object-oriented programming, \eg \CC:
    12461249\begin{C++}
     
    12491252        int i;
    12501253        double d;
    1251         int mem() {                                                             $\C{// implicit "this" parameter}$
     1254        int f() {                                                               $\C{// implicit "this" parameter}$
    12521255                `this->`c; `this->`i; `this->`d;        $\C{// access containing fields}$
    12531256        }
    12541257}
    12551258\end{C++}
    1256 Nesting of member routines in a \lstinline[language=C++]@class@ allows eliding \lstinline[language=C++]@this->@ because of lexical scoping.
     1259Object-oriented nesting of member routines in a \lstinline[language=C++]@class@ allows eliding \lstinline[language=C++]@this->@ because of lexical scoping.
    12571260However, for other aggregate parameters, qualification is necessary:
    12581261\begin{cfa}
    12591262struct T { double m, n; };
    1260 int C::mem( T & t ) {                                           $\C{// multiple aggregate parameters}$
     1263int C::f( T & t ) {                                                     $\C{// multiple aggregate parameters}$
    12611264        c; i; d;                                                                $\C{\color{red}// this-\textgreater.c, this-\textgreater.i, this-\textgreater.d}$
    12621265        `t.`m; `t.`n;                                                   $\C{// must qualify}$
     
    12641267\end{cfa}
    12651268
    1266 % In object-oriented programming, there is an implicit first parameter, often names @self@ or @this@, which is elided.
    1267 % In any programming language, some functions have a naturally close relationship with a particular data type.
    1268 % Object-oriented programming allows this close relationship to be codified in the language by making such functions \newterm{class methods} of their related data type.
    1269 % Class methods have certain privileges with respect to their associated data type, notably un-prefixed access to the fields of that data type.
    1270 % When writing C functions in an object-oriented style, this un-prefixed access is swiftly missed, as access to fields of a @Foo* f@ requires an extra three characters @f->@ every time, which disrupts coding flow and clutters the produced code.
    1271 %
    1272 % \TODO{Fill out section. Be sure to mention arbitrary expressions in with-blocks, recent change driven by Thierry to prioritize field name over parameters.}
    1273 
    1274 To simplify the programmer experience, \CFA provides a @with@ clause/statement (see Pascal~\cite[\S~4.F]{Pascal}) to elide aggregate qualification to fields by opening a scope containing the field identifiers.
     1269To simplify the programmer experience, \CFA provides a @with@ statement (see Pascal~\cite[\S~4.F]{Pascal}) to elide aggregate qualification to fields by opening a scope containing the field identifiers.
    12751270Hence, the qualified fields become variables with the side-effect that it is easier to optimizing field references in a block.
    12761271\begin{cfa}
    1277 void f( S s ) `with( s )` {                                     $\C{// with clause}$
    1278         c; i; d;                                                                $\C{\color{red}// s.c, s.i, s.d}$
    1279 }
    1280 \end{cfa}
    1281 and the equivalence for object-style programming is:
    1282 \begin{cfa}
    1283 int mem( S & this ) `with( this )` {            $\C{// with clause}$
     1272void f( S & this ) `with ( this )` {            $\C{// with statement}$
    12841273        c; i; d;                                                                $\C{\color{red}// this.c, this.i, this.d}$
    12851274}
     
    12871276with the generality of opening multiple aggregate-parameters:
    12881277\begin{cfa}
    1289 int mem( S & s, T & t ) `with( s, t )` {        $\C{// multiple aggregate parameters}$
     1278int f( S & s, T & t ) `with ( s, t )` {         $\C{// multiple aggregate parameters}$
    12901279        c; i; d;                                                                $\C{\color{red}// s.c, s.i, s.d}$
    12911280        m; n;                                                                   $\C{\color{red}// t.m, t.n}$
     
    12931282\end{cfa}
    12941283
    1295 In detail, the @with@ clause/statement has the form:
     1284In detail, the @with@ statement has the form:
    12961285\begin{cfa}
    12971286$\emph{with-statement}$:
     
    13051294
    13061295All expressions in the expression list are open in ``parallel'' within the compound statement.
    1307 This semantic is different from Pascal, which nests the openings.
    1308 The difference between parallel and nesting occurs for fields with the same name but different type:
    1309 \begin{cfa}
    1310 struct S { int i; int j; double m; } s, w;
    1311 struct T { int i; int k; int m } t, w;
    1312 with( s, t ) {
    1313         j + k;                                                                  $\C{// unambiguous, s.j + t.m}$
     1296This semantic is different from Pascal, which nests the openings from left to right.
     1297The difference between parallel and nesting occurs for fields with the same name and type:
     1298\begin{cfa}
     1299struct S { int `i`; int j; double m; } s, w;
     1300struct T { int `i`; int k; int m; } t, w;
     1301with ( s, t ) {
     1302        j + k;                                                                  $\C{// unambiguous, s.j + t.k}$
    13141303        m = 5.0;                                                                $\C{// unambiguous, t.m = 5.0}$
    13151304        m = 1;                                                                  $\C{// unambiguous, s.m = 1}$
    1316         int a = s.i + m;                                                $\C{// unambiguous, a = s.i + t.i}$
    1317         int b = s.i + t.i;                                              $\C{// unambiguous, qualification}$
    1318         sout | (double)m | endl;                                $\C{// unambiguous, cast}$
    1319         i;                                                                              $\C{// ambiguous}$
    1320 }
    1321 \end{cfa}
    1322 \CFA's ability to overload variables means usages of field with the same names can be automatically disambiguated, eliminating most qualification.
     1305        int a = m;                                                              $\C{// unambiguous, a = s.i }$
     1306        double b = m;                                                   $\C{// unambiguous, b = t.m}$
     1307        int c = s.i + t.i;                                              $\C{// unambiguous, qualification}$
     1308        (double)m;                                                              $\C{// unambiguous, cast}$
     1309}
     1310\end{cfa}
     1311For parallel semantics, both @s.i@ and @t.i@ are visible, so @i@ is ambiguous without qualification;
     1312for nested semantics, @t.i@ hides @s.i@, so @i@ implies @t.i@.
     1313\CFA's ability to overload variables means fields with the same name but different types are automatically disambiguated, eliminating most qualification when opening multiple aggregates.
    13231314Qualification or a cast is used to disambiguate.
    1324 A cast may be necessary to disambiguate between the overload variables in a @with@ expression:
    1325 \begin{cfa}
    1326 with( w ) { ... }                                                       $\C{// ambiguous, same name and no context}$
    1327 with( (S)w ) { ... }                                            $\C{// unambiguous}$
    1328 \end{cfa}
    1329 
     1315
     1316There is an interesting problem between parameters and the routine @with@, \eg:
     1317\begin{cfa}
     1318void ?{}( S & s, int i ) with ( s ) {           $\C{// constructor}$
     1319        `s.i = i;` j = 3; m = 5.5;
     1320}
     1321\end{cfa}
     1322Here, the assignment @s.i = i@ means @s.i = s.i@, which is meaningless, and there is no mechanism to qualify the parameter @i@, making the assignment impossible using the routine @with@.
     1323To solve this problem, parameters are treated like an initialized aggregate:
     1324\begin{cfa}
     1325struct Params {
     1326        S & s;
     1327        int i;
     1328} params;
     1329\end{cfa}
     1330and implicitly opened \emph{after} a routine open, to give them higher priority:
     1331\begin{cfa}
     1332void ?{}( S & s, int i ) with ( s ) `with( $\emph{\color{red}params}$ )` {
     1333        s.i = i; j = 3; m = 5.5;
     1334}
     1335\end{cfa}
     1336Finally, a cast may be used to disambiguate among overload variables in a @with@ expression:
     1337\begin{cfa}
     1338with ( w ) { ... }                                                      $\C{// ambiguous, same name and no context}$
     1339with ( (S)w ) { ... }                                           $\C{// unambiguous, cast}$
     1340\end{cfa}
     1341and @with@ expressions may be pointers and references (see Section~\ref{s:References}) to aggregates:
    13301342\begin{cfa}
    13311343struct S { int i, j; } sv;
    1332 with( sv ) {
     1344with ( sv ) {                                                           $\C{variable}$
    13331345        S & sr = sv;
    1334         with( sr ) {
     1346        with ( sr ) {                                                   $\C{reference}$
    13351347                S * sp = &sv;
    1336                 with( *sp ) {
     1348                with ( *sp ) {                                          $\C{pointer}$
    13371349                        i = 3; j = 4;                                   $\C{\color{red}// sp-{\textgreater}i, sp-{\textgreater}j}$
    13381350                }
     
    13431355\end{cfa}
    13441356
    1345 The statement form is used within a block:
    1346 \begin{cfa}
    1347 int foo() {
    1348         struct S1 { ... } s1;
    1349         struct S2 { ... } s2;
    1350         `with( s1 )` {                                                  $\C{// with statement}$
    1351                 // access fields of s1 without qualification
    1352                 `with( s2 )` {                                          $\C{// nesting}$
    1353                         // access fields of s1 and s2 without qualification
    1354                 }
    1355         }
    1356         `with( s1, s2 )` {
    1357                 // access unambiguous fields of s1 and s2 without qualification
    1358         }
    1359 }
    1360 \end{cfa}
    13611357
    13621358% \subsection{Exception Handling ???}
     1359
    13631360
    13641361\section{Declarations}
     
    16151612
    16161613\subsection{References}
     1614\label{s:References}
    16171615
    16181616All variables in C have an \newterm{address}, a \newterm{value}, and a \newterm{type};
Note: See TracChangeset for help on using the changeset viewer.