Ignore:
Timestamp:
Mar 4, 2021, 7:40:25 PM (5 years ago)
Author:
Thierry Delisle <tdelisle@…>
Branches:
ADT, arm-eh, ast-experimental, enum, forall-pointer-decay, jacob/cs343-translation, master, new-ast-unique-expr, pthread-emulation, qualifiedEnum
Children:
77d601f
Parents:
342af53 (diff), a5040fe (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the (diff) links above to see all the changes relative to each parent.
Message:

Merge branch 'master' of plg.uwaterloo.ca:software/cfa/cfa-cc

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/andrew_beach_MMath/existing.tex

    r342af53 r8e4aa05  
    1 \chapter{\CFA{} Existing Features}
    2 
    3 \section{Overloading and extern}
    4 Cforall has overloading, allowing multiple definitions of the same name to
    5 be defined.
    6 
    7 This also adds name mangling so that the assembly symbols are unique for
    8 different overloads. For compatability with names in C there is also a
    9 syntax to diable the name mangling. These unmangled names cannot be overloaded
    10 but act as the interface between C and \CFA code.
    11 
    12 The syntax for disabling mangling is:
    13 \begin{lstlisting}
    14 extern "C" {
    15     ...
    16 }
    17 \end{lstlisting}
    18 
    19 To re-enable mangling once it is disabled the syntax is:
    20 \begin{lstlisting}
    21 extern "Cforall" {
    22     ...
    23 }
    24 \end{lstlisting}
    25 
    26 Both should occur at the declaration level and effect all the declarations
    27 in \texttt{...}. Neither care about the state of mangling when they begin
    28 and will return to that state after the group is finished. So re-enabling
    29 is only used to nest areas of mangled and unmangled declarations.
    30 
    31 \section{References}
    32 \CFA adds references to C. These are auto-dereferencing pointers and use the
    33 same syntax as pointers except they use ampersand (\codeCFA{\&}) instead of
    34 the asterisk (\codeCFA{*}). They can also be constaint or mutable, if they
    35 are mutable they may be assigned to by using the address-of operator
    36 (\codeCFA\&) which converts them into a pointer.
     1\chapter{\CFA Existing Features}
     2
     3\CFA (C-for-all)~\cite{Cforall} is an open-source project extending ISO C with
     4modern safety and productivity features, while still ensuring backwards
     5compatibility with C and its programmers.  \CFA is designed to have an
     6orthogonal feature-set based closely on the C programming paradigm
     7(non-object-oriented) and these features can be added incrementally to an
     8existing C code-base allowing programmers to learn \CFA on an as-needed basis.
     9
     10Only those \CFA features pertinent to this thesis are discussed.  Many of the
     11\CFA syntactic and semantic features used in the thesis should be fairly
     12obvious to the reader.
     13
     14\section{Overloading and \lstinline{extern}}
     15\CFA has extensive overloading, allowing multiple definitions of the same name
     16to be defined.~\cite{Moss18}
     17\begin{cfa}
     18char i; int i; double i;                        $\C[3.75in]{// variable overload}$
     19int f(); double f();                            $\C{// return overload}$
     20void g( int ); void g( double );        $\C{// parameter overload}\CRT$
     21\end{cfa}
     22This feature requires name mangling so the assembly symbols are unique for
     23different overloads. For compatibility with names in C, there is also a syntax
     24to disable name mangling. These unmangled names cannot be overloaded but act as
     25the interface between C and \CFA code.  The syntax for disabling/enabling
     26mangling is:
     27\begin{cfa}
     28// name mangling
     29int i; // _X1ii_1
     30@extern "C"@ {  // no name mangling
     31        int j; // j
     32        @extern "Cforall"@ {  // name mangling
     33                int k; // _X1ki_1
     34        }
     35        // no name mangling
     36}
     37// name mangling
     38\end{cfa}
     39Both forms of @extern@ affect all the declarations within their nested lexical
     40scope and transition back to the previous mangling state when the lexical scope
     41ends.
     42
     43\section{Reference Type}
     44\CFA adds a rebindable reference type to C, but more expressive than the \Cpp
     45reference.  Multi-level references are allowed and act like auto-dereferenced
     46pointers using the ampersand (@&@) instead of the pointer asterisk (@*@). \CFA
     47references may also be mutable or non-mutable. If mutable, a reference variable
     48may be assigned to using the address-of operator (@&@), which converts the
     49reference to a pointer.
     50\begin{cfa}
     51int i, j;
     52int @&@ ri = i, @&&@ rri = ri;
     53rri = 3;  // auto-dereference assign to i
     54@&@ri = @&@j; // rebindable
     55ri = 5;   // assign to j
     56\end{cfa}
    3757
    3858\section{Constructors and Destructors}
    3959
    4060Both constructors and destructors are operators, which means they are just
    41 functions with special names. The special names are used to define them and
    42 may be used to call the functions expicately. The \CFA special names are
    43 constructed by taking the tokens in the operators and putting \texttt{?} where
    44 the arguments would go. So multiplication is \texttt{?*?} while dereference
    45 is \texttt{*?}. This also make it easy to tell the difference between
    46 pre-fix operations (such as \texttt{++?}) and post-fix operations
    47 (\texttt{?++}).
    48 
    49 The special name for contructors is \texttt{?\{\}}, which comes from the
    50 initialization syntax in C. The special name for destructors is
    51 \texttt{\^{}?\{\}}. % I don't like the \^{} symbol but $^\wedge$ isn't better.
    52 
    53 Any time a type T goes out of scope the destructor matching
    54 \codeCFA{void ^?\{\}(T \&);} is called. In theory this is also true of
    55 primitive types such as \codeCFA{int}, but in practice those are no-ops and
    56 are usually omitted for optimization.
     61functions with special operator names rather than type names in \Cpp. The
     62special operator names may be used to call the functions explicitly (not
     63allowed in \Cpp for constructors).
     64
     65In general, operator names in \CFA are constructed by bracketing an operator
     66token with @?@, which indicates where the arguments. For example, infixed
     67multiplication is @?*?@ while prefix dereference is @*?@. This syntax make it
     68easy to tell the difference between prefix operations (such as @++?@) and
     69post-fix operations (@?++@).
     70
     71The special name for a constructor is @?{}@, which comes from the
     72initialization syntax in C. The special name for a destructor is @^{}@, where
     73the @^@ has no special meaning.
     74% I don't like the \^{} symbol but $^\wedge$ isn't better.
     75\begin{cfa}
     76struct T { ... };
     77void ?@{}@(@T &@ this, ...) { ... }  // constructor
     78void ?@^{}@(@T &@ this, ...) { ... } // destructor
     79{
     80        T s = @{@ ... @}@;  // same constructor/initialization braces
     81} // destructor call automatically generated
     82\end{cfa}
     83The first parameter is a reference parameter to the type for the
     84constructor/destructor. Destructors may have multiple parameters.  The compiler
     85implicitly matches an overloaded constructor @void ^?{}(T &, ...);@ to an
     86object declaration with associated initialization, and generates a construction
     87call after the object is allocated. When an object goes out of scope, the
     88matching overloaded destructor @void ^?{}(T &);@ is called.  Without explicit
     89definition, \CFA creates a default and copy constructor, destructor and
     90assignment (like \Cpp). It is possible to define constructors/destructors for
     91basic and existing types.
    5792
    5893\section{Polymorphism}
    59 \CFA uses polymorphism to create functions and types that are defined over
    60 different types. \CFA polymorphic declarations serve the same role as \CPP
    61 templates or Java generics.
    62 
    63 Polymorphic declaractions start with a forall clause that goes before the
    64 standard (monomorphic) declaration. These declarations have the same syntax
    65 except that you may use the names introduced by the forall clause in them.
    66 
    67 Forall clauses are written \codeCFA{forall( ... )} where \codeCFA{...} becomes
    68 the list of polymorphic variables (local type names) and assertions, which
    69 repersent required operations on those types.
    70 
    71 \begin{lstlisting}
    72 forall(dtype T | { void do_once(T &); })
    73 void do_twice(T & value) {
    74     do_once(value);
    75     do_once(value);
    76 }
    77 \end{lstlisting}
    78 
    79 A polymorphic function can be used in the same way normal functions are.
    80 The polymorphics variables are filled in with concrete types and the
    81 assertions are checked. An assertion checked by seeing if that name of that
    82 type (with all the variables replaced with the concrete types) is defined at
    83 the the call site.
    84 
    85 As an example, even if no function named \codeCFA{do_once} is not defined
    86 near the definition of \codeCFA{do_twice} the following code will work.
    87 \begin{lstlisting}
     94\CFA uses parametric polymorphism to create functions and types that are
     95defined over multiple types. \CFA polymorphic declarations serve the same role
     96as \Cpp templates or Java generics. The ``parametric'' means the polymorphism is
     97accomplished by passing argument operations to associate \emph{parameters} at
     98the call site, and these parameters are used in the function to differentiate
     99among the types the function operates on.
     100
     101Polymorphic declarations start with a universal @forall@ clause that goes
     102before the standard (monomorphic) declaration. These declarations have the same
     103syntax except they may use the universal type names introduced by the @forall@
     104clause.  For example, the following is a polymorphic identity function that
     105works on any type @T@:
     106\begin{cfa}
     107@forall( T )@ @T@ identity( @T@ val ) { return val; }
     108int forty_two = identity( 42 ); // T bound to int, forty_two == 42
     109\end{cfa}
     110
     111To allow a polymorphic function to be separately compiled, the type @T@ must be
     112constrained by the operations used on @T@ in the function body. The @forall@
     113clauses is augmented with a list of polymorphic variables (local type names)
     114and assertions (constraints), which represent the required operations on those
     115types used in a function, \eg:
     116\begin{cfa}
     117forall( T @| { void do_once(T); }@) // assertion
     118void do_twice(T value) {
     119        do_once(value);
     120        do_once(value);
     121}
     122void do_once(int i) { ... }  // provide assertion
     123int i;
     124do_twice(i); // implicitly pass assertion do_once to do_twice
     125\end{cfa}
     126Any object with a type fulfilling the assertion may be passed as an argument to
     127a @do_twice@ call.
     128
     129A polymorphic function can be used in the same way as a normal function.  The
     130polymorphic variables are filled in with concrete types and the assertions are
     131checked. An assertion is checked by verifying each assertion operation (with
     132all the variables replaced with the concrete types from the arguments) is
     133defined at a call site.
     134
     135Note, a function named @do_once@ is not required in the scope of @do_twice@ to
     136compile it, unlike \Cpp template expansion. Furthermore, call-site inferencing
     137allows local replacement of the most specific parametric functions needs for a
     138call.
     139\begin{cfa}
     140void do_once(double y) { ... } // global
    88141int quadruple(int x) {
    89     void do_once(int & y) {
    90         y = y * 2;
    91     }
    92     do_twice(x);
    93     return x;
    94 }
    95 \end{lstlisting}
    96 This is not the recommended way to implement a quadruple function but it
    97 does work. The complier will deduce that \codeCFA{do_twice}'s T is an
    98 integer from the argument. It will then look for a definition matching the
    99 assertion which is the \codeCFA{do_once} defined within the function. That
    100 function will be passed in as a function pointer to \codeCFA{do_twice} and
    101 called within it.
    102 
    103 To avoid typing out long lists of assertions again and again there are also
    104 traits which collect assertions into convenent packages that can then be used
    105 in assertion lists instead of all of their components.
    106 \begin{lstlisting}
    107 trait done_once(dtype T) {
    108     void do_once(T &);
    109 }
    110 \end{lstlisting}
    111 
    112 After this the forall list in the previous example could instead be written
    113 with the trait instead of the assertion itself.
    114 \begin{lstlisting}
    115 forall(dtype T | done_once(T))
    116 \end{lstlisting}
    117 
    118 Traits can have arbitrary number of assertions in them and are usually used to
    119 create short hands for, and give descriptive names to, commond groupings of
    120 assertions.
    121 
    122 Polymorphic structures and unions may also be defined by putting a forall
    123 clause before the declaration. The type variables work the same way except
    124 are now used in field declaractions instead of parameters and local variables.
    125 
    126 \begin{lstlisting}
     142        void do_once(int y) { y = y * 2; } // local
     143        do_twice(x); // using local "do_once"
     144        return x;
     145}
     146\end{cfa}
     147Specifically, the complier deduces that @do_twice@'s T is an integer from the
     148argument @x@. It then looks for the most specific definition matching the
     149assertion, which is the nested integral @do_once@ defined within the
     150function. The matched assertion function is then passed as a function pointer
     151to @do_twice@ and called within it.
     152
     153To avoid typing long lists of assertions, constraints can be collect into
     154convenient packages called a @trait@, which can then be used in an assertion
     155instead of the individual constraints.
     156\begin{cfa}
     157trait done_once(T) {
     158        void do_once(T);
     159}
     160\end{cfa}
     161and the @forall@ list in the previous example is replaced with the trait.
     162\begin{cfa}
     163forall(dtype T | @done_once(T)@)
     164\end{cfa}
     165In general, a trait can contain an arbitrary number of assertions, both
     166functions and variables, and are usually used to create a shorthand for, and
     167give descriptive names to, common groupings of assertions describing a certain
     168functionality, like @sumable@, @listable@, \etc.
     169
     170Polymorphic structures and unions are defined by qualifying the aggregate type
     171with @forall@. The type variables work the same except they are used in field
     172declarations instead of parameters, returns, and local variable declarations.
     173\begin{cfa}
    127174forall(dtype T)
    128175struct node {
    129     node(T) * next;
    130     T * data;
    131 }
    132 \end{lstlisting}
    133 
    134 The \codeCFA{node(T)} is a use of a polymorphic structure. Polymorphic types
    135 must be provided their polymorphic parameters.
    136 
    137 There are many other features of polymorphism that have not given here but
    138 these are the ones used by the exception system.
     176        node(T) * next;  // generic linked node
     177        T * data;
     178}
     179\end{cfa}
     180The generic type @node(T)@ is an example of a polymorphic-type usage.  Like \Cpp
     181templates usage, a polymorphic-type usage must specify a type parameter.
     182
     183There are many other polymorphism features in \CFA but these are the ones used
     184by the exception system.
    139185
    140186\section{Concurrency}
    141 
    142 \CFA has a number of concurrency features, \codeCFA{thread}s,
    143 \codeCFA{monitor}s and \codeCFA{mutex} parameters, \codeCFA{coroutine}s and
    144 \codeCFA{generator}s. The two features that interact with the exception system
    145 are \codeCFA{thread}s and \codeCFA{coroutine}s; they and their supporting
    146 constructs will be described here.
    147 
    148 \subsection{Coroutines}
    149 Coroutines are routines that do not have to finish execution to hand control
    150 back to their caller, instead they may suspend their execution at any time
    151 and resume it later.
    152 Coroutines are not true concurrency but share some similarities and many of
    153 the same underpinnings and so are included as part of the \CFA threading
    154 library.
    155 
    156 In \CFA coroutines are created using the \codeCFA{coroutine} keyword which
    157 works just like \codeCFA{struct} except that the created structure will be
    158 modified by the compiler to satify the \codeCFA{is_coroutine} trait.
    159 
    160 These structures act as the interface between callers and the coroutine,
    161 the fields are used to pass information in and out. Here is a simple example
    162 where the single field is used to pass the next number in a sequence out.
    163 \begin{lstlisting}
     187\CFA has a number of concurrency features: @thread@, @monitor@, @mutex@
     188parameters, @coroutine@ and @generator@. The two features that interact with
     189the exception system are @thread@ and @coroutine@; they and their supporting
     190constructs are described here.
     191
     192\subsection{Coroutine}
     193A coroutine is a type with associated functions, where the functions are not
     194required to finish execution when control is handed back to the caller. Instead
     195they may suspend execution at any time and be resumed later at the point of
     196last suspension. (Generators are stackless and coroutines are stackful.) These
     197types are not concurrent but share some similarities along with common
     198underpinnings, so they are combined with the \CFA threading library. Further
     199discussion in this section only refers to the coroutine because generators are
     200similar.
     201
     202In \CFA, a coroutine is created using the @coroutine@ keyword, which is an
     203aggregate type like @struct,@ except the structure is implicitly modified by
     204the compiler to satisfy the @is_coroutine@ trait; hence, a coroutine is
     205restricted by the type system to types that provide this special trait.  The
     206coroutine structure acts as the interface between callers and the coroutine,
     207and its fields are used to pass information in and out of coroutine interface
     208functions.
     209
     210Here is a simple example where a single field is used to pass (communicate) the
     211next number in a sequence.
     212\begin{cfa}
    164213coroutine CountUp {
    165     unsigned int next;
    166 }
    167 \end{lstlisting}
    168 
    169 The routine part of the coroutine is a main function for the coroutine. It
    170 takes a reference to a coroutine object and returns nothing. In this function,
    171 and any functions called by this function, the suspend statement may be used
    172 to return execution to the coroutine's caller. When control returns to the
    173 function it continue from that same suspend statement instead of at the top
    174 of the function.
    175 \begin{lstlisting}
    176 void main(CountUp & this) {
    177     unsigned int next = 0;
    178     while (true) {
    179         this.next = next;
    180         suspend;
    181         next = next + 1;
    182     }
    183 }
    184 \end{lstlisting}
    185 
    186 Control is passed to the coroutine with the resume function. This includes the
    187 first time when the coroutine is starting up. The resume function takes a
    188 reference to the coroutine structure and returns the same reference. The
    189 return value is for easy access to communication variables. For example the
    190 next value from a count-up can be generated and collected in a single
    191 expression: \codeCFA{resume(count).next}.
     214        unsigned int next; // communication variable
     215}
     216CountUp countup;
     217\end{cfa}
     218Each coroutine has @main@ function, which takes a reference to a coroutine
     219object and returns @void@.
     220\begin{cfa}[numbers=left]
     221void main(@CountUp & this@) { // argument matches trait is_coroutine
     222        unsigned int up = 0;  // retained between calls
     223        while (true) {
     224                next = up; // make "up" available outside function
     225                @suspend;@$\label{suspend}$
     226                up += 1;
     227        }
     228}
     229\end{cfa}
     230In this function, or functions called by this function (helper functions), the
     231@suspend@ statement is used to return execution to the coroutine's caller
     232without terminating the coroutine.
     233
     234A coroutine is resumed by calling the @resume@ function, \eg @resume(countup)@.
     235The first resume calls the @main@ function at the top. Thereafter, resume calls
     236continue a coroutine in the last suspended function after the @suspend@
     237statement, in this case @main@ line~\ref{suspend}.  The @resume@ function takes
     238a reference to the coroutine structure and returns the same reference. The
     239return value allows easy access to communication variables defined in the
     240coroutine object. For example, the @next@ value for coroutine object @countup@
     241is both generated and collected in the single expression:
     242@resume(countup).next@.
    192243
    193244\subsection{Monitors and Mutex}
    194 
    195 True concurrency does not garrenty ordering. To get some of that ordering back
    196 \CFA uses monitors and mutex (mutual exclution) parameters. A monitor is
    197 another special declaration that contains a lock and is compatable with mutex
    198 parameters.
    199 
    200 Function parameters can have the \codeCFA{mutex} qualifiers on reference
    201 arguments, for example \codeCFA{void example(a_monitor & mutex arg);}. When the
    202 function is called it will acquire the lock on all of the mutex parameters.
    203 
    204 This means that all functions that mutex on a type are part of a critical
    205 section and only one will ever run at a time.
     245Concurrency does not guarantee ordering; without ordering results are
     246non-deterministic. To claw back ordering, \CFA uses monitors and @mutex@
     247(mutual exclusion) parameters. A monitor is another kind of aggregate, where
     248the compiler implicitly inserts a lock and instances are compatible with
     249@mutex@ parameters.
     250
     251A function that requires deterministic (ordered) execution, acquires mutual
     252exclusion on a monitor object by qualifying an object reference parameter with
     253@mutex@.
     254\begin{cfa}
     255void example(MonitorA & @mutex@ argA, MonitorB & @mutex@ argB);
     256\end{cfa}
     257When the function is called, it implicitly acquires the monitor lock for all of
     258the mutex parameters without deadlock.  This semantics means all functions with
     259the same mutex type(s) are part of a critical section for objects of that type
     260and only one runs at a time.
    206261
    207262\subsection{Threads}
    208 While coroutines allow new things to be done with a single execution path
    209 threads actually introduce new paths of execution that continue independently.
    210 Now for threads to work together their must be some communication between them
    211 and that means the timing of certain operations does have to be known. There
    212 or various means of syncronization and mutual exclution provided by \CFA but
    213 for exceptions only the basic two -- fork and join -- are needed.
    214 
    215 Threads are created like coroutines except the keyword is changed:
    216 \begin{lstlisting}
     263Functions, generators, and coroutines are sequential so there is only a single
     264(but potentially sophisticated) execution path in a program. Threads introduce
     265multiple execution paths that continue independently.
     266
     267For threads to work safely with objects requires mutual exclusion using
     268monitors and mutex parameters. For threads to work safely with other threads,
     269also requires mutual exclusion in the form of a communication rendezvous, which
     270also supports internal synchronization as for mutex objects. For exceptions
     271only the basic two basic operations are important: thread fork and join.
     272
     273Threads are created like coroutines with an associated @main@ function:
     274\begin{cfa}
    217275thread StringWorker {
    218     const char * input;
    219     int result;
     276        const char * input;
     277        int result;
    220278};
    221 
    222279void main(StringWorker & this) {
    223     const char * localCopy = this.input;
    224     // ... do some work, perhaps hashing the string ...
    225     this.result = result;
    226 }
    227 \end{lstlisting}
    228 The main function will start executing after the fork operation and continue
    229 executing until it is finished. If another thread joins with this one it will
    230 wait until main has completed execution. In other words everything the thread
    231 does is between fork and join.
    232 
    233 From the outside this is the creation and destruction of the thread object.
    234 Fork happens after the constructor is run and join happens before the
    235 destructor runs. Join also happens during the \codeCFA{join} function which
    236 can be used to join a thread earlier. If it is used the destructor does not
    237 join as that has already been completed.
     280        const char * localCopy = this.input;
     281        // ... do some work, perhaps hashing the string ...
     282        this.result = result;
     283}
     284{
     285        StringWorker stringworker; // fork thread running in "main"
     286} // implicitly join with thread $\(\Rightarrow\)$ wait for completion
     287\end{cfa}
     288The thread main is where a new thread starts execution after a fork operation
     289and then the thread continues executing until it is finished. If another thread
     290joins with an executing thread, it waits until the executing main completes
     291execution. In other words, everything a thread does is between a fork and join.
     292
     293From the outside, this behaviour is accomplished through creation and
     294destruction of a thread object.  Implicitly, fork happens after a thread
     295object's constructor is run and join happens before the destructor runs. Join
     296can also be specified explicitly using the @join@ function to wait for a
     297thread's completion independently from its deallocation (\ie destructor
     298call). If @join@ is called explicitly, the destructor does not implicitly join.
Note: See TracChangeset for help on using the changeset viewer.