Context Navigation

-              r62c0f8a
+              r6c79bef
 \chapter{\CFA Existing Features}
+\CFA (C-for-all)~\cite{Cforall} is an open-source project extending ISO C with modern safety and productivity features, while still ensuring backwards compatibility with C and its programmers.
+\CFA is designed to have an orthogonal feature-set based closely on the C programming paradigm (non-object-oriented) and these features can be added incrementally to an existing C code-base allowing programmers to learn \CFA on an as-needed basis.
+\section{Overloading and extern}
+Cforall has overloading, allowing multiple definitions of the same name to
+be defined.~\cite{Moss18}
+This also adds name mangling so that the assembly symbols are unique for
+different overloads. For compatability with names in C there is also a
+syntax to diable the name mangling. These unmangled names cannot be overloaded
+but act as the interface between C and \CFA code.
+The syntax for disabling mangling is:
+\begin{cfa}
+extern "C" {
+    ...
+}
+\end{cfa}
+To re-enable mangling once it is disabled the syntax is:
+\begin{cfa}
+extern "Cforall" {
+    ...
+}
+\end{cfa}
+Both should occur at the declaration level and effect all the declarations
+in @...@. Neither care about the state of mangling when they begin
+and will return to that state after the group is finished. So re-enabling
+is only used to nest areas of mangled and unmangled declarations.
+\section{References}
+\CFA adds references to C. These are auto-dereferencing pointers and use the
+same syntax as pointers except they use ampersand (@&@) instead of
+the asterisk (@*@). They can also be constaint or mutable, if they
+are mutable they may be assigned to by using the address-of operator
+(@&@) which converts them into a pointer.
+\CFA (C-for-all)~\cite{Cforall} is an open-source project extending ISO C with
+modern safety and productivity features, while still ensuring backwards
+compatibility with C and its programmers.  \CFA is designed to have an
+orthogonal feature-set based closely on the C programming paradigm
+(non-object-oriented) and these features can be added incrementally to an
+existing C code-base allowing programmers to learn \CFA on an as-needed basis.
+Only those \CFA features pertinent to this thesis are discussed.  Many of the
+\CFA syntactic and semantic features used in the thesis should be fairly
+obvious to the reader.
+\section{Overloading and \lstinline|extern|}
+\CFA has extensive overloading, allowing multiple definitions of the same name
+to be defined.~\cite{Moss18}
+\begin{cfa}
+char i; int i; double i;                        $\C[3.75in]{// variable overload}$
+int f(); double f();                            $\C{// return overload}$
+void g( int ); void g( double );        $\C{// parameter overload}\CRT$
+\end{cfa}
+This feature requires name mangling so the assembly symbols are unique for
+different overloads. For compatibility with names in C, there is also a syntax
+to disable name mangling. These unmangled names cannot be overloaded but act as
+the interface between C and \CFA code.  The syntax for disabling/enabling
+mangling is:
+\begin{cfa}
+// name mangling
+int i; // _X1ii_1
+@extern "C"@ {  // no name mangling
+        int j; // j
+        @extern "Cforall"@ {  // name mangling
+                int k; // _X1ki_1
+        }
+        // no name mangling
+}
+// name mangling
+\end{cfa}
+Both forms of @extern@ affect all the declarations within their nested lexical
+scope and transition back to the previous mangling state when the lexical scope
+ends.
+\section{Reference Type}
+\CFA adds a rebindable reference type to C, but more expressive than the \CC
+reference.  Multi-level references are allowed and act like auto-dereferenced
+pointers using the ampersand (@&@) instead of the pointer asterisk (@*@). \CFA
+references may also be mutable or non-mutable. If mutable, a reference variable
+may be assigned to using the address-of operator (@&@), which converts the
+reference to a pointer.
+\begin{cfa}
+int i, j;
+int @&@ ri = i, @&&@ rri = ri;
+rri = 3;  // auto-dereference assign to i
+@&@ri = @&@j; // rebindable
+ri = 5;   // assign to j
+\end{cfa}
 \section{Constructors and Destructors}
 Both constructors and destructors are operators, which means they are just
+functions with special names. The special names are used to define them and
+may be used to call the functions expicately. The \CFA special names are
+constructed by taking the tokens in the operators and putting @?@ where
+the arguments would go. So multiplication is @?*?@ while dereference
+is @*?@. This also make it easy to tell the difference between
+pre-fix operations (such as @++?@) and post-fix operations
+(@?++@).
+The special name for contructors is @?{}@, which comes from the
+initialization syntax in C. The special name for destructors is
+@^{}@. % I don't like the \^{} symbol but $^\wedge$ isn't better.
+Any time a type T goes out of scope the destructor matching
+@void ^?{}(T &);@ is called. In theory this is also true of
+primitive types such as @int@, but in practice those are no-ops and
+are usually omitted for optimization.
+functions with special operator names rather than type names in \CC. The
+special operator names may be used to call the functions explicitly (not
+allowed in \CC for constructors).
+In general, operator names in \CFA are constructed by bracketing an operator
+token with @?@, which indicates where the arguments. For example, infixed
+multiplication is @?*?@ while prefix dereference is @*?@. This syntax make it
+easy to tell the difference between prefix operations (such as @++?@) and
+post-fix operations (@?++@).
+The special name for a constructor is @?{}@, which comes from the
+initialization syntax in C. The special name for a destructor is @^{}@, where
+the @^@ has no special meaning.
+% I don't like the \^{} symbol but $^\wedge$ isn't better.
+\begin{cfa}
+struct T { ... };
+void ?@{}@(@T &@ this, ...) { ... }  // constructor
+void ?@^{}@(@T &@ this, ...) { ... } // destructor
+{
+        T s = @{@ ... @}@;  // same constructor/initialization braces
+} // destructor call automatically generated
+\end{cfa}
+The first parameter is a reference parameter to the type for the
+constructor/destructor. Destructors may have multiple parameters.  The compiler
+implicitly matches an overloaded constructor @void ^?{}(T &, ...);@ to an
+object declaration with associated initialization, and generates a construction
+call after the object is allocated. When an object goes out of scope, the
+matching overloaded destructor @void ^?{}(T &);@ is called.  Without explicit
+definition, \CFA creates a default and copy constructor, destructor and
+assignment (like \CC). It is possible to define constructors/destructors for
+basic and existing types.
 \section{Polymorphism}
+\CFA uses polymorphism to create functions and types that are defined over
+different types. \CFA polymorphic declarations serve the same role as \CC
+templates or Java generics.
+Polymorphic declaractions start with a forall clause that goes before the
+standard (monomorphic) declaration. These declarations have the same syntax
+except that you may use the names introduced by the forall clause in them.
+Forall clauses are written @forall( ... )@ where @...@ becomes
+the list of polymorphic variables (local type names) and assertions, which
+repersent required operations on those types.
+\begin{cfa}
+forall(dtype T | { void do_once(T &); })
+void do_twice(T & value) {
+    do_once(value);
+    do_once(value);
+}
+\end{cfa}
+A polymorphic function can be used in the same way normal functions are.
+The polymorphics variables are filled in with concrete types and the
+assertions are checked. An assertion checked by seeing if that name of that
+type (with all the variables replaced with the concrete types) is defined at
+the the call site.
+As an example, even if no function named @do_once@ is not defined
+near the definition of @do_twice@ the following code will work.
+\begin{cfa}
+\CFA uses parametric polymorphism to create functions and types that are
+defined over multiple types. \CFA polymorphic declarations serve the same role
+as \CC templates or Java generics. The ``parametric'' means the polymorphism is
+accomplished by passing argument operations to associate \emph{parameters} at
+the call site, and these parameters are used in the function to differentiate
+among the types the function operates on.
+Polymorphic declarations start with a universal @forall@ clause that goes
+before the standard (monomorphic) declaration. These declarations have the same
+syntax except they may use the universal type names introduced by the @forall@
+clause.  For example, the following is a polymorphic identity function that
+works on any type @T@:
+\begin{cfa}
+@forall( T )@ @T@ identity( @T@ val ) { return val; }
+int forty_two = identity( 42 ); // T bound to int, forty_two == 42
+\end{cfa}
+To allow a polymorphic function to be separately compiled, the type @T@ must be
+constrained by the operations used on @T@ in the function body. The @forall@
+clauses is augmented with a list of polymorphic variables (local type names)
+and assertions (constraints), which represent the required operations on those
+types used in a function, \eg:
+\begin{cfa}
+forall( T @| { void do_once(T); }@) // assertion
+void do_twice(T value) {
+        do_once(value);
+        do_once(value);
+}
+void do_once(int i) { ... }  // provide assertion
+int i;
+do_twice(i); // implicitly pass assertion do_once to do_twice
+\end{cfa}
+Any object with a type fulfilling the assertion may be passed as an argument to
+a @do_twice@ call.
+A polymorphic function can be used in the same way as a normal function.  The
+polymorphic variables are filled in with concrete types and the assertions are
+checked. An assertion is checked by verifying each assertion operation (with
+all the variables replaced with the concrete types from the arguments) is
+defined at a call site.
+Note, a function named @do_once@ is not required in the scope of @do_twice@ to
+compile it, unlike \CC template expansion. Furthermore, call-site inferencing
+allows local replacement of the most specific parametric functions needs for a
+call.
+\begin{cfa}
+void do_once(double y) { ... } // global
 int quadruple(int x) {
+    void do_once(int & y) {
+        y = y * 2;
+    }
+    do_twice(x);
+    return x;
+}
+\end{cfa}
+This is not the recommended way to implement a quadruple function but it
+does work. The complier will deduce that @do_twice@'s T is an
+integer from the argument. It will then look for a definition matching the
+assertion which is the @do_once@ defined within the function. That
+function will be passed in as a function pointer to @do_twice@ and
+called within it.
+To avoid typing out long lists of assertions again and again there are also
+traits which collect assertions into convenent packages that can then be used
+in assertion lists instead of all of their components.
+\begin{cfa}
+trait done_once(dtype T) {
+    void do_once(T &);
+}
+\end{cfa}
+After this the forall list in the previous example could instead be written
+with the trait instead of the assertion itself.
+\begin{cfa}
+forall(dtype T | done_once(T))
+\end{cfa}
+Traits can have arbitrary number of assertions in them and are usually used to
+create short hands for, and give descriptive names to, commond groupings of
+assertions.
+Polymorphic structures and unions may also be defined by putting a forall
+clause before the declaration. The type variables work the same way except
+are now used in field declaractions instead of parameters and local variables.
+        void do_once(int y) { y = y * 2; } // local
+        do_twice(x); // using local "do_once"
+        return x;
+}
+\end{cfa}
+Specifically, the complier deduces that @do_twice@'s T is an integer from the
+argument @x@. It then looks for the most specific definition matching the
+assertion, which is the nested integral @do_once@ defined within the
+function. The matched assertion function is then passed as a function pointer
+to @do_twice@ and called within it.
+To avoid typing long lists of assertions, constraints can be collect into
+convenient packages called a @trait@, which can then be used in an assertion
+instead of the individual constraints.
+\begin{cfa}
+trait done_once(T) {
+        void do_once(T);
+}
+\end{cfa}
+and the @forall@ list in the previous example is replaced with the trait.
+\begin{cfa}
+forall(dtype T | @done_once(T)@)
+\end{cfa}
+In general, a trait can contain an arbitrary number of assertions, both
+functions and variables, and are usually used to create a shorthand for, and
+give descriptive names to, common groupings of assertions describing a certain
+functionality, like @sumable@, @listable@, \etc.
+Polymorphic structures and unions are defined by qualifying the aggregate type
+with @forall@. The type variables work the same except they are used in field
+declarations instead of parameters, returns, and local variable declarations.
 \begin{cfa}
 forall(dtype T)
 struct node {
+    node(T) * next;
+    T * data;
+}
+\end{cfa}
+The @node(T)@ is a use of a polymorphic structure. Polymorphic types
+must be provided their polymorphic parameters.
+There are many other features of polymorphism that have not given here but
+these are the ones used by the exception system.
+        node(T) * next;  // generic linked node
+        T * data;
+}
+\end{cfa}
+The generic type @node(T)@ is an example of a polymorphic-type usage.  Like \CC
+templates usage, a polymorphic-type usage must specify a type parameter.
+There are many other polymorphism features in \CFA but these are the ones used
+by the exception system.
 \section{Concurrency}
+\CFA has a number of concurrency features, @thread@s,
+@monitor@s and @mutex@ parameters, @coroutine@s and
+@generator@s. The two features that interact with the exception system
+are @thread@s and @coroutine@s; they and their supporting
+constructs will be described here.
+\subsection{Coroutines}
+Coroutines are routines that do not have to finish execution to hand control
+back to their caller, instead they may suspend their execution at any time
+and resume it later.
+Coroutines are not true concurrency but share some similarities and many of
+the same underpinnings and so are included as part of the \CFA threading
+library.
+In \CFA coroutines are created using the @coroutine@ keyword which
+works just like @struct@ except that the created structure will be
+modified by the compiler to satify the @is_coroutine@ trait.
+These structures act as the interface between callers and the coroutine,
+the fields are used to pass information in and out. Here is a simple example
+where the single field is used to pass the next number in a sequence out.
+\CFA has a number of concurrency features: @thread@, @monitor@, @mutex@
+parameters, @coroutine@ and @generator@. The two features that interact with
+the exception system are @thread@ and @coroutine@; they and their supporting
+constructs are described here.
+\subsection{Coroutine}
+A coroutine is a type with associated functions, where the functions are not
+required to finish execution when control is handed back to the caller. Instead
+they may suspend execution at any time and be resumed later at the point of
+last suspension. (Generators are stackless and coroutines are stackful.) These
+types are not concurrent but share some similarities along with common
+underpinnings, so they are combined with the \CFA threading library. Further
+discussion in this section only refers to the coroutine because generators are
+similar.
+In \CFA, a coroutine is created using the @coroutine@ keyword, which is an
+aggregate type like @struct,@ except the structure is implicitly modified by
+the compiler to satisfy the @is_coroutine@ trait; hence, a coroutine is
+restricted by the type system to types that provide this special trait.  The
+coroutine structure acts as the interface between callers and the coroutine,
+and its fields are used to pass information in and out of coroutine interface
+functions.
+Here is a simple example where a single field is used to pass (communicate) the
+next number in a sequence.
 \begin{cfa}
 coroutine CountUp {
+    unsigned int next;
+}
+\end{cfa}
+The routine part of the coroutine is a main function for the coroutine. It
+takes a reference to a coroutine object and returns nothing. In this function,
+and any functions called by this function, the suspend statement may be used
+to return execution to the coroutine's caller. When control returns to the
+function it continue from that same suspend statement instead of at the top
+of the function.
+\begin{cfa}
+void main(CountUp & this) {
+    unsigned int next = 0;
+    while (true) {
+        this.next = next;
+        suspend;
+        next = next + 1;
+    }
+}
+\end{cfa}
+Control is passed to the coroutine with the resume function. This includes the
+first time when the coroutine is starting up. The resume function takes a
+reference to the coroutine structure and returns the same reference. The
+return value is for easy access to communication variables. For example the
+next value from a count-up can be generated and collected in a single
+expression: @resume(count).next@.
+        unsigned int next; // communication variable
+}
+CountUp countup;
+\end{cfa}
+Each coroutine has @main@ function, which takes a reference to a coroutine
+object and returns @void@.
+\begin{cfa}[numbers=left]
+void main(@CountUp & this@) { // argument matches trait is_coroutine
+        unsigned int up = 0;  // retained between calls
+        while (true) {
+                next = up; // make "up" available outside function
+                @suspend;@$\label{suspend}$
+                up += 1;
+        }
+}
+\end{cfa}
+In this function, or functions called by this function (helper functions), the
+@suspend@ statement is used to return execution to the coroutine's caller
+without terminating the coroutine.
+A coroutine is resumed by calling the @resume@ function, \eg @resume(countup)@.
+The first resume calls the @main@ function at the top. Thereafter, resume calls
+continue a coroutine in the last suspended function after the @suspend@
+statement, in this case @main@ line~\ref{suspend}.  The @resume@ function takes
+a reference to the coroutine structure and returns the same reference. The
+return value allows easy access to communication variables defined in the
+coroutine object. For example, the @next@ value for coroutine object @countup@
+is both generated and collected in the single expression:
+@resume(countup).next@.
 \subsection{Monitors and Mutex}
+True concurrency does not garrenty ordering. To get some of that ordering back
+\CFA uses monitors and mutex (mutual exclution) parameters. A monitor is
+another special declaration that contains a lock and is compatable with mutex
+parameters.
+Function parameters can have the @mutex@ qualifiers on reference
+arguments, for example @void example(a_monitor & mutex arg);@. When the
+function is called it will acquire the lock on all of the mutex parameters.
+This means that all functions that mutex on a type are part of a critical
+section and only one will ever run at a time.
+Concurrency does not guarantee ordering; without ordering results are
+non-deterministic. To claw back ordering, \CFA uses monitors and @mutex@
+(mutual exclusion) parameters. A monitor is another kind of aggregate, where
+the compiler implicitly inserts a lock and instances are compatible with
+@mutex@ parameters.
+A function that requires deterministic (ordered) execution, acquires mutual
+exclusion on a monitor object by qualifying an object reference parameter with
+@mutex@.
+\begin{cfa}
+void example(MonitorA & @mutex@ argA, MonitorB & @mutex@ argB);
+\end{cfa}
+When the function is called, it implicitly acquires the monitor lock for all of
+the mutex parameters without deadlock.  This semantics means all functions with
+the same mutex type(s) are part of a critical section for objects of that type
+and only one runs at a time.
 \subsection{Threads}
+While coroutines allow new things to be done with a single execution path
+threads actually introduce new paths of execution that continue independently.
+Now for threads to work together their must be some communication between them
+and that means the timing of certain operations does have to be known. There
+or various means of syncronization and mutual exclution provided by \CFA but
+for exceptions only the basic two -- fork and join -- are needed.
+Threads are created like coroutines except the keyword is changed:
+Functions, generators, and coroutines are sequential so there is only a single
+(but potentially sophisticated) execution path in a program. Threads introduce
+multiple execution paths that continue independently.
+For threads to work safely with objects requires mutual exclusion using
+monitors and mutex parameters. For threads to work safely with other threads,
+also requires mutual exclusion in the form of a communication rendezvous, which
+also supports internal synchronization as for mutex objects. For exceptions
+only the basic two basic operations are important: thread fork and join.
+Threads are created like coroutines with an associated @main@ function:
 \begin{cfa}
 thread StringWorker {
     const char * input;
     int result;
+        const char * input;
+        int result;
 };
 void main(StringWorker & this) {
+    const char * localCopy = this.input;
+    // ... do some work, perhaps hashing the string ...
+    this.result = result;
+}
+\end{cfa}
+The main function will start executing after the fork operation and continue
+executing until it is finished. If another thread joins with this one it will
+wait until main has completed execution. In other words everything the thread
+does is between fork and join.
+From the outside this is the creation and destruction of the thread object.
+Fork happens after the constructor is run and join happens before the
+destructor runs. Join also happens during the @join@ function which
+can be used to join a thread earlier. If it is used the destructor does not
+join as that has already been completed.
+        const char * localCopy = this.input;
+        // ... do some work, perhaps hashing the string ...
+        this.result = result;
+}
+{
+        StringWorker stringworker; // fork thread running in "main"
+} // implicitly join with thread $\(\Rightarrow\)$ wait for completion
+\end{cfa}
+The thread main is where a new thread starts execution after a fork operation
+and then the thread continues executing until it is finished. If another thread
+joins with an executing thread, it waits until the executing main completes
+execution. In other words, everything a thread does is between a fork and join.
+From the outside, this behaviour is accomplished through creation and
+destruction of a thread object.  Implicitly, fork happens after a thread
+object's constructor is run and join happens before the destructor runs. Join
+can also be specified explicitly using the @join@ function to wait for a
+thread's completion independently from its deallocation (\ie destructor
+call). If @join@ is called explicitly, the destructor does not implicitly join.

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset 6c79bef for doc/theses

Legend:

doc/theses/andrew_beach_MMath/existing.tex

Download in other formats: