Ignore:
Timestamp:
Jul 29, 2024, 1:32:10 PM (5 months ago)
Author:
JiadaL <j82liang@…>
Branches:
master
Children:
ce02877
Parents:
5aeb1a9
Message:

update thesis

Location:
doc/theses/jiada_liang_MMath
Files:
4 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/jiada_liang_MMath/CFAenum.tex

    r5aeb1a9 r38e20a80  
    66% The following sections detail all of my new contributions to enumerations in \CFA.
    77\CFA extends the enumeration declaration by parameterizing with a type (like a generic type).
    8 \begin{clang}[identifierstyle=\linespread{0.9}\it]
     8
     9
     10\begin{cfa}[caption={CFA Enum},captionpos=b,label={l:CFAEnum}]
    911$\it enum$-specifier:
    1012        enum @(type-specifier$\(_{opt}\)$)@ identifier$\(_{opt}\)$ { cfa-enumerator-list }
     
    1820        $\it inline$ identifier
    1921        enumeration-constant = expression
    20 \end{clang}
     22\end{cfa}
     23
    2124A \newterm{\CFA enumeration}, or \newterm{\CFA enum}, has an optional type declaration in the bracket next to the @enum@ keyword.
    22 Without optional type declarations, the syntax defines "opaque enums".
    23 Otherwise, \CFA enum with type declaration are "typed enums".
     25Without optional type declarations, the syntax defines \newterm{opaque enums}.
     26Otherwise, \CFA enum with type declaration are \newterm{typed enums}.
    2427
    2528\section{Opaque Enum}
     
    2730Opaque enum is a special CFA enumeration type, where the internal representation is chosen by the compiler and hidden from users.
    2831Compared C enum, opaque enums are more restrictive in terms of typing, and cannot be implicitly converted to integers.
    29 Enumerators of opaque enum cannot have initializer. Declaring initializer in the body of opaque enum results in a syntax error.
     32Enumerators of opaque enum cannot have initializer. Declaring initializer in the body of opaque enum results in a compile error.
    3033\begin{cfa}
    3134enum@()@ Planets { MERCURY, VENUS, EARTH, MARS, JUPITER, SATURN, URANUS, NEPTUNE };
    3235
    3336Planet p = URANUS;
    34 @int i = VENUS; // Error, VENUS cannot be converted into an integral type@
    35 \end{cfa}
    36 Each opage enum has two @attributes@: @position@ and @label@. \CFA auto-generates @attribute functions@ @posn()@ and @label()@ for every \CFA enum to returns the respective attributes.
    37 \begin{cfa}
    38 // Auto-generated
    39 int posn(Planet p);
    40 char * s label(Planet p);
    41 \end{cfa}
     37int i = VENUS; @// Error, VENUS cannot be converted into an integral type
     38\end{cfa}
     39% Each opaque enum has two @attributes@: @position@ and @label@. \CFA auto-generates @attribute functions@ @posn()@ and @label()@ for every \CFA enum to returns the respective attributes.
     40Opaque enumerations have two defining properties: @label@ (name) and @order@ (position), exposed to users by predefined @attribute functions@ , with the following signatures:
     41\begin{cfa}
     42forall( E ) {
     43        unsigned posn(E e);
     44        const char * s label(E e);
     45};
     46\end{cfa}
     47With polymorphic type parameter E being substituted by enumeration types such as @Planet@.
    4248
    4349\begin{cfa}
     
    4652\end{cfa}
    4753
    48 % \subsection{Representation}
    49 \CFA uses chooses signed int as the underlying representation of an opaque enum variable, holding the value of enumeration position. Therefore, @posn()@ is in fact a cast that bypassing type system, converting an
    50 cfa enum to its integral representation.
    51 
    52 Labels information are stored in a global array. @label()@ is a function that maps enum position to an element of the array.
     54\subsection{Representation}
     55The underlying representation of \CFA enumeration object is its order, saved as an integral type. Therefore, the size of a \CFA enumeration is consistent with C enumeration.
     56Attribute function @posn@ performs type substitution on an expression from \CFA type to integral type.
     57Names of enumerators are stored in a global data structure, with @label@ maps \CFA enumeration object to corresponding data.
    5358
    5459\section{Typed Enum}
     
    5661
    5762\CFA extends the enumeration declaration by parameterizing with a type (like a generic type), allowing enumerators to be assigned any values from the declared type.
    58 Figure~\ref{f:EumeratorTyping} shows a series of examples illustrating that all \CFA types can be use with an enumeration and each type's constants used to set the enumerator constants.
     63Figure~\ref{f:EumeratorTyping} shows a series of examples illustrating that all \CFA types can be use with an enumeration and each type's values used to set the enumerator constants.
    5964Note, the synonyms @Liz@ and @Beth@ in the last declaration.
    6065Because enumerators are constants, the enumeration type is implicitly @const@, so all the enumerator types in Figure~\ref{f:EumeratorTyping} are logically rewritten with @const@.
     
    7075        enum( @_Complex@ ) Plane { X = 1.5+3.4i, Y = 7+3i, Z = 0+0.5i };
    7176// pointer
    72         enum( @const char *@ ) Name { Fred = "FRED", Mary = "MARY", Jane = "JANE" };
     77        enum( @char *@ ) Name { Fred = "FRED", Mary = "MARY", Jane = "JANE" };
    7378        int i, j, k;
    7479        enum( @int *@ ) ptr { I = &i,  J = &j,  K = &k };
     
    107112
    108113
    109 \subsection{Implicit Conversion}
     114\subsection{Value Conversion}
    110115C has an implicit type conversion from an enumerator to its base type @int@.
    111 Correspondingly, \CFA has an implicit (safe) conversion from a typed enumerator to its base type.
     116Correspondingly, \CFA has an implicit conversion from a typed enumerator to its base type.
    112117\begin{cfa}
    113118char currency = Dollar;
    114 string fred = Fred;                                             $\C{// implicit conversion from char * to \CFA string type}$
    115 Person student = Beth;
    116 \end{cfa}
    117 
    118 % The implicit conversion is accomplished by the compiler adding @value()@ function calls as a candidate with safe cost. Therefore, the expression
    119 % \begin{cfa}
    120 % char currency = Dollar;
    121 % \end{cfa}
    122 % is equivalent to
    123 % \begin{cfa}
    124 % char currency = value(Dollar);
    125 % \end{cfa}
    126 % Such conversion an @additional@ safe
    127 
    128 The implicit conversion is accomplished by the resolver adding call to @value()@ functions as a resolution candidate with a @implicit@ cost.
    129 Implicit cost is an additional category to Aaron's cost model. It is more signicant than @unsafe@ to have
    130 the compiler choosing implicit conversion over the narrowing conversion; It is less signicant to @poly@
    131 so that function overloaded with enum traits will be selected over the implicit. @Enum trait@ will be discussed in the chapter.
    132 
    133 Therefore, \CFA conversion cost is 8-tuple
    134 @@(unsafe, implicit, poly, safe, sign, vars, specialization, reference)@@
     119void foo( char * );
     120foo( Fred );
     121\end{cfa}
     122% \CFA enumeration being resolved as its base type because \CFA inserts an implicit @value()@ call on an \CFA enumeration.
     123During the resolution of expression e with \CFA enumeration type, \CFA adds @value(e)@ as an additional candidate with an extra \newterm{value} cost.
     124For expression @char currency = Dollar@, the is no defined conversion from Dollar (\CFA enumeration) type to basic type and the conversion cost is @infinite@,
     125thus the only valid candidate is @value(Dollar)@.
     126
     127@Value@ is a new category in \CFA's conversion cost model. It is defined to be a more significant factor than a @unsafe@ but weight less than @poly@.
     128The resultin g conversion cost is a 8-tuple:
     129@@(unsafe, value, poly, safe, sign, vars, specialization, reference)@@.
     130
     131\begin{cfa}
     132void bar(int);
     133enum(int) Month !{
     134        January=31, February=29, March=31, April=30, May=31, June-30,
     135        July=31, August=31, September=30, October=31, November=30, December=31
     136};
     137
     138Month a = Februrary;    // (1), with cost (0, 1, 0, 0, 0, 0, 0, 0)
     139double a = 5.5;                 // (2), with cost (1, 0, 0, 0, 0, 0, 0, 0)
     140
     141bar(a);
     142\end{cfa}
     143In the previous example, candidate (1) has an value cost to parameter type int, with is lower than (2) as an unsafe conversion from double to int.
     144\CFA chooses value cost over unsafe cost and therefore @a@ of @bar(a)@ is resolved as an @Month@.
     145
     146\begin{cfa}
     147forall(T | @CfaEnum(T)@) void bar(T);
     148
     149bar(a);                                 // (3), with cost (0, 0, 1, 0, 0, 0, 0, 0)
     150\end{cfa}
     151% @Value@ is designed to be less significant than @poly@ to allow function being generic over \CFA enumeration (see ~\ref{c:trait}).
     152Being generic over @CfaEnum@ traits (a pre-defined interface for \CFA enums) is a practice in \CFA to implement functions over \CFA enumerations, as will see in chapter~\ref{c:trait}.
     153@Value@ is a being a more significant cost than @poly@ implies if a overloaeded function defined for @CfaEnum@ (and other generic type), \CFA always
     154try to resolve it as a @CfaEnum@, rather to insert a @value@ conversion.
     155
     156\subsection{Coercion}
     157While implicit conversion from a \CFA enumeration has been disabled, a explicit coercion cast to basic type is still possible to be consistent with C. In which case,
     158\CFA converts a \CFA enumeration variable as a basic type, with the value of the @position@ of the variable.
    135159
    136160\section{Auto Initialization}
     
    145169The complexity of the constant expression depends on the level of runtime computation the compiler implements, \eg \CC \lstinline[language={[GNU]C++}]{constexpr} provides complex compile-time computation across multiple types, which blurs the compilation/runtime boundary.
    146170
    147 % The notion of auto-initialization can be generalized in \CFA through the trait @AutoInitializable@.
    148 % \begin{cfa}
    149 % forall(T) @trait@ AutoInitializable {
    150 %       void ?{}( T & o, T v );                         $\C{// initialization}$
    151 %       void ?{}( T & t, zero_t );                      $\C{// 0}$
    152 %       T ?++( T & t);                                          $\C{// increment}$
    153 % };
    154 % \end{cfa}
    155 % In addition, there is an implicit enumeration counter, @ecnt@ of type @T@, managed by the compiler.
    156 % For example, the type @Odd@ satisfies @AutoInitializable@:
    157 % \begin{cfa}
    158 % struct Odd { int i; };
    159 % void ?{}( Odd & o, int v ) { if ( v & 1 ) o.i = v; else /* error not odd */ ; };
    160 % void ?{}( Odd & o, zero_t ) { o.i = 1; };
    161 % Odd ?++( Odd o ) { return (Odd){ o.i + 2 }; };
    162 % \end{cfa}
    163 % and implicit initialization is available.
    164 % \begin{cfa}
    165 % enum( Odd ) { A, B, C = 7, D };                       $\C{// 1, 3, 7, 9}$
    166 % \end{cfa}
    167 % where the compiler performs the following transformation and runs the code.
    168 % \begin{cfa}
    169 % enum( Odd ) {
    170 %       ?{}( ecnt, @0@ }  ?{}( A, ecnt },       ?++( ecnt )  ?{}( B, ecnt ),
    171 %       ?{}( ecnt, 7 )  ?{}( C, ecnt ), ?++( ecnt )  ?{}( D, ecnt )
    172 % };
    173 % \end{cfa}
    174 
    175 The notion of auto-initialization is generalized in \CFA enum in the following way:
    176 Enumerator e is the first enumerator of \CFA enumeration E with base type T. If e declares no no initializer, e is auto-initialized by the $zero\_t$ constructor of T.
    177 \CFA reports a compile time error if T has no $zero\_t$ constructor.
    178 Enumerator e is an enumerator of base-type T enumeration E that position i, where $i \neq 0$. And d is the enumerator with position @i-1@, e is auto-initialized with
    179 the result of @value(d)++@. If operator @?++@ is not defined for type T, \CFA reports a compile time error.
    180 
    181 Unfortunately, auto-initialization is not implemented because \CFA is only a transpiler, relying on generated C code to perform the detail work.
     171% The notion of auto-initialization is generalized in \CFA enumertation E with base type T in the following way:
     172When an enumerator @e@ does not have a initializer, if @e@ has enumeration type @E@ with base type @T@, \CFA auto-initialize @e@ with the following scheme:
     173\begin{enumerate}
     174% \item Enumerator e is the first enumerator of \CFA enumeration E with base type T. If e declares no no initializer, e is auto-initialized by the $zero\_t$ constructor of T.
     175\item if e is first enumerator, e is initialized with T's @zero_t@.
     176\item otherwise, if d is the enumerator defined just before e, with d has has been initialized with expression @l@ (@l@ can also be an auto-generated), e is initialized with @l++@.
     177% \CFA reports a compile time error if T has no $zero\_t$ constructor.
     178% Enumerator e is an enumerator of base-type T enumeration E that position i, where $i \neq 0$. And d is the enumerator with position @i-1@, e is auto-initialized with
     179% the result of @value(d)++@. If operator @?++@ is not defined for type T, \CFA reports a compile time error.
     180
     181% Unfortunately, auto-initialization is not implemented because \CFA is only a transpiler, relying on generated C code to perform the detail work.
     182% C does not have the equivalent of \CC \lstinline[language={[GNU]C++}]{constexpr}, and it is currently beyond the scope of the \CFA project to implement a complex runtime interpreter in the transpiler.
     183% Nevertheless, the necessary language concepts exist to support this feature.
     184\end{enumerate}
     185while @?++( T )@ can be explicitly overloaded or implicitly overloaded with properly defined @one_t@ and @?+?(T, T)@.
     186
     187Unfortunately, auto-initialization with only constant expression is not enforced because \CFA is only a transpiler, relying on generated C code to perform the detail work.
    182188C does not have the equivalent of \CC \lstinline[language={[GNU]C++}]{constexpr}, and it is currently beyond the scope of the \CFA project to implement a complex runtime interpreter in the transpiler.
    183189Nevertheless, the necessary language concepts exist to support this feature.
    184 
    185 
    186190
    187191\section{Enumeration Inheritance}
     
    292296In most programming languages, an enumerator is implicitly converted to its value (like a typed macro substitution).
    293297However, enumerator synonyms and typed enumerations make this implicit conversion to value incorrect in some contexts.
    294 In these contexts, a programmer's initition assumes an implicit conversion to position.
     298In these contexts, a programmer's intuition assumes an implicit conversion to position.
    295299
    296300For example, an intuitive use of enumerations is with the \CFA @switch@/@choose@ statement, where @choose@ performs an implicit @break@ rather than a fall-through at the end of a @case@ clause.
  • doc/theses/jiada_liang_MMath/background.tex

    r5aeb1a9 r38e20a80  
    401401
    402402\subsection{Conversion Cost}
    403 In C, functions argument and parameter type does not need to be exact match, and the compiler performs an @implicit conversion@ on argument.
    404 \begin{cfa}
    405 void foo(double i);
    406 foo(42);
    407 \end{cfa}
    408 The implicit conversion in C is relatively simple because of the abscence of overloading, with the exception of binary operators, for which the
    409 compiler needs to find a common type of both operands and the result. The pattern is known as "usual arithmetic conversions".
    410 
    411 \CFA generalizes C implicit conversion to function overloading as a concept of @conversion cost@.
    412 Initially designed by Bilson, conversion cost is a 3-tuple, @(unsafe, poly, safe)@, where unsafe is the number of narrowing conversion,
    413 poly is the count of polymorphics type binding, and safe is the sum of the degree of widening conversion. Every
    414 basic type in \CFA has been assigned with a @distance to Byte@, or @distance@, and the degree of widening conversion is the difference between two distances.
    415 
    416 Aaron extends conversion cost to a 7-tuple,
    417 @@(unsafe, poly, safe, sign, vars, specialization, reference)@@. The summary of Aaron's cost model is the following:
     403\label{s:ConversionCost}
     404In C, function call arguments and function parameters do not need to be a exact match. When types mismatch, C performs an \newterm{implicit conversion}
     405on argument to parameter type. The process is trivial with the exception on binary operators;  When types of operands are different,
     406C nees to decide which operands need implicit conversion. C defines the resolution pattern as "usual arithmetic conversion",
     407in which C looks for a \newterm{common type} between operands, and convert either one or both operands to the common type.
     408Loosely defined, a common type is a the smallest type in terms of size of representation that both operands can be converted into without losing their precision.
     409Such conversion is called "widening" or "safe conversion".
     410
     411C binary operators can be restated as 2-arity functions that overloaded with different parameters. "Usual arithmetic conversion" is to find a overloaded
     412instance that for both arguments, the conversion to parameter type is a widening conversion to the smallest type.
     413
     414\CFA generalizes "usual arithmetic conversion" to \newterm{conversion cost}. In the first design by Bilson, conversion cost is a 3-tuple,
     415@(unsafe, poly, safe)@, where @unsafe@ the number of unsafe (narrorow conversion) from argument to parameter,
     416@poly@ is the number of polymorphic function parameter,
     417and @safe@ is sum of degree of safe (widening) conversion.
     418Sum of degree is a method to quantify C's integer and floating-point rank.
     419Every pair of widening conversion types has been assigned with a \newterm{distance}, and distance between the two same type is 0.
     420For example, the distance from char to int is 2, distance from integer to long is 1, and distance from int to long long int is 2.
     421The distance does not mirror C's rank system. For example, the rank of char and signed char are the same in C, but the distance from char to signed char is assigned with 1.
     422@safe@ cost is summing all pair of argument to parameter safe conversion distance.
     423Among the three in Bilson's model, @unsafe@ is the most significant cost and @safe@ is the least significant one, with an implication that \CFA always choose
     424a candidate with the lowest @unsafe@ if possible.
     425
     426Assume the overloaded function @foo@ is called with two @int@ parameter. The cost for every overloaded @foo@ has been list along:
     427\begin{cfa}
     428void foo(char, char);                           // (2, 0, 0)
     429void foo(char, int);                            // (1, 0, 0)
     430forall(T, V) void foo(T, V);            // (0, 2, 0)
     431forall(T)        void foo(T, T);                // (0, 2, 0)
     432forall(T)        void foo(T, int);              // (0, 1, 0)
     433void foo(long long, long);                      // (0, 0, 3)
     434void foo(long, long);                           // (0, 0, 2)
     435void foo(int, long);                            // (0, 0, 1)
     436
     437int i, j; foo(i, j);
     438\end{cfa}
     439The overloaded instances are ordered from the highest to the lowest cost, and \CFA select the last candidate.
     440
     441In the later iteration of \CFA, Schluntz and Aaron expanded conversion cost to a 7-tuple with 4 additional categories,
     442@@(unsafe, poly, safe, sign, vars, specialization, reference)@@.
     443with interpretation listed below:
    418444\begin{itemize}
    419 \item Unsafe is the number of argument that implicitly convert to a type with high rank.
    420 \item Poly accounts for number of polymorphics binding in the function declaration.
    421 \item Safe is sum of distance (add reference/appendix later).
     445\item Unsafe
     446\item Poly
     447\item Safe
    422448\item Sign is the number of sign/unsign variable conversion.
    423 \item Vars is the number of polymorphics type declared in @forall@.
    424 \item Specialization is opposite number of function declared in @forall@. More function declared implies more constraint on polymorphics type, and therefore has the lower cost.
    425 \item Reference is number of lvalue-to-rvalue conversion.
     449\item Vars is the number of polymorphics type variable.
     450\item Specialization is negative value of the number of type assertion.
     451\item Reference is number of reference-to-rvalue conversion.
    426452\end{itemize}
     453The extended conversion cost models looks for candidates that are more specific and less generic.
     454@Var@s was introduced by Aaron to disambugate @forall(T, V) void foo(T, V)@ and @forall(T) void foo(T, T)@. The extra type parameter @V@
     455makes it more generic and less specific. More generic type means less constraints on types of its parameters. \CFA is in favor of candidates with more
     456restrictions on polymorphism so @forall(T) void foo(T, T)@ has lower cost. @Specialization@ is a value that always less than or equal to zero. For every type assertion in @forall@ clause, \CFA subtracts one from
     457@specialization@, starting from zero. More type assertions often means more constraints on argument type, and making the function less generic.
     458
     459\CFA defines two special cost value: @zero@ and @infinite@ A conversion cost is @zero@ when argument and parameter has exact match, and a conversion cost is @infinite@ when
     460there is no defined conversion between two types. For example, the conversion cost from int to a struct type S is @infinite@.
  • doc/theses/jiada_liang_MMath/implementation.tex

    r5aeb1a9 r38e20a80  
    1 \chapter{Enumeration Implementation}
    2 
    3 \section{Enumeration Traits}
    4 
    5 \CFA defines a set of traits containing operators and helper functions for @enum@.
    6 A \CFA enumeration satisfies all of these traits allowing it to interact with runtime features in \CFA.
    7 Each trait is discussed in detail.
    8 
    9 The trait @CfaEnum@:
     1\chapter{Enumeration Traits}
     2\label{c:trait}
     3
     4\section{CfaEnum and TypedEnum}
     5
     6\CFA defines attribute functions @label()@ and @posn()@ for all \CFA enumerations,
     7and therefore \CFA enumerations fulfills the type assertions with the combination.
     8With the observation, we define trait @CfaEnum@:
    109\begin{cfa}
    1110forall( E ) trait CfaEnum {
     
    1413};
    1514\end{cfa}
    16 asserts an enumeration type @E@ has named enumerator constants (@label@) with positions (@posn@).
    17 
    18 The trait @TypedEnum@ extends @CfaEnum@:
     15
     16% The trait @TypedEnum@ extends @CfaEnum@ with an additional value() assertion:
     17Typed enumerations are \CFA enumeration with an additional @value@ attribute. Extending
     18CfaEnum traits, TypedEnum is a subset of CFAEnum that implements attribute function @value()@,
     19which includes all typed enumerations.
    1920\begin{cfa}
    2021forall( E, V | CfaEnum( E ) ) trait TypedEnum {
     
    2223};
    2324\end{cfa}
    24 asserting an enumeration type @E@ can have homogeneous enumerator values of type @V@.
    25 
    26 The declarative syntax
    27 \begin{cfa}
    28 enum(T) E { A = ..., B = ..., C = ... };
    29 \end{cfa}
    30 creates an enumerated type E with @label@, @posn@ and @value@ implemented automatically.
    31 \begin{cfa}
    32 void foo( T t ) { ... }
    33 void bar(E e) {
    34         choose ( e ) {
    35                 case A: printf( "\%d", posn( e) );
    36                 case B: printf( "\%s", label( e ) );
    37                 case C: foo( value( e ) );
    38         }
    39 }
    40 \end{cfa}
    41 
    42 Implementing general functions across all enumeration types is possible by asserting @CfaEnum( E, T )@, \eg:
    43 \begin{cfa}
    44 #include <string.hfa>
    45 forall( E, T | CfaEnum( E, T ) | {unsigned int toUnsigned(T)} )
    46 string formatEnum( E e ) {
    47         unsigned int v = toUnsigned( value( e ) );
    48         string out = label(e) + '(' + v +')';
    49         return out;
    50 }
    51 formatEnum( Week.Mon );
    52 formatEnum( RGB.Green );
    53 \end{cfa}
    54 
    55 \CFA does not define attribute functions for C-style enumeration.
    56 But it is possible for users to explicitly implement enumeration traits for C enum and any other types.
    57 \begin{cfa}
    58 enum Fruit { Apple, Pear, Cherry };                     $\C{// C enum}$
     25Type parameter V of TypedEnum trait binds to return type of @value()@, which is also the base
     26type for typed enumerations. CfaEnum and TypedEnum triats constitues a CfaEnum function interfaces, as well a way to define functions
     27over all CfaEnum enumerations.
     28\begin{cfa}
     29// for all type E that implements value() to return type T, where T is a type that convertible to string
     30forall(  E, T | TypedEnum( E, T ) | { ?{}(string &, T ); } )
     31string format_enum( E e ) { return label(E) + "(" + string(value(e)) + ")"; }
     32
     33// int is convertible to string; implemented in the standard library
     34enum(int) RGB { Red = 0xFF0000, Green = 0x00FF00, Blue = 0x0000FF };
     35
     36struct color_code { int R; int G; int B };
     37// Implement color_code to string conversion
     38?{}(string & this, struct color_code p ) {
     39        this = string(p.R) + ',' + string(p.G) + ',' + string(p.B);
     40}
     41enum(color_code) Rainbow {
     42        Red = {255, 0, 0}, Orange = {255, 127, 0}, Yellow = {255, 255, 0}, Green = {0, 255, 0},
     43        Blue = {0, 0, 255}, Indigo = {75, 0, 130}, Purple = {148, 0, 211}
     44};
     45
     46format_enum(RGB.Green); // "Green(65280)"
     47format_enum(Rainbow.Green); // "Green(0,255,0)"
     48\end{cfa}
     49
     50
     51% Not only CFA enumerations can be used with CfaEnum trait, other types that satisfy
     52% CfaEnum assertions are all valid.
     53Types does not need be defined as \CFA enumerations to work with CfaEnum traits. CfaEnum applies to any type
     54with @label()@ and @value()@ being properly defined.
     55Here is an example on how to extend a C enumeration to comply CfaEnum traits:
     56\begin{cfa}
     57enum Fruit { Apple, Banana, Cherry };                   $\C{// C enum}$
    5958const char * label( Fruit f ) {
    60         choose ( f ) {
     59        choose( f ) {
    6160                case Apple: return "Apple";
    62                 case Bear: return "Pear";
     61                case Banana: return "Banana";
    6362                case Cherry: return "Cherry";
    6463        }
    6564}
    66 unsigned posn( Fruit f ) { return f; }
    67 const char * value( Fruit f ) { return ""; } $\C{// value can return any non void type}$
    68 formatEnum( Apple );                                            $\C{// Fruit is now a \CFA enum}$
    69 \end{cfa}
    70 
    71 A type that implements trait @CfaEnum@, \ie, a type has no @value@, is called an opaque enum.
    72 
    73 % \section{Enumerator Opaque Type}
    74 
    75 % \CFA provides a special opaque enumeration type, where the internal representation is chosen by the compiler and only equality operations are available.
    76 \begin{cfa}
    77 enum@()@ Planets { MERCURY, VENUS, EARTH, MARS, JUPITER, SATURN, URANUS, NEPTUNE };
    78 \end{cfa}
    79 
    80 
    81 In addition, \CFA implements @Bound@ and @Serial@ for \CFA enumerations.
     65unsigned posn( Fruit f ) { return f; }
     66char value( Fruit f ) {
     67        choose(f)  {
     68                case Apple: return 'a';
     69                case Banana: return 'b';
     70                case Cherry: return 'c';
     71        }
     72}
     73
     74format_enum(Cherry); // "Cherry(c)"
     75\end{cfa}
     76
     77\subsection{Bounded and Serial}
     78A bounded type defines a lower bound and a upper bound.
    8279\begin{cfa}
    8380forall( E ) trait Bounded {
    84         E first();
    85         E last();
    86 };
    87 \end{cfa}
    88 The function @first()@ and @last()@ of enumerated type E return the first and the last enumerator declared in E, respectively. \eg:
    89 \begin{cfa}
    90 Workday day = first();                                  $\C{// Mon}$
    91 Planet outermost = last();                              $\C{// NEPTUNE}$
    92 \end{cfa}
    93 @first()@ and @last()@ are overloaded with return types only, so in the example, the enumeration type is found on the left-hand side of the assignment.
    94 Calling either functions without a context results in a type ambiguity, except in the rare case where the type environment has only one enumeration.
    95 \begin{cfa}
    96 @first();@                                                              $\C{// ambiguous because both Workday and Planet implement Bounded}$
    97 sout | @last()@;
    98 Workday day = first();                                  $\C{// day provides type Workday}$
     81        E lowerBound();
     82        E lowerBound();
     83};
     84
     85\end{cfa}
     86Both Bounded functions are implement for \CFA enumerations, with @lowerBound()@ returning the first enumerator and @upperBound()@ returning
     87the last enumerator.
     88\begin{cfa}
     89Workday day = lowerBound();                                     $\C{// Mon}$
     90Planet outermost = upperBound();                                $\C{// NEPTUNE}$
     91\end{cfa}
     92
     93The lowerBound() and upperBound() are functions overloaded on return type only, means their type resolution solely depend on the outer context,
     94including expected type as a function argument, or the left hand size of an assignment expression.
     95Calling either function without a context results in a type ambiguity, except in the rare case where the type environment has only one
     96type overloads the functions, including \CFA enumerations, which has Bounded functions automatic defined.
     97\begin{cfa}
     98@lowerBound();@                 $\C{// ambiguous as both Workday and Planet implement Bounded}$
     99sout | @lowerBound()@;
     100Workday day = first();          $\C{// day provides type Workday}$
    99101void foo( Planet p );
    100 foo( last() );                                                  $\C{// argument provides type Planet}$
    101 \end{cfa}
    102 
    103 The trait @Serial@:
     102foo( last() );                      $\C{// argument provides type Planet}$
     103\end{cfa}
     104
     105@Serial@ is a subset of @Bounded@, with functions maps elements against integers, as well implements a sequential order between members.
    104106\begin{cfa}
    105107forall( E | Bounded( E ) ) trait Serial {
    106108        unsigned fromInstance( E e );
    107         E fromInt( unsigned int posn );
     109        E fromInt( unsigned int i );
    108110        E succ( E e );
    109111        E pred( E e );
    110 };
    111 \end{cfa}
    112 is a @Bounded@ trait, where elements can be mapped to an integer sequence.
    113 A type @T@ matching @Serial@ can project to an unsigned @int@ type, \ie an instance of type T has a corresponding integer value.
    114 %However, the inverse may not be possible, and possible requires a bound check.
    115 The mapping from a serial type to integer is defined by @fromInstance@, which returns the enumerator's position.
    116 The inverse operation is @fromInt@, which performs a bound check using @first()@ and @last()@ before casting the integer into an enumerator.
    117 Specifically, for enumerator @E@ declaring $N$ enumerators, @fromInt( i )@ returns the $i-1_{th}$ enumerator, if $0 \leq i < N$, or raises the exception @enumBound@.
    118 
    119 The @succ( E e )@ and @pred( E e )@ imply the enumeration positions are consecutive and ordinal.
    120 Specifically, if @e@ is the $i_{th}$ enumerator, @succ( e )@ returns the $i+1_{th}$ enumerator when $e \ne last()$, and @pred( e )@ returns the $i-1_{th}$ enumerator when $e \ne first()$.
    121 The exception @enumRange@ is raised if the result of either operation is outside the range of type @E@.
     112        unsigned Countof( E e );
     113};
     114\end{cfa}
     115
     116% A Serail type can project to an unsigned @int@ type, \ie an instance of type T has a corresponding integer value.
     117Function @fromInstance()@ projects a @Bounded@ member to a number and @fromInt@ is the inverser. Function @succ()@ take an element, returns the "next"
     118member in sequential order and @pred()@ returns the "last" member.
     119
     120A Serial type E may not be having a one-to-one mapping to integer because of bound. An integer that cannot be mapping to a member of E is called the member \newterm{out of bound}.
     121Calling @succ()@ on @upperBound@ or @pred()@ on @lowerBound()@ results in out of bound.
     122
     123\CFA implements Serial interface for CFA enumerations with \newterm{bound check} on @fromInt()@, @succ()@ and @pred()@, and abort the program if the function call results in out of bound.
     124Unlike a cast, conversion between \CFA enumeration and integer with @Serial@ interface is type safe.
     125Specifically for @fromInt@, \CFA abort if input i smaller than @fromInstance(lowerBound())@ or greater than @fromInstance(upperBound())@
     126
     127Function @Countof@ takes an object as a parameter and returns the number of elements in the Serial type, which is @fromInstance( upper ) - fromInstance( lower ) + 1@.
     128@Countof@ does not use its arugment as procedural input; it needs
     129an argument to anchor its polymorphic type T.
     130
     131\CFA has an expression @countof@ (lower case) that returns the number of enumerators defined for enumerations.
     132\begin{cfa}
     133enum RGB {Red, Green, Blue};
     134countof( RGB ); // (1)
     135countof( Red ); // (2)
     136\end{cfa}
     137Both expressions from the previous example are converted to constant expression @3@ with no function call at runtime.
     138@countof@ also works for any type T that defines @Countof@ and @lowerBound@, for which it turns into
     139a function call @Countof( T )@. The resolution step on expression @countof(e)@ works as the following with priority ordered:
     140\begin{enumerate}
     141\item Looks for an enumeration named e, such as @enum e {... }@.
     142If such an enumeration e exists, \CFA replace @countof(e)@  with constant expression with number of enumerator of e.
     143\item Looks for a non-enumeration type named e that defines @Countof@ and @lowerBound@, including e being a polymorphic type, such as @forall(e)@.
     144If type e exists, \CFA replaces it with @Countof(lowerBound())@, where lowerBound() is bounded to type e. 
     145\item Looks for an enumerator e that defined in enumeration E. If such an enumeration e exists, \CFA replace @countof(e)@ with constant expression with number of enumerator of E.
     146\item Looks for a name e in the context with expression type E. If such name e exists, \CFA replace @countof(e)@ with function call @Countof(e)@.
     147\item If 1-4 fail, \CFA reports a type error on expression @countof(e)@.
     148\end{enumerate}
     149
     150With the @Bounded@ and @Serial@, a loop over enumeration can be implemented in the following ways:
     151\begin{cfa}
     152enum() E { ... }
     153for( unsigned i = 0; i < countof(E); i++ ) { ... }
     154for( E e = lowerBound(); ; e = succ(e) ) { ...; if (e == upperBound()) break; }
     155
     156forall( T ) {
     157        for( unsigned i = 0; i < countof(T); i++ ) { ... }
     158        for( T e = lowerBound(); ; e = succ(e) ) { ...; if (e == upperBound()) break; }
     159}
     160\end{cfa}
    122161
    123162Finally, there is an associated trait defining comparison operators among enumerators.
     
    154193
    155194
    156 \section{Enumeration Variable}
    157 
    158 Although \CFA enumeration captures three different attributes, an enumeration instance does not store all this information.
    159 The @sizeof@ a \CFA enumeration instance is always 4 bytes, the same size as a C enumeration instance (@sizeof( int )@).
    160 It comes from the fact that:
    161 \begin{enumerate}
    162 \item
    163 a \CFA enumeration is always statically typed;
    164 \item
    165 it is always resolved as one of its attributes regarding real usage.
    166 \end{enumerate}
    167 When creating an enumeration instance @colour@ and assigning it with the enumerator @Color.Green@, the compiler allocates an integer variable and stores the position 1.
    168 The invocations of $positions()$, $value()$, and $label()$ turn into calls to special functions defined in the prelude:
    169 \begin{cfa}
    170 position( green );
    171 >>> position( Colour, 1 ) -> int
    172 value( green );
    173 >>> value( Colour, 1 ) -> T
    174 label( green );
    175 >>> label( Colour, 1) -> char *
    176 \end{cfa}
    177 @T@ represents the type declared in the \CFA enumeration defined and @char *@ in the example.
    178 These generated functions are $Companion Functions$, they take an $companion$ object and the position as parameters.
    179 
    180 
    181 \section{Enumeration Data}
    182 
    183 \begin{cfa}
    184 enum(T) E { ... };
    185 // backing data
    186 T * E_values;
    187 char ** E_labels;
    188 \end{cfa}
    189 Storing values and labels as arrays can sometimes help support enumeration features.
    190 However, the data structures are the overhead for the programs. We want to reduce the memory usage for enumeration support by:
    191 \begin{itemize}
    192         \item Only generates the data array if necessary
    193         \item The compilation units share the data structures.
    194         No extra overhead if the data structures are requested multiple times.
    195 \end{itemize}
    196 
    197 
    198 \section{Unification}
    199 
    200 \section{Enumeration as Value}
    201 \label{section:enumeration_as_value}
    202 An \CFA enumeration with base type T can be used seamlessly as T, without explicitly calling the pseudo-function value.
    203 \begin{cfa}
    204 char * green_value = Colour.Green; // "G"
    205 // Is equivalent to
    206 // char * green_value = value( Color.Green ); "G"
    207 \end{cfa}
    208 
    209 
    210 \section{Unification Distance}
    211 
    212 \begin{cfa}
    213 T_2 Foo(T1);
    214 \end{cfa}
    215 The @Foo@ function expects a parameter with type @T1@. In C, only a value with the exact type T1 can be used as a parameter for @Foo@. In \CFA, @Foo@ accepts value with some type @T3@ as long as @distance(T1, T3)@ is not @Infinite@.
    216 
    217 @path(A, B)@ is a compiler concept that returns one of the following:
    218 \begin{itemize}
    219         \item Zero or 0, if and only if $A == B$.
    220         \item Safe, if B can be used as A without losing its precision, or B is a subtype of A.
    221         \item Unsafe, if B loses its precision when used as A, or A is a subtype of B.
    222         \item Infinite, if B cannot be used as A. A is not a subtype of B and B is not a subtype of A.
    223 \end{itemize}
    224 
    225 For example, @path(int, int)==Zero@, @path(int, char)==Safe@, @path(int, double)==Unsafe@, @path(int, struct S)@ is @Infinite@ for @struct S{}@.
    226 @distance(A, C)@ is the minimum sum of paths from A to C. For example, if @path(A, B)==i@, @path(B, C)==j@, and @path(A, C)=k@, then $$distance(A,C)==min(path(A,B), path(B,C))==i+j$$.
    227 
    228 (Skip over the distance matrix here because it is mostly irrelevant for enumeration discussion. In the actual implementation, distance( E, T ) is 1.)
    229 
    230 The arithmetic of distance is the following:
    231 \begin{itemize}
    232         \item $Zero + v= v$, for some value v.
    233         \item $Safe * k <  Unsafe$, for finite k.
    234         \item $Unsafe * k < Infinite$, for finite k.
    235         \item $Infinite + v = Infinite$, for some value v.
    236 \end{itemize}
    237 
    238 For @enum(T) E@, @path(T, E)==Safe@ and @path(E,T)==Infinite@. In other words, enumeration type E can be @safely@ used as type T, but type T cannot be used when the resolution context expects a variable with enumeration type @E@.
    239 
    240 
    241 \section{Variable Overloading and Parameter Unification}
    242 
    243 \CFA allows variable names to be overloaded. It is possible to overload a variable that has type T and an enumeration with type T.
    244 \begin{cfa}
    245 char * green = "Green";
    246 Colour green = Colour.Green; // "G"
    247 
    248 void bar(char * s) { return s; }
    249 void foo(Colour c) { return value( c ); }
    250 
    251 foo( green ); // "G"
    252 bar( green ); // "Green"
    253 \end{cfa}
    254 \CFA's conversion distance helps disambiguation in this overloading. For the function @bar@ which expects the parameter s to have type @char *@, $distance(char *,char *) == Zero$ while $distance(char *, Colour) == Safe$, the path from @char *@ to the enumeration with based type @char *@, \CFA chooses the @green@ with type @char *@ unambiguously. On the other hand, for the function @foo@, @distance(Colour, char *)@ is @Infinite@, @foo@ picks the @green@ with type @char *@.
    255 
    256 \section{Function Overloading}
    257 Similarly, functions can be overloaded with different signatures. \CFA picks the correct function entity based on the distance between parameter types and the arguments.
    258 \begin{cfa}
    259 Colour green = Colour.Green;
    260 void foo(Colour c) { sout | "It is an enum"; } // First foo
    261 void foo(char * s) { sout | "It is a string"; } // Second foo
    262 foo( green ); // "It is an enum"
    263 \end{cfa}
    264 Because @distance(Colour, Colour)@ is @Zero@ and @distance(char *, Colour)@ is @Safe@, \CFA determines the @foo( green )@ is a call to the first foo.
    265 
    266 \section{Attributes Functions}
    267 The pseudo-function @value()@ "unboxes" the enumeration and the type of the expression is the underlying type. Therefore, in the section~\ref{section:enumeration_as_value} when assigning @Colour.Green@ to variable typed @char *@, the resolution distance is @Safe@, while assigning @value(Color.Green) to @char *) has resolution distance @Zero@.
    268 
    269 \begin{cfa}
    270 int s1;
    271 \end{cfa}
    272 The generated code for an enumeration instance is simply an int. It is to hold the position of an enumeration. And usage of variable @s1@ will be converted to return one of its attributes: label, value, or position, concerning the @Unification@ rule
    273 
    274 % \section{Unification and Resolution (this implementation will probably not be used, safe as reference for now)}
    275 
    276 % \begin{cfa}
    277 % enum Colour( char * ) { Red = "R", Green = "G", Blue = "B"  };
    278 % \end{cfa}
    279 % The @EnumInstType@ is convertible to other types.
    280 % A \CFA enumeration expression is implicitly \emph{overloaded} with its three different attributes: value, position, and label.
    281 % The \CFA compilers need to resolve an @EnumInstType@ as one of its attributes based on the current context.
    282 
    283 % \begin{cfa}[caption={Null Context}, label=lst:null_context]
    284 % {
    285 %       Colour.Green;
    286 % }
    287 % \end{cfa}
    288 % In example~\ref{lst:null_context}, the environment gives no information to help with the resolution of @Colour.Green@.
    289 % In this case, any of the attributes is resolvable.
    290 % According to the \textit{precedence rule}, the expression with @EnumInstType@ resolves as @value( Colour.Green )@.
    291 % The @EnumInstType@ is converted to the type of the value, which is statically known to the compiler as @char *@.
    292 % When the compilation reaches the code generation, the compiler outputs code for type @char *@ with the value @"G"@.
    293 % \begin{cfa}[caption={Null Context Generated Code}, label=lst:null_context]
    294 % {
    295 %       "G";
    296 % }
    297 % \end{cfa}
    298 % \begin{cfa}[caption={int Context}, label=lst:int_context]
    299 % {
    300 %       int g = Colour.Green;
    301 % }
    302 % \end{cfa}
    303 % The assignment expression gives a context for the EnumInstType resolution.
    304 % The EnumInstType is used as an @int@, and \CFA needs to determine which of the attributes can be resolved as an @int@ type.
    305 % The functions $Unify( T1, T2 ): bool$ take two types as parameters and determine if one type can be used as another.
    306 % In example~\ref{lst:int_context}, the compiler is trying to unify @int@ and @EnumInstType@ of @Colour@.
    307 % $$Unification( int, EnumInstType<Colour> )$$ which turns into three Unification call
    308 % \begin{cfa}[label=lst:attr_resolution_1]
    309 % {
    310 %       Unify( int, char * ); // unify with the type of value
    311 %       Unify( int, int ); // unify with the type of position
    312 %       Unify( int, char * ); // unify with the type of label
    313 % }
    314 % \end{cfa}
    315 % \begin{cfa}[label=lst:attr_resolution_precedence]
    316 % {
    317 %       Unification( T1, EnumInstType<T2> ) {
    318 %               if ( Unify( T1, T2 ) ) return T2;
    319 %               if ( Unify( T1, int ) ) return int;
    320 %               if ( Unify( T1, char * ) ) return char *;
    321 %               Error: Cannot Unify T1 with EnumInstType<T2>;
    322 %       }
    323 % }
    324 % \end{cfa}
    325 % After the unification, @EnumInstType@ is replaced by its attributes.
    326 
    327 % \begin{cfa}[caption={Unification Functions}, label=lst:unification_func_call]
    328 % {
    329 %       T2 foo ( T1 ); // function take variable with T1 as a parameter
    330 %       foo( EnumInstType<T3> ); // Call foo with a variable has type EnumInstType<T3>
    331 %       >>>> Unification( T1, EnumInstType<T3> )
    332 % }
    333 % \end{cfa}
    334 % % The conversion can work backward: in restrictive cases, attributes of can be implicitly converted back to the EnumInstType.
    335 % Backward conversion:
    336 % \begin{cfa}[caption={Unification Functions}, label=lst:unification_func_call]
    337 % {
    338 %       enum Colour colour = 1;
    339 % }
    340 % \end{cfa}
    341 
    342 % \begin{cfa}[caption={Unification Functions}, label=lst:unification_func_call]
    343 % {
    344 %       Unification( EnumInstType<Colour>, int ) >>> label
    345 % }
    346 % \end{cfa}
    347 % @int@ can be unified with the label of Colour.
    348 % @5@ is a constant expression $\Rightarrow$ Compiler knows the value during the compilation $\Rightarrow$ turns it into
    349 % \begin{cfa}
    350 % {
    351 %       enum Colour colour = Colour.Green;
    352 % }
    353 % \end{cfa}
    354 % Steps:
    355 % \begin{enumerate}
    356 % \item
    357 % identify @1@ as a constant expression with type @int@, and the value is statically known as @1@
    358 % \item
    359 % @unification( EnumInstType<Colour>, int )@: @position( EnumInstType< Colour > )@
    360 % \item
    361 % return the enumeration constant at position 1
    362 % \end{enumerate}
    363 % \begin{cfa}
    364 % {
    365 %       enum T (int) { ... } // Declaration
    366 %       enum T t = 1;
    367 % }
    368 % \end{cfa}
    369 % Steps:
    370 % \begin{enumerate}
    371 % \item
    372 % identify @1@ as a constant expression with type @int@, and the value is statically known as @1@
    373 % \item
    374 % @unification( EnumInstType<Colour>, int )@: @value( EnumInstType< Colour > )@
    375 % \item
    376 % return the FIRST enumeration constant that has the value 1, by searching through the values array
    377 % \end{enumerate}
    378 % The downside of the precedence rule: @EnumInstType@ $\Rightarrow$ @int ( value )@ $\Rightarrow$ @EnumInstType@ may return a different @EnumInstType@ because the value can be repeated and there is no way to know which one is expected $\Rightarrow$ want uniqueness
    379 
    380 % \section{Casting}
    381 % Casting an EnumInstType to some other type T works similarly to unify the EnumInstType with T. For example:
    382 % \begin{cfa}
    383 % enum( int ) Foo { A = 10, B = 100, C = 1000 };
    384 % (int) Foo.A;
    385 % \end{cfa}
    386 % The \CFA-compiler unifies @EnumInstType<int>@ with int, with returns @value( Foo.A )@, which has statically known value 10. In other words, \CFA-compiler is aware of a cast expression, and it forms the context for EnumInstType resolution. The expression with type @EnumInstType<int>@ can be replaced by the compile with a constant expression 10, and optionally discard the cast expression.
    387 
    388 % \section{Value Conversion}
    389 % As discussed in section~\ref{lst:var_declaration}, \CFA only saves @position@ as the necessary information. It is necessary for \CFA to generate intermediate code to retrieve other attributes.
    390 
    391 % \begin{cfa}
    392 % Foo a; // int a;
    393 % int j = a;
    394 % char * s = a;
    395 % \end{cfa}
    396 % Assume stores a value x, which cannot be statically determined. When assigning a to j in line 2, the compiler @Unify@ j with a, and returns @value( a )@. The generated code for the second line will be
    397 % \begin{cfa}
    398 % int j = value( Foo, a )
    399 % \end{cfa}
    400 % Similarly, the generated code for the third line is
    401 % \begin{cfa}
    402 % char * j = label( Foo, a )
    403 % \end{cfa}
    404 
    405 
    406 \section{Enumerator Initialization}
    407 
    408 An enumerator must have a deterministic immutable value, either be explicitly initialized in the enumeration definition, or implicitly initialized by rules.
    409 
    410 
    411 \section{C Enumeration Rule}
    412 
    413 A C enumeration has an integral type. If not initialized, the first enumerator implicitly has the integral value 0, and other enumerators have a value equal to its $predecessor + 1$.
    414 
    415 
    416 \section{Auto Initialization}
    417 
    418 C auto-initialization works for the integral type @int@ with constant expressions.
    419 \begin{cfa}
    420 enum Alphabet ! {
    421         A = 'A', B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z,
    422         a = 'a', b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z
    423 };
    424 \end{cfa}
    425 The complexity of the constant expression depends on the level of runtime computation the compiler implements, \eg \CC \lstinline[language={[GNU]C++}]{constexpr} provides complex compile-time computation across multiple types, which blurs the compilation/runtime boundary.
    426 
    427 The notion of auto-initialization can be generalized in \CFA through the trait @AutoInitializable@.
    428 \begin{cfa}
    429 forall(T) @trait@ AutoInitializable {
    430         void ?{}( T & o, T v );                         $\C{// initialization}$
    431         void ?{}( T & t, zero_t );                      $\C{// 0}$
    432         T ?++( T & t);                                          $\C{// increment}$
    433 };
    434 \end{cfa}
    435 In addition, there is an implicit enumeration counter, @ecnt@ of type @T@, managed by the compiler.
    436 For example, the type @Odd@ satisfies @AutoInitializable@:
    437 \begin{cfa}
    438 struct Odd { int i; };
    439 void ?{}( Odd & o, int v ) { if ( v & 1 ) o.i = v; else /* error not odd */ ; };
    440 void ?{}( Odd & o, zero_t ) { o.i = 1; };
    441 Odd ?++( Odd o ) { return (Odd){ o.i + 2 }; };
    442 \end{cfa}
    443 and implicit initialization is available.
    444 \begin{cfa}
    445 enum( Odd ) { A, B, C = 7, D };                 $\C{// 1, 3, 7, 9}$
    446 \end{cfa}
    447 where the compiler performs the following transformation and runs the code.
    448 \begin{cfa}
    449 enum( Odd ) {
    450         ?{}( ecnt, @0@ }  ?{}( A, ecnt },       ?++( ecnt )  ?{}( B, ecnt ),
    451         ?{}( ecnt, 7 )  ?{}( C, ecnt ), ?++( ecnt )  ?{}( D, ecnt )
    452 };
    453 \end{cfa}
    454 
    455 Unfortunately, auto-initialization is not implemented because \CFA is only a transpiler, relying on generated C code to perform the detail work.
    456 C does not have the equivalent of \CC \lstinline[language={[GNU]C++}]{constexpr}, and it is currently beyond the scope of the \CFA project to implement a complex runtime interpreter in the transpiler.
    457 Nevertheless, the necessary language concepts exist to support this feature.
    458 
    459 
    460 \section{Enumeration Features}
    461 
    462 
    463195\section{Iteration and Range}
    464196
     
    541273for ( char * ch; labels( Alphabet ) )
    542274\end{cfa}
    543 
    544 
    545 % \section{Non-uniform Type}
    546 % TODO: Working in Progress, might need to change other sections. Conflict with the resolution right now.
    547 
    548 % \begin{cfa}
    549 % enum T( int, char * ) {
    550 %        a=42, b="Hello World"
    551 % };
    552 % \end{cfa}
    553 % The enum T declares two different types: int and char *. The enumerators of T hold values of one of the declared types.
    554 
    555 \section{Enumeration Inheritance}
    556 
    557 \begin{cfa}[label=lst:EnumInline]
    558 enum( char * ) Name { Jack = "Jack", Jill = "Jill" };
    559 enum /* inferred */ Name2 { inline Name, Sue = "Sue", Tom = "Tom" };
    560 \end{cfa}
    561 \lstinline{Inline} allows Enumeration Name2 to inherit enumerators from Name1 by containment, and a Name enumeration is a subtype of enumeration Name2. An enumeration instance of type Name can be used where an instance of Name2 is expected.
    562 \begin{cfa}[label=lst:EnumInline]
    563 Name Fred;
    564 void f( Name2 );
    565 f( Fred );
    566 \end{cfa}
    567 If enumeration A declares @inline B@ in its enumeration body, enumeration A is the "inlining enum" and enumeration B is the "inlined enum".
    568 
    569 An enumeration can inline at most one other enumeration. The inline declaration must be placed before the first enumerator of the inlining enum. The inlining enum has all the enumerators from the inlined enum, with the same labels, values, and position.
    570 \begin{cfa}[label=lst:EnumInline]
    571 enum /* inferred */ Name2 { inline Name, Sue = "Sue", Tom = "Tom" };
    572 // is equivalent to enum Name2 { Jack = "Jack", Jill="Jill", Sue = "Sue", Tom = "Tom" };
    573 \end{cfa}
    574 Name.Jack is equivalent to Name2.Jack. Their attributes are all identical. Opening both Name and Name2 in the same scope will not introduce ambiguity.
    575 \begin{cfa}[label=lst:EnumInline]
    576 with( Name, Name2 ) { Jack; } // Name.Jack and Name2.Jack are equivalent. No ambiguity
    577 \end{cfa}
    578 
    579 \section{Implementation}
    580 
    581 \section{Static Attribute Expression}
    582 \begin{cfa}[label=lst:static_attr]
    583 enum( char * ) Colour {
    584         Red = "red", Blue = "blue", Green = "green"
    585 };
    586 \end{cfa}
    587 An enumerator expression returns its enumerator value as a constant expression with no runtime cost. For example, @Colour.Red@ is equivalent to the constant expression "red", and \CFA finishes the expression evaluation before generating the corresponding C code. Applying a pseudo-function to a constant enumerator expression results in a constant expression as well. @value( Colour.Red )@, @position( Colour. Red )@, and @label( Colour.Red )@ are equivalent to constant expression with char * value "red", int value 0, and char * value "Red", respectively.
    588 
    589 \section{Runtime Attribute Expression and Weak Referenced Data}
    590 \begin{cfa}[label=lst:dynamic_attr]
    591 Colour c;
    592 ...
    593 value( c ); // or c
    594 \end{cfa}
    595 An enumeration variable c is equivalent to an integer variable with the value of @position( c )@ In Example~\ref{lst:dynamic_attr}, the value of enumeration variable c is unknown at compile time. In this case, the pseudo-function calls are reduced to expression that returns the enumerator values at runtime.
    596 
    597 \CFA stores the variables and labels in @const@ arrays to provide runtime lookup for enumeration information.
    598 
    599 \begin{cfa}[label=lst:attr_array]
    600 const char * Colour_labels [3] = { "Red", "Blue", "Green" };
    601 const char * Colour_values [3] = { "red", "blue", "green" };
    602 \end{cfa}
    603 The \CFA compiles transforms the attribute expressions into array access.
    604 \begin{cfa}[label=lst:attr_array_access]
    605 position( c ) // c; an integer
    606 value( c ); // Colour_values[c]
    607 label( c ); // Colour_labels[c]
    608 \end{cfa}
    609 
    610 To avoid unnecessary memory usage, the labels and values array are only generated as needed, and only generate once across all compilation units. By default, \CFA defers the declaration of the label and value arrays until an call to attribute function with a dynamic value. If an attribute function is never called on a dynamic value of an enumerator, the array will never be allocated. Once the arrays are created, all compilation units share a weak reference to the allocation array.
    611 
    612 \section{Enum Prelude}
    613 
    614 \begin{cfa}[label=lst:enum_func_dec]
    615 forall( T ) {
    616         unsigned position( unsigned );
    617         T value( unsigned );
    618         char * label( unsigned );
    619 }
    620 \end{cfa}
    621 \CFA loads the declaration of enumeration function from the enum.hfa.
    622 
    623 \section{Internal Representation}
    624 
    625 The definition of an enumeration is represented by an internal type called @EnumDecl@. At the minimum, it stores all the information needed to construct the companion object. Therefore, an @EnumDecl@ can be represented as the following:
    626 \begin{cfa}[label=lst:EnumDecl]
    627 forall(T)
    628 class EnumDecl {
    629         T* values;
    630         char** label;
    631 };
    632 \end{cfa}
    633 
    634 The internal representation of an enumeration constant is @EnumInstType@.
    635 An @EnumInstType@ has a reference to the \CFA-enumeration declaration and the position of the enumeration constant.
    636 \begin{cfa}[label=lst:EnumInstType]
    637 class EnumInstType {
    638         EnumDecl enumDecl;
    639         int position;
    640 };
    641 \end{cfa}
    642 In the later discussion, we will use @EnumDecl<T>@ to symbolize a @EnumDecl@ parameterized by type T, and @EnumInstType<T>@ is a declared instance of @EnumDecl<T>@.
    643 
    644 \begin{cfa}[caption={Enum Type Functions}, label=lst:cforall_enum_data]
    645 const T * const values;
    646 const char * label;
    647 int length;
    648 \end{cfa}
    649 Companion data are necessary information to represent an enumeration. They are stored as standalone pieces, rather than a structure. Those data will be loaded "on demand".
    650 Companion data are needed only if the according pseudo-functions are called. For example, the value of the enumeration Workday is loaded only if there is at least one compilation that has call $value(Workday)$. Once the values are loaded, all compilations share these values array to reduce memory usage.
    651 
    652 
    653 % \section{(Rework) Companion Object and Companion Function}
    654 
    655 % \begin{cfa}[caption={Enum Type Functions}, label=lst:cforall_enum_functions]
    656 % forall( T )
    657 % struct Companion {
    658 %       const T * const values;
    659 %                const char * label;
    660 %       int length;
    661 % };
    662 % \end{cfa}
    663 % \CFA generates companion objects, an instance of structure that encloses @necessary@ data to represent an enumeration. The size of the companion is unknown at the compilation time, and it "grows" in size to compensate for the @usage@.
    664 
    665 % The companion object is singleton across the compilation (investigation).
    666 
    667 % \CFA generates the definition of companion functions.
    668 % Because \CFA implicitly stores an enumeration instance as its position, the companion function @position@ does nothing but return the position it is passed.
    669 % Companions function @value@ and @label@ return the array item at the given position of @values@ and @labels@, respectively.
    670 % \begin{cfa}[label=lst:companion_definition]
    671 % int position( Companion o, int pos ) { return pos; }
    672 % T value( Companion o, int pos ) { return o.values[ pos ]; }
    673 % char * label( Companion o, int pos ) { return o.labels[ pos ]; }
    674 % \end{cfa}
    675 % Notably, the @Companion@ structure definition, and all companion objects, are visible to users.
    676 % A user can retrieve values and labels defined in an enumeration by accessing the values and labels directly, or indirectly by calling @Companion@ functions @values@ and @labels@
    677 % \begin{cfa}[label=lst:companion_definition_values_labels]
    678 % Colour.values; // read the Companion's values
    679 % values( Colour ); // same as Colour.values
    680 % \end{cfa}
    681 
    682 \section{Companion Traits (experimental)}
    683 Not sure its semantics yet, and it might replace a companion object.
    684 \begin{cfa}[label=lst:companion_trait]
    685 forall(T1) {
    686         trait Companion(otype T2<otype T1>) {
    687                 T1 value((otype T2<otype T1> const &);
    688                 int position(otype T2<otype T1> const &);
    689                 char * label(otype T2<otype T1> const &);
    690         }
    691 }
    692 \end{cfa}
    693 All enumerations implicitly implement the Companion trait, an interface to access attributes. The Companion can be a data type because it fulfills to requirements to have concrete instances, which are:
    694 
    695 \begin{enumerate}
    696   \item The instance of enumeration has a single polymorphic type.
    697   \item Each assertion should use the type once as a parameter.
    698 \end{enumerate}
    699 
    700 \begin{cfa}
    701 enum(int) Weekday {
    702         Mon = 10, Tue, ...
    703 };
    704 
    705 T value( enum Weekday<T> & this);
    706 int position( enum Weekday<T> & this )
    707 char * label( enum Weekday<T> & this )
    708 
    709 trait Companion obj = (enum(int)) Workday.Weekday;
    710 value(obj); // 10
    711 \end{cfa}
    712 The enumeration comes with default implementation to the Companion traits functions. The usage of Companion functions would make \CFA allocates and initializes the necessary companion arrays, and return the data at the position represented by the enumeration.
    713 (...)
    714 
    715 \section{User Define Enumeration Functions}
    716 
    717 Companion objects make extending features for \CFA enumeration easy.
    718 \begin{cfa}[label=lst:companion_user_definition]
    719 char * charastic_string( Companion o, int position ) {
    720         return sprintf( "Label: %s; Value: %s", label( o, position ), value( o, position) );
    721 }
    722 printf( charactic_string ( Color, 1 ) );
    723 >>> Label: Green; Value: G
    724 \end{cfa}
    725 Defining a function takes a Companion object effectively defines functions for all \CFA enumeration.
    726 
    727 The \CFA compiler turns a function call that takes an enumeration instance as a parameter into a function call with a companion object plus a position.
    728 Therefore, a user can use the syntax with a user-defined enumeration function call:
    729 \begin{cfa}[label=lst:companion_user_definition]
    730 charactic_string( Color.Green ); // equivalent to charactic_string( Color, 1 )
    731 >>> Label: Green; Value: G
    732 \end{cfa}
    733 Similarly, the user can work with the enumeration type itself: (see section ref...)
    734 \begin{cfa}[ label=lst:companion_user_definition]
    735 void print_enumerators ( Companion o ) {
    736         for ( c : Companion o ) {
    737                 sout | label (c) | value( c ) ;
    738         }
    739 }
    740 print_enumerators( Colour );
    741 \end{cfa}
    742 
    743 
    744 \section{Declaration}
    745 
    746 The qualified enumeration syntax is dedicated to \CFA enumeration.
    747 \begin{cfa}[label=lst:range_functions]
    748 enum (type_declaration) name { enumerator = const_expr, enumerator = const_expr, ... }
    749 \end{cfa}
    750 A compiler stores the name, the underlying type, and all enumerators in an @enumeration table@.
    751 During the $Validation$ pass, the compiler links the type declaration to the type's definition.
    752 It ensures that the name of an enumerator is unique within the enumeration body, and checks if all values of the enumerator have the declaration type.
    753 If the declared type is not @AutoInitializable@, \CFA rejects the enumeration definition.
    754 Otherwise, it attempts to initialize enumerators with the enumeration initialization pattern. (a reference to a future initialization pattern section)
    755 
    756 \begin{cfa}[label=lst:init]
    757 struct T { ... };
    758 void ?{}( T & t, zero_t ) { ... };
    759 void ?{}( T & t, one_t ) { ... };
    760 T ?+?( T & lhs, T & rhs ) { ... };
    761 
    762 enum (T) Sample {
    763         Zero: 0 /* zero_t */,
    764         One: Zero + 1 /* ?+?( Zero, one_t ) */ , ...
    765 };
    766 \end{cfa}
    767 Challenge: \\
    768 The value of an enumerator, or the initializer, requires @const_expr@.
    769 While previously getting around the issue by pushing it to the C compiler, it might not work anymore because of the user-defined types, user-defined @zero_t@, @one_t@, and addition operation.
    770 Might not be able to implement a \emph{correct} static check.
    771 
    772 \CFA $autogens$ a Companion object for the declared enumeration.
    773 \begin{cfa}[label=lst:companion]
    774 Companion( T ) Sample {
    775         .values: { 0, 0+1, 0+1+1, 0+1+1+1, ... }, /* 0: zero_t, 1: one_t, +: ?+?{} */
    776         .labels: { "Zero", "One", "Two", "Three", ...},
    777         .length: /* number of enumerators */
    778 };
    779 \end{cfa}
    780 \CFA stores values as intermediate expressions because the result of the function call to the function @?+?{}(T&, T&)@ is statically unknown to \CFA.
    781 But the result is computed at run time, and the compiler ensures the @values@ are not changed.
    782 
    783 \section{Qualified Expression}
    784 
    785 \CFA uses qualified expression to address the scoping of \CFA-enumeration.
    786 \begin{cfa}[label=lst:qualified_expression]
    787 aggregation_name.field;
    788 \end{cfa}
    789 The qualified expression is not dedicated to \CFA enumeration.
    790 It is a feature that is supported by other aggregation in \CFA as well, including a C enumeration.
    791 When C enumerations are unscoped, the qualified expression syntax still helps to disambiguate names in the context.
    792 \CFA recognizes if the expression references a \CFA aggregation by searching the presence of @aggregation_name@ in the \CFA enumeration table.
    793 If the @aggregation_name@ is identified as a \CFA enumeration, the compiler checks if @field@ presents in the declared \CFA enumeration.
    794 
    795 \section{Instance Declaration}
    796 
    797 
    798 \begin{cfa}[label=lst:var_declaration]
    799 enum Sample s1;
    800 \end{cfa}
    801 
    802 The declaration \CFA-enumeration variable has the same syntax as the C-enumeration. Internally, such a variable will be represented as an EnumInstType.
  • doc/theses/jiada_liang_MMath/uw-ethesis.tex

    r5aeb1a9 r38e20a80  
    226226\input{intro}
    227227\input{background}
     228\input{CEnum}
    228229\input{CFAenum}
    229230\input{implementation}
    230231\input{relatedwork}
    231 \input{performance}
    232232\input{conclusion}
    233233
Note: See TracChangeset for help on using the changeset viewer.