\chapter{\CFA Enumeration} \CFA supports C enumeration using the same syntax and semantics for backwards compatibility. \CFA also extends C-Style enumeration by adding a number of new features that bring enumerations inline with other modern programming languages. Any enumeration extensions must be intuitive to C programmers both in syntax and semantics. The following sections detail all of my new contributions to enumerations in \CFA. \begin{comment} Not support. \end{comment} % \section{Aliasing} % C already provides @const@-style aliasing using the unnamed enumerator \see{\VRef{s:TypeName}}, even if the name @enum@ is misleading (@const@ would be better). % Given the existence of this form, it is straightforward to extend it with types other than @int@. % \begin{cfa} % enum E { Size = 20u, PI = 3.14159L, Jack = L"John" }; % \end{cfa} % which matches with @const@ aliasing in other programming languages. % Here, the type of the enumerator is the type of the initialization constant, \eg @typeof(20u)@ for @Size@ implies @unsigned int@. % Auto-initialization is restricted to the case where all constants are @int@, matching with C. % As seen in \VRef{s:EnumeratorTyping}, this feature is just a shorthand for multiple typed-enumeration declarations. \section{Enumerator Visibility} \label{s:EnumeratorVisibility} In C, unscoped enumerators present a \newterm{naming problem} when multiple enumeration types appear in the same scope with duplicate enumerator names. There is no mechanism in C to resolve these naming conflicts other than renaming one of the duplicates, which may be impossible if the conflict comes from system include files. The \CFA type-system allows extensive overloading, including enumerators. Furthermore, \CFA uses the environment, such as the left-had of assignment and function parameter, to pinpoint the best overloaded name. % Furthermore, \CFA uses the left-hand of assignment in type resolution to pinpoint the best overloaded name. Finally, qualification and casting are provided to disambiguate any ambiguous situations. \begin{cfa} enum E1 { First, Second, Third, Fourth }; enum E2 { @Fourth@, @Third@, @Second@, @First@ }; $\C{// same enumerator names}$ E1 f() { return Third; } $\C{// overloaded functions, different return types}$ E2 f() { return Fourth; } void g(E1 e); void h(E2 e); void foo() { E1 e1 = First; E2 e2 = First; $\C{// initialization}$ e1 = Second; e2 = Second; $\C{// assignment}$ e1 = f(); e2 = f(); $\C{// function return}$ g(First); h(First); $\C{// function parameter}$ int i = @E1.@First + @E2.@First; $\C{// disambiguate with qualification}$ int j = @(E1)@First + @(E2)@First; $\C{// disambiguate with cast}$ } \end{cfa} \CFA overloading allows programmers to use the most meaningful names without fear of name clashes within a program or from external sources, like include files. Experience from \CFA developers is that the type system implicitly and correctly disambiguates the majority of overloaded names, \ie it is rare to get an incorrect selection or ambiguity, even among hundreds of overloaded variables and functions. Any ambiguity can be resolved using qualification or casting. \section{Enumerator Scoping} An enumeration can be scoped, using @'!'@, so the enumerator constants are not projected into the enclosing scope. \begin{cfa} enum Week @!@ { Mon, Tue, Wed, Thu = 10, Fri, Sat, Sun }; enum RGB @!@ { Red, Green, Blue }; \end{cfa} Now the enumerators \emph{must} be qualified with the associated enumeration type. \begin{cfa} Week week = @Week.@Mon; week = @Week.@Sat; RGB rgb = @RGB.@Red; rgb = @RGB.@Blue; \end{cfa} It is possible to toggle back to unscoping using the \CFA @with@ clause/statement (see also \CC \lstinline[language=c++]{using enum} in Section~\ref{s:C++RelatedWork}). \begin{cfa} with ( @Week@, @RGB@ ) { $\C{// type names}$ week = @Sun@; $\C{// no qualification}$ rgb = @Green@; } \end{cfa} As in Section~\ref{s:EnumeratorVisibility}, opening multiple scoped enumerations in a @with@ can result in duplicate enumeration names, but \CFA implicit type resolution and explicit qualification/casting handle this localized scenario. \section{Enumeration Traits} \CFA defines the set of traits containing operators and helper functions for @enum@. A \CFA enumeration satisfies all of these traits allowing it to interact with runtime features in \CFA. Each trait is discussed in detail. The trait @CfaEnum@: \begin{cfa} forall( E ) trait CfaEnum { char * label( E e ); unsigned int posn( E e ); }; \end{cfa} describes an enumeration as a named constant with position. And @TypeEnum@ \begin{cfa} forall( E, V ) trait TypeEnum { V value( E e ); }; \end{cfa} asserts two types @E@ and @T@, with @T@ being the base type for the enumeration @E@. The declarative syntax \begin{cfa} enum(T) E { A = ..., B = ..., C = ... }; \end{cfa} creates an enumerated type E with @label@, @posn@ and @value@ implemented automatically. \begin{cfa} void foo( T t ) { ... } void bar(E e) { choose (e) { case A: printf("\%d", posn(e)); case B: printf("\%s", label(e)); case C: foo(value(e)); } } \end{cfa} Implementing general functions across all enumeration types is possible by asserting @CfaEnum( E, T )@, \eg: \begin{cfa} #include forall( E, T | CfaEnum( E, T ) | {unsigned int toUnsigned(T)} ) string formatEnum( E e ) { unsigned int v = toUnsigned(value(e)); string out = label(e) + '(' + v +')'; return out; } printEunm( Week.Mon ); printEnum( RGB.Green ); \end{cfa} \CFA does not define attribute functions for C style enumeration. But it is possilbe for users to explicitly implement enumeration traits for C enum and any other types. \begin{cfa} enum Fruit { Apple, Bear, Cherry }; $\C{// C enum}$ char * label(Fruit f) { switch(f) { case Apple: "A"; break; case Bear: "B"; break; case Cherry: "C"; break; } } unsigned posn(Fruit f) { return f; } char* value(Fruit f) { return ""; } $\C{// value can return any non void type}$ formatEnum( Apple ); $\C{// Fruit is now a Cfa enum}$ \end{cfa} A type that implements trait @CfaEnum@, \ie, a type has no @value@, is called an opaque enum. % \section{Enumerator Opaque Type} % \CFA provides a special opaque enumeration type, where the internal representation is chosen by the compiler and only equality operations are available. \begin{cfa} enum@()@ Planets { MERCURY, VENUS, EARTH, MARS, JUPITER, SATURN, URANUS, NEPTUNE }; \end{cfa} In addition, \CFA implements @Bound@ and @Serial@ for \CFA Enums. \begin{cfa} forall( E ) trait Bounded { E first(); E last(); }; \end{cfa} The function @first()@ and @last()@ of enumerated type E return the first and the last enumerator declared in E, respectively. \eg: \begin{cfa} Workday day = first(); $\C{// Mon}$ Planet outermost = last(); $\C{// NEPTUNE}$ \end{cfa} @first()@ and @last()@ are overloaded with return types only, so in the example, the enumeration type is found on the left-hand side of the assignment. Calling either functions without a context results in a type ambiguity, except in the rare case where the type environment has only one enumeration. \begin{cfa} @first();@ $\C{// ambiguous because both Workday and Planet implement Bounded}$ sout | @last()@; Workday day = first(); $\C{// day provides type Workday}$ void foo( Planet p ); foo( last() ); $\C{// parameter provides type Planet}$ \end{cfa} The trait @Serial@: \begin{cfa} forall( E | Bounded( E ) ) trait Serial { unsigned fromInstance( E e ); E fromInt( unsigned int posn ); E succ( E e ); E pred( E e ); }; \end{cfa} is a @Bounded@ trait, where elements can be mapped to an integer sequence. A type @T@ matching @Serial@ can project to an unsigned @int@ type, \ie an instance of type T has a corresponding integer value. %However, the inverse may not be possible, and possible requires a bound check. The mapping from a serial type to integer is defined by @fromInstance@, which returns the enumerator's position. The inverse operation is @fromInt@, which performs a bound check using @first()@ and @last()@ before casting the integer into an enumerator. Specifically, for enumerator @E@ declaring $N$ enumerators, @fromInt( i )@ returns the $i-1_{th}$ enumerator, if $0 \leq i < N$, or raises the exception @enumBound@. The @succ( E e )@ and @pred( E e )@ imply the enumeration positions are consecutive and ordinal. Specifically, if @e@ is the $i_{th}$ enumerator, @succ( e )@ returns the $i+1_{th}$ enumerator when $e \ne last()$, and @pred( e )@ returns the $i-1_{th}$ enumerator when $e \ne first()$. The exception @enumRange@ is raised if the result of either operation is outside the range of type @E@. Finally, there is an associated trait defining comparison operators among enumerators. \begin{cfa} forall( E, T | CfaEnum( E, T ) ) { // comparison int ?==?( E l, E r ); $\C{// true if l and r are same enumerators}$ int ?!=?( E l, E r ); $\C{// true if l and r are different enumerators}$ int ?!=?( E l, zero_t ); $\C{// true if l is not the first enumerator}$ int ??( E l, E r ); $\C{// true if l is an enumerator after r}$ int ?>=?( E l, E r ); $\C{// true if l after or the same as r}$ } \end{cfa} \section{Typed Enum} \label{s:EnumeratorTyping} \CFA extends the enumeration declaration by parameterizing with a type (like a generic type), allowing enumerators to be assigned any values from the declared type. Figure~\ref{f:EumeratorTyping} shows a series of examples illustrating that all \CFA types can be use with an enumeration and each type's constants used to set the enumerator constants. Note, the synonyms @Liz@ and @Beth@ in the last declaration. Because enumerators are constants, the enumeration type is implicitly @const@, so all the enumerator types in Figure~\ref{f:EumeratorTyping} are logically rewritten with @const@. C has an implicit type conversion from an enumerator to its base type @int@. Correspondingly, \CFA has an implicit (safe) conversion from a typed enumerator to its base type. \begin{cfa} char currency = Dollar; string fred = Fred; $\C{// implicit conversion from char * to \CFA string type}$ Person student = Beth; \end{cfa} % \begin{cfa} % struct S { int i, j; }; % enum( S ) s { A = { 3, 4 }, B = { 7, 8 } }; % enum( @char@ ) Currency { Dollar = '$\textdollar$', Euro = '$\texteuro$', Pound = '$\textsterling$' }; % enum( @double@ ) Planet { Venus = 4.87, Earth = 5.97, Mars = 0.642 }; // mass % enum( @char *@ ) Colour { Red = "red", Green = "green", Blue = "blue" }; % enum( @Currency@ ) Europe { Euro = '$\texteuro$', Pound = '$\textsterling$' }; // intersection % \end{cfa} \begin{figure} \begin{cfa} // integral enum( @char@ ) Currency { Dollar = '$\textdollar$', Cent = '$\textcent$', Yen = '$\textyen$', Pound = '$\textsterling$', Euro = 'E' }; enum( @signed char@ ) srgb { Red = -1, Green = 0, Blue = 1 }; enum( @long long int@ ) BigNum { X = 123_456_789_012_345, Y = 345_012_789_456_123 }; // non-integral enum( @double@ ) Math { PI_2 = 1.570796, PI = 3.141597, E = 2.718282 }; enum( @_Complex@ ) Plane { X = 1.5+3.4i, Y = 7+3i, Z = 0+0.5i }; // pointer enum( @const char *@ ) Name { Fred = "FRED", Mary = "MARY", Jane = "JANE" }; int i, j, k; enum( @int *@ ) ptr { I = &i, J = &j, K = &k }; enum( @int &@ ) ref { I = i, J = j, K = k }; // tuple enum( @[int, int]@ ) { T = [ 1, 2 ] }; $\C{// new \CFA type}$ // function void f() {...} void g() {...} enum( @void (*)()@ ) funs { F = f, G = g }; // aggregate struct Person { char * name; int age, height; }; @***@enum( @Person@ ) friends { @Liz@ = { "ELIZABETH", 22, 170 }, @Beth@ = Liz, Jon = { "JONATHAN", 35, 190 } }; \end{cfa} \caption{Enumerator Typing} \label{f:EumeratorTyping} \end{figure} An advantage of the typed enumerations is eliminating the \emph{harmonizing} problem between an enumeration and companion data \see{\VRef{s:Usage}}: \begin{cfa} enum( char * ) integral_types { chr = "char", schar = "signed char", uschar = "unsigned char", sshort = "signed short int", ushort = "unsigned short int", sint = "signed int", usint = "unsigned int", ... }; \end{cfa} Note, the enumeration type can be a structure (see @Person@ in Figure~\ref{f:EumeratorTyping}), so it is possible to have the equivalent of multiple arrays of companion data using an array of structures. While the enumeration type can be any C aggregate, the aggregate's \CFA constructors are not used to evaluate an enumerator's value. \CFA enumeration constants are compile-time values (static); calling constructors happens at runtime (dynamic). \section{Enumeration Inheritance} \CFA Plan-9 inheritance may be used with enumerations, where Plan-9 inheritance is containment inheritance with implicit unscoping (like a nested unnamed @struct@/@union@ in C). \begin{cfa} enum( char * ) Names { /* as above */ }; enum( char * ) Names2 { @inline Names@, Jack = "JACK", Jill = "JILL" }; enum( char * ) Names3 { @inline Names2@, Sue = "SUE", Tom = "TOM" }; \end{cfa} Enumeration @Name2@ inherits all the enumerators and their values from enumeration @Names@ by containment, and a @Names@ enumeration is a subtype of enumeration @Name2@. Note, that enumerators must be unique in inheritance but enumerator values may be repeated. % The enumeration type for the inheriting type must be the same as the inherited type; % hence the enumeration type may be omitted for the inheriting enumeration and it is inferred from the inherited enumeration, as for @Name3@. % When inheriting from integral types, automatic numbering may be used, so the inheritance placement left to right is important. Specifically, the inheritance relationship for @Names@ is: \begin{cfa} Names $\(\subset\)$ Names2 $\(\subset\)$ Names3 $\C{// enum type of Names}$ \end{cfa} A subtype can be cast to its supertype, assigned to a supertype variable, or be used as a function argument that expects the supertype. \begin{cfa} Names fred = Name.Fred; (Names2) fred; (Names3) fred; (Name3) Names.Jack; $\C{// cast to super type}$ Names2 fred2 = fred; Names3 fred3 = fred2; $\C{// assign to super type}$ \end{cfa} For the given function prototypes, the following calls are valid. \begin{cquote} \begin{tabular}{ll} \begin{cfa} void f( Names ); void g( Names2 ); void h( Names3 ); void j( const char * ); \end{cfa} & \begin{cfa} f( Fred ); g( Fred ); g( Jill ); h( Fred ); h( Jill ); h( Sue ); j( Fred ); j( Jill ); j( Sue ); j( "WILL" ); \end{cfa} \end{tabular} \end{cquote} Note, the validity of calls is the same for call-by-reference as for call-by-value, and @const@ restrictions are the same as for other types. \section{Enumerator Control Structures} Enumerators can be used in multiple contexts. In most programming languages, an enumerator is implicitly converted to its value (like a typed macro substitution). However, enumerator synonyms and typed enumerations make this implicit conversion to value incorrect in some contexts. In these contexts, a programmer's initition assumes an implicit conversion to position. For example, an intuitive use of enumerations is with the \CFA @switch@/@choose@ statement, where @choose@ performs an implicit @break@ rather than a fall-through at the end of a @case@ clause. \begin{cquote} \begin{cfa} enum Count { First, Second, Third, Fourth }; Count e; \end{cfa} \begin{tabular}{ll} \begin{cfa} choose( e ) { case @First@: ...; case @Second@: ...; case @Third@: ...; case @Fourth@: ...; } \end{cfa} & \begin{cfa} // rewrite choose( @value@( e ) ) { case @value@( First ): ...; case @value@( Second ): ...; case @value@( Third ): ...; case @value@( Fourth ): ...; } \end{cfa} \end{tabular} \end{cquote} Here, the intuitive code on the left is implicitly transformed into the standard implementation on the right, using the value of the enumeration variable and enumerators. However, this implementation is fragile, \eg if the enumeration is changed to: \begin{cfa} enum Count { First, Second, Third @= First@, Fourth }; \end{cfa} which make @Third == First@ and @Fourth == Second@, causing a compilation error because of duplicate @case@ clauses. To better match with programmer intuition, \CFA toggles between value and position semantics depending on the language context. For conditional clauses and switch statements, \CFA uses the robust position implementation. \begin{cfa} choose( @position@( e ) ) { case @position@( First ): ...; case @position@( Second ): ...; case @position@( Third ): ...; case @position@( Fourth ): ...; } \end{cfa} \begin{cfa} Count variable_a = First, variable_b = Second, variable_c = Third, variable_d = Fourth; p(variable_a); // 0 p(variable_b); // 1 p(variable_c); // "Third" p(variable_d); // 3 \end{cfa} \begin{cfa} for (d; Workday) { sout | d; } for (p; +~=Planet) { sout | p; } for (c: -~=Alphabet ) { sout | c; } \end{cfa} The @range loop@ for enumeration is a syntax sugar that loops over all enumerators and assigns each enumeration to a variable in every iteration. The loop control of the range loop consists of two parts: a variable declaration and a @range expression@, with the type of the variable can be inferred from the range expression. The range expression is an enumeration type, optionally prefixed by @+~=@ or @-~=@. Without a prefix, or prefixed with @+~=@, the control loop over all enumerators from the first to the last. With a @-~=@ prefix, the control loops backward. On a side note, the loop syntax \begin{cfa} for ( typeof(Workday) d; d <= last(); d = succ(d) ); \end{cfa} does not work. When d == last(), the loop control will still attempt to assign succ(d) to d, which causes an @enumBound@ exception. \CFA reduces conditionals to its "if case" if the predicate is not equal to ( @!=@ ) zero, and the "else case" otherwise. Overloading the @!=@ operator with an enumeration type against the zero defines a conceptual conversion from enum to boolean, which can be used as predicates. \begin{cfa} enum(int) ErrorCode { Normal = 0, Slow = 1, Overheat = 1000, OutOfResource = 1001 }; bool ?!=?(ErrorCode lhs, zero_t) { return value(lhs) >= 1000; } ErrorCode code = /.../ if (code) { scream(); } \end{cfa} Incidentally, \CFA does not define boolean conversion for enumeration. If no @?!=?(ErrorCode, zero_t)@ overloading defined, \CFA looks for the boolean conversion in terms of its value and gives a compiler error if no such conversion is available. \begin{cfa} enum(int) Weekday { Mon, Tues, Wed, Thurs, Fri, Sat, Sun, }; enum() Colour { Red, Green, Blue }; enum(S) Fruit { Apple, Banana, Cherry } Weekday w = ...; Colour c = ...; Fruit f = ...; if (w) { ... } // w is true if and only if w != Mon, because value(Mon) == 0 (auto initialized) if (c) { ... } // error if (s) { ... } // depends on ?!=?(S lhs, zero_t ), and error if no such overloading available \end{cfa} As an alternative, users can define the boolean conversion for CfaEnum: \begin{cfa} forall(E | CfaEnum(E)) bool ?!=?(E lhs, zero_t) { return posn(lhs) != 0; } \end{cfa} which effectively turns the first enumeration as a logical zero and non-zero for others. \section{Enumerated Arrays} Enumerated arrays use an \CFA array as their index. \begin{cfa} enum() Colour { Red, Orange, Yellow, Green, Blue, Indigo, Violet }; string colourCode[Colour] = { "#e81416", "#ffa500", "#ffa500", "#ffa500", "#487de7", "#4b369d", "#70369d" }; sout | "Colour Code of Orange is " | colourCode[Orange]; \end{cfa} \section{Planet Example} \VRef[Figure]{f:PlanetExample} shows an archetypal enumeration example illustrating most of the \CFA enumeration features. @Planet@ is an enumeration of type @MR@. Each planet enumerator is initialized to a specific mass/radius, @MR@, value. The unnamed enumeration provides the gravitational-constant enumerator @G@. Function @surfaceGravity@ uses the @with@ clause to remove @p@ qualification from fields @mass@ and @radius@. The program main uses the pseudo function @countof@ to obtain the number of enumerators in @Planet@, and safely converts the random value into a @Planet@ enumerator using @fromInt@. The resulting random orbital-body is used in a @choose@ statement. The enumerators in the @case@ clause use the enumerator position for testing. The prints use @label@ to print an enumerator's name. Finally, a loop enumerates through the planets computing the weight on each planet for a given earth mass. The print statement does an equality comparison with an enumeration variable and enumerator (@p == MOON@). \begin{figure} \small \begin{cfa} struct MR { double mass, radius; }; enum( @MR@ ) Planet { $\C{// typed enumeration}$ // mass (kg) radius (km) MERCURY = { 0.330_E24, 2.4397_E6 }, VENUS = { 4.869_E24, 6.0518_E6 }, EARTH = { 5.976_E24, 6.3781_E6 }, MOON = { 7.346_E22, 1.7380_E6 }, $\C{// not a planet}$ MARS = { 0.642_E24, 3.3972_E6 }, JUPITER = { 1898._E24, 71.492_E6 }, SATURN = { 568.8_E24, 60.268_E6 }, URANUS = { 86.86_E24, 25.559_E6 }, NEPTUNE = { 102.4_E24, 24.746_E6 }, PLUTO = { 1.303_E22, 1.1880_E6 }, $\C{// not a planet}$ }; enum( double ) { G = 6.6743_E-11 }; $\C{// universal gravitational constant (m3 kg-1 s-2)}$ static double surfaceGravity( Planet p ) @with( p )@ { return G * mass / ( radius @\@ 2 ); $\C{// no qualification, exponentiation}$ } static double surfaceWeight( Planet p, double otherMass ) { return otherMass * surfaceGravity( p ); } int main( int argc, char * argv[] ) { if ( argc != 2 ) @exit@ | "Usage: " | argv[0] | "earth-weight"; // terminate program double earthWeight = convert( argv[1] ); double earthMass = earthWeight / surfaceGravity( EARTH ); Planet rp = @fromInt@( prng( @countof@( Planet ) ) ); $\C{// select random orbiting body}$ @choose( rp )@ { $\C{// implicit breaks}$ case MERCURY, VENUS, EARTH, MARS: sout | @rp@ | "is a rocky planet"; case JUPITER, SATURN, URANUS, NEPTUNE: sout | rp | "is a gas-giant planet"; default: sout | rp | "is not a planet"; } for ( @p; Planet@ ) { $\C{// enumerate}$ sout | "Your weight on" | ( @p == MOON@ ? "the" : " " ) | p | "is" | wd( 1,1, surfaceWeight( p, earthMass ) ) | "kg"; } } $\$$ planet 100 JUPITER is a gas-giant planet Your weight on MERCURY is 37.7 kg Your weight on VENUS is 90.5 kg Your weight on EARTH is 100.0 kg Your weight on the MOON is 16.6 kg Your weight on MARS is 37.9 kg Your weight on JUPITER is 252.8 kg Your weight on SATURN is 106.6 kg Your weight on URANUS is 90.5 kg Your weight on NEPTUNE is 113.8 kg Your weight on PLUTO is 6.3 kg \end{cfa} \caption{Planet Example} \label{f:PlanetExample} \end{figure}