\chapter{\CFA Enumeration} % \CFA supports C enumeration using the same syntax and semantics for backwards compatibility. % \CFA also extends C-Style enumeration by adding a number of new features that bring enumerations inline with other modern programming languages. % Any enumeration extensions must be intuitive to C programmers both in syntax and semantics. % The following sections detail all of my new contributions to enumerations in \CFA. \CFA extends the enumeration declaration by parameterizing with a type (like a generic type). \begin{clang}[identifierstyle=\linespread{0.9}\it] $\it enum$-specifier: enum @(type-specifier$\(_{opt}\)$)@ identifier$\(_{opt}\)$ { cfa-enumerator-list } enum @(type-specifier$\(_{opt}\)$)@ identifier$\(_{opt}\)$ { cfa-enumerator-list , } enum @(type-specifier$\(_{opt}\)$)@ identifier cfa-enumerator-list: cfa-enumerator cfa-enumerator, cfa-enumerator-list cfa-enumerator: enumeration-constant $\it inline$ identifier enumeration-constant = expression \end{clang} A \newterm{\CFA enumeration}, or \newterm{\CFA enum}, has an optional type declaration in the bracket next to the @enum@ keyword. Without optional type declarations, the syntax defines "opaque enums". Otherwise, \CFA enum with type declaration are "typed enums". \section{Opaque Enum} \label{s:OpaqueEnum} Opaque enum is a special CFA enumeration type, where the internal representation is chosen by the compiler and hidden from users. Compared C enum, opaque enums are more restrictive in terms of typing, and cannot be implicitly converted to integers. Enumerators of opaque enum cannot have initializer. Declaring initializer in the body of opaque enum results in a syntax error. \begin{cfa} enum@()@ Planets { MERCURY, VENUS, EARTH, MARS, JUPITER, SATURN, URANUS, NEPTUNE }; Planet p = URANUS; @int i = VENUS; // Error, VENUS cannot be converted into an integral type@ \end{cfa} Each opage enum has two @attributes@: @position@ and @label@. \CFA auto-generates @attribute functions@ @posn()@ and @label()@ for every \CFA enum to returns the respective attributes. \begin{cfa} // Auto-generated int posn(Planet p); char * s label(Planet p); \end{cfa} \begin{cfa} unsigned i = posn(VENUS); // 1 char * s = label(MARS); // "MARS" \end{cfa} % \subsection{Representation} \CFA uses chooses signed int as the underlying representation of an opaque enum variable, holding the value of enumeration position. Therefore, @posn()@ is in fact a cast that bypassing type system, converting an cfa enum to its integral representation. Labels information are stored in a global array. @label()@ is a function that maps enum position to an element of the array. \section{Typed Enum} \label{s:EnumeratorTyping} \CFA extends the enumeration declaration by parameterizing with a type (like a generic type), allowing enumerators to be assigned any values from the declared type. Figure~\ref{f:EumeratorTyping} shows a series of examples illustrating that all \CFA types can be use with an enumeration and each type's constants used to set the enumerator constants. Note, the synonyms @Liz@ and @Beth@ in the last declaration. Because enumerators are constants, the enumeration type is implicitly @const@, so all the enumerator types in Figure~\ref{f:EumeratorTyping} are logically rewritten with @const@. \begin{figure} \begin{cfa} // integral enum( @char@ ) Currency { Dollar = '$\textdollar$', Cent = '$\textcent$', Yen = '$\textyen$', Pound = '$\textsterling$', Euro = 'E' }; enum( @signed char@ ) srgb { Red = -1, Green = 0, Blue = 1 }; enum( @long long int@ ) BigNum { X = 123_456_789_012_345, Y = 345_012_789_456_123 }; // non-integral enum( @double@ ) Math { PI_2 = 1.570796, PI = 3.141597, E = 2.718282 }; enum( @_Complex@ ) Plane { X = 1.5+3.4i, Y = 7+3i, Z = 0+0.5i }; // pointer enum( @const char *@ ) Name { Fred = "FRED", Mary = "MARY", Jane = "JANE" }; int i, j, k; enum( @int *@ ) ptr { I = &i, J = &j, K = &k }; enum( @int &@ ) ref { I = i, J = j, K = k }; // tuple enum( @[int, int]@ ) { T = [ 1, 2 ] }; $\C{// new \CFA type}$ // function void f() {...} void g() {...} enum( @void (*)()@ ) funs { F = f, G = g }; // aggregate struct Person { char * name; int age, height; }; @***@enum( @Person@ ) friends { @Liz@ = { "ELIZABETH", 22, 170 }, @Beth@ = Liz, Jon = { "JONATHAN", 35, 190 } }; \end{cfa} \caption{Enumerator Typing} \label{f:EumeratorTyping} \end{figure} An advantage of the typed enumerations is eliminating the \emph{harmonizing} problem between an enumeration and companion data \see{\VRef{s:Usage}}: \begin{cfa} enum( char * ) integral_types { chr = "char", schar = "signed char", uschar = "unsigned char", sshort = "signed short int", ushort = "unsigned short int", sint = "signed int", usint = "unsigned int", ... }; \end{cfa} Note, the enumeration type can be a structure (see @Person@ in Figure~\ref{f:EumeratorTyping}), so it is possible to have the equivalent of multiple arrays of companion data using an array of structures. While the enumeration type can be any C aggregate, the aggregate's \CFA constructors are not used to evaluate an enumerator's value. \CFA enumeration constants are compile-time values (static); calling constructors happens at runtime (dynamic). @value@ is an @attribute@ that defined for typed enum along with position and label. @values@ of a typed enum are stored in a global array of declared typed, initialized with value of enumerator initializers. @value()@ functions maps an enum to an elements of the array. \subsection{Implicit Conversion} C has an implicit type conversion from an enumerator to its base type @int@. Correspondingly, \CFA has an implicit (safe) conversion from a typed enumerator to its base type. \begin{cfa} char currency = Dollar; string fred = Fred; $\C{// implicit conversion from char * to \CFA string type}$ Person student = Beth; \end{cfa} % The implicit conversion is accomplished by the compiler adding @value()@ function calls as a candidate with safe cost. Therefore, the expression % \begin{cfa} % char currency = Dollar; % \end{cfa} % is equivalent to % \begin{cfa} % char currency = value(Dollar); % \end{cfa} % Such conversion an @additional@ safe The implicit conversion is accomplished by the resolver adding call to @value()@ functions as a resolution candidate with a @implicit@ cost. Implicit cost is an additional category to Aaron's cost model. It is more signicant than @unsafe@ to have the compiler choosing implicit conversion over the narrowing conversion; It is less signicant to @poly@ so that function overloaded with enum traits will be selected over the implicit. @Enum trait@ will be discussed in the chapter. Therefore, \CFA conversion cost is 8-tuple @@(unsafe, implicit, poly, safe, sign, vars, specialization, reference)@@ \section{Auto Initialization} C auto-initialization works for the integral type @int@ with constant expressions. \begin{cfa} enum Alphabet ! { A = 'A', B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z, a = 'a', b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z }; \end{cfa} The complexity of the constant expression depends on the level of runtime computation the compiler implements, \eg \CC \lstinline[language={[GNU]C++}]{constexpr} provides complex compile-time computation across multiple types, which blurs the compilation/runtime boundary. % The notion of auto-initialization can be generalized in \CFA through the trait @AutoInitializable@. % \begin{cfa} % forall(T) @trait@ AutoInitializable { % void ?{}( T & o, T v ); $\C{// initialization}$ % void ?{}( T & t, zero_t ); $\C{// 0}$ % T ?++( T & t); $\C{// increment}$ % }; % \end{cfa} % In addition, there is an implicit enumeration counter, @ecnt@ of type @T@, managed by the compiler. % For example, the type @Odd@ satisfies @AutoInitializable@: % \begin{cfa} % struct Odd { int i; }; % void ?{}( Odd & o, int v ) { if ( v & 1 ) o.i = v; else /* error not odd */ ; }; % void ?{}( Odd & o, zero_t ) { o.i = 1; }; % Odd ?++( Odd o ) { return (Odd){ o.i + 2 }; }; % \end{cfa} % and implicit initialization is available. % \begin{cfa} % enum( Odd ) { A, B, C = 7, D }; $\C{// 1, 3, 7, 9}$ % \end{cfa} % where the compiler performs the following transformation and runs the code. % \begin{cfa} % enum( Odd ) { % ?{}( ecnt, @0@ } ?{}( A, ecnt }, ?++( ecnt ) ?{}( B, ecnt ), % ?{}( ecnt, 7 ) ?{}( C, ecnt ), ?++( ecnt ) ?{}( D, ecnt ) % }; % \end{cfa} The notion of auto-initialization is generalized in \CFA enum in the following way: Enumerator e is the first enumerator of \CFA enumeration E with base type T. If e declares no no initializer, e is auto-initialized by the $zero\_t$ constructor of T. \CFA reports a compile time error if T has no $zero\_t$ constructor. Enumerator e is an enumerator of base-type T enumeration E that position i, where $i \neq 0$. And d is the enumerator with position @i-1@, e is auto-initialized with the result of @value(d)++@. If operator @?++@ is not defined for type T, \CFA reports a compile time error. Unfortunately, auto-initialization is not implemented because \CFA is only a transpiler, relying on generated C code to perform the detail work. C does not have the equivalent of \CC \lstinline[language={[GNU]C++}]{constexpr}, and it is currently beyond the scope of the \CFA project to implement a complex runtime interpreter in the transpiler. Nevertheless, the necessary language concepts exist to support this feature. \section{Enumeration Inheritance} \CFA Plan-9 inheritance may be used with enumerations, where Plan-9 inheritance is containment inheritance with implicit unscoping (like a nested unnamed @struct@/@union@ in C). \begin{cfa} enum( char * ) Names { /* as above */ }; enum( char * ) Names2 { @inline Names@, Jack = "JACK", Jill = "JILL" }; enum( char * ) Names3 { @inline Names2@, Sue = "SUE", Tom = "TOM" }; \end{cfa} Enumeration @Name2@ inherits all the enumerators and their values from enumeration @Names@ by containment, and a @Names@ enumeration is a @subtype@ of enumeration @Name2@. Note, that enumerators must be unique in inheritance but enumerator values may be repeated. % The enumeration type for the inheriting type must be the same as the inherited type; % hence the enumeration type may be omitted for the inheriting enumeration and it is inferred from the inherited enumeration, as for @Name3@. % When inheriting from integral types, automatic numbering may be used, so the inheritance placement left to right is important. Specifically, the inheritance relationship for @Names@ is: \begin{cfa} Names $\(\subset\)$ Names2 $\(\subset\)$ Names3 $\C{// enum type of Names}$ \end{cfa} Inlined from \CFA enumeration @O@, new enumeration @N@ copies all enumerators from @O@, including those @O@ obtains through inheritance. Enumerators inherited from @O@ keeps same @label@ and @value@, but @position@ may shift to the right if other enumerators or inline enumeration declared in prior of @inline A@. \begin{cfa} enum() Phynchocephalia { Tuatara }; enum() Squamata { Snake, Lizard }; enum() Lepidosauromorpha { inline Phynchocephalia, inline Squamata, Kuehneosauridae }; \end{cfa} Snake, for example, has the position 0 in Squamata, but 1 in Lepidosauromorpha as Tuatara inherited from Phynchocephalia is position 0 in Lepidosauromorpha. A subtype enumeration can be casted, or implicitly converted into its supertype, with a safe cost. \begin{cfa} enum Squamata squamata_lizard = Lizard; posn(quamata_lizard); // 1 enum Lepidosauromorpha lepidosauromorpha_lizard = squamata_lizard; posn(lepidosauromorpha_lizard); // 2 void foo( Lepidosauromorpha l ); foo( squamata_lizard ); posn( (Lepidosauromorpha) squamata_lizard ); // 2 Lepidosauromorpha s = Snake; \end{cfa} The last expression in the preceding example is umabigious. While both @Squamata.Snake@ and @Lepidosauromorpha.Snake@ are valid candidate, @Squamata.Snake@ has an associated safe cost and \CFA select the zero cost candidate @Lepidosauromorpha.Snake@. As discussed in \VRef{s:OpaqueEnum}, \CFA chooses position as a representation of \CFA enum. Conversion involves both change of typing and possibly @position@. When converting a subtype to a supertype, the position can only be a larger value. The difference between the position in subtype and in supertype is an "offset". \CFA runs a the following algorithm to determine the offset for an enumerator to a super type. % In a summary, \CFA loops over members (include enumerators and inline enums) of the supertype. % If the member is the matching enumerator, the algorithm returns its position. % If the member is a inline enumeration, the algorithm trys to find the enumerator in the inline enumeration. If success, it returns the position of enumerator in the inline enumeration, plus % the position in the current enumeration. Otherwises, it increase the offset by the size of inline enumeration. \begin{cfa} struct Enumerator; struct CFAEnum { vector> members; }; pair calculateEnumOffset( CFAEnum dst, Enumerator e ) { int offset = 0; for( auto v: dst.members ) { if ( v.holds_alternative() ) { auto m = v.get(); if ( m == e ) return make_pair( true, 0 ); offset++; } else { auto p = calculateEnumOffset( v, e ); if ( p.first ) return make_pair( true, offset + p.second ); offset += p.second; } } return make_pair( false, offset ); } \end{cfa} % \begin{cfa} % Names fred = Name.Fred; % (Names2) fred; (Names3) fred; (Name3) Names.Jack; $\C{// cast to super type}$ % Names2 fred2 = fred; Names3 fred3 = fred2; $\C{// assign to super type}$ % \end{cfa} For the given function prototypes, the following calls are valid. \begin{cquote} \begin{tabular}{ll} \begin{cfa} void f( Names ); void g( Names2 ); void h( Names3 ); void j( const char * ); \end{cfa} & \begin{cfa} f( Fred ); g( Fred ); g( Jill ); h( Fred ); h( Jill ); h( Sue ); j( Fred ); j( Jill ); j( Sue ); j( "WILL" ); \end{cfa} \end{tabular} \end{cquote} Note, the validity of calls is the same for call-by-reference as for call-by-value, and @const@ restrictions are the same as for other types. \section{Enumerator Control Structures} Enumerators can be used in multiple contexts. In most programming languages, an enumerator is implicitly converted to its value (like a typed macro substitution). However, enumerator synonyms and typed enumerations make this implicit conversion to value incorrect in some contexts. In these contexts, a programmer's initition assumes an implicit conversion to position. For example, an intuitive use of enumerations is with the \CFA @switch@/@choose@ statement, where @choose@ performs an implicit @break@ rather than a fall-through at the end of a @case@ clause. (For this discussion, ignore the fact that @case@ requires a compile-time constant.) \begin{cfa}[belowskip=0pt] enum Count { First, Second, Third, Fourth }; Count e; \end{cfa} \begin{cquote} \setlength{\tabcolsep}{15pt} \noindent \begin{tabular}{@{}ll@{}} \begin{cfa}[aboveskip=0pt] choose( e ) { case @First@: ...; case @Second@: ...; case @Third@: ...; case @Fourth@: ...; } \end{cfa} & \begin{cfa}[aboveskip=0pt] // rewrite choose( @value@( e ) ) { case @value@( First ): ...; case @value@( Second ): ...; case @value@( Third ): ...; case @value@( Fourth ): ...; } \end{cfa} \end{tabular} \end{cquote} Here, the intuitive code on the left is implicitly transformed into the standard implementation on the right, using the value of the enumeration variable and enumerators. However, this implementation is fragile, \eg if the enumeration is changed to: \begin{cfa} enum Count { First, Second, Third @= First@, Fourth }; \end{cfa} making @Third == First@ and @Fourth == Second@, causing a compilation error because of duplicate @case@ clauses. To better match with programmer intuition, \CFA toggles between value and position semantics depending on the language context. For conditional clauses and switch statements, \CFA uses the robust position implementation. \begin{cfa} if ( @posn@( e ) < posn( Third ) ) ... choose( @posn@( e ) ) { case @posn@( First ): ...; case @posn@( Second ): ...; case @posn@( Third ): ...; case @posn@( Fourth ): ...; } \end{cfa} \CFA provides a special form of for-control for enumerating through an enumeration, where the range is a type. \begin{cfa} for ( cx; @Count@ ) { sout | cx | nonl; } sout | nl; for ( cx; +~= Count ) { sout | cx | nonl; } sout | nl; for ( cx; -~= Count ) { sout | cx | nonl; } sout | nl; First Second Third Fourth First Second Third Fourth Fourth Third Second First \end{cfa} The enumeration type is syntax sugar for looping over all enumerators and assigning each enumerator to the loop index, whose type is inferred from the range type. The prefix @+~=@ or @-~=@ iterate forward or backwards through the inclusive enumeration range, where no prefix defaults to @+~=@. C has an idiom for @if@ and loop predicates of comparing the predicate result ``not equal to 0''. \begin{cfa} if ( x + y /* != 0 */ ) ... while ( p /* != 0 */ ) ... \end{cfa} This idiom extends to enumerations because there is a boolean conversion in terms of the enumeration value, if and only if such a conversion is available. For example, such a conversion exists for all numerical types (integral and floating-point). It is possible to explicitly extend this idiom to any typed enumeration by overloading the @!=@ operator. \begin{cfa} bool ?!=?( Name n, zero_t ) { return n != Fred; } Name n = Mary; if ( n ) ... // result is true \end{cfa} Specialize meanings are also possible. \begin{cfa} enum(int) ErrorCode { Normal = 0, Slow = 1, Overheat = 1000, OutOfResource = 1001 }; bool ?!=?( ErrorCode ec, zero_t ) { return ec >= Overheat; } ErrorCode code = ...; if ( code ) { problem(); } \end{cfa} \section{Enumeration Dimension} \VRef{s:EnumeratorTyping} introduced the harmonizing problem between an enumeration and secondary information. When possible, using a typed enumeration for the secondary information is the best approach. However, there are times when combining these two types is not possible. For example, the secondary information might precede the enumeration and/or its type is needed directly to declare parameters of functions. In these cases, having secondary arrays of the enumeration size are necessary. To support some level of harmonizing in these cases, an array dimension can be defined using an enumerator type, and the enumerators used as subscripts. \begin{cfa} enum E { A, B, C, N }; // possibly predefined float H1[N] = { [A] : 3.4, [B] : 7.1, [C] : 0.01 }; // C float H2[@E@] = { [A] : 3.4, [B] : 7.1, [C] : 0.01 }; // CFA \end{cfa} (Note, C uses the symbol, @'='@ for designator initialization, but \CFA had to change to @':'@ because of problems with tuple syntax.) This approach is also necessary for a predefined typed enumeration (unchangeable), when additional secondary-information need to be added. \section{Planet Example} \VRef[Figure]{f:PlanetExample} shows an archetypal enumeration example illustrating most of the \CFA enumeration features. @Planet@ is an enumeration of type @MR@. Each planet enumerator is initialized to a specific mass/radius, @MR@, value. The unnamed enumeration provides the gravitational-constant enumerator @G@. Function @surfaceGravity@ uses the @with@ clause to remove @p@ qualification from fields @mass@ and @radius@. The program main uses the pseudo function @countof@ to obtain the number of enumerators in @Planet@, and safely converts the random value into a @Planet@ enumerator using @fromInt@. The resulting random orbital-body is used in a @choose@ statement. The enumerators in the @case@ clause use the enumerator position for testing. The prints use @label@ to print an enumerator's name. Finally, a loop enumerates through the planets computing the weight on each planet for a given earth mass. The print statement does an equality comparison with an enumeration variable and enumerator (@p == MOON@). \begin{figure} \small \begin{cfa} struct MR { double mass, radius; }; $\C{// planet definition}$ enum( @MR@ ) Planet { $\C{// typed enumeration}$ // mass (kg) radius (km) MERCURY = { 0.330_E24, 2.4397_E6 }, VENUS = { 4.869_E24, 6.0518_E6 }, EARTH = { 5.976_E24, 6.3781_E6 }, MOON = { 7.346_E22, 1.7380_E6 }, $\C{// not a planet}$ MARS = { 0.642_E24, 3.3972_E6 }, JUPITER = { 1898._E24, 71.492_E6 }, SATURN = { 568.8_E24, 60.268_E6 }, URANUS = { 86.86_E24, 25.559_E6 }, NEPTUNE = { 102.4_E24, 24.746_E6 }, PLUTO = { 1.303_E22, 1.1880_E6 }, $\C{// not a planet}$ }; enum( double ) { G = 6.6743_E-11 }; $\C{// universal gravitational constant (m3 kg-1 s-2)}$ static double surfaceGravity( Planet p ) @with( p )@ { return G * mass / ( radius @\@ 2 ); $\C{// no qualification, exponentiation}$ } static double surfaceWeight( Planet p, double otherMass ) { return otherMass * surfaceGravity( p ); } int main( int argc, char * argv[] ) { if ( argc != 2 ) @exit@ | "Usage: " | argv[0] | "earth-weight"; // terminate program double earthWeight = convert( argv[1] ); double earthMass = earthWeight / surfaceGravity( EARTH ); Planet rp = @fromInt@( prng( @countof@( Planet ) ) ); $\C{// select random orbiting body}$ @choose( rp )@ { $\C{// implicit breaks}$ case MERCURY, VENUS, EARTH, MARS: sout | @rp@ | "is a rocky planet"; case JUPITER, SATURN, URANUS, NEPTUNE: sout | rp | "is a gas-giant planet"; default: sout | rp | "is not a planet"; } for ( @p; Planet@ ) { $\C{// enumerate}$ sout | "Your weight on" | ( @p == MOON@ ? "the" : " " ) | p | "is" | wd( 1,1, surfaceWeight( p, earthMass ) ) | "kg"; } } $\$$ planet 100 JUPITER is a gas-giant planet Your weight on MERCURY is 37.7 kg Your weight on VENUS is 90.5 kg Your weight on EARTH is 100.0 kg Your weight on the MOON is 16.6 kg Your weight on MARS is 37.9 kg Your weight on JUPITER is 252.8 kg Your weight on SATURN is 106.6 kg Your weight on URANUS is 90.5 kg Your weight on NEPTUNE is 113.8 kg Your weight on PLUTO is 6.3 kg \end{cfa} \caption{Planet Example} \label{f:PlanetExample} \end{figure}