\chapter{\CFA Enumeration} % \CFA supports C enumeration using the same syntax and semantics for backwards compatibility. % \CFA also extends C-Style enumeration by adding a number of new features that bring enumerations inline with other modern programming languages. % Any enumeration extensions must be intuitive to C programmers both in syntax and semantics. % The following sections detail all of my new contributions to enumerations in \CFA. \CFA extends the enumeration declaration by parameterizing with a type (like a generic type). \begin{cfa}[caption={CFA Enum},captionpos=b,label={l:CFAEnum}] $\it enum$-specifier: enum @(type-specifier$\(_{opt}\)$)@ identifier$\(_{opt}\)$ { cfa-enumerator-list } enum @(type-specifier$\(_{opt}\)$)@ identifier$\(_{opt}\)$ { cfa-enumerator-list , } enum @(type-specifier$\(_{opt}\)$)@ identifier cfa-enumerator-list: cfa-enumerator cfa-enumerator, cfa-enumerator-list cfa-enumerator: enumeration-constant $\it inline$ identifier enumeration-constant = expression \end{cfa} A \newterm{\CFA enumeration}, or \newterm{\CFA enum}, has an optional type declaration in the bracket next to the @enum@ keyword. Without optional type declarations, the syntax defines \newterm{opaque enums}. Otherwise, \CFA enum with type declaration are \newterm{typed enums}. \section{Opaque Enum} \label{s:OpaqueEnum} Opaque enum is a special CFA enumeration type, where the internal representation is chosen by the compiler and hidden from users. Compared C enum, opaque enums are more restrictive in terms of typing, and cannot be implicitly converted to integers. Enumerators of opaque enum cannot have initializer. Declaring initializer in the body of opaque enum results in a compile time error. \begin{cfa} enum@()@ Planets { MERCURY, VENUS, EARTH, MARS, JUPITER, SATURN, URANUS, NEPTUNE }; Planet p = URANUS; int i = VENUS; @// Error, VENUS cannot be converted into an integral type \end{cfa} % Each opaque enum has two @attributes@: @position@ and @label@. \CFA auto-generates @attribute functions@ @posn()@ and @label()@ for every \CFA enum to returns the respective attributes. Opaque enumerations have two defining properties: @label@ (name) and @order@ (position), exposed to users by predefined @attribute functions@ , with the following signatures: \begin{cfa} forall( E ) { unsigned posn(E e); const char * s label(E e); }; \end{cfa} With polymorphic type parameter E being substituted by enumeration types such as @Planet@. \begin{cfa} unsigned i = posn(VENUS); // 1 char * s = label(MARS); // "MARS" \end{cfa} \subsection{Representation} The underlying representation of \CFA enumeration object is its order, saved as an integral type. Therefore, the size of a \CFA enumeration is consistent with C enumeration. Attribute function @posn@ performs type substitution on an expression from \CFA type to integral type. Names of enumerators are stored in a global data structure, with @label@ maps \CFA enumeration object to corresponding data. \section{Typed Enum} \label{s:EnumeratorTyping} \CFA extends the enumeration declaration by parameterizing with a type (like a generic type), allowing enumerators to be assigned any values from the declared type. Figure~\ref{f:EumeratorTyping} shows a series of examples illustrating that all \CFA types can be use with an enumeration and each type's values used to set the enumerator constants. Note, the synonyms @Liz@ and @Beth@ in the last declaration. Because enumerators are constants, the enumeration type is implicitly @const@, so all the enumerator types in Figure~\ref{f:EumeratorTyping} are logically rewritten with @const@. \begin{figure} \begin{cfa} // integral enum( @char@ ) Currency { Dollar = '$\textdollar$', Cent = '$\textcent$', Yen = '$\textyen$', Pound = '$\textsterling$', Euro = 'E' }; enum( @signed char@ ) srgb { Red = -1, Green = 0, Blue = 1 }; enum( @long long int@ ) BigNum { X = 123_456_789_012_345, Y = 345_012_789_456_123 }; // non-integral enum( @double@ ) Math { PI_2 = 1.570796, PI = 3.141597, E = 2.718282 }; enum( @_Complex@ ) Plane { X = 1.5+3.4i, Y = 7+3i, Z = 0+0.5i }; // pointer enum( @char *@ ) Name { Fred = "FRED", Mary = "MARY", Jane = "JANE" }; int i, j, k; enum( @int *@ ) ptr { I = &i, J = &j, K = &k }; enum( @int &@ ) ref { I = i, J = j, K = k }; // tuple enum( @[int, int]@ ) { T = [ 1, 2 ] }; $\C{// new \CFA type}$ // function void f() {...} void g() {...} enum( @void (*)()@ ) funs { F = f, G = g }; // aggregate struct Person { char * name; int age, height; }; @***@enum( @Person@ ) friends { @Liz@ = { "ELIZABETH", 22, 170 }, @Beth@ = Liz, Jon = { "JONATHAN", 35, 190 } }; \end{cfa} \caption{Enumerator Typing} \label{f:EumeratorTyping} \end{figure} An advantage of the typed enumerations is eliminating the \emph{harmonizing} problem between an enumeration and companion data \see{\VRef{s:Usage}}: \begin{cfa} enum( char * ) integral_types { chr = "char", schar = "signed char", uschar = "unsigned char", sshort = "signed short int", ushort = "unsigned short int", sint = "signed int", usint = "unsigned int", ... }; \end{cfa} Note, the enumeration type can be a structure (see @Person@ in Figure~\ref{f:EumeratorTyping}), so it is possible to have the equivalent of multiple arrays of companion data using an array of structures. While the enumeration type can be any C aggregate, the aggregate's \CFA constructors are not used to evaluate an enumerator's value. \CFA enumeration constants are compile-time values (static); calling constructors happens at runtime (dynamic). @value@ is an @attribute@ that defined for typed enum along with position and label. @values@ of a typed enum are stored in a global array of declared typed, initialized with value of enumerator initializers. @value()@ functions maps an enum to an elements of the array. \subsection{Value Conversion} C has an implicit type conversion from an enumerator to its base type @int@. Correspondingly, \CFA has an implicit conversion from a typed enumerator to its base type. The feature that allows Typed enumeration seemlyless used \begin{cfa} char currency = Dollar; void foo( char * ); foo( Fred ); \end{cfa} During the resolution of expression e with \CFA enumeration type, \CFA adds @value(e)@ as an additional candidate with an extra \newterm{value} cost. For expression @char currency = Dollar@, the is no defined conversion from Dollar (\CFA enumeration) type to basic type and the conversion cost is @infinite@, thus the only valid candidate is @value(Dollar)@. @Value@ is a new category in \CFA's conversion cost model. It is defined to be a more significant factor than a @unsafe@ but weight less than @poly@. The resultin g conversion cost is a 8-tuple: @@(unsafe, value, poly, safe, sign, vars, specialization, reference)@@. \begin{cfa} void bar(int); enum(int) Month !{ January=31, February=29, March=31, April=30, May=31, June-30, July=31, August=31, September=30, October=31, November=30, December=31 }; Month a = Februrary; // (1), with cost (0, 1, 0, 0, 0, 0, 0, 0) double a = 5.5; // (2), with cost (1, 0, 0, 0, 0, 0, 0, 0) bar(a); \end{cfa} In the previous example, candidate (1) has an value cost to parameter type int, with is lower than (2) as an unsafe conversion from double to int. \CFA chooses value cost over unsafe cost and therefore @a@ of @bar(a)@ is resolved as an @Month@. \begin{cfa} forall(T | @CfaEnum(T)@) void bar(T); bar(a); // (3), with cost (0, 0, 1, 0, 0, 0, 0, 0) \end{cfa} % @Value@ is designed to be less significant than @poly@ to allow function being generic over \CFA enumeration (see ~\ref{c:trait}). Being generic over @CfaEnum@ traits (a pre-defined interface for \CFA enums) is a practice in \CFA to implement functions over \CFA enumerations, as will see in chapter~\ref{c:trait}. @Value@ is a being a more significant cost than @poly@ implies if a overloaeded function defined for @CfaEnum@ (and other generic type), \CFA always try to resolve it as a @CfaEnum@, rather to insert a @value@ conversion. \subsection{Explicit Conversion} Explicit conversion is allowed on \CFA enumeration to an integral type, in which case \CFA converts \CFA enumeration into its underlying representation, which is its @position@. \section{Auto Initialization} C auto-initialization works for the integral type @int@ with constant expressions. \begin{cfa} enum Alphabet ! { A = 'A', B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z, a = 'a', b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z }; \end{cfa} The complexity of the constant expression depends on the level of runtime computation the compiler implements, \eg \CC \lstinline[language={[GNU]C++}]{constexpr} provides complex compile-time computation across multiple types, which blurs the compilation/runtime boundary. % The notion of auto-initialization is generalized in \CFA enumertation E with base type T in the following way: When an enumerator @e@ does not have a initializer, if @e@ has enumeration type @E@ with base type @T@, \CFA auto-initialize @e@ with the following scheme: \begin{enumerate} % \item Enumerator e is the first enumerator of \CFA enumeration E with base type T. If e declares no no initializer, e is auto-initialized by the $zero\_t$ constructor of T. \item if e is first enumerator, e is initialized with T's @zero_t@. \item otherwise, if d is the enumerator defined just before e, with d has has been initialized with expression @l@ (@l@ can also be an auto-generated), e is initialized with @l++@. % \CFA reports a compile time error if T has no $zero\_t$ constructor. % Enumerator e is an enumerator of base-type T enumeration E that position i, where $i \neq 0$. And d is the enumerator with position @i-1@, e is auto-initialized with % the result of @value(d)++@. If operator @?++@ is not defined for type T, \CFA reports a compile time error. % Unfortunately, auto-initialization is not implemented because \CFA is only a transpiler, relying on generated C code to perform the detail work. % C does not have the equivalent of \CC \lstinline[language={[GNU]C++}]{constexpr}, and it is currently beyond the scope of the \CFA project to implement a complex runtime interpreter in the transpiler. % Nevertheless, the necessary language concepts exist to support this feature. \end{enumerate} while @?++( T )@ can be explicitly overloaded or implicitly overloaded with properly defined @one_t@ and @?+?(T, T)@. Unfortunately, auto-initialization with only constant expression is not enforced because \CFA is only a transpiler, relying on generated C code to perform the detail work. C does not have the equivalent of \CC \lstinline[language={[GNU]C++}]{constexpr}, and it is currently beyond the scope of the \CFA project to implement a complex runtime interpreter in the transpiler. Nevertheless, the necessary language concepts exist to support this feature. \section{Enumeration Inheritance} \CFA Plan-9 inheritance may be used with enumerations, where Plan-9 inheritance is containment inheritance with implicit unscoping (like a nested unnamed @struct@/@union@ in C). \begin{cfa} enum( char * ) Names { /* as above */ }; enum( char * ) Names2 { @inline Names@, Jack = "JACK", Jill = "JILL" }; enum( char * ) Names3 { @inline Names2@, Sue = "SUE", Tom = "TOM" }; \end{cfa} Enumeration @Name2@ inherits all the enumerators and their values from enumeration @Names@ by containment, and a @Names@ enumeration is a @subtype@ of enumeration @Name2@. Note, that enumerators must be unique in inheritance but enumerator values may be repeated. % The enumeration type for the inheriting type must be the same as the inherited type; % hence the enumeration type may be omitted for the inheriting enumeration and it is inferred from the inherited enumeration, as for @Name3@. % When inheriting from integral types, automatic numbering may be used, so the inheritance placement left to right is important. Specifically, the inheritance relationship for @Names@ is: \begin{cfa} Names $\(\subset\)$ Names2 $\(\subset\)$ Names3 $\C{// enum type of Names}$ \end{cfa} Inlined from \CFA enumeration @O@, new enumeration @N@ copies all enumerators from @O@, including those @O@ obtains through inheritance. Enumerators inherited from @O@ keeps same @label@ and @value@, but @position@ may shift to the right if other enumerators or inline enumeration declared in prior of @inline A@. \begin{cfa} enum() Phynchocephalia { Tuatara }; enum() Squamata { Snake, Lizard }; enum() Lepidosauromorpha { inline Phynchocephalia, inline Squamata, Kuehneosauridae }; \end{cfa} Snake, for example, has the position 0 in Squamata, but 1 in Lepidosauromorpha as Tuatara inherited from Phynchocephalia is position 0 in Lepidosauromorpha. A subtype enumeration can be casted, or implicitly converted into its supertype, with a safe cost. \begin{cfa} enum Squamata squamata_lizard = Lizard; posn(quamata_lizard); // 1 enum Lepidosauromorpha lepidosauromorpha_lizard = squamata_lizard; posn(lepidosauromorpha_lizard); // 2 void foo( Lepidosauromorpha l ); foo( squamata_lizard ); posn( (Lepidosauromorpha) squamata_lizard ); // 2 Lepidosauromorpha s = Snake; \end{cfa} The last expression in the preceding example is umabigious. While both @Squamata.Snake@ and @Lepidosauromorpha.Snake@ are valid candidate, @Squamata.Snake@ has an associated safe cost and \CFA select the zero cost candidate @Lepidosauromorpha.Snake@. As discussed in \VRef{s:OpaqueEnum}, \CFA chooses position as a representation of \CFA enum. Conversion involves both change of typing and possibly @position@. When converting a subtype to a supertype, the position can only be a larger value. The difference between the position in subtype and in supertype is an "offset". \CFA runs a the following algorithm to determine the offset for an enumerator to a super type. % In a summary, \CFA loops over members (include enumerators and inline enums) of the supertype. % If the member is the matching enumerator, the algorithm returns its position. % If the member is a inline enumeration, the algorithm trys to find the enumerator in the inline enumeration. If success, it returns the position of enumerator in the inline enumeration, plus % the position in the current enumeration. Otherwises, it increase the offset by the size of inline enumeration. \begin{cfa} struct Enumerator; struct CFAEnum { vector> members; }; pair calculateEnumOffset( CFAEnum dst, Enumerator e ) { int offset = 0; for( auto v: dst.members ) { if ( v.holds_alternative() ) { auto m = v.get(); if ( m == e ) return make_pair( true, 0 ); offset++; } else { auto p = calculateEnumOffset( v, e ); if ( p.first ) return make_pair( true, offset + p.second ); offset += p.second; } } return make_pair( false, offset ); } \end{cfa} % \begin{cfa} % Names fred = Name.Fred; % (Names2) fred; (Names3) fred; (Name3) Names.Jack; $\C{// cast to super type}$ % Names2 fred2 = fred; Names3 fred3 = fred2; $\C{// assign to super type}$ % \end{cfa} For the given function prototypes, the following calls are valid. \begin{cquote} \begin{tabular}{ll} \begin{cfa} void f( Names ); void g( Names2 ); void h( Names3 ); void j( const char * ); \end{cfa} & \begin{cfa} f( Fred ); g( Fred ); g( Jill ); h( Fred ); h( Jill ); h( Sue ); j( Fred ); j( Jill ); j( Sue ); j( "WILL" ); \end{cfa} \end{tabular} \end{cquote} Note, the validity of calls is the same for call-by-reference as for call-by-value, and @const@ restrictions are the same as for other types. \section{Enumerator Control Structures} Enumerators can be used in multiple contexts. In most programming languages, an enumerator is implicitly converted to its value (like a typed macro substitution). However, enumerator synonyms and typed enumerations make this implicit conversion to value incorrect in some contexts. In these contexts, a programmer's intuition assumes an implicit conversion to position. For example, an intuitive use of enumerations is with the \CFA @switch@/@choose@ statement, where @choose@ performs an implicit @break@ rather than a fall-through at the end of a @case@ clause. (For this discussion, ignore the fact that @case@ requires a compile-time constant.) \begin{cfa}[belowskip=0pt] enum Count { First, Second, Third, Fourth }; Count e; \end{cfa} \begin{cquote} \setlength{\tabcolsep}{15pt} \noindent \begin{tabular}{@{}ll@{}} \begin{cfa}[aboveskip=0pt] choose( e ) { case @First@: ...; case @Second@: ...; case @Third@: ...; case @Fourth@: ...; } \end{cfa} & \begin{cfa}[aboveskip=0pt] // rewrite choose( @value@( e ) ) { case @value@( First ): ...; case @value@( Second ): ...; case @value@( Third ): ...; case @value@( Fourth ): ...; } \end{cfa} \end{tabular} \end{cquote} Here, the intuitive code on the left is implicitly transformed into the standard implementation on the right, using the value of the enumeration variable and enumerators. However, this implementation is fragile, \eg if the enumeration is changed to: \begin{cfa} enum Count { First, Second, Third @= First@, Fourth }; \end{cfa} making @Third == First@ and @Fourth == Second@, causing a compilation error because of duplicate @case@ clauses. To better match with programmer intuition, \CFA toggles between value and position semantics depending on the language context. For conditional clauses and switch statements, \CFA uses the robust position implementation. \begin{cfa} if ( @posn@( e ) < posn( Third ) ) ... choose( @posn@( e ) ) { case @posn@( First ): ...; case @posn@( Second ): ...; case @posn@( Third ): ...; case @posn@( Fourth ): ...; } \end{cfa} \CFA provides a special form of for-control for enumerating through an enumeration, where the range is a type. \begin{cfa} for ( cx; @Count@ ) { sout | cx | nonl; } sout | nl; for ( cx; +~= Count ) { sout | cx | nonl; } sout | nl; for ( cx; -~= Count ) { sout | cx | nonl; } sout | nl; First Second Third Fourth First Second Third Fourth Fourth Third Second First \end{cfa} The enumeration type is syntax sugar for looping over all enumerators and assigning each enumerator to the loop index, whose type is inferred from the range type. The prefix @+~=@ or @-~=@ iterate forward or backwards through the inclusive enumeration range, where no prefix defaults to @+~=@. C has an idiom for @if@ and loop predicates of comparing the predicate result ``not equal to 0''. \begin{cfa} if ( x + y /* != 0 */ ) ... while ( p /* != 0 */ ) ... \end{cfa} This idiom extends to enumerations because there is a boolean conversion in terms of the enumeration value, if and only if such a conversion is available. For example, such a conversion exists for all numerical types (integral and floating-point). It is possible to explicitly extend this idiom to any typed enumeration by overloading the @!=@ operator. \begin{cfa} bool ?!=?( Name n, zero_t ) { return n != Fred; } Name n = Mary; if ( n ) ... // result is true \end{cfa} Specialize meanings are also possible. \begin{cfa} enum(int) ErrorCode { Normal = 0, Slow = 1, Overheat = 1000, OutOfResource = 1001 }; bool ?!=?( ErrorCode ec, zero_t ) { return ec >= Overheat; } ErrorCode code = ...; if ( code ) { problem(); } \end{cfa} \section{Enumeration Dimension} \VRef{s:EnumeratorTyping} introduced the harmonizing problem between an enumeration and secondary information. When possible, using a typed enumeration for the secondary information is the best approach. However, there are times when combining these two types is not possible. For example, the secondary information might precede the enumeration and/or its type is needed directly to declare parameters of functions. In these cases, having secondary arrays of the enumeration size are necessary. To support some level of harmonizing in these cases, an array dimension can be defined using an enumerator type, and the enumerators used as subscripts. \begin{cfa} enum E { A, B, C, N }; // possibly predefined float H1[N] = { [A] : 3.4, [B] : 7.1, [C] : 0.01 }; // C float H2[@E@] = { [A] : 3.4, [B] : 7.1, [C] : 0.01 }; // CFA \end{cfa} (Note, C uses the symbol, @'='@ for designator initialization, but \CFA had to change to @':'@ because of problems with tuple syntax.) This approach is also necessary for a predefined typed enumeration (unchangeable), when additional secondary-information need to be added. \section{Enumeration I/O} As seen in multiple examples, enumerations can be printed and the default property printed is the enumerator's label, which is similar in other programming languages. However, very few programming languages provide a mechanism to read in enumerator values. Even the @boolean@ type in many languages does not have a mechanism for input using the enumerators @true@ or @false@. \VRef[Figure]{f:EnumerationI/O} show \CFA enumeration input based on the enumerator labels. When the enumerator labels are packed together in the input stream, the input algorithm scans for the longest matching string. For basic types in \CFA, the constants use to initialize a variable in a program are available to initialize a variable using input, where strings constants can be quoted or unquoted. \begin{figure} \begin{cquote} \setlength{\tabcolsep}{15pt} \begin{tabular}{@{}ll@{}} \begin{cfa} int main() { enum(int ) E { BBB = 3, AAA, AA, AB, B }; E e; for () { try { @sin | e@; } catch( missing_data * ) { sout | "missing data"; continue; // try again } if ( eof( sin ) ) break; sout | e | "= " | value( e ); } } \end{cfa} & \begin{cfa} $\rm input$ BBBABAAAAB BBB AAA AA AB B $\rm output$ BBB = 3 AB = 6 AAA = 4 AB = 6 BBB = 3 AAA = 4 AA = 5 AB = 6 B = 7 \end{cfa} \end{tabular} \end{cquote} \caption{Enumeration I/O} \label{f:EnumerationI/O} \end{figure} \section{Planet Example} \VRef[Figure]{f:PlanetExample} shows an archetypal enumeration example illustrating most of the \CFA enumeration features. @Planet@ is an enumeration of type @MR@. Each planet enumerator is initialized to a specific mass/radius, @MR@, value. The unnamed enumeration provides the gravitational-constant enumerator @G@. Function @surfaceGravity@ uses the @with@ clause to remove @p@ qualification from fields @mass@ and @radius@. The program main uses the pseudo function @countof@ to obtain the number of enumerators in @Planet@, and safely converts the random value into a @Planet@ enumerator using @fromInt@. The resulting random orbital-body is used in a @choose@ statement. The enumerators in the @case@ clause use the enumerator position for testing. The prints use @label@ to print an enumerator's name. Finally, a loop enumerates through the planets computing the weight on each planet for a given earth mass. The print statement does an equality comparison with an enumeration variable and enumerator (@p == MOON@). \begin{figure} \small \begin{cfa} struct MR { double mass, radius; }; $\C{// planet definition}$ enum( @MR@ ) Planet { $\C{// typed enumeration}$ // mass (kg) radius (km) MERCURY = { 0.330_E24, 2.4397_E6 }, VENUS = { 4.869_E24, 6.0518_E6 }, EARTH = { 5.976_E24, 6.3781_E6 }, MOON = { 7.346_E22, 1.7380_E6 }, $\C{// not a planet}$ MARS = { 0.642_E24, 3.3972_E6 }, JUPITER = { 1898._E24, 71.492_E6 }, SATURN = { 568.8_E24, 60.268_E6 }, URANUS = { 86.86_E24, 25.559_E6 }, NEPTUNE = { 102.4_E24, 24.746_E6 }, PLUTO = { 1.303_E22, 1.1880_E6 }, $\C{// not a planet}$ }; enum( double ) { G = 6.6743_E-11 }; $\C{// universal gravitational constant (m3 kg-1 s-2)}$ static double surfaceGravity( Planet p ) @with( p )@ { return G * mass / ( radius @\@ 2 ); $\C{// no qualification, exponentiation}$ } static double surfaceWeight( Planet p, double otherMass ) { return otherMass * surfaceGravity( p ); } int main( int argc, char * argv[] ) { if ( argc != 2 ) @exit@ | "Usage: " | argv[0] | "earth-weight"; // terminate program double earthWeight = convert( argv[1] ); double earthMass = earthWeight / surfaceGravity( EARTH ); Planet rp = @fromInt@( prng( @countof@( Planet ) ) ); $\C{// select random orbiting body}$ @choose( rp )@ { $\C{// implicit breaks}$ case MERCURY, VENUS, EARTH, MARS: sout | @rp@ | "is a rocky planet"; case JUPITER, SATURN, URANUS, NEPTUNE: sout | rp | "is a gas-giant planet"; default: sout | rp | "is not a planet"; } for ( @p; Planet@ ) { $\C{// enumerate}$ sout | "Your weight on" | ( @p == MOON@ ? "the" : " " ) | p | "is" | wd( 1,1, surfaceWeight( p, earthMass ) ) | "kg"; } } $\$$ planet 100 JUPITER is a gas-giant planet Your weight on MERCURY is 37.7 kg Your weight on VENUS is 90.5 kg Your weight on EARTH is 100.0 kg Your weight on the MOON is 16.6 kg Your weight on MARS is 37.9 kg Your weight on JUPITER is 252.8 kg Your weight on SATURN is 106.6 kg Your weight on URANUS is 90.5 kg Your weight on NEPTUNE is 113.8 kg Your weight on PLUTO is 6.3 kg \end{cfa} \caption{Planet Example} \label{f:PlanetExample} \end{figure}