\chapter{\CFA Enumeration} \CFA supports C enumeration using the same syntax and semantics for backwards compatibility. \CFA also extends C-Style enumeration by adding a number of new features that bring enumerations inline with other modern programming languages. Any enumeration extensions must be intuitive to C programmers both in syntax and semantics. The following sections detail all of my new contributions to enumerations in \CFA. \section{Aliasing} {\color{red}@***@} C already provides @const@-style aliasing using the unnamed enumerator \see{\VRef{s:TypeName}}, even if the keyword @enum@ is misleading (@const@ is better). However, given the existence of this form, it is straightforward to extend it with heterogeneous types, \ie types other than @int@. \begin{cfa} enum { Size = 20u, PI = 3.14159L, Jack = L"John" }; $\C{// not an ADT nor an enumeration}$ \end{cfa} which matches with @const@ aliasing in other programming languages. (See \VRef{s:CenumImplementation} on how @gcc@/@clang@ are doing this for integral types.) Here, the type of each enumerator is the type of the initialization constant, \eg @typeof(20u)@ for @Size@ implies @unsigned int@. Auto-initialization is impossible in this case because some types do not support arithmetic. As seen in \VRef{s:EnumeratorTyping}, this feature is just a shorthand for multiple typed-enumeration declarations. \section{Enumerator Visibility} \label{s:EnumeratorVisibility} In C, unscoped enumerators present a \newterm{naming problem} when multiple enumeration types appear in the same scope with duplicate enumerator names. There is no mechanism in C to resolve these naming conflicts other than renaming one of the duplicates, which may be impossible if the conflict comes from system include files. The \CFA type-system allows extensive overloading, including enumerators. Furthermore, \CFA uses the environment, such as the left-hand of assignment and function arguments, to pinpoint the best overloaded name. \VRef[Figure]{f:EnumeratorVisibility} shows enumeration overloading and how qualification and casting are used to disambiguate ambiguous situations. \CFA overloading allows programmers to use the most meaningful names without fear of name clashes within a program or from external sources, like include files. Experience from \CFA developers is that the type system implicitly and correctly disambiguates the majority of overloaded names. That is, it is rare to get an incorrect selection or ambiguity, even among hundreds of overloaded variables and functions, that requires disambiguation using qualification or casting. \begin{figure} \begin{cfa} enum E1 { First, Second, Third, Fourth }; enum E2 { @Fourth@, @Third@, @Second@, @First@ }; $\C{// same enumerator names}$ E1 f() { return Third; } $\C{// overloaded functions, different return types}$ E2 f() { return Fourth; } void g( E1 e ); void h( E2 e ); void foo() { $\C{// different resolutions and dealing with ambiguities}$ E1 e1 = First; E2 e2 = First; $\C{// initialization}$ e1 = Second; e2 = Second; $\C{// assignment}$ e1 = f(); e2 = f(); $\C{// function return}$ g( First ); h( First ); $\C{// function argument}$ int i = @E1.@First + @E2.@First; $\C{// disambiguate with qualification}$ int j = @(E1)@First + @(E2)@First; $\C{// disambiguate with cast}$ } \end{cfa} \caption{Enumerator Visibility and Disambiguating} \label{f:EnumeratorVisibility} \end{figure} \section{Enumerator Scoping} An enumeration can be scoped, using @'!'@, so the enumerator constants are not projected into the enclosing scope. \begin{cfa} enum Week @!@ { Mon, Tue, Wed, Thu = 10, Fri, Sat, Sun }; enum RGB @!@ { Red, Green, Blue }; \end{cfa} Now the enumerators \emph{must} be qualified with the associated enumeration type. \begin{cfa} Week week = @Week.@Mon; week = @Week.@Sat; RGB rgb = @RGB.@Red; rgb = @RGB.@Blue; \end{cfa} {\color{red}@***@}It is possible to toggle back to unscoped using the \CFA @with@ clause/statement (see also \CC \lstinline[language=c++]{using enum} in Section~\ref{s:C++RelatedWork}). \begin{cfa} with ( @Week@, @RGB@ ) { $\C{// type names}$ week = @Sun@; $\C{// no qualification}$ rgb = @Green@; } \end{cfa} As in Section~\ref{s:EnumeratorVisibility}, opening multiple scoped enumerations in a @with@ can result in duplicate enumeration names, but \CFA implicit type resolution and explicit qualification/casting handle this localized scenario. \section{Enumerator Typing} \label{s:EnumeratorTyping} \CFA extends the enumeration declaration by parameterizing with a type (like a generic type), allowing enumerators to be assigned any values from the declared type. Figure~\ref{f:EumeratorTyping} shows a series of examples illustrating that all \CFA types can be use with an enumeration and each type's constants used to set the enumerator constants. Note, the synonyms @Liz@ and @Beth@ in the last declaration. Because enumerators are constants, the enumeration type is implicitly @const@, so all the enumerator types in Figure~\ref{f:EumeratorTyping} are logically rewritten with @const@. C has an implicit type conversion from an enumerator to its base type @int@. Correspondingly, \CFA has an implicit (safe) conversion from a typed enumerator to its base type. \begin{cfa} char currency = Dollar; string fred = Fred; $\C{// implicit conversion from char * to \CFA string type}$ Person student = Beth; \end{cfa} % \begin{cfa} % struct S { int i, j; }; % enum( S ) s { A = { 3, 4 }, B = { 7, 8 } }; % enum( @char@ ) Currency { Dollar = '$\textdollar$', Euro = '$\texteuro$', Pound = '$\textsterling$' }; % enum( @double@ ) Planet { Venus = 4.87, Earth = 5.97, Mars = 0.642 }; // mass % enum( @char *@ ) Colour { Red = "red", Green = "green", Blue = "blue" }; % enum( @Currency@ ) Europe { Euro = '$\texteuro$', Pound = '$\textsterling$' }; // intersection % \end{cfa} \begin{figure} \begin{cfa} // integral enum( @char@ ) Currency { Dollar = '$\textdollar$', Cent = '$\textcent$', Yen = '$\textyen$', Pound = '$\textsterling$', Euro = 'E' }; enum( @signed char@ ) srgb { Red = -1, Green = 0, Blue = 1 }; enum( @long long int@ ) BigNum { X = 123_456_789_012_345, Y = 345_012_789_456_123 }; // non-integral enum( @double@ ) Math { PI_2 = 1.570796, PI = 3.141597, E = 2.718282 }; enum( @_Complex@ ) Plane { X = 1.5+3.4i, Y = 7+3i, Z = 0+0.5i }; // pointer enum( @const char *@ ) Name { Fred = "FRED", Mary = "MARY", Jane = "JANE" }; int i, j, k; enum( @int *@ ) ptr { I = &i, J = &j, K = &k }; @***@enum( @int &@ ) ref { I = i, J = j, K = k }; // tuple @***@enum( @[int, int]@ ) { T = [ 1, 2 ] }; $\C{// new \CFA type}$ // function void f() {...} void g() {...} enum( @void (*)()@ ) funs { F = f, G = g }; // aggregate struct Person { char * name; int age, height; }; @***@enum( @Person@ ) friends { @Liz@ = { "ELIZABETH", 22, 170 }, @Beth@ = Liz, Jon = { "JONATHAN", 35, 190 } }; \end{cfa} \caption{Enumerator Typing} \label{f:EumeratorTyping} \end{figure} An advantage of the typed enumerations is eliminating the \emph{harmonizing} problem between an enumeration and companion data \see{\VRef{s:Usage}}: \begin{cfa} enum( char * ) integral_types { chr = "char", schar = "signed char", uschar = "unsigned char", sshort = "signed short int", ushort = "unsigned short int", sint = "signed int", usint = "unsigned int", ... }; \end{cfa} Note, the enumeration type can be a structure (see @Person@ in Figure~\ref{f:EumeratorTyping}), so it is possible to have the equivalent of multiple arrays of companion data using an array of structures. While the enumeration type can be any C aggregate, the aggregate's \CFA constructors are not used to evaluate an enumerator's value. \CFA enumeration constants are compile-time values (static); calling constructors happens at runtime (dynamic). \section{Opaque Enumeration} \CFA provides a special opaque (pure) enumeration type with only assignment and equality operations, and no implicit conversion to any base-type. \begin{cfa} enum@()@ Mode { O_RDONLY, O_WRONLY, O_CREAT, O_TRUNC, O_APPEND }; Mode mode = O_RDONLY; if ( mode == O_CREAT ) ... bool b = mode == O_RDONLY || mode @<@ O_APPEND; $\C{// disallowed}$ int www @=@ mode; $\C{// disallowed}$ \end{cfa} \section{Enumeration Operators} \subsection{Conversion} \CFA only proves an implicit safe conversion between an enumeration and its base type (like \CC), whereas C allows an unsafe conversion from base type to enumeration. \begin{cfa} enum(int) Colour { Red, Blue, Green }; int w = Red; $\C[1.5in]{// allowed}$ Colour color = 0; $\C{// disallowed}\CRT$ \end{cfa} Unfortunately, there must be one confusing case between C enumerations and \CFA enumeration for type @int@. \begin{cfa} enum Colour { Red = 42, Blue, Green }; enum(int) Colour2 { Red = 16, Blue, Green }; int w = Redy; $\C[1.5in]{// 42}\CRT$ \end{cfa} Programmer intuition is that the assignment to @w@ is ambiguous. However, converting from @color@ to @int@ is zero cost (no conversion), while from @Colour2@ to @int@ is a safe conversion, which is a higher cost. This semantics means fewer backwards-compatibility issues with overloaded C and \CFA enumerators. \subsection{Properties} \VRef{s:Terminology} introduced three fundamental enumeration properties: label, position, and value. \CFA provides direct access to these three properties via the functions: @label@, @posn@, and @value@. \begin{cfa} enum( const char * ) Name { Fred = "FRED", Mary = "MARY", Jane = "JANE" }; Name name = Fred; sout | name | label( name ) | posn( name ) | value( name ); FRED Fred 0 FRED \end{cfa} The default meaning for an enumeration variable in an expression is its value. \subsection{Range} The following helper function are used to access and control enumeration ranges (enumerating). The pseudo-function @countof@ (like @sizeof@) provides the size (range) of an enumeration or an enumeration instance. \begin{cfa} enum(int) Colour { Red, Blue, Green }; Colour c = Red sout | countof( Colour ) | countof( c ); 3 3 \end{cfa} @countof@ is a pseudo-function because it takes a type as an argument. The function @fromInt@ provides a safe subscript of the enumeration. \begin{cfa} Colour r = fromInt( prng( countof( Colour ) ) ); // select random colour \end{cfa} The functions @lowerBound@, @upperBound@, @succ@, and @pred@ are for enumerating. \begin{cfa} for ( Colour c = lowerBound();; ) { sout | c | nonl; if ( c == upperBound() ) break; c = succ( c ); } \end{cfa} Note, the mid-exit loop is necessary to prevent triggering a @succ@ bound check, as in: \begin{cfa} for ( Colour c = lowerBound(); c <= upperBound(); c = succ( c ) ) ... // generates error \end{cfa} When @c == upperBound()@, the loop control still invokes @succ( c )@, which causes an @enumBound@ exception. Finally, there is operational overlap between @countof@ and @upperBound@. \section{Enumeration Inheritance} \CFA Plan-9 inheritance may be used with enumerations, where Plan-9 inheritance is containment inheritance with implicit unscoping (like a nested unnamed @struct@/@union@ in C). \begin{cfa} enum( const char * ) Names { Fred = "FRED", Mary = "MARY", Jane = "JANE" }; enum( const char * ) Names2 { @inline Names@, Jack = "JACK", Jill = "JILL" }; enum( const char * ) Names3 { @inline Names2@, Sue = "SUE", Tom = "TOM" }; \end{cfa} Enumeration @Name2@ inherits all the enumerators and their values from enumeration @Names@ by containment, and a @Names@ enumeration is a subtype of enumeration @Name2@. Note, that enumerators must be unique in inheritance but enumerator values may be repeated. % The enumeration type for the inheriting type must be the same as the inherited type; % hence the enumeration type may be omitted for the inheriting enumeration and it is inferred from the inherited enumeration, as for @Name3@. % When inheriting from integral types, automatic numbering may be used, so the inheritance placement left to right is important. Specifically, the inheritance relationship for @Names@ is: \begin{cfa} Names $\(\subset\)$ Names2 $\(\subset\)$ Names3 $\C{// enum type of Names}$ \end{cfa} A subtype can be cast to its supertype, assigned to a supertype variable, or used as a function argument that expects the supertype. \begin{cfa} Names fred = Names.Fred; (Names2)fred; (Names3)fred; (Names3)Names2.Jack; $\C{// cast to super type}$ Names2 fred2 = fred; Names3 fred3 = fred2; $\C{// assign to super type}$ \end{cfa} As well, there is the implicit cast to an enumerator's base-type. \begin{cfa} const char * name = fred; \end{cfa} For the given function prototypes, the following calls are valid. \begin{cquote} \begin{tabular}{ll} \begin{cfa} void f( Names ); void g( Names2 ); void h( Names3 ); void j( const char * ); \end{cfa} & \begin{cfa} f( Fred ); g( Fred ); g( Jill ); h( Fred ); h( Jill ); h( Sue ); j( Fred ); j( Jill ); j( Sue ); j( "WILL" ); \end{cfa} \end{tabular} \end{cquote} Note, the validity of calls is the same for call-by-reference as for call-by-value, and @const@ restrictions are the same as for other types. \section{Enumerator Control Structures} Enumerators can be used in multiple contexts. In most programming languages, an enumerator is implicitly converted to its value (like a typed macro substitution). However, enumerator synonyms and typed enumerations make this implicit conversion to value incorrect in some contexts. In these contexts, a programmer's initition assumes an implicit conversion to position. For example, an intuitive use of enumerations is with the \CFA @switch@/@choose@ statement, where @choose@ performs an implicit @break@ rather than a fall-through at the end of a @case@ clause. (For this discussion, ignore the fact that @case@ requires a compile-time constant.) \begin{cfa}[belowskip=0pt] enum Count { First, Second, Third, Fourth }; Count e; \end{cfa} \begin{cquote} \setlength{\tabcolsep}{15pt} \noindent \begin{tabular}{@{}ll@{}} \begin{cfa}[aboveskip=0pt] choose( e ) { case @First@: ...; case @Second@: ...; case @Third@: ...; case @Fourth@: ...; } \end{cfa} & \begin{cfa}[aboveskip=0pt] // rewrite choose( @value@( e ) ) { case @value@( First ): ...; case @value@( Second ): ...; case @value@( Third ): ...; case @value@( Fourth ): ...; } \end{cfa} \end{tabular} \end{cquote} Here, the intuitive code on the left is implicitly transformed into the standard implementation on the right, using the value of the enumeration variable and enumerators. However, this implementation is fragile, \eg if the enumeration is changed to: \begin{cfa} enum Count { First, Second, Third @= First@, Fourth }; \end{cfa} making @Third == First@ and @Fourth == Second@, causing a compilation error because of duplicate @case@ clauses. To better match with programmer intuition, \CFA toggles between value and position semantics depending on the language context. For conditional clauses and switch statements, \CFA uses the robust position implementation. \begin{cfa} if ( @posn@( e ) < posn( Third ) ) ... choose( @posn@( e ) ) { case @posn@( First ): ...; case @posn@( Second ): ...; case @posn@( Third ): ...; case @posn@( Fourth ): ...; } \end{cfa} \CFA provides a special form of for-control for enumerating through an enumeration, where the range is a type. \begin{cfa} for ( cx; @Count@ ) { sout | cx | nonl; } sout | nl; for ( cx; +~= Count ) { sout | cx | nonl; } sout | nl; for ( cx; -~= Count ) { sout | cx | nonl; } sout | nl; First Second Third Fourth First Second Third Fourth Fourth Third Second First \end{cfa} The enumeration type is syntax sugar for looping over all enumerators and assigning each enumerator to the loop index, whose type is inferred from the range type. The prefix @+~=@ or @-~=@ iterate forward or backwards through the inclusive enumeration range, where no prefix defaults to @+~=@. C has an idiom for @if@ and loop predicates of comparing the predicate result ``not equal to 0''. \begin{cfa} if ( x + y /* != 0 */ ) ... while ( p /* != 0 */ ) ... \end{cfa} This idiom extends to enumerations because there is a boolean conversion in terms of the enumeration value, if and only if such a conversion is available. For example, such a conversion exists for all numerical types (integral and floating-point). It is possible to explicitly extend this idiom to any typed enumeration by overloading the @!=@ operator. \begin{cfa} bool ?!=?( Name n, zero_t ) { return n != Fred; } Name n = Mary; if ( n ) ... // result is true \end{cfa} Specialize meanings are also possible. \begin{cfa} enum(int) ErrorCode { Normal = 0, Slow = 1, Overheat = 1000, OutOfResource = 1001 }; bool ?!=?( ErrorCode ec, zero_t ) { return ec >= Overheat; } ErrorCode code = ...; if ( code ) { problem(); } \end{cfa} \section{Enumeration Dimension} \VRef{s:EnumeratorTyping} introduced the harmonizing problem between an enumeration and secondary information. When possible, using a typed enumeration for the secondary information is the best approach. However, there are times when combining these two types is not possible. For example, the secondary information might precede the enumeration and/or its type is needed directly to declare parameters of functions. In these cases, having secondary arrays of the enumeration size are necessary. To support some level of harmonizing in these cases, an array dimension can be defined using an enumerator type, and the enumerators used as subscripts. \begin{cfa} enum E { A, B, C, N }; // possibly predefined float H1[N] = { [A] : 3.4, [B] : 7.1, [C] : 0.01 }; // C float H2[@E@] = { [A] : 3.4, [B] : 7.1, [C] : 0.01 }; // CFA \end{cfa} (Note, C uses the symbol, @'='@ for designator initialization, but \CFA had to change to @':'@ because of problems with tuple syntax.) This approach is also necessary for a predefined typed enumeration (unchangeable), when additional secondary-information need to be added. \section{Enumeration I/O} As seen in multiple examples, enumerations can be printed and the default property printed is the enumerator's label, which is similar in other programming languages. However, very few programming languages provide a mechanism to read in enumerator values. Even the @boolean@ type in many languages does not have a mechanism for input using the enumerators @true@ or @false@. \VRef[Figure]{f:EnumerationI/O} show \CFA enumeration input based on the enumerator labels. When the enumerator labels are packed together in the input stream, the input algorithm scans for the longest matching string. For basic types in \CFA, the constants use to initialize a variable in a program are available to initialize a variable using input, where strings constants can be quoted or unquoted. \begin{figure} \begin{cquote} \setlength{\tabcolsep}{15pt} \begin{tabular}{@{}ll@{}} \begin{cfa} int main() { enum(int ) E { BBB = 3, AAA, AA, AB, B }; E e; for () { try { @sin | e@; } catch( missing_data * ) { sout | "missing data"; continue; // try again } if ( eof( sin ) ) break; sout | e | "= " | value( e ); } } \end{cfa} & \begin{cfa} $\rm input$ BBBABAAAAB BBB AAA AA AB B $\rm output$ BBB = 3 AB = 6 AAA = 4 AB = 6 BBB = 3 AAA = 4 AA = 5 AB = 6 B = 7 \end{cfa} \end{tabular} \end{cquote} \caption{Enumeration I/O} \label{f:EnumerationI/O} \end{figure} \section{Planet Example} \VRef[Figure]{f:PlanetExample} shows an archetypal enumeration example illustrating most of the \CFA enumeration features. @Planet@ is an enumeration of type @MR@. Each planet enumerator is initialized to a specific mass/radius, @MR@, value. The unnamed enumeration provides the gravitational-constant enumerator @G@. Function @surfaceGravity@ uses the @with@ clause to remove @p@ qualification from fields @mass@ and @radius@. The program main uses the pseudo function @countof@ to obtain the number of enumerators in @Planet@, and safely converts the random value into a @Planet@ enumerator using @fromInt@. The resulting random orbital-body is used in a @choose@ statement. The enumerators in the @case@ clause use the enumerator position for testing. The prints use @label@ to print an enumerator's name. Finally, a loop enumerates through the planets computing the weight on each planet for a given earth mass. The print statement does an equality comparison with an enumeration variable and enumerator (@p == MOON@). \begin{figure} \small \begin{cfa} struct MR { double mass, radius; }; $\C{// planet definition}$ enum( @MR@ ) Planet { $\C{// typed enumeration}$ // mass (kg) radius (km) MERCURY = { 0.330_E24, 2.4397_E6 }, VENUS = { 4.869_E24, 6.0518_E6 }, EARTH = { 5.976_E24, 6.3781_E6 }, MOON = { 7.346_E22, 1.7380_E6 }, $\C{// not a planet}$ MARS = { 0.642_E24, 3.3972_E6 }, JUPITER = { 1898._E24, 71.492_E6 }, SATURN = { 568.8_E24, 60.268_E6 }, URANUS = { 86.86_E24, 25.559_E6 }, NEPTUNE = { 102.4_E24, 24.746_E6 }, PLUTO = { 1.303_E22, 1.1880_E6 }, $\C{// not a planet}$ }; enum( double ) { G = 6.6743_E-11 }; $\C{// universal gravitational constant (m3 kg-1 s-2)}$ static double surfaceGravity( Planet p ) @with( p )@ { return G * mass / ( radius @\@ 2 ); $\C{// no qualification, exponentiation}$ } static double surfaceWeight( Planet p, double otherMass ) { return otherMass * surfaceGravity( p ); } int main( int argc, char * argv[] ) { if ( argc != 2 ) @exit@ | "Usage: " | argv[0] | "earth-weight"; // terminate program double earthWeight = convert( argv[1] ); double earthMass = earthWeight / surfaceGravity( EARTH ); Planet rp = @fromInt@( prng( @countof@( Planet ) ) ); $\C{// select random orbiting body}$ @choose( rp )@ { $\C{// implicit breaks}$ case MERCURY, VENUS, EARTH, MARS: sout | @rp@ | "is a rocky planet"; case JUPITER, SATURN, URANUS, NEPTUNE: sout | rp | "is a gas-giant planet"; default: sout | rp | "is not a planet"; } for ( @p; Planet@ ) { $\C{// enumerate}$ sout | "Your weight on" | ( @p == MOON@ ? "the" : " " ) | p | "is" | wd( 1,1, surfaceWeight( p, earthMass ) ) | "kg"; } } $\$$ planet 100 JUPITER is a gas-giant planet Your weight on MERCURY is 37.7 kg Your weight on VENUS is 90.5 kg Your weight on EARTH is 100.0 kg Your weight on the MOON is 16.6 kg Your weight on MARS is 37.9 kg Your weight on JUPITER is 252.8 kg Your weight on SATURN is 106.6 kg Your weight on URANUS is 90.5 kg Your weight on NEPTUNE is 113.8 kg Your weight on PLUTO is 6.3 kg \end{cfa} \caption{Planet Example} \label{f:PlanetExample} \end{figure}