Changes in / [478dade:a933489b]
- Files:
-
- 7 edited
-
doc/theses/jiada_liang_MMath/CFAenum.tex (modified) (4 diffs)
-
doc/theses/jiada_liang_MMath/Cenum.tex (modified) (3 diffs)
-
doc/theses/jiada_liang_MMath/background.tex (modified) (8 diffs)
-
doc/theses/jiada_liang_MMath/conclusion.tex (modified) (4 diffs)
-
doc/theses/jiada_liang_MMath/intro.tex (modified) (2 diffs)
-
doc/theses/jiada_liang_MMath/relatedwork.tex (modified) (33 diffs)
-
libcfa/src/enum.hfa (modified) (2 diffs)
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/jiada_liang_MMath/CFAenum.tex
r478dade ra933489b 53 53 enum E { A, B, C, D, @N@ }; // N == 4 54 54 \end{cfa} 55 56 The underlying representation of \CFA enumeration object is its position, saved as an integral type. 57 Therefore, the size of a \CFA enumeration is consistent with a C enumeration. 58 Attribute function @posn@ performs type substitution on an expression from \CFA type to an integral type. 59 The label and value of an enumerator are stored in a global data structure for each enumeration, where attribute functions @label@/@value@ map an \CFA enumeration object to the corresponding data. 60 These operations do not apply to C Enums because backward compatibility means the necessary backing data structures cannot be supplied. 61 55 62 56 63 \section{Opaque Enumeration} … … 131 138 calling constructors happens at runtime (dynamic). 132 139 133 \section{Implementation}134 \CFA-cc is is a transpiler that translates \CFA code into C, which can later be compiled by a C compiler.135 136 During the transpilation, \CFA-cc breaks a \CFA enumeration definition into a definition of a C enumeration with the same name and auxiliary arrays: a label array and a value array for a typed enumeration.137 For example:138 \begin{cfa}139 // CFA (source):140 enum(T) E { E1=t1, E2=t2, E3=t3 };141 \end{cfa}142 is compiled into:143 \begin{cfa}144 // C (transpiled by cfa-cc):145 enum E { E1, E2, E3 };146 const char * E_labels[3] = { "E1", "E2", "E3" };147 const T E_values [3] = { t1, t2, t3 };148 \end{cfa}149 The generated C enumeration will have enumerator values equals to their positions thanks to C's auto-initialization scheme. Notice that value and label arrays are dynamically allocated data structures that take up150 memory. If an enumeration is globally defined, the arrays are allocated in the @.data@ section and will be initialized before the program execution.151 Otherwise, if an enumeration has its definition in a local scope, these arrays will be allocated on the stack and be initialized when the program counter152 reaches the code location of the enumeration definition.153 154 % This bring a considerable overhead to the program, in terms of both execution time and storage.155 % An opaque enumeration has no overhead156 % for values, and it has been suggested as a future work to leave as an option to not generate the label array.157 158 Alongs with the enumeration defintion, \CFA-cc adds defintions of attribute functions: @posn@, @label@ and @value@:159 \begin{cfa}160 inline int posn( E e ) { return (int) e; }161 inline const * label( E e ) { return E_labels[ (int) e ]; }162 inline const * E_value( E e ) { return E_values[ (int) e ]; }163 \end{cfa}164 These functions are not implemented in \CFA code: they are Abstract Syntax Tree (AST) nodes appends to the Abstract Syntax Tree (AST).165 Notably, the AST subnode for the "cast to @int@" expression inside the functions is annotated as reinterpreted casts.166 In order words, the effect of a case is only to change the type of an expression, and it stops further reduction on the expression \see{\VRef{s:ValueConversion}}.167 168 Consequently, \CFA enumeration comes with space and runtime overhead, both for enumeration definition and function call to attribute functions. \CFA made efforts to reduce the runtime169 overhead on function calls by aggressively reducing @label()@ and @value()@ function calls on an enumeration constant to a constant expression. The interpreted casts are extraneous170 after type checking and removed in later steps. A @label()@ and @value()@ call on an enumeration variable is a lookup of an element of an array of constant values, and it is up to the171 C compiler to optimize its runtime. While OpaqueEnum is effectively an "opt-out" of the value overhead, it has been suggested that an option to "opt-out" from labels be added as well.172 A @label()@ function definition is still necessary to accomplish enumeration traits. But it will return an empty string for an enumeration label when "opt-out" or the enumerator name173 when it is called on an enumeration constant. It will allow a user not to pay the overhead for labels when the enumerator names of a particular enumerated type are not helpful.174 140 175 141 \section{Value Conversion} 176 \label{s:ValueConversion} 142 177 143 C has an implicit type conversion from an enumerator to its base type @int@. 178 144 Correspondingly, \CFA has an implicit conversion from a typed enumerator to its base type, allowing typed enumeration to be seamlessly used as the value of its base type … … 225 191 enum(S) E { A, B, C, D }; 226 192 \end{cfa} 227 228 The restriction on C's enumeration initializers being constant expression is relaxed on \CFA enumeration.229 Therefore, an enumerator initializer allows function calls like @?+?( S & s, one_t )@ and @?{}( S & s, zero_t )@.230 It is because the values of \CFA enumerators are not stored in the compiled enumeration body but in the @value@ array, which231 allows dynamic initialization.232 193 233 194 \section{Subset} … … 559 520 The enumerators in the @case@ clause use the enumerator position for testing. 560 521 The prints use @label@ to print an enumerator's name. 561 Finally, a loop enumerates through the planets computing the weight on each planet for a given Earth mass.522 Finally, a loop enumerates through the planets computing the weight on each planet for a given earth mass. 562 523 The print statement does an equality comparison with an enumeration variable and enumerator (@p == MOON@). 563 524 -
doc/theses/jiada_liang_MMath/Cenum.tex
r478dade ra933489b 17 17 There is no mechanism in C to resolve these naming conflicts other than renaming one of the duplicates, which may be impossible if the conflict comes from system include-files. 18 18 19 The \CFA type-system allows extensive overloading, including enumerators. For example, enumerator First from E1 can exist at the scope as First from E2.19 The \CFA type-system allows extensive overloading, including enumerators. 20 20 Hence, most ambiguities among C enumerators are implicitly resolved by the \CFA type system, possibly without any programmer knowledge of the conflict. 21 21 In addition, C Enum qualification is added, exactly like aggregate field-qualification, to disambiguate. 22 22 \VRef[Figure]{f:EnumeratorVisibility} shows how resolution, qualification, and casting are used to disambiguate situations for enumerations @E1@ and @E2@. 23 24 Aside, name shadowing in \CFA only happens when a name has been redefined with the exact same type. Because an enumeration define its type and enumerators in one definition,25 and enumeration cannot be changed afterward, shadowing an enumerator is not possible (it is impossible to have another @First@ with same type @E1@.).26 23 27 24 \begin{figure} … … 60 57 \end{cfa} 61 58 % with feature unimplemented 62 It is possible to introduce enumerators from a scoped enumeration to a block scopeusing the \CFA @with@ auto-qualification clause/statement (see also \CC \lstinline[language=c++]{using enum} in Section~\ref{s:C++RelatedWork}).59 It is possible to toggle back to unscoped using the \CFA @with@ auto-qualification clause/statement (see also \CC \lstinline[language=c++]{using enum} in Section~\ref{s:C++RelatedWork}). 63 60 \begin{cfa} 64 61 with ( @Week@, @RGB@ ) { $\C{// type names}$ … … 68 65 \end{cfa} 69 66 As in Section~\ref{s:CVisibility}, opening multiple scoped enumerations in a @with@ can result in duplicate enumeration names, but \CFA implicit type resolution and explicit qualification/casting handle this localized scenario. 67 70 68 71 69 \section{Type Safety} -
doc/theses/jiada_liang_MMath/background.tex
r478dade ra933489b 6 6 \section{C} 7 7 8 As mentioned in \VRef{s:Aliasing}, it is common for C programmers to believethere are three equivalent forms of named constants.8 As mentioned in \VRef{s:Aliasing}, it is common for C programmers to ``believe'' there are three equivalent forms of named constants. 9 9 \begin{clang} 10 10 #define Mon 0 … … 15 15 \item 16 16 For @#define@, the programmer must explicitly manage the constant name and value. 17 Furthermore, these C preprocessor macro names are outside the C type system and can unintentionally change semantics ofa program.17 Furthermore, these C preprocessor macro names are outside the C type system and can incorrectly change random text in a program. 18 18 \item 19 19 The same explicit management is true for the @const@ declaration, and the @const@ variable cannot appear in constant-expression locations, like @case@ labels, array dimensions,\footnote{ … … 24 24 \end{clang} 25 25 \item 26 Only the @enum@ form is managed by the compiler, is part of the language type-system, works in all C constant-expression locations, and does not occupy storage.26 Only the @enum@ form is managed by the compiler, is part of the language type-system, works in all C constant-expression locations, and normally does not occupy storage. 27 27 \end{enumerate} 28 28 … … 55 55 \end{cquote} 56 56 However, statically initialized identifiers cannot appear in constant-expression contexts, \eg @case@. 57 Dynamically initialized identifiers may appear in initialization and array dimensions , which allows variable-sized arrays on the stack.57 Dynamically initialized identifiers may appear in initialization and array dimensions in @g++@, which allows variable-sized arrays on the stack. 58 58 Again, this form of aliasing is not an enumeration. 59 59 … … 132 132 Theoretically, a C enumeration \emph{variable} is an implementation-defined integral type large enough to hold all enumerator values. 133 133 In practice, C defines @int@~\cite[\S~6.4.4.3]{C11} as the underlying type for enumeration variables, restricting initialization to integral constants, which have type @int@ (unless qualified with a size suffix). 134 According to the C standard, type @int@ is defined as the following:134 However, type @int@ is defined as: 135 135 \begin{quote} 136 136 A ``plain'' @int@ object has the natural size suggested by the architecture of the execution environment (large enough to contain any value in the range @INT_MIN@ to @INT_MAX@ as defined in the header @<limits.h>@).~\cite[\S~6.2.5(5)]{C11} 137 137 \end{quote} 138 138 However, @int@ means 4 bytes on both 32/64-bit architectures, which does not seem like the ``natural'' size for a 64-bit architecture. 139 %Whereas @long int@ means 4 bytes on a 32-bit and 8 bytes on 64-bit architectures, and @long long int@ means 8 bytes on both 32/64-bit architectures, where 64-bit operations are simulated on 32-bit architectures.139 Whereas @long int@ means 4 bytes on a 32-bit and 8 bytes on 64-bit architectures, and @long long int@ means 8 bytes on both 32/64-bit architectures, where 64-bit operations are simulated on 32-bit architectures. 140 140 \VRef[Figure]{f:gccEnumerationStorageSize} shows both @gcc@ and @clang@ partially ignore this specification and type the integral size of an enumerator based on its initialization. 141 141 Hence, initialization in the range @INT_MIN@..@INT_MAX@ results in a 4-byte enumerator, and outside this range, the enumerator is 8 bytes. 142 Note that @sizeof( typeof( IMin ) ) != sizeof( E )@, making the size of an enumerator different than i ts containing enumeration type, which seems inconsistent.142 Note that @sizeof( typeof( IMin ) ) != sizeof( E )@, making the size of an enumerator different than is containing enumeration type, which seems inconsistent, \eg @sizeof( typeof( 3 ) ) == sizeof( int )@. 143 143 144 144 \begin{figure} … … 156 156 } 157 157 8 4 158 4 8159 158 4 -2147483648 2147483647 160 159 8 -9223372036854775808 9223372036854775807 … … 169 168 \label{s:Usage} 170 169 171 C proves an implicit \emph{bidirectional} conversion between an enumeration and its integral type and between different enumerations.170 C proves an implicit \emph{bidirectional} conversion between an enumeration and its integral type and between two different enumerations. 172 171 \begin{clang} 173 172 enum Week week = Mon; $\C{// week == 0}$ … … 256 255 Virtually all programming languages overload the arithmetic operators across the basic types using the number and type of parameters and returns. 257 256 Like \CC, \CFA also allows these operators to be overloaded with user-defined types. 258 The syntax for operator names uses the @'?'@ character to denote a parameter, \eg unary operators: @?++@, @++?@, binary operator@?+?@.257 The syntax for operator names uses the @'?'@ character to denote a parameter, \eg prefix and infix increment operators: @?++@, @++?@, and @?+?@. 259 258 \begin{cfa} 260 259 struct S { int i, j }; -
doc/theses/jiada_liang_MMath/conclusion.tex
r478dade ra933489b 2 2 \label{c:conclusion} 3 3 4 Th is work aims to extend the simple and unsafe enumeration type in the C programming language into a complex and safe enumeration type in the \CFA programming language while maintaining backward compatibility with C.4 The goal of this work is to extend the simple and unsafe enumeration type in the C programming language into a complex and safe enumeration type in the \CFA programming language while maintaining backward compatibility with C. 5 5 Within this goal, the new \CFA enumeration should align with the analogous enumeration features in other languages to match modern programming expectations. 6 6 Hence, the \CFA enumeration features are borrowed from a number of programming languages, but engineered to work and play with \CFA's type system and feature set. … … 8 8 Strong type-checking of enumeration initialization and assignment provides additional safety, ensuring an enumeration only contains its enumerators. 9 9 Overloading and scoping of enumerators significantly reduces the naming problem, providing a better software-engineering environment, with fewer name clashes and the ability to disambiguate those that cannot be implicitly resolved. 10 Typed enumerations solve the data-harmonization problem ,increasing safety through better software engineering.10 Typed enumerations solve the data-harmonization problem increasing safety through better software engineering. 11 11 Moreover, integrating enumerations with existing control structures provides a consistent upgrade for programmers and a succinct and secure mechanism to enumerate with the new loop-range feature. 12 12 Generalization and reuse are supported by incorporating the new enumeration type using the \CFA trait system. 13 Enumeration traits define the meaning of an enumeration, allowing functions to be written that work on any enumeration, such as the reading and printing ofan enumeration.14 With advanced structural typing,C enumerations can be extended so they work with all of the enumeration features, providing for legacy C code to be moved forward into the modern \CFA programming domain.15 Finally, the \CFA project's test suite has been expandedwith multiple enumeration features tests with respect to implicit conversions, control structures, inheritance, interaction with the polymorphic types, and the features built on top of enumeration traits.13 Enumeration traits define the meaning of an enumeration, allowing functions to be written that work on any enumeration, such as the reading and printing an enumeration. 14 Using advanced duck typing, existing C enumerations can be extended so they work with all of the enumeration features, providing for legacy C code to be moved forward into the modern \CFA programming domain. 15 Finally, I expanded the \CFA project's test-suite with multiple enumeration features tests with respect to implicit conversions, control structures, inheritance, interaction with the polymorphic types, and the features built on top of enumeration traits. 16 16 These tests ensure future \CFA work does not accidentally break the new enumeration system. 17 17 18 In summary,the new \CFA enumeration mechanisms achieve the initial goals, providing C programmers with an intuitive enumeration mechanism for handling modern programming requirements.18 The conclusion is that the new \CFA enumeration mechanisms achieve the initial goals, providing C programmers with an intuitive enumeration mechanism for handling modern programming requirements. 19 19 20 20 … … 51 51 \end{cfa} 52 52 \item 53 Currently ,enumeration scoping is all or nothing. In some cases, it might be useful to53 Currently enumeration scoping is all or nothing. In some cases, it might be useful to 54 54 increase the scoping granularity to individual enumerators. 55 55 \begin{cfa} … … 68 68 typedef RGB.Red OtherRed; // alias 69 69 \end{cfa} 70 \item71 Label arrays are auxiliary data structures that are always generated for \CFA enumeration, which is a considerable program overhead.72 It is helpful to provide a new syntax or annotation for a \CFA enumeration definition that tells the compiler the labels will not be used73 throughout the execution. Therefore, \CFA optimizes the label array away. The @value()@ function can still be used on an enumeration constant,74 and the function called is reduced to a @char *@ constant expression that holds the name of the enumerator. But if @value()@ is called on75 a variable with an enumerated type, it returns an empty string since the label information is lost for the runtime.76 70 \end{enumerate} -
doc/theses/jiada_liang_MMath/intro.tex
r478dade ra933489b 61 61 The alias names are constants, which follow transitively from their binding to other constants. 62 62 \item 63 Defines a type for generating instan ces (variables).63 Defines a type for generating instants (variables). 64 64 \item 65 65 For safety, an enumeration instance should be restricted to hold only its constant names. … … 232 232 % https://hackage.haskell.org/package/base-4.19.1.0/docs/GHC-Enum.html 233 233 234 % The association between ADT and enumeration occurs if all the constructors have a unit (empty) type, \eg @struct unit {}@. 235 % Note, the unit type is not the same as \lstinline{void}. 236 In terms of functional programming linguistics, enumerations often refers to a @unit type@ ADT, where @unit type@ is a type 237 that carry no information. 238 % \begin{cfa} 239 % void foo( void ); 240 % struct unit {} u; $\C[1.5in]{// empty type}$ 241 % unit bar( unit ); 242 % foo( @foo()@ ); $\C{// void argument does not match with void parameter}$ 243 % bar( bar( u ) ); $\C{// unit argument does match with unit parameter}\CRT$ 244 % \end{cfa} 234 The association between ADT and enumeration occurs if all the constructors have a unit (empty) type, \eg @struct unit {}@. 235 Note, the unit type is not the same as \lstinline{void}. 236 \begin{cfa} 237 void foo( void ); 238 struct unit {} u; $\C[1.5in]{// empty type}$ 239 unit bar( unit ); 240 foo( @foo()@ ); $\C{// void argument does not match with void parameter}$ 241 bar( bar( u ) ); $\C{// unit argument does match with unit parameter}\CRT$ 242 \end{cfa} 245 243 246 244 For example, in the Haskell ADT: -
doc/theses/jiada_liang_MMath/relatedwork.tex
r478dade ra933489b 1 1 \chapter{Related Work} 2 2 \label{s:RelatedWork} 3 4 \begin{comment} 5 An algebraic data type (ADT) can be viewed as a recursive sum of product types. 6 A sum type lists values as members. 7 A member in a sum type definition is known as a data constructor. 8 For example, C supports sum types union and enumeration (enum). 9 An enumeration in C can be viewed as the creation of a list of zero-arity data constructors. 10 A union instance holds a value of one of its member types. 11 Defining a union does not generate new constructors. 12 The definition of member types and their constructors are from the outer lexical scope. 13 14 In general, an \newterm{algebraic data type} (ADT) is a composite type, \ie, a type formed by combining other types. 15 Three common classes of algebraic types are \newterm{array type}, \ie homogeneous types, \newterm{product type}, \ie heterogeneous tuples and records (structures), and \newterm{sum type}, \ie tagged product-types (unions). 16 Enumerated types are a special case of product/sum types with non-mutable fields, \ie initialized (constructed) once at the type's declaration, possible restricted to compile-time initialization. 17 Values of algebraic types are access by subscripting, field qualification, or type (pattern) matching. 18 \end{comment} 3 19 4 20 Enumeration-like features exist in many popular programming languages, both past and present, \eg Pascal~\cite{Pascal}, Ada~\cite{Ada}, \Csharp~\cite{Csharp}, OCaml~\cite{OCaml} \CC, Go~\cite{Go}, Haskell~\cite{Haskell} \see{discussion in \VRef{s:AlgebraicDataType}}, Java~\cite{Java}, Rust~\cite{Rust}, Swift~\cite{Swift}, Python~\cite{Python}. … … 29 45 type Boolean = ( false, true ); 30 46 \end{pascal} 31 The enumeration supports the relational operators @=@, @<>@, @<@, @<=@, @>=@, and @>@, interpreted as as comparison in terms of declaration order.47 The enumeration ordering supports the relational operators @=@, @<>@, @<@, @<=@, @>=@, and @>@, provided both operands are the same (sub)type. 32 48 33 49 The following auto-generated pseudo-functions exist for all enumeration types: … … 49 65 wend : Weekend; 50 66 \end{pascal} 51 Hence, declaration order ofenumerators is crucial to provide the necessary ranges.67 Hence, the ordering of the enumerators is crucial to provide the necessary ranges. 52 68 There is a bidirectional assignment between the enumeration and its subranges. 53 69 \begin{pascal} … … 135 151 136 152 The underlying type is an implementation-defined integral type large enough to hold all enumerated values; it does not have to be the smallest possible type. 137 The integral size can be explicitly specified using compiler directive \$@PACKENUM@~$N$, where $N$ is the number of bytes, \eg:153 The integral size can be explicitly specified using compiler directive @$PACKENUM@~$N$, where $N$ is the number of bytes, \eg: 138 154 \begin{pascal} 139 155 Type @{$\color{red}\$$PACKENUM 1}@ SmallEnum = ( one, two, three ); … … 183 199 \end{figure} 184 200 185 Enumerators without initialization are auto-initialized from left to right, starting at zero andincrementing by 1.201 Enumerators without initialization are auto-initialized from left to right, starting at zero, incrementing by 1. 186 202 Enumerators with initialization must set \emph{all} enumerators in \emph{ascending} order, \ie there is no auto-initialization. 187 203 \begin{ada} … … 366 382 \end{c++} 367 383 \CC{11} added a scoped enumeration, \lstinline[language=c++]{enum class} (or \lstinline[language=c++]{enum struct})\footnote{ 368 The use of keyword \lstinline[language=c++]{class} is re asonable because default visibility is \lstinline[language=c++]{private} (scoped).384 The use of keyword \lstinline[language=c++]{class} is resonable because default visibility is \lstinline[language=c++]{private} (scoped). 369 385 However, default visibility for \lstinline[language=c++]{struct} is \lstinline[language=c++]{public} (unscoped) making it an odd choice.}, 370 386 where the enumerators are accessed using type qualification. … … 380 396 E e = A; e = B; $\C{// direct access}$ 381 397 \end{c++} 382 \CC{11} added the ability to explicitly declare an underlying \emph{integral} type for \lstinline[language=c++]{enum class}.398 \CC{11} added the ability to explicitly declare only an underlying \emph{integral} type for \lstinline[language=c++]{enum class}. 383 399 \begin{c++} 384 400 enum class RGB @: long@ { Red, Green, Blue }; … … 414 430 \end{tabular} 415 431 \end{cquote} 416 However, there is no mechanism to iterate through an enumeration. 417 A common workaround is to iterate over enumerator as integral values, but it only works if 418 enumerators resemble a sequence of natural, i.e., enumerators are auto-initialized. 419 Otherwises, the iteration would have integers that are not enumeration values. 432 However, there is no mechanism to iterate through an enumeration without an unsafe cast and it does not understand the enumerator values. 420 433 \begin{c++} 421 434 enum Week { Mon, Tue, Wed, Thu = 10, Fri, Sat, Sun }; … … 436 449 % https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/language-specification/enums 437 450 438 \Csharp is a programming language with a scoped, integral enumeration similar to \CC \lstinline[language=C++]{enum class}.451 \Csharp is a dynamically-typed programming language with a scoped, integral enumeration similar to \CC \lstinline[language=C++]{enum class}. 439 452 \begin{csharp} 440 453 enum Week : @long@ { Mon, Tue, Wed, Thu@ = 10@, Fri, Sat, Sun } … … 495 508 \end{cquote} 496 509 497 To indirectly enumerate, \Csharp's Enum library provides @Enum.GetValues@, 498 % a pseudo-method that retrieves an array of the enumeration constants for looping over an enumeration type or variable (expensive operation). 499 a static memeber of abstract Enum type that return a reference to an array of all enumeration constants. 500 Internally, a Enum type has a static member called @fieldInfoHash@, a @Hashtable@ that stores enumerators information. The field is populated on-demand: 501 it only contains information if a @reflection@ like @GetValues@ is called. But the information will be cached, so the cost of reflection is paid only 502 once throughout the lifetime of a program. @GetValues@ then converts a @Hashtable@ to an @Array@, which supports enumerating. 510 To indirectly enumerate, \Csharp's Enum library has @Enum.GetValues@, a pseudo-method that retrieves an array of the enumeration constants for looping over an enumeration type or variable (expensive operation). 503 511 \begin{csharp} 504 512 foreach ( Week d in @Enum.GetValues@( typeof(Week) ) ) { … … 527 535 \label{s:Go} 528 536 529 Go has @const@ aliasing declarations, similar to \CC \see{\VRef{s:C++RelatedWork}}, for basic types with type inferencing and static initialization (constant expression). 530 The most basic form of constant definition is a @const@ keyword, followed by the name of constant, an optional type declaration of the constant, and a mandatory initialize. 531 For exmaple: 537 Go has a no enumeration. 538 It has @const@ aliasing declarations, similar to \CC \see{\VRef{s:C++RelatedWork}}, for basic types with type inferencing and static initialization (constant expression). 532 539 \begin{Go} 533 540 const R @int@ = 0; const G @uint@ = 1; const B = 2; $\C{// explicit typing and type inferencing}$ … … 537 544 const V = 3.1; const W = 3.1; 538 545 \end{Go} 539 Since these declarations are immutable variables, they are unscoped and Go has no overloading. If no type declaration provided, Go infers 540 type from the initializer expression. 541 542 % Go provides an enumeration-like feature to group together @const@ declaration into a block and introduces a form of auto-initialization. 543 These named constants can be grouped together in one @const@ declaration block and introduces a form of auto-initialization. 546 Since these declarations are immutable variables, they are unscoped and Go has no overloading. 547 548 Go provides an enumeration-like feature to group together @const@ declaration into a block and introduces a form of auto-initialization. 544 549 \begin{Go} 545 550 const ( R = 0; G; B ) $\C{// implicit initialization: 0 0 0}$ 546 551 const ( Fred = "Fred"; Mary = "Mary"; Jane = "Jane" ) $\C{// explicit initialization: Fred Mary Jane}$ 547 const ( S = 0; T; USA = "USA"; U; V = 3.1; W ) $\C{// implicit/explicit: 0 0 USA USA 3.1 3.1}$552 const ( S = 0; T; USA = "USA"; U; V = 3.1; W ) $\C{// type change, implicit/explicit: 0 0 USA USA 3.1 3.1}$ 548 553 \end{Go} 549 554 The first identifier \emph{must} be explicitly initialized; 550 555 subsequent identifiers can be implicitly or explicitly initialized. 551 Implicit initialization always uses the \emph{previous} (predecessor) constant expression initializer. 552 553 % Each @const@ declaration provides an implicit integer counter starting at zero, called \lstinline[language=Go]{iota}. 554 Each const declaration is often paired with a const expression \lstinline[language=Go]{iota} to re-define its 555 implicit initialization. \lstinline[language=Go]{iota} represents an sequence of natural number starting from zero. 556 Every implicit or explicit \lstinline[language=Go]{iota} increments the value of the expression by one. 556 Implicit initialization is the \emph{previous} (predecessor) identifier value. 557 558 Each @const@ declaration provides an implicit integer counter starting at zero, called \lstinline[language=Go]{iota}. 557 559 Using \lstinline[language=Go]{iota} outside of a @const@ block always sets the identifier to zero. 558 560 \begin{Go} 559 561 const R = iota; $\C{// 0}$ 560 562 \end{Go} 561 % Inside a @const@ block, \lstinline[language=Go]{iota} is implicitly incremented for each \lstinline[language=golang]{const} identifier and used to initialize the next uninitialized identifier. 562 Inside a @const@ block, if a constant has \lstinline[language=Go]{iota} initializer, its successor will also use \lstinline[language=Go]{iota} initializer. 563 \lstinline[language=Go]{iota} is no different than other constant expression when it is used in implicit initialization, but 564 thanks to the increment natural of \lstinline[language=Go]{iota}, the successor will have a value equal to its predecessor plus 1. 563 Inside a @const@ block, \lstinline[language=Go]{iota} is implicitly incremented for each \lstinline[language=golang]{const} identifier and used to initialize the next uninitialized identifier. 565 564 \begin{Go} 566 565 const ( R = @iota@; G; B ) $\C{// implicit: 0 1 2}$ 567 % const ( C = @iota + B + 1@; G; Y ) $\C{// implicit: 3 4 5}$568 \end{Go}569 The constant blocks from the previous example is equivalanet to:570 \begin{Go}571 const ( R = @iota@; G=@iota@; B=@iota@ ) $\C{// implicit: 0 1 2}$572 \end{Go}573 R, G, B have values 0, 1, 2, respectively, because \lstinline[language=Go]{iota} is an increasing.574 575 Similarly,576 \begin{Go}577 566 const ( C = @iota + B + 1@; G; Y ) $\C{// implicit: 3 4 5}$ 578 567 \end{Go} 579 can be rewritten as: 568 An underscore \lstinline[language=golang]{const} identifier advances \lstinline[language=Go]{iota}. 580 569 \begin{Go} 581 const ( C = @iota + B + 1@; G = @iota + B + 1@; Y = @iota + B + 1@ ) $\C{// implicit: 3 4 5}$570 const ( O1 = iota + 1; @_@; O3; @_@; O5 ) // 1, 3, 5 582 571 \end{Go} 583 Go's grouped constants do not define a new type, and constants in the same block can have heterogeneous types. 584 These two characteristics differs a grouped constant from an enumeration, but also gives a direction on approximating enumeration in Go: 585 first to define a new type externally, and make sure all constants in the same group will have the new type. 572 Auto-initialization reverts from \lstinline[language=Go]{iota} to the previous value after an explicit initialization, but auto-incrementing of \lstinline[language=Go]{iota} continues. 586 573 \begin{Go} 587 type Language int64 588 const ( 589 C Language = iota 590 CPP 591 CSharp 592 CFA 593 Go 594 ) 574 const ( Mon = iota; Tue; Wed; // 0, 1, 2 575 @Thu = 10@; Fri; Sat; @Sun = itoa@ ) $\C{// 10, 10, 10, {\color{red}6}}$ 595 576 \end{Go} 596 By typing the first constant as @Language@ and assigning initializer with \lstinline[language=Go]{iota}, all other constants will have the same type 597 and the same initialzer. It is a close approximation, but it is not a real enumeration. The definition of the "enumerated type" is separate from 598 the "enumerator definition", and nothing stop outside constants to have the type @Language@. 599 600 % An underscore \lstinline[language=golang]{const} identifier advances \lstinline[language=Go]{iota}. 601 % \begin{Go} 602 % const ( O1 = iota + 1; @_@; O3; @_@; O5 ) // 1, 3, 5 603 % \end{Go} 604 % Auto-initialization reverts from \lstinline[language=Go]{iota} to the previous value after an explicit initialization, but auto-incrementing of \lstinline[language=Go]{iota} continues. 605 % \begin{Go} 606 % const ( Mon = iota; Tue; Wed; // 0, 1, 2 607 % @Thu = 10@; Fri; Sat; @Sun = itoa@ ) $\C{// 10, 10, 10, {\color{red}6}}$ 608 % \end{Go} 609 % Auto-initialization from \lstinline[language=Go]{iota} is restarted and \lstinline[language=Go]{iota} reinitialized with an expression containing at most \emph{one} \lstinline[language=Go]{iota}. 610 % \begin{Go} 611 % const ( V1 = iota; V2; @V3 = 7;@ V4 = @iota@ + 1; V5 ) // 0 1 7 4 5 612 % const ( Mon = iota; Tue; Wed; // 0, 1, 2 613 % @Thu = 10;@ Fri = @iota@ - Wed + Thu - 1; Sat; Sun ) // 10, 11, 12, 13 614 % \end{Go} 615 % Here, @V4@ and @Fri@ restart auto-incrementing from \lstinline[language=Go]{iota} and reset \lstinline[language=Go]{iota} to 4 and 11, respectively, because of the initialization expressions containing \lstinline[language=Go]{iota}. 616 % Note, because \lstinline[language=Go]{iota} is incremented for an explicitly initialized identifier or @_@, 617 % at @Fri@ \lstinline[language=Go]{iota} is 4 requiring the minus one to compute the value for @Fri@. 618 577 Auto-initialization from \lstinline[language=Go]{iota} is restarted and \lstinline[language=Go]{iota} reinitialized with an expression containing at most \emph{one} \lstinline[language=Go]{iota}. 578 \begin{Go} 579 const ( V1 = iota; V2; @V3 = 7;@ V4 = @iota@ + 1; V5 ) // 0 1 7 4 5 580 const ( Mon = iota; Tue; Wed; // 0, 1, 2 581 @Thu = 10;@ Fri = @iota@ - Wed + Thu - 1; Sat; Sun ) // 10, 11, 12, 13 582 \end{Go} 583 Here, @V4@ and @Fri@ restart auto-incrementing from \lstinline[language=Go]{iota} and reset \lstinline[language=Go]{iota} to 4 and 11, respectively, because of the initialization expressions containing \lstinline[language=Go]{iota}. 584 Note, because \lstinline[language=Go]{iota} is incremented for an explicitly initialized identifier or @_@, 585 at @Fri@ \lstinline[language=Go]{iota} is 4 requiring the minus one to compute the value for @Fri@. 619 586 620 587 Basic switch and looping are possible. … … 643 610 \end{tabular} 644 611 \end{cquote} 645 However, the loop in this exampleprints the values from 0 to 13 because there is no actual enumeration.612 However, the loop prints the values from 0 to 13 because there is no actual enumeration. 646 613 647 614 A constant variable can be used as an array dimension or a subscript. … … 660 627 Week day = Week.Sat; 661 628 \end{Java} 662 The enumerator's members are scoped and requiresqualification.629 The enumerator's members are scoped and cannot be made \lstinline[language=java]{public}, hence requiring qualification. 663 630 The value of an enumeration instance is restricted to its enumerators. 664 631 … … 696 663 If the implementation member is \lstinline[language=Java]{public}, the enumeration is unsafe, as any value of the underlying type can be assigned to it, \eg @day = 42@. 697 664 The implementation constructor must be private since it is only used internally to initialize the enumerators. 698 Initialization occurs at the enumeration-type declaration .665 Initialization occurs at the enumeration-type declaration for each enumerator in the first line. 699 666 700 667 Enumerations can be used in the @if@ and @switch@ statements but only for equality tests. … … 724 691 725 692 There are no arithmetic operations on enumerations, so there is no arithmetic way to iterate through an enumeration without making the implementation type \lstinline[language=Java]{public}. 726 Like \Csharp, looping over an enumeration is done using static method @values@, which returns an array of enumerator values. 727 Unfortunately, @values@ is an expensive @O(n)@ operation because it creates a new array every time it is called. 693 Like \Csharp, looping over an enumeration is done using method @values@, which returns an array of enumerator values (expensive operation). 728 694 \begin{Java} 729 695 for ( Week d : Week.values() ) { … … 734 700 Like \Csharp, enumerating is supplied indirectly through another enumerable type, not via the enumeration. 735 701 736 % Java provides an @EnumSet@ where the underlying type is an efficient set of bits, one per enumeration \see{\Csharp \lstinline{Flags}, \VRef{s:Csharp}}, providing (logical) operations on groups of enumerators.737 % There is also a specialized version of @HashMap@ with enumerator keys, which has performance benefits.738 Java provides @EnumSet@, an auxiliary data structure that takes an enum @class@ as parameter (Week.class) for its construction, and it contains members only with the supplied739 enum type. @EnumSet@ is enumerable because it extends @AbstractSet@ interfaces and thus supports direct enumerating via @forEach@. It also has subset operation740 @range@ and it is possible to add to and remove from members of the set.741 @EnumSet@ supports more enumeration features, but it is not an enumeration type: it is a set of enumerators from a pre-define enum.742 743 744 702 An enumeration type cannot declare an array dimension nor can an enumerator be used as a subscript. 745 703 Enumeration inheritence is disallowed because an enumeration is \lstinline[language=Java]{final}. 704 705 Java provides an @EnumSet@ where the underlying type is an efficient set of bits, one per enumeration \see{\Csharp \lstinline{Flags}, \VRef{s:Csharp}}, providing (logical) operations on groups of enumerators. 706 There is also a specialized version of @HashMap@ with enumerator keys, which has performance benefits. 707 746 708 747 709 \section{Rust} … … 771 733 adt = ADT::S(s); println!( "{:?}", adt ); 772 734 @match@ adt { 773 ADT::I( i ) $=>$println!( "{:}", i ),774 ADT::F( f ) $=>$println!( "{:}", f ),775 ADT::S( s ) $=>$println!( "{:} {:}", s.i, s.j ),735 ADT::I( i ) => println!( "{:}", i ), 736 ADT::F( f ) => println!( "{:}", f ), 737 ADT::S( s ) => println!( "{:} {:}", s.i, s.j ), 776 738 } 777 739 \end{rust} … … 795 757 let mut week : Week = Week::Mon; 796 758 match week { 797 Week::Mon $=>$println!( "Mon" ),759 Week::Mon => println!( "Mon" ), 798 760 ... 799 Week::Sun $=>$println!( "Sun" ),761 Week::Sun => println!( "Sun" ), 800 762 } 801 763 \end{rust} … … 851 813 Week::Mon | Week:: Tue | Week::Wed | Week::Thu 852 814 | Week::Fri => println!( "weekday" ), 853 Week::Sat | Week:: Sun $=>$println!( "weekend" ),815 Week::Sat | Week:: Sun => println!( "weekend" ), 854 816 } 855 817 \end{c++} 856 818 \end{tabular} 857 819 \end{cquote} 858 % However, there is no mechanism to iterate through an enumeration without casting to integral and positions versus values is not handled. 859 Like C/\CC, there is no mechanism to iterate through an enumeration. It can only be approximated 860 by a loop over a range of enumerator and will not work if enumerator values is a sequence of natural numbers. 820 However, there is no mechanism to iterate through an enumeration without casting to integral and positions versus values is not handled. 861 821 \begin{c++} 862 822 for d in Week::Mon as isize ..= Week::Sun as isize { … … 865 825 0 1 2 @3 4 5 6 7 8 9@ 10 11 12 13 866 826 \end{c++} 867 % An enumeration type cannot declare an array dimension nor as a subscript. 868 There is no direct way to harmonize an enumeration and another data structure. For example, 869 there is no mapping from an enumerated type to an array type. 870 In terms of extensibility, there is no mechanism to subset or inherit from an enumeration. 827 An enumeration type cannot declare an array dimension nor as a subscript. 828 There is no mechanism to subset or inherit from an enumeration. 871 829 872 830 873 831 \section{Swift} 874 \label{s:Swift} 832 875 833 % https://www.programiz.com/swift/online-compiler 876 Despite being named as enumeration, A Swift @enum@ is in fact a ADT: cases (enumerators) of an @enum@ can have heterogeneous types and be recursive. 877 %Like Rust, Swift @enum@ provides two largely independent mechanisms from a single language feature: an ADT and an enumeration.834 835 Like Rust, Swift @enum@ provides two largely independent mechanisms from a single language feature: an ADT and an enumeration. 878 836 When @enum@ is an ADT, pattern matching is used to discriminate among the variant types. 879 837 \begin{cquote} … … 917 875 \end{tabular} 918 876 \end{cquote} 919 % Note, after an @adt@'s type is know, the enumerator is inferred without qualification, \eg @.I(3)@. 920 Normally an enumeration case needs a type qualification. But in the example when pattern matching @adt@, which 921 has a type @ADT@, the context provides that the cases refer to @ADT@'s cases and no explicit type qualification is required. 922 923 % An enumeration is created when \emph{all} the enumerators are unit-type, which is like a scoped, opaque enumeration. 924 Without type declaration for enumeration cases, a Swift enum syntax defined a unit-type enumeration, which is like a scoped, opaque enumeration. 877 Note, after an @adt@'s type is know, the enumerator is inferred without qualification, \eg @.I(3)@. 878 879 An enumeration is created when \emph{all} the enumerators are unit-type, which is like a scoped, opaque enumeration. 925 880 \begin{swift} 926 881 enum Week { case Mon, Tue, Wed, Thu, Fri, Sat, Sun }; // unit-type 927 882 var week : Week = @Week.Mon@; 928 883 \end{swift} 929 % As well, it is possible to type \emph{all} the enumerators with a common type, and set different values for each enumerator; 930 % for integral types, there is auto-incrementing. 931 As well, it is possible to type associated values of enumeration cases with a common types. 932 When enumeration cases are typed with a common integral type, Swift auto-initialize enumeration cases following the same initialization scheme as C language. 933 If enumeration is typed with @string@, its cases are auto-initialized to case names (labels). 884 As well, it is possible to type \emph{all} the enumerators with a common type, and set different values for each enumerator; 885 for integral types, there is auto-incrementing. 934 886 \begin{cquote} 935 887 \setlength{\tabcolsep}{15pt} … … 981 933 \end{tabular} 982 934 \end{cquote} 983 Enumerating is accomplished by inheriting from @CaseIterable@ protocol, which has a static 984 @enum.allCases@ property that returns a collection of all the cases for looping over an enumeration type or variable. 985 Like \CFA, Swift's default enumerator output is the case name (label). An enumerator of a typed enumeration has attribute 986 @rawValue@ that return its case value. 935 Enumerating is accomplished by inheriting from @CaseIterable@ without any associated values. 987 936 \begin{swift} 988 937 enum Week: Comparable, @CaseIterable@ { … … 994 943 Mon Tue Wed Thu Fri Sat Sun 995 944 \end{swift} 996 997 945 The @enum.allCases@ property returns a collection of all the cases for looping over an enumeration type or variable (expensive operation). 946 947 A typed enumeration is accomplished by inheriting from any Swift type, and accessing the underlying enumerator value is done with the attribute @rawValue@. 948 Type @Int@ has auto-incrementing from the previous enumerator; 949 type @String@ has auto-incrementing of the enumerator label. 998 950 \begin{cquote} 999 951 \setlength{\tabcolsep}{15pt} … … 1023 975 \end{cquote} 1024 976 1025 There is a safebidirectional conversion from typed enumerator to @rawValue@ and vice versa.977 There is a bidirectional conversion from typed enumerator to @rawValue@ and vice versa. 1026 978 \begin{swift} 979 var weekInt : WeekInt = WeekInt.Mon; 1027 980 if let opt = WeekInt( rawValue: 0 ) { // test optional return value 1028 print( opt.rawValue, opt ) // 0 Mon981 print( weekInt.rawValue, opt ) // 0 Mon 1029 982 } else { 1030 983 print( "invalid weekday lookup" ) 1031 984 } 1032 985 \end{swift} 1033 % Conversion from @rawValue@ to enumerator may fail (bad lookup), so the result is an optional value. 1034 In the previous exmaple, the initialization of @opt@ fails when there is no enumeration cases has value equals 0, resulting in a 1035 @nil@ value. Initialization from a raw value is considered a expensive operation because it requires a value lookup. 986 Conversion from @rawValue@ to enumerator may fail (bad lookup), so the result is an optional value. 987 1036 988 1037 989 \section{Python 3.13} … … 1052 1004 class Week(!Enum!): Mon = 1; Tue = 2; Wed = 3; Thu = 4; Fri = 5; Sat = 6; Sun = 7 1053 1005 \end{python} 1054 and/or explicitly auto-initialized with @auto@ method, \eg:1006 and/or explicitly auto-initialized, \eg: 1055 1007 \begin{python} 1056 1008 class Week(Enum): Mon = 1; Tue = 2; Wed = 3; Thu = 10; Fri = !auto()!; Sat = 4; Sun = !auto()! 1057 1009 Mon : 1 Tue : 2 Wed : 3 Thu : 10 Fri : !11! Sat : 4 Sun : !12! 1058 1010 \end{python} 1059 @auto@ is controlled by member @_generate_next_value_()@, which by default return one plus the highest value among enumerators, and can be overridden: 1011 where @auto@ increments by 1 from the previous @auto@ value \see{Go \lstinline[language=Go]{iota}, \VRef{s:Go}}. 1012 @auto@ is controlled by member @_generate_next_value_()@, which can be overridden: 1060 1013 \begin{python} 1061 1014 @staticmethod … … 1244 1197 % https://dev.realworldocaml.org/runtime-memory-layout.html 1245 1198 1246 Like Swift (\VRef{s:Swift}) and Haskell (\VRef{s:AlgebraicDataType}), OCaml @enum@ provides two largely independent mechanisms from a single language feature: an ADT and an enumeration.1199 Like Haskell, OCaml @enum@ provides two largely independent mechanisms from a single language feature: an ADT and an enumeration. 1247 1200 When @enum@ is an ADT, pattern matching is used to discriminate among the variant types. 1248 1201 \begin{cquote} … … 1283 1236 \end{tabular} 1284 1237 \end{cquote} 1285 % (Note, after an @adtv@'s type is know, the enumerator is inferred without qualification, \eg @I(3)@.) 1286 1238 (Note, after an @adtv@'s type is know, the enumerator is inferred without qualification, \eg @I(3)@.) 1287 1239 The type names are independent of the type value and mapped to an opaque, ascending, integral tag, starting from 0, supporting relational operators @<@, @<=@, @>@, and @>=@. 1288 1240 \begin{cquote} … … 1326 1278 1327 1279 While OCaml enumerators have an ordering following the definition order, they are not enumerable. 1328 To iterate over all enumerators, an OCaml type needs to derive from the @enumerate@ PPX (Pre-Preocessor eXtension), which appends a list of all enumerators to the program abstract syntax tree (AST). 1329 However, as stated in the documentation, @enumerate@ PPX does not guarantee the order of the list. 1330 PPX is beyond the scope of OCaml native language and it is a preprocessor directly modifying a parsed AST. In conclusion, there is no enumerating mechanism within the scope of OCaml language. 1280 To iterate over all enumerators, an OCaml type needs to derive from the @enumerate@ preprocessor, which appends a list of all enumerators to the program abstract syntax tree (AST). 1281 However, the list of values may not persist in the defined ordering. 1282 As a consequence, there is no meaningful enumerating mechanism. 1283 1284 Enumeration subsetting is allowed but inheritance is restricted to classes not types. 1285 \begin{ocaml} 1286 type weekday = Mon | Tue | Wed | Thu | Fri 1287 type weekend = Sat | Sun 1288 type week = Weekday of weekday | Weekend of weekend 1289 let day : week = Weekend Sun 1290 \end{ocaml} 1331 1291 1332 1292 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% … … 1567 1527 opaque & \CM & & & \CM & \CM & & \CM & \CM & & & & \CM \\ 1568 1528 \hline 1569 typed & Int & Int & Int & H & U & H & U/H & U/H & H & Int & Int& U \\1529 typed & Int & Int & Integral & H & U & H & U/H & U/H & H & Int & Integral& U \\ 1570 1530 \hline 1571 1531 safety & \CM & \CM & & \CM & \CM & & \CM & \CM & & & \CM & \CM \\ … … 1573 1533 posn ordered & Implied & Implied & & \CM & & & & & & & & \CM \\ 1574 1534 \hline 1575 unique values & \CM & \CM & & \CM& & & & \CM & & & & \\1535 unique values & \CM & \CM & & & & & & \CM & & & & \\ 1576 1536 \hline 1577 auto-init & \CM & all or none & \CM & N/A& & \CM & \CM & \CM & \CM & \CM & \CM & \CM \\1537 auto-init & \CM & all or none & \CM & & & \CM & \CM & \CM & \CM & \CM & \CM & \CM \\ 1578 1538 \hline 1579 1539 (Un)Scoped & U & U & S & S & S & U & S & S & S & U & U/S & U/S \\ … … 1585 1545 arr. dim. & \CM & \CM & & & & & & & & & & \CM \\ 1586 1546 \hline 1587 subset & \CM & \CM & & & & & & & & & & \CM \\1547 subset & \CM & \CM & & \CM & & & & & & & & \CM \\ 1588 1548 \hline 1589 1549 superset & & & & & & & & & & & & \CM \\ … … 1599 1559 Position ordered is implied if the enumerator values must be strictly increasingly. 1600 1560 \item unique value: enumerators must have a unique value. 1601 \item auto-init: Values are auto-initializable by language specification. \\ 1602 It is not appliable to OCaml because OCaml enumeration has unit type. 1561 \item auto-init: Values are auto-initializable by language specification, often being "+1" of the predecessor. 1603 1562 \item (Un)Scoped: U $\Rightarrow$ enumerators are projected into the containing scope. 1604 1563 S $\Rightarrow$ enumerators are contained in the enumeration scope and require qualification. -
libcfa/src/enum.hfa
r478dade ra933489b 4 4 5 5 forall( E ) trait Bounded { 6 E lowerBound( void);7 E upperBound( void);6 E lowerBound(); 7 E upperBound(); 8 8 }; 9 9 … … 54 54 55 55 static inline 56 forall( E | CfaEnum(E) | Serial(E) ) {56 forall( E | Serial(E) | CfaEnum(E) ) { 57 57 int ?==?( E l, E r ) { return posn( l ) == posn( r ); } // relational operators 58 58 int ?!=?( E l, E r ) { return posn( l ) != posn( r ); }
Note:
See TracChangeset
for help on using the changeset viewer.