Context Navigation

← Previous Change
Next Change →

Changeset 29c8675 for doc/theses/jiada_liang_MMath

Timestamp:

Sep 9, 2024, 6:29:59 AM (3 months ago)

Author:

JiadaL <j82liang@…>

Branches:

Children:

Parents:

Message:

update thesis

Location:

doc/theses/jiada_liang_MMath

Files:

: 6 edited

CFAenum.tex (modified) (4 diffs)
Cenum.tex (modified) (3 diffs)
background.tex (modified) (8 diffs)
conclusion.tex (modified) (4 diffs)
intro.tex (modified) (2 diffs)
relatedwork.tex (modified) (33 diffs)

Legend:

: Unmodified
: Added
: Removed

doc/theses/jiada_liang_MMath/CFAenum.tex

-                      r9bb6c5f
+                      r29c8675
 enum E { A, B, C, D, @N@ };  // N == 4
 \end{cfa}
-The underlying representation of \CFA enumeration object is its position, saved as an integral type.
-Therefore, the size of a \CFA enumeration is consistent with a C enumeration.
-Attribute function @posn@ performs type substitution on an expression from \CFA type to an integral type.
-The label and value of an enumerator are stored in a global data structure for each enumeration, where attribute functions @label@/@value@ map an \CFA enumeration object to the corresponding data.
-These operations do not apply to C Enums because backward compatibility means the necessary backing data structures cannot be supplied.
 \section{Opaque Enumeration}
 …
 calling constructors happens at runtime (dynamic).
+\section{Implementation}
+\CFA-cc is is a transpiler that translates \CFA code into C, which can later be compiled by a C compiler.
+During the transpilation, \CFA-cc breaks a \CFA enumeration definition into a definition of a C enumeration with the same name and auxiliary arrays: a label array and a value array for a typed enumeration.
+For example:
+\begin{cfa}
+// CFA (source):
+enum(T) E { E1=t1, E2=t2, E3=t3 };
+\end{cfa}
+is compiled into:
+\begin{cfa}
+// C (transpiled by cfa-cc):
+enum E { E1, E2, E3 };
+const char * E_labels[3] = { "E1", "E2", "E3" };
+const T E_values [3] = { t1, t2, t3 };
+\end{cfa}
+The generated C enumeration will have enumerator values equals to their positions thanks to C's auto-initialization scheme. Notice that value and label arrays are dynamically allocated data structures that take up
+memory. If an enumeration is globally defined, the arrays are allocated in the @.data@ section and will be initialized before the program execution.
+Otherwise, if an enumeration has its definition in a local scope, these arrays will be allocated on the stack and be initialized when the program counter
+reaches the code location of the enumeration definition.
+% This bring a considerable overhead to the program, in terms of both execution time and storage.
+% An opaque enumeration has no overhead
+% for values, and it has been suggested as a future work to leave as an option to not generate the label array.
+Alongs with the enumeration defintion, \CFA-cc adds defintions of attribute functions: @posn@, @label@ and @value@:
+\begin{cfa}
+inline int posn( E e ) { return (int) e; }
+inline const * label( E e ) { return E_labels[ (int) e ]; }
+inline const * E_value( E e ) { return E_values[ (int) e ]; }
+\end{cfa}
+These functions are not implemented in \CFA code: they are Abstract Syntax Tree (AST) nodes appends to the Abstract Syntax Tree (AST).
+Notably, the AST subnode for the "cast to @int@" expression inside the functions is annotated as reinterpreted casts.
+In order words, the effect of a case is only to change the type of an expression, and it stops further reduction on the expression \see{\VRef{s:ValueConversion}}.
+Consequently, \CFA enumeration comes with space and runtime overhead, both for enumeration definition and function call to attribute functions. \CFA made efforts to reduce the runtime
+overhead on function calls by aggressively reducing @label()@ and @value()@ function calls on an enumeration constant to a constant expression. The interpreted casts are extraneous
+after type checking and removed in later steps. A @label()@ and @value()@ call on an enumeration variable is a lookup of an element of an array of constant values, and it is up to the
+C compiler to optimize its runtime. While OpaqueEnum is effectively an "opt-out" of the value overhead, it has been suggested that an option to "opt-out" from labels be added as well.
+A @label()@ function definition is still necessary to accomplish enumeration traits. But it will return an empty string for an enumeration label when "opt-out" or the enumerator name
+when it is called on an enumeration constant. It will allow a user not to pay the overhead for labels when the enumerator names of a particular enumerated type are not helpful.
 \section{Value Conversion}
+\label{s:ValueConversion}
 C has an implicit type conversion from an enumerator to its base type @int@.
 Correspondingly, \CFA has an implicit conversion from a typed enumerator to its base type, allowing typed enumeration to be seamlessly used as the value of its base type
 …
 enum(S) E { A, B, C, D };
 \end{cfa}
+The restriction on C's enumeration initializers being constant expression is relaxed on \CFA enumeration.
+Therefore, an enumerator initializer allows function calls like @?+?( S & s, one_t )@ and @?{}( S & s, zero_t )@.
+It is because the values of \CFA enumerators are not stored in the compiled enumeration body but in the @value@ array, which
+allows dynamic initialization.
 \section{Subset}
 …
 The enumerators in the @case@ clause use the enumerator position for testing.
 The prints use @label@ to print an enumerator's name.
 Finally, a loop enumerates through the planets computing the weight on each planet for a given earth mass.
+Finally, a loop enumerates through the planets computing the weight on each planet for a given Earth mass.
 The print statement does an equality comparison with an enumeration variable and enumerator (@p == MOON@).

doc/theses/jiada_liang_MMath/Cenum.tex

-                      r9bb6c5f
+                      r29c8675
 There is no mechanism in C to resolve these naming conflicts other than renaming one of the duplicates, which may be impossible if the conflict comes from system include-files.
 The \CFA type-system allows extensive overloading, including enumerators.
+The \CFA type-system allows extensive overloading, including enumerators. For example, enumerator First from E1 can exist at the scope as First from E2.
 Hence, most ambiguities among C enumerators are implicitly resolved by the \CFA type system, possibly without any programmer knowledge of the conflict.
 In addition, C Enum qualification is added, exactly like aggregate field-qualification, to disambiguate.
 \VRef[Figure]{f:EnumeratorVisibility} shows how resolution, qualification, and casting are used to disambiguate situations for enumerations @E1@ and @E2@.
+Aside, name shadowing in \CFA only happens when a name has been redefined with the exact same type. Because an enumeration define its type and enumerators in one definition,
+and enumeration cannot be changed afterward, shadowing an enumerator is not possible (it is impossible to have another @First@ with same type @E1@.).
 \begin{figure}
 …
 \end{cfa}
 % with feature unimplemented
 It is possible to toggle back to unscoped using the \CFA @with@ auto-qualification clause/statement (see also \CC \lstinline[language=c++]{using enum} in Section~\ref{s:C++RelatedWork}).
+It is possible to introduce enumerators from a scoped enumeration to a block scope using the \CFA @with@ auto-qualification clause/statement (see also \CC \lstinline[language=c++]{using enum} in Section~\ref{s:C++RelatedWork}).
 \begin{cfa}
 with ( @Week@, @RGB@ ) {                                $\C{// type names}$
 …
 \end{cfa}
 As in Section~\ref{s:CVisibility}, opening multiple scoped enumerations in a @with@ can result in duplicate enumeration names, but \CFA implicit type resolution and explicit qualification/casting handle this localized scenario.
 \section{Type Safety}

doc/theses/jiada_liang_MMath/background.tex

-                      r9bb6c5f
+                      r29c8675
 \section{C}
 As mentioned in \VRef{s:Aliasing}, it is common for C programmers to ``believe'' there are three equivalent forms of named constants.
+As mentioned in \VRef{s:Aliasing}, it is common for C programmers to believe there are three equivalent forms of named constants.
 \begin{clang}
 #define Mon 0
 …
 \item
 For @#define@, the programmer must explicitly manage the constant name and value.
 Furthermore, these C preprocessor macro names are outside the C type system and can incorrectly change random text in a program.
+Furthermore, these C preprocessor macro names are outside the C type system and can unintentionally change semantics of a program.
 \item
 The same explicit management is true for the @const@ declaration, and the @const@ variable cannot appear in constant-expression locations, like @case@ labels, array dimensions,\footnote{
 …
 \end{clang}
 \item
 Only the @enum@ form is managed by the compiler, is part of the language type-system, works in all C constant-expression locations, and normally does not occupy storage.
+Only the @enum@ form is managed by the compiler, is part of the language type-system, works in all C constant-expression locations, and does not occupy storage.
 \end{enumerate}
 …
 \end{cquote}
 However, statically initialized identifiers cannot appear in constant-expression contexts, \eg @case@.
 Dynamically initialized identifiers may appear in initialization and array dimensions in @g++@, which allows variable-sized arrays on the stack.
+Dynamically initialized identifiers may appear in initialization and array dimensions, which allows variable-sized arrays on the stack.
 Again, this form of aliasing is not an enumeration.
 …
 Theoretically, a C enumeration \emph{variable} is an implementation-defined integral type large enough to hold all enumerator values.
 In practice, C defines @int@~\cite[\S~6.4.4.3]{C11} as the underlying type for enumeration variables, restricting initialization to integral constants, which have type @int@ (unless qualified with a size suffix).
 However, type @int@ is defined as:
+According to the C standard, type @int@ is defined as the following:
 \begin{quote}
 A ``plain'' @int@ object has the natural size suggested by the architecture of the execution environment (large enough to contain any value in the range @INT_MIN@ to @INT_MAX@ as defined in the header @<limits.h>@).~\cite[\S~6.2.5(5)]{C11}
 \end{quote}
 However, @int@ means 4 bytes on both 32/64-bit architectures, which does not seem like the ``natural'' size for a 64-bit architecture.
 Whereas @long int@ means 4 bytes on a 32-bit and 8 bytes on 64-bit architectures, and @long long int@ means 8 bytes on both 32/64-bit architectures, where 64-bit operations are simulated on 32-bit architectures.
+% Whereas @long int@ means 4 bytes on a 32-bit and 8 bytes on 64-bit architectures, and @long long int@ means 8 bytes on both 32/64-bit architectures, where 64-bit operations are simulated on 32-bit architectures.
 \VRef[Figure]{f:gccEnumerationStorageSize} shows both @gcc@ and @clang@ partially ignore this specification and type the integral size of an enumerator based on its initialization.
 Hence, initialization in the range @INT_MIN@..@INT_MAX@ results in a 4-byte enumerator, and outside this range, the enumerator is 8 bytes.
 Note that @sizeof( typeof( IMin ) ) != sizeof( E )@, making the size of an enumerator different than is containing enumeration type, which seems inconsistent, \eg @sizeof( typeof( 3 ) ) == sizeof( int )@.
+Note that @sizeof( typeof( IMin ) ) != sizeof( E )@, making the size of an enumerator different than its containing enumeration type, which seems inconsistent.
 \begin{figure}
 …
+}
 4
+8
 -2147483648 2147483647
 -9223372036854775808 9223372036854775807
 …
 \label{s:Usage}
 C proves an implicit \emph{bidirectional} conversion between an enumeration and its integral type and between two different enumerations.
+C proves an implicit \emph{bidirectional} conversion between an enumeration and its integral type and between different enumerations.
 \begin{clang}
 enum Week week = Mon;                           $\C{// week == 0}$
 …
 Virtually all programming languages overload the arithmetic operators across the basic types using the number and type of parameters and returns.
 Like \CC, \CFA also allows these operators to be overloaded with user-defined types.
 The syntax for operator names uses the @'?'@ character to denote a parameter, \eg prefix and infix increment operators: @?++@, @++?@, and @?+?@.
+The syntax for operator names uses the @'?'@ character to denote a parameter, \eg unary operators: @?++@, @++?@, binary operator @?+?@.
 \begin{cfa}
 struct S { int i, j };

doc/theses/jiada_liang_MMath/conclusion.tex

-                      r9bb6c5f
+                      r29c8675
 \label{c:conclusion}
 The goal of this work is to extend the simple and unsafe enumeration type in the C programming language into a complex and safe enumeration type in the \CFA programming language while maintaining backward compatibility with C.
+This work aims to extend the simple and unsafe enumeration type in the C programming language into a complex and safe enumeration type in the \CFA programming language while maintaining backward compatibility with C.
 Within this goal, the new \CFA enumeration should align with the analogous enumeration features in other languages to match modern programming expectations.
 Hence, the \CFA enumeration features are borrowed from a number of programming languages, but engineered to work and play with \CFA's type system and feature set.
 …
 Strong type-checking of enumeration initialization and assignment provides additional safety, ensuring an enumeration only contains its enumerators.
 Overloading and scoping of enumerators significantly reduces the naming problem, providing a better software-engineering environment, with fewer name clashes and the ability to disambiguate those that cannot be implicitly resolved.
 Typed enumerations solve the data-harmonization problem increasing safety through better software engineering.
+Typed enumerations solve the data-harmonization problem, increasing safety through better software engineering.
 Moreover, integrating enumerations with existing control structures provides a consistent upgrade for programmers and a succinct and secure mechanism to enumerate with the new loop-range feature.
 Generalization and reuse are supported by incorporating the new enumeration type using the \CFA trait system.
 Enumeration traits define the meaning of an enumeration, allowing functions to be written that work on any enumeration, such as the reading and printing an enumeration.
 Using advanced duck typing, existing C enumerations can be extended so they work with all of the enumeration features, providing for legacy C code to be moved forward into the modern \CFA programming domain.
 Finally, I expanded the \CFA project's test-suite with multiple enumeration features tests with respect to implicit conversions, control structures, inheritance, interaction with the polymorphic types, and the features built on top of enumeration traits.
+Enumeration traits define the meaning of an enumeration, allowing functions to be written that work on any enumeration, such as the reading and printing of an enumeration.
+With advanced structural typing, C enumerations can be extended so they work with all of the enumeration features, providing for legacy C code to be moved forward into the modern \CFA programming domain.
+Finally, the \CFA project's test suite has been expanded with multiple enumeration features tests with respect to implicit conversions, control structures, inheritance, interaction with the polymorphic types, and the features built on top of enumeration traits.
 These tests ensure future \CFA work does not accidentally break the new enumeration system.
 The conclusion is that the new \CFA enumeration mechanisms achieve the initial goals, providing C programmers with an intuitive enumeration mechanism for handling modern programming requirements.
+In summary, the new \CFA enumeration mechanisms achieve the initial goals, providing C programmers with an intuitive enumeration mechanism for handling modern programming requirements.
 …
 \end{cfa}
 \item
 Currently enumeration scoping is all or nothing. In some cases, it might be useful to
+Currently, enumeration scoping is all or nothing. In some cases, it might be useful to
 increase the scoping granularity to individual enumerators.
 \begin{cfa}
 …
 typedef RGB.Red OtherRed; // alias
 \end{cfa}
+\item
+Label arrays are auxiliary data structures that are always generated for \CFA enumeration, which is a considerable program overhead.
+It is helpful to provide a new syntax or annotation for a \CFA enumeration definition that tells the compiler the labels will not be used
+throughout the execution. Therefore, \CFA optimizes the label array away. The @value()@ function can still be used on an enumeration constant,
+and the function called is reduced to a @char *@ constant expression that holds the name of the enumerator. But if @value()@ is called on
+a variable with an enumerated type, it returns an empty string since the label information is lost for the runtime.
 \end{enumerate}

doc/theses/jiada_liang_MMath/intro.tex

-                      r9bb6c5f
+                      r29c8675
 The alias names are constants, which follow transitively from their binding to other constants.
 \item
 Defines a type for generating instants (variables).
+Defines a type for generating instances (variables).
 \item
 For safety, an enumeration instance should be restricted to hold only its constant names.
 …
 % https://hackage.haskell.org/package/base-4.19.1.0/docs/GHC-Enum.html
+The association between ADT and enumeration occurs if all the constructors have a unit (empty) type, \eg @struct unit {}@.
+Note, the unit type is not the same as \lstinline{void}.
+\begin{cfa}
+void foo( void );
+struct unit {} u;       $\C[1.5in]{// empty type}$
+unit bar( unit );
+foo( @foo()@ );         $\C{// void argument does not match with void parameter}$
+bar( bar( u ) );        $\C{// unit argument does match with unit parameter}\CRT$
+\end{cfa}
+% The association between ADT and enumeration occurs if all the constructors have a unit (empty) type, \eg @struct unit {}@.
+% Note, the unit type is not the same as \lstinline{void}.
+In terms of functional programming linguistics, enumerations often refers to a @unit type@ ADT, where @unit type@ is a type
+that carry no information.
+% \begin{cfa}
+% void foo( void );
+% struct unit {} u;     $\C[1.5in]{// empty type}$
+% unit bar( unit );
+% foo( @foo()@ );               $\C{// void argument does not match with void parameter}$
+% bar( bar( u ) );      $\C{// unit argument does match with unit parameter}\CRT$
+% \end{cfa}
 For example, in the Haskell ADT:

doc/theses/jiada_liang_MMath/relatedwork.tex

-                      r9bb6c5f
+                      r29c8675
 \chapter{Related Work}
 \label{s:RelatedWork}
-\begin{comment}
-An algebraic data type (ADT) can be viewed as a recursive sum of product types.
-A sum type lists values as members.
-A member in a sum type definition is known as a data constructor.
-For example, C supports sum types union and enumeration (enum).
-An enumeration in C can be viewed as the creation of a list of zero-arity data constructors.
-A union instance holds a value of one of its member types.
-Defining a union does not generate new constructors.
-The definition of member types and their constructors are from the outer lexical scope.
-In general, an \newterm{algebraic data type} (ADT) is a composite type, \ie, a type formed by combining other types.
-Three common classes of algebraic types are \newterm{array type}, \ie homogeneous types, \newterm{product type}, \ie heterogeneous tuples and records (structures), and \newterm{sum type}, \ie tagged product-types (unions).
-Enumerated types are a special case of product/sum types with non-mutable fields, \ie initialized (constructed) once at the type's declaration, possible restricted to compile-time initialization.
-Values of algebraic types are access by subscripting, field qualification, or type (pattern) matching.
-\end{comment}
 Enumeration-like features exist in many popular programming languages, both past and present, \eg Pascal~\cite{Pascal}, Ada~\cite{Ada}, \Csharp~\cite{Csharp}, OCaml~\cite{OCaml} \CC, Go~\cite{Go}, Haskell~\cite{Haskell} \see{discussion in \VRef{s:AlgebraicDataType}}, Java~\cite{Java}, Rust~\cite{Rust}, Swift~\cite{Swift}, Python~\cite{Python}.
 …
 type Boolean = ( false, true );
 \end{pascal}
 The enumeration ordering supports the relational operators @=@, @<>@, @<@, @<=@, @>=@, and @>@, provided both operands are the same (sub)type.
+The enumeration supports the relational operators @=@, @<>@, @<@, @<=@, @>=@, and @>@, interpreted as as comparison in terms of declaration order.
 The following auto-generated pseudo-functions exist for all enumeration types:
 …
          wend : Weekend;
 \end{pascal}
 Hence, the ordering of the enumerators is crucial to provide the necessary ranges.
+Hence, declaration order of enumerators is crucial to provide the necessary ranges.
 There is a bidirectional assignment between the enumeration and its subranges.
 \begin{pascal}
 …
 The underlying type is an implementation-defined integral type large enough to hold all enumerated values; it does not have to be the smallest possible type.
 The integral size can be explicitly specified using compiler directive @$PACKENUM@~$N$, where $N$ is the number of bytes, \eg:
+The integral size can be explicitly specified using compiler directive \$@PACKENUM@~$N$, where $N$ is the number of bytes, \eg:
 \begin{pascal}
 Type @{$\color{red}\$$PACKENUM 1}@ SmallEnum = ( one, two, three );
 …
 \end{figure}
 Enumerators without initialization are auto-initialized from left to right, starting at zero, incrementing by 1.
+Enumerators without initialization are auto-initialized from left to right, starting at zero and incrementing by 1.
 Enumerators with initialization must set \emph{all} enumerators in \emph{ascending} order, \ie there is no auto-initialization.
 \begin{ada}
 …
 \end{c++}
 \CC{11} added a scoped enumeration, \lstinline[language=c++]{enum class} (or \lstinline[language=c++]{enum struct})\footnote{
 The use of keyword \lstinline[language=c++]{class} is resonable because default visibility is \lstinline[language=c++]{private} (scoped).
+The use of keyword \lstinline[language=c++]{class} is reasonable because default visibility is \lstinline[language=c++]{private} (scoped).
 However, default visibility for \lstinline[language=c++]{struct} is \lstinline[language=c++]{public} (unscoped) making it an odd choice.},
 where the enumerators are accessed using type qualification.
 …
 E e = A;    e = B;                                              $\C{// direct access}$
 \end{c++}
 \CC{11} added the ability to explicitly declare only an underlying \emph{integral} type for \lstinline[language=c++]{enum class}.
+\CC{11} added the ability to explicitly declare an underlying \emph{integral} type for \lstinline[language=c++]{enum class}.
 \begin{c++}
 enum class RGB @: long@ { Red, Green, Blue };
 …
 \end{tabular}
 \end{cquote}
+However, there is no mechanism to iterate through an enumeration without an unsafe cast and it does not understand the enumerator values.
+However, there is no mechanism to iterate through an enumeration.
+A common workaround is to iterate over enumerator as integral values, but it only works if
+enumerators resemble a sequence of natural, i.e., enumerators are auto-initialized.
+Otherwises, the iteration would have integers that are not enumeration values.
 \begin{c++}
 enum Week { Mon, Tue, Wed, Thu = 10, Fri, Sat, Sun };
 …
 % https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/language-specification/enums
 \Csharp is a dynamically-typed programming language with a scoped, integral enumeration similar to \CC \lstinline[language=C++]{enum class}.
+\Csharp is a programming language with a scoped, integral enumeration similar to \CC \lstinline[language=C++]{enum class}.
 \begin{csharp}
 enum Week : @long@ { Mon, Tue, Wed, Thu@ = 10@, Fri, Sat, Sun }
 …
 \end{cquote}
+To indirectly enumerate, \Csharp's Enum library has @Enum.GetValues@, a pseudo-method that retrieves an array of the enumeration constants for looping over an enumeration type or variable (expensive operation).
+To indirectly enumerate, \Csharp's Enum library provides @Enum.GetValues@,
+% a pseudo-method that retrieves an array of the enumeration constants for looping over an enumeration type or variable (expensive operation).
+a static memeber of abstract Enum type that return a reference to an array of all enumeration constants.
+Internally, a Enum type has a static member called @fieldInfoHash@, a @Hashtable@ that stores enumerators information. The field is populated on-demand:
+it only contains information if a @reflection@ like @GetValues@ is called. But the information will be cached, so the cost of reflection is paid only
+once throughout the lifetime of a program. @GetValues@ then converts a @Hashtable@ to an @Array@, which supports enumerating.
 \begin{csharp}
 foreach ( Week d in @Enum.GetValues@( typeof(Week) ) ) {
 …
 \label{s:Go}
+Go has a no enumeration.
+It has @const@ aliasing declarations, similar to \CC \see{\VRef{s:C++RelatedWork}}, for basic types with type inferencing and static initialization (constant expression).
+Go has @const@ aliasing declarations, similar to \CC \see{\VRef{s:C++RelatedWork}}, for basic types with type inferencing and static initialization (constant expression).
+The most basic form of constant definition is a @const@ keyword, followed by the name of constant, an optional type declaration of the constant, and a mandatory initialize.
+For exmaple:
 \begin{Go}
 const R @int@ = 0;  const G @uint@ = 1;  const B = 2; $\C{// explicit typing and type inferencing}$
 …
 const V = 3.1;  const W = 3.1;
 \end{Go}
+Since these declarations are immutable variables, they are unscoped and Go has no overloading.
+Go provides an enumeration-like feature to group together @const@ declaration into a block and introduces a form of auto-initialization.
+Since these declarations are immutable variables, they are unscoped and Go has no overloading. If no type declaration provided, Go infers
+type from the initializer expression.
+% Go provides an enumeration-like feature to group together @const@ declaration into a block and introduces a form of auto-initialization.
+These named constants can be grouped together in one @const@ declaration block and introduces a form of auto-initialization.
 \begin{Go}
 const ( R = 0; G; B )                                   $\C{// implicit initialization: 0 0 0}$
 const ( Fred = "Fred"; Mary = "Mary"; Jane = "Jane" ) $\C{// explicit initialization: Fred Mary Jane}$
 const ( S = 0; T; USA = "USA"; U; V = 3.1; W ) $\C{// type change, implicit/explicit: 0 0 USA USA 3.1 3.1}$
+const ( S = 0; T; USA = "USA"; U; V = 3.1; W ) $\C{// implicit/explicit: 0 0 USA USA 3.1 3.1}$
 \end{Go}
 The first identifier \emph{must} be explicitly initialized;
 subsequent identifiers can be implicitly or explicitly initialized.
+Implicit initialization is the \emph{previous} (predecessor) identifier value.
+Each @const@ declaration provides an implicit integer counter starting at zero, called \lstinline[language=Go]{iota}.
+Implicit initialization always uses the \emph{previous} (predecessor) constant expression initializer.
+% Each @const@ declaration provides an implicit integer counter starting at zero, called \lstinline[language=Go]{iota}.
+Each const declaration is often paired with a const expression \lstinline[language=Go]{iota} to re-define its
+implicit initialization. \lstinline[language=Go]{iota} represents an sequence of natural number starting from zero.
+Every implicit or explicit \lstinline[language=Go]{iota} increments the value of the expression by one.
 Using \lstinline[language=Go]{iota} outside of a @const@ block always sets the identifier to zero.
 \begin{Go}
 const R = iota;                                                 $\C{// 0}$
 \end{Go}
+Inside a @const@ block, \lstinline[language=Go]{iota} is implicitly incremented for each \lstinline[language=golang]{const} identifier and used to initialize the next uninitialized identifier.
+% Inside a @const@ block, \lstinline[language=Go]{iota} is implicitly incremented for each \lstinline[language=golang]{const} identifier and used to initialize the next uninitialized identifier.
+Inside a @const@ block, if a constant has \lstinline[language=Go]{iota} initializer, its successor will also use \lstinline[language=Go]{iota} initializer.
+\lstinline[language=Go]{iota} is no different than other constant expression when it is used in implicit initialization, but
+thanks to the increment natural of \lstinline[language=Go]{iota}, the successor will have a value equal to its predecessor plus 1.
 \begin{Go}
 const ( R = @iota@; G; B )                              $\C{// implicit: 0 1 2}$
+% const ( C = @iota + B + 1@; G; Y )            $\C{// implicit: 3 4 5}$
+\end{Go}
+The constant blocks from the previous example is equivalanet to:
+\begin{Go}
+const ( R = @iota@; G=@iota@; B=@iota@ )                                $\C{// implicit: 0 1 2}$
+\end{Go}
+R, G, B have values 0, 1, 2, respectively, because \lstinline[language=Go]{iota} is an increasing.
+Similarly,
+\begin{Go}
 const ( C = @iota + B + 1@; G; Y )              $\C{// implicit: 3 4 5}$
 \end{Go}
+An underscore \lstinline[language=golang]{const} identifier advances \lstinline[language=Go]{iota}.
+can be rewritten as:
 \begin{Go}
 const ( O1 = iota + 1; @_@; O3; @_@; O5 ) // 1, 3, 5
+const ( C = @iota + B + 1@; G = @iota + B + 1@; Y = @iota + B + 1@ )            $\C{// implicit: 3 4 5}$
 \end{Go}
+Auto-initialization reverts from \lstinline[language=Go]{iota} to the previous value after an explicit initialization, but auto-incrementing of \lstinline[language=Go]{iota} continues.
+Go's grouped constants do not define a new type, and constants in the same block can have heterogeneous types.
+These two characteristics differs a grouped constant from an enumeration, but also gives a direction on approximating enumeration in Go:
+first to define a new type externally, and make sure all constants in the same group will have the new type.
 \begin{Go}
+const ( Mon = iota; Tue; Wed; // 0, 1, 2
+                @Thu = 10@; Fri; Sat; @Sun = itoa@ ) $\C{// 10, 10, 10, {\color{red}6}}$
+type Language int64
+const (
+        C Language = iota
+        CPP
+        CSharp
+        CFA
+        Go
+)
 \end{Go}
+Auto-initialization from \lstinline[language=Go]{iota} is restarted and \lstinline[language=Go]{iota} reinitialized with an expression containing at most \emph{one} \lstinline[language=Go]{iota}.
+\begin{Go}
+const ( V1 = iota; V2; @V3 = 7;@ V4 = @iota@ + 1; V5 ) // 0 1 7 4 5
+const ( Mon = iota; Tue; Wed; // 0, 1, 2
+                @Thu = 10;@ Fri = @iota@ - Wed + Thu - 1; Sat; Sun ) // 10, 11, 12, 13
+\end{Go}
+Here, @V4@ and @Fri@ restart auto-incrementing from \lstinline[language=Go]{iota} and reset \lstinline[language=Go]{iota} to 4 and 11, respectively, because of the initialization expressions containing \lstinline[language=Go]{iota}.
+Note, because \lstinline[language=Go]{iota} is incremented for an explicitly initialized identifier or @_@,
+at @Fri@ \lstinline[language=Go]{iota} is 4 requiring the minus one to compute the value for @Fri@.
+By typing the first constant as @Language@ and assigning initializer with \lstinline[language=Go]{iota}, all other constants will have the same type
+and the same initialzer. It is a close approximation, but it is not a real enumeration. The definition of the "enumerated type" is separate from
+the "enumerator definition", and nothing stop outside constants to have the type @Language@.
+% An underscore \lstinline[language=golang]{const} identifier advances \lstinline[language=Go]{iota}.
+% \begin{Go}
+% const ( O1 = iota + 1; @_@; O3; @_@; O5 ) // 1, 3, 5
+% \end{Go}
+% Auto-initialization reverts from \lstinline[language=Go]{iota} to the previous value after an explicit initialization, but auto-incrementing of \lstinline[language=Go]{iota} continues.
+% \begin{Go}
+% const ( Mon = iota; Tue; Wed; // 0, 1, 2
+%               @Thu = 10@; Fri; Sat; @Sun = itoa@ ) $\C{// 10, 10, 10, {\color{red}6}}$
+% \end{Go}
+% Auto-initialization from \lstinline[language=Go]{iota} is restarted and \lstinline[language=Go]{iota} reinitialized with an expression containing at most \emph{one} \lstinline[language=Go]{iota}.
+% \begin{Go}
+% const ( V1 = iota; V2; @V3 = 7;@ V4 = @iota@ + 1; V5 ) // 0 1 7 4 5
+% const ( Mon = iota; Tue; Wed; // 0, 1, 2
+%               @Thu = 10;@ Fri = @iota@ - Wed + Thu - 1; Sat; Sun ) // 10, 11, 12, 13
+% \end{Go}
+% Here, @V4@ and @Fri@ restart auto-incrementing from \lstinline[language=Go]{iota} and reset \lstinline[language=Go]{iota} to 4 and 11, respectively, because of the initialization expressions containing \lstinline[language=Go]{iota}.
+% Note, because \lstinline[language=Go]{iota} is incremented for an explicitly initialized identifier or @_@,
+% at @Fri@ \lstinline[language=Go]{iota} is 4 requiring the minus one to compute the value for @Fri@.
 Basic switch and looping are possible.
 …
 \end{tabular}
 \end{cquote}
 However, the loop prints the values from 0 to 13 because there is no actual enumeration.
+However, the loop in this example prints the values from 0 to 13 because there is no actual enumeration.
 A constant variable can be used as an array dimension or a subscript.
 …
 Week day = Week.Sat;
 \end{Java}
 The enumerator's members are scoped and cannot be made \lstinline[language=java]{public}, hence requiring qualification.
+The enumerator's members are scoped and requires qualification.
 The value of an enumeration instance is restricted to its enumerators.
 …
 If the implementation member is \lstinline[language=Java]{public}, the enumeration is unsafe, as any value of the underlying type can be assigned to it, \eg @day = 42@.
 The implementation constructor must be private since it is only used internally to initialize the enumerators.
 Initialization occurs at the enumeration-type declaration for each enumerator in the first line.
+Initialization occurs at the enumeration-type declaration.
 Enumerations can be used in the @if@ and @switch@ statements but only for equality tests.
 …
 There are no arithmetic operations on enumerations, so there is no arithmetic way to iterate through an enumeration without making the implementation type \lstinline[language=Java]{public}.
+Like \Csharp, looping over an enumeration is done using method @values@, which returns an array of enumerator values (expensive operation).
+Like \Csharp, looping over an enumeration is done using static method @values@, which returns an array of enumerator values.
+Unfortunately, @values@ is an expensive @O(n)@ operation because it creates a new array every time it is called.
 \begin{Java}
 for ( Week d : Week.values() ) {
 …
 Like \Csharp, enumerating is supplied indirectly through another enumerable type, not via the enumeration.
+% Java provides an @EnumSet@ where the underlying type is an efficient set of bits, one per enumeration \see{\Csharp \lstinline{Flags}, \VRef{s:Csharp}}, providing (logical) operations on groups of enumerators.
+% There is also a specialized version of @HashMap@ with enumerator keys, which has performance benefits.
+Java provides @EnumSet@, an auxiliary data structure that takes an enum @class@ as parameter (Week.class) for its construction, and it contains members only with the supplied
+enum type. @EnumSet@ is enumerable because it extends @AbstractSet@ interfaces and thus supports direct enumerating via @forEach@. It also has subset operation
+@range@ and it is possible to add to and remove from members of the set.
+@EnumSet@ supports more enumeration features, but it is not an enumeration type: it is a set of enumerators from a pre-define enum.
 An enumeration type cannot declare an array dimension nor can an enumerator be used as a subscript.
 Enumeration inheritence is disallowed because an enumeration is \lstinline[language=Java]{final}.
-Java provides an @EnumSet@ where the underlying type is an efficient set of bits, one per enumeration \see{\Csharp \lstinline{Flags}, \VRef{s:Csharp}}, providing (logical) operations on groups of enumerators.
-There is also a specialized version of @HashMap@ with enumerator keys, which has performance benefits.
 \section{Rust}
 …
 adt = ADT::S(s);  println!( "{:?}", adt );
 @match@ adt {
         ADT::I( i ) => println!( "{:}", i ),
         ADT::F( f ) => println!( "{:}", f ),
         ADT::S( s ) => println!( "{:} {:}", s.i, s.j ),
+        ADT::I( i ) $=>$ println!( "{:}", i ),
+        ADT::F( f ) $=>$ println!( "{:}", f ),
+        ADT::S( s ) $=>$ println!( "{:} {:}", s.i, s.j ),
+}
 \end{rust}
 …
 let mut week : Week = Week::Mon;
 match week {
         Week::Mon => println!( "Mon" ),
+        Week::Mon $=>$ println!( "Mon" ),
         ...
         Week::Sun => println!( "Sun" ),
+        Week::Sun $=>$ println!( "Sun" ),
+}
 \end{rust}
 …
         Week::Mon | Week:: Tue | Week::Wed | Week::Thu
                 | Week::Fri => println!( "weekday" ),
         Week::Sat | Week:: Sun => println!( "weekend" ),
+        Week::Sat | Week:: Sun $=>$ println!( "weekend" ),
+}
 \end{c++}
 \end{tabular}
 \end{cquote}
+However, there is no mechanism to iterate through an enumeration without casting to integral and positions versus values is not handled.
+% However, there is no mechanism to iterate through an enumeration without casting to integral and positions versus values is not handled.
+Like C/\CC, there is no mechanism to iterate through an enumeration. It can only be approximated
+by a loop over a range of enumerator and will not work if enumerator values is a sequence of natural numbers.
 \begin{c++}
 for d in Week::Mon as isize ..= Week::Sun as isize {
 …
 1 2 @3 4 5 6 7 8 9@ 10 11 12 13
 \end{c++}
+An enumeration type cannot declare an array dimension nor as a subscript.
+There is no mechanism to subset or inherit from an enumeration.
+% An enumeration type cannot declare an array dimension nor as a subscript.
+There is no direct way to harmonize an enumeration and another data structure. For example,
+there is no mapping from an enumerated type to an array type.
+In terms of extensibility, there is no mechanism to subset or inherit from an enumeration.
 \section{Swift}
+\label{s:Swift}
 % https://www.programiz.com/swift/online-compiler
 Like Rust, Swift @enum@ provides two largely independent mechanisms from a single language feature: an ADT and an enumeration.
+Despite being named as enumeration, A Swift @enum@ is in fact a ADT: cases (enumerators) of an @enum@ can have heterogeneous types and be recursive.
+% Like Rust, Swift @enum@ provides two largely independent mechanisms from a single language feature: an ADT and an enumeration.
 When @enum@ is an ADT, pattern matching is used to discriminate among the variant types.
 \begin{cquote}
 …
 \end{tabular}
 \end{cquote}
+Note, after an @adt@'s type is know, the enumerator is inferred without qualification, \eg @.I(3)@.
+An enumeration is created when \emph{all} the enumerators are unit-type, which is like a scoped, opaque enumeration.
+% Note, after an @adt@'s type is know, the enumerator is inferred without qualification, \eg @.I(3)@.
+Normally an enumeration case needs a type qualification. But in the example when pattern matching @adt@, which
+has a type @ADT@, the context provides that the cases refer to @ADT@'s cases and no explicit type qualification is required.
+% An enumeration is created when \emph{all} the enumerators are unit-type, which is like a scoped, opaque enumeration.
+Without type declaration for enumeration cases, a Swift enum syntax defined a unit-type enumeration, which is like a scoped, opaque enumeration.
 \begin{swift}
 enum Week { case Mon, Tue, Wed, Thu, Fri, Sat, Sun }; // unit-type
 var week : Week = @Week.Mon@;
 \end{swift}
+As well, it is possible to type \emph{all} the enumerators with a common type, and set different values for each enumerator;
+for integral types, there is auto-incrementing.
+% As well, it is possible to type \emph{all} the enumerators with a common type, and set different values for each enumerator;
+% for integral types, there is auto-incrementing.
+As well, it is possible to type associated values of enumeration cases with a common types.
+When enumeration cases are typed with a common integral type, Swift auto-initialize enumeration cases following the same initialization scheme as C language.
+If enumeration is typed with @string@, its cases are auto-initialized to case names (labels).
 \begin{cquote}
 \setlength{\tabcolsep}{15pt}
 …
 \end{tabular}
 \end{cquote}
+Enumerating is accomplished by inheriting from @CaseIterable@ without any associated values.
+Enumerating is accomplished by inheriting from @CaseIterable@ protocol, which has a static
+@enum.allCases@ property that returns a collection of all the cases for looping over an enumeration type or variable.
+Like \CFA, Swift's default enumerator output is the case name (label). An enumerator of a typed enumeration has attribute
+@rawValue@ that return its case value.
 \begin{swift}
 enum Week: Comparable, @CaseIterable@ {
 …
 Mon Tue Wed Thu Fri Sat Sun
 \end{swift}
+The @enum.allCases@ property returns a collection of all the cases for looping over an enumeration type or variable (expensive operation).
+A typed enumeration is accomplished by inheriting from any Swift type, and accessing the underlying enumerator value is done with the attribute @rawValue@.
+Type @Int@ has auto-incrementing from the previous enumerator;
+type @String@ has auto-incrementing of the enumerator label.
 \begin{cquote}
 \setlength{\tabcolsep}{15pt}
 …
 \end{cquote}
 There is a bidirectional conversion from typed enumerator to @rawValue@ and vice versa.
+There is a safe bidirectional conversion from typed enumerator to @rawValue@ and vice versa.
 \begin{swift}
-var weekInt : WeekInt = WeekInt.Mon;
 if let opt = WeekInt( rawValue: 0 ) {  // test optional return value
         print( weekInt.rawValue, opt )  // 0 Mon
+        print( opt.rawValue, opt )  // 0 Mon
 } else {
         print( "invalid weekday lookup" )
+}
 \end{swift}
+Conversion from @rawValue@ to enumerator may fail (bad lookup), so the result is an optional value.
+% Conversion from @rawValue@ to enumerator may fail (bad lookup), so the result is an optional value.
+In the previous exmaple, the initialization of @opt@ fails when there is no enumeration cases has value equals 0, resulting in a
+@nil@ value. Initialization from a raw value is considered a expensive operation because it requires a value lookup.
 \section{Python 3.13}
 …
 class Week(!Enum!): Mon = 1; Tue = 2; Wed = 3; Thu = 4; Fri = 5; Sat = 6; Sun = 7
 \end{python}
 and/or explicitly auto-initialized, \eg:
+and/or explicitly auto-initialized with @auto@ method, \eg:
 \begin{python}
 class Week(Enum): Mon = 1; Tue = 2; Wed = 3; Thu = 10; Fri = !auto()!; Sat = 4; Sun = !auto()!
 Mon : 1 Tue : 2 Wed : 3 Thu : 10 Fri : !11! Sat : 4 Sun : !12!
 \end{python}
+where @auto@ increments by 1 from the previous @auto@ value \see{Go \lstinline[language=Go]{iota}, \VRef{s:Go}}.
+@auto@ is controlled by member @_generate_next_value_()@, which can be overridden:
+@auto@ is controlled by member @_generate_next_value_()@, which by default return one plus the highest value among enumerators, and can be overridden:
 \begin{python}
 @staticmethod
 …
 % https://dev.realworldocaml.org/runtime-memory-layout.html
 Like Haskell, OCaml @enum@ provides two largely independent mechanisms from a single language feature: an ADT and an enumeration.
+Like Swift (\VRef{s:Swift}) and Haskell (\VRef{s:AlgebraicDataType}), OCaml @enum@ provides two largely independent mechanisms from a single language feature: an ADT and an enumeration.
 When @enum@ is an ADT, pattern matching is used to discriminate among the variant types.
 \begin{cquote}
 …
 \end{tabular}
 \end{cquote}
+(Note, after an @adtv@'s type is know, the enumerator is inferred without qualification, \eg @I(3)@.)
+% (Note, after an @adtv@'s type is know, the enumerator is inferred without qualification, \eg @I(3)@.)
 The type names are independent of the type value and mapped to an opaque, ascending, integral tag, starting from 0, supporting relational operators @<@, @<=@, @>@, and @>=@.
 \begin{cquote}
 …
 While OCaml enumerators have an ordering following the definition order, they are not enumerable.
+To iterate over all enumerators, an OCaml type needs to derive from the @enumerate@ preprocessor, which appends a list of all enumerators to the program abstract syntax tree (AST).
+However, the list of values may not persist in the defined ordering.
+As a consequence, there is no meaningful enumerating mechanism.
+Enumeration subsetting is allowed but inheritance is restricted to classes not types.
+\begin{ocaml}
+type weekday = Mon | Tue | Wed | Thu | Fri
+type weekend = Sat | Sun
+type week = Weekday of weekday | Weekend of weekend
+let day : week = Weekend Sun
+\end{ocaml}
+To iterate over all enumerators, an OCaml type needs to derive from the @enumerate@ PPX (Pre-Preocessor eXtension), which appends a list of all enumerators to the program abstract syntax tree (AST).
+However, as stated in the documentation, @enumerate@ PPX does not guarantee the order of the list.
+PPX is beyond the scope of OCaml native language and it is a preprocessor directly modifying a parsed AST. In conclusion, there is no enumerating mechanism within the scope of OCaml language.
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 …
 opaque                  & \CM    &                              &                  & \CM    & \CM   &             & \CM         & \CM                   &               &               &               & \CM   \\
 \hline
 typed                   & Int    & Int                  & Integral  & H     & U     & H       & U/H         & U/H           & H     & Int       & Integral& U   \\
+typed                   & Int    & Int                  & Int     & H     & U     & H       & U/H         & U/H           & H       & Int       & Int   & U     \\
 \hline
 safety          & \CM   & \CM                   &          & \CM        & \CM   &                 & \CM                 & \CM                   &               &               & \CM   & \CM   \\
 …
 posn ordered    & Implied & Implied     &          & \CM    &       &             &                         &                   &       &           &       & \CM       \\
 \hline
 unique values   & \CM   & \CM           &           &           &       &      &                            & \CM               &       &           &       &     \\
+unique values   & \CM   & \CM           &           &\CM        &       &      &                            & \CM               &       &           &       &     \\
 \hline
 auto-init               & \CM   & all or none   & \CM      &      &       & \CM     & \CM           & \CM               & \CM   & \CM   & \CM   & \CM   \\
+auto-init               & \CM   & all or none   & \CM      & N/A &       & \CM     & \CM            & \CM               & \CM   & \CM   & \CM   & \CM   \\
 \hline
 (Un)Scoped              & U     & U                     & S        & S      & S         & U       & S               & S                         & S     & U             & U/S   & U/S   \\
 …
 arr. dim.               & \CM   & \CM           &                  &              &                 &           &                 &                         &            &      &               & \CM \\
 \hline
 subset                  & \CM   & \CM                   &         & \CM     &           &                 &                         &                   &               &               &               & \CM   \\
+subset                  & \CM   & \CM                   &         &      &              &                 &                         &                   &               &               &               & \CM   \\
 \hline
 superset                &               &                               &                 &                 &           &                 &                         &               &           &               &               & \CM   \\
 …
 Position ordered is implied if the enumerator values must be strictly increasingly.
 \item unique value: enumerators must have a unique value.
+\item auto-init: Values are auto-initializable by language specification, often being "+1" of the predecessor.
+\item auto-init: Values are auto-initializable by language specification. \\
+It is not appliable to OCaml because OCaml enumeration has unit type.
 \item (Un)Scoped: U $\Rightarrow$ enumerators are projected into the containing scope.
 S $\Rightarrow$ enumerators are contained in the enumeration scope and require qualification.

Note: See TracChangeset for help on using the changeset viewer.

Download in other formats: