Index: doc/theses/jiada_liang_MMath/intro.tex
===================================================================
--- doc/theses/jiada_liang_MMath/intro.tex	(revision 566cc33ae374c458f72fbc1e3db85c204344608c)
+++ doc/theses/jiada_liang_MMath/intro.tex	(revision 314c9d8852a35c6736a8f3ae5920e3ca35122ba7)
@@ -5,15 +5,16 @@
 (In \CFA, the primary constants @0@ and @1@ can be overloaded for any type.)
 Hence, each primary constant has a symbolic name referring to its internal representation, and these names are dictated by language syntax related to types.
-In theory, there are an infinite set of primary names per type.
-
-\Newterm{Secondary naming} is a common practice in mathematics, engineering and computer science, \eg $\pi$, $\tau$ (2$\pi$), $\phi$ (golden ratio), MB (mega byte, 1E6), and in general situations, \eg specific times (noon, New Years), cities (Big Apple), flowers (Lily), \etc.
+In theory, there are an infinite set of primary constant names per type.
+
+\Newterm{Secondary naming} is a common practice in mathematics, engineering and computer science, \eg $\pi$, $\tau$ (2$\pi$), $\phi$ (golden ratio), MB (megabyte, 1E6), and in general situations, \eg specific times (noon, New Years), cities (Big Apple), flowers (Lily), \etc.
 Many programming languages capture this important software-engineering capability through a mechanism called \Newterm{constant} or \Newterm{literal} naming, where a secondary name is aliased to a primary name.
-Its common purpose is to eliminate duplication of the primary constant throughout a program.
-For example, the secondary name replaces its primary name, thereafter changing the binding of the secondary to primary name automatically distributes the rebinding throughout the program.
+Its purpose is for readability and to eliminate duplication of the primary constant throughout a program.
+For example, a meaningful secondary name replaces a primary name throughout a program;
+thereafter, changing the binding of the secondary to primary name automatically distributes the rebinding, preventing errors.
 In some cases, secondary naming is \Newterm{opaque}, where the matching internal representation can be chosen arbitrarily, and only equality operations are available, \eg @O_RDONLY@, @O_WRONLY@, @O_CREAT@, @O_TRUNC@, @O_APPEND@.
 Because a secondary name is a constant, it cannot appear in a mutable context, \eg \mbox{$\pi$ \lstinline{= 42}} is meaningless, and a constant has no address, \ie it is an \Newterm{rvalue}\footnote{
 The term rvalue defines an expression that can only appear on the right-hand side of an assignment expression.}.
 
-Secondary names can form an (ordered) set, \eg days of the week, months of a year, floors of a building (basement, ground, 1st), colours in a rainbow, \etc.
+Secondary names can form an (ordered) set, \eg days of a week, months of a year, floors of a building (basement, ground, 1st), colours in a rainbow, \etc.
 Many programming languages capture these groupings through a mechanism called an \Newterm{enumeration}.
 \begin{quote}
@@ -34,21 +35,24 @@
 This independence from internal representation allows multiple names to have the same representation (eighth note, quaver), giving synonyms.
 A set can have a partial or total ordering, making it possible to compare set elements, \eg Monday is before Friday and Friday is after.
-Ordering allows iterating among the enumeration set using relational operators and advancement, \eg
+Ordering allows iterating among the enumeration set using relational operators and advancement, \eg:
 \begin{cfa}
 for ( cursor = Monday; cursor @<=@ Friday; cursor = @succ@( cursor ) ) ...
 \end{cfa}
-Here the internal representations for the secondary names are logically \emph{generated} rather than listing a subset of names.
+Here the internal representation for the secondary names are logically \emph{generated} rather than listing a subset of names.
 
 Hence, the fundamental aspects of an enumeration are:
 \begin{enumerate}
 \item
-It defines a type from which instants can be generated.
-\item
-The type lists a finite set of secondary names, which become its primary constants.
-This differentiates an enumeration from general types with an infinite number of primary constants.
-\item
-An enumeration's secondary names represent constants, which follows from their binding (aliasing) to primary names, which are constants.
-\item
-For safety, an enumeration instance is restricted to hold only its type's secondary names.
+\begin{sloppypar}
+It provides a finite set of secondary names, which become its primary constants.
+This differentiates an enumeration from general types with an infinite set
+of primary constants.
+\end{sloppypar}
+\item
+The secondary names are constants, which follows transitively from their binding (aliasing) to primary names, which are constants.
+\item
+Defines a type for generating instants (variables).
+\item
+For safety, an enumeration instance should be restricted to hold only its type's secondary names.
 \item
 There is a mechanism for \emph{enumerating} over the secondary names, where the ordering can be implicit from the type, explicitly listed, or generated arithmetically.
@@ -59,15 +63,15 @@
 \label{s:Terminology}
 
-The term \Newterm{enumeration} defines a type with a set of secondary names, and the term \Newterm{enumerator} represents an arbitrary secondary name \see{\VRef{s:CEnumeration}}.
-As well, an enumerated type has three fundamental properties, \Newterm{label}, \Newterm{order}, and \Newterm{value}.
+The term \Newterm{enumeration} defines a type with a set of secondary names, and the term \Newterm{enumerator} represents an arbitrary secondary name \see{\VRef{s:CEnumeration} for the name derivation}.
+As well, an enumerated type can have three fundamental properties, \Newterm{label}, \Newterm{order}, and \Newterm{value}.
 \begin{cquote}
 \sf\setlength{\tabcolsep}{3pt}
 \begin{tabular}{rcccccccr}
 \it\color{red}enumeration	& \multicolumn{8}{c}{\it\color{red}enumerators}	\\
-$\downarrow$\hspace*{25pt}	& \multicolumn{8}{c}{$\downarrow$}				\\
-@enum@ Week \{				& Mon,	& Tue,	& Wed,	& Thu,	& Fri, 	& Sat,	& Sun = 42	& \};	\\
+$\downarrow$\hspace*{15pt}	& \multicolumn{8}{c}{$\downarrow$}				\\
+@enum@ Week \{				& Mon,	& Tue,	& Wed,	& Thu,	& Fri, 	& Sat,	& Sun {\color{red}= 42}	& \};	\\
 \it\color{red}label			& Mon	& Tue	& Wed	& Thu	& Fri 	& Sat	& Sun		&		\\
 \it\color{red}order			& 0		& 1		& 2		& 3		& 4		& 5		& 6	 		&		\\
-\it\color{red}value			& 0		& 1		& 2		& 3		& 4		& 5		& 42 		&
+\it\color{red}value			& 0		& 1		& 2		& 3		& 4		& 5		& {\color{red}42} 		&
 \end{tabular}
 \end{cquote}
@@ -88,11 +92,33 @@
 \section{Motivation}
 
-Some programming languages only provide direct secondary renaming.
+Many programming languages provide an enumeration-like mechanism, which may or may not cover the previous five fundamental enumeration aspects.
+Hence, the term \emph{enumeration} can be confusing and misunderstood.
+Furthermore, some languages conjoin the enumeration with other type features, making it difficult to tease apart which featuring is being used.
+This section discusses some language features that are sometimes called an enumeration but do not provide all enumeration aspects.
+
+
+\subsection{Aliasing}
+
+Some languages provide simple secondary aliasing (renaming), \eg:
 \begin{cfa}
 const Size = 20, Pi = 3.14159, Name = "Jane";
 \end{cfa}
-Here, it is possible to compare the secondary names, \eg @Size < Pi@, if that is meaningful.
-
-Secondary renaming can simulate an enumeration, but with extra effort.
+The secondary name is logically replaced in the program text by its corresponding primary name.
+Therefore, it is possible to compare the secondary names, \eg @Size < Pi@, only because the primary constants allow it, whereas \eg @Pi < Name@ might be disallowed depending on the language.
+
+Aliasing is not macro substitution, \eg @#define Size 20@, where a name is replaced by its value \emph{before} compilation, so the name is invisible to the programming language.
+With aliasing, each secondary name is part of the language, and hence, participates fully, such as name overloading in the type system.
+Aliasing is not an immutable variable, \eg:
+\begin{cfa}
+extern @const@ int Size = 20;
+extern void foo( @const@ int @&@ size );
+foo( Size ); // take the address of (reference) Size
+\end{cfa}
+Taking the address of an immutable variable makes it an \Newterm{lvalue}, which implies it has storage.
+With separate compilation, it is necessary to choose one translation unit to perform the initialization.
+If aliasing does require storage, its address and initialization are opaque (compiler only), similar to \CC rvalue reference @&&@.
+
+Aliasing does provide readability and automatic resubstitution.
+It also provides simple enumeration properties, but with extra effort.
 \begin{cfa}
 const Mon = 1, Tue = 2, Wed = 3, Thu = 4, Fri = 5, Sat = 6, Sun = 7;
@@ -102,130 +128,133 @@
 const Sun = 1, Mon = 2, Tue = 3, Wed = 4, Thu = 5, Fri = 6, Sat = 7;
 \end{cfa}
-Finally, there is no type to create a type-checked instance or iterator cursor.
-Hence, there is only a weak equivalence between secondary naming and enumerations, justifying a seperate enumeration type in a programming language.
-
-A variant (algebraic) type is often promoted as a kind of enumeration, \ie a variant type can simulate an enumeration.
-Fundamentally, a variant type is a tagged-union, where the tag is normally opaque and the types are usually heterogeneous.
-\begin{cfa}[morekeywords={variant}]
-@variant@ Variant {
-	@int tag;@  // optional/implicit: 0 => int, 1 => double, 2 => S
-	@union {@ // implicit
-		case int i;
-		case double d;
-		case struct S { int i, j; } s;
-	@};@
-};
-\end{cfa}
-Crucially, the union implies instance storage is shared by all the variant types, and therefore, before a variant type can be used in a statically-typed expression, it must be dynamically discriminated to its current contained type.
-Hence, knowing which type is in a variant instance is crucial for correctness.
-Occasionally, it is possible to statically determine all regions where each variant type is used, so a tag and runtime checking is unnecessary.
-Otherwise, a tag is required to denote the particular type in the variant, and the tag is discriminated at runtime using some form of type pattern-matching, after which the value can be used in a statically-typed expression.
-
-A less frequent variant case is multiple variants with the same type, which normally requires explicit naming of the tag to disambiguate among the common types.
+For these reasons, aliasing is sometimes called an enumeration.
+However, there is no type to create a type-checked instance or iterator cursor, so there is no ability for enumerating.
+Hence, there are multiple enumeration aspects not provided by aliasing, justifying a separate enumeration type in a programming language.
+
+
+\subsection{Algebraic Data Type}
+
+An algebraic data type (ADT)\footnote{ADT is overloaded with abstract data type.} is another language feature often linked with enumeration, where an ADT conjoins an arbitrary type, possibly a \lstinline[language=C++]{class} or @union@, and a named constructor.
+For example, in Haskell:
+\begin{haskell}
+data S = S { i::Int, d::Double }		$\C{// structure}$
+data @Foo@ = A Int | B Double | C S		$\C{// ADT, composed of three types}$
+foo = A 3;								$\C{// type Foo is inferred}$
+bar = B 3.5
+baz = C S{ i = 7, d = 7.5 }
+\end{haskell}
+the ADT has three variants (constructors), @A@, @B@, @C@ with associated types @Int@, @Double@, and @S@.
+The constructors create an initialized value of the specific type that is bound to the immutable variables @foo@, @bar@, and @baz@.
+Hence, the ADT @Foo@ is like a union containing values of the associated types, and a constructor name is used to access the value using dynamic pattern-matching.
 \begin{cquote}
-\begin{tabular}{@{}l@{\hspace{30pt}}l@{}}
-\begin{cfa}[morekeywords={variant}]
-variant VariantCT {
-	case @car@: int i;  // explicitly typed
-	case @boat@: int i;
-	case @bridge@: int i;
-};
-\end{cfa}
+\setlength{\tabcolsep}{15pt}
+\begin{tabular}{@{}ll@{}}
+\begin{haskell}
+prtfoo val = -- function
+    -- pattern match on constructor
+    case val of
+      @A@ a -> print a
+      @B@ b -> print b
+      @C@ (S i d) -> do
+          print i
+          print d
+\end{haskell}
 &
-\begin{cfa}[morekeywords={variant}]
-variant VariantCU {
-	case @car@: ;  // empty or unit type
-	case @boat@: ;
-	case @bridge@: ;
-};
-\end{cfa}
+\begin{haskell}
+main = do
+    prtfoo foo
+    prtfoo bar
+    prtfoo baz
+3
+3.5
+7
+7.5
+\end{haskell}
 \end{tabular}
 \end{cquote}
-Here, the explicit tag name is used to give different meaning to the values in the common @int@ type, \eg the value 3 has different interpretations depending on the tag name.
-It is even possible to remove the type or use the empty @unit@ type (@struct unit {}@).
-It is this tag naming that is used to simulate an enumeration.
-
-Normally, the variant tag is implicitly set by the compiler based on type, but with common types, a tag name is required to resolve type ambiguity.
-\begin{cfa}
-Variant v = 3;							$\C{// implicitly set tag to 0 based on type of 3}$
-VariantCT ve = boats.3;					$\C{// explicitly set tag to 1 using tag name}$
-\end{cfa}
-Type pattern-matching is then used to dynamically test the tag and branch to a section of statically-typed code to safely manipulate the value, \eg:
+For safety, most languages require all assocaited types to be listed or a default case with no field accesses.
+
+A less frequent case is multiple constructors with the same type.
+\begin{haskell}
+data Bar = X Int | Y Int | Z Int;
+foo = X 3;
+bar = Y 3;
+baz = Z 5;
+\end{haskell}
+Here, the constructor name gives different meaning to the values in the common \lstinline[language=Haskell]{Int} type, \eg the value @3@ has different interpretations depending on the constructor name in the pattern matching.
+
+Note, the term \Newterm{variant} is often associated with ADTs.
+However, there are multiple languages with a @variant@ type that is not an ADT \see{Algol68~\cite{Algol68} or \CC \lstinline{variant}}.
+In these languages, the variant is often a union using RTTI tags, which cannot be used to simulate an enumeration.
+Hence, in this work the term variant is not a synonym for ADT.
+
+% https://downloads.haskell.org/ghc/latest/docs/libraries/base-4.19.1.0-179c/GHC-Enum.html
+% https://hackage.haskell.org/package/base-4.19.1.0/docs/GHC-Enum.html
+
+The association between ADT and enumeration occurs if all the constructors have a unit (empty) type, \eg @struct unit {}@.
+Note, the unit type is not the same as \lstinline{void}, \eg:
+\begin{cfa}
+void foo( void );
+struct unit {} u;  // empty type
+unit bar( unit );
+foo( foo() );        // void argument does not match with void parameter
+bar( bar( u ) );   // unit argument does match with unit parameter
+\end{cfa}
+
+For example, in the Haskell ADT:
+\begin{haskell}
+data Week = Mon | Tue | Wed | Thu | Fri | Sat | Sun deriving(Enum, Eq, Show)
+\end{haskell}
+the default type for each constructor is the unit type, and deriving from @Enum@ enforces no other type, @Eq@ allows equality comparison, and @Show@ is for printing.
+The nullary constructors for the unit types are numbered left-to-right from $0$ to @maxBound@$- 1$, and provides enumerating operations @succ@, @pred@, @enumFrom@ @enumFromTo@.
+\VRef[Figure]{f:HaskellEnumeration} shows enumeration comparison and iterating (enumerating).
+
+\begin{figure}
 \begin{cquote}
-\begin{tabular}{@{}l@{\hspace{30pt}}l@{}}
-\begin{cfa}[morekeywords={match}]
-@match@( v ) { // know type implicitly or test tag
-	case int { /* only access i field */ }
-	case double { /* only access d field */ }
-	case S { /* only access s field */ }
-}
-\end{cfa}
+\setlength{\tabcolsep}{15pt}
+\begin{tabular}{@{}ll@{}}
+\begin{haskell}
+day = Tue
+main = do
+    if day == Tue then
+        print day
+    else
+        putStr "not Tue"
+    print (enumFrom Mon)            -- week
+    print (enumFromTo Mon Fri)   -- weekday
+    print (enumFromTo Sat Sun)  -- weekend
+\end{haskell}
 &
-\begin{cfa}[morekeywords={match}]
-@match@( ve ) {
-	case car: int { /* car interpretation */ }
-	case boat: int { /* boat interpretation */ }
-	case bridge: int { /* bridge interpretation */ }
-}
-\end{cfa}
+\begin{haskell}
+Tue
+[Mon,Tue,Wed,Thu,Fri,Sat,Sun]
+[Mon,Tue,Wed,Thu,Fri]
+[Sat,Sun]
+
+
+
+
+
+\end{haskell}
 \end{tabular}
 \end{cquote}
-For safety, some languages require all variant types to be listed or a @default@ case with no field accesses.
-
-To further strengthen the simulate for an enumeration with different values, each variant type can be a @const@ type or the tag becomes non-opaque, possibly taking advantage of the opaque auto-numbering.
-\begin{cquote}
-\begin{tabular}{@{}l@{\hspace{30pt}}l@{}}
-\begin{cfa}
-variant Week {
-	case Mon: const int = 0;
-	...
-	case Sat: const int = 5;
-	case Sun: const int = 10;
-};
-\end{cfa}
-&
-\begin{cfa}
-variant Week {
-	case Mon: ;  // tag auto-numbering
-	...
-	case Sat: ;
-	case @Sun = 10@: ; // directly set tag value
-};
-\end{cfa}
-\end{tabular}
-\end{cquote}
-Directly setting the tag implies restrictions, like unique values.
-In both cases, instances of @Week@ are @const@ (immutable).
-However, usage between these two types becomes complex.
-\begin{cfa}
-Week day = Week.Mon;  // sets value or tag depending on type
-if ( day == Week.Mon )   // dereference value or tag ?
-\end{cfa}
-Here, the dereference of @day@ should return the value of the type stored in the variant, never the tag.
-If it does return the tag, some special meaning must be given to the empty/unit type, especially if a variant contains both regular and unit types.
-
-
-In general, the enumeration simulation and the variant extensions to support it, are deviating from the normal use of a variant (union) type.
-As well, the enumeration operations are limited to the available tag operations, \eg pattern matching.
-While enumerating among tag names is possible:
-\begin{cfa}[morekeywords={in}]
-for ( cursor in Week.Mon, Week.Wed, Week.Fri, Week.Sun ) ...
-\end{cfa}
-what is the type of @cursor@?
-If it the tag type (@int@), how is this value used?
-If it is the variant type, where is the instance variable, which only contains one value.
-Hence, either enumerating with a variant enumeration is disallowed or some unusual typing rule must be invented to make it work but only in restricted contexts.
-
-While functional programming systems regularly re-purposing variant types into enumeration types, this process seems contrived and confusing.
-A variant tag is not an enumeration, it is a discriminant among a restricted set of types stored in a storage block.
-Hence, there is only a weak equivalence between an enumeration and variant type, justifying a seperate enumeration type in a programming language.
+\caption{Haskell Enumeration}
+\label{f:HaskellEnumeration}
+\end{figure}
+
+The key observation is the dichotomy between an ADT and enumeration: the ADT uses the associated type resulting in a union-like data structure, and the enumeration does not use the associated type, and hence, is not a union.
+While the enumeration is constructed using the ADT mechanism, it is so restricted it is not really an ADT.
+Furthermore, a general ADT cannot be an enumeration because the constructors generate different values making enumerating meaningless.
+While functional programming languages regularly repurpose the ADT type into an enumeration type, this process seems contrived and confusing.
+Hence, there is only a weak equivalence between an enumeration and ADT, justifying a separate enumeration type in a programming language.
 
 
 \section{Contributions}
 
-The goal of this work is to to extend the simple and unsafe enumeration type in the C programming-language into a sophisticated and safe type in the \CFA programming-language, while maintain backwards compatibility with C.
+The goal of this work is to to extend the simple and unsafe enumeration type in the C programming-language into a complex and safe enumeration type in the \CFA programming-language, while maintaining backwards compatibility with C.
 On the surface, enumerations seem like a simple type.
 However, when extended with advanced features, enumerations become complex for both the type system and the runtime implementation.
 
+The contribution of this work are:
 \begin{enumerate}
 \item
@@ -236,5 +265,5 @@
 typing
 \item
-subset
+subseting
 \item
 inheritance
