Context Navigation

← Previous Change
Next Change →

Changeset 486caad for doc

Timestamp:

Mar 25, 2024, 7:15:30 PM (4 months ago)

Author:

JiadaL <j82liang@…>

Branches:

Children:

Parents:

df78cce (diff), bf050c5 (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the (diff) links above to see all the changes relative to each parent.

Message:

Merge branch 'master' of plg.uwaterloo.ca:software/cfa/cfa-cc

Location:

Files:

: 1 added
: 10 edited

bibliography/pl.bib (modified) (10 diffs)
papers/general/SPE_CforallModernFeatures.pdf (added)
theses/jiada_liang_MMath/CFAenum.tex (modified) (10 diffs)
theses/jiada_liang_MMath/background.tex (modified) (2 diffs)
theses/jiada_liang_MMath/implementation.tex (modified) (1 diff)
theses/jiada_liang_MMath/intro.tex (modified) (1 diff)
theses/jiada_liang_MMath/relatedwork.tex (modified) (24 diffs)
theses/mike_brooks_MMath/array.tex (modified) (1 diff)
theses/mike_brooks_MMath/background.tex (modified) (2 diffs)
theses/mike_brooks_MMath/intro.tex (modified) (2 diffs)
theses/mike_brooks_MMath/uw-ethesis.tex (modified) (1 diff)

Legend:

: Unmodified
: Added
: Removed

doc/bibliography/pl.bib

-                      rdf78cce
+                      r486caad
     school      = {School of Computer Science, University of Waterloo},
     year        = 2019,
     optaddress  = {Waterloo, Ontario, Canada, N2L 3G1},
+    address     = {Waterloo, Ontario, Canada, N2L 3G1},
     note        = {\url{https://uwspace.uwaterloo.ca/handle/10012/14584}},
+}
 …
     school      = {School of Computer Sc., University of Waterloo},
     year        = 2015,
     optaddress  = {Waterloo, Ontario, Canada, N2L 3G1},
+    address     = {Waterloo, Ontario, Canada, N2L 3G1},
     note        = {\url{https://uwspace.uwaterloo.ca/handle/10012/10013}},
+}
 …
     school      = {School of Computer Sc., University of Waterloo},
     year        = 2019,
     optaddress  = {Waterloo, Ontario, Canada, N2L 3G1},
+    address     = {Waterloo, Ontario, Canada, N2L 3G1},
     note        = {\url{https://uwspace.uwaterloo.ca/handle/10012/14706}},
+}
 …
     school      = {School of Computer Science, University of Waterloo},
     year        = 2003,
     optaddress  = {Waterloo, Ontario, Canada, N2L 3G1},
+    address     = {Waterloo, Ontario, Canada, N2L 3G1},
     note        = {\url{http://plg.uwaterloo.ca/theses/BilsonThesis.pdf}},
+}
 …
     year        = 2018,
     month       = sep,
     optaddress  = {Waterloo, Ontario, Canada, N2L 3G1},
+    address     = {Waterloo, Ontario, Canada, N2L 3G1},
     note        = {\url{https://uwspace.uwaterloo.ca/handle/10012/13935}},
+}
 …
     key         = {OCaml},
     title       = {The {OC}aml system, release 5.1},
     optaddress  = {Rust Project Developers},
+    address     = {Rust Project Developers},
     year        = 2023,
     note        = {\url{https://v2.ocaml.org/manual/}},
 …
     organization= {United States Department of Defense},
     edition     = {{ANSI/MIL-STD-1815A-1983}},
+    address     = {Springer, New York},
     month       = feb,
     year        = 1983,
-    note        = {Springer, New York},
+}
 …
     school      = {School of Computer Science, University of Waterloo},
     year        = 2017,
     optaddress  = {Waterloo, Ontario, Canada, N2L 3G1},
+    address     = {Waterloo, Ontario, Canada, N2L 3G1},
     note        = {\url{https://uwspace.uwaterloo.ca/handle/10012/11830}},
+}
 …
     key         = {Rust},
     title       = {{R}ust Programming Language},
     optaddress  = {Rust Project Developers},
+    address     = {Rust Project Developers},
     year        = 2015,
     note        = {\url{https://doc.rust-lang.org/reference.html}},
 …
     publisher   = {Morgan \& Claypool},
     year        = 2013,
+}
-@inproceedings{Leissa14,
-    title       = {{S}ierra: a {SIMD} extension for {C}++},
-    author      = {Lei{\ss}a, Roland and Haffner, Immanuel and Hack, Sebastian},
-    booktitle   = {Proceedings of the 2014 Workshop on Workshop on programming models for SIMD/Vector processing},
-    pages       = {17-24},
-    year        = {2014},
-    organization= {ACM}
+}
-@article{Nickolls08,
-    author      = {Nickolls, John and Buck, Ian and Garland, Michael and Skadron, Kevin},
-    title       = {Scalable Parallel Programming with CUDA},
-    journal     = {Queue},
-    volume      = {6},
-    number      = {2},
-    month       = mar,
-    year        = 2008,
-    pages       = {40-53},
-    publisher   = {ACM},
-    address     = {New York, NY, USA},
+}

doc/theses/jiada_liang_MMath/CFAenum.tex

-                      rdf78cce
+                      r486caad
 \chapter{\CFA-Style Enum}
 \CFA supports C-Style enumeration using the same syntax and semantics for backwards compatibility.
+\chapter{\CFA Enumeration}
+\CFA supports C enumeration using the same syntax and semantics for backwards compatibility.
 \CFA also extends C-Style enumeration by adding a number of new features that bring enumerations inline with other modern programming languages.
 …
 Finally, qualification is provided to disambiguate any ambiguous situations.
 \begin{cfa}
 enum C1 { First, Second, Third, Fourth };
 enum C2 { @Fourth@, @Third@, @Second@, @First@ };
 C1 p() { return Third; }                                $\C{// correctly resolved duplicate names}$
 C2 p() { return Fourth; }
+enum E1 { First, Second, Third, Fourth };
+enum E2 { @Fourth@, @Third@, @Second@, @First@ };
+E1 p() { return Third; }                                $\C{// correctly resolved duplicate names}$
+E2 p() { return Fourth; }
 void foo() {
         C1 e1 = First;   C2 e2 = First;
+        E1 e1 = First;   E2 e2 = First;
         e1 = Second;   e2 = Second;
         e1 = p();   e2 = p();                           $\C{// correctly resolved function call}$
+        int i = @C1.@First + @C2.@First;        $\C{// ambiguous without qualification}$
+}
+\end{cfa}
+\CFA overloading allows programmers to use the most meaningful names without fear of unresolvable clashes from included files, which are correctable with qualification.
+        int i = @E1.@First + @E2.@First;        $\C{// disambiguate with qualification}$
+        int j = @(E1)@First + @(E2)@First;      $\C{// disambiguate with cast}$
+}
+\end{cfa}
+\CFA overloading allows programmers to use the most meaningful names without fear of name clashes from include files.
+Either the type system implicitly disambiguates or the programmer explicitly disambiguates using qualification or casting.
 …
 An enumeration can be scoped, so the enumerator constants are not projected into the enclosing scope, using @'!'@.
 \begin{cfa}
 enum Weekday @!@ { /* as above */ };
 enum( char * ) Names @!@ { /* as above */ };
+enum Weekday @!@ { Mon, Tue, Wed, Thu = 10, Fri, Sat, Sun };
+enum RGB @!@ { Red, Green, Blue };
 \end{cfa}
 Now the enumerators \emph{must} be qualified with the associated enumeration.
 \begin{cfa}
+Weekday weekday = @Weekday@.Monday;
+Names names = @Names.@Fred;
+names = @Names.@Jane;
+Weekday weekday = @Weekday@.Mon;
+weekday = @Weekday@.Sat;
+RGB rgb = RGB.Red;
+rgb = RGB.Blue;
 \end{cfa}
 It is possible to toggle back to unscoping using the \CFA @with@ clause/statement (see also \CC \lstinline[language=c++]{using enum} in Section~\ref{s:C++RelatedWork}).
 \begin{cfa}
+Weekday weekday;
+with ( @Weekday@, @Names@ ) {                   $\C{// type names}$
+         Names names = @Fred@;
+         names = @Jane@;
+         weekday = Saturday;
+}
+\end{cfa}
+As in Section~\ref{s:EnumeratorNameResolution}, opening multiple unscoped enumerations can result in duplicate enumeration names, but \CFA type resolution and falling back to explicit qualification handles name resolution.
+with ( @Weekday@, @RGB@ ) {                     $\C{// type names}$
+         weekday = @Sun@;                               $\C{// no qualification}$
+         rgb = @Green@;
+}
+\end{cfa}
+As in Section~\ref{s:EnumeratorNameResolution}, opening multiple unscoped enumerations can result in duplicate enumeration names, but \CFA implicit type resolution and explicit qualification/casting handles name resolution.
 \section{Enumerator Typing}
 …
 \begin{cfa}
 // integral
         enum( @char@ ) Currency { Dollar = '$\textdollar$', Euro = '$\texteuro$', Pound = '$\textsterling$'  };
+        enum( @char@ ) Currency { Dollar = '$\textdollar$', Cent = '$\textcent$', Yen = '$\textyen$', Pound = '$\textsterling$', Euro = 'E' };
         enum( @signed char@ ) srgb { Red = -1, Green = 0, Blue = 1 };
         enum( @long long int@ ) BigNum { X = 123_456_789_012_345,  Y = 345_012_789_456_123 };
 …
         enum( @_Complex@ ) Plane { X = 1.5+3.4i, Y = 7+3i, Z = 0+0.5i };
 // pointer
         enum( @char *@ ) Names { Fred = "FRED", Mary = "MARY", Jane = "JANE" };
+        enum( @const char *@ ) Name { Fred = "FRED", Mary = "MARY", Jane = "JANE" };
         int i, j, k;
         enum( @int *@ ) ptr { I = &i,  J = &j,  K = &k };
 …
 // aggregate
         struct Person { char * name; int age, height; };
+@***@enum( @Person@ ) friends { @Liz@ = { "ELIZABETH", 22, 170 }, @Beth@ = Liz, Jon = { "JONATHAN", 35, 190 } };
+@***@enum( @Person@ ) friends { @Liz@ = { "ELIZABETH", 22, 170 }, @Beth@ = Liz,
+                                                                                        Jon = { "JONATHAN", 35, 190 } };
 \end{cfa}
 \caption{Enumerator Typing}
 …
 \begin{cfa}
 enum() Mode { O_RDONLY, O_WRONLY, O_CREAT, O_TRUNC, O_APPEND };
 @***@Mode iomode = O_RDONLY;
 bool b = iomode == O_RDONLY || iomode < O_APPEND;
 int i = iomode;                                                 $\C{\color{red}// disallowed}$
+Mode iomode = O_RDONLY;
+bool b = iomode == O_RDONLY || iomode < O_APPEND; $\C{// ordering}$
+@***@@int i = iomode;@                                                  $\C{// disallow conversion to int}$
 \end{cfa}
 …
 If follows from enumerator typing that the enumerator type can be another enumerator.
 \begin{cfa}
 enum( @char@ ) Currency { Dollar = '$\textdollar$', Euro = '$\texteuro$', Pound = '$\textsterling$'  };
 enum( @Currency@ ) Europe { Euro = Currency.Euro, Pound = Currency.Pound }; // intersection
+enum( @char@ ) Currency { Dollar = '$\textdollar$', Cent = '$\textcent$', Yen = '$\textyen$', Pound = '$\textsterling$', Euro = 'E' };
+enum( Currency ) Europe { Euro = Currency.Euro, Pound = Currency.Pound };
 enum( char ) Letter { A = 'A',  B = 'B', C = 'C', ..., Z = 'Z' };
 enum( @Letter@ ) Greek { Alph = A, Beta = B, ..., Zeta = Z }; // intersection
 …
 \begin{cfa}
 Letter letter = A;
 @***@Greak greek = Beta;
 letter = Beta;                                                  $\C{// allowed, letter == B}$
 greek = A;                                                              $\C{\color{red}// disallowed}$
+Greak greek = Beta;
+letter = Beta;                                                  $\C{// allowed, greek == B}$
+@greek = A;@                                                    $\C{// disallowed}$
 \end{cfa}
 …
 p(variable_d); // 3
 \end{cfa}
+\section{Planet Example}
+\VRef[Figure]{f:PlanetExample} shows an archetypal enumeration example illustrating all of the \CFA enumeration features.
+Enumeration @Planet@ is a typed enumeration of type @MR@.
+Each of the planet enumerators is initialized to a specific mass/radius, @MR@, value.
+The unnamed enumeration projects the gravitational-constant enumerator @G@.
+The program main iterates through the planets computing the weight on each planet for a given earth weight.
+\begin{figure}
+\begin{cfa}
+struct MR { double mass, radius; };
+enum( MR ) Planet {
+        //                           mass          radius
+        MERCURY = { 3.303_E23, 2.4397_E6 },
+        VENUS       = { 4.869_E24, 6.0518_E6 },
+        EARTH       = { 5.976_E24, 6.3781_E6 },
+        MARS         = { 6.421_E23, 3.3972_E6 },
+        JUPITER    = { 1.898_E27, 7.1492_E7 },
+        SATURN     = { 5.688_E26, 6.0268_E7 },
+        URANUS    = { 8.686_E25, 2.5559_E7 },
+        NEPTUNE  = { 1.024_E26, 2.4746_E7 },
+};
+enum( double ) { G = 6.6743E-11 }; // universal gravitational constant (m3 kg-1 s-2)
+static double surfaceGravity( Planet p ) with( p ) {
+        return G * mass / ( radius * radius );
+}
+static double surfaceWeight( Planet p, double otherMass ) {
+        return otherMass * surfaceGravity( p );
+}
+int main( int argc, char * argv[] ) {
+        if ( argc != 2 ) exit | "Usage: " | argv[0] | "earth-weight";
+        double earthWeight = convert( argv[1] );
+        double mass = earthWeight / surfaceGravity( EARTH );
+        for ( p; Planet ) {
+                sout | "Your weight on" | labelE(p) | "is" | surfaceWeight( p, mass );
+        }
+}
+\end{cfa}
+\caption{Planet Example}
+\label{f:PlanetExample}
+\end{figure}

doc/theses/jiada_liang_MMath/background.tex

-                      rdf78cce
+                      r486caad
 \chapter{Background}
+\lstnewenvironment{clang}[1][]{\lstset{language=[ANSI]C,escapechar=\$,moredelim=**[is][\color{red}]{@}{@},}\lstset{#1}}{}
+\CFA is a backwards-compatible extension of the C programming language.
+Therefore, it must support C-style enumerations and any enumeration extensions must be intuitive to C programmers both in syntax and semantics.
+It is common for C programmers to ``believe'' there are three equivalent forms of named constants.
+\begin{clang}
+#define Mon 0
+static const int Mon = 0;
+enum { Mon };
+\end{clang}
+\begin{enumerate}[leftmargin=*]
+\item
+For @#define@, the programmer has to explicitly manage the constant name and value.
+Furthermore, these C preprocessor macro names are outside of the C type-system, and hence cannot be overloaded, and can incorrectly change random text in a program.
+\item
+The same explicit management is true for the @const@ declaration, and the @const@ variable cannot appear in constant-expression locations, like @case@ labels, array dimensions,\footnote{
+C allows variable-length array-declarations (VLA), so this case does work, but it fails in \CC, which does not support VLAs, unless it is \lstinline{g++}.} immediate operands of assembler instructions, and occupy storage.
+\begin{clang}
+$\$$ nm test.o
+0000000000000018 r Mon
+\end{clang}
+\item
+Only the @enum@ form is managed by the compiler, is part of the language type-system, and works in all C constant-expression locations.
+\end{enumerate}
 \section{C-Style Enum}
+\section{C \lstinline{const}}
+The C-Style enumeration has the following syntax and semantics.
+\begin{cfa}
+enum Weekday { Monday, Tuesday, Wednesday, Thursday@ = 10@, Friday, Saturday, Sunday };
+\end{cfa}
+As noted, C has the equivalent of Pascal typed @const@ declarations \see{\VRef{s:Pascal}}, with static and dynamic initialization.
+\begin{clang}
+static const int one = 0 + 1;                   $\C{// static intialization}$
+static const void * NIL = NULL;
+static const double PI = 3.14159;
+static const char Plus = '+';
+static const char * Fred = "Fred";
+static const int Mon = 0, Tue = Mon + 1, Wed = Tue + 1, Thu = Wed + 1, Fri = Thu + 1,
+                                        Sat = Fri + 1, Sun = Sat + 1;
+void foo() {
+        const int r = random();                         $\C{// dynamic intialization}$
+        int sa[Sun];                                            $\C{// VLA, local scope only}$
+}
+\end{clang}
+Statically initialized identifiers may appear in any constant-expression context, \eg @case@.
+Dynamically initialized identifiers may appear as array dimensions in @g++@, which allows variable-sized arrays.
+\section{C Enumeration}
+The C enumeration has the following syntax and semantics.
+\begin{clang}
+enum Weekday { Mon, Tue, Wed, Thu@ = 10@, Fri, Sat, Sun, };
+\end{clang}
 Enumerators without an explicitly designated constant value are \Newterm{auto-initialized} by the compiler: from left to right, starting at zero or the next explicitly initialized constant, incrementing by @1@.
 For example, @Monday@ to @Wednesday@ are implicitly assigned with constants @0@--@2@, @Thursday@ is explicitly set to constant @10@, and @Friday@ to @Sunday@ are implicitly assigned with constants @11@--@13@.
+For example, @Mon@ to @Wed@ are implicitly assigned with constants @0@--@2@, @Thu@ is explicitly set to constant @10@, and @Fri@ to @Sun@ are implicitly assigned with constants @11@--@13@.
 Initialization may occur in any order.
 \begin{cfa}
 enum Weekday { Thursday@ = 10@, Friday, Saturday, Sunday, Monday@ = 0@, Tuesday, Wednesday };
 \end{cfa}
+\begin{clang}
+enum Weekday { Thu@ = 10@, Fri, Sat, Sun, Mon@ = 0@, Tue, Wed };
+\end{clang}
 Note, the comma in the enumerator list can be a terminator or a separator, allowing the list to end with a dangling comma.
 \begin{cfa}
+\begin{clang}
 enum Weekday {
         Thursday = 10, Friday, Saturday, Sunday,
         Monday = 0, Tuesday, Wednesday@,@ // terminating comma
+        Thu = 10, Fri, Sat, Sun,
+        Mon = 0, Tue, Wed@,@ // terminating comma
 };
 \end{cfa}
+\end{clang}
 This feature allow enumerator lines to be interchanged without moving a comma.\footnote{
 A terminating comma appears in other C syntax, \eg the initializer list.}
 …
 In practice, since integral constants are used, which have type @int@ (unless qualified with a size suffix), C uses @int@ as the underlying type for enumeration variables.
 Finally, there is an implicit bidirectional conversion between an enumeration and its integral type.
 \begin{cfa}
+\begin{clang}
+{
         enum Weekday { /* as above */ };        $\C{// enumerators implicitly projected into local scope}$
         Weekday weekday = Monday;                       $\C{// weekday == 0}$
         weekday = Friday;                                       $\C{// weekday == 11}$
         int i = Sunday;                                         $\C{// implicit conversion to int, i == 13}$
+        Weekday weekday = Mon;                          $\C{// weekday == 0}$
+        weekday = Fri;                                          $\C{// weekday == 11}$
+        int i = Sun;                                            $\C{// implicit conversion to int, i == 13}$
         weekday = 10000;                                        $\C{// UNDEFINED! implicit conversion to Weekday}$
+}
 int j = Wednesday;                                              $\C{// ERROR! Wednesday is not declared in this scope}$
 \end{cfa}
+int j = Wed;                                                    $\C{// ERROR! Wed is not declared in this scope}$
+\end{clang}
 The implicit conversion from @int@ to an enumeration type is an unnecessary source of error.
-It is common for C programmers to ``believe'' there are 3 equivalent forms of constant enumeration.
-\begin{cfa}
-#define Monday 0
-static const int Monday = 0;
-enum { Monday };
-\end{cfa}
-For @#define@, the programmer has to play compiler and explicitly manage the enumeration values;
-furthermore, these are independent constants outside of any language type mechanism.
-The same explicit management is true for @const@ declarations, and the @const@ variable cannot appear in constant-expression locations, like @case@ labels, array dimensions,\footnote{
-C allows variable-length array-declarations (VLA), so this case does work, but it fails in \CC, which does not support VLAs, unless it is \lstinline{g++}.} and immediate operands of assembler instructions.
-Only the @enum@ form is managed by the compiler, is part of the language type-system, and works in all C constant-expression locations.

doc/theses/jiada_liang_MMath/implementation.tex

rdf78cce	r486caad
548	548	\begin{cfa}
549	549	enum(int) Weekday {
550		Mon~~day=10, Tuesday~~, ...
	550	Mon = 10, Tue, ...
551	551	};
552	552

doc/theses/jiada_liang_MMath/intro.tex

-                      rdf78cce
+                      r486caad
 \chapter{Introduction}
+Naming values is a common practice in mathematics and engineering, \eg $\pi$, $\tau$ (2$\pi$), $\phi$ (golden ratio), MHz (1E6), etc.
+Naming is also commonly used to represent many other numerical phenomenon, such as days of the week, months of a year, floors of a building (basement), specific times (noon, New Years).
+Many programming languages capture this important software engineering capability through a mechanism called an \Newterm{enumeration}.
+An enumeration is similar to other programming-language types by providing a set of constrained values, but adds the ability to name \emph{all} the values in its set.
+Note, all enumeration names must be unique but different names can represent the same value (eight note, quaver), which are synonyms.
+All types in a programming language must have a set of constants, and these constants have \Newterm{primary names}, \eg integral types have constants @-1@, @17@, @12345@, \etc.
+Constants can be overloaded among types, \eg @0@ is a null pointer for all pointer types, and the value zero for integral and floating-point types.
+Hence, each primary constant has a symbolic name referring to its internal representation, and these names are dictated by language syntax related to types.
+In theory, there are an infinite set of primary names per type.
+Specifically, an enumerated type restricts its values to a fixed set of named constants.
+While all types are restricted to a fixed set of values because of the underlying von Neumann architecture, and hence, to a corresponding set of constants, \eg @3@, @3.5@, @3.5+2.1i@, @'c'@, @"abc"@, etc., these values are not named, other than the programming-language supplied constant names.
+\Newterm{Secondary naming} is a common practice in mathematics and engineering, \eg $\pi$, $\tau$ (2$\pi$), $\phi$ (golden ratio), MHz (1E6), and in general situations, \eg specific times (noon, New Years), cities (Big Apple), flowers (Lily), \etc.
+Many programming languages capture this important software-engineering capability through a mechanism called \Newterm{constant} or \Newterm{literal} naming, where a secondary name is aliased to a primary name.
+In some cases, secondary naming is \Newterm{pure}, where the matching internal representation can be chosen arbitrarily, and only equality operations are available, \eg @O_RDONLY@, @O_WRONLY@, @O_CREAT@, @O_TRUNC@, @O_APPEND@.
+(The names the thing.)
+Because a secondary name is a constant, it cannot appear in a mutable context, \eg \mbox{$\pi$ \lstinline{= 42}} is meaningless, and a constant has no address, \ie it is an \Newterm{rvalue}\footnote{
+The term rvalue defines an expression that can only appear on the right-hand side of an assignment expression.}.
+Fundamentally, all enumeration systems have an \Newterm{enumeration} type with an associated set of \Newterm{enumerator} names.
+An enumeration has three universal attributes, \Newterm{position}, \Newterm{label}, and \Newterm{value}, as shown by this representative enumeration, where position and value can be different.
+Secondary names can form an (ordered) set, \eg days of the week, months of a year, floors of a building (basement, ground, 1st), colours in a rainbow, \etc.
+Many programming languages capture these groupings through a mechanism called an \Newterm{enumeration}.
+\begin{quote}
+enumerate (verb, transitive).
+To count, ascertain the number of;
+\emph{more
+usually, to mention (a number of things or persons) separately, as if for the
+purpose of counting};
+to specify as in a list or catalogue.~\cite{OED}
+\end{quote}
+Within an enumeration set, the enumeration names must be unique, and instances of an enumerated type are restricted to hold only the secondary names.
+It is possible to enumerate among set names without having an ordering among the set elements.
+For example, the week, the weekdays, the weekend, and every second day of the week.
+\begin{cfa}[morekeywords={in}]
+for ( cursor in Mon, Tue, Wed, Thu, Fri, Sat, Sun } ... $\C[3.75in]{// week}$
+for ( cursor in Mon, Tue, Wed, Thu, Fri } ...   $\C{// weekday}$
+for ( cursor in Thu, Fri } ...                                  $\C{// weekend}$
+for ( cursor in Mon, Wed, Fri, Sun } ...                $\C{// every second day of week}\CRT$
+\end{cfa}
+This independence from internal representation allows multiple names to have the same representation (eight note, quaver), giving synonyms.
+A set can have a partial or total ordering, making it possible to compare set elements, \eg Monday is before Friday and Friday is after.
+Ordering allows iterating among the enumeration set using relational operators and advancement, \eg
+\begin{cfa}
+for ( cursor = Monday; cursor @<=@ Friday; cursor = @succ@( cursor ) ) ...
+\end{cfa}
+Here the internal representations for the secondary names are \emph{generated} rather than listing a subset of names.
+\section{Terminology}
+The term \Newterm{enumeration} defines the set of secondary names, and the term \Newterm{enumerator} represents an arbitrary secondary name.
+As well, an enumerated type has three fundamental properties, \Newterm{label}, \Newterm{order}, and \Newterm{value}.
 \begin{cquote}
 \small\sf\setlength{\tabcolsep}{3pt}
 \begin{tabular}{rccccccccccc}
 \it\color{red}enumeration & \multicolumn{7}{c}{\it\color{red}enumerators}       \\
 $\downarrow$\hspace*{25pt} & \multicolumn{7}{c}{$\downarrow$}                           \\
 @enum@ Weekday \{                               & Monday,       & Tuesday,      & Wednesday,    & Thursday,& Friday,    & Saturday,     & Sunday \}; \\
 \it\color{red}position                  & 0                     & 1                     & 2                             & 3                             & 4                     & 5                     & 6                     \\
 \it\color{red}label                             & Monday        & Tuesday       & Wednesday             & Thursday              & Friday        & Saturday      & Sunday        \\
 \it\color{red}value                             & 0                     & 1                     & 2                             & 3                             & 4                     & 5             & 6
+\sf\setlength{\tabcolsep}{3pt}
+\begin{tabular}{rcccccccr}
+\it\color{red}enumeration       & \multicolumn{8}{c}{\it\color{red}enumerators} \\
+$\downarrow$\hspace*{25pt}      & \multicolumn{8}{c}{$\downarrow$}                              \\
+@enum@ Week \{                          & Mon,  & Tue,  & Wed,  & Thu,  & Fri,  & Sat,  & Sun = 42      & \};   \\
+\it\color{red}label                     & Mon   & Tue   & Wed   & Thu   & Fri   & Sat   & Sun           &               \\
+\it\color{red}order                     & 0             & 1             & 2             & 3             & 4             & 5             & 6                     &               \\
+\it\color{red}value                     & 0             & 1             & 2             & 3             & 4             & 5             & 42            &
 \end{tabular}
 \end{cquote}
+Here, the \Newterm{enumeration} @Weekday@ defines the ordered \Newterm{enumerator}s @Monday@, @Tuesday@, @Wednesday@, @Thursday@, @Friday@, @Saturday@ and @Sunday@.
+By convention, the successor of @Tuesday@ is @Monday@ and the predecessor of @Tuesday@ is @Wednesday@, independent of the associated enumerator constant values.
+Because an enumerator is a constant, it cannot appear in a mutable context, \eg @Mon = Sun@ is meaningless, and an enumerator has no address, it is an \Newterm{rvalue}\footnote{
+The term rvalue defines an expression that can only appear on the right-hand side of an assignment.}.
+Here, the enumeration @Week@ defines the enumerator labels @Mon@, @Tue@, @Wed@, @Thu@, @Fri@, @Sat@ and @Sun@.
+The implicit ordering implies the successor of @Tue@ is @Mon@ and the predecessor of @Tue@ is @Wed@, independent of any associated enumerator values.
+The value is the constant represented by the secondary name, which can be implicitly or explicitly set.
+Specifying complex ordering is possible:
+\begin{cfa}
+enum E1 { $\color{red}[\(_1\)$ {A, B}, $\color{blue}[\(_2\)$ C $\color{red}]\(_1\)$, {D, E} $\color{blue}]\(_2\)$ }; $\C{// overlapping square brackets}$
+enum E2 { {A, {B, C} }, { {D, E}, F };  $\C{// nesting}$
+\end{cfa}
+For @E1@, there is the partial ordering among @A@, @B@ and @C@, and @C@, @D@ and @E@, but not among @A@, @B@ and @D@, @E@.
+For @E2@, there is the total ordering @A@ $<$ @{B, C}@ $<$ @{D, E}@ $<$ @F@.
+Only flat total-ordering among enumerators is considered in this work.
+\section{Motivation}
+Some programming languages only provide secondary renaming, which can be simulated by an enumeration without ordering.
+\begin{cfa}
+const Size = 20, Pi = 3.14159;
+enum { Size = 20, Pi = 3.14159 };   // unnamed enumeration $\(\Rightarrow\)$ no ordering
+\end{cfa}
+In both cases, it is possible to compare the secondary names, \eg @Size < Pi@, if that is meaningful;
+however, without an enumeration type-name, it is impossible to create an iterator cursor.
+Secondary renaming can similate an enumeration, but with extra effort.
+\begin{cfa}
+const Mon = 1, Tue = 2, Wed = 3, Thu = 4, Fri = 5, Sat = 6, Sun = 7;
+\end{cfa}
+Furthermore, reordering the enumerators requires manual renumbering.
+\begin{cfa}
+const Sun = 1, Mon = 2, Tue = 3, Wed = 4, Thu = 5, Fri = 6, Sat = 7;
+\end{cfa}
+Finally, there is no common type to create a type-checked instance or iterator cursor.
+Hence, there is only a weak equivalence between secondary naming and enumerations, justifying the enumeration type in a programming language.
+A variant (algebraic) type is often promoted as a kind of enumeration, \ie a varient type can simulate an enumeration.
+A variant type is a tagged-union, where the possible types may be heterogeneous.
+\begin{cfa}
+@variant@ Variant {
+        @int tag;@  // optional/implicit: 0 => int, 1 => double, 2 => S
+        @union {@ // implicit
+                case int i;
+                case double d;
+                case struct S { int i, j; } s;
+        @};@
+};
+\end{cfa}
+Crucially, the union implies instance storage is shared by all of the variant types.
+Hence, a variant is dynamically typed, as in a dynamic-typed programming-language, but the set of types is statically bound, similar to some aspects of dynamic gradual-typing~\cite{Gradual Typing}.
+Knowing which type is in a variant instance is crucial for correctness.
+Occasionally, it is possible to statically determine all regions where each variant type is used, so a tag and runtime checking is unnecessary;
+otherwise, a tag is required to denote the particular type in the variant and the tag checked at runtime using some form of type pattern-matching.
+The tag can be implicitly set by the compiler on assignment, or explicitly set by the program\-mer.
+Type pattern-matching is then used to dynamically test the tag and branch to a section of code to safely manipulate the value, \eg:
+\begin{cfa}[morekeywords={match}]
+Variant v = 3;  // implicitly set tag to 0
+@match@( v ) {    // know the type or test the tag
+        case int { /* only access i field in v */ }
+        case double { /* only access d field in v */ }
+        case S { /* only access s field in v */ }
+}
+\end{cfa}
+For safety, either all variant types must be listed or a @default@ case must exist with no field accesses.
+To simulate an enumeration with a variant, the tag is \emph{re-purposed} for either ordering or value and the variant types are omitted.
+\begin{cfa}
+variant Weekday {
+        int tag; // implicit 0 => Mon, ..., 6 => Sun
+        @case Mon;@ // no type
+        ...
+        @case Sun;@
+};
+\end{cfa}
+The type system ensures tag setting and testing are correctly done.
+However, the enumeration operations are limited to the available tag operations, \eg pattern matching.
+\begin{cfa}
+Week week = Mon;
+if ( @dynamic_cast(Mon)@week ) ... // test tag == Mon
+\end{cfa}
+While enumerating among tag names is possible:
+\begin{cfa}[morekeywords={in}]
+for ( cursor in Mon, Wed, Fri, Sun ) ...
+\end{cfa}
+ordering for iteration would require a \emph{magic} extension, such as a special @enum@ variant, because it has no meaning for a regular variant, \ie @int@ < @double@.
+However, if a special @enum@ variant allows the tags to be heterogeneously typed, ordering must fall back on case positioning, as many types have incomparable values.
+Iterating using tag ordering and heterogeneous types, also requires pattern matching.
+\begin{cfa}[morekeywords={match}]
+for ( cursor = Mon; cursor <= Fri; cursor = succ( cursor) ) {
+        match( cursor ) {
+                case Mon { /* access special type for Mon */ }
+                ...
+                case Fri { /* access special type for Fri */ }
+                default
+        }
+}
+\end{cfa}
+If the variant type is changed by adding/removing types or the loop range changes, the pattern matching must be adjusted.
+As well, if the start/stop values are dynamic, it may be impossible to statically determine if all variant types are listed.
+Re-purposing the notion of enumerating into variant types is ill formed and confusing.
+Hence, there is only a weak equivalence between an enumeration and variant type, justifying the enumeration type in a programming language.
 \section{Contributions}
+The goal of this work is to to extend the simple and unsafe enumeration type in the C programming-language into a sophisticated and safe type in the \CFA programming-language, while maintain backwards compatibility with C.
+On the surface, enumerations seem like a simple type.
+However, when extended with advanced features, enumerations become complex for both the type system and the runtime implementation.
+\begin{enumerate}
+\item
+overloading
+\item
+scoping
+\item
+typing
+\item
+subset
+\item
+inheritance
+\end{enumerate}

doc/theses/jiada_liang_MMath/relatedwork.tex

-                      rdf78cce
+                      r486caad
 \label{s:RelatedWork}
+\begin{comment}
 An algebraic data type (ADT) can be viewed as a recursive sum of product types.
 A sum type lists values as members.
 …
 Enumerated types are a special case of product/sum types with non-mutable fields, \ie initialized (constructed) once at the type's declaration, possible restricted to compile-time initialization.
 Values of algebraic types are access by subscripting, field qualification, or type (pattern) matching.
+\end{comment}
 Enumeration types exist in many popular programming languages, both past and present, \eg Pascal~\cite{Pascal}, Ada~\cite{Ada}, \Csharp~\cite{Csharp}, OCaml~\cite{OCaml} \CC, Go~\cite{Go}, Java~\cite{Java}, Modula-3~\cite{Modula-3}, Rust~\cite{Rust}, Swift~\cite{Swift}, Python~\cite{Python}.
 …
 \section{Pascal}
+\label{s:Pascal}
 \lstnewenvironment{pascal}[1][]{\lstset{language=pascal,escapechar=\$,moredelim=**[is][\color{red}]{@}{@},}\lstset{#1}}{}
 …
                  PI = 3.14159;   Plus = '+';   Fred = 'Fred';
 \end{pascal}
 Here, there is no enumeration because there is no specific type (pseudo enumeration).
+This mechanism is not an enumeration because there is no specific type (pseudo enumeration).
 Hence, there is no notion of a (possibly ordered) set, modulo the \lstinline[language=pascal]{set of} type.
 The type of each constant name (enumerator) is inferred from the constant-expression type.
 …
 with Ada.Text_IO; use Ada.Text_IO;
 procedure test is
    type RGB is ( @Red@, Green, Blue );
    type Traffic_Light is ( @Red@, Yellow, Green );         -- overload
    procedure @Red@( Colour : RGB ) is begin            -- overload
        Put_Line( "Colour is " & RGB'Image( Colour ) );
    end Red;
    procedure @Red@( TL : Traffic_Light ) is begin       -- overload
        Put_Line( "Light is " & Traffic_Light'Image( TL ) );
    end Red;
+        type RGB is ( @Red@, Green, Blue );
+        type Traffic_Light is ( @Red@, Yellow, Green );         -- overload
+        procedure @Red@( Colour : RGB ) is begin            -- overload
+                Put_Line( "Colour is " & RGB'Image( Colour ) );
+        end Red;
+        procedure @Red@( TL : Traffic_Light ) is begin       -- overload
+                Put_Line( "Light is " & Traffic_Light'Image( TL ) );
+        end Red;
 begin
     @Red@( Blue );                               -- RGB
     @Red@( Yellow );                            -- Traffic_Light
     @Red@( @RGB'(Red)@ );               -- ambiguous without cast
+        @Red@( Blue );                           -- RGB
+        @Red@( Yellow );                                -- Traffic_Light
+        @Red@( @RGB'(Red)@ );           -- ambiguous without cast
 end test;
 \end{ada}
 …
 \lstnewenvironment{c++}[1][]{\lstset{language=[GNU]C++,escapechar=\$,moredelim=**[is][\color{red}]{@}{@},}\lstset{#1}}{}
+\CC is largely backwards compatible with C, so it inherited C's enumerations.
+However, the following non-backwards compatible changes have been made.
+\CC has the equivalent of Pascal typed @const@ declarations \see{\VRef{s:Pascal}}, with static and dynamic initialization.
+\begin{c++}
+const auto one = 0 + 1;                                 $\C{// static intialization}$
+const auto NULL = nullptr;
+const auto PI = 3.14159;
+const auto Plus = '+';
+const auto Fred = "Fred";
+const auto Mon = 0, Tue = Mon + 1, Wed = Tue + 1, Thu = Wed + 1, Fri = Thu + 1,
+                                Sat = Fri + 1, Sun = Sat + 1;
+int sa[Sun];
+const auto r = random();                                $\C{// dynamic intialization}$
+int da[r];                                                              $\C{// VLA}$
+\end{c++}
+Statically initialized identifiers may appear in any constant-expression context, \eg @case@.
+Dynamically intialized identifiers may appear as array dimensions in @g++@, which allows variable-sized arrays.
+Interestingly, global \CC @const@ declarations are implicitly marked @static@ (@r@ rather than @R@).
+\begin{c++}
+$\$$ nm test.o
+0000000000000018 @r@ Mon
+\end{c++}
+\CC enumeration is largely backwards compatible with C, so it inherited C's enumerations.
+However, the following non-backwards compatible changes are made.
 \begin{cquote}
 …
 \begin{Go}
 const ( Mon = iota; Tue; Wed; // 0, 1, 2
          @Thu = 10@; Fri; Sat; Sun ) // 10, 10, 10, 10
+        @Thu = 10@; Fri; Sat; Sun ) // 10, 10, 10, 10
 \end{Go}
 Auto-incrementing can be restarted with an expression containing \emph{one} \lstinline[language=Go]{iota}.
 …
 const ( V1 = iota; V2; @V3 = 7;@ V4 = @iota@; V5 ) // 0 1 7 3 4
 const ( Mon = iota; Tue; Wed; // 0, 1, 2
          @Thu = 10;@ Fri = @iota - Wed + Thu - 1@; Sat; Sun ) // 10, 11, 12, 13
+        @Thu = 10;@ Fri = @iota - Wed + Thu - 1@; Sat; Sun ) // 10, 11, 12, 13
 \end{Go}
 Note, \lstinline[language=Go]{iota} is advanced for an explicitly initialized enumerator, like the underscore @_@ identifier.
 …
 \lstnewenvironment{rust}[1][]{\lstset{language=Rust,escapechar=\$,moredelim=**[is][\color{red}]{@}{@},}\lstset{#1}}{}
+Enumerations
+Rust provides a scoped enumeration based on variant types.
+% An enumeration, also referred to as an enum, is a simultaneous definition of a nominal enumerated type as well as a set of constructors, that can be used to create or pattern-match values of the corresponding enumerated type.
+An enumeration without constructors is called field-less.
 \begin{rust}
+        Syntax
+        Enumeration :
+           enum IDENTIFIER  GenericParams? WhereClause? { EnumItems? }
+        EnumItems :
+           EnumItem ( , EnumItem )* ,?
+        EnumItem :
+           OuterAttribute* Visibility?
+           IDENTIFIER ( EnumItemTuple | EnumItemStruct )? EnumItemDiscriminant?
+        EnumItemTuple :
+           ( TupleFields? )
+        EnumItemStruct :
+           { StructFields? }
+        EnumItemDiscriminant :
+           = Expression
+enum Week { Mon, Tues, Wed, Thu, Fri, Sat, Sun@,@ }
+let mut week: Week = Week::Mon;
+week = Week::Fri;
 \end{rust}
+An enumeration, also referred to as an enum, is a simultaneous definition of a nominal enumerated type as well as a set of constructors, that can be used to create or pattern-match values of the corresponding enumerated type.
+Enumerations are declared with the keyword enum.
+An example of an enum item and its use:
+A field-less enumeration with only unit variants is called unit-only.
 \begin{rust}
+enum Animal {
+        Dog,
+        Cat,
+}
+let mut a: Animal = Animal::Dog;
+a = Animal::Cat;
+enum Week { Mon = 0, Tues = 1, Wed = 2, Thu = 3, Fri = 4, Sat = 5, Sun = 6 }
 \end{rust}
 Enum constructors can have either named or unnamed fields:
 \begin{rust}
 enum Animal {
+        Dog(String, f64),
+        Cat { name: String, weight: f64 },
+}
+        Dog( String, f64 ),
+        Cat{ name: String, weight: f64 },
+}
 let mut a: Animal = Animal::Dog("Cocoa".to_string(), 37.2);
 a = Animal::Cat { name: "Spotty".to_string(), weight: 2.7 };
 \end{rust}
+In this example, Cat is a struct-like enum variant, whereas Dog is simply called an enum variant.
+An enum where no constructors contain fields are called a field-less enum. For example, this is a fieldless enum:
+Here, @Dog@ is an @enum@ variant, whereas @Cat@ is a struct-like variant.
+Each @enum@ type has an implicit integer tag (discriminant), with a unique value for each variant type.
+Like a C enumeration, the tag values for the variant types start at 0 with auto incrementing.
+The tag is re-purposed for enumeration by allowing it to be explicitly set, and auto incrmenting continues from that value.
+\begin{cquote}
+\sf\setlength{\tabcolsep}{3pt}
+\begin{tabular}{rcccccccr}
+@enum@ Week \{  & Mon,  & Tue,  & Wed = 2,      & Thu = 10,     & Fri,  & Sat = 5,      & Sun   & \};   \\
+\rm tags                & 0             & 1             & 2                     & 10            & 11    & 5                     & 6             &               \\
+\end{tabular}
+\end{cquote}
+In general, the tag can only be read as an opaque reference for comparison.
 \begin{rust}
+enum Fieldless {
+        Tuple(),
+        Struct{},
+        Unit,
+}
+if mem::discriminant(&week) == mem::discriminant(&Week::Mon) ...
 \end{rust}
 If a field-less enum only contains unit variants, the enum is called an unit-only enum. For example:
+If the enumeration is unit-only, or field-less with no explicit discriminants and where only unit variants are explicit, then the discriminant is accessible with a numeric cast.
 \begin{rust}
+enum Enum {
+        Foo = 3,
+        Bar = 2,
+        Baz = 1,
+}
+\end{rust}
+\subsection{Discriminants}
+Each enum instance has a discriminant: an integer logically associated to it that is used to determine which variant it holds.
+Under the default representation, the discriminant is interpreted as an isize value. However, the compiler is allowed to use a smaller type (or another means of distinguishing variants) in its actual memory layout.
+\subsection{Assigning discriminant values}
+\subsection{Explicit discriminants}
+In two circumstances, the discriminant of a variant may be explicitly set by following the variant name with = and a constant expression:
+        if the enumeration is "unit-only".
+        if a primitive representation is used. For example:
+\begin{rust}
+        #[repr(u8)]
+        enum Enum {
+                Unit = 3,
+                Tuple(u16),
+                Struct {
+                        a: u8,
+                        b: u16,
+                } = 1,
+        }
+\end{rust}
+\subsection{Implicit discriminants}
+If a discriminant for a variant is not specified, then it is set to one higher than the discriminant of the previous variant in the declaration. If the discriminant of the first variant in the declaration is unspecified, then it is set to zero.
+\begin{rust}
+enum Foo {
+        Bar,                    // 0
+        Baz = 123,        // 123
+        Quux,              // 124
+}
+let baz_discriminant = Foo::Baz as u32;
+assert_eq!(baz_discriminant, 123);
+\end{rust}
+\subsection{Restrictions}
+It is an error when two variants share the same discriminant.
+\begin{rust}
+enum SharedDiscriminantError {
+        SharedA = 1,
+        SharedB = 1
+}
+enum SharedDiscriminantError2 {
+        Zero,      // 0
+        One,            // 1
+        OneToo = 1  // 1 (collision with previous!)
+}
+\end{rust}
+It is also an error to have an unspecified discriminant where the previous discriminant is the maximum value for the size of the discriminant.
+\begin{rust}
+#[repr(u8)]
+enum OverflowingDiscriminantError {
+        Max = 255,
+        MaxPlusOne // Would be 256, but that overflows the enum.
+}
+#[repr(u8)]
+enum OverflowingDiscriminantError2 {
+        MaxMinusOne = 254, // 254
+        Max,                       // 255
+        MaxPlusOne               // Would be 256, but that overflows the enum.
+}
+\end{rust}
+\subsection{Accessing discriminant}
+\begin{rust}
+Via mem::discriminant
+\end{rust}
+@mem::discriminant@ returns an opaque reference to the discriminant of an enum value which can be compared. This cannot be used to get the value of the discriminant.
+\subsection{Casting}
+If an enumeration is unit-only (with no tuple and struct variants), then its discriminant can be directly accessed with a numeric cast; e.g.:
+\begin{rust}
+enum Enum {
+        Foo,
+        Bar,
+        Baz,
+}
+assert_eq!(0, Enum::Foo as isize);
+assert_eq!(1, Enum::Bar as isize);
+assert_eq!(2, Enum::Baz as isize);
+\end{rust}
+Field-less enums can be casted if they do not have explicit discriminants, or where only unit variants are explicit.
+\begin{rust}
+enum Fieldless {
+        Tuple(),
+        Struct{},
+        Unit,
+}
+assert_eq!(0, Fieldless::Tuple() as isize);
+assert_eq!(1, Fieldless::Struct{} as isize);
+assert_eq!(2, Fieldless::Unit as isize);
+\end{rust}
+\begin{rust}
+#[repr(u8)]
+enum FieldlessWithDiscrimants {
+        First = 10,
+        Tuple(),
+        Second = 20,
+        Struct{},
+        Unit,
+}
+assert_eq!(10, FieldlessWithDiscrimants::First as u8);
+assert_eq!(11, FieldlessWithDiscrimants::Tuple() as u8);
+assert_eq!(20, FieldlessWithDiscrimants::Second as u8);
+assert_eq!(21, FieldlessWithDiscrimants::Struct{} as u8);
+assert_eq!(22, FieldlessWithDiscrimants::Unit as u8);
+\end{rust}
+\subsection{Pointer casting}
+If the enumeration specifies a primitive representation, then the discriminant may be reliably accessed via unsafe pointer casting:
+\begin{rust}
+#[repr(u8)]
+enum Enum {
+        Unit,
+        Tuple(bool),
+        Struct{a: bool},
+}
+impl Enum {
+        fn discriminant(&self) -> u8 {
+                unsafe { *(self as *const Self as *const u8) }
+        }
+}
+let unit_like = Enum::Unit;
+let tuple_like = Enum::Tuple(true);
+let struct_like = Enum::Struct{a: false};
+assert_eq!(0, unit_like.discriminant());
+assert_eq!(1, tuple_like.discriminant());
+assert_eq!(2, struct_like.discriminant());
+\end{rust}
+\subsection{Zero-variant enums}
+Enums with zero variants are known as zero-variant enums. As they have no valid values, they cannot be instantiated.
+\begin{rust}
+enum ZeroVariants {}
+\end{rust}
+Zero-variant enums are equivalent to the never type, but they cannot be coerced into other types.
+\begin{rust}
+let x: ZeroVariants = panic!();
+let y: u32 = x; // mismatched type error
+\end{rust}
+\subsection{Variant visibility}
+Enum variants syntactically allow a Visibility annotation, but this is rejected when the enum is validated. This allows items to be parsed with a unified syntax across different contexts where they are used.
+\begin{rust}
+macro_rules! mac_variant {
+        ($vis:vis $name:ident) => {
+                enum $name {
+                        $vis Unit,
+                        $vis Tuple(u8, u16),
+                        $vis Struct { f: u8 },
+                }
+        }
+}
+// Empty `vis` is allowed.
+mac_variant! { E }
+// This is allowed, since it is removed before being validated.
+#[cfg(FALSE)]
+enum E {
+        pub U,
+        pub(crate) T(u8),
+        pub(super) T { f: String }
+}
+if week as isize == Week::Mon as isize ...
 \end{rust}
 …
 \lstnewenvironment{python}[1][]{\lstset{language=Python,escapechar=\$,moredelim=**[is][\color{red}]{@}{@},}\lstset{#1}}{}
+An @Enum@ is a set of symbolic names bound to unique values.
+They are similar to global variables, but they offer a more useful @repr()@, grouping, type-safety, and a few other features.
+They are most useful when you have a variable that can take one of a limited selection of values. For example, the days of the week:
+\begin{python}
+>>> from enum import Enum
+>>> class Weekday(Enum):
+...    MONDAY = 1
+...    TUESDAY = 2
+...    WEDNESDAY = 3
+...    THURSDAY = 4
+...    FRIDAY = 5
+...    SATURDAY = 6
+...    SUNDAY = 7
+\end{python}
+Or perhaps the RGB primary colors:
+\begin{python}
+>>> from enum import Enum
+>>> class Color(Enum):
+...    RED = 1
+...    GREEN = 2
+...    BLUE = 3
+\end{python}
+As you can see, creating an @Enum@ is as simple as writing a class that inherits from @Enum@ itself.
+Note: Case of Enum Members
+Because Enums are used to represent constants, and to help avoid issues with name clashes between mixin-class methods/attributes and enum names, we strongly recommend using @UPPER_CASE@ names for members, and will be using that style in our examples.
+A Python enumeration is a set of symbolic names bound to \emph{unique} values.
+They are similar to global variables, but offer a more useful @repr()@, grouping, type-safety, and additional features.
+Enumerations inherits from the @Enum@ class, \eg:
+\begin{python}
+class Weekday(@Enum@): Mon = 1; Tue = 2; Wed = 3; Thu = 4; Fri = 5; Sat = 6; Sun = 7
+class RGB(@Enum@): Red = 1; Green = 2; Blue = 3
+\end{python}
 Depending on the nature of the enum a member's value may or may not be important, but either way that value can be used to get the corresponding member:
 \begin{python}
 >>> Weekday(3)
 <Weekday.WEDNESDAY: 3>
+print( repr( Weekday( 3 ) ) )
+<Weekday.Wed: 3>
 \end{python}
 As you can see, the @repr()@ of a member shows the enum name, the member name, and the value.
 The @str()@ of a member shows only the enum name and member name:
 \begin{python}
 print(Weekday.THURSDAY)
 Weekday.THURSDAY
+print( str( Weekday.Thu ), Weekday.Thu )
+Weekday.Thu Weekday.Thu
 \end{python}
 The type of an enumeration member is the enum it belongs to:
 \begin{python}
 >>> type(Weekday.MONDAY)
+print( type( Weekday.Thu ) )
 <enum 'Weekday'>
 isinstance(Weekday.FRIDAY, Weekday)
+print( isinstance(Weekday.Fri, Weekday) )
 True
 \end{python}
 Enum members have an attribute that contains just their name:
 \begin{python}
 >>> print(Weekday.TUESDAY.name)
+print(Weekday.TUESDAY.name)
 TUESDAY
 \end{python}
 Likewise, they have an attribute for their value:
 \begin{python}
 >>> Weekday.WEDNESDAY.value
+Weekday.WEDNESDAY.value
 \end{python}
 Unlike many languages that treat enumerations solely as name/value pairs, Python @Enum@s can have behavior added.
 For example, @datetime.date@ has two methods for returning the weekday: @weekday()@ and @isoweekday()@.
 …
 Rather than keep track of that ourselves we can add a method to the @Weekday@ enum to extract the day from the date instance and return the matching enum member:
 \begin{python}
+class Weekday(Enum): Mon = 1; Tue = 2; Wed = 3; Thu = 10; Fri = 15; Sat = 16; Sun = 17
 $@$classmethod
 def from_date(cls, date):
+    return cls(date.isoweekday())
+\end{python}
+The complete Weekday enum now looks like this:
+\begin{python}
+>>> class Weekday(Enum):
+...    MONDAY = 1
+...    TUESDAY = 2
+...    WEDNESDAY = 3
+...    THURSDAY = 4
+...    FRIDAY = 5
+...    SATURDAY = 6
+...    SUNDAY = 7
+...    #
+...    $@$classmethod
+...    def from_date(cls, date):
+...        return cls(date.isoweekday())
+        return cls(date.isoweekday())
 \end{python}
 Now we can find out what today is! Observe:
 …
 This Weekday enum is great if our variable only needs one day, but what if we need several? Maybe we're writing a function to plot chores during a week, and don't want to use a @list@ -- we could use a different type of @Enum@:
 \begin{python}
+>>> from enum import Flag
+>>> class Weekday(Flag):
+...    MONDAY = 1
+...    TUESDAY = 2
+...    WEDNESDAY = 4
+...    THURSDAY = 8
+...    FRIDAY = 16
+...    SATURDAY = 32
+...    SUNDAY = 64
+from enum import Flag
+class WeekdayF(@Flag@): Mon = @1@; Tue = @2@; Wed = @4@; Thu = @8@; Fri = @16@; Sat = @32@; Sun = @64@
 \end{python}
 We've changed two things: we're inherited from @Flag@, and the values are all powers of 2.
+Just like the original @Weekday@ enum above, we can have a single selection:
+\begin{python}
+>>> first_week_day = Weekday.MONDAY
+>>> first_week_day
+<Weekday.MONDAY: 1>
+\end{python}
+But @Flag@ also allows us to combine several members into a single variable:
+\begin{python}
+>>> weekend = Weekday.SATURDAY | Weekday.SUNDAY
+>>> weekend
+<Weekday.SATURDAY|SUNDAY: 96>
+@Flag@ allows combining several members into a single variable:
+\begin{python}
+print( repr(WeekdayF.Sat | WeekdayF.Sun) )
+<WeekdayF.Sun|Sat: 96>
 \end{python}
 You can even iterate over a @Flag@ variable:
 \begin{python}
 >>> for day in weekend:
 ...    print(day)
+for day in weekend:
+        print(day)
 Weekday.SATURDAY
 Weekday.SUNDAY
 …
 \subsection{Duplicating enum members and values}
+Having two enum members with the same name is invalid:
+\begin{python}
+>>> class Shape(Enum):
+...    SQUARE = 2
+...    SQUARE = 3
+...
+Traceback (most recent call last):
+...
+TypeError: 'SQUARE' already defined as 2
+\end{python}
+However, an enum member can have other names associated with it.
+An enum member can have other names associated with it.
 Given two entries @A@ and @B@ with the same value (and @A@ defined first), @B@ is an alias for the member @A@.
 By-value lookup of the value of @A@ will return the member @A@.
 …
 By-name lookup of @B@ will also return the member @A@:
 \begin{python}
+>>> class Shape(Enum):
+...    SQUARE = 2
+...    DIAMOND = 1
+...    CIRCLE = 3
+...    ALIAS_FOR_SQUARE = 2
+...
+class Shape(Enum): SQUARE = 2; DIAMOND = 1; CIRCLE = 3; ALIAS_FOR_SQUARE = 2
 >>> Shape.SQUARE
 <Shape.SQUARE: 2>
 …
 When this behavior isn't desired, you can use the @unique()@ decorator:
 \begin{python}
+>>> from enum import Enum, unique
+>>> $@$unique
+... class Mistake(Enum):
+...     ONE = 1
+...     TWO = 2
+...     THREE = 3
+...     FOUR = 3
+...
+Traceback (most recent call last):
+...
+from enum import Enum, unique
+$@$unique
+class DupVal(Enum): ONE = 1; TWO = 2; THREE = 3; FOUR = 3
 ValueError: duplicate values found in <enum 'Mistake'>: FOUR -> THREE
 \end{python}
 …
 If the exact value is unimportant you can use @auto@:
 \begin{python}
+>>> from enum import Enum, auto
+>>> class Color(Enum):
+...     RED = auto()
+...     BLUE = auto()
+...     GREEN = auto()
+...
+>>> [member.value for member in Color]
+[1, 2, 3]
+\end{python}
+The values are chosen by \_generate\_next\_value\_(), which can be overridden:
+from enum import Enum, auto
+class RGBa(Enum): RED = auto(); BLUE = auto(); GREEN = auto()
+\end{python}
+(Like Golang @iota@.)
+The values are chosen by @_generate_next_value_()@, which can be overridden:
 \begin{python}
 >>> class AutoName(Enum):
 …
 \begin{python}
 class EnumName([mix-in, ...,] [data-type,] base-enum):
     pass
+        pass
 \end{python}
 Also, subclassing an enumeration is allowed only if the enumeration does not define any members.
 …
 \begin{python}
 Enum(
     value='NewEnumName',
     names=<...>,
     *,
     module='...',
     qualname='...',
     type=<mixed-in class>,
     start=1,
+    )
+        value='NewEnumName',
+        names=<...>,
+        *,
+        module='...',
+        qualname='...',
+        type=<mixed-in class>,
+        start=1,
+        )
 \end{python}
 \begin{itemize}
 …
 \begin{python}
 class IntEnum(int, Enum):
     pass
+        pass
 \end{python}
 This demonstrates how similar derived enumerations can be defined;
 …
 \begin{python}
 def __bool__(self):
     return bool(self.value)
+        return bool(self.value)
 \end{python}
 Plain @Enum@ classes always evaluate as @True@.
 …
 If @__new__()@ or @__init__()@ is defined, the value of the enum member will be passed to those methods:
+\begin{python}
+>>> class Planet(Enum):
+...     MERCURY = (3.303e+23, 2.4397e6)
+...     VENUS   = (4.869e+24, 6.0518e6)
+...     EARTH   = (5.976e+24, 6.37814e6)
+...     MARS    = (6.421e+23, 3.3972e6)
+...     JUPITER = (1.9e+27,   7.1492e7)
+...     SATURN  = (5.688e+26, 6.0268e7)
+...     URANUS  = (8.686e+25, 2.5559e7)
+...     NEPTUNE = (1.024e+26, 2.4746e7)
+...     def __init__(self, mass, radius):
+...         self.mass = mass       # in kilograms
+...         self.radius = radius   # in meters
+...     $\@$property
+...     def surface_gravity(self):
+...         # universal gravitational constant  (m3 kg-1 s-2)
+...         G = 6.67300E-11
+...         return G * self.mass / (self.radius * self.radius)
+...
+>>> Planet.EARTH.value
+(5.976e+24, 6378140.0)
+>>> Planet.EARTH.surface_gravity
+.802652743337129
+\end{python}
+\begin{figure}
+\begin{python}
+from enum import Enum
+class Planet(Enum):
+        MERCURY = ( 3.303E23, 2.4397E6 )
+        VENUS       = ( 4.869E24, 6.0518E6 )
+        EARTH       = (5.976E24, 6.37814E6)
+        MARS         = (6.421E23, 3.3972E6)
+        JUPITER    = (1.9E27,   7.1492E7)
+        SATURN     = (5.688E26, 6.0268E7)
+        URANUS    = (8.686E25, 2.5559E7)
+        NEPTUNE  = (1.024E26, 2.4746E7)
+        def __init__( self, mass, radius ):
+                self.mass = mass                # in kilograms
+                self.radius = radius    # in meters
+        def surface_gravity( self ):
+                # universal gravitational constant  (m3 kg-1 s-2)
+                G = 6.67300E-11
+                return G * self.mass / (self.radius * self.radius)
+for p in Planet:
+        print( f"{p.name}: {p.value}" )
+MERCURY: (3.303e+23, 2439700.0)
+VENUS: (4.869e+24, 6051800.0)
+EARTH: (5.976e+24, 6378140.0)
+MARS: (6.421e+23, 3397200.0)
+JUPITER: (1.9e+27, 71492000.0)
+SATURN: (5.688e+26, 60268000.0)
+URANUS: (8.686e+25, 25559000.0)
+NEPTUNE: (1.024e+26, 24746000.0)
+\end{python}
+\caption{Python Planet Example}
+\label{f:PythonPlanetExample}
+\end{figure}
 \subsection{TimePeriod}
 …
 \section{OCaml}
 \lstnewenvironment{ocaml}[1][]{\lstset{language=OCaml,escapechar=\$,moredelim=**[is][\color{red}]{@}{@},}\lstset{#1}}{}
+% https://ocaml.org/docs/basic-data-types#enumerated-data-types
 OCaml provides a variant (union) type, where multiple heterogeneously-typed objects share the same storage.
 …
 With valediction,
   - Gregor Richards
+Date: Thu, 14 Mar 2024 21:45:52 -0400
+Subject: Re: OCaml "enums" do come with ordering
+To: "Peter A. Buhr" <pabuhr@uwaterloo.ca>
+From: Gregor Richards <gregor.richards@uwaterloo.ca>
+On 3/14/24 21:30, Peter A. Buhr wrote:
+> I've marked 3 places with your name to shows places with enum ordering.
+>
+> type weekday = Mon | Tue | Wed | Thu | Fri | Sat | Sun
+> let day : weekday = Mon
+> let take_class( d : weekday ) =
+>       if d <= Fri then                                (* Gregor *)
+>               Printf.printf "weekday\n"
+>       else if d >= Sat then                   (* Gregor *)
+>               Printf.printf "weekend\n";
+>       match d with
+>               Mon | Wed -> Printf.printf "CS442\n" |
+>               Tue | Thu -> Printf.printf "CS343\n" |
+>               Fri -> Printf.printf "Tutorial\n" |
+>               _ -> Printf.printf "Take a break\n"
+>
+> let _ = take_class( Mon ); take_class( Sat );
+>
+> type colour = Red | Green of string | Blue of int * float
+> let c = Red
+> let _ = match c with Red -> Printf.printf "Red, "
+> let c = Green( "abc" )
+> let _ = match c with Green g -> Printf.printf "%s, " g
+> let c = Blue( 1, 1.5 )
+> let _ = match c with Blue( i, f ) -> Printf.printf "%d %g\n" i f
+>
+> let check_colour(c: colour): string =
+>       if c < Green( "xyz" ) then              (* Gregor *)
+>               Printf.printf "green\n";
+>       match c with
+>               Red -> "Red" |
+>               Green g -> g |
+>               Blue(i, f) -> string_of_int i ^ string_of_float f
+> let _ = check_colour( Red ); check_colour( Green( "xyz" ) );
+>
+> type stringList = Empty | Pair of string * stringList
+> let rec len_of_string_list(l: stringList): int =
+>       match l with
+>               Empty -> 0 |
+>               Pair(_ , r) -> 1 + len_of_string_list r
+>
+> let _ = for i = 1 to 10 do
+>       Printf.printf "%d, " i
+> done
+>
+> (* Local Variables: *)
+> (* tab-width: 4 *)
+> (* compile-command: "ocaml test.ml" *)
+> (* End: *)
+My functional-language familiarity is far more with Haskell than OCaml.  I
+mostly view OCaml through a lens of "it's Haskell but with cheating".  Haskell
+"enums" (ADTs) aren't ordered unless you specifically and manually put them in
+the Ord typeclass by defining the comparators.  Apparently, OCaml has some
+other rule, which I would guess is something like "sort by tag then by order of
+parameter". Having a default behavior for comparators is *bizarre*; my guess
+would be that it gained this behavior in its flirtation with object
+orientation, but that's just a guess (and irrelevant).
+This gives a total order, but not enumerability (which would still be
+effectively impossible or even meaningless since enums are just a special case
+of ADTs).
+With valediction,
+  - Gregor Richards
+Date: Wed, 20 Mar 2024 18:16:44 -0400
+Subject: Re:
+To: "Peter A. Buhr" <pabuhr@uwaterloo.ca>
+From: Gregor Richards <gregor.richards@uwaterloo.ca>
+On 3/20/24 17:26, Peter A. Buhr wrote:
+> Gregor, everyone at this end would like a definition of "enumerability". Can
+> you formulate one?
+According to the OED (emphasis added to the meaning I'm after):
+enumerate (verb, transitive). To count, ascertain the number of; **more
+usually, to mention (a number of things or persons) separately, as if for the
+purpose of counting**; to specify as in a list or catalogue.
+With C enums, if you know the lowest and highest value, you can simply loop
+over them in a for loop (this is, of course, why so many enums come with an
+ENUM_WHATEVER_LAST value). But, I would be hesitant to use the word "loop" to
+describe enumerability, since in functional languages, you would recurse for
+such a purpose.
+In Haskell, in order to do something with every member of an "enumeration", you
+would have to explicitly list them all. The type system will help a bit since
+it knows if you haven't listed them all, but you would have to statically have
+every element in the enumeration.  If somebody added new elements to the
+enumeration later, your code to enumerate over them would no longer work
+correctly, because you can't simply say "for each member of this enumeration do
+X". In Haskell that's because there aren't actually enumerations; what they use
+as enumerations are a degenerate form of algebraic datatypes, and ADTs are
+certainly not enumerable. In OCaml, you've demonstrated that they impose
+comparability, but I would still assume that you can't make a loop over every
+member of an enumeration. (But, who knows!)
+Since that's literally what "enumerate" means, it seems like a rather important
+property for enumerations to have ;)
+With valediction,
+  - Gregor Richards
+From: Andrew James Beach <ajbeach@uwaterloo.ca>
+To: Gregor Richards <gregor.richards@uwaterloo.ca>, Peter Buhr <pabuhr@uwaterloo.ca>
+CC: Michael Leslie Brooks <mlbrooks@uwaterloo.ca>, Fangren Yu <f37yu@uwaterloo.ca>,
+    Jiada Liang <j82liang@uwaterloo.ca>
+Subject: Re: Re:
+Date: Thu, 21 Mar 2024 14:26:36 +0000
+Does this mean that not all enum declarations in C create enumerations? If you
+declare an enumeration like:
+enum Example {
+    Label,
+    Name = 10,
+    Tag = 3,
+};
+I don't think there is any way to enumerate (iterate, loop, recurse) over these
+values without listing all of them.
+Date: Thu, 21 Mar 2024 10:31:49 -0400
+Subject: Re:
+To: Andrew James Beach <ajbeach@uwaterloo.ca>, Peter Buhr <pabuhr@uwaterloo.ca>
+CC: Michael Leslie Brooks <mlbrooks@uwaterloo.ca>, Fangren Yu <f37yu@uwaterloo.ca>,
+    Jiada Liang <j82liang@uwaterloo.ca>
+From: Gregor Richards <gregor.richards@uwaterloo.ca>
+I consider this conclusion reasonable. C enums can be nothing more than const
+ints, and if used in that way, I personally wouldn't consider them as
+enumerations in any meaningful sense, particularly since the type checker
+essentially does nothing for you there. Then they're a way of writing consts
+repeatedly with some textual indicator that these definitions are related; more
+namespace, less enum.
+When somebody writes bitfield members as an enum, is that *really* an
+enumeration, or just a use of the syntax for enums to keep related definitions
+together?
+With valediction,
+  - Gregor Richards
 \end{comment}
 …
 \section{Comparison}
+\begin{tabular}{r|ccccccccc}
+feat. / lang. & Pascal  & Ada   & \Csharp       & OCaml & Java  & Modula-3      & Rust  & Swift & Python        \\
+\VRef[Table]{t:FeatureLanguageComparison} shows a comparison of enumeration features and programming languages.
+The features are high level and may not capture nuances within a particular language
+The @const@ feature is simple macros substitution and not a typed enumeration.
+\begin{table}
+\caption{Enumeration Feature / Language Comparison}
+\label{t:FeatureLanguageComparison}
+\small
+\setlength{\tabcolsep}{3pt}
+\newcommand{\CM}{\checkmark}
+\begin{tabular}{r|c|c|c|c|c|c|c|c|c|c|c|c|c}
+                                &Pascal & Ada   &\Csharp& OCaml & Java  &Modula-3&Golang& Rust  & Swift & Python& C             & \CC   & \CFA  \\
 \hline
+pure            &                       &               &                       &               &               &                       &               &               &                       \\
+ordered         &                       &               &                       &               &               &                       &               &               &                       \\
+setable         &                       &               &                       &               &               &                       &               &               &                       \\
+auto-init       &                       &               &                       &               &               &                       &               &               &                       \\
+scoped          &                       &               &                       &               &               &                       &               &               &                       \\
+typed           &                       &               &                       &               &               &                       &               &               &                       \\
+switch          &                       &               &                       &               &               &                       &               &               &                       \\
+loop            &                       &               &                       &               &               &                       &               &               &                       \\
+array           &                       &               &                       &               &               &                       &               &               &                       \\
+@const@                 & \CM   &               &               &               &               &               & \CM   &               &               &               &               & \CM   &               \\
+\hline
+\hline
+pure                    &               &               &               &               &               &               &               &               &               &               &               &               & \CM   \\
+\hline
+typed                   &               &               &               &               &               &               &               &               &               &               & @int@ & integral      & @T@   \\
+\hline
+safe                    &               &               &               &               &               &               &               &               &               &               &               & \CM   & \CM   \\
+\hline
+ordered                 &               &               &               &               &               &               &               &               &               &               & \CM   & \CM   & \CM   \\
+\hline
+dup. values             &               &               &               &               &               &               &               &               &               & alias & \CM   & \CM   & \CM   \\
+\hline
+setable                 &               &               &               &               &               &               &               &               &               &               & \CM   & \CM   & \CM   \\
+\hline
+auto-init               &               &               &               &               &               &               &               &               &               &               & \CM   & \CM   & \CM   \\
+\hline
+(un)scoped              &               &               &               &               &               &               &               &               &               &               & U             & U/S   & U/S   \\
+\hline
+overload                &               & \CM   &               &               &               &               &               &               &               &               &               & \CM   & \CM   \\
+\hline
+switch                  &               &               &               &               &               &               &               &               &               &               & \CM   & \CM   & \CM   \\
+\hline
+loop                    &               &               &               &               &               &               &               &               &               &               &               &               & \CM   \\
+\hline
+array                   &               &               &               &               &               &               &               &               &               &               & \CM   &               & \CM   \\
+\hline
+subtype                 &               &               &               &               &               &               &               &               &               &               &               &               & \CM   \\
+\hline
+inheritance             &               &               &               &               &               &               &               &               &               &               &               &               & \CM   \\
 \end{tabular}
+\end{table}

doc/theses/mike_brooks_MMath/array.tex

-                      rdf78cce
+                      r486caad
 \subsection{Retire pointer arithmetic}
+\section{\CFA}
+XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX \\
+moved from background chapter \\
+XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX \\
+Traditionally, fixing C meant leaving the C-ism alone, while providing a better alternative beside it.
+(For later:  That's what I offer with array.hfa, but in the future-work vision for arrays, the fix includes helping programmers stop accidentally using a broken C-ism.)
+\subsection{\CFA features interacting with arrays}
+Prior work on \CFA included making C arrays, as used in C code from the wild,
+work, if this code is fed into @cfacc@.
+The quality of this this treatment was fine, with no more or fewer bugs than is typical.
+More mixed results arose with feeding these ``C'' arrays into preexisting \CFA features.
+A notable success was with the \CFA @alloc@ function,
+which type information associated with a polymorphic return type
+replaces @malloc@'s use of programmer-supplied size information.
+\begin{cfa}
+// C, library
+void * malloc( size_t );
+// C, user
+struct tm * el1 = malloc(      sizeof(struct tm) );
+struct tm * ar1 = malloc( 10 * sizeof(struct tm) );
+// CFA, library
+forall( T * ) T * alloc();
+// CFA, user
+tm * el2 = alloc();
+tm (*ar2)[10] = alloc();
+\end{cfa}
+The alloc polymorphic return compiles into a hidden parameter, which receives a compiler-generated argument.
+This compiler's argument generation uses type information from the left-hand side of the initialization to obtain the intended type.
+Using a compiler-produced value eliminates an opportunity for user error.
+TODO: fix in following: even the alloc call gives bad code gen: verify it was always this way; walk back the wording about things just working here; assignment (rebind) seems to offer workaround, as in bkgd-cfa-arrayinteract.cfa
+Bringing in another \CFA feature, reference types, both resolves a sore spot of the last example, and gives a first example of an array-interaction bug.
+In the last example, the choice of ``pointer to array'' @ar2@ breaks a parallel with @ar1@.
+They are not subscripted in the same way.
+\begin{cfa}
+ar1[5];
+(*ar2)[5];
+\end{cfa}
+Using ``reference to array'' works at resolving this issue.  TODO: discuss connection with Doug-Lea \CC proposal.
+\begin{cfa}
+tm (&ar3)[10] = *alloc();
+ar3[5];
+\end{cfa}
+The implicit size communication to @alloc@ still works in the same ways as for @ar2@.
+Using proper array types (@ar2@ and @ar3@) addresses a concern about using raw element pointers (@ar1@), albeit a theoretical one.
+TODO xref C standard does not claim that @ar1@ may be subscripted,
+because no stage of interpreting the construction of @ar1@ has it be that ``there is an \emph{array object} here.''
+But both @*ar2@ and the referent of @ar3@ are the results of \emph{typed} @alloc@ calls,
+where the type requested is an array, making the result, much more obviously, an array object.
+The ``reference to array'' type has its sore spots too.
+TODO see also @dimexpr-match-c/REFPARAM_CALL@ (under @TRY_BUG_1@)
+TODO: I fixed a bug associated with using an array as a T.  I think.  Did I really?  What was the bug?

doc/theses/mike_brooks_MMath/background.tex

-                      rdf78cce
+                      r486caad
 \chapter{Background}
+This chapter states facts about the prior work, upon which my contributions build.
+Each receives a justification of the extent to which its statement is phrased to provoke controversy or surprise.
+\section{C}
+\subsection{Common knowledge}
+The reader is assumed to have used C or \CC for the coursework of at least four university-level courses, or have equivalent experience.
+The current discussion introduces facts, unaware of which, such a functioning novice may be operating.
+% TODO: decide if I'm also claiming this collection of facts, and test-oriented presentation is a contribution; if so, deal with (not) arguing for its originality
+\subsection{Convention: C is more touchable than its standard}
+When it comes to explaining how C works, I like illustrating definite program semantics.
+I prefer doing so, over a quoting manual's suggested programmer's intuition, or showing how some compiler writers chose to model their problem.
+To illustrate definite program semantics, I devise a program, whose behaviour exercises the point at issue, and I show its behaviour.
+This behaviour is typically one of
+\begin{itemize}
+        \item my statement that the compiler accepts or rejects the program
+        \item the program's printed output, which I show
+        \item my implied assurance that its assertions do not fail when run
+\end{itemize}
+The compiler whose program semantics is shown is
+\begin{cfa}
+$ gcc --version
+gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
+\end{cfa}
+running on Architecture @x86_64@, with the same environment targeted.
+Unless explicit discussion ensues about differences among compilers or with (versions of) the standard, it is further implied that there exists a second version of GCC and some version of Clang, running on and for the same platform, that give substantially similar behaviour.
+In this case, I do not argue that my sample of major Linux compilers is doing the right thing with respect to the C standard.
+\subsection{C reports many ill-typed expressions as warnings}
+These attempts to assign @y@ to @x@ and vice-versa are obviously ill-typed.
+\lstinput{12-15}{bkgd-c-tyerr.c}
+with warnings:
+\begin{cfa}
+warning: assignment to 'float *' from incompatible pointer type 'void (*)(void)'
+warning: assignment to 'void (*)(void)' from incompatible pointer type 'float *'
+\end{cfa}
+Similarly,
+\lstinput{17-19}{bkgd-c-tyerr.c}
+with warning:
+\begin{cfa}
+warning: passing argument 1 of 'f' from incompatible pointer type
+note: expected 'void (*)(void)' but argument is of type 'float *'
+\end{cfa}
+with a segmentation fault at runtime.
+That @f@'s attempt to call @g@ fails is not due to 3.14 being a particularly unlucky choice of value to put in the variable @pi@.
+Rather, it is because obtaining a program that includes this essential fragment, yet exhibits a behaviour other than "doomed to crash," is a matter for an obfuscated coding competition.
+A "tractable syntactic method for proving the absence of certain program behaviours by classifying phrases according to the kinds of values they compute"*1 rejected the program.
+The behaviour (whose absence is unprovable) is neither minor nor unlikely.
+The rejection shows that the program is ill-typed.
+Yet, the rejection presents as a GCC warning.
+In the discussion following, ``ill-typed'' means giving a nonzero @gcc -Werror@ exit condition with a message that discusses typing.
+*1  TAPL-pg1 definition of a type system
+\section{C Arrays}
+\subsection{C has an array type (!)}
+Since this work builds on C, it is necessary to explain the C mechanisms and their shortcomings for array, linked list, and string,
+\section{Array}
 When a programmer works with an array, C semantics provide access to a type that is different in every way from ``pointer to its first element.''
 …
+\section{\CFA}
+Traditionally, fixing C meant leaving the C-ism alone, while providing a better alternative beside it.
+(For later:  That's what I offer with array.hfa, but in the future-work vision for arrays, the fix includes helping programmers stop accidentally using a broken C-ism.)
+\subsection{\CFA features interacting with arrays}
+Prior work on \CFA included making C arrays, as used in C code from the wild,
+work, if this code is fed into @cfacc@.
+The quality of this this treatment was fine, with no more or fewer bugs than is typical.
+More mixed results arose with feeding these ``C'' arrays into preexisting \CFA features.
+A notable success was with the \CFA @alloc@ function,
+which type information associated with a polymorphic return type
+replaces @malloc@'s use of programmer-supplied size information.
+\begin{cfa}
+// C, library
+void * malloc( size_t );
+// C, user
+struct tm * el1 = malloc(      sizeof(struct tm) );
+struct tm * ar1 = malloc( 10 * sizeof(struct tm) );
+// CFA, library
+forall( T * ) T * alloc();
+// CFA, user
+tm * el2 = alloc();
+tm (*ar2)[10] = alloc();
+\end{cfa}
+The alloc polymorphic return compiles into a hidden parameter, which receives a compiler-generated argument.
+This compiler's argument generation uses type information from the left-hand side of the initialization to obtain the intended type.
+Using a compiler-produced value eliminates an opportunity for user error.
+TODO: fix in following: even the alloc call gives bad code gen: verify it was always this way; walk back the wording about things just working here; assignment (rebind) seems to offer workaround, as in bkgd-cfa-arrayinteract.cfa
+Bringing in another \CFA feature, reference types, both resolves a sore spot of the last example, and gives a first example of an array-interaction bug.
+In the last example, the choice of ``pointer to array'' @ar2@ breaks a parallel with @ar1@.
+They are not subscripted in the same way.
+\begin{cfa}
+ar1[5];
+(*ar2)[5];
+\end{cfa}
+Using ``reference to array'' works at resolving this issue.  TODO: discuss connection with Doug-Lea \CC proposal.
+\begin{cfa}
+tm (&ar3)[10] = *alloc();
+ar3[5];
+\end{cfa}
+The implicit size communication to @alloc@ still works in the same ways as for @ar2@.
+Using proper array types (@ar2@ and @ar3@) addresses a concern about using raw element pointers (@ar1@), albeit a theoretical one.
+TODO xref C standard does not claim that @ar1@ may be subscripted,
+because no stage of interpreting the construction of @ar1@ has it be that ``there is an \emph{array object} here.''
+But both @*ar2@ and the referent of @ar3@ are the results of \emph{typed} @alloc@ calls,
+where the type requested is an array, making the result, much more obviously, an array object.
+The ``reference to array'' type has its sore spots too.
+TODO see also @dimexpr-match-c/REFPARAM_CALL@ (under @TRY_BUG_1@)
+TODO: I fixed a bug associated with using an array as a T.  I think.  Did I really?  What was the bug?
+\section{Linked List}
+\section{String}

doc/theses/mike_brooks_MMath/intro.tex

-                      rdf78cce
+                      r486caad
 \chapter{Introduction}
+All modern programming languages provide three high-level containers (collection): array, linked-list, and string.
+Often array is part of the programming language, while linked-list is built from pointer types, and string from a combination of array and linked-list.
 \cite{Blache19}
 …
 \cite{Ruef19}
 \section{Arrays}
+\section{Array}
+\section{Strings}
+Array provides a homogeneous container with $O(1)$ access to elements using subscripting.
+The array size can be static, dynamic but fixed after creation, or dynamic and variable after creation.
+For static and dynamic-fixed, an array can be stack allocated, while dynamic-variable requires the heap.
+\section{Linked List}
+Linked-list provides a homogeneous container with $O(log N)$/$O(N)$ access to elements using successor and predecessor operations.
+Subscripting by value is sometimes available, \eg hash table.
+Linked types are normally dynamically sized by adding/removing nodes using link fields internal or external to the elements (nodes).
+If a programming language allows pointer to stack storage, linked-list types can be allocated on the stack;
+otherwise, elements are heap allocated and explicitly/implicitly managed.
+\section{String}
+String provides a dynamic array of homogeneous elements, where the elements are often human-readable characters.
+What differentiates string from other types in that string operations work on blocks of elements for scanning and changing the elements, rather than accessing individual elements.
+Nevertheless, subscripting is often available.
+The cost of string operations is less important than the power of the block operation to accomplish complex manipulation.
+The dynamic nature of string means storage is normally heap allocated but often implicitly managed, even in unmanaged languages.
+\section{Motivation}
+The goal of this work is to introduce safe and complex versions of array, link-lists, and string into the programming language \CFA~\cite{CFA}, which is based on C.
+Unfortunately, to make C better, while retaining a high level of backwards compatibility, requires a significant knowledge of C's design.
+Hence, it is assumed the reader has a medium knowledge of C or \CC, on which extensive new C knowledge is built.
+\subsection{C?}
+Like many established programming languages, C has a standards committee and multiple ANSI/\-ISO language manuals~\cite{C99,C11,C18,C23}.
+However, most programming languages are only partially explained by standard's manuals.
+When it comes to explaining how C works, the definitive source is the @gcc@ compiler, which is mimicked by other C compilers, such as Clang~\cite{clang}.
+Often other C compilers must \emph{ape} @gcc@ because a large part of the C library (runtime) system contains @gcc@ features.
+While some key aspects of C need to be explained by quoting from the language reference manual, to illustrate definite program semantics, I devise a program, whose behaviour exercises the point at issue, and shows its behaviour.
+These example programs show
+\begin{itemize}
+        \item the compiler accepts or rejects certain syntax,
+        \item prints output to buttress a claim of behaviour,
+        \item executes without triggering any embedded assertions testing pre/post-assertions or invariants.
+\end{itemize}
+This work has been tested across @gcc@ versions 8--12 and clang version 10 running on ARM, AMD, and Intel architectures.
+Any discovered anomalies among compilers or versions is discussed.
+In this case, I do not argue that my sample of major Linux compilers is doing the right thing with respect to the C standard.
+\subsection{Ill-Typed Expressions}
+C reports many ill-typed expressions as warnings.
+For example, these attempts to assign @y@ to @x@ and vice-versa are obviously ill-typed.
+\lstinput{12-15}{bkgd-c-tyerr.c}
+with warnings:
+\begin{cfa}
+warning: assignment to 'float *' from incompatible pointer type 'void (*)(void)'
+warning: assignment to 'void (*)(void)' from incompatible pointer type 'float *'
+\end{cfa}
+Similarly,
+\lstinput{17-19}{bkgd-c-tyerr.c}
+with warning:
+\begin{cfa}
+warning: passing argument 1 of 'f' from incompatible pointer type
+note: expected 'void (*)(void)' but argument is of type 'float *'
+\end{cfa}
+with a segmentation fault at runtime.
+Clearly, @gcc@ understands these ill-typed case, and yet allows the program to compile, which seems like madness.
+Compiling with flag @-Werror@, which turns warnings into errors, is often too strong, because some warnings are just warnings.
+In the following discussion, ``ill-typed'' means giving a nonzero @gcc@ exit condition with a message that discusses typing.
+Note, \CFA's type-system rejects all these ill-typed cases as type mismatch errors.
+% That @f@'s attempt to call @g@ fails is not due to 3.14 being a particularly unlucky choice of value to put in the variable @pi@.
+% Rather, it is because obtaining a program that includes this essential fragment, yet exhibits a behaviour other than "doomed to crash," is a matter for an obfuscated coding competition.
+% A "tractable syntactic method for proving the absence of certain program behaviours by classifying phrases according to the kinds of values they compute"*1 rejected the program.
+% The behaviour (whose absence is unprovable) is neither minor nor unlikely.
+% The rejection shows that the program is ill-typed.
+%
+% Yet, the rejection presents as a GCC warning.
+% *1  TAPL-pg1 definition of a type system
 \section{Contributions}
+\subsection{Linked List}
+\subsection{Array}
+\subsection{String}

doc/theses/mike_brooks_MMath/uw-ethesis.tex

-                      rdf78cce
+                      r486caad
 \input{intro}
 \input{background}
+\input{array}
 \input{list}
-\input{array}
 \input{string}
 \input{conclusion}

Note: See TracChangeset for help on using the changeset viewer.

Download in other formats: