Ignore:
Timestamp:
Apr 18, 2024, 10:23:34 PM (5 months ago)
Author:
Peter A. Buhr <pabuhr@…>
Branches:
master
Children:
2b6db03
Parents:
c148966
Message:

more proofreading on enumerations

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/jiada_liang_MMath/intro.tex

    rc148966 r4da9142  
    11\chapter{Introduction}
    22
    3 All types in a programming language must have a set of constants, and these constants have \Newterm{primary names}, \eg integral types have constants @-1@, @17@, @12345@, \etc.
    4 Constants can be overloaded among types, \eg @0@ is a null pointer for all pointer types, and the value zero for integral and floating-point types.
     3All types in a programming language must have a set of constants, and these constants have \Newterm{primary names}, \eg integral types have constants @-1@, @17@, @0xff@, floating-point types have constants @5.3@, @2.3E-5@, @0xff.ffp0@, character types have constants @'a'@, @"abc\n"@, \mbox{\lstinline{u8"}\texttt{\guillemotleft{na\"{i}ve}\guillemotright}\lstinline{"}}, \etc.
     4Con\-stants can be overloaded among types, \eg @0@ is a null pointer for all pointer types, and the value zero for integral and floating-point types.
     5(In \CFA, the primary constants @0@ and @1@ can be overloaded for any type.)
    56Hence, each primary constant has a symbolic name referring to its internal representation, and these names are dictated by language syntax related to types.
    67In theory, there are an infinite set of primary names per type.
    78
    8 \Newterm{Secondary naming} is a common practice in mathematics and engineering, \eg $\pi$, $\tau$ (2$\pi$), $\phi$ (golden ratio), MHz (1E6), and in general situations, \eg specific times (noon, New Years), cities (Big Apple), flowers (Lily), \etc.
     9\Newterm{Secondary naming} is a common practice in mathematics, engineering and computer science, \eg $\pi$, $\tau$ (2$\pi$), $\phi$ (golden ratio), MB (mega byte, 1E6), and in general situations, \eg specific times (noon, New Years), cities (Big Apple), flowers (Lily), \etc.
    910Many programming languages capture this important software-engineering capability through a mechanism called \Newterm{constant} or \Newterm{literal} naming, where a secondary name is aliased to a primary name.
    10 In some cases, secondary naming is \Newterm{pure}, where the matching internal representation can be chosen arbitrarily, and only equality operations are available, \eg @O_RDONLY@, @O_WRONLY@, @O_CREAT@, @O_TRUNC@, @O_APPEND@.
    11 (The names the thing.)
     11Its common purpose is to eliminate duplication of the primary constant throughout a program.
     12For example, the secondary name replaces its primary name, thereafter changing the binding of the secondary to primary name automatically distributes the rebinding throughout the program.
     13In some cases, secondary naming is \Newterm{opaque}, where the matching internal representation can be chosen arbitrarily, and only equality operations are available, \eg @O_RDONLY@, @O_WRONLY@, @O_CREAT@, @O_TRUNC@, @O_APPEND@.
    1214Because a secondary name is a constant, it cannot appear in a mutable context, \eg \mbox{$\pi$ \lstinline{= 42}} is meaningless, and a constant has no address, \ie it is an \Newterm{rvalue}\footnote{
    1315The term rvalue defines an expression that can only appear on the right-hand side of an assignment expression.}.
     
    1820enumerate (verb, transitive).
    1921To count, ascertain the number of;
    20 \emph{more
    21 usually, to mention (a number of things or persons) separately, as if for the
    22 purpose of counting};
    23 to specify as in a list or catalogue.~\cite{OED}
     22more usually, to mention (a number of things or persons) separately, as if for the purpose of counting;
     23to specify as in a list or catalogue.~\cite{OEDenumerate}
    2424\end{quote}
    25 Within an enumeration set, the enumeration names must be unique, and instances of an enumerated type are restricted to hold only the secondary names.
     25Within an enumeration set, the enumeration names must be unique, and instances of an enumerated type are \emph{often} restricted to hold only the secondary names.
    2626It is possible to enumerate among set names without having an ordering among the set elements.
    2727For example, the week, the weekdays, the weekend, and every second day of the week.
     
    2929for ( cursor in Mon, Tue, Wed, Thu, Fri, Sat, Sun } ... $\C[3.75in]{// week}$
    3030for ( cursor in Mon, Tue, Wed, Thu, Fri } ...   $\C{// weekday}$
    31 for ( cursor in Thu, Fri } ...                                  $\C{// weekend}$
     31for ( cursor in Sat, Sun } ...                                  $\C{// weekend}$
    3232for ( cursor in Mon, Wed, Fri, Sun } ...                $\C{// every second day of week}\CRT$
    3333\end{cfa}
    34 This independence from internal representation allows multiple names to have the same representation (eight note, quaver), giving synonyms.
     34This independence from internal representation allows multiple names to have the same representation (eighth note, quaver), giving synonyms.
    3535A set can have a partial or total ordering, making it possible to compare set elements, \eg Monday is before Friday and Friday is after.
    3636Ordering allows iterating among the enumeration set using relational operators and advancement, \eg
     
    3838for ( cursor = Monday; cursor @<=@ Friday; cursor = @succ@( cursor ) ) ...
    3939\end{cfa}
    40 Here the internal representations for the secondary names are \emph{generated} rather than listing a subset of names.
     40Here the internal representations for the secondary names are logically \emph{generated} rather than listing a subset of names.
     41
     42Hence, the fundamental aspects of an enumeration are:
     43\begin{enumerate}
     44\item
     45It defines a type from which instants can be generated.
     46\item
     47The type lists a finite set of secondary names, which become its primary constants.
     48This differentiates an enumeration from general types with an infinite number of primary constants.
     49\item
     50An enumeration's secondary names represent constants, which follows from their binding (aliasing) to primary names, which are constants.
     51\item
     52For safety, an enumeration instance is restricted to hold only its type's secondary names.
     53\item
     54There is a mechanism for \emph{enumerating} over the secondary names, where the ordering can be implicit from the type, explicitly listed, or generated arithmetically.
     55\end{enumerate}
    4156
    4257
    4358\section{Terminology}
    44 
    45 The term \Newterm{enumeration} defines the set of secondary names, and the term \Newterm{enumerator} represents an arbitrary secondary name.
     59\label{s:Terminology}
     60
     61The term \Newterm{enumeration} defines a type with a set of secondary names, and the term \Newterm{enumerator} represents an arbitrary secondary name \see{\VRef{s:CEnumeration}}.
    4662As well, an enumerated type has three fundamental properties, \Newterm{label}, \Newterm{order}, and \Newterm{value}.
    4763\begin{cquote}
     
    7288\section{Motivation}
    7389
    74 Some programming languages only provide secondary renaming, which can be simulated by an enumeration without ordering.
    75 \begin{cfa}
    76 const Size = 20, Pi = 3.14159;
    77 enum { Size = 20, Pi = 3.14159 };   // unnamed enumeration $\(\Rightarrow\)$ no ordering
    78 \end{cfa}
    79 In both cases, it is possible to compare the secondary names, \eg @Size < Pi@, if that is meaningful;
    80 however, without an enumeration type-name, it is impossible to create an iterator cursor.
    81 
    82 Secondary renaming can similate an enumeration, but with extra effort.
     90Some programming languages only provide direct secondary renaming.
     91\begin{cfa}
     92const Size = 20, Pi = 3.14159, Name = "Jane";
     93\end{cfa}
     94Here, it is possible to compare the secondary names, \eg @Size < Pi@, if that is meaningful.
     95
     96Secondary renaming can simulate an enumeration, but with extra effort.
    8397\begin{cfa}
    8498const Mon = 1, Tue = 2, Wed = 3, Thu = 4, Fri = 5, Sat = 6, Sun = 7;
    8599\end{cfa}
    86 Furthermore, reordering the enumerators requires manual renumbering.
     100Any reordering of the enumerators requires manual renumbering.
    87101\begin{cfa}
    88102const Sun = 1, Mon = 2, Tue = 3, Wed = 4, Thu = 5, Fri = 6, Sat = 7;
    89103\end{cfa}
    90 Finally, there is no common type to create a type-checked instance or iterator cursor.
    91 Hence, there is only a weak equivalence between secondary naming and enumerations, justifying the enumeration type in a programming language.
    92 
    93 A variant (algebraic) type is often promoted as a kind of enumeration, \ie a varient type can simulate an enumeration.
    94 A variant type is a tagged-union, where the possible types may be heterogeneous.
    95 \begin{cfa}
     104Finally, there is no type to create a type-checked instance or iterator cursor.
     105Hence, there is only a weak equivalence between secondary naming and enumerations, justifying a seperate enumeration type in a programming language.
     106
     107A variant (algebraic) type is often promoted as a kind of enumeration, \ie a variant type can simulate an enumeration.
     108Fundamentally, a variant type is a tagged-union, where the tag is normally opaque and the types are usually heterogeneous.
     109\begin{cfa}[morekeywords={variant}]
    96110@variant@ Variant {
    97111        @int tag;@  // optional/implicit: 0 => int, 1 => double, 2 => S
     
    103117};
    104118\end{cfa}
    105 Crucially, the union implies instance storage is shared by all of the variant types.
    106 Hence, a variant is dynamically typed, as in a dynamic-typed programming-language, but the set of types is statically bound, similar to some aspects of dynamic gradual-typing~\cite{Gradual Typing}.
    107 Knowing which type is in a variant instance is crucial for correctness.
    108 Occasionally, it is possible to statically determine all regions where each variant type is used, so a tag and runtime checking is unnecessary;
    109 otherwise, a tag is required to denote the particular type in the variant and the tag checked at runtime using some form of type pattern-matching.
    110 
    111 The tag can be implicitly set by the compiler on assignment, or explicitly set by the program\-mer.
    112 Type pattern-matching is then used to dynamically test the tag and branch to a section of code to safely manipulate the value, \eg:
     119Crucially, the union implies instance storage is shared by all the variant types, and therefore, before a variant type can be used in a statically-typed expression, it must be dynamically discriminated to its current contained type.
     120Hence, knowing which type is in a variant instance is crucial for correctness.
     121Occasionally, it is possible to statically determine all regions where each variant type is used, so a tag and runtime checking is unnecessary.
     122Otherwise, a tag is required to denote the particular type in the variant, and the tag is discriminated at runtime using some form of type pattern-matching, after which the value can be used in a statically-typed expression.
     123
     124A less frequent variant case is multiple variants with the same type, which normally requires explicit naming of the tag to disambiguate among the common types.
     125\begin{cquote}
     126\begin{tabular}{@{}l@{\hspace{30pt}}l@{}}
     127\begin{cfa}[morekeywords={variant}]
     128variant VariantCT {
     129        case @car@: int i;  // explicitly typed
     130        case @boat@: int i;
     131        case @bridge@: int i;
     132};
     133\end{cfa}
     134&
     135\begin{cfa}[morekeywords={variant}]
     136variant VariantCU {
     137        case @car@: ;  // empty or unit type
     138        case @boat@: ;
     139        case @bridge@: ;
     140};
     141\end{cfa}
     142\end{tabular}
     143\end{cquote}
     144Here, the explicit tag name is used to give different meaning to the values in the common @int@ type, \eg the value 3 has different interpretations depending on the tag name.
     145It is even possible to remove the type or use the empty @unit@ type (@struct unit {}@).
     146It is this tag naming that is used to simulate an enumeration.
     147
     148Normally, the variant tag is implicitly set by the compiler based on type, but with common types, a tag name is required to resolve type ambiguity.
     149\begin{cfa}
     150Variant v = 3;                                                  $\C{// implicitly set tag to 0 based on type of 3}$
     151VariantCT ve = boats.3;                                 $\C{// explicitly set tag to 1 using tag name}$
     152\end{cfa}
     153Type pattern-matching is then used to dynamically test the tag and branch to a section of statically-typed code to safely manipulate the value, \eg:
     154\begin{cquote}
     155\begin{tabular}{@{}l@{\hspace{30pt}}l@{}}
    113156\begin{cfa}[morekeywords={match}]
    114 Variant v = 3;  // implicitly set tag to 0
    115 @match@( v ) {    // know the type or test the tag
    116         case int { /* only access i field in v */ }
    117         case double { /* only access d field in v */ }
    118         case S { /* only access s field in v */ }
     157@match@( v ) { // know type implicitly or test tag
     158        case int { /* only access i field */ }
     159        case double { /* only access d field */ }
     160        case S { /* only access s field */ }
    119161}
    120162\end{cfa}
    121 For safety, either all variant types must be listed or a @default@ case must exist with no field accesses.
    122 
    123 To simulate an enumeration with a variant, the tag is \emph{re-purposed} for either ordering or value and the variant types are omitted.
    124 \begin{cfa}
    125 variant Weekday {
    126         int tag; // implicit 0 => Mon, ..., 6 => Sun
    127         @case Mon;@ // no type
     163&
     164\begin{cfa}[morekeywords={match}]
     165@match@( ve ) {
     166        case car: int { /* car interpretation */ }
     167        case boat: int { /* boat interpretation */ }
     168        case bridge: int { /* bridge interpretation */ }
     169}
     170\end{cfa}
     171\end{tabular}
     172\end{cquote}
     173For safety, some languages require all variant types to be listed or a @default@ case with no field accesses.
     174
     175To further strengthen the simulate for an enumeration with different values, each variant type can be a @const@ type or the tag becomes non-opaque, possibly taking advantage of the opaque auto-numbering.
     176\begin{cquote}
     177\begin{tabular}{@{}l@{\hspace{30pt}}l@{}}
     178\begin{cfa}
     179variant Week {
     180        case Mon: const int = 0;
    128181        ...
    129         @case Sun;@
    130 };
    131 \end{cfa}
    132 The type system ensures tag setting and testing are correctly done.
    133 However, the enumeration operations are limited to the available tag operations, \eg pattern matching.
    134 \begin{cfa}
    135 Week week = Mon;
    136 if ( @dynamic_cast(Mon)@week ) ... // test tag == Mon
    137 \end{cfa}
     182        case Sat: const int = 5;
     183        case Sun: const int = 10;
     184};
     185\end{cfa}
     186&
     187\begin{cfa}
     188variant Week {
     189        case Mon: ;  // tag auto-numbering
     190        ...
     191        case Sat: ;
     192        case @Sun = 10@: ; // directly set tag value
     193};
     194\end{cfa}
     195\end{tabular}
     196\end{cquote}
     197Directly setting the tag implies restrictions, like unique values.
     198In both cases, instances of @Week@ are @const@ (immutable).
     199However, usage between these two types becomes complex.
     200\begin{cfa}
     201Week day = Week.Mon;  // sets value or tag depending on type
     202if ( day == Week.Mon )   // dereference value or tag ?
     203\end{cfa}
     204Here, the dereference of @day@ should return the value of the type stored in the variant, never the tag.
     205If it does return the tag, some special meaning must be given to the empty/unit type, especially if a variant contains both regular and unit types.
     206
     207
     208In general, the enumeration simulation and the variant extensions to support it, are deviating from the normal use of a variant (union) type.
     209As well, the enumeration operations are limited to the available tag operations, \eg pattern matching.
    138210While enumerating among tag names is possible:
    139211\begin{cfa}[morekeywords={in}]
    140 for ( cursor in Mon, Wed, Fri, Sun ) ...
    141 \end{cfa}
    142 ordering for iteration would require a \emph{magic} extension, such as a special @enum@ variant, because it has no meaning for a regular variant, \ie @int@ < @double@.
    143 
    144 However, if a special @enum@ variant allows the tags to be heterogeneously typed, ordering must fall back on case positioning, as many types have incomparable values.
    145 Iterating using tag ordering and heterogeneous types, also requires pattern matching.
    146 \begin{cfa}[morekeywords={match}]
    147 for ( cursor = Mon; cursor <= Fri; cursor = succ( cursor) ) {
    148         match( cursor ) {
    149                 case Mon { /* access special type for Mon */ }
    150                 ...
    151                 case Fri { /* access special type for Fri */ }
    152                 default
    153         }
    154 }
    155 \end{cfa}
    156 If the variant type is changed by adding/removing types or the loop range changes, the pattern matching must be adjusted.
    157 As well, if the start/stop values are dynamic, it may be impossible to statically determine if all variant types are listed.
    158 
    159 Re-purposing the notion of enumerating into variant types is ill formed and confusing.
    160 Hence, there is only a weak equivalence between an enumeration and variant type, justifying the enumeration type in a programming language.
     212for ( cursor in Week.Mon, Week.Wed, Week.Fri, Week.Sun ) ...
     213\end{cfa}
     214what is the type of @cursor@?
     215If it the tag type (@int@), how is this value used?
     216If it is the variant type, where is the instance variable, which only contains one value.
     217Hence, either enumerating with a variant enumeration is disallowed or some unusual typing rule must be invented to make it work but only in restricted contexts.
     218
     219While functional programming systems regularly re-purposing variant types into enumeration types, this process seems contrived and confusing.
     220A variant tag is not an enumeration, it is a discriminant among a restricted set of types stored in a storage block.
     221Hence, there is only a weak equivalence between an enumeration and variant type, justifying a seperate enumeration type in a programming language.
    161222
    162223
Note: See TracChangeset for help on using the changeset viewer.