Changeset 41fb996


Ignore:
Timestamp:
Mar 25, 2024, 9:02:18 AM (9 months ago)
Author:
Peter A. Buhr <pabuhr@…>
Branches:
master
Children:
051aec4
Parents:
6a8c773
Message:

word smithing and poking at rust enumerations

Location:
doc/theses/jiada_liang_MMath
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/jiada_liang_MMath/intro.tex

    r6a8c773 r41fb996  
    11\chapter{Introduction}
    22
    3 All types in a programming language must have a set of constants, and these constants have primary names, \eg integral types have constants @-1@, @17@, @12345@, \etc.
     3All types in a programming language must have a set of constants, and these constants have \Newterm{primary names}, \eg integral types have constants @-1@, @17@, @12345@, \etc.
    44Constants can be overloaded among types, \eg @0@ is a null pointer for all pointer types, and the value zero for integral and floating-point types.
    55Hence, each primary constant has a symbolic name referring to its internal representation, and these names are dictated by language syntax related to types.
    66In theory, there are an infinite set of primary names per type.
    77
    8 Secondary naming is a common practice in mathematics and engineering, \eg $\pi$, $\tau$ (2$\pi$), $\phi$ (golden ratio), MHz (1E6), and in general situations, \eg specific times (noon, New Years), cities (Big Apple), flowers (Lily), \etc.
     8\Newterm{Secondary naming} is a common practice in mathematics and engineering, \eg $\pi$, $\tau$ (2$\pi$), $\phi$ (golden ratio), MHz (1E6), and in general situations, \eg specific times (noon, New Years), cities (Big Apple), flowers (Lily), \etc.
    99Many programming languages capture this important software-engineering capability through a mechanism called \Newterm{constant} or \Newterm{literal} naming, where a secondary name is aliased to a primary name.
    1010In some cases, secondary naming is \Newterm{pure}, where the matching internal representation can be chosen arbitrarily, and only equality operations are available, \eg @O_RDONLY@, @O_WRONLY@, @O_CREAT@, @O_TRUNC@, @O_APPEND@.
     11(The names the thing.)
    1112Because a secondary name is a constant, it cannot appear in a mutable context, \eg \mbox{$\pi$ \lstinline{= 42}} is meaningless, and a constant has no address, \ie it is an \Newterm{rvalue}\footnote{
    1213The term rvalue defines an expression that can only appear on the right-hand side of an assignment expression.}.
     
    2223to specify as in a list or catalogue.~\cite{OED}
    2324\end{quote}
    24 Within an enumeration set, the enumeration names must be unique, and instances of an enumerated type are restricted to its secondary names.
     25Within an enumeration set, the enumeration names must be unique, and instances of an enumerated type are restricted to hold only the secondary names.
    2526It is possible to enumerate among set names without having an ordering among the set elements.
    2627For example, the week, the weekdays, the weekend, and every second day of the week.
     
    4950\it\color{red}enumeration       & \multicolumn{8}{c}{\it\color{red}enumerators} \\
    5051$\downarrow$\hspace*{25pt}      & \multicolumn{8}{c}{$\downarrow$}                              \\
    51 @enum@ Weekday \{                       & Mon,  & Tue,  & Wed,  & Thu,  & Fri,  & Sat,  & Sun = 42      & \};   \\
     52@enum@ Week \{                          & Mon,  & Tue,  & Wed,  & Thu,  & Fri,  & Sat,  & Sun = 42      & \};   \\
    5253\it\color{red}label                     & Mon   & Tue   & Wed   & Thu   & Fri   & Sat   & Sun           &               \\
    5354\it\color{red}order                     & 0             & 1             & 2             & 3             & 4             & 5             & 6                     &               \\
     
    5556\end{tabular}
    5657\end{cquote}
    57 Here, the enumeration @Weekday@ defines the enumerator labels @Mon@, @Tue@, @Wed@, @Thu@, @Fri@, @Sat@ and @Sun@.
     58Here, the enumeration @Week@ defines the enumerator labels @Mon@, @Tue@, @Wed@, @Thu@, @Fri@, @Sat@ and @Sun@.
    5859The implicit ordering implies the successor of @Tue@ is @Mon@ and the predecessor of @Tue@ is @Wed@, independent of any associated enumerator values.
    5960The value is the constant represented by the secondary name, which can be implicitly or explicitly set.
     
    105106Hence, a variant is dynamically typed, as in a dynamic-typed programming-language, but the set of types is statically bound, similar to some aspects of dynamic gradual-typing~\cite{Gradual Typing}.
    106107Knowing which type is in a variant instance is crucial for correctness.
    107 Occasionally, it is possible to statically determine, all regions where each variant type is used, so a tag and runtime checking is unnecessary;
     108Occasionally, it is possible to statically determine all regions where each variant type is used, so a tag and runtime checking is unnecessary;
    108109otherwise, a tag is required to denote the particular type in the variant and the tag checked at runtime using some form of type pattern-matching.
    109110
     
    120121For safety, either all variant types must be listed or a @default@ case must exist with no field accesses.
    121122
    122 To simulate an enumeration with a variant, the tag is re-purposed for either ordering or value and the variant types are omitted.
     123To simulate an enumeration with a variant, the tag is \emph{re-purposed} for either ordering or value and the variant types are omitted.
    123124\begin{cfa}
    124125variant Weekday {
     
    129130};
    130131\end{cfa}
    131 The type system ensures tag setting and testing are correct.
     132The type system ensures tag setting and testing are correctly done.
    132133However, the enumeration operations are limited to the available tag operations, \eg pattern matching.
    133134\begin{cfa}
    134 Weekday weekday = Mon;
    135 if ( @dynamic_cast(Mon)@weekday ) ... // test tag == Mon
     135Week week = Mon;
     136if ( @dynamic_cast(Mon)@week ) ... // test tag == Mon
    136137\end{cfa}
    137138While enumerating among tag names is possible:
     
    143144However, if a special @enum@ variant allows the tags to be heterogeneously typed, ordering must fall back on case positioning, as many types have incomparable values.
    144145Iterating using tag ordering and heterogeneous types, also requires pattern matching.
    145 \begin{cfa}
     146\begin{cfa}[morekeywords={match}]
    146147for ( cursor = Mon; cursor <= Fri; cursor = succ( cursor) ) {
    147         switch( cursor ) {
     148        match( cursor ) {
    148149                case Mon { /* access special type for Mon */ }
    149150                ...
    150151                case Fri { /* access special type for Fri */ }
     152                default
    151153        }
    152154}
    153155\end{cfa}
    154 If the variant type adds/removes types or the loop range changes, the pattern matching must be adjusted.
    155 As well, if the start/stop values are dynamic, it is impossible to statically determine if all variant types are listed.
     156If the variant type is changed by adding/removing types or the loop range changes, the pattern matching must be adjusted.
     157As well, if the start/stop values are dynamic, it may be impossible to statically determine if all variant types are listed.
    156158
    157 Forcing the notion of enumerating into variant types is ill formed and confusing.
     159Re-purposing the notion of enumerating into variant types is ill formed and confusing.
    158160Hence, there is only a weak equivalence between an enumeration and variant type, justifying the enumeration type in a programming language.
    159161
     
    163165The goal of this work is to to extend the simple and unsafe enumeration type in the C programming-language into a sophisticated and safe type in the \CFA programming-language, while maintain backwards compatibility with C.
    164166On the surface, enumerations seem like a simple type.
    165 However, when extended with advanced features, enumerations become complex for both the type system and the implementation.
     167However, when extended with advanced features, enumerations become complex for both the type system and the runtime implementation.
    166168
    167169\begin{enumerate}
  • doc/theses/jiada_liang_MMath/relatedwork.tex

    r6a8c773 r41fb996  
    608608\lstnewenvironment{rust}[1][]{\lstset{language=Rust,escapechar=\$,moredelim=**[is][\color{red}]{@}{@},}\lstset{#1}}{}
    609609
    610 Enumerations
     610Rust provides a scoped enumeration based on variant types.
     611% An enumeration, also referred to as an enum, is a simultaneous definition of a nominal enumerated type as well as a set of constructors, that can be used to create or pattern-match values of the corresponding enumerated type.
     612An enumeration without constructors is called field-less.
    611613\begin{rust}
    612         Syntax
    613         Enumeration :
    614            enum IDENTIFIER  GenericParams? WhereClause? { EnumItems? }
    615 
    616         EnumItems :
    617            EnumItem ( , EnumItem )* ,?
    618 
    619         EnumItem :
    620            OuterAttribute* Visibility?
    621            IDENTIFIER ( EnumItemTuple | EnumItemStruct )? EnumItemDiscriminant?
    622 
    623         EnumItemTuple :
    624            ( TupleFields? )
    625 
    626         EnumItemStruct :
    627            { StructFields? }
    628 
    629         EnumItemDiscriminant :
    630            = Expression
     614enum Week { Mon, Tues, Wed, Thu, Fri, Sat, Sun@,@ }
     615let mut week: Week = Week::Mon;
     616week = Week::Fri;
    631617\end{rust}
    632 An enumeration, also referred to as an enum, is a simultaneous definition of a nominal enumerated type as well as a set of constructors, that can be used to create or pattern-match values of the corresponding enumerated type.
    633 
    634 Enumerations are declared with the keyword enum.
    635 
    636 An example of an enum item and its use:
     618A field-less enumeration with only unit variants is called unit-only.
    637619\begin{rust}
    638 enum Animal {
    639         Dog,
    640         Cat,
    641 }
    642 
    643 let mut a: Animal = Animal::Dog;
    644 a = Animal::Cat;
     620enum Week { Mon = 0, Tues = 1, Wed = 2, Thu = 3, Fri = 4, Sat = 5, Sun = 6 }
    645621\end{rust}
    646622Enum constructors can have either named or unnamed fields:
    647623\begin{rust}
    648624enum Animal {
    649         Dog(String, f64),
    650         Cat { name: String, weight: f64 },
    651 }
    652 
     625        Dog( String, f64 ),
     626        Cat{ name: String, weight: f64 },
     627}
    653628let mut a: Animal = Animal::Dog("Cocoa".to_string(), 37.2);
    654629a = Animal::Cat { name: "Spotty".to_string(), weight: 2.7 };
    655630\end{rust}
    656 In this example, Cat is a struct-like enum variant, whereas Dog is simply called an enum variant.
    657 
    658 An enum where no constructors contain fields are called a field-less enum. For example, this is a fieldless enum:
     631Here, @Dog@ is an @enum@ variant, whereas @Cat@ is a struct-like variant.
     632
     633Each @enum@ type has an implicit integer tag (discriminant), with a unique value for each variant type.
     634Like a C enumeration, the tag values for the variant types start at 0 with auto incrementing.
     635The tag is re-purposed for enumeration by allowing it to be explicitly set, and auto incrmenting continues from that value.
     636\begin{cquote}
     637\sf\setlength{\tabcolsep}{3pt}
     638\begin{tabular}{rcccccccr}
     639@enum@ Week \{  & Mon,  & Tue,  & Wed = 2,      & Thu = 10,     & Fri,  & Sat = 5,      & Sun   & \};   \\
     640\rm tags                & 0             & 1             & 2                     & 10            & 11    & 5                     & 6             &               \\
     641\end{tabular}
     642\end{cquote}
     643In general, the tag can only be read as an opaque reference for comparison.
    659644\begin{rust}
    660 enum Fieldless {
    661         Tuple(),
    662         Struct{},
    663         Unit,
    664 }
     645if mem::discriminant(&week) == mem::discriminant(&Week::Mon) ...
    665646\end{rust}
    666 If a field-less enum only contains unit variants, the enum is called an unit-only enum. For example:
     647If the enumeration is unit-only, or field-less with no explicit discriminants and where only unit variants are explicit, then the discriminant is accessible with a numeric cast.
    667648\begin{rust}
    668 enum Enum {
    669         Foo = 3,
    670         Bar = 2,
    671         Baz = 1,
    672 }
    673 \end{rust}
    674 
    675 \subsection{Discriminants}
    676 
    677 Each enum instance has a discriminant: an integer logically associated to it that is used to determine which variant it holds.
    678 
    679 Under the default representation, the discriminant is interpreted as an isize value. However, the compiler is allowed to use a smaller type (or another means of distinguishing variants) in its actual memory layout.
    680 
    681 \subsection{Assigning discriminant values}
    682 
    683 \subsection{Explicit discriminants}
    684 
    685 In two circumstances, the discriminant of a variant may be explicitly set by following the variant name with = and a constant expression:
    686 
    687         if the enumeration is "unit-only".
    688 
    689         if a primitive representation is used. For example:
    690 \begin{rust}
    691         #[repr(u8)]
    692         enum Enum {
    693                 Unit = 3,
    694                 Tuple(u16),
    695                 Struct {
    696                         a: u8,
    697                         b: u16,
    698                 } = 1,
    699         }
    700 \end{rust}
    701 
    702 \subsection{Implicit discriminants}
    703 
    704 If a discriminant for a variant is not specified, then it is set to one higher than the discriminant of the previous variant in the declaration. If the discriminant of the first variant in the declaration is unspecified, then it is set to zero.
    705 \begin{rust}
    706 enum Foo {
    707         Bar,                    // 0
    708         Baz = 123,        // 123
    709         Quux,              // 124
    710 }
    711 
    712 let baz_discriminant = Foo::Baz as u32;
    713 assert_eq!(baz_discriminant, 123);
    714 \end{rust}
    715 
    716 \subsection{Restrictions}
    717 
    718 It is an error when two variants share the same discriminant.
    719 \begin{rust}
    720 enum SharedDiscriminantError {
    721         SharedA = 1,
    722         SharedB = 1
    723 }
    724 
    725 enum SharedDiscriminantError2 {
    726         Zero,      // 0
    727         One,            // 1
    728         OneToo = 1  // 1 (collision with previous!)
    729 }
    730 \end{rust}
    731 It is also an error to have an unspecified discriminant where the previous discriminant is the maximum value for the size of the discriminant.
    732 \begin{rust}
    733 #[repr(u8)]
    734 enum OverflowingDiscriminantError {
    735         Max = 255,
    736         MaxPlusOne // Would be 256, but that overflows the enum.
    737 }
    738 
    739 #[repr(u8)]
    740 enum OverflowingDiscriminantError2 {
    741         MaxMinusOne = 254, // 254
    742         Max,                       // 255
    743         MaxPlusOne               // Would be 256, but that overflows the enum.
    744 }
    745 \end{rust}
    746 
    747 \subsection{Accessing discriminant}
    748 
    749 \begin{rust}
    750 Via mem::discriminant
    751 \end{rust}
    752 @mem::discriminant@ returns an opaque reference to the discriminant of an enum value which can be compared. This cannot be used to get the value of the discriminant.
    753 
    754 \subsection{Casting}
    755 
    756 If an enumeration is unit-only (with no tuple and struct variants), then its discriminant can be directly accessed with a numeric cast; e.g.:
    757 \begin{rust}
    758 enum Enum {
    759         Foo,
    760         Bar,
    761         Baz,
    762 }
    763 
    764 assert_eq!(0, Enum::Foo as isize);
    765 assert_eq!(1, Enum::Bar as isize);
    766 assert_eq!(2, Enum::Baz as isize);
    767 \end{rust}
    768 Field-less enums can be casted if they do not have explicit discriminants, or where only unit variants are explicit.
    769 \begin{rust}
    770 enum Fieldless {
    771         Tuple(),
    772         Struct{},
    773         Unit,
    774 }
    775 
    776 assert_eq!(0, Fieldless::Tuple() as isize);
    777 assert_eq!(1, Fieldless::Struct{} as isize);
    778 assert_eq!(2, Fieldless::Unit as isize);
    779 \end{rust}
    780 \begin{rust}
    781 #[repr(u8)]
    782 enum FieldlessWithDiscrimants {
    783         First = 10,
    784         Tuple(),
    785         Second = 20,
    786         Struct{},
    787         Unit,
    788 }
    789 
    790 assert_eq!(10, FieldlessWithDiscrimants::First as u8);
    791 assert_eq!(11, FieldlessWithDiscrimants::Tuple() as u8);
    792 assert_eq!(20, FieldlessWithDiscrimants::Second as u8);
    793 assert_eq!(21, FieldlessWithDiscrimants::Struct{} as u8);
    794 assert_eq!(22, FieldlessWithDiscrimants::Unit as u8);
    795 \end{rust}
    796 
    797 \subsection{Pointer casting}
    798 
    799 If the enumeration specifies a primitive representation, then the discriminant may be reliably accessed via unsafe pointer casting:
    800 \begin{rust}
    801 #[repr(u8)]
    802 enum Enum {
    803         Unit,
    804         Tuple(bool),
    805         Struct{a: bool},
    806 }
    807 
    808 impl Enum {
    809         fn discriminant(&self) -> u8 {
    810                 unsafe { *(self as *const Self as *const u8) }
    811         }
    812 }
    813 
    814 let unit_like = Enum::Unit;
    815 let tuple_like = Enum::Tuple(true);
    816 let struct_like = Enum::Struct{a: false};
    817 
    818 assert_eq!(0, unit_like.discriminant());
    819 assert_eq!(1, tuple_like.discriminant());
    820 assert_eq!(2, struct_like.discriminant());
    821 \end{rust}
    822 
    823 \subsection{Zero-variant enums}
    824 
    825 Enums with zero variants are known as zero-variant enums. As they have no valid values, they cannot be instantiated.
    826 \begin{rust}
    827 enum ZeroVariants {}
    828 \end{rust}
    829 Zero-variant enums are equivalent to the never type, but they cannot be coerced into other types.
    830 \begin{rust}
    831 let x: ZeroVariants = panic!();
    832 let y: u32 = x; // mismatched type error
    833 \end{rust}
    834 
    835 \subsection{Variant visibility}
    836 
    837 Enum variants syntactically allow a Visibility annotation, but this is rejected when the enum is validated. This allows items to be parsed with a unified syntax across different contexts where they are used.
    838 \begin{rust}
    839 macro_rules! mac_variant {
    840         ($vis:vis $name:ident) => {
    841                 enum $name {
    842                         $vis Unit,
    843 
    844                         $vis Tuple(u8, u16),
    845 
    846                         $vis Struct { f: u8 },
    847                 }
    848         }
    849 }
    850 
    851 // Empty `vis` is allowed.
    852 mac_variant! { E }
    853 
    854 // This is allowed, since it is removed before being validated.
    855 #[cfg(FALSE)]
    856 enum E {
    857         pub U,
    858         pub(crate) T(u8),
    859         pub(super) T { f: String }
    860 }
     649if week as isize == Week::Mon as isize ...
    861650\end{rust}
    862651
Note: See TracChangeset for help on using the changeset viewer.