Changeset 4da9142 for doc/theses/jiada_liang_MMath/intro.tex
- Timestamp:
- Apr 18, 2024, 10:23:34 PM (5 months ago)
- Branches:
- master
- Children:
- 2b6db03
- Parents:
- c148966
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/jiada_liang_MMath/intro.tex
rc148966 r4da9142 1 1 \chapter{Introduction} 2 2 3 All types in a programming language must have a set of constants, and these constants have \Newterm{primary names}, \eg integral types have constants @-1@, @17@, @12345@, \etc. 4 Constants can be overloaded among types, \eg @0@ is a null pointer for all pointer types, and the value zero for integral and floating-point types. 3 All types in a programming language must have a set of constants, and these constants have \Newterm{primary names}, \eg integral types have constants @-1@, @17@, @0xff@, floating-point types have constants @5.3@, @2.3E-5@, @0xff.ffp0@, character types have constants @'a'@, @"abc\n"@, \mbox{\lstinline{u8"}\texttt{\guillemotleft{na\"{i}ve}\guillemotright}\lstinline{"}}, \etc. 4 Con\-stants can be overloaded among types, \eg @0@ is a null pointer for all pointer types, and the value zero for integral and floating-point types. 5 (In \CFA, the primary constants @0@ and @1@ can be overloaded for any type.) 5 6 Hence, each primary constant has a symbolic name referring to its internal representation, and these names are dictated by language syntax related to types. 6 7 In theory, there are an infinite set of primary names per type. 7 8 8 \Newterm{Secondary naming} is a common practice in mathematics and engineering, \eg $\pi$, $\tau$ (2$\pi$), $\phi$ (golden ratio), MHz (1E6), and in general situations, \eg specific times (noon, New Years), cities (Big Apple), flowers (Lily), \etc.9 \Newterm{Secondary naming} is a common practice in mathematics, engineering and computer science, \eg $\pi$, $\tau$ (2$\pi$), $\phi$ (golden ratio), MB (mega byte, 1E6), and in general situations, \eg specific times (noon, New Years), cities (Big Apple), flowers (Lily), \etc. 9 10 Many programming languages capture this important software-engineering capability through a mechanism called \Newterm{constant} or \Newterm{literal} naming, where a secondary name is aliased to a primary name. 10 In some cases, secondary naming is \Newterm{pure}, where the matching internal representation can be chosen arbitrarily, and only equality operations are available, \eg @O_RDONLY@, @O_WRONLY@, @O_CREAT@, @O_TRUNC@, @O_APPEND@. 11 (The names the thing.) 11 Its common purpose is to eliminate duplication of the primary constant throughout a program. 12 For example, the secondary name replaces its primary name, thereafter changing the binding of the secondary to primary name automatically distributes the rebinding throughout the program. 13 In some cases, secondary naming is \Newterm{opaque}, where the matching internal representation can be chosen arbitrarily, and only equality operations are available, \eg @O_RDONLY@, @O_WRONLY@, @O_CREAT@, @O_TRUNC@, @O_APPEND@. 12 14 Because a secondary name is a constant, it cannot appear in a mutable context, \eg \mbox{$\pi$ \lstinline{= 42}} is meaningless, and a constant has no address, \ie it is an \Newterm{rvalue}\footnote{ 13 15 The term rvalue defines an expression that can only appear on the right-hand side of an assignment expression.}. … … 18 20 enumerate (verb, transitive). 19 21 To count, ascertain the number of; 20 \emph{more 21 usually, to mention (a number of things or persons) separately, as if for the 22 purpose of counting}; 23 to specify as in a list or catalogue.~\cite{OED} 22 more usually, to mention (a number of things or persons) separately, as if for the purpose of counting; 23 to specify as in a list or catalogue.~\cite{OEDenumerate} 24 24 \end{quote} 25 Within an enumeration set, the enumeration names must be unique, and instances of an enumerated type are restricted to hold only the secondary names.25 Within an enumeration set, the enumeration names must be unique, and instances of an enumerated type are \emph{often} restricted to hold only the secondary names. 26 26 It is possible to enumerate among set names without having an ordering among the set elements. 27 27 For example, the week, the weekdays, the weekend, and every second day of the week. … … 29 29 for ( cursor in Mon, Tue, Wed, Thu, Fri, Sat, Sun } ... $\C[3.75in]{// week}$ 30 30 for ( cursor in Mon, Tue, Wed, Thu, Fri } ... $\C{// weekday}$ 31 for ( cursor in Thu, Fri} ... $\C{// weekend}$31 for ( cursor in Sat, Sun } ... $\C{// weekend}$ 32 32 for ( cursor in Mon, Wed, Fri, Sun } ... $\C{// every second day of week}\CRT$ 33 33 \end{cfa} 34 This independence from internal representation allows multiple names to have the same representation (eight note, quaver), giving synonyms.34 This independence from internal representation allows multiple names to have the same representation (eighth note, quaver), giving synonyms. 35 35 A set can have a partial or total ordering, making it possible to compare set elements, \eg Monday is before Friday and Friday is after. 36 36 Ordering allows iterating among the enumeration set using relational operators and advancement, \eg … … 38 38 for ( cursor = Monday; cursor @<=@ Friday; cursor = @succ@( cursor ) ) ... 39 39 \end{cfa} 40 Here the internal representations for the secondary names are \emph{generated} rather than listing a subset of names. 40 Here the internal representations for the secondary names are logically \emph{generated} rather than listing a subset of names. 41 42 Hence, the fundamental aspects of an enumeration are: 43 \begin{enumerate} 44 \item 45 It defines a type from which instants can be generated. 46 \item 47 The type lists a finite set of secondary names, which become its primary constants. 48 This differentiates an enumeration from general types with an infinite number of primary constants. 49 \item 50 An enumeration's secondary names represent constants, which follows from their binding (aliasing) to primary names, which are constants. 51 \item 52 For safety, an enumeration instance is restricted to hold only its type's secondary names. 53 \item 54 There is a mechanism for \emph{enumerating} over the secondary names, where the ordering can be implicit from the type, explicitly listed, or generated arithmetically. 55 \end{enumerate} 41 56 42 57 43 58 \section{Terminology} 44 45 The term \Newterm{enumeration} defines the set of secondary names, and the term \Newterm{enumerator} represents an arbitrary secondary name. 59 \label{s:Terminology} 60 61 The term \Newterm{enumeration} defines a type with a set of secondary names, and the term \Newterm{enumerator} represents an arbitrary secondary name \see{\VRef{s:CEnumeration}}. 46 62 As well, an enumerated type has three fundamental properties, \Newterm{label}, \Newterm{order}, and \Newterm{value}. 47 63 \begin{cquote} … … 72 88 \section{Motivation} 73 89 74 Some programming languages only provide secondary renaming, which can be simulated by an enumeration without ordering. 75 \begin{cfa} 76 const Size = 20, Pi = 3.14159; 77 enum { Size = 20, Pi = 3.14159 }; // unnamed enumeration $\(\Rightarrow\)$ no ordering 78 \end{cfa} 79 In both cases, it is possible to compare the secondary names, \eg @Size < Pi@, if that is meaningful; 80 however, without an enumeration type-name, it is impossible to create an iterator cursor. 81 82 Secondary renaming can similate an enumeration, but with extra effort. 90 Some programming languages only provide direct secondary renaming. 91 \begin{cfa} 92 const Size = 20, Pi = 3.14159, Name = "Jane"; 93 \end{cfa} 94 Here, it is possible to compare the secondary names, \eg @Size < Pi@, if that is meaningful. 95 96 Secondary renaming can simulate an enumeration, but with extra effort. 83 97 \begin{cfa} 84 98 const Mon = 1, Tue = 2, Wed = 3, Thu = 4, Fri = 5, Sat = 6, Sun = 7; 85 99 \end{cfa} 86 Furthermore, reorderingthe enumerators requires manual renumbering.100 Any reordering of the enumerators requires manual renumbering. 87 101 \begin{cfa} 88 102 const Sun = 1, Mon = 2, Tue = 3, Wed = 4, Thu = 5, Fri = 6, Sat = 7; 89 103 \end{cfa} 90 Finally, there is no commontype to create a type-checked instance or iterator cursor.91 Hence, there is only a weak equivalence between secondary naming and enumerations, justifying the enumeration type in a programming language.92 93 A variant (algebraic) type is often promoted as a kind of enumeration, \ie a vari ent type can simulate an enumeration.94 A variant type is a tagged-union, where the possible types may beheterogeneous.95 \begin{cfa} 104 Finally, there is no type to create a type-checked instance or iterator cursor. 105 Hence, there is only a weak equivalence between secondary naming and enumerations, justifying a seperate enumeration type in a programming language. 106 107 A variant (algebraic) type is often promoted as a kind of enumeration, \ie a variant type can simulate an enumeration. 108 Fundamentally, a variant type is a tagged-union, where the tag is normally opaque and the types are usually heterogeneous. 109 \begin{cfa}[morekeywords={variant}] 96 110 @variant@ Variant { 97 111 @int tag;@ // optional/implicit: 0 => int, 1 => double, 2 => S … … 103 117 }; 104 118 \end{cfa} 105 Crucially, the union implies instance storage is shared by all of the variant types. 106 Hence, a variant is dynamically typed, as in a dynamic-typed programming-language, but the set of types is statically bound, similar to some aspects of dynamic gradual-typing~\cite{Gradual Typing}. 107 Knowing which type is in a variant instance is crucial for correctness. 108 Occasionally, it is possible to statically determine all regions where each variant type is used, so a tag and runtime checking is unnecessary; 109 otherwise, a tag is required to denote the particular type in the variant and the tag checked at runtime using some form of type pattern-matching. 110 111 The tag can be implicitly set by the compiler on assignment, or explicitly set by the program\-mer. 112 Type pattern-matching is then used to dynamically test the tag and branch to a section of code to safely manipulate the value, \eg: 119 Crucially, the union implies instance storage is shared by all the variant types, and therefore, before a variant type can be used in a statically-typed expression, it must be dynamically discriminated to its current contained type. 120 Hence, knowing which type is in a variant instance is crucial for correctness. 121 Occasionally, it is possible to statically determine all regions where each variant type is used, so a tag and runtime checking is unnecessary. 122 Otherwise, a tag is required to denote the particular type in the variant, and the tag is discriminated at runtime using some form of type pattern-matching, after which the value can be used in a statically-typed expression. 123 124 A less frequent variant case is multiple variants with the same type, which normally requires explicit naming of the tag to disambiguate among the common types. 125 \begin{cquote} 126 \begin{tabular}{@{}l@{\hspace{30pt}}l@{}} 127 \begin{cfa}[morekeywords={variant}] 128 variant VariantCT { 129 case @car@: int i; // explicitly typed 130 case @boat@: int i; 131 case @bridge@: int i; 132 }; 133 \end{cfa} 134 & 135 \begin{cfa}[morekeywords={variant}] 136 variant VariantCU { 137 case @car@: ; // empty or unit type 138 case @boat@: ; 139 case @bridge@: ; 140 }; 141 \end{cfa} 142 \end{tabular} 143 \end{cquote} 144 Here, the explicit tag name is used to give different meaning to the values in the common @int@ type, \eg the value 3 has different interpretations depending on the tag name. 145 It is even possible to remove the type or use the empty @unit@ type (@struct unit {}@). 146 It is this tag naming that is used to simulate an enumeration. 147 148 Normally, the variant tag is implicitly set by the compiler based on type, but with common types, a tag name is required to resolve type ambiguity. 149 \begin{cfa} 150 Variant v = 3; $\C{// implicitly set tag to 0 based on type of 3}$ 151 VariantCT ve = boats.3; $\C{// explicitly set tag to 1 using tag name}$ 152 \end{cfa} 153 Type pattern-matching is then used to dynamically test the tag and branch to a section of statically-typed code to safely manipulate the value, \eg: 154 \begin{cquote} 155 \begin{tabular}{@{}l@{\hspace{30pt}}l@{}} 113 156 \begin{cfa}[morekeywords={match}] 114 Variant v = 3; // implicitly set tag to 0 115 @match@( v ) { // know the type or test the tag 116 case int { /* only access i field in v */ } 117 case double { /* only access d field in v */ } 118 case S { /* only access s field in v */ } 157 @match@( v ) { // know type implicitly or test tag 158 case int { /* only access i field */ } 159 case double { /* only access d field */ } 160 case S { /* only access s field */ } 119 161 } 120 162 \end{cfa} 121 For safety, either all variant types must be listed or a @default@ case must exist with no field accesses. 122 123 To simulate an enumeration with a variant, the tag is \emph{re-purposed} for either ordering or value and the variant types are omitted. 124 \begin{cfa} 125 variant Weekday { 126 int tag; // implicit 0 => Mon, ..., 6 => Sun 127 @case Mon;@ // no type 163 & 164 \begin{cfa}[morekeywords={match}] 165 @match@( ve ) { 166 case car: int { /* car interpretation */ } 167 case boat: int { /* boat interpretation */ } 168 case bridge: int { /* bridge interpretation */ } 169 } 170 \end{cfa} 171 \end{tabular} 172 \end{cquote} 173 For safety, some languages require all variant types to be listed or a @default@ case with no field accesses. 174 175 To further strengthen the simulate for an enumeration with different values, each variant type can be a @const@ type or the tag becomes non-opaque, possibly taking advantage of the opaque auto-numbering. 176 \begin{cquote} 177 \begin{tabular}{@{}l@{\hspace{30pt}}l@{}} 178 \begin{cfa} 179 variant Week { 180 case Mon: const int = 0; 128 181 ... 129 @case Sun;@ 130 }; 131 \end{cfa} 132 The type system ensures tag setting and testing are correctly done. 133 However, the enumeration operations are limited to the available tag operations, \eg pattern matching. 134 \begin{cfa} 135 Week week = Mon; 136 if ( @dynamic_cast(Mon)@week ) ... // test tag == Mon 137 \end{cfa} 182 case Sat: const int = 5; 183 case Sun: const int = 10; 184 }; 185 \end{cfa} 186 & 187 \begin{cfa} 188 variant Week { 189 case Mon: ; // tag auto-numbering 190 ... 191 case Sat: ; 192 case @Sun = 10@: ; // directly set tag value 193 }; 194 \end{cfa} 195 \end{tabular} 196 \end{cquote} 197 Directly setting the tag implies restrictions, like unique values. 198 In both cases, instances of @Week@ are @const@ (immutable). 199 However, usage between these two types becomes complex. 200 \begin{cfa} 201 Week day = Week.Mon; // sets value or tag depending on type 202 if ( day == Week.Mon ) // dereference value or tag ? 203 \end{cfa} 204 Here, the dereference of @day@ should return the value of the type stored in the variant, never the tag. 205 If it does return the tag, some special meaning must be given to the empty/unit type, especially if a variant contains both regular and unit types. 206 207 208 In general, the enumeration simulation and the variant extensions to support it, are deviating from the normal use of a variant (union) type. 209 As well, the enumeration operations are limited to the available tag operations, \eg pattern matching. 138 210 While enumerating among tag names is possible: 139 211 \begin{cfa}[morekeywords={in}] 140 for ( cursor in Mon, Wed, Fri, Sun ) ... 141 \end{cfa} 142 ordering for iteration would require a \emph{magic} extension, such as a special @enum@ variant, because it has no meaning for a regular variant, \ie @int@ < @double@. 143 144 However, if a special @enum@ variant allows the tags to be heterogeneously typed, ordering must fall back on case positioning, as many types have incomparable values. 145 Iterating using tag ordering and heterogeneous types, also requires pattern matching. 146 \begin{cfa}[morekeywords={match}] 147 for ( cursor = Mon; cursor <= Fri; cursor = succ( cursor) ) { 148 match( cursor ) { 149 case Mon { /* access special type for Mon */ } 150 ... 151 case Fri { /* access special type for Fri */ } 152 default 153 } 154 } 155 \end{cfa} 156 If the variant type is changed by adding/removing types or the loop range changes, the pattern matching must be adjusted. 157 As well, if the start/stop values are dynamic, it may be impossible to statically determine if all variant types are listed. 158 159 Re-purposing the notion of enumerating into variant types is ill formed and confusing. 160 Hence, there is only a weak equivalence between an enumeration and variant type, justifying the enumeration type in a programming language. 212 for ( cursor in Week.Mon, Week.Wed, Week.Fri, Week.Sun ) ... 213 \end{cfa} 214 what is the type of @cursor@? 215 If it the tag type (@int@), how is this value used? 216 If it is the variant type, where is the instance variable, which only contains one value. 217 Hence, either enumerating with a variant enumeration is disallowed or some unusual typing rule must be invented to make it work but only in restricted contexts. 218 219 While functional programming systems regularly re-purposing variant types into enumeration types, this process seems contrived and confusing. 220 A variant tag is not an enumeration, it is a discriminant among a restricted set of types stored in a storage block. 221 Hence, there is only a weak equivalence between an enumeration and variant type, justifying a seperate enumeration type in a programming language. 161 222 162 223
Note: See TracChangeset
for help on using the changeset viewer.