- Timestamp:
- Aug 3, 2024, 9:40:56 AM (5 months ago)
- Branches:
- master
- Children:
- 4e107bf, e15293b
- Parents:
- 8789ae4
- Location:
- doc/theses/jiada_liang_MMath
- Files:
-
- 2 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/jiada_liang_MMath/intro.tex
r8789ae4 r433e2c3 1 1 \chapter{Introduction} 2 2 3 All types in a programming language have a set of constants (symbols), and these constants representvalues, \eg integer types have constants @-1@, @17@, @0xff@ representing whole numbers, floating-point types have constants @5.3@, @2.3E-5@, @0xff.ffp0@ representing real numbers, character types have constants @'a'@, @"abc\n"@, \mbox{\lstinline{u8"}\texttt{\guillemotleft{na\"{i}ve}\guillemotright}\lstinline{"}} representing (human readable) text, \etc.3 All basic types in a programming language have a set of constants (symbols), and these constants represent computable values, \eg integer types have constants @-1@, @17@, @0xff@ representing whole numbers, floating-point types have constants @5.3@, @2.3E-5@, @0xff.ffp0@ representing real numbers, character types have constants @'a'@, @"abc\n"@, \mbox{\lstinline{u8"}\texttt{\guillemotleft{na\"{i}ve}\guillemotright}\lstinline{"}} representing (human readable) text, \etc. 4 4 Constants can be overloaded among types, \eg @0@ is a null pointer for all pointer types, and the value zero for integer and floating-point types. 5 5 (In \CFA, the constants @0@ and @1@ can be overloaded for any type.) 6 Higher-level types compose constants from the basic constants. 7 \begin{cfa} 8 struct S { int i, j, k; } s; 9 s = (S){ 1, 2, 3 }; $\C[2in]{// structure constant}$ 10 int x[5] = { 1, 2, 3, 4, 5 }; $\C{// array constant}\CRT$ 11 \end{cfa} 6 12 A constant's symbolic name is dictated by language syntax related to types, \eg @5.@ (double), @5.0f@ (float), @5l@ (long double). 7 In general, the representation of a constant's value is \newterm{opaque}, so the internal representation can be chosen arbitrarily .13 In general, the representation of a constant's value is \newterm{opaque}, so the internal representation can be chosen arbitrarily, \eg two's complement, IEEE floating-point. 8 14 In theory, there are an infinite set of constant names per type representing an infinite set of values. 9 15 … … 13 19 14 20 Many programming languages capture this important software-engineering capability through a mechanism called \newterm{constant} or \newterm{literal} naming, where a new constant is aliased to an existing constant. 15 Its purpose is for readability: replacing a constant name that directly represents a value with a name that is more symbolic and meaningful in the context of the program. 16 Thereafter, changing the aliasing of the new constant to another constant automatically distributes the rebinding, preventing errors. 17 % and only equality operations are available, \eg @O_RDONLY@, @O_WRONLY@, @O_CREAT@, @O_TRUNC@, @O_APPEND@. 21 Its purpose is for readability: replacing constant values in a program with symbolic names that are more meaningful to programmers in the context of the application. 22 Thereafter, associating a name to a different value automatically distributes this rebinding, preventing errors. 18 23 Because an aliased name is a constant, it cannot appear in a mutable context, \eg \mbox{$\pi$ \lstinline{= 42}} is meaningless, and a constant has no address, \ie it is an \newterm{rvalue}\footnote{ 19 24 The term rvalue defines an expression that can only appear on the right-hand side of an assignment expression.}. … … 39 44 for ( cursor in Mon, Wed, Fri, Sun } ... $\C{// every second day of week}\CRT$ 40 45 \end{cfa} 41 A set can have a partial or total ordering, making it possible to compare set elements, \eg Monday is before Friday and Friday is after.46 A set can have a partial or total ordering, making it possible to compare set elements, \eg Monday is before Tuesday and Tuesday is after. 42 47 Ordering allows iterating among the enumeration set using relational operators and advancement, \eg: 43 48 \begin{cfa} 44 for ( cursor = Monday; cursor @<=@ Friday; cursor = @succ@( cursor ) ) ... 49 for ( cursor = Monday; cursor @<=@ Friday; cursor = @succ@( cursor ) ) ... // weekdays 45 50 \end{cfa} 46 51 Here the values for the set names are logically \emph{generated} rather than listing a subset of names. … … 50 55 \item 51 56 \begin{sloppypar} 52 It provides a finite set of new constants, which are implicitly or explicitly assigned values that must be appropriate for any set operations .57 It provides a finite set of new constants, which are implicitly or explicitly assigned values that must be appropriate for any set operations, \eg increasing order. 53 58 This aspect differentiates an enumeration from general types with an infinite set of constants. 54 59 \end{sloppypar} … … 106 111 \label{s:Aliasing} 107 112 108 Some languages provide simple aliasing (renaming) , \eg:113 Some languages provide simple aliasing (renaming). 109 114 \begin{cfa} 110 115 const Size = 20, Pi = 3.14159, Name = "Jane"; … … 113 118 It is possible to compare aliases, if the constants allow it, \eg @Size < Pi@, whereas @Pi < Name@ might be disallowed depending on the language. 114 119 115 Aliasing is notmacro substitution, \eg @#define Size 20@, where a name is replaced by its value \emph{before} compilation, so the name is invisible to the programming language.120 Aliasing is \emph{not} macro substitution, \eg @#define Size 20@, where a name is replaced by its value \emph{before} compilation, so the name is invisible to the programming language. 116 121 With aliasing, each new name is part of the language, and hence, participates fully, such as name overloading in the type system. 117 Aliasing is not an immutable variable , \eg:122 Aliasing is not an immutable variable. 118 123 \begin{cfa} 119 124 extern @const@ int Size = 20; … … 123 128 Taking the address of an immutable variable makes it an \newterm{lvalue}, which implies it has storage. 124 129 With separate compilation, it is necessary to choose one translation unit to perform the initialization. 125 If aliasing does requirestorage, its address and initialization are opaque (compiler only), similar to \CC rvalue reference @&&@.130 If aliasing requires storage, its address and initialization are opaque (compiler only), similar to \CC rvalue reference @&&@. 126 131 127 132 Aliasing does provide readability and automatic resubstitution. … … 151 156 baz = C S{ i = 7, d = 7.5 } 152 157 \end{haskell} 153 the ADT has three variants (constructors), @A@, @B@, @C@ with associated types @Int@, @Double@, and @S@.158 the ADT has three variants (constructors), @A@, @B@, @C@, with associated types @Int@, @Double@, and @S@. 154 159 The constructors create an initialized value of the specific type that is bound to the immutable variables @foo@, @bar@, and @baz@. 155 160 Hence, the ADT @Foo@ is like a union containing values of the associated types, and a constructor name is used to intialize and access the value using dynamic pattern-matching. 156 161 \begin{cquote} 157 \setlength{\tabcolsep}{ 15pt}162 \setlength{\tabcolsep}{20pt} 158 163 \begin{tabular}{@{}ll@{}} 159 164 \begin{haskell} 160 165 prtfoo val = -- function 161 162 163 164 165 166 167 166 -- pattern match on constructor 167 case val of 168 @A@ a -> print a 169 @B@ b -> print b 170 @C@ (S i d) -> do 171 print i 172 print d 168 173 \end{haskell} 169 174 & 170 175 \begin{haskell} 171 176 main = do 172 173 174 177 prtfoo foo 178 prtfoo bar 179 prtfoo baz 175 180 3 176 181 3.5 … … 193 198 Note, the term \newterm{variant} is often associated with ADTs. 194 199 However, there are multiple languages with a @variant@ type that is not an ADT \see{Algol68~\cite{Algol68} or \CC \lstinline{variant}}. 195 In these languages, the variant is often a union using RTTI tags for discrimination, which cannot be used to simulate an enumeration. 200 Here, the type (and possibly the position for equivalent types) is used to discriminant the specific \emph{variant} within the variant instance. 201 For example, \VRef[Figure]{f:C++variant} shows the \CC equivalent of the two Haskell ADT types using variant types. 202 In these languages, the variant cannot be used to simulate an enumeration. 196 203 Hence, in this work the term variant is not a synonym for ADT. 204 205 \begin{figure} 206 \begin{c++} 207 struct S { char s[32]; }; 208 variant< int, double, S > vd; 209 variant< int, int, int > vs; 210 211 // discrimination based on type 212 vd = 3; 213 if ( holds_alternative<int>(vd) ) cout << "int " << get<int>(vd ) << endl; 214 vd = 3.5; 215 if ( holds_alternative<double>(vd) ) cout << "double " << get<double>(vd) << endl; 216 vd = (S){ "abc" }; 217 if ( holds_alternative<S>(vd) ) cout << "S.s " << get<S>(vd).s << endl; 218 219 // discrimination based on type and position within type 220 vs = (variant<int,int,int>){ in_place_index<0>, 12 }; 221 if ( vs.index() == 0 ) cout << "posn 0 " << get<0>(vs) << endl; 222 vs = (variant<int,int,int>){ in_place_index<1>, 4 }; 223 if ( vs.index() == 1 ) cout << "posn 1 " << get<1>(vs) << endl; 224 vs = (variant<int,int,int>){ in_place_index<2>, 5 }; 225 if ( vs.index() == 2 ) cout << "posn 2 " << get<2>(vs) << endl; 226 \end{c++} 227 \caption{\CC \lstinline[language=C++]{variant} Discrimination Using RTTI/Position} 228 \label{f:C++variant} 229 \end{figure} 197 230 198 231 % https://downloads.haskell.org/ghc/latest/docs/libraries/base-4.19.1.0-179c/GHC-Enum.html … … 200 233 201 234 The association between ADT and enumeration occurs if all the constructors have a unit (empty) type, \eg @struct unit {}@. 202 Note, the unit type is not the same as \lstinline{void} , \eg:235 Note, the unit type is not the same as \lstinline{void}. 203 236 \begin{cfa} 204 237 void foo( void ); 205 238 struct unit {} u; $\C[1.5in]{// empty type}$ 206 239 unit bar( unit ); 207 foo( foo()); $\C{// void argument does not match with void parameter}$240 foo( @foo()@ ); $\C{// void argument does not match with void parameter}$ 208 241 bar( bar( u ) ); $\C{// unit argument does match with unit parameter}\CRT$ 209 242 \end{cfa} … … 214 247 \end{haskell} 215 248 the default type for each constructor is the unit type, and deriving from @Enum@ enforces no other associated types, @Eq@ allows equality comparison, and @Show@ is for printing. 216 The nullary constructors for the unit types are numbered left-to-right from $0$ to @maxBound@$- 1$, and provides enumerating operations @succ@, @pred@, @enumFrom@ @enumFromTo@.249 The nullary constructors for the unit types are numbered left-to-right from $0$ to @maxBound@$- 1$, and provides enumerating operations @succ@, @pred@, @enumFrom@, @enumFromTo@. 217 250 \VRef[Figure]{f:HaskellEnumeration} shows enumeration comparison and iterating (enumerating). 218 251 219 252 \begin{figure} 220 253 \begin{cquote} 221 \setlength{\tabcolsep}{ 15pt}254 \setlength{\tabcolsep}{40pt} 222 255 \begin{tabular}{@{}ll@{}} 223 256 \begin{haskell} 224 257 day = Tue 225 258 main = do 226 227 228 229 230 print (enumFrom Mon) -- week 231 print (enumFromTo Mon Fri) -- weekday 232 print (enumFromTo Sat Sun) -- weekend 259 if day == Tue then 260 print day 261 else 262 putStr "not Tue" 263 print (enumFrom Mon) $\C[2.25in]{-- week}$ 264 print (enumFromTo Mon Fri) $\C{-- weekday}$ 265 print (enumFromTo Sat Sun) $\C{-- weekend}\CRT$ 233 266 \end{haskell} 234 267 & … … 251 284 252 285 The key observation is the dichotomy between an ADT and enumeration: the ADT uses the associated type resulting in a union-like data structure, and the enumeration does not use the associated type, and hence, is not a union. 253 While an enumeration is constructed using the ADT mechanism,it is so restricted it is not an ADT.286 In contrast, an enumeration may be constructed using the ADT mechanism, but it is so restricted it is not an ADT. 254 287 Furthermore, a general ADT cannot be an enumeration because the constructors generate different values making enumerating meaningless. 255 288 While functional programming languages regularly repurpose the ADT type into an enumeration type, this process seems contrived and confusing. … … 266 299 \begin{enumerate} 267 300 \item 268 overloading 301 overloading: 269 302 \item 270 303 scoping -
doc/theses/jiada_liang_MMath/uw-ethesis-frontpgs.tex
r8789ae4 r433e2c3 131 131 \begin{center}\textbf{Abstract}\end{center} 132 132 133 % An enumeration is a type defining an ordered set of named constant values, where a name abstracts a value, \eg @PI@ versus @3.145159@. 134 % C restrict an enumeration type to the integral type @signed int@, which \CC support, meaning enumeration names bind to integer constants. 135 % \CFA extends C enumerations to allow all basic and custom types for the enumeration type, like other modern programming languages. 136 % Furthermore, \CFA adds other useful features for enumerations to support better software-engineering practices and simplify program development. 137 The \CFA (C-for-all) programming language is an evolutionary refinement of C programing language. One of its distinctive feature is the generic 138 types. But legacy data type from C, such as enumerations, does not adapt well into the \CFA generic type system. 139 140 This thesis presents an adaptation of enumerated types, in a way that integrates naturallly with the generic type feature of \CFA while being 141 backward-compatiable to C. This thesis also presents a number of smaller refinement to the \CFA overload resolution rules for enumerated types, 142 each of which improves the intuitive nature of enumerations. 143 The enumeration types improvement has been implemented into \CFA compiler and run-time environment. The root ideas behinds the change of 144 enumeration is to discover the approach of C data types working with \CFA generic types in order to improve the language expressiveity and safety. 133 An \emph{enumeration} is a type defining a (ordered) set of named constant values. 134 \begin{cfa} 135 enum Week { Mon, Tue, Wed, Thu, Fri, Sat, Sun }; 136 enum Math { PI = 3.14159, Tau = 6.28318, Phi = 1.61803 }; 137 enum RGB { Red = 100b, Green = 010b, Blue = 001b }; 138 \end{cfa} 139 Its purpose is for readability: replacing constant values in a program with symbolic names that are more meaningful to programmers in the context of the application. 140 Thereafter, associating a name to a different value automatically distributes this rebinding, preventing errors. 141 One of the key properties of an enumeration is the ability to enumerate (iterate) through the constants, and hence, access their values, if present. 142 C restricts an enumeration to the integral type @signed int@, while \CC extends enumerations to all integral types, meaning enumeration names must bind to integer constants. 143 Other modern programming languages provide bindings to any type and additional features to extend enumeration capabilities for better software-engineering practices. 144 145 The \CFA (C-for-all) programming language is an evolutionary refinement of the C programing language. 146 One of its distinctive feature is a parametric-polymorphic generic type. 147 However, legacy data types from C, such as enumerations, do not adapt well into the \CFA generic type-system. 148 149 This thesis extends the simple and unsafe enumeration type in the C programming language into a complex and safe enumeration type in the \CFA programming-language, while maintaining backwards compatibility with C. 150 The major contribution is an adaptation of enumerated types with the \CFA type-system in a way that integrates naturally with the generic types. 151 This thesis also presents a number of smaller refinement to the \CFA overload resolution rules for enumerated types, each of which improves the intuitive nature of enumeration name resolution by the compiler. 152 Finally, this work adds other useful features to enumerations that better support software-engineering practices and simplify program development. 153 145 154 \cleardoublepage 146 155 \phantomsection % allows hyperref to link to the correct page … … 151 160 \begin{center}\textbf{Acknowledgements}\end{center} 152 161 153 To begin, I would like to thank my supervisor Proferssor Peter Buhr. Thank you for your guidance and 154 support throughout my study and research. I would not be here without you. 155 156 Thanks Gregor Richards and Yzihou Zhang for reading my thesis. 162 To begin, I would like to thank my supervisor Professor Peter Buhr. 163 Thank you for your guidance and support throughout my study and research. 164 I would not be here without you. 165 166 Thanks to Gregor Richards and Yzihou Zhang for reading my thesis. 157 167 158 168 Special thanks to Andrew James Beach for your insight on the theory development on the thesis. 159 169 160 170 Thanks to Michael Brooks, Fangran Yu, Colby Parsons, Thierry Delisle, Mubeen Zulifiqar, 161 and entire Cforallteam for development of the \CFA language, making it the best language it can be.171 and the entire \CFA team for development of the \CFA language, making it the best language it can be. 162 172 163 173 Finally, a special thank you to Huawei Canada for funding this work.
Note: See TracChangeset
for help on using the changeset viewer.