Changeset 433e2c3


Ignore:
Timestamp:
Aug 3, 2024, 9:40:56 AM (3 months ago)
Author:
Peter A. Buhr <pabuhr@…>
Branches:
master
Children:
4e107bf, e15293b
Parents:
8789ae4
Message:

proofread thesis frontpgs and intro

Location:
doc/theses/jiada_liang_MMath
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/jiada_liang_MMath/intro.tex

    r8789ae4 r433e2c3  
    11\chapter{Introduction}
    22
    3 All types in a programming language have a set of constants (symbols), and these constants represent values, \eg integer types have constants @-1@, @17@, @0xff@ representing whole numbers, floating-point types have constants @5.3@, @2.3E-5@, @0xff.ffp0@ representing  real numbers, character types have constants @'a'@, @"abc\n"@, \mbox{\lstinline{u8"}\texttt{\guillemotleft{na\"{i}ve}\guillemotright}\lstinline{"}} representing (human readable) text, \etc.
     3All basic types in a programming language have a set of constants (symbols), and these constants represent computable values, \eg integer types have constants @-1@, @17@, @0xff@ representing whole numbers, floating-point types have constants @5.3@, @2.3E-5@, @0xff.ffp0@ representing  real numbers, character types have constants @'a'@, @"abc\n"@, \mbox{\lstinline{u8"}\texttt{\guillemotleft{na\"{i}ve}\guillemotright}\lstinline{"}} representing (human readable) text, \etc.
    44Constants can be overloaded among types, \eg @0@ is a null pointer for all pointer types, and the value zero for integer and floating-point types.
    55(In \CFA, the constants @0@ and @1@ can be overloaded for any type.)
     6Higher-level types compose constants from the basic constants.
     7\begin{cfa}
     8struct S { int i, j, k; } s;
     9s = (S){ 1, 2, 3 };                             $\C[2in]{// structure constant}$
     10int x[5] = { 1, 2, 3, 4, 5 };   $\C{// array constant}\CRT$
     11\end{cfa}
    612A constant's symbolic name is dictated by language syntax related to types, \eg @5.@ (double), @5.0f@ (float), @5l@ (long double).
    7 In general, the representation of a constant's value is \newterm{opaque}, so the internal representation can be chosen arbitrarily.
     13In general, the representation of a constant's value is \newterm{opaque}, so the internal representation can be chosen arbitrarily, \eg two's complement, IEEE floating-point.
    814In theory, there are an infinite set of constant names per type representing an infinite set of values.
    915
     
    1319
    1420Many programming languages capture this important software-engineering capability through a mechanism called \newterm{constant} or \newterm{literal} naming, where a new constant is aliased to an existing constant.
    15 Its purpose is for readability: replacing a constant name that directly represents a value with a name that is more symbolic and meaningful in the context of the program.
    16 Thereafter, changing the aliasing of the new constant to another constant automatically distributes the rebinding, preventing errors.
    17 % and only equality operations are available, \eg @O_RDONLY@, @O_WRONLY@, @O_CREAT@, @O_TRUNC@, @O_APPEND@.
     21Its purpose is for readability: replacing constant values in a program with symbolic names that are more meaningful to programmers in the context of the application.
     22Thereafter, associating a name to a different value automatically distributes this rebinding, preventing errors.
    1823Because an aliased name is a constant, it cannot appear in a mutable context, \eg \mbox{$\pi$ \lstinline{= 42}} is meaningless, and a constant has no address, \ie it is an \newterm{rvalue}\footnote{
    1924The term rvalue defines an expression that can only appear on the right-hand side of an assignment expression.}.
     
    3944for ( cursor in Mon, Wed, Fri, Sun } ...                $\C{// every second day of week}\CRT$
    4045\end{cfa}
    41 A set can have a partial or total ordering, making it possible to compare set elements, \eg Monday is before Friday and Friday is after.
     46A set can have a partial or total ordering, making it possible to compare set elements, \eg Monday is before Tuesday and Tuesday is after.
    4247Ordering allows iterating among the enumeration set using relational operators and advancement, \eg:
    4348\begin{cfa}
    44 for ( cursor = Monday; cursor @<=@ Friday; cursor = @succ@( cursor ) ) ...
     49for ( cursor = Monday; cursor @<=@ Friday; cursor = @succ@( cursor ) ) ... // weekdays
    4550\end{cfa}
    4651Here the values for the set names are logically \emph{generated} rather than listing a subset of names.
     
    5055\item
    5156\begin{sloppypar}
    52 It provides a finite set of new constants, which are implicitly or explicitly assigned values that must be appropriate for any set operations.
     57It provides a finite set of new constants, which are implicitly or explicitly assigned values that must be appropriate for any set operations, \eg increasing order.
    5358This aspect differentiates an enumeration from general types with an infinite set of constants.
    5459\end{sloppypar}
     
    106111\label{s:Aliasing}
    107112
    108 Some languages provide simple aliasing (renaming), \eg:
     113Some languages provide simple aliasing (renaming).
    109114\begin{cfa}
    110115const Size = 20, Pi = 3.14159, Name = "Jane";
     
    113118It is possible to compare aliases, if the constants allow it, \eg @Size < Pi@, whereas @Pi < Name@ might be disallowed depending on the language.
    114119
    115 Aliasing is not macro substitution, \eg @#define Size 20@, where a name is replaced by its value \emph{before} compilation, so the name is invisible to the programming language.
     120Aliasing is \emph{not} macro substitution, \eg @#define Size 20@, where a name is replaced by its value \emph{before} compilation, so the name is invisible to the programming language.
    116121With aliasing, each new name is part of the language, and hence, participates fully, such as name overloading in the type system.
    117 Aliasing is not an immutable variable, \eg:
     122Aliasing is not an immutable variable.
    118123\begin{cfa}
    119124extern @const@ int Size = 20;
     
    123128Taking the address of an immutable variable makes it an \newterm{lvalue}, which implies it has storage.
    124129With separate compilation, it is necessary to choose one translation unit to perform the initialization.
    125 If aliasing does require storage, its address and initialization are opaque (compiler only), similar to \CC rvalue reference @&&@.
     130If aliasing requires storage, its address and initialization are opaque (compiler only), similar to \CC rvalue reference @&&@.
    126131
    127132Aliasing does provide readability and automatic resubstitution.
     
    151156baz = C S{ i = 7, d = 7.5 }
    152157\end{haskell}
    153 the ADT has three variants (constructors), @A@, @B@, @C@ with associated types @Int@, @Double@, and @S@.
     158the ADT has three variants (constructors), @A@, @B@, @C@, with associated types @Int@, @Double@, and @S@.
    154159The constructors create an initialized value of the specific type that is bound to the immutable variables @foo@, @bar@, and @baz@.
    155160Hence, the ADT @Foo@ is like a union containing values of the associated types, and a constructor name is used to intialize and access the value using dynamic pattern-matching.
    156161\begin{cquote}
    157 \setlength{\tabcolsep}{15pt}
     162\setlength{\tabcolsep}{20pt}
    158163\begin{tabular}{@{}ll@{}}
    159164\begin{haskell}
    160165prtfoo val = -- function
    161     -- pattern match on constructor
    162     case val of
    163       @A@ a -> print a
    164       @B@ b -> print b
    165       @C@ (S i d) -> do
    166           print i
    167           print d
     166        -- pattern match on constructor
     167        case val of
     168          @A@ a -> print a
     169          @B@ b -> print b
     170          @C@ (S i d) -> do
     171                print i
     172                print d
    168173\end{haskell}
    169174&
    170175\begin{haskell}
    171176main = do
    172     prtfoo foo
    173     prtfoo bar
    174     prtfoo baz
     177        prtfoo foo
     178        prtfoo bar
     179        prtfoo baz
    1751803
    1761813.5
     
    193198Note, the term \newterm{variant} is often associated with ADTs.
    194199However, there are multiple languages with a @variant@ type that is not an ADT \see{Algol68~\cite{Algol68} or \CC \lstinline{variant}}.
    195 In these languages, the variant is often a union using RTTI tags for discrimination, which cannot be used to simulate an enumeration.
     200Here, the type (and possibly the position for equivalent types) is used to discriminant the specific \emph{variant} within the variant instance.
     201For example, \VRef[Figure]{f:C++variant} shows the \CC equivalent of the two Haskell ADT types using variant types.
     202In these languages, the variant cannot be used to simulate an enumeration.
    196203Hence, in this work the term variant is not a synonym for ADT.
     204
     205\begin{figure}
     206\begin{c++}
     207struct S { char s[32]; };
     208variant< int, double, S > vd;
     209variant< int, int, int > vs;
     210
     211// discrimination based on type
     212vd = 3;
     213if ( holds_alternative<int>(vd) ) cout << "int " << get<int>(vd ) << endl;
     214vd = 3.5;
     215if ( holds_alternative<double>(vd) ) cout << "double " << get<double>(vd) << endl;
     216vd = (S){ "abc" };
     217if ( holds_alternative<S>(vd) ) cout << "S.s " << get<S>(vd).s << endl;
     218
     219// discrimination based on type and position within type
     220vs = (variant<int,int,int>){ in_place_index<0>, 12 };
     221if ( vs.index() == 0 ) cout << "posn 0 " << get<0>(vs) << endl;
     222vs = (variant<int,int,int>){ in_place_index<1>, 4 };
     223if ( vs.index() == 1 ) cout << "posn 1 " << get<1>(vs) << endl;
     224vs = (variant<int,int,int>){ in_place_index<2>, 5 };
     225if ( vs.index() == 2 ) cout << "posn 2 " << get<2>(vs) << endl;
     226\end{c++}
     227\caption{\CC \lstinline[language=C++]{variant} Discrimination Using RTTI/Position}
     228\label{f:C++variant}
     229\end{figure}
    197230
    198231% https://downloads.haskell.org/ghc/latest/docs/libraries/base-4.19.1.0-179c/GHC-Enum.html
     
    200233
    201234The association between ADT and enumeration occurs if all the constructors have a unit (empty) type, \eg @struct unit {}@.
    202 Note, the unit type is not the same as \lstinline{void}, \eg:
     235Note, the unit type is not the same as \lstinline{void}.
    203236\begin{cfa}
    204237void foo( void );
    205238struct unit {} u;       $\C[1.5in]{// empty type}$
    206239unit bar( unit );
    207 foo( foo() );           $\C{// void argument does not match with void parameter}$
     240foo( @foo()@ );         $\C{// void argument does not match with void parameter}$
    208241bar( bar( u ) );        $\C{// unit argument does match with unit parameter}\CRT$
    209242\end{cfa}
     
    214247\end{haskell}
    215248the default type for each constructor is the unit type, and deriving from @Enum@ enforces no other associated types, @Eq@ allows equality comparison, and @Show@ is for printing.
    216 The nullary constructors for the unit types are numbered left-to-right from $0$ to @maxBound@$- 1$, and provides enumerating operations @succ@, @pred@, @enumFrom@ @enumFromTo@.
     249The nullary constructors for the unit types are numbered left-to-right from $0$ to @maxBound@$- 1$, and provides enumerating operations @succ@, @pred@, @enumFrom@, @enumFromTo@.
    217250\VRef[Figure]{f:HaskellEnumeration} shows enumeration comparison and iterating (enumerating).
    218251
    219252\begin{figure}
    220253\begin{cquote}
    221 \setlength{\tabcolsep}{15pt}
     254\setlength{\tabcolsep}{40pt}
    222255\begin{tabular}{@{}ll@{}}
    223256\begin{haskell}
    224257day = Tue
    225258main = do
    226     if day == Tue then
    227         print day
    228     else
    229         putStr "not Tue"
    230     print (enumFrom Mon)            -- week
    231     print (enumFromTo Mon Fri)   -- weekday
    232     print (enumFromTo Sat Sun)  -- weekend
     259        if day == Tue then
     260                print day
     261        else
     262                putStr "not Tue"
     263        print (enumFrom Mon)            $\C[2.25in]{-- week}$
     264        print (enumFromTo Mon Fri)      $\C{-- weekday}$
     265        print (enumFromTo Sat Sun)      $\C{-- weekend}\CRT$
    233266\end{haskell}
    234267&
     
    251284
    252285The key observation is the dichotomy between an ADT and enumeration: the ADT uses the associated type resulting in a union-like data structure, and the enumeration does not use the associated type, and hence, is not a union.
    253 While an enumeration is constructed using the ADT mechanism, it is so restricted it is not an ADT.
     286In contrast, an enumeration may be constructed using the ADT mechanism, but it is so restricted it is not an ADT.
    254287Furthermore, a general ADT cannot be an enumeration because the constructors generate different values making enumerating meaningless.
    255288While functional programming languages regularly repurpose the ADT type into an enumeration type, this process seems contrived and confusing.
     
    266299\begin{enumerate}
    267300\item
    268 overloading
     301overloading:
    269302\item
    270303scoping
  • doc/theses/jiada_liang_MMath/uw-ethesis-frontpgs.tex

    r8789ae4 r433e2c3  
    131131\begin{center}\textbf{Abstract}\end{center}
    132132
    133 % An enumeration is a type defining an ordered set of named constant values, where a name abstracts a value, \eg @PI@ versus @3.145159@.
    134 % C restrict an enumeration type to the integral type @signed int@, which \CC support, meaning enumeration names bind to integer constants.
    135 % \CFA extends C enumerations to allow all basic and custom types for the enumeration type, like other modern programming languages.
    136 % Furthermore, \CFA adds other useful features for enumerations to support better software-engineering practices and simplify program development.
    137 The \CFA (C-for-all) programming language is an evolutionary refinement of C programing language. One of its distinctive feature is the generic
    138 types. But legacy data type from C, such as enumerations, does not adapt well into the \CFA generic type system.
    139 
    140 This thesis presents an adaptation of enumerated types, in a way that integrates naturallly with the generic type feature of \CFA while being
    141 backward-compatiable to C. This thesis also presents a number of smaller refinement to the \CFA overload resolution rules for enumerated types,
    142 each of which improves the intuitive nature of enumerations.
    143 The enumeration types improvement has been implemented into \CFA compiler and run-time environment. The root ideas behinds the change of
    144 enumeration is to discover the approach of C data types working with \CFA generic types in order to improve the language expressiveity and safety.
     133An \emph{enumeration} is a type defining a (ordered) set of named constant values.
     134\begin{cfa}
     135enum Week { Mon, Tue, Wed, Thu, Fri, Sat, Sun };
     136enum Math { PI = 3.14159, Tau = 6.28318, Phi = 1.61803 };
     137enum RGB { Red = 100b, Green = 010b, Blue = 001b };
     138\end{cfa}
     139Its purpose is for readability: replacing constant values in a program with symbolic names that are more meaningful to programmers in the context of the application.
     140Thereafter, associating a name to a different value automatically distributes this rebinding, preventing errors.
     141One of the key properties of an enumeration is the ability to enumerate (iterate) through the constants, and hence, access their values, if present.
     142C restricts an enumeration to the integral type @signed int@, while \CC extends enumerations to all integral types, meaning enumeration names must bind to integer constants.
     143Other modern programming languages provide bindings to any type and additional features to extend enumeration capabilities for better software-engineering practices.
     144
     145The \CFA (C-for-all) programming language is an evolutionary refinement of the C programing language.
     146One of its distinctive feature is a parametric-polymorphic generic type.
     147However, legacy data types from C, such as enumerations, do not adapt well into the \CFA generic type-system.
     148
     149This thesis extends the simple and unsafe enumeration type in the C programming language into a complex and safe enumeration type in the \CFA programming-language, while maintaining backwards compatibility with C.
     150The major contribution is an adaptation of enumerated types with the \CFA type-system in a way that integrates naturally with the generic types.
     151This thesis also presents a number of smaller refinement to the \CFA overload resolution rules for enumerated types, each of which improves the intuitive nature of enumeration name resolution by the compiler.
     152Finally, this work adds other useful features to enumerations that better support software-engineering practices and simplify program development.
     153
    145154\cleardoublepage
    146155\phantomsection    % allows hyperref to link to the correct page
     
    151160\begin{center}\textbf{Acknowledgements}\end{center}
    152161
    153 To begin, I would like to thank my supervisor Proferssor Peter Buhr. Thank you for your guidance and
    154 support throughout my study and research. I would not be here without you.
    155 
    156 Thanks Gregor Richards and Yzihou Zhang for reading my thesis.
     162To begin, I would like to thank my supervisor Professor Peter Buhr.
     163Thank you for your guidance and support throughout my study and research.
     164I would not be here without you.
     165
     166Thanks to Gregor Richards and Yzihou Zhang for reading my thesis.
    157167
    158168Special thanks to Andrew James Beach for your insight on the theory development on the thesis.
    159169
    160170Thanks to Michael Brooks, Fangran Yu, Colby Parsons, Thierry Delisle, Mubeen Zulifiqar,
    161  and entire Cforall team for development of the \CFA language, making it the best language it can be.
     171and the entire \CFA team for development of the \CFA language, making it the best language it can be.
    162172
    163173Finally, a special thank you to Huawei Canada for funding this work.
Note: See TracChangeset for help on using the changeset viewer.