Changes in / [d73e667:b5978ca]


Ignore:
File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/fangren_yu_MMath/intro.tex

    rd73e667 rb5978ca  
    2323A programming language and its compiler present ways to declare types that ultimately map into the ones provided by the underlying hardware.
    2424These language types are thrust upon programmers with their syntactic/semantic rules and restrictions.
    25 These rules are then used to transform a language expression to a unique hardware expression.
     25These rules are then used to transform a language expression to a hardware expression.
    2626Modern programming-languages allow user-defined types and generalize across multiple types using polymorphism.
    2727Type systems can be static, where each variable has a fixed type during execution and an expression's type is determined at compile time, or dynamic, where each variable can change type during execution and so an expression's type is reconstructed on each evaluation.
     
    4040\newterm{Namespace pollution} refers to loading the global or other namespaces with many names, resulting in paranoia that the compiler could make wrong choices for overloaded names causing failure.
    4141This fear leads to coding styles where names are partitioned by language mechanisms and qualification is used to make names unique.
    42 This approach defeats the purpose of overloading and places an additional coding burden on both the developer and user.
     42This approach defeats the purpose of overloading and places an additional coding burden on both the code developer and user.
    4343As well, many namespace systems provide a mechanism to open their scope returning to normal overloading, \ie no qualification.
    4444While namespace mechanisms are very important and provide a number of crucial program-development features, protection from overloading is overstated.
    4545Similarly, lexical nesting is another place where overloading occurs.
    46 For example, in object-oriented programming, class member names \newterm{shadow} names within members.
     46For example, in object-oriented programming, class memeber names \newterm{shadow} names within members.
    4747Some programmers, qualify all member names with @class::@ or @this->@ to make them unique from names defined in members.
    4848Even nested lexical blocks result in shadowing, \eg multiple nested loop-indices called @i@.
     
    115115In the following C example:
    116116\begin{cfa}
    117 struct S { int S; };
    118 struct S s;
    119 void S( struct S S ) { S.S = 1; };
     117struct S {};
     118struct S S;
    120119enum E { E };
    121 enum E e = E;
    122 \end{cfa}
    123 the overloaded names @S@ and @E@ are separated into the type and object domain, and C uses the type kinds @struct@ and @enum@ to disambiguate the names.
    124 In general, types are not overloaded because inferencing them is difficult to imagine in a statically programming language.
    125 \begin{cquote}
    126 \setlength{\tabcolsep}{26pt}
    127 \begin{tabular}{@{}ll@{}}
    128 \begin{cfa}
    129 struct S { int i; }; // 1
    130 struct S { int i, j }; // 2
    131 union S { int i; double d; }; // 3
    132 typedef int S; // 4
    133 \end{cfa}
    134 &
    135 \begin{cfa}
    136 S s;
    137 if ( ... ) s = (S){ 3 }; // 1
    138 else s = 3; // 4
    139 
    140 \end{cfa}
    141 \end{tabular}
    142 \end{cquote}
     120\end{cfa}
     121the names @S@ and @E@ exist is the type and object domain, and C uses the type kinds @struct@ and @enum@ to disambiguate the names.
     122
    143123On the other hand, in ad-hoc overloading of variables and/or functions, the names are in the object domain and each overloaded object name has an anonymous associated type.
    144124\begin{cfa}
    145 @double@ foo( @int@ );    @char@ foo( @void@ );    @int@ foo( @double, double@ );
     125@double@ foo( @int@ );    @int@ foo( @void@ );    @int@ foo( @double, double@ );
    146126@double@ foo;    @char@ foo;    @int@ foo;
    147127\end{cfa}
    148 Here, the associated type cannot be extracted using @typeof@/\lstinline[language={[11]C++}]{decltype} for typing purposes, as @typeof( foo )@ is always ambiguous.
    149 To disambiguate the type of @foo@ requires additional information from the call-site or a cast.
    150 \begin{cfa}
    151 double d = foo( 3 );    char c = foo();    int i = foo( 3.5, 4.1 );  $\C{// exact match}$
    152 d = foo;    c = foo;    i = foo;  $\C{// exact match}$
    153 i = ((double (*)( int ))foo)( 3.5 ); $\C{// no exact match, explicit cast => conversions}$
    154 \end{cfa}
    155 Hence, depending on your perspective, overloading may not be polymorphism, as no single overloaded entity represents multiple types.
     128Notice, the associated type cannot be extracted using @typeof@/\lstinline[language={[11]C++}]{decltype} for typing purposes, as @typeof( foo )@ is always ambiguous.
     129Hence, overloading may not be polymorphism, as no single overloaded entity represents multiple types.
    156130
    157131\item
     
    161135For exact type matches in overloading, there is a systematic way of matching arguments to parameters, and a function denotes its return type rather than using type inferencing.
    162136This matching is just as robust as other polymorphic analysis.
    163 The ad-hoc aspect might be the implicit transfer functions (conversions) applied to arguments to create an exact parameter type-match, as there may be multiple conversions leading to different exact matches.
    164 Note, conversion issues apply to both non-overloaded and overloaded functions.
    165 Here, the selection of the conversion functions can be considered the \emph{opinion} of the language (type system), even if the technique used is based on sound rules, like maximizing conversion accuracy (non-lossy).
     137The ad-hoc aspect is the implicit transfer functions (conversions) applied to arguments to create an exact parameter type-match, as there may be multiple conversions leading to different exact matches.
     138Note, conversion issues apply to non-overloaded and overloaded functions.
     139Here, the selection of the conversion functions is based on the \emph{opinion} of the language (type system), even if the technique used is based on sound rules, like maximizing conversion accuracy (non-lossy).
    166140The difference in opinion results when the language conversion rules differ from a programmer's expectations.
    167141However, without implicit conversions, programmers may have to write an exponential number of functions covering all possible exact-match cases among all reasonable types.
    168142\CFA's \emph{opinion} on conversions must match C's and then rely on programmers to understand the effects.
    169 That is, let the compiler do the heavy-lifting of selecting a \emph{best} set of conversions that maximize safety.
    170 Hence, removing implicit conversions from \CFA is not an option (as is true in most other programming languages), so it must do the best possible job to get it right.
    171 
    172 \item
    173 Why are there two forms of \emph{overloading} (general and parametric) in different programming languages?
     143That is, let the compiler do the heavy-lifting of selecting a \emph{best} set of conversions that minimizes safety concerns.
     144Hence, removing implicit conversions from \CFA is not an option, so it must do the best possible job to get it right.
     145
     146\item
     147Why are there two forms of \emph{overloading} (regular and type class) in different programming languages?
    174148
    175149\noindent
    176 \newterm{General overloading} occurs when the type-system \emph{knows} a function's parameters and return types (or a variable's type for variable overloading).
    177 In functional programming-languages, there is always a return type (except for a monad).
     150\newterm{Regular overloading} occurs when the type-system \emph{knows} a function's argument and return types (or a variable's type for variable overloading).
    178151If a return type is specified, the compiler does not have to inference the routine body.
    179152For example, the compiler has complete knowledge about builtin types and their overloaded arithmetic operators.
     
    181154Overload resolution then involves finding an exact match between a call and the overload prototypes based on argument type(s) and possibly return context.
    182155If an \emph{exact} match is not found, the call is either ill formed (ambiguous) or further attempts are made to find a \emph{best} match using transfer functions (conversions).
    183 As a consequence, no additional runtime information is needed per call, \ie the call is a direct transfer (branch) with possibly converted arguments.
    184 
    185 \noindent
    186 \newterm{Parametric overloading} occurs when the type-system does not know a function's parameter and return types.
    187 Types are inferred from constant types and/or constraint information that includes typing.
    188 Parametric overloading starts with a universal/generic type used to define parameters and returns.
    189 \begin{cfa}
    190 forall( T ) T identity( T t ) { return t; }
    191 int i = identify( 3 );
    192 double d;
    193 double e = identity( d );
    194 \end{cfa}
    195 Here, the type system is substituting the argument type for @T@, and ensuring the return type also matches.
    196 This mechanism can be extended to overloaded parametric functions with multiple type parameters.
    197 \begin{cfa}
    198 forall( T ) [T, T] identity( T t1, T t2 ) { return [t1, t2 ]; } // return tuple
    199 int j;
    200 [i, j] = identity( 3, 4 );
    201 forall( T, U ) [T, U] identity( T t1, U t2 ) { return [t1, t2 ]; }
    202 [i, d] = identity( i, d );
    203 \end{cfa}
    204 Here, the type system is using techniques from general overloading, number and argument types, to disambiguate the overloaded parametric functions.
    205 However, this exercise is moot because the types are so universal they have no practical operations within the function, modulo any general operations carried by every type, like copy or print.
    206 This kind of polymorphism is restricted to moving type instances as abstract entities;
    207 if type erasure is used, there is no way to recover the original type to perform useful actions on its value(s).
    208 Hence, parametric overloading requires additional information about the universal types to make them useful.
    209 
    210 This additional information often comes as a set of operations a type must supply (@trait@/-@concept@) and these operations can then be used in the body of the function.
    211 \begin{cfa}
    212 forall( T | T ?@++@( T, T ) ) T inc( T t ) { return t@++@; }
    213 int i = 3
    214 i = inc( i )
    215 double d = inc( 3.5 );
    216 \end{cfa}
    217 Given a qualifying trait, are its elements inferred or declared?
    218 In the above example, the type system infers @int@ for @T@, infers it needs a @++@ operator that takes an @int@ and returns an @int@, and finds this function in the enclosing environment (\eg standard prelude).
    219 This implicit inferencing is expensive if matched with implicit conversions when there is no exact match.
    220 Alternatively, types opt-in to traits via declarations.
    221 \begin{cfa}
    222 trait postinc { T ?@++@( T, T ) }
    223 type int provide postinc;
    224 \end{cfa}
    225 In this approach, after inferring @int@ for @T@, @int@ is examined for any necessary traits required in the function, so there is no searching.
    226 Once all traits are satisfied, the compiler can inline them, pass functions pointers directly, or pass a virtual table composing the trait.
    227 \end{enumerate}
    228 The following sections illustrate how overloading is provided in \CFA.
    229 
    230 \begin{comment}
     156As a consequence, no additional runtime information is needed per call, \ie the call is a direct transfer (branch) with pushed arguments.
     157
     158\newterm{Type-class overloading} occurs when the compiler is using currying for type inferencing.
    231159\begin{lstlisting}[language=Haskell]
    232160f( int, int );  f( int, float ); -- return types to be determined
     
    257185\end{cfa}
    258186
     187
    259188In this case, the compiler implicitly changes the overloaded function to a parametrically polymorphic one.
    260189Hence, the programmer does specify any additional information for the overloading to work.
    261190Explicit overloading occurs when the compiler has to be told what operations are associated with a type  programmer separately defines the associate type and subsequently associates the type with overloaded name.
    262 \end{comment}
     191\end{enumerate}
    263192
    264193\begin{comment}
     
    414343\subsection{Operator Overloading}
    415344
    416 Virtually all programming languages provide general overloading of the arithmetic operators across the basic computational types using the number and type of parameters and returns.
    417 However, in some languages, arithmetic operators may not be first class, and hence, cannot be overloaded.
    418 Like \CC, \CFA does allow general operator overloading for user-defined types.
    419 The \CFA syntax for operator names uses the @'?'@ character to denote a parameter, \eg left and right unary operators: @?++@ and @++?@, and binary operators @?+?@ and @?<=?@.
     345Virtually all programming languages overload the arithmetic operators across the basic computational types using the number and type of parameters and returns.
     346However, in many programming languages, arithmetic operators are not first class, and hence, they cannot be overloaded by programmers.
     347Like \CC, \CFA maps operators to named functions allowing them to be overloaded with user-defined types.
     348The syntax for operator names uses the @'?'@ character to denote a parameter, \eg left and right unary operators: @?++@ and @++?@, and binary operators @?+?@ and @?<=?@.
    420349Here, a user-defined type is extended with an addition operation with the same syntax as a builtin type.
    421350\begin{cfa}
     
    430359Conversions are necessary because the hardware rarely supports mix-mode operations, so both operands must be converted to a common type.
    431360Like overloading, the majority of mixed-mode conversions are silently resolved, simplifying the development process.
    432 This approach matches with programmer intuition and expectation, regardless of any \emph{safety} issues resulting from converted values.
     361This approach does not match with programmer intuition and expectation, regardless of any \emph{safety} issues resulting from converted values.
    433362Depending on the language, mix-mode conversions can be explicitly controlled using some form of cast.
    434363
     
    436365\subsection{Function Overloading}
    437366
    438 Both \CFA and \CC allow general overloading for functions, as long as their prototypes differ in the number and type of parameters and returns.
     367Both \CFA and \CC allow function names to be overloaded, as long as their prototypes differ in the number and type of parameters and returns.
    439368\begin{cfa}
    440369void f( void );                 $\C[2in]{// (1): no parameter}$
     
    443372f( 'A' );                               $\C{// select (2)}\CRT$
    444373\end{cfa}
    445 The type system examines each call size and first looks for an exact match and then a best match using conversions.
     374In this case, the name @f@ is overloaded depending on the number and parameter types.
     375The type system examines each call size and selects the best match based on the number and types of the arguments.
     376Here, there is a perfect match for the call, @f( 'A' )@ with the number and parameter type of function (2).
    446377
    447378Ada, Scala, and \CFA type-systems also use the return type in resolving a call, to pinpoint the best overloaded name.
    448 Essentailly, the return types are \emph{reversed curried} into output parameters of the function.
    449379For example, in many programming languages with overloading, the following functions are ambiguous without using the return type.
    450380\begin{cfa}
     
    463393\end{cfa}
    464394Again, there is an exact match for each call.
    465 As for operator overloading, if there is no exact match, a set of minimal implicit conversions can be added to find a best match.
     395As for operator overloading, if there is no exact match, a set of minimal, an implicit conversion can be added to find a best match.
    466396\begin{cfa}
    467397short int = random();   $\C[2in]{// select (1), unsafe}$
     
    486416\end{cfa}
    487417It is interesting that shadow overloading is considered a normal programming-language feature with only slight software-engineering problems.
    488 However, variable overloading within a scope is often considered dangerous, without any evidence to corroborate this claim.
     418However, variable overloading within a scope is considered extremely dangerous, without any evidence to corroborate this claim.
    489419In contrast, function overloading in \CC occurs silently within the global scope from @#include@ files all the time without problems.
    490420
     
    500430\end{cfa}
    501431Hence, the name @MAX@ can replace all the C type-specific names, \eg @INT_MAX@, @LONG_MAX@, @DBL_MAX@, \etc.
    502 The result is a significant reduction in names to access common constants.
     432The result is a significant reduction in names to access typed constants.
     433
     434As an aside, C has a separate namespace for types and variables allowing overloading between the namespaces, using @struct@ (qualification) to disambiguate.
     435\begin{cfa}
     436void S() {
     437        struct @S@ { int S; };
     438        @struct S@ S;
     439        void S( @struct S@ S ) { S.S = 1; };
     440}
     441\end{cfa}
     442Here the name @S@ is an aggregate type and field, and a variable and parameter of type @S@.
    503443
    504444
     
    532472
    533473
     474\section{Overload Resolution Strategies}
     475
     476For languages with user-defined overloading,
     477Given an overloaded constant, variable, or (generic) function, there must exist strategies for differentiating among them and selecting the most appropriate one in a given context.
     478The criteria commonly used to match operator/function/method names with definitions are: number of parameters, parameter types, parameter order or name, return type, implicit argument type conversions (safe/unsafe), generic, where some features are missing in certain programming languages.
     479\VRef[Table]{t:OverloadingFeatures} shows a subset of popular programming languages with overloading and the discriminating features used to disambiguate among overloadings.
     480Language C, Go and Rust have no overloading beyond basic types and operators.
     481
     482\begin{table}
     483\caption{Overload Discriminating Features in Programming Languages}
     484\label{t:OverloadingFeatures}
     485\centering
     486
     487% https://doc.rust-lang.org/rust-by-example/trait/impl_trait.html
     488% https://dl.acm.org/doi/10.1145/75277.75283
     489
     490\begin{minipage}{\linewidth}
     491\setlength{\tabcolsep}{5pt}
     492\begin{tabular}{@{}r|cccccccc@{}}
     493Feature\,{\textbackslash}\,Language     & Ada   & \CC   & \CFA  & Java  & Scala & Swift & Rust & Haskell        \\
     494\hline
     495Operator/Function/Method name   & O\footnote{except assignment}/F       & O/F/M & O/F   & M     & O/M   & O/F/M & X     & X     \\
     496generic name                                    & no    & yes\footnote{compile-time only, using template expansion}     & yes   & yes   & yes   & yes   & X     & X \\
     497parameter number                                & yes   & yes   & yes   & yes   & yes   & yes   & X     & X     \\
     498parameter types                                 & yes   & yes   & yes   & yes   & yes   & yes   & X     & X     \\
     499parameter name                                  & no    & no    & no    & no    & yes   & yes   & X     & X     \\
     500return type                                             & yes   & no    & yes   & no    & no    & yes   & X     & X     \\
     501Safe/Unsafe argument conversion & none  & yes\footnote{no conversions allowed during template parameter deduction}      & S/U
     502        & S\footnote{unsafe (narrowing) conversion only allowed in assignment or initialization to a primitive variable}        & S
     503        & no\footnote{literals only, Int -> Double (Safe)}      & X     & X
     504\end{tabular}
     505\end{minipage}
     506\end{table}
     507
     508
    534509\section{Type Inferencing}
    535510\label{s:IntoTypeInferencing}
     
    886861
    887862
    888 \section{Language Comparison}
    889 
    890 Because \CFA's type system is focused on overloading and overload resolution, \VRef[Table]{t:OverloadingFeatures} compares \CFA with a representive set of popular programming languages and their take on overloading.
    891 The first row classifies whether there is general overloading, and what enities may be overloaded.
    892 \begin{cfa}
    893 int foo( int );                                 int foo;
    894 double foo( int, int );                 double foo;
    895 \end{cfa}
    896 The second row classifies the specialzation mechanisms used distinished among the general overloads.
    897 The third row classifies if generic functions can be overloaded based on the number and differing type variables.
    898 \begin{cfa}
    899 forall( T ) T foo( T t );
    900 forall( T ) T foo( T t, T s );
    901 forall( T, U ) T foo( T t, U s );
    902 \end{cfa}
    903 The fourth row classifies the mechnaism used to specialize  provide  if generic functions can be overloaded based on the number and differing type variables.
    904 The fifth row classifies if conversions are attempted beyond exact match.
    905 
    906 \begin{table}
    907 \caption{Overload Discriminating Features in Programming Languages}
    908 \label{t:OverloadingFeatures}
    909 \centering
    910 \begin{minipage}{\linewidth}
    911 \setlength{\tabcolsep}{5pt}
    912 \begin{tabular}{@{}r|cccccccc@{}}
    913 Feature\,{\textbackslash}\,Language     & Ada   & \CC           & \CFA  & Java  & Scala & Swift & Rust & Haskell        \\
    914 \hline
    915 general\footnote{overloadable entities: V $\Rightarrow$ variable, O $\Rightarrow$ operator, F $\Rightarrow$ function, M $\Rightarrow$ member}
    916                                                 & O\footnote{except assignment}/F       & O/F/M & V/O/F & M\footnote{not universal}     & O/M   & O/F/M & no    & no    \\
    917 general constraints\footnote{\# $\Rightarrow$ number, T $\Rightarrow$ type, N $\Rightarrow$ name, R $\Rightarrow$ return type}
    918                                                 & \#/T/R        & \#/T  & \#/T/R        & \#/T  & \#/T/N/R      & \#/T/N/R      & \#/T/N        & T/R \\
    919 type parameters                 & no            & yes   & yes           & yes   & yes   & yes   & yes   & yes \\
    920 type constraint\footnote{T $\Rightarrow$ trait/concept, B $\Rightarrow$ type bounds, TC $\Rightarrow$ type class}
    921                                                 & no            & T             & T                     & T             & B             & T     & T     & TC \\
    922 %parameter number               & yes   & yes           & yes   & yes   & yes   & yes   & maybe & no    \\
    923 %parameter types                & yes   & yes           & yes   & yes   & yes   & yes   & yes   & yes   \\
    924 %parameter name                 & no    & no            & no    & no    & yes   & yes   & no    & no    \\
    925 %return type                    & yes   & no            & yes   & no    & no    & yes   & no    & yes   \\
    926 Safe/Unsafe arg. conv.  & no    & S/U\footnote{no conversions allowed during template parameter deduction}      & S/U
    927         & S\footnote{unsafe (narrowing) conversion only allowed in assignment or initialization to a primitive (builtin) type}  & S
    928         & no\footnote{literals only, Int $\rightarrow$ Double (Safe)}   & no    & no
    929 \end{tabular}
    930 \end{minipage}
    931 % https://doc.rust-lang.org/rust-by-example/trait/impl_trait.html
    932 % https://dl.acm.org/doi/10.1145/75277.75283
    933 \end{table}
    934 
    935 
    936863\section{Contributions}
    937864
Note: See TracChangeset for help on using the changeset viewer.