Ignore:
Timestamp:
Mar 25, 2025, 12:24:05 PM (4 days ago)
Author:
Peter A. Buhr <pabuhr@…>
Branches:
master
Children:
d73e667
Parents:
f5bf3c2
Message:

update overload comparison table

File:
1 edited

Legend:

Unmodified
Added
Removed
  • TabularUnified doc/theses/fangren_yu_MMath/intro.tex

    rf5bf3c2 r399074a  
    2323A programming language and its compiler present ways to declare types that ultimately map into the ones provided by the underlying hardware.
    2424These language types are thrust upon programmers with their syntactic/semantic rules and restrictions.
    25 These rules are then used to transform a language expression to a hardware expression.
     25These rules are then used to transform a language expression to a unique hardware expression.
    2626Modern programming-languages allow user-defined types and generalize across multiple types using polymorphism.
    2727Type systems can be static, where each variable has a fixed type during execution and an expression's type is determined at compile time, or dynamic, where each variable can change type during execution and so an expression's type is reconstructed on each evaluation.
     
    4040\newterm{Namespace pollution} refers to loading the global or other namespaces with many names, resulting in paranoia that the compiler could make wrong choices for overloaded names causing failure.
    4141This fear leads to coding styles where names are partitioned by language mechanisms and qualification is used to make names unique.
    42 This approach defeats the purpose of overloading and places an additional coding burden on both the code developer and user.
     42This approach defeats the purpose of overloading and places an additional coding burden on both the developer and user.
    4343As well, many namespace systems provide a mechanism to open their scope returning to normal overloading, \ie no qualification.
    4444While namespace mechanisms are very important and provide a number of crucial program-development features, protection from overloading is overstated.
    4545Similarly, lexical nesting is another place where overloading occurs.
    46 For example, in object-oriented programming, class memeber names \newterm{shadow} names within members.
     46For example, in object-oriented programming, class member names \newterm{shadow} names within members.
    4747Some programmers, qualify all member names with @class::@ or @this->@ to make them unique from names defined in members.
    4848Even nested lexical blocks result in shadowing, \eg multiple nested loop-indices called @i@.
     
    115115In the following C example:
    116116\begin{cfa}
    117 struct S {};
    118 struct S S;
     117struct S { int S; };
     118struct S s;
     119void S( struct S S ) { S.S = 1; };
    119120enum E { E };
    120 \end{cfa}
    121 the names @S@ and @E@ exist is the type and object domain, and C uses the type kinds @struct@ and @enum@ to disambiguate the names.
    122 
     121enum E e = E;
     122\end{cfa}
     123the overloaded names @S@ and @E@ are separated into the type and object domain, and C uses the type kinds @struct@ and @enum@ to disambiguate the names.
     124In general, types are not overloaded because inferencing them is difficult to imagine in a statically programming language.
     125\begin{cquote}
     126\setlength{\tabcolsep}{26pt}
     127\begin{tabular}{@{}ll@{}}
     128\begin{cfa}
     129struct S { int i; }; // 1
     130struct S { int i, j }; // 2
     131union S { int i; double d; }; // 3
     132typedef int S; // 4
     133\end{cfa}
     134&
     135\begin{cfa}
     136S s;
     137if ( ... ) s = (S){ 3 }; // 1
     138else s = 3; // 4
     139
     140\end{cfa}
     141\end{tabular}
     142\end{cquote}
    123143On the other hand, in ad-hoc overloading of variables and/or functions, the names are in the object domain and each overloaded object name has an anonymous associated type.
    124144\begin{cfa}
    125 @double@ foo( @int@ );    @int@ foo( @void@ );    @int@ foo( @double, double@ );
     145@double@ foo( @int@ );    @char@ foo( @void@ );    @int@ foo( @double, double@ );
    126146@double@ foo;    @char@ foo;    @int@ foo;
    127147\end{cfa}
    128 Notice, the associated type cannot be extracted using @typeof@/\lstinline[language={[11]C++}]{decltype} for typing purposes, as @typeof( foo )@ is always ambiguous.
    129 Hence, overloading may not be polymorphism, as no single overloaded entity represents multiple types.
     148Here, the associated type cannot be extracted using @typeof@/\lstinline[language={[11]C++}]{decltype} for typing purposes, as @typeof( foo )@ is always ambiguous.
     149To disambiguate the type of @foo@ requires additional information from the call-site or a cast.
     150\begin{cfa}
     151double d = foo( 3 );    char c = foo();    int i = foo( 3.5, 4.1 );  $\C{// exact match}$
     152d = foo;    c = foo;    i = foo;  $\C{// exact match}$
     153i = ((double (*)( int ))foo)( 3.5 ); $\C{// no exact match, explicit cast => conversions}$
     154\end{cfa}
     155Hence, depending on your perspective, overloading may not be polymorphism, as no single overloaded entity represents multiple types.
    130156
    131157\item
     
    135161For exact type matches in overloading, there is a systematic way of matching arguments to parameters, and a function denotes its return type rather than using type inferencing.
    136162This matching is just as robust as other polymorphic analysis.
    137 The ad-hoc aspect is the implicit transfer functions (conversions) applied to arguments to create an exact parameter type-match, as there may be multiple conversions leading to different exact matches.
    138 Note, conversion issues apply to non-overloaded and overloaded functions.
    139 Here, the selection of the conversion functions is based on the \emph{opinion} of the language (type system), even if the technique used is based on sound rules, like maximizing conversion accuracy (non-lossy).
     163The ad-hoc aspect might be the implicit transfer functions (conversions) applied to arguments to create an exact parameter type-match, as there may be multiple conversions leading to different exact matches.
     164Note, conversion issues apply to both non-overloaded and overloaded functions.
     165Here, the selection of the conversion functions can be considered the \emph{opinion} of the language (type system), even if the technique used is based on sound rules, like maximizing conversion accuracy (non-lossy).
    140166The difference in opinion results when the language conversion rules differ from a programmer's expectations.
    141167However, without implicit conversions, programmers may have to write an exponential number of functions covering all possible exact-match cases among all reasonable types.
    142168\CFA's \emph{opinion} on conversions must match C's and then rely on programmers to understand the effects.
    143 That is, let the compiler do the heavy-lifting of selecting a \emph{best} set of conversions that minimizes safety concerns.
    144 Hence, removing implicit conversions from \CFA is not an option, so it must do the best possible job to get it right.
    145 
    146 \item
    147 Why are there two forms of \emph{overloading} (regular and type class) in different programming languages?
     169That is, let the compiler do the heavy-lifting of selecting a \emph{best} set of conversions that maximize safety.
     170Hence, removing implicit conversions from \CFA is not an option (as is true in most other programming languages), so it must do the best possible job to get it right.
     171
     172\item
     173Why are there two forms of \emph{overloading} (general and parametric) in different programming languages?
    148174
    149175\noindent
    150 \newterm{Regular overloading} occurs when the type-system \emph{knows} a function's argument and return types (or a variable's type for variable overloading).
     176\newterm{General overloading} occurs when the type-system \emph{knows} a function's parameters and return types (or a variable's type for variable overloading).
     177In functional programming-languages, there is always a return type (except for a monad).
    151178If a return type is specified, the compiler does not have to inference the routine body.
    152179For example, the compiler has complete knowledge about builtin types and their overloaded arithmetic operators.
     
    154181Overload resolution then involves finding an exact match between a call and the overload prototypes based on argument type(s) and possibly return context.
    155182If an \emph{exact} match is not found, the call is either ill formed (ambiguous) or further attempts are made to find a \emph{best} match using transfer functions (conversions).
    156 As a consequence, no additional runtime information is needed per call, \ie the call is a direct transfer (branch) with pushed arguments.
    157 
    158 \newterm{Type-class overloading} occurs when the compiler is using currying for type inferencing.
     183As a consequence, no additional runtime information is needed per call, \ie the call is a direct transfer (branch) with possibly converted arguments.
     184
     185\noindent
     186\newterm{Parametric overloading} occurs when the type-system does not know a function's parameter and return types.
     187Types are inferred from constant types and/or constraint information that includes typing.
     188Parametric overloading starts with a universal/generic type used to define parameters and returns.
     189\begin{cfa}
     190forall( T ) T identity( T t ) { return t; }
     191int i = identify( 3 );
     192double d;
     193double e = identity( d );
     194\end{cfa}
     195Here, the type system is substituting the argument type for @T@, and ensuring the return type also matches.
     196This mechanism can be extended to overloaded parametric functions with multiple type parameters.
     197\begin{cfa}
     198forall( T ) [T, T] identity( T t1, T t2 ) { return [t1, t2 ]; } // return tuple
     199int j;
     200[i, j] = identity( 3, 4 );
     201forall( T, U ) [T, U] identity( T t1, U t2 ) { return [t1, t2 ]; }
     202[i, d] = identity( i, d );
     203\end{cfa}
     204Here, the type system is using techniques from general overloading, number and argument types, to disambiguate the overloaded parametric functions.
     205However, this exercise is moot because the types are so universal they have no practical operations within the function, modulo any general operations carried by every type, like copy or print.
     206This kind of polymorphism is restricted to moving type instances as abstract entities;
     207if type erasure is used, there is no way to recover the original type to perform useful actions on its value(s).
     208Hence, parametric overloading requires additional information about the universal types to make them useful.
     209
     210This additional information often comes as a set of operations a type must supply (@trait@/-@concept@) and these operations can then be used in the body of the function.
     211\begin{cfa}
     212forall( T | T ?@++@( T, T ) ) T inc( T t ) { return t@++@; }
     213int i = 3
     214i = inc( i )
     215double d = inc( 3.5 );
     216\end{cfa}
     217Given a qualifying trait, are its elements inferred or declared?
     218In the above example, the type system infers @int@ for @T@, infers it needs a @++@ operator that takes an @int@ and returns an @int@, and finds this function in the enclosing environment (\eg standard prelude).
     219This implicit inferencing is expensive if matched with implicit conversions when there is no exact match.
     220Alternatively, types opt-in to traits via declarations.
     221\begin{cfa}
     222trait postinc { T ?@++@( T, T ) }
     223type int provide postinc;
     224\end{cfa}
     225In this approach, after inferring @int@ for @T@, @int@ is examined for any necessary traits required in the function, so there is no searching.
     226Once all traits are satisfied, the compiler can inline them, pass functions pointers directly, or pass a virtual table composing the trait.
     227\end{enumerate}
     228The following sections illustrate how overloading is provided in \CFA.
     229
     230\begin{comment}
    159231\begin{lstlisting}[language=Haskell]
    160232f( int, int );  f( int, float ); -- return types to be determined
     
    185257\end{cfa}
    186258
    187 
    188259In this case, the compiler implicitly changes the overloaded function to a parametrically polymorphic one.
    189260Hence, the programmer does specify any additional information for the overloading to work.
    190261Explicit overloading occurs when the compiler has to be told what operations are associated with a type  programmer separately defines the associate type and subsequently associates the type with overloaded name.
    191 \end{enumerate}
     262\end{comment}
    192263
    193264\begin{comment}
     
    343414\subsection{Operator Overloading}
    344415
    345 Virtually all programming languages overload the arithmetic operators across the basic computational types using the number and type of parameters and returns.
    346 However, in many programming languages, arithmetic operators are not first class, and hence, they cannot be overloaded by programmers.
    347 Like \CC, \CFA maps operators to named functions allowing them to be overloaded with user-defined types.
    348 The syntax for operator names uses the @'?'@ character to denote a parameter, \eg left and right unary operators: @?++@ and @++?@, and binary operators @?+?@ and @?<=?@.
     416Virtually all programming languages provide general overloading of the arithmetic operators across the basic computational types using the number and type of parameters and returns.
     417However, in some languages, arithmetic operators may not be first class, and hence, cannot be overloaded.
     418Like \CC, \CFA does allow general operator overloading for user-defined types.
     419The \CFA syntax for operator names uses the @'?'@ character to denote a parameter, \eg left and right unary operators: @?++@ and @++?@, and binary operators @?+?@ and @?<=?@.
    349420Here, a user-defined type is extended with an addition operation with the same syntax as a builtin type.
    350421\begin{cfa}
     
    359430Conversions are necessary because the hardware rarely supports mix-mode operations, so both operands must be converted to a common type.
    360431Like overloading, the majority of mixed-mode conversions are silently resolved, simplifying the development process.
    361 This approach does not match with programmer intuition and expectation, regardless of any \emph{safety} issues resulting from converted values.
     432This approach matches with programmer intuition and expectation, regardless of any \emph{safety} issues resulting from converted values.
    362433Depending on the language, mix-mode conversions can be explicitly controlled using some form of cast.
    363434
     
    365436\subsection{Function Overloading}
    366437
    367 Both \CFA and \CC allow function names to be overloaded, as long as their prototypes differ in the number and type of parameters and returns.
     438Both \CFA and \CC allow general overloading for functions, as long as their prototypes differ in the number and type of parameters and returns.
    368439\begin{cfa}
    369440void f( void );                 $\C[2in]{// (1): no parameter}$
     
    372443f( 'A' );                               $\C{// select (2)}\CRT$
    373444\end{cfa}
    374 In this case, the name @f@ is overloaded depending on the number and parameter types.
    375 The type system examines each call size and selects the best match based on the number and types of the arguments.
    376 Here, there is a perfect match for the call, @f( 'A' )@ with the number and parameter type of function (2).
     445The type system examines each call size and first looks for an exact match and then a best match using conversions.
    377446
    378447Ada, Scala, and \CFA type-systems also use the return type in resolving a call, to pinpoint the best overloaded name.
     448Essentailly, the return types are \emph{reversed curried} into output parameters of the function.
    379449For example, in many programming languages with overloading, the following functions are ambiguous without using the return type.
    380450\begin{cfa}
     
    393463\end{cfa}
    394464Again, there is an exact match for each call.
    395 As for operator overloading, if there is no exact match, a set of minimal, an implicit conversion can be added to find a best match.
     465As for operator overloading, if there is no exact match, a set of minimal implicit conversions can be added to find a best match.
    396466\begin{cfa}
    397467short int = random();   $\C[2in]{// select (1), unsafe}$
     
    416486\end{cfa}
    417487It is interesting that shadow overloading is considered a normal programming-language feature with only slight software-engineering problems.
    418 However, variable overloading within a scope is considered extremely dangerous, without any evidence to corroborate this claim.
     488However, variable overloading within a scope is often considered dangerous, without any evidence to corroborate this claim.
    419489In contrast, function overloading in \CC occurs silently within the global scope from @#include@ files all the time without problems.
    420490
     
    430500\end{cfa}
    431501Hence, the name @MAX@ can replace all the C type-specific names, \eg @INT_MAX@, @LONG_MAX@, @DBL_MAX@, \etc.
    432 The result is a significant reduction in names to access typed constants.
    433 
    434 As an aside, C has a separate namespace for types and variables allowing overloading between the namespaces, using @struct@ (qualification) to disambiguate.
    435 \begin{cfa}
    436 void S() {
    437         struct @S@ { int S; };
    438         @struct S@ S;
    439         void S( @struct S@ S ) { S.S = 1; };
    440 }
    441 \end{cfa}
    442 Here the name @S@ is an aggregate type and field, and a variable and parameter of type @S@.
     502The result is a significant reduction in names to access common constants.
    443503
    444504
     
    472532
    473533
    474 \section{Overload Resolution Strategies}
    475 
    476 For languages with user-defined overloading,
    477 Given an overloaded constant, variable, or (generic) function, there must exist strategies for differentiating among them and selecting the most appropriate one in a given context.
    478 The criteria commonly used to match operator/function/method names with definitions are: number of parameters, parameter types, parameter order or name, return type, implicit argument type conversions (safe/unsafe), generic, where some features are missing in certain programming languages.
    479 \VRef[Table]{t:OverloadingFeatures} shows a subset of popular programming languages with overloading and the discriminating features used to disambiguate among overloadings.
    480 Language C, Go and Rust have no overloading beyond basic types and operators.
    481 
    482 \begin{table}
    483 \caption{Overload Discriminating Features in Programming Languages}
    484 \label{t:OverloadingFeatures}
    485 \centering
    486 
    487 % https://doc.rust-lang.org/rust-by-example/trait/impl_trait.html
    488 % https://dl.acm.org/doi/10.1145/75277.75283
    489 
    490 \begin{minipage}{\linewidth}
    491 \setlength{\tabcolsep}{5pt}
    492 \begin{tabular}{@{}r|cccccccc@{}}
    493 Feature\,{\textbackslash}\,Language     & Ada   & \CC   & \CFA  & Java  & Scala & Swift & Rust & Haskell        \\
    494 \hline
    495 Operator/Function/Method name   & O\footnote{except assignment}/F       & O/F/M & O/F   & M     & O/M   & O/F/M & X     & X     \\
    496 generic name                                    & no    & yes\footnote{compile-time only, using template expansion}     & yes   & yes   & yes   & yes   & X     & X \\
    497 parameter number                                & yes   & yes   & yes   & yes   & yes   & yes   & X     & X     \\
    498 parameter types                                 & yes   & yes   & yes   & yes   & yes   & yes   & X     & X     \\
    499 parameter name                                  & no    & no    & no    & no    & yes   & yes   & X     & X     \\
    500 return type                                             & yes   & no    & yes   & no    & no    & yes   & X     & X     \\
    501 Safe/Unsafe argument conversion & none  & yes\footnote{no conversions allowed during template parameter deduction}      & S/U
    502         & S\footnote{unsafe (narrowing) conversion only allowed in assignment or initialization to a primitive variable}        & S
    503         & no\footnote{literals only, Int -> Double (Safe)}      & X     & X
    504 \end{tabular}
    505 \end{minipage}
    506 \end{table}
    507 
    508 
    509534\section{Type Inferencing}
    510535\label{s:IntoTypeInferencing}
     
    861886
    862887
     888\section{Language Comparison}
     889
     890Because \CFA's type system is focused on overloading and overload resolution, \VRef[Table]{t:OverloadingFeatures} compares \CFA with a representive set of popular programming languages and their take on overloading.
     891The first row classifies whether there is general overloading, and what enities may be overloaded.
     892\begin{cfa}
     893int foo( int );                                 int foo;
     894double foo( int, int );                 double foo;
     895\end{cfa}
     896The second row classifies the specialzation mechanisms used distinished among the general overloads.
     897The third row classifies if generic functions can be overloaded based on the number and differing type variables.
     898\begin{cfa}
     899forall( T ) T foo( T t );
     900forall( T ) T foo( T t, T s );
     901forall( T, U ) T foo( T t, U s );
     902\end{cfa}
     903The fourth row classifies the mechnaism used to specialize  provide  if generic functions can be overloaded based on the number and differing type variables.
     904The fifth row classifies if conversions are attempted beyond exact match.
     905
     906\begin{table}
     907\caption{Overload Discriminating Features in Programming Languages}
     908\label{t:OverloadingFeatures}
     909\centering
     910\begin{minipage}{\linewidth}
     911\setlength{\tabcolsep}{5pt}
     912\begin{tabular}{@{}r|cccccccc@{}}
     913Feature\,{\textbackslash}\,Language     & Ada   & \CC           & \CFA  & Java  & Scala & Swift & Rust & Haskell        \\
     914\hline
     915general\footnote{overloadable entities: V $\Rightarrow$ variable, O $\Rightarrow$ operator, F $\Rightarrow$ function, M $\Rightarrow$ member}
     916                                                & O\footnote{except assignment}/F       & O/F/M & V/O/F & M\footnote{not universal}     & O/M   & O/F/M & no    & no    \\
     917general constraints\footnote{\# $\Rightarrow$ number, T $\Rightarrow$ type, N $\Rightarrow$ name, R $\Rightarrow$ return type}
     918                                                & \#/T/R        & \#/T  & \#/T/R        & \#/T  & \#/T/N/R      & \#/T/N/R      & \#/T/N        & T/R \\
     919type parameters                 & no            & yes   & yes           & yes   & yes   & yes   & yes   & yes \\
     920type constraint\footnote{T $\Rightarrow$ trait/concept, B $\Rightarrow$ type bounds, TC $\Rightarrow$ type class}
     921                                                & no            & T             & T                     & T             & B             & T     & T     & TC \\
     922%parameter number               & yes   & yes           & yes   & yes   & yes   & yes   & maybe & no    \\
     923%parameter types                & yes   & yes           & yes   & yes   & yes   & yes   & yes   & yes   \\
     924%parameter name                 & no    & no            & no    & no    & yes   & yes   & no    & no    \\
     925%return type                    & yes   & no            & yes   & no    & no    & yes   & no    & yes   \\
     926Safe/Unsafe arg. conv.  & no    & S/U\footnote{no conversions allowed during template parameter deduction}      & S/U
     927        & S\footnote{unsafe (narrowing) conversion only allowed in assignment or initialization to a primitive (builtin) type}  & S
     928        & no\footnote{literals only, Int $\rightarrow$ Double (Safe)}   & no    & no
     929\end{tabular}
     930\end{minipage}
     931% https://doc.rust-lang.org/rust-by-example/trait/impl_trait.html
     932% https://dl.acm.org/doi/10.1145/75277.75283
     933\end{table}
     934
     935
    863936\section{Contributions}
    864937
Note: See TracChangeset for help on using the changeset viewer.