Context Navigation

-                      rf5bf3c2
+                      r399074a
 A programming language and its compiler present ways to declare types that ultimately map into the ones provided by the underlying hardware.
 These language types are thrust upon programmers with their syntactic/semantic rules and restrictions.
 These rules are then used to transform a language expression to a hardware expression.
+These rules are then used to transform a language expression to a unique hardware expression.
 Modern programming-languages allow user-defined types and generalize across multiple types using polymorphism.
 Type systems can be static, where each variable has a fixed type during execution and an expression's type is determined at compile time, or dynamic, where each variable can change type during execution and so an expression's type is reconstructed on each evaluation.
 …
 \newterm{Namespace pollution} refers to loading the global or other namespaces with many names, resulting in paranoia that the compiler could make wrong choices for overloaded names causing failure.
 This fear leads to coding styles where names are partitioned by language mechanisms and qualification is used to make names unique.
 This approach defeats the purpose of overloading and places an additional coding burden on both the code developer and user.
+This approach defeats the purpose of overloading and places an additional coding burden on both the developer and user.
 As well, many namespace systems provide a mechanism to open their scope returning to normal overloading, \ie no qualification.
 While namespace mechanisms are very important and provide a number of crucial program-development features, protection from overloading is overstated.
 Similarly, lexical nesting is another place where overloading occurs.
 For example, in object-oriented programming, class memeber names \newterm{shadow} names within members.
+For example, in object-oriented programming, class member names \newterm{shadow} names within members.
 Some programmers, qualify all member names with @class::@ or @this->@ to make them unique from names defined in members.
 Even nested lexical blocks result in shadowing, \eg multiple nested loop-indices called @i@.
 …
 In the following C example:
 \begin{cfa}
+struct S {};
+struct S S;
+struct S { int S; };
+struct S s;
+void S( struct S S ) { S.S = 1; };
 enum E { E };
+\end{cfa}
+the names @S@ and @E@ exist is the type and object domain, and C uses the type kinds @struct@ and @enum@ to disambiguate the names.
+enum E e = E;
+\end{cfa}
+the overloaded names @S@ and @E@ are separated into the type and object domain, and C uses the type kinds @struct@ and @enum@ to disambiguate the names.
+In general, types are not overloaded because inferencing them is difficult to imagine in a statically programming language.
+\begin{cquote}
+\setlength{\tabcolsep}{26pt}
+\begin{tabular}{@{}ll@{}}
+\begin{cfa}
+struct S { int i; }; // 1
+struct S { int i, j }; // 2
+union S { int i; double d; }; // 3
+typedef int S; // 4
+\end{cfa}
+&
+\begin{cfa}
+S s;
+if ( ... ) s = (S){ 3 }; // 1
+else s = 3; // 4
+\end{cfa}
+\end{tabular}
+\end{cquote}
 On the other hand, in ad-hoc overloading of variables and/or functions, the names are in the object domain and each overloaded object name has an anonymous associated type.
 \begin{cfa}
 @double@ foo( @int@ );    @int@ foo( @void@ );    @int@ foo( @double, double@ );
+@double@ foo( @int@ );    @char@ foo( @void@ );    @int@ foo( @double, double@ );
 @double@ foo;    @char@ foo;    @int@ foo;
 \end{cfa}
+Notice, the associated type cannot be extracted using @typeof@/\lstinline[language={[11]C++}]{decltype} for typing purposes, as @typeof( foo )@ is always ambiguous.
+Hence, overloading may not be polymorphism, as no single overloaded entity represents multiple types.
+Here, the associated type cannot be extracted using @typeof@/\lstinline[language={[11]C++}]{decltype} for typing purposes, as @typeof( foo )@ is always ambiguous.
+To disambiguate the type of @foo@ requires additional information from the call-site or a cast.
+\begin{cfa}
+double d = foo( 3 );    char c = foo();    int i = foo( 3.5, 4.1 );  $\C{// exact match}$
+d = foo;    c = foo;    i = foo;  $\C{// exact match}$
+i = ((double (*)( int ))foo)( 3.5 ); $\C{// no exact match, explicit cast => conversions}$
+\end{cfa}
+Hence, depending on your perspective, overloading may not be polymorphism, as no single overloaded entity represents multiple types.
 \item
 …
 For exact type matches in overloading, there is a systematic way of matching arguments to parameters, and a function denotes its return type rather than using type inferencing.
 This matching is just as robust as other polymorphic analysis.
 The ad-hoc aspect is the implicit transfer functions (conversions) applied to arguments to create an exact parameter type-match, as there may be multiple conversions leading to different exact matches.
 Note, conversion issues apply to non-overloaded and overloaded functions.
 Here, the selection of the conversion functions is based on the \emph{opinion} of the language (type system), even if the technique used is based on sound rules, like maximizing conversion accuracy (non-lossy).
+The ad-hoc aspect might be the implicit transfer functions (conversions) applied to arguments to create an exact parameter type-match, as there may be multiple conversions leading to different exact matches.
+Note, conversion issues apply to both non-overloaded and overloaded functions.
+Here, the selection of the conversion functions can be considered the \emph{opinion} of the language (type system), even if the technique used is based on sound rules, like maximizing conversion accuracy (non-lossy).
 The difference in opinion results when the language conversion rules differ from a programmer's expectations.
 However, without implicit conversions, programmers may have to write an exponential number of functions covering all possible exact-match cases among all reasonable types.
 \CFA's \emph{opinion} on conversions must match C's and then rely on programmers to understand the effects.
 That is, let the compiler do the heavy-lifting of selecting a \emph{best} set of conversions that minimizes safety concerns.
 Hence, removing implicit conversions from \CFA is not an option, so it must do the best possible job to get it right.
 \item
 Why are there two forms of \emph{overloading} (regular and type class) in different programming languages?
+That is, let the compiler do the heavy-lifting of selecting a \emph{best} set of conversions that maximize safety.
+Hence, removing implicit conversions from \CFA is not an option (as is true in most other programming languages), so it must do the best possible job to get it right.
+\item
+Why are there two forms of \emph{overloading} (general and parametric) in different programming languages?
 \noindent
+\newterm{Regular overloading} occurs when the type-system \emph{knows} a function's argument and return types (or a variable's type for variable overloading).
+\newterm{General overloading} occurs when the type-system \emph{knows} a function's parameters and return types (or a variable's type for variable overloading).
+In functional programming-languages, there is always a return type (except for a monad).
 If a return type is specified, the compiler does not have to inference the routine body.
 For example, the compiler has complete knowledge about builtin types and their overloaded arithmetic operators.
 …
 Overload resolution then involves finding an exact match between a call and the overload prototypes based on argument type(s) and possibly return context.
 If an \emph{exact} match is not found, the call is either ill formed (ambiguous) or further attempts are made to find a \emph{best} match using transfer functions (conversions).
+As a consequence, no additional runtime information is needed per call, \ie the call is a direct transfer (branch) with pushed arguments.
+\newterm{Type-class overloading} occurs when the compiler is using currying for type inferencing.
+As a consequence, no additional runtime information is needed per call, \ie the call is a direct transfer (branch) with possibly converted arguments.
+\noindent
+\newterm{Parametric overloading} occurs when the type-system does not know a function's parameter and return types.
+Types are inferred from constant types and/or constraint information that includes typing.
+Parametric overloading starts with a universal/generic type used to define parameters and returns.
+\begin{cfa}
+forall( T ) T identity( T t ) { return t; }
+int i = identify( 3 );
+double d;
+double e = identity( d );
+\end{cfa}
+Here, the type system is substituting the argument type for @T@, and ensuring the return type also matches.
+This mechanism can be extended to overloaded parametric functions with multiple type parameters.
+\begin{cfa}
+forall( T ) [T, T] identity( T t1, T t2 ) { return [t1, t2 ]; } // return tuple
+int j;
+[i, j] = identity( 3, 4 );
+forall( T, U ) [T, U] identity( T t1, U t2 ) { return [t1, t2 ]; }
+[i, d] = identity( i, d );
+\end{cfa}
+Here, the type system is using techniques from general overloading, number and argument types, to disambiguate the overloaded parametric functions.
+However, this exercise is moot because the types are so universal they have no practical operations within the function, modulo any general operations carried by every type, like copy or print.
+This kind of polymorphism is restricted to moving type instances as abstract entities;
+if type erasure is used, there is no way to recover the original type to perform useful actions on its value(s).
+Hence, parametric overloading requires additional information about the universal types to make them useful.
+This additional information often comes as a set of operations a type must supply (@trait@/-@concept@) and these operations can then be used in the body of the function.
+\begin{cfa}
+forall( T | T ?@++@( T, T ) ) T inc( T t ) { return t@++@; }
+int i = 3
+i = inc( i )
+double d = inc( 3.5 );
+\end{cfa}
+Given a qualifying trait, are its elements inferred or declared?
+In the above example, the type system infers @int@ for @T@, infers it needs a @++@ operator that takes an @int@ and returns an @int@, and finds this function in the enclosing environment (\eg standard prelude).
+This implicit inferencing is expensive if matched with implicit conversions when there is no exact match.
+Alternatively, types opt-in to traits via declarations.
+\begin{cfa}
+trait postinc { T ?@++@( T, T ) }
+type int provide postinc;
+\end{cfa}
+In this approach, after inferring @int@ for @T@, @int@ is examined for any necessary traits required in the function, so there is no searching.
+Once all traits are satisfied, the compiler can inline them, pass functions pointers directly, or pass a virtual table composing the trait.
+\end{enumerate}
+The following sections illustrate how overloading is provided in \CFA.
+\begin{comment}
 \begin{lstlisting}[language=Haskell]
 f( int, int );  f( int, float ); -- return types to be determined
 …
 \end{cfa}
 In this case, the compiler implicitly changes the overloaded function to a parametrically polymorphic one.
 Hence, the programmer does specify any additional information for the overloading to work.
 Explicit overloading occurs when the compiler has to be told what operations are associated with a type  programmer separately defines the associate type and subsequently associates the type with overloaded name.
 \end{enumerate}
+\end{comment}
 \begin{comment}
 …
 \subsection{Operator Overloading}
 Virtually all programming languages overload the arithmetic operators across the basic computational types using the number and type of parameters and returns.
 However, in many programming languages, arithmetic operators are not first class, and hence, they cannot be overloaded by programmers.
 Like \CC, \CFA maps operators to named functions allowing them to be overloaded with user-defined types.
 The syntax for operator names uses the @'?'@ character to denote a parameter, \eg left and right unary operators: @?++@ and @++?@, and binary operators @?+?@ and @?<=?@.
+Virtually all programming languages provide general overloading of the arithmetic operators across the basic computational types using the number and type of parameters and returns.
+However, in some languages, arithmetic operators may not be first class, and hence, cannot be overloaded.
+Like \CC, \CFA does allow general operator overloading for user-defined types.
+The \CFA syntax for operator names uses the @'?'@ character to denote a parameter, \eg left and right unary operators: @?++@ and @++?@, and binary operators @?+?@ and @?<=?@.
 Here, a user-defined type is extended with an addition operation with the same syntax as a builtin type.
 \begin{cfa}
 …
 Conversions are necessary because the hardware rarely supports mix-mode operations, so both operands must be converted to a common type.
 Like overloading, the majority of mixed-mode conversions are silently resolved, simplifying the development process.
 This approach does not match with programmer intuition and expectation, regardless of any \emph{safety} issues resulting from converted values.
+This approach matches with programmer intuition and expectation, regardless of any \emph{safety} issues resulting from converted values.
 Depending on the language, mix-mode conversions can be explicitly controlled using some form of cast.
 …
 \subsection{Function Overloading}
 Both \CFA and \CC allow function names to be overloaded, as long as their prototypes differ in the number and type of parameters and returns.
+Both \CFA and \CC allow general overloading for functions, as long as their prototypes differ in the number and type of parameters and returns.
 \begin{cfa}
 void f( void );                 $\C[2in]{// (1): no parameter}$
 …
 f( 'A' );                               $\C{// select (2)}\CRT$
 \end{cfa}
+In this case, the name @f@ is overloaded depending on the number and parameter types.
+The type system examines each call size and selects the best match based on the number and types of the arguments.
+Here, there is a perfect match for the call, @f( 'A' )@ with the number and parameter type of function (2).
+The type system examines each call size and first looks for an exact match and then a best match using conversions.
 Ada, Scala, and \CFA type-systems also use the return type in resolving a call, to pinpoint the best overloaded name.
+Essentailly, the return types are \emph{reversed curried} into output parameters of the function.
 For example, in many programming languages with overloading, the following functions are ambiguous without using the return type.
 \begin{cfa}
 …
 \end{cfa}
 Again, there is an exact match for each call.
 As for operator overloading, if there is no exact match, a set of minimal, an implicit conversion can be added to find a best match.
+As for operator overloading, if there is no exact match, a set of minimal implicit conversions can be added to find a best match.
 \begin{cfa}
 short int = random();   $\C[2in]{// select (1), unsafe}$
 …
 \end{cfa}
 It is interesting that shadow overloading is considered a normal programming-language feature with only slight software-engineering problems.
 However, variable overloading within a scope is considered extremely dangerous, without any evidence to corroborate this claim.
+However, variable overloading within a scope is often considered dangerous, without any evidence to corroborate this claim.
 In contrast, function overloading in \CC occurs silently within the global scope from @#include@ files all the time without problems.
 …
 \end{cfa}
 Hence, the name @MAX@ can replace all the C type-specific names, \eg @INT_MAX@, @LONG_MAX@, @DBL_MAX@, \etc.
+The result is a significant reduction in names to access typed constants.
+As an aside, C has a separate namespace for types and variables allowing overloading between the namespaces, using @struct@ (qualification) to disambiguate.
+\begin{cfa}
+void S() {
+        struct @S@ { int S; };
+        @struct S@ S;
+        void S( @struct S@ S ) { S.S = 1; };
+}
+\end{cfa}
+Here the name @S@ is an aggregate type and field, and a variable and parameter of type @S@.
+The result is a significant reduction in names to access common constants.
 …
-\section{Overload Resolution Strategies}
-For languages with user-defined overloading,
-Given an overloaded constant, variable, or (generic) function, there must exist strategies for differentiating among them and selecting the most appropriate one in a given context.
-The criteria commonly used to match operator/function/method names with definitions are: number of parameters, parameter types, parameter order or name, return type, implicit argument type conversions (safe/unsafe), generic, where some features are missing in certain programming languages.
-\VRef[Table]{t:OverloadingFeatures} shows a subset of popular programming languages with overloading and the discriminating features used to disambiguate among overloadings.
-Language C, Go and Rust have no overloading beyond basic types and operators.
-\begin{table}
-\caption{Overload Discriminating Features in Programming Languages}
-\label{t:OverloadingFeatures}
-\centering
-% https://doc.rust-lang.org/rust-by-example/trait/impl_trait.html
-% https://dl.acm.org/doi/10.1145/75277.75283
-\begin{minipage}{\linewidth}
-\setlength{\tabcolsep}{5pt}
-\begin{tabular}{@{}r|cccccccc@{}}
-Feature\,{\textbackslash}\,Language     & Ada   & \CC   & \CFA  & Java  & Scala & Swift & Rust & Haskell        \\
-\hline
-Operator/Function/Method name   & O\footnote{except assignment}/F       & O/F/M & O/F   & M     & O/M   & O/F/M & X     & X     \\
-generic name                                    & no    & yes\footnote{compile-time only, using template expansion}     & yes   & yes   & yes   & yes   & X     & X \\
-parameter number                                & yes   & yes   & yes   & yes   & yes   & yes   & X     & X     \\
-parameter types                                 & yes   & yes   & yes   & yes   & yes   & yes   & X     & X     \\
-parameter name                                  & no    & no    & no    & no    & yes   & yes   & X     & X     \\
-return type                                             & yes   & no    & yes   & no    & no    & yes   & X     & X     \\
-Safe/Unsafe argument conversion & none  & yes\footnote{no conversions allowed during template parameter deduction}      & S/U
-        & S\footnote{unsafe (narrowing) conversion only allowed in assignment or initialization to a primitive variable}        & S
-        & no\footnote{literals only, Int -> Double (Safe)}      & X     & X
-\end{tabular}
-\end{minipage}
-\end{table}
 \section{Type Inferencing}
 \label{s:IntoTypeInferencing}
 …
+\section{Language Comparison}
+Because \CFA's type system is focused on overloading and overload resolution, \VRef[Table]{t:OverloadingFeatures} compares \CFA with a representive set of popular programming languages and their take on overloading.
+The first row classifies whether there is general overloading, and what enities may be overloaded.
+\begin{cfa}
+int foo( int );                                 int foo;
+double foo( int, int );                 double foo;
+\end{cfa}
+The second row classifies the specialzation mechanisms used distinished among the general overloads.
+The third row classifies if generic functions can be overloaded based on the number and differing type variables.
+\begin{cfa}
+forall( T ) T foo( T t );
+forall( T ) T foo( T t, T s );
+forall( T, U ) T foo( T t, U s );
+\end{cfa}
+The fourth row classifies the mechnaism used to specialize  provide  if generic functions can be overloaded based on the number and differing type variables.
+The fifth row classifies if conversions are attempted beyond exact match.
+\begin{table}
+\caption{Overload Discriminating Features in Programming Languages}
+\label{t:OverloadingFeatures}
+\centering
+\begin{minipage}{\linewidth}
+\setlength{\tabcolsep}{5pt}
+\begin{tabular}{@{}r|cccccccc@{}}
+Feature\,{\textbackslash}\,Language     & Ada   & \CC           & \CFA  & Java  & Scala & Swift & Rust & Haskell        \\
+\hline
+general\footnote{overloadable entities: V $\Rightarrow$ variable, O $\Rightarrow$ operator, F $\Rightarrow$ function, M $\Rightarrow$ member}
+                                                & O\footnote{except assignment}/F       & O/F/M & V/O/F & M\footnote{not universal}     & O/M   & O/F/M & no    & no    \\
+general constraints\footnote{\# $\Rightarrow$ number, T $\Rightarrow$ type, N $\Rightarrow$ name, R $\Rightarrow$ return type}
+                                                & \#/T/R        & \#/T  & \#/T/R        & \#/T  & \#/T/N/R      & \#/T/N/R      & \#/T/N        & T/R \\
+type parameters                 & no            & yes   & yes           & yes   & yes   & yes   & yes   & yes \\
+type constraint\footnote{T $\Rightarrow$ trait/concept, B $\Rightarrow$ type bounds, TC $\Rightarrow$ type class}
+                                                & no            & T             & T                     & T             & B             & T     & T     & TC \\
+%parameter number               & yes   & yes           & yes   & yes   & yes   & yes   & maybe & no    \\
+%parameter types                & yes   & yes           & yes   & yes   & yes   & yes   & yes   & yes   \\
+%parameter name                 & no    & no            & no    & no    & yes   & yes   & no    & no    \\
+%return type                    & yes   & no            & yes   & no    & no    & yes   & no    & yes   \\
+Safe/Unsafe arg. conv.  & no    & S/U\footnote{no conversions allowed during template parameter deduction}      & S/U
+        & S\footnote{unsafe (narrowing) conversion only allowed in assignment or initialization to a primitive (builtin) type}  & S
+        & no\footnote{literals only, Int $\rightarrow$ Double (Safe)}   & no    & no
+\end{tabular}
+\end{minipage}
+% https://doc.rust-lang.org/rust-by-example/trait/impl_trait.html
+% https://dl.acm.org/doi/10.1145/75277.75283
+\end{table}
 \section{Contributions}

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset 399074a

Legend:

TabularUnified doc/theses/fangren_yu_MMath/intro.tex ¶

Download in other formats: