Changeset f5bf3c2


Ignore:
Timestamp:
Mar 20, 2025, 1:21:53 PM (31 hours ago)
Author:
Peter A. Buhr <pabuhr@…>
Branches:
master
Parents:
62b5940
Message:

more proofreading on introduction chapter

Location:
doc/theses/fangren_yu_MMath
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • TabularUnified doc/theses/fangren_yu_MMath/intro.tex

    r62b5940 rf5bf3c2  
    1414In certain cases, if the resolver fails to find an exact assertion match, it attempts to find a \emph{best} match using reasonable type conversions.
    1515Hence, \CFA follows the current trend of replacing nominal inheritance with traits composed of assertions for type matching.
    16 The over-arching goal is to push the boundary on localized assertion matching, with advanced overloading resolution and type conversions that match programmer expectations in the C programming language.
    17 Together, the resulting \CFA type-system has a number of unique features making it different from other programming languages with expressive, static, type-systems.
     16The over-arching goal in \CFA is to push the boundary on localized assertion matching, with advanced overloading resolution and type conversions that match programmer expectations in the C programming language.
     17Together, the resulting type-system has a number of unique features making it different from other programming languages with expressive, static, type-systems.
    1818
    1919
     
    2525These rules are then used to transform a language expression to a hardware expression.
    2626Modern programming-languages allow user-defined types and generalize across multiple types using polymorphism.
    27 Type systems can be static, where each variable has a fixed type during execution and an expression's type is determined once at compile time, or dynamic, where each variable can change type during execution and so an expression's type is reconstructed on each evaluation.
     27Type systems can be static, where each variable has a fixed type during execution and an expression's type is determined at compile time, or dynamic, where each variable can change type during execution and so an expression's type is reconstructed on each evaluation.
    2828Expressibility, generalization, and safety are all bound up in a language's type system, and hence, directly affect the capability, build time, and correctness of program development.
    2929
     
    3535\end{quote}
    3636Overloading allows programmers to use the most meaningful names without fear of name clashes within a program or from external sources, like include files.
    37 Experience from \CC and \CFA developers is that the type system implicitly and correctly disambiguates the majority of overloaded names, \ie it is rare to get an incorrect selection or ambiguity, even among hundreds of overloaded (variables and) functions.
    38 In many cases, a programmer has no idea there are name clashes, as they are silently resolved, simplifying the development process.
    39 Depending on the language, any ambiguous cases are resolved explicitly using some form of qualification and/or cast.
     37Experience from \CC and \CFA developers shows the type system can implicitly and correctly disambiguates the majority of overloaded names, \ie it is rare to get an incorrect selection or ambiguity, even among hundreds of overloaded (variables and) functions.
     38In many cases, a programmer is unaware of name clashes, as they are silently resolved, simplifying the development process.
     39
     40\newterm{Namespace pollution} refers to loading the global or other namespaces with many names, resulting in paranoia that the compiler could make wrong choices for overloaded names causing failure.
     41This fear leads to coding styles where names are partitioned by language mechanisms and qualification is used to make names unique.
     42This approach defeats the purpose of overloading and places an additional coding burden on both the code developer and user.
     43As well, many namespace systems provide a mechanism to open their scope returning to normal overloading, \ie no qualification.
     44While namespace mechanisms are very important and provide a number of crucial program-development features, protection from overloading is overstated.
     45Similarly, lexical nesting is another place where overloading occurs.
     46For example, in object-oriented programming, class memeber names \newterm{shadow} names within members.
     47Some programmers, qualify all member names with @class::@ or @this->@ to make them unique from names defined in members.
     48Even nested lexical blocks result in shadowing, \eg multiple nested loop-indices called @i@.
     49Again, coding styles exist requiring all variables in nested block to be unique to prevent name shadowing.
     50Depending on the language, these possible ambiguities can be reported (as warnings or errors) and resolved explicitly using some form of qualification and/or cast.
     51
     52Formally, overloading is defined by Strachey as \newterm{ad hoc polymorphism}:
     53\begin{quote}
     54In ad hoc polymorphism there is no single systematic way of determining the type of the result from the type of the arguments.
     55There may be several rules of limited extent which reduce the number of cases, but these are themselves ad hoc both in scope and content.
     56All the ordinary arithmetic operators and functions come into this category.
     57It seems, moreover, that the automatic insertion of transfer functions by the compiling system is limited to this.~\cite[p.~37]{Strachey00}
     58\end{quote}
     59where a \newterm{transfer function} is an implicit conversion to help find a matching overload:
     60\begin{quote}
     61The problem of dealing with polymorphic operators is complicated by the fact that the range of types sometimes overlap.
     62Thus for example 3 may be an integer or a real and it may be necessary to change it from one type to the other.
     63The functions which perform this operation are known as transfer functions and may either be used explicitly by the programmer, or, in some systems, inserted automatically by the compiling system.~\cite[p.~35]{Strachey00}
     64\end{quote}
     65The differentiating characteristic between parametric polymorphism and overloading is often stated as: polymorphic functions use one algorithm to operate on arguments of many different types, whereas overloaded functions use a different algorithm for each type of argument.
     66A similar differentiation is applicable for overloading and default parameters.
     67\begin{cquote}
     68\setlength{\tabcolsep}{10pt}
     69\begin{tabular}{@{}lll@{}}
     70\multicolumn{1}{c}{\textbf{different implementations}}  & \multicolumn{1}{c}{\textbf{same implementation}} \\
     71\begin{cfa}
     72void foo( int );
     73void foo( int, int );
     74\end{cfa}
     75&
     76\begin{cfa}
     77void foo( int, int = 5 ); // default value
     78
     79\end{cfa}
     80\end{tabular}
     81\end{cquote}
     82However, this distinguishing characteristic is vague.
     83For example, should the operation @abs@ be overloaded or polymorphic or both?
     84\begin{cquote}
     85\setlength{\tabcolsep}{10pt}
     86\begin{tabular}{@{}lll@{}}
     87\multicolumn{1}{c}{\textbf{overloading}}        & \multicolumn{1}{c}{\textbf{polymorphic}} \\
     88\begin{cfa}
     89int abs( int );
     90double abs( double );
     91\end{cfa}
     92&
     93\begin{cfa}
     94forall( T | { void ?{}( T &, zero_t ); int ?<?( T, T ); T -?( T ); } )
     95T abs( T );
     96\end{cfa}
     97\end{tabular}
     98\end{cquote}
     99Here, there are performance advantages for having specializations and code-reuse advantages for the generalization.
     100
     101The Strachey definitions raise several questions.
     102\begin{enumerate}[leftmargin=*]
     103\item
     104Is overloading polymorphism?
     105
     106\noindent
     107In type theory, polymorphism allows an overloaded type name to represent multiple different types.
     108For example, generic types overload the type name for a container type.
     109\begin{cfa}
     110@List@<int> li;    @List@<double> ld;    @List@<struct S> ls;
     111\end{cfa}
     112For subtyping, a derived type masquerades as a base type, where the base and derived names cannot be overloaded.
     113Instead, the mechanism relies on structural typing among the types.
     114In both cases, the polymorphic mechanisms apply in the type domain and the names are in the type namespace.
     115In the following C example:
     116\begin{cfa}
     117struct S {};
     118struct S S;
     119enum E { E };
     120\end{cfa}
     121the names @S@ and @E@ exist is the type and object domain, and C uses the type kinds @struct@ and @enum@ to disambiguate the names.
     122
     123On the other hand, in ad-hoc overloading of variables and/or functions, the names are in the object domain and each overloaded object name has an anonymous associated type.
     124\begin{cfa}
     125@double@ foo( @int@ );    @int@ foo( @void@ );    @int@ foo( @double, double@ );
     126@double@ foo;    @char@ foo;    @int@ foo;
     127\end{cfa}
     128Notice, the associated type cannot be extracted using @typeof@/\lstinline[language={[11]C++}]{decltype} for typing purposes, as @typeof( foo )@ is always ambiguous.
     129Hence, overloading may not be polymorphism, as no single overloaded entity represents multiple types.
     130
     131\item
     132Does ad-hoc polymorphism have a single systematic way of determining the type of the result from the type of the arguments?
     133
     134\noindent
     135For exact type matches in overloading, there is a systematic way of matching arguments to parameters, and a function denotes its return type rather than using type inferencing.
     136This matching is just as robust as other polymorphic analysis.
     137The ad-hoc aspect is the implicit transfer functions (conversions) applied to arguments to create an exact parameter type-match, as there may be multiple conversions leading to different exact matches.
     138Note, conversion issues apply to non-overloaded and overloaded functions.
     139Here, the selection of the conversion functions is based on the \emph{opinion} of the language (type system), even if the technique used is based on sound rules, like maximizing conversion accuracy (non-lossy).
     140The difference in opinion results when the language conversion rules differ from a programmer's expectations.
     141However, without implicit conversions, programmers may have to write an exponential number of functions covering all possible exact-match cases among all reasonable types.
     142\CFA's \emph{opinion} on conversions must match C's and then rely on programmers to understand the effects.
     143That is, let the compiler do the heavy-lifting of selecting a \emph{best} set of conversions that minimizes safety concerns.
     144Hence, removing implicit conversions from \CFA is not an option, so it must do the best possible job to get it right.
     145
     146\item
     147Why are there two forms of \emph{overloading} (regular and type class) in different programming languages?
     148
     149\noindent
     150\newterm{Regular overloading} occurs when the type-system \emph{knows} a function's argument and return types (or a variable's type for variable overloading).
     151If a return type is specified, the compiler does not have to inference the routine body.
     152For example, the compiler has complete knowledge about builtin types and their overloaded arithmetic operators.
     153In this context, there is a fixed set of overloads for a given name that are completely specified.
     154Overload resolution then involves finding an exact match between a call and the overload prototypes based on argument type(s) and possibly return context.
     155If an \emph{exact} match is not found, the call is either ill formed (ambiguous) or further attempts are made to find a \emph{best} match using transfer functions (conversions).
     156As a consequence, no additional runtime information is needed per call, \ie the call is a direct transfer (branch) with pushed arguments.
     157
     158\newterm{Type-class overloading} occurs when the compiler is using currying for type inferencing.
     159\begin{lstlisting}[language=Haskell]
     160f( int, int );  f( int, float ); -- return types to be determined
     161g( int, int );  g( float, int );
     162let x = curry f( 3, _ ); -- which f
     163let y = curry g( _ , 3 ); -- which g
     164\end{lstlisting}
     165For the currying to succeed, there cannot be overloaded function names resulting in ambiguities.
     166To allow currying to succeed requires an implicit disambiguating mechanism, \ie a kind of transfer function.
     167A type class~\cite{typeclass} is a mechanism to convert overloading into parametric polymorphism.
     168Parametric polymorphism has enough information to disambiguate the overloaded names because it removes the type inferencing.
     169\begin{cfa}
     170forall( T | T has + $and$ - ) T f$\(_1\)$( T );
     171forall( T | T has * $and$ - ) T f$\(_2\)$( T );
     172x = f$\(_1\)$( x ); // if x has + and - but not *
     173y = f$\(_2\)$( y ); // if y has * and - but not +
     174\end{cfa}
     175Here, the types of @x@ and @y@ are combined in the type-class contraints to provide secondary infomration for disambiguation.
     176This approach handles many overloading cases because the contraints overlap completely or are disjoint
     177
     178A type class (trait) generically abstracts the set of the operations used in a function's implementation.
     179A type-class instance binds a specific type to the generic operations to form concrete instances, giving a name type-class.
     180Then Qualified types concisely express the operations required to convert an overloaded
     181The name type-class is used as a transfer function to convert an overloaded routine into a polymorphic routine that is uniquely qualified with the  name type-class.
     182\begin{cfa}
     183void foo_int_trait( special int trait for operations in this foo );
     184void foo_int_int_trait( special (int, int) trait for operations in this foo );
     185\end{cfa}
     186
     187
     188In this case, the compiler implicitly changes the overloaded function to a parametrically polymorphic one.
     189Hence, the programmer does specify any additional information for the overloading to work.
     190Explicit overloading occurs when the compiler has to be told what operations are associated with a type  programmer separately defines the associate type and subsequently associates the type with overloaded name.
     191\end{enumerate}
     192
     193\begin{comment}
     194Date: Mon, 24 Feb 2025 11:26:12 -0500
     195Subject: Re: overloading
     196To: "Peter A. Buhr" <pabuhr@uwaterloo.ca>
     197CC: <f37yu@uwaterloo.ca>, <ajbeach@uwaterloo.ca>, <mlbrooks@uwaterloo.ca>,
     198        <alvin.zhang@uwaterloo.ca>, <lseo@plg.uwaterloo.ca>,
     199        <j82liang@uwaterloo.ca>
     200From: Gregor Richards <gregor.richards@uwaterloo.ca>
     201
     202Yes.
     203
     204With valediction,
     205  - Gregor Richards
     206
     207On 2/24/25 11:22, Peter A. Buhr wrote:
     208>      Gregor Richards <gregor.richards@uwaterloo.ca> writes:
     209>      In Haskell, `+` works for both because of typeclasses (inclusion
     210>      polymorphism), and so is also not an unresolved type.
     211>
     212> I'm making this up. The Haskell type-class is a trait, like an interface or
     213> abstract class, and its usage/declaration/binding creates a specific trait
     214> instance for bound types, which is a vtable filled with the typed routines
     215> instantiated/located for the trait. The vtables are present at runtime and
     216> passed implicitly to ad-hoc polymorphic routines allowing differentiate of
     217> overloaded functions based on the number of traits and their specialization.
     218> (Major geek talk, YA! 8-)
     219>
     220>      On 2/21/25 23:04, Fangren Yu wrote:
     221>      > In a statically typed language I would rather have definitions like
     222>      > double x = x+x be ambiguous than "an unresolved type" as the latter
     223>      > sounds like a weaker version of a generic type, and being able to make
     224>      > something generic without explicitly saying so is probably not a good
     225>      > idea. Giving the unspecified parameter type an arbitrary preference is
     226>      > the second best option IMO (does ML give you a warning on such not
     227>      > fully inferred types?)
     228>      > ------------------------------------------------------------------------
     229>      > *From:* Gregor Richards <gregor.richards@uwaterloo.ca>
     230>      > *Sent:* Wednesday, February 19, 2025 9:55:23 PM
     231>      > *To:* Peter Buhr <pabuhr@uwaterloo.ca>
     232>      > *Cc:* Andrew James Beach <ajbeach@uwaterloo.ca>; Michael Leslie Brooks
     233>      > <mlbrooks@uwaterloo.ca>; Fangren Yu <f37yu@uwaterloo.ca>;
     234>      > j82liang@uwaterloo.ca <j82liang@uwaterloo.ca>; Alvin Zhang
     235>      > <alvin.zhang@uwaterloo.ca>; lseo@plg.uwaterloo.ca <lseo@plg.uwaterloo.ca>
     236>      > *Subject:* Re: overloading
     237>      > Jives with what I was saying, albeit not exactly the same; it's a result
     238>      > of the same problem.
     239>      >
     240>      > 'doubles' refers to an unresolved 'double', and the latter can't be
     241>      > resolved without the former, so you can't compile 'double' unless you
     242>      > know what its arguments are. The solutions are:
     243>      >
     244>      > * Typeclasses make it possible by compiling with a handle. When you
     245>      > call a function that takes a typeclass value as an argument, it takes an
     246>      > extra, hidden argument internally which is the typeclass handle. That
     247>      > handle tells the callee how to use the typeclass functions with this
     248>      > particular value. And, of course, you hope that some aggressive inlining
     249>      > gets rid of the dynamic dispatch :). But, no per se overloading
     250>      > supported, only inclusion polymorphism.
     251>      >
     252>      > * If you do whole-world compilation, then you just compile what you
     253>      > particularly need in context. If you call 'doubles' with a
     254>      > float,int,int, then you compile that version. But, currying is unsolved.
     255>      >
     256>      > * If you do C++-style templating, this is a less severe problem, as
     257>      > you compile it with the use of 'doubles', not with the definition. But,
     258>      > either no currying, or you have to specify the extra types explicitly so
     259>      > it knows what to curry, so no inference.
     260>      >
     261>      > * If you do Java-style generics, ... just kidding.
     262>      >
     263>      > In a language like Haskell or OCaml, if you want to compile this
     264>      > modularly and have the code with the implementation, then naively it
     265>      > would have to make eight implementations. But, even that is only true if
     266>      > (x, y, z) is a single tuple argument. Currying is still the killer. If
     267>      > you call `doubles 3`, the return is supposed to be some x -> y -> (int, x,
     268>      > y), where 'x' and 'y' are "some type on which I can call a 'double'
     269>      > function, but I don't know which double function yet because I don't
     270>      > know what type". Even *writing* that type is hard enough, but having
     271>      > values of that type float around at runtime? Yikes.
     272>      >
     273>      > To put it a different way: In C++ (and presumably CFA?), you can
     274>      > overload all you want to, but you can't get a function pointer to an
     275>      > unresolved overload. The function pointer is to a *particular* overload,
     276>      > not the set of possible overloads. Well, in a functional language,
     277>      > function pointers are the lifeblood of the language.
     278>      >
     279>      > With valediction,
     280>      > - Gregor Richards
     281>      >
     282>      > On 2/19/25 21:25, Peter A. Buhr wrote:
     283>      > > In the "Type Classes" chapter I sent out, the author says the
     284>      > following. Does it
     285>      > > jive with what you are saying about currying? BTW, I do not know who
     286>      > wrote the
     287>      > > book chapter.
     288>      > >
     289>      > >
     290>      > ==========================================================================
     291>      > >
     292>      > > Suppose we have a language that overloads addition + and
     293>      > multiplication *,
     294>      > > providing versions that work over values of type Int and type Float.
     295>      > Now,
     296>      > > consider the double function, written in terms of the overloaded
     297>      > addition
     298>      > > operation:
     299>      > >
     300>      > > double x = x + x
     301>      > >
     302>      > > What does this definition mean? A naive interpretation would be to
     303>      > say that
     304>      > > double is also overloaded, defining one function of type Int -> Int
     305>      > -> Int and a
     306>      > > second of type Float -> Float -> Float. All seems fine, until we
     307>      > consider the
     308>      > > function
     309>      > >
     310>      > > doubles (x,y,z) = (double x, double y, double z)
     311>      > >
     312>      > > Under the proposed scheme, this definition would give rise to eight
     313>      > different
     314>      > > versions! This approach has not been widely used because of the
     315>      > exponential
     316>      > > growth in the number of versions.
     317>      > >
     318>      > > To avoid this blow-up, language designers have sometimes restricted the
     319>      > > definition of overloaded functions. In this approach, which was
     320>      > adopted in
     321>      > > Standard ML, basic operations can be overloaded, but not functions
     322>      > defined in
     323>      > > terms of them. Instead, the language design specifies one of the
     324>      > possible
     325>      > > versions as the meaning of the function. For example, Standard ML give
     326>      > > preference to the type int over real, so the type (and
     327>      > implementation) of the
     328>      > > function double would be int -> int. If the programmer wanted to
     329>      > define a double
     330>      > > function over floating point numbers, she would have to explicitly
     331>      > write the
     332>      > > type of the function in its definition and give the function a name
     333>      > distinct
     334>      > > from the double function on integers. This approach is not particularly
     335>      > > satisfying, because it violates a general principle of language
     336>      > design: giving
     337>      > > the compiler the ability to define features that programmers cannot.
     338>
     339>      [2:text/html Show Save:noname (10kB)]
     340\end{comment}
    40341
    41342
     
    58359Conversions are necessary because the hardware rarely supports mix-mode operations, so both operands must be converted to a common type.
    59360Like overloading, the majority of mixed-mode conversions are silently resolved, simplifying the development process.
    60 Without implicit conversions, programmers must write an exponential number of functions covering all possible exact-match cases among all possible types.
    61361This approach does not match with programmer intuition and expectation, regardless of any \emph{safety} issues resulting from converted values.
    62362Depending on the language, mix-mode conversions can be explicitly controlled using some form of cast.
     
    188488% https://dl.acm.org/doi/10.1145/75277.75283
    189489
    190 \begin{minipage}{0.75\textwidth}
    191 \begin{tabular}{@{}r|ccccccc@{}}
    192 Feat.\,{\textbackslash}\,Lang.  & Ada   & \CC   & \CFA  & Java  & Scala & Swift & Haskell       \\
     490\begin{minipage}{\linewidth}
     491\setlength{\tabcolsep}{5pt}
     492\begin{tabular}{@{}r|cccccccc@{}}
     493Feature\,{\textbackslash}\,Language     & Ada   & \CC   & \CFA  & Java  & Scala & Swift & Rust & Haskell        \\
    193494\hline
    194 Oper./Func./Meth. name  & O\footnote{except assignment}/F       & O/F/M & O/F   & M     & O/M   & O/F/M \\
    195 parameter number                & yes   & yes   & yes   & yes   & yes   & yes   \\
    196 parameter types                 & yes   & yes   & yes   & yes   & yes   & yes   \\
    197 parameter name                  & no    & no    & no    & no    & yes   & yes   \\
    198 return type                             & yes   & no    & yes   & no    & no    & yes   \\
    199 Safe/Unsafe arg. conv.  & none  & yes\footnote{no conversions allowed during template parameter deduction}      & S/U   & S\footnote{unsafe (narrowing) conversion only allowed in assignment or initialization to a primitive variable}        & S             & no\footnote{literals only, Int -> Double (Safe)}      \\
    200 generic                                 & no            & yes\footnote{compile-time only, using template expansion}     & yes   & yes   & yes   & yes
     495Operator/Function/Method name   & O\footnote{except assignment}/F       & O/F/M & O/F   & M     & O/M   & O/F/M & X     & X     \\
     496generic name                                    & no    & yes\footnote{compile-time only, using template expansion}     & yes   & yes   & yes   & yes   & X     & X \\
     497parameter number                                & yes   & yes   & yes   & yes   & yes   & yes   & X     & X     \\
     498parameter types                                 & yes   & yes   & yes   & yes   & yes   & yes   & X     & X     \\
     499parameter name                                  & no    & no    & no    & no    & yes   & yes   & X     & X     \\
     500return type                                             & yes   & no    & yes   & no    & no    & yes   & X     & X     \\
     501Safe/Unsafe argument conversion & none  & yes\footnote{no conversions allowed during template parameter deduction}      & S/U
     502        & S\footnote{unsafe (narrowing) conversion only allowed in assignment or initialization to a primitive variable}        & S
     503        & no\footnote{literals only, Int -> Double (Safe)}      & X     & X
    201504\end{tabular}
    202505\end{minipage}
  • TabularUnified doc/theses/fangren_yu_MMath/test.adb

    r62b5940 rf5bf3c2  
    1616        Function Func( V1 : Integer; V2 : Float ) return Float is begin return Float(V1) + V2; end;
    1717       
    18         Function "-"( L : Integer; R : Integer ) return Integer is begin return 3; end;
     18        Function "-"( L, R : Integer ) return Integer is begin return L + (-R); end; --  prevent infinite recusion
    1919       
    2020        i : Integer;
     
    2626            Re, Im : Float;
    2727        end record;
     28        c : Complex := (Re => 1.0, Im => 1.0);
    2829    Procedure Grind (X : Complex) is begin Put_Line( "Grind1" ); end;
    2930    Procedure Grind (X : Unbounded_String) is begin Put_Line( "Grind2" ); end;
    30         c : Complex;
    3131       
    3232        generic
     
    3939           Put_Line( "XXX" ); return X + X;   -- The formal operator "*".
    4040        end twice;
    41        
    42         generic
    43            type T is private;
    44            with function "+"( X, Y: T ) return T;
    45         function twice( X : T; Y : T ) return T;
    46        
    47         -- generic units cannot be overloaded
    48         function twice( X: T; Y : T ) return T is
    49         begin
    50            Put_Line( "XXX" ); return X + X;   -- The formal operator "*".
    51         end twice;
    5241
    5342        function Int_Twice is new Twice( Integer, "+" => "+" );
    5443        function float_Twice is new Twice( float, "+" => "+" );
     44       
     45        -- generic units cannot be overloaded
     46        -- generic
     47        --    type T is private;
     48        --    with function "+"( X, Y: T ) return T;
     49        -- function twice( X : T; Y : T ) return T;
     50        --
     51        -- function twice( X: T; Y : T ) return T is
     52        -- begin
     53        --    Put_Line( "XXX" ); return X + X;   -- The formal operator "*".
     54        -- end twice;
    5555begin
    5656        i := Random;
Note: See TracChangeset for help on using the changeset viewer.