Context Navigation

← Previous Changeset
Next Changeset →

Changeset f5bf3c2

Timestamp:

Mar 20, 2025, 1:21:53 PM (31 hours ago)

Author:

Peter A. Buhr <pabuhr@…>

Branches:

master

Parents:

62b5940

Message:

more proofreading on introduction chapter

Location:

doc/theses/fangren_yu_MMath

Files:

: 2 edited

intro.tex (modified) (5 diffs)
test.adb (modified) (3 diffs)

Legend:

: Unmodified
: Added
: Removed

TabularUnified doc/theses/fangren_yu_MMath/intro.tex ¶

-                      r62b5940
+                      rf5bf3c2
 In certain cases, if the resolver fails to find an exact assertion match, it attempts to find a \emph{best} match using reasonable type conversions.
 Hence, \CFA follows the current trend of replacing nominal inheritance with traits composed of assertions for type matching.
 The over-arching goal is to push the boundary on localized assertion matching, with advanced overloading resolution and type conversions that match programmer expectations in the C programming language.
 Together, the resulting \CFA type-system has a number of unique features making it different from other programming languages with expressive, static, type-systems.
+The over-arching goal in \CFA is to push the boundary on localized assertion matching, with advanced overloading resolution and type conversions that match programmer expectations in the C programming language.
+Together, the resulting type-system has a number of unique features making it different from other programming languages with expressive, static, type-systems.
 …
 These rules are then used to transform a language expression to a hardware expression.
 Modern programming-languages allow user-defined types and generalize across multiple types using polymorphism.
 Type systems can be static, where each variable has a fixed type during execution and an expression's type is determined once at compile time, or dynamic, where each variable can change type during execution and so an expression's type is reconstructed on each evaluation.
+Type systems can be static, where each variable has a fixed type during execution and an expression's type is determined at compile time, or dynamic, where each variable can change type during execution and so an expression's type is reconstructed on each evaluation.
 Expressibility, generalization, and safety are all bound up in a language's type system, and hence, directly affect the capability, build time, and correctness of program development.
 …
 \end{quote}
 Overloading allows programmers to use the most meaningful names without fear of name clashes within a program or from external sources, like include files.
+Experience from \CC and \CFA developers is that the type system implicitly and correctly disambiguates the majority of overloaded names, \ie it is rare to get an incorrect selection or ambiguity, even among hundreds of overloaded (variables and) functions.
+In many cases, a programmer has no idea there are name clashes, as they are silently resolved, simplifying the development process.
+Depending on the language, any ambiguous cases are resolved explicitly using some form of qualification and/or cast.
+Experience from \CC and \CFA developers shows the type system can implicitly and correctly disambiguates the majority of overloaded names, \ie it is rare to get an incorrect selection or ambiguity, even among hundreds of overloaded (variables and) functions.
+In many cases, a programmer is unaware of name clashes, as they are silently resolved, simplifying the development process.
+\newterm{Namespace pollution} refers to loading the global or other namespaces with many names, resulting in paranoia that the compiler could make wrong choices for overloaded names causing failure.
+This fear leads to coding styles where names are partitioned by language mechanisms and qualification is used to make names unique.
+This approach defeats the purpose of overloading and places an additional coding burden on both the code developer and user.
+As well, many namespace systems provide a mechanism to open their scope returning to normal overloading, \ie no qualification.
+While namespace mechanisms are very important and provide a number of crucial program-development features, protection from overloading is overstated.
+Similarly, lexical nesting is another place where overloading occurs.
+For example, in object-oriented programming, class memeber names \newterm{shadow} names within members.
+Some programmers, qualify all member names with @class::@ or @this->@ to make them unique from names defined in members.
+Even nested lexical blocks result in shadowing, \eg multiple nested loop-indices called @i@.
+Again, coding styles exist requiring all variables in nested block to be unique to prevent name shadowing.
+Depending on the language, these possible ambiguities can be reported (as warnings or errors) and resolved explicitly using some form of qualification and/or cast.
+Formally, overloading is defined by Strachey as \newterm{ad hoc polymorphism}:
+\begin{quote}
+In ad hoc polymorphism there is no single systematic way of determining the type of the result from the type of the arguments.
+There may be several rules of limited extent which reduce the number of cases, but these are themselves ad hoc both in scope and content.
+All the ordinary arithmetic operators and functions come into this category.
+It seems, moreover, that the automatic insertion of transfer functions by the compiling system is limited to this.~\cite[p.~37]{Strachey00}
+\end{quote}
+where a \newterm{transfer function} is an implicit conversion to help find a matching overload:
+\begin{quote}
+The problem of dealing with polymorphic operators is complicated by the fact that the range of types sometimes overlap.
+Thus for example 3 may be an integer or a real and it may be necessary to change it from one type to the other.
+The functions which perform this operation are known as transfer functions and may either be used explicitly by the programmer, or, in some systems, inserted automatically by the compiling system.~\cite[p.~35]{Strachey00}
+\end{quote}
+The differentiating characteristic between parametric polymorphism and overloading is often stated as: polymorphic functions use one algorithm to operate on arguments of many different types, whereas overloaded functions use a different algorithm for each type of argument.
+A similar differentiation is applicable for overloading and default parameters.
+\begin{cquote}
+\setlength{\tabcolsep}{10pt}
+\begin{tabular}{@{}lll@{}}
+\multicolumn{1}{c}{\textbf{different implementations}}  & \multicolumn{1}{c}{\textbf{same implementation}} \\
+\begin{cfa}
+void foo( int );
+void foo( int, int );
+\end{cfa}
+&
+\begin{cfa}
+void foo( int, int = 5 ); // default value
+\end{cfa}
+\end{tabular}
+\end{cquote}
+However, this distinguishing characteristic is vague.
+For example, should the operation @abs@ be overloaded or polymorphic or both?
+\begin{cquote}
+\setlength{\tabcolsep}{10pt}
+\begin{tabular}{@{}lll@{}}
+\multicolumn{1}{c}{\textbf{overloading}}        & \multicolumn{1}{c}{\textbf{polymorphic}} \\
+\begin{cfa}
+int abs( int );
+double abs( double );
+\end{cfa}
+&
+\begin{cfa}
+forall( T | { void ?{}( T &, zero_t ); int ?<?( T, T ); T -?( T ); } )
+T abs( T );
+\end{cfa}
+\end{tabular}
+\end{cquote}
+Here, there are performance advantages for having specializations and code-reuse advantages for the generalization.
+The Strachey definitions raise several questions.
+\begin{enumerate}[leftmargin=*]
+\item
+Is overloading polymorphism?
+\noindent
+In type theory, polymorphism allows an overloaded type name to represent multiple different types.
+For example, generic types overload the type name for a container type.
+\begin{cfa}
+@List@<int> li;    @List@<double> ld;    @List@<struct S> ls;
+\end{cfa}
+For subtyping, a derived type masquerades as a base type, where the base and derived names cannot be overloaded.
+Instead, the mechanism relies on structural typing among the types.
+In both cases, the polymorphic mechanisms apply in the type domain and the names are in the type namespace.
+In the following C example:
+\begin{cfa}
+struct S {};
+struct S S;
+enum E { E };
+\end{cfa}
+the names @S@ and @E@ exist is the type and object domain, and C uses the type kinds @struct@ and @enum@ to disambiguate the names.
+On the other hand, in ad-hoc overloading of variables and/or functions, the names are in the object domain and each overloaded object name has an anonymous associated type.
+\begin{cfa}
+@double@ foo( @int@ );    @int@ foo( @void@ );    @int@ foo( @double, double@ );
+@double@ foo;    @char@ foo;    @int@ foo;
+\end{cfa}
+Notice, the associated type cannot be extracted using @typeof@/\lstinline[language={[11]C++}]{decltype} for typing purposes, as @typeof( foo )@ is always ambiguous.
+Hence, overloading may not be polymorphism, as no single overloaded entity represents multiple types.
+\item
+Does ad-hoc polymorphism have a single systematic way of determining the type of the result from the type of the arguments?
+\noindent
+For exact type matches in overloading, there is a systematic way of matching arguments to parameters, and a function denotes its return type rather than using type inferencing.
+This matching is just as robust as other polymorphic analysis.
+The ad-hoc aspect is the implicit transfer functions (conversions) applied to arguments to create an exact parameter type-match, as there may be multiple conversions leading to different exact matches.
+Note, conversion issues apply to non-overloaded and overloaded functions.
+Here, the selection of the conversion functions is based on the \emph{opinion} of the language (type system), even if the technique used is based on sound rules, like maximizing conversion accuracy (non-lossy).
+The difference in opinion results when the language conversion rules differ from a programmer's expectations.
+However, without implicit conversions, programmers may have to write an exponential number of functions covering all possible exact-match cases among all reasonable types.
+\CFA's \emph{opinion} on conversions must match C's and then rely on programmers to understand the effects.
+That is, let the compiler do the heavy-lifting of selecting a \emph{best} set of conversions that minimizes safety concerns.
+Hence, removing implicit conversions from \CFA is not an option, so it must do the best possible job to get it right.
+\item
+Why are there two forms of \emph{overloading} (regular and type class) in different programming languages?
+\noindent
+\newterm{Regular overloading} occurs when the type-system \emph{knows} a function's argument and return types (or a variable's type for variable overloading).
+If a return type is specified, the compiler does not have to inference the routine body.
+For example, the compiler has complete knowledge about builtin types and their overloaded arithmetic operators.
+In this context, there is a fixed set of overloads for a given name that are completely specified.
+Overload resolution then involves finding an exact match between a call and the overload prototypes based on argument type(s) and possibly return context.
+If an \emph{exact} match is not found, the call is either ill formed (ambiguous) or further attempts are made to find a \emph{best} match using transfer functions (conversions).
+As a consequence, no additional runtime information is needed per call, \ie the call is a direct transfer (branch) with pushed arguments.
+\newterm{Type-class overloading} occurs when the compiler is using currying for type inferencing.
+\begin{lstlisting}[language=Haskell]
+f( int, int );  f( int, float ); -- return types to be determined
+g( int, int );  g( float, int );
+let x = curry f( 3, _ ); -- which f
+let y = curry g( _ , 3 ); -- which g
+\end{lstlisting}
+For the currying to succeed, there cannot be overloaded function names resulting in ambiguities.
+To allow currying to succeed requires an implicit disambiguating mechanism, \ie a kind of transfer function.
+A type class~\cite{typeclass} is a mechanism to convert overloading into parametric polymorphism.
+Parametric polymorphism has enough information to disambiguate the overloaded names because it removes the type inferencing.
+\begin{cfa}
+forall( T | T has + $and$ - ) T f$\(_1\)$( T );
+forall( T | T has * $and$ - ) T f$\(_2\)$( T );
+x = f$\(_1\)$( x ); // if x has + and - but not *
+y = f$\(_2\)$( y ); // if y has * and - but not +
+\end{cfa}
+Here, the types of @x@ and @y@ are combined in the type-class contraints to provide secondary infomration for disambiguation.
+This approach handles many overloading cases because the contraints overlap completely or are disjoint
+A type class (trait) generically abstracts the set of the operations used in a function's implementation.
+A type-class instance binds a specific type to the generic operations to form concrete instances, giving a name type-class.
+Then Qualified types concisely express the operations required to convert an overloaded
+The name type-class is used as a transfer function to convert an overloaded routine into a polymorphic routine that is uniquely qualified with the  name type-class.
+\begin{cfa}
+void foo_int_trait( special int trait for operations in this foo );
+void foo_int_int_trait( special (int, int) trait for operations in this foo );
+\end{cfa}
+In this case, the compiler implicitly changes the overloaded function to a parametrically polymorphic one.
+Hence, the programmer does specify any additional information for the overloading to work.
+Explicit overloading occurs when the compiler has to be told what operations are associated with a type  programmer separately defines the associate type and subsequently associates the type with overloaded name.
+\end{enumerate}
+\begin{comment}
+Date: Mon, 24 Feb 2025 11:26:12 -0500
+Subject: Re: overloading
+To: "Peter A. Buhr" <pabuhr@uwaterloo.ca>
+CC: <f37yu@uwaterloo.ca>, <ajbeach@uwaterloo.ca>, <mlbrooks@uwaterloo.ca>,
+        <alvin.zhang@uwaterloo.ca>, <lseo@plg.uwaterloo.ca>,
+        <j82liang@uwaterloo.ca>
+From: Gregor Richards <gregor.richards@uwaterloo.ca>
+Yes.
+With valediction,
+  - Gregor Richards
+On 2/24/25 11:22, Peter A. Buhr wrote:
+>      Gregor Richards <gregor.richards@uwaterloo.ca> writes:
+>      In Haskell, `+` works for both because of typeclasses (inclusion
+>      polymorphism), and so is also not an unresolved type.
+>
+> I'm making this up. The Haskell type-class is a trait, like an interface or
+> abstract class, and its usage/declaration/binding creates a specific trait
+> instance for bound types, which is a vtable filled with the typed routines
+> instantiated/located for the trait. The vtables are present at runtime and
+> passed implicitly to ad-hoc polymorphic routines allowing differentiate of
+> overloaded functions based on the number of traits and their specialization.
+> (Major geek talk, YA! 8-)
+>
+>      On 2/21/25 23:04, Fangren Yu wrote:
+>      > In a statically typed language I would rather have definitions like
+>      > double x = x+x be ambiguous than "an unresolved type" as the latter
+>      > sounds like a weaker version of a generic type, and being able to make
+>      > something generic without explicitly saying so is probably not a good
+>      > idea. Giving the unspecified parameter type an arbitrary preference is
+>      > the second best option IMO (does ML give you a warning on such not
+>      > fully inferred types?)
+>      > ------------------------------------------------------------------------
+>      > *From:* Gregor Richards <gregor.richards@uwaterloo.ca>
+>      > *Sent:* Wednesday, February 19, 2025 9:55:23 PM
+>      > *To:* Peter Buhr <pabuhr@uwaterloo.ca>
+>      > *Cc:* Andrew James Beach <ajbeach@uwaterloo.ca>; Michael Leslie Brooks
+>      > <mlbrooks@uwaterloo.ca>; Fangren Yu <f37yu@uwaterloo.ca>;
+>      > j82liang@uwaterloo.ca <j82liang@uwaterloo.ca>; Alvin Zhang
+>      > <alvin.zhang@uwaterloo.ca>; lseo@plg.uwaterloo.ca <lseo@plg.uwaterloo.ca>
+>      > *Subject:* Re: overloading
+>      > Jives with what I was saying, albeit not exactly the same; it's a result
+>      > of the same problem.
+>      >
+>      > 'doubles' refers to an unresolved 'double', and the latter can't be
+>      > resolved without the former, so you can't compile 'double' unless you
+>      > know what its arguments are. The solutions are:
+>      >
+>      > * Typeclasses make it possible by compiling with a handle. When you
+>      > call a function that takes a typeclass value as an argument, it takes an
+>      > extra, hidden argument internally which is the typeclass handle. That
+>      > handle tells the callee how to use the typeclass functions with this
+>      > particular value. And, of course, you hope that some aggressive inlining
+>      > gets rid of the dynamic dispatch :). But, no per se overloading
+>      > supported, only inclusion polymorphism.
+>      >
+>      > * If you do whole-world compilation, then you just compile what you
+>      > particularly need in context. If you call 'doubles' with a
+>      > float,int,int, then you compile that version. But, currying is unsolved.
+>      >
+>      > * If you do C++-style templating, this is a less severe problem, as
+>      > you compile it with the use of 'doubles', not with the definition. But,
+>      > either no currying, or you have to specify the extra types explicitly so
+>      > it knows what to curry, so no inference.
+>      >
+>      > * If you do Java-style generics, ... just kidding.
+>      >
+>      > In a language like Haskell or OCaml, if you want to compile this
+>      > modularly and have the code with the implementation, then naively it
+>      > would have to make eight implementations. But, even that is only true if
+>      > (x, y, z) is a single tuple argument. Currying is still the killer. If
+>      > you call `doubles 3`, the return is supposed to be some x -> y -> (int, x,
+>      > y), where 'x' and 'y' are "some type on which I can call a 'double'
+>      > function, but I don't know which double function yet because I don't
+>      > know what type". Even *writing* that type is hard enough, but having
+>      > values of that type float around at runtime? Yikes.
+>      >
+>      > To put it a different way: In C++ (and presumably CFA?), you can
+>      > overload all you want to, but you can't get a function pointer to an
+>      > unresolved overload. The function pointer is to a *particular* overload,
+>      > not the set of possible overloads. Well, in a functional language,
+>      > function pointers are the lifeblood of the language.
+>      >
+>      > With valediction,
+>      > - Gregor Richards
+>      >
+>      > On 2/19/25 21:25, Peter A. Buhr wrote:
+>      > > In the "Type Classes" chapter I sent out, the author says the
+>      > following. Does it
+>      > > jive with what you are saying about currying? BTW, I do not know who
+>      > wrote the
+>      > > book chapter.
+>      > >
+>      > >
+>      > ==========================================================================
+>      > >
+>      > > Suppose we have a language that overloads addition + and
+>      > multiplication *,
+>      > > providing versions that work over values of type Int and type Float.
+>      > Now,
+>      > > consider the double function, written in terms of the overloaded
+>      > addition
+>      > > operation:
+>      > >
+>      > > double x = x + x
+>      > >
+>      > > What does this definition mean? A naive interpretation would be to
+>      > say that
+>      > > double is also overloaded, defining one function of type Int -> Int
+>      > -> Int and a
+>      > > second of type Float -> Float -> Float. All seems fine, until we
+>      > consider the
+>      > > function
+>      > >
+>      > > doubles (x,y,z) = (double x, double y, double z)
+>      > >
+>      > > Under the proposed scheme, this definition would give rise to eight
+>      > different
+>      > > versions! This approach has not been widely used because of the
+>      > exponential
+>      > > growth in the number of versions.
+>      > >
+>      > > To avoid this blow-up, language designers have sometimes restricted the
+>      > > definition of overloaded functions. In this approach, which was
+>      > adopted in
+>      > > Standard ML, basic operations can be overloaded, but not functions
+>      > defined in
+>      > > terms of them. Instead, the language design specifies one of the
+>      > possible
+>      > > versions as the meaning of the function. For example, Standard ML give
+>      > > preference to the type int over real, so the type (and
+>      > implementation) of the
+>      > > function double would be int -> int. If the programmer wanted to
+>      > define a double
+>      > > function over floating point numbers, she would have to explicitly
+>      > write the
+>      > > type of the function in its definition and give the function a name
+>      > distinct
+>      > > from the double function on integers. This approach is not particularly
+>      > > satisfying, because it violates a general principle of language
+>      > design: giving
+>      > > the compiler the ability to define features that programmers cannot.
+>
+>      [2:text/html Show Save:noname (10kB)]
+\end{comment}
 …
 Conversions are necessary because the hardware rarely supports mix-mode operations, so both operands must be converted to a common type.
 Like overloading, the majority of mixed-mode conversions are silently resolved, simplifying the development process.
-Without implicit conversions, programmers must write an exponential number of functions covering all possible exact-match cases among all possible types.
 This approach does not match with programmer intuition and expectation, regardless of any \emph{safety} issues resulting from converted values.
 Depending on the language, mix-mode conversions can be explicitly controlled using some form of cast.
 …
 % https://dl.acm.org/doi/10.1145/75277.75283
+\begin{minipage}{0.75\textwidth}
+\begin{tabular}{@{}r|ccccccc@{}}
+Feat.\,{\textbackslash}\,Lang.  & Ada   & \CC   & \CFA  & Java  & Scala & Swift & Haskell       \\
+\begin{minipage}{\linewidth}
+\setlength{\tabcolsep}{5pt}
+\begin{tabular}{@{}r|cccccccc@{}}
+Feature\,{\textbackslash}\,Language     & Ada   & \CC   & \CFA  & Java  & Scala & Swift & Rust & Haskell        \\
 \hline
+Oper./Func./Meth. name  & O\footnote{except assignment}/F       & O/F/M & O/F   & M     & O/M   & O/F/M \\
+parameter number                & yes   & yes   & yes   & yes   & yes   & yes   \\
+parameter types                 & yes   & yes   & yes   & yes   & yes   & yes   \\
+parameter name                  & no    & no    & no    & no    & yes   & yes   \\
+return type                             & yes   & no    & yes   & no    & no    & yes   \\
+Safe/Unsafe arg. conv.  & none  & yes\footnote{no conversions allowed during template parameter deduction}      & S/U   & S\footnote{unsafe (narrowing) conversion only allowed in assignment or initialization to a primitive variable}        & S             & no\footnote{literals only, Int -> Double (Safe)}      \\
+generic                                 & no            & yes\footnote{compile-time only, using template expansion}     & yes   & yes   & yes   & yes
+Operator/Function/Method name   & O\footnote{except assignment}/F       & O/F/M & O/F   & M     & O/M   & O/F/M & X     & X     \\
+generic name                                    & no    & yes\footnote{compile-time only, using template expansion}     & yes   & yes   & yes   & yes   & X     & X \\
+parameter number                                & yes   & yes   & yes   & yes   & yes   & yes   & X     & X     \\
+parameter types                                 & yes   & yes   & yes   & yes   & yes   & yes   & X     & X     \\
+parameter name                                  & no    & no    & no    & no    & yes   & yes   & X     & X     \\
+return type                                             & yes   & no    & yes   & no    & no    & yes   & X     & X     \\
+Safe/Unsafe argument conversion & none  & yes\footnote{no conversions allowed during template parameter deduction}      & S/U
+        & S\footnote{unsafe (narrowing) conversion only allowed in assignment or initialization to a primitive variable}        & S
+        & no\footnote{literals only, Int -> Double (Safe)}      & X     & X
 \end{tabular}
 \end{minipage}

TabularUnified doc/theses/fangren_yu_MMath/test.adb ¶

-                      r62b5940
+                      rf5bf3c2
         Function Func( V1 : Integer; V2 : Float ) return Float is begin return Float(V1) + V2; end;
         Function "-"( L : Integer; R : Integer ) return Integer is begin return 3; end;
+        Function "-"( L, R : Integer ) return Integer is begin return L + (-R); end; --  prevent infinite recusion
         i : Integer;
 …
             Re, Im : Float;
         end record;
+        c : Complex := (Re => 1.0, Im => 1.0);
     Procedure Grind (X : Complex) is begin Put_Line( "Grind1" ); end;
     Procedure Grind (X : Unbounded_String) is begin Put_Line( "Grind2" ); end;
-        c : Complex;
         generic
 …
            Put_Line( "XXX" ); return X + X;   -- The formal operator "*".
         end twice;
-        generic
-           type T is private;
-           with function "+"( X, Y: T ) return T;
-        function twice( X : T; Y : T ) return T;
-        -- generic units cannot be overloaded
-        function twice( X: T; Y : T ) return T is
-        begin
-           Put_Line( "XXX" ); return X + X;   -- The formal operator "*".
-        end twice;
         function Int_Twice is new Twice( Integer, "+" => "+" );
         function float_Twice is new Twice( float, "+" => "+" );
+        -- generic units cannot be overloaded
+        -- generic
+        --    type T is private;
+        --    with function "+"( X, Y: T ) return T;
+        -- function twice( X : T; Y : T ) return T;
+        --
+        -- function twice( X: T; Y : T ) return T is
+        -- begin
+        --    Put_Line( "XXX" ); return X + X;   -- The formal operator "*".
+        -- end twice;
 begin
         i := Random;

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset f5bf3c2

Legend:

TabularUnified doc/theses/fangren_yu_MMath/intro.tex ¶

TabularUnified doc/theses/fangren_yu_MMath/test.adb ¶

Download in other formats: