source: doc/theses/fangren_yu_MMath/intro.tex @ f5bf3c2

Last change on this file since f5bf3c2 was f5bf3c2, checked in by Peter A. Buhr <pabuhr@…>, 4 days ago

more proofreading on introduction chapter

  • Property mode set to 100644
File size: 53.2 KB
Line 
1\chapter{Introduction}
2
3This thesis is exploratory work I did to understand, fix, and extend the \CFA type-system, specifically, the \newterm{type-resolver} used to satisfy call-site assertions among overloaded variable and function names to allow polymorphic routine calls.
4Assertions are the operations a function uses within its body to perform its computation.
5For example, a polymorphic function summing an array needs a size, zero, assignment, and plus for the array element-type, and a subscript operation for the array type.
6\begin{cfa}
7T sum( T a[$\,$], size_t size ) {
8        @T@ total = { @0@ }$\C[1.75in]{// size, 0 for type T}$
9        for ( size_t i = 0; i < size; i += 1 )
10                total @+=@ a@[@i@]@; $\C{// + and subscript for T}\CRT$
11        return total;
12}
13\end{cfa}
14In certain cases, if the resolver fails to find an exact assertion match, it attempts to find a \emph{best} match using reasonable type conversions.
15Hence, \CFA follows the current trend of replacing nominal inheritance with traits composed of assertions for type matching.
16The over-arching goal in \CFA is to push the boundary on localized assertion matching, with advanced overloading resolution and type conversions that match programmer expectations in the C programming language.
17Together, the resulting type-system has a number of unique features making it different from other programming languages with expressive, static, type-systems.
18
19
20\section{Types}
21
22All computers have multiple types because computer architects optimize the hardware around a few basic types with well defined (mathematical) operations: boolean, integral, floating-point, and occasionally strings.
23A programming language and its compiler present ways to declare types that ultimately map into the ones provided by the underlying hardware.
24These language types are thrust upon programmers with their syntactic/semantic rules and restrictions.
25These rules are then used to transform a language expression to a hardware expression.
26Modern programming-languages allow user-defined types and generalize across multiple types using polymorphism.
27Type systems can be static, where each variable has a fixed type during execution and an expression's type is determined at compile time, or dynamic, where each variable can change type during execution and so an expression's type is reconstructed on each evaluation.
28Expressibility, generalization, and safety are all bound up in a language's type system, and hence, directly affect the capability, build time, and correctness of program development.
29
30
31\section{Overloading}
32
33\begin{quote}
34There are only two hard things in Computer Science: cache invalidation and \emph{naming things}. --- Phil Karlton
35\end{quote}
36Overloading allows programmers to use the most meaningful names without fear of name clashes within a program or from external sources, like include files.
37Experience from \CC and \CFA developers shows the type system can implicitly and correctly disambiguates the majority of overloaded names, \ie it is rare to get an incorrect selection or ambiguity, even among hundreds of overloaded (variables and) functions.
38In many cases, a programmer is unaware of name clashes, as they are silently resolved, simplifying the development process.
39
40\newterm{Namespace pollution} refers to loading the global or other namespaces with many names, resulting in paranoia that the compiler could make wrong choices for overloaded names causing failure.
41This fear leads to coding styles where names are partitioned by language mechanisms and qualification is used to make names unique.
42This approach defeats the purpose of overloading and places an additional coding burden on both the code developer and user.
43As well, many namespace systems provide a mechanism to open their scope returning to normal overloading, \ie no qualification.
44While namespace mechanisms are very important and provide a number of crucial program-development features, protection from overloading is overstated.
45Similarly, lexical nesting is another place where overloading occurs.
46For example, in object-oriented programming, class memeber names \newterm{shadow} names within members.
47Some programmers, qualify all member names with @class::@ or @this->@ to make them unique from names defined in members.
48Even nested lexical blocks result in shadowing, \eg multiple nested loop-indices called @i@.
49Again, coding styles exist requiring all variables in nested block to be unique to prevent name shadowing.
50Depending on the language, these possible ambiguities can be reported (as warnings or errors) and resolved explicitly using some form of qualification and/or cast.
51
52Formally, overloading is defined by Strachey as \newterm{ad hoc polymorphism}:
53\begin{quote}
54In ad hoc polymorphism there is no single systematic way of determining the type of the result from the type of the arguments.
55There may be several rules of limited extent which reduce the number of cases, but these are themselves ad hoc both in scope and content.
56All the ordinary arithmetic operators and functions come into this category.
57It seems, moreover, that the automatic insertion of transfer functions by the compiling system is limited to this.~\cite[p.~37]{Strachey00}
58\end{quote}
59where a \newterm{transfer function} is an implicit conversion to help find a matching overload:
60\begin{quote}
61The problem of dealing with polymorphic operators is complicated by the fact that the range of types sometimes overlap.
62Thus for example 3 may be an integer or a real and it may be necessary to change it from one type to the other.
63The functions which perform this operation are known as transfer functions and may either be used explicitly by the programmer, or, in some systems, inserted automatically by the compiling system.~\cite[p.~35]{Strachey00}
64\end{quote}
65The differentiating characteristic between parametric polymorphism and overloading is often stated as: polymorphic functions use one algorithm to operate on arguments of many different types, whereas overloaded functions use a different algorithm for each type of argument.
66A similar differentiation is applicable for overloading and default parameters.
67\begin{cquote}
68\setlength{\tabcolsep}{10pt}
69\begin{tabular}{@{}lll@{}}
70\multicolumn{1}{c}{\textbf{different implementations}}  & \multicolumn{1}{c}{\textbf{same implementation}} \\
71\begin{cfa}
72void foo( int );
73void foo( int, int );
74\end{cfa}
75&
76\begin{cfa}
77void foo( int, int = 5 ); // default value
78
79\end{cfa}
80\end{tabular}
81\end{cquote}
82However, this distinguishing characteristic is vague.
83For example, should the operation @abs@ be overloaded or polymorphic or both?
84\begin{cquote}
85\setlength{\tabcolsep}{10pt}
86\begin{tabular}{@{}lll@{}}
87\multicolumn{1}{c}{\textbf{overloading}}        & \multicolumn{1}{c}{\textbf{polymorphic}} \\
88\begin{cfa}
89int abs( int );
90double abs( double );
91\end{cfa}
92&
93\begin{cfa}
94forall( T | { void ?{}( T &, zero_t ); int ?<?( T, T ); T -?( T ); } )
95T abs( T );
96\end{cfa}
97\end{tabular}
98\end{cquote}
99Here, there are performance advantages for having specializations and code-reuse advantages for the generalization.
100
101The Strachey definitions raise several questions.
102\begin{enumerate}[leftmargin=*]
103\item
104Is overloading polymorphism?
105
106\noindent
107In type theory, polymorphism allows an overloaded type name to represent multiple different types.
108For example, generic types overload the type name for a container type.
109\begin{cfa}
110@List@<int> li;    @List@<double> ld;    @List@<struct S> ls;
111\end{cfa}
112For subtyping, a derived type masquerades as a base type, where the base and derived names cannot be overloaded.
113Instead, the mechanism relies on structural typing among the types.
114In both cases, the polymorphic mechanisms apply in the type domain and the names are in the type namespace.
115In the following C example:
116\begin{cfa}
117struct S {};
118struct S S;
119enum E { E };
120\end{cfa}
121the names @S@ and @E@ exist is the type and object domain, and C uses the type kinds @struct@ and @enum@ to disambiguate the names.
122
123On the other hand, in ad-hoc overloading of variables and/or functions, the names are in the object domain and each overloaded object name has an anonymous associated type.
124\begin{cfa}
125@double@ foo( @int@ );    @int@ foo( @void@ );    @int@ foo( @double, double@ );
126@double@ foo;    @char@ foo;    @int@ foo;
127\end{cfa}
128Notice, the associated type cannot be extracted using @typeof@/\lstinline[language={[11]C++}]{decltype} for typing purposes, as @typeof( foo )@ is always ambiguous.
129Hence, overloading may not be polymorphism, as no single overloaded entity represents multiple types.
130
131\item
132Does ad-hoc polymorphism have a single systematic way of determining the type of the result from the type of the arguments?
133
134\noindent
135For exact type matches in overloading, there is a systematic way of matching arguments to parameters, and a function denotes its return type rather than using type inferencing.
136This matching is just as robust as other polymorphic analysis.
137The ad-hoc aspect is the implicit transfer functions (conversions) applied to arguments to create an exact parameter type-match, as there may be multiple conversions leading to different exact matches.
138Note, conversion issues apply to non-overloaded and overloaded functions.
139Here, the selection of the conversion functions is based on the \emph{opinion} of the language (type system), even if the technique used is based on sound rules, like maximizing conversion accuracy (non-lossy).
140The difference in opinion results when the language conversion rules differ from a programmer's expectations.
141However, without implicit conversions, programmers may have to write an exponential number of functions covering all possible exact-match cases among all reasonable types.
142\CFA's \emph{opinion} on conversions must match C's and then rely on programmers to understand the effects.
143That is, let the compiler do the heavy-lifting of selecting a \emph{best} set of conversions that minimizes safety concerns.
144Hence, removing implicit conversions from \CFA is not an option, so it must do the best possible job to get it right.
145
146\item
147Why are there two forms of \emph{overloading} (regular and type class) in different programming languages?
148
149\noindent
150\newterm{Regular overloading} occurs when the type-system \emph{knows} a function's argument and return types (or a variable's type for variable overloading).
151If a return type is specified, the compiler does not have to inference the routine body.
152For example, the compiler has complete knowledge about builtin types and their overloaded arithmetic operators.
153In this context, there is a fixed set of overloads for a given name that are completely specified.
154Overload resolution then involves finding an exact match between a call and the overload prototypes based on argument type(s) and possibly return context.
155If an \emph{exact} match is not found, the call is either ill formed (ambiguous) or further attempts are made to find a \emph{best} match using transfer functions (conversions).
156As a consequence, no additional runtime information is needed per call, \ie the call is a direct transfer (branch) with pushed arguments.
157
158\newterm{Type-class overloading} occurs when the compiler is using currying for type inferencing.
159\begin{lstlisting}[language=Haskell]
160f( int, int );  f( int, float ); -- return types to be determined
161g( int, int );  g( float, int );
162let x = curry f( 3, _ ); -- which f
163let y = curry g( _ , 3 ); -- which g
164\end{lstlisting}
165For the currying to succeed, there cannot be overloaded function names resulting in ambiguities.
166To allow currying to succeed requires an implicit disambiguating mechanism, \ie a kind of transfer function.
167A type class~\cite{typeclass} is a mechanism to convert overloading into parametric polymorphism.
168Parametric polymorphism has enough information to disambiguate the overloaded names because it removes the type inferencing.
169\begin{cfa}
170forall( T | T has + $and$ - ) T f$\(_1\)$( T );
171forall( T | T has * $and$ - ) T f$\(_2\)$( T );
172x = f$\(_1\)$( x ); // if x has + and - but not *
173y = f$\(_2\)$( y ); // if y has * and - but not +
174\end{cfa}
175Here, the types of @x@ and @y@ are combined in the type-class contraints to provide secondary infomration for disambiguation.
176This approach handles many overloading cases because the contraints overlap completely or are disjoint
177
178A type class (trait) generically abstracts the set of the operations used in a function's implementation.
179A type-class instance binds a specific type to the generic operations to form concrete instances, giving a name type-class.
180Then Qualified types concisely express the operations required to convert an overloaded
181The name type-class is used as a transfer function to convert an overloaded routine into a polymorphic routine that is uniquely qualified with the  name type-class.
182\begin{cfa}
183void foo_int_trait( special int trait for operations in this foo );
184void foo_int_int_trait( special (int, int) trait for operations in this foo );
185\end{cfa}
186
187
188In this case, the compiler implicitly changes the overloaded function to a parametrically polymorphic one.
189Hence, the programmer does specify any additional information for the overloading to work.
190Explicit overloading occurs when the compiler has to be told what operations are associated with a type  programmer separately defines the associate type and subsequently associates the type with overloaded name.
191\end{enumerate}
192
193\begin{comment}
194Date: Mon, 24 Feb 2025 11:26:12 -0500
195Subject: Re: overloading
196To: "Peter A. Buhr" <pabuhr@uwaterloo.ca>
197CC: <f37yu@uwaterloo.ca>, <ajbeach@uwaterloo.ca>, <mlbrooks@uwaterloo.ca>,
198        <alvin.zhang@uwaterloo.ca>, <lseo@plg.uwaterloo.ca>,
199        <j82liang@uwaterloo.ca>
200From: Gregor Richards <gregor.richards@uwaterloo.ca>
201
202Yes.
203
204With valediction,
205  - Gregor Richards
206
207On 2/24/25 11:22, Peter A. Buhr wrote:
208>      Gregor Richards <gregor.richards@uwaterloo.ca> writes:
209>      In Haskell, `+` works for both because of typeclasses (inclusion
210>      polymorphism), and so is also not an unresolved type.
211>
212> I'm making this up. The Haskell type-class is a trait, like an interface or
213> abstract class, and its usage/declaration/binding creates a specific trait
214> instance for bound types, which is a vtable filled with the typed routines
215> instantiated/located for the trait. The vtables are present at runtime and
216> passed implicitly to ad-hoc polymorphic routines allowing differentiate of
217> overloaded functions based on the number of traits and their specialization.
218> (Major geek talk, YA! 8-)
219>
220>      On 2/21/25 23:04, Fangren Yu wrote:
221>      > In a statically typed language I would rather have definitions like
222>      > double x = x+x be ambiguous than "an unresolved type" as the latter
223>      > sounds like a weaker version of a generic type, and being able to make
224>      > something generic without explicitly saying so is probably not a good
225>      > idea. Giving the unspecified parameter type an arbitrary preference is
226>      > the second best option IMO (does ML give you a warning on such not
227>      > fully inferred types?)
228>      > ------------------------------------------------------------------------
229>      > *From:* Gregor Richards <gregor.richards@uwaterloo.ca>
230>      > *Sent:* Wednesday, February 19, 2025 9:55:23 PM
231>      > *To:* Peter Buhr <pabuhr@uwaterloo.ca>
232>      > *Cc:* Andrew James Beach <ajbeach@uwaterloo.ca>; Michael Leslie Brooks
233>      > <mlbrooks@uwaterloo.ca>; Fangren Yu <f37yu@uwaterloo.ca>;
234>      > j82liang@uwaterloo.ca <j82liang@uwaterloo.ca>; Alvin Zhang
235>      > <alvin.zhang@uwaterloo.ca>; lseo@plg.uwaterloo.ca <lseo@plg.uwaterloo.ca>
236>      > *Subject:* Re: overloading
237>      > Jives with what I was saying, albeit not exactly the same; it's a result
238>      > of the same problem.
239>      >
240>      > 'doubles' refers to an unresolved 'double', and the latter can't be
241>      > resolved without the former, so you can't compile 'double' unless you
242>      > know what its arguments are. The solutions are:
243>      >
244>      > * Typeclasses make it possible by compiling with a handle. When you
245>      > call a function that takes a typeclass value as an argument, it takes an
246>      > extra, hidden argument internally which is the typeclass handle. That
247>      > handle tells the callee how to use the typeclass functions with this
248>      > particular value. And, of course, you hope that some aggressive inlining
249>      > gets rid of the dynamic dispatch :). But, no per se overloading
250>      > supported, only inclusion polymorphism.
251>      >
252>      > * If you do whole-world compilation, then you just compile what you
253>      > particularly need in context. If you call 'doubles' with a
254>      > float,int,int, then you compile that version. But, currying is unsolved.
255>      >
256>      > * If you do C++-style templating, this is a less severe problem, as
257>      > you compile it with the use of 'doubles', not with the definition. But,
258>      > either no currying, or you have to specify the extra types explicitly so
259>      > it knows what to curry, so no inference.
260>      >
261>      > * If you do Java-style generics, ... just kidding.
262>      >
263>      > In a language like Haskell or OCaml, if you want to compile this
264>      > modularly and have the code with the implementation, then naively it
265>      > would have to make eight implementations. But, even that is only true if
266>      > (x, y, z) is a single tuple argument. Currying is still the killer. If
267>      > you call `doubles 3`, the return is supposed to be some x -> y -> (int, x,
268>      > y), where 'x' and 'y' are "some type on which I can call a 'double'
269>      > function, but I don't know which double function yet because I don't
270>      > know what type". Even *writing* that type is hard enough, but having
271>      > values of that type float around at runtime? Yikes.
272>      >
273>      > To put it a different way: In C++ (and presumably CFA?), you can
274>      > overload all you want to, but you can't get a function pointer to an
275>      > unresolved overload. The function pointer is to a *particular* overload,
276>      > not the set of possible overloads. Well, in a functional language,
277>      > function pointers are the lifeblood of the language.
278>      >
279>      > With valediction,
280>      > - Gregor Richards
281>      >
282>      > On 2/19/25 21:25, Peter A. Buhr wrote:
283>      > > In the "Type Classes" chapter I sent out, the author says the
284>      > following. Does it
285>      > > jive with what you are saying about currying? BTW, I do not know who
286>      > wrote the
287>      > > book chapter.
288>      > >
289>      > >
290>      > ==========================================================================
291>      > >
292>      > > Suppose we have a language that overloads addition + and
293>      > multiplication *,
294>      > > providing versions that work over values of type Int and type Float.
295>      > Now,
296>      > > consider the double function, written in terms of the overloaded
297>      > addition
298>      > > operation:
299>      > >
300>      > > double x = x + x
301>      > >
302>      > > What does this definition mean? A naive interpretation would be to
303>      > say that
304>      > > double is also overloaded, defining one function of type Int -> Int
305>      > -> Int and a
306>      > > second of type Float -> Float -> Float. All seems fine, until we
307>      > consider the
308>      > > function
309>      > >
310>      > > doubles (x,y,z) = (double x, double y, double z)
311>      > >
312>      > > Under the proposed scheme, this definition would give rise to eight
313>      > different
314>      > > versions! This approach has not been widely used because of the
315>      > exponential
316>      > > growth in the number of versions.
317>      > >
318>      > > To avoid this blow-up, language designers have sometimes restricted the
319>      > > definition of overloaded functions. In this approach, which was
320>      > adopted in
321>      > > Standard ML, basic operations can be overloaded, but not functions
322>      > defined in
323>      > > terms of them. Instead, the language design specifies one of the
324>      > possible
325>      > > versions as the meaning of the function. For example, Standard ML give
326>      > > preference to the type int over real, so the type (and
327>      > implementation) of the
328>      > > function double would be int -> int. If the programmer wanted to
329>      > define a double
330>      > > function over floating point numbers, she would have to explicitly
331>      > write the
332>      > > type of the function in its definition and give the function a name
333>      > distinct
334>      > > from the double function on integers. This approach is not particularly
335>      > > satisfying, because it violates a general principle of language
336>      > design: giving
337>      > > the compiler the ability to define features that programmers cannot.
338>
339>      [2:text/html Show Save:noname (10kB)]
340\end{comment}
341
342
343\subsection{Operator Overloading}
344
345Virtually all programming languages overload the arithmetic operators across the basic computational types using the number and type of parameters and returns.
346However, in many programming languages, arithmetic operators are not first class, and hence, they cannot be overloaded by programmers.
347Like \CC, \CFA maps operators to named functions allowing them to be overloaded with user-defined types.
348The syntax for operator names uses the @'?'@ character to denote a parameter, \eg left and right unary operators: @?++@ and @++?@, and binary operators @?+?@ and @?<=?@.
349Here, a user-defined type is extended with an addition operation with the same syntax as a builtin type.
350\begin{cfa}
351struct S { int i, j };
352S @?+?@( S op1, S op2 ) { return (S){ op1.i + op2.i, op1.j + op2.j }; }
353S s1, s2;
354s1 = s1 @+@ s2;                 $\C[1.75in]{// infix call}$
355s1 = @?+?@( s1, s2 );   $\C{// direct call}\CRT$
356\end{cfa}
357The type system examines each call site and selects the best matching overloaded function based on the number and types of arguments.
358If there are mixed-mode operands, @2 + 3.5@, the type system attempts (safe) conversions, like in C/\CC, converting the argument type(s) to the parameter type(s).
359Conversions are necessary because the hardware rarely supports mix-mode operations, so both operands must be converted to a common type.
360Like overloading, the majority of mixed-mode conversions are silently resolved, simplifying the development process.
361This approach does not match with programmer intuition and expectation, regardless of any \emph{safety} issues resulting from converted values.
362Depending on the language, mix-mode conversions can be explicitly controlled using some form of cast.
363
364
365\subsection{Function Overloading}
366
367Both \CFA and \CC allow function names to be overloaded, as long as their prototypes differ in the number and type of parameters and returns.
368\begin{cfa}
369void f( void );                 $\C[2in]{// (1): no parameter}$
370void f( char );                 $\C{// (2): overloaded on the number and parameter type}$
371void f( int, int );             $\C{// (3): overloaded on the number and parameter type}$
372f( 'A' );                               $\C{// select (2)}\CRT$
373\end{cfa}
374In this case, the name @f@ is overloaded depending on the number and parameter types.
375The type system examines each call size and selects the best match based on the number and types of the arguments.
376Here, there is a perfect match for the call, @f( 'A' )@ with the number and parameter type of function (2).
377
378Ada, Scala, and \CFA type-systems also use the return type in resolving a call, to pinpoint the best overloaded name.
379For example, in many programming languages with overloading, the following functions are ambiguous without using the return type.
380\begin{cfa}
381int f( int );                   $\C[2in]{// (1); overloaded on return type and parameter}$
382double f( int );                $\C{// (2); overloaded on return type and parameter}$
383int i = f( 3 );                 $\C{// select (1)}$
384double d = f( 3 );              $\C{// select (2)}\CRT$
385\end{cfa}
386Alternatively, if the type system uses the return type, there is an exact match for each call, which again matches with programmer intuition and expectation.
387This capability can be taken to the extreme, where the only differentiating factor is the return type.
388\begin{cfa}
389int random( void );             $\C[2in]{// (1); overloaded on return type}$
390double random( void )$\C{// (2); overloaded on return type}$
391int i = random();               $\C{// select (1)}$
392double d = random();    $\C{// select (2)}\CRT$
393\end{cfa}
394Again, there is an exact match for each call.
395As for operator overloading, if there is no exact match, a set of minimal, an implicit conversion can be added to find a best match.
396\begin{cfa}
397short int = random();   $\C[2in]{// select (1), unsafe}$
398long double = random(); $\C{// select (2), safe}\CRT$
399\end{cfa}
400
401
402\subsection{Variable Overloading}
403
404Unlike most programming languages, \CFA has variable overloading within a scope, along with shadow overloading in nested scopes.
405Shadow overloading is also possible for functions, in languages supporting nested-function declarations, \eg \CC named, nested, lambda functions.
406\begin{cfa}
407void foo( double d );
408int v;                              $\C[2in]{// (1)}$
409double v;                               $\C{// (2) variable overloading}$
410foo( v );                               $\C{// select (2)}$
411{
412        int v;                          $\C{// (3) shadow overloading}$
413        double v;                       $\C{// (4) and variable overloading}$
414        foo( v );                       $\C{// select (4)}\CRT$
415}
416\end{cfa}
417It is interesting that shadow overloading is considered a normal programming-language feature with only slight software-engineering problems.
418However, variable overloading within a scope is considered extremely dangerous, without any evidence to corroborate this claim.
419In contrast, function overloading in \CC occurs silently within the global scope from @#include@ files all the time without problems.
420
421In \CFA, the type system simply treats an overloaded variable as an overloaded function returning a value with no parameters.
422Hence, no effort is required to support this feature as it is available for differentiating among overloaded functions with no parameters.
423\begin{cfa}
424int MAX = 2147483647;   $\C[2in]{// (1); overloaded on return type}$
425long int MAX = ...;             $\C{// (2); overloaded on return type}$
426double MAX = ...;               $\C{// (3); overloaded on return type}$
427int i = MAX;                    $\C{// select (1)}$
428long int i = MAX;               $\C{// select (2)}$
429double d = MAX;                 $\C{// select (3)}\CRT$
430\end{cfa}
431Hence, the name @MAX@ can replace all the C type-specific names, \eg @INT_MAX@, @LONG_MAX@, @DBL_MAX@, \etc.
432The result is a significant reduction in names to access typed constants.
433
434As an aside, C has a separate namespace for types and variables allowing overloading between the namespaces, using @struct@ (qualification) to disambiguate.
435\begin{cfa}
436void S() {
437        struct @S@ { int S; };
438        @struct S@ S;
439        void S( @struct S@ S ) { S.S = 1; };
440}
441\end{cfa}
442Here the name @S@ is an aggregate type and field, and a variable and parameter of type @S@.
443
444
445\subsection{Constant Overloading}
446
447\CFA is unique in providing restricted constant overloading for the values @0@ and @1@, which have special status in C.
448For example, the value @0@ is both an integer and a pointer literal, so its meaning depends on context.
449In addition, several operations are defined in terms of values @0@ and @1@.
450For example, @if@ and iteration statements in C compare the condition with @0@, and the increment and decrement operators are semantically equivalent to adding or subtracting the value @1@.
451\begin{cfa}
452if ( x ) ++x;        =>    if ( x @!= 0@ ) x @+= 1@;
453for ( ; x; --x )   =>    for ( ; x @!= 0@; x @-= 1@ )
454\end{cfa}
455To generalize this feature, both constants are given types @zero_t@ and @one_t@ in \CFA, which allows overloading various operations for new types that seamlessly work within the special @0@ and @1@ contexts.
456The types @zero_t@ and @one_t@ have special builtin implicit conversions to the various integral types, and a conversion to pointer types for @0@, which allows standard C code involving @0@ and @1@ to work.
457\begin{cfa}
458struct S { int i, j; };
459void ?{}( S & s, zero_t ) { s.[i,j] = 0; } $\C{// constant constructors}$
460void ?{}( S & s, one_t ) { s.[i,j] = 1; }
461S ?=?( S & dst, zero_t ) { dst.[i,j] = 0; return dst; } $\C{// constant assignments}$
462S ?=?( S & dst, one_t ) { dst.[i,j] = 1; return dst; }
463S ?+=?( S & s, one_t ) { s.[i,j] += 1; return s; } $\C{// increment/decrement each field}$
464S ?-=?( S & s, one_t ) { s.[i,j] -= 1; return s; }
465int ?!=?( S s, zero_t ) { return s.i != 0 && s.j != 0; } $\C{// constant comparison}$
466S s = @0@;                      $\C{// initialization}$
467s = @0@;                        $\C{// assignments}$
468s = @1@;
469if ( @s@ ) @++s@;       $\C{// unary ++/-\,- come implicitly from +=/-=}$
470\end{cfa}
471Here, type @S@ is first-class with respect to the basic types, working with all existing implicit C mechanisms.
472
473
474\section{Overload Resolution Strategies}
475
476For languages with user-defined overloading,
477Given an overloaded constant, variable, or (generic) function, there must exist strategies for differentiating among them and selecting the most appropriate one in a given context.
478The criteria commonly used to match operator/function/method names with definitions are: number of parameters, parameter types, parameter order or name, return type, implicit argument type conversions (safe/unsafe), generic, where some features are missing in certain programming languages.
479\VRef[Table]{t:OverloadingFeatures} shows a subset of popular programming languages with overloading and the discriminating features used to disambiguate among overloadings.
480Language C, Go and Rust have no overloading beyond basic types and operators.
481
482\begin{table}
483\caption{Overload Discriminating Features in Programming Languages}
484\label{t:OverloadingFeatures}
485\centering
486
487% https://doc.rust-lang.org/rust-by-example/trait/impl_trait.html
488% https://dl.acm.org/doi/10.1145/75277.75283
489
490\begin{minipage}{\linewidth}
491\setlength{\tabcolsep}{5pt}
492\begin{tabular}{@{}r|cccccccc@{}}
493Feature\,{\textbackslash}\,Language     & Ada   & \CC   & \CFA  & Java  & Scala & Swift & Rust & Haskell        \\
494\hline
495Operator/Function/Method name   & O\footnote{except assignment}/F       & O/F/M & O/F   & M     & O/M   & O/F/M & X     & X     \\
496generic name                                    & no    & yes\footnote{compile-time only, using template expansion}     & yes   & yes   & yes   & yes   & X     & X \\
497parameter number                                & yes   & yes   & yes   & yes   & yes   & yes   & X     & X     \\
498parameter types                                 & yes   & yes   & yes   & yes   & yes   & yes   & X     & X     \\
499parameter name                                  & no    & no    & no    & no    & yes   & yes   & X     & X     \\
500return type                                             & yes   & no    & yes   & no    & no    & yes   & X     & X     \\
501Safe/Unsafe argument conversion & none  & yes\footnote{no conversions allowed during template parameter deduction}      & S/U
502        & S\footnote{unsafe (narrowing) conversion only allowed in assignment or initialization to a primitive variable}        & S
503        & no\footnote{literals only, Int -> Double (Safe)}      & X     & X
504\end{tabular}
505\end{minipage}
506\end{table}
507
508
509\section{Type Inferencing}
510\label{s:IntoTypeInferencing}
511
512Every variable has a type, but association between them can occur in different ways:
513at the point where the variable comes into existence (declaration) and/or on each assignment to the variable.
514\begin{cfa}
515double x;                               $\C{// type only}$
516float y = 3.1D;                 $\C{// type and initialization}$
517auto z = y;                             $\C{// initialization only}$
518z = "abc";                              $\C{// assignment}$
519\end{cfa}
520For type-only, the programmer specifies the initial type, which remains fixed for the variable's lifetime in statically typed languages.
521For type-and-initialization, the specified and initialization types may not agree requiring an implicit/explicit conversion.
522For initialization-only, the compiler may select the type by melding programmer and context information.
523When the compiler participates in type selection, it is called \newterm{type inferencing}.
524Note, type inferencing is different from type conversion: type inferencing \emph{discovers} a variable's type before setting its value, whereas conversion has two typed variables and performs a (possibly lossy) value conversion from one type to the other.
525Finally, for assignment, the current variable and expression types may not agree.
526Discovering a variable or function type is complex and has limitations.
527The following covers these issues, and why this scheme is not amenable with the \CFA type system.
528
529One of the first and powerful type-inferencing system is Hindley--Milner~\cite{Damas82}.
530Here, the type resolver starts with the types of the program constants used for initialization and these constant types flow throughout the program, setting all variable and expression types.
531\begin{cfa}
532auto f() {
533        x = 1;   y = 3.5;       $\C{// set types from constants}$
534        x = // expression involving x, y and other local initialized variables
535        y = // expression involving x, y and other local initialized variables
536        return x, y;
537}
538auto w = f();                   $\C{// typing flows outwards}$
539
540void f( auto x, auto y ) {
541        x = // expression involving x, y and other local initialized variables
542        y = // expression involving x, y and other local initialized variables
543}
544s = 1;   t = 3.5;               $\C{// set types from constants}$
545f( s, t );                              $\C{// typing flows inwards}$
546\end{cfa}
547In both overloads of @f@, the type system works from the constant initializations inwards and/or outwards to determine the types of all variables and functions.
548Like template meta-programming, there can be a new function generated for the second @f@ depending on the types of the arguments, assuming these types are meaningful in the body of @f@.
549Inferring type constraints, by analysing the body of @f@ is possible, and these constraints must be satisfied at each call site by the argument types;
550in this case, parametric polymorphism can allow separate compilation.
551In languages with type inferencing, there is often limited overloading to reduce the search space, which introduces the naming problem.
552Note, return-type inferencing goes in the opposite direction to Hindley--Milner: knowing the type of the result and flowing back through an expression to help select the best possible overloads, and possibly converting the constants for a best match.
553
554In simpler type-inferencing systems, such as C/\CC/\CFA, there are more specific usages.
555\begin{cquote}
556\setlength{\tabcolsep}{10pt}
557\begin{tabular}{@{}lll@{}}
558\multicolumn{1}{c}{\textbf{gcc / \CFA}} & \multicolumn{1}{c}{\textbf{\CC}} \\
559\begin{cfa}
560#define expr 3.0 * i
561typeof(expr) x = expr;
562int y;
563typeof(y) z = y;
564\end{cfa}
565&
566\begin{cfa}
567
568auto x = 3.0 * i;
569int y;
570auto z = y;
571\end{cfa}
572&
573\begin{cfa}
574
575// use type of initialization expression
576
577// use type of initialization expression
578\end{cfa}
579\end{tabular}
580\end{cquote}
581The two important capabilities are:
582\begin{itemize}[topsep=0pt]
583\item
584Not determining or writing long generic types, \eg, given deeply nested generic types.
585\begin{cfa}
586typedef T1(int).T2(float).T3(char).T @ST@;  $\C{// \CFA nested type declaration}$
587@ST@ x, y, x;
588\end{cfa}
589This issue is exaggerated with \CC templates, where type names are 100s of characters long, resulting in unreadable error messages.
590\item
591Ensuring the type of secondary variables, match a primary variable.
592\begin{cfa}
593int x; $\C{// primary variable}$
594typeof(x) y, z, w; $\C{// secondary variables match x's type}$
595\end{cfa}
596If the type of @x@ changes, the type of the secondary variables correspondingly updates.
597There can be strong software-engineering reasons for binding the types of these variables.
598\end{itemize}
599Note, the use of @typeof@ is more restrictive, and possibly safer, than general type-inferencing.
600\begin{cfa}
601int x;
602type(x) y = ... // complex expression
603type(x) z = ... // complex expression
604\end{cfa}
605Here, the types of @y@ and @z@ are fixed (branded), whereas with type inferencing, the types of @y@ and @z@ are potentially unknown.
606
607
608\subsection{Type-Inferencing Issues}
609
610Each kind of type-inferencing system has its own set of issues that flow onto the programmer in the form of convenience, restrictions, or confusions.
611
612A convenience is having the compiler use its overarching program knowledge to select the best type for each variable based on some notion of \emph{best}, which simplifies the programming experience.
613
614A restriction is the conundrum in type inferencing of when to \emph{brand} a type.
615That is, when is the type of the variable/function more important than the type of its initialization expression(s).
616For example, if a change is made in an initialization expression, it can cascade type changes producing many other changes and/or errors.
617At some point, a variable's type needs to remain constant and the initializing expression needs to be modified or be in error when it changes.
618Often type-inferencing systems allow restricting (\newterm{branding}) a variable or function type, so the complier can report a mismatch with the constant initialization.
619\begin{cfa}
620void f( @int@ x, @int@ y ) {  // brand function prototype
621        x = // expression involving x, y and other local initialized variables
622        y = // expression involving x, y and other local initialized variables
623}
624s = 1;   t = 3.5;
625f( s, @t@ ); // type mismatch
626\end{cfa}
627In Haskell, it is common for programmers to brand (type) function parameters.
628
629A confusion is blocks of code where all declarations are @auto@, as is now common in \CC.
630As a result, understanding and changing the code becomes almost impossible.
631Types provide important clues as to the behaviour of the code, and correspondingly to correctly change or add new code.
632In these cases, a programmer is forced to re-engineer types, which is fragile, or rely on a fancy IDE that can re-engineer types for them.
633For example, given:
634\begin{cfa}
635auto x = @...@
636\end{cfa}
637and the need to write a routine to compute using @x@
638\begin{cfa}
639void rtn( @type of x@ parm );
640rtn( x );
641\end{cfa}
642A programmer must re-engineer the type of @x@'s initialization expression, reconstructing the possibly long generic type-name.
643In this situation, having the type name or its short alias is essential.
644
645\CFA's type system tries to prevent type-resolution mistakes by relying heavily on the type of the left-hand side of assignment to pinpoint the right types within an expression.
646Type inferencing defeats this goal because there is no left-hand type.
647Fundamentally, type inferencing tries to magic away variable types from the programmer.
648However, this results in lazy programming with the potential for poor performance and safety concerns.
649Types are as important as control-flow in writing a good program, and should not be masked, even if it requires the programmer to think!
650A similar issue is garbage collection, where storage management is magicked away, often resulting in poor program design and performance.\footnote{
651There are full-time Java consultants, who are hired to find memory-management problems in large Java programs.}
652The entire area of Computer-Science data-structures is obsessed with time and space, and that obsession should continue into regular programming.
653Understanding space and time issues is an essential part of the programming craft.
654Given @typedef@ and @typeof@ in \CFA, and the strong desire to use the left-hand type in resolution, the decision was made not to support implicit type-inferencing in the type system.
655Should a significant need arise, this decision can be revisited.
656
657
658\section{Polymorphism}
659
660\CFA provides polymorphic functions and types, where a polymorphic function can constrain types using assertions based on traits.
661
662
663\subsection{Polymorphic Function}
664
665The signature feature of the \CFA type-system is parametric-polymorphic functions~\cite{forceone:impl,Cormack90,Duggan96}, generalized using a @forall@ clause (giving the language its name).
666\begin{cfa}
667@forall( T )@ T identity( T val ) { return val; }
668int forty_two = identity( 42 );         $\C{// T is bound to int, forty\_two == 42}$
669\end{cfa}
670This @identity@ function can be applied to an \newterm{object type}, \ie a type with a known size and alignment, which is sufficient to stack allocate, default or copy initialize, assign, and delete.
671The \CFA implementation passes the size and alignment for each type parameter, as well as auto-generated default and copy constructors, assignment operator, and destructor.
672For an incomplete \newterm{data type}, \eg pointer/reference types, this information is not needed.
673\begin{cfa}
674forall( T * ) T * identity( T * val ) { return val; }
675int i, * ip = identity( &i );
676\end{cfa}
677Unlike \CC template functions, \CFA polymorphic functions are compatible with C \emph{separate compilation}, preventing compilation and code bloat.
678
679To constrain polymorphic types, \CFA uses \newterm{type assertions}~\cite[pp.~37-44]{Alphard} to provide further type information, where type assertions may be variable or function declarations that depend on a polymorphic type variable.
680Here, the function @twice@ works for any type @T@ with a matching addition operator.
681\begin{cfa}
682forall( T @| { T ?+?(T, T); }@ ) T twice( T x ) { return x @+@ x; }
683int val = twice( twice( 3 ) )$\C{// val == 12}$
684\end{cfa}
685Parametric polymorphism and assertions occur in existing type-unsafe (@void *@) C functions, like @qsort@ for sorting an array of unknown values.
686\begin{cfa}
687void qsort( void * base, size_t nmemb, size_t size, int (*cmp)( const void *, const void * ) );
688\end{cfa}
689Here, the polymorphism is type-erasure, and the parametric assertion is the comparison routine, which is explicitly passed.
690\begin{cfa}
691enum { N = 5 };
692double val[N] = { 5.1, 4.1, 3.1, 2.1, 1.1 };
693int cmp( const void * v1, const void * v2 ) { $\C{// compare two doubles}$
694        return *(double *)v1 < *(double *)v2 ? -1 : *(double *)v2 < *(double *)v1 ? 1 : 0;
695}
696qsort( val, N, sizeof( double ), cmp );
697\end{cfa}
698The equivalent type-safe version in \CFA is a wrapper over the C version.
699\begin{cfa}
700forall( ET | { int @?<?@( ET, ET ); } ) $\C{// type must have < operator}$
701void qsort( ET * vals, size_t dim ) {
702        int cmp( const void * t1, const void * t2 ) { $\C{// nested function}$
703                return *(ET *)t1 @<@ *(ET *)t2 ? -1 : *(ET *)t2 @<@ *(ET *)t1 ? 1 : 0;
704        }
705        qsort( vals, dim, sizeof(ET), cmp ); $\C{// call C version}$
706}
707qsort( val, N )$\C{// deduct type double, and pass builtin < for double}$
708\end{cfa}
709The nested function @cmp@ is implicitly built and provides the interface from typed \CFA to untyped (@void *@) C.
710Providing a hidden @cmp@ function in \CC is awkward as lambdas do not use C calling conventions and template declarations cannot appear in block scope.
711% In addition, an alternate kind of return is made available: position versus pointer to found element.
712% \CC's type system cannot disambiguate between the two versions of @bsearch@ because it does not use the return type in overload resolution, nor can \CC separately compile a template @bsearch@.
713Call-site inferencing and nested functions provide a localized form of inheritance.
714For example, the \CFA @qsort@ can be made to sort in descending order by locally changing the behaviour of @<@.
715\begin{cfa}
716{
717        int ?<?( double x, double y ) { return x @>@ y; } $\C{// locally override behaviour}$
718        qsort( vals, 10 );                                                      $\C{// descending sort}$
719}
720\end{cfa}
721The local version of @?<?@ overrides the built-in @?<?@ so it is passed to @qsort@.
722The local version performs @?>?@, making @qsort@ sort in descending order.
723Hence, any number of assertion functions can be overridden locally to maximize the reuse of existing functions and types, without the construction of a named inheritance hierarchy.
724A final example is a type-safe wrapper for C @malloc@, where the return type supplies the type/size of the allocation, which is impossible in most type systems.
725\begin{cfa}
726static inline forall( T & | sized(T) )
727T * malloc( void ) {
728        if ( _Alignof(T) <= __BIGGEST_ALIGNMENT__ ) return (T *)malloc( sizeof(T) ); // C allocation
729        else return (T *)memalign( _Alignof(T), sizeof(T) );
730}
731// select type and size from left-hand side
732int * ip = malloc();  double * dp = malloc()[[aligned(64)]] struct S {...} * sp = malloc();
733\end{cfa}
734The @sized@ assertion passes size and alignment as a data object has no implicit assertions.
735Both assertions are used in @malloc@ via @sizeof@ and @_Alignof@.
736In practise, this polymorphic @malloc@ is unwrapped by the C compiler and the @if@ statement is elided producing a type-safe call to @malloc@ or @memalign@.
737
738This mechanism is used to construct type-safe wrapper-libraries condensing hundreds of existing C functions into tens of \CFA overloaded functions.
739Here, existing C legacy code is leveraged as much as possible;
740other programming languages must build supporting libraries from scratch, even in \CC.
741
742
743\subsection{Traits}
744
745\CFA provides \newterm{traits} to name a group of type assertions, where the trait name allows specifying the same set of assertions in multiple locations, preventing repetition mistakes at each function declaration.
746\begin{cquote}
747\begin{tabular}{@{}l|@{\hspace{10pt}}l@{}}
748\begin{cfa}
749trait @sumable@( T ) {
750        void @?{}@( T &, zero_t ); // 0 literal constructor
751        T ?+?( T, T );           // assortment of additions
752        T @?+=?@( T &, T );
753        T ++?( T & );
754        T ?++( T & );
755};
756\end{cfa}
757&
758\begin{cfa}
759forall( T @| sumable( T )@ ) // use trait
760T sum( T a[$\,$], size_t size ) {
761        @T@ total = { @0@ };  // initialize by 0 constructor
762        for ( size_t i = 0; i < size; i += 1 )
763                total @+=@ a[i]; // select appropriate +
764        return total;
765}
766\end{cfa}
767\end{tabular}
768\end{cquote}
769Traits are implemented by flatten them at use points, as if written in full by the programmer.
770Flattening often results in overlapping assertions, \eg operator @+@.
771Hence, trait names play no part in type equivalence.
772In the example, type @T@ is an object type, and hence, has the implicit internal trait @otype@.
773\begin{cfa}
774trait otype( T & | sized(T) ) {
775        void ?{}( T & );                                                $\C{// default constructor}$
776        void ?{}( T &, T );                                             $\C{// copy constructor}$
777        void ?=?( T &, T );                                             $\C{// assignment operator}$
778        void ^?{}( T & );                                               $\C{// destructor}$
779};
780\end{cfa}
781These implicit routines are used by the @sumable@ operator @?+=?@ for the right side of @?+=?@ and return.
782
783If the array type is not a builtin type, an extra type parameter and assertions are required, like subscripting.
784This case is generalized in polymorphic container-types, such as a list with @insert@ and @remove@ operations, and an element type with copy and assignment.
785
786
787\subsection{Generic Types}
788
789A significant shortcoming of standard C is the lack of reusable type-safe abstractions for generic data structures and algorithms.
790Broadly speaking, there are three approaches to implement abstract data structures in C.
791\begin{enumerate}[leftmargin=*]
792\item
793Write bespoke data structures for each context.
794While this approach is flexible and supports integration with the C type checker and tooling, it is tedious and error prone, especially for more complex data structures.
795\item
796Use @void *@-based polymorphism, \eg the C standard library functions @bsearch@ and @qsort@, which allow for the reuse of code with common functionality.
797However, this approach eliminates the type checker's ability to ensure argument types are properly matched, often requiring a number of extra function parameters, pointer indirection, and dynamic allocation that is otherwise unnecessary.
798\item
799Use preprocessor macros, similar to \CC @templates@, to generate code that is both generic and type checked, but errors may be difficult to interpret.
800Furthermore, writing and using complex preprocessor macros is difficult and inflexible.
801\end{enumerate}
802
803\CC, Java, and other languages use \newterm{generic types} to produce type-safe abstract data-types.
804\CFA generic types integrate efficiently and naturally with the existing polymorphic functions, while retaining backward compatibility with C and providing separate compilation.
805For concrete parameters, the generic-type definition can be inlined, like \CC templates, if its definition appears in a header file (\eg @static inline@).
806
807A generic type can be declared by placing a @forall@ specifier on a @struct@ or @union@ declaration and instantiated using a parenthesized list of types after the type name.
808\begin{cquote}
809\begin{tabular}{@{}l|@{\hspace{10pt}}l@{}}
810\begin{cfa}
811@forall( F, S )@ struct pair {
812        F first;        S second;
813};
814@forall( F, S )@ // object
815S second( pair( F, S ) p ) { return p.second; }
816@forall( F *, S * )@ // sized
817S * second( pair( F *, S * ) p ) { return p.second; }
818\end{cfa}
819&
820\begin{cfa}
821pair( double, int ) dpr = { 3.5, 42 };
822int i = second( dpr );
823pair( void *, int * ) vipr = { 0p, &i };
824int * ip = second( vipr );
825double d = 1.0;
826pair( int *, double * ) idpr = { &i, &d };
827double * dp = second( idpr );
828\end{cfa}
829\end{tabular}
830\end{cquote}
831\CFA generic types are \newterm{fixed} or \newterm{dynamic} sized.
832Fixed-size types have a fixed memory layout regardless of type parameters, whereas dynamic types vary in memory layout depending on the type parameters.
833For example, the type variable @T *@ is fixed size and is represented by @void *@ in code generation;
834whereas, the type variable @T@ is dynamic and set at the point of instantiation.
835The difference between fixed and dynamic is the complexity and cost of field access.
836For fixed, field offsets are computed (known) at compile time and embedded as displacements in instructions.
837For dynamic, field offsets are compile-time computed at the call site, stored in an array of offset values, passed as a polymorphic parameter, and added to the structure address for each field dereference within a polymorphic routine.
838See~\cite[\S~3.2]{Moss19} for complete implementation details.
839
840Currently, \CFA generic types allow assertion.
841For example, the following declaration of a sorted set-type ensures the set key supports equality and relational comparison.
842\begin{cfa}
843forall( Elem, @Key@ | { _Bool ?==?( Key, Key ); _Bool ?<?( Key, Key ); } )
844struct Sorted_Set { Elem elem; @Key@ key; ... };
845\end{cfa}
846However, the operations that insert/remove elements from the set should not appear as part of the generic-types assertions.
847\begin{cfa}
848forall( @Elem@ | /* any assertions on element type */ ) {
849        void insert( Sorted_Set set, @Elem@ elem ) { ... }
850        bool remove( Sorted_Set set, @Elem@ elem ) { ... } // false => element not present
851        ... // more set operations
852} // distribution
853\end{cfa}
854(Note, the @forall@ clause can be distributed across multiple functions.)
855For software-engineering reasons, the set assertions would be refactored into a trait to allow alternative implementations, like a Java \lstinline[language=java]{interface}.
856
857In summation, the \CFA type system inherits \newterm{nominal typing} for concrete types from C, and adds \newterm{structural typing} for polymorphic types.
858Traits are used like interfaces in Java or abstract base-classes in \CC, but without the nominal inheritance relationships.
859Instead, each polymorphic function or generic type defines the structural type needed for its execution, which is fulfilled at each call site from the lexical environment, like Go~\cite{Go} or Rust~\cite{Rust} interfaces.
860Hence, new lexical scopes and nested functions are used extensively to create local subtypes, as in the @qsort@ example, without having to manage a nominal inheritance hierarchy.
861
862
863\section{Contributions}
864
865The \CFA compiler performance and type capability have been greatly improved through my development work.
866\begin{enumerate}
867\item
868The compilation time of various \CFA library units and test programs has been reduced by an order of magnitude, from minutes to seconds \see{\VRef[Table]{t:SelectedFileByCompilerBuild}}, which made it possible to develop and test more complicated \CFA programs that utilize sophisticated type system features.
869The details of compiler optimization work are covered in a previous technical report~\cite{Yu20}, which essentially forms part of this thesis.
870\item
871The thesis presents a systematic review of the new features added to the \CFA language and its type system.
872Some of the more recent inclusions to \CFA, such as tuples and generic structure types, were not well tested during development due to the limitation of compiler performance.
873Several issues coming from the interactions of various language features are identified and discussed in this thesis;
874some of them I have resolved, while others are given temporary fixes and need to be reworked in the future.
875\item
876Finally, this thesis provides constructive ideas for fixing a number of high-level issues in the \CFA language design and implementation, and gives a path for future improvements to the language and compiler.
877\end{enumerate}
878
879
880\begin{comment}
881From: Andrew James Beach <ajbeach@uwaterloo.ca>
882To: Peter Buhr <pabuhr@uwaterloo.ca>, Michael Leslie Brooks <mlbrooks@uwaterloo.ca>,
883    Fangren Yu <f37yu@uwaterloo.ca>, Jiada Liang <j82liang@uwaterloo.ca>
884Subject: Re: Haskell
885Date: Fri, 30 Aug 2024 16:09:06 +0000
886
887Do you mean:
888
889one = 1
890
891And then write a bunch of code that assumes it is an Int or Integer (which are roughly int and Int in Cforall) and then replace it with:
892
893one = 1.0
894
895And have that crash? That is actually enough, for some reason Haskell is happy to narrow the type of the first literal (Num a => a) down to Integer but will not do the same for (Fractional a => a) and Rational (which is roughly Integer for real numbers). Possibly a compatibility thing since before Haskell had polymorphic literals.
896
897Now, writing even the first version will fire a -Wmissing-signatures warning, because it does appear to be understood that just from a documentation perspective, people want to know what types are being used. Now, if you have the original case and start updating the signatures (adding one :: Fractional a => a), you can eventually get into issues, for example:
898
899import Data.Array (Array, Ix, (!))
900atOne :: (Ix a, Frational a) => Array a b -> b - - In CFA: forall(a | Ix(a) | Frational(a), b) b atOne(Array(a, b) const & array)
901atOne = (! one)
902
903Which compiles and is fine except for the slightly awkward fact that I don't know of any types that are both Ix and Fractional types. So you might never be able to find a way to actually use that function. If that is good enough you can reduce that to three lines and use it.
904
905Something that just occurred to me, after I did the above examples, is: Are there any classic examples in literature I could adapt to Haskell?
906
907Andrew
908
909PS, I think it is too obvious of a significant change to work as a good example but I did mock up the structure of what I am thinking you are thinking about with a function. If this helps here it is.
910
911doubleInt :: Int -> Int
912doubleInt x = x * 2
913
914doubleStr :: String -> String
915doubleStr x = x ++ x
916
917-- Missing Signature
918action = doubleInt - replace with doubleStr
919
920main :: IO ()
921main = print $ action 4
922\end{comment}
Note: See TracBrowser for help on using the repository browser.