Context Navigation

-                      re15293b
+                      rd39d8a4
 \chapter{Background}
+\vspace*{-8pt}
+\CFA is a backwards-compatible extension of the C programming language, therefore, it must support C-style enumerations.
+The following discussion covers C enumerations.
+This chapter covers background material for C enumerations and \CFA features used in later discussion.
+\section{C}
 As mentioned in \VRef{s:Aliasing}, it is common for C programmers to ``believe'' there are three equivalent forms of named constants.
 …
 \item
 The same explicit management is true for the @const@ declaration, and the @const@ variable cannot appear in constant-expression locations, like @case@ labels, array dimensions,\footnote{
 C allows variable-length array-declarations (VLA), so this case does work, but it fails in \CC, which does not support VLAs, unless it is \lstinline{g++}.} immediate oper\-ands of assembler instructions, and occupy storage.
+C allows variable-length array-declarations (VLA), so this case does work, but it fails in \CC, which does not support VLAs, unless it is \lstinline{g++}.} immediate oper\-ands of assembler instructions, and occupies storage.
 \begin{clang}
 $\$$ nm test.o
 …
 \section{C \lstinline{const}}
+\subsection{C \lstinline{const}}
 \label{s:Cconst}
 …
 \begin{cquote}
 \begin{tabular}{@{}ll@{}}
 \multicolumn{1}{@{}c}{\textbf{static initialization}} &  \multicolumn{1}{c@{}}{\textbf{dynamic intialization}} \\
+\multicolumn{1}{@{}c}{\textbf{static initialization}} &  \multicolumn{1}{c@{}}{\textbf{dynamic initialization}} \\
 \begin{clang}
 static const int one = 0 + 1;
 …
 Again, this form of aliasing is not an enumeration.
+\section{C Enumeration}
+\subsection{C Enumeration}
 \label{s:CEnumeration}
 …
 \subsection{Type Name}
+\subsubsection{Type Name}
 \label{s:TypeName}
 …
 Here, the aliased constants are: 20, 10, 20, 21, and -7.
 Direct initialization is by a compile-time expression generating a constant value.
 Indirect initialization (without initialization, @Max10Plus1@) is \newterm{auto-initialized}: from left to right, starting at zero or the next explicitly initialized constant, incrementing by @1@.
+Indirect initialization (without initializer, @Max10Plus1@) is called \newterm{auto-initialization}, where enumerators are assigned values from left to right, starting at zero or the next explicitly initialized constant, incrementing by @1@.
 Because multiple independent enumerators can be combined, enumerators with the same values can occur.
 The enumerators are rvalues, so assignment is disallowed.
 Finally, enumerators are \newterm{unscoped}, \ie enumerators declared inside of an @enum@ are visible (projected) into the enclosing scope of the @enum@ type.
+Finally, enumerators are \newterm{unscoped}, \ie enumerators declared inside of an @enum@ are visible (projected) outside into the enclosing scope of the @enum@ type.
 For unnamed enumerations, this semantic is required because there is no type name for scoped qualification.
 …
 \subsection{Implementation}
+\subsubsection{Implementation}
 \label{s:CenumImplementation}
 …
 A ``plain'' @int@ object has the natural size suggested by the architecture of the execution environment (large enough to contain any value in the range @INT_MIN@ to @INT_MAX@ as defined in the header @<limits.h>@).~\cite[\S~6.2.5(5)]{C11}
 \end{quote}
 Howeveer, @int@ means a 4 bytes on both 32/64-bit architectures, which does not seem like the ``natural'' size for a 64-bit architecture.
+However, @int@ means a 4 bytes on both 32/64-bit architectures, which does not seem like the ``natural'' size for a 64-bit architecture.
 Whereas, @long int@ means 4 bytes on a 32-bit and 8 bytes on 64-bit architectures, and @long long int@ means 8 bytes on both 32/64-bit architectures, where 64-bit operations are simulated on 32-bit architectures.
+In reality, both @gcc@ and @clang@ partially ignore this specification and type the integral size of an enumerator based its initialization.
+\VRef[Figure]{f:gccEnumerationStorageSize} shows both @gcc@ and @clang@ partially ignore this specification and type the integral size of an enumerator based its initialization.
+Hence, initialization in the range @INT_MIN@..@INT_MAX@ results in a 4-byte enumerator, and outside this range the enumerator is 8 bytes.
+Note that @sizeof( typeof( IMin ) ) != sizeof( E )@, making the size of an enumerator different than is containing enumeration type, which seems inconsistent, \eg @sizeof( typeof( 3 ) ) == sizeof( int )@.
+\begin{figure}
 \begin{cfa}
 enum E { IMin = INT_MIN, IMax = INT_MAX,
 …
                          ILLMin = LLONG_MIN, ILLMax = LLONG_MAX };
 int main() {
+        printf( "%zd %d %d\n%zd %ld %ld\n%zd %ld %ld\n",
+                         sizeof(IMin), IMin, IMax,
+                         sizeof(ILMin), ILMin, ILMax,
+                         sizeof(ILLMin), ILLMin, ILLMax );
+}
+        printf( "%zd %zd\n%zd %zd\n%zd %d %d\n%zd %ld %ld\n%zd %ld %ld\n",
+                        sizeof(enum E), sizeof(typeof(IMin)),
+                        sizeof(int), sizeof(long int),
+                        sizeof(IMin), IMin, IMax,
+                        sizeof(ILMin), ILMin, ILMax,
+                        sizeof(ILLMin), ILLMin, ILLMax );
+}
+4
 -2147483648 2147483647
 -9223372036854775808 9223372036854775807
 -9223372036854775808 9223372036854775807
 \end{cfa}
+Hence, initialization in the range @INT_MIN@..@INT_MAX@ is 4 bytes, and outside this range is 8 bytes.
+\subsection{Usage}
+\caption{\lstinline{gcc}/\lstinline{clang} Enumeration Storage Size}
+\label{f:gccEnumerationStorageSize}
+\end{figure}
+\subsubsection{Usage}
 \label{s:Usage}
 C proves an implicit \emph{bidirectional} conversion between an enumeration and its integral type, and between two different enumeration.
+C proves an implicit \emph{bidirectional} conversion between an enumeration and its integral type, and between two different enumerations.
 \begin{clang}
 enum Week week = Mon;                           $\C{// week == 0}$
 …
 @week = 10000;@                                         $\C{// UNDEFINED! implicit conversion to Week}$
 enum Season {Spring, Summer, Fall, Winter };
+enum Season { Spring, Summer, Fall, Winter };
 @week = Winter;@                                        $\C{// UNDEFINED! implicit conversion to Week}$
 \end{clang}
 While converting an enumerator to its underlying type is useful, the implicit conversion from the base type to an enumeration type is a common source of error.
+While converting an enumerator to its underlying type is useful, the implicit conversion from the base or another enumeration type to an enumeration is a common source of error.
 Enumerators can appear in @switch@ and looping statements.
 …
+}
 \end{cfa}
 For iterating to make sense, the enumerator values \emph{must} have a consecutive ordering with a fixed step between values.
 For example, a gap introduced by @Thu = 10@, results in iterating over the values 0--13, where values 3--9 are not @Week@ values.
+For iterating using arithmetic to make sense, the enumerator values \emph{must} have a consecutive ordering with a fixed step between values.
+For example, a previous gap introduced by @Thu = 10@, results in iterating over the values 0--13, where values 3--9 are not @Week@ values.
 Note, it is the bidirectional conversion that allows incrementing @day@: @day@ is converted to @int@, integer @1@ is added, and the result is converted back to @Week@ for the assignment to @day@.
 For safety, \CC does not support the bidirectional conversion, and hence, an unsafe cast is necessary to increment @day@: @day = (Week)(day + 1)@.
 …
 for ( enum E e = A; e < @N@; e += 1 ) ...
 \end{cfa}
 Here, the auto-incrementing counts the number of enumerators and puts the total into the last enumerator @N@.
 @N@ is often used as the dimension for an array assocated with the enumeration.
+Here, serendipitously the auto-incrementing counts the number of enumerators and puts the total into the last enumerator @N@.
+This @N@ is often used as the dimension for an array assocated with the enumeration.
 \begin{cfa}
 E array[@N@];
 …
+}
 \end{cfa}
 However, for non-integral typed enumerations, \see{\VRef{f:EumeratorTyping}}, this idiom fails.
 This idiom is used in another C idiom for matching companion information.
 For example, an enumeration is linked with a companion array of printable strings.
+However, for non-consecutive ordering and non-integral typed enumerations, \see{\VRef{f:EumeratorTyping}}, this idiom fails.
+This idiom is often used with another C idiom for matching companion information.
+For example, an enumeration may be linked with a companion array of printable strings.
 \begin{cfa}
 enum Integral_Type { chr, schar, uschar, sshort, ushort, sint, usint, ..., NO_OF_ITYPES };
 …
         "signed int", "unsigned int", ...
 };
 enum Integral_Type integral_type = ...
+enum Integral_Type @integral_type@ = ...
 printf( "%s\n", Integral_Name[@integral_type@] ); // human readable type name
 \end{cfa}
 However, the companion idiom results in the \emph{harmonizing} problem because an update to the enumeration @Integral_Type@ often requires a corresponding update to the companion array \snake{Integral_Name}.
 The need to harmonize is at best indicated by a comment before the enumeration.
+The requirement to harmonize is at best indicated by a comment before the enumeration.
 This issue is exacerbated if enumeration and companion array are in different translation units.
 \bigskip
+While C provides a true enumeration, it is restricted, has unsafe semantics, and does not provide useful enumeration features in other programming languages.
+\section{\CFA Polymorphism}
+While C provides a true enumeration, it is restricted, has unsafe semantics, and does not provide useful/advanced enumeration features found in other programming languages.
+\section{\CFA}
+\CFA in \emph{not} an object-oriented programming-language, \ie functions cannot be nested in aggregate types, and hence, there is no receive notation for calling functions, \eg @obj.method(...)@, where the first argument proceeds the call.
+The following section provide short descriptions of \CFA features mentioned further in the thesis.
+\subsection{Operator Overloading}
+Virtually all programming languages overload the arithmetic operators across the basic types using the number and type of parameters and returns.
+Like \CC, \CFA also allows these operators to be overloaded with user-defined types.
+The syntax for operator names uses the @'?'@ character to denote a parameter, \eg prefix and infix increment operators: @?++@, @++?@, and @?+?@.
+\begin{cfa}
+struct S { int i, j };
+S @?+?@( S op1, S op2 ) { return (S){ op1.i + op2.i, op1.j + op2.j }; }
+S s1, s2;
+s1 = s1 @+@ s2;                 $\C[1.75in]{// infix call}$
+s1 = @?+?@( s1, s2 );   $\C{// direct call}\CRT$
+\end{cfa}
+The type system examines each call size and selects the best matching overloaded function based on the number and types of the arguments.
+If there are intermixed operands, @2 + 3.5@, the type system attempts (safe) conversions changing the arguments to one or more of the parameter type(s).
 \subsection{Function Overloading}
+Function overloading is programming languages feature wherein functions may share the same name, but with different function signatures. In both C++ and \CFA, function names can be overloaded
+with different entities as long as they are different in terms of the number and type of parameters.
+\begin{cfa}
+void f(); // (1)
+void f(int); // (2); Overloaded on the number of parameters
+void f(char); // (3); Overloaded on parameter type
+f('A');
+\end{cfa}
+In this case, the name f is overloaded with a nullity function and two arity functions with different parameters types. Exactly which precedures being executed
+is determined based on the passing arguments. The last expression of the preceding example calls f with one arguments, narrowing the possible candidates down to (2) and (3).
+Between those, function argument 'A' is an exact match to the parameter expected by (3), while needing an @implicit conversion@ to call (2). The compiler determines (3) is the better candidates among
+and procedure (3) is being executed.
+\begin{cfa}
+int f(int); // (4); Overloaded on return type
+[int, int] f(int); // (5) Overloaded on the number of return value
+\end{cfa}
+The function declarations (4) and (5) show the ability of \CFA functions overloaded with different return value, a feature that is not shared by C++.
+\subsection{Operator Overloading}
+Operators in \CFA are specialized function and are overloadable by with specially-named functions represents the syntax used to call the operator.
+% For example, @bool ?==?T(T lhs, T rhs)@ overloads equality operator for type T, where @?@ is the placeholders for operands for the operator.
+\begin{cfa}
+enum Weekday { Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, Sunday };
+bool ?<?(const Weekday a, const Weekday b) {
+        return ((int)a + 1);
+}
+Monday < Sunday; // False
+?<?( Monday, Sunday ); // Equivalent syntax
+\end{cfa}
+Unary operators are functions that takes one argument and have name @operator?@ or @?operator@, where @?@ is the placeholders for operands.
+Binary operators are function with two parameters. They are overloadable with function name @?operator?@.
+Both \CFA and \CC allow function names to be overloaded, as long as their prototypes differ in the number and type of parameters and returns.
+\begin{cfa}
+void f( void );                 $\C[1.75in]{// (1): no parameter}$
+void f( char );                 $\C{// (2): overloaded on the number and parameter type}$
+void f( int, int );             $\C{// (3): overloaded on the number and parameter type}$
+f( 'A' );                               $\C{// select (2)}\CRT$
+\end{cfa}
+In this case, the name @f@ is overloaded depending on the number and parameter types.
+The type system examines each call size and selects the best matching based on the number and types of the arguments.
+Here, there is a perfect match for the call, @f( 'A' )@ with the number and parameter type of function (2).
+Ada, Scala, and \CFA type-systems also use the return type in resolving a call.
+\begin{cfa}
+int f( void );                  $\C[1.75in]{// (4); overloaded on return type}$
+double f( void );               $\C{// (5); overloaded on return type}$
+int i = f();                    $\C{// select (4)}$
+double d = f();                 $\C{// select (5)}\CRT$
+\end{cfa}
+\subsection{Variable Overloading}
+Unlike almost all programming languages, \CFA has variable overloading within a scope, along with shadow overloading in nested scopes.
+\begin{cfa}
+void foo( double d );
+int v;                              $\C[1.75in]{// (1)}$
+double v;                               $\C{// (2) variable overloading}$
+foo( v );                               $\C{// select (2)}$
+{
+        int v;                          $\C{// (3) shadow overloading}$
+        double v;                       $\C{// (4) and variable overloading}$
+        foo( v );                       $\C{// select (4)}\CRT$
+}
+\end{cfa}
+The \CFA type system simply treats overloaded variables as an overloaded function returning a value with no parameters.
 \subsection{Constructor and Destructor}
+In \CFA, all objects are initialized by @constructors@ during its allocation, including basic types,
+which are initialized by auto-generated basic type constructors.
+Constructors are overloadable functions with name @?{}@, return @void@, and have at least one parameter, which is a reference
+to the object being constructored (Colloquially referred to "this" or "self" in other language).
+While \CFA in not object oriented, it adopts many language features commonly used in object-oriented languages;
+these features are independent from object-oriented programming.
+All objects in \CFA are initialized by @constructors@ \emph{after} allocation and de-initialized \emph{before} deallocation.
+\CC cannot have constructors for basic-types because they have no aggregate type \lstinline[language=C++]{struct/class} in which to insert a constructor definition.
+Like \CC, \CFA has multiple auto-generated constructors for every type.
+The prototype for the constructor/destructor are @void ?{}( T &, ... )@ and @void ^?{}( T &, ... )@, respectively.
+The first parameter is logically, the \lstinline[language=C++]{this} or \lstinline[language=java]{self} in other object-oriented languages, and implicitly passed.
 \begin{cfa}
 struct Employee {
         const char * name;
+        char * name;
         double salary;
 };
+void ?{}( Employee& this, const char * name, double salary ) {
+    this.name = name;
+    this.salary = salary;
+}
+Employee Sara { "Sara Schmidt", 20.5 };
+\end{cfa}
+Like Python, the "self" reference is implicitly passed to a constructor. The Employee constructors takes two additional arugments used in its
+field initialization.
+A destructor in \CFA is a function that has name @^?{}@. It returns void, and take only one arugment as its "self".
+\begin{cfa}
+void ^?{}( Employee& this ) {
+    free(this.name);
+    this.name = 0p;
+    this.salary = 0;
+}
+\end{cfa}
+Destructor can be explicitly evoked as a function call, or implicitly called at the end of the block in which the object is delcared.
+\begin{cfa}
+void @?{}@( Employee & this, char * name, double salary ) {
+        this.name = aalloc( sizeof(name) );
+        strcpy( this.name, name );
+        this.salary = salary;
+}
+void @^?{}@( Employee & this ) {
+        free( this.name );
+}
+{
+^Sara{};
+Sara{ "Sara Craft", 20 };
+} // ^Sara{}
+\end{cfa}
+\subsection{Variable Overloading}
+C and C++ disallow more than one variable declared in the same scope with the same name. When a variable declare in a inner scope has the same name as
+a variable in an outer scope, the outer scope variable is "shadowed" by the inner scope variable and cannot be accessed directly.
+\CFA has variable overloading: multiple variables can share the same name in the same scope, as long as they have different type. Name shadowing only
+happens when the inner scope variable and the outer scope ones have the same type.
+\begin{cfa}
+double i = 6.0;
+int i = 5;
+void foo( double i ) { sout | i; } // 6.0
+\end{cfa}
+        Employee name = { "Sara Schmidt", 20.5 };
+} // implicit destructor call
+\end{cfa}
+Both constructor and destructor can be explicitly called.
+\begin{cfa}
+        Employee name = { "Sara Schmidt", 20.5 };
+        ... // use name
+        ^?{}( name ); // de-initialize
+        ?{}( name, "Jack Smith", 10.5 }; // re-initialize
+        ... // use name
+\end{cfa}
 \subsection{Special Literals}
+Literal 0 has special meanings within different contexts: it can means "nothing" or "empty", an additive identity in arithmetic, a default value as in C (null pointer),
+or an initial state.
+Awaring of its significance, \CFA provides a special type for the 0 literal, @zero_t@, to define the logical @zero@ for custom types.
+The C constants @0@ and @1@ have special meaning.
+@0@ is the null pointer and used in conditional expressions, where @if ( p )@ is rewritten @if ( p != 0 )@;
+@1@ is an additive identity in unary operators @++@ and @--@.
+Aware of their significance, \CFA provides a special type @zero_t@ and @one_t@ for custom types.
 \begin{cfa}
 struct S { int i, j; };
 void ?{}( S & this, @zero_t@ ) { this.i = 0; this.j = 0; } // zero_t, no parameter name allowed
-S s0 = @0@;
-\end{cfa}
-Overloading @zero_t@ for S provides new definition for @zero@ of type S.
-According to the C standard, @0@ is the @only@ false value. Any values compares equals to @0@ is false, and not euqals @0@ is true. As a consequence, control structure
-such as @if()@ and @while()@ only runs it true clause when its predicate @not equals@ to @0@.
-\CFA generalizes this concept and allows to logically overloads the boolean value for any type by overloading @not equal@ comparison against @zero_t@.
-\begin{cfa}
 int ?@!=@?( S this, @zero_t@ ) { return this.i != 0 && this.j != 0; }
+\end{cfa}
+% In C, the literal 0 represents the Boolean value false. The expression such as @if (x)@ is equivalent to @if (x != 0)@ .
+% \CFA allows user to define the logical zero for a custom type by overloading the @!=@ operation against a special type, @zero_t@,
+% so that an expression with the custom type can be used as a predicate without the need of conversion to the literal 0.
+% \begin{cfa}
+% struct S s;
+% int ?!=?(S, zero_t);
+% if (s) {}
+% \end{cfa}
+Literal 1 is also special. Particularly in C, the pre-increment operator and post-increment operator can be interpreted in terms of @+= 1@.
+The logical @1@ in \CFA is represented by special type @one_t@.
+\begin{cfa}
+void ?{}( S & this, one_t ) { this.i = 1; this.j = 1; } // one_t, no parameter name allowed
+S & ?+=?( S & this, one_t ) { this.i += 1; this.j += 1; return op; }
+\end{cfa}
+Without explictly overloaded by a user, \CFA uses the user-defined @+=(S&, one_t)@ to interpret @?++@ and @++?@, as both are polymorphic functions in \CFA.
+\subsection{Polymorphics Functions}
+Parametric-Polymorphics functions are the functions that applied to all types. \CFA functions are parametric-polymorphics when
+they are written with the @forall@ clause.
+\begin{cfa}
+forall(T)
+T identity(T x) { return x; }
+identity(42);
+\end{cfa}
+The identity function accepts a value from any type as an arugment, and the type parameter @T@ is bounded to @int@ when the function
+is called with 42.
+The forall clause can takes @type assertions@ that restricts the polymorphics type.
+\begin{cfa}
+forall( T | { void foo(T); } )
+void bar(T t) { foo(t); }
+struct S {} s;
+void foo(struct S);
+bar(s);
+\end{cfa}
+The assertion on @T@ restricts the range of types for bar to only those implements foo with the matching a signature, so that bar()
+can call @foo@ in its body with type safe.
+Calling on type with no mathcing @foo()@ implemented, such as int, causes a compile time type assertion error.
+\subsection{trait}
+A @forall@ clause can asserts on multiple types and with multiple asserting functions. A common practice in \CFA is to group
+the asserting functions in to a named \newterm{trait}.
+\begin{cfa}
+trait Bird(T) {
+        int days_can_fly(T i);
+        void fly(T t);
+S s = @0@;
+if ( s @!= 0@ ) ...
+\end{cfa}
+Similarity, for @one_t@.
+\begin{cfa}
+void ?{}( S & this, @one_t@ ) { this.i = 1; this.j = 1; } // one_t, no parameter name allowed
+S & ?++( S & this, @one_t@ ) { return (S){ this.i++, this.j++ }; }
+\end{cfa}
+\subsection{Polymorphic Functions}
+Polymorphic functions are the functions that apply to all types.
+\CFA provides \newterm{parametric polymorphism} written with the @forall@ clause.
+\begin{cfa}
+@forall( T )@ T identity( T v ) { return v; }
+identity( 42 );
+\end{cfa}
+The @identity@ function accepts a value from any type as an argument and returns that value.
+At the call size, the type parameter @T@ is bounded to @int@ from the argument @42@.
+For polymorphic functions to be useful, the @forall@ clause needs \newterm{type assertion}s that restricts the polymorphic types it accepts.
+\begin{cfa}
+forall( T @| { void foo( T ); }@ ) void bar( T t ) { @foo( t );@ }
+struct S { ... } s;
+void foo( struct S );
+bar( s );
+\end{cfa}
+The assertion on @T@ restricts the range of types that can be manipulated by @bar@ to only those that have an implementation of @foo@ with the matching signature, allowing @bar@'s call to @foo@ in its body.
+\subsection{Trait}
+A @forall@ clause can assert many restrictions on multiple types.
+A common practice is to refactor the assertions into a named \newterm{trait}.
+\begin{cfa}
+forall(T) trait @Bird@ {
+        int days_can_fly( T );
+        void fly( T );
 };
+forall(B | Bird(B)) {
+        void bird_fly(int days_since_born, B bird) {
+                if (days_since_born > days_can_fly(bird)) {
+                        fly(bird);
+                }
+        }
+}
+struct Robin {} r;
+int days_can_fly(Robin r) { return 23; }
+void fly(Robin r) {}
+bird_fly( r );
+\end{cfa}
+Grouping type assertions into named trait effectively create a reusable interface for parametrics polymorphics types.
+forall( B | @Bird@( B ) )
+void bird_fly( int days_since_born, B bird ) {
+        if ( days_since_born > days_can_fly( bird )) fly( bird );
+}
+struct Robin {} robin;
+int days_can_fly( Robin robin ) { return 23; }
+void fly( Robin robin ) {}
+bird_fly( 23, robin );
+\end{cfa}
+Grouping type assertions into a named trait effectively creates a reusable interface for parametric-polymorphic types.
 \section{Expression Resolution}
+The overloading feature poses a challenge in \CFA expression resolution. Overloadeded identifiers can refer multiple
+candidates, with multiples being simultaneously valid. The main task of \CFA resolver is to identity a best candidate that
+involes less implicit conversion and polymorphism.
+Overloading poses a challenge for all expression-resolution systems.
+Multiple overloaded names give multiple candidates at a call site, and a resolver must pick a \emph{best} match, where ``best'' is defined by a series of heuristics based on safety and programmer intuition/expectation.
+When multiple best matches exist, the resolution is ambiguous.
+The \CFA resolver attempts to identity a best candidate based on: first, the number of parameters and types, and second, when no exact match exists, the fewest implicit conversions and polymorphic variables.
+Finding an exact match is not discussed here, because the mechanism is fairly straightforward, even when the search space is large;
+only finding a non-exact match is discussed in detail.
 \subsection{Conversion Cost}
 \label{s:ConversionCost}
+In C, function call arguments and function parameters do not need to be a exact match. When types mismatch, C performs an \newterm{implicit conversion}
+on argument to parameter type. The process is trivial with the exception on binary operators;  When types of operands are different,
+C nees to decide which operands need implicit conversion. C defines the resolution pattern as "usual arithmetic conversion",
+in which C looks for a \newterm{common type} between operands, and convert either one or both operands to the common type.
+Loosely defined, a common type is a the smallest type in terms of size of representation that both operands can be converted into without losing their precision.
+Such conversion is called "widening" or "safe conversion".
+C binary operators can be restated as 2-arity functions that overloaded with different parameters. "Usual arithmetic conversion" is to find a overloaded
+instance that for both arguments, the conversion to parameter type is a widening conversion to the smallest type.
+\CFA generalizes "usual arithmetic conversion" to \newterm{conversion cost}. In the first design by Bilson, conversion cost is a 3-tuple,
+@(unsafe, poly, safe)@, where @unsafe@ the number of unsafe (narrorow conversion) from argument to parameter,
+@poly@ is the number of polymorphic function parameter,
+and @safe@ is sum of degree of safe (widening) conversion.
+Most programming languages perform some implicit conversions among basic types to facilitate mixed-mode arithmetic;
+otherwise, the program becomes littered with many explicit casts, which is not match programmer expectation.
+C is an aggressive language as it provides conversions among almost all of the basic types, even when the conversion is potentially unsafe or not meaningful, \ie @float@ to @bool@.
+C defines the resolution pattern as ``usual arithmetic conversion''~\cite[\S~6.3.1.8]{C11}, in which C looks for a \newterm{common type} between operands, and converts one or both operands to the common type.
+Loosely defined, a common type is a the smallest type in terms of size of representation that both operands can be converted into without losing their precision, called a \newterm{widening} or \newterm{safe conversion}.
+\CFA generalizes ``usual arithmetic conversion'' to \newterm{conversion cost}.
+In the first design by Bilson~\cite{Bilson03}, conversion cost is a 3-tuple, @(unsafe, poly, safe)@ applied between each argument/parameter type, where:
+\begin{enumerate}
+\item
+@unsafe@ is the number of precision losing (\newterm{narrowing} conversions),
+\item
+@poly@ is the number of polymorphic function parameters, and
+\item
+@safe@ is sum of the degree of safe (widening) conversions.
+\end{enumerate}
 Sum of degree is a method to quantify C's integer and floating-point rank.
+Every pair of widening conversion types has been assigned with a \newterm{distance}, and distance between the two same type is 0.
+For example, the distance from char to int is 2, distance from integer to long is 1, and distance from int to long long int is 2.
+The distance does not mirror C's rank system. For example, the rank of char and signed char are the same in C, but the distance from char to signed char is assigned with 1.
+@safe@ cost is summing all pair of argument to parameter safe conversion distance.
+Among the three in Bilson's model, @unsafe@ is the most significant cost and @safe@ is the least significant one, with an implication that \CFA always choose
+a candidate with the lowest @unsafe@ if possible.
+Assume the overloaded function @foo@ is called with two @int@ parameter. The cost for every overloaded @foo@ has been list along:
+\begin{cfa}
+void foo(char, char);                           // (2, 0, 0)
+void foo(char, int);                            // (1, 0, 0)
+forall(T, V) void foo(T, V);            // (0, 2, 0)
+forall(T)        void foo(T, T);                // (0, 2, 0)
+forall(T)        void foo(T, int);              // (0, 1, 0)
+void foo(long long, long);                      // (0, 0, 3)
+void foo(long, long);                           // (0, 0, 2)
+void foo(int, long);                            // (0, 0, 1)
+int i, j; foo(i, j);
+\end{cfa}
+The overloaded instances are ordered from the highest to the lowest cost, and \CFA select the last candidate.
+In the later iteration of \CFA, Schluntz and Aaron expanded conversion cost to a 7-tuple with 4 additional categories,
+@@(unsafe, poly, safe, sign, vars, specialization, reference)@@.
+with interpretation listed below:
+\begin{itemize}
+\item Unsafe
+\item Poly
+\item Safe
+\item Sign is the number of sign/unsign variable conversion.
+\item Vars is the number of polymorphics type variable.
+\item Specialization is negative value of the number of type assertion.
+\item Reference is number of reference-to-rvalue conversion.
+\end{itemize}
+The extended conversion cost models looks for candidates that are more specific and less generic.
+@Var@s was introduced by Aaron to disambugate @forall(T, V) void foo(T, V)@ and @forall(T) void foo(T, T)@. The extra type parameter @V@
+makes it more generic and less specific. More generic type means less constraints on types of its parameters. \CFA is in favor of candidates with more
+restrictions on polymorphism so @forall(T) void foo(T, T)@ has lower cost. @Specialization@ is a value that always less than or equal to zero. For every type assertion in @forall@ clause, \CFA subtracts one from
+@specialization@, starting from zero. More type assertions often means more constraints on argument type, and making the function less generic.
+\CFA defines two special cost value: @zero@ and @infinite@ A conversion cost is @zero@ when argument and parameter has exact match, and a conversion cost is @infinite@ when
+there is no defined conversion between two types. For example, the conversion cost from int to a struct type S is @infinite@.
+Every pair of widening conversion types is assigned a \newterm{distance}, and distance between the two same type is 0.
+For example, the distance from @char@ to @int@ is 2, distance from @int@ to @long@ is 1, and distance from @int@ to @long long int@ is 2.
+This distance does not mirror C's rank system.
+For example, the rank of @char@ and @signed char@ are the same in C, but the distance from @char@ to @signed char@ is assigned 1.
+@safe@ cost is summing all pairs of argument to parameter safe conversion distances.
+Among the three costs in Bilson's model, @unsafe@ is the most significant cost and @safe@ is the least significant, with an implication that \CFA always choose a candidate with the lowest @unsafe@, if possible.
+For example, assume the overloaded function @foo@ is called with two @int@ parameter.
+The cost for every overloaded @foo@ has been list along:
+\begin{cfa}
+void foo( char, char );                         $\C[2.5in]{// (1) (2, 0, 0)}$
+void foo( char, int );                          $\C{// (2) (1, 0, 0)}$
+forall( T, V ) void foo( T, V );        $\C{// (3) (0, 2, 0)}$
+forall( T ) void foo( T, T );           $\C{// (4) (0, 2, 0)}$
+forall( T ) void foo( T, int );         $\C{// (5) (0, 1, 0)}$
+void foo( long long, long );            $\C{// (6) (0, 0, 3)}$
+void foo( long, long );                         $\C{// (7) (0, 0, 2)}$
+void foo( int, long );                          $\C{// (8) (0, 0, 1)}$
+int i, j;
+foo( i, j );                                            $\C{// convert j to long and call (8)}\CRT$
+\end{cfa}
+The overloaded instances are ordered from the highest to the lowest cost, and \CFA select the last candidate (8).
+In the next iteration of \CFA, Schluntz and Aaron~\cite{Moss18} expanded conversion cost to a 7-tuple with 4 additional categories, @(unsafe, poly, safe, sign, vars, specialization, reference)@, with the following interpretations:
+\begin{enumerate}
+\item @unsafe@ from Bilson
+\item @poly@
+\item @safe@
+\item @sign@ is the number of sign/unsigned variable conversions
+\item @vars@ is the number of polymorphic type variables
+\item @specialization@ is a negative value of the number of type assertions
+\item @reference@ is the number of reference-to-rvalue conversions
+\end{enumerate}
+The extended conversion-cost model looks for candidates that are more specific and less generic.
+@vars@ disambiguates @forall( T, V ) foo( T, V )@ and @forall( T ) void foo( T, T )@, where the extra type parameter @V@ makes is more generic.
+A more generic type means less constraints on its parameter types.
+\CFA favours candidates with more restrictions on polymorphism, so @forall( T ) void foo( T, T )@ has lower cost.
+@specialization@ is an arbitrary count-down value starting at zero.
+For every type assertion in @forall@ clause (no assertions in the above example), \CFA subtracts one from @specialization@.
+More type assertions means more constraints on argument types, making the function less generic.
+\CFA defines two special cost values: @zero@ and @infinite@.
+A conversion cost is @zero@ when argument and parameter has an exact match, and a conversion cost is @infinite@ when there is no defined conversion between two types.
+For example, the conversion cost from @int@ to a @struct S@ is @infinite@.

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset d39d8a4

Legend:

TabularUnified doc/theses/jiada_liang_MMath/background.tex ¶

Download in other formats: