Changeset b797fe36


Ignore:
Timestamp:
Aug 4, 2024, 11:35:26 AM (3 months ago)
Author:
JiadaL <j82liang@…>
Branches:
master
Children:
1697c40
Parents:
ecaedf35 (diff), 1e12f07 (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the (diff) links above to see all the changes relative to each parent.
Message:

Merge branch 'master' of plg.uwaterloo.ca:software/cfa/cfa-cc

Location:
doc/theses/jiada_liang_MMath
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/jiada_liang_MMath/background.tex

    recaedf35 rb797fe36  
    11\chapter{Background}
    22
    3 \vspace*{-8pt}
    4 
    5 \CFA is a backwards-compatible extension of the C programming language, therefore, it must support C-style enumerations.
    6 The following discussion covers C enumerations.
     3This chapter covers background material for C enumerations and \CFA features used in later discussion.
     4
     5
     6\section{C}
    77
    88As mentioned in \VRef{s:Aliasing}, it is common for C programmers to ``believe'' there are three equivalent forms of named constants.
     
    1818\item
    1919The same explicit management is true for the @const@ declaration, and the @const@ variable cannot appear in constant-expression locations, like @case@ labels, array dimensions,\footnote{
    20 C allows variable-length array-declarations (VLA), so this case does work, but it fails in \CC, which does not support VLAs, unless it is \lstinline{g++}.} immediate oper\-ands of assembler instructions, and occupy storage.
     20C allows variable-length array-declarations (VLA), so this case does work, but it fails in \CC, which does not support VLAs, unless it is \lstinline{g++}.} immediate oper\-ands of assembler instructions, and occupies storage.
    2121\begin{clang}
    2222$\$$ nm test.o
     
    2828
    2929
    30 \section{C \lstinline{const}}
     30\subsection{C \lstinline{const}}
    3131\label{s:Cconst}
    3232
     
    3434\begin{cquote}
    3535\begin{tabular}{@{}ll@{}}
    36 \multicolumn{1}{@{}c}{\textbf{static initialization}} &  \multicolumn{1}{c@{}}{\textbf{dynamic intialization}} \\
     36\multicolumn{1}{@{}c}{\textbf{static initialization}} &  \multicolumn{1}{c@{}}{\textbf{dynamic initialization}} \\
    3737\begin{clang}
    3838static const int one = 0 + 1;
     
    5858Again, this form of aliasing is not an enumeration.
    5959
    60 \section{C Enumeration}
     60
     61\subsection{C Enumeration}
    6162\label{s:CEnumeration}
    6263
     
    7879
    7980
    80 \subsection{Type Name}
     81\subsubsection{Type Name}
    8182\label{s:TypeName}
    8283
     
    8889Here, the aliased constants are: 20, 10, 20, 21, and -7.
    8990Direct initialization is by a compile-time expression generating a constant value.
    90 Indirect initialization (without initialization, @Max10Plus1@) is \newterm{auto-initialized}: from left to right, starting at zero or the next explicitly initialized constant, incrementing by @1@.
     91Indirect initialization (without initializer, @Max10Plus1@) is called \newterm{auto-initialization}, where enumerators are assigned values from left to right, starting at zero or the next explicitly initialized constant, incrementing by @1@.
    9192Because multiple independent enumerators can be combined, enumerators with the same values can occur.
    9293The enumerators are rvalues, so assignment is disallowed.
    93 Finally, enumerators are \newterm{unscoped}, \ie enumerators declared inside of an @enum@ are visible (projected) into the enclosing scope of the @enum@ type.
     94Finally, enumerators are \newterm{unscoped}, \ie enumerators declared inside of an @enum@ are visible (projected) outside into the enclosing scope of the @enum@ type.
    9495For unnamed enumerations, this semantic is required because there is no type name for scoped qualification.
    9596
     
    126127
    127128
    128 \subsection{Implementation}
     129\subsubsection{Implementation}
    129130\label{s:CenumImplementation}
    130131
     
    135136A ``plain'' @int@ object has the natural size suggested by the architecture of the execution environment (large enough to contain any value in the range @INT_MIN@ to @INT_MAX@ as defined in the header @<limits.h>@).~\cite[\S~6.2.5(5)]{C11}
    136137\end{quote}
    137 Howeveer, @int@ means a 4 bytes on both 32/64-bit architectures, which does not seem like the ``natural'' size for a 64-bit architecture.
     138However, @int@ means a 4 bytes on both 32/64-bit architectures, which does not seem like the ``natural'' size for a 64-bit architecture.
    138139Whereas, @long int@ means 4 bytes on a 32-bit and 8 bytes on 64-bit architectures, and @long long int@ means 8 bytes on both 32/64-bit architectures, where 64-bit operations are simulated on 32-bit architectures.
    139 In reality, both @gcc@ and @clang@ partially ignore this specification and type the integral size of an enumerator based its initialization.
     140\VRef[Figure]{f:gccEnumerationStorageSize} shows both @gcc@ and @clang@ partially ignore this specification and type the integral size of an enumerator based its initialization.
     141Hence, initialization in the range @INT_MIN@..@INT_MAX@ results in a 4-byte enumerator, and outside this range the enumerator is 8 bytes.
     142Note that @sizeof( typeof( IMin ) ) != sizeof( E )@, making the size of an enumerator different than is containing enumeration type, which seems inconsistent, \eg @sizeof( typeof( 3 ) ) == sizeof( int )@.
     143
     144\begin{figure}
    140145\begin{cfa}
    141146enum E { IMin = INT_MIN, IMax = INT_MAX,
     
    143148                         ILLMin = LLONG_MIN, ILLMax = LLONG_MAX };
    144149int main() {
    145         printf( "%zd %d %d\n%zd %ld %ld\n%zd %ld %ld\n",
    146                          sizeof(IMin), IMin, IMax,
    147                          sizeof(ILMin), ILMin, ILMax,
    148                          sizeof(ILLMin), ILLMin, ILLMax );
    149 }
     150        printf( "%zd %zd\n%zd %zd\n%zd %d %d\n%zd %ld %ld\n%zd %ld %ld\n",
     151                        sizeof(enum E), sizeof(typeof(IMin)),
     152                        sizeof(int), sizeof(long int),
     153                        sizeof(IMin), IMin, IMax,
     154                        sizeof(ILMin), ILMin, ILMax,
     155                        sizeof(ILLMin), ILLMin, ILLMax );
     156}
     1578 4
    1501584 -2147483648 2147483647
    1511598 -9223372036854775808 9223372036854775807
    1521608 -9223372036854775808 9223372036854775807
    153161\end{cfa}
    154 Hence, initialization in the range @INT_MIN@..@INT_MAX@ is 4 bytes, and outside this range is 8 bytes.
    155 
    156 \subsection{Usage}
     162\caption{\lstinline{gcc}/\lstinline{clang} Enumeration Storage Size}
     163\label{f:gccEnumerationStorageSize}
     164\end{figure}
     165
     166
     167\subsubsection{Usage}
    157168\label{s:Usage}
    158169
    159 C proves an implicit \emph{bidirectional} conversion between an enumeration and its integral type, and between two different enumeration.
     170C proves an implicit \emph{bidirectional} conversion between an enumeration and its integral type, and between two different enumerations.
    160171\begin{clang}
    161172enum Week week = Mon;                           $\C{// week == 0}$
     
    164175@week = 10000;@                                         $\C{// UNDEFINED! implicit conversion to Week}$
    165176
    166 enum Season {Spring, Summer, Fall, Winter };
     177enum Season { Spring, Summer, Fall, Winter };
    167178@week = Winter;@                                        $\C{// UNDEFINED! implicit conversion to Week}$
    168179\end{clang}
    169 While converting an enumerator to its underlying type is useful, the implicit conversion from the base type to an enumeration type is a common source of error.
     180While converting an enumerator to its underlying type is useful, the implicit conversion from the base or another enumeration type to an enumeration is a common source of error.
    170181
    171182Enumerators can appear in @switch@ and looping statements.
     
    182193}
    183194\end{cfa}
    184 For iterating to make sense, the enumerator values \emph{must} have a consecutive ordering with a fixed step between values.
    185 For example, a gap introduced by @Thu = 10@, results in iterating over the values 0--13, where values 3--9 are not @Week@ values.
     195For iterating using arithmetic to make sense, the enumerator values \emph{must} have a consecutive ordering with a fixed step between values.
     196For example, a previous gap introduced by @Thu = 10@, results in iterating over the values 0--13, where values 3--9 are not @Week@ values.
    186197Note, it is the bidirectional conversion that allows incrementing @day@: @day@ is converted to @int@, integer @1@ is added, and the result is converted back to @Week@ for the assignment to @day@.
    187198For safety, \CC does not support the bidirectional conversion, and hence, an unsafe cast is necessary to increment @day@: @day = (Week)(day + 1)@.
     
    192203for ( enum E e = A; e < @N@; e += 1 ) ...
    193204\end{cfa}
    194 Here, the auto-incrementing counts the number of enumerators and puts the total into the last enumerator @N@.
    195 @N@ is often used as the dimension for an array assocated with the enumeration.
     205Here, serendipitously the auto-incrementing counts the number of enumerators and puts the total into the last enumerator @N@.
     206This @N@ is often used as the dimension for an array assocated with the enumeration.
    196207\begin{cfa}
    197208E array[@N@];
     
    200211}
    201212\end{cfa}
    202 However, for non-integral typed enumerations, \see{\VRef{f:EumeratorTyping}}, this idiom fails.
    203 
    204 This idiom is used in another C idiom for matching companion information.
    205 For example, an enumeration is linked with a companion array of printable strings.
     213However, for non-consecutive ordering and non-integral typed enumerations, \see{\VRef{f:EumeratorTyping}}, this idiom fails.
     214
     215This idiom is often used with another C idiom for matching companion information.
     216For example, an enumeration may be linked with a companion array of printable strings.
    206217\begin{cfa}
    207218enum Integral_Type { chr, schar, uschar, sshort, ushort, sint, usint, ..., NO_OF_ITYPES };
     
    211222        "signed int", "unsigned int", ...
    212223};
    213 enum Integral_Type integral_type = ...
     224enum Integral_Type @integral_type@ = ...
    214225printf( "%s\n", Integral_Name[@integral_type@] ); // human readable type name
    215226\end{cfa}
    216227However, the companion idiom results in the \emph{harmonizing} problem because an update to the enumeration @Integral_Type@ often requires a corresponding update to the companion array \snake{Integral_Name}.
    217 The need to harmonize is at best indicated by a comment before the enumeration.
     228The requirement to harmonize is at best indicated by a comment before the enumeration.
    218229This issue is exacerbated if enumeration and companion array are in different translation units.
    219230
    220231\bigskip
    221 While C provides a true enumeration, it is restricted, has unsafe semantics, and does not provide useful enumeration features in other programming languages.
    222 
    223 \section{\CFA Polymorphism}
     232While C provides a true enumeration, it is restricted, has unsafe semantics, and does not provide useful/advanced enumeration features found in other programming languages.
     233
     234
     235\section{\CFA}
     236
     237\CFA in \emph{not} an object-oriented programming-language, \ie functions cannot be nested in aggregate types, and hence, there is no receive notation for calling functions, \eg @obj.method(...)@, where the first argument proceeds the call.
     238The following section provide short descriptions of \CFA features mentioned further in the thesis.
     239
     240
     241\subsection{Operator Overloading}
     242
     243Virtually all programming languages overload the arithmetic operators across the basic types using the number and type of parameters and returns.
     244Like \CC, \CFA also allows these operators to be overloaded with user-defined types.
     245The syntax for operator names uses the @'?'@ character to denote a parameter, \eg prefix and infix increment operators: @?++@, @++?@, and @?+?@.
     246\begin{cfa}
     247struct S { int i, j };
     248S @?+?@( S op1, S op2 ) { return (S){ op1.i + op2.i, op1.j + op2.j }; }
     249S s1, s2;
     250s1 = s1 @+@ s2;                 $\C[1.75in]{// infix call}$
     251s1 = @?+?@( s1, s2 );   $\C{// direct call}\CRT$
     252\end{cfa}
     253The type system examines each call size and selects the best matching overloaded function based on the number and types of the arguments.
     254If there are intermixed operands, @2 + 3.5@, the type system attempts (safe) conversions changing the arguments to one or more of the parameter type(s).
     255
     256
    224257\subsection{Function Overloading}
    225 Function overloading is programming languages feature wherein functions may share the same name, but with different function signatures. In both C++ and \CFA, function names can be overloaded
    226 with different entities as long as they are different in terms of the number and type of parameters.
    227 
    228 \begin{cfa}
    229 void f(); // (1)
    230 void f(int); // (2); Overloaded on the number of parameters
    231 void f(char); // (3); Overloaded on parameter type
    232 
    233 f('A');
    234 \end{cfa}
    235 In this case, the name f is overloaded with a nullity function and two arity functions with different parameters types. Exactly which precedures being executed
    236 is determined based on the passing arguments. The last expression of the preceding example calls f with one arguments, narrowing the possible candidates down to (2) and (3).
    237 Between those, function argument 'A' is an exact match to the parameter expected by (3), while needing an @implicit conversion@ to call (2). The compiler determines (3) is the better candidates among
    238 and procedure (3) is being executed.
    239 
    240 \begin{cfa}
    241 int f(int); // (4); Overloaded on return type
    242 [int, int] f(int); // (5) Overloaded on the number of return value
    243 \end{cfa}
    244 The function declarations (4) and (5) show the ability of \CFA functions overloaded with different return value, a feature that is not shared by C++.
    245 
    246 
    247 \subsection{Operator Overloading}
    248 Operators in \CFA are specialized function and are overloadable by with specially-named functions represents the syntax used to call the operator.
    249 % For example, @bool ?==?T(T lhs, T rhs)@ overloads equality operator for type T, where @?@ is the placeholders for operands for the operator.
    250 \begin{cfa}
    251 enum Weekday { Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, Sunday };
    252 bool ?<?(const Weekday a, const Weekday b) {
    253         return ((int)a + 1);
    254 }
    255 Monday < Sunday; // False
    256 ?<?( Monday, Sunday ); // Equivalent syntax
    257 \end{cfa}
    258 Unary operators are functions that takes one argument and have name @operator?@ or @?operator@, where @?@ is the placeholders for operands.
    259 Binary operators are function with two parameters. They are overloadable with function name @?operator?@.
     258
     259Both \CFA and \CC allow function names to be overloaded, as long as their prototypes differ in the number and type of parameters and returns.
     260\begin{cfa}
     261void f( void );                 $\C[1.75in]{// (1): no parameter}$
     262void f( char );                 $\C{// (2): overloaded on the number and parameter type}$
     263void f( int, int );             $\C{// (3): overloaded on the number and parameter type}$
     264f( 'A' );                               $\C{// select (2)}\CRT$
     265\end{cfa}
     266In this case, the name @f@ is overloaded depending on the number and parameter types.
     267The type system examines each call size and selects the best matching based on the number and types of the arguments.
     268Here, there is a perfect match for the call, @f( 'A' )@ with the number and parameter type of function (2).
     269
     270Ada, Scala, and \CFA type-systems also use the return type in resolving a call.
     271\begin{cfa}
     272int f( void );                  $\C[1.75in]{// (4); overloaded on return type}$
     273double f( void );               $\C{// (5); overloaded on return type}$
     274int i = f();                    $\C{// select (4)}$
     275double d = f();                 $\C{// select (5)}\CRT$
     276\end{cfa}
     277
     278
     279\subsection{Variable Overloading}
     280Unlike almost all programming languages, \CFA has variable overloading within a scope, along with shadow overloading in nested scopes.
     281\begin{cfa}
     282void foo( double d );
     283int v;                              $\C[1.75in]{// (1)}$
     284double v;                               $\C{// (2) variable overloading}$
     285foo( v );                               $\C{// select (2)}$
     286{
     287        int v;                          $\C{// (3) shadow overloading}$
     288        double v;                       $\C{// (4) and variable overloading}$
     289        foo( v );                       $\C{// select (4)}\CRT$
     290}
     291\end{cfa}
     292The \CFA type system simply treats overloaded variables as an overloaded function returning a value with no parameters.
     293
    260294
    261295\subsection{Constructor and Destructor}
    262 In \CFA, all objects are initialized by @constructors@ during its allocation, including basic types,
    263 which are initialized by auto-generated basic type constructors.
    264 
    265 Constructors are overloadable functions with name @?{}@, return @void@, and have at least one parameter, which is a reference
    266 to the object being constructored (Colloquially referred to "this" or "self" in other language).
    267 
     296
     297While \CFA in not object oriented, it adopts many language features commonly used in object-oriented languages;
     298these features are independent from object-oriented programming.
     299
     300All objects in \CFA are initialized by @constructors@ \emph{after} allocation and de-initialized \emph{before} deallocation.
     301\CC cannot have constructors for basic-types because they have no aggregate type \lstinline[language=C++]{struct/class} in which to insert a constructor definition.
     302Like \CC, \CFA has multiple auto-generated constructors for every type.
     303
     304The prototype for the constructor/destructor are @void ?{}( T &, ... )@ and @void ^?{}( T &, ... )@, respectively.
     305The first parameter is logically, the \lstinline[language=C++]{this} or \lstinline[language=java]{self} in other object-oriented languages, and implicitly passed.
    268306\begin{cfa}
    269307struct Employee {
    270         const char * name;
     308        char * name;
    271309        double salary;
    272310};
    273 
    274 void ?{}( Employee& this, const char * name, double salary ) {
    275     this.name = name;
    276     this.salary = salary;
    277 }
    278 
    279 Employee Sara { "Sara Schmidt", 20.5 };
    280 \end{cfa}
    281 Like Python, the "self" reference is implicitly passed to a constructor. The Employee constructors takes two additional arugments used in its
    282 field initialization.
    283 
    284 A destructor in \CFA is a function that has name @^?{}@. It returns void, and take only one arugment as its "self".
    285 \begin{cfa}
    286 void ^?{}( Employee& this ) {
    287     free(this.name);
    288     this.name = 0p;
    289     this.salary = 0;
    290 }
    291 \end{cfa}
    292 Destructor can be explicitly evoked as a function call, or implicitly called at the end of the block in which the object is delcared.
    293 \begin{cfa}
     311void @?{}@( Employee & this, char * name, double salary ) {
     312        this.name = aalloc( sizeof(name) );
     313        strcpy( this.name, name );
     314        this.salary = salary;
     315}
     316void @^?{}@( Employee & this ) {
     317        free( this.name );
     318}
    294319{
    295 ^Sara{};
    296 Sara{ "Sara Craft", 20 };
    297 } // ^Sara{}
    298 \end{cfa}
    299 
    300 \subsection{Variable Overloading}
    301 C and C++ disallow more than one variable declared in the same scope with the same name. When a variable declare in a inner scope has the same name as
    302 a variable in an outer scope, the outer scope variable is "shadowed" by the inner scope variable and cannot be accessed directly.
    303 
    304 \CFA has variable overloading: multiple variables can share the same name in the same scope, as long as they have different type. Name shadowing only
    305 happens when the inner scope variable and the outer scope ones have the same type.
    306 \begin{cfa}
    307 double i = 6.0;
    308 int i = 5;
    309 void foo( double i ) { sout | i; } // 6.0
    310 \end{cfa}
     320        Employee name = { "Sara Schmidt", 20.5 };
     321} // implicit destructor call
     322\end{cfa}
     323Both constructor and destructor can be explicitly called.
     324\begin{cfa}
     325        Employee name = { "Sara Schmidt", 20.5 };
     326        ... // use name
     327        ^?{}( name ); // de-initialize
     328        ?{}( name, "Jack Smith", 10.5 }; // re-initialize
     329        ... // use name
     330\end{cfa}
     331
    311332
    312333\subsection{Special Literals}
    313 Literal 0 has special meanings within different contexts: it can means "nothing" or "empty", an additive identity in arithmetic, a default value as in C (null pointer),
    314 or an initial state.
    315 Awaring of its significance, \CFA provides a special type for the 0 literal, @zero_t@, to define the logical @zero@ for custom types.
     334
     335The C constants @0@ and @1@ have special meaning.
     336@0@ is the null pointer and used in conditional expressions, where @if ( p )@ is rewritten @if ( p != 0 )@;
     337@1@ is an additive identity in unary operators @++@ and @--@.
     338Aware of their significance, \CFA provides a special type @zero_t@ and @one_t@ for custom types.
    316339\begin{cfa}
    317340struct S { int i, j; };
    318341void ?{}( S & this, @zero_t@ ) { this.i = 0; this.j = 0; } // zero_t, no parameter name allowed
    319 S s0 = @0@;
    320 \end{cfa}
    321 Overloading @zero_t@ for S provides new definition for @zero@ of type S.
    322 
    323 According to the C standard, @0@ is the @only@ false value. Any values compares equals to @0@ is false, and not euqals @0@ is true. As a consequence, control structure
    324 such as @if()@ and @while()@ only runs it true clause when its predicate @not equals@ to @0@.
    325 
    326 \CFA generalizes this concept and allows to logically overloads the boolean value for any type by overloading @not equal@ comparison against @zero_t@.
    327 \begin{cfa}
    328342int ?@!=@?( S this, @zero_t@ ) { return this.i != 0 && this.j != 0; }
    329 \end{cfa}
    330 
    331 % In C, the literal 0 represents the Boolean value false. The expression such as @if (x)@ is equivalent to @if (x != 0)@ .
    332 % \CFA allows user to define the logical zero for a custom type by overloading the @!=@ operation against a special type, @zero_t@,
    333 % so that an expression with the custom type can be used as a predicate without the need of conversion to the literal 0.
    334 % \begin{cfa}
    335 % struct S s;
    336 % int ?!=?(S, zero_t);
    337 % if (s) {}
    338 % \end{cfa}
    339 Literal 1 is also special. Particularly in C, the pre-increment operator and post-increment operator can be interpreted in terms of @+= 1@.
    340 The logical @1@ in \CFA is represented by special type @one_t@.
    341 \begin{cfa}
    342 void ?{}( S & this, one_t ) { this.i = 1; this.j = 1; } // one_t, no parameter name allowed
    343 S & ?+=?( S & this, one_t ) { this.i += 1; this.j += 1; return op; }
    344 \end{cfa}
    345 Without explictly overloaded by a user, \CFA uses the user-defined @+=(S&, one_t)@ to interpret @?++@ and @++?@, as both are polymorphic functions in \CFA.
    346 
    347 \subsection{Polymorphics Functions}
    348 Parametric-Polymorphics functions are the functions that applied to all types. \CFA functions are parametric-polymorphics when
    349 they are written with the @forall@ clause.
    350 
    351 \begin{cfa}
    352 forall(T)
    353 T identity(T x) { return x; }
    354 identity(42);
    355 \end{cfa}
    356 The identity function accepts a value from any type as an arugment, and the type parameter @T@ is bounded to @int@ when the function
    357 is called with 42.
    358 
    359 The forall clause can takes @type assertions@ that restricts the polymorphics type.
    360 \begin{cfa}
    361 forall( T | { void foo(T); } )
    362 void bar(T t) { foo(t); }
    363 
    364 struct S {} s;
    365 void foo(struct S);
    366 
    367 bar(s);
    368 \end{cfa}
    369 The assertion on @T@ restricts the range of types for bar to only those implements foo with the matching a signature, so that bar()
    370 can call @foo@ in its body with type safe.
    371 Calling on type with no mathcing @foo()@ implemented, such as int, causes a compile time type assertion error.
    372 
    373 \subsection{trait}
    374 A @forall@ clause can asserts on multiple types and with multiple asserting functions. A common practice in \CFA is to group
    375 the asserting functions in to a named \newterm{trait}.
    376 
    377 \begin{cfa}
    378 trait Bird(T) {
    379         int days_can_fly(T i);
    380         void fly(T t);
     343S s = @0@;
     344if ( s @!= 0@ ) ...
     345\end{cfa}
     346Similarity, for @one_t@.
     347\begin{cfa}
     348void ?{}( S & this, @one_t@ ) { this.i = 1; this.j = 1; } // one_t, no parameter name allowed
     349S & ?++( S & this, @one_t@ ) { return (S){ this.i++, this.j++ }; }
     350\end{cfa}
     351
     352
     353\subsection{Polymorphic Functions}
     354
     355Polymorphic functions are the functions that apply to all types.
     356\CFA provides \newterm{parametric polymorphism} written with the @forall@ clause.
     357\begin{cfa}
     358@forall( T )@ T identity( T v ) { return v; }
     359identity( 42 );
     360\end{cfa}
     361The @identity@ function accepts a value from any type as an argument and returns that value.
     362At the call size, the type parameter @T@ is bounded to @int@ from the argument @42@.
     363
     364For polymorphic functions to be useful, the @forall@ clause needs \newterm{type assertion}s that restricts the polymorphic types it accepts.
     365\begin{cfa}
     366forall( T @| { void foo( T ); }@ ) void bar( T t ) { @foo( t );@ }
     367struct S { ... } s;
     368void foo( struct S );
     369bar( s );
     370\end{cfa}
     371The assertion on @T@ restricts the range of types that can be manipulated by @bar@ to only those that have an implementation of @foo@ with the matching signature, allowing @bar@'s call to @foo@ in its body.
     372
     373
     374\subsection{Trait}
     375
     376A @forall@ clause can assert many restrictions on multiple types.
     377A common practice is to refactor the assertions into a named \newterm{trait}.
     378\begin{cfa}
     379forall(T) trait @Bird@ {
     380        int days_can_fly( T );
     381        void fly( T );
    381382};
    382 
    383 forall(B | Bird(B)) {
    384         void bird_fly(int days_since_born, B bird) {
    385                 if (days_since_born > days_can_fly(bird)) {
    386                         fly(bird);
    387                 }
    388         }
    389 }
    390 
    391 struct Robin {} r;
    392 int days_can_fly(Robin r) { return 23; }
    393 void fly(Robin r) {}
    394 
    395 bird_fly( r );
    396 \end{cfa}
    397 
    398 Grouping type assertions into named trait effectively create a reusable interface for parametrics polymorphics types.
     383forall( B | @Bird@( B ) )
     384void bird_fly( int days_since_born, B bird ) {
     385        if ( days_since_born > days_can_fly( bird )) fly( bird );
     386}
     387struct Robin {} robin;
     388int days_can_fly( Robin robin ) { return 23; }
     389void fly( Robin robin ) {}
     390bird_fly( 23, robin );
     391\end{cfa}
     392Grouping type assertions into a named trait effectively creates a reusable interface for parametric-polymorphic types.
     393
    399394
    400395\section{Expression Resolution}
    401396
    402 The overloading feature poses a challenge in \CFA expression resolution. Overloadeded identifiers can refer multiple
    403 candidates, with multiples being simultaneously valid. The main task of \CFA resolver is to identity a best candidate that
    404 involes less implicit conversion and polymorphism.
     397Overloading poses a challenge for all expression-resolution systems.
     398Multiple overloaded names give multiple candidates at a call site, and a resolver must pick a \emph{best} match, where ``best'' is defined by a series of heuristics based on safety and programmer intuition/expectation.
     399When multiple best matches exist, the resolution is ambiguous.
     400
     401The \CFA resolver attempts to identity a best candidate based on: first, the number of parameters and types, and second, when no exact match exists, the fewest implicit conversions and polymorphic variables.
     402Finding an exact match is not discussed here, because the mechanism is fairly straightforward, even when the search space is large;
     403only finding a non-exact match is discussed in detail.
     404
    405405
    406406\subsection{Conversion Cost}
    407407\label{s:ConversionCost}
    408 In C, function call arguments and function parameters do not need to be a exact match. When types mismatch, C performs an \newterm{implicit conversion}
    409 on argument to parameter type. The process is trivial with the exception on binary operators;  When types of operands are different,
    410 C nees to decide which operands need implicit conversion. C defines the resolution pattern as "usual arithmetic conversion",
    411 in which C looks for a \newterm{common type} between operands, and convert either one or both operands to the common type.
    412 Loosely defined, a common type is a the smallest type in terms of size of representation that both operands can be converted into without losing their precision.
    413 Such conversion is called "widening" or "safe conversion".
    414 
    415 C binary operators can be restated as 2-arity functions that overloaded with different parameters. "Usual arithmetic conversion" is to find a overloaded
    416 instance that for both arguments, the conversion to parameter type is a widening conversion to the smallest type.
    417 
    418 \CFA generalizes "usual arithmetic conversion" to \newterm{conversion cost}. In the first design by Bilson, conversion cost is a 3-tuple,
    419 @(unsafe, poly, safe)@, where @unsafe@ the number of unsafe (narrorow conversion) from argument to parameter,
    420 @poly@ is the number of polymorphic function parameter,
    421 and @safe@ is sum of degree of safe (widening) conversion.
     408
     409Most programming languages perform some implicit conversions among basic types to facilitate mixed-mode arithmetic;
     410otherwise, the program becomes littered with many explicit casts, which is not match programmer expectation.
     411C is an aggressive language as it provides conversions among almost all of the basic types, even when the conversion is potentially unsafe or not meaningful, \ie @float@ to @bool@.
     412C defines the resolution pattern as ``usual arithmetic conversion''~\cite[\S~6.3.1.8]{C11}, in which C looks for a \newterm{common type} between operands, and converts one or both operands to the common type.
     413Loosely defined, a common type is a the smallest type in terms of size of representation that both operands can be converted into without losing their precision, called a \newterm{widening} or \newterm{safe conversion}.
     414
     415\CFA generalizes ``usual arithmetic conversion'' to \newterm{conversion cost}.
     416In the first design by Bilson~\cite{Bilson03}, conversion cost is a 3-tuple, @(unsafe, poly, safe)@ applied between each argument/parameter type, where:
     417\begin{enumerate}
     418\item
     419@unsafe@ is the number of precision losing (\newterm{narrowing} conversions),
     420\item
     421@poly@ is the number of polymorphic function parameters, and
     422\item
     423@safe@ is sum of the degree of safe (widening) conversions.
     424\end{enumerate}
    422425Sum of degree is a method to quantify C's integer and floating-point rank.
    423 Every pair of widening conversion types has been assigned with a \newterm{distance}, and distance between the two same type is 0.
    424 For example, the distance from char to int is 2, distance from integer to long is 1, and distance from int to long long int is 2.
    425 The distance does not mirror C's rank system. For example, the rank of char and signed char are the same in C, but the distance from char to signed char is assigned with 1.
    426 @safe@ cost is summing all pair of argument to parameter safe conversion distance.
    427 Among the three in Bilson's model, @unsafe@ is the most significant cost and @safe@ is the least significant one, with an implication that \CFA always choose
    428 a candidate with the lowest @unsafe@ if possible.
    429 
    430 Assume the overloaded function @foo@ is called with two @int@ parameter. The cost for every overloaded @foo@ has been list along:
    431 \begin{cfa}
    432 void foo(char, char);                           // (2, 0, 0)
    433 void foo(char, int);                            // (1, 0, 0)
    434 forall(T, V) void foo(T, V);            // (0, 2, 0)
    435 forall(T)        void foo(T, T);                // (0, 2, 0)
    436 forall(T)        void foo(T, int);              // (0, 1, 0)
    437 void foo(long long, long);                      // (0, 0, 3)
    438 void foo(long, long);                           // (0, 0, 2)
    439 void foo(int, long);                            // (0, 0, 1)
    440 
    441 int i, j; foo(i, j);
    442 \end{cfa}
    443 The overloaded instances are ordered from the highest to the lowest cost, and \CFA select the last candidate.
    444 
    445 In the later iteration of \CFA, Schluntz and Aaron expanded conversion cost to a 7-tuple with 4 additional categories,
    446 @@(unsafe, poly, safe, sign, vars, specialization, reference)@@.
    447 with interpretation listed below:
    448 \begin{itemize}
    449 \item Unsafe
    450 \item Poly
    451 \item Safe
    452 \item Sign is the number of sign/unsign variable conversion.
    453 \item Vars is the number of polymorphics type variable.
    454 \item Specialization is negative value of the number of type assertion.
    455 \item Reference is number of reference-to-rvalue conversion.
    456 \end{itemize}
    457 The extended conversion cost models looks for candidates that are more specific and less generic.
    458 @Var@s was introduced by Aaron to disambugate @forall(T, V) void foo(T, V)@ and @forall(T) void foo(T, T)@. The extra type parameter @V@
    459 makes it more generic and less specific. More generic type means less constraints on types of its parameters. \CFA is in favor of candidates with more
    460 restrictions on polymorphism so @forall(T) void foo(T, T)@ has lower cost. @Specialization@ is a value that always less than or equal to zero. For every type assertion in @forall@ clause, \CFA subtracts one from
    461 @specialization@, starting from zero. More type assertions often means more constraints on argument type, and making the function less generic.
    462 
    463 \CFA defines two special cost value: @zero@ and @infinite@ A conversion cost is @zero@ when argument and parameter has exact match, and a conversion cost is @infinite@ when
    464 there is no defined conversion between two types. For example, the conversion cost from int to a struct type S is @infinite@.
     426Every pair of widening conversion types is assigned a \newterm{distance}, and distance between the two same type is 0.
     427For example, the distance from @char@ to @int@ is 2, distance from @int@ to @long@ is 1, and distance from @int@ to @long long int@ is 2.
     428This distance does not mirror C's rank system.
     429For example, the rank of @char@ and @signed char@ are the same in C, but the distance from @char@ to @signed char@ is assigned 1.
     430@safe@ cost is summing all pairs of argument to parameter safe conversion distances.
     431Among the three costs in Bilson's model, @unsafe@ is the most significant cost and @safe@ is the least significant, with an implication that \CFA always choose a candidate with the lowest @unsafe@, if possible.
     432
     433For example, assume the overloaded function @foo@ is called with two @int@ parameter.
     434The cost for every overloaded @foo@ has been list along:
     435\begin{cfa}
     436void foo( char, char );                         $\C[2.5in]{// (1) (2, 0, 0)}$
     437void foo( char, int );                          $\C{// (2) (1, 0, 0)}$
     438forall( T, V ) void foo( T, V );        $\C{// (3) (0, 2, 0)}$
     439forall( T ) void foo( T, T );           $\C{// (4) (0, 2, 0)}$
     440forall( T ) void foo( T, int );         $\C{// (5) (0, 1, 0)}$
     441void foo( long long, long );            $\C{// (6) (0, 0, 3)}$
     442void foo( long, long );                         $\C{// (7) (0, 0, 2)}$
     443void foo( int, long );                          $\C{// (8) (0, 0, 1)}$
     444int i, j;
     445foo( i, j );                                            $\C{// convert j to long and call (8)}\CRT$
     446\end{cfa}
     447The overloaded instances are ordered from the highest to the lowest cost, and \CFA select the last candidate (8).
     448
     449In the next iteration of \CFA, Schluntz and Aaron~\cite{Moss18} expanded conversion cost to a 7-tuple with 4 additional categories, @(unsafe, poly, safe, sign, vars, specialization, reference)@, with the following interpretations:
     450\begin{enumerate}
     451\item @unsafe@ from Bilson
     452\item @poly@
     453\item @safe@
     454\item @sign@ is the number of sign/unsigned variable conversions
     455\item @vars@ is the number of polymorphic type variables
     456\item @specialization@ is a negative value of the number of type assertions
     457\item @reference@ is the number of reference-to-rvalue conversions
     458\end{enumerate}
     459The extended conversion-cost model looks for candidates that are more specific and less generic.
     460@vars@ disambiguates @forall( T, V ) foo( T, V )@ and @forall( T ) void foo( T, T )@, where the extra type parameter @V@ makes is more generic.
     461A more generic type means less constraints on its parameter types.
     462\CFA favours candidates with more restrictions on polymorphism, so @forall( T ) void foo( T, T )@ has lower cost.
     463@specialization@ is an arbitrary count-down value starting at zero.
     464For every type assertion in @forall@ clause (no assertions in the above example), \CFA subtracts one from @specialization@.
     465More type assertions means more constraints on argument types, making the function less generic.
     466
     467\CFA defines two special cost values: @zero@ and @infinite@.
     468A conversion cost is @zero@ when argument and parameter has an exact match, and a conversion cost is @infinite@ when there is no defined conversion between two types.
     469For example, the conversion cost from @int@ to a @struct S@ is @infinite@.
  • doc/theses/jiada_liang_MMath/uw-ethesis.tex

    recaedf35 rb797fe36  
    155155\urlstyle{sf}
    156156
     157\setcounter{secnumdepth}{4}     % number subsubsection
     158\setcounter{tocdepth}{4} % subsubsection in TOC
     159
    157160%\usepackage[automake,toc,abbreviations]{glossaries-extra} % Exception to the rule of hyperref being the last add-on package
    158161%\renewcommand*{\glstextformat}[1]{\textcolor{black}{#1}}
Note: See TracChangeset for help on using the changeset viewer.