Changes in / [4ff7ea3:94aa202]


Ignore:
File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/fangren_yu_MMath/intro.tex

    r4ff7ea3 r94aa202  
    1212Therefore, one of the key goals in \CFA is to push the boundary on overloading, and hence, overload resolution.
    1313As well, \CFA follows the current trend of replacing nominal inheritance with traits.
    14 Together, the resulting \CFA type-system has a number of unique features making it different from all other programming languages.
     14Together, the resulting \CFA type-system has a number of unique features making it different from other programming languages with expressive, static type-systems.
    1515
    1616
     
    2323All computers have multiple types because computer architects optimize the hardware around a few basic types with well defined (mathematical) operations: boolean, integral, floating-point, and occasionally strings.
    2424A programming language and its compiler present ways to declare types that ultimately map into the ones provided by the underlying hardware.
    25 These language types are thrust upon programmers, with their syntactic and semantic rules and restrictions.
     25These language types are \emph{thrust} upon programmers, with their syntactic and semantic rules and restrictions.
    2626These rules are used to transform a language expression to a hardware expression.
    2727Modern programming-languages allow user-defined types and generalize across multiple types using polymorphism.
     
    3434Virtually all programming languages overload the arithmetic operators across the basic computational types using the number and type of parameters and returns.
    3535Like \CC, \CFA also allows these operators to be overloaded with user-defined types.
    36 The syntax for operator names uses the @'?'@ character to denote a parameter, \eg unary left and right operators: @?++@ and @++?@, and binary operators @?+?@ and @?<=?@.
     36The syntax for operator names uses the @'?'@ character to denote a parameter, \eg left and right unary operators: @?++@ and @++?@, and binary operators @?+?@ and @?<=?@.
    3737Here, a user-defined type is extended with an addition operation with the same syntax as builtin types.
    3838\begin{cfa}
     
    4444\end{cfa}
    4545The type system examines each call site and selects the best matching overloaded function based on the number and types of arguments.
    46 If there are mixed-mode operands, @2 + 3.5@, the type system, like in C/\CC, attempts (safe) conversions, converting the argument type(s) to the parameter type(s).
     46If there are mixed-mode operands, @2 + 3.5@, the type system attempts (safe) conversions, like in C/\CC, converting the argument type(s) to the parameter type(s).
    4747Conversions are necessary because the hardware rarely supports mix-mode operations, so both operands must be the same type.
    4848Note, without implicit conversions, programmers must write an exponential number of functions covering all possible exact-match cases among all possible types.
     
    7171double d = f( 3 );              $\C{// select (2)}\CRT$
    7272\end{cfa}
    73 Alternatively, if the type system looks at the return type, there is an exact match for each call, which matches with programmer intuition and expectation.
     73Alternatively, if the type system looks at the return type, there is an exact match for each call, which again matches with programmer intuition and expectation.
    7474This capability can be taken to the extreme, where there are no function parameters.
    7575\begin{cfa}
     
    8080\end{cfa}
    8181Again, there is an exact match for each call.
    82 If there is no exact match, a set of minimal conversions can be added to find a best match, as for operator overloading.
     82If there is no exact match, a set of minimal, safe conversions can be added to find a best match, as for operator overloading.
    8383
    8484
     
    9999\end{cfa}
    100100It is interesting that shadow overloading is considered a normal programming-language feature with only slight software-engineering problems.
    101 However, variable overloading within a scope is considered extremely dangerous, without any evidence to corroborate this claim.
    102 Similarly, function overloading occurs silently within the global scope in \CC from @#include@ files all the time without problems.
     101However, variable overloading within a scope is often considered extremely dangerous, without any evidence to corroborate this claim.
     102In contrast, function overloading in \CC occurs silently within the global scope from @#include@ files all the time without problems.
    103103
    104104In \CFA, the type system simply treats overloaded variables as an overloaded function returning a value with no parameters.
     
    114114Hence, the name @MAX@ can replace all the C type-specific names, \eg @INT_MAX@, @LONG_MAX@, @DBL_MAX@, \etc.
    115115The result is a significant reduction in names to access typed constants.
    116 % Paraphrasing Shakespeare, ``The \emph{name} is the thing.''.
     116
     117As an aside, C has a separate namespace for type and variables allowing overloading between the namespaces, using @struct@ (qualification) to disambiguate.
     118\begin{cfa}
     119void S() {
     120        struct @S@ { int S; };
     121        @struct S@ S;
     122        void S( @struct S@ S ) { S.S = 1; };
     123}
     124\end{cfa}
    117125
    118126
     
    120128
    121129\CFA is unique in providing restricted constant overloading for the values @0@ and @1@, which have special status in C, \eg the value @0@ is both an integer and a pointer literal, so its meaning depends on context.
    122 In addition, several operations are defined in terms of values @0@ and @1@, \eg:
    123 \begin{cfa}
    124 if ( x ) ++x                    $\C{// if ( x != 0 ) x += 1;}$
    125 \end{cfa}
    126 Every @if@ and iteration statement in C compares the condition with @0@, and every increment and decrement operator is semantically equivalent to adding or subtracting the value @1@ and storing the result.
    127 These two constants are given types @zero_t@ and @one_t@ in \CFA, which allows overloading various operations for new types that seamlessly connect to all special @0@ and @1@ contexts.
     130In addition, several operations are defined in terms of values @0@ and @1@.
     131For example, every @if@ and iteration statement in C compares the condition with @0@, and every increment and decrement operator is semantically equivalent to adding or subtracting the value @1@ and storing the result.
     132\begin{cfa}
     133if ( x ) ++x;        =>    if ( x @!= 0@ ) x @+= 1@;
     134for ( ; x; --x )   =>    for ( ; x @!= 0@; x @-= 1@ )
     135\end{cfa}
     136To generalize this feature, both constants are given types @zero_t@ and @one_t@ in \CFA, which allows overloading various operations for new types that seamlessly work with the special @0@ and @1@ contexts.
    128137The types @zero_t@ and @one_t@ have special builtin implicit conversions to the various integral types, and a conversion to pointer types for @0@, which allows standard C code involving @0@ and @1@ to work.
    129138\begin{cfa}
     
    133142S ?=?( S & dst, zero_t ) { dst.[i,j] = 0; return dst; } $\C{// assignment}$
    134143S ?=?( S & dst, one_t ) { dst.[i,j] = 1; return dst; }
    135 S ?+=?( S & s, one_t ) { s.[i,j] += 1; return s; } $\C{// increment/decrement}$
     144S ?+=?( S & s, one_t ) { s.[i,j] += 1; return s; } $\C{// increment/decrement each field}$
    136145S ?-=?( S & s, one_t ) { s.[i,j] -= 1; return s; }
    137146int ?!=?( S s, zero_t ) { return s.i != 0 && s.j != 0; } $\C{// comparison}$
    138 S s = @0@;
    139 s = @0@;
     147S s = @0@;                      $\C{// initialization}$
     148s = @0@;                        $\C{// assignments}$
    140149s = @1@;
    141 if ( @s@ ) @++s@;       $\C{// unary ++/-\,- come from +=/-=}$
    142 \end{cfa}
    143 Hence, type @S@ is first-class with respect to the basic types, working with all existing implicit C mechanisms.
     150if ( @s@ ) @++s@;       $\C{// special, unary ++/-\,- come implicitly from +=/-=}$
     151\end{cfa}
     152Here, type @S@ is first-class with respect to the basic types, working with all existing implicit C mechanisms.
    144153
    145154
     
    180189\end{cfa}
    181190In both overloads of @f@, the type system works from the constant initializations inwards and/or outwards to determine the types of all variables and functions.
    182 Note, like template meta-programming, there could be a new function generated for the second @f@ depending on the types of the arguments, assuming these types are meaningful in the body of @f@.
     191Note, like template meta programming, there could be a new function generated for the second @f@ depending on the types of the arguments, assuming these types are meaningful in the body of @f@.
    183192Inferring type constraints, by analysing the body of @f@ is possible, and these constraints must be satisfied at each call site by the argument types;
    184193in this case, parametric polymorphism can allow separate compilation.
     
    200209\begin{cfa}
    201210
    202 auto x = 3.0 * 4;
     211auto x = 3.0 * i;
    203212int y;
    204213auto z = y;
     
    223232This issue is exaggerated with \CC templates, where type names are 100s of characters long, resulting in unreadable error messages.
    224233\item
    225 Ensuring the type of secondary variables, always match a primary variable.
     234Ensuring the type of secondary variables, match a primary variable(s).
    226235\begin{cfa}
    227236int x; $\C{// primary variable}$
     
    284293There are full-time Java consultants, who are hired to find memory-management problems in large Java programs.}
    285294The entire area of Computer-Science data-structures is obsessed with time and space, and that obsession should continue into regular programming.
    286 Understanding space and time issues are an essential part of the programming craft.
    287 Given @typedef@ and @typeof@ in \CFA, and the strong need to use the left-hand type in resolution, implicit type-inferencing is unsupported.
     295Understanding space and time issues is an essential part of the programming craft.
     296Given @typedef@ and @typeof@ in \CFA, and the strong desire to use the left-hand type in resolution, implicit type-inferencing is unsupported.
    288297Should a significant need arise, this feature can be revisited.
    289298
     
    291300\section{Polymorphism}
    292301
    293 \CFA provides polymorphic functions and types, where the polymorphic function can be the constraints types using assertions based on traits.
    294 
    295 \subsection{\texorpdfstring{\protect\lstinline{forall} functions}{forall functions}}
    296 \label{sec:poly-fns}
    297 
    298 The signature feature of \CFA is parametric-polymorphic functions~\cite{forceone:impl,Cormack90,Duggan96} with functions generalized using a @forall@ clause (giving the language its name).
     302\CFA provides polymorphic functions and types, where a polymorphic function can constrain types using assertions based on traits.
     303
     304
     305\subsection{Polymorphic Function}
     306
     307The signature feature of the \CFA type-system is parametric-polymorphic functions~\cite{forceone:impl,Cormack90,Duggan96}, generalized using a @forall@ clause (giving the language its name).
    299308\begin{cfa}
    300309@forall( T )@ T identity( T val ) { return val; }
    301310int forty_two = identity( 42 );         $\C{// T is bound to int, forty\_two == 42}$
    302311\end{cfa}
    303 This @identity@ function can be applied to any complete \newterm{object type} (or @otype@).
    304 The type variable @T@ is transformed into a set of additional implicit parameters encoding sufficient information about @T@ to create and return a variable of that type.
    305 The \CFA implementation passes the size and alignment of the type represented by an @otype@ parameter, as well as an assignment operator, constructor, copy constructor, and destructor.
    306 If this extra information is not needed, for instance, for a pointer, the type parameter can be declared as a \newterm{data type} (or @dtype@).
    307 
    308 In \CFA, the polymorphic runtime cost is spread over each polymorphic call, because more arguments are passed to polymorphic functions;
    309 the experiments in Section~\ref{sec:eval} show this overhead is similar to \CC virtual function calls.
    310 A design advantage is that, unlike \CC template functions, \CFA polymorphic functions are compatible with C \emph{separate compilation}, preventing compilation and code bloat.
    311 
    312 Since bare polymorphic types provide a restricted set of available operations, \CFA provides a \newterm{type assertion}~\cite[pp.~37-44]{Alphard} mechanism to provide further type information, where type assertions may be variable or function declarations that depend on a polymorphic type variable.
    313 For example, the function @twice@ can be defined using the \CFA syntax for operator overloading.
    314 \begin{cfa}
    315 forall( T @| { T ?+?(T, T); }@ ) T twice( T x ) { return x @+@ x; }  $\C{// ? denotes operands}$
    316 int val = twice( twice( 3.7 ) );  $\C{// val == 14}$
    317 \end{cfa}
    318 This works for any type @T@ with a matching addition operator.
    319 The polymorphism is achieved by creating a wrapper function for calling @+@ with the @T@ bound to @double@ and then passing this function to the first call of @twice@.
    320 There is now the option of using the same @twice@ and converting the result into @int@ on assignment or creating another @twice@ with the type parameter @T@ bound to @int@ because \CFA uses the return type~\cite{Cormack81,Baker82,Ada} in its type analysis.
    321 The first approach has a late conversion from @double@ to @int@ on the final assignment, whereas the second has an early conversion to @int@.
    322 \CFA minimizes the number of conversions and their potential to lose information;
    323 hence, it selects the first approach, which corresponds with C programmer intuition.
    324 
    325 Crucial to the design of a new programming language are the libraries to access thousands of external software features.
    326 Like \CC, \CFA inherits a massive compatible library base, where other programming languages must rewrite or provide fragile interlanguage communication with C.
    327 A simple example is leveraging the existing type-unsafe (@void *@) C @bsearch@ to binary search a sorted float array.
    328 \begin{cfa}
    329 void * bsearch( const void * key, const void * base, size_t nmemb, size_t size,
    330                                 int (* compar)( const void *, const void * ));
    331 int comp( const void * t1, const void * t2 ) {
    332          return *(double *)t1 < *(double *)t2 ? -1 : *(double *)t2 < *(double *)t1 ? 1 : 0;
    333 }
    334 double key = 5.0, vals[10] = { /* 10 sorted float values */ };
    335 double * val = (double *)bsearch( &key, vals, 10, sizeof(vals[0]), comp ); $\C{// search sorted array}$
    336 \end{cfa}
    337 This can be augmented simply with generalized, type-safe, \CFA-overloaded wrappers.
    338 \begin{cfa}
    339 forall( T | { int ?<?( T, T ); } ) T * bsearch( T key, const T * arr, size_t size ) {
    340         int comp( const void * t1, const void * t2 ) { /* as above with double changed to T */ }
    341         return (T *)bsearch( &key, arr, size, sizeof(T), comp );
    342 }
    343 forall( T | { int ?<?( T, T ); } ) unsigned int bsearch( T key, const T * arr, size_t size ) {
    344         T * result = bsearch( key, arr, size ); $\C{// call first version}$
    345         return result ? result - arr : size; $\C{// pointer subtraction includes sizeof(T)}$
    346 }
    347 double * val = bsearch( 5.0, vals, 10 ); $\C{// selection based on return type}$
    348 int posn = bsearch( 5.0, vals, 10 );
    349 \end{cfa}
    350 The nested function @comp@ provides the hidden interface from typed \CFA to untyped (@void *@) C, plus the cast of the result.
    351 % FIX
    352 Providing a hidden @comp@ function in \CC is awkward as lambdas do not use C calling conventions and template declarations cannot appear in block scope.
    353 In addition, an alternate kind of return is made available: position versus pointer to found element.
    354 \CC's type system cannot disambiguate between the two versions of @bsearch@ because it does not use the return type in overload resolution, nor can \CC separately compile a template @bsearch@.
    355 
    356 \CFA has replacement libraries condensing hundreds of existing C functions into tens of \CFA overloaded functions, all without rewriting the actual computations (see Section~\ref{sec:libraries}).
    357 For example, it is possible to write a type-safe \CFA wrapper @malloc@ based on the C @malloc@, where the return type supplies the type/size of the allocation, which is impossible in most type systems.
    358 \begin{cfa}
    359 forall( T & | sized(T) ) T * malloc( void ) { return (T *)malloc( sizeof(T) ); }
     312This @identity@ function can be applied to an \newterm{object type}, \ie a type with a known size and alignment, which is sufficient to stack allocate, default or copy initialize, assign, and delete.
     313The \CFA implementation passes the size and alignment for each type parameter, as well as any implicit/explicit constructor, copy constructor, assignment operator, and destructor.
     314For an incomplete \newterm{data type}, \eg pointer/reference types, this information is not needed.
     315\begin{cfa}
     316forall( T * ) T * identity( T * val ) { return val; }
     317int i, * ip = identity( &i );
     318\end{cfa}
     319Unlike \CC template functions, \CFA polymorphic functions are compatible with C \emph{separate compilation}, preventing compilation and code bloat.
     320
     321To constrain polymorphic types, \CFA uses \newterm{type assertions}~\cite[pp.~37-44]{Alphard} to provide further type information, where type assertions may be variable or function declarations that depend on a polymorphic type variable.
     322For example, the function @twice@ works for any type @T@ with a matching addition operator.
     323\begin{cfa}
     324forall( T @| { T ?+?(T, T); }@ ) T twice( T x ) { return x @+@ x; }
     325int val = twice( twice( 3 ) );  $\C{// val == 12}$
     326\end{cfa}
     327For example. parametric polymorphism and assertions occurs in existing type-unsafe (@void *@) C @qsort@ to sort an array.
     328\begin{cfa}
     329void qsort( void * base, size_t nmemb, size_t size, int (*cmp)( const void *, const void * ) );
     330\end{cfa}
     331Here, the polymorphism is type-erasure, and the parametric assertion is the comparison routine, which is explicitly passed.
     332\begin{cfa}
     333enum { N = 5 };
     334double val[N] = { 5.1, 4.1, 3.1, 2.1, 1.1 };
     335int cmp( const void * v1, const void * v2 ) { $\C{// compare two doubles}$
     336        return *(double *)v1 < *(double *)v2 ? -1 : *(double *)v2 < *(double *)v1 ? 1 : 0;
     337}
     338qsort( val, N, sizeof( double ), cmp );
     339\end{cfa}
     340The equivalent type-safe version in \CFA is a wrapper over the C version.
     341\begin{cfa}
     342forall( ET | { int @?<?@( ET, ET ); } ) $\C{// type must have < operator}$
     343void qsort( ET * vals, size_t dim ) {
     344        int cmp( const void * t1, const void * t2 ) { $\C{// nested function}$
     345                return *(ET *)t1 @<@ *(ET *)t2 ? -1 : *(ET *)t2 @<@ *(ET *)t1 ? 1 : 0;
     346        }
     347        qsort( vals, dim, sizeof(ET), cmp ); $\C{// call C version}$
     348}
     349qsort( val, N );  $\C{// deduct type double, and pass builtin < for double}$
     350\end{cfa}
     351The nested function @cmp@ is implicitly built and provides the interface from typed \CFA to untyped (@void *@) C.
     352Providing a hidden @cmp@ function in \CC is awkward as lambdas do not use C calling conventions and template declarations cannot appear in block scope.
     353% In addition, an alternate kind of return is made available: position versus pointer to found element.
     354% \CC's type system cannot disambiguate between the two versions of @bsearch@ because it does not use the return type in overload resolution, nor can \CC separately compile a template @bsearch@.
     355Call-site inferencing and nested functions provide a localized form of inheritance.
     356For example, the \CFA @qsort@ can be made to sort in descending order by locally changing the behaviour of @<@.
     357\begin{cfa}
     358{
     359        int ?<?( double x, double y ) { return x @>@ y; } $\C{// locally override behaviour}$
     360        qsort( vals, 10 );                                                      $\C{// descending sort}$
     361}
     362\end{cfa}
     363The local version of @?<?@ overrides the built-in @?<?@ so it is passed to @qsort@.
     364The local version performs @?>?@, making @qsort@ sort in descending order.
     365Hence, any number of assertion functions can be overridden locally to maximize the reuse of existing functions and types, without the construction of a named inheritance hierarchy.
     366A final example is a type-safe wrapper for C @malloc@, where the return type supplies the type/size of the allocation, which is impossible in most type systems.
     367\begin{cfa}
     368static inline forall( T & | sized(T) )
     369T * malloc( void ) {
     370        if ( _Alignof(T) <= __BIGGEST_ALIGNMENT__ ) return (T *)malloc( sizeof(T) ); // C allocation
     371        else return (T *)memalign( _Alignof(T), sizeof(T) );
     372}
    360373// select type and size from left-hand side
    361 int * ip = malloc();  double * dp = malloc();  struct S {...} * sp = malloc();
    362 \end{cfa}
    363 
    364 Call site inferencing and nested functions provide a localized form of inheritance.
    365 For example, the \CFA @qsort@ only sorts in ascending order using @<@.
    366 However, it is trivial to locally change this behavior.
    367 \begin{cfa}
    368 forall( T | { int ?<?( T, T ); } ) void qsort( const T * arr, size_t size ) { /* use C qsort */ }
    369 int main() {
    370         int ?<?( double x, double y ) { return x @>@ y; } $\C{// locally override behavior}$
    371         qsort( vals, 10 );                                                      $\C{// descending sort}$
    372 }
    373 \end{cfa}
    374 The local version of @?<?@ performs @?>?@ overriding the built-in @?<?@ so it is passed to @qsort@.
    375 Therefore, programmers can easily form local environments, adding and modifying appropriate functions, to maximize the reuse of other existing functions and types.
    376 
    377 To reduce duplication, it is possible to distribute a group of @forall@ (and storage-class qualifiers) over functions/types, such that each block declaration is prefixed by the group (see the example in Appendix~\ref{s:CforallStack}).
    378 \begin{cfa}
    379 forall( @T@ ) {                                                 $\C{// distribution block, add forall qualifier to declarations}$
    380         struct stack { stack_node(@T@) * head; };       $\C{// generic type}$
    381         inline {                                                                        $\C{// nested distribution block, add forall/inline to declarations}$
    382                 void push( stack(@T@) & s, @T@ value ) ...      $\C{// generic operations}$
    383         }
    384 }
    385 \end{cfa}
     374int * ip = malloc();  double * dp = malloc();  $@$[aligned(64)] struct S {...} * sp = malloc();
     375\end{cfa}
     376The @sized@ assertion passes size and alignment as a data object has no implicit assertions.
     377Both assertions are used in @malloc@ via @sizeof@ and @_Alignof@.
     378
     379These mechanism are used to construct type-safe wrapper-libraries condensing hundreds of existing C functions into tens of \CFA overloaded functions.
     380Hence, existing C legacy code is leveraged as much as possible;
     381other programming languages must build supporting libraries from scratch, even in \CC.
    386382
    387383
     
    390386\CFA provides \newterm{traits} to name a group of type assertions, where the trait name allows specifying the same set of assertions in multiple locations, preventing repetition mistakes at each function declaration.
    391387\begin{cquote}
    392 \lstDeleteShortInline@%
    393 \begin{tabular}{@{}l@{\hspace{\parindentlnth}}|@{\hspace{\parindentlnth}}l@{}}
     388\begin{tabular}{@{}l|@{\hspace{10pt}}l@{}}
    394389\begin{cfa}
    395390trait @sumable@( T ) {
    396391        void @?{}@( T &, zero_t ); // 0 literal constructor
    397         T ?+?( T, T );                  // assortment of additions
     392        T ?+?( T, T );          // assortment of additions
    398393        T @?+=?@( T &, T );
    399394        T ++?( T & );
     
    412407\end{cfa}
    413408\end{tabular}
    414 \lstMakeShortInline@%
    415409\end{cquote}
    416 
    417 Note that the @sumable@ trait does not include a copy constructor needed for the right side of @?+=?@ and return;
    418 it is provided by @otype@, which is syntactic sugar for the following trait.
    419 \begin{cfa}
    420 trait otype( T & | sized(T) ) {  // sized is a pseudo-trait for types with known size and alignment
     410Traits are simply flatten at the use point, as if written in full by the programmer, where traits often contain overlapping assertions, \eg operator @+@.
     411Hence, trait names play no part in type equivalence.
     412Note, the type @T@ is an object type, and hence, has the implicit internal trait @otype@.
     413\begin{cfa}
     414trait otype( T & | sized(T) ) {
    421415        void ?{}( T & );                                                $\C{// default constructor}$
    422416        void ?{}( T &, T );                                             $\C{// copy constructor}$
     
    425419};
    426420\end{cfa}
    427 Given the information provided for an @otype@, variables of polymorphic type can be treated as if they were a complete type: stack allocatable, default or copy initialized, assigned, and deleted.
    428 
    429 In summation, the \CFA type system uses \newterm{nominal typing} for concrete types, matching with the C type system, and \newterm{structural typing} for polymorphic types.
    430 Hence, trait names play no part in type equivalence;
    431 the names are simply macros for a list of polymorphic assertions, which are expanded at usage sites.
    432 Nevertheless, trait names form a logical subtype hierarchy with @dtype@ at the top, where traits often contain overlapping assertions, \eg operator @+@.
    433 Traits are used like interfaces in Java or abstract base classes in \CC, but without the nominal inheritance relationships.
    434 Instead, each polymorphic function (or generic type) defines the structural type needed for its execution (polymorphic type key), and this key is fulfilled at each call site from the lexical environment, which is similar to the Go~\cite{Go} interfaces.
    435 Hence, new lexical scopes and nested functions are used extensively to create local subtypes, as in the @qsort@ example, without having to manage a nominal inheritance hierarchy.
    436 % (Nominal inheritance can be approximated with traits using marker variables or functions, as is done in Go.)
    437 
    438 % Nominal inheritance can be simulated with traits using marker variables or functions:
    439 % \begin{cfa}
    440 % trait nominal(T) {
    441 %     T is_nominal;
    442 % };
    443 % int is_nominal;                                                               $\C{// int now satisfies the nominal trait}$
    444 % \end{cfa}
    445 %
    446 % Traits, however, are significantly more powerful than nominal-inheritance interfaces; most notably, traits may be used to declare a relationship \emph{among} multiple types, a property that may be difficult or impossible to represent in nominal-inheritance type systems:
    447 % \begin{cfa}
    448 % trait pointer_like(Ptr, El) {
    449 %     lvalue El *?(Ptr);                                                $\C{// Ptr can be dereferenced into a modifiable value of type El}$
    450 % }
    451 % struct list {
    452 %     int value;
    453 %     list * next;                                                              $\C{// may omit "struct" on type names as in \CC}$
    454 % };
    455 % typedef list * list_iterator;
    456 %
    457 % lvalue int *?( list_iterator it ) { return it->value; }
    458 % \end{cfa}
    459 % In the example above, @(list_iterator, int)@ satisfies @pointer_like@ by the user-defined dereference function, and @(list_iterator, list)@ also satisfies @pointer_like@ by the built-in dereference operator for pointers. Given a declaration @list_iterator it@, @*it@ can be either an @int@ or a @list@, with the meaning disambiguated by context (\eg @int x = *it;@ interprets @*it@ as an @int@, while @(*it).value = 42;@ interprets @*it@ as a @list@).
    460 % While a nominal-inheritance system with associated types could model one of those two relationships by making @El@ an associated type of @Ptr@ in the @pointer_like@ implementation, few such systems could model both relationships simultaneously.
     421The implicit routines are used by the @sumable@ operator @?+=?@ for the right side of @?+=?@ and return.
    461422
    462423
     
    465426A significant shortcoming of standard C is the lack of reusable type-safe abstractions for generic data structures and algorithms.
    466427Broadly speaking, there are three approaches to implement abstract data structures in C.
    467 One approach is to write bespoke data structures for each context in which they are needed.
    468 While this approach is flexible and supports integration with the C type checker and tooling, it is also tedious and error prone, especially for more complex data structures.
    469 A second approach is to use @void *@-based polymorphism, \eg the C standard library functions @bsearch@ and @qsort@, which allow for the reuse of code with common functionality.
    470 However, basing all polymorphism on @void *@ eliminates the type checker's ability to ensure that argument types are properly matched, often requiring a number of extra function parameters, pointer indirection, and dynamic allocation that is otherwise not needed.
    471 A third approach to generic code is to use preprocessor macros, which does allow the generated code to be both generic and type checked, but errors may be difficult to interpret.
    472 Furthermore, writing and using preprocessor macros is unnatural and inflexible.
    473 
    474 \CC, Java, and other languages use \newterm{generic types} to produce type-safe abstract data types.
     428\begin{enumerate}[leftmargin=*]
     429\item
     430Write bespoke data structures for each context they are needed.
     431While this approach is flexible and supports integration with the C type checker and tooling, it is tedious and error prone, especially for more complex data structures.
     432\item
     433Use @void *@-based polymorphism, \eg the C standard library functions @bsearch@ and @qsort@, which allow for the reuse of code with common functionality.
     434However, this approach eliminates the type checker's ability to ensure argument types are properly matched, often requiring a number of extra function parameters, pointer indirection, and dynamic allocation that is otherwise unnecessary.
     435\item
     436Use preprocessor macros, similar to \CC @templates@, to generate code that is both generic and type checked, but errors may be difficult to interpret.
     437Furthermore, writing and using preprocessor macros is difficult and inflexible.
     438\end{enumerate}
     439
     440\CC, Java, and other languages use \newterm{generic types} to produce type-safe abstract data-types.
    475441\CFA generic types integrate efficiently and naturally with the existing polymorphic functions, while retaining backward compatibility with C and providing separate compilation.
    476442However, for known concrete parameters, the generic-type definition can be inlined, like \CC templates.
     
    478444A generic type can be declared by placing a @forall@ specifier on a @struct@ or @union@ declaration and instantiated using a parenthesized list of types after the type name.
    479445\begin{cquote}
    480 \lstDeleteShortInline@%
    481 \begin{tabular}{@{}l|@{\hspace{\parindentlnth}}l@{}}
    482 \begin{cfa}
    483 @forall( R, S )@ struct pair {
    484         R first;        S second;
     446\begin{tabular}{@{}l|@{\hspace{10pt}}l@{}}
     447\begin{cfa}
     448@forall( F, S )@ struct pair {
     449        F first;        S second;
    485450};
    486 @forall( T )@ // dynamic
    487 T value( pair(const char *, T) p ) { return p.second; }
    488 @forall( dtype F, T )@ // dtype-static (concrete)
    489 T value( pair(F *, T * ) p) { return *p.second; }
     451@forall( F, S )@ // object
     452S second( pair( F, S ) p ) { return p.second; }
     453@forall( F *, S * )@ // sized
     454S * second( pair( F *, S * ) p ) { return p.second; }
    490455\end{cfa}
    491456&
    492457\begin{cfa}
    493 pair(const char *, int) p = {"magic", 42}; // concrete
    494 int i = value( p );
    495 pair(void *, int *) q = { 0, &p.second }; // concrete
    496 i = value( q );
     458pair( double, int ) dpr = { 3.5, 42 };
     459int i = second( dpr );
     460pair( void *, int * ) vipr = { 0p, &i };
     461int * ip = second( vipr );
    497462double d = 1.0;
    498 pair(double *, double *) r = { &d, &d }; // concrete
    499 d = value( r );
     463pair( int *, double * ) idpr = { &i, &d };
     464double * dp = second( idpr );
    500465\end{cfa}
    501466\end{tabular}
    502 \lstMakeShortInline@%
    503467\end{cquote}
    504 
    505 \CFA classifies generic types as either \newterm{concrete} or \newterm{dynamic}.
    506 Concrete types have a fixed memory layout regardless of type parameters, whereas dynamic types vary in memory layout depending on their type parameters.
    507 A \newterm{dtype-static} type has polymorphic parameters but is still concrete.
    508 Polymorphic pointers are an example of dtype-static types;
    509 given some type variable @T@, @T@ is a polymorphic type, as is @T *@, but @T *@ has a fixed size and can, therefore, be represented by @void *@ in code generation.
    510 
    511 \CFA generic types also allow checked argument constraints.
    512 For example, the following declaration of a sorted set type ensures the set key supports equality and relational comparison.
    513 \begin{cfa}
    514 forall( Key | { _Bool ?==?(Key, Key); _Bool ?<?(Key, Key); } ) struct sorted_set;
    515 \end{cfa}
    516 
    517 
    518 \subsection{Concrete generic types}
    519 
    520 The \CFA translator template expands concrete generic types into new structure types, affording maximal inlining.
    521 To enable interoperation among equivalent instantiations of a generic type, the translator saves the set of instantiations currently in scope and reuses the generated structure declarations where appropriate.
    522 A function declaration that accepts or returns a concrete generic type produces a declaration for the instantiated structure in the same scope, which all callers may reuse.
    523 For example, the concrete instantiation for @pair( const char *, int )@ is
    524 \begin{cfa}
    525 struct _pair_conc0 {
    526         const char * first;  int second;
    527 };
    528 \end{cfa}
    529 
    530 A concrete generic type with dtype-static parameters is also expanded to a structure type, but this type is used for all matching instantiations.
    531 In the above example, the @pair( F *, T * )@ parameter to @value@ is such a type; its expansion is below, and it is used as the type of the variables @q@ and @r@ as well, with casts for member access where appropriate.
    532 \begin{cfa}
    533 struct _pair_conc1 {
    534         void * first, * second;
    535 };
    536 \end{cfa}
    537 
    538 
    539 \subsection{Dynamic generic types}
    540 
    541 Though \CFA implements concrete generic types efficiently, it also has a fully general system for dynamic generic types.
    542 As mentioned in Section~\ref{sec:poly-fns}, @otype@ function parameters (in fact, all @sized@ polymorphic parameters) come with implicit size and alignment parameters provided by the caller.
    543 Dynamic generic types also have an \newterm{offset array} containing structure-member offsets.
    544 A dynamic generic @union@ needs no such offset array, as all members are at offset 0, but size and alignment are still necessary.
    545 Access to members of a dynamic structure is provided at runtime via base displacement addressing
    546 % FIX
    547 using the structure pointer and the member offset (similar to the @offsetof@ macro), moving a compile-time offset calculation to runtime.
    548 
    549 The offset arrays are statically generated where possible.
    550 If a dynamic generic type is declared to be passed or returned by value from a polymorphic function, the translator can safely assume that the generic type is complete (\ie has a known layout) at any call site, and the offset array is passed from the caller;
    551 if the generic type is concrete at the call site, the elements of this offset array can even be statically generated using the C @offsetof@ macro.
    552 As an example, the body of the second @value@ function is implemented as
    553 \begin{cfa}
    554 _assign_T( _retval, p + _offsetof_pair[1] ); $\C{// return *p.second}$
    555 \end{cfa}
    556 \newpage
    557 \noindent
    558 Here, @_assign_T@ is passed in as an implicit parameter from @T@, and takes two @T *@ (@void *@ in the generated code), a destination and a source, and @_retval@ is the pointer to a caller-allocated buffer for the return value, the usual \CFA method to handle dynamically sized return types.
    559 @_offsetof_pair@ is the offset array passed into @value@;
    560 this array is generated at the call site as
    561 \begin{cfa}
    562 size_t _offsetof_pair[] = { offsetof( _pair_conc0, first ), offsetof( _pair_conc0, second ) }
    563 \end{cfa}
    564 
    565 In some cases, the offset arrays cannot be statically generated.
    566 For instance, modularity is generally provided in C by including an opaque forward declaration of a structure and associated accessor and mutator functions in a header file, with the actual implementations in a separately compiled @.c@ file.
    567 \CFA supports this pattern for generic types, but the caller does not know the actual layout or size of the dynamic generic type and only holds it by a pointer.
    568 The \CFA translator automatically generates \newterm{layout functions} for cases where the size, alignment, and offset array of a generic struct cannot be passed into a function from that function's caller.
    569 These layout functions take as arguments pointers to size and alignment variables and a caller-allocated array of member offsets, as well as the size and alignment of all @sized@ parameters to the generic structure (un@sized@ parameters are forbidden from being used in a context that affects layout).
    570 Results of these layout functions are cached so that they are only computed once per type per function. %, as in the example below for @pair@.
    571 Layout functions also allow generic types to be used in a function definition without reflecting them in the function signature.
    572 For instance, a function that strips duplicate values from an unsorted @vector(T)@ likely has a pointer to the vector as its only explicit parameter, but uses some sort of @set(T)@ internally to test for duplicate values.
    573 This function could acquire the layout for @set(T)@ by calling its layout function with the layout of @T@ implicitly passed into the function.
    574 
    575 Whether a type is concrete, dtype-static, or dynamic is decided solely on the @forall@'s type parameters.
    576 This design allows opaque forward declarations of generic types, \eg @forall(T)@ @struct Box@ -- like in C, all uses of @Box(T)@ can be separately compiled, and callers from other translation units know the proper calling conventions to use.
    577 If the definition of a structure type is included in deciding whether a generic type is dynamic or concrete, some further types may be recognized as dtype-static (\eg @forall(T)@ @struct unique_ptr { T * p }@ does not depend on @T@ for its layout, but the existence of an @otype@ parameter means that it \emph{could}.);
    578 however, preserving separate compilation (and the associated C compatibility) in the existing design is judged to be an appropriate trade-off.
     468\CFA generic types are \newterm{fixed} or \newterm{dynamic} sized.
     469Fixed-size types have a fixed memory layout regardless of type parameters, whereas dynamic types vary in memory layout depending on their type parameters.
     470For example, the type variable @T *@ is fixed size and is represented by @void *@ in code generation;
     471whereas, the type variable @T@ is dynamic and set at the point of instantiation.
     472The difference between fixed and dynamic is the complexity and cost of field access.
     473For fixed, field offsets are computed (known) at compile time and embedded as displacements in instructions.
     474For dynamic, field offsets are computed at compile time at the call site, stored in an array of offset values, passed as a polymorphic parameter, and added to the structure address for each field dereference within a polymorphic routine.
     475See~\cite[\S~3.2]{Moss19} for complete implementation details.
     476
     477Currently, \CFA generic types allow assertion.
     478For example, the following declaration of a sorted set-type ensures the set key supports equality and relational comparison.
     479\begin{cfa}
     480forall( Elem, @Key@ | { _Bool ?==?( Key, Key ); _Bool ?<?( Key, Key ); } )
     481struct Sorted_Set { Elem elem; @Key@ key; ... };
     482\end{cfa}
     483However, the operations that insert/remove elements from the set should not appear as part of the generic-types assertions.
     484\begin{cfa}
     485forall( @Elem@ | /* any assertions on element type */ ) {
     486        void insert( Sorted_Set set, @Elem@ elem ) { ... }
     487        bool remove( Sorted_Set set, @Elem@ elem ) { ... } // false => element not present
     488        ... // more set operations
     489} // distribution
     490\end{cfa}
     491(Note, the @forall@ clause can be distributed across multiple functions.)
     492For software-engineering reasons, the set assertions would be refactored into a trait to allow alternative implementations, like a Java \lstinline[language=java]{interface}.
     493
     494In summation, the \CFA type system inherits \newterm{nominal typing} for concrete types from C, and adds \newterm{structural typing} for polymorphic types.
     495Traits are used like interfaces in Java or abstract base-classes in \CC, but without the nominal inheritance relationships.
     496Instead, each polymorphic function or generic type defines the structural type needed for its execution, which is fulfilled at each call site from the lexical environment, like Go~\cite{Go} or Rust~\cite{Rust} interfaces.
     497Hence, new lexical scopes and nested functions are used extensively to create local subtypes, as in the @qsort@ example, without having to manage a nominal inheritance hierarchy.
    579498
    580499
    581500\section{Contributions}
    582501
     502\begin{enumerate}
     503\item
     504\item
     505\item
     506\end{enumerate}
    583507
    584508
Note: See TracChangeset for help on using the changeset viewer.