Changeset 19518e8 for doc/generic_types


Ignore:
Timestamp:
Apr 12, 2017, 10:57:36 PM (7 years ago)
Author:
Peter A. Buhr <pabuhr@…>
Branches:
ADT, aaron-thesis, arm-eh, ast-experimental, cleanup-dtors, deferred_resn, demangler, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, new-env, no_list, persistent-indexer, pthread-emulation, qualifiedEnum, resolv-new, with_gc
Children:
ffc9f5a
Parents:
50b7e8c
Message:

more shortening tuple section

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/generic_types/generic_types.tex

    r50b7e8c r19518e8  
    709709For example, in
    710710\begin{lstlisting}
    711   [int, int, int] f();
    712   [int, [int, int], int] g();
    713 
    714   ([int, double])f();           $\C{// (1)}$
    715   ([int, int, int])g();         $\C{// (2)}$
    716   ([void, [int, int]])g();      $\C{// (3)}$
    717   ([int, int, int, int])g();    $\C{// (4)}$
    718   ([int, [int, int, int]])g();  $\C{// (5)}$
     711[int, int, int] f();
     712[int, [int, int], int] g();
     713
     714([int, double])f();           $\C{// (1)}$
     715([int, int, int])g();         $\C{// (2)}$
     716([void, [int, int]])g();      $\C{// (3)}$
     717([int, int, int, int])g();    $\C{// (4)}$
     718([int, [int, int, int]])g();  $\C{// (5)}$
    719719\end{lstlisting}
    720720
     
    767767\subsection{Variadic Tuples}
    768768
    769 To define variadic functions, \CFA adds a new kind of type parameter, @ttype@. Matching against a @ttype@ (``tuple type'') parameter consumes all remaining argument components and packages them into a tuple, binding to the resulting tuple of types. In a given parameter list, there should be at most one @ttype@ parameter that must occur last, otherwise the call can never resolve, given the previous rule. This idea essentially matches normal variadic semantics, with a strong feeling of similarity to \CCeleven variadic templates. As such, @ttype@ variables are also referred to as \emph{argument} or \emph{parameter packs} in this paper.
    770 
    771 Like variadic templates, the main way to manipulate @ttype@ polymorphic functions is through recursion. Since nothing is known about a parameter pack by default, assertion parameters are key to doing anything meaningful. Unlike variadic templates, @ttype@ polymorphic functions can be separately compiled.
    772 
    773 For example, the C @sum@ function at the beginning of Section~\ref{sec:tuples} could be written using @ttype@ as:
    774 \begin{lstlisting}
    775 int sum(){ return 0; }        // (0)
    776 forall(ttype Params | { int sum(Params); })
    777 int sum(int x, Params rest) { // (1)
    778   return x+sum(rest);
    779 }
    780 sum(10, 20, 30);
    781 \end{lstlisting}
    782 Since (0) does not accept any arguments, it is not a valid candidate function for the call @sum(10, 20, 30)@.
    783 In order to call (1), @10@ is matched with @x@, and the argument resolution moves on to the argument pack @rest@, which consumes the remainder of the argument list and @Params@ is bound to @[20, 30]@.
    784 In order to finish the resolution of @sum@, an assertion parameter that matches @int sum(int, int)@ is required.
    785 Like in the previous iteration, (0) is not a valid candidate, so (1) is examined with @Params@ bound to @[int]@, requiring the assertion @int sum(int)@.
    786 Next, (0) fails, and to satisfy (1) @Params@ is bound to @[]@, requiring an assertion @int sum()@.
    787 Finally, (0) matches and (1) fails, which terminates the recursion.
    788 Effectively, this algorithm traces as @sum(10, 20, 30)@ $\rightarrow$ @10+sum(20, 30)@ $\rightarrow$ @10+(20+sum(30))@ $\rightarrow$ @10+(20+(30+sum()))@ $\rightarrow$ @10+(20+(30+0))@.
    789 
    790 As a point of note, this version does not require any form of argument descriptor, since the \CFA type system keeps track of all of these details. It might be reasonable to take the @sum@ function a step further to enforce a minimum number of arguments:
    791 \begin{lstlisting}
    792 int sum(int x, int y){
    793   return x+y;
    794 }
    795 forall(ttype Params | { int sum(int, Params); })
    796 int sum(int x, int y, Params rest) {
    797   return sum(x+y, rest);
    798 }
    799 \end{lstlisting}
    800 
    801 One more iteration permits the summation of any summable type, as long as all arguments are the same type:
     769To define variadic functions, \CFA adds a new kind of type parameter, @ttype@ (tuple type).
     770Matching against a @ttype@ parameter consumes all remaining argument components and packages them into a tuple, binding to the resulting tuple of types.
     771In a given parameter list, there must be at most one @ttype@ parameter that occurs last, which matches normal variadic semantics, with a strong feeling of similarity to \CCeleven variadic templates.
     772As such, @ttype@ variables are also called \emph{argument packs}.
     773
     774Like variadic templates, the main way to manipulate @ttype@ polymorphic functions is via recursion.
     775Since nothing is known about a parameter pack by default, assertion parameters are key to doing anything meaningful.
     776Unlike variadic templates, @ttype@ polymorphic functions can be separately compiled.
     777For example, a generalized @sum@ function written using @ttype@:
     778\begin{lstlisting}
     779int sum$\(_0\)$() { return 0; }
     780forall(ttype Params | { int sum( Params ); } ) int sum$\(_1\)$( int x, Params rest ) {
     781        return x + sum( rest );
     782}
     783sum( 10, 20, 30 );
     784\end{lstlisting}
     785Since @sum@\(_0\) does not accept any arguments, it is not a valid candidate function for the call @sum(10, 20, 30)@.
     786In order to call @sum@\(_1\), @10@ is matched with @x@, and the argument resolution moves on to the argument pack @rest@, which consumes the remainder of the argument list and @Params@ is bound to @[20, 30]@.
     787The process continues, @Params@ is bound to @[]@, requiring an assertion @int sum()@, which matchs @sum@\(_0\) and terminates the recursion.
     788Effectively, this algorithm traces as @sum(10, 20, 30)@ $\rightarrow$ @10 + sum(20, 30)@ $\rightarrow$ @10 + (20 + sum(30))@ $\rightarrow$ @10 + (20 + (30 + sum()))@ $\rightarrow$ @10 + (20 + (30 + 0))@.
     789
     790It is reasonable to take the @sum@ function a step further to enforce a minimum number of arguments:
     791\begin{lstlisting}
     792int sum( int x, int y ) { return x + y; }
     793forall(ttype Params | { int sum( int, Params ); } ) int sum( int x, int y, Params rest ) {
     794        return sum( x + y, rest );
     795}
     796\end{lstlisting}
     797One more step permits the summation of any summable type with all arguments of the same type:
    802798\begin{lstlisting}
    803799trait summable(otype T) {
    804   T ?+?(T, T);
     800        T ?+?( T, T );
    805801};
    806 forall(otype R | summable(R))
    807 R sum(R x, R y){
    808   return x+y;
    809 }
    810 forall(otype R, ttype Params
    811   | summable(R)
    812   | { R sum(R, Params); })
    813 R sum(R x, R y, Params rest) {
    814   return sum(x+y, rest);
    815 }
    816 \end{lstlisting}
    817 Unlike C, it is not necessary to hard code the expected type. This code is naturally open to extension, in that any user-defined type with a @?+?@ operator is automatically able to be used with the @sum@ function. That is to say, the programmer who writes @sum@ does not need full program knowledge of every possible data type, unlike what is necessary to write an equivalent function using the standard C mechanisms. Summing arbitrary heterogeneous lists is possible with similar code by adding the appropriate type variables and addition operators.
    818 
    819 It is also possible to write a type-safe variadic print function which can replace @printf@:
     802forall(otype R | summable( R ) ) R sum( R x, R y ) {
     803        return x + y;
     804}
     805forall(otype R, ttype Params | summable(R) | { R sum(R, Params); } ) R sum(R x, R y, Params rest) {
     806        return sum( x + y, rest );
     807}
     808\end{lstlisting}
     809Unlike C variadic functions, it is unnecessary to hard code the number and expected types.
     810Furthermore, this code is extendable so any user-defined type with a @?+?@ operator.
     811Summing arbitrary heterogeneous lists is possible with similar code by adding the appropriate type variables and addition operators.
     812
     813It is also possible to write a type-safe variadic print function to replace @printf@:
    820814\begin{lstlisting}
    821815struct S { int x, y; };
    822 forall(otype T, ttype Params |
    823   { void print(T); void print(Params); })
    824 void print(T arg, Params rest) {
    825   print(arg);
    826   print(rest);
    827 }
    828 void print(char * x) { printf("%s", x); }
    829 void print(int x) { printf("%d", x);  }
    830 void print(S s) { print("{ ", s.x, ",", s.y, " }"); }
    831 
    832 print("s = ", (S){ 1, 2 }, "\n");
    833 \end{lstlisting}
    834 This example function showcases a variadic-template-like decomposition of the provided argument list. The individual @print@ functions allow printing a single element of a type. The polymorphic @print@ allows printing any list of types, as long as each individual type has a @print@ function. The individual print functions can be used to build up more complicated @print@ functions, such as for @S@, which is something that cannot be done with @printf@ in C.
    835 
    836 It is also possible to use @ttype@ polymorphism to provide arbitrary argument forwarding functions. For example, it is possible to write @new@ as a library function:
    837 \begin{lstlisting}
    838 struct pair(otype R, otype S);
    839 forall(otype R, otype S)
    840 void ?{}(pair(R, S) *, R, S);  // (1)
    841 
    842 forall(dtype T, ttype Params | sized(T) | { void ?{}(T *, Params); })
    843 T * new(Params p) {
    844   return ((T*)malloc( sizeof(T) )){ p }; // construct into result of malloc
    845 }
    846 
    847 pair(int, char) * x = new(42, '!');
    848 \end{lstlisting}
    849 The @new@ function provides the combination of type-safe @malloc@ with a constructor call, so that it becomes impossible to forget to construct dynamically allocated objects. This function provides the type-safety of @new@ in \CC, without the need to specify the allocated type again, thanks to return-type inference.
    850 
    851 In the call to @new@, @pair(double, char)@ is selected to match @T@, and @Params@ is expanded to match @[double, char]@. The constructor (1) may be specialized to  satisfy the assertion for a constructor with an interface compatible with @void ?{}(pair(int, char) *, int, char)@.
     816forall(otype T, ttype Params | { void print(T); void print(Params); }) void print(T arg, Params rest) {
     817        print(arg);
     818        print(rest);
     819}
     820void print( char * x ) { printf( "%s", x ); }
     821void print( int x ) { printf( "%d", x ); }
     822void print( S s ) { print( "{ ", s.x, ",", s.y, " }" ); }
     823print( "s = ", (S){ 1, 2 }, "\n" );
     824\end{lstlisting}
     825This example showcases a variadic-template-like decomposition of the provided argument list.
     826The individual @print@ functions allow printing a single element of a type.
     827The polymorphic @print@ allows printing any list of types, as long as each individual type has a @print@ function.
     828The individual print functions can be used to build up more complicated @print@ functions, such as for @S@, which is something that cannot be done with @printf@ in C.
     829
     830Finally, it is possible to use @ttype@ polymorphism to provide arbitrary argument forwarding functions.
     831For example, it is possible to write @new@ as a library function:
     832\begin{lstlisting}
     833struct pair( otype R, otype S );
     834forall( otype R, otype S ) void ?{}( pair(R, S) *, R, S );  // (1)
     835forall( dtype T, ttype Params | sized(T) | { void ?{}( T *, Params ); } ) T * new( Params p ) {
     836        return ((T*)malloc( sizeof(T) )){ p }; // construct into result of malloc
     837}
     838pair( int, char ) * x = new( 42, '!' );
     839\end{lstlisting}
     840The @new@ function provides the combination of type-safe @malloc@ with a \CFA constructor call, making it impossible to forget constructing dynamically allocated objects.
     841This function provides the type-safety of @new@ in \CC, without the need to specify the allocated type again, thanks to return-type inference.
     842
     843% In the call to @new@, @pair(double, char)@ is selected to match @T@, and @Params@ is expanded to match @[double, char]@. The constructor (1) may be specialized to  satisfy the assertion for a constructor with an interface compatible with @void ?{}(pair(int, char) *, int, char)@.
     844
    852845
    853846\subsection{Implementation}
    854847
    855 Tuples are implemented in the \CFA translator via a transformation into generic types. For each $N$, the first time an $N$-tuple is seen in a scope a generic type with $N$ type parameters is generated. For example:
     848Tuples are implemented in the \CFA translator via a transformation into generic types.
     849For each $N$, the first time an $N$-tuple is seen in a scope a generic type with $N$ type parameters is generated. \eg:
    856850\begin{lstlisting}
    857851[int, int] f() {
    858   [double, double] x;
    859   [int, double, int] y;
    860 }
    861 \end{lstlisting}
    862 Is transformed into:
    863 \begin{lstlisting}
    864 forall(dtype T0, dtype T1 | sized(T0) | sized(T1))
    865 struct _tuple2 {  // generated before the first 2-tuple
    866   T0 field_0;
    867   T1 field_1;
     852        [double, double] x;
     853        [int, double, int] y;
     854}
     855\end{lstlisting}
     856is transformed into:
     857\begin{lstlisting}
     858// generated before the first 2-tuple
     859forall(dtype T0, dtype T1 | sized(T0) | sized(T1)) struct _tuple2 {
     860        T0 field_0;
     861        T1 field_1;
    868862};
    869863_tuple2(int, int) f() {
    870   _tuple2(double, double) x;
    871   forall(dtype T0, dtype T1, dtype T2 | sized(T0) | sized(T1) | sized(T2))
    872   struct _tuple3 {  // generated before the first 3-tuple
    873         T0 field_0;
    874         T1 field_1;
    875         T2 field_2;
    876   };
    877   _tuple3_(int, double, int) y;
    878 }
    879 \end{lstlisting}
    880 
     864        _tuple2(double, double) x;
     865        // generated before the first 3-tuple
     866        forall(dtype T0, dtype T1, dtype T2 | sized(T0) | sized(T1) | sized(T2)) struct _tuple3 {
     867                T0 field_0;
     868                T1 field_1;
     869                T2 field_2;
     870        };
     871        _tuple3_(int, double, int) y;
     872}
     873\end{lstlisting}
    881874Tuple expressions are then simply converted directly into compound literals:
    882875\begin{lstlisting}
    883876[5, 'x', 1.24];
    884877\end{lstlisting}
    885 Becomes:
     878becomes:
    886879\begin{lstlisting}
    887880(_tuple3(int, char, double)){ 5, 'x', 1.24 };
    888881\end{lstlisting}
    889882
     883\begin{comment}
    890884Since tuples are essentially structures, tuple indexing expressions are just field accesses:
    891885\begin{lstlisting}
     
    922916[int, double] _unq0;
    923917g(
    924   (_unq0_finished_ ? _unq0 : (_unq0 = f(), _unq0_finished_ = 1, _unq0)).0,
    925   (_unq0_finished_ ? _unq0 : (_unq0 = f(), _unq0_finished_ = 1, _unq0)).1,
     918        (_unq0_finished_ ? _unq0 : (_unq0 = f(), _unq0_finished_ = 1, _unq0)).0,
     919        (_unq0_finished_ ? _unq0 : (_unq0 = f(), _unq0_finished_ = 1, _unq0)).1,
    926920);
    927921\end{lstlisting}
     
    931925
    932926The various kinds of tuple assignment, constructors, and destructors generate GNU C statement expressions. A variable is generated to store the value produced by a statement expression, since its fields may need to be constructed with a non-trivial constructor and it may need to be referred to multiple time, \eg in a unique expression. The use of statement expressions allows the translator to arbitrarily generate additional temporary variables as needed, but binds the implementation to a non-standard extension of the C language. However, there are other places where the \CFA translator makes use of GNU C extensions, such as its use of nested functions, so this restriction is not new.
     927\end{comment}
     928
    933929
    934930\section{Evaluation}
    935931
    936932\TODO{Magnus suggests we need some graphs, it's kind of a done thing that the reviewers will be looking for. Also, we've made some unsubstantiated claims about the runtime performance of \CFA, which some micro-benchmarks could help with. I'm thinking a simple stack push and pop, with an idiomatic \lstinline@void*@, \CFA, \CC template and \CC virtual inheritance versions (the void* and virtual inheritance versions likely need to be linked lists, or clumsy in their API -- possibly both versions) to test generics, and variadic print to test tuples. We measure SLOC, runtime performance, executable size (making sure to include benchmarks for multiple types in the executable), and possibly manually count the number of places where the programmer must provide un-type-checked type information. Appendices don't count against our page limit, so we might want to include the source code for the benchmarks (or at least the relevant implementation details) in one.}
     933
    937934
    938935\section{Related Work}
Note: See TracChangeset for help on using the changeset viewer.