Changeset 19518e8
- Timestamp:
- Apr 12, 2017, 10:57:36 PM (8 years ago)
- Branches:
- ADT, aaron-thesis, arm-eh, ast-experimental, cleanup-dtors, deferred_resn, demangler, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, new-env, no_list, persistent-indexer, pthread-emulation, qualifiedEnum, resolv-new, with_gc
- Children:
- ffc9f5a
- Parents:
- 50b7e8c
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/generic_types/generic_types.tex
r50b7e8c r19518e8 709 709 For example, in 710 710 \begin{lstlisting} 711 712 713 714 715 716 717 718 711 [int, int, int] f(); 712 [int, [int, int], int] g(); 713 714 ([int, double])f(); $\C{// (1)}$ 715 ([int, int, int])g(); $\C{// (2)}$ 716 ([void, [int, int]])g(); $\C{// (3)}$ 717 ([int, int, int, int])g(); $\C{// (4)}$ 718 ([int, [int, int, int]])g(); $\C{// (5)}$ 719 719 \end{lstlisting} 720 720 … … 767 767 \subsection{Variadic Tuples} 768 768 769 To define variadic functions, \CFA adds a new kind of type parameter, @ttype@. Matching against a @ttype@ (``tuple type'') parameter consumes all remaining argument components and packages them into a tuple, binding to the resulting tuple of types. In a given parameter list, there should be at most one @ttype@ parameter that must occur last, otherwise the call can never resolve, given the previous rule. This idea essentially matches normal variadic semantics, with a strong feeling of similarity to \CCeleven variadic templates. As such, @ttype@ variables are also referred to as \emph{argument} or \emph{parameter packs} in this paper. 770 771 Like variadic templates, the main way to manipulate @ttype@ polymorphic functions is through recursion. Since nothing is known about a parameter pack by default, assertion parameters are key to doing anything meaningful. Unlike variadic templates, @ttype@ polymorphic functions can be separately compiled. 772 773 For example, the C @sum@ function at the beginning of Section~\ref{sec:tuples} could be written using @ttype@ as: 774 \begin{lstlisting} 775 int sum(){ return 0; } // (0) 776 forall(ttype Params | { int sum(Params); }) 777 int sum(int x, Params rest) { // (1) 778 return x+sum(rest); 779 } 780 sum(10, 20, 30); 781 \end{lstlisting} 782 Since (0) does not accept any arguments, it is not a valid candidate function for the call @sum(10, 20, 30)@. 783 In order to call (1), @10@ is matched with @x@, and the argument resolution moves on to the argument pack @rest@, which consumes the remainder of the argument list and @Params@ is bound to @[20, 30]@. 784 In order to finish the resolution of @sum@, an assertion parameter that matches @int sum(int, int)@ is required. 785 Like in the previous iteration, (0) is not a valid candidate, so (1) is examined with @Params@ bound to @[int]@, requiring the assertion @int sum(int)@. 786 Next, (0) fails, and to satisfy (1) @Params@ is bound to @[]@, requiring an assertion @int sum()@. 787 Finally, (0) matches and (1) fails, which terminates the recursion. 788 Effectively, this algorithm traces as @sum(10, 20, 30)@ $\rightarrow$ @10+sum(20, 30)@ $\rightarrow$ @10+(20+sum(30))@ $\rightarrow$ @10+(20+(30+sum()))@ $\rightarrow$ @10+(20+(30+0))@. 789 790 As a point of note, this version does not require any form of argument descriptor, since the \CFA type system keeps track of all of these details. It might be reasonable to take the @sum@ function a step further to enforce a minimum number of arguments: 791 \begin{lstlisting} 792 int sum(int x, int y){ 793 return x+y; 794 } 795 forall(ttype Params | { int sum(int, Params); }) 796 int sum(int x, int y, Params rest) { 797 return sum(x+y, rest); 798 } 799 \end{lstlisting} 800 801 One more iteration permits the summation of any summable type, as long as all arguments are the same type: 769 To define variadic functions, \CFA adds a new kind of type parameter, @ttype@ (tuple type). 770 Matching against a @ttype@ parameter consumes all remaining argument components and packages them into a tuple, binding to the resulting tuple of types. 771 In a given parameter list, there must be at most one @ttype@ parameter that occurs last, which matches normal variadic semantics, with a strong feeling of similarity to \CCeleven variadic templates. 772 As such, @ttype@ variables are also called \emph{argument packs}. 773 774 Like variadic templates, the main way to manipulate @ttype@ polymorphic functions is via recursion. 775 Since nothing is known about a parameter pack by default, assertion parameters are key to doing anything meaningful. 776 Unlike variadic templates, @ttype@ polymorphic functions can be separately compiled. 777 For example, a generalized @sum@ function written using @ttype@: 778 \begin{lstlisting} 779 int sum$\(_0\)$() { return 0; } 780 forall(ttype Params | { int sum( Params ); } ) int sum$\(_1\)$( int x, Params rest ) { 781 return x + sum( rest ); 782 } 783 sum( 10, 20, 30 ); 784 \end{lstlisting} 785 Since @sum@\(_0\) does not accept any arguments, it is not a valid candidate function for the call @sum(10, 20, 30)@. 786 In order to call @sum@\(_1\), @10@ is matched with @x@, and the argument resolution moves on to the argument pack @rest@, which consumes the remainder of the argument list and @Params@ is bound to @[20, 30]@. 787 The process continues, @Params@ is bound to @[]@, requiring an assertion @int sum()@, which matchs @sum@\(_0\) and terminates the recursion. 788 Effectively, this algorithm traces as @sum(10, 20, 30)@ $\rightarrow$ @10 + sum(20, 30)@ $\rightarrow$ @10 + (20 + sum(30))@ $\rightarrow$ @10 + (20 + (30 + sum()))@ $\rightarrow$ @10 + (20 + (30 + 0))@. 789 790 It is reasonable to take the @sum@ function a step further to enforce a minimum number of arguments: 791 \begin{lstlisting} 792 int sum( int x, int y ) { return x + y; } 793 forall(ttype Params | { int sum( int, Params ); } ) int sum( int x, int y, Params rest ) { 794 return sum( x + y, rest ); 795 } 796 \end{lstlisting} 797 One more step permits the summation of any summable type with all arguments of the same type: 802 798 \begin{lstlisting} 803 799 trait summable(otype T) { 804 T ?+?(T, T);800 T ?+?( T, T ); 805 801 }; 806 forall(otype R | summable(R)) 807 R sum(R x, R y){ 808 return x+y; 809 } 810 forall(otype R, ttype Params 811 | summable(R) 812 | { R sum(R, Params); }) 813 R sum(R x, R y, Params rest) { 814 return sum(x+y, rest); 815 } 816 \end{lstlisting} 817 Unlike C, it is not necessary to hard code the expected type. This code is naturally open to extension, in that any user-defined type with a @?+?@ operator is automatically able to be used with the @sum@ function. That is to say, the programmer who writes @sum@ does not need full program knowledge of every possible data type, unlike what is necessary to write an equivalent function using the standard C mechanisms. Summing arbitrary heterogeneous lists is possible with similar code by adding the appropriate type variables and addition operators. 818 819 It is also possible to write a type-safe variadic print function which can replace @printf@: 802 forall(otype R | summable( R ) ) R sum( R x, R y ) { 803 return x + y; 804 } 805 forall(otype R, ttype Params | summable(R) | { R sum(R, Params); } ) R sum(R x, R y, Params rest) { 806 return sum( x + y, rest ); 807 } 808 \end{lstlisting} 809 Unlike C variadic functions, it is unnecessary to hard code the number and expected types. 810 Furthermore, this code is extendable so any user-defined type with a @?+?@ operator. 811 Summing arbitrary heterogeneous lists is possible with similar code by adding the appropriate type variables and addition operators. 812 813 It is also possible to write a type-safe variadic print function to replace @printf@: 820 814 \begin{lstlisting} 821 815 struct S { int x, y; }; 822 forall(otype T, ttype Params | 823 { void print(T); void print(Params); }) 824 void print(T arg, Params rest) { 825 print(arg); 826 print(rest); 827 } 828 void print(char * x) { printf("%s", x); } 829 void print(int x) { printf("%d", x); } 830 void print(S s) { print("{ ", s.x, ",", s.y, " }"); } 831 832 print("s = ", (S){ 1, 2 }, "\n"); 833 \end{lstlisting} 834 This example function showcases a variadic-template-like decomposition of the provided argument list. The individual @print@ functions allow printing a single element of a type. The polymorphic @print@ allows printing any list of types, as long as each individual type has a @print@ function. The individual print functions can be used to build up more complicated @print@ functions, such as for @S@, which is something that cannot be done with @printf@ in C. 835 836 It is also possible to use @ttype@ polymorphism to provide arbitrary argument forwarding functions. For example, it is possible to write @new@ as a library function: 837 \begin{lstlisting} 838 struct pair(otype R, otype S); 839 forall(otype R, otype S) 840 void ?{}(pair(R, S) *, R, S); // (1) 841 842 forall(dtype T, ttype Params | sized(T) | { void ?{}(T *, Params); }) 843 T * new(Params p) { 844 return ((T*)malloc( sizeof(T) )){ p }; // construct into result of malloc 845 } 846 847 pair(int, char) * x = new(42, '!'); 848 \end{lstlisting} 849 The @new@ function provides the combination of type-safe @malloc@ with a constructor call, so that it becomes impossible to forget to construct dynamically allocated objects. This function provides the type-safety of @new@ in \CC, without the need to specify the allocated type again, thanks to return-type inference. 850 851 In the call to @new@, @pair(double, char)@ is selected to match @T@, and @Params@ is expanded to match @[double, char]@. The constructor (1) may be specialized to satisfy the assertion for a constructor with an interface compatible with @void ?{}(pair(int, char) *, int, char)@. 816 forall(otype T, ttype Params | { void print(T); void print(Params); }) void print(T arg, Params rest) { 817 print(arg); 818 print(rest); 819 } 820 void print( char * x ) { printf( "%s", x ); } 821 void print( int x ) { printf( "%d", x ); } 822 void print( S s ) { print( "{ ", s.x, ",", s.y, " }" ); } 823 print( "s = ", (S){ 1, 2 }, "\n" ); 824 \end{lstlisting} 825 This example showcases a variadic-template-like decomposition of the provided argument list. 826 The individual @print@ functions allow printing a single element of a type. 827 The polymorphic @print@ allows printing any list of types, as long as each individual type has a @print@ function. 828 The individual print functions can be used to build up more complicated @print@ functions, such as for @S@, which is something that cannot be done with @printf@ in C. 829 830 Finally, it is possible to use @ttype@ polymorphism to provide arbitrary argument forwarding functions. 831 For example, it is possible to write @new@ as a library function: 832 \begin{lstlisting} 833 struct pair( otype R, otype S ); 834 forall( otype R, otype S ) void ?{}( pair(R, S) *, R, S ); // (1) 835 forall( dtype T, ttype Params | sized(T) | { void ?{}( T *, Params ); } ) T * new( Params p ) { 836 return ((T*)malloc( sizeof(T) )){ p }; // construct into result of malloc 837 } 838 pair( int, char ) * x = new( 42, '!' ); 839 \end{lstlisting} 840 The @new@ function provides the combination of type-safe @malloc@ with a \CFA constructor call, making it impossible to forget constructing dynamically allocated objects. 841 This function provides the type-safety of @new@ in \CC, without the need to specify the allocated type again, thanks to return-type inference. 842 843 % In the call to @new@, @pair(double, char)@ is selected to match @T@, and @Params@ is expanded to match @[double, char]@. The constructor (1) may be specialized to satisfy the assertion for a constructor with an interface compatible with @void ?{}(pair(int, char) *, int, char)@. 844 852 845 853 846 \subsection{Implementation} 854 847 855 Tuples are implemented in the \CFA translator via a transformation into generic types. For each $N$, the first time an $N$-tuple is seen in a scope a generic type with $N$ type parameters is generated. For example: 848 Tuples are implemented in the \CFA translator via a transformation into generic types. 849 For each $N$, the first time an $N$-tuple is seen in a scope a generic type with $N$ type parameters is generated. \eg: 856 850 \begin{lstlisting} 857 851 [int, int] f() { 858 859 860 } 861 \end{lstlisting} 862 Is transformed into:863 \begin{lstlisting} 864 forall(dtype T0, dtype T1 | sized(T0) | sized(T1)) 865 struct _tuple2 { // generated before the first 2-tuple 866 867 852 [double, double] x; 853 [int, double, int] y; 854 } 855 \end{lstlisting} 856 is transformed into: 857 \begin{lstlisting} 858 // generated before the first 2-tuple 859 forall(dtype T0, dtype T1 | sized(T0) | sized(T1)) struct _tuple2 { 860 T0 field_0; 861 T1 field_1; 868 862 }; 869 863 _tuple2(int, int) f() { 870 _tuple2(double, double) x; 871 forall(dtype T0, dtype T1, dtype T2 | sized(T0) | sized(T1) | sized(T2)) 872 struct _tuple3 { // generated before the first 3-tuple 873 T0 field_0; 874 T1 field_1; 875 T2 field_2; 876 }; 877 _tuple3_(int, double, int) y; 878 } 879 \end{lstlisting} 880 864 _tuple2(double, double) x; 865 // generated before the first 3-tuple 866 forall(dtype T0, dtype T1, dtype T2 | sized(T0) | sized(T1) | sized(T2)) struct _tuple3 { 867 T0 field_0; 868 T1 field_1; 869 T2 field_2; 870 }; 871 _tuple3_(int, double, int) y; 872 } 873 \end{lstlisting} 881 874 Tuple expressions are then simply converted directly into compound literals: 882 875 \begin{lstlisting} 883 876 [5, 'x', 1.24]; 884 877 \end{lstlisting} 885 Becomes:878 becomes: 886 879 \begin{lstlisting} 887 880 (_tuple3(int, char, double)){ 5, 'x', 1.24 }; 888 881 \end{lstlisting} 889 882 883 \begin{comment} 890 884 Since tuples are essentially structures, tuple indexing expressions are just field accesses: 891 885 \begin{lstlisting} … … 922 916 [int, double] _unq0; 923 917 g( 924 925 918 (_unq0_finished_ ? _unq0 : (_unq0 = f(), _unq0_finished_ = 1, _unq0)).0, 919 (_unq0_finished_ ? _unq0 : (_unq0 = f(), _unq0_finished_ = 1, _unq0)).1, 926 920 ); 927 921 \end{lstlisting} … … 931 925 932 926 The various kinds of tuple assignment, constructors, and destructors generate GNU C statement expressions. A variable is generated to store the value produced by a statement expression, since its fields may need to be constructed with a non-trivial constructor and it may need to be referred to multiple time, \eg in a unique expression. The use of statement expressions allows the translator to arbitrarily generate additional temporary variables as needed, but binds the implementation to a non-standard extension of the C language. However, there are other places where the \CFA translator makes use of GNU C extensions, such as its use of nested functions, so this restriction is not new. 927 \end{comment} 928 933 929 934 930 \section{Evaluation} 935 931 936 932 \TODO{Magnus suggests we need some graphs, it's kind of a done thing that the reviewers will be looking for. Also, we've made some unsubstantiated claims about the runtime performance of \CFA, which some micro-benchmarks could help with. I'm thinking a simple stack push and pop, with an idiomatic \lstinline@void*@, \CFA, \CC template and \CC virtual inheritance versions (the void* and virtual inheritance versions likely need to be linked lists, or clumsy in their API -- possibly both versions) to test generics, and variadic print to test tuples. We measure SLOC, runtime performance, executable size (making sure to include benchmarks for multiple types in the executable), and possibly manually count the number of places where the programmer must provide un-type-checked type information. Appendices don't count against our page limit, so we might want to include the source code for the benchmarks (or at least the relevant implementation details) in one.} 933 937 934 938 935 \section{Related Work}
Note: See TracChangeset
for help on using the changeset viewer.