# Changeset 19518e8

Ignore:
Timestamp:
Apr 12, 2017, 10:57:36 PM (6 years ago)
Branches:
ADT, aaron-thesis, arm-eh, cleanup-dtors, deferred_resn, demangler, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, new-env, no_list, persistent-indexer, pthread-emulation, qualifiedEnum, resolv-new, with_gc
Children:
ffc9f5a
Parents:
50b7e8c
Message:

more shortening tuple section

File:
1 edited

### Legend:

Unmodified
 r50b7e8c For example, in \begin{lstlisting} [int, int, int] f(); [int, [int, int], int] g(); ([int, double])f();           $\C{// (1)}$ ([int, int, int])g();         $\C{// (2)}$ ([void, [int, int]])g();      $\C{// (3)}$ ([int, int, int, int])g();    $\C{// (4)}$ ([int, [int, int, int]])g();  $\C{// (5)}$ [int, int, int] f(); [int, [int, int], int] g(); ([int, double])f();           $\C{// (1)}$ ([int, int, int])g();         $\C{// (2)}$ ([void, [int, int]])g();      $\C{// (3)}$ ([int, int, int, int])g();    $\C{// (4)}$ ([int, [int, int, int]])g();  $\C{// (5)}$ \end{lstlisting} \subsection{Variadic Tuples} To define variadic functions, \CFA adds a new kind of type parameter, @ttype@. Matching against a @ttype@ (tuple type'') parameter consumes all remaining argument components and packages them into a tuple, binding to the resulting tuple of types. In a given parameter list, there should be at most one @ttype@ parameter that must occur last, otherwise the call can never resolve, given the previous rule. This idea essentially matches normal variadic semantics, with a strong feeling of similarity to \CCeleven variadic templates. As such, @ttype@ variables are also referred to as \emph{argument} or \emph{parameter packs} in this paper. Like variadic templates, the main way to manipulate @ttype@ polymorphic functions is through recursion. Since nothing is known about a parameter pack by default, assertion parameters are key to doing anything meaningful. Unlike variadic templates, @ttype@ polymorphic functions can be separately compiled. For example, the C @sum@ function at the beginning of Section~\ref{sec:tuples} could be written using @ttype@ as: \begin{lstlisting} int sum(){ return 0; }        // (0) forall(ttype Params | { int sum(Params); }) int sum(int x, Params rest) { // (1) return x+sum(rest); } sum(10, 20, 30); \end{lstlisting} Since (0) does not accept any arguments, it is not a valid candidate function for the call @sum(10, 20, 30)@. In order to call (1), @10@ is matched with @x@, and the argument resolution moves on to the argument pack @rest@, which consumes the remainder of the argument list and @Params@ is bound to @[20, 30]@. In order to finish the resolution of @sum@, an assertion parameter that matches @int sum(int, int)@ is required. Like in the previous iteration, (0) is not a valid candidate, so (1) is examined with @Params@ bound to @[int]@, requiring the assertion @int sum(int)@. Next, (0) fails, and to satisfy (1) @Params@ is bound to @[]@, requiring an assertion @int sum()@. Finally, (0) matches and (1) fails, which terminates the recursion. Effectively, this algorithm traces as @sum(10, 20, 30)@ $\rightarrow$ @10+sum(20, 30)@ $\rightarrow$ @10+(20+sum(30))@ $\rightarrow$ @10+(20+(30+sum()))@ $\rightarrow$ @10+(20+(30+0))@. As a point of note, this version does not require any form of argument descriptor, since the \CFA type system keeps track of all of these details. It might be reasonable to take the @sum@ function a step further to enforce a minimum number of arguments: \begin{lstlisting} int sum(int x, int y){ return x+y; } forall(ttype Params | { int sum(int, Params); }) int sum(int x, int y, Params rest) { return sum(x+y, rest); } \end{lstlisting} One more iteration permits the summation of any summable type, as long as all arguments are the same type: To define variadic functions, \CFA adds a new kind of type parameter, @ttype@ (tuple type). Matching against a @ttype@ parameter consumes all remaining argument components and packages them into a tuple, binding to the resulting tuple of types. In a given parameter list, there must be at most one @ttype@ parameter that occurs last, which matches normal variadic semantics, with a strong feeling of similarity to \CCeleven variadic templates. As such, @ttype@ variables are also called \emph{argument packs}. Like variadic templates, the main way to manipulate @ttype@ polymorphic functions is via recursion. Since nothing is known about a parameter pack by default, assertion parameters are key to doing anything meaningful. Unlike variadic templates, @ttype@ polymorphic functions can be separately compiled. For example, a generalized @sum@ function written using @ttype@: \begin{lstlisting} int sum$$$_0$$$() { return 0; } forall(ttype Params | { int sum( Params ); } ) int sum$$$_1$$$( int x, Params rest ) { return x + sum( rest ); } sum( 10, 20, 30 ); \end{lstlisting} Since @sum@$$_0$$ does not accept any arguments, it is not a valid candidate function for the call @sum(10, 20, 30)@. In order to call @sum@$$_1$$, @10@ is matched with @x@, and the argument resolution moves on to the argument pack @rest@, which consumes the remainder of the argument list and @Params@ is bound to @[20, 30]@. The process continues, @Params@ is bound to @[]@, requiring an assertion @int sum()@, which matchs @sum@$$_0$$ and terminates the recursion. Effectively, this algorithm traces as @sum(10, 20, 30)@ $\rightarrow$ @10 + sum(20, 30)@ $\rightarrow$ @10 + (20 + sum(30))@ $\rightarrow$ @10 + (20 + (30 + sum()))@ $\rightarrow$ @10 + (20 + (30 + 0))@. It is reasonable to take the @sum@ function a step further to enforce a minimum number of arguments: \begin{lstlisting} int sum( int x, int y ) { return x + y; } forall(ttype Params | { int sum( int, Params ); } ) int sum( int x, int y, Params rest ) { return sum( x + y, rest ); } \end{lstlisting} One more step permits the summation of any summable type with all arguments of the same type: \begin{lstlisting} trait summable(otype T) { T ?+?(T, T); T ?+?( T, T ); }; forall(otype R | summable(R)) R sum(R x, R y){ return x+y; } forall(otype R, ttype Params | summable(R) | { R sum(R, Params); }) R sum(R x, R y, Params rest) { return sum(x+y, rest); } \end{lstlisting} Unlike C, it is not necessary to hard code the expected type. This code is naturally open to extension, in that any user-defined type with a @?+?@ operator is automatically able to be used with the @sum@ function. That is to say, the programmer who writes @sum@ does not need full program knowledge of every possible data type, unlike what is necessary to write an equivalent function using the standard C mechanisms. Summing arbitrary heterogeneous lists is possible with similar code by adding the appropriate type variables and addition operators. It is also possible to write a type-safe variadic print function which can replace @printf@: forall(otype R | summable( R ) ) R sum( R x, R y ) { return x + y; } forall(otype R, ttype Params | summable(R) | { R sum(R, Params); } ) R sum(R x, R y, Params rest) { return sum( x + y, rest ); } \end{lstlisting} Unlike C variadic functions, it is unnecessary to hard code the number and expected types. Furthermore, this code is extendable so any user-defined type with a @?+?@ operator. Summing arbitrary heterogeneous lists is possible with similar code by adding the appropriate type variables and addition operators. It is also possible to write a type-safe variadic print function to replace @printf@: \begin{lstlisting} struct S { int x, y; }; forall(otype T, ttype Params | { void print(T); void print(Params); }) void print(T arg, Params rest) { print(arg); print(rest); } void print(char * x) { printf("%s", x); } void print(int x) { printf("%d", x);  } void print(S s) { print("{ ", s.x, ",", s.y, " }"); } print("s = ", (S){ 1, 2 }, "\n"); \end{lstlisting} This example function showcases a variadic-template-like decomposition of the provided argument list. The individual @print@ functions allow printing a single element of a type. The polymorphic @print@ allows printing any list of types, as long as each individual type has a @print@ function. The individual print functions can be used to build up more complicated @print@ functions, such as for @S@, which is something that cannot be done with @printf@ in C. It is also possible to use @ttype@ polymorphism to provide arbitrary argument forwarding functions. For example, it is possible to write @new@ as a library function: \begin{lstlisting} struct pair(otype R, otype S); forall(otype R, otype S) void ?{}(pair(R, S) *, R, S);  // (1) forall(dtype T, ttype Params | sized(T) | { void ?{}(T *, Params); }) T * new(Params p) { return ((T*)malloc( sizeof(T) )){ p }; // construct into result of malloc } pair(int, char) * x = new(42, '!'); \end{lstlisting} The @new@ function provides the combination of type-safe @malloc@ with a constructor call, so that it becomes impossible to forget to construct dynamically allocated objects. This function provides the type-safety of @new@ in \CC, without the need to specify the allocated type again, thanks to return-type inference. In the call to @new@, @pair(double, char)@ is selected to match @T@, and @Params@ is expanded to match @[double, char]@. The constructor (1) may be specialized to  satisfy the assertion for a constructor with an interface compatible with @void ?{}(pair(int, char) *, int, char)@. forall(otype T, ttype Params | { void print(T); void print(Params); }) void print(T arg, Params rest) { print(arg); print(rest); } void print( char * x ) { printf( "%s", x ); } void print( int x ) { printf( "%d", x ); } void print( S s ) { print( "{ ", s.x, ",", s.y, " }" ); } print( "s = ", (S){ 1, 2 }, "\n" ); \end{lstlisting} This example showcases a variadic-template-like decomposition of the provided argument list. The individual @print@ functions allow printing a single element of a type. The polymorphic @print@ allows printing any list of types, as long as each individual type has a @print@ function. The individual print functions can be used to build up more complicated @print@ functions, such as for @S@, which is something that cannot be done with @printf@ in C. Finally, it is possible to use @ttype@ polymorphism to provide arbitrary argument forwarding functions. For example, it is possible to write @new@ as a library function: \begin{lstlisting} struct pair( otype R, otype S ); forall( otype R, otype S ) void ?{}( pair(R, S) *, R, S );  // (1) forall( dtype T, ttype Params | sized(T) | { void ?{}( T *, Params ); } ) T * new( Params p ) { return ((T*)malloc( sizeof(T) )){ p }; // construct into result of malloc } pair( int, char ) * x = new( 42, '!' ); \end{lstlisting} The @new@ function provides the combination of type-safe @malloc@ with a \CFA constructor call, making it impossible to forget constructing dynamically allocated objects. This function provides the type-safety of @new@ in \CC, without the need to specify the allocated type again, thanks to return-type inference. % In the call to @new@, @pair(double, char)@ is selected to match @T@, and @Params@ is expanded to match @[double, char]@. The constructor (1) may be specialized to  satisfy the assertion for a constructor with an interface compatible with @void ?{}(pair(int, char) *, int, char)@. \subsection{Implementation} Tuples are implemented in the \CFA translator via a transformation into generic types. For each $N$, the first time an $N$-tuple is seen in a scope a generic type with $N$ type parameters is generated. For example: Tuples are implemented in the \CFA translator via a transformation into generic types. For each $N$, the first time an $N$-tuple is seen in a scope a generic type with $N$ type parameters is generated. \eg: \begin{lstlisting} [int, int] f() { [double, double] x; [int, double, int] y; } \end{lstlisting} Is transformed into: \begin{lstlisting} forall(dtype T0, dtype T1 | sized(T0) | sized(T1)) struct _tuple2 {  // generated before the first 2-tuple T0 field_0; T1 field_1; [double, double] x; [int, double, int] y; } \end{lstlisting} is transformed into: \begin{lstlisting} // generated before the first 2-tuple forall(dtype T0, dtype T1 | sized(T0) | sized(T1)) struct _tuple2 { T0 field_0; T1 field_1; }; _tuple2(int, int) f() { _tuple2(double, double) x; forall(dtype T0, dtype T1, dtype T2 | sized(T0) | sized(T1) | sized(T2)) struct _tuple3 {  // generated before the first 3-tuple T0 field_0; T1 field_1; T2 field_2; }; _tuple3_(int, double, int) y; } \end{lstlisting} _tuple2(double, double) x; // generated before the first 3-tuple forall(dtype T0, dtype T1, dtype T2 | sized(T0) | sized(T1) | sized(T2)) struct _tuple3 { T0 field_0; T1 field_1; T2 field_2; }; _tuple3_(int, double, int) y; } \end{lstlisting} Tuple expressions are then simply converted directly into compound literals: \begin{lstlisting} [5, 'x', 1.24]; \end{lstlisting} Becomes: becomes: \begin{lstlisting} (_tuple3(int, char, double)){ 5, 'x', 1.24 }; \end{lstlisting} \begin{comment} Since tuples are essentially structures, tuple indexing expressions are just field accesses: \begin{lstlisting} [int, double] _unq0; g( (_unq0_finished_ ? _unq0 : (_unq0 = f(), _unq0_finished_ = 1, _unq0)).0, (_unq0_finished_ ? _unq0 : (_unq0 = f(), _unq0_finished_ = 1, _unq0)).1, (_unq0_finished_ ? _unq0 : (_unq0 = f(), _unq0_finished_ = 1, _unq0)).0, (_unq0_finished_ ? _unq0 : (_unq0 = f(), _unq0_finished_ = 1, _unq0)).1, ); \end{lstlisting} The various kinds of tuple assignment, constructors, and destructors generate GNU C statement expressions. A variable is generated to store the value produced by a statement expression, since its fields may need to be constructed with a non-trivial constructor and it may need to be referred to multiple time, \eg in a unique expression. The use of statement expressions allows the translator to arbitrarily generate additional temporary variables as needed, but binds the implementation to a non-standard extension of the C language. However, there are other places where the \CFA translator makes use of GNU C extensions, such as its use of nested functions, so this restriction is not new. \end{comment} \section{Evaluation} \TODO{Magnus suggests we need some graphs, it's kind of a done thing that the reviewers will be looking for. Also, we've made some unsubstantiated claims about the runtime performance of \CFA, which some micro-benchmarks could help with. I'm thinking a simple stack push and pop, with an idiomatic \lstinline@void*@, \CFA, \CC template and \CC virtual inheritance versions (the void* and virtual inheritance versions likely need to be linked lists, or clumsy in their API -- possibly both versions) to test generics, and variadic print to test tuples. We measure SLOC, runtime performance, executable size (making sure to include benchmarks for multiple types in the executable), and possibly manually count the number of places where the programmer must provide un-type-checked type information. Appendices don't count against our page limit, so we might want to include the source code for the benchmarks (or at least the relevant implementation details) in one.} \section{Related Work}