%======================================================================
\chapter{Variadic Functions}
%======================================================================

\section{Design Criteria} % TODO: better section name???
C provides variadic functions through the manipulation of @va_list@ objects.
A variadic function is one which contains at least one parameter, followed by @...@ as the last token in the parameter list.
In particular, some form of \emph{argument descriptor} is needed to inform the function of the number of arguments and their types.
Two common argument descriptors are format strings or counter parameters.
It is important to note that both of these mechanisms are inherently redundant, because they require the user to explicitly specify information that the compiler already knows.
This required repetition is error prone, because it is easy for the user to add or remove arguments without updating the argument descriptor.
In addition, C requires the programmer to hard code all of the possible expected types.
As a result, it is cumbersome to write a function that is open to extension.
For example, a simple function which sums $N$ @int@s,
\begin{cfacode}
int sum(int N, ...) {
  va_list args;
  va_start(args, N);
  int ret = 0;
  while(N) {
    ret += va_arg(args, int);  // have to specify type
    N--;
  }
  va_end(args);
  return ret;
}
sum(3, 10, 20, 30);  // need to keep counter in sync
\end{cfacode}
The @va_list@ type is a special C data type that abstracts variadic argument manipulation.
The @va_start@ macro initializes a @va_list@, given the last named parameter.
Each use of the @va_arg@ macro allows access to the next variadic argument, given a type.
Since the function signature does not provide any information on what types can be passed to a variadic function, the compiler does not perform any error checks on a variadic call.
As such, it is possible to pass any value to the @sum@ function, including pointers, floating-point numbers, and structures.
In the case where the provided type is not compatible with the argument's actual type after default argument promotions, or if too many arguments are accessed, the behaviour is undefined \cite[p.~81]{C11}.
Furthermore, there is no way to perform the necessary error checks in the @sum@ function at run-time, since type information is not carried into the function body.
Since they rely on programmer convention rather than compile-time checks, variadic functions are generally unsafe.

In practice, compilers can provide warnings to help mitigate some of the problems.
For example, GCC provides the @format@ attribute to specify that a function uses a format string, which allows the compiler to perform some checks related to the standard format specifiers.
Unfortunately, this approach does not permit extensions to the format string syntax, so a programmer cannot extend the attribute to warn for mismatches with custom types.

As a result, C's variadic functions are a deficient language feature.
Two options were examined to provide better, type-safe variadic functions in \CFA.
\subsection{Whole Tuple Matching}
Option 1 is to change the argument matching algorithm, so that type parameters can match whole tuples, rather than just their components.
This option could be implemented with two phases of argument matching when a function contains type parameters and the argument list contains tuple arguments.
If flattening and structuring fail to produce a match, a second attempt at matching the function and argument combination is made where tuple arguments are not expanded and structure must match exactly, modulo non-tuple implicit conversions.
For example:
\begin{cfacode}
  forall(otype T, otype U | { T g(U); })
  void f(T, U);

  [int, int] g([int, int, int, int]);

  f([1, 2], [3, 4, 5, 6]);
\end{cfacode}
With flattening and structuring, the call is first transformed into @f(1, 2, 3, 4, 5, 6)@.
Since the first argument of type @T@ does not have a tuple type, unification decides that @T=int@ and @1@ is matched as the first parameter.
Likewise, @U@ does not have a tuple type, so @U=int@ and @2@ is accepted as the second parameter.
There are now no remaining formal parameters, but there are remaining arguments and the function is not variadic, so the match fails.

With the addition of an exact matching attempt, @T=[int,int]@ and @U=[int,int,int,int]@, and so the arguments type check.
Likewise, when inferring assertion @g@, an exact match is found.

This approach is strict with respect to argument structure, by nature, which makes it syntactically awkward to use in ways that the existing tuple design is not.
For example, consider a @new@ function that allocates memory using @malloc@, and constructs the result using arbitrary arguments.
\begin{cfacode}
struct Array;
void ?{}(Array *, int, int, int);

forall(dtype T, otype Params | sized(T) | { void ?{}(T *, Params); })
T * new(Params p) {
  return malloc(){ p };
}
Array(int) * x = new([1, 2, 3]);
\end{cfacode}
The call to @new@ is not particularly appealing, since it requires the use of square brackets at the call-site, which is not required in any other function call.
This shifts the burden from the compiler to the programmer, which is almost always wrong, and creates an odd inconsistency within the language.
Similarly, in order to pass 0 variadic arguments, an explicit empty tuple must be passed into the argument list, otherwise the exact matching rule would not have an argument to bind against.

It should be otherwise noted that the addition of an exact matching rule only affects the outcome for polymorphic type binding when tuples are involved.
For non-tuple arguments, exact matching and flattening and structuring are equivalent.
For tuple arguments to a function without polymorphic formal parameters, flattening and structuring work whenever an exact match would have worked, since the tuple is flattened and implicitly restructured to its original structure.
Thus there is nothing to be gained from permitting the exact matching rule to take effect when a function does not contain polymorphism and none of the arguments are tuples.

Overall, this option takes a step in the right direction, but is contrary to the flexibility of the existing tuple design.

\subsection{A New Typeclass}
A second option is the addition of another kind of type parameter, @ttype@.
Matching against a @ttype@ parameter consumes all remaining argument components and packages them into a tuple, binding to the resulting tuple of types.
In a given parameter list, there should be at most one @ttype@ parameter that must occur last, otherwise the call can never resolve, given the previous rule.
This idea essentially matches normal variadic semantics, with a strong feeling of similarity to \CCeleven variadic templates.
As such, @ttype@ variables are also referred to as argument packs.
This approach is the option that has been added to \CFA.

Like variadic templates, the main way to manipulate @ttype@ polymorphic functions is through recursion.
Since nothing is known about a parameter pack by default, assertion parameters are key to doing anything meaningful.
Unlike variadic templates, @ttype@ polymorphic functions can be separately compiled.

For example, a simple translation of the C sum function using @ttype@ is
\begin{cfacode}
int sum(void){ return 0; }        // (0)
forall(ttype Params | { int sum(Params); })
int sum(int x, Params rest) { // (1)
  return x+sum(rest);
}
sum(10, 20, 30);
\end{cfacode}
Since (0) does not accept any arguments, it is not a valid candidate function for the call @sum(10, 20, 30)@.
In order to call (1), @10@ is matched with @x@, and the argument resolution moves on to the argument pack @rest@, which consumes the remainder of the argument list and @Params@ is bound to @[20, 30]@.
In order to finish the resolution of @sum@, an assertion parameter that matches @int sum(int, int)@ is required.
Like in the previous iteration, (0) is not a valid candidate, so (1) is examined with @Params@ bound to @[int]@, requiring the assertion @int sum(int)@.
Next, (0) fails, and to satisfy (1) @Params@ is bound to @[]@, requiring an assertion @int sum()@.
Finally, (0) matches and (1) fails, which terminates the recursion.
Effectively, this traces as @sum(10, 20, 30)@ $\rightarrow$ @10+sum(20, 30)@ $\rightarrow$ @10+(20+sum(30))@ $\rightarrow$ @10+(20+(30+sum()))@ $\rightarrow$ @10+(20+(30+0))@.

Interestingly, this version does not require any form of argument descriptor, since the \CFA type system keeps track of all of these details.
It might be reasonable to take the @sum@ function a step further to enforce a minimum number of arguments, which could be done simply
\begin{cfacode}
int sum(int x, int y){
  return x+y;
}
forall(ttype Params | { int sum(int, Params); })
int sum(int x, int y, Params rest) {
  return sum(x+y, rest);
}
sum(10);          // invalid
sum(10, 20);      // valid
sum(10, 20, 30);  // valid
...
\end{cfacode}

One more iteration permits the summation of any summable type, as long as all arguments are the same type.
\begin{cfacode}
trait summable(otype T) {
  T ?+?(T, T);
};
forall(otype R | summable(R))
R sum(R x, R y){
  return x+y;
}
forall(otype R, ttype Params
  | summable(R)
  | { R sum(R, Params); })
R sum(R x, R y, Params rest) {
  return sum(x+y, rest);
}
sum(3, 10, 20, 30);
\end{cfacode}
Unlike C, it is not necessary to hard code the expected type.
This @sum@ function is naturally open to extension, in that any user-defined type with a @?+?@ operator is automatically able to be used with the @sum@ function.
That is to say, the programmer who writes @sum@ does not need full program knowledge of every possible data type, unlike what is necessary to write an equivalent function using the standard C mechanisms.

Going one last step, it is possible to achieve full generality in \CFA, allowing the summation of arbitrary lists of summable types.
\begin{cfacode}
trait summable(otype T1, otype T2, otype R) {
  R ?+?(T1, T2);
};
forall(otype T1, otype T2, otype R | summable(T1, T2, R))
R sum(T1 x, T2 y) {
  return x+y;
}
forall(otype T1, otype T2, otype T3, ttype Params, otype R
  | summable(T1, T2, T3)
  | { R sum(T3, Params); })
R sum(T1 x, T2 y, Params rest ) {
  return sum(x+y, rest);
}
sum(3, 10.5, 20, 30.3);
\end{cfacode}
The \CFA translator requires adding explicit @double ?+?(int, double)@ and @double ?+?(double, int)@ functions for this call to work, since implicit conversions are not supported for assertions.

A notable limitation of this approach is that it heavily relies on recursive assertions.
The \CFA translator imposes a limitation on the depth of the recursion for assertion satisfaction.
Currently, the limit is set to 4, which means that the first version of the @sum@ function is limited to at most 5 arguments, while the second version can support up to 6 arguments.
The limit is set low due to inefficiencies in the current implementation of the \CFA expression resolver.
There is ongoing work to improve the performance of the resolver, and with noticeable gains, the limit can be relaxed to allow longer argument lists to @ttype@ functions.

C variadic syntax and @ttype@ polymorphism probably should not be mixed, since it is not clear where to draw the line to decide which arguments belong where.
Furthermore, it might be desirable to disallow polymorphic functions to use C variadic syntax to encourage a \CFA style.
Aside from calling C variadic functions, it is not obvious that there is anything that can be done with C variadics that could not also be done with @ttype@ parameters.

Variadic templates in \CC require an ellipsis token to express that a parameter is a parameter pack and to expand a parameter pack.
\CFA does not need an ellipsis in either case, since the type class @ttype@ is only used for variadics.
An alternative design is to use an ellipsis combined with an existing type class.
This approach was not taken because the largest benefit of the ellipsis token in \CC is the ability to expand a parameter pack within an expression, e.g., in fold expressions, which requires compile-time knowledge of the structure of the parameter pack, which is not available in \CFA.
\begin{cppcode}
template<typename... Args>
void f(Args &... args) {
  g(&args...);  // expand to addresses of pack elements
}
\end{cppcode}
As such, the addition of an ellipsis token would be purely an aesthetic change in \CFA today.

It is possible to write a type-safe variadic print routine, which can replace @printf@
\begin{cfacode}
struct S { int x, y; };
forall(otype T, ttype Params |
  { void print(T); void print(Params); })
void print(T arg, Params rest) {
  print(arg);
  print(rest);
}
void print(char * x) { printf("%s", x); }
void print(int x) { printf("%d", x);  }
void print(S s) { print("{ ", s.x, ",", s.y, " }"); }
print("s = ", (S){ 1, 2 }, "\n");
\end{cfacode}
This example routine showcases a variadic-template-like decomposition of the provided argument list.
The individual @print@ routines allow printing a single element of a type.
The polymorphic @print@ allows printing any list of types, as long as each individual type has a @print@ function.
The individual print functions can be used to build up more complicated @print@ routines, such as for @S@, which is something that cannot be done with @printf@ in C.

It is also possible to use @ttype@ polymorphism to provide arbitrary argument forwarding functions.
For example, it is possible to write @new@ as a library function.
\begin{cfacode}
struct Array;
void ?{}(Array *, int, int, int);

forall(dtype T, ttype Params | sized(T) | { void ?{}(T *, Params); })
T * new(Params p) {
  return malloc(){ p }; // construct result of malloc
}
Array * x = new(1, 2, 3);
\end{cfacode}
The @new@ function provides the combination of type-safe @malloc@ with a constructor call, so that it becomes impossible to forget to construct dynamically allocated objects.
This approach provides the type-safety of @new@ in \CC, without the need to specify the allocated type, thanks to return-type inference.

In the call to @new@, @Array@ is selected to match @T@, and @Params@ is expanded to match @[int, int, int, int]@. To satisfy the assertions, a constructor with an interface compatible with @void ?{}(Array *, int, int, int)@ must exist in the current scope.

\section{Implementation}

The definition of @new@
\begin{cfacode}
forall(dtype T | sized(T)) T * malloc();

forall(dtype T, ttype Params | sized(T) | { void ?{}(T *, Params); })
T * new(Params p) {
  return malloc(){ p }; // construct result of malloc
}
\end{cfacode}
Generates the following
\begin{cfacode}
void *malloc(long unsigned int _sizeof_T, long unsigned int _alignof_T);

void *new(
  void (*_adapter_)(void (*)(), void *, void *),
  long unsigned int _sizeof_T,
  long unsigned int _alignof_T,
  long unsigned int _sizeof_Params,
  long unsigned int _alignof_Params,
  void (* _ctor_T)(void *, void *),
  void *p
){
  void *_retval_new;
  void *_tmp_cp_ret0;
  void *_tmp_ctor_expr0;
  _retval_new=
    (_adapter_(_ctor_T,
      (_tmp_ctor_expr0=(_tmp_cp_ret0=malloc(_sizeof_2tT, _alignof_2tT),
        _tmp_cp_ret0)),
      p),
    _tmp_ctor_expr0); // ?{}
  *(void **)&_tmp_cp_ret0; // ^?{}
  return _retval_new;
}
\end{cfacode}
The constructor for @T@ is called indirectly through the adapter function on the result of @malloc@ and the parameter pack.
The variable that was allocated and constructed is then returned from @new@.

A call to @new@
\begin{cfacode}
struct S { int x, y; };
void ?{}(S *, int, int);

S * s = new(3, 4);
\end{cfacode}
Generates the following
\begin{cfacode}
struct _tuple2_ {  // _tuple2_(T0, T1)
  void *field_0;
  void *field_1;
};
struct _conc__tuple2_0 {  // _tuple2_(int, int)
  int field_0;
  int field_1;
};
struct _conc__tuple2_0 _tmp_cp1;  // tuple argument to new
struct S *_tmp_cp_ret1;           // return value from new
void _thunk0(  // ?{}(S *, [int, int])
  struct S *_p0,
  struct _conc__tuple2_0 _p1
){
  _ctor_S(_p0, _p1.field_0, _p1.field_1);  // restructure tuple parameter
}
void _adapter(void (*_adaptee)(), void *_p0, void *_p1){
  // apply adaptee to arguments after casting to actual types
  ((void (*)(struct S *, struct _conc__tuple2_0))_adaptee)(
    _p0,
    *(struct _conc__tuple2_0 *)_p1
  );
}
struct S *s = (struct S *)(_tmp_cp_ret1=
  new(
    _adapter,
    sizeof(struct S),
    __alignof__(struct S),
    sizeof(struct _conc__tuple2_0),
    __alignof__(struct _conc__tuple2_0),
    (void (*)(void *, void *))&_thunk0,
    (({ // copy construct tuple argument to new
      int *__multassign_L0 = (int *)&_tmp_cp1.field_0;
      int *__multassign_L1 = (int *)&_tmp_cp1.field_1;
      int __multassign_R0 = 3;
      int __multassign_R1 = 4;
      ((*__multassign_L0=__multassign_R0 /* ?{} */) ,
       (*__multassign_L1=__multassign_R1 /* ?{} */));
    }), &_tmp_cp1)
  ), _tmp_cp_ret1);
*(struct S **)&_tmp_cp_ret1; // ^?{}  // destroy return value from new
({  // destroy argument temporary
  int *__massassign_L0 = (int *)&_tmp_cp1.field_0;
  int *__massassign_L1 = (int *)&_tmp_cp1.field_1;
  ((*__massassign_L0 /* ^?{} */) , (*__massassign_L1 /* ^?{} */));
});
\end{cfacode}
Of note, @_thunk0@ is generated to translate calls to @?{}(S *, [int, int])@ into calls to @?{}(S *, int, int)@.
The call to @new@ constructs a tuple argument using the supplied arguments.

The @print@ function
\begin{cfacode}
forall(otype T, ttype Params |
  { void print(T); void print(Params); })
void print(T arg, Params rest) {
  print(arg);
  print(rest);
}
\end{cfacode}
Generates
\begin{cfacode}
void print_variadic(
  void (*_adapterF_7tParams__P)(void (*)(), void *),
  void (*_adapterF_2tT__P)(void (*)(), void *),
  void (*_adapterF_P2tT2tT__MP)(void (*)(), void *, void *),
  void (*_adapterF2tT_P2tT2tT_P_MP)(void (*)(), void *, void *, void *),
  long unsigned int _sizeof_T,
  long unsigned int _alignof_T,
  long unsigned int _sizeof_Params,
  long unsigned int _alignof_Params,
  void *(*_assign_TT)(void *, void *),
  void (*_ctor_T)(void *),
  void (*_ctor_TT)(void *, void *),
  void (*_dtor_T)(void *),
  void (*print_T)(void *),
  void (*print_Params)(void *),
  void *arg,
  void *rest
){
  void *_tmp_cp0 = __builtin_alloca(_sizeof_T);
  _adapterF_2tT__P(  // print(arg)
    ((void (*)())print_T),
    (_adapterF_P2tT2tT__MP( // copy construct argument
      ((void (*)())_ctor_TT),
      _tmp_cp0,
      arg
    ), _tmp_cp0)
  );
  _dtor_T(_tmp_cp0);  // destroy argument temporary
  _adapterF_7tParams__P(  // print(rest)
    ((void (*)())print_Params),
    rest
  );
}
\end{cfacode}
The @print_T@ routine is called indirectly through an adapter function with a copy constructed argument, followed by an indirect call to @print_Params@.

A call to print
\begin{cfacode}
void print(const char * x) { printf("%s", x); }
void print(int x) { printf("%d", x);  }

print("x = ", 123, ".\n");
\end{cfacode}
Generates the following
\begin{cfacode}
void print_string(const char *x){
  int _tmp_cp_ret0;
  (_tmp_cp_ret0=printf("%s", x)) , _tmp_cp_ret0;
  *(int *)&_tmp_cp_ret0; // ^?{}
}
void print_int(int x){
  int _tmp_cp_ret1;
  (_tmp_cp_ret1=printf("%d", x)) , _tmp_cp_ret1;
  *(int *)&_tmp_cp_ret1; // ^?{}
}

struct _tuple2_ {  // _tuple2_(T0, T1)
  void *field_0;
  void *field_1;
};
struct _conc__tuple2_0 {  // _tuple2_(int, const char *)
  int field_0;
  const char *field_1;
};
struct _conc__tuple2_0 _tmp_cp6;  // _tuple2_(int, const char *)
const char *_thunk0(const char **_p0, const char *_p1){
        // const char * ?=?(const char **, const char *)
  return *_p0=_p1;
}
void _thunk1(const char **_p0){ // void ?{}(const char **)
  *_p0; // ?{}
}
void _thunk2(const char **_p0, const char *_p1){
        // void ?{}(const char **, const char *)
  *_p0=_p1; // ?{}
}
void _thunk3(const char **_p0){ // void ^?{}(const char **)
  *_p0; // ^?{}
}
void _thunk4(struct _conc__tuple2_0 _p0){ // void print([int, const char *])
  struct _tuple1_ { // _tuple1_(T0)
    void *field_0;
  };
  struct _conc__tuple1_1 { // _tuple1_(const char *)
    const char *field_0;
  };
  void _thunk5(struct _conc__tuple1_1 _pp0){ // void print([const char *])
    print_string(_pp0.field_0);  // print(rest.0)
  }
  void _adapter_i_pii_(void (*_adaptee)(), void *_ret, void *_p0, void *_p1){
    *(int *)_ret=((int (*)(int *, int))_adaptee)(_p0, *(int *)_p1);
  }
  void _adapter_pii_(void (*_adaptee)(), void *_p0, void *_p1){
    ((void (*)(int *, int ))_adaptee)(_p0, *(int *)_p1);
  }
  void _adapter_i_(void (*_adaptee)(), void *_p0){
    ((void (*)(int))_adaptee)(*(int *)_p0);
  }
  void _adapter_tuple1_5_(void (*_adaptee)(), void *_p0){
    ((void (*)(struct _conc__tuple1_1 ))_adaptee)(*(struct _conc__tuple1_1 *)_p0);
  }
  print_variadic(
    _adapter_tuple1_5,
    _adapter_i_,
    _adapter_pii_,
    _adapter_i_pii_,
    sizeof(int),
    __alignof__(int),
    sizeof(struct _conc__tuple1_1),
    __alignof__(struct _conc__tuple1_1),
    (void *(*)(void *, void *))_assign_i,     // int ?=?(int *, int)
    (void (*)(void *))_ctor_i,                // void ?{}(int *)
    (void (*)(void *, void *))_ctor_ii,       // void ?{}(int *, int)
    (void (*)(void *))_dtor_ii,               // void ^?{}(int *)
    (void (*)(void *))print_int,              // void print(int)
    (void (*)(void *))&_thunk5,               // void print([const char *])
    &_p0.field_0,                             // rest.0
    &(struct _conc__tuple1_1 ){ _p0.field_1 } // [rest.1]
  );
}
struct _tuple1_ {  // _tuple1_(T0)
  void *field_0;
};
struct _conc__tuple1_6 {  // _tuple_1(const char *)
  const char *field_0;
};
const char *_temp0;
_temp0="x = ";
void _adapter_pstring_pstring_string(
  void (*_adaptee)(),
  void *_ret,
  void *_p0,
  void *_p1
){
  *(const char **)_ret=
    ((const char *(*)(const char **, const char *))_adaptee)(
      _p0,
      *(const char **)_p1
    );
}
void _adapter_pstring_string(void (*_adaptee)(), void *_p0, void *_p1){
  ((void (*)(const char **, const char *))_adaptee)(_p0, *(const char **)_p1);
}
void _adapter_string_(void (*_adaptee)(), void *_p0){
  ((void (*)(const char *))_adaptee)(*(const char **)_p0);
}
void _adapter_tuple2_0_(void (*_adaptee)(), void *_p0){
  ((void (*)(struct _conc__tuple2_0 ))_adaptee)(*(struct _conc__tuple2_0 *)_p0);
}
print_variadic(
  _adapter_tuple2_0_,
  _adapter_string_,
  _adapter_pstring_string_,
  _adapter_pstring_pstring_string_,
  sizeof(const char *),
  __alignof__(const char *),
  sizeof(struct _conc__tuple2_0 ),
  __alignof__(struct _conc__tuple2_0 ),
  (void *(*)(void *, void *))&_thunk0, // const char * ?=?(const char **, const char *)
  (void (*)(void *))&_thunk1,          // void ?{}(const char **)
  (void (*)(void *, void *))&_thunk2,  // void ?{}(const char **, const char *)
  (void (*)(void *))&_thunk3,          // void ^?{}(const char **)
  (void (*)(void *))print_string,      // void print(const char *)
  (void (*)(void *))&_thunk4,          // void print([int, const char *])
  &_temp0,                             // "x = "
  (({  // copy construct tuple argument to print
    int *__multassign_L0 = (int *)&_tmp_cp6.field_0;
    const char **__multassign_L1 = (const char **)&_tmp_cp6.field_1;
    int __multassign_R0 = 123;
    const char *__multassign_R1 = ".\n";
    ((*__multassign_L0=__multassign_R0 /* ?{} */),
     (*__multassign_L1=__multassign_R1 /* ?{} */));
  }), &_tmp_cp6)                        // [123, ".\n"]
);
({  // destroy argument temporary
  int *__massassign_L0 = (int *)&_tmp_cp6.field_0;
  const char **__massassign_L1 = (const char **)&_tmp_cp6.field_1;
  ((*__massassign_L0 /* ^?{} */) , (*__massassign_L1 /* ^?{} */));
});
\end{cfacode}
The type @_tuple2_@ is generated to allow passing the @rest@ argument to @print_variadic@.
Thunks 0 through 3 provide wrappers for the @otype@ parameters for @const char *@, while @_thunk4@ translates a call to @print([int, const char *])@ into a call to @print_variadic(int, [const char *])@.
This all builds to a call to @print_variadic@, with the appropriate copy construction of the tuple argument.