Changeset 52931c5


Ignore:
Timestamp:
Mar 28, 2025, 4:58:25 AM (3 days ago)
Author:
Fangren Yu <f37yu@…>
Branches:
master
Parents:
185cd94
Message:

add some discussion regarding type environment

File:
1 edited

Legend:

Unmodified
Added
Removed
  • TabularUnified doc/theses/fangren_yu_MMath/content2.tex

    r185cd94 r52931c5  
    248248}
    249249\end{cfa}
    250 This helper function is used for performance logging as part of computing a geometric;
     250This helper function is used for performance logging as part of computing a geometric mean;
    251251it is called during summing of logarithmic values.
    252252However, the function name @log2@ is overloaded in \CFA for both integer and floating point types.
     
    255255When asked, the developer expected the floating-point overload because of return-type overloading.
    256256This mistake went unnoticed because the truncated component was insignificant in the performance logging.
    257 \PAB{Not sure I understand this: The conclusion is that matching the return type higher up in the expression tree is better, in cases where the total expression cost is equal.}
     257Investigation of this example leads to the decision that the resolution algorithm favors a lower conversion cost up the expression tree when the total global cost is equal.
    258258
    259259Another change addresses the issue that C arithmetic expressions have unique meanings governed by its arithmetic conversion rules.
     
    352352\CFA does not attempt to do any type \textit{inference} \see{\VRef{s:IntoTypeInferencing}}: it has no anonymous functions (\ie lambdas, commonly found in functional programming and also used in \CC and Java), and the variable types must all be explicitly defined (no auto typing).
    353353This restriction makes the unification problem more tractable in \CFA, as the argument types at each call site are usually all specified.
    354 There is a single exception case when the function return type contains a free type variable that does not occur in any of the argument types, and subsequently passed into the parent expression.
    355 \begin{cfa}
    356 example ... does malloc work here
     354There is a single exception case when the function return type contains a free type variable that does not occur in any of the argument types, and subsequently passed into the parent expression. One such example is the \CFA wrapper for @malloc@ which also infers size argument from the deducted return type.
     355\begin{cfa}
     356forall (T*) T* malloc() {
     357        return (T*) malloc (sizeof(T)); // calls C malloc with the size inferred from context
     358}
    357359\end{cfa}
    358360A top level expression whose type still contains an unbounded type variable is considered ill-formed as such an expression is inherently ambiguous.
     
    410412max (42, 3.14); // OK, T=double; requires explicit type annotation in C++ such as max<double>(42, 3.14);
    411413\end{cfa}
    412 The current \CFA documentation does not include a formal set of rules for type unification.
    413 In practice, the algorithm implemented in the \CFA translator can be summarized as follows, given a function signature forall$(T_1,..., T_n) f(p_1, ..., p_m)$ and argument types $(a_1, ..., a_m)$, the unification algorithm performs the following steps: \footnote{This assumes argument tuples are already expanded to the individual components.}
    414 \begin{enumerate}
    415 \item The type environment is initialized as the union of all type environments of the arguments, plus $(T_1,...,T_n)$ as free variables.
    416 The inclusion of argument environments serves the purpose of resolving polymorphic return types that needs to be deduced.
    417 \item Initially, all type variables
    418 \end{enumerate}
     414
     415From a theoretical point of view, the simplified implementation of type environment has its shortcomings. There are some cases that do not work nicely with this implementation and some compromise has to be made. A more detailed discussion follows at the end of this chapter.
    419416
    420417
     
    515512
    516513\section{Compiler Implementation Considerations}
     514\CFA is still an experimental language and there is no formal specification of expression resolution rules yet.
     515Presently there is also only one reference implementation, namely the cfa-cc translator, which is under active development and the specific behavior of the implementation can change frequently as new features are added.
     516
     517Ideally, the goal of expression resolution involving polymorphic functions would be to find the set of type variable assignments such that the global conversion cost is minimal and all assertion variables can be satisfied.
     518Unfortunately, with a lot of complications involving implicit conversions and assertion variables, fully achieving this goal is not realistic. The \CFA compiler is specifically not designed to accommodate for all edge cases, either.
     519Instead it makes a few restrictions to simplify the algorithm so that most expressions that will be encountered in actual code can still pass type checking within a reasonable amount of time.
     520
     521As previously mentioned, \CFA polymorphic type resolution is based on a modified version of unification algorithm, where both equivalence (exact) and subtyping (inexact) relations are considered. However, the implementation of type environment is simplified; it only stores a tentative type binding with a flag indicating whether \textit{widening} is possible for an equivalence class of type variables.
     522Formally speaking, this means the type environment used in \CFA compiler is only capable of representing \textit{lower bound} constraints.
     523This simplification can still work well most of the time, given the following properties of the existing \CFA type system and the resolution algorithms in use:
     524
     525\begin{enumerate}
     526        \item Type resolution almost exclusively proceeds in bottom-up order, which naturally produces lower bound constraints. Since all identifiers can be overloaded in \CFA, not much definite information can be gained from top-down. In principle it would be possible to detect non-overloaded function names and perform top-down resolution for those; however, the prototype experiments have shown that such optimization does not give a meaningful performance benefit, and therefore it is not implemented.
     527        \item Few nontrivial subtyping relationships are present in \CFA, namely the arithmetic types presented in \VRef[Figure]{f:CFACurrArithmeticConversions}, and qualified pointer/reference types. In particular, \CFA lacks the nominal inheritance subtyping present in object-oriented languages, and the generic types do not support covariance on type parameters. As a result, named types such as structs are always matched by strict equivalence, including any type parameters should they exist.
     528        \item Unlike in functional programming where subtyping between arrow types exists, \ie if $T_2 <: T_1$ and $U_1 <: U_2$ then $T_1 \rightarrow T_2 <: U_1 \rightarrow U_2$, \CFA uses C function pointer types and the parameter/return types must match exactly to be compatible.
     529\end{enumerate}
     530
     531\CFA does attempt to incorporate type information propagated from upstream in the case of variable declaration with initializer, since the type of the variable being initialized is definitely known. It is known that the current type environment representation is flawed in handling such type deduction when the return type in the initializer is polymorphic, and an inefficient workaround has to be performed in certain cases. An annotated example is included in the \CFA compiler source code:
     532
     533\begin{cfa}
     534// If resolution is unsuccessful with a target type, try again without, since it
     535// will sometimes succeed when it wouldn't with a target type binding.
     536// For example:
     537forall( otype T ) T & ?[]( T *, ptrdiff_t );
     538const char * x = "hello world";
     539int ch = x[0];
     540// Fails with simple return type binding (xxx -- check this!) as follows:
     541// * T is bound to int
     542// * (x: const char *) is unified with int *, which fails
     543\end{cfa}
     544
     545The problem here is that we can only represent the constraints $T = int$ and $int <: T$, but since the type information flows in the opposite direction, the proper constraint for this case is $T <: int$, which cannot be represented in the simplified type environment. Currently, an attempt to resolve with equality constraint generated from the initialized variable is still made, since it is often the correct type binding (especially in the case where the initialized variable is a struct), and when such attempt fails, the resolution algorithm is rerun without the initialization context.
     546
     547One additional remark to make here is that \CFA does not provide a mechanism to explicitly specify values for polymorphic type parameters. In \CC for example, users may specify template arguments in angle brackets, which could be useful when automatic deduction fails, \eg @max<double>(42, 3.14)@.
     548There are some partial workarounds such as adding casts to the arguments, but they are not guaranteed to work in all cases. If a type parameter appears in the function return type, however, using the ascription (return) cast will force the desired type binding, since the cast only compiles when the expression type matches exactly with the target.
     549
     550\section{Related Work}
     551
     552
     553
Note: See TracChangeset for help on using the changeset viewer.