Changeset bd72f517 for doc/theses/fangren_yu_MMath/resolution.tex
- Timestamp:
- May 13, 2025, 1:17:50 PM (10 months ago)
- Branches:
- master, stuck-waitfor-destruct
- Children:
- 0528d79
- Parents:
- 7d02d35 (diff), 2410424 (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the(diff)links above to see all the changes relative to each parent. - File:
-
- 1 edited
-
doc/theses/fangren_yu_MMath/resolution.tex (modified) (28 diffs)
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/fangren_yu_MMath/resolution.tex
r7d02d35 rbd72f517 2 2 \label{c:content2} 3 3 4 Recapping, the\CFA's type-system provides expressive polymorphism: variables can be overloaded, functions can be overloaded by argument and return types, tuple types, generic (polymorphic) functions and types (aggregates) can have multiple type parameters with assertion restrictions;4 Recapping, \CFA's type-system provides expressive polymorphism: variables can be overloaded, functions can be overloaded by argument and return types, tuple types, generic (polymorphic) functions and types (aggregates) can have multiple type parameters with assertion restrictions; 5 5 in addition, C's multiple implicit type-conversions must be respected. 6 6 This generality leads to internal complexity and correspondingly higher compilation cost directly related to type resolution. … … 24 24 \end{enumerate} 25 25 \VRef[Table]{t:SelectedFileByCompilerBuild} shows improvements for selected tests with accumulated reductions in compile time across each of the 5 fixes. 26 T o this day, the large reduction in compilation time significantly improves the development of the \CFA's runtime because of its frequent compilation cycles.26 The large reduction in compilation time significantly improves the development of the \CFA's runtime because of its frequent compilation cycles. 27 27 28 28 \begin{table}[htb] … … 54 54 Some of those problems arise from the newly introduced language features described in the previous chapter. 55 55 In addition, fixing unexpected interactions within the type system has presented challenges. 56 This chapter describes in detail the type-resolution rules currently in use and some major problems that have beenidentified.56 This chapter describes in detail the type-resolution rules currently in use and some major problems \PAB{I} have identified. 57 57 Not all of those problems have immediate solutions, because fixing them may require redesigning parts of the \CFA type system at a larger scale, which correspondingly affects the language design. 58 58 … … 69 69 \begin{enumerate}[leftmargin=*] 70 70 \item \textbf{Unsafe} cost representing a narrowing conversion of arithmetic types, \eg @int@ to @short@, and qualifier-dropping conversions for pointer and reference types. 71 Narrowing conversions have the potential to lose (truncat ion) data.71 Narrowing conversions have the potential to lose (truncate) data. 72 72 A programmer must decide if the computed data-range can safely be shorted in the smaller storage. 73 73 Warnings for unsafe conversions are helpful. … … 86 86 87 87 \item \textbf{Safe} cost representing a widening conversion \eg @short@ to @int@, qualifier-adding conversions for pointer and reference types, and value conversion for enumeration constants. 88 Even when conversions are safe, the fewest conversions itranked better, \eg @short@ to @int@ versus @short@ to @long int@.88 When all conversions are safe, closer conversions are ranked better, \eg @short@ to @int@ versus @short@ to @long int@. 89 89 \begin{cfa} 90 90 void f( long int p ); $\C[2.5in]{// 1}$ … … 103 103 104 104 \item \textbf{Specialization} cost counting the number of restrictions introduced by type assertions. 105 Fewer restriction means few sparametric variables passed at the function call giving better performance.105 Fewer restriction means fewer parametric variables passed at the function call giving better performance. 106 106 \begin{cfa} 107 107 forall( T | { T ?+?( T, T ) } ) void f( T ); $\C[3.25in]{// 1}$ … … 110 110 \end{cfa} 111 111 \end{enumerate} 112 Cost tuples are compared bylexicographical order, from unsafe (highest) to specialization (lowest), with ties moving to the next lowest item.112 Cost tuples are compared in lexicographical order, from unsafe (highest) to specialization (lowest), with ties moving to the next lowest item. 113 113 At a subexpression level, the lowest cost candidate for each result type is included as a possible interpretation of the expression; 114 114 at the top level, all possible interpretations of different types are considered (generating a total ordering) and the overall lowest cost is selected as the final interpretation of the expression. 115 115 Glen Ditchfield first proposed this costing model~\cite[\S~4.4.5]{Ditchfield92} to generate a resolution behaviour that is reasonable to C programmers based on existing conversions in the C programming language. 116 116 This model carried over into the first implementation of the \CFA type-system by Richard Bilson~\cite[\S~2.2]{Bilson03}, and was extended but not redesigned by Aaron Moss~\cite[chap.~4]{Moss19}. 117 Moss's work began to show problems with the underlying cost ingmodel;117 Moss's work began to show problems with the underlying cost model; 118 118 these design issues are part of this work. 119 119 … … 152 152 Therefore, at each resolution step, the arguments are already given unique interpretations, so the ordering only needs to compare different sets of conversion targets (function parameter types) on the same set of input. 153 153 154 In \CFA, trying to use such a system is problematic because of the presence of return-type overloading of functions and variable.154 \PAB{My conclusion} is that trying to use such a system in \CFA is problematic because of the presence of return-type overloading of functions and variables. 155 155 Specifically, \CFA expression resolution considers multiple interpretations of argument subexpressions with different types, \eg: 156 156 so it is possible that both the selected function and the set of arguments are different, and cannot be compared with a partial-ordering system. … … 165 165 \end{quote} 166 166 However, I was unable to generate any Ada example program that demonstrates this preference. 167 In contrast, the \CFA overload resolution-system is at the other end of the spectrum, as it tries to order everylegal interpretations of an expression and chooses the best one according to cost, occasionally giving unexpected results rather than an ambiguity.167 In contrast, the \CFA overload resolution-system is at the other end of the spectrum, as it tries to order all legal interpretations of an expression and chooses the best one according to cost, occasionally giving unexpected results rather than an ambiguity. 168 168 169 169 Interestingly, the \CFA cost-based model can sometimes make expression resolution too permissive because it always attempts to select the lowest cost option, and only when there are multiple options tied at the lowest cost does it report the expression is ambiguous. … … 171 171 Other than the case of multiple exact matches, where all have cost zero, incomparable candidates under a partial ordering can often have different expression costs since different kinds of implicit conversions are involved, resulting in seemingly arbitrary overload selections. 172 172 173 There are currently at least three different situations where the polymorphic cost element of the cost model does not yield a candidate selection that is clearly justifiable, and one of them is straight upwrong.173 There are currently at least three different situations where the polymorphic cost element of the cost model does not yield a candidate selection that is justifiable, and one of them is clearly wrong. 174 174 \begin{enumerate}[leftmargin=*] 175 175 \item Polymorphic exact match versus non-polymorphic inexact match. … … 193 193 \end{itemize} 194 194 In this example, option 1 produces the prototype @void f( int )@, which gives an exact match and therefore takes priority. 195 The \CC resolution rules effectively make s option 2 a specialization that only applies to type @long@ exactly,\footnote{\CC does have explicit template specializations, however they do not participate directly in overload resolution and can sometimes lead to unintuitive results.} while the current \CFA rules make option 2 apply for all integral types below @long@.195 The \CC resolution rules effectively make option 2 a specialization that only applies to type @long@ exactly,\footnote{\CC does have explicit template specializations, however they do not participate directly in overload resolution and can sometimes lead to unintuitive results.} while the current \CFA rules make option 2 apply for all integral types ranked lower than @long@ as well. 196 196 This difference could be explained as compensating for \CFA polymorphic functions being separately compiled versus template inlining; 197 197 hence, calling them requires passing type information and assertions increasing the runtime cost. … … 211 211 Although it is true that both the sequence 1, 2 and 1, 3, 4 are increasingly more constrained on the argument types, option 2 is not comparable to either of option 3 or 4; 212 212 they actually describe independent constraints on the two arguments. 213 Specifically, option 2 says the two arguments must have the same type, while option 3 states the second argument must have type @int@ ,213 Specifically, option 2 says the two arguments must have the same type, while option 3 states the second argument must have type @int@. 214 214 Because two constraints can independently be satisfied, neither should be considered a better match when trying to resolve a call to @f@ with argument types @(int, int)@; 215 215 reporting such an expression as ambiguous is more appropriate. … … 227 227 Passing a @pair@ variable to @f@ 228 228 \begin{cfa} 229 pair p;229 pair(int, double) p; 230 230 f( p ); 231 231 \end{cfa} 232 232 gives a cost of 1 poly, 2 variable for the @pair@ overload, versus a cost of 1 poly, 1 variable for the unconstrained overload. 233 233 Programmer expectation is to select option 1 because of the exact match, but the cost model selects 2; 234 while either could work, the type system should select a call that meets expectation of say the call is ambiguous, forcing the programmer to mediate.234 it is not possible to write a specialization for @f@ that works on any pair type and gets selected by the type resolver as intended. 235 235 As a result, simply counting the number of polymorphic type variables is no longer correct to order the function candidates as being more constrained. 236 236 \end{enumerate} 237 237 238 These inconsistencies are not easily solvable in the current cost-model, meaning th e currently\CFA codebase has to workaround these defects.238 These inconsistencies are not easily solvable in the current cost-model, meaning that currently the \CFA codebase has to workaround these defects. 239 239 One potential solution is to mix the conversion cost and \CC-like partial ordering of specializations. 240 240 For example, observe that the first three elements (unsafe, polymorphic and safe conversions) in the \CFA cost-tuple are related to the argument/parameter types, while the other two elements (polymorphic variable and assertion counts) are properties of the function declaration. … … 352 352 Here, the unsafe cost of signed to unsigned is factored into the ranking, so the safe conversion is selected over an unsafe one. 353 353 Furthermore, an integral option is taken before considering a floating option. 354 This model locally matches the C approach, but provides an ordering when there are many overload ed alternative.354 This model locally matches the C approach, but provides an ordering when there are many overload alternatives. 355 355 However, as Moss pointed out overload resolution by total cost has problems, \eg handling cast expressions. 356 356 \begin{cquote} … … 379 379 if an expression has any legal interpretations as a C builtin operation, only the lowest cost one is kept, regardless of the result type. 380 380 381 \VRef[Figure]{f:CFAArithmeticConversions} shows analternative \CFA partial-order arithmetic-conversions graphically.381 \VRef[Figure]{f:CFAArithmeticConversions} shows \PAB{my} alternative \CFA partial-order arithmetic-conversions graphically. 382 382 The idea here is to first look for the best integral alternative because integral calculations are exact and cheap. 383 383 If no integral solution is found, than there are different rules to select among floating-point alternatives. … … 387 387 \section{Type Unification} 388 388 389 Type unification is the algorithm that assigns values to each (free) type parameter ssuch that the types of the provided arguments and function parameters match.389 Type unification is the algorithm that assigns values to each (free) type parameter such that the types of the provided arguments and function parameters match. 390 390 391 391 \CFA does not attempt to do any type \textit{inference} \see{\VRef{s:IntoTypeInferencing}}: it has no anonymous functions (\ie lambdas, commonly found in functional programming and also used in \CC and Java), and the variable types must all be explicitly defined (no auto typing). … … 408 408 With the introduction of generic record types, the parameters must match exactly as well; currently there are no covariance or contravariance supported for the generics. 409 409 410 One simplification was madeto the \CFA language that makes modelling the type system easier: polymorphic function pointer types are no longer allowed.410 \PAB{I made} one simplification to the \CFA language that makes modelling the type system easier: polymorphic function pointer types are no longer allowed. 411 411 The polymorphic function declarations themselves are still treated as function pointer types internally, however the change means that formal parameter types can no longer be polymorphic. 412 412 Previously it was possible to write function prototypes such as … … 418 418 A function operates on the call-site arguments together with any local and global variables. 419 419 When the function is polymorphic, the types are inferred at each call site. 420 On each invocation, the types to be operate on are determined from the arguments provided, and therefore, there is no need to pass a polymorphic function pointer, which can take any type in principle.420 On each invocation, the types to be operated on are determined from the arguments provided, and therefore, there is no need to pass a polymorphic function pointer, which can take any type in principle. 421 421 For example, consider a polymorphic function that takes one argument of type @T@ and polymorphic function pointer. 422 422 \begin{cfa} … … 441 441 The assertion set that needs to be resolved is just the declarations on the function prototype, which also simplifies the assertion satisfaction algorithm, which is discussed further in the next section. 442 442 443 Animplementation sketch stores type unification results in a type-environment data-structure, which represents all the type variables currently in scope as equivalent classes, together with their bound types and information such as whether the bound type is allowed to be opaque (\ie a forward declaration without definition in scope) and whether the bounds are allowed to be widened.443 \PAB{My} implementation sketch stores type unification results in a type-environment data-structure, which represents all the type variables currently in scope as equivalent classes, together with their bound types and information such as whether the bound type is allowed to be opaque (\ie a forward declaration without definition in scope) and whether the bounds are allowed to be widened. 444 444 In the general approach commonly used in functional languages, the unification variables are given a lower bound and an upper bound to account for covariance and contravariance of types. 445 445 \CFA does not implement any variance with its generic types and does not allow polymorphic function types, therefore no explicit upper bound is needed and one binding value for each equivalence class suffices. … … 469 469 470 470 471 In previous versions of \CFA, this number was set at 4; as the compiler becomes more optimized and capable of handling more complex expressions in a reasonable amount of time, I have increased the limit to 8 and it does not lead to problems.471 In previous versions of \CFA, this number was set at 4; as the compiler has become more optimized and capable of handling more complex expressions in a reasonable amount of time, I have increased the limit to 8 and it has not led to problems. 472 472 Only rarely is there a case where the infinite recursion produces an exponentially growing assertion set, causing minutes of time wasted before the limit is reached. 473 473 Fortunately, it is very hard to generate this situation with realistic \CFA code, and the ones that have occurred have clear characteristics, which can be prevented by alternative approaches. … … 475 475 One example is analysed in this section. 476 476 477 While the assertion satisfaction problem in isolation looks like just another expression to resolve, its recursive nature makes some techniques for expression resolution no longer possible.477 \PAB{My analysis shows that} while the assertion satisfaction problem in isolation looks like just another expression to resolve, its recursive nature makes some techniques for expression resolution no longer possible. 478 478 The most significant impact is that type unification has a side effect, namely editing the type environment (equivalence classes and bindings), which means if one expression has multiple associated assertions it is dependent, as the changes to the type environment must be compatible for all the assertions to be resolved. 479 479 Particularly, if one assertion parameter can be resolved in multiple different ways, all of the results need to be checked to make sure the change to type variable bindings are compatible with other assertions to be resolved. … … 481 481 In many cases, these problems can be avoided by examining other assertions that provide insight on the desired type binding: if one assertion parameter can only be matched by a unique option, the type bindings can be updated confidently without the need for backtracking. 482 482 483 The Moss algorithm currently used in \CFA was developed using a simplified type -simulator that capture most of \CFA type-system features.483 The Moss algorithm currently used in \CFA was developed using a simplified type system that captures most of \CFA's type system features. 484 484 The simulation results were then ported back to the actual language. 485 485 The simulator used a mix of breadth- and depth-first search in a staged approach. … … 494 494 If any new assertions are introduced by the selected candidates, the algorithm is applied recursively, until there are none pending resolution or the recursion limit is reached, which results in a failure. 495 495 496 However, in practice the efficiency of this algorithm can be sensitive to the order of resolving assertions.496 However, \PAB{I identify that} in practice the efficiency of this algorithm can be sensitive to the order of resolving assertions. 497 497 Suppose an unbound type variable @T@ appears in two assertions: 498 498 \begin{cfa} … … 517 517 A type variable introduced by the @forall@ clause of function declaration can appear in parameter types, return types and assertion variables. 518 518 If it appears in parameter types, it can be bound when matching the arguments to parameters at the call site. 519 If it only appears in the return type, it can be eventually bedetermined from the call-site context.519 If it only appears in the return type, it can be eventually determined from the call-site context. 520 520 Currently, type resolution cannot do enough return-type inferencing while performing eager assertion resolution: the return type information is unknown before the parent expression is resolved, unless the expression is an initialization context where the variable type is known. 521 521 By delaying the assertion resolution until the return type becomes known, this problem can be circumvented. 522 The truly problematic case occurs if a type variable does not appear in either of the parameter or return types and only appears in assertions or variables ( associate types).522 The truly problematic case occurs if a type variable does not appear in either of the parameter or return types and only appears in assertions or variables (\newterm{associate types}). 523 523 \begin{cfa} 524 524 forall( T | { void foo( @T@ ) } ) int f( float ) { … … 526 526 } 527 527 \end{cfa} 528 This case is rare so forcing every type variable to appear at least once in parameter or return types limitsdoes not limit the expressiveness of \CFA type system to a significant extent.529 The next section presents a proposal for including type declarations in traits rather than having all type variables appear in the trait parameter list, which isprovides equivalent functionality to an unbound type parameter in assertion variables, and also addresses some of the variable cost issue discussed in \VRef{s:ExpressionCostModel}.528 This case is rare so forcing every type variable to appear at least once in parameter or return types does not limit the expressiveness of \CFA type system to a significant extent. 529 \VRef{s:AssociatedTypes} presents a proposal for including type declarations in traits rather than having all type variables appear in the trait parameter list, which provides equivalent functionality to an unbound type parameter in assertion variables, and also addresses some of the variable cost issue discussed in \VRef{s:ExpressionCostModel}. 530 530 531 531 … … 535 535 Based on the experiment results, this approach can improve the performance of expression resolution in general, and sometimes allow difficult instances of assertion resolution problems to be solved that are otherwise infeasible, \eg when the resolution encounters an infinite loop. 536 536 537 The tricky problem in implementing this approach is that the resolution algorithm has side effects, namely modifying the type bindings in the environment.537 \PAB{I identify that} the tricky problem in implementing this approach is that the resolution algorithm has side effects, namely modifying the type bindings in the environment. 538 538 If the modifications are cached, \ie the results that cause the type bindings to be modified, it is also necessary to store the changes to type bindings, too. 539 539 Furthermore, in cases where multiple candidates can be used to satisfy one assertion parameter, all of them must be cached including those that are not eventually selected, since the side effect can produce different results depending on the context. … … 583 583 However, the implementation of the type environment is simplified; 584 584 it only stores a tentative type binding with a flag indicating whether \emph{widening} is possible for an equivalence class of type variables. 585 Formally speaking, this meansthe type environment used in \CFA is only capable of representing \emph{lower-bound} constraints.585 Formally speaking, \PAB{I concluded} the type environment used in \CFA is only capable of representing \emph{lower-bound} constraints. 586 586 This simplification works most of the time, given the following properties of the existing \CFA type system and the resolution algorithms: 587 587 \begin{enumerate} … … 599 599 \end{enumerate} 600 600 601 \CFA does attempt to incorporate upstream type information propagated from variable adeclaration with initializer, since the type of the variable being initialized is known.601 \CFA does attempt to incorporate upstream type information propagated from a variable declaration with initializer, since the type of the variable being initialized is known. 602 602 However, the current type-environment representation is flawed in handling such type inferencing, when the return type in the initializer is polymorphic. 603 603 Currently, an inefficient workaround is performed to create the necessary effect.
Note:
See TracChangeset
for help on using the changeset viewer.