Changeset 57c7e6c4 for doc/theses


Ignore:
Timestamp:
May 3, 2025, 12:46:23 AM (5 months ago)
Author:
Fangren Yu <f37yu@…>
Branches:
master
Children:
c9c1a7e6
Parents:
ef05cf0
Message:

proofreading fix as suggested by Ondrej

Location:
doc/theses/fangren_yu_MMath
Files:
5 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/fangren_yu_MMath/background.tex

    ref05cf0 r57c7e6c4  
    2121Furthermore, Cyclone's polymorphic functions and types are restricted to abstraction over types with the same layout and calling convention as @void *@, \ie only pointer types and @int@.
    2222In \CFA terms, all Cyclone polymorphism must be dtype-static.
    23 While the Cyclone design provides the efficiency benefits discussed in Section~\ref{sec:generic-apps} for dtype-static polymorphism, it is more restrictive than \CFA's general model.
     23While the Cyclone design provides the efficiency benefits discussed in~\VRef{s:GenericImplementation} for dtype-static polymorphism, it is more restrictive than \CFA's general model.
    2424Smith and Volpano~\cite{Smith98} present Polymorphic C, an ML dialect with polymorphic functions, C-like syntax, and pointer types;
    2525it lacks many of C's features, most notably structure types, and hence, is not a practical C replacement.
  • doc/theses/fangren_yu_MMath/features.tex

    ref05cf0 r57c7e6c4  
    2323\CFA adopts a uniform policy between pointers and references where mutability is a separate property made at the declaration.
    2424
    25 The following examples shows how pointers and references are treated uniformly in \CFA.
     25The following examples show how pointers and references are treated uniformly in \CFA.
    2626\begin{cfa}[numbers=left,numberblanklines=false]
    2727int x = 1, y = 2, z = 3;$\label{p:refexamples}$
     
    3636@&@r3 = @&@y; @&&@r3 = @&&@r4;                          $\C{// change r1, r2}$
    3737\end{cfa}
    38 Like pointers, reference can be cascaded, \ie a reference to a reference, \eg @&& r2@.\footnote{
     38Like pointers, references can be cascaded, \ie a reference to a reference, \eg @&& r2@.\footnote{
    3939\CC uses \lstinline{&&} for rvalue reference, a feature for move semantics and handling the \lstinline{const} Hell problem.}
    4040Usage of a reference variable automatically performs the same number of dereferences as the number of references in its declaration, \eg @r2@ becomes @**r2@.
     
    101101Interestingly, C does not give a warning/error if a @const@ pointer is not initialized, while \CC does.
    102102Hence, type @& const@ is similar to a \CC reference, but \CFA does not preclude initialization with a non-variable address.
    103 For example, in system's programming, there are cases where an immutable address is initialized to a specific memory location.
     103For example, in systems programming, there are cases where an immutable address is initialized to a specific memory location.
    104104\begin{cfa}
    105105int & const mem_map = *0xe45bbc67@p@; $\C{// hardware mapped registers ('p' for pointer)}$
     
    157157\end{cfa}
    158158While it is possible to write a reference type as the argument to a generic type, it is disallowed in assertion checking, if the generic type requires the object trait \see{\VPageref{p:objecttrait}} for the type argument, a fairly common use case.
    159 Even if the object trait can be made optional, the current type system often misbehaves by adding undesirable auto-dereference on the referenced-to value rather than the reference variable itself, as intended.
     159Even if the object trait can be made optional, the current compiler implementation often misbehaves by adding undesirable auto-dereference on the referenced-to value rather than the reference variable itself, as intended.
    160160Some tweaks are necessary to accommodate reference types in polymorphic contexts and it is unclear what can or cannot be achieved.
    161161Currently, there are contexts where the \CFA programmer is forced to use a pointer type, giving up the benefits of auto-dereference operations and better syntax with reference types.
     
    188188@[x, y, z]@ = foo( 3, 4 );  // return 3 values into a tuple
    189189\end{cfa}
    190 Along with making returning multiple values a first-class feature, tuples were extended to simplify a number of other common context that normally require multiple statements and/or additional declarations, all of which reduces coding time and errors.
     190Along with making returning multiple values a first-class feature, tuples were extended to simplify a number of other common contexts that normally require multiple statements and/or additional declarations, all of which reduces coding time and errors.
    191191\begin{cfa}
    192192[x, y, z] = 3; $\C[2in]{// x = 3; y = 3; z = 3, where types may be different}$
     
    213213\end{cfa}
    214214The type resolver only has the tuple return types to resolve the call to @bar@ as the @foo@ parameters are identical.
    215 The resultion involves unifying the flattened @foo@ return values with @bar@'s parameter list.
     215The resulution involves unifying the flattened @foo@ return values with @bar@'s parameter list.
    216216However, no combination of @foo@s is an exact match with @bar@'s parameters;
    217217thus, the resolver applies C conversions to obtain a best match.
     
    303303\section{Tuple Implementation}
    304304
    305 As noted, tradition languages manipulate multiple values by in/out parameters and/or structures.
     305As noted, traditional languages manipulate multiple values by in/out parameters and/or structures.
    306306K-W C adopted the structure for tuple values or variables, and as needed, the fields are extracted by field access operations.
    307307As well, for the tuple-assignment implementation, the left-hand tuple expression is expanded into assignments of each component, creating temporary variables to avoid unexpected side effects.
     
    576576Unfortunately, packing the variadic arguments into a rigid @struct@ type and generating all the required wrapper functions is significant work and largely wasted because most are never called.
    577577Interested readers can refer to pages 77-80 of Robert Schluntz's thesis to see how verbose the translator output is to implement a simple variadic call with 3 arguments.
    578 As the number of arguments increases, \eg a call with 5 arguments, the translator generates a concrete @struct@ types for a 4-tuple and a 3-tuple along with all the polymorphic type data for them.
     578As the number of arguments increases, \eg a call with 5 arguments, the translator generates concrete @struct@ types for a 4-tuple and a 3-tuple along with all the polymorphic type data for them.
    579579An alternative approach is to put the variadic arguments into an array, along with an offset array to retrieve each individual argument.
    580580This method is similar to how the C @va_list@ object is used (and how \CFA accesses polymorphic fields in a generic type), but the \CFA variadics generate the required type information to guarantee type safety (like the @printf@ format string).
     
    849849While \CC has no direct syntax to disambiguate @x@, \ie @d.B.x@ or @d.C.x@, it is possible with casts, @((B)d).x@ or @((C)d).x@.
    850850Like \CC, \CFA compiles the Plan-9 version and provides direct qualification and casts to disambiguate @x@.
    851 While ambiguous definitions are allowed, duplicate field names is poor practice and should be avoided if possible.
     851While ambiguous definitions are allowed, duplicate field names are poor practice and should be avoided if possible.
    852852However, when a programmer does not control all code, this problem can occur and a naming workaround must exist.
  • doc/theses/fangren_yu_MMath/future.tex

    ref05cf0 r57c7e6c4  
    9292\section{Associated Types}
    9393
    94 The analysis presented in \VRef{s:AssertionSatisfaction} shows if all type parameters have to be bound before assertion resolution, the complexity of resolving assertions become much lower as every assertion parameter can be resolved independently.
     94The analysis presented in \VRef{s:AssertionSatisfaction} shows if all type parameters have to be bound before assertion resolution, the complexity of resolving assertions becomes much lower as every assertion parameter can be resolved independently.
    9595That is, by utilizing information from higher up the expression tree for return value overloading, most of the type bindings can be resolved.
    9696However, there are scenarios where some intermediate types need to be involved in certain operations, which are neither input nor output types.
     
    152152Note that the type @list *@ satisfies both @pointer_like( list *, int )@ and @pointer_like( list *,@ @list )@ (the latter by the built-in pointer dereference operator) and the expression @*it@ can be either a @struct list@ or an @int@.
    153153Requiring associated types to be unique makes the @pointer_like@ trait not applicable to @list *@, which is undesirable.
    154 I have not attempted to implement associated types in \CFA compiler, but based on the above discussions, one option is to make associated type resolution and return type overloading coexist:
     154I have not attempted to implement associated types in the \CFA compiler, but based on the above discussions, one option is to make associated type resolution and return type overloading coexist:
    155155when the associated type appears in returns, it is deduced from the context and then verify the trait with ordinary assertion resolution;
    156156when it does not appear in the returns, the type is required to be uniquely determined by the expression that defines the associated type.
  • doc/theses/fangren_yu_MMath/intro.tex

    ref05cf0 r57c7e6c4  
    3535\end{quote}
    3636Overloading allows programmers to use the most meaningful names without fear of name clashes within a program or from external sources, like include files.
    37 Experience from \CC and \CFA developers shows the type system can implicitly and correctly disambiguates the majority of overloaded names, \ie it is rare to get an incorrect selection or ambiguity, even among hundreds of overloaded (variables and) functions.
     37Experience from \CC and \CFA developers shows the type system can implicitly and correctly disambiguate the majority of overloaded names, \ie it is rare to get an incorrect selection or ambiguity, even among hundreds of overloaded (variables and) functions.
    3838In many cases, a programmer is unaware of name clashes, as they are silently resolved, simplifying the development process.
    3939
     
    445445f( 'A' );                               $\C{// select (2)}\CRT$
    446446\end{cfa}
    447 The type system examines each call size and first looks for an exact match and then a best match using conversions.
     447The type system examines each call site and first looks for an exact match and then a best match using conversions.
    448448
    449449Ada, Scala, and \CFA type-systems also use the return type in resolving a call, to pinpoint the best overloaded name.
    450 Essentailly, the return types are \emph{reversed curried} into output parameters of the function.
     450Essentially, the return types are \emph{reversed curried} into output parameters of the function.
    451451For example, in many programming languages with overloading, the following functions are ambiguous without using the return type.
    452452\begin{cfa}
     
    643643For example, if a change is made in an initialization expression, it can cascade type changes producing many other changes and/or errors.
    644644At some point, a variable's type needs to remain constant and the initializing expression needs to be modified or be in error when it changes.
    645 Often type-inferencing systems allow restricting (\newterm{branding}) a variable or function type, so the complier can report a mismatch with the constant initialization.
     645Often type-inferencing systems allow restricting (\newterm{branding}) a variable or function type, so the compiler can report a mismatch with the constant initialization.
    646646\begin{cfa}
    647647void f( @int@ x, @int@ y ) {  // brand function prototype
     
    795795\end{tabular}
    796796\end{cquote}
    797 Traits are implemented by flatten them at use points, as if written in full by the programmer.
     797Traits are implemented by flattening them at use points, as if written in full by the programmer.
    798798Flattening often results in overlapping assertions, \eg operator @+@.
    799799Hence, trait names play no part in type equivalence.
     
    857857\end{tabular}
    858858\end{cquote}
     859\label{s:GenericImplementation}
    859860\CFA generic types are \newterm{fixed} or \newterm{dynamic} sized.
    860861Fixed-size types have a fixed memory layout regardless of type parameters, whereas dynamic types vary in memory layout depending on the type parameters.
     
    10091010\end{swift}
    10101011To make a universal function useable, an abstract description is needed for the operations used on the parameters within the function body.
    1011 Type matching these operations can occur by discover using techniques like \CC template expansion, or explicit stating, \eg interfaces, subtyping (inheritance), assertions (traits), type classes, type bounds.
     1012Type matching these operations can be done by using techniques like \CC template expansion, or explicit stating, \eg interfaces, subtyping (inheritance), assertions (traits), type classes, type bounds.
    10121013The mechanism chosen can affect separate compilation or require runtime type information (RTTI).
    10131014\begin{description}
  • doc/theses/fangren_yu_MMath/resolution.tex

    ref05cf0 r57c7e6c4  
    6969\begin{enumerate}[leftmargin=*]
    7070\item \textbf{Unsafe} cost representing a narrowing conversion of arithmetic types, \eg @int@ to @short@, and qualifier-dropping conversions for pointer and reference types.
    71 Narrowing conversions have the potential to lose (truncation) data.
     71Narrowing conversions have the potential to lose (truncate) data.
    7272A programmer must decide if the computed data-range can safely be shorted in the smaller storage.
    7373Warnings for unsafe conversions are helpful.
     
    103103
    104104\item \textbf{Specialization} cost counting the number of restrictions introduced by type assertions.
    105 Fewer restriction means fews parametric variables passed at the function call giving better performance.
     105Fewer restriction means fewer parametric variables passed at the function call giving better performance.
    106106\begin{cfa}
    107107forall( T | { T ?+?( T, T ) } ) void f( T ); $\C[3.25in]{// 1}$
     
    152152Therefore, at each resolution step, the arguments are already given unique interpretations, so the ordering only needs to compare different sets of conversion targets (function parameter types) on the same set of input.
    153153
    154 In \CFA, trying to use such a system is problematic because of the presence of return-type overloading of functions and variable.
     154In \CFA, trying to use such a system is problematic because of the presence of return-type overloading of functions and variables.
    155155Specifically, \CFA expression resolution considers multiple interpretations of argument subexpressions with different types, \eg:
    156156so it is possible that both the selected function and the set of arguments are different, and cannot be compared with a partial-ordering system.
     
    165165\end{quote}
    166166However, I was unable to generate any Ada example program that demonstrates this preference.
    167 In contrast, the \CFA overload resolution-system is at the other end of the spectrum, as it tries to order every legal interpretations of an expression and chooses the best one according to cost, occasionally giving unexpected results rather than an ambiguity.
     167In contrast, the \CFA overload resolution-system is at the other end of the spectrum, as it tries to order all legal interpretations of an expression and chooses the best one according to cost, occasionally giving unexpected results rather than an ambiguity.
    168168
    169169Interestingly, the \CFA cost-based model can sometimes make expression resolution too permissive because it always attempts to select the lowest cost option, and only when there are multiple options tied at the lowest cost does it report the expression is ambiguous.
     
    193193\end{itemize}
    194194In this example, option 1 produces the prototype @void f( int )@, which gives an exact match and therefore takes priority.
    195 The \CC resolution rules effectively makes option 2 a specialization that only applies to type @long@ exactly,\footnote{\CC does have explicit template specializations, however they do not participate directly in overload resolution and can sometimes lead to unintuitive results.} while the current \CFA rules make option 2 apply for all integral types below @long@.
     195The \CC resolution rules effectively make option 2 a specialization that only applies to type @long@ exactly,\footnote{\CC does have explicit template specializations, however they do not participate directly in overload resolution and can sometimes lead to unintuitive results.} while the current \CFA rules make option 2 apply for all integral types below @long@.
    196196This difference could be explained as compensating for \CFA polymorphic functions being separately compiled versus template inlining;
    197197hence, calling them requires passing type information and assertions increasing the runtime cost.
     
    211211Although it is true that both the sequence 1, 2 and 1, 3, 4 are increasingly more constrained on the argument types, option 2 is not comparable to either of option 3 or 4;
    212212they actually describe independent constraints on the two arguments.
    213 Specifically, option 2 says the two arguments must have the same type, while option 3 states the second argument must have type @int@,
     213Specifically, option 2 says the two arguments must have the same type, while option 3 states the second argument must have type @int@.
    214214Because two constraints can independently be satisfied, neither should be considered a better match when trying to resolve a call to @f@ with argument types @(int, int)@;
    215215reporting such an expression as ambiguous is more appropriate.
     
    236236\end{enumerate}
    237237
    238 These inconsistencies are not easily solvable in the current cost-model, meaning the currently \CFA codebase has to workaround these defects.
     238These inconsistencies are not easily solvable in the current cost-model, meaning the current \CFA codebase has to workaround these defects.
    239239One potential solution is to mix the conversion cost and \CC-like partial ordering of specializations.
    240240For example, observe that the first three elements (unsafe, polymorphic and safe conversions) in the \CFA cost-tuple are related to the argument/parameter types, while the other two elements (polymorphic variable and assertion counts) are properties of the function declaration.
     
    352352Here, the unsafe cost of signed to unsigned is factored into the ranking, so the safe conversion is selected over an unsafe one.
    353353Furthermore, an integral option is taken before considering a floating option.
    354 This model locally matches the C approach, but provides an ordering when there are many overloaded alternative.
     354This model locally matches the C approach, but provides an ordering when there are many overload alternatives.
    355355However, as Moss pointed out overload resolution by total cost has problems, \eg handling cast expressions.
    356356\begin{cquote}
     
    387387\section{Type Unification}
    388388
    389 Type unification is the algorithm that assigns values to each (free) type parameters such that the types of the provided arguments and function parameters match.
     389Type unification is the algorithm that assigns values to each (free) type parameter such that the types of the provided arguments and function parameters match.
    390390
    391391\CFA does not attempt to do any type \textit{inference} \see{\VRef{s:IntoTypeInferencing}}: it has no anonymous functions (\ie lambdas, commonly found in functional programming and also used in \CC and Java), and the variable types must all be explicitly defined (no auto typing).
     
    418418A function operates on the call-site arguments together with any local and global variables.
    419419When the function is polymorphic, the types are inferred at each call site.
    420 On each invocation, the types to be operate on are determined from the arguments provided, and therefore, there is no need to pass a polymorphic function pointer, which can take any type in principle.
     420On each invocation, the types to be operated on are determined from the arguments provided, and therefore, there is no need to pass a polymorphic function pointer, which can take any type in principle.
    421421For example, consider a polymorphic function that takes one argument of type @T@ and polymorphic function pointer.
    422422\begin{cfa}
     
    481481In many cases, these problems can be avoided by examining other assertions that provide insight on the desired type binding: if one assertion parameter can only be matched by a unique option, the type bindings can be updated confidently without the need for backtracking.
    482482
    483 The Moss algorithm currently used in \CFA was developed using a simplified type-simulator that capture most of \CFA type-system features.
     483The Moss algorithm currently used in \CFA was developed using a simplified type system that captures most of \CFA type system features.
    484484The simulation results were then ported back to the actual language.
    485485The simulator used a mix of breadth- and depth-first search in a staged approach.
     
    517517A type variable introduced by the @forall@ clause of function declaration can appear in parameter types, return types and assertion variables.
    518518If it appears in parameter types, it can be bound when matching the arguments to parameters at the call site.
    519 If it only appears in the return type, it can be eventually be determined from the call-site context.
     519If it only appears in the return type, it can be eventually determined from the call-site context.
    520520Currently, type resolution cannot do enough return-type inferencing while performing eager assertion resolution: the return type information is unknown before the parent expression is resolved, unless the expression is an initialization context where the variable type is known.
    521521By delaying the assertion resolution until the return type becomes known, this problem can be circumvented.
     
    526526}
    527527\end{cfa}
    528 This case is rare so forcing every type variable to appear at least once in parameter or return types limits does not limit the expressiveness of \CFA type system to a significant extent.
     528This case is rare so forcing every type variable to appear at least once in parameter or return types does not limit the expressiveness of \CFA type system to a significant extent.
    529529The next section presents a proposal for including type declarations in traits rather than having all type variables appear in the trait parameter list, which is provides equivalent functionality to an unbound type parameter in assertion variables, and also addresses some of the variable cost issue discussed in \VRef{s:ExpressionCostModel}.
    530530
Note: See TracChangeset for help on using the changeset viewer.