Changeset 21ce2c7


Ignore:
Timestamp:
Dec 13, 2023, 2:47:16 PM (5 months ago)
Author:
JiadaL <j82liang@…>
Branches:
master
Children:
dc80280
Parents:
81da3da4
Message:

Change the unification scheme

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/proposals/enum.tex

    r81da3da4 r21ce2c7  
    231231These generated functions are $Companion Functions$, they take an $companion$ object and the position as parameters.
    232232
     233\section{Unification}
     234
    233235\subsection{Enumeration as Value}
    234 An \CFA enumeration with base type T can be used seamlessly as T.
     236\label{section:enumeration_as_value}
     237An \CFA enumeration with base type T can be used seamlessly as T, without explicitly calling the pseudo-function value.
    235238\begin{lstlisting}[label=lst:implicit_conversion]
    236239char * green_value = Colour.Green; // "G"
    237240// Is equivalent to
    238 char * green_value = value( Color.Green ); "G"
    239 \end{lstlisting}
    240 \CFA recognizes @Colour.Green@ as an Expression with enumeration type. [reference to resolution distance] An enumeration type can be safely converted into its value type T, @char *@ in the example. When assigning @Colour.Green@ to a reference @green_value@, which has type @char *@, the compiler adds the distance between an enumeration and type T, and the distance between type T and @char *@. If the distance is safe, \CFA will replace the expression @Colour.Green@ with @value( Colour.Green )@.
    241 
    242 \subsection{Variable Overloading}
     241// char * green_value = value( Color.Green ); "G"
     242\end{lstlisting}
     243
     244\subsection{Unification Distance}
     245\begin{lstlisting}[label=lst:unification_distance_example]
     246T_2 Foo(T1);
     247\end{lstlisting}
     248The @Foo@ function expects a parameter with type @T1@. In C, only a value with the exact type T1 can be used as a parameter for @Foo@. In \CFA, @Foo@ accepts value with some type @T3@ as long as @distance(T1, T3)@ is not @Infinite@.
     249
     250@path(A, B)@ is a compiler concept that returns one of the following:
     251\begin{itemize}
     252    \item Zero or 0, if and only if $A == B$.
     253    \item Safe, if B can be used as A without losing its precision, or B is a subtype of A.
     254    \item Unsafe, if B loses its precision when used as A, or A is a subtype of B.
     255    \item Infinite, if B cannot be used as A. A is not a subtype of B and B is not a subtype of A.
     256\end{itemize}
     257
     258For example, @path(int, int)==Zero@, @path(int, char)==Safe@, @path(int, double)==Unsafe@, @path(int, struct S)@ is @Infinite@ for @struct S{}@.
     259@distance(A, C)@ is the minimum sum of paths from A to C. For example, if @path(A, B)==i@, @path(B, C)==j@, and @path(A, C)=k@, then $$distance(A,C)==min(path(A,B), path(B,C))==i+j$$.
     260
     261(Skip over the distance matrix here because it is mostly irrelevant for enumeration discussion. In the actual implementation, distance( E, T ) is 1.)
     262
     263The arithmetic of distance is the following:
     264\begin{itemize}
     265    \item $Zero + v= v$, for some value v.
     266    \item $Safe * k <  Unsafe$, for finite k.
     267    \item $Unsafe * k < Infinite$, for finite k.
     268    \item $Infinite + v = Infinite$, for some value v.
     269\end{itemize}
     270
     271For @enum(T) E@, @path(T, E)==Safe@ and @path(E,T)==Infinite@. In other words, enumeration type E can be @safely@ used as type T, but type T cannot be used when the resolution context expects a variable with enumeration type @E@.
     272
     273
     274\subsection{Variable Overloading and Parameter Unification}
     275\CFA allows variable names to be overloaded. It is possible to overload a variable that has type T and an enumeration with type T.
    243276\begin{lstlisting}[label=lst:variable_overload]
     277char * green = "Green";
     278Colour green = Colour.Green; // "G"
     279
     280void bar(char * s) { return s; }
    244281void foo(Colour c) { return value( c ); }
    245 void bar(char * s) { return s; }
    246 Colour green = Colour.Green; // "G"
    247 char * green = "Green";
     282
    248283foo( green ); // "G"
    249284bar( green ); // "Green"
    250285\end{lstlisting}
     286\CFA's conversion distance helps disambiguation in this overloading. For the function @bar@ which expects the parameter s to have type @char *@, $distance(char *,char *) == Zero$ while $distance(char *, Colour) == Safe$, the path from @char *@ to the enumeration with based type @char *@, \CFA chooses the @green@ with type @char *@ unambiguously. On the other hand, for the function @foo@, @distance(Colour, char *)@ is @Infinite@, @foo@ picks the @green@ with type @char *@.
    251287
    252288\subsection{Function Overloading}
     289Similarly, functions can be overloaded with different signatures. \CFA picks the correct function entity based on the distance between parameter types and the arguments.
    253290\begin{lstlisting}[label=lst:function_overload]
    254 void foo(Colour c) { return "It is an enum"; }
    255 void foo(char * s) { return "It is a string"; }
     291Colour green = Colour.Green;
     292void foo(Colour c) { sout | "It is an enum"; } // First foo
     293void foo(char * s) { sout | "It is a string"; } // Second foo
    256294foo( green ); // "It is an enum"
    257295\end{lstlisting}
    258 
    259 As a consequence, the semantics of using \CFA enumeration as a flag for selection is identical to C enumeration.
    260 
    261 
    262 % \section{Enumeration Features}
    263 
    264 A trait is a collection of constraints in \CFA that can be used to describe types.
    265 The \CFA standard library defines traits to categorize types with related enumeration features.
     296Because @distance(Colour, Colour)@ is @Zero@ and @distance(char *, Colour)@ is @Safe@, \CFA determines the @foo( green )@ is a call to the first foo.
     297
     298\subsection{Attributes Functions}
     299The pseudo-function @value()@ "unboxes" the enumeration and the type of the expression is the underlying type. Therefore, in the section~\ref{section:enumeration_as_value} when assigning @Colour.Green@ to variable typed @char *@, the resolution distance is @Safe@, while assigning @value(Color.Green) to @char *) has resolution distance @Zero@.
     300
     301\begin{lstlisting}[label=lst:declaration_code]
     302int s1;
     303\end{lstlisting}
     304The generated code for an enumeration instance is simply an int. It is to hold the position of an enumeration. And usage of variable @s1@ will be converted to return one of its attributes: label, value, or position, concerning the @Unification@ rule
     305
     306% \subsection{Unification and Resolution (this implementation will probably not be used, safe as reference for now)}
     307
     308% \begin{lstlisting}
     309% enum Colour( char * ) { Red = "R", Green = "G", Blue = "B"  };
     310% \end{lstlisting}
     311% The @EnumInstType@ is convertible to other types.
     312% A \CFA enumeration expression is implicitly \emph{overloaded} with its three different attributes: value, position, and label.
     313% The \CFA compilers need to resolve an @EnumInstType@ as one of its attributes based on the current context.
     314
     315% \begin{lstlisting}[caption={Null Context}, label=lst:null_context]
     316% {
     317%       Colour.Green;
     318% }
     319% \end{lstlisting}
     320% In example~\ref{lst:null_context}, the environment gives no information to help with the resolution of @Colour.Green@.
     321% In this case, any of the attributes is resolvable.
     322% According to the \textit{precedence rule}, the expression with @EnumInstType@ resolves as @value( Colour.Green )@.
     323% The @EnumInstType@ is converted to the type of the value, which is statically known to the compiler as @char *@.
     324% When the compilation reaches the code generation, the compiler outputs code for type @char *@ with the value @"G"@.
     325% \begin{lstlisting}[caption={Null Context Generated Code}, label=lst:null_context]
     326% {
     327%       "G";
     328% }
     329% \end{lstlisting}
     330% \begin{lstlisting}[caption={int Context}, label=lst:int_context]
     331% {
     332%       int g = Colour.Green;
     333% }
     334% \end{lstlisting}
     335% The assignment expression gives a context for the EnumInstType resolution.
     336% The EnumInstType is used as an @int@, and \CFA needs to determine which of the attributes can be resolved as an @int@ type.
     337% The functions $Unify( T1, T2 ): bool$ take two types as parameters and determine if one type can be used as another.
     338% In example~\ref{lst:int_context}, the compiler is trying to unify @int@ and @EnumInstType@ of @Colour@.
     339% $$Unification( int, EnumInstType<Colour> )$$ which turns into three Unification call
     340% \begin{lstlisting}[label=lst:attr_resolution_1]
     341% {
     342%       Unify( int, char * ); // unify with the type of value
     343%       Unify( int, int ); // unify with the type of position
     344%       Unify( int, char * ); // unify with the type of label
     345% }
     346% \end{lstlisting}
     347% \begin{lstlisting}[label=lst:attr_resolution_precedence]
     348% {
     349%       Unification( T1, EnumInstType<T2> ) {
     350%               if ( Unify( T1, T2 ) ) return T2;
     351%               if ( Unify( T1, int ) ) return int;
     352%               if ( Unify( T1, char * ) ) return char *;
     353%               Error: Cannot Unify T1 with EnumInstType<T2>;
     354%       }
     355% }
     356% \end{lstlisting}
     357% After the unification, @EnumInstType@ is replaced by its attributes.
     358
     359% \begin{lstlisting}[caption={Unification Functions}, label=lst:unification_func_call]
     360% {
     361%       T2 foo ( T1 ); // function take variable with T1 as a parameter
     362%       foo( EnumInstType<T3> ); // Call foo with a variable has type EnumInstType<T3>
     363%       >>>> Unification( T1, EnumInstType<T3> )
     364% }
     365% \end{lstlisting}
     366% % The conversion can work backward: in restrictive cases, attributes of can be implicitly converted back to the EnumInstType.
     367% Backward conversion:
     368% \begin{lstlisting}[caption={Unification Functions}, label=lst:unification_func_call]
     369% {
     370%       enum Colour colour = 1;
     371% }
     372% \end{lstlisting}
     373
     374% \begin{lstlisting}[caption={Unification Functions}, label=lst:unification_func_call]
     375% {
     376%    Unification( EnumInstType<Colour>, int ) >>> label
     377% }
     378% \end{lstlisting}
     379% @int@ can be unified with the label of Colour.
     380% @5@ is a constant expression $\Rightarrow$ Compiler knows the value during the compilation $\Rightarrow$ turns it into
     381% \begin{lstlisting}
     382% {
     383%    enum Colour colour = Colour.Green;
     384% }
     385% \end{lstlisting}
     386% Steps:
     387% \begin{enumerate}
     388% \item
     389% identify @1@ as a constant expression with type @int@, and the value is statically known as @1@
     390% \item
     391% @unification( EnumInstType<Colour>, int )@: @position( EnumInstType< Colour > )@
     392% \item
     393% return the enumeration constant at position 1
     394% \end{enumerate}
     395% \begin{lstlisting}
     396% {
     397%       enum T (int) { ... } // Declaration
     398%       enum T t = 1;
     399% }
     400% \end{lstlisting}
     401% Steps:
     402% \begin{enumerate}
     403% \item
     404% identify @1@ as a constant expression with type @int@, and the value is statically known as @1@
     405% \item
     406% @unification( EnumInstType<Colour>, int )@: @value( EnumInstType< Colour > )@
     407% \item
     408% return the FIRST enumeration constant that has the value 1, by searching through the values array
     409% \end{enumerate}
     410% The downside of the precedence rule: @EnumInstType@ $\Rightarrow$ @int ( value )@ $\Rightarrow$ @EnumInstType@ may return a different @EnumInstType@ because the value can be repeated and there is no way to know which one is expected $\Rightarrow$ want uniqueness
     411
     412% \subsection{Casting}
     413% Casting an EnumInstType to some other type T works similarly to unify the EnumInstType with T. For example:
     414% \begin{lstlisting}
     415% enum( int ) Foo { A = 10, B = 100, C = 1000 };
     416% (int) Foo.A;
     417% \end{lstlisting}
     418% The \CFA-compiler unifies @EnumInstType<int>@ with int, with returns @value( Foo.A )@, which has statically known value 10. In other words, \CFA-compiler is aware of a cast expression, and it forms the context for EnumInstType resolution. The expression with type @EnumInstType<int>@ can be replaced by the compile with a constant expression 10, and optionally discard the cast expression.
     419
     420% \subsection{Value Conversion}
     421% As discussed in section~\ref{lst:var_declaration}, \CFA only saves @position@ as the necessary information. It is necessary for \CFA to generate intermediate code to retrieve other attributes.
     422
     423% \begin{lstlisting}
     424% Foo a; // int a;
     425% int j = a;
     426% char * s = a;
     427% \end{lstlisting}
     428% Assume stores a value x, which cannot be statically determined. When assigning a to j in line 2, the compiler @Unify@ j with a, and returns @value( a )@. The generated code for the second line will be
     429% \begin{lstlisting}
     430% int j = value( Foo, a )
     431% \end{lstlisting}
     432% Similarly, the generated code for the third line is
     433% \begin{lstlisting}
     434% char * j = label( Foo, a )
     435% \end{lstlisting}
     436
    266437
    267438\section{Enumerator Initialization}
     
    436607\section{Implementation}
    437608
    438 \subsection{Compiler Representation}
     609\subsection{Compiler Representation (Reworking)}
    439610The definition of an enumeration is represented by an internal type called @EnumDecl@. At the minimum, it stores all the information needed to construct the companion object. Therefore, an @EnumDecl@ can be represented as the following:
    440611\begin{lstlisting}[label=lst:EnumDecl]
     
    464635Companion data are needed only if the according pseudo-functions are called. For example, the value of the enumeration Workday is loaded only if there is at least one compilation that has call $value(Workday)$. Once the values are loaded, all compilations share these values array to reduce memory usage.
    465636
    466 <Investiage: how to implement this is huge>
    467637
    468638\subsection{(Rework) Companion Object and Companion Function}
     
    636806
    637807The declaration \CFA-enumeration variable has the same syntax as the C-enumeration. Internally, such a variable will be represented as an EnumInstType.
    638 \begin{lstlisting}[label=lst:declaration_code]
    639 int s1;
    640 \end{lstlisting}
    641 The generated code for an enumeration instance is simply an int. It is to hold the position of an enumeration. And usage of variable @s1@ will be converted to return one of its attributes: label, value, or position, with respect to the @Unification@ rule
    642 
    643 \subsection{Unification and Resolution }
    644 
    645 
    646 \begin{lstlisting}
    647 enum Colour( char * ) { Red = "R", Green = "G", Blue = "B"  };
    648 \end{lstlisting}
    649 The @EnumInstType@ is convertible to other types.
    650 A \CFA enumeration expression is implicitly \emph{overloaded} with its three different attributes: value, position, and label.
    651 The \CFA compilers need to resolve an @EnumInstType@ as one of its attributes based on the current context.
    652 
    653 \begin{lstlisting}[caption={Null Context}, label=lst:null_context]
    654 {
    655         Colour.Green;
    656 }
    657 \end{lstlisting}
    658 In example~\ref{lst:null_context}, the environment gives no information to help with the resolution of @Colour.Green@.
    659 In this case, any of the attributes is resolvable.
    660 According to the \textit{precedence rule}, the expression with @EnumInstType@ resolves as @value( Colour.Green )@.
    661 The @EnumInstType@ is converted to the type of the value, which is statically known to the compiler as @char *@.
    662 When the compilation reaches the code generation, the compiler outputs code for type @char *@ with the value @"G"@.
    663 \begin{lstlisting}[caption={Null Context Generated Code}, label=lst:null_context]
    664 {
    665         "G";
    666 }
    667 \end{lstlisting}
    668 \begin{lstlisting}[caption={int Context}, label=lst:int_context]
    669 {
    670         int g = Colour.Green;
    671 }
    672 \end{lstlisting}
    673 The assignment expression gives a context for the EnumInstType resolution.
    674 The EnumInstType is used as an @int@, and \CFA needs to determine which of the attributes can be resolved as an @int@ type.
    675 The functions $Unify( T1, T2 ): bool$ take two types as parameters and determine if one type can be used as another.
    676 In example~\ref{lst:int_context}, the compiler is trying to unify @int@ and @EnumInstType@ of @Colour@.
    677 $$Unification( int, EnumInstType<Colour> )$$ which turns into three Unification call
    678 \begin{lstlisting}[label=lst:attr_resolution_1]
    679 {
    680         Unify( int, char * ); // unify with the type of value
    681         Unify( int, int ); // unify with the type of position
    682         Unify( int, char * ); // unify with the type of label
    683 }
    684 \end{lstlisting}
    685 \begin{lstlisting}[label=lst:attr_resolution_precedence]
    686 {
    687         Unification( T1, EnumInstType<T2> ) {
    688                 if ( Unify( T1, T2 ) ) return T2;
    689                 if ( Unify( T1, int ) ) return int;
    690                 if ( Unify( T1, char * ) ) return char *;
    691                 Error: Cannot Unify T1 with EnumInstType<T2>;
    692         }
    693 }
    694 \end{lstlisting}
    695 After the unification, @EnumInstType@ is replaced by its attributes.
    696 
    697 \begin{lstlisting}[caption={Unification Functions}, label=lst:unification_func_call]
    698 {
    699         T2 foo ( T1 ); // function take variable with T1 as a parameter
    700         foo( EnumInstType<T3> ); // Call foo with a variable has type EnumInstType<T3>
    701         >>>> Unification( T1, EnumInstType<T3> )
    702 }
    703 \end{lstlisting}
    704 % The conversion can work backward: in restrictive cases, attributes of can be implicitly converted back to the EnumInstType.
    705 Backward conversion:
    706 \begin{lstlisting}[caption={Unification Functions}, label=lst:unification_func_call]
    707 {
    708         enum Colour colour = 1;
    709 }
    710 \end{lstlisting}
    711 
    712 \begin{lstlisting}[caption={Unification Functions}, label=lst:unification_func_call]
    713 {
    714    Unification( EnumInstType<Colour>, int ) >>> label
    715 }
    716 \end{lstlisting}
    717 @int@ can be unified with the label of Colour.
    718 @5@ is a constant expression $\Rightarrow$ Compiler knows the value during the compilation $\Rightarrow$ turns it into
    719 \begin{lstlisting}
    720 {
    721    enum Colour colour = Colour.Green;
    722 }
    723 \end{lstlisting}
    724 Steps:
    725 \begin{enumerate}
    726 \item
    727 identify @1@ as a constant expression with type @int@, and the value is statically known as @1@
    728 \item
    729 @unification( EnumInstType<Colour>, int )@: @position( EnumInstType< Colour > )@
    730 \item
    731 return the enumeration constant at position 1
    732 \end{enumerate}
    733 \begin{lstlisting}
    734 {
    735         enum T (int) { ... } // Declaration
    736         enum T t = 1;
    737 }
    738 \end{lstlisting}
    739 Steps:
    740 \begin{enumerate}
    741 \item
    742 identify @1@ as a constant expression with type @int@, and the value is statically known as @1@
    743 \item
    744 @unification( EnumInstType<Colour>, int )@: @value( EnumInstType< Colour > )@
    745 \item
    746 return the FIRST enumeration constant that has the value 1, by searching through the values array
    747 \end{enumerate}
    748 The downside of the precedence rule: @EnumInstType@ $\Rightarrow$ @int ( value )@ $\Rightarrow$ @EnumInstType@ may return a different @EnumInstType@ because the value can be repeated and there is no way to know which one is expected $\Rightarrow$ want uniqueness
    749 
    750 \subsection{Casting}
    751 Casting an EnumInstType to some other type T works similarly to unify the EnumInstType with T. For example:
    752 \begin{lstlisting}
    753 enum( int ) Foo { A = 10, B = 100, C = 1000 };
    754 (int) Foo.A;
    755 \end{lstlisting}
    756 The \CFA-compiler unifies @EnumInstType<int>@ with int, with returns @value( Foo.A )@, which has statically known value 10. In other words, \CFA-compiler is aware of a cast expression, and it forms the context for EnumInstType resolution. The expression with type @EnumInstType<int>@ can be replaced by the compile with a constant expression 10, and optionally discard the cast expression.
    757 
    758 \subsection{Value Conversion}
    759 As discussed in section~\ref{lst:var_declaration}, \CFA only saves @position@ as the necessary information. It is necessary for \CFA to generate intermediate code to retrieve other attributes.
    760 
    761 \begin{lstlisting}
    762 Foo a; // int a;
    763 int j = a;
    764 char * s = a;
    765 \end{lstlisting}
    766 Assume stores a value x, which cannot be statically determined. When assigning a to j in line 2, the compiler @Unify@ j with a, and returns @value( a )@. The generated code for the second line will be
    767 \begin{lstlisting}
    768 int j = value( Foo, a )
    769 \end{lstlisting}
    770 Similarly, the generated code for the third line is
    771 \begin{lstlisting}
    772 char * j = label( Foo, a )
    773 \end{lstlisting}
    774808
    775809
Note: See TracChangeset for help on using the changeset viewer.