Changes in / [69ab896:5546eee4]


Ignore:
File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/proposals/enum.tex

    r69ab896 r5546eee4  
    205205\end{lstlisting}
    206206
    207 \section{Enumeration Characteristic}
    208 
    209 \subsection{Enumerator Storage}
     207\section{Enumeration Storage}
     208
     209\subsection{Enumeration Variable}
    210210
    211211Although \CFA enumeration captures three different attributes, an enumeration instance does not store all this information.
     
    231231These generated functions are $Companion Functions$, they take an $companion$ object and the position as parameters.
    232232
     233\subsection{Enumeration Data}
     234\begin{lstlisting}[label=lst:enumeration_backing_data]
     235enum(T) E { ... };
     236// backing data
     237T* E_values;
     238char** E_labels;
     239\end{lstlisting}
     240Storing values and labels as arrays can sometimes help support enumeration features. However, the data structures are the overhead for the programs. We want to reduce the memory usage for enumeration support by:
     241\begin{itemize}
     242    \item Only generates the data array if necessary
     243    \item The compilation units share the data structures. No extra overhead if the data structures are requested multiple times.
     244\end{itemize}
     245
     246
     247\subsection{Aggressive Inline}
     248To avoid allocating memory for enumeration data structures, \CFA inline the result of enumeration attribute pseudo-function whenever it is possible.
     249\begin{lstlisting}[label=lst:enumeration_inline]
     250enum(int) OddNumber { A=1, B=3 };
     251sout | "A: " | OddNumber.A | "B: " | OddNumber.B | "A+B: " | OddNumber.A + OddNumber.B
     252\end{lstlisting}
     253Instead of calling pseudo-function @value@ on expression $OddNumber.A$ and $OddNumber.B$, because the result is known statistically, \CFA will inline the constant expression 1 and 3, respectively. Because no runtime lookup for enumeration value is necessary, \CFA will not generate data structure for enumeration OddNumber.
     254
     255\subsection{Weak Reference}
     256\begin{lstlisting}[label=lst:week_ref]
     257enum(int) OddNumber { A=1, B=3 };
     258enum OddNumber i = ...;
     259...
     260sout | OddNumber;
     261\end{lstlisting}
     262In this example, \CFA cannot determine the static value of the enum variable i, and Runtime lookup is necessary. The OddNumber can be referenced in multiple compilations, and allocating the arrays in all compilation units is not desirable. \CFA addresses this by declaring the value array as a weak reference. All compilation units reference OddNumber have weak references to the same enumeration data structure. No extra memory is allocated if more compilation units reference OddNumber, and the OddNumber is initialized once.
     263
     264\section{Unification}
     265
    233266\subsection{Enumeration as Value}
    234 An \CFA enumeration with base type T can be used seamlessly as T.
     267\label{section:enumeration_as_value}
     268An \CFA enumeration with base type T can be used seamlessly as T, without explicitly calling the pseudo-function value.
    235269\begin{lstlisting}[label=lst:implicit_conversion]
    236270char * green_value = Colour.Green; // "G"
    237271// Is equivalent to
    238 char * green_value = value( Color.Green ); "G"
    239 \end{lstlisting}
    240 \CFA recognizes @Colour.Green@ as an Expression with enumeration type. [reference to resolution distance] An enumeration type can be safely converted into its value type T, @char *@ in the example. When assigning @Colour.Green@ to a reference @green_value@, which has type @char *@, the compiler adds the distance between an enumeration and type T, and the distance between type T and @char *@. If the distance is safe, \CFA will replace the expression @Colour.Green@ with @value( Colour.Green )@.
    241 
    242 \subsection{Variable Overloading}
     272// char * green_value = value( Color.Green ); "G"
     273\end{lstlisting}
     274
     275\subsection{Unification Distance}
     276\begin{lstlisting}[label=lst:unification_distance_example]
     277T_2 Foo(T1);
     278\end{lstlisting}
     279The @Foo@ function expects a parameter with type @T1@. In C, only a value with the exact type T1 can be used as a parameter for @Foo@. In \CFA, @Foo@ accepts value with some type @T3@ as long as @distance(T1, T3)@ is not @Infinite@.
     280
     281@path(A, B)@ is a compiler concept that returns one of the following:
     282\begin{itemize}
     283    \item Zero or 0, if and only if $A == B$.
     284    \item Safe, if B can be used as A without losing its precision, or B is a subtype of A.
     285    \item Unsafe, if B loses its precision when used as A, or A is a subtype of B.
     286    \item Infinite, if B cannot be used as A. A is not a subtype of B and B is not a subtype of A.
     287\end{itemize}
     288
     289For example, @path(int, int)==Zero@, @path(int, char)==Safe@, @path(int, double)==Unsafe@, @path(int, struct S)@ is @Infinite@ for @struct S{}@.
     290@distance(A, C)@ is the minimum sum of paths from A to C. For example, if @path(A, B)==i@, @path(B, C)==j@, and @path(A, C)=k@, then $$distance(A,C)==min(path(A,B), path(B,C))==i+j$$.
     291
     292(Skip over the distance matrix here because it is mostly irrelevant for enumeration discussion. In the actual implementation, distance( E, T ) is 1.)
     293
     294The arithmetic of distance is the following:
     295\begin{itemize}
     296    \item $Zero + v= v$, for some value v.
     297    \item $Safe * k <  Unsafe$, for finite k.
     298    \item $Unsafe * k < Infinite$, for finite k.
     299    \item $Infinite + v = Infinite$, for some value v.
     300\end{itemize}
     301
     302For @enum(T) E@, @path(T, E)==Safe@ and @path(E,T)==Infinite@. In other words, enumeration type E can be @safely@ used as type T, but type T cannot be used when the resolution context expects a variable with enumeration type @E@.
     303
     304
     305\subsection{Variable Overloading and Parameter Unification}
     306\CFA allows variable names to be overloaded. It is possible to overload a variable that has type T and an enumeration with type T.
    243307\begin{lstlisting}[label=lst:variable_overload]
     308char * green = "Green";
     309Colour green = Colour.Green; // "G"
     310
     311void bar(char * s) { return s; }
    244312void foo(Colour c) { return value( c ); }
    245 void bar(char * s) { return s; }
    246 Colour green = Colour.Green; // "G"
    247 char * green = "Green";
     313
    248314foo( green ); // "G"
    249315bar( green ); // "Green"
    250316\end{lstlisting}
     317\CFA's conversion distance helps disambiguation in this overloading. For the function @bar@ which expects the parameter s to have type @char *@, $distance(char *,char *) == Zero$ while $distance(char *, Colour) == Safe$, the path from @char *@ to the enumeration with based type @char *@, \CFA chooses the @green@ with type @char *@ unambiguously. On the other hand, for the function @foo@, @distance(Colour, char *)@ is @Infinite@, @foo@ picks the @green@ with type @char *@.
    251318
    252319\subsection{Function Overloading}
     320Similarly, functions can be overloaded with different signatures. \CFA picks the correct function entity based on the distance between parameter types and the arguments.
    253321\begin{lstlisting}[label=lst:function_overload]
    254 void foo(Colour c) { return "It is an enum"; }
    255 void foo(char * s) { return "It is a string"; }
     322Colour green = Colour.Green;
     323void foo(Colour c) { sout | "It is an enum"; } // First foo
     324void foo(char * s) { sout | "It is a string"; } // Second foo
    256325foo( green ); // "It is an enum"
    257326\end{lstlisting}
    258 
    259 As a consequence, the semantics of using \CFA enumeration as a flag for selection is identical to C enumeration.
    260 
    261 
    262 % \section{Enumeration Features}
    263 
    264 A trait is a collection of constraints in \CFA that can be used to describe types.
    265 The \CFA standard library defines traits to categorize types with related enumeration features.
     327Because @distance(Colour, Colour)@ is @Zero@ and @distance(char *, Colour)@ is @Safe@, \CFA determines the @foo( green )@ is a call to the first foo.
     328
     329\subsection{Attributes Functions}
     330The pseudo-function @value()@ "unboxes" the enumeration and the type of the expression is the underlying type. Therefore, in the section~\ref{section:enumeration_as_value} when assigning @Colour.Green@ to variable typed @char *@, the resolution distance is @Safe@, while assigning @value(Color.Green) to @char *) has resolution distance @Zero@.
     331
     332\begin{lstlisting}[label=lst:declaration_code]
     333int s1;
     334\end{lstlisting}
     335The generated code for an enumeration instance is simply an int. It is to hold the position of an enumeration. And usage of variable @s1@ will be converted to return one of its attributes: label, value, or position, concerning the @Unification@ rule
     336
     337% \subsection{Unification and Resolution (this implementation will probably not be used, safe as reference for now)}
     338
     339% \begin{lstlisting}
     340% enum Colour( char * ) { Red = "R", Green = "G", Blue = "B"  };
     341% \end{lstlisting}
     342% The @EnumInstType@ is convertible to other types.
     343% A \CFA enumeration expression is implicitly \emph{overloaded} with its three different attributes: value, position, and label.
     344% The \CFA compilers need to resolve an @EnumInstType@ as one of its attributes based on the current context.
     345
     346% \begin{lstlisting}[caption={Null Context}, label=lst:null_context]
     347% {
     348%       Colour.Green;
     349% }
     350% \end{lstlisting}
     351% In example~\ref{lst:null_context}, the environment gives no information to help with the resolution of @Colour.Green@.
     352% In this case, any of the attributes is resolvable.
     353% According to the \textit{precedence rule}, the expression with @EnumInstType@ resolves as @value( Colour.Green )@.
     354% The @EnumInstType@ is converted to the type of the value, which is statically known to the compiler as @char *@.
     355% When the compilation reaches the code generation, the compiler outputs code for type @char *@ with the value @"G"@.
     356% \begin{lstlisting}[caption={Null Context Generated Code}, label=lst:null_context]
     357% {
     358%       "G";
     359% }
     360% \end{lstlisting}
     361% \begin{lstlisting}[caption={int Context}, label=lst:int_context]
     362% {
     363%       int g = Colour.Green;
     364% }
     365% \end{lstlisting}
     366% The assignment expression gives a context for the EnumInstType resolution.
     367% The EnumInstType is used as an @int@, and \CFA needs to determine which of the attributes can be resolved as an @int@ type.
     368% The functions $Unify( T1, T2 ): bool$ take two types as parameters and determine if one type can be used as another.
     369% In example~\ref{lst:int_context}, the compiler is trying to unify @int@ and @EnumInstType@ of @Colour@.
     370% $$Unification( int, EnumInstType<Colour> )$$ which turns into three Unification call
     371% \begin{lstlisting}[label=lst:attr_resolution_1]
     372% {
     373%       Unify( int, char * ); // unify with the type of value
     374%       Unify( int, int ); // unify with the type of position
     375%       Unify( int, char * ); // unify with the type of label
     376% }
     377% \end{lstlisting}
     378% \begin{lstlisting}[label=lst:attr_resolution_precedence]
     379% {
     380%       Unification( T1, EnumInstType<T2> ) {
     381%               if ( Unify( T1, T2 ) ) return T2;
     382%               if ( Unify( T1, int ) ) return int;
     383%               if ( Unify( T1, char * ) ) return char *;
     384%               Error: Cannot Unify T1 with EnumInstType<T2>;
     385%       }
     386% }
     387% \end{lstlisting}
     388% After the unification, @EnumInstType@ is replaced by its attributes.
     389
     390% \begin{lstlisting}[caption={Unification Functions}, label=lst:unification_func_call]
     391% {
     392%       T2 foo ( T1 ); // function take variable with T1 as a parameter
     393%       foo( EnumInstType<T3> ); // Call foo with a variable has type EnumInstType<T3>
     394%       >>>> Unification( T1, EnumInstType<T3> )
     395% }
     396% \end{lstlisting}
     397% % The conversion can work backward: in restrictive cases, attributes of can be implicitly converted back to the EnumInstType.
     398% Backward conversion:
     399% \begin{lstlisting}[caption={Unification Functions}, label=lst:unification_func_call]
     400% {
     401%       enum Colour colour = 1;
     402% }
     403% \end{lstlisting}
     404
     405% \begin{lstlisting}[caption={Unification Functions}, label=lst:unification_func_call]
     406% {
     407%    Unification( EnumInstType<Colour>, int ) >>> label
     408% }
     409% \end{lstlisting}
     410% @int@ can be unified with the label of Colour.
     411% @5@ is a constant expression $\Rightarrow$ Compiler knows the value during the compilation $\Rightarrow$ turns it into
     412% \begin{lstlisting}
     413% {
     414%    enum Colour colour = Colour.Green;
     415% }
     416% \end{lstlisting}
     417% Steps:
     418% \begin{enumerate}
     419% \item
     420% identify @1@ as a constant expression with type @int@, and the value is statically known as @1@
     421% \item
     422% @unification( EnumInstType<Colour>, int )@: @position( EnumInstType< Colour > )@
     423% \item
     424% return the enumeration constant at position 1
     425% \end{enumerate}
     426% \begin{lstlisting}
     427% {
     428%       enum T (int) { ... } // Declaration
     429%       enum T t = 1;
     430% }
     431% \end{lstlisting}
     432% Steps:
     433% \begin{enumerate}
     434% \item
     435% identify @1@ as a constant expression with type @int@, and the value is statically known as @1@
     436% \item
     437% @unification( EnumInstType<Colour>, int )@: @value( EnumInstType< Colour > )@
     438% \item
     439% return the FIRST enumeration constant that has the value 1, by searching through the values array
     440% \end{enumerate}
     441% The downside of the precedence rule: @EnumInstType@ $\Rightarrow$ @int ( value )@ $\Rightarrow$ @EnumInstType@ may return a different @EnumInstType@ because the value can be repeated and there is no way to know which one is expected $\Rightarrow$ want uniqueness
     442
     443% \subsection{Casting}
     444% Casting an EnumInstType to some other type T works similarly to unify the EnumInstType with T. For example:
     445% \begin{lstlisting}
     446% enum( int ) Foo { A = 10, B = 100, C = 1000 };
     447% (int) Foo.A;
     448% \end{lstlisting}
     449% The \CFA-compiler unifies @EnumInstType<int>@ with int, with returns @value( Foo.A )@, which has statically known value 10. In other words, \CFA-compiler is aware of a cast expression, and it forms the context for EnumInstType resolution. The expression with type @EnumInstType<int>@ can be replaced by the compile with a constant expression 10, and optionally discard the cast expression.
     450
     451% \subsection{Value Conversion}
     452% As discussed in section~\ref{lst:var_declaration}, \CFA only saves @position@ as the necessary information. It is necessary for \CFA to generate intermediate code to retrieve other attributes.
     453
     454% \begin{lstlisting}
     455% Foo a; // int a;
     456% int j = a;
     457% char * s = a;
     458% \end{lstlisting}
     459% Assume stores a value x, which cannot be statically determined. When assigning a to j in line 2, the compiler @Unify@ j with a, and returns @value( a )@. The generated code for the second line will be
     460% \begin{lstlisting}
     461% int j = value( Foo, a )
     462% \end{lstlisting}
     463% Similarly, the generated code for the third line is
     464% \begin{lstlisting}
     465% char * j = label( Foo, a )
     466% \end{lstlisting}
     467
    266468
    267469\section{Enumerator Initialization}
     
    436638\section{Implementation}
    437639
    438 \subsection{Compiler Representation}
     640\subsection{Compiler Representation (Reworking)}
    439641The definition of an enumeration is represented by an internal type called @EnumDecl@. At the minimum, it stores all the information needed to construct the companion object. Therefore, an @EnumDecl@ can be represented as the following:
    440642\begin{lstlisting}[label=lst:EnumDecl]
     
    464666Companion data are needed only if the according pseudo-functions are called. For example, the value of the enumeration Workday is loaded only if there is at least one compilation that has call $value(Workday)$. Once the values are loaded, all compilations share these values array to reduce memory usage.
    465667
    466 <Investiage: how to implement this is huge>
    467668
    468669\subsection{(Rework) Companion Object and Companion Function}
     
    636837
    637838The declaration \CFA-enumeration variable has the same syntax as the C-enumeration. Internally, such a variable will be represented as an EnumInstType.
    638 \begin{lstlisting}[label=lst:declaration_code]
    639 int s1;
    640 \end{lstlisting}
    641 The generated code for an enumeration instance is simply an int. It is to hold the position of an enumeration. And usage of variable @s1@ will be converted to return one of its attributes: label, value, or position, with respect to the @Unification@ rule
    642 
    643 \subsection{Unification and Resolution }
    644 
    645 
    646 \begin{lstlisting}
    647 enum Colour( char * ) { Red = "R", Green = "G", Blue = "B"  };
    648 \end{lstlisting}
    649 The @EnumInstType@ is convertible to other types.
    650 A \CFA enumeration expression is implicitly \emph{overloaded} with its three different attributes: value, position, and label.
    651 The \CFA compilers need to resolve an @EnumInstType@ as one of its attributes based on the current context.
    652 
    653 \begin{lstlisting}[caption={Null Context}, label=lst:null_context]
    654 {
    655         Colour.Green;
    656 }
    657 \end{lstlisting}
    658 In example~\ref{lst:null_context}, the environment gives no information to help with the resolution of @Colour.Green@.
    659 In this case, any of the attributes is resolvable.
    660 According to the \textit{precedence rule}, the expression with @EnumInstType@ resolves as @value( Colour.Green )@.
    661 The @EnumInstType@ is converted to the type of the value, which is statically known to the compiler as @char *@.
    662 When the compilation reaches the code generation, the compiler outputs code for type @char *@ with the value @"G"@.
    663 \begin{lstlisting}[caption={Null Context Generated Code}, label=lst:null_context]
    664 {
    665         "G";
    666 }
    667 \end{lstlisting}
    668 \begin{lstlisting}[caption={int Context}, label=lst:int_context]
    669 {
    670         int g = Colour.Green;
    671 }
    672 \end{lstlisting}
    673 The assignment expression gives a context for the EnumInstType resolution.
    674 The EnumInstType is used as an @int@, and \CFA needs to determine which of the attributes can be resolved as an @int@ type.
    675 The functions $Unify( T1, T2 ): bool$ take two types as parameters and determine if one type can be used as another.
    676 In example~\ref{lst:int_context}, the compiler is trying to unify @int@ and @EnumInstType@ of @Colour@.
    677 $$Unification( int, EnumInstType<Colour> )$$ which turns into three Unification call
    678 \begin{lstlisting}[label=lst:attr_resolution_1]
    679 {
    680         Unify( int, char * ); // unify with the type of value
    681         Unify( int, int ); // unify with the type of position
    682         Unify( int, char * ); // unify with the type of label
    683 }
    684 \end{lstlisting}
    685 \begin{lstlisting}[label=lst:attr_resolution_precedence]
    686 {
    687         Unification( T1, EnumInstType<T2> ) {
    688                 if ( Unify( T1, T2 ) ) return T2;
    689                 if ( Unify( T1, int ) ) return int;
    690                 if ( Unify( T1, char * ) ) return char *;
    691                 Error: Cannot Unify T1 with EnumInstType<T2>;
    692         }
    693 }
    694 \end{lstlisting}
    695 After the unification, @EnumInstType@ is replaced by its attributes.
    696 
    697 \begin{lstlisting}[caption={Unification Functions}, label=lst:unification_func_call]
    698 {
    699         T2 foo ( T1 ); // function take variable with T1 as a parameter
    700         foo( EnumInstType<T3> ); // Call foo with a variable has type EnumInstType<T3>
    701         >>>> Unification( T1, EnumInstType<T3> )
    702 }
    703 \end{lstlisting}
    704 % The conversion can work backward: in restrictive cases, attributes of can be implicitly converted back to the EnumInstType.
    705 Backward conversion:
    706 \begin{lstlisting}[caption={Unification Functions}, label=lst:unification_func_call]
    707 {
    708         enum Colour colour = 1;
    709 }
    710 \end{lstlisting}
    711 
    712 \begin{lstlisting}[caption={Unification Functions}, label=lst:unification_func_call]
    713 {
    714    Unification( EnumInstType<Colour>, int ) >>> label
    715 }
    716 \end{lstlisting}
    717 @int@ can be unified with the label of Colour.
    718 @5@ is a constant expression $\Rightarrow$ Compiler knows the value during the compilation $\Rightarrow$ turns it into
    719 \begin{lstlisting}
    720 {
    721    enum Colour colour = Colour.Green;
    722 }
    723 \end{lstlisting}
    724 Steps:
    725 \begin{enumerate}
    726 \item
    727 identify @1@ as a constant expression with type @int@, and the value is statically known as @1@
    728 \item
    729 @unification( EnumInstType<Colour>, int )@: @position( EnumInstType< Colour > )@
    730 \item
    731 return the enumeration constant at position 1
    732 \end{enumerate}
    733 \begin{lstlisting}
    734 {
    735         enum T (int) { ... } // Declaration
    736         enum T t = 1;
    737 }
    738 \end{lstlisting}
    739 Steps:
    740 \begin{enumerate}
    741 \item
    742 identify @1@ as a constant expression with type @int@, and the value is statically known as @1@
    743 \item
    744 @unification( EnumInstType<Colour>, int )@: @value( EnumInstType< Colour > )@
    745 \item
    746 return the FIRST enumeration constant that has the value 1, by searching through the values array
    747 \end{enumerate}
    748 The downside of the precedence rule: @EnumInstType@ $\Rightarrow$ @int ( value )@ $\Rightarrow$ @EnumInstType@ may return a different @EnumInstType@ because the value can be repeated and there is no way to know which one is expected $\Rightarrow$ want uniqueness
    749 
    750 \subsection{Casting}
    751 Casting an EnumInstType to some other type T works similarly to unify the EnumInstType with T. For example:
    752 \begin{lstlisting}
    753 enum( int ) Foo { A = 10, B = 100, C = 1000 };
    754 (int) Foo.A;
    755 \end{lstlisting}
    756 The \CFA-compiler unifies @EnumInstType<int>@ with int, with returns @value( Foo.A )@, which has statically known value 10. In other words, \CFA-compiler is aware of a cast expression, and it forms the context for EnumInstType resolution. The expression with type @EnumInstType<int>@ can be replaced by the compile with a constant expression 10, and optionally discard the cast expression.
    757 
    758 \subsection{Value Conversion}
    759 As discussed in section~\ref{lst:var_declaration}, \CFA only saves @position@ as the necessary information. It is necessary for \CFA to generate intermediate code to retrieve other attributes.
    760 
    761 \begin{lstlisting}
    762 Foo a; // int a;
    763 int j = a;
    764 char * s = a;
    765 \end{lstlisting}
    766 Assume stores a value x, which cannot be statically determined. When assigning a to j in line 2, the compiler @Unify@ j with a, and returns @value( a )@. The generated code for the second line will be
    767 \begin{lstlisting}
    768 int j = value( Foo, a )
    769 \end{lstlisting}
    770 Similarly, the generated code for the third line is
    771 \begin{lstlisting}
    772 char * j = label( Foo, a )
    773 \end{lstlisting}
    774839
    775840
Note: See TracChangeset for help on using the changeset viewer.