Changes in / [69ab896:5546eee4]
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/proposals/enum.tex
r69ab896 r5546eee4 205 205 \end{lstlisting} 206 206 207 \section{Enumeration Characteristic}208 209 \subsection{Enumerat or Storage}207 \section{Enumeration Storage} 208 209 \subsection{Enumeration Variable} 210 210 211 211 Although \CFA enumeration captures three different attributes, an enumeration instance does not store all this information. … … 231 231 These generated functions are $Companion Functions$, they take an $companion$ object and the position as parameters. 232 232 233 \subsection{Enumeration Data} 234 \begin{lstlisting}[label=lst:enumeration_backing_data] 235 enum(T) E { ... }; 236 // backing data 237 T* E_values; 238 char** E_labels; 239 \end{lstlisting} 240 Storing values and labels as arrays can sometimes help support enumeration features. However, the data structures are the overhead for the programs. We want to reduce the memory usage for enumeration support by: 241 \begin{itemize} 242 \item Only generates the data array if necessary 243 \item The compilation units share the data structures. No extra overhead if the data structures are requested multiple times. 244 \end{itemize} 245 246 247 \subsection{Aggressive Inline} 248 To avoid allocating memory for enumeration data structures, \CFA inline the result of enumeration attribute pseudo-function whenever it is possible. 249 \begin{lstlisting}[label=lst:enumeration_inline] 250 enum(int) OddNumber { A=1, B=3 }; 251 sout | "A: " | OddNumber.A | "B: " | OddNumber.B | "A+B: " | OddNumber.A + OddNumber.B 252 \end{lstlisting} 253 Instead of calling pseudo-function @value@ on expression $OddNumber.A$ and $OddNumber.B$, because the result is known statistically, \CFA will inline the constant expression 1 and 3, respectively. Because no runtime lookup for enumeration value is necessary, \CFA will not generate data structure for enumeration OddNumber. 254 255 \subsection{Weak Reference} 256 \begin{lstlisting}[label=lst:week_ref] 257 enum(int) OddNumber { A=1, B=3 }; 258 enum OddNumber i = ...; 259 ... 260 sout | OddNumber; 261 \end{lstlisting} 262 In this example, \CFA cannot determine the static value of the enum variable i, and Runtime lookup is necessary. The OddNumber can be referenced in multiple compilations, and allocating the arrays in all compilation units is not desirable. \CFA addresses this by declaring the value array as a weak reference. All compilation units reference OddNumber have weak references to the same enumeration data structure. No extra memory is allocated if more compilation units reference OddNumber, and the OddNumber is initialized once. 263 264 \section{Unification} 265 233 266 \subsection{Enumeration as Value} 234 An \CFA enumeration with base type T can be used seamlessly as T. 267 \label{section:enumeration_as_value} 268 An \CFA enumeration with base type T can be used seamlessly as T, without explicitly calling the pseudo-function value. 235 269 \begin{lstlisting}[label=lst:implicit_conversion] 236 270 char * green_value = Colour.Green; // "G" 237 271 // Is equivalent to 238 char * green_value = value( Color.Green ); "G" 239 \end{lstlisting} 240 \CFA recognizes @Colour.Green@ as an Expression with enumeration type. [reference to resolution distance] An enumeration type can be safely converted into its value type T, @char *@ in the example. When assigning @Colour.Green@ to a reference @green_value@, which has type @char *@, the compiler adds the distance between an enumeration and type T, and the distance between type T and @char *@. If the distance is safe, \CFA will replace the expression @Colour.Green@ with @value( Colour.Green )@. 241 242 \subsection{Variable Overloading} 272 // char * green_value = value( Color.Green ); "G" 273 \end{lstlisting} 274 275 \subsection{Unification Distance} 276 \begin{lstlisting}[label=lst:unification_distance_example] 277 T_2 Foo(T1); 278 \end{lstlisting} 279 The @Foo@ function expects a parameter with type @T1@. In C, only a value with the exact type T1 can be used as a parameter for @Foo@. In \CFA, @Foo@ accepts value with some type @T3@ as long as @distance(T1, T3)@ is not @Infinite@. 280 281 @path(A, B)@ is a compiler concept that returns one of the following: 282 \begin{itemize} 283 \item Zero or 0, if and only if $A == B$. 284 \item Safe, if B can be used as A without losing its precision, or B is a subtype of A. 285 \item Unsafe, if B loses its precision when used as A, or A is a subtype of B. 286 \item Infinite, if B cannot be used as A. A is not a subtype of B and B is not a subtype of A. 287 \end{itemize} 288 289 For example, @path(int, int)==Zero@, @path(int, char)==Safe@, @path(int, double)==Unsafe@, @path(int, struct S)@ is @Infinite@ for @struct S{}@. 290 @distance(A, C)@ is the minimum sum of paths from A to C. For example, if @path(A, B)==i@, @path(B, C)==j@, and @path(A, C)=k@, then $$distance(A,C)==min(path(A,B), path(B,C))==i+j$$. 291 292 (Skip over the distance matrix here because it is mostly irrelevant for enumeration discussion. In the actual implementation, distance( E, T ) is 1.) 293 294 The arithmetic of distance is the following: 295 \begin{itemize} 296 \item $Zero + v= v$, for some value v. 297 \item $Safe * k < Unsafe$, for finite k. 298 \item $Unsafe * k < Infinite$, for finite k. 299 \item $Infinite + v = Infinite$, for some value v. 300 \end{itemize} 301 302 For @enum(T) E@, @path(T, E)==Safe@ and @path(E,T)==Infinite@. In other words, enumeration type E can be @safely@ used as type T, but type T cannot be used when the resolution context expects a variable with enumeration type @E@. 303 304 305 \subsection{Variable Overloading and Parameter Unification} 306 \CFA allows variable names to be overloaded. It is possible to overload a variable that has type T and an enumeration with type T. 243 307 \begin{lstlisting}[label=lst:variable_overload] 308 char * green = "Green"; 309 Colour green = Colour.Green; // "G" 310 311 void bar(char * s) { return s; } 244 312 void foo(Colour c) { return value( c ); } 245 void bar(char * s) { return s; } 246 Colour green = Colour.Green; // "G" 247 char * green = "Green"; 313 248 314 foo( green ); // "G" 249 315 bar( green ); // "Green" 250 316 \end{lstlisting} 317 \CFA's conversion distance helps disambiguation in this overloading. For the function @bar@ which expects the parameter s to have type @char *@, $distance(char *,char *) == Zero$ while $distance(char *, Colour) == Safe$, the path from @char *@ to the enumeration with based type @char *@, \CFA chooses the @green@ with type @char *@ unambiguously. On the other hand, for the function @foo@, @distance(Colour, char *)@ is @Infinite@, @foo@ picks the @green@ with type @char *@. 251 318 252 319 \subsection{Function Overloading} 320 Similarly, functions can be overloaded with different signatures. \CFA picks the correct function entity based on the distance between parameter types and the arguments. 253 321 \begin{lstlisting}[label=lst:function_overload] 254 void foo(Colour c) { return "It is an enum"; } 255 void foo(char * s) { return "It is a string"; } 322 Colour green = Colour.Green; 323 void foo(Colour c) { sout | "It is an enum"; } // First foo 324 void foo(char * s) { sout | "It is a string"; } // Second foo 256 325 foo( green ); // "It is an enum" 257 326 \end{lstlisting} 258 259 As a consequence, the semantics of using \CFA enumeration as a flag for selection is identical to C enumeration. 260 261 262 % \section{Enumeration Features} 263 264 A trait is a collection of constraints in \CFA that can be used to describe types. 265 The \CFA standard library defines traits to categorize types with related enumeration features. 327 Because @distance(Colour, Colour)@ is @Zero@ and @distance(char *, Colour)@ is @Safe@, \CFA determines the @foo( green )@ is a call to the first foo. 328 329 \subsection{Attributes Functions} 330 The pseudo-function @value()@ "unboxes" the enumeration and the type of the expression is the underlying type. Therefore, in the section~\ref{section:enumeration_as_value} when assigning @Colour.Green@ to variable typed @char *@, the resolution distance is @Safe@, while assigning @value(Color.Green) to @char *) has resolution distance @Zero@. 331 332 \begin{lstlisting}[label=lst:declaration_code] 333 int s1; 334 \end{lstlisting} 335 The generated code for an enumeration instance is simply an int. It is to hold the position of an enumeration. And usage of variable @s1@ will be converted to return one of its attributes: label, value, or position, concerning the @Unification@ rule 336 337 % \subsection{Unification and Resolution (this implementation will probably not be used, safe as reference for now)} 338 339 % \begin{lstlisting} 340 % enum Colour( char * ) { Red = "R", Green = "G", Blue = "B" }; 341 % \end{lstlisting} 342 % The @EnumInstType@ is convertible to other types. 343 % A \CFA enumeration expression is implicitly \emph{overloaded} with its three different attributes: value, position, and label. 344 % The \CFA compilers need to resolve an @EnumInstType@ as one of its attributes based on the current context. 345 346 % \begin{lstlisting}[caption={Null Context}, label=lst:null_context] 347 % { 348 % Colour.Green; 349 % } 350 % \end{lstlisting} 351 % In example~\ref{lst:null_context}, the environment gives no information to help with the resolution of @Colour.Green@. 352 % In this case, any of the attributes is resolvable. 353 % According to the \textit{precedence rule}, the expression with @EnumInstType@ resolves as @value( Colour.Green )@. 354 % The @EnumInstType@ is converted to the type of the value, which is statically known to the compiler as @char *@. 355 % When the compilation reaches the code generation, the compiler outputs code for type @char *@ with the value @"G"@. 356 % \begin{lstlisting}[caption={Null Context Generated Code}, label=lst:null_context] 357 % { 358 % "G"; 359 % } 360 % \end{lstlisting} 361 % \begin{lstlisting}[caption={int Context}, label=lst:int_context] 362 % { 363 % int g = Colour.Green; 364 % } 365 % \end{lstlisting} 366 % The assignment expression gives a context for the EnumInstType resolution. 367 % The EnumInstType is used as an @int@, and \CFA needs to determine which of the attributes can be resolved as an @int@ type. 368 % The functions $Unify( T1, T2 ): bool$ take two types as parameters and determine if one type can be used as another. 369 % In example~\ref{lst:int_context}, the compiler is trying to unify @int@ and @EnumInstType@ of @Colour@. 370 % $$Unification( int, EnumInstType<Colour> )$$ which turns into three Unification call 371 % \begin{lstlisting}[label=lst:attr_resolution_1] 372 % { 373 % Unify( int, char * ); // unify with the type of value 374 % Unify( int, int ); // unify with the type of position 375 % Unify( int, char * ); // unify with the type of label 376 % } 377 % \end{lstlisting} 378 % \begin{lstlisting}[label=lst:attr_resolution_precedence] 379 % { 380 % Unification( T1, EnumInstType<T2> ) { 381 % if ( Unify( T1, T2 ) ) return T2; 382 % if ( Unify( T1, int ) ) return int; 383 % if ( Unify( T1, char * ) ) return char *; 384 % Error: Cannot Unify T1 with EnumInstType<T2>; 385 % } 386 % } 387 % \end{lstlisting} 388 % After the unification, @EnumInstType@ is replaced by its attributes. 389 390 % \begin{lstlisting}[caption={Unification Functions}, label=lst:unification_func_call] 391 % { 392 % T2 foo ( T1 ); // function take variable with T1 as a parameter 393 % foo( EnumInstType<T3> ); // Call foo with a variable has type EnumInstType<T3> 394 % >>>> Unification( T1, EnumInstType<T3> ) 395 % } 396 % \end{lstlisting} 397 % % The conversion can work backward: in restrictive cases, attributes of can be implicitly converted back to the EnumInstType. 398 % Backward conversion: 399 % \begin{lstlisting}[caption={Unification Functions}, label=lst:unification_func_call] 400 % { 401 % enum Colour colour = 1; 402 % } 403 % \end{lstlisting} 404 405 % \begin{lstlisting}[caption={Unification Functions}, label=lst:unification_func_call] 406 % { 407 % Unification( EnumInstType<Colour>, int ) >>> label 408 % } 409 % \end{lstlisting} 410 % @int@ can be unified with the label of Colour. 411 % @5@ is a constant expression $\Rightarrow$ Compiler knows the value during the compilation $\Rightarrow$ turns it into 412 % \begin{lstlisting} 413 % { 414 % enum Colour colour = Colour.Green; 415 % } 416 % \end{lstlisting} 417 % Steps: 418 % \begin{enumerate} 419 % \item 420 % identify @1@ as a constant expression with type @int@, and the value is statically known as @1@ 421 % \item 422 % @unification( EnumInstType<Colour>, int )@: @position( EnumInstType< Colour > )@ 423 % \item 424 % return the enumeration constant at position 1 425 % \end{enumerate} 426 % \begin{lstlisting} 427 % { 428 % enum T (int) { ... } // Declaration 429 % enum T t = 1; 430 % } 431 % \end{lstlisting} 432 % Steps: 433 % \begin{enumerate} 434 % \item 435 % identify @1@ as a constant expression with type @int@, and the value is statically known as @1@ 436 % \item 437 % @unification( EnumInstType<Colour>, int )@: @value( EnumInstType< Colour > )@ 438 % \item 439 % return the FIRST enumeration constant that has the value 1, by searching through the values array 440 % \end{enumerate} 441 % The downside of the precedence rule: @EnumInstType@ $\Rightarrow$ @int ( value )@ $\Rightarrow$ @EnumInstType@ may return a different @EnumInstType@ because the value can be repeated and there is no way to know which one is expected $\Rightarrow$ want uniqueness 442 443 % \subsection{Casting} 444 % Casting an EnumInstType to some other type T works similarly to unify the EnumInstType with T. For example: 445 % \begin{lstlisting} 446 % enum( int ) Foo { A = 10, B = 100, C = 1000 }; 447 % (int) Foo.A; 448 % \end{lstlisting} 449 % The \CFA-compiler unifies @EnumInstType<int>@ with int, with returns @value( Foo.A )@, which has statically known value 10. In other words, \CFA-compiler is aware of a cast expression, and it forms the context for EnumInstType resolution. The expression with type @EnumInstType<int>@ can be replaced by the compile with a constant expression 10, and optionally discard the cast expression. 450 451 % \subsection{Value Conversion} 452 % As discussed in section~\ref{lst:var_declaration}, \CFA only saves @position@ as the necessary information. It is necessary for \CFA to generate intermediate code to retrieve other attributes. 453 454 % \begin{lstlisting} 455 % Foo a; // int a; 456 % int j = a; 457 % char * s = a; 458 % \end{lstlisting} 459 % Assume stores a value x, which cannot be statically determined. When assigning a to j in line 2, the compiler @Unify@ j with a, and returns @value( a )@. The generated code for the second line will be 460 % \begin{lstlisting} 461 % int j = value( Foo, a ) 462 % \end{lstlisting} 463 % Similarly, the generated code for the third line is 464 % \begin{lstlisting} 465 % char * j = label( Foo, a ) 466 % \end{lstlisting} 467 266 468 267 469 \section{Enumerator Initialization} … … 436 638 \section{Implementation} 437 639 438 \subsection{Compiler Representation }640 \subsection{Compiler Representation (Reworking)} 439 641 The definition of an enumeration is represented by an internal type called @EnumDecl@. At the minimum, it stores all the information needed to construct the companion object. Therefore, an @EnumDecl@ can be represented as the following: 440 642 \begin{lstlisting}[label=lst:EnumDecl] … … 464 666 Companion data are needed only if the according pseudo-functions are called. For example, the value of the enumeration Workday is loaded only if there is at least one compilation that has call $value(Workday)$. Once the values are loaded, all compilations share these values array to reduce memory usage. 465 667 466 <Investiage: how to implement this is huge>467 668 468 669 \subsection{(Rework) Companion Object and Companion Function} … … 636 837 637 838 The declaration \CFA-enumeration variable has the same syntax as the C-enumeration. Internally, such a variable will be represented as an EnumInstType. 638 \begin{lstlisting}[label=lst:declaration_code]639 int s1;640 \end{lstlisting}641 The generated code for an enumeration instance is simply an int. It is to hold the position of an enumeration. And usage of variable @s1@ will be converted to return one of its attributes: label, value, or position, with respect to the @Unification@ rule642 643 \subsection{Unification and Resolution }644 645 646 \begin{lstlisting}647 enum Colour( char * ) { Red = "R", Green = "G", Blue = "B" };648 \end{lstlisting}649 The @EnumInstType@ is convertible to other types.650 A \CFA enumeration expression is implicitly \emph{overloaded} with its three different attributes: value, position, and label.651 The \CFA compilers need to resolve an @EnumInstType@ as one of its attributes based on the current context.652 653 \begin{lstlisting}[caption={Null Context}, label=lst:null_context]654 {655 Colour.Green;656 }657 \end{lstlisting}658 In example~\ref{lst:null_context}, the environment gives no information to help with the resolution of @Colour.Green@.659 In this case, any of the attributes is resolvable.660 According to the \textit{precedence rule}, the expression with @EnumInstType@ resolves as @value( Colour.Green )@.661 The @EnumInstType@ is converted to the type of the value, which is statically known to the compiler as @char *@.662 When the compilation reaches the code generation, the compiler outputs code for type @char *@ with the value @"G"@.663 \begin{lstlisting}[caption={Null Context Generated Code}, label=lst:null_context]664 {665 "G";666 }667 \end{lstlisting}668 \begin{lstlisting}[caption={int Context}, label=lst:int_context]669 {670 int g = Colour.Green;671 }672 \end{lstlisting}673 The assignment expression gives a context for the EnumInstType resolution.674 The EnumInstType is used as an @int@, and \CFA needs to determine which of the attributes can be resolved as an @int@ type.675 The functions $Unify( T1, T2 ): bool$ take two types as parameters and determine if one type can be used as another.676 In example~\ref{lst:int_context}, the compiler is trying to unify @int@ and @EnumInstType@ of @Colour@.677 $$Unification( int, EnumInstType<Colour> )$$ which turns into three Unification call678 \begin{lstlisting}[label=lst:attr_resolution_1]679 {680 Unify( int, char * ); // unify with the type of value681 Unify( int, int ); // unify with the type of position682 Unify( int, char * ); // unify with the type of label683 }684 \end{lstlisting}685 \begin{lstlisting}[label=lst:attr_resolution_precedence]686 {687 Unification( T1, EnumInstType<T2> ) {688 if ( Unify( T1, T2 ) ) return T2;689 if ( Unify( T1, int ) ) return int;690 if ( Unify( T1, char * ) ) return char *;691 Error: Cannot Unify T1 with EnumInstType<T2>;692 }693 }694 \end{lstlisting}695 After the unification, @EnumInstType@ is replaced by its attributes.696 697 \begin{lstlisting}[caption={Unification Functions}, label=lst:unification_func_call]698 {699 T2 foo ( T1 ); // function take variable with T1 as a parameter700 foo( EnumInstType<T3> ); // Call foo with a variable has type EnumInstType<T3>701 >>>> Unification( T1, EnumInstType<T3> )702 }703 \end{lstlisting}704 % The conversion can work backward: in restrictive cases, attributes of can be implicitly converted back to the EnumInstType.705 Backward conversion:706 \begin{lstlisting}[caption={Unification Functions}, label=lst:unification_func_call]707 {708 enum Colour colour = 1;709 }710 \end{lstlisting}711 712 \begin{lstlisting}[caption={Unification Functions}, label=lst:unification_func_call]713 {714 Unification( EnumInstType<Colour>, int ) >>> label715 }716 \end{lstlisting}717 @int@ can be unified with the label of Colour.718 @5@ is a constant expression $\Rightarrow$ Compiler knows the value during the compilation $\Rightarrow$ turns it into719 \begin{lstlisting}720 {721 enum Colour colour = Colour.Green;722 }723 \end{lstlisting}724 Steps:725 \begin{enumerate}726 \item727 identify @1@ as a constant expression with type @int@, and the value is statically known as @1@728 \item729 @unification( EnumInstType<Colour>, int )@: @position( EnumInstType< Colour > )@730 \item731 return the enumeration constant at position 1732 \end{enumerate}733 \begin{lstlisting}734 {735 enum T (int) { ... } // Declaration736 enum T t = 1;737 }738 \end{lstlisting}739 Steps:740 \begin{enumerate}741 \item742 identify @1@ as a constant expression with type @int@, and the value is statically known as @1@743 \item744 @unification( EnumInstType<Colour>, int )@: @value( EnumInstType< Colour > )@745 \item746 return the FIRST enumeration constant that has the value 1, by searching through the values array747 \end{enumerate}748 The downside of the precedence rule: @EnumInstType@ $\Rightarrow$ @int ( value )@ $\Rightarrow$ @EnumInstType@ may return a different @EnumInstType@ because the value can be repeated and there is no way to know which one is expected $\Rightarrow$ want uniqueness749 750 \subsection{Casting}751 Casting an EnumInstType to some other type T works similarly to unify the EnumInstType with T. For example:752 \begin{lstlisting}753 enum( int ) Foo { A = 10, B = 100, C = 1000 };754 (int) Foo.A;755 \end{lstlisting}756 The \CFA-compiler unifies @EnumInstType<int>@ with int, with returns @value( Foo.A )@, which has statically known value 10. In other words, \CFA-compiler is aware of a cast expression, and it forms the context for EnumInstType resolution. The expression with type @EnumInstType<int>@ can be replaced by the compile with a constant expression 10, and optionally discard the cast expression.757 758 \subsection{Value Conversion}759 As discussed in section~\ref{lst:var_declaration}, \CFA only saves @position@ as the necessary information. It is necessary for \CFA to generate intermediate code to retrieve other attributes.760 761 \begin{lstlisting}762 Foo a; // int a;763 int j = a;764 char * s = a;765 \end{lstlisting}766 Assume stores a value x, which cannot be statically determined. When assigning a to j in line 2, the compiler @Unify@ j with a, and returns @value( a )@. The generated code for the second line will be767 \begin{lstlisting}768 int j = value( Foo, a )769 \end{lstlisting}770 Similarly, the generated code for the third line is771 \begin{lstlisting}772 char * j = label( Foo, a )773 \end{lstlisting}774 839 775 840
Note:
See TracChangeset
for help on using the changeset viewer.