source: doc/proposals/enum.tex @ 116d7e2

Last change on this file since 116d7e2 was 0fa0201d, checked in by JiadaL <j82liang@…>, 7 months ago

Update enumeration data structure

  • Property mode set to 100644
File size: 40.3 KB
Line 
1\documentclass[12pt]{article}
2\usepackage{fullpage,times}
3\usepackage{pslatex}                    % reduce size of san serif font
4\usepackage{xcolor}
5\usepackage{listings}
6%\usepackage{array}
7\usepackage{graphics}
8\usepackage{xspace}
9
10\makeatletter
11\renewcommand\section{\@startsection{section}{1}{\z@}{-3.0ex \@plus -1ex \@minus -.2ex}{1.5ex \@plus .2ex}{\normalfont\large\bfseries}}
12\renewcommand\subsection{\@startsection{subsection}{2}{\z@}{-2.75ex \@plus -1ex \@minus -.2ex}{1.25ex \@plus .2ex}{\normalfont\normalsize\bfseries}}
13\renewcommand\subsubsection{\@startsection{subsubsection}{3}{\z@}{-2.5ex \@plus -1ex \@minus -.2ex}{1.0ex \@plus .2ex}{\normalfont\normalsize\bfseries}}
14\renewcommand\paragraph{\@startsection{paragraph}{4}{\z@}{-2.0ex \@plus -1ex \@minus -.2ex}{-1em}{\normalfont\normalsize\bfseries}}
15\renewcommand\subparagraph{\@startsection{subparagraph}{4}{\z@}{-1.5ex \@plus -1ex \@minus -.2ex}{-1em}{\normalfont\normalsize\bfseries\itshape}}
16
17% Denote newterms in particular font and index them without particular font and in lowercase, e.g., \newterm{abc}.
18% The option parameter provides an index term different from the new term, e.g., \newterm[\texttt{abc}]{abc}
19% The star version does not lowercase the index information, e.g., \newterm*{IBM}.
20\newcommand{\newtermFontInline}{\emph}
21\newcommand{\newterm}{\protect\@ifstar\@snewterm\@newterm}
22\newcommand{\@newterm}[2][\@empty]{\lowercase{\def\temp{#2}}{\newtermFontInline{#2}}\ifx#1\@empty\index{\temp}\else\index{#1@{\protect#2}}\fi}
23\newcommand{\@snewterm}[2][\@empty]{{\newtermFontInline{#2}}\ifx#1\@empty\index{#2}\else\index{#1@{\protect#2}}\fi}
24\makeatother
25
26\usepackage[ignoredisplayed]{enumitem}  % do not affect trivlist
27\setlist{labelsep=1ex}% global
28\setlist[itemize]{topsep=0.5ex,parsep=0.25ex,itemsep=0.25ex,listparindent=\parindent,leftmargin=\parindent}% global
29\setlist[itemize,1]{label=\textbullet}% local
30%\renewcommand{\labelitemi}{{\raisebox{0.25ex}{\footnotesize$\bullet$}}}
31\setlist[enumerate]{topsep=0.5ex,parsep=0.25ex,itemsep=0.25ex,listparindent=\parindent}% global
32\setlist[enumerate,2]{leftmargin=\parindent,labelsep=*,align=parleft,label=\alph*.}% local
33\setlist[description]{topsep=0.5ex,itemsep=0pt,listparindent=\parindent,leftmargin=\parindent,labelsep=1.5ex}
34
35\newenvironment{cquote}{%
36        \list{}{\lstset{resetmargins=true,aboveskip=0pt,belowskip=0pt}\topsep=4pt\parsep=0pt\leftmargin=\parindent\rightmargin\leftmargin}%
37        \item\relax
38}{%
39        \endlist
40}% cquote
41
42\setlength{\topmargin}{-0.45in}                                                 % move running title into header
43\setlength{\headsep}{0.25in}
44\setlength{\textheight}{9.0in}
45
46\newcommand{\CFAIcon}{\textsf{C\raisebox{\depth}{\rotatebox{180}A}}} % Cforall icon
47\newcommand{\CFA}{\protect\CFAIcon\xspace}                              % CFA symbolic name
48\newcommand{\CCIcon}{\textrm{C}\kern-.1em\hbox{+\kern-.25em+}} % C++ icon
49\newcommand{\CC}[1][]{\protect\CCIcon{#1}\xspace}               % C++ symbolic name
50\newcommand{\PAB}[1]{{\color{red}PAB: #1}}
51
52% \definecolor{mGreen}{rgb}{0,0.6,0}
53% \definecolor{mGray}{rgb}{0.5,0.5,0.5}
54% \definecolor{mPurple}{rgb}{0.58,0,0.82}
55% \definecolor{backgroundColour}{rgb}{0.95,0.95,0.92}
56
57\lstdefinestyle{CStyle}{
58%    backgroundcolor=\color{backgroundColour},   
59%    commentstyle=\color{mGreen},
60%    keywordstyle=\color{magenta},
61        stringstyle=\small\tt,                                  % use typewriter font
62%    stringstyle=\color{mPurple},
63    columns=fullflexible,
64    basicstyle=\small\linespread{0.9}\sf,       % reduce line spacing and use sanserif font
65%   basicstyle=\footnotesize,
66    breakatwhitespace=false,         
67%    breaklines=true,                 
68    captionpos=b,                   
69    keepspaces=true,                 
70        escapechar=\$,                                                  % LaTeX escape in CFA code
71%    numbers=left,                   
72%    numbersep=5pt,                 
73%    numberstyle=\tiny\color{mGray},
74%    showspaces=false,               
75    showstringspaces=false,
76%    showtabs=false,                 
77        showlines=true,                                                 % show blank lines at end of code
78    tabsize=5,
79    language=C,
80        aboveskip=4pt,                                                  % spacing above/below code block
81        belowskip=2pt,
82        xleftmargin=\parindent,                 % indent code to paragraph indentation
83}
84\lstset{style=CStyle,moredelim=**[is][\color{red}]{@}{@}}
85\lstMakeShortInline@                            % single-character for \lstinline
86
87\begin{document}
88
89\title{\vspace*{-0.5in}Enumeration in \CFA}
90\author{Jiada Liang}
91
92\maketitle
93
94\begin{abstract}
95An enumeration is a type that defines a list of named constant values in C (and other languages).
96C and \CC use an integral type as the underlying representation of an enumeration.
97\CFA extends C enumerations to allow all basic and custom types for the inner representation.
98\end{abstract}
99
100\section{C-Style Enum}
101
102\CFA supports the C-Style enumeration using the same syntax and semantics.
103\begin{lstlisting}[label=lst:weekday]
104enum Weekday { Monday, Tuesday, Wednesday, Thursday=10, Friday, Saturday, Sunday };
105                $\(\uparrow\)$                                                                      $\(\uparrow\)$
106    ${\rm \newterm{enumeration name}}$                                        ${\rm \newterm{enumerator names}}
107\end{lstlisting}
108The example defines an enumeration type @Weekday@ with ordered enumerators @Monday@, @Tuesday@, @Wednesday@, @Thursday@, @Friday@, @Saturday@ and @Sunday@.
109The successor of @Tuesday@ is @Monday@ and the predecessor of @Tuesday@ is @Wednesday@.
110A C enumeration is an integral type, with consecutive enumerator values assigned by the compiler starting at zero or the next explicitly initialized value by the programmer.
111For example, @Monday@ to @Wednesday@ have values 0--2 implicitly set by the compiler, @Thursday@ is explicitly set to @10@ by the programmer, and @Friday@ to @Sunday@ have values 11--13 implicitly set by the compiler.
112
113There are 3 attributes for an enumeration: \newterm{position}, \newterm{label}, and \newterm{value}:
114\begin{cquote}
115\small\sf\setlength{\tabcolsep}{3pt}
116\begin{tabular}{rccccccccccc}
117@enum@ Weekday \{       & Monday,       & Tuesday,      & Wednesday,    & Thursday=10,  & Friday,       & Saturday,     & Sunday \}; \\
118\it position            & 0                     & 1                     & 2                             & 3                             & 4                     & 5                     & 6                     \\
119\it label                       & Monday        & Tuesday       & Wednesday             & Thursday              & Friday        & Saturday      & Sunday        \\
120\it value                       & 0                     & 1                     & 2                             & 10                    & 11            & 12            & 13
121\end{tabular}
122\end{cquote}
123
124The enumerators of an enumeration are unscoped, i.e., enumerators declared inside of an @enum@ are visible in the enclosing scope of the @enum@ type.
125\begin{lstlisting}[label=lst:enum_scope]
126{
127        enum Weekday { ... };   // enumerators implicitly projected into local scope
128        Weekday weekday = Monday;
129        weekday = Friday;
130        int i = Sunday  // i == 13
131}
132int j = Wednesday; // ERROR! Wednesday is not declared in this scope
133\end{lstlisting}
134
135\section{\CFA-Style Enum}
136
137A \CFA enumeration is parameterized by a type specifying each enumerator's type.
138\CFA allows any object type for the enumerators, and values assigned to enumerators must be from the declared type.
139\begin{lstlisting}[label=lst:color]
140enum Colour( @char *@ ) { Red = "R", Green = "G", Blue = "B"  };
141\end{lstlisting}
142The type of @Colour@ is @char *@ and each enumerator is initialized with a C string.
143Only types with a defined ordering can be automatically initialized (see Section~\ref{s:AutoInitializable}).
144
145% An instance of \CFA-enum (denoted as @<enum_instance>@) is a label for the defined enum name.
146% The label can be retrieved by calling the function @label( <enum_instance> )@.
147% Similarly, the @value()@ function returns the value used to initialize the \CFA-enum.
148
149\subsection{Enumerator Scoping}
150
151A \CFA-enum can be scoped, meaning the enumerator constants are not projected into the enclosing scope.
152\begin{lstlisting}
153enum Colour( char * ) @!@ { ... };
154\end{lstlisting}
155where the @'!'@ implies the enumerators are \emph{not} projected.
156The enumerators of a scoped enumeration are accessed using qualifications, like the fields of an aggregate.
157% The syntax of $qualified\_expression$ for \CFA-enum is the following:
158% $$<qualified\_expression> := <enum\_type>.<enumerator>$$
159\begin{lstlisting}
160Colour colour = @Colour.@Red;   // qualification
161colour = @Colour.@Blue;
162\end{lstlisting}
163
164\section{Enumeration Pseudo-functions}
165Pseudo-functions are function-like operators that do not result in any run-time computations, i.e., like @sizeof@. Instead, the call to functions will be substituted into other expressions in compilation time.
166
167\subsection{Enumerator Attributes}
168The attributes of an enumerator are accessed by pseudo-functions @position@, @value@, and @label@.
169\begin{lstlisting}
170int green_pos = @position@( Colour.Green );     // 1
171char * green_value = @value@( Colour.Green ); / "G"
172char * green_label = @label@( Colour.Green ); // "Green"
173\end{lstlisting}
174
175\subsection{enumerate()}
176\begin{lstlisting}[label=lst:c_switch]
177enum(int) C_ENUM { First, Second, Third = First, Fourth };
178int v(C_ENUM e) { 
179    switch( e ) {
180        case First: return 0; break;
181        case Second: return 1; break;
182        // case Thrid: return 2; break;
183        // case Fourth: return 3; break;
184    };
185};
186\end{lstlisting}
187In the @C_ENUM@ example, @Third@ is an alias of @First@ and @Fourth@ is an alias of @Second@. Programmers cannot make case branches for @Third@ and @Fourth@ because the switch statement matches cases by the enumerator's value. Case First and Third, or Second and Fourth, has duplicate case values.
188
189@enumerate()@ is a pseudo-function that makes the switch statement match by an enumerator instead.
190\begin{lstlisting}[label=lst:c_switch_enumerate]
191enum(double) C_ENUM { First, Second, Third = First, Fourth };
192C_ENUM variable_a = First, variable_b = Second, variable_c = Thrid, variable_d = Fourth;
193int v(C_ENUM e) { 
194    switch( enumeratate( e ) ) {
195        case First: return e; break;
196        case Second: return value( e ); break;
197        case Thrid: return label( e ); break;
198        case Fourth: return position( e ); break;
199    };
200};
201p(variable_a); // 0
202p(variable_b); // 1
203p(variable_c); // "Third"
204p(variable_d); // 3
205\end{lstlisting}
206
207\section{Enumeration Storage}
208
209\subsection{Enumeration Variable}
210
211Although \CFA enumeration captures three different attributes, an enumeration instance does not store all this information.
212The @sizeof@ a \CFA enumeration instance is always 4 bytes, the same size as a C enumeration instance (@sizeof( int )@).
213It comes from the fact that:
214\begin{enumerate}
215\item
216a \CFA enumeration is always statically typed;
217\item
218it is always resolved as one of its attributes regarding real usage.
219\end{enumerate}
220When creating an enumeration instance @colour@ and assigning it with the enumerator @Color.Green@, the compiler allocates an integer variable and stores the position 1.
221The invocations of $positions()$, $value()$, and $label()$ turn into calls to special functions defined in the prelude:
222\begin{lstlisting}[label=lst:companion_call]
223position( green );
224>>> position( Colour, 1 ) -> int
225value( green );
226>>> value( Colour, 1 ) -> T
227label( green );
228>>> label( Colour, 1) -> char *
229\end{lstlisting}
230@T@ represents the type declared in the \CFA enumeration defined and @char *@ in the example.
231These generated functions are $Companion Functions$, they take an $companion$ object and the position as parameters.
232
233\subsection{Enumeration Data}
234\begin{lstlisting}[label=lst:enumeration_backing_data]
235enum(T) E { ... };
236// backing data
237T* E_values;
238char** E_labels;
239\end{lstlisting}
240Storing values and labels as arrays can sometimes help support enumeration features. However, the data structures are the overhead for the programs. We want to reduce the memory usage for enumeration support by:
241\begin{itemize}
242    \item Only generates the data array if necessary
243    \item The compilation units share the data structures. No extra overhead if the data structures are requested multiple times.
244\end{itemize}
245
246
247\subsection{Aggressive Inline}
248To avoid allocating memory for enumeration data structures, \CFA inline the result of enumeration attribute pseudo-function whenever it is possible.
249\begin{lstlisting}[label=lst:enumeration_inline]
250enum(int) OddNumber { A=1, B=3 };
251sout | "A: " | OddNumber.A | "B: " | OddNumber.B | "A+B: " | OddNumber.A + OddNumber.B
252\end{lstlisting}
253Instead of calling pseudo-function @value@ on expression $OddNumber.A$ and $OddNumber.B$, because the result is known statistically, \CFA will inline the constant expression 1 and 3, respectively. Because no runtime lookup for enumeration value is necessary, \CFA will not generate data structure for enumeration OddNumber.
254
255\subsection{Weak Reference}
256\begin{lstlisting}[label=lst:week_ref]
257enum(int) OddNumber { A=1, B=3 };
258enum OddNumber i = ...;
259...
260sout | OddNumber;
261\end{lstlisting}
262In this example, \CFA cannot determine the static value of the enum variable i, and Runtime lookup is necessary. The OddNumber can be referenced in multiple compilations, and allocating the arrays in all compilation units is not desirable. \CFA addresses this by declaring the value array as a weak reference. All compilation units reference OddNumber have weak references to the same enumeration data structure. No extra memory is allocated if more compilation units reference OddNumber, and the OddNumber is initialized once.
263
264\section{Unification}
265
266\subsection{Enumeration as Value}
267\label{section:enumeration_as_value}
268An \CFA enumeration with base type T can be used seamlessly as T, without explicitly calling the pseudo-function value.
269\begin{lstlisting}[label=lst:implicit_conversion]
270char * green_value = Colour.Green; // "G"
271// Is equivalent to
272// char * green_value = value( Color.Green ); "G"
273\end{lstlisting}
274
275\subsection{Unification Distance}
276\begin{lstlisting}[label=lst:unification_distance_example]
277T_2 Foo(T1);
278\end{lstlisting}
279The @Foo@ function expects a parameter with type @T1@. In C, only a value with the exact type T1 can be used as a parameter for @Foo@. In \CFA, @Foo@ accepts value with some type @T3@ as long as @distance(T1, T3)@ is not @Infinite@.
280
281@path(A, B)@ is a compiler concept that returns one of the following:
282\begin{itemize}
283    \item Zero or 0, if and only if $A == B$.
284    \item Safe, if B can be used as A without losing its precision, or B is a subtype of A.
285    \item Unsafe, if B loses its precision when used as A, or A is a subtype of B.
286    \item Infinite, if B cannot be used as A. A is not a subtype of B and B is not a subtype of A.
287\end{itemize}
288
289For example, @path(int, int)==Zero@, @path(int, char)==Safe@, @path(int, double)==Unsafe@, @path(int, struct S)@ is @Infinite@ for @struct S{}@.
290@distance(A, C)@ is the minimum sum of paths from A to C. For example, if @path(A, B)==i@, @path(B, C)==j@, and @path(A, C)=k@, then $$distance(A,C)==min(path(A,B), path(B,C))==i+j$$.
291
292(Skip over the distance matrix here because it is mostly irrelevant for enumeration discussion. In the actual implementation, distance( E, T ) is 1.)
293
294The arithmetic of distance is the following:
295\begin{itemize}
296    \item $Zero + v= v$, for some value v.
297    \item $Safe * k <  Unsafe$, for finite k.
298    \item $Unsafe * k < Infinite$, for finite k.
299    \item $Infinite + v = Infinite$, for some value v.
300\end{itemize}
301
302For @enum(T) E@, @path(T, E)==Safe@ and @path(E,T)==Infinite@. In other words, enumeration type E can be @safely@ used as type T, but type T cannot be used when the resolution context expects a variable with enumeration type @E@.
303
304
305\subsection{Variable Overloading and Parameter Unification}
306\CFA allows variable names to be overloaded. It is possible to overload a variable that has type T and an enumeration with type T.
307\begin{lstlisting}[label=lst:variable_overload]
308char * green = "Green";
309Colour green = Colour.Green; // "G"
310
311void bar(char * s) { return s; }
312void foo(Colour c) { return value( c ); }
313
314foo( green ); // "G"
315bar( green ); // "Green"
316\end{lstlisting}
317\CFA's conversion distance helps disambiguation in this overloading. For the function @bar@ which expects the parameter s to have type @char *@, $distance(char *,char *) == Zero$ while $distance(char *, Colour) == Safe$, the path from @char *@ to the enumeration with based type @char *@, \CFA chooses the @green@ with type @char *@ unambiguously. On the other hand, for the function @foo@, @distance(Colour, char *)@ is @Infinite@, @foo@ picks the @green@ with type @char *@.
318
319\subsection{Function Overloading}
320Similarly, functions can be overloaded with different signatures. \CFA picks the correct function entity based on the distance between parameter types and the arguments.
321\begin{lstlisting}[label=lst:function_overload]
322Colour green = Colour.Green;
323void foo(Colour c) { sout | "It is an enum"; } // First foo
324void foo(char * s) { sout | "It is a string"; } // Second foo
325foo( green ); // "It is an enum"
326\end{lstlisting}
327Because @distance(Colour, Colour)@ is @Zero@ and @distance(char *, Colour)@ is @Safe@, \CFA determines the @foo( green )@ is a call to the first foo.
328
329\subsection{Attributes Functions}
330The pseudo-function @value()@ "unboxes" the enumeration and the type of the expression is the underlying type. Therefore, in the section~\ref{section:enumeration_as_value} when assigning @Colour.Green@ to variable typed @char *@, the resolution distance is @Safe@, while assigning @value(Color.Green) to @char *) has resolution distance @Zero@.
331
332\begin{lstlisting}[label=lst:declaration_code]
333int s1;
334\end{lstlisting}
335The generated code for an enumeration instance is simply an int. It is to hold the position of an enumeration. And usage of variable @s1@ will be converted to return one of its attributes: label, value, or position, concerning the @Unification@ rule
336
337% \subsection{Unification and Resolution (this implementation will probably not be used, safe as reference for now)}
338
339% \begin{lstlisting}
340% enum Colour( char * ) { Red = "R", Green = "G", Blue = "B"  };
341% \end{lstlisting}
342% The @EnumInstType@ is convertible to other types.
343% A \CFA enumeration expression is implicitly \emph{overloaded} with its three different attributes: value, position, and label.
344% The \CFA compilers need to resolve an @EnumInstType@ as one of its attributes based on the current context.
345
346% \begin{lstlisting}[caption={Null Context}, label=lst:null_context]
347% {
348%       Colour.Green;
349% }
350% \end{lstlisting}
351% In example~\ref{lst:null_context}, the environment gives no information to help with the resolution of @Colour.Green@.
352% In this case, any of the attributes is resolvable.
353% According to the \textit{precedence rule}, the expression with @EnumInstType@ resolves as @value( Colour.Green )@.
354% The @EnumInstType@ is converted to the type of the value, which is statically known to the compiler as @char *@.
355% When the compilation reaches the code generation, the compiler outputs code for type @char *@ with the value @"G"@.
356% \begin{lstlisting}[caption={Null Context Generated Code}, label=lst:null_context]
357% {
358%       "G";
359% }
360% \end{lstlisting}
361% \begin{lstlisting}[caption={int Context}, label=lst:int_context]
362% {
363%       int g = Colour.Green;
364% }
365% \end{lstlisting}
366% The assignment expression gives a context for the EnumInstType resolution.
367% The EnumInstType is used as an @int@, and \CFA needs to determine which of the attributes can be resolved as an @int@ type.
368% The functions $Unify( T1, T2 ): bool$ take two types as parameters and determine if one type can be used as another.
369% In example~\ref{lst:int_context}, the compiler is trying to unify @int@ and @EnumInstType@ of @Colour@.
370% $$Unification( int, EnumInstType<Colour> )$$ which turns into three Unification call
371% \begin{lstlisting}[label=lst:attr_resolution_1]
372% {
373%       Unify( int, char * ); // unify with the type of value
374%       Unify( int, int ); // unify with the type of position
375%       Unify( int, char * ); // unify with the type of label
376% }
377% \end{lstlisting}
378% \begin{lstlisting}[label=lst:attr_resolution_precedence]
379% {
380%       Unification( T1, EnumInstType<T2> ) {
381%               if ( Unify( T1, T2 ) ) return T2;
382%               if ( Unify( T1, int ) ) return int;
383%               if ( Unify( T1, char * ) ) return char *;
384%               Error: Cannot Unify T1 with EnumInstType<T2>;
385%       }
386% }
387% \end{lstlisting}
388% After the unification, @EnumInstType@ is replaced by its attributes.
389
390% \begin{lstlisting}[caption={Unification Functions}, label=lst:unification_func_call]
391% {
392%       T2 foo ( T1 ); // function take variable with T1 as a parameter
393%       foo( EnumInstType<T3> ); // Call foo with a variable has type EnumInstType<T3>
394%       >>>> Unification( T1, EnumInstType<T3> )
395% }
396% \end{lstlisting}
397% % The conversion can work backward: in restrictive cases, attributes of can be implicitly converted back to the EnumInstType.
398% Backward conversion:
399% \begin{lstlisting}[caption={Unification Functions}, label=lst:unification_func_call]
400% {
401%       enum Colour colour = 1;
402% }
403% \end{lstlisting}
404
405% \begin{lstlisting}[caption={Unification Functions}, label=lst:unification_func_call]
406% {
407%    Unification( EnumInstType<Colour>, int ) >>> label
408% }
409% \end{lstlisting}
410% @int@ can be unified with the label of Colour.
411% @5@ is a constant expression $\Rightarrow$ Compiler knows the value during the compilation $\Rightarrow$ turns it into
412% \begin{lstlisting}
413% {
414%    enum Colour colour = Colour.Green;
415% }
416% \end{lstlisting}
417% Steps:
418% \begin{enumerate}
419% \item
420% identify @1@ as a constant expression with type @int@, and the value is statically known as @1@
421% \item
422% @unification( EnumInstType<Colour>, int )@: @position( EnumInstType< Colour > )@
423% \item
424% return the enumeration constant at position 1
425% \end{enumerate}
426% \begin{lstlisting}
427% {
428%       enum T (int) { ... } // Declaration
429%       enum T t = 1;
430% }
431% \end{lstlisting}
432% Steps:
433% \begin{enumerate}
434% \item
435% identify @1@ as a constant expression with type @int@, and the value is statically known as @1@
436% \item
437% @unification( EnumInstType<Colour>, int )@: @value( EnumInstType< Colour > )@
438% \item
439% return the FIRST enumeration constant that has the value 1, by searching through the values array
440% \end{enumerate}
441% The downside of the precedence rule: @EnumInstType@ $\Rightarrow$ @int ( value )@ $\Rightarrow$ @EnumInstType@ may return a different @EnumInstType@ because the value can be repeated and there is no way to know which one is expected $\Rightarrow$ want uniqueness
442
443% \subsection{Casting}
444% Casting an EnumInstType to some other type T works similarly to unify the EnumInstType with T. For example:
445% \begin{lstlisting}
446% enum( int ) Foo { A = 10, B = 100, C = 1000 };
447% (int) Foo.A;
448% \end{lstlisting}
449% The \CFA-compiler unifies @EnumInstType<int>@ with int, with returns @value( Foo.A )@, which has statically known value 10. In other words, \CFA-compiler is aware of a cast expression, and it forms the context for EnumInstType resolution. The expression with type @EnumInstType<int>@ can be replaced by the compile with a constant expression 10, and optionally discard the cast expression.
450
451% \subsection{Value Conversion}
452% As discussed in section~\ref{lst:var_declaration}, \CFA only saves @position@ as the necessary information. It is necessary for \CFA to generate intermediate code to retrieve other attributes.
453
454% \begin{lstlisting}
455% Foo a; // int a;
456% int j = a;
457% char * s = a;
458% \end{lstlisting}
459% Assume stores a value x, which cannot be statically determined. When assigning a to j in line 2, the compiler @Unify@ j with a, and returns @value( a )@. The generated code for the second line will be
460% \begin{lstlisting}
461% int j = value( Foo, a )
462% \end{lstlisting}
463% Similarly, the generated code for the third line is
464% \begin{lstlisting}
465% char * j = label( Foo, a )
466% \end{lstlisting}
467
468
469\section{Enumerator Initialization}
470An enumerator must have a deterministic immutable value, either be explicitly initialized in the enumeration definition, or implicitly initialized by rules.
471
472\subsection{C Enumeration Rule}
473A C enumeration has an integral type. If not initialized, the first enumerator implicitly has the integral value 0, and other enumerators have a value equal to its $predecessor + 1$.
474
475\subsection{Auto Initializable}
476\label{s:AutoInitializable}
477
478
479\CFA enumerations have the same rule in enumeration constant initialization.
480However, only \CFA types that have defined traits for @zero_t@, @one_t@, and an addition operator can be automatically initialized by \CFA.
481
482Specifically, a type is auto-initializable only if it satisfies the trait @AutoInitializable@:
483\begin{lstlisting}
484forall(T)
485trait AutoInitializable {
486        void ?()( T & t, zero_t );
487        S ?++( T & t);
488};
489\end{lstlisting}
490An example of a user-defined @AutoInitializable@ is:
491\begin{lstlisting}[label=lst:sample_auto_Initializable]
492struct Odd { int i; };
493void ?()( Odd & t, zero_t ) { t.i = 1; };
494Odd ?++( Odd t1 ) { return Odd( t1.i + 2); };
495\end{lstlisting}
496When the type of an enumeration is @AutoInitializable@, implicit initialization is available.
497\begin{lstlisting}[label=lst:sample_auto_Initializable_usage]
498enum AutoInitUsage(Odd) {
499        A, B, C = 7, D
500};
501\end{lstlisting}
502In the example, no initializer is specified for the first enumeration constant @A@, so \CFA initializes it with the value of @zero_t@, which is 1.
503@B@ and @D@ have the values of their $predecessor++$, where @one_t@ has the value 2.
504Therefore, the enumeration is initialized as follows:
505\begin{lstlisting}[label=lst:sample_auto_Initializable_usage_gen]
506enum AutoInitUsage(Odd) {
507        A = 1, B = 3, C = 7, D = 9
508};
509\end{lstlisting}
510Note that there is no mechanism to prevent an even value for the direct initialization, such as @C = 6@.
511
512In \CFA, character, integral, float, and imaginary types are all @AutoInitialiable@.
513\begin{lstlisting}[label=lst:letter]
514enum Alphabet( int ) {
515        A = 'A', B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z,
516        a = 'a', b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z
517};
518print( "%c, %c, %c", Alphabet.F, Alphabet.o, Alphabet.z );
519>>> F, o, z
520\end{lstlisting}
521\section{Enumeration Features}
522\subsection{Iteration and Range}
523
524It is convenient to iterate over a \CFA enumeration value, e.g.:
525\begin{lstlisting}[label=lst:range_functions]
526for ( Alphabet alph; Alphabet ) { sout | alph; }
527>>> A B C ... D
528\end{lstlisting}
529The for-loop uses the enumeration type @Alphabet@ its range, and iterates through all enumerators in the order defined in the enumeration.
530@alph@ is the iterating enumeration object, which returns the value of an @Alphabet@ in this context according to the precedence rule.
531
532\textbullet\ \CFA offers a shorthand for iterating all enumeration constants:
533\begin{lstlisting}[label=lst:range_functions]
534for ( Alphabet alph ) { sout | alph; }
535>>> A B C ... D
536\end{lstlisting}
537
538The following are examples for constructing for-control using an enumeration. Note that the type declaration of the iterating variable is optional, because \CFA can infer the type as EnumInstType based on the range expression, and possibly convert it to one of its attribute types.
539
540\textbullet\ H is implicit up-to exclusive range [0, H).
541\begin{lstlisting}[label=lst:range_function_1]
542for ( alph; Alphabet.D ) { sout | alph; }
543>>> A B C
544\end{lstlisting}
545
546\textbullet\ ~= H is implicit up-to inclusive range [0,H].
547\begin{lstlisting}[label=lst:range_function_2]
548for ( alph; ~= Alphabet.D ) { sout | alph; }
549>>> A B C D
550\end{lstlisting}
551
552\textbullet\ L ~ H is explicit up-to exclusive range [L,H).
553\begin{lstlisting}[label=lst:range_function_3]
554for ( alph; Alphabet.B ~ Alphabet.D  ) { sout | alph; }
555// for ( Alphabet alph = Alphabet.B; alph < Alphabet.D; alph += 1  ); 1 is one_t
556>>> B C
557\end{lstlisting}
558
559\textbullet\ L ~= H is explicit up-to inclusive range [L,H].
560\begin{lstlisting}[label=lst:range_function_4]
561for ( alph; Alphabet.B ~= Alphabet.D  ) { sout | alph; }
562>>> B C D
563\end{lstlisting}
564
565\textbullet\ L -~ H is explicit down-to exclusive range [H,L), where L and H are implicitly interchanged to make the range down-to.
566\begin{lstlisting}[label=lst:range_function_5]
567for ( alph; Alphabet.D -~ Alphabet.B  ) { sout | alph; }
568>>> D C
569\end{lstlisting}
570
571\textbullet\ L -~= H is explicit down-to exclusive range [H,L], where L and H are implicitly interchanged to make the range down-to.
572\begin{lstlisting}[label=lst:range_function_6]
573for ( alph; Alphabet.D -~= Alphabet.B  ) { sout | alph; }
574>>> D C B
575\end{lstlisting}
576
577A user can specify the ``step size'' of an iteration. There are two different stepping schemes of enumeration for-loop.
578\begin{lstlisting}[label=lst:range_function_stepping]
579enum(int) Sequence { A = 10, B = 12, C = 14, D = 16, D  = 18 };
580for ( s; Sequence.A ~= Sequence.D ~ 1  ) { sout | alph; }
581>>> 10 12 14 16 18
582for ( s; Sequence.A ~= Sequence.D; s+=1  ) { sout | alph; }
583>>> 10 11 12 13 14 15 16 17 18
584\end{lstlisting}
585The first syntax is stepping to the next enumeration constant, which is the default stepping scheme if not explicitly specified. The second syntax, on the other hand, is to call @operator+=@ @one_type@ on the @value( s )@. Therefore, the second syntax is equivalent to
586\begin{lstlisting}[label=lst:range_function_stepping_converted]
587for ( typeof( value(Sequence.A) ) s=value( Sequence.A ); s <= Sequence.D; s+=1  ) { sout | alph; }
588>>> 10 11 12 13 14 15 16 17 18
589\end{lstlisting}
590
591% \PAB{Explain what each loop does.}
592
593It is also possible to iterate over an enumeration's labels, implicitly or explicitly:
594\begin{lstlisting}[label=lst:range_functions_label_implicit]
595for ( char * alph; Alphabet )
596\end{lstlisting}
597This for-loop implicitly iterates every label of the enumeration, because a label is the only valid resolution to the ch with type @char *@ in this case.
598If the value can also be resolved as the @char *@, you might iterate the labels explicitly with the array iteration.
599\begin{lstlisting}[label=lst:range_functions_label_implicit]
600for ( char * ch; labels( Alphabet ) )
601\end{lstlisting}
602
603
604% \subsection{Non-uniform Type}
605% TODO: Working in Progress, might need to change other sections. Conflict with the resolution right now.
606
607% \begin{lstlisting}
608% enum T( int, char * ) {
609%     a=42, b="Hello World"
610% };
611% \end{lstlisting}
612% The enum T declares two different types: int and char *. The enumerators of T hold values of one of the declared types.
613
614\subsection{Enumeration Inheritance}
615
616\begin{lstlisting}[label=lst:EnumInline]
617enum( char * ) Name { Jack = "Jack", Jill = "Jill" };
618enum /* inferred */ Name2 { inline Name, Sue = "Sue", Tom = "Tom" };
619\end{lstlisting}
620\lstinline{Inline} allows Enumeration Name2 to inherit enumerators from Name1 by containment, and a Name enumeration is a subtype of enumeration Name2. An enumeration instance of type Name can be used where an instance of Name2 is expected.
621\begin{lstlisting}[label=lst:EnumInline]
622Name Fred;
623void f( Name2 );
624f( Fred );
625\end{lstlisting}
626If enumeration A declares @inline B@ in its enumeration body, enumeration A is the "inlining enum" and enumeration B is the "inlined enum".
627
628An enumeration can inline at most one other enumeration. The inline declaration must be placed before the first enumerator of the inlining enum. The inlining enum has all the enumerators from the inlined enum, with the same labels, values, and position.
629\begin{lstlisting}[label=lst:EnumInline]
630enum /* inferred */ Name2 { inline Name, Sue = "Sue", Tom = "Tom" };
631// is equivalent to enum Name2 { Jack = "Jack", Jill="Jill", Sue = "Sue", Tom = "Tom" };
632\end{lstlisting}
633Name.Jack is equivalent to Name2.Jack. Their attributes are all identical. Opening both Name and Name2 in the same scope will not introduce ambiguity.
634\begin{lstlisting}[label=lst:EnumInline]
635with( Name, Name2 ) { Jack; } // Name.Jack and Name2.Jack are equivalent. No ambiguity
636\end{lstlisting}
637
638\section{Implementation}
639
640\subsection{Compiler Representation (Reworking)}
641The definition of an enumeration is represented by an internal type called @EnumDecl@. At the minimum, it stores all the information needed to construct the companion object. Therefore, an @EnumDecl@ can be represented as the following:
642\begin{lstlisting}[label=lst:EnumDecl]
643forall(T)
644class EnumDecl {
645    T* values;
646    char** label;
647};
648\end{lstlisting}
649
650The internal representation of an enumeration constant is @EnumInstType@.
651An @EnumInstType@ has a reference to the \CFA-enumeration declaration and the position of the enumeration constant.
652\begin{lstlisting}[label=lst:EnumInstType]
653class EnumInstType {
654    EnumDecl enumDecl;
655    int position;
656};
657\end{lstlisting}
658In the later discussion, we will use @EnumDecl<T>@ to symbolize a @EnumDecl@ parameterized by type T, and @EnumInstType<T>@ is a declared instance of @EnumDecl<T>@.
659
660\begin{lstlisting}[caption={Enum Type Functions}, label=lst:cforall_enum_data]
661const T * const values;
662const char * label;
663int length;
664\end{lstlisting}
665Companion data are necessary information to represent an enumeration. They are stored as standalone pieces, rather than a structure. Those data will be loaded "on demand".
666Companion data are needed only if the according pseudo-functions are called. For example, the value of the enumeration Workday is loaded only if there is at least one compilation that has call $value(Workday)$. Once the values are loaded, all compilations share these values array to reduce memory usage.
667
668
669\subsection{(Rework) Companion Object and Companion Function}
670
671\begin{lstlisting}[caption={Enum Type Functions}, label=lst:cforall_enum_functions]
672forall( T )
673struct Companion {
674        const T * const values;
675        const char * label;
676        int length;
677};
678\end{lstlisting}
679\CFA generates companion objects, an instance of structure that encloses @necessary@ data to represent an enumeration. The size of the companion is unknown at the compilation time, and it "grows" in size to compensate for the @usage@.
680
681The companion object is singleton across the compilation (investigation). 
682
683\CFA generates the definition of companion functions.
684Because \CFA implicitly stores an enumeration instance as its position, the companion function @position@ does nothing but return the position it is passed.
685Companions function @value@ and @label@ return the array item at the given position of @values@ and @labels@, respectively.
686\begin{lstlisting}[label=lst:companion_definition]
687int position( Companion o, int pos ) { return pos; }
688T value( Companion o, int pos ) { return o.values[ pos ]; }
689char * label( Companion o, int pos ) { return o.labels[ pos ]; }
690\end{lstlisting}
691Notably, the @Companion@ structure definition, and all companion objects, are visible to users.
692A user can retrieve values and labels defined in an enumeration by accessing the values and labels directly, or indirectly by calling @Companion@ functions @values@ and @labels@
693\begin{lstlisting}[label=lst:companion_definition_values_labels]
694Colour.values; // read the Companion's values
695values( Colour ); // same as Colour.values
696\end{lstlisting}
697
698\subsection{Companion Traits (experimental)}
699Not sure its semantics yet, and it might replace a companion object.
700\begin{lstlisting}[label=lst:companion_trait]
701forall(T1) {
702    trait Companion(otype T2<otype T1>) {
703        T1 value((otype T2<otype T1> const &);
704        int position(otype T2<otype T1> const &);
705        char * label(otype T2<otype T1> const &);
706    }
707}
708\end{lstlisting}
709All enumerations implicitly implement the Companion trait, an interface to access attributes. The Companion can be a data type because it fulfills to requirements to have concrete instances, which are:
710
711\begin{enumerate}
712  \item The instance of enumeration has a single polymorphic type.
713  \item Each assertion should use the type once as a parameter.
714\end{enumerate}
715
716\begin{lstlisting}
717enum(int) Weekday {
718    Monday=10, Tuesday, ...
719};
720
721T value( enum Weekday<T> & this);
722int position( enum Weekday<T> & this )
723char * label( enum Weekday<T> & this )
724
725trait Companion obj = (enum(int)) Workday.Weekday;
726value(obj); // 10
727\end{lstlisting}
728The enumeration comes with default implementation to the Companion traits functions. The usage of Companion functions would make \CFA allocates and initializes the necessary companion arrays, and return the data at the position represented by the enumeration.
729(...)
730
731\subsection{User Define Enumeration Functions}
732
733Companion objects make extending features for \CFA enumeration easy.
734\begin{lstlisting}[label=lst:companion_user_definition]
735char * charastic_string( Companion o, int position ) { 
736        return sprintf( "Label: %s; Value: %s", label( o, position ), value( o, position) );
737}
738printf( charactic_string ( Color, 1 ) );
739>>> Label: Green; Value: G
740\end{lstlisting}
741Defining a function takes a Companion object effectively defines functions for all \CFA enumeration.
742
743The \CFA compiler turns a function call that takes an enumeration instance as a parameter into a function call with a companion object plus a position.
744Therefore, a user can use the syntax with a user-defined enumeration function call:
745\begin{lstlisting}[label=lst:companion_user_definition]
746charactic_string( Color.Green ); // equivalent to charactic_string( Color, 1 )
747>>> Label: Green; Value: G
748\end{lstlisting}
749Similarly, the user can work with the enumeration type itself: (see section ref...)
750\begin{lstlisting}[ label=lst:companion_user_definition]
751void print_enumerators ( Companion o ) { 
752        for ( c : Companion o ) {
753                sout | label (c) | value( c ) ;
754        } 
755}
756print_enumerators( Colour );
757\end{lstlisting}
758
759
760\subsection{Declaration}
761
762The qualified enumeration syntax is dedicated to \CFA enumeration.
763\begin{lstlisting}[label=lst:range_functions]
764enum (type_declaration) name { enumerator = const_expr, enumerator = const_expr, ... }
765\end{lstlisting}
766A compiler stores the name, the underlying type, and all enumerators in an @enumeration table@.
767During the $Validation$ pass, the compiler links the type declaration to the type's definition.
768It ensures that the name of an enumerator is unique within the enumeration body, and checks if all values of the enumerator have the declaration type.
769If the declared type is not @AutoInitializable@, \CFA rejects the enumeration definition.
770Otherwise, it attempts to initialize enumerators with the enumeration initialization pattern. (a reference to a future initialization pattern section)
771
772\begin{lstlisting}[label=lst:init]
773struct T { ... };
774void ?{}( T & t, zero_t ) { ... };
775void ?{}( T & t, one_t ) { ... };
776T ?+?( T & lhs, T & rhs ) { ... };
777
778enum (T) Sample { 
779        Zero: 0 /* zero_t */,
780        One: Zero + 1 /* ?+?( Zero, one_t ) */ , ...
781};
782\end{lstlisting}
783Challenge: \\
784The value of an enumerator, or the initializer, requires @const_expr@.
785While previously getting around the issue by pushing it to the C compiler, it might not work anymore because of the user-defined types, user-defined @zero_t@, @one_t@, and addition operation.
786Might not be able to implement a \emph{correct} static check.
787
788\CFA $autogens$ a Companion object for the declared enumeration.
789\begin{lstlisting}[label=lst:companion]
790Companion( T ) Sample {
791        .values: { 0, 0+1, 0+1+1, 0+1+1+1, ... }, /* 0: zero_t, 1: one_t, +: ?+?{} */
792        .labels: { "Zero", "One", "Two", "Three", ...},
793        .length: /* number of enumerators */
794};
795\end{lstlisting}
796\CFA stores values as intermediate expressions because the result of the function call to the function @?+?{}(T&, T&)@ is statically unknown to \CFA.
797But the result is computed at run time, and the compiler ensures the @values@ are not changed.
798
799\subsection{Qualified Expression}
800
801\CFA uses qualified expression to address the scoping of \CFA-enumeration.
802\begin{lstlisting}[label=lst:qualified_expression]
803aggregation_name.field;
804\end{lstlisting}
805The qualified expression is not dedicated to \CFA enumeration.
806It is a feature that is supported by other aggregation in \CFA as well, including a C enumeration.
807When C enumerations are unscoped, the qualified expression syntax still helps to disambiguate names in the context.
808\CFA recognizes if the expression references a \CFA aggregation by searching the presence of @aggregation_name@ in the \CFA enumeration table.
809If the @aggregation_name@ is identified as a \CFA enumeration, the compiler checks if @field@ presents in the declared \CFA enumeration.
810
811\subsection{\lstinline{with} Clause/Statement}
812Instead of qualifying an enumeration expression every time, the @with@ can be used to expose enumerators to the current scope, making them directly accessible.
813\begin{lstlisting}[label=lst:declaration]
814enum Color( char * ) { Red="R", Green="G", Blue="B" };
815enum Animal( int ) { Cat=10, Dog=20 };
816with ( Color, Animal ) {
817    char * red_string = Red; // value( Color.Red )
818    int cat = Cat; // value( Animal.Cat )
819}
820\end{lstlisting}
821The \lstinline{with} might introduce ambiguity to a scope. Consider the example:
822\begin{lstlisting}[label=lst:declaration]
823enum Color( char * ) { Red="R", Green="G", Blue="B" };
824enum RGB( int ) { Red=0, Green=1, Blue=2 };
825with ( Color, RGB ) {
826    // int red = Red;
827}
828\end{lstlisting}
829\CFA will not try to resolve the expression with ambiguity. It would report an error. In this case, it is necessary to qualify @Red@ even inside of the \lstinline{with} clause.
830
831\subsection{Instance Declaration}
832
833
834\begin{lstlisting}[label=lst:var_declaration]
835enum Sample s1;
836\end{lstlisting}
837
838The declaration \CFA-enumeration variable has the same syntax as the C-enumeration. Internally, such a variable will be represented as an EnumInstType.
839
840
841\end{document}
842
843% Local Variables: %
844% tab-width: 4 %
845% compile-command: "pdflatex enum.tex" %
846% End: %
Note: See TracBrowser for help on using the repository browser.