Context Navigation

source: doc/theses/jiada_liang_MMath/trait.tex@ 7d65715f

Visit:

Last change on this file since 7d65715f was 0bda8d7, checked in by JiadaL <j82liang@…>, 12 months ago
Small updates
Property mode set to `100644`
File size: 15.6 KB

Line
1	\chapter{Enumeration Traits}
2	\label{c:trait}
3
4	\CC introduced the @std::is_enum@ trait in \CC{11} and concept feature in \CC{20}.
5	This combination makes it possible to write a polymorphic function over an enumerated type.
6	\begin{c++}
7	#include <type_traits>
8	template<typename T> @concept Enumerable@ = std::is_enum<T>::value;
9	template<@Enumerable@ E> E f( E e ) { $\C{// constrained type}$
10	E w = e; $\C{// allocation and copy}$
11	cout << e << ' ' << w << endl; $\C{// value}$
12	return w; $\C{// copy}$
13	}
14	int main() {
15	enum E { A = 42, B, C } e = C;
16	e = f( e );
17	}
18	44 44
19	\end{c++}
20	The @std::is_enum@ and other \CC @traits@ are compile-time interfaces to query type information.
21	While named the same as @trait@ in other programming languages, it is orthogonal to the \CFA trait, with the latter being defined as a collection of assertions to be satisfied by a polymorphic type.
22
23	The following sections cover the underlying implementation features I created to generalize and restrict enumerations in the \CFA type-system using the @trait@ mechanism.
24
25
26	\section{Traits \texorpdfstring{\lstinline{CfaEnum}}{CfaEnum} and \texorpdfstring{\lstinline{TypedEnum}}{TypedEnum}}
27
28	Traits @CfaEnum@ and @TypedEnum@ define the enumeration attributes: @label@, @posn@, @value@, and @Countof@.
29	These traits support polymorphic functions for \CFA enumeration, \eg:
30	\begin{cfa}
31	forall( E ) \| @CfaEnum( E )@ )
32	void f( E e ) {
33	// access enumeration properties for e
34	}
35	\end{cfa}
36	\newpage
37	Trait @CfaEnum@ defines attribute functions @label@ and @posn@ for all \CFA enumerations, and internally \CFA enumerations fulfill this assertion.
38	\begin{cfa}
39	forall( E ) trait CfaEnum {
40	const char * @label@( E e );
41	unsigned int @posn@( E e );
42	};
43	\end{cfa}
44	This trait covers opaque enumerations that do not have an explicit @value@.
45
46	The trait @TypedEnum@ extends @CfaEnum@ with the @value@ assertion for typed enumerations.
47	\begin{cfa}
48	forall( E, V \| CfaEnum( E ) ) trait TypedEnum {
49	V @value@( E e );
50	};
51	\end{cfa}
52	Here, the associate type-parameter @V@ is the base type of the typed enumeration, and hence, the return type of @value@.
53	These two traits provide a way to define functions over all \CFA enumerations.
54
55	For example, \VRef[Figure]{f:GeneralizedEnumerationFormatter} shows a generalized enumeration formatter for any enumeration type.
56	The formatter prints an enumerator name and its value in the form @"label( value )"@.
57	The trait for @format_enum@ requires a function named @str@ to print the value (payload) of the enumerator.
58	Hence, enumeration defines how its value appears, and @format_enum@ displays this value within the label name.
59
60	\begin{figure}
61	\begin{cfa}
62	forall( @E, V \| TypedEnum( E, V )@ \| { string str( V ); } ) $\C{// format any enumeration}$
63	string format_enum( E e ) {
64	return label( e ) + '(' + str( value( e ) ) + ')'; $\C{// "label( value )"}$
65	}
66	enum(size_t) RGB { Red = 0xFF0000, Green = 0x00FF00, Blue = 0x0000FF };
67	// string library has conversion function str from size_t to string
68
69	struct color_code { int R, G, B; };
70	enum(color_code) Rainbow {
71	Red = {255, 0, 0}, Orange = {255, 127, 0}, Yellow = {255, 255, 0}, Green = {0, 255, 0}, // ...
72	};
73	string str( color_code cc ) with( cc ) { $\C{// format payload, "ddd,ddd,ddd"}$
74	return str( R ) + ',' + str( G ) + ',' + str( B ); $\C{// "R,G,B"}$
75	}
76	int main() {
77	sout \| format_enum( RGB.Green ); $\C{// "Green(65280)"}$
78	sout \| format_enum( Rainbow.Green ); $\C{// "Green(0,255,0)"}$
79	}
80	\end{cfa}
81	\caption{Generalized Enumeration Formatter}
82	\label{f:GeneralizedEnumerationFormatter}
83	\end{figure}
84
85	Other types may work with traits @CfaEnum@ and @TypedEnum@, by supplying appropriate @label@, @posn@, and @value@ functions.
86	For example, \VRef[Figure]{f:ExtendCEnumeration} extends a (possibly predefined) C enumeration to work with all the \CFA extensions.
87
88	\begin{figure}
89	\begin{cfa}
90	enum Fruit { Apple, Banana, Cherry }; $\C{// C enum}$
91	const char * @label@( Fruit f ) {
92	static const char * labels[] = { "Apple", "Banana", "Cherry" };
93	return labels[f];
94	}
95	int @posn@( Fruit f ) { return f; }
96	int @value@( Fruit f ) {
97	static const char values[] = { 'a', 'b', 'c' };
98	return values[f];
99	}
100	sout \| format_enum( Cherry ); $\C{// "Cherry(c)"}$
101	\end{cfa}
102	\caption{Extend C Enumeration to \CFA Enumeration}
103	\label{f:ExtendCEnumeration}
104	\end{figure}
105
106
107	\section{Discussion: Genericity}
108
109	At the start of this chapter, the \CC concept is introduced to constrained template types, \eg:
110	\begin{c++}
111	concept Enumerable = std::is_enum<T>::value;
112	\end{c++}
113	Here, @concept@ is referring directly to types with kind @enum@;
114	other @concept@s can refer to all types with kind @int@ with @long@ or @long long@ qualifiers, \etc.
115	Hence, the @concept@ is the first level of restriction, allowing only the specified kinds of types and rejecting others.
116	The template expansion is the second level of restriction verifying if the type passing the @concept@ test provides the necessary functionality.
117	Hence, a @concept@ is querying precise aspects of the programming language set of types.
118
119	Alternatively, languages using traits, like \CFA, Scala, Go, and Rust, are defining a restriction based on a set of operations, variables, or structure fields that must exist to match with usages in a function or aggregate type.
120	Hence, the \CFA enumeration traits are never connected with the specific @enum@ kind.
121	Instead, anything that can look like the @enum@ kind is considered an enumeration (static structural typing).
122	However, Scala, Go, and Rust traits are nominative: a type explicitly declares a named trait to be of its type, while in \CFA, any type implementing all requirements declared in a trait implicitly satisfy its restrictions.
123
124	One of the key differences between concepts and traits, which is leveraged heavily by \CFA, is the ability to apply new \CFA features to C legacy code.
125	For example, \VRef[Figure]{f:GeneralizedEnumerationFormatter} shows that pre-existing C enumerations can be upgraded to work and play with new \CFA enumeration facilities.
126	Another example is adding constructors and destructors to pre-existing C types by simply declaring them for the old C type.
127	\CC fails at certain levels of legacy extension because many of the new \CC features must appear \emph{within} an aggregate definition due to the object-oriented nature of the type system, where it is impossible to change legacy library types.
128
129
130	\section{Bounded and Serial}
131
132	A bounded trait defines a lower and upper bound for a type.
133	\begin{cfa}
134	forall( E ) trait Bounded {
135	E lowerBound();
136	E upperBound();
137	};
138	\end{cfa}
139	Both functions are necessary for the implementation of \CFA enumeration, with @lowerBound@ returning the first enumerator and @upperBound@ returning the last enumerator.
140	\begin{cfa}
141	enum(int) Week { Mon, Tue, Wed, Thu, Fri, Sat, Sun };
142	enum(int) Fruit { Apple, Banana, Cherry };
143	Week first_day = lowerBound(); $\C{// Mon}$
144	Fruit last_fruit = upperBound(); $\C{// Cherry}$
145	\end{cfa}
146	The @lowerBound@ and @upperBound@ are functions overloaded on return type only, meaning their type resolution depends solely on the call-site context, such as the parameter type for a function argument or the left-hand side of an assignment expression.
147	Calling either function without a context results in a type ambiguity, unless the type environment has only one type overloading the functions.
148	\begin{cfa}
149	sout \| @lowerBound()@; $\C[2.5in]{// ambiguous as Week and Fruit implement Bounded}$
150	void foo( Fruit );
151	foo( lowerBound() ); $\C{// parameter provides type Fruit}$
152	Week day = upperBound(); $\C{// day provides type Week}\CRT$
153	\end{cfa}
154
155	Trait @Serial@ is a subset of @Bounded@, with functions mapping enumerators to integers, and implementing a sequential order between enumerators.
156	\begin{cfa}
157	forall( E \| Bounded( E ) ) trait Serial {
158	int fromInstance( E e );
159	E fromInt( unsigned int i );
160	E pred( E e );
161	E succ( E e );
162	unsigned Countof( E );
163	};
164	\end{cfa}
165	Function @fromInstance@ projects a @Bounded@ member to a number and @fromInt@ is the inverse.
166	Function @pred@ and @succ@ are advancement functions:
167	@pred@ takes an enumerator and returns the previous enumerator, if there is one, in sequential order, and @succ@ returns the next enumerator.
168	\begin{cfa}
169	sout \| fromInstance( Wed ) \| fromInt( 2 ) \| succ( Wed ) \| pred( Wed );
170	2 Wed Thu Tue
171	\end{cfa}
172	Bound checking is provided for @fromInt@, @pred@, and @succ@, and the program is terminated if the lower or upper bound is exceeded, \eg:
173	\begin{cfa}
174	fromInt( 100 );
175	Cforall Runtime error: call to fromInt has index 100 outside of enumeration range 0-6.
176	\end{cfa}
177	Function @fromInstance@ or a position cast using @(int)@ is always safe, \ie within the enumeration range.
178
179	Function @Countof@ is the generic counterpart to the built-in pseudo-function @countof@.
180	@countof@ only works on enumeration types and instances, so it is locked into the language type system;
181	as such, @countof( enum-type)@ becomes a compile-time constant.
182	@Countof@ works on any type that matches the @Serial@ trait.
183	Hence, @Countof@ does not use its argument;
184	only the parameter type is needed to compute the range size.
185	\begin{cfa}
186	int Countof( E ) {
187	E upper = upperBound();
188	E lower = lowerBound();
189	return fromInstance( upper ) - fromInstance( lower ) + 1;
190	}
191	\end{cfa}
192
193	@countof@ also works for any type @E@ that defines @Countof@ and @lowerBound@, becoming a call to @Countof( E )@.
194	The resolution step on expression @countof( E )@ are:
195	\begin{enumerate}
196	\item Look for an enumeration named @E@, such as @enum E {... }@.
197	If such an enumeration @E@ exists, replace @countof( E )@ with the number of enumerators.
198	\item Look for a non-enumeration type named @E@ that defines @Countof@ and @lowerBound@, including @E@ being a polymorphic type, such as @forall( E )@.
199	If type @E@ exists, replace it with @Countof(lowerBound())@, where @lowerBound@ is defined for type @E@.
200	\item Look for an enumerator @A@ defined in enumeration @E@.
201	If such an enumerator @A@ exists, replace @countof( A )@ with the number of enumerators in @E@.
202	\item Look for a name @A@ in the lexical context with the type @E@.
203	If the name @A@ exists, replace @countof( A )@ with a function call @Countof( E )@.
204	\item If 1-4 fail, report a type error on expression @countof( E )@.
205	\end{enumerate}
206
207
208	\section{Enumerating}
209
210	The fundamental aspect of an enumeration type is the ability to enumerate over its enumerators.
211	\CFA supports \newterm{for} loops, \newterm{while} loop, and \newterm{range} loop. This section covers @for@ loops and @range@ loops for enumeration, but the concept transitions to @while@ loop.
212
213
214	\subsection{For Loop}
215
216	A for-loop consists of loop control and body.
217	The loop control is often a 3-tuple: initializers, looping condition, and advancement.
218	It is a common practice to declare one or more loop-index variables in initializers, whether the variables satisfy the loop condition, and update the variables in advancement.
219	Such a variable is called an \newterm{index} and is available for reading and writing within the loop body.
220	(Some languages make the index read-only in the loop body.)
221	This style of iteration can be written for an enumeration using functions from the @Bounded@ and @Serial@ traits:
222	\begin{cfa}
223	enum() E { A, B, C, D };
224	for ( unsigned int i = 0; i < countof(E); i += 1 ) $\C{// (1)}$
225	sout \| label( fromInt( i ) ) \| nonl;
226	sout \| nl;
227	for ( E e = lowerBound(); ; e = succ(e) ) { $\C{// (2)}$
228	sout \| label(e) \| nonl;
229	if (e == upperBound()) break;
230	}
231	sout \| nl;
232	A B C D
233	A B C D
234	\end{cfa}
235
236	A caveat in writing loop control using @pred@ and @succ@ is unintentionally exceeding the range.
237	\begin{cfa}
238	for ( E e = upperBound(); e >= lowerBound(); e = pred( e ) ) {}
239	for ( E e = lowerBound(); e <= upperBound(); e = succ( e ) ) {}
240	\end{cfa}
241	Both of these loops look correct but fail because there is an additional bound check within the advancement \emph{before} the conditional test to stop the loop, resulting in a failure at the endpoints of the iteration.
242	These loops must be restructured by moving the loop test to the end of the loop (@do-while@), as in loop (2) above, which is safe because an enumeration always has at least one enumerator.
243
244
245	\subsection{Range Loop}
246
247	Instead of writing the traditional 3-tuple loop control, \CFA supports a \newterm{range loop}.
248	\begin{cfa}
249	for ( @E e; A ~= D@ ) { sout \| label( e ) \| nonl; } sout \| nl;
250	for ( @e; A ~= D@ ) { sout \| label( e ) \| nonl; } sout \| nl;
251	for ( @E e; A -~= D@ ) { sout \| label( e ) \| nonl; } sout \| nl;
252	for ( @e; A -~= D ~ 2@ ) { sout \| label( e ) \| nonl; } sout \| nl;
253	\end{cfa}
254	Every range loop above has an index declaration and a @range@ bounded by \newterm{left bound} @A@ and \newterm{right bound} @D@.
255	If the index declaration-type is omitted, the index type is the type of the lower bound (@typeof( A )@).
256	If a range is joined by @~=@ (range up equal) operator, the index variable is initialized by the left bound and advanced by 1 until it is greater than the right bound.
257	If a range is joined by @-~=@ (range down equal) operator, the index variable is initialized by the right bound and advanced by -1 until it is less than the left bound.
258	(Note, functions @pred@ and @succ@ are not used for advancement, so the advancement problem does not occur.)
259	A range can be suffixed by a positive \newterm{step}, \eg @~ 2@, so advancement is incremented/decremented by step.
260
261	Finally, a shorthand for enumerating over the entire set of enumerators (the most common case) is using the enumeration type for the range.
262	\begin{cfa}
263	for ( e; @E@ ) sout \| label( e ) \| nonl; sout \| nl; $\C{// A B C D}$
264	for ( e; @-~= E@ ) sout \| label( e ) \| nonl; sout \| nl; $\C{// D C B A}$
265	\end{cfa}
266	For a \CFA enumeration, the loop enumerates over all enumerators of the enumeration.
267	For a type matching the @Serial@ trait: the index variable is initialized to @lowerBound@ and loop control checks the index's value for greater than the @upperBound@.
268	If the range type is not a \CFA enumeration or does not match trait @Serial@, it is compile-time error.
269
270
271	\section{Overload Operators}
272
273	\CFA overloads the comparison operators for \CFA enumeration satisfying traits @Serial@ and @CfaEnum@.
274	These definitions require the operand types to be the same, and the appropriate comparison is made using the the positions of the operands.
275	\begin{cfa}
276	forall( E \| CfaEnum( E ) \| Serial( E ) ) @{@ $\C{// distribution block}$
277	// comparison
278	int ?==?( E l, E r ); $\C{// true if l and r are same enumerators}$
279	int ?!=?( E l, E r ); $\C{// true if l and r are different enumerators}$
280	int ?<?( E l, E r ); $\C{// true if l is an enumerator before r}$
281	int ?<=?( E l, E r ); $\C{// true if l before or the same as r}$
282	int ?>?( E l, E r ); $\C{// true if l is an enumerator after r}$
283	int ?>=?( E l, E r ); $\C{// true if l after or the same as r}$
284	@}@
285	\end{cfa}
286	(Note, all the function prototypes are wrapped in a distribution block, where all qualifiers preceding the block are distributed to each declaration with the block, which eliminates tedious repeated qualification.
287	Distribution blocks can be nested.)
288
289	\CFA implements a few arithmetic operators for @CfaEnum@.
290	% Unlike advancement functions in @Serial@, these operators perform direct arithmetic, so there is no implicit bound checks.
291	Bound checks are added to these operations to ensure the outputs fulfill the @Bounded@ invariant.
292	\begin{cfa}
293	forall( E \| CfaEnum( E ) \| Serial( E ) ) { $\C{// distribution block}$
294	// comparison
295	E ++?( E & l );
296	E --?( E & l );
297	E ?+=? ( E & l, one_t );
298	E ?-=? ( E & l, one_t );
299	E ?+=? ( E & l, int i );
300	E ?-=? ( E & l, int i );
301	E ?++( E & l );
302	E ?--( E & l );
303	}
304	\end{cfa}
305
306	Lastly, \CFA does not define @zero_t@ for \CFA enumeration.
307	Users can define the boolean @false@ for \CFA enumerations on their own, \eg:
308	\begin{cfa}
309	forall( E \| CfaEnum( E ) )
310	int ?!=?( E lhs, zero_t ) {
311	return posn( lhs ) != 0;
312	}
313	\end{cfa}
314	which effectively turns the first enumeration into a logical @false@ and @true@ for the others.

Note: See TracBrowser for help on using the repository browser.

Download in other formats: