Changeset 7568e5c for doc/theses/jiada_liang_MMath/background.tex
- Timestamp:
- Aug 8, 2024, 10:39:40 PM (13 hours ago)
- Branches:
- master
- Children:
- acab1bd
- Parents:
- c1c0efdb
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/jiada_liang_MMath/background.tex
rc1c0efdb r7568e5c 1 1 \chapter{Background} 2 2 3 This chapter covers background material for C enumerations and \CFA features used in later discussion .3 This chapter covers background material for C enumerations and \CFA features used in later discussions. 4 4 5 5 … … 14 14 \begin{enumerate}[leftmargin=*] 15 15 \item 16 For @#define@, the programmer has toexplicitly manage the constant name and value.17 Furthermore, these C preprocessor macro names are outside of the C type-system and can incorrectly change random text in a program.16 For @#define@, the programmer must explicitly manage the constant name and value. 17 Furthermore, these C preprocessor macro names are outside the C type system and can incorrectly change random text in a program. 18 18 \item 19 19 The same explicit management is true for the @const@ declaration, and the @const@ variable cannot appear in constant-expression locations, like @case@ labels, array dimensions,\footnote{ 20 C allows variable-length array -declarations (VLA), so this case does work, but it fails in \CC, which does not support VLAs, unless it is \lstinline{g++}.} immediate oper\-ands of assembler instructions,and occupies storage.20 C allows variable-length array declarations (VLA), so this case does work. Still, it fails in \CC, which does not support VLAs, unless it is \lstinline{g++}.} immediate oper\-ands of assembler instructions and occupies storage. 21 21 \begin{clang} 22 22 $\$$ nm test.o … … 87 87 enum { Size = 20, Max = 10, MaxPlus10 = Max + 10, @Max10Plus1@, Fred = -7 }; 88 88 \end{clang} 89 Here, the aliased constants are :20, 10, 20, 21, and -7.90 Direct initialization is by a compile-time expression generatinga constant value.89 Here, the aliased constants are 20, 10, 20, 21, and -7. 90 Direct initialization is achieved by a compile-time expression that generates a constant value. 91 91 Indirect initialization (without an initializer, @Max10Plus1@) is called \newterm{auto-initialization}, where enumerators are assigned values from left to right, starting at zero or the next explicitly initialized constant, incrementing by @1@. 92 92 Because multiple independent enumerators can be combined, enumerators with the same values can occur. 93 The enumerators are rvalues, soassignment is disallowed.93 The enumerators are @rvalues@, so the assignment is disallowed. 94 94 Finally, enumerators are \newterm{unscoped}, \ie enumerators declared inside of an @enum@ are visible (projected) outside into the enclosing scope of the @enum@ type. 95 For unnamed enumerations, this semantic is requiredbecause there is no type name for scoped qualification.96 97 As noted, this kind ofaliasing declaration is not an enumeration, even though it is declared using an @enum@ in C.95 This semantic is required for unnamed enumerations because there is no type name for scoped qualification. 96 97 As noted, this aliasing declaration is not an enumeration, even though it is declared using an @enum@ in C. 98 98 While the semantics is misleading, this enumeration form matches with aggregate types: 99 99 \begin{cfa} … … 121 121 }; 122 122 \end{clang} 123 Note ,the comma in the enumerator list can be a terminator or a separator, allowing the list to end with a dangling comma.\footnote{123 Note the comma in the enumerator list can be a terminator or a separator, allowing the list to end with a dangling comma.\footnote{ 124 124 A terminating comma appears in other C syntax, \eg the initializer list.} 125 125 This feature allows enumerator lines to be interchanged without moving a comma. … … 130 130 \label{s:CenumImplementation} 131 131 132 In theory, a C enumeration \emph{variable} is an implementation-defined integral type large enough to hold all enumerator values.132 Theoretically, a C enumeration \emph{variable} is an implementation-defined integral type large enough to hold all enumerator values. 133 133 In practice, C defines @int@~\cite[\S~6.4.4.3]{C11} as the underlying type for enumeration variables, restricting initialization to integral constants, which have type @int@ (unless qualified with a size suffix). 134 134 However, type @int@ is defined as: … … 137 137 \end{quote} 138 138 However, @int@ means 4 bytes on both 32/64-bit architectures, which does not seem like the ``natural'' size for a 64-bit architecture. 139 Whereas ,@long int@ means 4 bytes on a 32-bit and 8 bytes on 64-bit architectures, and @long long int@ means 8 bytes on both 32/64-bit architectures, where 64-bit operations are simulated on 32-bit architectures.140 \VRef[Figure]{f:gccEnumerationStorageSize} shows both @gcc@ and @clang@ partially ignore this specification and type the integral size of an enumerator based its initialization.141 Hence, initialization in the range @INT_MIN@..@INT_MAX@ results in a 4-byte enumerator, and outside this range the enumerator is 8 bytes.139 Whereas @long int@ means 4 bytes on a 32-bit and 8 bytes on 64-bit architectures, and @long long int@ means 8 bytes on both 32/64-bit architectures, where 64-bit operations are simulated on 32-bit architectures. 140 \VRef[Figure]{f:gccEnumerationStorageSize} shows both @gcc@ and @clang@ partially ignore this specification and type the integral size of an enumerator based on its initialization. 141 Hence, initialization in the range @INT_MIN@..@INT_MAX@ results in a 4-byte enumerator, and outside this range, the enumerator is 8 bytes. 142 142 Note that @sizeof( typeof( IMin ) ) != sizeof( E )@, making the size of an enumerator different than is containing enumeration type, which seems inconsistent, \eg @sizeof( typeof( 3 ) ) == sizeof( int )@. 143 143 … … 168 168 \label{s:Usage} 169 169 170 C proves an implicit \emph{bidirectional} conversion between an enumeration and its integral type ,and between two different enumerations.170 C proves an implicit \emph{bidirectional} conversion between an enumeration and its integral type and between two different enumerations. 171 171 \begin{clang} 172 172 enum Week week = Mon; $\C{// week == 0}$ … … 178 178 @week = Winter;@ $\C{// UNDEFINED! implicit conversion to Week}$ 179 179 \end{clang} 180 While converting an enumerator to its underlying type is useful, the implicit conversion from the base or another enumeration type to an enumeration is a common source of error.180 While converting an enumerator to its underlying type is sound, the implicit conversion from the base or another enumeration type to an enumeration is a common source of error. 181 181 182 182 Enumerators can appear in @switch@ and looping statements. … … 194 194 \end{cfa} 195 195 For iterating using arithmetic to make sense, the enumerator values \emph{must} have a consecutive ordering with a fixed step between values. 196 For example, a previous gap introduced by @Thu = 10@ ,results in iterating over the values 0--13, where values 3--9 are not @Week@ values.197 Note , it is the bidirectional conversion thatallows incrementing @day@: @day@ is converted to @int@, integer @1@ is added, and the result is converted back to @Week@ for the assignment to @day@.196 For example, a previous gap introduced by @Thu = 10@ results in iterating over the values 0--13, where values 3--9 are not @Week@ values. 197 Note that the bidirectional conversion allows incrementing @day@: @day@ is converted to @int@, integer @1@ is added, and the result is converted back to @Week@ for the assignment to @day@. 198 198 For safety, \CC does not support the bidirectional conversion, and hence, an unsafe cast is necessary to increment @day@: @day = (Week)(day + 1)@. 199 199 200 There is a C idiom t o automatically compute the number of enumerators in an enumeration.200 There is a C idiom that computes the number of enumerators in an enumeration automatically. 201 201 \begin{cfa} 202 202 enum E { A, B, C, D, @N@ }; // N == 4 203 203 for ( enum E e = A; e < @N@; e += 1 ) ... 204 204 \end{cfa} 205 Here, serendipitouslythe auto-incrementing counts the number of enumerators and puts the total into the last enumerator @N@.205 Serendipitously, the auto-incrementing counts the number of enumerators and puts the total into the last enumerator @N@. 206 206 This @N@ is often used as the dimension for an array associated with the enumeration. 207 207 \begin{cfa} … … 226 226 \end{cfa} 227 227 However, the companion idiom results in the \emph{harmonizing} problem because an update to the enumeration @Integral_Type@ often requires a corresponding update to the companion array \snake{Integral_Name}. 228 The requirement to harmonize is at bestindicated by a comment before the enumeration.228 The requirement to harmonize is, at best, indicated by a comment before the enumeration. 229 229 This issue is exacerbated if enumeration and companion array are in different translation units. 230 230 231 231 \bigskip 232 While C provides a true enumeration, it is restricted, has unsafe semantics, and does not provide useful/advanced enumeration features foundin other programming languages.232 While C provides a true enumeration, it is restricted, has unsafe semantics, and does not provide helpful/advanced enumeration features in other programming languages. 233 233 234 234 235 235 \section{\texorpdfstring{\CFA}{Cforall}} 236 236 237 \CFA in \emph{not} an object-oriented programming -language, \ie functions cannot be nested in aggregate types, and hence, there is no \newterm{receiver} notation for calling functions, \eg @obj.method(...)@, where the first argument proceeds the call and becomes an implicit first (\lstinline[language=C++]{this}) parameter.237 \CFA in \emph{not} an object-oriented programming language, \ie functions cannot be nested in aggregate types, and hence, there is no \newterm{receiver} notation for calling functions, \eg @obj.method(...)@, where the first argument proceeds the call and becomes an implicit first (\lstinline[language=C++]{this}) parameter. 238 238 The following sections provide short descriptions of \CFA features needed further in the thesis. 239 Other \CFA features are presented in -situ with short explanations,or no explanation because the feature is obvious to C programmers.239 Other \CFA features are presented in situ with short or no explanation because the feature is obvious to C programmers. 240 240 241 241 242 242 \subsection{Overloading} 243 243 244 Overloading allows programmers to use the most meaningful names without fear of name clashes within a program or from external sources , like includefiles.244 Overloading allows programmers to use the most meaningful names without fear of name clashes within a program or from external sources like included files. 245 245 \begin{quote} 246 246 There are only two hard things in Computer Science: cache invalidation and naming things. --- Phil Karlton … … 269 269 \subsection{Function Overloading} 270 270 271 Both \CFA and \CC allow function names to be overloaded ,as long as their prototypes differ in the number and type of parameters and returns.271 Both \CFA and \CC allow function names to be overloaded as long as their prototypes differ in the number and type of parameters and returns. 272 272 \begin{cfa} 273 273 void f( void ); $\C[1.75in]{// (1): no parameter}$ … … 277 277 \end{cfa} 278 278 In this case, the name @f@ is overloaded depending on the number and parameter types. 279 The type system examines each call size and selects the best match based on the number and types of thearguments.280 Here, the re is a perfect match for the call, @f( 'A' )@ withthe number and parameter type of function (2).281 282 Ada, Scala, and \CFA type-systems also use the return type in resolving a call, to pinpoint the best overloaded name.279 The type system examines each call size and selects the best match based on the number and types of arguments. 280 Here, the call @f( 'A' )@ is a perfect match for the number and parameter type of function (2). 281 282 Ada, Scala, and \CFA type-systems also use the return type to pinpoint the best-overloaded name in resolving a call. 283 283 \begin{cfa} 284 284 int f( void ); $\C[1.75in]{// (4); overloaded on return type}$ … … 302 302 } 303 303 \end{cfa} 304 The \CFA type system simplytreats overloaded variables as an overloaded function returning a value with no parameters.304 The \CFA type system treats overloaded variables as an overloaded function returning a value with no parameters. 305 305 Hence, no significant effort is required to support this feature. 306 306 … … 312 312 313 313 All objects in \CFA are initialized by @constructors@ \emph{after} allocation and de-initialized \emph{before} deallocation. 314 \CC cannot have constructors for basic -types because they have no aggregate type \lstinline[language=C++]{struct/class} in which to insert a constructor definition.314 \CC cannot have constructors for basic types because they have no aggregate type \lstinline[language=C++]{struct/class} in which to insert a constructor definition. 315 315 Like \CC, \CFA has multiple auto-generated constructors for every type. 316 316 317 317 The prototype for the constructor/destructor are @void ?{}( T &, ... )@ and @void ^?{}( T &, ... )@, respectively. 318 The first parameter is logically the \lstinline[language=C++]{this} or \lstinline[language=Python]{self} in other object-oriented languages ,and implicitly passed.318 The first parameter is logically the \lstinline[language=C++]{this} or \lstinline[language=Python]{self} in other object-oriented languages and implicitly passed. 319 319 \VRef[Figure]{f:CFAConstructorDestructor} shows an example of creating and using a constructor and destructor. 320 320 Both constructor and destructor can be explicitly called to reuse a variable. … … 350 350 351 351 The C constants @0@ and @1@ have special meaning. 352 @0@ is the null pointer and used in conditional expressions, where @if ( p )@ is rewritten @if ( p != 0 )@;352 @0@ is the null pointer and is used in conditional expressions, where @if ( p )@ is rewritten @if ( p != 0 )@; 353 353 @1@ is an additive identity in unary operators @++@ and @--@. 354 354 Aware of their significance, \CFA provides a special type @zero_t@ and @one_t@ for custom types. … … 385 385 bar( s ); 386 386 \end{cfa} 387 The assertion on @T@ restricts the range of types that can be manipulated by @bar@ to only those that have an implementation of@foo@ with the matching signature, allowing @bar@'s call to @foo@ in its body.387 The assertion on @T@ restricts the range of types that can be manipulated by @bar@ to only those that implement @foo@ with the matching signature, allowing @bar@'s call to @foo@ in its body. 388 388 Unlike templates in \CC, which are macro expansions at the call site, \CFA polymorphic functions are compiled, passing the call-site assertion functions as hidden parameters. 389 389 … … 392 392 393 393 A @forall@ clause can assert many restrictions on multiple types. 394 A common practice is to refactor the assertions into a named \newterm{trait}, similar to other languages,like Go and Rust.394 A common practice is refactoring the assertions into a named \newterm{trait}, similar to other languages like Go and Rust. 395 395 \begin{cfa} 396 396 forall(T) trait @Bird@ { … … 407 407 bird_fly( 23, robin ); 408 408 \end{cfa} 409 Grouping type assertions into a named trait effectively creates a reusable interface for parametric -polymorphic types.409 Grouping type assertions into a named trait effectively creates a reusable interface for parametric polymorphic types. 410 410 411 411 … … 417 417 418 418 The \CFA resolver attempts to identify the best candidate based on: first, the number of parameters and types, and second, when no exact match exists, the fewest implicit conversions and polymorphic variables. 419 Finding an exact match is not discussed here, because the mechanism is fairly straightforward, even when the search space is large;419 Finding an exact match is not discussed here, because the mechanism is fairly straightforward, even when the search space is ample; 420 420 only finding a non-exact match is discussed in detail. 421 421 … … 425 425 426 426 Most programming languages perform some implicit conversions among basic types to facilitate mixed-mode arithmetic; 427 otherwise, the program becomes littered with many explicit casts , which does not match withprogrammer's expectations.428 C is an aggressive language as it provides conversions among almost all of thebasic types, even when the conversion is potentially unsafe or not meaningful, \ie @float@ to @bool@.427 otherwise, the program becomes littered with many explicit casts which do not match the programmer's expectations. 428 C is an aggressive language, providing conversions among almost all basic types, even when the conversion is potentially unsafe or not meaningful, \ie @float@ to @bool@. 429 429 C defines the resolution pattern as ``usual arithmetic conversion''~\cite[\S~6.3.1.8]{C11}, in which C looks for a \newterm{common type} between operands, and converts one or both operands to the common type. 430 Loosely defined, acommon type is the smallest type in terms of the size of representation that both operands can be converted into without losing their precision, called a \newterm{widening} or \newterm{safe conversion}.430 A common type is the smallest type in terms of the size of representation that both operands can be converted into without losing their precision, called a \newterm{widening} or \newterm{safe conversion}. 431 431 432 432 \CFA generalizes ``usual arithmetic conversion'' to \newterm{conversion cost}. … … 438 438 @poly@ is the number of polymorphic function parameters, and 439 439 \item 440 @safe@ is sum of the degree of safe (widening) conversions.440 @safe@ is the sum of the degree of safe (widening) conversions. 441 441 \end{enumerate} 442 442 Sum of degree is a method to quantify C's integer and floating-point rank. 443 443 Every pair of widening conversion types is assigned a \newterm{distance}, and the distance between the two same types is 0. 444 For example, the distance from @char@ to @int@ is 2, the distance from @int@ to @long@ is 1, and the distancefrom @int@ to @long long int@ is 2.444 For example, the distance from @char@ to @int@ is 2, from @int@ to @long@ is 1, and from @int@ to @long long int@ is 2. 445 445 This distance does not mirror C's rank system. 446 For example, the rank of @char@ and @signed char@are the same in C, but the distance from @char@ to @signed char@ is assigned 1.446 For example, the @char@ and @signed char@ ranks are the same in C, but the distance from @char@ to @signed char@ is assigned 1. 447 447 @safe@ cost is summing all pairs of arguments to parameter safe conversion distances. 448 Among the three costs in Bilson's model, @unsafe@ is the most significant cost and @safe@ is the least significant, with an implication that \CFA always choosea candidate with the lowest @unsafe@, if possible.449 450 For example, assume the overloaded function @foo@ is called with two @int@ parameter .451 The cost for every overloaded @foo@ has been listed along :448 Among the three costs in Bilson's model, @unsafe@ is the most significant cost, and @safe@ is the least significant, implying that \CFA always chooses a candidate with the lowest @unsafe@, if possible. 449 450 For example, assume the overloaded function @foo@ is called with two @int@ parameters. 451 The cost for every overloaded @foo@ has been listed along with the following: 452 452 \begin{cfa} 453 453 void foo( char, char ); $\C[2.5in]{// (1) (2, 0, 0)}$ … … 479 479 \CFA favours candidates with more restrictions on polymorphism, so @forall( T ) void foo( T, T )@ has lower cost. 480 480 @specialization@ is an arbitrary count-down value starting at zero. 481 For every type assertion in @forall@ clause (no assertions in the above example), \CFA subtracts one from @specialization@.482 More type assertions mean smore constraints on argument types, making the function less generic.481 For every type assertion in the @forall@ clause (no assertions in the above example), \CFA subtracts one from @specialization@. 482 More type assertions mean more constraints on argument types, making the function less generic. 483 483 484 484 \CFA defines two special cost values: @zero@ and @infinite@. 485 A conversion cost is @zero@ when the argument and parameter have an exact match, and a conversion cost is @infinite@ when there is no defined conversion between t wo types.485 A conversion cost is @zero@ when the argument and parameter have an exact match, and a conversion cost is @infinite@ when there is no defined conversion between the two types. 486 486 For example, the conversion cost from @int@ to a @struct S@ is @infinite@. 487 487 488 488 In \CFA, the meaning of a C-style cast is determined by its @Cast Cost@. 489 For most cast-expression resolutions, a cast cost is equal toa conversion cost.490 Cast cost exists as an independent matrix for conversion that cannot happen implicitly ,while being possible with an explicit cast.491 These conversions are often defined to have ainfinite conversion cost and a non-infinite cast cost.489 For most cast-expression resolutions, a cast cost equals a conversion cost. 490 Cast cost exists as an independent matrix for conversion that cannot happen implicitly while being possible with an explicit cast. 491 These conversions are often defined as having an infinite conversion cost and a non-infinite cast cost.
Note: See TracChangeset
for help on using the changeset viewer.