Context Navigation

-                      rbf4fe05
+                      rca4f2b2
 class seqnode : public uSeqable { ... }
 %]
 A node inheriting from @uSeqable@ can appear in a sequence/collection but a node inherting from @uColable@ can only appear in a collection.
+A node inheriting from @uSeqable@ can appear in a sequence/collection but a node inheriting from @uColable@ can only appear in a collection.
 Along with providing the appropriate link fields, the types @uColable@ and @uSeqable@ also provide one member routine:
 %[
 …
 supplying the link fields by inheritance makes them implicit and relies on compiler placement, such as the start or end of @req@.
 An example of an explicit attribute is cache alignment of the link fields in conjunction with other @req@ fields, improving locality and/or avoiding false sharing.
 Wrapped reference has no control over the link fields, but the seperate data allows some control;
+Wrapped reference has no control over the link fields, but the separate data allows some control;
 wrapped value has no control over data or links.
 …
 Each group of intrusive links become the links for each separate STL list.
 The upside is the unlimited number of a lists a node can be associated with simultaneously, any number of STL lists can be created dynamically.
 The downside is the dynamic allocation of the link nodes and manging multiple lists.
+The downside is the dynamic allocation of the link nodes and managing multiple lists.
 Note, it might be possible to wrap the multiple lists in another type to hide this implementation issue.
 …
 \section{String}
 A string is a logical sequence of symbols, where the form of the symbols can vary significantly: 7/8-bit characters (ASCII/Latin-1), or 2/4/8-byte (UNICODE) characters/symbols or variable length (UTF-8/16/32) characters.
+A string is a sequence of symbols, where the form of a symbol can vary significantly: 7/8-bit characters (ASCII/Latin-1), or 2/4/8-byte (UNICODE) characters/symbols or variable length (UTF-8/16/32) characters.
 A string can be read left-to-right, right-to-left, top-to-bottom, and have stacked elements (Arabic).
+An integer character constant is a sequence of one or more multibyte characters enclosed in single-quotes, as in @'x'@.
+A wide character constant is the same, except prefixed by the letter @L@, @u@, or @U@.
+Except for escape sequences, the elements of the sequence are any members of the source character set;
+they are mapped in an implementation-defined manner to members of the execution character set.
+A C character-string literal is a sequence of zero or more multibyte characters enclosed in double-quotes, as in @"xyz"@.
+A UTF-8 string literal is the same, except prefixed by @u8@.
+A wide string literal is the same, except prefixed by the letter @L@, @u@, or @U@.
+For UTF-8 string literals, the array elements have type @char@, and are initialized with the characters of the multibyte character sequence, as encoded in UTF-8.
+For wide string literals prefixed by the letter @L@, the array elements have type @wchar_t@ and are initialized with the sequence of wide characters corresponding to the multibyte character sequence, as defined by the @mbstowcs@ function with an implementation-defined current locale.
+For wide string literals prefixed by the letter @u@ or @U@, the array elements have type @char16_t@ or @char32_t@, respectively, and are initialized with the sequence of wide characters corresponding to the multibyte character sequence, as defined by successive calls to the @mbrtoc16@, or @mbrtoc32@ function as appropriate for its type, with an implementation-defined current locale.
+A C character constant is an ASCII/Latin-1 character enclosed in single-quotes, \eg @'x'@, @'@\textsterling@'@.
+A wide C character constant is the same, except prefixed by the letter @L@, @u@, or @U@, \eg @u'\u25A0'@ (black square), where the @\u@ identifies a universal character name.
+A character can be formed from an escape sequence, which expresses a non-typable character (@'\n'@), a delimiter character @'\''@, or a raw character @'\x2f'@.
+A character sequence is zero or more regular, wide, or escape characters enclosed in double-quotes @"xyz\n"@.
+The kind of characters in the string is denoted by a prefix: UTF-8 characters are prefixed by @u8@, wide characters are prefixed by @L@, @u@, or @U@.
+For UTF-8 string literals, the array elements have type @char@ and are initialized with the characters of the multibyte character sequences, \eg @u8"\xe1\x90\x87"@ (Canadian syllabics Y-Cree OO).
+For wide string literals prefixed by the letter @L@, the array elements have type @wchar_t@ and are initialized with the wide characters corresponding of the multibyte character sequence, \eg @L"abc@$\mu$@"@ and read/print using @wsanf@/@wprintf@.
+The value of a wide-character is implementation-defined, usually a UTF-16 character.
+For wide string literals prefixed by the letter @u@ or @U@, the array elements have type @char16_t@ or @char32_t@, respectively, and are initialized with wide characters corresponding to the multibyte character sequence, \eg @u"abc@$\mu$@"@, @U"abc@$\mu$@"@.
+The value of a @"u"@ character is an UTF-16 character;
+the value of a @"U"@ character is an UTF-32 character.
 The value of a string literal containing a multibyte character or escape sequence not represented in the execution character set is implementation-defined.
+Another bad C design decision is to have null-terminated strings rather than maintaining a separate string length.
+C strings are null-terminated rather than maintaining a separate string length.
 \begin{quote}
 Technically, a string is an array whose elements are single characters.
 …
 This representation means that there is no real limit to how long a string can be, but programs have to scan one completely to determine its length.
 \end{quote}
+Unfortunately, this design decision is both unsafe and inefficient.
+It is common error in C to forget the space in a character array for the terminator or overwrite the terminator, resulting in array overruns in string operations.
+The need to repeatedly scan an entire string to determine its length can result in significant cost, as it is not possible to cache the length in many cases.
+C strings are fixed size because arrays are used for the implementation.
+However, string manipulation commonly results in dynamically-sized temporary and final string values.
+As a result, storage management for C strings is a nightmare, quickly resulting in array overruns and incorrect results.
+Collectively, these design decisions make working with strings in C, awkward, time consuming, and very unsafe.
+While there are companion string routines that take the maximum lengths of strings to prevent array overruns, that means the semantics of the operation can fail because strings are truncated.
+Suffice it to say, C is not a go-to language for string applications, which is why \CC introduced the @string@ type.

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset ca4f2b2

Legend:

doc/theses/mike_brooks_MMath/background.tex

Download in other formats: