Changeset ca4f2b2
- Timestamp:
- May 13, 2024, 10:07:35 AM (6 months ago)
- Branches:
- master
- Children:
- 31f4837, ccfbfd9
- Parents:
- bf4fe05
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/mike_brooks_MMath/background.tex
rbf4fe05 rca4f2b2 437 437 class seqnode : public uSeqable { ... } 438 438 %] 439 A node inheriting from @uSeqable@ can appear in a sequence/collection but a node inher ting from @uColable@ can only appear in a collection.439 A node inheriting from @uSeqable@ can appear in a sequence/collection but a node inheriting from @uColable@ can only appear in a collection. 440 440 Along with providing the appropriate link fields, the types @uColable@ and @uSeqable@ also provide one member routine: 441 441 %[ … … 604 604 supplying the link fields by inheritance makes them implicit and relies on compiler placement, such as the start or end of @req@. 605 605 An example of an explicit attribute is cache alignment of the link fields in conjunction with other @req@ fields, improving locality and/or avoiding false sharing. 606 Wrapped reference has no control over the link fields, but the sep erate data allows some control;606 Wrapped reference has no control over the link fields, but the separate data allows some control; 607 607 wrapped value has no control over data or links. 608 608 … … 690 690 Each group of intrusive links become the links for each separate STL list. 691 691 The upside is the unlimited number of a lists a node can be associated with simultaneously, any number of STL lists can be created dynamically. 692 The downside is the dynamic allocation of the link nodes and man ging multiple lists.692 The downside is the dynamic allocation of the link nodes and managing multiple lists. 693 693 Note, it might be possible to wrap the multiple lists in another type to hide this implementation issue. 694 694 … … 776 776 \section{String} 777 777 778 A string is a logical sequence of symbols, where the form of the symbolscan vary significantly: 7/8-bit characters (ASCII/Latin-1), or 2/4/8-byte (UNICODE) characters/symbols or variable length (UTF-8/16/32) characters.778 A string is a sequence of symbols, where the form of a symbol can vary significantly: 7/8-bit characters (ASCII/Latin-1), or 2/4/8-byte (UNICODE) characters/symbols or variable length (UTF-8/16/32) characters. 779 779 A string can be read left-to-right, right-to-left, top-to-bottom, and have stacked elements (Arabic). 780 780 781 An integer character constant is a sequence of one or more multibyte characters enclosed in single-quotes, as in @'x'@. 782 A wide character constant is the same, except prefixed by the letter @L@, @u@, or @U@. 783 Except for escape sequences, the elements of the sequence are any members of the source character set; 784 they are mapped in an implementation-defined manner to members of the execution character set. 785 786 A C character-string literal is a sequence of zero or more multibyte characters enclosed in double-quotes, as in @"xyz"@. 787 A UTF-8 string literal is the same, except prefixed by @u8@. 788 A wide string literal is the same, except prefixed by the letter @L@, @u@, or @U@. 789 790 For UTF-8 string literals, the array elements have type @char@, and are initialized with the characters of the multibyte character sequence, as encoded in UTF-8. 791 For wide string literals prefixed by the letter @L@, the array elements have type @wchar_t@ and are initialized with the sequence of wide characters corresponding to the multibyte character sequence, as defined by the @mbstowcs@ function with an implementation-defined current locale. 792 For wide string literals prefixed by the letter @u@ or @U@, the array elements have type @char16_t@ or @char32_t@, respectively, and are initialized with the sequence of wide characters corresponding to the multibyte character sequence, as defined by successive calls to the @mbrtoc16@, or @mbrtoc32@ function as appropriate for its type, with an implementation-defined current locale. 781 A C character constant is an ASCII/Latin-1 character enclosed in single-quotes, \eg @'x'@, @'@\textsterling@'@. 782 A wide C character constant is the same, except prefixed by the letter @L@, @u@, or @U@, \eg @u'\u25A0'@ (black square), where the @\u@ identifies a universal character name. 783 A character can be formed from an escape sequence, which expresses a non-typable character (@'\n'@), a delimiter character @'\''@, or a raw character @'\x2f'@. 784 785 A character sequence is zero or more regular, wide, or escape characters enclosed in double-quotes @"xyz\n"@. 786 The kind of characters in the string is denoted by a prefix: UTF-8 characters are prefixed by @u8@, wide characters are prefixed by @L@, @u@, or @U@. 787 788 For UTF-8 string literals, the array elements have type @char@ and are initialized with the characters of the multibyte character sequences, \eg @u8"\xe1\x90\x87"@ (Canadian syllabics Y-Cree OO). 789 For wide string literals prefixed by the letter @L@, the array elements have type @wchar_t@ and are initialized with the wide characters corresponding of the multibyte character sequence, \eg @L"abc@$\mu$@"@ and read/print using @wsanf@/@wprintf@. 790 The value of a wide-character is implementation-defined, usually a UTF-16 character. 791 For wide string literals prefixed by the letter @u@ or @U@, the array elements have type @char16_t@ or @char32_t@, respectively, and are initialized with wide characters corresponding to the multibyte character sequence, \eg @u"abc@$\mu$@"@, @U"abc@$\mu$@"@. 792 The value of a @"u"@ character is an UTF-16 character; 793 the value of a @"U"@ character is an UTF-32 character. 793 794 The value of a string literal containing a multibyte character or escape sequence not represented in the execution character set is implementation-defined. 794 795 795 796 Another bad C design decision is to have null-terminated strings rather than maintaining a separate string length. 796 C strings are null-terminated rather than maintaining a separate string length. 797 797 \begin{quote} 798 798 Technically, a string is an array whose elements are single characters. … … 800 800 This representation means that there is no real limit to how long a string can be, but programs have to scan one completely to determine its length. 801 801 \end{quote} 802 Unfortunately, this design decision is both unsafe and inefficient. 803 It is common error in C to forget the space in a character array for the terminator or overwrite the terminator, resulting in array overruns in string operations. 804 The need to repeatedly scan an entire string to determine its length can result in significant cost, as it is not possible to cache the length in many cases. 805 806 C strings are fixed size because arrays are used for the implementation. 807 However, string manipulation commonly results in dynamically-sized temporary and final string values. 808 As a result, storage management for C strings is a nightmare, quickly resulting in array overruns and incorrect results. 809 810 Collectively, these design decisions make working with strings in C, awkward, time consuming, and very unsafe. 811 While there are companion string routines that take the maximum lengths of strings to prevent array overruns, that means the semantics of the operation can fail because strings are truncated. 812 Suffice it to say, C is not a go-to language for string applications, which is why \CC introduced the @string@ type.
Note: See TracChangeset
for help on using the changeset viewer.