Ignore:
Timestamp:
Mar 14, 2025, 7:15:01 PM (7 months ago)
Author:
Peter A. Buhr <pabuhr@…>
Branches:
master
Children:
8617ee90
Parents:
7611631
Message:

change shareEdit to share and proofreading of string chapter

File:
1 edited

Legend:

Unmodified
Added
Removed
  • TabularUnified doc/theses/mike_brooks_MMath/string.tex

    r7611631 r06ffa95  
    99
    1010To prepare for the following discussion, comparisons among C, \CC, Java and \CFA strings are presented, beginning in \VRef[Figure]{f:StrApiCompare}.
    11 It provides a classic ``cheat sheet'' presentation, summarizing the names of the most common closely-equivalent operations.
     11It provides a classic ``cheat sheet'' presentation, summarizing the names of the most-common closely-equivalent operations.
    1212The over-arching commonality is that operations work on groups of characters for assigning, copying, scanning, and updating.
    1313
     
    3939hence, most of its group operations are error prone and expensive.
    4040Most high-level string libraries use a separate length field and specialized storage management to implement group operations.
    41 Interestingly, \CC strings retain null termination just in case it is needed to interface with C library functions.
     41Interestingly, \CC strings retain null termination in case it is needed to interface with C library functions.
    4242\begin{cfa}
    4343int open( @const char * pathname@, int flags );
     
    5252While \VRef[Figure]{f:StrApiCompare} emphasizes cross-language similarities, it elides many specific operational differences.
    5353For example, the @replace@ function selects a substring in the target and substitutes it with the source string, which can be smaller or larger than the substring.
    54 \CC performs the modification on the mutable receiver object.
     54\CC performs the modification on the mutable receiver object
    5555\begin{cfa}
    5656string s1 = "abcde";
     
    6060\end{cfa}
    6161while Java allocates and returns a new string with the result, leaving the receiver unmodified.
     62\label{p:JavaReplace}
    6263\begin{java}
    6364String s = "abcde";
    6465String r = s.replace( "cde", "xy" );  $\C[2.25in]{// replace by text, immutable}$
     66System.out.println( s + ' ' + r );
     67$\texttt{abcde abxy}$
     68\end{java}
     69% Generally, Java's @String@ type is immutable.
     70Java provides a @StringBuffer@ near-analog that is mutable.
     71\begin{java}
    6572StringBuffer sb = new StringBuffer( "abcde" );
    66 sb.replace( 2, 5, "xy" );  $\C{// replace by position, mutable}\CRT$
    67 System.out.println( s + ' ' + r + ' ' + sb );
    68 $\texttt{abcde abxy abxy}$
     73sb.replace( 2, 5, "xy" );  $\C[2.25in]{// replace by position, mutable}\CRT$
     74System.out.println( sb );
     75$\texttt{abxy}$
    6976\end{java}
    70 Generally, Java's @String@ type is immutable.
    71 Java provides a @StringBuffer@ near-analog that is mutable, but the differences are significant; for example, @StringBuffer@'s @substring@ functions still return copies rather than mutable selections.
    72 Finally, the operations between these type are asymmetric, \eg @string@ has @replace@ by text but not replace by position and vice versa for @StringBuffer@.
     77However, there are significant differences;
     78\eg, @StringBuffer@'s @substring@ function returns a @String@ copy that is immutable.
     79Finally, the operations between these type are asymmetric, \eg @String@ has @replace@ by text but not replace by position and vice versa for @StringBuffer@.
    7380
    7481More significant operational differences relate to storage management, often appearing through assignment (@target = source@), and are summarized in \VRef[Figure]{f:StrSemanticCompare}.
     
    140147$\texttt{\small abcde abcde abcde bcde cde}$
    141148\end{cfa}
    142 Note, all of these strings rely on the single null termination character at the end of @s@.
     149Note, all of these aliased strings rely on the single null termination character at the end of @s@.
    143150The issue of symmetry does not apply to C strings because the value and pointer strings are essentially different types, and so this feature is scored as not applicable for C.
    144 With the type not managing the text buffer, there is no ownership question, \ie operations on @s1@ or @s2@ never leads to their memory becoming reusable.
     151With the type not managing the text storage, there is no ownership question, \ie operations on @s1@ or @s2@ never leads to their memory becoming reusable.
    145152While @s3@ is a valid C-string that contains a proper substring of @s1@, the @s3@ technique does not constitute having a fragment referent because null termination implies the substring cannot be chosen arbitrarily; the technique works only for suffixes.
    146153
     
    157164\end{cfa}
    158165The lax symmetry reflects how the validity of @s1@ depends on the content and lifetime of @s@.
    159 It is common practice in \CC to use the @s1@-style pass by reference, with the understanding that the callee only use the referenced string for the duration of the call, \ie no side-effect using the parameter.
    160 So, when the called function is a constructor, it is typical to use an @s2@-style copy initialization to string-object-typed member.
     166It is common practice in \CC to use the @s1@-style pass by reference, with the understanding that the callee only uses the referenced string for the duration of the call, \ie no side-effect using the parameter.
     167So, when the called function is a constructor, it is typical to use an @s2@-style copy-initialization to string-object-typed member.
    161168Exceptions to this pattern are possible, but require the programmer to assure safety where the type system does not.
    162 The @s3@ initialization is constrained to copy the substring because @c_str@ always provides a null-terminated character, which is different from source string.
     169The @s3@ initialization is constrained to copy the substring because @c_str@ always provides a null-terminated character, which may be different from the source string.
    163170@s3@ assignment could be fast by reference counting the text area and using copy-on-write, but would require an implementation upgrade.
    164171
    165172In Java, @String@ also offers a high-level abstraction:
    166 \begin{cfa}
     173\begin{java}
    167174String s = "abcde";
    168175String s1 = s;  $\C[2.25in]{// snapshot state, strict symmetry, variable-constrained referent}$
     
    173180$\texttt{\small abcde abcde bc c}$
    174181$\texttt{\small true false false}$
    175 \end{cfa}
     182\end{java}
    176183Note, @substring@ takes a start and end position, rather than a start position and length.
    177184Here, facts about Java's implicit pointers and pointer equality can over complicate the picture, and so are ignored.
     
    180187With @s2@, the case for fast-copy is more subtle.
    181188Certainly, its value is not pointer-equal to @s@, implying at least a further allocation.
    182 TODO: finish the fast-copy case.
    183 Java's immutable strings mean aliasing is impossible with the @String@ type.
    184 Java's @StringBuffer@ provides aliasing, though without supporting symmetric treatment of a fragment referent; as a result, @StringBuffer@ scores as \CC.
    185 The easy symmetry that the Java string enjoys is aided by Java's garbage collection; Java's @s2@ is doing effectively the operation of \CC's @s2@, though without the consequence to of complicating memory management.
    186 
    187 Finally, In \CFA, @string@ also offers a high-level abstraction:
     189\PAB{TODO: finish the fast-copy case.}
     190Java does not meet the aliasing requirement because immutability make it impossible to modify.
     191Java's @StringBuffer@ provides aliasing (see @replace@ example on \VPageref{p:JavaReplace}), though without supporting symmetric treatment of a fragment referent, \eg @substring@ of a @StringBuffer@ is a @String@;
     192as a result, @StringBuffer@ scores as \CC.
     193The easy symmetry that the Java string enjoys is aided by Java's garbage collection; Java's @s2@ is doing effectively the operation of \CC's @s3@, though without the consequence of complicating memory management.
     194\PAB{What complex storage management is going on here?}
     195
     196Finally, in \CFA, @string@ also offers a high-level abstraction:
    188197\begin{cfa}
    189198string s = "abcde";
    190199string & s1 = s; $\C[2.25in]{// alias state, strict symmetry, variable-constrained referent}$
    191200string s2 = s; $\C{// snapshot state, strict symmetry, variable-constrained referent}$
    192 string s3 = s`shareEdits; $\C{// alias state, strict symmetry, variable-constrained referent}\CRT$
     201string s3 = s`share; $\C{// alias state, strict symmetry, variable-constrained referent}\CRT$
    193202string s4 = s( 1, 2 );
    194203string s5 = s4( 1, 1 );
     
    198207% all helpful criteria of \VRef[Figure]{f:StrSemanticCompare} are satisfied.
    199208The \CFA string manages storage, handles all assignments, including those of fragment referents with fast initialization, provides the choice between snapshot and alias semantics, and does so symmetrically with one type (which assures text validity according to the lifecycles of the string variables).
    200 With aliasing, the intuition is that each string is an editor on an open shared document.
    201 With fragment aliasing, the intuition is that these editor views have been scrolled or zoomed to overlapping, but potentially different, ranges.
     209The intended metaphor for \CFA stings is similar to a GUI text-editor or web browser.
     210Select a consecutive block of text using the mouse generates an aliased substring in the file/dialog-box.
     211Typing into the selected area is like assigning to an aliased substring, where the highlighted text is replaced with more or less text;
     212depending on the text entered, the file/dialog-box content grows or shrinks.
     213\PAB{Need to discuss the example, as for the other languages.}
    202214
    203215The remainder of this chapter explains how the \CFA string achieves this usage style.
    204216
    205217
    206 
    207218\section{Storage Management}
    208219
    209220This section discusses issues related to storage management of strings.
    210 Specifically, it is common for strings to logically overlap completely or partially.
     221Specifically, it is common for strings to logically overlap partially or completely.
    211222\begin{cfa}
    212223string s1 = "abcdef";
     
    218229s3[1] = 'w'; $\C{// what happens to s1 and s2?}$
    219230\end{cfa}
    220 This question is the notion of mutable or immutable strings.
    221 For example, Java has immutable strings that are copied when any overlapping string changes.
    222 Note, the notion of underlying string mutability is not specified by @const@, \eg:
     231which is restricted by a string's mutable or immutable property.
     232For example, Java's immutable strings require copy-on-write when any overlapping string changes.
     233Note, the notion of underlying string mutability is not specified by @const@; \eg in \CC:
    223234\begin{cfa}
    224235const string s1 = "abc";
    225236\end{cfa}
    226 Here, @const@ applies to the @s1@ pointer to @"abc"@, and @"abc"@ is an immutable constant that is \emph{copied} into the string's storage.
    227 Hence, @s1@ is not pointing at an immutable constant, meaning its underlying string is always mutable, unless some other designation is specified, such as Java's global rule.
     237the @const@ applies to the @s1@ pointer to @"abc"@, and @"abc"@ is an immutable constant that is \emph{copied} into the string's storage.
     238Hence, @s1@ is not pointing at an immutable constant, meaning its underlying string can be mutable, unless some other designation is specified, such as Java's global immutable rule.
    228239
    229240
    230241\subsection{Logical overlap}
    231242
    232 \CFA provides a dynamic mechanism to indicate mutable or immutable as an assignment attribute: @`shareEdits@.
    233 
    234 Consider two strings @s1@ and @s1a@ that are in an aliasing relationship, and a third, @s2@, made by a simple copy from @s1@.
    235 Aliasing (@`shareEdits@) means that changes flow in both directions; with a simple copy, they do not.
     243\CFA provides a dynamic mechanism to indicate mutable or immutable using the attribute @`share@.
     244This aliasing relationship is a sticky-property established at initialization.
     245For example, here strings @s1@ and @s1a@ are in an aliasing relationship, while @s2@ is in a copy relationship.
    236246\input{sharing1.tex}
    237 
    238 Aliasing (@`shareEdits@) means that changes flow in both directions; with a simple copy, they do not.
     247Here, the aliasing (@`share@) causes partial changes (subscripting) to flow in both directions.
    239248\input{sharing2.tex}
    240 
    241 Assignment of a value is just a modification.
    242 The aliasing relationship is established at construction and is unaffected by assignment of a value.
     249Similarly for complete changes.
    243250\input{sharing3.tex}
    244251
    245 Assignment from a string is just assignment of a value.
    246 Whether of not the RHS participates in aliasing is irrelevant.  Any aliasing of the LHS is unaffected.
     252Because string assignment copies the value, RHS aliasing is irrelevant.
     253Hence, aliasing of the LHS is unaffected.
    247254\input{sharing4.tex}
    248255
    249 Consider new strings @s1_mid@ being an alias for a run in the middle of @s1@, along with @s2@, made by a simple copy from the middle of @s1@.
     256Now, consider string @s1_mid@ being an alias in the middle of @s1@, along with @s2@, made by a simple copy from the middle of @s1@.
    250257\input{sharing5.tex}
    251 
    252 Again, @`shareEdits@ passes changes in both directions; copy does not.
    253 Note the difference in index values, with the \emph{b} position being 1 in the longer string and 0 in the shorter strings.
    254 In the case of s1 aliasing with @s1_mid@, the very same character is being accessed by different positions.
     258Again, @`share@ passes changes in both directions; copy does not.
     259As a result, the index values for the position of @b@ are 1 in the longer string @"abcd"@ and 0 in the shorter aliased string @"bc"@.
     260This alternate positioning also applies to subscripting.
    255261\input{sharing6.tex}
    256262
    257 Once again, assignment of a value is a modification that flows through the aliasing relationship, without affecting its structure.
     263Finally, assignment flows through the aliasing relationship without affecting its structure.
    258264\input{sharing7.tex}
    259 
    260 In the \emph{ff} step, which is a positive example of flow across an aliasing relationship, the result is straightforward to accept because the flow direction is from contained (small) to containing (large).
    261 The following rules for edits through aliasing substrings will guide how to flow in the opposite direction.
    262 
    263 Growth and shrinkage are natural extensions.
    264 An empty substring is a real thing, at a well-defined location, whose meaning is extrapolated from the examples so far.
    265 The intended metaphor is to operating a GUI text editor.
    266 Having an aliasing substring is like using the mouse to select a few words.
    267 Assigning onto an aliasing substring is like typing from having a few words selected: depending how much you type, the file being edited can get shorter or longer.
     265In the @"ff"@ assignment, the result is straightforward to accept because the flow direction is from contained (small) to containing (large).
     266The following rules explain aliasing substrings that flow in the opposite direction, large to small.
     267
     268Growth and shrinkage are natural extensions, as for the text-editor example mentioned earlier, where an empty substring is as real real as an empty string.
    268269\input{sharing8.tex}
    269270
    270 Multiple portions can be aliased.
    271 When there are several aliasing substrings at once, the text editor analogy becomes an online multi-user editor.
    272 I should be able to edit a paragraph in one place (changing the document's length), without my edits affecting which letters are within a mouse-selection that you had made previously, somewhere else.
     271Multiple portions of a string can be aliased.
     272% When there are several aliasing substrings at once, the text editor analogy becomes an online multi-user editor.
     273%I should be able to edit a paragraph in one place (changing the document's length), without my edits affecting which letters are within a mouse-selection that you had made previously, somewhere else.
    273274\input{sharing9.tex}
    274 
    275 When an edit happens on an aliasing substring that overlaps another, an effect is unavoidable.
    276 Here, the passive party sees its selection shortened, to exclude the characters that were not part of the original selection.
     275When @s1_bgn@'s size increases by 3, @s1_mid@'s starting location moves from 1 to 4 and @s1_end@'s from 3 to 6,
     276
     277When changes happens on an aliasing substring that overlap.
    277278\input{sharing10.tex}
    278 
    279 TODO: finish typesetting the demo
     279Strings @s1_crs@ and @s1_mid@ overlap at character 4, @j@ because the substrings are 3,2 and 4,2.
     280When @s1_crs@'s size increases by 1, @s1_mid@'s starting location moves from 4 to 5, but the overlaping character remains, changing to @'+'@.
     281
     282\PAB{TODO: finish typesetting the demo}
    280283
    281284%\input{sharing-demo.tex}
Note: See TracChangeset for help on using the changeset viewer.