Context Navigation

← Previous Changeset
Next Changeset →

Changeset c185ca9

Timestamp:

Feb 12, 2024, 1:09:10 PM (8 months ago)

Author:

Peter A. Buhr <pabuhr@…>

Branches:

master

Children:

e7b04a3

Parents:

77bc259

Message:

Legend:

: Unmodified
: Added
: Removed

doc/user/user.tex

-                      r77bc259
+                      rc185ca9
 %% Created On       : Wed Apr  6 14:53:29 2016
 %% Last Modified By : Peter A. Buhr
 %% Last Modified On : Tue Jan 30 09:02:41 2024
 %% Update Count     : 6046
+%% Last Modified On : Mon Feb 12 11:50:26 2024
+%% Update Count     : 6199
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 …
 The \CFA header file for the I/O library is \Indexc{fstream.hfa}.
+\subsubsection{Stream Output}
 For implicit formatted output, the common case is printing a series of variables separated by whitespace.
 \begin{cquote}
 …
 Note, \CFA stream variables ©stdin©, ©stdout©, ©stderr©, ©exit©, and ©abort© overload C variables ©stdin©, ©stdout©, ©stderr©, and functions ©exit© and ©abort©, respectively.
+\subsubsection{Stream Input}
 For implicit formatted input, the common case is reading a sequence of values separated by whitespace, where the type of an input constant must match with the type of the input variable.
 \begin{cquote}
 \begin{lrbox}{\myboxA}
 \begin{cfa}[aboveskip=0pt,belowskip=0pt]
+int x;   double y   char z;
+char c;   int i;   double d
 \end{cfa}
 \end{lrbox}
 …
 \multicolumn{1}{c@{\hspace{2em}}}{\textbf{\CFA}}        & \multicolumn{1}{c@{\hspace{2em}}}{\textbf{\CC}}       & \multicolumn{1}{c}{\textbf{Python}}   \\
 \begin{cfa}[aboveskip=0pt,belowskip=0pt]
 sin | x | y | z;
+sin | c | i | d;
 \end{cfa}
+&
 \begin{cfa}[aboveskip=0pt,belowskip=0pt]
 cin >> x >> y >> z;
+cin >> c >> i >> d;
 \end{cfa}
+&
 \begin{cfa}[aboveskip=0pt,belowskip=0pt]
 x = int(input());  y = float(input());  z = input();
+c = input();   i = int(input());   d = float(input());
 \end{cfa}
 \\
 \begin{cfa}[showspaces=true,aboveskip=0pt,belowskip=0pt]
 ®1® ®2.5® ®A®
+®A® ®1® ®2.5®
 …
+&
 \begin{cfa}[showspaces=true,aboveskip=0pt,belowskip=0pt]
 ®1® ®2.5® ®A®
+®A® ®1® ®2.5®
 …
+&
 \begin{cfa}[showspaces=true,aboveskip=0pt,belowskip=0pt]
+®A®
 ®1®
 ®2.5®
-®A®
 \end{cfa}
 \end{tabular}
 …
 For floating-point types, any number of decimal digits, optionally preceded by a sign (©+© or ©-©), optionally containing a decimal point, and optionally followed by an exponent, ©e© or ©E©, with signed (optional) decimal digits.
 Floating-point values can also be written in hexadecimal format preceded by ©0x© or ©0X© with hexadecimal digits and exponent denoted by ©p© or ©P©.
+In all cases, all whitespace characters are skipped until an appropriate value is found.
+\Textbf{If an appropriate value is not found, the exception ©missing_data© is raised.}
+For the C-string type, there are two input forms: any number of \Textbf{non-whitespace} characters or a quoted sequence containing any characters except the closing quote, \ie there is no escape character supported in the string..
+In both cases, the string is null terminated ©'\0'©.
+For the quoted string, the start and end quote characters can be any character and do not have to match \see{\ref{XXX}}.
+\VRef[Figure]{f:IOStreamFunctions} shows the I/O stream operations for interacting with files other than ©cin©, ©cout©, and ©cerr©.
+In all cases, whitespace characters are skipped until an appropriate value is found.
+\begin{cfa}[belowskip=0pt]
+char ch;  int i;  float f; double d;  _Complex double cxd;
+sin | ch | i | f | d | cxd;
+X   42   1234.5     0xfffp-2    3.5+7.1i
+\end{cfa}
+It is also possible to scan and ignore specific strings and whitespace using a string format.
+\begin{cfa}[belowskip=0pt]
+sin | "abc def";                                                §\C{// space matches arbitrary whitespace (2 blanks, 2 tabs)}§
+\end{cfa}
+\begin{cfa}[showspaces=true,showtabs=true,aboveskip=0pt,belowskip=0pt]
+®abc            def®
+\end{cfa}
+A non-whitespace format character reads the next input character, compares the format and input characters, and if equal, the input character is discarded and the next format character is tested.
+Note, a single whitespace in the format string matches \Textbf{any} quantity of whitespace characters from the stream (including none).
+For the C-string type, the default input format is any number of \Textbf{non-whitespace} characters.
+There is no escape character supported in an input string, but any Latin-1 character can be typed directly in the input string.
+For example, if the following non-whitespace output is redirected into a file by the shell:
+\begin{cfa}[belowskip=0pt]
+sout | "\n\t\f\0234\x23";
+\end{cfa}
+it can be read back from the file by redirecting the file as input using:
+\begin{cfa}[belowskip=0pt]
+char s[64];
+sin | wdi( sizeof(s), s );                              §\C{// must specify string size}§
+\end{cfa}
+The input string is always null terminated ©'\0'© in the input variable.
+Because of potential buffer overrun when reading C strings, strings are restricted to work with input manipulators \see{\VRef{s:InputManipulators}}.
+As well, there are multiple input-manipulators for scanning complex input string formats, \eg a quoted character or string.
+\Textbf{In all cases, if an invalid data value is not found for a type or format string, the exception ©missing_data© is raised and the input variable is unchanged.}
+For example, when reading an integer and the string ©"abc"© is found, the exception ©missing_data© is raised to ensure the program does not proceed erroneously.
+If a valid data value is found, but it is larger than the capacity of the input variable, such reads are undefined.
+\subsubsection{Stream Files}
+\VRef[Figure]{f:IOStreamFunctions} shows the I/O stream operations for interacting with files other than ©sin©, ©sout©, and ©cerr©.
 \begin{itemize}[topsep=4pt,itemsep=2pt,parsep=0pt]
 \item
 …
 \subsection{Input Manipulators}
+The following \Index{manipulator}s control scanning of input values (reading), and only affect the format of the argument.
+Certain manipulators support a \newterm{scanset}, which is a simple regular expression, where the matching set contains any Latin-1 character (8-bits) or character ranges using minus.
+\label{s:InputManipulators}
+A string variable \emph{must} be large enough to contain the input sequence.
+To force programmers to consider buffer overruns for C-string input, C-strings may only be read with a width field, which should specify a size less than or equal to the C-string size, \eg:
+\begin{cfa}
+char line[64];
+sin | wdi( ®sizeof(line)®, line );              §\C{// must specify string size}§
+\end{cfa}
+Certain input manipulators support a \newterm{scanset}, which is a simple regular expression, where the matching set contains any Latin-1 character (8-bits) or character ranges using minus.
 For example, the scanset \lstinline{"a-zA-Z -/?§"} matches any number of characters between ©'a'© and ©'z'©, between ©'A'© and ©'Z'©, between space and ©'/'©, and characters ©'?'© and (Latin-1) ©'§'©.
 The following string is matched by this scanset:
 \begin{cfa}
+!&%$  abAA () ZZZ  ??  xx§\S\S\S§
+\end{cfa}
+To match a minus, put it as the first character, ©"-0-9"©.
+Other complex forms of regular-expression matching are not supported.
+A string variable \emph{must} be large enough to contain the input sequence.
+To force programmers to consider buffer overruns for C-string input, C-strings can only be read with a width field, which should specify a size less than or equal to the C-string size, \eg:
+\begin{cfa}
+char line[64];
+sin | wdi( ®sizeof(line)®, line ); // must specify size
+\end{cfa}
+Currently, there is no mechanism to detect if a value read exceeds the capwhen Most types are finite sized, \eg integral types only store value that fit into their corresponding storage, 8, 16, 32, 64, 128 bits.
+Hence, an input value may be too large, and the result of the read is often considered undefined, which leads to difficlt to locate runtime errors.
+All reads in \CFA check if values do not fit into the argument variable's type and raise the exception
+All types are
+!&%$  abAA () ZZZ  ??§\S§  xx§\S\S§
+\end{cfa}
+To match a minus, make it the first character in the set, \eg ©"©{\color{red}\raisebox{-1pt}{\texttt{-}}}©0-9"©.
+Other complex forms of regular-expression matching are unsupported.
+The following \Index{manipulator}s control scanning of input values (reading) and only affect the format of the argument.
 \begin{enumerate}
 \item
+\Indexc{skip}( scanset )\index{manipulator!skip@©skip©}, ©skip©( $N$ )
+The first form uses a scanset to skip matching characters.
+The second form skips the next $N$ characters, including newline.
+If the match successes, the input characters are discarded, and input continues with the next character.
+\Indexc{skip}( \textit{scanset} )\index{manipulator!skip@©skip©}, ©skip©( $N$ )
+consumes either the \textit{scanset} or the next $N$ characters, including newlines.
+If the match successes, the input characters are ignored, and input continues with the next character.
 If the match fails, the input characters are left unread.
 \begin{cfa}[belowskip=0pt]
 char sk[§\,§] = "abc";
 sin | "abc " | skip( sk ) | skip( 5 ); // match input sequence
+char scanset[§\,§] = "abc";
+sin | "abc§\textvisiblespace§" | skip( scanset ) | skip( 5 ); §\C{// match and skip input sequence}§
 \end{cfa}
 \begin{cfa}[showspaces=true,aboveskip=0pt,belowskip=0pt]
+®abc   ®
+®abc  ®
+®xx®
+\end{cfa}
+\item
+\Indexc{wdi}( maximum, variable )\index{manipulator!wdi@©wdi©}
+For all types except ©char *©, whitespace is skipped until an appropriate value is found for the specified variable type.
+maximum is the maximum number of characters read for the current operation.
+®abc   abc  xxx®
+\end{cfa}
+Again, the blank in the format string ©"abc©\textvisiblespace©"© matches any number of whitespace characters.
+\item
+\Indexc{wdi}( \textit{maximum}, ©T & v© )\index{manipulator!wdi@©wdi©}
+For all types except ©char *©, whitespace is skipped and the longest sequence of non-whitespace characters matching an appropriate typed (©T©) value is read, converted into its corresponding internal form, and written into the ©T© variable.
+\textit{maximum} is the maximum number of characters read for the current value rather than the longest sequence.
 \begin{cfa}[belowskip=0pt]
 char ch;   char ca[3];   int i;   double d;
 …
 \end{cfa}
 Here, ©ca[0]© is type ©char©, so the width reads 3 characters \Textbf{without} a null terminator.
+If an input value is not found for a variable, the exception ©missing_data© is raised, and the input variable is unchanged.
 Note, input ©wdi© cannot be overloaded with output ©wd© because both have the same parameters but return different types.
 …
 \item
+\Indexc{wdi}( maximum size, ©char s[]© )\index{manipulator!wdi@©wdi©}
+For type ©char *©, maximum is the maximum number of characters read for the current operation.
+Any number of non-whitespace characters, stopping at the first whitespace character found. A terminating null character is automatically added at the end of the stored sequence
+\Indexc{wdi}( $maximum\ size$, ©char s[]© )\index{manipulator!wdi@©wdi©}
+For type ©char *©, whitespace is skippped and the longest sequence of non-whitespace characters is read, without conversion, and written into the string variable (null terminated).
+$maximum\ size$ is the maximum number of characters in the string variable.
+If the non-whitespace sequence of input characters is greater than $maximum\ size - 1$ (null termination), the exception ©cstring_length© is raised.
 \begin{cfa}[belowskip=0pt]
 char cstr[10];
 sin | wdi( sizeof(cstr), cstr );
+char cs[10];
+sin | wdi( sizeof(cs), cs );
 \end{cfa}
 \begin{cfa}[showspaces=true,aboveskip=0pt,belowskip=0pt]
+®abcd1233.456E+2®
+\end{cfa}
+\item
+\Indexc{wdi}( maximum size, maximum read, ©char s[]© )\index{manipulator!wdi@©wdi©}
+For type ©char *©, maximum is the maximum number of characters read for the current operation.
+®012345678®
+\end{cfa}
+Nine non-whitespace character are read and the null character is added to make ten.
+\item
+\Indexc{wdi}( $maximum\ size$, $maximum\ read$, ©char s[]© )\index{manipulator!wdi@©wdi©}
+This manipulator is the same as the previous one, except $maximum$ $read$ is the maximum number of characters read for the current value rather than the longest sequence, where $maximum\ read$ $\le$ $maximum\ size$.
 \begin{cfa}[belowskip=0pt]
 char ch;   char ca[3];   int i;   double d;
 sin | wdi( sizeof(ch), ch ) | wdi( sizeof(ca), ca[0] ) | wdi( 3, i ) | wdi( 8, d );  // c == 'a', ca == "bcd", i == 123, d == 345.6
+char cs[10];
+sin | wdi( sizeof(cs), 9, cs );
 \end{cfa}
 \begin{cfa}[showspaces=true,aboveskip=0pt,belowskip=0pt]
+®abcd1233.456E+2®
+\end{cfa}
+\item
+\Indexc{ignore}( reference-value )\index{manipulator!ignore@©ignore©}
+For all types, the data is read from the stream depending on the argument type but ignored, \ie it is not stored in the argument.
+®012345678®9
+\end{cfa}
+The exception ©cstring_length© is not raised, because the read stops reading after nine characters.
+\item
+\Indexc{getline}( $wdi\ manipulator$, ©const char delimiter = '\n'© )\index{manipulator!getline@©getline©}
+consumes the scanset ©"[^D]D"©, where ©D© is the ©delimiter© character, which reads all characters from the current input position to the delimiter character into the string (null terminated), and consumes and ignores the delimiter.
+If the delimiter character is omitted, it defaults to ©'\n'© (newline).
 \begin{cfa}[belowskip=0pt]
+double d;
+sin | ignore( d );  // d is unchanged
+char cs[10];
+sin | getline( wdi( sizeof(cs), cs ) );
+sin | getline( wdi( sizeof(cs), cs ), 'X' ); §\C{// X is the line delimiter}§
 \end{cfa}
 \begin{cfa}[showspaces=true,aboveskip=0pt,belowskip=0pt]
+®  -75.35e-4® 25
+\end{cfa}
+\item
+\Indexc{incl}( scanset, wdi-input-string )\index{manipulator!incl@©incl©}
+For C-string types only, the scanset matches any number of characters \emph{in} the set.
+Matching characters are read into the C input-string and null terminated.
+®abc ?? #@%®
+®abc ?? #@%X® w
+\end{cfa}
+The same value is read for both input strings.
+\item
+\Indexc{quoted}( ©char & ch©, ©const char Ldelimiter = '\''©, ©const char Rdelimiter = '\0'© )\index{manipulator!quoted@©quoted©}
+consumes the string ©"LCR"©, where ©L© is the left ©delimiter© character, ©C© is the value in ©ch©, and ©R© is the right delimiter character, which skips whitespace, consumes and ignores the left delimiter, reads a single character into ©ch©, and consumes and ignores the right delimiter (3 characters).
+If the delimit character is omitted, it defaults to ©'\''© (single quote).
 \begin{cfa}[belowskip=0pt]
+char s[10];
+sin | incl( "abc", s );
+char ch;
+sin | quoted( ch );   sin | quoted( ch, '"' );   sin | quoted( ch, '[', ']' );
+\end{cfa}
+\begin{cfa}[showspaces=true,aboveskip=0pt,belowskip=0pt]
+®   'a'  "a"[a]®
+\end{cfa}
+\item
+\begin{sloppypar}
+\Indexc{quoted}( $wdi\ manipulator$, ©const char Ldelimiter = '\''©, ©const char Rdelimiter = '\0'© )\index{manipulator!quoted@©quoted©}
+consumes the scanset ©"L[^R]R"©, where ©L© is the left ©delimiter© character and ©R© is the right delimiter character, which skips whitespace, consumes and ignores the left delimiter, reads characters until the right-delimiter into the string variable (null terminated), and consumes and ignores the right delimiter.
+If the delimit character is omitted, it defaults to ©'\''© (single quote).
+\end{sloppypar}
+\begin{cfa}[belowskip=0pt]
+char cs[10];
+sin | quoted( wdi( sizeof(cs), cs ) ); §\C[3in]{// " is the start/end delimiter}§
+sin | quoted( wdi( sizeof(cs), cs ), '\'' ); §\C{// ' is the start/end delimiter}§
+sin | quoted( wdi( sizeof(cs), cs ), '[', ']' ); §\C{// [ is the start and ] is the end delimiter}\CRT§
+\end{cfa}
+\begin{cfa}[showspaces=true]
+®   "abc"  'abc'[abc]®
+\end{cfa}
+\item
+\Indexc{incl}( scanset, $wdi\ manipulator$ )\index{manipulator!incl@©incl©}
+consumes the scanset, which reads all the scanned characters into the string variable (null terminated).
+\begin{cfa}[belowskip=0pt]
+char cs[10];
+sin | incl( "abc", cs );
 \end{cfa}
 \begin{cfa}[showspaces=true,aboveskip=0pt,belowskip=0pt]
 …
 \item
+\Indexc{excl}( scanset, wdi-input-string )\index{manipulator!excl@©excl©}
+For C-string types, the scanset matches any number of characters \emph{not in} the set.
+Non-matching characters are read into the C input-string and null terminated.
+\Indexc{excl}( scanset, $wdi\ manipulator$ )\index{manipulator!excl@©excl©}
+consumes the \emph{not} scanset, which reads all the scanned characters into the string variable (null terminated).
 \begin{cfa}[belowskip=0pt]
 char s[10];
 sin | excl( "abc", s );
+char cs[10];
+sin | excl( "abc", cs );
 \end{cfa}
 \begin{cfa}[showspaces=true,aboveskip=0pt,belowskip=0pt]
 …
 \end{cfa}
+\Indexc{quoted}( char delimit, wdi-input-string )\index{manipulator!quoted@©quoted©}
+Is an ©excl© with scanset ©"delimit"©, which consumes all characters up to the delimit character.
+If the delimit character is omitted, it defaults to ©'\n'© (newline).
+\item
+\Indexc{getline}( char delimit, wdi-input-string )\index{manipulator!getline@©getline©}
+Is an ©excl© with scanset ©"delimit"©, which consumes all characters up to the delimit character.
+If the delimit character is omitted, it defaults to ©'\n'© (newline).
+\item
+\Indexc{ignore}( ©T & v© or ©const char cs[]© or $string\ manipulator$ )\index{manipulator!ignore@©ignore©}
+consumes the appropriate characters for the type and ignores them, so the input variable is unchanged.
+\begin{cfa}
+double d;
+char cs[10];
+sin | ignore( d );                                              §\C{// d is unchanged}§
+sin | ignore( cs );                                             §\C{// cs is unchanged, no wdi required}§
+sin | ignore( quoted( wdi( sizeof(cs), cs ) ) ); §\C{// cs is unchanged}§
+\end{cfa}
+\begin{cfa}[showspaces=true,aboveskip=0pt,belowskip=0pt]
+®  -75.35e-4 25 "abc"®
+\end{cfa}
 \end{enumerate}

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset c185ca9

Legend:

doc/user/user.tex

Download in other formats: