- Timestamp:
- May 13, 2025, 1:17:50 PM (4 months ago)
- Branches:
- master
- Children:
- 0528d79
- Parents:
- 7d02d35 (diff), 2410424 (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the(diff)
links above to see all the changes relative to each parent. - Location:
- doc
- Files:
-
- 23 added
- 17 deleted
- 21 edited
- 3 moved
Legend:
- Unmodified
- Added
- Removed
-
doc/LaTeXmacros/common.sty
r7d02d35 rbd72f517 11 11 %% Created On : Sat Apr 9 10:06:17 2016 12 12 %% Last Modified By : Peter A. Buhr 13 %% Last Modified On : Wed Mar 19 21:22:28202514 %% Update Count : 66 413 %% Last Modified On : Mon May 5 21:37:13 2025 14 %% Update Count : 666 15 15 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 16 17 % This latex idiom for checking empty optional parameters 18 % \ifx#1\@empty\else\if\relax\detokenize{#1}\relax 19 % first checks if there is no optional parameter specified: \name{...} versus \name[]{...} 20 % second checks if the optional parameter is specified but empty: \name[]{...} versus \name[...]{...} 16 21 17 22 \setlength{\textheight}{9in} … … 142 147 \newcommand{\Index}{\@ifstar\@sIndex\@Index} 143 148 % inline text and as-in index: \Index[as-is index text]{inline text} 144 \newcommand{\@Index}[2][\@empty]{\lowercase{\def\temp{#2}}#2\ifx#1\@empty\ index{\temp}\else\index{#1@{\protect#2}}\fi}149 \newcommand{\@Index}[2][\@empty]{\lowercase{\def\temp{#2}}#2\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\index{\temp}\else\index{#1@{\protect#2}}\fi\fi} 145 150 % inline text but index with different as-is text: \Index[index text]{inline text} 146 \newcommand{\@sIndex}[2][\@empty]{#2\ifx#1\@empty\ index{#2}\else\index{#1@{\protect#2}}\fi}151 \newcommand{\@sIndex}[2][\@empty]{#2\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\index{#2}\else\index{#1@{\protect#2}}\fi\fi} 147 152 148 153 % inline text and code index (cannot use ©) … … 156 161 \newcommand{\newtermFontInline}{\emph} 157 162 \newcommand{\newterm}{\protect\@ifstar\@snewterm\@newterm} 158 \newcommand{\@snewterm}[2][\@empty]{{\newtermFontInline{#2}}\ifx#1\@empty\ index{#2}\else\index{#1@{\protect#2}}\fi}159 \newcommand{\@newterm}[2][\@empty]{\lowercase{\def\temp{#2}}{\newtermFontInline{#2}}\ifx#1\@empty\ index{\temp}\else\index{#1@{\protect#2}}\fi}163 \newcommand{\@snewterm}[2][\@empty]{{\newtermFontInline{#2}}\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\index{#2}\else\index{#1@{\protect#2}}\fi\fi} 164 \newcommand{\@newterm}[2][\@empty]{\lowercase{\def\temp{#2}}{\newtermFontInline{#2}}\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\index{\temp}\else\index{#1@{\protect#2}}\fi\fi} 160 165 161 166 % \snake{<identifier>} … … 254 259 \renewcommand{\reftextfaraway}[1]{\unskip, p.~\pageref{#1}} 255 260 \renewcommand{\reftextpagerange}[2]{\unskip, pp.~\pageref{#1}--\pageref{#2}} 256 \newcommand{\VRef}[2][Section]{\ifx#1\@empty\else {#1}\nobreakspace\fi\vref{#2}}257 \newcommand{\VRefrange}[3][Sections]{\ifx#1\@empty\else {#1}\nobreakspace\fi\vrefrange{#2}{#3}}258 \newcommand{\VPageref}[2][page]{\ifx#1\@empty\else {#1}\nobreakspace\fi\pageref{#2}}259 \newcommand{\VPagerefrange}[3][pages]{\ifx#1\@empty\else {#1}\nobreakspace\fi\pageref{#2}{#3}}261 \newcommand{\VRef}[2][Section]{\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\else{#1}\nobreakspace\fi\fi\vref{#2}} 262 \newcommand{\VRefrange}[3][Sections]{\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\else{#1}\nobreakspace\fi\fi\vrefrange{#2}{#3}} 263 \newcommand{\VPageref}[2][page]{\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\else{#1}\nobreakspace\fi\fi\pageref{#2}} 264 \newcommand{\VPagerefrange}[3][pages]{\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\else{#1}\nobreakspace\fi\fi\pageref{#2}{#3}} 260 265 261 266 \let\Oldthebibliography\thebibliography … … 282 287 \setlength{\columnposn}{\gcolumnposn} 283 288 \newcommand{\setgcolumn}[1]{\global\gcolumnposn=#1\global\columnposn=\gcolumnposn} 284 \newcommand{\C}[2][\@empty]{\ifx#1\@empty\else\ global\setlength{\columnposn}{#1}\global\columnposn=\columnposn\fi\hfill\makebox[\textwidth-\columnposn][l]{\LstCommentStyle{#2}}}285 \newcommand{\CD}[2][\@empty]{\ifx#1\@empty\else\ global\setlength{\columnposn}{#1}\global\columnposn=\columnposn\fi\hfill\makebox[\textwidth-\columnposn][l]{\LstBasicStyle{#2}}}289 \newcommand{\C}[2][\@empty]{\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\else\global\setlength{\columnposn}{#1}\global\columnposn=\columnposn\fi\fi\hfill\makebox[\textwidth-\columnposn][l]{\LstCommentStyle{#2}}} 290 \newcommand{\CD}[2][\@empty]{\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\else\global\setlength{\columnposn}{#1}\global\columnposn=\columnposn\fi\fi\hfill\makebox[\textwidth-\columnposn][l]{\LstBasicStyle{#2}}} 286 291 \newcommand{\CRT}{\global\columnposn=\gcolumnposn} 287 292 -
doc/LaTeXmacros/common.tex
r7d02d35 rbd72f517 11 11 %% Created On : Sat Apr 9 10:06:17 2016 12 12 %% Last Modified By : Peter A. Buhr 13 %% Last Modified On : Wed Mar 19 07:37:17202514 %% Update Count : 68813 %% Last Modified On : Mon May 5 21:34:53 2025 14 %% Update Count : 709 15 15 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 16 17 % This latex idiom for checking empty optional parameters 18 % \ifx#1\@empty\else\if\relax\detokenize{#1}\relax 19 % first checks if there is no optional parameter specified: \name{...} versus \name[]{...} 20 % second checks if the optional parameter is specified but empty: \name[]{...} versus \name[...]{...} 16 21 17 22 \setlength{\textheight}{9in} … … 143 148 \newcommand{\Index}{\@ifstar\@sIndex\@Index} 144 149 % inline text and as-in index: \Index[as-is index text]{inline text} 145 \newcommand{\@Index}[2][\@empty]{\lowercase{\def\temp{#2}}#2\ifx#1\@empty\ index{\temp}\else\index{#1@{\protect#2}}\fi}150 \newcommand{\@Index}[2][\@empty]{\lowercase{\def\temp{#2}}#2\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\index{\temp}\else\index{#1@{\protect#2}}\fi\fi} 146 151 % inline text but index with different as-is text: \Index[index text]{inline text} 147 \newcommand{\@sIndex}[2][\@empty]{#2\ifx#1\@empty\ index{#2}\else\index{#1@{\protect#2}}\fi}152 \newcommand{\@sIndex}[2][\@empty]{#2\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\index{#2}\else\index{#1@{\protect#2}}\fi\fi} 148 153 149 154 % inline text and code index (cannot use ©) … … 157 162 \newcommand{\newtermFontInline}{\emph} 158 163 \newcommand{\newterm}{\protect\@ifstar\@snewterm\@newterm} 159 \newcommand{\@snewterm}[2][\@empty]{{\newtermFontInline{#2}}\ifx#1\@empty\ index{#2}\else\index{#1@{\protect#2}}\fi}160 \newcommand{\@newterm}[2][\@empty]{\lowercase{\def\temp{#2}}{\newtermFontInline{#2}}\ifx#1\@empty\ index{\temp}\else\index{#1@{\protect#2}}\fi}164 \newcommand{\@snewterm}[2][\@empty]{{\newtermFontInline{#2}}\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\index{#2}\else\index{#1@{\protect#2}}\fi\fi} 165 \newcommand{\@newterm}[2][\@empty]{\lowercase{\def\temp{#2}}{\newtermFontInline{#2}}\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\index{\temp}\else\index{#1@{\protect#2}}\fi\fi} 161 166 162 167 % \snake{<identifier>} … … 256 261 \renewcommand{\reftextfaraway}[1]{\unskip, p.~\pageref{#1}} 257 262 \renewcommand{\reftextpagerange}[2]{\unskip, pp.~\pageref{#1}--\pageref{#2}} 258 \newcommand{\VRef}[2][Section]{\ifx#1\@empty\else {#1}\nobreakspace\fi\vref{#2}}259 \newcommand{\VRefrange}[3][Sections]{\ifx#1\@empty\else {#1}\nobreakspace\fi\vrefrange{#2}{#3}}260 \newcommand{\VPageref}[2][page]{\ifx#1\@empty\else {#1}\nobreakspace\fi\pageref{#2}}261 \newcommand{\VPagerefrange}[3][pages]{\ifx#1\@empty\else {#1}\nobreakspace\fi\pageref{#2}{#3}}263 \newcommand{\VRef}[2][Section]{\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\else{#1}\nobreakspace\fi\fi\vref{#2}} 264 \newcommand{\VRefrange}[3][Sections]{\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\else{#1}\nobreakspace\fi\fi\vrefrange{#2}{#3}} 265 \newcommand{\VPageref}[2][page]{\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\else{#1}\nobreakspace\fi\fi\pageref{#2}} 266 \newcommand{\VPagerefrange}[3][pages]{\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\else{#1}\nobreakspace\fi\fi\pageref{#2}{#3}} 262 267 263 268 \let\Oldthebibliography\thebibliography … … 285 290 \setlength{\columnposn}{\gcolumnposn} 286 291 \newcommand{\setgcolumn}[1]{\global\gcolumnposn=#1\global\columnposn=\gcolumnposn} 287 \newcommand{\C}[2][\@empty]{\ifx#1\@empty\else\ global\setlength{\columnposn}{#1}\global\columnposn=\columnposn\fi\hfill\makebox[\textwidth-\columnposn][l]{\LstCommentStyle{#2}}}288 \newcommand{\CD}[2][\@empty]{\ifx#1\@empty\else\ global\setlength{\columnposn}{#1}\global\columnposn=\columnposn\fi\hfill\makebox[\textwidth-\columnposn][l]{\LstBasicStyle{#2}}}292 \newcommand{\C}[2][\@empty]{\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\else\global\setlength{\columnposn}{#1}\global\columnposn=\columnposn\fi\fi\hfill\makebox[\textwidth-\columnposn][l]{\LstCommentStyle{#2}}} 293 \newcommand{\CD}[2][\@empty]{\ifx#1\@empty\else\if\relax\detokenize{#1}\relax\else\global\setlength{\columnposn}{#1}\global\columnposn=\columnposn\fi\fi\hfill\makebox[\textwidth-\columnposn][l]{\LstBasicStyle{#2}}} 289 294 \newcommand{\CRT}{\global\columnposn=\gcolumnposn} 290 295 -
doc/bibliography/pl.bib
r7d02d35 rbd72f517 5214 5214 % M 5215 5215 5216 @misc{M4, 5217 keywords = {macros, preprocessor}, 5218 contributer = {pabuhr@plg}, 5219 author = {Brian W. Kernighan and Dennis M. Ritchie}, 5220 title = {The M4 Macro Processor}, 5221 year = 1977, 5222 howpublished= {\url{https://wolfram.schneider.org/bsd/7thEdManVol2/m4/m4.pdf}}, 5223 optnote = {Accessed: 2016-09}, 5224 } 5225 5216 5226 @book{M68K, 5217 5227 keywords = {M680XX, Motorola}, … … 9081 9091 } 9082 9092 9093 9094 @inproceedings{valgind, 9095 keywords = {Memcheck, Valgrind, dynamic binary analysis, dynamic binary instrumentation, shadow values}, 9096 contributer = {pabuhr@plg}, 9097 author = {Nethercote, Nicholas and Seward, Julian}, 9098 title = {{V}algrind: a framework for heavyweight dynamic binary instrumentation}, 9099 publisher = {Association for Computing Machinery}, 9100 address = {New York, NY, USA}, 9101 booktitle = {Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation}, 9102 pages = {89-100}, 9103 location = {San Diego, California, USA}, 9104 series = {PLDI'07} 9105 year = {2007}, 9106 } 9107 9083 9108 @misc{Vala, 9084 9109 keywords = {GObject, Vala}, -
doc/theses/fangren_yu_MMath/background.tex
r7d02d35 rbd72f517 21 21 Furthermore, Cyclone's polymorphic functions and types are restricted to abstraction over types with the same layout and calling convention as @void *@, \ie only pointer types and @int@. 22 22 In \CFA terms, all Cyclone polymorphism must be dtype-static. 23 While the Cyclone design provides the efficiency benefits discussed in Section~\ref{sec:generic-apps} for dtype-static polymorphism, it is more restrictive than \CFA's general model.23 While the Cyclone design provides the efficiency benefits discussed in~\VRef{s:GenericImplementation} for dtype-static polymorphism, it is more restrictive than \CFA's general model. 24 24 Smith and Volpano~\cite{Smith98} present Polymorphic C, an ML dialect with polymorphic functions, C-like syntax, and pointer types; 25 25 it lacks many of C's features, most notably structure types, and hence, is not a practical C replacement. -
doc/theses/fangren_yu_MMath/features.tex
r7d02d35 rbd72f517 13 13 Here, manipulating the pointer address is the primary operation, while dereferencing the pointer to its value is the secondary operation. 14 14 For example, \emph{within} a data structure, \eg stack or queue, all operations involve pointer addresses and the pointer may never be dereferenced because the referenced object is opaque. 15 Alternatively, use a reference when its primary purpose is to alias a value, \eg a function parameter that does not copy the argument (performance reason).15 Alternatively, use a reference when its primary purpose is to alias a value, \eg a function parameter that does not copy the argument, for performance reasons. 16 16 Here, manipulating the value is the primary operation, while changing the pointer address is the secondary operation. 17 17 Succinctly, if the address changes often, use a pointer; … … 23 23 \CFA adopts a uniform policy between pointers and references where mutability is a separate property made at the declaration. 24 24 25 The following examples show show pointers and references are treated uniformly in \CFA.25 The following examples show how pointers and references are treated uniformly in \CFA. 26 26 \begin{cfa}[numbers=left,numberblanklines=false] 27 27 int x = 1, y = 2, z = 3;$\label{p:refexamples}$ … … 36 36 @&@r3 = @&@y; @&&@r3 = @&&@r4; $\C{// change r1, r2}$ 37 37 \end{cfa} 38 Like pointers, reference can be cascaded, \ie a reference to a reference, \eg @&& r2@.\footnote{38 Like pointers, references can be cascaded, \ie a reference to a reference, \eg @&& r2@.\footnote{ 39 39 \CC uses \lstinline{&&} for rvalue reference, a feature for move semantics and handling the \lstinline{const} Hell problem.} 40 40 Usage of a reference variable automatically performs the same number of dereferences as the number of references in its declaration, \eg @r2@ becomes @**r2@. … … 64 64 The call applies an implicit dereference once to @x@ so the call is typed @f( int & )@ with @T = int@, rather than with @T = int &@. 65 65 66 As fora pointer type, a reference type may have qualifiers, where @const@ is most common.66 As with a pointer type, a reference type may have qualifiers, where @const@ is most common. 67 67 \begin{cfa} 68 68 int x = 3; $\C{// mutable}$ … … 101 101 Interestingly, C does not give a warning/error if a @const@ pointer is not initialized, while \CC does. 102 102 Hence, type @& const@ is similar to a \CC reference, but \CFA does not preclude initialization with a non-variable address. 103 For example, in system 's programming, there are cases where an immutable address is initialized to a specific memory location.103 For example, in systems programming, there are cases where an immutable address is initialized to a specific memory location. 104 104 \begin{cfa} 105 105 int & const mem_map = *0xe45bbc67@p@; $\C{// hardware mapped registers ('p' for pointer)}$ … … 122 122 \end{cfa} 123 123 the call to @foo@ must pass @x@ by value, implying auto-dereference, while the call to @bar@ must pass @x@ by reference, implying no auto-dereference. 124 Without any restrictions, this ambiguity limits the behaviour of reference types in \CFA polymorphic functions, where a type @T@ can bind to a reference or non-reference type. 124 125 \PAB{My analysis shows} without any restrictions, this ambiguity limits the behaviour of reference types in \CFA polymorphic functions, where a type @T@ can bind to a reference or non-reference type. 125 126 This ambiguity prevents the type system treating reference types the same way as other types, even if type variables could be bound to reference types. 126 127 The reason is that \CFA uses a common \emph{object trait}\label{p:objecttrait} (constructor, destructor and assignment operators) to handle passing dynamic concrete type arguments into polymorphic functions, and the reference types are handled differently in these contexts so they do not satisfy this common interface. … … 157 158 \end{cfa} 158 159 While it is possible to write a reference type as the argument to a generic type, it is disallowed in assertion checking, if the generic type requires the object trait \see{\VPageref{p:objecttrait}} for the type argument, a fairly common use case. 159 Even if the object trait can be made optional, the current type systemoften misbehaves by adding undesirable auto-dereference on the referenced-to value rather than the reference variable itself, as intended.160 Even if the object trait can be made optional, the current compiler implementation often misbehaves by adding undesirable auto-dereference on the referenced-to value rather than the reference variable itself, as intended. 160 161 Some tweaks are necessary to accommodate reference types in polymorphic contexts and it is unclear what can or cannot be achieved. 161 162 Currently, there are contexts where the \CFA programmer is forced to use a pointer type, giving up the benefits of auto-dereference operations and better syntax with reference types. … … 188 189 @[x, y, z]@ = foo( 3, 4 ); // return 3 values into a tuple 189 190 \end{cfa} 190 Along with making returning multiple values a first-class feature, tuples were extended to simplify a number of other common context that normally require multiple statements and/or additional declarations, all of which reduces coding time and errors.191 Along with making returning multiple values a first-class feature, tuples were extended to simplify a number of other common contexts that normally require multiple statements and/or additional declarations. 191 192 \begin{cfa} 192 193 [x, y, z] = 3; $\C[2in]{// x = 3; y = 3; z = 3, where types may be different}$ … … 205 206 Only when returning a tuple from a function is there the notion of a tuple value. 206 207 207 Overloading in the \CFA type-system must support complex composition of tuples and C type conversions using a co stingscheme giving lower cost to widening conversions that do not truncate a value.208 Overloading in the \CFA type-system must support complex composition of tuples and C type conversions using a conversion cost scheme giving lower cost to widening conversions that do not truncate a value. 208 209 \begin{cfa} 209 210 [ int, int ] foo$\(_1\)$( int ); $\C{// overloaded foo functions}$ … … 213 214 \end{cfa} 214 215 The type resolver only has the tuple return types to resolve the call to @bar@ as the @foo@ parameters are identical. 215 The resul tion involves unifying the flattened @foo@ return values with @bar@'s parameter list.216 The resulution involves unifying the flattened @foo@ return values with @bar@'s parameter list. 216 217 However, no combination of @foo@s is an exact match with @bar@'s parameters; 217 218 thus, the resolver applies C conversions to obtain a best match. … … 223 224 bar( foo( 3 ) ) // only one tuple returning call 224 225 \end{lstlisting} 225 Hence, program ers cannot take advantage of the full power of tuples but type match is straightforward.226 Hence, programmers cannot take advantage of the full power of tuples but type match is straightforward. 226 227 227 228 K-W C also supported tuple variables, but with a strong distinction between tuples and tuple values/variables. … … 286 287 \end{figure} 287 288 288 The primary issues for tuples in the \CFA type system are polymorphism and conversions.289 \PAB{I identified} the primary issues for tuples in the \CFA type system are polymorphism and conversions. 289 290 Specifically, does it make sense to have a generic (polymorphic) tuple type, as is possible for a structure? 290 291 \begin{cfa} … … 303 304 \section{Tuple Implementation} 304 305 305 As noted, tradition languages manipulate multiple values by in/out parameters and/or structures.306 As noted, traditional languages manipulate multiple values by in/out parameters and/or structures. 306 307 K-W C adopted the structure for tuple values or variables, and as needed, the fields are extracted by field access operations. 307 308 As well, for the tuple-assignment implementation, the left-hand tuple expression is expanded into assignments of each component, creating temporary variables to avoid unexpected side effects. … … 356 357 \end{figure} 357 358 358 Interestingly, in the third implementation of \CFA tuples by Robert Schluntz~\cite[\S~3]{Schluntz17}, the MVR functions revert back to structure based, where itremains in the current version of \CFA.359 Interestingly, in the third implementation of \CFA tuples by Robert Schluntz~\cite[\S~3]{Schluntz17}, the MVR functions revert back to structure based, and this remains in the current version of \CFA. 359 360 The reason for the reversion is a uniform approach for tuple values/variables making tuples first-class types in \CFA, \ie allow tuples with corresponding tuple variables. 360 361 This reversion was possible, because in parallel with Schluntz's work, generic types were added independently by Moss~\cite{Moss19}, and the tuple variables leveraged the same implementation techniques as for generic variables~\cite[\S~3.7]{Schluntz17}. … … 384 385 Scala, like \CC, provides tuple types through a library using this structural expansion, \eg Scala provides tuple sizes 1 through 22 via hand-coded generic data-structures. 385 386 386 However, after experience gained building the \CFA runtime system, making tuple-types first-class seems to add little benefit.387 However, after experience gained building the \CFA runtime system, \PAB{I convinced them} making tuple-types first-class seems to add little benefit. 387 388 The main reason is that tuples usages are largely unstructured, 388 389 \begin{cfa} … … 512 513 looping is used to traverse the argument pack from left to right. 513 514 The @va_list@ interface is walking up the stack (by address) looking at the arguments pushed by the caller. 514 ( Magicknowledge is needed for arguments pushed using registers.)515 (Compiler-specific ABI knowledge is needed for arguments pushed using registers.) 515 516 516 517 \begin{figure} … … 571 572 572 573 Currently in \CFA, variadic polymorphic functions are the only place tuple types are used. 573 And because \CFA compiles polymorphic functions versus template expansion, many wrapper functions are generated to implement both user-defined generic-types and polymorphism with variadics.574 \PAB{My analysis showed} many wrapper functions are generated to implement both user-defined generic-types and polymorphism with variadics, because \CFA compiles polymorphic functions versus template expansion. 574 575 Fortunately, the only permitted operations on polymorphic function parameters are given by the list of assertion (trait) functions. 575 576 Nevertheless, this small set of functions eventually needs to be called with flattened tuple arguments. 576 577 Unfortunately, packing the variadic arguments into a rigid @struct@ type and generating all the required wrapper functions is significant work and largely wasted because most are never called. 577 578 Interested readers can refer to pages 77-80 of Robert Schluntz's thesis to see how verbose the translator output is to implement a simple variadic call with 3 arguments. 578 As the number of arguments increases, \eg a call with 5 arguments, the translator generates aconcrete @struct@ types for a 4-tuple and a 3-tuple along with all the polymorphic type data for them.579 As the number of arguments increases, \eg a call with 5 arguments, the translator generates concrete @struct@ types for a 4-tuple and a 3-tuple along with all the polymorphic type data for them. 579 580 An alternative approach is to put the variadic arguments into an array, along with an offset array to retrieve each individual argument. 580 581 This method is similar to how the C @va_list@ object is used (and how \CFA accesses polymorphic fields in a generic type), but the \CFA variadics generate the required type information to guarantee type safety (like the @printf@ format string). … … 683 684 684 685 Nested \emph{named} aggregates are allowed in C but there is no qualification operator, like the \CC type operator `@::@', to access an inner type. 685 \emph{To compensate for the missing type operator, all named nested aggregates are hoisted to global scope, regardless of the nesting depth, and type usages within the nested type are replaced with global type name.} 686 To compensate for the missing type operator, all named nested aggregates are hoisted to global scope, regardless of the nesting depth, and type usages within the nested type are replaced with global type name. 686 687 Hoisting nested types can result in name collisions among types at the global level, which defeats the purpose of nesting the type. 687 688 \VRef[Figure]{f:NestedNamedAggregate} shows the nested type @T@ is hoisted to the global scope and the declaration rewrites within structure @S@. … … 729 730 \end{figure} 730 731 731 For good reasons,\CC chose to change this semantics:732 \CC chose to change this semantics: 732 733 \begin{cquote} 733 734 \begin{description}[leftmargin=*,topsep=0pt,itemsep=0pt,parsep=0pt] … … 769 770 Like an anonymous nested type, a named Plan-9 nested type has its field names hoisted into @struct S@, so there is direct access, \eg @s.x@ and @s.i@. 770 771 Hence, the field names must be unique, unlike \CC nested types, but the type names are at a nested scope level, unlike type nesting in C. 771 In addition, a pointer to a structure is automatically converted to a pointer to an anonymous field for assignments and function calls, providing containment inheritance with implicit subtyping, \ie @U@ $ \subset$ @S@ and @W@ $\subset$ @S@, \eg:772 In addition, a pointer to a structure is automatically converted to a pointer to an anonymous field for assignments and function calls, providing containment inheritance with implicit subtyping, \ie @U@ $<:$ @S@ and @W@ $<:$ @S@, \eg: 772 773 \begin{cfa} 773 774 void f( union U * u ); … … 781 782 Note, there is no value assignment, such as, @w = s@, to copy the @W@ field from @S@. 782 783 783 Unfortunately, the Plan-9 designers did not look ahead to other useful features, specifically nested types.784 Unfortunately, the Plan-9 designers did not look ahead to other useful features, specifically nested types. 784 785 This nested type compiles in \CC and \CFA. 785 786 \begin{cfa} … … 808 809 In addition, a semi-non-compatible change is made so that Plan-9 syntax means a forward declaration in a nested type. 809 810 Since the Plan-9 extension is not part of C and rarely used, this change has minimal impact. 810 Hence, all Plan-9 semantics are denoted by the @inline@ qualifier, which is good ``eye-candy'' when reading a structure definition to spotPlan-9 definitions.811 Hence, all Plan-9 semantics are denoted by the @inline@ qualifier, which clearly indicates the usage of Plan-9 definitions. 811 812 Finally, the following code shows the value and pointer polymorphism. 812 813 \begin{cfa} … … 821 822 822 823 In general, non-standard C features (@gcc@) do not need any special treatment, as they are directly passed through to the C compiler. 823 However, the Plan-9 semantics allow implicit conversions from the outer type to the inner type, which means the \CFA type resolver must take this information into account.824 However, \PAB{I found} the Plan-9 semantics allow implicit conversions from the outer type to the inner type, which means the \CFA type resolver must take this information into account. 824 825 Therefore, the \CFA resolver must implement the Plan-9 features and insert necessary type conversions into the translated code output. 825 826 In the current version of \CFA, this is the only kind of implicit type conversion other than the standard C arithmetic conversions. … … 847 848 \end{c++} 848 849 and again the expression @d.x@ is ambiguous. 849 While \CC has no direct syntax to disambiguate @x@, \ ie@d.B.x@ or @d.C.x@, it is possible with casts, @((B)d).x@ or @((C)d).x@.850 While \CC has no direct syntax to disambiguate @x@, \eg @d.B.x@ or @d.C.x@, it is possible with casts, @((B)d).x@ or @((C)d).x@. 850 851 Like \CC, \CFA compiles the Plan-9 version and provides direct qualification and casts to disambiguate @x@. 851 While ambiguous definitions are allowed, duplicate field names ispoor practice and should be avoided if possible.852 While ambiguous definitions are allowed, duplicate field names are poor practice and should be avoided if possible. 852 853 However, when a programmer does not control all code, this problem can occur and a naming workaround must exist. -
doc/theses/fangren_yu_MMath/future.tex
r7d02d35 rbd72f517 4 4 The following are feature requests related to type-system enhancements that have surfaced during the development of the \CFA language and library, but have not been implemented yet. 5 5 Currently, developers must work around these missing features, sometimes resulting in inefficiency. 6 \PAB{The following sections discuss new features I am proposing to fix these problems.} 6 7 7 8 8 9 \section{Closed Trait Types} 9 10 10 Currently, \CFA does not have any closed types, as open type are the basis of its unique type-system, allowing new functions to be added at any time to override existing ones for trait satisfaction.11 Currently, \CFA does not have any closed types, as open types are the basis of its unique type-system, allowing new functions to be added at any time to override existing ones for trait satisfaction. 11 12 Locally-declared nested-functions,\footnote{ 12 13 Nested functions are not a feature in C but supported by \lstinline{gcc} for multiple decades and are used heavily in \CFA.} … … 17 18 Library implementers normally do not want users to override certain operations and cause the behaviour of polymorphic invocations to change. 18 19 \item 19 Caching and reusing resolution results in the compiler is effected, as newly introduced declarations can participate in assertion resolution;20 Caching and reusing resolution results in the compiler is affected, as newly introduced declarations can participate in assertion resolution; 20 21 as a result, previously invalid subexpressions suddenly become valid, or alternatively cause ambiguity in assertions. 21 22 \end{enumerate} … … 70 71 \end{figure} 71 72 72 A \CFA closed trait type is similar to a Haskell type class requiringan explicit instance declaration.73 A \CFA closed trait type is planned to be working similarly to a Haskell type class that requires an explicit instance declaration. 73 74 The syntax for the closed trait might look like: 74 75 \begin{cfa} … … 91 92 92 93 \section{Associated Types} 94 \label{s:AssociatedTypes} 93 95 94 The analysis presented in \VRef{s:AssertionSatisfaction} shows if all type parameters have to be bound before assertion resolution, the complexity of resolving assertions become much lower as every assertion parameter can be resolved independently.96 The analysis presented in \VRef{s:AssertionSatisfaction} shows if all type parameters have to be bound before assertion resolution, the complexity of resolving assertions becomes much lower as every assertion parameter can be resolved independently. 95 97 That is, by utilizing information from higher up the expression tree for return value overloading, most of the type bindings can be resolved. 96 98 However, there are scenarios where some intermediate types need to be involved in certain operations, which are neither input nor output types. … … 152 154 Note that the type @list *@ satisfies both @pointer_like( list *, int )@ and @pointer_like( list *,@ @list )@ (the latter by the built-in pointer dereference operator) and the expression @*it@ can be either a @struct list@ or an @int@. 153 155 Requiring associated types to be unique makes the @pointer_like@ trait not applicable to @list *@, which is undesirable. 154 I have not attempted to implement associated types in \CFA compiler, but based on the above discussions, one option is to make associated type resolution and return type overloading coexist:156 I have not attempted to implement associated types in the \CFA compiler, but based on the above discussions, one option is to make associated type resolution and return type overloading coexist: 155 157 when the associated type appears in returns, it is deduced from the context and then verify the trait with ordinary assertion resolution; 156 158 when it does not appear in the returns, the type is required to be uniquely determined by the expression that defines the associated type. … … 159 161 \section{User-defined Conversions} 160 162 161 Missing type-system featureis a scheme for user-defined conversions.163 A missing type-system feature in \CFA is a scheme for user-defined conversions. 162 164 Conversion means one type goes through an arbitrary complex process of changing its value to some meaningful value in another type. 163 165 Because the conversion process can be arbitrarily complex, it requires the power of a function. -
doc/theses/fangren_yu_MMath/intro.tex
r7d02d35 rbd72f517 30 30 31 31 \section{Overloading} 32 32 \label{s:Overloading} 33 34 \vspace*{-5pt} 33 35 \begin{quote} 34 36 There are only two hard things in Computer Science: cache invalidation and \emph{naming things}. --- Phil Karlton 35 37 \end{quote} 38 \vspace*{-5pt} 36 39 Overloading allows programmers to use the most meaningful names without fear of name clashes within a program or from external sources, like include files. 37 Experience from \CC and \CFA developers shows the type system can implicitly and correctly disambiguate sthe majority of overloaded names, \ie it is rare to get an incorrect selection or ambiguity, even among hundreds of overloaded (variables and) functions.40 Experience from \CC and \CFA developers shows the type system can implicitly and correctly disambiguate the majority of overloaded names, \ie it is rare to get an incorrect selection or ambiguity, even among hundreds of overloaded (variables and) functions. 38 41 In many cases, a programmer is unaware of name clashes, as they are silently resolved, simplifying the development process. 39 42 40 43 Disambiguating among overloads is implemented by examining each call site and selecting the best matching overloaded function based on criteria like the types and number of arguments and the return context. 41 Since the hardware does not support mixed-mode operands, @2 + 3.5@, the type system must disallow it or (safely) convert the operands to a common type.44 Since the hardware does not support mixed-mode operands, such as @2 + 3.5@, the type system must disallow it or (safely) convert the operands to a common type. 42 45 Like overloading, the majority of mixed-mode conversions are silently resolved, simplifying the development process. 43 46 This approach matches with programmer intuition and expectation, regardless of any \emph{safety} issues resulting from converted values. … … 49 52 As well, many namespace systems provide a mechanism to open their scope returning to normal overloading, \ie no qualification. 50 53 While namespace mechanisms are very important and provide a number of crucial program-development features, protection from overloading is overstated. 51 Similarly, lexical nesting is another place where overloading occurs.54 Similarly, lexical nesting is another place where duplicate naming issues arise. 52 55 For example, in object-oriented programming, class member names \newterm{shadow} names within members. 53 Some programmers ,qualify all member names with @class::@ or @this->@ to make them unique from names defined in members.54 Even nested lexical blocks result in shadowing, \eg multiple nested loop-indices called @i@ .55 Again, coding styles exist requiring all variables in nested block to be unique to prevent name shadowing .56 Some programmers qualify all member names with @class::@ or @this->@ to make them unique from names defined in members. 57 Even nested lexical blocks result in shadowing, \eg multiple nested loop-indices called @i@, silently changing the meaning of @i@ at lower scope levels. 58 Again, coding styles exist requiring all variables in nested block to be unique to prevent name shadowing problems. 56 59 Depending on the language, these possible ambiguities can be reported (as warnings or errors) and resolved explicitly using some form of qualification and/or cast. 57 58 Formally, overloading is defined by Strachey as \newterm{ad hoc polymorphism}: 60 For example, if variables can be overloaded, shadowed variables of different type can produce ambiguities, indicating potential problems in lower scopes. 61 62 Formally, overloading is defined by Strachey as one kind of \newterm{ad hoc polymorphism}: 63 \vspace*{-5pt} 59 64 \begin{quote} 60 65 In ad hoc polymorphism there is no single systematic way of determining the type of the result from the type of the arguments. … … 63 68 It seems, moreover, that the automatic insertion of transfer functions by the compiling system is limited to this.~\cite[p.~37]{Strachey00} 64 69 \end{quote} 70 \vspace*{-5pt} 65 71 where a \newterm{transfer function} is an implicit conversion to help find a matching overload: 72 \vspace*{-5pt} 66 73 \begin{quote} 67 74 The problem of dealing with polymorphic operators is complicated by the fact that the range of types sometimes overlap. … … 69 76 The functions which perform this operation are known as transfer functions and may either be used explicitly by the programmer, or, in some systems, inserted automatically by the compiling system.~\cite[p.~35]{Strachey00} 70 77 \end{quote} 78 \vspace*{-5pt} 71 79 The differentiating characteristic between parametric polymorphism and overloading is often stated as: polymorphic functions use one algorithm to operate on arguments of many different types, whereas overloaded functions use a different algorithm for each type of argument. 72 80 A similar differentiation is applicable for overloading and default parameters. … … 128 136 \end{cfa} 129 137 the overloaded names @S@ and @E@ are separated into the type and object domain, and C uses the type kinds @struct@ and @enum@ to disambiguate the names. 130 In general, types are not overloaded because inferencing them is difficult to imagine in a statically programming language.138 In general, types are not overloaded because inferencing them is difficult to imagine in a statically typed programming language. 131 139 \begin{cquote} 132 140 \setlength{\tabcolsep}{26pt} … … 181 189 \noindent 182 190 \newterm{General overloading} occurs when the type-system \emph{knows} a function's parameters and return types (or a variable's type for variable overloading). 183 In functional programming-languages, there is always a return type (except for a monad).191 In functional programming-languages, there is always a return type. 184 192 If a return type is specified, the compiler does not have to inference the function body. 185 193 For example, the compiler has complete knowledge about builtin types and their overloaded arithmetic operators. … … 214 222 Hence, parametric overloading requires additional information about the universal types to make them useful. 215 223 216 This additional information often comes as a set of operations a type must supply (@trait@/-@concept@) and these operations can then be used in the body of the function. 217 \begin{cfa} 218 forall( T | T ?@++@( T, T ) ) T inc( T t ) { return t@++@; } 224 This additional information often comes as a set of operations that must be supply for a type, \eg \CFA/Rust/Go have traits, \CC template has concepts, Haskell has type-classes. 225 These operations can then be used in the body of the function to manipulate the type's value. 226 Here, a type binding to @T@ must have available a @++@ operation with the specified signature. 227 \begin{cfa} 228 forall( T | @T ?++( T, T )@ ) // trait 229 T inc( T t ) { return t@++@; } // change type value 219 230 int i = 3 220 231 i = inc( i ) … … 222 233 \end{cfa} 223 234 Given a qualifying trait, are its elements inferred or declared? 224 In the above example, the type system infers @int@ for @T@, infers it needs a @++@ operator that takes an @int@ and returns an @int@, and finds this function in the enclosing environment (\eg standard prelude).235 In the example, the type system infers @int@ for @T@, infers it needs an appropriately typed @++@ operator, and finds it in the enclosing environment, possibly in the language's prelude defining basic types and their operations. 225 236 This implicit inferencing is expensive if matched with implicit conversions when there is no exact match. 226 237 Alternatively, types opt-in to traits via declarations. … … 420 431 \subsection{Operator Overloading} 421 432 422 Virtually all programming languages provide general overloading of the arithmetic operatorsacross the basic computational types using the number and type of parameters and returns.433 Many programming languages provide general overloading of the arithmetic operators~\cite{OperOverloading} across the basic computational types using the number and type of parameters and returns. 423 434 However, in some languages, arithmetic operators may not be first class, and hence, cannot be overloaded. 424 435 Like \CC, \CFA allows general operator overloading for user-defined types. … … 438 449 \subsection{Function Overloading} 439 450 440 Both \CFA and \CC allow general overloading for functions, as long as their prototypes differ in the number and type of parameters and returns. 451 Many programming languages provide general overloading for functions~\cite{FuncOverloading}, as long as their prototypes differ in the number and type of parameters. 452 A few programming languages also use the return type for selecting overloaded functions \see{below}. 441 453 \begin{cfa} 442 454 void f( void ); $\C[2in]{// (1): no parameter}$ … … 445 457 f( 'A' ); $\C{// select (2)}\CRT$ 446 458 \end{cfa} 447 The type system examines each call si ze and first looks for an exact match and then a best match using conversions.459 The type system examines each call site and first looks for an exact match and then a best match using conversions. 448 460 449 461 Ada, Scala, and \CFA type-systems also use the return type in resolving a call, to pinpoint the best overloaded name. 450 Essent ailly, the return types are \emph{reversed curried} into output parameters of the function.462 Essentially, the return types are \emph{reversed curried} into output parameters of the function. 451 463 For example, in many programming languages with overloading, the following functions are ambiguous without using the return type. 452 464 \begin{cfa} … … 478 490 \begin{cfa} 479 491 void foo( double d ); 480 int v; 492 int v; $\C[2in]{// (1)}$ 481 493 double v; $\C{// (2) variable overloading}$ 482 494 foo( v ); $\C{// select (2)}$ … … 487 499 } 488 500 \end{cfa} 489 It is interesting that shadow overloadingis considered a normal programming-language feature with only slight software-engineering problems.501 It is interesting that shadowing \see{namespace pollution in \VRef{s:Overloading}} is considered a normal programming-language feature with only slight software-engineering problems. 490 502 However, variable overloading within a scope is often considered dangerous, without any evidence to corroborate this claim. 491 503 In contrast, function overloading in \CC occurs silently within the global scope from @#include@ files all the time without problems. … … 554 566 The following covers these issues, and why this scheme is not amenable with the \CFA type system. 555 567 556 One of the first and powerful type-inferencing systemis Hindley--Milner~\cite{Damas82}.568 One of the first and most powerful type-inferencing systems is Hindley--Milner~\cite{Damas82}. 557 569 Here, the type resolver starts with the types of the program constants used for initialization and these constant types flow throughout the program, setting all variable and expression types. 558 570 \begin{cfa} … … 579 591 Note, return-type inferencing goes in the opposite direction to Hindley--Milner: knowing the type of the result and flowing back through an expression to help select the best possible overloads, and possibly converting the constants for a best match. 580 592 581 In simpler type-inferencing systems, such as C/\CC/\CFA, there are more specific usages.593 There are multiple ways to indirectly specify a variable's type, \eg from a prior variable or expression. 582 594 \begin{cquote} 583 595 \setlength{\tabcolsep}{10pt} … … 606 618 \end{tabular} 607 619 \end{cquote} 608 The two important capabilities are: 620 Here, @type(expr)@ computes the same type as @auto@ righ-hand expression. 621 The advantages are: 609 622 \begin{itemize}[topsep=0pt] 610 623 \item … … 616 629 This issue is exaggerated with \CC templates, where type names are 100s of characters long, resulting in unreadable error messages. 617 630 \item 618 Ensuring the type of secondary variables ,match a primary variable.631 Ensuring the type of secondary variables match a primary variable. 619 632 \begin{cfa} 620 633 int x; $\C{// primary variable}$ … … 625 638 \end{itemize} 626 639 Note, the use of @typeof@ is more restrictive, and possibly safer, than general type-inferencing. 640 \begin{cquote} 641 \setlength{\tabcolsep}{20pt} 642 \begin{tabular}{@{}ll@{}} 627 643 \begin{cfa} 628 644 int x; … … 630 646 type(x) z = ... // complex expression 631 647 \end{cfa} 632 Here, the types of @y@ and @z@ are fixed (branded), whereas with type inferencing, the types of @y@ and @z@ are potentially unknown. 648 & 649 \begin{cfa} 650 int x; 651 auto y = ... // complex expression 652 auto z = ... // complex expression 653 \end{cfa} 654 \end{tabular} 655 \end{cquote} 656 On the left, the types of @y@ and @z@ are fixed (branded), whereas on the right, the types of @y@ and @z@ can fluctuate. 633 657 634 658 635 659 \subsection{Type-Inferencing Issues} 636 660 637 Each kind of type-inferencing system has its own set of issues that flow ontothe programmer in the form of convenience, restrictions, or confusions.661 Each kind of type-inferencing system has its own set of issues that affect the programmer in the form of convenience, restrictions, or confusions. 638 662 639 663 A convenience is having the compiler use its overarching program knowledge to select the best type for each variable based on some notion of \emph{best}, which simplifies the programming experience. … … 643 667 For example, if a change is made in an initialization expression, it can cascade type changes producing many other changes and/or errors. 644 668 At some point, a variable's type needs to remain constant and the initializing expression needs to be modified or be in error when it changes. 645 Often type-inferencing systems allow restricting (\newterm{branding}) a variable or function type, so the comp lier can report a mismatch with the constant initialization.669 Often type-inferencing systems allow restricting (\newterm{branding}) a variable or function type, so the compiler can report a mismatch with the constant initialization. 646 670 \begin{cfa} 647 671 void f( @int@ x, @int@ y ) { // brand function prototype … … 657 681 As a result, understanding and changing the code becomes almost impossible. 658 682 Types provide important clues as to the behaviour of the code, and correspondingly to correctly change or add new code. 659 In these cases, a programmer is forced to re-engineer types, which is fragile, or rely on a fancyIDE that can re-engineer types for them.683 In these cases, a programmer is forced to re-engineer types, which is fragile, or rely on an IDE that can re-engineer types for them. 660 684 For example, given: 661 685 \begin{cfa} … … 670 694 In this situation, having the type name or its short alias is essential. 671 695 672 \CFA's type system tries to prevent type-resolution mistakes by relying heavily on the type of the left-hand side of assignment to pinpoint the right types within an expression.696 \CFA's type system tries to prevent type-resolution mistakes by relying heavily on the type of the left-hand side of assignment to pinpoint correct types within an expression. 673 697 Type inferencing defeats this goal because there is no left-hand type. 674 Fundamentally, type inferencing tries to magic away variable types from the programmer. 675 However, this results in lazy programming with the potential for poor performance and safety concerns. 676 Types are as important as control-flow in writing a good program, and should not be masked, even if it requires the programmer to think! 677 A similar issue is garbage collection, where storage management is magicked away, often resulting in poor program design and performance.\footnote{ 678 There are full-time Java consultants, who are hired to find memory-management problems in large Java programs.} 679 The entire area of Computer-Science data-structures is obsessed with time and space, and that obsession should continue into regular programming. 680 Understanding space and time issues is an essential part of the programming craft. 681 Given @typedef@ and @typeof@ in \CFA, and the strong desire to use the left-hand type in resolution, the decision was made not to support implicit type-inferencing in the type system. 698 Fundamentally, type inferencing tries to remove explicit typing from programming. 699 However, writing down types is an important aspect of good programming, as it provides a check of the programmer's expected type and the actual type. 700 Thinking carefully about types is similar to thinking carefully about date structures, often resulting in better performance and safety. 701 Similarly, thinking carefully about storage management in unmanaged languages is an important aspect of good programming, versus implicit storage management (garbage collection) in managed language.\footnote{ 702 There are full-time Java consultants, who are hired to find memory-management problems in large Java programs, \eg Monika Beckworth.} 703 Given @typedef@ and @typeof@, and the strong desire to use the left-hand type in resolution, no attempt has been made in \CFA to support implicit type-inferencing. 682 704 Should a significant need arise, this decision can be revisited. 683 705 … … 702 724 int i, * ip = identity( &i ); 703 725 \end{cfa} 704 Unlike \CC template functions, \CFA polymorphic functions are compatible with C\emph{separate compilation}, preventing compilation and code bloat.726 Unlike \CC template functions, \CFA polymorphic functions are compatible with \emph{separate compilation}, preventing compilation and code bloat. 705 727 706 728 To constrain polymorphic types, \CFA uses \newterm{type assertions}~\cite[pp.~37-44]{Alphard} to provide further type information, where type assertions may be variable or function declarations that depend on a polymorphic type variable. … … 710 732 int val = twice( twice( 3 ) ); $\C{// val == 12}$ 711 733 \end{cfa} 712 Parametric polymorphism and assertions occur in existing type-unsafe (@void *@) Cfunctions, like @qsort@ for sorting an array of unknown values.734 The closest approximation to parametric polymorphism and assertions in C is type-unsafe (@void *@) functions, like @qsort@ for sorting an array of unknown values. 713 735 \begin{cfa} 714 736 void qsort( void * base, size_t nmemb, size_t size, int (*cmp)( const void *, const void * ) ); … … 761 783 The @sized@ assertion passes size and alignment as a data object has no implicit assertions. 762 784 Both assertions are used in @malloc@ via @sizeof@ and @_Alignof@. 763 In practi se, this polymorphic @malloc@ is unwrapped by the C compiler and the @if@ statement is elided producing a type-safe call to @malloc@ or @memalign@.785 In practice, this polymorphic @malloc@ is unwrapped by the C compiler and the @if@ statement is elided producing a type-safe call to @malloc@ or @memalign@. 764 786 765 787 This mechanism is used to construct type-safe wrapper-libraries condensing hundreds of existing C functions into tens of \CFA overloaded functions. … … 787 809 forall( T @| sumable( T )@ ) // use trait 788 810 T sum( T a[$\,$], size_t size ) { 789 @T@ total = 0; 811 @T@ total = 0; // initialize by 0 constructor 790 812 for ( i; size ) 791 total @+=@ a[i]; 813 total @+=@ a[i]; // select appropriate + 792 814 return total; 793 815 } … … 795 817 \end{tabular} 796 818 \end{cquote} 797 Traits are implemented by flatten them at use points, as if written in full by the programmer.819 Traits are implemented by flattening them at use points, as if written in full by the programmer. 798 820 Flattening often results in overlapping assertions, \eg operator @+@. 799 821 Hence, trait names play no part in type equivalence. … … 821 843 Write bespoke data structures for each context. 822 844 While this approach is flexible and supports integration with the C type checker and tooling, it is tedious and error prone, especially for more complex data structures. 845 823 846 \item 824 847 Use @void *@-based polymorphism, \eg the C standard library functions @bsearch@ and @qsort@, which allow for the reuse of code with common functionality. 825 848 However, this approach eliminates the type checker's ability to ensure argument types are properly matched, often requiring a number of extra function parameters, pointer indirection, and dynamic allocation that is otherwise unnecessary. 849 826 850 \item 827 Use preprocessor macros, similar to \CC @templates@, to generate code that is both generic and type checked, but errors may be difficult to interpret. 828 Furthermore, writing and using complex preprocessor macros is difficult and inflexible. 851 Use an internal macro capability, like \CC @templates@, to generate code that is both generic and type checked, but errors may be difficult to interpret. 852 Furthermore, writing complex template macros is difficult and complex. 853 854 \item 855 Use an external macro capability, like M4~\cite{M4}, to generate code that is generic code, but errors may be difficult to interpret. 856 Like internal macros, writing and using external macros is equally difficult and complex. 829 857 \end{enumerate} 830 858 … … 857 885 \end{tabular} 858 886 \end{cquote} 887 \label{s:GenericImplementation} 859 888 \CFA generic types are \newterm{fixed} or \newterm{dynamic} sized. 860 889 Fixed-size types have a fixed memory layout regardless of type parameters, whereas dynamic types vary in memory layout depending on the type parameters. … … 883 912 For software-engineering reasons, the set assertions would be refactored into a trait to allow alternative implementations, like a Java \lstinline[language=java]{interface}. 884 913 885 In summation, the \CFA type system inherits \newterm{nominal typing} for concrete types from C, and adds \newterm{structural typing} for polymorphic types. 886 Traits are used like interfaces in Java or abstract base-classes in \CC, but without the nominal inheritance relationships. 887 Instead, each polymorphic function or generic type defines the structural type needed for its execution, which is fulfilled at each call site from the lexical environment, like Go~\cite{Go} or Rust~\cite{Rust} interfaces. 888 Hence, new lexical scopes and nested functions are used extensively to create local subtypes, as in the @qsort@ example, without having to manage a nominal inheritance hierarchy. 914 In summation, the \CFA type system inherits \newterm{nominal typing} for concrete types from C; 915 however, without inheritance in \CFA, nominal typing cannot be extended to polymorphic subtyping. 916 Instead, \CFA adds \newterm{structural typing} and uses it to generate polymorphism. 917 Here, traits are like interfaces in Java or abstract base-classes in \CC, but without the nominal inheritance relationships. 918 Instead, each polymorphic function or generic type defines the structural requirements needed for its execution, which is fulfilled at each call site from the lexical environment, like Go~\cite{Go} or Rust~\cite{Rust} interfaces. 919 Hence, lexical scopes and nested functions are used extensively to mimic subtypes, as in the @qsort@ example, without managing a nominal inheritance hierarchy. 889 920 890 921 … … 902 933 general\footnote{overloadable entities: V $\Rightarrow$ variable, O $\Rightarrow$ operator, F $\Rightarrow$ function, M $\Rightarrow$ member} 903 934 & O\footnote{except assignment}/F & O/F/M & V/O/F & M\footnote{not universal} & O/M & O/F/M & no & no \\ 904 general constraints\footnote{T $\Rightarrow$ parameter type, \# $\Rightarrow$ parameter number, N $\Rightarrow$ parameter name; R $\Rightarrow$ return type}935 general constraints\footnote{T $\Rightarrow$ parameter type, \# $\Rightarrow$ parameter count, N $\Rightarrow$ parameter name; R $\Rightarrow$ return type} 905 936 & T/\#//R\footnote{parameter names can be used to disambiguate among overloads but not create overloads} 906 937 & T/\# & T/\#/R & T/\# & T/\#/N/R & T/\#/N/R & T/\#/N & T/R \\ … … 999 1030 1000 1031 However, the parameter operations are severely restricted because universal types have few operations. 1001 For example, swift provides a @print@ operation for its universal type, and the java @Object@ class provides general methods: @toString@, @hashCode@, @equals@, @finalize@, \etc.1032 For example, Swift provides a @print@ operation for its universal type, and the Java @Object@ class provides general methods: @toString@, @hashCode@, @equals@, @finalize@, \etc. 1002 1033 This restricted mechanism still supports a few useful functions, where the parameters are abstract entities, \eg: 1003 1034 \begin{swift} … … 1009 1040 \end{swift} 1010 1041 To make a universal function useable, an abstract description is needed for the operations used on the parameters within the function body. 1011 Type matching these operations can occur by discoverusing techniques like \CC template expansion, or explicit stating, \eg interfaces, subtyping (inheritance), assertions (traits), type classes, type bounds.1042 Type matching these operations can be done by using techniques like \CC template expansion, or explicit stating, \eg interfaces, subtyping (inheritance), assertions (traits), type classes, type bounds. 1012 1043 The mechanism chosen can affect separate compilation or require runtime type information (RTTI). 1013 1044 \begin{description} … … 1030 1061 1031 1062 \begin{figure} 1032 \setlength{\tabcolsep}{1 5pt}1063 \setlength{\tabcolsep}{12pt} 1033 1064 \begin{tabular}{@{}ll@{}} 1034 1065 \multicolumn{1}{c}{\textbf{\CFA}} & \multicolumn{1}{c}{\textbf{Haskell}} \\ … … 1036 1067 forall( T ) trait sumable { 1037 1068 void ?{}( T &, zero_t ); 1038 T ?+=?( T &, T ); 1039 }; 1069 T ?+=?( T &, T ); }; 1040 1070 forall( T | sumable( T ) ) 1041 1071 T sum( T a[], size_t size ) { 1042 1072 T total = 0; 1043 1073 for ( i; size ) total += a[i]; 1044 return total; 1045 } 1074 return total; } 1046 1075 struct S { int i, j; }; 1047 1076 void ?{}( S & s, zero_t ) { s.[i, j] = 0; } … … 1049 1078 void ?{}( S & s, int i, int j ) { s.[i, j] = [i, j]; } 1050 1079 S ?+=?( S & l, S r ) { l.[i, j] += r.[i, j]; } 1051 1052 int main() { 1053 int ia[] = { 1, 2, 3 }; 1054 sout | sum( ia, 3 ); // trait inference 1055 double da[] = { 1.5, 2.5, 3.5 }; 1056 sout | sum( da, 3 ); // trait inference 1057 S sa[] = { {1, 1}, {2, 2}, {3, 3 } }; 1058 sout | sum( sa, 3 ).[i, j]; // trait inference 1059 } 1080 int main() { // trait inferencing 1081 sout | sum( (int []){ 1, 2, 3 }, 3 ); 1082 sout | sum( (double []){ 1.5, 2.5, 3.5 }, 3 ); 1083 sout | sum( (S []){ {1,1}, {2,2}, {3,3} }, 3 ).[i, j]; } 1084 1085 1086 1060 1087 \end{cfa} 1061 1088 & … … 1064 1091 szero :: a 1065 1092 sadd :: a -> a -> a 1066 1067 1093 ssum :: Sumable a $=>$ [a] -> a 1068 1094 ssum (x:xs) = sadd x (ssum xs) 1069 1095 ssum [] = szero 1070 1071 1072 1073 1096 data S = S Int Int deriving Show 1074 1097 @instance Sumable Int@ where … … 1077 1100 @instance Sumable Float@ where 1078 1101 szero = 0.0 1079 1102 sadd = (+) 1080 1103 @instance Sumable S@ where 1081 1104 szero = S 0 0 … … 1087 1110 \end{haskell} 1088 1111 \end{tabular} 1089 \caption{Implicit ly/ExplicitlyTrait Inferencing}1090 \label{f:Implicit lyExplicitlyTraitInferencing}1112 \caption{Implicit/Explicit Trait Inferencing} 1113 \label{f:ImplicitExplicitTraitInferencing} 1091 1114 \end{figure} 1092 1115 1093 1116 One differentiating feature among these specialization techniques is the ability to implicitly or explicitly infer the trait information at a class site. 1094 \VRef[Figure]{f:Implicit lyExplicitlyTraitInferencing} compares the @sumable@ trait and polymorphic @sum@ function \see{\VRef{s:Traits}} for \CFA and Haskell.1117 \VRef[Figure]{f:ImplicitExplicitTraitInferencing} compares the @sumable@ trait and polymorphic @sum@ function \see{\VRef{s:Traits}} for \CFA and Haskell. 1095 1118 Here, the \CFA type system inferences the trait functions at each call site, so no additional specification is necessary by the programmer. 1096 1119 The Haskell program requires the programmer to explicitly bind the trait and to each type that can be summed. … … 1102 1125 \end{ada} 1103 1126 Finally, there is a belief that certain type systems cannot support general overloading, \eg Haskell. 1104 As \VRef[Table]{t:OverloadingFeatures} shows, there are multiple languages with both general and parametric overloading, so the decision to not support general overloading is based on the opinion of the language designers and the type system they choose,not any reason in type theory.1127 As \VRef[Table]{t:OverloadingFeatures} shows, there are multiple languages with both general and parametric overloading, so the decision to not support general overloading is based on design choices made by the language designers not any reason in type theory. 1105 1128 1106 1129 The fourth row classifies if conversions are attempted beyond exact match. … … 1121 1144 The details of compiler optimization work are covered in a previous technical report~\cite{Yu20}, which essentially forms part of this thesis. 1122 1145 \item 1123 Th ethesis presents a systematic review of the new features added to the \CFA language and its type system.1146 This thesis presents a systematic review of the new features added to the \CFA language and its type system. 1124 1147 Some of the more recent inclusions to \CFA, such as tuples and generic structure types, were not well tested during development due to the limitation of compiler performance. 1125 1148 Several issues coming from the interactions of various language features are identified and discussed in this thesis; -
doc/theses/fangren_yu_MMath/resolution.tex
r7d02d35 rbd72f517 2 2 \label{c:content2} 3 3 4 Recapping, the\CFA's type-system provides expressive polymorphism: variables can be overloaded, functions can be overloaded by argument and return types, tuple types, generic (polymorphic) functions and types (aggregates) can have multiple type parameters with assertion restrictions;4 Recapping, \CFA's type-system provides expressive polymorphism: variables can be overloaded, functions can be overloaded by argument and return types, tuple types, generic (polymorphic) functions and types (aggregates) can have multiple type parameters with assertion restrictions; 5 5 in addition, C's multiple implicit type-conversions must be respected. 6 6 This generality leads to internal complexity and correspondingly higher compilation cost directly related to type resolution. … … 24 24 \end{enumerate} 25 25 \VRef[Table]{t:SelectedFileByCompilerBuild} shows improvements for selected tests with accumulated reductions in compile time across each of the 5 fixes. 26 T o this day, the large reduction in compilation time significantly improves the development of the \CFA's runtime because of its frequent compilation cycles.26 The large reduction in compilation time significantly improves the development of the \CFA's runtime because of its frequent compilation cycles. 27 27 28 28 \begin{table}[htb] … … 54 54 Some of those problems arise from the newly introduced language features described in the previous chapter. 55 55 In addition, fixing unexpected interactions within the type system has presented challenges. 56 This chapter describes in detail the type-resolution rules currently in use and some major problems that have beenidentified.56 This chapter describes in detail the type-resolution rules currently in use and some major problems \PAB{I} have identified. 57 57 Not all of those problems have immediate solutions, because fixing them may require redesigning parts of the \CFA type system at a larger scale, which correspondingly affects the language design. 58 58 … … 69 69 \begin{enumerate}[leftmargin=*] 70 70 \item \textbf{Unsafe} cost representing a narrowing conversion of arithmetic types, \eg @int@ to @short@, and qualifier-dropping conversions for pointer and reference types. 71 Narrowing conversions have the potential to lose (truncat ion) data.71 Narrowing conversions have the potential to lose (truncate) data. 72 72 A programmer must decide if the computed data-range can safely be shorted in the smaller storage. 73 73 Warnings for unsafe conversions are helpful. … … 86 86 87 87 \item \textbf{Safe} cost representing a widening conversion \eg @short@ to @int@, qualifier-adding conversions for pointer and reference types, and value conversion for enumeration constants. 88 Even when conversions are safe, the fewest conversions itranked better, \eg @short@ to @int@ versus @short@ to @long int@.88 When all conversions are safe, closer conversions are ranked better, \eg @short@ to @int@ versus @short@ to @long int@. 89 89 \begin{cfa} 90 90 void f( long int p ); $\C[2.5in]{// 1}$ … … 103 103 104 104 \item \textbf{Specialization} cost counting the number of restrictions introduced by type assertions. 105 Fewer restriction means few sparametric variables passed at the function call giving better performance.105 Fewer restriction means fewer parametric variables passed at the function call giving better performance. 106 106 \begin{cfa} 107 107 forall( T | { T ?+?( T, T ) } ) void f( T ); $\C[3.25in]{// 1}$ … … 110 110 \end{cfa} 111 111 \end{enumerate} 112 Cost tuples are compared bylexicographical order, from unsafe (highest) to specialization (lowest), with ties moving to the next lowest item.112 Cost tuples are compared in lexicographical order, from unsafe (highest) to specialization (lowest), with ties moving to the next lowest item. 113 113 At a subexpression level, the lowest cost candidate for each result type is included as a possible interpretation of the expression; 114 114 at the top level, all possible interpretations of different types are considered (generating a total ordering) and the overall lowest cost is selected as the final interpretation of the expression. 115 115 Glen Ditchfield first proposed this costing model~\cite[\S~4.4.5]{Ditchfield92} to generate a resolution behaviour that is reasonable to C programmers based on existing conversions in the C programming language. 116 116 This model carried over into the first implementation of the \CFA type-system by Richard Bilson~\cite[\S~2.2]{Bilson03}, and was extended but not redesigned by Aaron Moss~\cite[chap.~4]{Moss19}. 117 Moss's work began to show problems with the underlying cost ingmodel;117 Moss's work began to show problems with the underlying cost model; 118 118 these design issues are part of this work. 119 119 … … 152 152 Therefore, at each resolution step, the arguments are already given unique interpretations, so the ordering only needs to compare different sets of conversion targets (function parameter types) on the same set of input. 153 153 154 In \CFA, trying to use such a system is problematic because of the presence of return-type overloading of functions and variable.154 \PAB{My conclusion} is that trying to use such a system in \CFA is problematic because of the presence of return-type overloading of functions and variables. 155 155 Specifically, \CFA expression resolution considers multiple interpretations of argument subexpressions with different types, \eg: 156 156 so it is possible that both the selected function and the set of arguments are different, and cannot be compared with a partial-ordering system. … … 165 165 \end{quote} 166 166 However, I was unable to generate any Ada example program that demonstrates this preference. 167 In contrast, the \CFA overload resolution-system is at the other end of the spectrum, as it tries to order everylegal interpretations of an expression and chooses the best one according to cost, occasionally giving unexpected results rather than an ambiguity.167 In contrast, the \CFA overload resolution-system is at the other end of the spectrum, as it tries to order all legal interpretations of an expression and chooses the best one according to cost, occasionally giving unexpected results rather than an ambiguity. 168 168 169 169 Interestingly, the \CFA cost-based model can sometimes make expression resolution too permissive because it always attempts to select the lowest cost option, and only when there are multiple options tied at the lowest cost does it report the expression is ambiguous. … … 171 171 Other than the case of multiple exact matches, where all have cost zero, incomparable candidates under a partial ordering can often have different expression costs since different kinds of implicit conversions are involved, resulting in seemingly arbitrary overload selections. 172 172 173 There are currently at least three different situations where the polymorphic cost element of the cost model does not yield a candidate selection that is clearly justifiable, and one of them is straight upwrong.173 There are currently at least three different situations where the polymorphic cost element of the cost model does not yield a candidate selection that is justifiable, and one of them is clearly wrong. 174 174 \begin{enumerate}[leftmargin=*] 175 175 \item Polymorphic exact match versus non-polymorphic inexact match. … … 193 193 \end{itemize} 194 194 In this example, option 1 produces the prototype @void f( int )@, which gives an exact match and therefore takes priority. 195 The \CC resolution rules effectively make s option 2 a specialization that only applies to type @long@ exactly,\footnote{\CC does have explicit template specializations, however they do not participate directly in overload resolution and can sometimes lead to unintuitive results.} while the current \CFA rules make option 2 apply for all integral types below @long@.195 The \CC resolution rules effectively make option 2 a specialization that only applies to type @long@ exactly,\footnote{\CC does have explicit template specializations, however they do not participate directly in overload resolution and can sometimes lead to unintuitive results.} while the current \CFA rules make option 2 apply for all integral types ranked lower than @long@ as well. 196 196 This difference could be explained as compensating for \CFA polymorphic functions being separately compiled versus template inlining; 197 197 hence, calling them requires passing type information and assertions increasing the runtime cost. … … 211 211 Although it is true that both the sequence 1, 2 and 1, 3, 4 are increasingly more constrained on the argument types, option 2 is not comparable to either of option 3 or 4; 212 212 they actually describe independent constraints on the two arguments. 213 Specifically, option 2 says the two arguments must have the same type, while option 3 states the second argument must have type @int@ ,213 Specifically, option 2 says the two arguments must have the same type, while option 3 states the second argument must have type @int@. 214 214 Because two constraints can independently be satisfied, neither should be considered a better match when trying to resolve a call to @f@ with argument types @(int, int)@; 215 215 reporting such an expression as ambiguous is more appropriate. … … 227 227 Passing a @pair@ variable to @f@ 228 228 \begin{cfa} 229 pair p;229 pair(int, double) p; 230 230 f( p ); 231 231 \end{cfa} 232 232 gives a cost of 1 poly, 2 variable for the @pair@ overload, versus a cost of 1 poly, 1 variable for the unconstrained overload. 233 233 Programmer expectation is to select option 1 because of the exact match, but the cost model selects 2; 234 while either could work, the type system should select a call that meets expectation of say the call is ambiguous, forcing the programmer to mediate.234 it is not possible to write a specialization for @f@ that works on any pair type and gets selected by the type resolver as intended. 235 235 As a result, simply counting the number of polymorphic type variables is no longer correct to order the function candidates as being more constrained. 236 236 \end{enumerate} 237 237 238 These inconsistencies are not easily solvable in the current cost-model, meaning th e currently\CFA codebase has to workaround these defects.238 These inconsistencies are not easily solvable in the current cost-model, meaning that currently the \CFA codebase has to workaround these defects. 239 239 One potential solution is to mix the conversion cost and \CC-like partial ordering of specializations. 240 240 For example, observe that the first three elements (unsafe, polymorphic and safe conversions) in the \CFA cost-tuple are related to the argument/parameter types, while the other two elements (polymorphic variable and assertion counts) are properties of the function declaration. … … 352 352 Here, the unsafe cost of signed to unsigned is factored into the ranking, so the safe conversion is selected over an unsafe one. 353 353 Furthermore, an integral option is taken before considering a floating option. 354 This model locally matches the C approach, but provides an ordering when there are many overload ed alternative.354 This model locally matches the C approach, but provides an ordering when there are many overload alternatives. 355 355 However, as Moss pointed out overload resolution by total cost has problems, \eg handling cast expressions. 356 356 \begin{cquote} … … 379 379 if an expression has any legal interpretations as a C builtin operation, only the lowest cost one is kept, regardless of the result type. 380 380 381 \VRef[Figure]{f:CFAArithmeticConversions} shows analternative \CFA partial-order arithmetic-conversions graphically.381 \VRef[Figure]{f:CFAArithmeticConversions} shows \PAB{my} alternative \CFA partial-order arithmetic-conversions graphically. 382 382 The idea here is to first look for the best integral alternative because integral calculations are exact and cheap. 383 383 If no integral solution is found, than there are different rules to select among floating-point alternatives. … … 387 387 \section{Type Unification} 388 388 389 Type unification is the algorithm that assigns values to each (free) type parameter ssuch that the types of the provided arguments and function parameters match.389 Type unification is the algorithm that assigns values to each (free) type parameter such that the types of the provided arguments and function parameters match. 390 390 391 391 \CFA does not attempt to do any type \textit{inference} \see{\VRef{s:IntoTypeInferencing}}: it has no anonymous functions (\ie lambdas, commonly found in functional programming and also used in \CC and Java), and the variable types must all be explicitly defined (no auto typing). … … 408 408 With the introduction of generic record types, the parameters must match exactly as well; currently there are no covariance or contravariance supported for the generics. 409 409 410 One simplification was madeto the \CFA language that makes modelling the type system easier: polymorphic function pointer types are no longer allowed.410 \PAB{I made} one simplification to the \CFA language that makes modelling the type system easier: polymorphic function pointer types are no longer allowed. 411 411 The polymorphic function declarations themselves are still treated as function pointer types internally, however the change means that formal parameter types can no longer be polymorphic. 412 412 Previously it was possible to write function prototypes such as … … 418 418 A function operates on the call-site arguments together with any local and global variables. 419 419 When the function is polymorphic, the types are inferred at each call site. 420 On each invocation, the types to be operate on are determined from the arguments provided, and therefore, there is no need to pass a polymorphic function pointer, which can take any type in principle.420 On each invocation, the types to be operated on are determined from the arguments provided, and therefore, there is no need to pass a polymorphic function pointer, which can take any type in principle. 421 421 For example, consider a polymorphic function that takes one argument of type @T@ and polymorphic function pointer. 422 422 \begin{cfa} … … 441 441 The assertion set that needs to be resolved is just the declarations on the function prototype, which also simplifies the assertion satisfaction algorithm, which is discussed further in the next section. 442 442 443 Animplementation sketch stores type unification results in a type-environment data-structure, which represents all the type variables currently in scope as equivalent classes, together with their bound types and information such as whether the bound type is allowed to be opaque (\ie a forward declaration without definition in scope) and whether the bounds are allowed to be widened.443 \PAB{My} implementation sketch stores type unification results in a type-environment data-structure, which represents all the type variables currently in scope as equivalent classes, together with their bound types and information such as whether the bound type is allowed to be opaque (\ie a forward declaration without definition in scope) and whether the bounds are allowed to be widened. 444 444 In the general approach commonly used in functional languages, the unification variables are given a lower bound and an upper bound to account for covariance and contravariance of types. 445 445 \CFA does not implement any variance with its generic types and does not allow polymorphic function types, therefore no explicit upper bound is needed and one binding value for each equivalence class suffices. … … 469 469 470 470 471 In previous versions of \CFA, this number was set at 4; as the compiler becomes more optimized and capable of handling more complex expressions in a reasonable amount of time, I have increased the limit to 8 and it does not lead to problems.471 In previous versions of \CFA, this number was set at 4; as the compiler has become more optimized and capable of handling more complex expressions in a reasonable amount of time, I have increased the limit to 8 and it has not led to problems. 472 472 Only rarely is there a case where the infinite recursion produces an exponentially growing assertion set, causing minutes of time wasted before the limit is reached. 473 473 Fortunately, it is very hard to generate this situation with realistic \CFA code, and the ones that have occurred have clear characteristics, which can be prevented by alternative approaches. … … 475 475 One example is analysed in this section. 476 476 477 While the assertion satisfaction problem in isolation looks like just another expression to resolve, its recursive nature makes some techniques for expression resolution no longer possible.477 \PAB{My analysis shows that} while the assertion satisfaction problem in isolation looks like just another expression to resolve, its recursive nature makes some techniques for expression resolution no longer possible. 478 478 The most significant impact is that type unification has a side effect, namely editing the type environment (equivalence classes and bindings), which means if one expression has multiple associated assertions it is dependent, as the changes to the type environment must be compatible for all the assertions to be resolved. 479 479 Particularly, if one assertion parameter can be resolved in multiple different ways, all of the results need to be checked to make sure the change to type variable bindings are compatible with other assertions to be resolved. … … 481 481 In many cases, these problems can be avoided by examining other assertions that provide insight on the desired type binding: if one assertion parameter can only be matched by a unique option, the type bindings can be updated confidently without the need for backtracking. 482 482 483 The Moss algorithm currently used in \CFA was developed using a simplified type -simulator that capture most of \CFA type-system features.483 The Moss algorithm currently used in \CFA was developed using a simplified type system that captures most of \CFA's type system features. 484 484 The simulation results were then ported back to the actual language. 485 485 The simulator used a mix of breadth- and depth-first search in a staged approach. … … 494 494 If any new assertions are introduced by the selected candidates, the algorithm is applied recursively, until there are none pending resolution or the recursion limit is reached, which results in a failure. 495 495 496 However, in practice the efficiency of this algorithm can be sensitive to the order of resolving assertions.496 However, \PAB{I identify that} in practice the efficiency of this algorithm can be sensitive to the order of resolving assertions. 497 497 Suppose an unbound type variable @T@ appears in two assertions: 498 498 \begin{cfa} … … 517 517 A type variable introduced by the @forall@ clause of function declaration can appear in parameter types, return types and assertion variables. 518 518 If it appears in parameter types, it can be bound when matching the arguments to parameters at the call site. 519 If it only appears in the return type, it can be eventually bedetermined from the call-site context.519 If it only appears in the return type, it can be eventually determined from the call-site context. 520 520 Currently, type resolution cannot do enough return-type inferencing while performing eager assertion resolution: the return type information is unknown before the parent expression is resolved, unless the expression is an initialization context where the variable type is known. 521 521 By delaying the assertion resolution until the return type becomes known, this problem can be circumvented. 522 The truly problematic case occurs if a type variable does not appear in either of the parameter or return types and only appears in assertions or variables ( associate types).522 The truly problematic case occurs if a type variable does not appear in either of the parameter or return types and only appears in assertions or variables (\newterm{associate types}). 523 523 \begin{cfa} 524 524 forall( T | { void foo( @T@ ) } ) int f( float ) { … … 526 526 } 527 527 \end{cfa} 528 This case is rare so forcing every type variable to appear at least once in parameter or return types limitsdoes not limit the expressiveness of \CFA type system to a significant extent.529 The next section presents a proposal for including type declarations in traits rather than having all type variables appear in the trait parameter list, which isprovides equivalent functionality to an unbound type parameter in assertion variables, and also addresses some of the variable cost issue discussed in \VRef{s:ExpressionCostModel}.528 This case is rare so forcing every type variable to appear at least once in parameter or return types does not limit the expressiveness of \CFA type system to a significant extent. 529 \VRef{s:AssociatedTypes} presents a proposal for including type declarations in traits rather than having all type variables appear in the trait parameter list, which provides equivalent functionality to an unbound type parameter in assertion variables, and also addresses some of the variable cost issue discussed in \VRef{s:ExpressionCostModel}. 530 530 531 531 … … 535 535 Based on the experiment results, this approach can improve the performance of expression resolution in general, and sometimes allow difficult instances of assertion resolution problems to be solved that are otherwise infeasible, \eg when the resolution encounters an infinite loop. 536 536 537 The tricky problem in implementing this approach is that the resolution algorithm has side effects, namely modifying the type bindings in the environment.537 \PAB{I identify that} the tricky problem in implementing this approach is that the resolution algorithm has side effects, namely modifying the type bindings in the environment. 538 538 If the modifications are cached, \ie the results that cause the type bindings to be modified, it is also necessary to store the changes to type bindings, too. 539 539 Furthermore, in cases where multiple candidates can be used to satisfy one assertion parameter, all of them must be cached including those that are not eventually selected, since the side effect can produce different results depending on the context. … … 583 583 However, the implementation of the type environment is simplified; 584 584 it only stores a tentative type binding with a flag indicating whether \emph{widening} is possible for an equivalence class of type variables. 585 Formally speaking, this meansthe type environment used in \CFA is only capable of representing \emph{lower-bound} constraints.585 Formally speaking, \PAB{I concluded} the type environment used in \CFA is only capable of representing \emph{lower-bound} constraints. 586 586 This simplification works most of the time, given the following properties of the existing \CFA type system and the resolution algorithms: 587 587 \begin{enumerate} … … 599 599 \end{enumerate} 600 600 601 \CFA does attempt to incorporate upstream type information propagated from variable adeclaration with initializer, since the type of the variable being initialized is known.601 \CFA does attempt to incorporate upstream type information propagated from a variable declaration with initializer, since the type of the variable being initialized is known. 602 602 However, the current type-environment representation is flawed in handling such type inferencing, when the return type in the initializer is polymorphic. 603 603 Currently, an inefficient workaround is performed to create the necessary effect. -
doc/theses/fangren_yu_MMath/uw-ethesis.bib
r7d02d35 rbd72f517 2 2 % For use with BibTeX 3 3 4 @misc{OperOverloading, 5 contributer = {pabuhr@plg}, 6 key = {Operator Overloading}, 7 title = {Operator Overloading}, 8 author = {{WikipediA}}, 9 howpublished= {\url{https://en.wikipedia.org/wiki/Operator_overloading}}, 10 year = 2025, 11 } 12 13 @misc{FuncOverloading, 14 contributer = {pabuhr@plg}, 15 key = {Function Overloading}, 16 title = {Function Overloading}, 17 author = {{WikipediA}}, 18 howpublished= {\url{https://en.wikipedia.org/wiki/Function_overloading}}, 19 year = 2025, 20 } -
doc/theses/fangren_yu_MMath/uw-ethesis.tex
r7d02d35 rbd72f517 100 100 \lstnewenvironment{ada}[1][]{\lstset{language=Ada,escapechar=\$,moredelim=**[is][\color{red}]{@}{@},}\lstset{#1}}{} 101 101 102 \newcommand{\PAB}[1]{{\color{ red}PAB:#1}}102 \newcommand{\PAB}[1]{{\color{magenta}#1}} 103 103 \newcommand{\newtermFont}{\emph} 104 104 \newcommand{\Newterm}[1]{\newtermFont{#1}} -
doc/theses/mike_brooks_MMath/Makefile
r7d02d35 rbd72f517 13 13 TeXSRC = ${wildcard *.tex} 14 14 PicSRC = ${notdir ${wildcard ${Pictures}/*.png}} ${notdir ${wildcard ${Pictures}/*.fig}} 15 PicSRC := ${PicSRC:.fig=.pdf} # substitute ".fig" with ".pdf" 16 GraphSRC_OLD = ${notdir ${wildcard ${Pictures}/*.dat}} 17 GraphSRC_OLD := ${GraphSRC_OLD:.dat=.pdf} # substitute ".dat" with ".pdf" 18 PlotINPUTS = ${wildcard ${Plots}/*.gp} ${wildcard ${Plots}/*.py} 19 PlotINPUTS := ${addsuffix .INPUTS,${PlotINPUTS}} 15 PicSRC := ${PicSRC:.fig=.pdf} # substitute ".fig" with ".pdf" 20 16 PlotSRC = ${notdir ${wildcard ${Plots}/*.gp}} 21 PlotSRC := ${addprefix ${Build}/plot-,${PlotSRC:.gp=.pdf}} 17 PlotSRC := ${addprefix ${Build}/plot-,${PlotSRC:.gp=.pdf}} # substitute ".gp" with ".pdf" 22 18 DemoPgmSRC = ${notdir ${wildcard ${Programs}/*-demo.cfa}} 23 19 PgmSRC = ${notdir ${wildcard ${Programs}/*}} … … 49 45 # Rules and Recipes 50 46 51 .PHONY : all clean # not file names47 .PHONY : all clean # not file names 52 48 .SECONDARY: 53 49 #.PRECIOUS : ${Build}/% # don't delete intermediates … … 61 57 # File Dependencies 62 58 63 ${DOCUMENT}: ${TeXSRC} $(RunPgmOut) ${DemoPgmOut} ${ GraphSRC_OLD} ${PlotSRC} ${PicSRC} ${BibSRC} ${BibRep}/pl.bib ${LaTMac}/common.tex Makefile | ${Build}59 ${DOCUMENT}: ${TeXSRC} $(RunPgmOut) ${DemoPgmOut} ${PlotSRC} ${PicSRC} ${BibSRC} ${BibRep}/pl.bib ${LaTMac}/common.tex Makefile | ${Build} 64 60 echo ${PicSRC} 65 61 echo ${GraphSRC_OLD} … … 82 78 ${CFA} $< -o $@ 83 79 84 ${Build}/%: ${Programs}/%.run.cfa | ${Build} 80 ${Build}/%: ${Programs}/%.run.cfa | ${Build} # cfa cannot handle pipe 85 81 sed -f ${Programs}/sedcmd $< > ${Build}/tmp.cfa; ${CFA} ${Build}/tmp.cfa -o $@ 86 82 … … 94 90 $< > $@ 95 91 96 string-graph-peq-sharing.pdf: string-graph-peq-sharing.dat plot-peq-sharing.gp | ${Build}97 gnuplot plot-peq-sharing.gp98 99 string-graph-pta-sharing.pdf: string-graph-pta-sharing.dat plot-pta-sharing.gp | ${Build}100 gnuplot plot-pta-sharing.gp101 102 string-graph-pbv.pdf: string-graph-pbv.dat plot-pbv.gp | ${Build}103 gnuplot plot-pbv.gp104 105 string-graph-allocn.pdf: string-graph-allocn.dat plot-allocn.gp | ${Build}106 gnuplot plot-allocn.gp107 108 92 %.pdf: %.fig | ${Build} 109 93 fig2dev -L pdf $< > ${Build}/$@ 110 94 111 -include $(Plots)/ string-peq-cppemu.d95 -include $(Plots)/*.d 112 96 113 97 ${Build}/plot-%.dat: ${Plots}/%.py ${Plots}/%.py.INPUTS | ${Build} 114 echo ${PlotINPUTS}115 98 python3 $< > $@ 116 99 -
doc/theses/mike_brooks_MMath/array.tex
r7d02d35 rbd72f517 17 17 though using a new style of generic parameter. 18 18 \begin{cfa} 19 @array( float, 99 )@ x; $\C[2. 75in]{// x contains 99 floats}$20 \end{cfa} 21 Here, the arguments to the @array@ type are @float@ (element type) and @99@ ( length).22 When this type is used as a function parameter, the type-system requires th at a call'sargument is a perfect match.19 @array( float, 99 )@ x; $\C[2.5in]{// x contains 99 floats}$ 20 \end{cfa} 21 Here, the arguments to the @array@ type are @float@ (element type) and @99@ (dimension). 22 When this type is used as a function parameter, the type-system requires the argument is a perfect match. 23 23 \begin{cfa} 24 24 void f( @array( float, 42 )@ & p ) {} $\C{// p accepts 42 floats}$ 25 25 f( x ); $\C{// statically rejected: type lengths are different, 99 != 42}$ 26 27 26 test2.cfa:3:1 error: Invalid application of existing declaration(s) in expression. 28 27 Applying untyped: Name: f ... to: Name: x 29 28 \end{cfa} 30 Here, the function @f@'s parameter @p@ is declared with length 42. 31 However, the call @f( x )@ is invalid, because @x@'s length is @99@, which does not match @42@. 32 33 A function declaration can be polymorphic over these @array@ arguments by using the \CFA @forall@ declaration prefix. 29 Function @f@'s parameter expects an array with dimension 42, but the argument dimension 99 does not match. 30 31 A function can be polymorphic over @array@ arguments using the \CFA @forall@ declaration prefix. 34 32 \begin{cfa} 35 33 forall( T, @[N]@ ) … … 38 36 } 39 37 g( x, 0 ); $\C{// T is float, N is 99, dynamic subscript check succeeds}$ 40 g( x, 1000 ); $\C{// T is float, N is 99, dynamic subscript check fails}\CRT$ 41 38 g( x, 1000 ); $\C{// T is float, N is 99, dynamic subscript check fails}$ 42 39 Cforall Runtime error: subscript 1000 exceeds dimension range [0,99) $for$ array 0x555555558020. 43 40 \end{cfa} 44 Function @g@ takes an arbitrary type parameter @T@ and a \emph{dimension parameter} @N@. 45 A dimension parameter represents a to-be-determined count of elements, managed by the type system. 46 The call @g( x, 0 )@ is valid because @g@ accepts any length of array, where the type system infers @float@ for @T@ and length @99@ for @N@. 47 Inferring values for @T@ and @N@ is implicit. 48 Furthermore, in this case, the runtime subscript @x[0]@ (parameter @i@ being @0@) in @g@ is valid because 0 is in the dimension range $[0,99)$ of argument @x@. 49 However, the call @g( x, 1000 )@ is also accepted through compile time; 50 however, this case's subscript, @x[1000]@, generates an error, because @1000@ is outside the dimension range $[0,99)$ of argument @x@. 41 Function @g@ takes an arbitrary type parameter @T@ and an unsigned integer \emph{dimension} @N@. 42 The dimension represents a to-be-determined number of elements, managed by the type system, where 0 represents an empty array. 43 The type system implicitly infers @float@ for @T@ and @99@ for @N@. 44 Furthermore, the runtime subscript @x[0]@ (parameter @i@ being @0@) in @g@ is valid because 0 is in the dimension range $[0,99)$ for argument @x@. 45 The call @g( x, 1000 )@ is also accepted at compile time. 46 However, the subscript, @x[1000]@, generates a runtime error, because @1000@ is outside the dimension range $[0,99)$ of argument @x@. 51 47 In general, the @forall( ..., [N] )@ participates in the user-relevant declaration of the name @N@, which becomes usable in parameter/return declarations and within a function. 52 48 The syntactic form is chosen to parallel other @forall@ forms: 53 49 \begin{cfa} 54 forall( @[N]@ ) ... $\C [1.5in]{// dimension}$55 forall( T ) ... $\C{// value datatype (formerly, "otype")}$56 forall( T & ) ... $\C{// opaque datatype (formerly, "dtype")}\CRT$50 forall( @[N]@ ) ... $\C{// dimension}$ 51 forall( T ) ... $\C{// value datatype}$ 52 forall( T & ) ... $\C{// opaque datatype}$ 57 53 \end{cfa} 58 54 % The notation @array(thing, N)@ is a single-dimensional case, giving a generic type instance. … … 63 59 \begin{cfa} 64 60 forall( [N] ) 65 void declDemo( ... ) { 66 float x1[N]; $\C{// built-in type ("C array")}$ 67 array(float, N) x2; $\C{// type from library}$ 68 } 69 \end{cfa} 70 Both of the locally-declared array variables, @x1@ and @x2@, have 42 elements, each element being a @float@. 71 The two variables have identical size and layout; they both encapsulate 42-float stack allocations, with no additional ``bookkeeping'' allocations or headers. 61 void f( ... ) { 62 float x1[@N@]; $\C{// C array, no subscript checking}$ 63 array(float, N) x2; $\C{// \CFA array, subscript checking}\CRT$ 64 } 65 \end{cfa} 66 Both of the stack declared array variables, @x1@ and @x2@, have 42 elements, each element being a @float@. 67 The two variables have identical size and layout, with no additional ``bookkeeping'' allocations or headers. 68 The C array, @x1@, has no subscript checking, while \CFA array, @x2@, does. 72 69 Providing this explicit generic approach requires a significant extension to the \CFA type system to support a full-feature, safe, efficient (space and time) array-type, which forms the foundation for more complex array forms in \CFA. 73 In all following discussion, ``C array'' means the types like that of @x@ and ``\CFA array'' means the standard-library @array@ type (instantiations), like the type of @x2@. 74 75 Admittedly, the @array@ library type for @x2@ is syntactically different from its C counterpart. 76 A future goal (TODO xref) is to provide the new @array@ features with syntax approaching C's (declaration style of @x1@). 70 In all following discussion, ``C array'' means types like @x1@ and ``\CFA array'' means types like @x2@. 71 72 A future goal is to provide the new @array@ features with syntax approaching C's (declaration style of @x1@). 77 73 Then, the library @array@ type could be removed, giving \CFA a largely uniform array type. 78 At present, the C-syntax @array@ is only partially supported, so the generic @array@ is used exclusively in the thesis; 79 feature support and C compatibility are revisited in Section ? TODO. 74 At present, the C-syntax @array@ is only partially supported, so the generic @array@ is used exclusively in the thesis. 80 75 81 76 My contributions in this chapter are: 82 \begin{enumerate} 77 \begin{enumerate}[leftmargin=*] 83 78 \item A type system enhancement that lets polymorphic functions and generic types be parameterized by a numeric value: @forall( [N] )@. 84 \item Provide a length-checked array-type in the \CFA standard library, where the array's length is statically managed and dynamically valued.79 \item Provide a dimension/subscript-checked array-type in the \CFA standard library, where the array's length is statically managed and dynamically valued. 85 80 \item Provide argument/parameter passing safety for arrays and subscript safety. 86 \item TODO: general parking...87 81 \item Identify the interesting specific abilities available by the new @array@ type. 88 82 \item Where there is a gap concerning this feature's readiness for prime-time, identification of specific workable improvements that are likely to close the gap. … … 90 84 91 85 86 \begin{comment} 92 87 \section{Dependent Typing} 93 88 94 General dependent typing allows the type system to encode arbitrary predicates (\eg behavioural specifications for functions), 95 which is an anti-goal for my work. 89 General dependent typing allows a type system to encode arbitrary predicates, \eg behavioural specifications for functions, which is an anti-goal for my work. 96 90 Firstly, this application is strongly associated with pure functional languages, 97 91 where a characterization of the return value (giving it a precise type, generally dependent upon the parameters) … … 101 95 Secondly, TODO: bash Rust. 102 96 TODO: cite the crap out of these claims. 97 \end{comment} 103 98 104 99 105 100 \section{Features Added} 106 101 107 This section shows more about using the \CFA array and dimension parameters, demonstrating theirsyntax and semantics by way of motivating examples.102 This section shows more about using the \CFA array and dimension parameters, demonstrating syntax and semantics by way of motivating examples. 108 103 As stated, the core capability of the new array is tracking all dimensions within the type system, where dynamic dimensions are represented using type variables. 109 104 110 105 By declaring type variables at the front of object declarations, an array dimension is lexically referenceable where it is needed. 111 For example, a declaration can share one length, @N@, among a pair of parameters and the return, 112 meaning that it requires both input arrays to be of the same length, and guarantees that the result is of that length as well. 106 For example, a declaration can share one length, @N@, among a pair of parameters and return type, meaning the input arrays and return array are the same length. 113 107 \lstinput{10-17}{hello-array.cfa} 114 108 Function @f@ does a pointwise comparison of its two input arrays, checking if each pair of numbers is within half a percent of each other, returning the answers in a newly allocated @bool@ array. 115 The dynamic allocation of the @ret@ array , bythe library @alloc@ function,109 The dynamic allocation of the @ret@ array uses the library @alloc@ function, 116 110 \begin{cfa} 117 111 forall( T & | sized(T) ) … … 120 114 } 121 115 \end{cfa} 122 uses the parameterized dimension information implicitly within its @sizeof@ determination, and casts the return type.123 Note that @alloc@ only sees one whole type for its @T@ (which is @f@'s @array(bool, N)@);this type's size is a computation based on @N@.116 which captures the parameterized dimension information implicitly within its @sizeof@ determination, and casts the return type. 117 Note, @alloc@ only sees the whole type for its @T@, @array(bool, N)@, where this type's size is a computation based on @N@. 124 118 This example illustrates how the new @array@ type plugs into existing \CFA behaviour by implementing necessary \emph{sized} assertions needed by other types. 125 119 (\emph{sized} implies a concrete \vs abstract type with a runtime-available size, exposed as @sizeof@.) … … 129 123 \lstinput{30-43}{hello-array.cfa} 130 124 \lstinput{45-48}{hello-array.cfa} 131 \caption{\lstinline{f} Harness}132 \label{f:f Harness}125 \caption{\lstinline{f} Example} 126 \label{f:fExample} 133 127 \end{figure} 134 128 135 \VRef[Figure]{f:f Harness} shows a harness that usesfunction @f@, illustrating how dynamic values are fed into the @array@ type.129 \VRef[Figure]{f:fExample} shows an example using function @f@, illustrating how dynamic values are fed into the @array@ type. 136 130 Here, the dimension of arrays @x@, @y@, and @result@ is specified from a command-line value, @dim@, and these arrays are allocated on the stack. 137 131 Then the @x@ array is initialized with decreasing values, and the @y@ array with amounts offset by constant @0.005@, giving relative differences within tolerance initially and diverging for later values. … … 144 138 145 139 In summary: 146 \begin{itemize} 140 \begin{itemize}[leftmargin=*] 147 141 \item 148 142 @[N]@ within a @forall@ declares the type variable @N@ to be a managed length. 149 143 \item 150 @N@ can be used an expression of type @size_t@ within the declaredfunction body.144 @N@ can be used in an expression with type @size_t@ within the function body. 151 145 \item 152 146 The value of an @N@-expression is the acquired length, derived from the usage site, \ie generic declaration or function call. … … 158 152 \begin{enumerate}[leftmargin=*] 159 153 \item 160 The \CC template @N@ can only be compile-time value, while the \CFA @N@ may be a runtime value. 161 % agreed, though already said 154 The \CC template @N@ can only be a compile-time value, while the \CFA @N@ may be a runtime value. 162 155 \item 163 156 \CC does not allow a template function to be nested, while \CFA lets its polymorphic functions to be nested. 164 % why is this important? 165 \item 157 Hence, \CC precludes a simple form of information hiding. 158 \item 159 \label{p:DimensionPassing} 166 160 The \CC template @N@ must be passed explicitly at the call, unless @N@ has a default value, even when \CC can deduct the type of @T@. 167 161 The \CFA @N@ is part of the array type and passed implicitly at the call. … … 169 163 % mycode/arrr/thesis-examples/check-peter/cs-cpp.cpp, v2 170 164 \item 171 \CC cannot have an array of references, but can have an array of pointers.165 \CC cannot have an array of references, but can have an array of @const@ pointers. 172 166 \CC has a (mistaken) belief that references are not objects, but pointers are objects. 173 167 In the \CC example, the arrays fall back on C arrays, which have a duality with references with respect to automatic dereferencing. … … 178 172 % https://stackoverflow.com/questions/922360/why-cant-i-make-a-vector-of-references 179 173 \item 174 \label{p:ArrayCopy} 180 175 C/\CC arrays cannot be copied, while \CFA arrays can be copied, making them a first-class object (although array copy is often avoided for efficiency). 181 176 % fixed by comparing to std::array 182 177 % mycode/arrr/thesis-examples/check-peter/cs-cpp.cpp, v10 183 178 \end{enumerate} 184 T ODO: settle Mike's concerns with this comparison (perhaps, remove)179 The \CC template @array@ type mitigates points \VRef[]{p:DimensionPassing} and \VRef[]{p:ArrayCopy}, but it is also trying to accomplish a similar mechanism to \CFA @array@. 185 180 186 181 \begin{figure} … … 226 221 Just as the first example in \VRef[Section]{s:ArrayIntro} shows a compile-time rejection of a length mismatch, 227 222 so are length mismatches stopped when they involve dimension parameters. 228 While \VRef[Figure]{f:f Harness} shows successfully calling a function @f@ expecting two arrays of the same length,223 While \VRef[Figure]{f:fExample} shows successfully calling a function @f@ expecting two arrays of the same length, 229 224 \begin{cfa} 230 225 array( bool, N ) & f( array( float, N ) &, array( float, N ) & ); … … 246 241 The same argument safety and the associated implicit communication of array length occurs. 247 242 Preexisting \CFA allowed aggregate types to be generalized with type parameters, enabling parameterizing of element types. 248 Now, \CFA also allows parameterizing them by length. 249 Doing so gives a refinement of C's ``flexible array member'' pattern[TODO: cite ARM 6.7.2.1 pp18]\cite{arr:gnu-flex-mbr}. 250 While a C flexible array member can only occur at the end of the enclosing structure, 251 \CFA allows length-parameterized array members to be nested at arbitrary locations. 252 This flexibility, in turn, allows for multiple array members. 243 This has been extended to allow parameterizing by dimension. 244 Doing so gives a refinement of C's ``flexible array member''~\cite[\S~6.7.2.1.18]{C11}. 245 \begin{cfa} 246 struct S { 247 ... 248 double d []; // incomplete array type => flexible array member 249 } * s = malloc( sizeof( struct S ) + sizeof( double [10] ) ); 250 \end{cfa} 251 which creates a VLA of size 10 @double@s at the end of the structure. 252 A C flexible array member can only occur at the end of a structure; 253 \CFA allows length-parameterized array members to be nested at arbitrary locations, with intervening member declarations. 253 254 \lstinput{10-15}{hello-accordion.cfa} 254 255 The structure has course- and student-level metatdata (their respective field names) and a position-based preferences' matrix. 255 256 Its layout has the starting offset of @studentIds@ varying according to the generic parameter @C@, and the offset of @preferences@ varying according to both generic parameters. 256 257 257 \VRef[Figure]{f:check Harness} shows a program main using @School@ and results with different array sizes.258 \VRef[Figure]{f:checkExample} shows a program main using @School@ and results with different array sizes. 258 259 The @school@ variable holds many students' course-preference forms. 259 260 It is on the stack and its initialization does not use any casting or size arithmetic. … … 285 286 \end{cquote} 286 287 287 \caption{\lstinline{School} harness, input and output}288 \label{f:check Harness}288 \caption{\lstinline{School} Example, Input and Output} 289 \label{f:checkExample} 289 290 \end{figure} 290 291 291 292 When a function operates on a @School@ structure, the type system handles its memory layout transparently. 292 293 \lstinput{30-37}{hello-accordion.cfa} 293 In the example, this @getPref@ function answers, for the student at position @is@, what is the position of its@pref@\textsuperscript{th}-favoured class?294 In the example, function @getPref@ returns, for the student at position @is@, what is the position of their @pref@\textsuperscript{th}-favoured class? 294 295 295 296 … … 324 325 325 326 The repurposed heavy equipment is 326 \begin{itemize} 327 \begin{itemize}[leftmargin=*] 327 328 \item 328 329 Resolver provided values for a used declaration's type-system variables, … … 348 349 int main() { 349 350 thing( @10@ ) x; f( x ); $\C{// prints 10, [4]}$ 350 thing( 100 ) y; f( y ); $\C{// prints 100}$ 351 return 0; 351 thing( @100@ ) y; f( y ); $\C{// prints 100}$ 352 352 } 353 353 \end{cfa} 354 354 This example has: 355 \begin{enumerate} 355 \begin{enumerate}[leftmargin=*] 356 356 \item 357 357 The symbol @N@ being declared as a type variable (a variable of the type system). … … 369 369 Because the box pass handles a type's size as its main datum, the encoding is chosen to use it. 370 370 The production and recovery are then straightforward. 371 \begin{itemize} 371 \begin{itemize}[leftmargin=*] 372 372 \item 373 373 The value $n$ is encoded as a type whose size is $n$. 374 374 \item 375 Given a dimension expression $e$, produce type @char[@$e$@]@ to represent it.375 Given a dimension expression $e$, produce an internal type @char[@$e$@]@ to represent it. 376 376 If $e$ evaluates to $n$ then the encoded type has size $n$. 377 377 \item … … 389 389 } 390 390 int main() { 391 thing( char[@10@] ) x; f( x ); $\C{// prints 10, [4]}$ 392 thing( char[100] ) y; f( y ); $\C{// prints 100}$ 393 return 0; 391 thing( @char[10]@ ) x; f( x ); $\C{// prints 10, [4]}$ 392 thing( @char[100]@ ) y; f( y ); $\C{// prints 100}$ 394 393 } 395 394 \end{cfa} 396 395 Observe: 397 \begin{enumerate} 396 \begin{enumerate}[leftmargin=*] 398 397 \item 399 398 @N@ is now declared to be a type. 400 It is declared to be \emph{sized} (by the @*@), meaning that the box pass shall do its @sizeof(N)@ --@__sizeof_N@ extra parameter and expression translation.399 It is declared to be \emph{sized} (by the @*@), meaning that the box pass shall do its @sizeof(N)@$\rightarrow$@__sizeof_N@ extra parameter and expression translation. 401 400 \item 402 401 @thing(N)@ is a type; the argument to the generic @thing@ is a type (type variable). … … 404 403 The @sout...@ expression (being an application of the @?|?@ operator) has a second argument that is an ordinary expression. 405 404 \item 406 The type of variable @x@ is another @thing(-)@ type; the argument to the generic @thing@ is a type (array type of bytes, @char @).405 The type of variable @x@ is another @thing(-)@ type; the argument to the generic @thing@ is a type (array type of bytes, @char[@$e$@]@). 407 406 \end{enumerate} 408 407 … … 415 414 struct __conc_thing_10 {} x; f( @10@, &x ); $\C{// prints 10, [4]}$ 416 415 struct __conc_thing_100 {} y; f( @100@, &y ); $\C{// prints 100}$ 417 return 0;418 416 } 419 417 \end{cfa} 420 418 Observe: 421 \begin{enumerate} 419 \begin{enumerate}[leftmargin=*] 422 420 \item 423 421 The type parameter @N@ is gone. … … 427 425 The @sout...@ expression (being an application of the @?|?@ operator) has a regular variable (parameter) usage for its second argument. 428 426 \item 429 Information about the particular @thing@ instantiation (value 10) has moved, from the type, to a regular function-call argument.427 Information about the particular @thing@ instantiation (value 10) is moved, from the type, to a regular function-call argument. 430 428 \end{enumerate} 431 429 At the end of the desugaring and downstream processing, the original C idiom of ``pass both a length parameter and a pointer'' has been reconstructed. … … 433 431 The compiler's action produces the more complex form, which if handwritten, would be error-prone. 434 432 435 Back at the compiler front end, the parsing changes AST schema extensions and validation rules for enabling the sugared user input.436 \begin{itemize} 433 At the compiler front end, the parsing changes AST schema extensions and validation rules for enabling the sugared user input. 434 \begin{itemize}[leftmargin=*] 437 435 \item 438 436 Recognize the form @[N]@ as a type-variable declaration within a @forall@. … … 440 438 Have the new brand of type-variable, \emph{Dimension}, in the AST form of a type-variable, to represent one parsed from @[-]@. 441 439 \item 442 Allow a type variable to occur in an expression. Validate (after parsing) that only dimension-branded type 440 Allow a type variable to occur in an expression. Validate (after parsing) that only dimension-branded type-variables are used here. 443 441 \item 444 442 Allow an expression to occur in type-argument position. Brand the resulting type argument as a dimension. … … 457 455 \label{s:ArrayTypingC} 458 456 459 Essential in giving a guarantee of accurate length is the compiler's ability 460 to reject a program that presumes to mishandle length. 461 By contrast, most discussion so far dealt with communicating length, 462 from one party who knows it, to another who is willing to work with any given length. 463 For scenarios where the concern is a mishandled length, 464 the interaction is between two parties who both claim to know something about it. 465 Such a scenario occurs in this pure C fragment, which today's C compilers accept: 466 \begin{cfa} 467 int n = @42@; 468 float x[n]; 469 float (*xp)[@999@] = &x; 457 Essential in giving a guarantee of accurate length is the compiler's ability to reject a program that presumes to mishandle length. 458 By contrast, most discussion so far deals with communicating length, from one party who knows it, to another willing to work with any given length. 459 For scenarios where the concern is a mishandled length, the interaction is between two parties who both claim to know something about it. 460 461 C and \CFA can check when working with two static values. 462 \begin{cfa} 463 enum { n = 42 }; 464 float x[@n@]; // or just 42 465 float (*xp1)[@42@] = &x; // accept 466 float (*xp2)[@999@] = &x; // reject 467 warning: initialization of 'float (*)[999]' from incompatible pointer type 'float (*)[42]' 468 \end{cfa} 469 When a variable is involved, C and \CFA take two different approaches. 470 Today's C compilers accept the following without warning. 471 \begin{cfa} 472 static const int n = 42; 473 float x[@n@]; 474 float (* xp)[@999@] = &x; $\C{// should be static rejection here}$ 470 475 (*xp)[@500@]; $\C{// in "bound"?}$ 471 476 \end{cfa} 472 477 Here, the array @x@ has length 42, while a pointer to it (@xp@) claims length 999. 473 So, while the subscript of @xp@ at position 500 is out of bound ofits referent @x@,478 So, while the subscript of @xp@ at position 500 is out of bound with its referent @x@, 474 479 the access appears in-bound of the type information available on @xp@. 475 Truly, length is being mishandled in the previous step, 476 where the type-carried length information on @x@ is not compatible with that of @xp@. 477 478 The \CFA new-array rejects the analogous case: 479 \begin{cfa} 480 int n = @42@; 481 array(float, n) x; 482 array(float, 999) * xp = x; $\C{// static rejection here}$ 483 (*xp)[@500@]; $\C{// runtime check vs len 999}$ 484 \end{cfa} 485 The way the \CFA array is implemented, the type analysis of this case reduces to a case similar to the earlier C version. 480 In fact, length is being mishandled in the previous step, where the type-carried length information on @x@ is not compatible with that of @xp@. 481 482 In \CFA, I choose to reject this C example at the point where the type-carried length information on @x@ is not compatible with that of @xp@, and correspondingly, its array counterpart at the same location: 483 \begin{cfa} 484 static const int n = 42; 485 array( float, @n@ ) x; 486 array( float, @999@ ) * xp = &x; $\C{// static rejection here}$ 487 (*xp)[@500@]; $\C{// runtime check passes}$ 488 \end{cfa} 489 The way the \CFA array is implemented, the type analysis for this case reduces to a case similar to the earlier C version. 486 490 The \CFA compiler's compatibility analysis proceeds as: 487 491 \begin{itemize}[parsep=0pt] 488 492 \item 489 Is @array(float, 999)@ type-compatible with @array(float, n)@? 490 \item 491 Is @arrayX(float, char[999])@ type-compatible with @arrayX(float, char[n])@?\footnote{ 492 Here, \lstinline{arrayX} represents the type that results 493 from desugaring the \lstinline{array} type 494 into a type whose generic parameters are all types. 495 This presentation elides the noisy fact that 496 \lstinline{array} is actually a macro for something bigger; 497 the reduction to \lstinline{char[-]} still proceeds as sketched.} 498 \item 499 Is @char[999]@ type-compatible with @char[n]@? 493 Is @array( float, 999 )@ type-compatible with @array( float, n )@? 494 \item 495 Is desugared @array( float, char[999] )@ type-compatible with desugared @array( float, char[n] )@? 496 % \footnote{ 497 % Here, \lstinline{arrayX} represents the type that results from desugaring the \lstinline{array} type into a type whose generic parameters are all types. 498 % This presentation elides the noisy fact that \lstinline{array} is actually a macro for something bigger; 499 % the reduction to \lstinline{char [-]} still proceeds as sketched.} 500 \item 501 Is internal type @char[999]@ type-compatible with internal type @char[n]@? 500 502 \end{itemize} 501 To achieve the necessary \CFA rejections meant rejecting the corresponding C case, which is not backward compatible. 502 There are two complementary mitigations for this incompatibility. 503 504 First, a simple recourse is available to a programmer who intends to proceed 505 with the statically unsound assignment. 506 This situation might arise if @n@ were known to be 999, 507 rather than 42, as in the introductory examples. 508 The programmer can add a cast in the \CFA code. 509 \begin{cfa} 510 xp = @(float (*)[999])@ &x; 511 \end{cfa} 512 This addition causes \CFA to accept, because now, the programmer has accepted blame. 513 This addition is benign in plain C, because the cast is valid, just unnecessary there. 514 Moreover, the addition can even be seen as appropriate ``eye candy,'' 515 marking where the unchecked length knowledge is used. 516 Therefore, a program being onboarded to \CFA can receive a simple upgrade, 517 to satisfy the \CFA rules (and arguably become clearer), 518 without giving up its validity to a plain C compiler. 519 520 Second, the incompatibility only affects types like pointer-to-array, 521 which are are infrequently used in C. 522 The more common C idiom for aliasing an array is to use a pointer-to-first-element type, 523 which does not participate in the \CFA array's length checking.\footnote{ 503 The answer is false because, in general, the value of @n@ is unknown at compile time, and hence, an error is raised. 504 For safety, it makes sense to reject the corresponding C case, which is a non-backwards compatible change. 505 There are two mitigations for this incompatibility. 506 507 First, a simple recourse is available in a situation where @n@ is \emph{known} to be 999 by using a cast. 508 \begin{cfa} 509 float (* xp)[999] = @(float (*)[999])@&x; 510 \end{cfa} 511 The cast means the programmer has accepted blame. 512 Moreover, the cast is ``eye candy'' marking where the unchecked length knowledge is used. 513 Therefore, a program being onboarded to \CFA requires some upgrading to satisfy the \CFA rules (and arguably become clearer), without giving up its validity to a plain C compiler. 514 Second, the incompatibility only affects types like pointer-to-array, which are infrequently used in C. 515 The more common C idiom for aliasing an array is to use a pointer-to-first-element type, which does not participate in the \CFA array's length checking.\footnote{ 524 516 Notably, the desugaring of the \lstinline{array} type avoids letting any \lstinline{-[-]} type decay, 525 517 in order to preserve the length information that powers runtime bound-checking.} 526 Therefore, the frequency of needing to upgrade legacy C code (as discussed in the first mitigation) 527 is anticipated to be low. 528 529 Because the incompatibility represents a low cost to a \CFA onboarding effort 530 (with a plausible side benefit of linting the original code for a missing annotation), 531 no special measures were added to retain the compatibility. 532 It would be possible to flag occurrences of @-[-]@ types that come from @array@ desugaring, 533 treating those with stricter \CFA rules, while treating others with classic C rules. 534 If future lessons from C project onboarding warrant it, 535 this special compatibility measure can be added. 536 537 Having allowed that both the initial C example's check 538 \begin{itemize} 539 \item 540 Is @float[999]@ type-compatible with @float[n]@? 541 \end{itemize} 542 and the second \CFA example's induced check 543 \begin{itemize} 544 \item 545 Is @char[999]@ type-compatible with @char[n]@? 546 \end{itemize} 547 shall have the same answer, (``no''), 548 discussion turns to how I got the \CFA compiler to produce this answer. 549 In its preexisting form, it produced a (buggy) approximation of the C rules. 550 To implement the new \CFA rules, I took the syntactic recursion a step further, obtaining, 551 in both cases: 552 \begin{itemize} 553 \item 554 Is @999@ compatible with @n@? 555 \end{itemize} 556 This compatibility question applies to a pair of expressions, where the earlier implementation were to types. 557 Such an expression-compatibility question is a new addition to the \CFA compiler. 558 Note, these questions only arise in the context of dimension expressions on (C) array types. 559 560 TODO: ensure these compiler implementation matters are treated under \CFA compiler background: 561 type unification, 562 cost calculation, 563 GenPoly. 564 565 The relevant technical component of the \CFA compiler is the type unification procedure within the type resolver. 566 I added rules for continuing this unification into expressions that occur within types. 567 It is still fundamentally doing \emph{type} unification 568 because it is participating in binding type variables, 569 and not participating in binding any variables that stand in for expression fragments 570 (for there is no such sort of variable in \CFA's analysis.) 571 An unfortunate fact about the \CFA compiler's preexisting implementation is that 572 type unification suffers from two forms of duplication. 573 574 The first duplication has (many of) the unification rules stated twice. 575 As a result, my additions for dimension expressions are stated twice. 576 The extra statement of the rules occurs in the @GenPoly@ module, 577 where concrete types like @array(int, 5)@\footnote{ 578 Again, the presentation is simplified 579 by leaving the \lstinline{array} macro unexpanded.} 580 are lowered into corresponding C types @struct __conc_array_1234@ (the suffix being a generated index). 581 In this case, the struct's definition contains fields that hardcode the argument values of @float@ and @5@. 582 The next time an @array(-,-)@ concrete instance is encountered, it checks if the previous @struct __conc_array_1234@ is suitable for it. 583 Yes, for another occurrence of @array(int, 5)@; 584 no, for either @array(rational(int), 5)@ or @array(int, 42)@. 585 By the last example, this phase must ``reject'' 586 the hypothesis that it should reuse the dimension-5 instance's C-lowering for a dimension-42 instance. 587 588 The second duplication has unification (proper) being invoked at two stages of expression resolution. 589 As a result, my added rule set needs to handle more cases than the preceding discussion motivates. 590 In the program 591 \begin{cfa} 592 void @f@( double ); 593 forall( T & ) void @f@( T & ); 594 void g( int n ) { 595 array( float, n + 1 ) x; 596 f(x); // overloaded 597 } 598 \end{cfa} 599 when resolving the function call, @g@, the first unification stage 600 compares the type @T@ of the parameter with @array( float, n + 1 )@, of the argument. 601 TODO: finish. 602 603 The actual rules for comparing two dimension expressions are conservative. 604 To answer, ``yes, consider this pair of expressions to be matching,'' 605 is to imply, ``all else being equal, allow an array with length calculated by $e_1$ 606 to be passed to a function expecting a length-$e_2$ array.''\footnote{ 607 TODO: Deal with directionality, that I'm doing exact-match, no ``at least as long as,'' no subtyping. 608 Should it be an earlier scoping principle? Feels like it should matter in more places than here.} 609 So, a ``yes'' answer must represent a guarantee that both expressions evaluate the 610 same result, while a ``no'' can tolerate ``they might, but we're not sure'', 611 provided that practical recourses are available 612 to let programmers express better knowledge. 613 The new rule-set in the current release is, in fact, extremely conservative. 614 I chose to keep things simple, 615 and allow future needs to drive adding additional complexity, within the new framework. 616 617 For starters, the original motivating example's rejection 618 is not based on knowledge that 619 the @xp@ length of (the literal) 999 is value-unequal to 620 the (obvious) runtime value of the variable @n@, which is the @x@ length. 621 Rather, the analysis assumes a variable's value can be anything, 622 and so there can be no guarantee that its value is 999. 623 So, a variable and a literal can never match. 624 625 Two occurrences of the same literal value are obviously a fine match. 626 For two occurrences of the same variable, more information is needed. 627 For example, this one is fine 628 \begin{cfa} 629 void f( const int n ) { 630 float x[n]; 631 float (*xp)[n] = x; // accept 632 } 633 \end{cfa} 634 while this one is not: 635 \begin{cfa} 518 Therefore, the need to upgrade legacy C code is low. 519 Finally, if this incompatibility is a problem onboarding C programs to \CFA, it is should be possible to change the C type check to a warning rather than an error, acting as a \emph{lint} of the original code for a missing type annotation. 520 521 To handle two occurrences of the same variable, more information is needed, \eg, this is fine, 522 \begin{cfa} 523 int n = 42; 524 float x[@n@]; 525 float (*xp)[@n@] = x; // accept 526 \end{cfa} 527 where @n@ remains fixed across a contiguous declaration context. 528 However, intervening dynamic statement cause failures. 529 \begin{cfa} 530 int n = 42; 531 float x[@n@]; 532 @n@ = 999; // dynamic change 533 float (*xp)[@n@] = x; // reject 534 \end{cfa} 535 However, side-effects can occur in a contiguous declaration context. 536 \begin{cquote} 537 \setlength{\tabcolsep}{20pt} 538 \begin{tabular}{@{}ll@{}} 539 \begin{cfa} 540 // compile unit 1 541 extern int @n@; 542 extern float g(); 636 543 void f() { 637 int n = 42; 638 float x[n]; 639 n = 999; 640 float (*xp)[n] = x; // reject 641 } 642 \end{cfa} 643 Furthermore, the fact that the first example sees @n@ as @const@ 644 is not actually sufficient. 645 In this example, @f@'s length expression's declaration is as @const@ as it can be, 646 yet its value still changes between the two invocations: 647 \begin{cquote} 648 \setlength{\tabcolsep}{15pt} 649 \begin{tabular}{@{}ll@{}} 650 \begin{cfa} 651 // compile unit 1 652 void g(); 653 void f( const int & const nr ) { 654 float x[nr]; 655 g(); // change n 656 @float (*xp)[nr] = x;@ // reject 544 float x[@n@] = { g() }; 545 float (*xp)[@n@] = x; // reject 657 546 } 658 547 \end{cfa} … … 660 549 \begin{cfa} 661 550 // compile unit 2 662 static int n= 42;551 int @n@ = 42; 663 552 void g() { 664 n= 99;665 } 666 667 f( n ); 553 @n@ = 99; 554 } 555 556 668 557 \end{cfa} 669 558 \end{tabular} … … 671 560 The issue here is that knowledge needed to make a correct decision is hidden by separate compilation. 672 561 Even within a translation unit, static analysis might not be able to provide all the information. 673 674 My rule set also respects a traditional C feature: In spite of the several limitations of the C rules 675 accepting cases that produce different values, there are a few mismatches that C stops. 676 C is quite precise when working with two static values. 677 \begin{cfa} 678 enum { fortytwo = 42 }; 679 float x[fortytwo]; 680 float (*xp1)[42] = &x; // accept 681 float (*xp2)[999] = &x; // reject 682 \end{cfa} 683 My \CFA rules agree with C's on these cases. 562 However, if the example uses @const@, the check is possible. 563 \begin{cquote} 564 \setlength{\tabcolsep}{20pt} 565 \begin{tabular}{@{}ll@{}} 566 \begin{cfa} 567 // compile unit 1 568 extern @const@ int n; 569 extern float g(); 570 void f() { 571 float x[n] = { g() }; 572 float (*xp)[n] = x; // reject 573 } 574 \end{cfa} 575 & 576 \begin{cfa} 577 // compile unit 2 578 @const@ int n = 42; 579 void g() { 580 @n = 99@; // allowed 581 } 582 583 584 \end{cfa} 585 \end{tabular} 586 \end{cquote} 684 587 685 588 In summary, the new rules classify expressions into three groups: 686 589 \begin{description} 687 590 \item[Statically Evaluable] 688 Expressions for which a specific value can be calculated (conservatively) 689 at compile-time. 690 A preexisting \CFA compiler module defines which literals, enumerators, and expressions qualify, 691 and evaluates them. 591 Expressions for which a specific value can be calculated (conservatively) at compile-time. 592 A preexisting \CFA compiler module defines which literals, enumerators, and expressions qualify and evaluates them. 692 593 \item[Dynamic but Stable] 693 594 The value of a variable declared as @const@, including a @const@ parameter. 694 595 \item[Potentially Unstable] 695 596 The catch-all category. Notable examples include: 696 any function-call result, @float x[foo()];@, 697 the particular function-call result that is a pointer dereference, @void f(const int * n)@ @{ float x[*n]; }@, and 698 any use of a reference-typed variable. 597 any function-call result, @float x[foo()]@, the particular function-call result that is a pointer dereference, @void f(const int * n)@ @{ float x[*n]; }@, and any use of a reference-typed variable. 699 598 \end{description} 700 599 Within these groups, my \CFA rules are: 701 \begin{itemize} 600 \begin{itemize}[leftmargin=*] 702 601 \item 703 602 Accept a Statically Evaluable pair, if both expressions have the same value. … … 710 609 \end{itemize} 711 610 The traditional C rules are: 712 \begin{itemize} 611 \begin{itemize}[leftmargin=*] 713 612 \item 714 613 Reject a Statically Evaluable pair, if the expressions have two different values. … … 716 615 Otherwise, accept. 717 616 \end{itemize} 617 \VRef[Figure]{f:DimexprRuleCompare} gives a case-by-case comparison of the consequences of these rule sets. 618 It demonstrates that the \CFA false alarms occur in the same cases as C treats unsafe. 619 It also shows that C-incompatibilities only occur in cases that C treats unsafe. 718 620 719 621 \begin{figure} … … 746 648 where \lstinline{expr1} and \lstinline{expr2} are meta-variables varying according to the row's Case. 747 649 Each row's claim applies to other harnesses too, including, 748 \begin{itemize} 650 \begin{itemize}[leftmargin=*] 749 651 \item 750 652 calling a function with a parameter like \lstinline{x} and an argument of the \lstinline{xp} type, … … 762 664 The table treats symbolic identity (Same/Different on rows) 763 665 apart from value equality (Equal/Unequal on columns). 764 \begin{itemize} 666 \begin{itemize}[leftmargin=*] 765 667 \item 766 668 The expressions \lstinline{1}, \lstinline{0+1} and \lstinline{n} … … 781 683 \end{figure} 782 684 783 784 \VRef[Figure]{f:DimexprRuleCompare} gives a case-by-case comparison of the consequences of these rule sets. 785 It demonstrates that the \CFA false alarms occur in the same cases as C treats unsafe. 786 It also shows that C-incompatibilities only occur in cases that C treats unsafe. 787 788 789 The conservatism of the new rule set can leave a programmer needing a recourse, 790 when needing to use a dimension expression whose stability argument 791 is more subtle than current-state analysis. 685 \begin{comment} 686 Given that the above check 687 \begin{itemize} 688 \item 689 Is internal type @char[999]@ type-compatible with internal type @char[n]@? 690 \end{itemize} 691 answers false, discussion turns to how I got the \CFA compiler to produce this answer. 692 In its preexisting form, the type system had a buggy approximation of the C rules. 693 To implement the new \CFA rules, I added one further step. 694 \begin{itemize} 695 \item 696 Is @999@ compatible with @n@? 697 \end{itemize} 698 This question applies to a pair of expressions, where the earlier question applies to types. 699 An expression-compatibility question is a new addition to the \CFA compiler, and occurs in the context of dimension expressions, and possibly enumerations assigns, which must be unique. 700 701 % TODO: ensure these compiler implementation matters are treated under \CFA compiler background: type unification, cost calculation, GenPoly. 702 703 The relevant technical component of the \CFA compiler is the standard type-unification within the type resolver. 704 \begin{cfa} 705 example 706 \end{cfa} 707 I added rules for continuing this unification into expressions that occur within types. 708 It is still fundamentally doing \emph{type} unification because it is participating in binding type variables, and not participating in binding any variables that stand in for expression fragments (for there is no such sort of variable in \CFA's analysis.) 709 An unfortunate fact about the \CFA compiler's preexisting implementation is that type unification suffers from two forms of duplication. 710 711 In detail, the first duplication has (many of) the unification rules stated twice. 712 As a result, my additions for dimension expressions are stated twice. 713 The extra statement of the rules occurs in the @GenPoly@ module, where concrete types like @array( int, 5 )@\footnote{ 714 Again, the presentation is simplified 715 by leaving the \lstinline{array} macro unexpanded.} 716 are lowered into corresponding C types @struct __conc_array_1234@ (the suffix being a generated index). 717 In this case, the struct's definition contains fields that hardcode the argument values of @float@ and @5@. 718 The next time an @array( -, - )@ concrete instance is encountered, it checks if the previous @struct __conc_array_1234@ is suitable for it. 719 Yes, for another occurrence of @array( int, 5 )@; 720 no, for examples like @array( int, 42 )@ or @array( rational(int), 5 )@. 721 In the first example, it must reject the reuse hypothesis for a dimension-@5@ and a dimension-@42@ instance. 722 723 The second duplication has unification (proper) being invoked at two stages of expression resolution. 724 As a result, my added rule set needs to handle more cases than the preceding discussion motivates. 725 In the program 726 \begin{cfa} 727 void @f@( double ); // overload 728 forall( T & ) void @f@( T & ); // overload 729 void g( int n ) { 730 array( float, n + 1 ) x; 731 f(x); // overloaded 732 } 733 \end{cfa} 734 when resolving a function call to @g@, the first unification stage compares the type @T@ of the parameter with @array( float, n + 1 )@, of the argument. 735 \PAB{TODO: finish.} 736 737 The actual rules for comparing two dimension expressions are conservative. 738 To answer, ``yes, consider this pair of expressions to be matching,'' 739 is to imply, ``all else being equal, allow an array with length calculated by $e_1$ 740 to be passed to a function expecting a length-$e_2$ array.''\footnote{ 741 TODO: Deal with directionality, that I'm doing exact-match, no ``at least as long as,'' no subtyping. 742 Should it be an earlier scoping principle? Feels like it should matter in more places than here.} 743 So, a ``yes'' answer must represent a guarantee that both expressions evaluate the 744 same result, while a ``no'' can tolerate ``they might, but we're not sure'', 745 provided that practical recourses are available 746 to let programmers express better knowledge. 747 The new rule-set in the current release is, in fact, extremely conservative. 748 I chose to keep things simple, 749 and allow future needs to drive adding additional complexity, within the new framework. 750 751 For starters, the original motivating example's rejection is not based on knowledge that the @xp@ length of (the literal) 999 is value-unequal to the (obvious) runtime value of the variable @n@, which is the @x@ length. 752 Rather, the analysis assumes a variable's value can be anything, and so there can be no guarantee that its value is 999. 753 So, a variable and a literal can never match. 754 755 TODO: Discuss the interaction of this dimension hoisting with the challenge of extra unification for cost calculation 756 \end{comment} 757 758 The conservatism of the new rule set can leave a programmer needing a recourse, when needing to use a dimension expression whose stability argument is more subtle than current-state analysis. 792 759 This recourse is to declare an explicit constant for the dimension value. 793 Consider these two dimension expressions, 794 whose reuses are rejected by the blunt current-state rules: 795 \begin{cfa} 796 void f( int & nr, const int nv ) { 797 float x[nr]; 798 float (*xp)[nr] = &x; // reject: nr varying (no references) 799 float y[nv + 1]; 800 float (*yp)[nv + 1] = &y; // reject: ?+? unpredictable (no functions) 760 Consider these two dimension expressions, whose uses are rejected by the blunt current-state rules: 761 \begin{cfa} 762 void f( int @&@ nr, @const@ int nv ) { 763 float x[@nr@]; 764 float (*xp)[@nr@] = &x; // reject: nr varying (no references) 765 float y[@nv + 1@]; 766 float (*yp)[@nv + 1@] = &y; // reject: ?+? unpredictable (no functions) 801 767 } 802 768 \end{cfa} 803 769 Yet, both dimension expressions are reused safely. 804 The @nr@ reference is never written, not volatile 805 and control does not leave the function between the uses. 806 The name @?+?@ resolves to a function that is quite predictable. 807 Here, the programmer can add the constant declarations (cast does not work): 770 The @nr@ reference is never written, not volatile meaning no implicit code (load) between declarations, and control does not leave the function between the uses. 771 As well, the build-in @?+?@ function is predictable. 772 To make these cases work, the programmer must add the follow constant declarations (cast does not work): 808 773 \begin{cfa} 809 774 void f( int & nr, const int nv ) { … … 819 784 achieved by adding a superfluous ``snapshot it as of now'' directive. 820 785 821 The snapshot ting trick is also used by thetranslation, though to achieve a different outcome.786 The snapshot trick is also used by the \CFA translation, though to achieve a different outcome. 822 787 Rather obviously, every array must be subscriptable, even a bizarre one: 823 788 \begin{cfa} 824 array( float, rand(10) ) x; 825 x[0]; // 10% chance of bound-check failure 826 \end{cfa} 827 Less obvious is that the mechanism of subscripting is a function call, 828 which must communicate length accurately. 829 The bound-check above (callee logic) must use the actual allocated length of @x@, 830 without mistakenly reevaluating the dimension expression, @rand(10)@. 789 array( float, @rand(10)@ ) x; 790 x[@0@]; // 10% chance of bound-check failure 791 \end{cfa} 792 Less obvious is that the mechanism of subscripting is a function call, which must communicate length accurately. 793 The bound-check above (callee logic) must use the actual allocated length of @x@, without mistakenly reevaluating the dimension expression, @rand(10)@. 831 794 Adjusting the example to make the function's use of length more explicit: 832 795 \begin{cfa} 833 forall 834 void f( T * x ) { sout | sizeof( *x); }796 forall( T * ) 797 void f( T * x ) { sout | sizeof( *x ); } 835 798 float x[ rand(10) ]; 836 799 f( x ); … … 840 803 void f( size_t __sizeof_T, void * x ) { sout | __sizeof_T; } 841 804 \end{cfa} 842 the translation must callthe dimension argument twice:805 the translation calls the dimension argument twice: 843 806 \begin{cfa} 844 807 float x[ rand(10) ]; 845 808 f( rand(10), &x ); 846 809 \end{cfa} 847 Rather, the translationis:810 The correct form is: 848 811 \begin{cfa} 849 812 size_t __dim_x = rand(10); … … 851 814 f( __dim_x, &x ); 852 815 \end{cfa} 853 The occurrence of this dimension hoisting during translation was in the preexisting\CFA compiler.854 But its cases werebuggy, particularly with determining, ``Can hoisting the expression be skipped here?'', for skipping this hoisting is clearly desirable in some cases.855 For example, when the programmer has already done so manually. \PAB{I don't know what this means.}816 Dimension hoisting already existed in the \CFA compiler. 817 But its was buggy, particularly with determining, ``Can hoisting the expression be skipped here?'', for skipping this hoisting is clearly desirable in some cases. 818 For example, when a programmer has already hoisted to perform an optimiation to prelude duplicate code (expression) and/or expression evaluation. 856 819 In the new implementation, these cases are correct, harmonized with the accept/reject criteria. 857 858 TODO: Discuss the interaction of this dimension hoisting with the challenge of extra unification for cost calculation859 820 860 821 … … 863 824 864 825 A multidimensional array implementation has three relevant levels of abstraction, from highest to lowest, where the array occupies \emph{contiguous memory}. 865 \begin{enumerate} 826 \begin{enumerate}[leftmargin=*] 866 827 \item 867 828 Flexible-stride memory: … … 1100 1061 Preexisting \CFA mechanisms achieve this requirement, but with poor performance. 1101 1062 Furthermore, advanced array users need an exception to the basic mechanism, which does not occur with other aggregates. 1102 Hence, arrays introduce sub leties in supporting an element's lifecycle.1063 Hence, arrays introduce subtleties in supporting an element's lifecycle. 1103 1064 1104 1065 The preexisting \CFA support for contained-element lifecycle is based on recursive occurrences of the object-type (@otype@) pseudo-trait. … … 1319 1280 The @worker@ type is designed this way to work with the threading system. 1320 1281 A thread type forks a thread at the end of each constructor and joins with it at the start of each destructor. 1321 But a @worker@ cannot begin its forked-th ead work without knowing its @id@.1282 But a @worker@ cannot begin its forked-thread work without knowing its @id@. 1322 1283 Therefore, there is a conflict between the implicit actions of the builtin @thread@ type and a user's desire to defer these actions. 1323 1284 … … 1460 1421 1461 1422 The \CFA array and the field of ``array language'' comparators all leverage dependent types to improve on the expressiveness over C and Java, accommodating examples such as: 1462 \begin{itemize} 1423 \begin{itemize}[leftmargin=*] 1463 1424 \item a \emph{zip}-style operation that consumes two arrays of equal length 1464 1425 \item a \emph{map}-style operation whose produced length matches the consumed length … … 1544 1505 The details in this presentation aren't meant to be taken too precisely as suggestions for how it should look in \CFA. 1545 1506 But the example shows these abilities: 1546 \begin{itemize} 1507 \begin{itemize}[leftmargin=*] 1547 1508 \item a built-in way (the @is_enum@ trait) for a generic routine to require enumeration-like information about its instantiating type 1548 1509 \item an implicit implementation of the trait whenever a user-written enum occurs (@weekday@'s declaration implies @is_enum@) -
doc/theses/mike_brooks_MMath/background.tex
r7d02d35 rbd72f517 995 995 Illustrated by pseudocode implementation of an STL-compatible API fragment using LQ as the underlying implementation. 996 996 The gap that makes it pseudocode is that 997 the LQ C macros do not expand to valid C++when instantiated with template parameters---there is no \lstinline{struct El}.997 the LQ C macros do not expand to valid \CC when instantiated with template parameters---there is no \lstinline{struct El}. 998 998 When using a custom-patched version of LQ to work around this issue, 999 999 the programs of \VRef[Figure]{f:WrappedRef} and wrapped value work with this shim in place of real STL. -
doc/theses/mike_brooks_MMath/benchmarks/string/result-append-pbv.csv
r7d02d35 rbd72f517 1 perfexp-cfa-pta-ll-share-reuse,corpus-100-1-1.txt,100,100,1.000000,219460000,10.000260 2 perfexp-cfa-pta-ll-share-reuse,corpus-100-10-1.txt,100,100,9.500000,180250000,10.000486 3 perfexp-cfa-pta-ll-share-reuse,corpus-100-100-1.txt,100,100,106.370000,152790000,10.000441 4 perfexp-cfa-pta-ll-share-reuse,corpus-100-2-1.txt,100,100,2.030000,206090000,10.000311 5 perfexp-cfa-pta-ll-share-reuse,corpus-100-20-1.txt,100,100,22.960000,184330000,10.000328 6 perfexp-cfa-pta-ll-share-reuse,corpus-100-200-1.txt,100,100,177.280000,125090000,10.000138 7 perfexp-cfa-pta-ll-share-reuse,corpus-100-5-1.txt,100,100,5.270000,199130000,10.000180 8 perfexp-cfa-pta-ll-share-reuse,corpus-100-50-1.txt,100,100,43.320000,167720000,10.000327 9 perfexp-cfa-pta-ll-share-reuse,corpus-100-500-1.txt,100,100,557.260000,93560000,10.001058 10 perfexp-cfa-pta-ll-share-fresh,corpus-100-1-1.txt,100,100,1.000000,225090000,10.000393 11 perfexp-cfa-pta-ll-share-fresh,corpus-100-10-1.txt,100,100,9.500000,196300000,10.000221 12 perfexp-cfa-pta-ll-share-fresh,corpus-100-100-1.txt,100,100,106.370000,150670000,10.000337 13 perfexp-cfa-pta-ll-share-fresh,corpus-100-2-1.txt,100,100,2.030000,206600000,10.000182 14 perfexp-cfa-pta-ll-share-fresh,corpus-100-20-1.txt,100,100,22.960000,188400000,10.000199 15 perfexp-cfa-pta-ll-share-fresh,corpus-100-200-1.txt,100,100,177.280000,125880000,10.000489 16 perfexp-cfa-pta-ll-share-fresh,corpus-100-5-1.txt,100,100,5.270000,185930000,10.000231 17 perfexp-cfa-pta-ll-share-fresh,corpus-100-50-1.txt,100,100,43.320000,170660000,10.000491 18 perfexp-cfa-pta-ll-share-fresh,corpus-100-500-1.txt,100,100,557.260000,91520000,10.000640 19 perfexp-cfa-pta-ll-noshare-reuse,corpus-100-1-1.txt,100,100,1.000000,146200000,10.000520 20 perfexp-cfa-pta-ll-noshare-reuse,corpus-100-10-1.txt,100,100,9.500000,114140000,10.000734 21 perfexp-cfa-pta-ll-noshare-reuse,corpus-100-100-1.txt,100,100,106.370000,17630000,10.000889 22 perfexp-cfa-pta-ll-noshare-reuse,corpus-100-2-1.txt,100,100,2.030000,139700000,10.000460 23 perfexp-cfa-pta-ll-noshare-reuse,corpus-100-20-1.txt,100,100,22.960000,71910000,10.000768 24 perfexp-cfa-pta-ll-noshare-reuse,corpus-100-200-1.txt,100,100,177.280000,8540000,10.009186 25 perfexp-cfa-pta-ll-noshare-reuse,corpus-100-5-1.txt,100,100,5.270000,129810000,10.000379 26 perfexp-cfa-pta-ll-noshare-reuse,corpus-100-50-1.txt,100,100,43.320000,45280000,10.000006 27 perfexp-cfa-pta-ll-noshare-reuse,corpus-100-500-1.txt,100,100,557.260000,3300000,10.021088 28 perfexp-cfa-pta-ll-noshare-fresh,corpus-100-1-1.txt,100,100,1.000000,146050000,10.000551 29 perfexp-cfa-pta-ll-noshare-fresh,corpus-100-10-1.txt,100,100,9.500000,102800000,10.000490 30 perfexp-cfa-pta-ll-noshare-fresh,corpus-100-100-1.txt,100,100,106.370000,17060000,10.001677 31 perfexp-cfa-pta-ll-noshare-fresh,corpus-100-2-1.txt,100,100,2.030000,137470000,10.000361 32 perfexp-cfa-pta-ll-noshare-fresh,corpus-100-20-1.txt,100,100,22.960000,69520000,10.001142 33 perfexp-cfa-pta-ll-noshare-fresh,corpus-100-200-1.txt,100,100,177.280000,8830000,10.010528 34 perfexp-cfa-pta-ll-noshare-fresh,corpus-100-5-1.txt,100,100,5.270000,117120000,10.000681 35 perfexp-cfa-pta-ll-noshare-fresh,corpus-100-50-1.txt,100,100,43.320000,42960000,10.001950 36 perfexp-cfa-pta-ll-noshare-fresh,corpus-100-500-1.txt,100,100,557.260000,3220000,10.010203 37 perfexp-cfa-peq-ll-share-reuse,corpus-100-1-1.txt,100,100,1.000000,583560000,10.000070 38 perfexp-cfa-peq-ll-share-reuse,corpus-100-10-1.txt,100,100,9.500000,451400000,10.000013 39 perfexp-cfa-peq-ll-share-reuse,corpus-100-100-1.txt,100,100,106.370000,253260000,10.000275 40 perfexp-cfa-peq-ll-share-reuse,corpus-100-2-1.txt,100,100,2.030000,483580000,10.000140 41 perfexp-cfa-peq-ll-share-reuse,corpus-100-20-1.txt,100,100,22.960000,396550000,10.000060 42 perfexp-cfa-peq-ll-share-reuse,corpus-100-200-1.txt,100,100,177.280000,199760000,10.000416 43 perfexp-cfa-peq-ll-share-reuse,corpus-100-5-1.txt,100,100,5.270000,454790000,10.000069 44 perfexp-cfa-peq-ll-share-reuse,corpus-100-50-1.txt,100,100,43.320000,339690000,10.000243 45 perfexp-cfa-peq-ll-share-reuse,corpus-100-500-1.txt,100,100,557.260000,123840000,10.000724 46 perfexp-cfa-peq-ll-share-fresh,corpus-100-1-1.txt,100,100,1.000000,577650000,10.000157 47 perfexp-cfa-peq-ll-share-fresh,corpus-100-10-1.txt,100,100,9.500000,445260000,10.000186 48 perfexp-cfa-peq-ll-share-fresh,corpus-100-100-1.txt,100,100,106.370000,259650000,10.000273 49 perfexp-cfa-peq-ll-share-fresh,corpus-100-2-1.txt,100,100,2.030000,485650000,10.000026 50 perfexp-cfa-peq-ll-share-fresh,corpus-100-20-1.txt,100,100,22.960000,386150000,10.000120 51 perfexp-cfa-peq-ll-share-fresh,corpus-100-200-1.txt,100,100,177.280000,197690000,10.000077 52 perfexp-cfa-peq-ll-share-fresh,corpus-100-5-1.txt,100,100,5.270000,443650000,10.000006 53 perfexp-cfa-peq-ll-share-fresh,corpus-100-50-1.txt,100,100,43.320000,339190000,10.000037 54 perfexp-cfa-peq-ll-share-fresh,corpus-100-500-1.txt,100,100,557.260000,122740000,10.000753 55 perfexp-cfa-peq-ll-noshare-reuse,corpus-100-1-1.txt,100,100,1.000000,595700000,10.000119 56 perfexp-cfa-peq-ll-noshare-reuse,corpus-100-10-1.txt,100,100,9.500000,452000000,10.000055 57 perfexp-cfa-peq-ll-noshare-reuse,corpus-100-100-1.txt,100,100,106.370000,280570000,10.000281 58 perfexp-cfa-peq-ll-noshare-reuse,corpus-100-2-1.txt,100,100,2.030000,501040000,10.000073 59 perfexp-cfa-peq-ll-noshare-reuse,corpus-100-20-1.txt,100,100,22.960000,422280000,10.000131 60 perfexp-cfa-peq-ll-noshare-reuse,corpus-100-200-1.txt,100,100,177.280000,235640000,10.000126 61 perfexp-cfa-peq-ll-noshare-reuse,corpus-100-5-1.txt,100,100,5.270000,461250000,10.000197 62 perfexp-cfa-peq-ll-noshare-reuse,corpus-100-50-1.txt,100,100,43.320000,369020000,10.000057 63 perfexp-cfa-peq-ll-noshare-reuse,corpus-100-500-1.txt,100,100,557.260000,135050000,10.000682 64 perfexp-cfa-peq-ll-noshare-fresh,corpus-100-1-1.txt,100,100,1.000000,529900000,10.000150 65 perfexp-cfa-peq-ll-noshare-fresh,corpus-100-10-1.txt,100,100,9.500000,408530000,10.000108 66 perfexp-cfa-peq-ll-noshare-fresh,corpus-100-100-1.txt,100,100,106.370000,217530000,10.000334 67 perfexp-cfa-peq-ll-noshare-fresh,corpus-100-2-1.txt,100,100,2.030000,463860000,10.000166 68 perfexp-cfa-peq-ll-noshare-fresh,corpus-100-20-1.txt,100,100,22.960000,360110000,10.000008 69 perfexp-cfa-peq-ll-noshare-fresh,corpus-100-200-1.txt,100,100,177.280000,176490000,10.000131 70 perfexp-cfa-peq-ll-noshare-fresh,corpus-100-5-1.txt,100,100,5.270000,424710000,10.000106 71 perfexp-cfa-peq-ll-noshare-fresh,corpus-100-50-1.txt,100,100,43.320000,290930000,10.000172 72 perfexp-cfa-peq-ll-noshare-fresh,corpus-100-500-1.txt,100,100,557.260000,90430000,10.000065 73 perfexp-cfa-pbv-ll-share-na,corpus-100-1-1.txt,xxx,100,1.000000,578040000,10.000159 74 perfexp-cfa-pbv-ll-share-na,corpus-100-10-1.txt,xxx,100,9.500000,573200000,10.000098 75 perfexp-cfa-pbv-ll-share-na,corpus-100-100-1.txt,xxx,100,106.370000,575160000,10.000149 76 perfexp-cfa-pbv-ll-share-na,corpus-100-2-1.txt,xxx,100,2.030000,573780000,10.000134 77 perfexp-cfa-pbv-ll-share-na,corpus-100-20-1.txt,xxx,100,22.960000,574500000,10.000156 78 perfexp-cfa-pbv-ll-share-na,corpus-100-200-1.txt,xxx,100,177.280000,577170000,10.000125 79 perfexp-cfa-pbv-ll-share-na,corpus-100-5-1.txt,xxx,100,5.270000,577820000,10.000046 80 perfexp-cfa-pbv-ll-share-na,corpus-100-50-1.txt,xxx,100,43.320000,578770000,10.000033 81 perfexp-cfa-pbv-ll-share-na,corpus-100-500-1.txt,xxx,100,557.260000,579540000,10.000128 82 perfexp-cfa-pbv-ll-noshare-na,corpus-100-1-1.txt,xxx,100,1.000000,191420000,10.000232 83 perfexp-cfa-pbv-ll-noshare-na,corpus-100-10-1.txt,xxx,100,9.500000,186330000,10.000046 84 perfexp-cfa-pbv-ll-noshare-na,corpus-100-100-1.txt,xxx,100,106.370000,164610000,10.000463 85 perfexp-cfa-pbv-ll-noshare-na,corpus-100-2-1.txt,xxx,100,2.030000,182390000,10.000409 86 perfexp-cfa-pbv-ll-noshare-na,corpus-100-20-1.txt,xxx,100,22.960000,182280000,10.000252 87 perfexp-cfa-pbv-ll-noshare-na,corpus-100-200-1.txt,xxx,100,177.280000,149840000,10.000281 88 perfexp-cfa-pbv-ll-noshare-na,corpus-100-5-1.txt,xxx,100,5.270000,152370000,10.000284 89 perfexp-cfa-pbv-ll-noshare-na,corpus-100-50-1.txt,xxx,100,43.320000,177430000,10.000397 90 perfexp-cfa-pbv-ll-noshare-na,corpus-100-500-1.txt,xxx,100,557.260000,113440000,10.000150 91 perfexp-stl-pta-na-na-reuse,corpus-100-1-1.txt,100,100,1.000000,152870000,10.000280 92 perfexp-stl-pta-na-na-reuse,corpus-100-10-1.txt,100,100,9.500000,98530000,10.000299 93 perfexp-stl-pta-na-na-reuse,corpus-100-100-1.txt,100,100,106.370000,16690000,10.005783 94 perfexp-stl-pta-na-na-reuse,corpus-100-2-1.txt,100,100,2.030000,136230000,10.000196 95 perfexp-stl-pta-na-na-reuse,corpus-100-20-1.txt,100,100,22.960000,62110000,10.001423 96 perfexp-stl-pta-na-na-reuse,corpus-100-200-1.txt,100,100,177.280000,8960000,10.005548 97 perfexp-stl-pta-na-na-reuse,corpus-100-5-1.txt,100,100,5.270000,104790000,10.000889 98 perfexp-stl-pta-na-na-reuse,corpus-100-50-1.txt,100,100,43.320000,39170000,10.000011 99 perfexp-stl-pta-na-na-reuse,corpus-100-500-1.txt,100,100,557.260000,3100000,10.015093 100 perfexp-stl-pta-na-na-fresh,corpus-100-1-1.txt,100,100,1.000000,154450000,10.000054 101 perfexp-stl-pta-na-na-fresh,corpus-100-10-1.txt,100,100,9.500000,96570000,10.000834 102 perfexp-stl-pta-na-na-fresh,corpus-100-100-1.txt,100,100,106.370000,16400000,10.000697 103 perfexp-stl-pta-na-na-fresh,corpus-100-2-1.txt,100,100,2.030000,133450000,10.000440 104 perfexp-stl-pta-na-na-fresh,corpus-100-20-1.txt,100,100,22.960000,62540000,10.001476 105 perfexp-stl-pta-na-na-fresh,corpus-100-200-1.txt,100,100,177.280000,8960000,10.006817 106 perfexp-stl-pta-na-na-fresh,corpus-100-5-1.txt,100,100,5.270000,106470000,10.000109 107 perfexp-stl-pta-na-na-fresh,corpus-100-50-1.txt,100,100,43.320000,37460000,10.000100 108 perfexp-stl-pta-na-na-fresh,corpus-100-500-1.txt,100,100,557.260000,3090000,10.000541 109 perfexp-stl-peq-na-na-reuse,corpus-100-1-1.txt,100,100,1.000000,863350000,10.000092 110 perfexp-stl-peq-na-na-reuse,corpus-100-10-1.txt,100,100,9.500000,471070000,10.000189 111 perfexp-stl-peq-na-na-reuse,corpus-100-100-1.txt,100,100,106.370000,287660000,10.000105 112 perfexp-stl-peq-na-na-reuse,corpus-100-2-1.txt,100,100,2.030000,669380000,10.000082 113 perfexp-stl-peq-na-na-reuse,corpus-100-20-1.txt,100,100,22.960000,432290000,10.000131 114 perfexp-stl-peq-na-na-reuse,corpus-100-200-1.txt,100,100,177.280000,241690000,10.000290 115 perfexp-stl-peq-na-na-reuse,corpus-100-5-1.txt,100,100,5.270000,510990000,10.000082 116 perfexp-stl-peq-na-na-reuse,corpus-100-50-1.txt,100,100,43.320000,396380000,10.000235 117 perfexp-stl-peq-na-na-reuse,corpus-100-500-1.txt,100,100,557.260000,135830000,10.000603 118 perfexp-stl-peq-na-na-fresh,corpus-100-1-1.txt,100,100,1.000000,785420000,10.000062 119 perfexp-stl-peq-na-na-fresh,corpus-100-10-1.txt,100,100,9.500000,418030000,10.000094 120 perfexp-stl-peq-na-na-fresh,corpus-100-100-1.txt,100,100,106.370000,225290000,10.000237 121 perfexp-stl-peq-na-na-fresh,corpus-100-2-1.txt,100,100,2.030000,550120000,10.000151 122 perfexp-stl-peq-na-na-fresh,corpus-100-20-1.txt,100,100,22.960000,386080000,10.000206 123 perfexp-stl-peq-na-na-fresh,corpus-100-200-1.txt,100,100,177.280000,176890000,10.000155 124 perfexp-stl-peq-na-na-fresh,corpus-100-5-1.txt,100,100,5.270000,441830000,10.000135 125 perfexp-stl-peq-na-na-fresh,corpus-100-50-1.txt,100,100,43.320000,310200000,10.000299 126 perfexp-stl-peq-na-na-fresh,corpus-100-500-1.txt,100,100,557.260000,90360000,10.000474 127 perfexp-stl-pbv-na-na-na,corpus-100-1-1.txt,xxx,100,1.000000,1267670000,10.000039 128 perfexp-stl-pbv-na-na-na,corpus-100-10-1.txt,xxx,100,9.500000,482210000,10.000013 129 perfexp-stl-pbv-na-na-na,corpus-100-100-1.txt,xxx,100,106.370000,268680000,10.000097 130 perfexp-stl-pbv-na-na-na,corpus-100-2-1.txt,xxx,100,2.030000,806650000,10.000104 131 perfexp-stl-pbv-na-na-na,corpus-100-20-1.txt,xxx,100,22.960000,369490000,10.000159 132 perfexp-stl-pbv-na-na-na,corpus-100-200-1.txt,xxx,100,177.280000,227020000,10.000244 133 perfexp-stl-pbv-na-na-na,corpus-100-5-1.txt,xxx,100,5.270000,534150000,10.000061 134 perfexp-stl-pbv-na-na-na,corpus-100-50-1.txt,xxx,100,43.320000,298950000,10.000190 135 perfexp-stl-pbv-na-na-na,corpus-100-500-1.txt,xxx,100,557.260000,158310000,10.000104 1 perfexp-cfa-pta-ll-share-reuse,corpus-100-1-1.txt,100,100,1.000000,220120000,10.000178 2 perfexp-cfa-pta-ll-share-reuse,corpus-100-10-1.txt,100,100,9.500000,177430000,10.000414 3 perfexp-cfa-pta-ll-share-reuse,corpus-100-100-1.txt,100,100,106.370000,142410000,10.000162 4 perfexp-cfa-pta-ll-share-reuse,corpus-100-2-1.txt,100,100,2.030000,195500000,10.000161 5 perfexp-cfa-pta-ll-share-reuse,corpus-100-20-1.txt,100,100,22.960000,164560000,10.000548 6 perfexp-cfa-pta-ll-share-reuse,corpus-100-200-1.txt,100,100,177.280000,122260000,10.000279 7 perfexp-cfa-pta-ll-share-reuse,corpus-100-5-1.txt,100,100,5.270000,193960000,10.000071 8 perfexp-cfa-pta-ll-share-reuse,corpus-100-50-1.txt,100,100,43.320000,163430000,10.000175 9 perfexp-cfa-pta-ll-share-reuse,corpus-100-500-1.txt,100,100,557.260000,87960000,10.001073 10 perfexp-cfa-pta-ll-share-reuse,corpus-1-1-1.txt,100,1,1.000000,224420000,10.000135 11 perfexp-cfa-pta-ll-share-reuse,corpus-1-10-1.txt,100,1,10.000000,223740000,10.000014 12 perfexp-cfa-pta-ll-share-reuse,corpus-1-100-1.txt,100,1,100.000000,153300000,10.000091 13 perfexp-cfa-pta-ll-share-reuse,corpus-1-2-1.txt,100,1,2.000000,223430000,10.000120 14 perfexp-cfa-pta-ll-share-reuse,corpus-1-20-1.txt,100,1,20.000000,210640000,10.000385 15 perfexp-cfa-pta-ll-share-reuse,corpus-1-200-1.txt,100,1,200.000000,129790000,10.000596 16 perfexp-cfa-pta-ll-share-reuse,corpus-1-5-1.txt,100,1,5.000000,222850000,10.000361 17 perfexp-cfa-pta-ll-share-reuse,corpus-1-50-1.txt,100,1,50.000000,201700000,10.000220 18 perfexp-cfa-pta-ll-share-reuse,corpus-1-500-1.txt,100,1,500.000000,110000000,10.000407 19 perfexp-cfa-pta-ll-share-fresh,corpus-100-1-1.txt,100,100,1.000000,225030000,10.000360 20 perfexp-cfa-pta-ll-share-fresh,corpus-100-10-1.txt,100,100,9.500000,192640000,10.000254 21 perfexp-cfa-pta-ll-share-fresh,corpus-100-100-1.txt,100,100,106.370000,143960000,10.000633 22 perfexp-cfa-pta-ll-share-fresh,corpus-100-2-1.txt,100,100,2.030000,204500000,10.000450 23 perfexp-cfa-pta-ll-share-fresh,corpus-100-20-1.txt,100,100,22.960000,185400000,10.000274 24 perfexp-cfa-pta-ll-share-fresh,corpus-100-200-1.txt,100,100,177.280000,126420000,10.000791 25 perfexp-cfa-pta-ll-share-fresh,corpus-100-5-1.txt,100,100,5.270000,194450000,10.000396 26 perfexp-cfa-pta-ll-share-fresh,corpus-100-50-1.txt,100,100,43.320000,173140000,10.000364 27 perfexp-cfa-pta-ll-share-fresh,corpus-100-500-1.txt,100,100,557.260000,92390000,10.000098 28 perfexp-cfa-pta-ll-share-fresh,corpus-1-1-1.txt,100,1,1.000000,222210000,10.000426 29 perfexp-cfa-pta-ll-share-fresh,corpus-1-10-1.txt,100,1,10.000000,209110000,10.000235 30 perfexp-cfa-pta-ll-share-fresh,corpus-1-100-1.txt,100,1,100.000000,154750000,10.000076 31 perfexp-cfa-pta-ll-share-fresh,corpus-1-2-1.txt,100,1,2.000000,222030000,10.000114 32 perfexp-cfa-pta-ll-share-fresh,corpus-1-20-1.txt,100,1,20.000000,208680000,10.000050 33 perfexp-cfa-pta-ll-share-fresh,corpus-1-200-1.txt,100,1,200.000000,133490000,10.000231 34 perfexp-cfa-pta-ll-share-fresh,corpus-1-5-1.txt,100,1,5.000000,217740000,10.000425 35 perfexp-cfa-pta-ll-share-fresh,corpus-1-50-1.txt,100,1,50.000000,200340000,10.000126 36 perfexp-cfa-pta-ll-share-fresh,corpus-1-500-1.txt,100,1,500.000000,109570000,10.000365 37 perfexp-cfa-pta-ll-noshare-reuse,corpus-100-1-1.txt,100,100,1.000000,146130000,10.000557 38 perfexp-cfa-pta-ll-noshare-reuse,corpus-100-10-1.txt,100,100,9.500000,110430000,10.000456 39 perfexp-cfa-pta-ll-noshare-reuse,corpus-100-100-1.txt,100,100,106.370000,17440000,10.003114 40 perfexp-cfa-pta-ll-noshare-reuse,corpus-100-2-1.txt,100,100,2.030000,139540000,10.000128 41 perfexp-cfa-pta-ll-noshare-reuse,corpus-100-20-1.txt,100,100,22.960000,70380000,10.000395 42 perfexp-cfa-pta-ll-noshare-reuse,corpus-100-200-1.txt,100,100,177.280000,8670000,10.001712 43 perfexp-cfa-pta-ll-noshare-reuse,corpus-100-5-1.txt,100,100,5.270000,127040000,10.000370 44 perfexp-cfa-pta-ll-noshare-reuse,corpus-100-50-1.txt,100,100,43.320000,44250000,10.002214 45 perfexp-cfa-pta-ll-noshare-reuse,corpus-100-500-1.txt,100,100,557.260000,3290000,10.007370 46 perfexp-cfa-pta-ll-noshare-reuse,corpus-1-1-1.txt,100,1,1.000000,139870000,10.000356 47 perfexp-cfa-pta-ll-noshare-reuse,corpus-1-10-1.txt,100,1,10.000000,115500000,10.000281 48 perfexp-cfa-pta-ll-noshare-reuse,corpus-1-100-1.txt,100,1,100.000000,18830000,10.003277 49 perfexp-cfa-pta-ll-noshare-reuse,corpus-1-2-1.txt,100,1,2.000000,144880000,10.000426 50 perfexp-cfa-pta-ll-noshare-reuse,corpus-1-20-1.txt,100,1,20.000000,82050000,10.001071 51 perfexp-cfa-pta-ll-noshare-reuse,corpus-1-200-1.txt,100,1,200.000000,8870000,10.002904 52 perfexp-cfa-pta-ll-noshare-reuse,corpus-1-5-1.txt,100,1,5.000000,138400000,10.000130 53 perfexp-cfa-pta-ll-noshare-reuse,corpus-1-50-1.txt,100,1,50.000000,38130000,10.002351 54 perfexp-cfa-pta-ll-noshare-reuse,corpus-1-500-1.txt,100,1,500.000000,3890000,10.003849 55 perfexp-cfa-pta-ll-noshare-fresh,corpus-100-1-1.txt,100,100,1.000000,143100000,10.000056 56 perfexp-cfa-pta-ll-noshare-fresh,corpus-100-10-1.txt,100,100,9.500000,97990000,10.000081 57 perfexp-cfa-pta-ll-noshare-fresh,corpus-100-100-1.txt,100,100,106.370000,16950000,10.004190 58 perfexp-cfa-pta-ll-noshare-fresh,corpus-100-2-1.txt,100,100,2.030000,135210000,10.000137 59 perfexp-cfa-pta-ll-noshare-fresh,corpus-100-20-1.txt,100,100,22.960000,69270000,10.000092 60 perfexp-cfa-pta-ll-noshare-fresh,corpus-100-200-1.txt,100,100,177.280000,8840000,10.000491 61 perfexp-cfa-pta-ll-noshare-fresh,corpus-100-5-1.txt,100,100,5.270000,112610000,10.000397 62 perfexp-cfa-pta-ll-noshare-fresh,corpus-100-50-1.txt,100,100,43.320000,42480000,10.001402 63 perfexp-cfa-pta-ll-noshare-fresh,corpus-100-500-1.txt,100,100,557.260000,3250000,10.027871 64 perfexp-cfa-pta-ll-noshare-fresh,corpus-1-1-1.txt,100,1,1.000000,139830000,10.000681 65 perfexp-cfa-pta-ll-noshare-fresh,corpus-1-10-1.txt,100,1,10.000000,102320000,10.000624 66 perfexp-cfa-pta-ll-noshare-fresh,corpus-1-100-1.txt,100,1,100.000000,17610000,10.000917 67 perfexp-cfa-pta-ll-noshare-fresh,corpus-1-2-1.txt,100,1,2.000000,134520000,10.000287 68 perfexp-cfa-pta-ll-noshare-fresh,corpus-1-20-1.txt,100,1,20.000000,78150000,10.000982 69 perfexp-cfa-pta-ll-noshare-fresh,corpus-1-200-1.txt,100,1,200.000000,8930000,10.010066 70 perfexp-cfa-pta-ll-noshare-fresh,corpus-1-5-1.txt,100,1,5.000000,119920000,10.000537 71 perfexp-cfa-pta-ll-noshare-fresh,corpus-1-50-1.txt,100,1,50.000000,38540000,10.001545 72 perfexp-cfa-pta-ll-noshare-fresh,corpus-1-500-1.txt,100,1,500.000000,3900000,10.024468 73 perfexp-cfa-peq-ll-share-reuse,corpus-100-1-1.txt,100,100,1.000000,580710000,10.000065 74 perfexp-cfa-peq-ll-share-reuse,corpus-100-10-1.txt,100,100,9.500000,430790000,10.000116 75 perfexp-cfa-peq-ll-share-reuse,corpus-100-100-1.txt,100,100,106.370000,247640000,10.000266 76 perfexp-cfa-peq-ll-share-reuse,corpus-100-2-1.txt,100,100,2.030000,464050000,10.000189 77 perfexp-cfa-peq-ll-share-reuse,corpus-100-20-1.txt,100,100,22.960000,377820000,10.000065 78 perfexp-cfa-peq-ll-share-reuse,corpus-100-200-1.txt,100,100,177.280000,195030000,10.000477 79 perfexp-cfa-peq-ll-share-reuse,corpus-100-5-1.txt,100,100,5.270000,430190000,10.000121 80 perfexp-cfa-peq-ll-share-reuse,corpus-100-50-1.txt,100,100,43.320000,331580000,10.000295 81 perfexp-cfa-peq-ll-share-reuse,corpus-100-500-1.txt,100,100,557.260000,123230000,10.000186 82 perfexp-cfa-peq-ll-share-reuse,corpus-1-1-1.txt,100,1,1.000000,572750000,10.000172 83 perfexp-cfa-peq-ll-share-reuse,corpus-1-10-1.txt,100,1,10.000000,558790000,10.000101 84 perfexp-cfa-peq-ll-share-reuse,corpus-1-100-1.txt,100,1,100.000000,291780000,10.000230 85 perfexp-cfa-peq-ll-share-reuse,corpus-1-2-1.txt,100,1,2.000000,571220000,10.000023 86 perfexp-cfa-peq-ll-share-reuse,corpus-1-20-1.txt,100,1,20.000000,461020000,10.000045 87 perfexp-cfa-peq-ll-share-reuse,corpus-1-200-1.txt,100,1,200.000000,220880000,10.000260 88 perfexp-cfa-peq-ll-share-reuse,corpus-1-5-1.txt,100,1,5.000000,555180000,10.000153 89 perfexp-cfa-peq-ll-share-reuse,corpus-1-50-1.txt,100,1,50.000000,433290000,10.000123 90 perfexp-cfa-peq-ll-share-reuse,corpus-1-500-1.txt,100,1,500.000000,165210000,10.000260 91 perfexp-cfa-peq-ll-share-fresh,corpus-100-1-1.txt,100,100,1.000000,591360000,10.000013 92 perfexp-cfa-peq-ll-share-fresh,corpus-100-10-1.txt,100,100,9.500000,432580000,10.000103 93 perfexp-cfa-peq-ll-share-fresh,corpus-100-100-1.txt,100,100,106.370000,253100000,10.000162 94 perfexp-cfa-peq-ll-share-fresh,corpus-100-2-1.txt,100,100,2.030000,470710000,10.000018 95 perfexp-cfa-peq-ll-share-fresh,corpus-100-20-1.txt,100,100,22.960000,381580000,10.000172 96 perfexp-cfa-peq-ll-share-fresh,corpus-100-200-1.txt,100,100,177.280000,197910000,10.000400 97 perfexp-cfa-peq-ll-share-fresh,corpus-100-5-1.txt,100,100,5.270000,437470000,10.000123 98 perfexp-cfa-peq-ll-share-fresh,corpus-100-50-1.txt,100,100,43.320000,337150000,10.000065 99 perfexp-cfa-peq-ll-share-fresh,corpus-100-500-1.txt,100,100,557.260000,127310000,10.000685 100 perfexp-cfa-peq-ll-share-fresh,corpus-1-1-1.txt,100,1,1.000000,581300000,10.000103 101 perfexp-cfa-peq-ll-share-fresh,corpus-1-10-1.txt,100,1,10.000000,566650000,10.000166 102 perfexp-cfa-peq-ll-share-fresh,corpus-1-100-1.txt,100,1,100.000000,295340000,10.000202 103 perfexp-cfa-peq-ll-share-fresh,corpus-1-2-1.txt,100,1,2.000000,579220000,10.000012 104 perfexp-cfa-peq-ll-share-fresh,corpus-1-20-1.txt,100,1,20.000000,470040000,10.000180 105 perfexp-cfa-peq-ll-share-fresh,corpus-1-200-1.txt,100,1,200.000000,223060000,10.000188 106 perfexp-cfa-peq-ll-share-fresh,corpus-1-5-1.txt,100,1,5.000000,563440000,10.000100 107 perfexp-cfa-peq-ll-share-fresh,corpus-1-50-1.txt,100,1,50.000000,438260000,10.000200 108 perfexp-cfa-peq-ll-share-fresh,corpus-1-500-1.txt,100,1,500.000000,166830000,10.000225 109 perfexp-cfa-peq-ll-noshare-reuse,corpus-100-1-1.txt,100,100,1.000000,603080000,10.000107 110 perfexp-cfa-peq-ll-noshare-reuse,corpus-100-10-1.txt,100,100,9.500000,439540000,10.000078 111 perfexp-cfa-peq-ll-noshare-reuse,corpus-100-100-1.txt,100,100,106.370000,279990000,10.000309 112 perfexp-cfa-peq-ll-noshare-reuse,corpus-100-2-1.txt,100,100,2.030000,509720000,10.000099 113 perfexp-cfa-peq-ll-noshare-reuse,corpus-100-20-1.txt,100,100,22.960000,405590000,10.000206 114 perfexp-cfa-peq-ll-noshare-reuse,corpus-100-200-1.txt,100,100,177.280000,230400000,10.000124 115 perfexp-cfa-peq-ll-noshare-reuse,corpus-100-5-1.txt,100,100,5.270000,454270000,10.000057 116 perfexp-cfa-peq-ll-noshare-reuse,corpus-100-50-1.txt,100,100,43.320000,375090000,10.000225 117 perfexp-cfa-peq-ll-noshare-reuse,corpus-100-500-1.txt,100,100,557.260000,134440000,10.000290 118 perfexp-cfa-peq-ll-noshare-reuse,corpus-1-1-1.txt,100,1,1.000000,588100000,10.000124 119 perfexp-cfa-peq-ll-noshare-reuse,corpus-1-10-1.txt,100,1,10.000000,577110000,10.000002 120 perfexp-cfa-peq-ll-noshare-reuse,corpus-1-100-1.txt,100,1,100.000000,319990000,10.000151 121 perfexp-cfa-peq-ll-noshare-reuse,corpus-1-2-1.txt,100,1,2.000000,586540000,10.000010 122 perfexp-cfa-peq-ll-noshare-reuse,corpus-1-20-1.txt,100,1,20.000000,480940000,10.000047 123 perfexp-cfa-peq-ll-noshare-reuse,corpus-1-200-1.txt,100,1,200.000000,300590000,10.000162 124 perfexp-cfa-peq-ll-noshare-reuse,corpus-1-5-1.txt,100,1,5.000000,577530000,10.000120 125 perfexp-cfa-peq-ll-noshare-reuse,corpus-1-50-1.txt,100,1,50.000000,454950000,10.000114 126 perfexp-cfa-peq-ll-noshare-reuse,corpus-1-500-1.txt,100,1,500.000000,186210000,10.000221 127 perfexp-cfa-peq-ll-noshare-fresh,corpus-100-1-1.txt,100,100,1.000000,546170000,10.000079 128 perfexp-cfa-peq-ll-noshare-fresh,corpus-100-10-1.txt,100,100,9.500000,403120000,10.000222 129 perfexp-cfa-peq-ll-noshare-fresh,corpus-100-100-1.txt,100,100,106.370000,214740000,10.000444 130 perfexp-cfa-peq-ll-noshare-fresh,corpus-100-2-1.txt,100,100,2.030000,449080000,10.000157 131 perfexp-cfa-peq-ll-noshare-fresh,corpus-100-20-1.txt,100,100,22.960000,351690000,10.000146 132 perfexp-cfa-peq-ll-noshare-fresh,corpus-100-200-1.txt,100,100,177.280000,174630000,10.000540 133 perfexp-cfa-peq-ll-noshare-fresh,corpus-100-5-1.txt,100,100,5.270000,419160000,10.000085 134 perfexp-cfa-peq-ll-noshare-fresh,corpus-100-50-1.txt,100,100,43.320000,296590000,10.000200 135 perfexp-cfa-peq-ll-noshare-fresh,corpus-100-500-1.txt,100,100,557.260000,78000000,10.000539 136 perfexp-cfa-peq-ll-noshare-fresh,corpus-1-1-1.txt,100,1,1.000000,541890000,10.000021 137 perfexp-cfa-peq-ll-noshare-fresh,corpus-1-10-1.txt,100,1,10.000000,511140000,10.000142 138 perfexp-cfa-peq-ll-noshare-fresh,corpus-1-100-1.txt,100,1,100.000000,243680000,10.000252 139 perfexp-cfa-peq-ll-noshare-fresh,corpus-1-2-1.txt,100,1,2.000000,532730000,10.000135 140 perfexp-cfa-peq-ll-noshare-fresh,corpus-1-20-1.txt,100,1,20.000000,413610000,10.000113 141 perfexp-cfa-peq-ll-noshare-fresh,corpus-1-200-1.txt,100,1,200.000000,192770000,10.000185 142 perfexp-cfa-peq-ll-noshare-fresh,corpus-1-5-1.txt,100,1,5.000000,495980000,10.000162 143 perfexp-cfa-peq-ll-noshare-fresh,corpus-1-50-1.txt,100,1,50.000000,367590000,10.000269 144 perfexp-cfa-peq-ll-noshare-fresh,corpus-1-500-1.txt,100,1,500.000000,111560000,10.000455 145 perfexp-cfa-pbv-ll-share-na,corpus-100-1-1.txt,xxx,100,1.000000,638780000,10.000008 146 perfexp-cfa-pbv-ll-share-na,corpus-100-10-1.txt,xxx,100,9.500000,637840000,10.000004 147 perfexp-cfa-pbv-ll-share-na,corpus-100-100-1.txt,xxx,100,106.370000,635130000,10.000003 148 perfexp-cfa-pbv-ll-share-na,corpus-100-2-1.txt,xxx,100,2.030000,639810000,10.000140 149 perfexp-cfa-pbv-ll-share-na,corpus-100-20-1.txt,xxx,100,22.960000,552670000,10.000089 150 perfexp-cfa-pbv-ll-share-na,corpus-100-200-1.txt,xxx,100,177.280000,639550000,10.000019 151 perfexp-cfa-pbv-ll-share-na,corpus-100-5-1.txt,xxx,100,5.270000,636230000,10.000044 152 perfexp-cfa-pbv-ll-share-na,corpus-100-50-1.txt,xxx,100,43.320000,631470000,10.000125 153 perfexp-cfa-pbv-ll-share-na,corpus-100-500-1.txt,xxx,100,557.260000,628330000,10.000127 154 perfexp-cfa-pbv-ll-share-na,corpus-1-1-1.txt,xxx,1,1.000000,589760000,10.000044 155 perfexp-cfa-pbv-ll-share-na,corpus-1-10-1.txt,xxx,1,10.000000,589790000,10.000151 156 perfexp-cfa-pbv-ll-share-na,corpus-1-100-1.txt,xxx,1,100.000000,587540000,10.000128 157 perfexp-cfa-pbv-ll-share-na,corpus-1-2-1.txt,xxx,1,2.000000,580790000,10.000102 158 perfexp-cfa-pbv-ll-share-na,corpus-1-20-1.txt,xxx,1,20.000000,586470000,10.000154 159 perfexp-cfa-pbv-ll-share-na,corpus-1-200-1.txt,xxx,1,200.000000,587510000,10.000005 160 perfexp-cfa-pbv-ll-share-na,corpus-1-5-1.txt,xxx,1,5.000000,582120000,10.000163 161 perfexp-cfa-pbv-ll-share-na,corpus-1-50-1.txt,xxx,1,50.000000,587990000,10.000127 162 perfexp-cfa-pbv-ll-share-na,corpus-1-500-1.txt,xxx,1,500.000000,587590000,10.000046 163 perfexp-cfa-pbv-ll-noshare-na,corpus-100-1-1.txt,xxx,100,1.000000,218340000,10.000321 164 perfexp-cfa-pbv-ll-noshare-na,corpus-100-10-1.txt,xxx,100,9.500000,189550000,10.000174 165 perfexp-cfa-pbv-ll-noshare-na,corpus-100-100-1.txt,xxx,100,106.370000,169280000,10.000141 166 perfexp-cfa-pbv-ll-noshare-na,corpus-100-2-1.txt,xxx,100,2.030000,197840000,10.000383 167 perfexp-cfa-pbv-ll-noshare-na,corpus-100-20-1.txt,xxx,100,22.960000,182700000,10.000041 168 perfexp-cfa-pbv-ll-noshare-na,corpus-100-200-1.txt,xxx,100,177.280000,157120000,10.000522 169 perfexp-cfa-pbv-ll-noshare-na,corpus-100-5-1.txt,xxx,100,5.270000,155160000,10.000322 170 perfexp-cfa-pbv-ll-noshare-na,corpus-100-50-1.txt,xxx,100,43.320000,179110000,10.000218 171 perfexp-cfa-pbv-ll-noshare-na,corpus-100-500-1.txt,xxx,100,557.260000,113620000,10.000140 172 perfexp-cfa-pbv-ll-noshare-na,corpus-1-1-1.txt,xxx,1,1.000000,216270000,10.000367 173 perfexp-cfa-pbv-ll-noshare-na,corpus-1-10-1.txt,xxx,1,10.000000,214390000,10.000157 174 perfexp-cfa-pbv-ll-noshare-na,corpus-1-100-1.txt,xxx,1,100.000000,165440000,10.000095 175 perfexp-cfa-pbv-ll-noshare-na,corpus-1-2-1.txt,xxx,1,2.000000,217150000,10.000044 176 perfexp-cfa-pbv-ll-noshare-na,corpus-1-20-1.txt,xxx,1,20.000000,216760000,10.000321 177 perfexp-cfa-pbv-ll-noshare-na,corpus-1-200-1.txt,xxx,1,200.000000,176930000,10.000100 178 perfexp-cfa-pbv-ll-noshare-na,corpus-1-5-1.txt,xxx,1,5.000000,200840000,10.000229 179 perfexp-cfa-pbv-ll-noshare-na,corpus-1-50-1.txt,xxx,1,50.000000,212960000,10.000273 180 perfexp-cfa-pbv-ll-noshare-na,corpus-1-500-1.txt,xxx,1,500.000000,163340000,10.000196 181 perfexp-stl-pta-na-na-reuse,corpus-100-1-1.txt,100,100,1.000000,151210000,10.000032 182 perfexp-stl-pta-na-na-reuse,corpus-100-10-1.txt,100,100,9.500000,92400000,10.000662 183 perfexp-stl-pta-na-na-reuse,corpus-100-100-1.txt,100,100,106.370000,16700000,10.003595 184 perfexp-stl-pta-na-na-reuse,corpus-100-2-1.txt,100,100,2.030000,132700000,10.000666 185 perfexp-stl-pta-na-na-reuse,corpus-100-20-1.txt,100,100,22.960000,61670000,10.001135 186 perfexp-stl-pta-na-na-reuse,corpus-100-200-1.txt,100,100,177.280000,8950000,10.005903 187 perfexp-stl-pta-na-na-reuse,corpus-100-5-1.txt,100,100,5.270000,105760000,10.000126 188 perfexp-stl-pta-na-na-reuse,corpus-100-50-1.txt,100,100,43.320000,38020000,10.001290 189 perfexp-stl-pta-na-na-reuse,corpus-100-500-1.txt,100,100,557.260000,3080000,10.009583 190 perfexp-stl-pta-na-na-reuse,corpus-1-1-1.txt,100,1,1.000000,150070000,10.000505 191 perfexp-stl-pta-na-na-reuse,corpus-1-10-1.txt,100,1,10.000000,96240000,10.000747 192 perfexp-stl-pta-na-na-reuse,corpus-1-100-1.txt,100,1,100.000000,17380000,10.005677 193 perfexp-stl-pta-na-na-reuse,corpus-1-2-1.txt,100,1,2.000000,136340000,10.000556 194 perfexp-stl-pta-na-na-reuse,corpus-1-20-1.txt,100,1,20.000000,69290000,10.000979 195 perfexp-stl-pta-na-na-reuse,corpus-1-200-1.txt,100,1,200.000000,9140000,10.005445 196 perfexp-stl-pta-na-na-reuse,corpus-1-5-1.txt,100,1,5.000000,114030000,10.000605 197 perfexp-stl-pta-na-na-reuse,corpus-1-50-1.txt,100,1,50.000000,33470000,10.000871 198 perfexp-stl-pta-na-na-reuse,corpus-1-500-1.txt,100,1,500.000000,3760000,10.021431 199 perfexp-stl-pta-na-na-fresh,corpus-100-1-1.txt,100,100,1.000000,151890000,10.000693 200 perfexp-stl-pta-na-na-fresh,corpus-100-10-1.txt,100,100,9.500000,97910000,10.000289 201 perfexp-stl-pta-na-na-fresh,corpus-100-100-1.txt,100,100,106.370000,16740000,10.000756 202 perfexp-stl-pta-na-na-fresh,corpus-100-2-1.txt,100,100,2.030000,134890000,10.000666 203 perfexp-stl-pta-na-na-fresh,corpus-100-20-1.txt,100,100,22.960000,61040000,10.000514 204 perfexp-stl-pta-na-na-fresh,corpus-100-200-1.txt,100,100,177.280000,8950000,10.004888 205 perfexp-stl-pta-na-na-fresh,corpus-100-5-1.txt,100,100,5.270000,101780000,10.000043 206 perfexp-stl-pta-na-na-fresh,corpus-100-50-1.txt,100,100,43.320000,38440000,10.000510 207 perfexp-stl-pta-na-na-fresh,corpus-100-500-1.txt,100,100,557.260000,3060000,10.007733 208 perfexp-stl-pta-na-na-fresh,corpus-1-1-1.txt,100,1,1.000000,149360000,10.000168 209 perfexp-stl-pta-na-na-fresh,corpus-1-10-1.txt,100,1,10.000000,98400000,10.000118 210 perfexp-stl-pta-na-na-fresh,corpus-1-100-1.txt,100,1,100.000000,17440000,10.004379 211 perfexp-stl-pta-na-na-fresh,corpus-1-2-1.txt,100,1,2.000000,130340000,10.000520 212 perfexp-stl-pta-na-na-fresh,corpus-1-20-1.txt,100,1,20.000000,69280000,10.001377 213 perfexp-stl-pta-na-na-fresh,corpus-1-200-1.txt,100,1,200.000000,9070000,10.004963 214 perfexp-stl-pta-na-na-fresh,corpus-1-5-1.txt,100,1,5.000000,114390000,10.000315 215 perfexp-stl-pta-na-na-fresh,corpus-1-50-1.txt,100,1,50.000000,34350000,10.001033 216 perfexp-stl-pta-na-na-fresh,corpus-1-500-1.txt,100,1,500.000000,3720000,10.009015 217 perfexp-stl-peq-na-na-reuse,corpus-100-1-1.txt,100,100,1.000000,867730000,10.000040 218 perfexp-stl-peq-na-na-reuse,corpus-100-10-1.txt,100,100,9.500000,470370000,10.000155 219 perfexp-stl-peq-na-na-reuse,corpus-100-100-1.txt,100,100,106.370000,287440000,10.000190 220 perfexp-stl-peq-na-na-reuse,corpus-100-2-1.txt,100,100,2.030000,667180000,10.000145 221 perfexp-stl-peq-na-na-reuse,corpus-100-20-1.txt,100,100,22.960000,430260000,10.000102 222 perfexp-stl-peq-na-na-reuse,corpus-100-200-1.txt,100,100,177.280000,232720000,10.000418 223 perfexp-stl-peq-na-na-reuse,corpus-100-5-1.txt,100,100,5.270000,515130000,10.000118 224 perfexp-stl-peq-na-na-reuse,corpus-100-50-1.txt,100,100,43.320000,401280000,10.000122 225 perfexp-stl-peq-na-na-reuse,corpus-100-500-1.txt,100,100,557.260000,135350000,10.000692 226 perfexp-stl-peq-na-na-reuse,corpus-1-1-1.txt,100,1,1.000000,847560000,10.000010 227 perfexp-stl-peq-na-na-reuse,corpus-1-10-1.txt,100,1,10.000000,641250000,10.000095 228 perfexp-stl-peq-na-na-reuse,corpus-1-100-1.txt,100,1,100.000000,300130000,10.000199 229 perfexp-stl-peq-na-na-reuse,corpus-1-2-1.txt,100,1,2.000000,680950000,10.000050 230 perfexp-stl-peq-na-na-reuse,corpus-1-20-1.txt,100,1,20.000000,515190000,10.000051 231 perfexp-stl-peq-na-na-reuse,corpus-1-200-1.txt,100,1,200.000000,271800000,10.000194 232 perfexp-stl-peq-na-na-reuse,corpus-1-5-1.txt,100,1,5.000000,611640000,10.000133 233 perfexp-stl-peq-na-na-reuse,corpus-1-50-1.txt,100,1,50.000000,483780000,10.000084 234 perfexp-stl-peq-na-na-reuse,corpus-1-500-1.txt,100,1,500.000000,191470000,10.000243 235 perfexp-stl-peq-na-na-fresh,corpus-100-1-1.txt,100,100,1.000000,779650000,10.000085 236 perfexp-stl-peq-na-na-fresh,corpus-100-10-1.txt,100,100,9.500000,419300000,10.000184 237 perfexp-stl-peq-na-na-fresh,corpus-100-100-1.txt,100,100,106.370000,224270000,10.000410 238 perfexp-stl-peq-na-na-fresh,corpus-100-2-1.txt,100,100,2.030000,545330000,10.000073 239 perfexp-stl-peq-na-na-fresh,corpus-100-20-1.txt,100,100,22.960000,385000000,10.000210 240 perfexp-stl-peq-na-na-fresh,corpus-100-200-1.txt,100,100,177.280000,174520000,10.000360 241 perfexp-stl-peq-na-na-fresh,corpus-100-5-1.txt,100,100,5.270000,443460000,10.000165 242 perfexp-stl-peq-na-na-fresh,corpus-100-50-1.txt,100,100,43.320000,310460000,10.000174 243 perfexp-stl-peq-na-na-fresh,corpus-100-500-1.txt,100,100,557.260000,92820000,10.000352 244 perfexp-stl-peq-na-na-fresh,corpus-1-1-1.txt,100,1,1.000000,774230000,10.000110 245 perfexp-stl-peq-na-na-fresh,corpus-1-10-1.txt,100,1,10.000000,554850000,10.000064 246 perfexp-stl-peq-na-na-fresh,corpus-1-100-1.txt,100,1,100.000000,227540000,10.000041 247 perfexp-stl-peq-na-na-fresh,corpus-1-2-1.txt,100,1,2.000000,616830000,10.000134 248 perfexp-stl-peq-na-na-fresh,corpus-1-20-1.txt,100,1,20.000000,436800000,10.000038 249 perfexp-stl-peq-na-na-fresh,corpus-1-200-1.txt,100,1,200.000000,185050000,10.000439 250 perfexp-stl-peq-na-na-fresh,corpus-1-5-1.txt,100,1,5.000000,569030000,10.000125 251 perfexp-stl-peq-na-na-fresh,corpus-1-50-1.txt,100,1,50.000000,387710000,10.000249 252 perfexp-stl-peq-na-na-fresh,corpus-1-500-1.txt,100,1,500.000000,113890000,10.000075 253 perfexp-stl-pbv-na-na-na,corpus-100-1-1.txt,xxx,100,1.000000,1267570000,10.000072 254 perfexp-stl-pbv-na-na-na,corpus-100-10-1.txt,xxx,100,9.500000,476260000,10.000192 255 perfexp-stl-pbv-na-na-na,corpus-100-100-1.txt,xxx,100,106.370000,271870000,10.000171 256 perfexp-stl-pbv-na-na-na,corpus-100-2-1.txt,xxx,100,2.030000,807830000,10.000110 257 perfexp-stl-pbv-na-na-na,corpus-100-20-1.txt,xxx,100,22.960000,373160000,10.000221 258 perfexp-stl-pbv-na-na-na,corpus-100-200-1.txt,xxx,100,177.280000,233700000,10.000081 259 perfexp-stl-pbv-na-na-na,corpus-100-5-1.txt,xxx,100,5.270000,536240000,10.000165 260 perfexp-stl-pbv-na-na-na,corpus-100-50-1.txt,xxx,100,43.320000,297400000,10.000317 261 perfexp-stl-pbv-na-na-na,corpus-100-500-1.txt,xxx,100,557.260000,159500000,10.000290 262 perfexp-stl-pbv-na-na-na,corpus-1-1-1.txt,xxx,1,1.000000,1089370000,10.000024 263 perfexp-stl-pbv-na-na-na,corpus-1-10-1.txt,xxx,1,10.000000,722490000,10.000040 264 perfexp-stl-pbv-na-na-na,corpus-1-100-1.txt,xxx,1,100.000000,311250000,10.000116 265 perfexp-stl-pbv-na-na-na,corpus-1-2-1.txt,xxx,1,2.000000,747630000,10.000103 266 perfexp-stl-pbv-na-na-na,corpus-1-20-1.txt,xxx,1,20.000000,348820000,10.000149 267 perfexp-stl-pbv-na-na-na,corpus-1-200-1.txt,xxx,1,200.000000,302220000,10.000223 268 perfexp-stl-pbv-na-na-na,corpus-1-5-1.txt,xxx,1,5.000000,725430000,10.000110 269 perfexp-stl-pbv-na-na-na,corpus-1-50-1.txt,xxx,1,50.000000,335730000,10.000280 270 perfexp-stl-pbv-na-na-na,corpus-1-500-1.txt,xxx,1,500.000000,258380000,10.000052 -
doc/theses/mike_brooks_MMath/plots/string-pbv.gp
r7d02d35 rbd72f517 3 3 #set terminal wxt size 950,1250 4 4 5 DIR="pictures" 5 INDIR="build" 6 OUTDIR="build" 6 7 7 8 set macros 8 set output "build/string-graph-pbv.pdf" 9 set output OUTDIR."/plot-string-pbv.pdf" 10 11 set multiplot layout 1, 2 ; 12 13 9 14 #set pointsize 2.0 10 15 set grid … … 14 19 set logscale x 15 20 set logscale y 2 16 set xlabel "String Length being passed (interp. varies)" offset 2,0 17 set ylabel "Time per append (ns, mean), log_{2} scale" 21 set xlabel "String length passed, varying (mean)" 22 set ylabel "Time per pass (ns, mean), log_{2} scale" 23 set yrange [4:64] 18 24 set linetype 3 dashtype 2 19 25 set linetype 4 dashtype 2 20 plot DIR."/string-graph-pbv.dat" \ 21 i 0 using 1:2 title columnheader(1) with linespoints lt rgb "blue" pt 2 ps 1 lw 1, \ 22 '' i 1 using 1:2 title columnheader(1) with linespoints lt rgb "red" pt 3 ps 1 lw 1, \ 23 '' i 2 using 1:2 title columnheader(1) with linespoints lt rgb "blue" pt 6 ps 1 lw 1 26 plot INDIR."/plot-string-pbv-varcorp.dat" \ 27 i 0 using 1:2 title columnheader(1) with linespoints lt rgb "red" pt 3 ps 1 lw 1, \ 28 '' i 1 using 1:2 title columnheader(1) with linespoints lt rgb "blue" pt 6 ps 1 lw 1 29 30 set xlabel "String length passed, fixed" 31 set ylabel 32 plot INDIR."/plot-string-pbv-fixcorp.dat" \ 33 i 0 using 1:2 title columnheader(1) with linespoints lt rgb "red" pt 3 ps 1 lw 1, \ 34 '' i 1 using 1:2 title columnheader(1) with linespoints lt rgb "blue" pt 6 ps 1 lw 1 35 36 unset multiplot -
doc/theses/mike_brooks_MMath/plots/string-peq-cppemu.gp
r7d02d35 rbd72f517 13 13 set xtics (1,2,5,10,20,50,100,200,500) 14 14 set logscale x 15 set logscale y16 set yrange [ 10:200]15 #set logscale y 16 set yrange [0:115] 17 17 set xlabel "String Length being appended (mean, geo. dist.), log scale" offset 2,0 18 18 set ylabel "Time per append (ns, mean)" -
doc/theses/mike_brooks_MMath/plots/string-peq-cppemu.py
r7d02d35 rbd72f517 12 12 import pandas as pd 13 13 import numpy as np 14 import sys 14 15 import os 15 16 16 infile = os.path.dirname(os.path.abspath(__file__)) + '/../benchmarks/string/result-append-pbv.csv' 17 sys.path.insert(0, os.path.dirname(__file__)) 18 from common import * 17 19 18 20 prettyFieldNames = { … … 23 25 } 24 26 25 timings = pd.read_csv( 26 infile, 27 names=['test', 'corpus', 'concatsPerReset', 'corpusItemCount', 'corpusMeanLenChars', 'concatDoneActualCount', 'execTimeActualSec'], 28 dtype={'test': str, 29 'corpus': str, 30 'concatsPerReset': 'Int64', # allows missing; https://stackoverflow.com/a/70626154 31 'corpusItemCount': np.int64, 32 'corpusMeanLenChars': np.float64, 33 'concatDoneActualCount': np.int64, 34 'execTimeActualSec': np.float64}, 35 na_values=['xxx'], 36 ) 37 # print(timings.head()) 27 timings = loadParseTimingData('result-append-pbv.csv') 38 28 29 # Filter operation=peq, corpus=100-*-1 39 30 40 # project: parse executable and corpus names 41 42 timings[['test-slug', 43 'sut-platform', 44 'operation', 45 'sut-cfa-level', 46 'sut-cfa-sharing', 47 'op-alloc']] = timings['test'].str.strip().str.split('-', expand=True) 48 timings['sut'] = timings[['sut-platform', 49 'sut-cfa-level', 50 'sut-cfa-sharing', 51 'op-alloc']].agg('-'.join, axis=1) 52 53 timings[['corpus-basename', 54 'corpus-ext']] = timings['corpus'].str.strip().str.split('.', expand=True) 55 timings[['corpus-slug', 56 'corpus-nstrs', 57 'corpus-meanlen', 58 'corpus-runid']] = timings['corpus-basename'].str.strip().str.split('-', expand=True) 59 timings["corpus-nstrs"] = pd.to_numeric(timings["corpus-nstrs"]) 60 timings["corpus-meanlen"] = pd.to_numeric(timings["corpus-meanlen"]) 61 timings["corpus-runid"] = pd.to_numeric(timings["corpus-runid"]) 62 63 64 # project: calculate fact 65 66 timings['op-duration-s'] = timings['execTimeActualSec'] / timings['concatDoneActualCount'] 67 timings['op-duration-ns'] = timings['op-duration-s'] * 1000 * 1000 * 1000 68 69 70 # Filter operation=peq 71 72 groupedOp = timings.groupby('operation') 73 tgtOpTimings = groupedOp.get_group('peq') 74 31 timings = timings.groupby('operation').get_group('peq') 32 timings = timings.groupby('corpus-nstrs').get_group(100) 33 timings = timings.groupby('corpus-runid').get_group(1) 75 34 76 35 # Emit in groups 77 36 78 groupedSut = t gtOpTimings.groupby('sut')37 groupedSut = timings.groupby('sut') 79 38 80 39 for sut, sgroup in groupedSut: -
doc/theses/mike_brooks_MMath/plots/string-peq-sharing.gp
r7d02d35 rbd72f517 3 3 #set terminal wxt size 950,1250 4 4 5 DIR="pictures" 5 INDIR="build" 6 OUTDIR="build" 6 7 7 8 set macros 8 set output "build/string-graph-peq-sharing.pdf"9 set output OUTDIR."/plot-string-peq-sharing.pdf" 9 10 #set pointsize 2.0 10 11 set grid … … 12 13 set xtics (1,2,5,10,20,50,100,200,500) 13 14 set logscale x 14 #set logscale y 2 15 #set logscale y 16 set yrange [10:115] 15 17 set xlabel "String Length being appended (mean, geo. dist.), log scale" offset 2,0 16 18 set ylabel "Time per append (ns, mean)" 17 19 set linetype 2 dashtype 2 18 20 set linetype 4 dashtype 2 19 plot DIR."/string-graph-peq-sharing.dat" \21 plot INDIR."/plot-string-peq-sharing.dat" \ 20 22 i 0 using 1:2 title columnheader(1) with linespoints lt rgb "red" pt 2 ps 1 lw 1, \ 21 23 '' i 1 using 1:2 title columnheader(1) with linespoints lt rgb "red" pt 3 ps 1 lw 1, \ -
doc/theses/mike_brooks_MMath/plots/string-pta-sharing.gp
r7d02d35 rbd72f517 3 3 #set terminal wxt size 950,1250 4 4 5 DIR="pictures" 5 INDIR="build" 6 OUTDIR="build" 6 7 7 8 set macros 8 set output "build/string-graph-pta-sharing.pdf"9 set output OUTDIR."/plot-string-pta-sharing.pdf" 9 10 #set pointsize 2.0 10 11 set grid … … 12 13 set xtics (1,2,5,10,20,50,100,200,500) 13 14 set logscale x 15 set yrange [8:4096] 14 16 set logscale y 2 15 17 set xlabel "String Length being appended (mean, geo. dist.), log scale" offset 2,0 16 18 set ylabel "Time per append (ns, mean), log_{2} scale" 17 set linetype 5 dashtype 218 19 #show colornames 19 plot DIR."/string-graph-pta-sharing.dat" \20 plot INDIR."/plot-string-pta-sharing.dat" \ 20 21 i 0 using 1:2 title columnheader(1) with linespoints lt rgb "red" pt 2 ps 1 lw 1, \ 21 22 '' i 1 using 1:2 title columnheader(1) with linespoints lt rgb "dark-green" pt 4 ps 1 lw 1, \ 22 23 '' i 2 using 1:2 title columnheader(1) with linespoints lt rgb "blue" pt 6 ps 1 lw 1, \ 23 '' i 3 using 1:2 title columnheader(1) with linespoints lt rgb "dark-green" pt 12 ps 1 lw 1, \ 24 '' i 4 using 1:2 title columnheader(1) with linespoints lt rgb "blue" pt 8 ps 1 lw 1 24 '' i 3 using 1:2 title columnheader(1) with linespoints lt rgb "dark-green" pt 12 ps 1 lw 1 -
doc/theses/mike_brooks_MMath/programs/bkgd-cfa-arrayinteract.cfa
r7d02d35 rbd72f517 3 3 4 4 struct tm { int x; }; 5 forall (T*) T* alloc();5 forall( T * ) T * alloc(); 6 6 7 7 int main () { -
doc/theses/mike_brooks_MMath/programs/hello-accordion.cfa
r7d02d35 rbd72f517 31 31 int getPref( @School( C, S ) & school@, int is, int pref ) { 32 32 for ( ic; C ) { 33 int curPref = @school.preferences@[ic][is]; $\C{// offset calculation implicit}$ 34 if ( curPref == pref ) return ic; 33 if ( pref == @school.preferences@[ic][is]; ) return ic; $\C{// offset calculation implicit}$ 35 34 } 36 35 assert( false ); 37 36 } 37 38 38 39 39 … … 91 91 sout | school.student_ids[is] | ": " | nonl; 92 92 for ( pref; 1 ~= nc ) { 93 int ic = getPref( school, is, pref);93 int ic = getPref( school, is, pref ); 94 94 sout | school.course_codes[ ic ] | nonl; 95 95 } -
doc/theses/mike_brooks_MMath/string.tex
r7d02d35 rbd72f517 17 17 \begin{cquote} 18 18 \begin{tabular}{@{}l|l|l|l@{}} 19 C @char [ ]@ & \CC @string@ & Java @String@ 19 C @char [ ]@ & \CC @string@ & Java @String@ & \CFA @string@ \\ 20 20 \hline 21 @strcpy@, @strncpy@ & @=@ & @=@ 22 @strcat@, @strncat@ & @+@, @+=@ & @+@, @+=@ 21 @strcpy@, @strncpy@ & @=@ & @=@ & @=@ \\ 22 @strcat@, @strncat@ & @+@, @+=@ & @+@, @+=@ & @+@, @+=@ \\ 23 23 @strcmp@, @strncmp@ & @==@, @!=@, @<@, @<=@, @>@, @>=@ 24 25 26 @strlen@ & @length@, @size@ & @length@ 27 @[ ]@ & @[ ]@ & @charAt@ 28 @strncpy@ & @substr@ & @substring@ 29 @strncpy@ & @replace@ & @replace@ 30 @strstr@ & @find@ & @indexOf@ 31 @strcspn@ & @find_first_of@ & @matches@ 32 @strspn@ & @find_first_not_of@ & @matches@ 33 n/a & @c_str@, @data@ & n/a& @strcpy@, @strncpy@ \\24 & @equals@, @compareTo@ 25 & @==@, @!=@, @<@, @<=@, @>@, @>=@ \\ 26 @strlen@ & @length@, @size@ & @length@ & @size@ \\ 27 @[ ]@ & @[ ]@ & @charAt@ & @[ ]@ \\ 28 @strncpy@ & @substr@ & @substring@ & @( )@, on RHS of @=@ \\ 29 @strncpy@ & @replace@ & @replace@ & @( )@, on LHS of @=@ \\ 30 @strstr@ & @find@ & @indexOf@ & @find@ \\ 31 @strcspn@ & @find_first_of@ & @matches@ & @include@ \\ 32 @strspn@ & @find_first_not_of@ & @matches@ & @exclude@ \\ 33 N/A & @c_str@, @data@ & N/A & @strcpy@, @strncpy@ \\ 34 34 \end{tabular} 35 35 \end{cquote} … … 53 53 54 54 55 \section{\CFA \lstinline{string} type}55 \section{\CFA \lstinline{string} Type} 56 56 \label{s:stringType} 57 57 … … 272 272 ch = ch + 'b'; $\C[2in]{// LHS disambiguate, add character values}$ 273 273 s = 'a' + 'b'; $\C{// LHS disambiguate, concatenate characters}$ 274 printf( "%c\n", @'a' + 'b'@ ); $\C [2in]{// no LHS information, ambiguous}$275 printf( "%c\n", @(return char)@('a' + 'b') ); $\C{// disambiguate with ascription cast} $274 printf( "%c\n", @'a' + 'b'@ ); $\C{// no LHS information, ambiguous}$ 275 printf( "%c\n", @(return char)@('a' + 'b') ); $\C{// disambiguate with ascription cast}\CRT$ 276 276 \end{cfa} 277 277 The ascription cast, @(return T)@, disambiguates by stating a (LHS) type to use during expression resolution (not a conversion). … … 319 319 ch = ch * 3; $\C[2in]{// LHS disambiguate, multiply character values}$ 320 320 s = 'a' * 3; $\C{// LHS disambiguate, concatenate characters}$ 321 printf( "%c\n", @'a' * 3@ ); $\C [2in]{// no LHS information, ambiguous}$322 printf( "%c\n", @(return char)@('a' * 3) ); $\C{// disambiguate with ascription cast} $321 printf( "%c\n", @'a' * 3@ ); $\C{// no LHS information, ambiguous}$ 322 printf( "%c\n", @(return char)@('a' * 3) ); $\C{// disambiguate with ascription cast}\CRT$ 323 323 \end{cfa} 324 324 Fortunately, character multiplication without LHS information is even rarer than addition, so repurposing the operator @*@ for @string@ types is not a problem. … … 600 600 & 601 601 \begin{cfa} 602 for ( ;;) {602 for () { 603 603 size_t posn = exclude( line, alpha ); 604 604 if ( posn == len( line ) ) break; … … 722 722 \end{tabular} 723 723 \end{cquote} 724 Input text can be gulped, including whitespace, from the current point to an arbitrary delimiter character using @getline@.724 Input text can be \emph{gulped}, including whitespace, from the current point to an arbitrary delimiter character using @getline@. 725 725 726 726 The \CFA philosophy for input is that, for every constant type in C, these constants should be usable as input. … … 760 760 \end{tabular} 761 761 \end{cquote} 762 Note, the ability to read in quoted strings to match with program string constants.762 Note, the ability to read in quoted strings with whitespace to match with program string constants. 763 763 The @nl@ at the end of an input ignores the rest of the line. 764 764 … … 845 845 & Laxed: The target's type is anything string-like; it may have a different status concerning ownership. 846 846 & Strict: The target's type is the same as the source; both strings are equivalent peers concerning ownership. 847 & n/a& no & yes & yes \\847 & N/A & no & yes & yes \\ 848 848 \hline 849 849 Referent … … 863 863 The C ``string'' is @char *@, under the conventions of @<string.h>@. Because this type does not manage a text allocation, symmetry does not apply. 864 864 \item 865 The Java @String@ class is analyzed; its @StringBuffer@ class behaves similarly to @C++@.865 The Java @String@ class is analyzed; its @StringBuffer@ class behaves similarly to \CC. 866 866 \end{itemize} 867 867 \caption{Comparison of languages' strings, storage management perspective.} … … 869 869 \end{figure} 870 870 871 In C, these declarations give very different things.871 In C, these declarations are very different. 872 872 \begin{cfa} 873 873 char x[$\,$] = "abcde"; 874 874 char * y = "abcde"; 875 875 \end{cfa} 876 Both associate the declared name with fixed-six contiguous bytes, filled as@{'a', 'b', 'c', 'd', 'e', 0}@.877 But @x@ gets them allocated in the active stack frame (with values filled in as control passes the declaration), while @y@ refers into the executable's read-only datasection.876 Both associate the declared name with the fixed, six contiguous bytes: @{'a', 'b', 'c', 'd', 'e', 0}@. 877 But @x@ is allocated on the stack (with values filled at the declaration), while @y@ refers to the executable's read-only data-section. 878 878 With @x@ representing an allocation, it offers information in @sizeof(x)@ that @y@ does not. 879 But this extra information is second-class, as it can only be used in the immediate lexical context, \ie it cannot be passed onto string operations or user functions.879 But this extra information is second-class, as it can only be used in the immediate lexical context, \ie it cannot be passed to string operations or user functions. 880 880 Only pointers to text buffers are first-class, and discussed further. 881 882 881 \begin{cfa} 883 882 char * s = "abcde"; 884 char * s1 = s; $\C {// alias state, n/asymmetry, variable-constrained referent}$885 char * s2 = &s[1]; $\C{// alias state, n/a symmetry, variable-constrained referent}\CRT$886 char * s3 = &s2[1]; $\C{// alias state, n/a symmetry, variable-constrained referent}883 char * s1 = s; $\C[2.25in]{// alias state, N/A symmetry, variable-constrained referent}$ 884 char * s2 = &s[1]; $\C{// alias state, N/A symmetry, variable-constrained referent}$ 885 char * s3 = &s2[1]; $\C{// alias state, N/A symmetry, variable-constrained referent}\CRT$ 887 886 printf( "%s %s %s %s\n", s, s1, s2, s3 ); 888 887 $\texttt{\small abcde abcde bcde cde}$ … … 904 903 string & s5 = s.substr(2,4); $\C{// error: cannot point to temporary}\CRT$ 905 904 \end{cfa} 906 The @s1@ lax symmetry reflects how its validity ofdepends on the lifetime of @s@.905 The @s1@ lax symmetry reflects how its validity depends on the lifetime of @s@. 907 906 It is common practice in \CC to use the @s1@-style for a by-reference function parameter. 908 907 Doing so assumes that the callee only uses the referenced string for the duration of the call, \ie no storing the parameter (as a reference) for later. 909 908 So, when the called function is a constructor, its definition typically uses an @s2@-style copy-initialization. 910 909 Exceptions to this pattern are possible, but require the programmer to assure safety where the type system does not. 911 The @s3@ initialization must copy the substring because it must support a subsequent @c_str@ call, which provides anull-termination, generally at a different position than the source string's.910 The @s3@ initialization must copy the substring to support a subsequent @c_str@ call, which provides null-termination, generally at a different position than the source string's. 912 911 @s2@ assignment could be made fast, by reference-counting the text area and using copy-on-write, but would require an implementation upgrade. 913 912 … … 929 928 With @s2@, the case for fast-copy is more subtle. 930 929 Certainly, its value is not pointer-equal to @s@, implying at least a further allocation. 931 But because Java is notconstrained to use a null-terminated representation, a standard-library implementation is free to refer to the source characters in-place.930 But because Java is \emph{not} constrained to use a null-terminated representation, a standard-library implementation is free to refer to the source characters in-place. 932 931 Java does not meet the aliasing requirement because immutability makes it impossible to modify. 933 932 Java's @StringBuffer@ provides aliasing (see @replace@ example on \VPageref{p:JavaReplace}), though without supporting symmetric treatment of a fragment referent, \eg @substring@ of a @StringBuffer@ is a @String@; … … 960 959 961 960 962 963 \subsection{Logical overlap} 961 \subsection{Logical Overlap} 964 962 965 963 It may be unfamiliar to combine \VRef[Figure]{f:StrSemanticCompare}'s alias state and fragment referent in one API, or at the same time. 966 964 This section shows the capability in action. 967 965 968 In summary, the metaphor of a GUI text editor is intended. 969 Selecting a consecutive block of text using the mouse defines an aliased substring within the file. 970 Typing in this state overwrites what was there before, replacing the originally selected text with more or less text. 971 But the \emph{whole file} grows or shrinks as a result, not just the selection. 972 This action models assigning to an aliased substring when the two strings overlap by total containment: one string is the selection, the other is the whole file. 973 974 Now extend the metaphor to a multi-user online editor. 975 If Alice selects a range of text at the bottom of the file, wile Bob is rewriting a paragraph at the top, Alice's selection holds onto the logical characters initially selected, unaffected by Bob making the total file grow/shrink, and unaffectd by Bob causing the start index of Alice's selction to vary. 976 This action models assigning to an aliased substring when the two strings do not overlap at all: one string is Alice's selection, the other is Bob's. 977 978 If a third office worker were also watching Alice's and Bob's actions on the whole file (a string with ``all the text'' is kept around), then two further single-user-edit cases give the semantics of the individual edits flowing into the whole. 979 But, departing from the document analogy, it is not necessary to keep a such a third string: 980 no one has to resource-manage ``the document.'' 981 When an original string, from which both the Alice- and Bob-parts came, ceases to exist, Alice and Bob are left with two independent strings. 982 They are independent because Alice and Bob have no API for growing the bounds of a string to subsume text that may once have been around it. 983 984 Edge cases, notably ``Venn-diagram overlap,'' had to have handlings chosen. 985 The intent in fleshing out these details was to achieve the above story, with a single API, while keeping the rest as simple as possible. 986 The remainder of this section shows the resulting decisions, played out at the API level. 987 988 \CFA uses the marker @`share@ as a dynamic mechanism to indicate alias (mutations shared) \vs snapshot (not quite an immutable result, but one with subsequent mutations isolated). 966 \begin{comment} 967 The metaphor of a GUI text-editor is used to illustrate combining these features. 968 Most editors allow selecting a consecutive block of text (highlighted) to define an aliased substring within a document. 969 Typing in this area overwrites the prior text, replacing the selected text with less, same, or more text. 970 Importantly, the document also changes size, not just the selection. 971 %This alias model is assigning to an aliased substring for two strings overlapping by total containment: one is the selected string, the other is the document. 972 Extend the metaphor to two selected areas, where one area can be drag-and-dropped into another, changing the text in the drop area and correspondingly changing the document. 973 When the selected areas are indenpendent, the semantics of the drag-and-drop are straightforward. 974 However, for overlapping selections, either partial or full, there are multiple useful semantics. 975 For example, two areas overlap at the top, or bottom, or a block at a corner, where one areas is dropped into the other. 976 For selecting a smaller area within a larger, and dropping the smaller area into the larger to replace it. 977 In both cases, meaningful semantics must be constructed or the operation precluded. 978 However, without this advanced capability, certain operations become multi-step, possible requiring explicit temporaries. 979 \end{comment} 980 981 A GUI text-editor provides a metaphor. 982 Selecting a block of text using the mouse defines an aliased substring within a document. 983 Typing in this area overwrites what was there, replacing the originally selected text with more or less text. 984 But the \emph{containing document} also grows or shrinks, not just the selection. 985 This action models assigning to an aliased substring when one string is completely contained in the other. 986 987 Extend the metaphor to a multi-user editor. 988 If Alice selects a range of text at the bottom, while Bob is rewriting a paragraph at the top, Alice's selection holds onto the characters initially selected, unaffected by Bob making the document grow/shrink even though Alice's start index in the document is changing. 989 This action models assigning to an aliased substring when the two strings do not overlap. 990 991 Logically, Alice's and Bob's actions on the whole document are like two single-user-edit cases, giving the semantics of the individual edits flowing into a whole. 992 But, there is no need to have two separate document strings. 993 Even if a third selection removes all the text, both Alice's and Bob's strings remain. 994 The independence of their selections assumes that the editor API does not allow the selection to be enlarged, \ie adding text from the containing environment, which may have disappeared. 995 996 This leaves the ``Venn-diagram overlap'' cases, where Alice's and Bob's selections overlap at the top, bottom, or corner. 997 In this case, the selection areas are dependent, and so, changes in content and size in one may have an affect in the other. 998 There are multiple possible semantics for this case. 999 The remainder of this section shows the chosen semantics for all of the cases. 1000 1001 String sharing is expressed using the @`share@ marker to indicate aliasing (mutations shared) \vs snapshot (not quite an immutable result, but one with subsequent mutations isolated). 989 1002 This aliasing relationship is a sticky property established at initialization. 990 1003 For example, here strings @s1@ and @s1a@ are in an aliasing relationship, while @s2@ is in a copy relationship. 991 1004 \input{sharing1.tex} 992 1005 Here, the aliasing (@`share@) causes partial changes (subscripting) to flow in both directions. 993 (In the following examples, watchhow @s1@ and @s1a@ change together, and @s2@ is independent.)1006 (In the following examples, note how @s1@ and @s1a@ change together, and @s2@ is independent.) 994 1007 \input{sharing2.tex} 995 1008 Similarly for complete changes. … … 999 1012 \input{sharing4.tex} 1000 1013 1001 Now, consider string @s1_mid@ being an alias in the middle of @s1@, along with @s2@, made by a simplecopy from the middle of @s1@.1014 Now, consider string @s1_mid@ being an alias in the middle of @s1@, along with @s2@, made by a copy from the middle of @s1@. 1002 1015 \input{sharing5.tex} 1003 1016 Again, @`share@ passes changes in both directions; copy does not. … … 1020 1033 When @s1_bgn@'s size increases by 3, @s1_mid@'s starting location moves from 1 to 4 and @s1_end@'s from 3 to 6, 1021 1034 1022 When changes happen son an aliasing substring that overlap.1035 When changes happen on an aliasing substring that overlap. 1023 1036 \input{sharing10.tex} 1024 Strings @s1_crs@ and @s1_mid@ overlap at character 4, @j@ because the substrings are 3,2 and 4,2.1037 Strings @s1_crs@ and @s1_mid@ overlap at character 4, @j@, because the substrings are 3,2 and 4,2. 1025 1038 When @s1_crs@'s size increases by 1, @s1_mid@'s starting location moves from 4 to 5, but the overlapping character remains, changing to @'+'@. 1026 1039 … … 1079 1092 1080 1093 1081 1082 \section{Storage management} 1094 \section{Storage Management} 1083 1095 1084 1096 This section discusses issues related to storage management of strings. … … 1099 1111 const string s1 = "abc"; 1100 1112 \end{cfa} 1101 the @const@ applies to the @s1@ pointer to @"abc"@, and @"abc"@ is an immutable constant that is \emph{copied} into the string's storage. 1102 Hence, @s1@ is not pointing at an immutable constant, meaning its underlying string can be mutable, unless some other designation is specified, such as Java's global immutable rule. 1103 1104 1105 1106 \subsection{General implementation} 1113 @const@ applies to the @s1@ pointer to @"abc"@, and @"abc"@ is an immutable constant that is \emph{copied} into the string's storage. 1114 Hence, @s1@ is not pointing at an immutable constant and its underlying string is mutable, unless some other designation is specified, such as Java's global immutable rule. 1115 1116 1117 \subsection{General Implementation} 1107 1118 \label{string-general-impl} 1108 1119 … … 1113 1124 A string is a smart pointer into this buffer. 1114 1125 1115 This cycle of frequent cheap allocations, interspersed with infrequent expensive compactions, has obvious similarities to a general-purpose memory 1126 This cycle of frequent cheap allocations, interspersed with infrequent expensive compactions, has obvious similarities to a general-purpose memory-manager based on garbage collection (GC). 1116 1127 A few differences are noteworthy. 1117 1128 First, in a general purpose manager, the allocated objects may contain pointers to other objects, making the transitive reachability of these objects a crucial property. 1118 1129 Here, the allocations are text, so one allocation never keeps another alive. 1119 1130 Second, in a general purpose manager, the handle that keeps an allocation alive is a bare pointer. 1120 For strings, a fatter representation is acceptable because this pseudo-pointer is only used for ent y into the string-heap, not for general data-sub-structure linking around the general heap.1131 For strings, a fatter representation is acceptable because this pseudo-pointer is only used for entry into the string-heap, not for general data-substructure linking around the general heap. 1121 1132 1122 1133 \begin{figure} … … 1128 1139 \VRef[Figure]{f:memmgr-basic} shows the representation. 1129 1140 The heap header and text buffer define a sharing context. 1130 Normally, one global sharingcontext is appropriate for an entire program;1131 concurren t exceptions arediscussed in \VRef{s:ControllingImplicitSharing}.1132 A string is a handle into the buffer and node within a linked list.1141 Normally, one global context is appropriate for an entire program; 1142 concurrency is discussed in \VRef{s:ControllingImplicitSharing}. 1143 A string is a handle to a node in a linked list containing a information about a string text in the buffer. 1133 1144 The list is doubly linked for $O(1)$ insertion and removal at any location. 1134 1145 Strings are ordered in the list by text start address. 1135 The hea der maintains a next-allocation pointer, @alloc@, pointing to the last live allocation in the buffer.1146 The heap header maintains a next-allocation pointer, @alloc@, pointing to the last live allocation in the buffer. 1136 1147 No external references point into the buffer and the management procedure relocates the text allocations as needed. 1137 1148 A string handle references a containing string, while its string is contiguous and not null terminated. … … 1139 1150 String handles can be allocated in the stack or heap, and represent the string variables in a program. 1140 1151 Normal C life-time rules apply to guarantee correctness of the string linked-list. 1141 The text buffer is large enough with good management so that often only one dynamic allocation is necessary during program execution .1152 The text buffer is large enough with good management so that often only one dynamic allocation is necessary during program execution, but not so large as to cause program bloat. 1142 1153 % During this period, strings can vary in size dynamically. 1143 1154 1144 1155 When the text buffer fills, \ie the next new string allocation causes @alloc@ to point beyond the end of the buffer, the strings are compacted. 1145 1156 The linked handles define all live strings in the buffer, which indirectly defines the allocated and free space in the buffer. 1146 Since the string handles are in sorted order,the handle list can be traversed, copying the first live text to the start of the buffer, and subsequent strings after each other.1147 If, upon compaction, the amount of free storage wouldstill be less than the new string allocation, a larger text buffer is heap-allocated, the current buffer is copied into the new buffer, and the original buffer is freed.1157 The string handles are maintained in sorted order, so the handle list can be traversed, copying the first live text to the start of the buffer, and subsequent strings after each other. 1158 After compaction, if free storage is still be less than the new string allocation, a larger text buffer is heap-allocated, the current buffer is copied into the new buffer, and the original buffer is freed. 1148 1159 Note, the list of string handles is structurally unaffected during a compaction; 1149 1160 only the text pointers in the handles are modified to new buffer locations. … … 1157 1168 Both string initialization styles preserve the string module's internal invariant that the linked-list order matches the buffer order. 1158 1169 For string destruction, handles are removed from the list. 1159 As a result, once a last handle using a run of buffer characters is destroyed, that buffer space gets excluded from the next compaction, making its character-count available in the compacted buffer.1160 1161 Certain string operations canresult in a substring of another string.1162 The resulting handle is then placed in the correct sorted position in the list, possible witha short linear search to locate the position.1170 Once the last handle using a run of buffer characters is destroyed, that buffer space is excluded from use until the next compaction. 1171 1172 Certain string operations result in a substring of another string. 1173 The resulting handle is then placed in the correct sorted position in the list, possible requiring a short linear search to locate the position. 1163 1174 For string operations resulting in a new string, that string is allocated at the end of the buffer. 1164 1175 For shared-edit strings, handles that originally referenced containing locations need to see the new value at the new buffer location. … … 1175 1186 1176 1187 1177 \subsection{RAII limitations}1188 \subsection{RAII Limitations} 1178 1189 \label{string-raii-limit} 1179 1190 1180 1191 Earlier work on \CFA~\cite[ch.~2]{Schluntz17} implemented object constructors and destructors for all types (basic and user defined). 1181 A constructor is a user-defined function run implicitly \emph{after} an object's storage is allocated, and a destructor is a user-defined function run \emph{before} an object's storage is deall cated.1192 A constructor is a user-defined function run implicitly \emph{after} an object's storage is allocated, and a destructor is a user-defined function run \emph{before} an object's storage is deallocated. 1182 1193 This feature, called Resource Acquisition Is Initialization (RAII)~\cite[p.~389]{Stroustrup94}, helps guarantee invariants for users before accessing an object and for the programming environment after an object terminates. 1183 1194 … … 1213 1224 \end{cfa} 1214 1225 A module providing the @T@ type can traverse @all_T@ at relevant times, to keep the objects ``good.'' 1215 Hence, declaring a @T@ not only ensures that it begins with an initially ``good'' value, but it also provides an implicit subscription to a service that keeps the value ``good'' in the future.1226 Hence, declaring a @T@ not only ensures that it begins with an initially ``good'' value, but it also provides an implicit subscription to a service that keeps the value ``good'' during its lifetime. 1216 1227 Again, both \CFA and \CC support this usage style. 1217 1228 1218 1229 A third capability concerns \emph{implicitly} requested copies. 1219 1230 When stack-allocated objects are used as parameter and return values, a sender's version exists in one stack frame and a receiver's version exists in another. 1220 In the parameter direction, the language's function-call handling must arrange for a copy-constructor call to happen\footnote{ 1221 \CC also offers move constructors and return-value optimization~\cite{RVO20}. 1222 These features help reduce unhelpful copy-constructor calls, which, for types like the example \lstinline{S}, would lead to extra memory allocations. 1223 \CFA does not currently have these features; adding similarly-intended features to \CFA is desirable. 1224 However, this section is about a problem in the realization of features that \CFA already supports. 1225 To understand the problem presented, the appropriate comparison is with classic versions of \CC that treated such copy-constructor calls as necessary.} 1226 at a time near the control transfer into the callee, with the source as the caller's (sender's) version and the target as the callee's (receiver's) version. 1227 (In the return direction, the roles are reversed and the copy-constructor call happens near the return of control.) 1228 \CC supports this capability without qualification. 1229 \CFA offers limited support here; simple examples work, but implicit copying does not combine successfully with the other RAII capabilities discussed. 1231 In the parameter direction, the language's function-call handling must arrange for a copy-constructor call to happen, at a time near the control transfer into the callee. %, with the source as the caller's (sender's) version and the target as the callee's (receiver's) version. 1232 In the return direction, the roles are reversed and the copy-constructor call happens near the return of control. 1233 \CC supports this capability.% without qualification. 1234 \CFA offers limited support; 1235 simple examples work, but implicit copying does not combine successfully with the other RAII capabilities discussed. 1236 1237 \CC also offers move constructors and return-value optimization~\cite{RVO20}. 1238 These features help reduce unhelpful copy-constructor calls, which, for types like the @S@ example, would lead to extra memory allocations. 1239 \CFA does not currently have these features; adding similarly-intended features to \CFA is desirable. 1240 However, this section is about a problem in the realization of features that \CFA already supports. 1241 Hence, the comparison continues with the classic version of \CC that treated such copy-constructor calls as necessary. 1230 1242 1231 1243 To summarize the unsupported combinations, the relevant features are: … … 1243 1255 At that time, adhering to a principal of minimal intervention, this code could always be treated as passthrough: 1244 1256 \begin{cfa} 1245 struct U { ...};1257 struct U { ... }; 1246 1258 // RAII to go here 1247 1259 void f( U u ) { F_BODY(u) } … … 1249 1261 f( x ); 1250 1262 \end{cfa} 1251 But adding custom RAII (at ``... here'') changes things.1252 The common C++ lowering~\cite[Sec. 3.1.2.3]{cxx:raii-abi} proceeds differently than the presentCFA lowering.1253 1254 \ noindent1255 \begin{tabular}{ l|l}1256 \begin{cfa} 1257 // C++, likely CFA to be 1263 But adding custom RAII (at ``...go here'') changes things. 1264 The common \CC lowering~\cite[Sec. 3.1.2.3]{cxx:raii-abi} proceeds differently than the present \CFA lowering. 1265 \begin{cquote} 1266 \setlength{\tabcolsep}{15pt} 1267 \begin{tabular}{@{}l|l@{}} 1268 \begin{cfa} 1269 $\C[0.0in]{// \CC, \CFA future}\CRT$ 1258 1270 struct U {...}; 1259 1271 // RAII elided 1260 1272 void f( U * __u_orig ) { 1261 1273 U u = * __u_orig; // call copy ctor 1262 F_BODY( u)1274 F_BODY( u ); 1263 1275 // call dtor, u 1264 1276 } 1265 1277 U x; // call default ctor 1266 f( & x ) ; 1278 1279 1280 f( &x ) ; 1281 1282 1267 1283 // call dtor, x 1268 1284 \end{cfa} 1269 1285 & 1270 1286 \begin{cfa} 1271 // CFA today 1287 $\C[0.0in]{// \CFA today}\CRT$ 1272 1288 struct U {...}; 1273 1289 // RAII elided 1274 1290 void f( U u ) { 1275 F_BODY(u) 1291 1292 F_BODY( u ); 1293 1276 1294 } 1277 1295 U x; // call default ctor … … 1284 1302 \end{cfa} 1285 1303 \end{tabular} 1286 1287 In the CFA-today scheme, the lowered form is still using a by-value C call. 1288 C does a @memcpy@ on structs passed by value. 1289 And so, @F_BDY@ sees the bits of @__u_for_f@ occurring at an address that has never been presented to the @U@ lifecycle functions. 1290 If @U@ is trying to have a style-\#2 invariant, it shows up broken in @F_BDY@: references that are supposed to be to @u@ are actually to the different location @__u_for_f@. 1291 The \CC scheme does not have this problem because it constructs the for-@f@ copy in the correct location. 1292 1293 Yet, the \CFA-today scheme is sufficient to deliver style-\#1 invariants (in this style-\#3 use case) because this scheme still does the correct number of lifecycle calls, using correct values, at correct times. So, reference-counting or simple ownership applications get their invariants respected under call/return-by-value. 1304 \end{cquote} 1305 The current \CFA scheme is still using a by-value C call. 1306 C does a @memcpy@ on structures passed by value. 1307 And so, @F_BODY@ sees the bits of @__u_for_f@ occurring at an address that has never been presented to the @U@ lifecycle functions. 1308 If @U@ is trying to have a style-\#2 invariant, it shows up broken in @F_BODY@: references supposedly to @u@ are actually to @__u_for_f@. 1309 The \CC scheme does not have this problem because it constructs the for @f@ copy in the correct location within @f@. 1310 1311 Yet, the current \CFA scheme is sufficient to deliver style-\#1 invariants (in this style-\#3 use case) because this scheme still does the correct number of lifecycle calls, using correct values, at correct times. 1312 So, reference-counting or simple ownership applications get their invariants respected under call/return-by-value. 1294 1313 1295 1314 % [Mike is not currently seeing how distinguishing initialization from assignment is relevant] … … 1325 1344 % The following discusses the consequences of this semantics with respect to lifetime management of \CFA strings. 1326 1345 1327 1328 The string API offers style \#3's pass-by-value in, for example, in the return of @"a" + "b"@. 1346 The string API offers style \#3's pass-by-value in, \eg in the return of @"a" + "b"@. 1329 1347 Its implementation uses the style-\#2 invariant of the string handles being linked to each other, helping to achieve high performance. 1330 Since these two RAII styles canno nt coexist, a workaround splits the API into two layers: one that provides pass-by-value, built upon the other with inter-linked handles.1348 Since these two RAII styles cannot coexist, a workaround splits the API into two layers: one that provides pass-by-value, built upon the other with inter-linked handles. 1331 1349 The layer with pass-by-value incurs a performance penalty, while the layer without delivers the desired runtime performance. 1332 The slower, friendlier High Level API (HL, type @string@) wrap ps the faster, more primitive Low Level API (LL, type @string_res@, abbreviating ``resource'').1350 The slower, friendlier High Level API (HL, type @string@) wraps the faster, more primitive Low Level API (LL, type @string_res@, abbreviating ``resource''). 1333 1351 Both APIs present the same features, up to return-by-value operations being unavailable in LL and implemented via the workaround in HL. 1334 1352 The intention is for most future code to target HL. 1335 When the RAII issue is fixed, the full HL feature set will be acheivable using the LL-style lifetime management. 1336 So then, there will be no need for two API levels; HL will be removed; LL's type will be renamed to @string@; programs written for current HL will run faster. 1353 When the RAII issue is fixed, the full HL feature set will be achievable using the LL-style lifetime management. 1354 Then, HL will be removed; 1355 LL's type will be renamed @string@ and programs written for current HL will run faster. 1337 1356 In the meantime, performance-critical sections of applications must use LL. 1338 1357 Subsequent performance experiments \see{\VRef{s:PerformanceAssessment}} use the LL API when comparing \CFA to other languages. 1339 1358 This measurement gives a fair estimate of the goal state for \CFA. 1340 1359 A separate measure of the HL overhead is also included. 1341 1342 \VRef[Section]{string-general-impl} described the goal state for \CFA.In present state, the type @string_res@ replaces its mention of @string@ as inter-linked handle.1343 1344 To use LL, a programmer rewrites invocations that used pass-by-value APIs into invocations where the resourcing is more explicit.1345 Many invocations are unaffected, notably includingassignment and comparison.1346 Of the capabilities listed in \VRef[Figure]{f:StrApiCompare}, only the following three cases haverevisions.1347 1348 \ noindent1360 hence, \VRef[Section]{string-general-impl} us describing the goal state for \CFA. 1361 In present state, the type @string_res@ replaces its mention of @string@ as inter-linked handle. 1362 1363 To use LL, a programmer rewrites invocations using pass-by-value APIs into invocations where resourcing is more explicit. 1364 Many invocations are unaffected, notably assignment and comparison. 1365 Of the capabilities listed in \VRef[Figure]{f:StrApiCompare}, only the following three cases need revisions. 1366 \begin{cquote} 1367 \setlength{\tabcolsep}{15pt} 1349 1368 \begin{tabular}{ll} 1350 1369 HL & LL \\ 1351 1370 \hline 1352 1371 \begin{cfa} 1372 1353 1373 string s = "a" + "b"; 1354 1374 \end{cfa} … … 1363 1383 string s = "abcde"; 1364 1384 string s2 = s(2, 3); // s2 == "cde" 1385 1365 1386 s(2,3) = "x"; // s == "abx" && s2 == "cde" 1366 1387 \end{cfa} … … 1376 1397 \begin{cfa} 1377 1398 string s = "abcde"; 1399 1378 1400 s[2] = "xxx"; // s == "abxxxde" 1379 1401 \end{cfa} … … 1385 1407 \end{cfa} 1386 1408 \end{tabular} 1387 1409 \end{cquote} 1388 1410 The actual HL workaround is having @string@ wrap a pointer to a uniquely owned, heap-allocated @string_res@. This arrangement has @string@ being style-\#1 RAII, which is compatible with pass-by-value. 1389 1411 1390 1412 1391 1392 \subsection{Sharing implementation} 1413 \subsection{Sharing Implementation} 1393 1414 \label{sharing-impl} 1394 1415 1395 The \CFA string module has two mechanisms to handle the case when string handles share a run of text. 1396 1416 The \CFA string module has two mechanisms to deal with string handles sharing text. 1397 1417 In the first type of sharing, the user requests that both string handles be views of the same logical, modifiable string. 1398 1418 This state is typically produced by the substring operation. … … 1404 1424 $\texttt{\small axcde xc}$ 1405 1425 \end{cfa} 1406 In a typical substring call, the source string-handle is referencing an entire string, and the resulting, newly made, string handle is referencing aportion of the original.1407 In this state, a subsequent modification made by either is visible in both.1426 Here, the source string-handle is referencing an entire string, and the resulting, newly made, string handle is referencing a contained portion of the original. 1427 In this state, a modification made in the overlapping area is visible in both strings. 1408 1428 1409 1429 The second type of sharing happens when the system implicitly delays the physical execution of a logical \emph{copy} operation, as part of its copy-on-write optimization. … … 1418 1438 In this state, a subsequent modification done on one handle triggers the deferred copy action, leaving the handles referencing different text within the buffer, holding distinct values. 1419 1439 1420 A further abstraction , in the string module's implementation,helps distinguish the two senses of sharing.1440 A further abstraction helps distinguish the two senses of sharing. 1421 1441 A share-edit set (SES) is an equivalence class over string handles, being the reflexive, symmetric and transitive closure of the relationship of one string being constructed from another, with the ``share'' option given. 1422 1442 The SES is represented by a second linked list among the handles. … … 1427 1447 1428 1448 1429 \subsection{Controlling implicit sharing}1449 \subsection{Controlling Implicit Sharing} 1430 1450 \label{s:ControllingImplicitSharing} 1431 1451 … … 1434 1454 In detail, string sharing has inter-linked string handles, so managing one string is also managing the neighbouring strings, and from there, a data structure of the ``set of all strings.'' 1435 1455 Therefore, it is useful to toggle this capability on or off when it is not providing any application benefit. 1436 1437 \begin{figure}1438 \begin{tabular}{ll}1439 \lstinputlisting[language=CFA, firstline=10, lastline=55]{sharectx.run.cfa}1440 &1441 \raisebox{-0.17\totalheight}{\includegraphics{string-sharectx.pdf}} % lower1442 \end{tabular}1443 \caption{Controlling copying vs sharing of strings using \lstinline{string_sharectx}.}1444 \label{fig:string-sharectx}1445 \end{figure}1446 1456 1447 1457 The \CFA string library provides the type @string_sharectx@ to control an ambient sharing context. … … 1456 1466 Executing the example does not produce an interesting outcome, but the comments in the picture indicate when the logical copy operation runs with 1457 1467 \begin{description} 1458 1459 1468 \item[share:] the copy being deferred, as described through the rest of this section (fast), or 1469 \item[copy:] the copy performed eagerly (slow). 1460 1470 \end{description} 1461 1471 Only eager copies can cross @string_sharectx@ boundaries. 1462 1472 The intended use is with stack-managed lifetimes, in which the established context lasts until the current function returns, and affects all functions called that do not create their own contexts. 1463 1473 1464 [ TODO: true up with ``is thread local'' (implement that and expand this discussion to give a concurrent example, or adjust this wording) ] 1465 1466 1467 \subsection{Sharing and threading} 1474 \begin{figure} 1475 \begin{tabular}{ll} 1476 \lstinputlisting[language=CFA, firstline=10, lastline=55]{sharectx.run.cfa} 1477 & 1478 \raisebox{-0.17\totalheight}{\includegraphics{string-sharectx.pdf}} % lower 1479 \end{tabular} 1480 \caption{Controlling copying vs sharing of strings using \lstinline{string_sharectx}.} 1481 \label{fig:string-sharectx} 1482 \end{figure} 1483 1484 1485 \subsection{Sharing and Threading} 1468 1486 1469 1487 The \CFA string library provides no thread safety, the same as \CC string, providing similar performance goals. … … 1474 1492 1475 1493 1476 \subsection{Future work} 1477 1478 Implementing the small-string optimization is straightforward, as a string header contains a pointer to the string text in the buffer. 1479 This pointer could be marked with a flag and contain a small string. 1480 However, there is now a conditional check required on the fast-path to switch between small and large string operations. 1481 1482 It might be possible to pack 16- or 32-bit Unicode characters within the same string buffer as 8-bit characters. 1483 Again, locations for identification flags must be found and checked along the fast path to select the correct actions. 1484 Handling utf8 (variable length), is more problematic because simple pointer arithmetic cannot be used to stride through the variable-length characters. 1485 Trying to use a secondary array of fixed-sized pointers/offsets to the characters is possible, but raises the question of storage management for the utf8 characters themselves. 1486 1487 1488 \section{Performance assessment} 1489 \label{s:PerformanceAssessment} 1490 1491 I assessed the \CFA string library's speed and memory usage against strings in \CC STL. 1492 1493 Overall, this analysis shows that adding support for the features shown earlier in the chapter comes at no substantial cost in the performance of featrues common to both APIs. 1494 1495 Moreover, the results support the \CFA string's position as a high-level enabler of simplified text processing. 1496 STL makes its user think about memory management. 1497 When the user does, and is successful, STL's performance can be very good. 1498 But when the user fails to think through the consequences of the STL representation, performance becomes poor. 1499 The \CFA string lets the user work at the level of just putting the right text into right variables, with corresponding performance degradations reduced or eliminated. 1500 1501 % The final test shows the overall win of the \CFA text-sharing mechanism. 1502 % It exercises several operations together, showing \CFA enabling clean user code to achieve performance that STL requires less-clean user code to achieve. 1503 1504 1505 \subsection{Methodology} 1506 1507 These tests use a \emph{corpus} of strings. 1508 Their lengths are important; the specific characters occurring in them are immaterial. 1509 In a result graph, a corpus's mean string length is often the independent variable shown on the X axis. 1510 1511 When a corpus contains strings of different lenghths, the lengths are drawn from a geometric distribution. 1512 Therefore, strings much longer than the mean occur nontrivially and strings slightly shorter than the mean occur most often. 1513 A corpus's string sizes are one of: 1514 \begin{description} 1515 \item [Fixed-size] all string lengths are of the stated size. 1516 \item [Varying 1 and up] the string lengths are drawn from the geometric distribution with a stated mean and all lengths occur. 1517 \item [Varying 16 and up] string lengths are drawn from the geometric distribution with the stated mean, but only lengths 16 and above occur; thus, the stated mean is above 16. \PAB{Is this one unused? May have just been for ``normalize.''} 1518 \end{description} 1519 The special treatment of length 16 deals with the short-string optimization (SSO) in STL @string@, currently not implemented in \CFA, though a fine future improvement to \CFA. 1520 In the general case, an STL string handle is a pointer (to separately allocated text) and a length. 1521 But when the text is shorter than this representation, the optimization repurposes the handle's storage to eliminate using the heap. 1494 \subsection{Short-String Optimization} 1495 1496 \CC implements a short-string ($\le$16) optimization (SSO). 1497 As a string header contains a pointer to the string text, this pointer can be tagged and used to contain a short string, removing a dynamic memory allocation/deallocation. 1522 1498 \begin{c++} 1523 1499 class string { … … 1529 1505 char sstr[sizeof(lstr)]; $\C{// short string <16 characters, text in situ}$ 1530 1506 }; 1531 $\C{// tagging for kind (short or long) elided}$1507 // some tagging for short or long strings 1532 1508 }; 1533 1509 \end{c++} 1534 1510 However, there is now a conditional check required on the fast-path to switch between short and long string operations. 1511 1512 It might be possible to pack 16- or 32-bit Unicode characters within the same string buffer as 8-bit characters. 1513 Again, locations for identification flags must be found and checked along the fast path to select the correct actions. 1514 Handling utf8 (variable length), is more problematic because simple pointer arithmetic cannot be used to stride through the variable-length characters. 1515 Trying to use a secondary array of fixed-sized pointers/offsets to the characters is possible, but raises the question of storage management for the utf8 characters themselves. 1516 1517 1518 \section{Performance Assessment} 1519 \label{s:PerformanceAssessment} 1520 1521 I assessed the \CFA string library's speed and memory usage against strings in \CC STL. 1522 Overall, this analysis shows that adding support for the features shown earlier in the chapter comes at no substantial cost in the performance of features common to both APIs. 1523 1524 Moreover, the results support the \CFA string's position as a high-level enabler of simplified text processing. 1525 STL makes its user think about memory management. 1526 When the user does, and is successful, STL's performance can be very good. 1527 But when the user fails to think through the consequences of the STL representation, performance becomes poor. 1528 The \CFA string lets the user work at the level of just putting the right text into the right variables, with corresponding performance degradations reduced or eliminated. 1529 1530 % The final test shows the overall win of the \CFA text-sharing mechanism. 1531 % It exercises several operations together, showing \CFA enabling clean user code to achieve performance that STL requires less-clean user code to achieve. 1532 1533 1534 \subsection{Methodology} 1535 1536 These tests use a \emph{corpus} of strings. 1537 Their lengths are important; the specific characters occurring in them are immaterial. 1538 In a result graph, a corpus's mean string length is often the independent variable shown on the X axis. 1539 1540 When a corpus contains strings of different lengths, the lengths are drawn from a geometric distribution. 1541 Therefore, strings much longer than the mean occur less often and strings slightly shorter than the mean occur most often. 1542 A corpus's string sizes are one of: 1543 \begin{description} 1544 \item [Fixed-size] all string lengths are of the stated size. 1545 \item [Varying 1 and up] the string lengths are drawn from the geometric distribution with a stated mean and all lengths occur. 1546 \item [Varying 16 and up] string lengths are drawn from the geometric distribution with the stated mean, but only lengths 16 and above occur; thus, the stated mean is above 16. 1547 \end{description} 1548 The special treatment of length 16 deals with the SSO in STL @string@, currently not implemented in \CFA. 1535 1549 A fixed-size or from-16 distribution ensures that \CC's extra-optimized cases are isolated within, or removed from, the comparison. 1536 1537 1550 In all experiments that use a corpus, its text is generated and loaded into the system under test before the timed phase begins. 1538 1539 1551 To ensure comparable results, a common memory allocator is used for \CFA and \CC. 1540 \CFA runs the llheap allocator~\cite{Zulfiqar22} ; the test rig plugs this same allocatorinto \CC.1552 \CFA runs the llheap allocator~\cite{Zulfiqar22}, which is also plugged into \CC. 1541 1553 1542 1554 The operations being measured take dozens of nanoseconds, so a succession of many invocations is run and timed as a group. 1543 The experiments run with fixed duration (targeting approximately 5 seconds), stopping upon passing a goal time, as determined by re-checking @clock()@ every 10,000 invocations, which is never more often than once per 80 ms.1544 Timing outcomes rep rt mean nanoseconds per invocation, which includes harness overhead and the targeted string API execution.1555 The experiments run for a fixed duration (5 seconds), as determined by re-checking @clock()@ every 10,000 invocations, which is never more often than once per 80 ms. 1556 Timing outcomes report mean nanoseconds per invocation, which includes harness overhead and the targeted string API execution. 1545 1557 1546 1558 \PAB{To discuss: hardware and such} 1547 1559 1548 As discussed in \VRef[Section]{string-raii-limit}, general performance comparisons are made using \CFA's faster, low-level string API, whose string type is named @string_res@. 1549 1550 \VRef{s:ControllingImplicitSharing} presents an operational mode where \CFA string sharing is turned off. In this mode, the \CFA string operates similarly to \CC's, by using a distinct heap allocation for each string's text. 1551 Some experiments include measurements in this mode for baselining purposes. 1552 It is called ``\CC emulation mode'' or ``nosharing'' here. 1553 1560 As discussed in \VRef[Section]{string-raii-limit}, general performance comparisons are made using \CFA's faster, low-level string API, named @string_res@. 1561 \VRef{s:ControllingImplicitSharing} presents an operational mode where \CFA string sharing is turned off. 1562 In this mode, the \CFA string operates similarly to \CC's, by using a heap allocation for string text. 1563 Some experiments include measurements in this mode for baselining purposes, called ``\CC emulation mode'' or ``nosharing''. 1554 1564 1555 1565 1556 1566 \subsection{Test: Append} 1557 1567 1558 These tests measure the speed of appending strings from the corpus onto a larger, growing string. They show \CFA performing comparably to \CC overall, though with reduced penalties for simple API misuses for which \CC programmers may not know to watch out.1559 1568 These tests measure the speed of appending strings from the corpus onto a larger, growing string. 1569 They show \CFA performing comparably to \CC overall, though with penalties for simple API misuses. 1560 1570 The basic harness is: 1561 \begin{cquote} 1562 \setlength{\tabcolsep}{20pt} 1563 \begin{cfa} 1564 START_TIMER 1565 for ( ... ) { 1566 string_res accum; 1567 for ( i; 100 ) { 1568 accum += corpus[ f(i) ]; // importing from char * here 1569 COUNT_ONE_OP_DONE 1571 \begin{cfa} 1572 // set alarm duration 1573 for ( ... ) { $\C[1.5in]{// loop for duration}$ 1574 for ( i; N ) { $\C{// perform multiple appends (concatenations)}$ 1575 accum += corpus[ f( i ) ]; 1570 1576 } 1577 count += N; $\C{// count number of appends}\CRT$ 1571 1578 } 1572 STOP_TIMER 1573 \end{cfa} 1574 \end{cquote} 1575 The harness's outer loop executes until a sample-worthy amount of execution has happened. 1576 The inner loop builds up the desired-length string with successive appends, before the outer makes it start over from a blank accumulator. 1577 Each harness run targets a specific (mean) corpus string length and produces one data point on the result graph. 1578 1579 \end{cfa} 1580 The harness's outer loop executes for the experiment duration. 1581 The string is reset to empty before appending (not shown). 1582 The inner loop builds up a growing-length string with successive appends. 1583 Each run targets a specific (mean) corpus string length and produces one data point on the result graph. 1579 1584 Three specific comparisons are made with this harness. 1580 1585 Each picks its own independent-variable basis of comparison. 1581 1582 All three comparisons use the varying-from-1 corpus construction, \ie they allow the STL to show its advantage from small-string optimization. 1586 All three comparisons use the varying-from-1 corpus construction, \ie they allow the STL to show its advantage for SSO. 1583 1587 1584 1588 1585 1589 \subsubsection{Fresh vs Reuse in \CC, Emulation Baseline} 1586 1590 1587 The first experiment compares \CFA with \CC, with \CFA operating in nosharing mode (and \CC having no other mode).1588 This experiment simply baselines how \CFA modestly lags \CC's optimization/tuning level generally, yet reproduces a coarser phenomenon.1589 1590 This experiment also introduces the first \CC coding pitfall, which the next experiment will show is helped by turning on \CFA sharing. By this pitfall, a \CC programmer must pay attention to string variable reuse.1591 1592 \begin{cquote} 1593 \setlength{\tabcolsep}{ 20pt}1591 The first experiment compares \CFA with \CC, with \CFA operating in nosharing mode and \CC having no other mode, hence both string package are using @malloc@/@free@. 1592 % This experiment establishes a baseline for other experiments. 1593 This experiment also introduces the first \CC coding pitfall, which the next experiment shows is helped by turning on \CFA sharing. 1594 % This pitfall shows, a \CC programmer must pay attention to string variable reuse. 1595 In the following, both programs are doing the same thing: start with @accum@ empty and build it up by appending @N@ strings (type @string@ in \CC and the faster @string_res@ in \CFA). 1596 \begin{cquote} 1597 \setlength{\tabcolsep}{40pt} 1594 1598 \begin{tabular}{@{}ll@{}} 1595 1599 % \multicolumn{1}{c}{\textbf{fresh}} & \multicolumn{1}{c}{\textbf{reuse}} \\ … … 1597 1601 1598 1602 for ( ... ) { 1599 @string_res accum;@ // fresh1600 for ( ...)1601 accum @+=@ ... 1603 @string_res accum;@ $\C[1.5in]{// fresh}$ 1604 for ( N ) 1605 accum @+=@ ... $\C{// append}\CRT$ 1602 1606 } 1603 1607 \end{cfa} … … 1606 1610 string_res accum; 1607 1611 for ( ... ) { 1608 @accum = "";@ $\C[1 in]{// reuse\CRT}$1609 for ( ...)1610 accum @+=@ ... 1612 @accum = "";@ $\C[1.5in]{// reuse}$ 1613 for ( N ) 1614 accum @+=@ ... $\C{// append}\CRT$ 1611 1615 } 1612 1616 \end{cfa} 1613 1617 \end{tabular} 1614 1618 \end{cquote} 1615 1616 Both programs are doing the same thing: start with @x@ empty and build it up by appending the same chunks. 1617 A programmer should not have to consider this difference. 1618 But from under the covers, each string being an individual allocation leaks through. 1619 While the inner loop is appending text to an @x@ that had not yet grown to have a large capacity, the program is, naturally, paying to extend the variable-length allocation, occasionally. 1620 This capacity stretching is a sticky property that survives assigning a (short, empty-string) value into an existing initialization. 1621 So, the ``reuse'' version benefits from not growing the allocation on subsequent runs of the inner loop. 1622 Yet, the ``fresh'' version is constantly restarting from a small buffer. 1619 The difference is creating a new or reusing an existing string variable. 1620 The pitfall is that most programmers do not consider this difference. 1621 However, creating a new variable implies deallocating the previous string storage and allocating new empty storage. 1622 As the string grows, further deallocations/allocations are required to release the previous and extend the current string storage. 1623 So, the fresh version is constantly restarting with zero string storage, while the reuse version benefits from having its prior large storage from the last append sequence. 1623 1624 1624 1625 \begin{figure} … … 1626 1627 \includegraphics{plot-string-peq-cppemu.pdf} 1627 1628 % \includegraphics[width=\textwidth]{string-graph-peq-cppemu.png} 1628 \caption{Fresh vs Reuse in \CC, Emulation Baseline. Average time per iteration with one \lstinline{x += y} invocation (lower is better). Comparing \CFA's STL emulation mode with STL implementations, and comparing the ``fresh'' with ``reused'' reset styles.} 1629 \caption{Fresh vs Reuse in \CC, Emulation Baseline. 1630 Average time per iteration with one \lstinline{x += y} invocation (lower is better). 1631 Comparing \CFA's STL emulation mode with STL implementations, and comparing the fresh with reused reset styles.} 1629 1632 \label{fig:string-graph-peq-cppemu} 1630 \end{figure} 1631 1632 \VRef[Figure]{fig:string-graph-peq-cppemu} shows the resulting performance. 1633 The fresh \vs reuse penalty is the dominant difference. 1634 The cost is 40\% averaged over the cases shown and minimally 24\%. 1635 It shows up consistently on both the \CFA and STL implementations, and this cost is more prominent with larger strings. 1636 1637 The lesser \CFA \vs STL difference shows \CFA reproducing STL's performance, up to a 15\% penalty averaged over the cases shown, diminishing with larger strings, and 50\% in the worst case. 1638 This penalty characterizes implementation fine tuning done with STL and not done yet done with \CFA. 1639 1640 1641 \subsubsection{\CFA's Fresh-Reuse Compromise} 1642 1643 This comparison has the same setup as the last one, except that the \CFA implementation is switched to use its sharing mode. The outcome is that the fresh/reuse difference vanishes in \CFA, with \CFA consistently delivering performance that compromises between the two \CC cases. 1644 1645 \begin{figure} 1646 \centering 1647 \includegraphics{string-graph-peq-sharing.pdf} 1633 \bigskip 1634 \bigskip 1635 \includegraphics{plot-string-peq-sharing.pdf} 1648 1636 % \includegraphics[width=\textwidth]{string-graph-peq-sharing.png} 1649 \caption{\CFA Compromise for Fresh \vs Reuse. Average time per iteration with one \lstinline{x += y} invocation (lower is better). Comparing \CFA's sharing mode with STL, and comparing the ``fresh'' with ``reused'' reset styles. The \CC results are repeated from \ref{fig:string-graph-peq-cppemu}.} 1637 \caption{\CFA Compromise for Fresh \vs Reuse. 1638 Average time per iteration with one \lstinline{x += y} invocation (lower is better). 1639 Comparing \CFA's sharing mode with STL, and comparing the fresh with reused reset styles. 1640 The \CC results are repeated from \VRef[Figure]{fig:string-graph-peq-cppemu}.} 1650 1641 \label{fig:string-graph-peq-sharing} 1651 1642 \end{figure} 1652 1643 1653 \VRef[Figure]{fig:string-graph-peq-sharing} has the result. 1654 At append lengths 5 and above, \CFA not only splits the two STL cases, but its slowdown of 16\% over STL with user-managed reuse is close to the baseline \CFA-v-STL implementation difference seen with \CFA in STL-emulation mode. 1655 1656 1657 \subsubsection{\CFA's low overhead for misusing \lstinline{+}} 1658 1659 A further pitfall occurs when the user writes @x = x + y@, rather than @x += y@. Again, they are logically equivalent. 1644 \VRef[Figure]{fig:string-graph-peq-cppemu} shows the resulting performance. 1645 The two fresh (solid) lines and the two reuse (dash) lines are identical, except for lengths $\le$10, where the \CC SSO has a 40\% average and minimally 24\% advantage. 1646 The gap between the fresh and reuse lines is the removal of the dynamic memory allocates and reuse of prior storage, \eg 100M allocations for fresh \vs 100 allocations for reuse across all experiments. 1647 While allocation reduction is huge, data copying dominates the cost, so the lines are still reasonably close together. 1648 1649 1650 \subsubsection{\CFA's Sharing Mode} 1651 1652 This comparison is the same as the last one, except the \CFA implementation is using sharing mode. 1653 Hence, both \CFA's fresh and reuse versions have no memory allocations, and as before, only for reuse does \CC have no memory allocations. 1654 \VRef[Figure]{fig:string-graph-peq-sharing} shows the resulting performance. 1655 For fresh at append lengths 5 and above, \CFA is now closer to the \CC reuse performance, because of removing the dynamic allocations. 1656 However, for reuse, \CFA has slowed down slightly, to performance matching the new fresh version, as the two versions are now implemented virtually the same. 1657 The reason for the \CFA reuse slow-down is the overhead of managing the sharing scheme (primarily maintaining the list of handles), without gaining any benefit. 1658 1659 \begin{comment} 1660 FIND A HOME!!! 1661 The potential benefits of the sharing scheme do not give \CFA an edge over \CC when appending onto a reused string, though the first one helps \CFA win at going onto a fresh string. These abilities are: 1662 \begin{itemize} 1663 \item 1664 To grow a text allocation repeatedly without copying it elsewhere. 1665 This ability is enabled by \CFA's most-recently modified string being located immediately before the text buffer's \emph{shared} bump-pointer area, \ie often a very large greenfield, relative to the \emph{individual} string being grown. 1666 With \CC-reuse, this benefit is already reaped by the user's reuse of a pre-stretched allocation. 1667 Yet \CC-fresh pays the higher cost because its room to grow for free is at most a constant times the original string's length. 1668 \item 1669 To share an individual text allocation across multiple related strings. 1670 This ability is not applicable to appending with @+=@. 1671 It in play in [xref sub-experiment pta] and [xref experiment pbv]. 1672 \item 1673 To share a text arena across unrelated strings, sourcing disparate allocations from a common place. 1674 That is, always allocating from a bump pointer, and never maintaining free lists. 1675 This ability is not relevant to running any append scenario on \CFA with sharing, because appending modifies an existing allocation and is not driving several allocations. 1676 This ability is assessed in [xref experiment allocn]. 1677 \end{itemize} 1678 This cost, of slowing down append-with-reuse, is \CFA paying the piper for other scenarios done well. 1679 \CFA prioritizes the fresh use case because it is more natural. 1680 The \emph{user-invoked} reuse scheme is an unnatural programming act because it deliberately misuses lexical scope: a variable (@accum@) gets its lifetime extended beyond the scope in which it is used. 1681 1682 A \CFA user needing the best performance on an append scenario can still access the \CC-like speed by invoking noshare. 1683 This (indirect) resource management is memory-safe, as compared to that required in \CC to use @string&@, where knowledge of another string's lifetime comes into play. 1684 This abstraction opt-out is also different from invoking the LL API-level option. 1685 In fact, these considerations are orthogonal. 1686 But the key difference is that invoking the LL API would be a temporary measure, to use a workaround of a known \CFA language issue; choosing to exempt a string from sharing is a permanent act of program tuning. 1687 Beyond these comparisons, opting for noshare actually provides program ``eye candy,'' indicating that under-the-hood thinking is becoming relevant here. 1688 \end{comment} 1689 1690 1691 \subsubsection{Misusing Concatenation} 1692 1693 A further pitfall occurs writing the apparently equivalent @x = x + y@ \vs @x += y@. 1660 1694 For numeric types, the generated code is equivalent, giving identical performance. 1661 1695 However, for string types there can be a significant difference. 1662 This pitfall is a particularly likely hazardfor beginners.1696 This pitfall is a particularly likely for beginners. 1663 1697 1664 1698 In earlier experiments, the choice of \CFA API among HL and LL had no impact on the functionality being tested. … … 1708 1742 \end{tabular} 1709 1743 \end{cquote} 1710 Note that this ``Goal'' code functions today in HL.1744 Note, the goal code functions today in HL but with slower performance. 1711 1745 1712 1746 \begin{figure} 1713 1747 \centering 1714 \includegraphics{ string-graph-pta-sharing.pdf}1748 \includegraphics{plot-string-pta-sharing.pdf} 1715 1749 % \includegraphics[width=\textwidth]{string-graph-pta-sharing.png} 1716 \caption{CFA's low overhead for misusing \lstinline{+}. Average time per iteration with one \lstinline{x += y} invocation (lower is better). Comparing \CFA (having implicit sharing activated) with STL, and comparing the \lstinline{+}-then-\lstinline{=} with the \lstinline{+=} append styles. The \lstinline{+=} results are repeated from \VRef[Figure]{fig:string-graph-peq-sharing}.}1750 \caption{CFA's low overhead for misusing concatenation. Average time per iteration with one \lstinline{x += y} invocation (lower is better). Comparing \CFA (having implicit sharing activated) with STL, and comparing the \lstinline{+}-then-\lstinline{=} with the \lstinline{+=} append styles. The \lstinline{+=} results are repeated from \VRef[Figure]{fig:string-graph-peq-sharing}.} 1717 1751 \label{fig:string-graph-pta-sharing} 1718 1752 \end{figure} 1719 1753 1720 \VRef[Figure]{fig:string-graph-pta-sharing} gives the outcome. The STL's penalty is $8 \times$ while \CFA's is only $2 \times$, averaged across the cases shown here. 1754 \VRef[Figure]{fig:string-graph-pta-sharing} gives the outcome, where the Y-axis is log scale because of the large differences. 1755 The STL's penalty is $8 \times$ while \CFA's is only $2 \times$, averaged across the cases shown here. 1721 1756 Moreover, the STL's gap increases with string size, while \CFA's converges. 1722 1757 So again, \CFA helps users who just want to treat strings as values, and not think about the resource management under the covers. 1723 1758 1724 While not a design goal, and not graphed out, \CFA in STL-emulation mode heppened to outperform STL in this case. User-managed allocation reuse did not affect either implementation in this case; only ``fresh'' results are shown. 1725 1726 1727 \subsection{Test: Pass argument} 1759 While not a design goal, and not graphed, \CFA in STL-emulation mode outperformed STL in this case. 1760 User-managed allocation reuse did not affect either implementation in this case; only ``fresh'' results are shown. 1761 1762 1763 \subsection{Test: Pass Argument} 1728 1764 1729 1765 STL has a penalty for passing a string by value, which forces users to think about memory management when communicating values with a function. 1730 The key \CFA value-add is that a user can think of a string simply as a value; this test shows that \CC charges a stiff penalty for thin ing this way, while \CFA does not.1766 The key \CFA value-add is that a user can think of a string simply as a value; this test shows that \CC charges a stiff penalty for thinking this way, while \CFA does not. 1731 1767 This test illustrates a main advantage of the \CFA sharing algorithm (in one case). 1732 It shows STL's small-string optimizationproviding a successful mitigation (in the other case).1768 It shows STL's SSO providing a successful mitigation (in the other case). 1733 1769 1734 1770 The basic operation considered is: … … 1749 1785 1750 1786 } 1751 START_TIMER 1752 for ( i; ... ) { 1753 helper( corpus[ f(i) ] ); // imported from char * previously 1754 COUNT_ONE_OP_DONE 1787 for ( ... ) { // loop for duration 1788 helper( corpus[ f( i ) ] ); 1789 count += 1; 1755 1790 } 1756 STOP_TIMER1757 1791 \end{cfa} 1758 1792 & … … 1761 1795 string_res q = { qref, COPY_VALUE }; 1762 1796 } 1763 // rest same, elided 1764 1765 1766 1767 1768 1769 \end{cfa} 1770 \end{tabular} 1771 \end{cquote} 1772 The Goal (HL) version gives the simplest sketch of the test harness. 1773 It uses a single level of looping. 1774 Each iteration uses a corpus item as the argument to a function call. 1797 1798 1799 1800 1801 \end{cfa} 1802 \end{tabular} 1803 \end{cquote} 1804 The goal (HL) version gives the modified test harness, with a single loop. 1805 Each iteration uses a corpus item as the argument to the function call. 1775 1806 These corpus items were imported to the string heap before beginning the timed run. 1776 1777 1807 1778 1808 \begin{figure} 1779 1809 \centering 1780 \includegraphics{ string-graph-pbv.pdf}1810 \includegraphics{plot-string-pbv.pdf} 1781 1811 % \includegraphics[width=\textwidth]{string-graph-pbv.png} 1812 \begin{tabularx}{\linewidth}{>{\centering\arraybackslash}X >{\centering\arraybackslash}X} (a) & (b) \end{tabularx} 1782 1813 \caption{Average time per iteration (lower is better) with one call to a function that takes a by-value string argument, comparing \CFA (having implicit sharing activated) with STL. 1783 (a) With \emph{Varying-from-1} corpus construction, in which the STL-only benefit of small-string optimization occurs, in varying degrees, at all string sizes. 1784 (b) With \emph{Fixed-size} corpus construction, in which this benefit applies exactly to strings with length below 16. 1785 [TODO: show version (b)]} 1814 (a) With \emph{Varying-from-1} corpus construction, in which the STL-only benefit of SSO optimization occurs, in varying degrees, at all string sizes. 1815 (b) With \emph{Fixed-size} corpus construction, in which this benefit applies exactly to strings with length below 16.} 1786 1816 \label{fig:string-graph-pbv} 1787 1817 \end{figure} 1788 1818 1789 1790 1819 \VRef[Figure]{fig:string-graph-pbv} shows the costs for calling a function that receives a string argument by value. 1791 STL's performance worsens as string length increases, while \CFA has the same performance at all sizes.1792 1820 STL's performance worsens uniformly as string length increases, while \CFA has the same performance at all sizes. 1821 Although the STL is better than \CFA until string length 10 because of the SSO. 1793 1822 While improved, the \CFA cost to pass a string is still nontrivial. 1794 1823 The contributor is adding and removing the callee's string handle from the global list. 1795 This cost is $1.5 \times$ to $2 \times$ over STL's when small-string optimization applies, though this cost should be avoidable in the same case, upon a \CFA realization ofthis optimization.1796 At the larger sizes, when STL has to manage storage for the string, STL runs more than $3 \times$ slower, mainly due to time spent in the general-purpose memory allocator.1797 \PAB{Need to check that. Expecting copying to dominate.} 1824 This cost is $1.5 \times$ to $2 \times$ over STL's when SSO applies, but is avoidable once \CFA realizes this optimization. 1825 At the larger sizes, the STL runs more than $3 \times$ slower, because it has to allocation/deallocate storage for the parameter and copy the argument string to the parameter. 1826 If the \CC string is passed by reference, the results are better and flat across string lengths like \CFA. 1798 1827 1799 1828 … … 1805 1834 1806 1835 A garbage collector, afforded the freedom of managed memory (where it knows about all the pointers and is allowed to modify them), often runs faster than malloc-free in an amortized analysis, even though it must occasionally stop to collect. 1807 The sppedup happens because GC is able to use its collection time to move objects. 1808 (In the case of the mini-allocator powering the \CFA string library, objects are runs of text.) Moving objects lets fresh allocations consume from a large contiguous store of available memory; the ``bump pointer'' bookkeeping for such a scheme is very light. 1836 The speedup happens because GC is able to use its collection time to move objects. 1837 (In the case of the mini-allocator powering the \CFA string library, objects are runs of text.) 1838 Moving objects lets fresh allocations consume from a large contiguous store of available memory; the ``bump pointer'' bookkeeping for such a scheme is very light. 1809 1839 A malloc-free implementation without the freedom to move objects must, in the general case, allocate in the spaces between existing objects; doing so entails the heavier bookkeeping of maintaining a linked structure of freed allocations and/or coalescing freed allocations. 1810 1840 … … 1862 1892 \begin{figure} 1863 1893 \centering 1864 \includegraphics{ string-graph-allocn.pdf}1894 \includegraphics{plot-string-allocn.pdf} 1865 1895 % \includegraphics[width=\textwidth]{string-graph-allocn.png} 1866 1896 \caption{Space and time performance, under varying fraction-live targets, for the five string lengths shown, at \emph{Fixed-size} corpus construction.
Note:
See TracChangeset
for help on using the changeset viewer.