Changeset eb0d9b7
- Timestamp:
- Dec 20, 2025, 4:52:54 AM (6 hours ago)
- Branches:
- master
- Parents:
- 0210a543
- Files:
-
- 9 added
- 3 deleted
- 4 edited
-
doc/theses/mike_brooks_MMath/Makefile (modified) (3 diffs)
-
doc/theses/mike_brooks_MMath/array.tex (modified) (17 diffs)
-
doc/theses/mike_brooks_MMath/pictures/ar-bchk.pdf (added)
-
doc/theses/mike_brooks_MMath/pictures/ar-bchk.xlsx (added)
-
doc/theses/mike_brooks_MMath/programs/ar-bchk/Makefile (added)
-
doc/theses/mike_brooks_MMath/programs/ar-bchk/control.c (added)
-
doc/theses/mike_brooks_MMath/programs/ar-bchk/control.cc (added)
-
doc/theses/mike_brooks_MMath/programs/ar-bchk/control.cfa (added)
-
doc/theses/mike_brooks_MMath/programs/ar-bchk/treatment.c (added)
-
doc/theses/mike_brooks_MMath/programs/ar-bchk/treatment.cc (added)
-
doc/theses/mike_brooks_MMath/programs/ar-bchk/treatment.cfa (added)
-
doc/theses/mike_brooks_MMath/programs/array-boundcheck-removal-matmul.cfa (deleted)
-
doc/theses/mike_brooks_MMath/programs/array-boundcheck-removal-stdvec.cpp (deleted)
-
doc/theses/mike_brooks_MMath/programs/array-boundcheck-removal.cfa (deleted)
-
libcfa/src/collections/array.hfa (modified) (1 diff)
-
tests/array-collections/array-md-sbscr-cases.cfa (modified) (2 diffs)
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/mike_brooks_MMath/Makefile
r0210a543 reb0d9b7 11 11 BibRep = ../../bibliography 12 12 13 strip-top-dir = $(foreach p,$1,$(patsubst %/,%,$(subst $(firstword $(subst /, ,$(p)))/,, $(p)))) 14 15 .SUFFIXES: # disable make built-in rules 16 13 17 TeXSRC = ${wildcard *.tex} 14 18 PicSRC = ${notdir ${wildcard ${Pictures}/*.png}} ${notdir ${wildcard ${Pictures}/*.fig}} … … 19 23 PgmSRC = ${notdir ${wildcard ${Programs}/*}} 20 24 RunPgmSRC = ${notdir ${wildcard ${Programs}/*.run.*}} 25 NondemoPgmSRC = ${call strip-top-dir,${wildcard ${Programs}/*/*.c*}} 21 26 BibSRC = ${wildcard *.bib} 22 27 … … 57 62 # File Dependencies 58 63 59 ${DOCUMENT}: ${TeXSRC} $(RunPgmOut) ${DemoPgmOut} ${ PlotSRC} ${PicSRC} ${BibSRC} ${BibRep}/pl.bib ${LaTMac}/common.tex Makefile | ${Build}64 ${DOCUMENT}: ${TeXSRC} $(RunPgmOut) ${DemoPgmOut} ${NondemoPgmSRC} ${PlotSRC} ${PicSRC} ${BibSRC} ${BibRep}/pl.bib ${LaTMac}/common.tex Makefile | ${Build} 60 65 echo ${PicSRC} 61 66 echo ${GraphSRC_OLD} -
doc/theses/mike_brooks_MMath/array.tex
r0210a543 reb0d9b7 92 92 is a sufficient postcondition. 93 93 In an imperative language like C and \CFA, it is also necessary to discuss side effects, for which an even heavier formalism, like separation logic, is required. 94 Secondly , TODO:bash Rust.95 % TODO:cite the crap out of these claims.94 Secondly: ... bash Rust. 95 ... cite the crap out of these claims. 96 96 \end{comment} 97 97 … … 698 698 An expression-compatibility question is a new addition to the \CFA compiler, and occurs in the context of dimension expressions, and possibly enumerations assigns, which must be unique. 699 699 700 % TODO:ensure these compiler implementation matters are treated under \CFA compiler background: type unification, cost calculation, GenPoly.700 % ... ensure these compiler implementation matters are treated under \CFA compiler background: type unification, cost calculation, GenPoly. 701 701 702 702 The relevant technical component of the \CFA compiler is the standard type-unification within the type resolver. … … 732 732 \end{cfa} 733 733 when resolving a function call to @g@, the first unification stage compares the type @T@ of the parameter with @array( float, n + 1 )@, of the argument. 734 \PAB{TODO: finish.}735 734 736 735 The actual rules for comparing two dimension expressions are conservative. … … 738 737 is to imply, ``all else being equal, allow an array with length calculated by $e_1$ 739 738 to be passed to a function expecting a length-$e_2$ array.''\footnote{ 740 TODO:Deal with directionality, that I'm doing exact-match, no ``at least as long as,'' no subtyping.739 ... Deal with directionality, that I'm doing exact-match, no ``at least as long as,'' no subtyping. 741 740 Should it be an earlier scoping principle? Feels like it should matter in more places than here.} 742 741 So, a ``yes'' answer must represent a guarantee that both expressions evaluate the … … 752 751 So, a variable and a literal can never match. 753 752 754 TODO:Discuss the interaction of this dimension hoisting with the challenge of extra unification for cost calculation753 ... Discuss the interaction of this dimension hoisting with the challenge of extra unification for cost calculation 755 754 \end{comment} 756 755 … … 1025 1024 1026 1025 1027 \section{Bound Checks, Added and Removed} 1028 1026 \section{Zero Overhead} 1027 1028 At runtime, the \CFA array is exactly a C array. 1029 1029 \CFA array subscripting is protected with runtime bound checks. 1030 The array dependent-typing provides information to the C optimizer allowing it remove many of the bound checks. 1031 This section provides a demonstration of the effect. 1032 1033 The experiment compares the \CFA array system with the padded-room system [TODO:xref] most typically exemplified by Java arrays, but also reflected in the \CC pattern where restricted vector usage models a checked array. 1034 The essential feature of this padded-room system is the one-to-one correspondence between array instances and the symbolic bounds on which dynamic checks are based. 1035 The experiment compares with the \CC version to keep access to generated assembly code simple. 1030 The array dependent-typing provides information to the C optimizer, allowing it remove many of the bound checks. 1031 This section provides a demonstration of these effects. 1032 1033 The experiment compares the \CFA array system with the simple-safety system most typically exemplified by Java arrays (\VRef[Section]{JavaCompare}), but also reflected in the \CC pattern where restricted vector usage models a checked array (\VRef[Section]{CppCompare}). 1034 The essential feature of this simple system is the one-to-one correspondence between array instances and the symbolic bounds on which dynamic checks are based. 1035 The experiment uses the \CC version to simplify access to generated assembly code. 1036 While ``\CC'' labels a participant, it is really the simple-safety system (of @vector@ with @.at@) whose limitaitons are being explained, and not limitations of \CC optimization. 1036 1037 1037 1038 As a control case, a simple loop (with no reused dimension sizes) is seen to get the same optimization treatment in both the \CFA and \CC versions. 1038 When the programmer treats the array's bound correctly (making the subscript ``obviously fine''), no dynamic bound check is observed in the program's optimized assembly code.1039 Firstly, when the programmer treats the array's bound correctly (making the subscript ``obviously fine''), no dynamic bound check is observed in the program's optimized assembly code. 1039 1040 But when the bounds are adjusted, such that the subscript is possibly invalid, the bound check appears in the optimized assembly, ready to catch an occurrence the mistake. 1040 1041 1041 TODO: paste source and assembly codes 1042 1043 Incorporating reuse among dimension sizes is seen to give \CFA an advantage at being optimized. 1044 The case is naive matrix multiplication over a row-major encoding. 1045 1046 TODO: paste source codes 1047 1042 \begin{figure} 1043 \setlength{\tabcolsep}{10pt} 1044 \begin{tabular}{llll@{}} 1045 \multicolumn{1}{c}{C} & 1046 \multicolumn{1}{c}{\CFA} & 1047 \multicolumn{1}{c}{\CC (nested vector)} 1048 \\[1em] 1049 \lstinput{20-28}{ar-bchk/control.c} & 1050 \lstinput{20-28}{ar-bchk/control.cfa} & 1051 \lstinput{20-28}{ar-bchk/control.cc} 1052 \\ 1053 \multicolumn{1}{c}{(a)} & 1054 \multicolumn{1}{c}{(b)} & 1055 \multicolumn{1}{c}{(c)} 1056 \\[1em] 1057 \multicolumn{4}{l}{ 1058 \includegraphics[page=1,trim=0 9.125in 2in 0,clip]{ar-bchk} 1059 } 1060 \\ 1061 \multicolumn{1}{c}{(d)} & 1062 \multicolumn{1}{c}{(e)} & 1063 \multicolumn{1}{c}{(f)} 1064 \\[1em] 1065 \multicolumn{4}{l}{ 1066 \includegraphics[page=2,trim=0 8.75in 2in 0,clip]{ar-bchk} 1067 } 1068 \\ 1069 \multicolumn{1}{c}{(g)} & 1070 \multicolumn{1}{c}{(h)} & 1071 \multicolumn{1}{c}{(i)} 1072 \end{tabular} 1073 \caption{Overhead comparison, control case. 1074 All three programs/compilers generate equally performant code when the programmer ``self-checks'' bounds via a loop's condition. 1075 Yet both \CFA and \CC versions generate code that is slower and safer than C's when the programmer might overrun the bounds.} 1076 \label{f:ovhd-ctl} 1077 \end{figure} 1078 1079 \VRef[Figure]{f:ovhd-ctl} gives a control-case example, of summing values in an array. 1080 Each column shows a this program in a different language: C (with no bound checking ever, a,~d,~g), \CFA (b,~e,~h), and \CC (c,~f,~i). 1081 The source-code functions in (a, b, c) can be compiled to have either correct or incorrect uses of bounds. 1082 When compiling for correct bound use, the @BND@ macro passes its argument through, so the loops cover exactly the passed array sizes. 1083 When compiling for incorrect bound use (a programming error), the @BND@ macro hardcodes the loops for 100 iterations, which implies out-of-bound access attempts when the passed array has fewer than 100 elements. 1084 The assembly code excerpts show optimized translations, for correct-bound mode in (d, e, f) and incorrect-bound mode in (g, h, i). 1085 Because incorrect-bound mode hardcodes 100 iterations, the loop always executes a first time, so this mode does not have the @n == 0@ branch seen in correct-bound mode. 1086 For C, this difference is the only (d--g) difference. 1087 For correct bounds, the \CFA translation equals the C translation, \ie~there is no (d--e) difference. 1088 It is practically so for \CC too, modulo the additional indirection of dereferencing into the vector's auxiliary allocation. 1089 1090 Differences arise when the bound-incorrect programs are passed an array shorter than 100. 1091 The C version, (g), proceeds with undefined behaviour, reading past the end of the passed array. 1092 The \CFA and \CC versions, (h, i), flag the mistake with the runtime check, the @i >= n@ branch. 1093 This check is provided implicitly by their library types, @array@ and @vector@ respectively. 1094 The significant result here is these runtime checks being \emph{absent} from the bound-correct translations, (e, f). 1095 The code optimizer has removed them because, at the point where they would occur (immediately past @.L28@/@.L3@), it knows from the surrounding control flow either @i == 0 && 0 < n@ or @i < n@ (directly), \i.e. @i < n@ either way. 1096 So a repeat of this check, the branch and its consequent code (raise error) are all removable. 1097 1098 Progressing from the control case, a deliberately bound-incorrect mode is no longer informative. 1099 Rather, given a (well-typed) program that does good work on good data, the issue is whether it is (determinably) bound-correct on all data. 1100 1101 When dimension sizes get reused, \CFA has an advantage over \CC-vector at getting simply written programs well optimized. 1102 1103 \begin{figure} 1104 \setlength{\tabcolsep}{10pt} 1105 \begin{tabular}{llll@{}} 1106 \multicolumn{1}{c}{C} & 1107 \multicolumn{1}{c}{\CFA} & 1108 \multicolumn{1}{c}{\CC (nested vector)} 1109 \\[1em] 1110 \lstinput{20-37}{ar-bchk/treatment.c} & 1111 \lstinput{20-37}{ar-bchk/treatment.cfa} & 1112 \lstinput{20-37}{ar-bchk/treatment.cc} 1113 \end{tabular} 1114 \caption{Overhead comparison, differentiation case, source. 1115 } 1116 \label{f:ovhd-treat-src} 1117 \end{figure} 1118 1119 \begin{figure} 1120 \newcolumntype{C}[1]{>{\centering\arraybackslash}p{#1}} 1121 \begin{tabular}{C{1.5in}C{1.5in}C{1.5in}p{1in}} 1122 C & 1123 \CFA & 1124 \CC (nested vector) 1125 \\[1em] 1126 \multicolumn{4}{l}{ 1127 \includegraphics[page=3,trim=0 4in 2in 0,clip]{ar-bchk} 1128 } 1129 \end{tabular} 1130 \caption{Overhead comparison, differentiation case, assembly. 1131 } 1132 \label{f:ovhd-treat-asm} 1133 \end{figure} 1134 1135 The example case of \VRef[Figures]{f:ovhd-treat-src} and \VRef{f:ovhd-treat-asm} is simple matrix multiplication over a row-major encoding. 1136 Simple means coding directly to the intuition of the mathematical definition, without trying to optimize for memory layout. 1137 In the assembly code of \VRef[Figures]{f:ovhd-treat-asm}, the looping pattern of \VRef[Figure]{f:ovhd-ctl} (d, e, f), ``Skip aheas on zero; loop back for next,'' occurs with three nesting levels. 1138 Simultaneously, the dynamic bound-check pattern of \VRef[Figure]{f:ovhd-ctl} (h,~i), ``Get out on invalid,'' occurs, targeting @.L7@, @.L9@ and @.L8@. 1139 1140 Here, \VRef[Figures]{f:ovhd-treat-asm} shows the \CFA solution optimizing into practically the C solution, while the \CC solution shows added runtime bound checks. 1141 Like in \VRef[Figure]{f:ovhd-ctl} (e), the significance is the \emph{absence} of library-imposed runtime checks, even though the source code is working through the the \CFA @array@ library. 1142 The optimizer removed the library-imposed checks because the data strructure @array@-of-@array@ is constained by its type to be shaped correctly for the intuitive looping. 1143 In \CC, the same constraint does not apply to @vector@-of-@vector@. 1144 Because every individual @vector@ carries its own size, two types of bound mistakes are possible. 1145 1146 Firstly, the three structures received may not be matrices at all, per the obvious/dense/total interpretation; rather, any one might be ragged-right in its rows. 1147 The \CFA solution guards against this possibility by encoding the minor length (number of columns) in the major element (row) type. 1148 In @res@, for example, each of the @M@ rows is @array(float, N)@, guaranteeing @N@ cells within it. 1149 Or more technically, guaranteeing @N@ as the basis for the imposed bound check \emph{of every row.} 1150 1151 The second type of \CC bound mistake is that its types do not impose the mathematically familiar constraint of $(M \times P) (P \times N) \rightarrow (M \times N)$. 1152 Even assuming away the first issue, \eg that in @lhs@, all minor/cell counts agree, the data format allows the @rhs@ major/row count to disagree. 1153 Or, the possibility that the row count @res.size()@ disagrees with the row count @lhs.size()@ illustrates this bound-mistake type in isolation. 1154 The \CFA solution guards against this possibility by keeping length information separate from the array data, and therefore eligible for sharing. 1155 This capability lets \CFA escape the one-to-one correspondence between array instances and symbolic bounds, where this correspondence leaves a \CC-vector programmer stuck with a matrix representation that repeats itself. 1156 1157 It is important to clarify that the \CFA solution does not become unsafe (like C) in losing its dynamic checks, even though it does become fast (as C) in losing them. 1158 The dynamic chekcs were dismissed as unnecessary \emph{because} the program was safe to begin with. 1159 1160 To regain performance, a \CC programmer is left needing to state appropriate assertions or assumptions, to convince the optimizer to dismiss the runtime checks. 1161 Especially considering that two of them are in the inner-most loop. 1162 The solution is nontrivial. 1163 It requires doing the work of the inner-loop checks as a preflight step. 1164 But this work still takes looping; doing it upfront gives too much separation for the optimizer to see ``has been checked already'' in the deep loop. 1165 So, the programmer must restate the preflight observation within the deep loop, but this time as an unchecked assumption. 1166 Such assumptions are risky because they introduce further undefined behaviour when misused. 1167 Only the programmer's discipline remains to ensure this work is done without error. 1168 1169 The \CFA solution lets a simply stated program have dynamic guards that catch bugs, while letting a simply stated bug-free program run as fast as the unguarded C equivalent. 1170 1171 \begin{comment} 1172 The ragged-right issue brings with it a source-of-truth difficulty: Where, in the \CC version, is one to find the value of $N$? $M$ can come from either @res@'s or @lhs@'s major/row count, and checking these for equality is straightforward. $P$ can come from @rhs@'s major/row count. But $N$ is only available from columns, \ie minor/cell counts, which are ragged. So any choice of initial source of truth, \eg 1173 \end{comment} 1048 1174 1049 1175 \section{Array Lifecycle} … … 1296 1422 \section{Array Comparison} 1297 1423 1298 1299 1424 \subsection{Rust} 1300 1425 … … 1413 1538 1414 1539 \subsection{Java} 1540 \label{JavaCompare} 1541 1415 1542 1416 1543 Java arrays are references, so multi-dimension arrays are arrays-of-arrays \see{\VRef{toc:mdimpl}}. … … 1501 1628 1502 1629 \subsection{\CC} 1630 \label{CppCompare} 1503 1631 1504 1632 Because C arrays are difficult and dangerous, the mantra for \CC programmers is to use @std::vector@ in place of the C array. 1505 While the vector size can grow and shrink dynamically, \vs a fixed-size dynamic size with VLAs, the cost of this extra feature is mitigated by preallocating the maximum size (like the VLA) at the declaration (one dynamic call) to avoid using @push_back@. 1633 While the vector size can grow and shrink dynamically, \vs an unchanging dynamic size with VLAs, the cost of this extra feature is mitigated by preallocating the maximum size (like the VLA) at the declaration. 1634 So, it costs one upfront dynamic allocation and avoids growing the arry through pushing. 1506 1635 \begin{c++} 1507 1636 vector< vector< int > > m( 5, vector<int>(8) ); // initialize size of 5 x 8 with 6 dynamic allocations 1508 1637 \end{c++} 1509 1638 Multidimensional arrays are arrays-of-arrays with associated costs. 1510 Each @vector@ array has an array descriptor contain the dimension, which allows bound checked using @x.at(i)@ in place of @x[i]@. 1511 Used with these restrictions, out-of-bound accesses are caught, and in-bound accesses never exercise the vector's ability to grow, preventing costly reallocate and copy, and never invalidate references to contained values. 1512 This scheme matches Java's safety and expressiveness exactly, but with the inherent costs. 1513 1639 Each @vector@ array has an array descriptor containing the dimension, which allows bound checked using @x.at(i)@ in place of @x[i]@. 1640 Used with these restrictions, out-of-bound accesses are caught, and in-bound accesses never exercise the vector's ability to grow, preventing costly reallocate-and-copy, and never invalidating references to contained values. 1641 1642 This scheme matches a Java array's safety and expressiveness exactly, with the same inherent costs. 1643 Notably, a \CC vector of vectors does not provide contiguous element storage (even when upfront allocation is done carefully) because a vector puts its elements in an auxiliary allocation. 1644 So, in spite of @vector< vector< int > >@ appearing to be ``everything by value,'' it is still a first array whose elements include pointers to further arrays. 1645 1646 \CC has options for working with memory-adjacent data in desirable ways, particularly in recent revisions. 1647 But none makes an allocation with a dynamically given fixed size less awkward than the vector arrangement just described. 1648 1649 \CC~26 adds @std::inplace_vector@, which provides an interesting vector--array hybrid,\footnote{ 1650 Like a vector, it lets a user construct elements in a loop, rather than imposing uniform construction. 1651 Yet it preserves \lstinline{std::array}'s ability to be entirely stack-allocated, by avoiding an auxiliary elements' allocation. 1652 } but does not change the fundamental limit of \lstinline{std::array}, that the length, being a template parameter, must be a static value. 1653 1654 \CC~20's @std::span@ is a view that unifies true arrays, vectors, static sizes and dynamic sizes, under a common API that adds bound checking. 1655 When wrapping a vector, bound checking occurs on regular subscripting; one needn't remember to use @.at@. 1656 When wrapping a locally declared fixed-size array, bound communication is implicit. 1657 But it has a soundness gap by offering construction from pointer and user-given length. 1658 1659 \CC~23's @std::mdspan@ adds multidimensional indexing and reshaping capabilities analogous to those built into \CFA's @array@. 1660 Like @span@, it works over multiple underlying container types. 1661 But neither @span@ nor @mdspan@ augments the available allocation options. 1662 1663 Thus, these options do not offer an allocation with a dynamically given fixed size. 1664 And furthermore, they do not provide any improvement to the C flexible array member pattern, for making a dynamic amount of storage contiguous with its header, as do \CFA's accordions. 1514 1665 1515 1666 \subsection{Levels of Dependently Typed Arrays} 1516 1667 1668 TODO: fix the position; checked c does not do linear types 1517 1669 \CFA's array is the first lightweight application of dependently-typed bound tracking to an extension of C. 1518 1670 Other extensions of C that apply dependently-typed bound tracking are heavyweight, in that the bound tracking is part of a linearly-typed ownership-system, which further helps guarantee statically the validity of every pointer deference. … … 1548 1700 it can also do these other cool checks, but watch how I can mess with its conservativeness and termination 1549 1701 1550 Two c urrent, state-of-the-art array languages, Dex\cite{arr:dex:long} and Futhark\cite{arr:futhark:tytheory}, offer novel contributions concerningsimilar, restricted dependent types for tracking array length.1702 Two contemporary array-centric languages, Dex\cite{arr:dex:long} and Futhark\cite{arr:futhark:tytheory}, contribute similar, restricted dependent types for tracking array length. 1551 1703 Unlike \CFA, both are garbage-collected functional languages. 1552 1704 Because they are garbage-collected, referential integrity is built-in, meaning that the heavyweight analysis, that \CFA aims to avoid, is unnecessary. … … 1559 1711 There is a particular emphasis on an existential type, enabling callee-determined return shapes. 1560 1712 1561 1562 Dex uses a novel conception of size, embedding its quantitative information completely into an ordinary type. 1563 1564 Futhark and full-strength dependently typed languages treat array sizes are ordinary values. 1713 Dex uses an Ada-style conception of size, embedding quantitative information completely into an ordinary type. 1714 1715 Futhark and full-strength dependently typed languages treat array sizes as ordinary values. 1565 1716 Futhark restricts these expressions syntactically to variables and constants, while a full-strength dependent system does not. 1566 1717 1567 \CFA's hybrid presentation, @forall( [N] )@, has @N@ belonging to the type system, yet has no instances.1718 \CFA's hybrid presentation, @forall( [N] )@, has @N@ belonging to the type system, yet no objects of type @[N]@ occur. 1568 1719 Belonging to the type system means it is inferred at a call site and communicated implicitly, like in Dex and unlike in Futhark. 1569 1720 Having no instances means there is no type for a variable @i@ that constrains @i@ to be in the range for @N@, unlike Dex, [TODO: verify], but like Futhark. … … 1574 1725 1575 1726 In Ada and Dex, an array is conceived as a function whose domain must satisfy only certain structural assumptions, while in C, \CC, Java, Futhark and \CFA today, the domain is a prefix of the natural numbers. 1576 The generality has obvious aesthetic benefits for programmers working on scheduling resources to weekdays, and for programmers who prefer to count from an initial number of their own choosing. 1727 The Ada--Dex generality has aesthetic benefits for certain programmers. For those working on scheduling resources to weekdays: 1728 For those who prefer to count from an initial number of their own choosing: 1577 1729 1578 1730 This change of perspective also lets us remove ubiquitous dynamic bound checks. … … 1610 1762 (unsafe_from_ordinal a (idiv o bs), unsafe_from_ordinal b (rem o bs)) 1611 1763 \end{cfa} 1764 % fix mike's syntax highlighter 1612 1765 and by a user-provided adapter expression at the call site that shows how to indexing with a tuple is backed by indexing each dimension at a time 1613 1766 \begin{cfa} … … 1702 1855 Using a compiler-produced value eliminates an opportunity for user error. 1703 1856 1704 TODO:fix in following: even the alloc call gives bad code gen: verify it was always this way; walk back the wording about things just working here; assignment (rebind) seems to offer workaround, as in bkgd-cfa-arrayinteract.cfa1857 ...someday... fix in following: even the alloc call gives bad code gen: verify it was always this way; walk back the wording about things just working here; assignment (rebind) seems to offer workaround, as in bkgd-cfa-arrayinteract.cfa 1705 1858 1706 1859 Bringing in another \CFA feature, reference types, both resolves a sore spot of the last example, and gives a first example of an array-interaction bug. … … 1711 1864 (*ar2)[5]; 1712 1865 \end{cfa} 1713 Using ``reference to array'' works at resolving this issue. TODO:discuss connection with Doug-Lea \CC proposal.1866 Using ``reference to array'' works at resolving this issue. ...someday... discuss connection with Doug-Lea \CC proposal. 1714 1867 \begin{cfa} 1715 1868 tm (&ar3)[10] = *alloc(); … … 1719 1872 1720 1873 Using proper array types (@ar2@ and @ar3@) addresses a concern about using raw element pointers (@ar1@), albeit a theoretical one. 1721 TODOxref C standard does not claim that @ar1@ may be subscripted,1874 ...someday... xref C standard does not claim that @ar1@ may be subscripted, 1722 1875 because no stage of interpreting the construction of @ar1@ has it be that ``there is an \emph{array object} here.'' 1723 1876 But both @*ar2@ and the referent of @ar3@ are the results of \emph{typed} @alloc@ calls, … … 1725 1878 1726 1879 The ``reference to array'' type has its sore spots too. 1727 TODOsee also @dimexpr-match-c/REFPARAM_CALL@ (under @TRY_BUG_1@)1728 1729 TODO:I fixed a bug associated with using an array as a T. I think. Did I really? What was the bug?1880 ...someday... see also @dimexpr-match-c/REFPARAM_CALL@ (under @TRY_BUG_1@) 1881 1882 ...someday... I fixed a bug associated with using an array as a T. I think. Did I really? What was the bug? 1730 1883 \end{comment} -
libcfa/src/collections/array.hfa
r0210a543 reb0d9b7 242 242 } 243 243 244 // Further form of -[-,-,-] that avoids using the trait system. 245 // Above overloads work for any type with (recursively valid) subscript operator, 246 // provided said subscript is passed as an assertion. 247 // Below works only on arpk variations but never passes its subscript though an assertion. 248 // 249 // When arpk implements the trait used above, 250 // the critical assertion is backed by a nontrivial thunk. 251 // There is no "thunk problem" (lifetime) issue, when used as shown in the test suite. 252 // But the optimizer has shown difficulty removing these thunks in cases where "it should," 253 // i.e. when all user code is in one compilation unit. 254 // Not that every attempt at removing such a thunk fails; cases have been found going both ways. 255 // Cases have been found with unnecessary bound-checks removed successfully, 256 // on user code written against the overloads below, 257 // but where these bound checks (which occur within `call`ed thunks) are not removed, 258 // on user code written against the overloads above. 259 // 260 // The overloads below provide specializations of the above 261 // that are a little harder to use than the ones above, 262 // but where array API erasure has been seen to be more effective. 263 // Note that the style below does not appeal to a case where thunk inlining is more effective; 264 // rather, it simply does not rely on thunks in the first place. 265 // 266 // Both usage styles are shown in test array-md-sbscr-cases#numSubscrTypeCompatibility, 267 // with the more general one above being "high abstraction," 268 // and the more performant one below being "mid abstraction" and "low abstraction." 269 // 270 // A breadth of index types is not given here (providing -[size_t,size_t,...] only) 271 // because these declarations are not feeding a trait, so safe implicit arithmetic conversion kiks in. 272 // Even so, there may still be an un-met need for accepting 273 // either ptrdiff_t or size_t (signed or unsigned) 274 // because Mike has seen the optimizer resist removing bound checks when sign-conversion is in play. 275 // "Only size_t" is meeting today's need 276 // and no solution is known that avoids 2^D overloads for D dimensions 277 // while offering multiple subscript types and staying assertion-free. 278 // 279 // This approach, of avoiding traits entirely, is likely incompatible with the original desire 280 // to have one recursive multidimensional subscript operator (TRY_BROKEN_DESIRED_MD_SUBSCRIPT). 281 // To make a single declaration work, 282 // we would probably have to get better at coaxing the optimizer into inlining thunks. 283 284 forall( [N2], S2*, [N1], S1*, Timmed1, Tbase ) 285 static inline Timmed1 & ?[?]( arpk( N2, S2, arpk( N1, S1, Timmed1, Tbase ), Tbase ) & this, size_t ix2, size_t ix1 ) { 286 return this[ix2][ix1]; 287 } 288 289 forall( [N3], S3*, [N2], S2*, [N1], S1*, Timmed1, Tbase ) 290 static inline Timmed1 & ?[?]( arpk( N3, S3, arpk( N2, S2, arpk( N1, S1, Timmed1, Tbase ), Tbase ), Tbase ) & this, size_t ix3, size_t ix2, size_t ix1 ) { 291 return this[ix3][ix2][ix1]; 292 } 293 294 forall( [N4], S4*, [N3], S3*, [N2], S2*, [N1], S1*, Timmed1, Tbase ) 295 static inline Timmed1 & ?[?]( arpk( N4, S4, arpk( N3, S3, arpk( N2, S2, arpk( N1, S1, Timmed1, Tbase ), Tbase ), Tbase ), Tbase ) & this, size_t ix4, size_t ix3, size_t ix2, size_t ix1 ) { 296 return this[ix4][ix3][ix2][ix1]; 297 } 298 299 300 244 301 #endif 245 302 -
tests/array-collections/array-md-sbscr-cases.cfa
r0210a543 reb0d9b7 231 231 } 232 232 233 // common function body, working on parameter wxyz, but for wxyz being different types 234 #define CHECK_NUM_SUBSCR_TYPE_COMPAT \ 235 valExpected = getMagicNumber(2, 3, 4, 5); \ 236 assert(( wxyz[2] [3] [4] [5] == valExpected )); \ 237 assert(( wxyz[2, 3] [4] [5] == valExpected )); \ 238 assert(( wxyz[2] [3, 4] [5] == valExpected )); \ 239 assert(( wxyz[2] [3] [4, 5] == valExpected )); \ 240 assert(( wxyz[2, 3, 4] [5] == valExpected )); \ 241 assert(( wxyz[2] [3, 4, 5] == valExpected )); \ 242 assert(( wxyz[2, 3, 4, 5] == valExpected )); \ 243 for ( i; Nw ) { \ 244 assert(( wxyz[ i, 3, 4, 5 ] == getMagicNumber(i, 3, 4, 5) )); \ 245 } \ 246 for ( i; Nx ) { \ 247 assert(( wxyz[ 2, i, 4, 5 ] == getMagicNumber(2, i, 4, 5) )); \ 248 } \ 249 for ( i; Ny ) { \ 250 assert(( wxyz[ 2, 3, i, 5 ] == getMagicNumber(2, 3, i, 5) )); \ 251 } \ 252 for ( i; Nz ) { \ 253 assert(( wxyz[ 2, 3, 4, i ] == getMagicNumber(2, 3, 4, i) )); \ 254 } 255 #define CHECK_NUM_SUBSCR_TYPE_COMPAT_ADDENDUM_RESHAPE \ 256 for ( i; Nw ) { \ 257 assert(( wxyz[ i, all, 4, 5 ][3] == getMagicNumber(i, 3, 4, 5) )); \ 258 } \ 259 for ( i; Nw ) { \ 260 assert(( wxyz[ all, 3, 4, 5 ][i] == getMagicNumber(i, 3, 4, 5) )); \ 261 } 262 263 // Low abstraction: simple declaration, cannot send a slice, can make a slice, runs fast 264 forall( [Nw], [Nx], [Ny], [Nz] ) 265 void test_numSubscrTypeCompatibility_lo( array( float, Nw, Nx, Ny, Nz ) & wxyz ) { 266 CHECK_NUM_SUBSCR_TYPE_COMPAT 267 CHECK_NUM_SUBSCR_TYPE_COMPAT_ADDENDUM_RESHAPE 268 } 269 270 // Medium abstraction: complex declaration, can send or make a slice, runs fast 271 forall( [Nw], Sw* 272 , [Nx], Sx* 273 , [Ny], Sy* 274 , [Nz], Sz* ) 275 void test_numSubscrTypeCompatibility_mid( 276 arpk( Nw, Sw, 277 arpk( Nx, Sx, 278 arpk( Ny, Sy, 279 arpk( Nz, Sz, float, float) 280 , float) 281 , float) 282 , float) & wxyz 283 ) { 284 CHECK_NUM_SUBSCR_TYPE_COMPAT 285 CHECK_NUM_SUBSCR_TYPE_COMPAT_ADDENDUM_RESHAPE 286 } 287 288 // High abstraction: mid-complexity declaration, can send a slice or a non-arpk, cannot make a slice, may not run fast 289 forall( [Nw], Awxyz & 290 , [Nx], Axyz & 291 , [Ny], Ayz & 292 , [Nz], Az & 293 | ar( Awxyz, Axyz, Nw ) 294 | ar( Axyz, Ayz, Nx ) 295 | ar( Ayz, Az, Ny ) 296 | ar( Az, float, Nz ) ) 297 void test_numSubscrTypeCompatibility_hi( Awxyz & wxyz ) { 298 CHECK_NUM_SUBSCR_TYPE_COMPAT 299 } 300 233 301 forall( [Nw], [Nx], [Ny], [Nz] ) 234 302 void test_numSubscrTypeCompatibility( tag(Nw), tag(Nx), tag(Ny), tag(Nz) ) { … … 237 305 fillHelloData(wxyz); 238 306 239 valExpected = getMagicNumber(2, 3, 4, 5); 240 assert(( wxyz [2] [3] [4] [5] == valExpected )); 241 assert(( wxyz[2, 3][4] [5] == valExpected )); 242 assert(( wxyz [2][3, 4][5] == valExpected )); 243 assert(( wxyz [2] [3][4, 5] == valExpected )); 244 assert(( wxyz[2, 3, 4][5] == valExpected )); 245 assert(( wxyz [2][3, 4, 5] == valExpected )); 246 assert(( wxyz[2, 3, 4, 5] == valExpected )); 247 248 for ( i; Nw ) { 249 assert(( wxyz[ i, 3, 4, 5 ] == getMagicNumber(i, 3, 4, 5) )); 250 } 251 252 for ( i; Nx ) { 253 assert(( wxyz[ 2, i, 4, 5 ] == getMagicNumber(2, i, 4, 5) )); 254 } 255 256 for ( i; Ny ) { 257 assert(( wxyz[ 2, 3, i, 5 ] == getMagicNumber(2, 3, i, 5) )); 258 } 259 260 for ( i; Nz ) { 261 assert(( wxyz[ 2, 3, 4, i ] == getMagicNumber(2, 3, 4, i) )); 262 } 263 264 for ( i; Nw ) { 265 assert(( wxyz[ i, all, 4, 5 ][3] == getMagicNumber(i, 3, 4, 5) )); 266 } 267 268 for ( i; Nw ) { 269 assert(( wxyz[ all, 3, 4, 5 ][i] == getMagicNumber(i, 3, 4, 5) )); 270 } 307 test_numSubscrTypeCompatibility_lo ( wxyz ); 308 test_numSubscrTypeCompatibility_mid( wxyz ); 309 test_numSubscrTypeCompatibility_hi ( wxyz ); 271 310 } 272 311
Note:
See TracChangeset
for help on using the changeset viewer.