Changeset 5faa3a5
- Timestamp:
- Apr 28, 2026, 8:56:08 AM (13 hours ago)
- Branches:
- master
- Parents:
- bf8112b
- Location:
- doc/theses/mike_brooks_MMath
- Files:
-
- 4 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/mike_brooks_MMath/array.tex
rbf8112b r5faa3a5 1766 1766 it can also do these other cool checks, but watch how I can mess with its conservativeness and termination 1767 1767 1768 Two contemporary array-centric languages, Dex \cite{arr:dex:long} and Futhark\cite{arr:futhark:tytheory}, contribute similar, restricted dependent types for tracking array length.1768 Two contemporary array-centric languages, Dex~\cite{arr:dex:long} and Futhark~\cite{arr:futhark:tytheory}, contribute similar, restricted dependent types for tracking array length. 1769 1769 Unlike \CFA, both are garbage-collected functional languages. 1770 1770 Because they are garbage-collected, referential integrity is built-in, meaning that the heavyweight analysis, that \CFA aims to avoid, is unnecessary. … … 1936 1936 \begin{itemize} 1937 1937 \item 1938 Alice holds an darray and knows its size.1938 Alice holds an array and knows its size. 1939 1939 \item 1940 1940 Bob offers a service that can apply to an array of any size. … … 1955 1955 \end{itemize} 1956 1956 This interaction is the dual notion of existential polymorphism, working upon size information. 1957 Work \cite{arr:futhark:tytheory} accompanying the array-centric language Futhark has detailed its theory.1957 Work~\cite{arr:futhark:tytheory} accompanying the array-centric language Futhark has detailed its theory. 1958 1958 1959 1959 An existential size helps cope with unnecessary repetition that occurs as program size increases and more intermediaries separate an array producer and consumer. … … 1975 1975 The intermediaries no longer need size information in scope to pass one along. 1976 1976 @SchoolRef@ stops the polymorphism of @School@ from flowing farther outward. 1977 The final consumer need only open the box, discover what 's inside, and proceed.1977 The final consumer need only open the box, discover what is inside, and proceed. 1978 1978 \begin{cfa} 1979 1979 forall( [C], [S] ) static void process( School(C, S) * ); 1980 1980 void receive( SchoolRef r ) { 1981 1981 with( r ) { 1982 classes; students; // are in scope1983 School( classes, students ) * s2 = school; // we can state the type1984 process( s2 ); // we can communicate the type1982 classes; students; $\C{// are in scope}$ 1983 School( classes, students ) * s2 = school; $\C{// state the type}$ 1984 process( s2 ); $\C{// communicate the type}$ 1985 1985 } 1986 1986 } … … 1994 1994 The programmer writes code in which it appears that a bound check ought to be unnecessary, but feedback about success or failure there is indirect. 1995 1995 1996 Dex \cite{arr:dex:long} is a contemporary array-centric language that follows in the Ada array tradition\cite{arr:ada:learn} of designating a type to be used for subscripting an array.1996 Dex~\cite{arr:dex:long} is a contemporary array-centric language that follows in the Ada array tradition~\cite{arr:ada:learn} of designating a type to be used for subscripting an array. 1997 1997 While \CFA has its new @[N]@ living in the type system, this @N@ is used to represent an array's size. 1998 1998 An indexing type would be the type of @i@, in @ar[i]@. 1999 1999 \begin{cfa} 2000 2000 const size_t n = ask_the_user(); 2001 typedef range(0, n) till_n_t; // does not exist in CFA yet2002 till_n_t x = 42; // static reject2003 till_n_t y = check(42); // static accept, dynamic check2001 typedef @range(0, n)@ till_n_t; $\C{// does not exist in CFA yet}$ 2002 till_n_t x = 42; $\C{// static reject}$ 2003 till_n_t y = check(42); $\C{// static accept, dynamic check}$ 2004 2004 void f( array(int, n) & ar, till_n_t i ) { 2005 sout | ar[ i ]; // bound check unnecessary2005 sout | ar[ i ]; $\C{// bound check unnecessary}$ 2006 2006 } 2007 2007 \end{cfa} … … 2009 2009 \begin{cfa} 2010 2010 void f2( array(int, n) & ar, size_t i ) { 2011 sout | ar[ i ]; // bound check necessary2011 sout | ar[ i ]; $\C{// bound check necessary}$ 2012 2012 } 2013 2013 \end{cfa} … … 2019 2019 If the user succeeds at needing no explicit bound checks, then the user is assured that there are no bound checks. 2020 2020 2021 Prior \CFA work \cite{Liang24} has touched upon this topic.2021 Prior \CFA work~\cite{Liang24} has touched upon this topic. 2022 2022 Therein, a user's enumeration type is adapted to serve as the domain of an array. 2023 2023 However, this work progressed as far as inferring an array size from the enumeration type, \eg that @int happy_hours[ weekday_t ]@ should be a declaration with length seven. -
doc/theses/mike_brooks_MMath/background.tex
rbf8112b r5faa3a5 463 463 however, it requires all dimensions except the first to be specified at compile time, \eg @m[][6]@, allowing all subscripting stride calculations to be generated with constants. 464 464 Hence, every matrix passed to @fp1@ must have exactly 6 columns but the row size can vary. 465 The variable-dimension approach (right) ignores (violates) the type system, \ie the parameter type has no suggestion of mu tidimensionality and some acrobatics are required for a w\footnote{465 The variable-dimension approach (right) ignores (violates) the type system, \ie the parameter type has no suggestion of multidimensionality and some acrobatics are required for a w\footnote{ 466 466 One may be tempted to phrase a call as \lstinline{fp2( 4, 4, vm1 )}, but this call is ill-typed. Argument \lstinline{vm1} could match parameter declarations \lstinline{int m[][4]} or \lstinline{int (*m)[4]}. But only the argument \lstinline{&vm1[0][0]}, or its equivalent, but confusing, \lstinline{vm1[0]}, relate \lstinline{vm1} to parameter type \lstinline{int*}. 467 467 }, and subscripting is performed manually using pointer arithmetic in the macro @sub@. … … 534 534 Nevertheless, the C array-of-array form is still important for special circumstances. 535 535 536 \VRef[Figure]{f:ContiguousNon-contiguous} shows a powerful extension made in C99, for manipulating contiguous \vs non-contiguous arrays.\footnote{C90 also supported non-contiguous arrays. Though GNU-flavoured C++ eventually got VLAs, it never got this enhancement for managing a multidimension l VLA parameter. }536 \VRef[Figure]{f:ContiguousNon-contiguous} shows a powerful extension made in C99, for manipulating contiguous \vs non-contiguous arrays.\footnote{C90 also supported non-contiguous arrays. Though GNU-flavoured C++ eventually got VLAs, it never got this enhancement for managing a multidimensional VLA parameter. } 537 537 For contiguous-array arguments (including VLA), C99 conjoins one or more of the parameters as a downstream dimension(s), \eg @cols@, implicitly using this parameter to compute the row stride of @m@. 538 538 Hence, if the declaration of @fc@ is changed to: … … 1184 1184 \end{c++} 1185 1185 It turns out that @TAILQ_REMOVE@ uses its ``which element to remove'' parameter at several places, importantly, one occurring after the removal's changes are in progress. 1186 When the second use encounters the macro substitution @TAILQ_LAST(reqs, reql)@, it obtains a different node than the first use got, with the removal's changes having alre dy started.1186 When the second use encounters the macro substitution @TAILQ_LAST(reqs, reql)@, it obtains a different node than the first use got, with the removal's changes having already started. 1187 1187 This macro-induced phenomenon led to an invalid pointer dereference (safety violation), at a run-time well after the removal at issue (costly to resolve). 1188 1188 -
doc/theses/mike_brooks_MMath/intro.tex
rbf8112b r5faa3a5 71 71 Among the linked structures, the list's defining characteristic is maintaining a user-explicit total order (chain shape). 72 72 Element search can be by position (as for array) or by value; it is sometimes presented as a subscript. 73 In a more complex structure, the linked shape may be system managed (user imp icit) by being a function of a designated ``key'' datum, \eg often for a hash table.73 In a more complex structure, the linked shape may be system managed (user implicit) by being a function of a designated ``key'' datum, \eg often for a hash table. 74 74 Though such schemes often give better-than-$O(n)$ lookup, the linked list's user-explicit shape limits what the system can provide, to fast step/iteration for user-``nearby'' data, and slow exhaustive search/seek otherwise. 75 75 Linked types are dynamically sized by adding and removing nodes using link fields internal or external to the list elements (nodes). -
doc/theses/mike_brooks_MMath/list.tex
rbf8112b r5faa3a5 1010 1010 Therefore, the duration's response to size is not a steady worsening as size increases. 1011 1011 Often, each size-independent configuration responds to size increases in steps of slowdown. 1012 Occasionally a slowdown step is followed by some perfor amnce increase, where an incurred penalty begins to amortize away.1012 Occasionally a slowdown step is followed by some performance increase, where an incurred penalty begins to amortize away. 1013 1013 Hence, performance results can have interesting jitter as size increases. 1014 1014 The analysis treats these behaviours as incidental. … … 1016 1016 Rather, size zones are picked, specific effects inside of a zone are averaged away, and the story at one zone is compared to that at another. 1017 1017 1018 % It is true, but perhaps not obvious, that buildin dand destroying long lists is slower than building and destroying short lists.1018 % It is true, but perhaps not obvious, that building and destroying long lists is slower than building and destroying short lists. 1019 1019 % Obviously, indeed, it takes longer to fuse and divide a hundred neighbours than five. 1020 1020 % But the key metric in this work, AII, is about a single link--unlink. … … 1028 1028 % So, duration's response to size is not a steady worsening as size increases. 1029 1029 % Rather, each size-independent configuration often responds to size increases with leaps of worsening. 1030 % Occasionally a leap is even followed size-run of retrograde response, where a suddenly incurred penalty has a chance to am mortize away.1030 % Occasionally a leap is even followed size-run of retrograde response, where a suddenly incurred penalty has a chance to amortize away. 1031 1031 % The frameworks tend to leapfrog over each other, at different points, as size increases. 1032 1032 % … … 1046 1046 } 1047 1047 \end{tabular} 1048 \caption[Variety of IR duration responses to list length, at small--medium lengths]{Variety of IR duration responses to list length, at small--medium lengths. Two example use cases are shown: I, stack movement with head-only access (plot a); IX, queue movement with element-oriented removal access (plot b); both use cases have insert-first polarity. One example is run on each machine: UC-I on AMD (plo at a); UC-IX on Intel (plot b). Lower is better.}1048 \caption[Variety of IR duration responses to list length, at small--medium lengths]{Variety of IR duration responses to list length, at small--medium lengths. Two example use cases are shown: I, stack movement with head-only access (plot a); IX, queue movement with element-oriented removal access (plot b); both use cases have insert-first polarity. One example is run on each machine: UC-I on AMD (plot a); UC-IX on Intel (plot b). Lower is better.} 1049 1049 \label{fig:plot-list-zoomin-abs} 1050 1050 \end{figure} … … 1122 1122 \subsubsection{Recap and Master Legend} 1123 1123 1124 For experiments performed in later sections, there are 12 use cases, which are all combinations of 2 move nts, 2 polarities and 3 accessors.1125 There are 4 p ysical contexts, which are all combinations of 2 machines and 2 size (length) zones.1124 For experiments performed in later sections, there are 12 use cases, which are all combinations of 2 movements, 2 polarities and 3 accessors. 1125 There are 4 physical contexts, which are all combinations of 2 machines and 2 size (length) zones. 1126 1126 There are 3.25 frameworks. 1127 1127 This accounting considers how LQ-list supports only the movement--polarity combination ``stack, insert first.'' … … 1154 1154 } 1155 1155 \end{tabular} 1156 \caption[IR duration, transformed for general ana ysis]{1157 IR duration, transformed for general ana ysis.1156 \caption[IR duration, transformed for general analysis]{ 1157 IR duration, transformed for general analysis. 1158 1158 The analysis follows the single example setup of \VRef[Figure]{f:zoomin-abs-i-swift}, \ie Use Case I on AMD; there, IR is given as absolute duration. 1159 1159 Plot (a) transforms the source dataset by conditioning on specific size. … … 1180 1180 1181 1181 The effect of conditioning on specific size erases the fact that \VRef[Figure]{fig:plot-list-zoomin-abs} shows, aside from the coarse hops, all frameworks getting smoothly slower as size increases. 1182 This effect is parti ularly relevant \emph{within} a size zone, most noticeable as the data lines all going up across the Small box.1182 This effect is particularly relevant \emph{within} a size zone, most noticeable as the data lines all going up across the Small box. 1183 1183 Now, with size conditioned, \VRef[Figure]{f:zoomin-rel-i-swift} has the trends inside a zone box being flat. 1184 1184 This flatness gives \subref*{f:histo-rel-i-swift} nicely separated histograms. … … 1186 1186 Specific size is the only factor conditioned in this view. 1187 1187 This choice was made to keep the relationship between \VRef[Figures]{f:zoomin-abs-i-swift} and \VRef[]{:zoomin-rel-i-swift} perceptible. 1188 By contrast, general comparisons like \VRef[Figure]{fig:plot-list-1ord} condition on more, generally, everyt ing not presented.1188 By contrast, general comparisons like \VRef[Figure]{fig:plot-list-1ord} condition on more, generally, everything not presented. 1189 1189 Its physical-factor breakdown conditions on use case and framework, but not on physical factors; its other two breakdowns are defined similarly. 1190 1190 1191 1191 The noteworthy performance hop, in this example, is LQ-@list@, which \VRef[Figures]{f:zoomin-abs-i-swift} has as consistently slow in the Small range, and consistently fast in the Medium range. 1192 Therefore, in \VRef{f:zoomin-histo-i-swift}, its two per-size-zone histograms are far apart, and its cross-size-zone histogram is bim dal.1192 Therefore, in \VRef{f:zoomin-histo-i-swift}, its two per-size-zone histograms are far apart, and its cross-size-zone histogram is bimodal. 1193 1193 Hops and distribution contributions, like this one, are common. 1194 1194 They are attention-grabbing curiosities when comparing nearly any pair of individual configurations. … … 1208 1208 1209 1209 A condensed graphing style is used in subsequent plots to present this amount of data. 1210 \VRef[Figure]{fig:plot-list-rel} shows how the condensed graphing style is generated from ind vidual-configuration measures.1210 \VRef[Figure]{fig:plot-list-rel} shows how the condensed graphing style is generated from individual-configuration measures. 1211 1211 \VRef[Figure]{f:zoomin-rel-i-swift} is formed from the data in \VRef[Figure]{f:zoomin-abs-i-swift}, transformed on the Y-axis to show duration relative to the mean across all four frameworks, at each specific size. 1212 \VRef[Figure]{f:zoomin-histo-i-swift} con senses the interesting data within the two boxes (Small/Medium) and their combination (Both).1212 \VRef[Figure]{f:zoomin-histo-i-swift} condenses the interesting data within the two boxes (Small/Medium) and their combination (Both). 1213 1213 This graph plots a vertical histogram for each of the 4 frameworks. 1214 1214 A data point on \VRef[Figure]{f:zoomin-abs-i-swift} is one-to-one with a point on \VRef[Figure]{f:zoomin-rel-i-swift}; each gives one IC. 1215 Repeatability of the experime t being established previously, retry variance information (error bar on an individual-configuration point) is discarded, and only the expected performance of an IC (mean of its middle three out of five trials) is promoted into the histograms.1215 Repeatability of the experiment being established previously, retry variance information (error bar on an individual-configuration point) is discarded, and only the expected performance of an IC (mean of its middle three out of five trials) is promoted into the histograms. 1216 1216 Each histogram bin (light-shaded area) counts the number of ICs whose expected performance falls in the bin's range. 1217 1217 A histogram's girth indicates the diversity of its qualifying configurations' performance expectations. … … 1445 1445 1446 1446 The second breakdown, movement and polarity (middle), the responses are more subdued. 1447 Note, LQ-list has no repres ntation in these comparisons because it only supports stacks that push and pop with the first element.1447 Note, LQ-list has no representation in these comparisons because it only supports stacks that push and pop with the first element. 1448 1448 \CFA is completely stable under movement and polarity changes. 1449 1449 \uCpp and LQ show modest responses favouring queues and insertion at last. … … 1472 1472 Speculation is that \CFA's increased data dependency, a result of the tagging scheme, pairs poorly with the aliasing implied by queue movement. 1473 1473 The aliasing, at length 1, is: the head's first element is the head's last element. 1474 With stack movement, one of these names for the first element is re aused for both insert and remove.1474 With stack movement, one of these names for the first element is reused for both insert and remove. 1475 1475 While with queue movement, both names are used in alternation. 1476 1476 … … 1479 1479 The insight for contextualizing this issue was to inspect both length and width. 1480 1480 1481 The issue is seen as practically mitigated by noticing that the difficu tly fades away as width increases.1481 The issue is seen as practically mitigated by noticing that the difficulty fades away as width increases. 1482 1482 This effect is seen both in \VRef[Figure]{fig:plot-list-short}'s easement across the top triangle rows, and, zoomed farther out, in \VRef[Figure]{fig:plot-list-wide}. 1483 1483 1484 1484 Increasing the width matters to the aliasing hypothesis. 1485 1485 In a narrow experiment, one element's insert and remove happen in rapid succession. 1486 So, the two aliases are exerci ed closer together, making a data hazard (that lacks ideal hardware treatment) stretch the instruction-pipeline schedule more significantly.1486 So, the two aliases are exercised closer together, making a data hazard (that lacks ideal hardware treatment) stretch the instruction-pipeline schedule more significantly. 1487 1487 Increasing the width adds harness-induced gaps between the uses of each alias, behind which a potential hazard can hide. 1488 1488 1489 1489 In the practical scenario that judges length-1 performance as relevant, width 1 is contrived. 1490 A thread putting itself on an often-empty waiters' list is not doing so on one such list repeatedly, at least not without taking other situation-i duced pauses.1490 A thread putting itself on an often-empty waiters' list is not doing so on one such list repeatedly, at least not without taking other situation-induced pauses. 1491 1491 1492 1492 Thus, the congestion at low width + length comes from the harness using repetition (in order to obtain a measurable time). 1493 1493 It does not reflect the situation that motivates the legitimate desire for good length-1 performance. 1494 1494 1495 There likely is a real hazard, unique to the \CFA framework, when a queue movement is repeated on a tiny list \emph{without other interven ting action}.1495 There likely is a real hazard, unique to the \CFA framework, when a queue movement is repeated on a tiny list \emph{without other intervening action}. 1496 1496 Doing so is believed to occur only in contrived situations. 1497 1497 … … 1574 1574 1575 1575 From the perspective of assessing winning/losing frameworks, these physical effects are noise. 1576 So, subsequent analysis conditions on the ph isical effects.1576 So, subsequent analysis conditions on the physical effects. 1577 1577 That is, it supposes you are put into an unknown physical situation (that is one of the four being tested), then presents all the ways your outcome could change as a result of non-physical factors, assuming that the physical situation is kept constant. 1578 It does do by presenting results relative to the mean of the physical quadrant (\VRef[fig]{fig:plot-list-mchn-szz} histogram) to which it belo gs.1579 With this adjustment, absolute duration values (in n onsecods) are lost.1578 It does do by presenting results relative to the mean of the physical quadrant (\VRef[fig]{fig:plot-list-mchn-szz} histogram) to which it belongs. 1579 With this adjustment, absolute duration values (in nanoseconds) are lost. 1580 1580 In return, the physical quadrants are re-combined, enabling assessment of the non-physical factors. 1581 1581 \end{comment} … … 1630 1630 This state is how LQ leaves a removed element; LQ does not offer an is-listed query. 1631 1631 \item[no-iter(ation)] Removing support for well-terminating iteration. 1632 The \CFA list uses bit-manipulation tagging on link poi ters (rather than \eg null links) to express, ``No more elements this way.''1632 The \CFA list uses bit-manipulation tagging on link pointers (rather than \eg null links) to express, ``No more elements this way.'' 1633 1633 This tagging has the cost of submitting a retrieved value to the ALU, and awaiting this operation's completion, before dereferencing a link pointer. 1634 1634 In some cases, the is-terminating bit is transferred from one link to another, or has a similar influence on a resulting link value; this logic adds register pressure and more data dependency.
Note:
See TracChangeset
for help on using the changeset viewer.