Changeset 5faa3a5


Ignore:
Timestamp:
Apr 28, 2026, 8:56:08 AM (23 hours ago)
Author:
Peter A. Buhr <pabuhr@…>
Branches:
master
Parents:
bf8112b
Message:

spelling corrections, and final proofreading

Location:
doc/theses/mike_brooks_MMath
Files:
4 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/mike_brooks_MMath/array.tex

    rbf8112b r5faa3a5  
    17661766it can also do these other cool checks, but watch how I can mess with its conservativeness and termination
    17671767
    1768 Two contemporary array-centric languages, Dex\cite{arr:dex:long} and Futhark\cite{arr:futhark:tytheory}, contribute similar, restricted dependent types for tracking array length.
     1768Two contemporary array-centric languages, Dex~\cite{arr:dex:long} and Futhark~\cite{arr:futhark:tytheory}, contribute similar, restricted dependent types for tracking array length.
    17691769Unlike \CFA, both are garbage-collected functional languages.
    17701770Because they are garbage-collected, referential integrity is built-in, meaning that the heavyweight analysis, that \CFA aims to avoid, is unnecessary.
     
    19361936\begin{itemize}
    19371937        \item
    1938         Alice holds and array and knows its size.
     1938        Alice holds an array and knows its size.
    19391939        \item
    19401940        Bob offers a service that can apply to an array of any size.
     
    19551955\end{itemize}
    19561956This interaction is the dual notion of existential polymorphism, working upon size information.
    1957 Work\cite{arr:futhark:tytheory} accompanying the array-centric language Futhark has detailed its theory.
     1957Work~\cite{arr:futhark:tytheory} accompanying the array-centric language Futhark has detailed its theory.
    19581958
    19591959An existential size helps cope with unnecessary repetition that occurs as program size increases and more intermediaries separate an array producer and consumer.
     
    19751975The intermediaries no longer need size information in scope to pass one along.
    19761976@SchoolRef@ stops the polymorphism of @School@ from flowing farther outward.
    1977 The final consumer need only open the box, discover what's inside, and proceed.
     1977The final consumer need only open the box, discover what is inside, and proceed.
    19781978\begin{cfa}
    19791979forall( [C], [S] ) static void process( School(C, S) * );
    19801980void receive( SchoolRef r ) {
    19811981        with( r ) {
    1982                 classes; students; // are in scope
    1983                 School( classes, students ) * s2 = school; // we can state the type
    1984                 process( s2 ); // we can communicate the type
     1982                classes; students;                              $\C{// are in scope}$
     1983                School( classes, students ) * s2 = school; $\C{// state the type}$
     1984                process( s2 );                                  $\C{// communicate the type}$
    19851985        }
    19861986}
     
    19941994The programmer writes code in which it appears that a bound check ought to be unnecessary, but feedback about success or failure there is indirect.
    19951995
    1996 Dex\cite{arr:dex:long} is a contemporary array-centric language that follows in the Ada array tradition\cite{arr:ada:learn} of designating a type to be used for subscripting an array.
     1996Dex~\cite{arr:dex:long} is a contemporary array-centric language that follows in the Ada array tradition~\cite{arr:ada:learn} of designating a type to be used for subscripting an array.
    19971997While \CFA has its new @[N]@ living in the type system, this @N@ is used to represent an array's size.
    19981998An indexing type would be the type of @i@, in @ar[i]@.
    19991999\begin{cfa}
    20002000const size_t n = ask_the_user();
    2001 typedef range(0, n) till_n_t; // does not exist in CFA yet
    2002 till_n_t x = 42; // static reject
    2003 till_n_t y = check(42); // static accept, dynamic check
     2001typedef @range(0, n)@ till_n_t; $\C{// does not exist in CFA yet}$
     2002till_n_t x = 42; $\C{// static reject}$
     2003till_n_t y = check(42); $\C{// static accept, dynamic check}$
    20042004void f( array(int, n) & ar, till_n_t i ) {
    2005         sout | ar[ i ];  // bound check unnecessary
     2005        sout | ar[ i ];  $\C{// bound check unnecessary}$
    20062006}
    20072007\end{cfa}
     
    20092009\begin{cfa}
    20102010void f2( array(int, n) & ar, size_t i ) {
    2011         sout | ar[ i ];  // bound check necessary
     2011        sout | ar[ i ];  $\C{// bound check necessary}$
    20122012}
    20132013\end{cfa}
     
    20192019If the user succeeds at needing no explicit bound checks, then the user is assured that there are no bound checks.
    20202020
    2021 Prior \CFA work\cite{Liang24} has touched upon this topic.
     2021Prior \CFA work~\cite{Liang24} has touched upon this topic.
    20222022Therein, a user's enumeration type is adapted to serve as the domain of an array.
    20232023However, this work progressed as far as inferring an array size from the enumeration type, \eg that @int happy_hours[ weekday_t ]@ should be a declaration with length seven.
  • doc/theses/mike_brooks_MMath/background.tex

    rbf8112b r5faa3a5  
    463463however, it requires all dimensions except the first to be specified at compile time, \eg @m[][6]@, allowing all subscripting stride calculations to be generated with constants.
    464464Hence, every matrix passed to @fp1@ must have exactly 6 columns but the row size can vary.
    465 The variable-dimension approach (right) ignores (violates) the type system, \ie the parameter type has no suggestion of mutidimensionality and some acrobatics are required for a w\footnote{
     465The variable-dimension approach (right) ignores (violates) the type system, \ie the parameter type has no suggestion of multidimensionality and some acrobatics are required for a w\footnote{
    466466        One may be tempted to phrase a call as \lstinline{fp2( 4, 4, vm1 )}, but this call is ill-typed.  Argument \lstinline{vm1} could match parameter declarations \lstinline{int m[][4]} or \lstinline{int (*m)[4]}.  But only the argument \lstinline{&vm1[0][0]}, or its equivalent, but confusing, \lstinline{vm1[0]}, relate \lstinline{vm1} to parameter type \lstinline{int*}.
    467467}, and subscripting is performed manually using pointer arithmetic in the macro @sub@.
     
    534534Nevertheless, the C array-of-array form is still important for special circumstances.
    535535
    536 \VRef[Figure]{f:ContiguousNon-contiguous} shows a powerful extension made in C99, for manipulating contiguous \vs non-contiguous arrays.\footnote{C90 also supported non-contiguous arrays.  Though GNU-flavoured C++ eventually got VLAs, it never got this enhancement for managing a multidimensionl VLA parameter. }
     536\VRef[Figure]{f:ContiguousNon-contiguous} shows a powerful extension made in C99, for manipulating contiguous \vs non-contiguous arrays.\footnote{C90 also supported non-contiguous arrays.  Though GNU-flavoured C++ eventually got VLAs, it never got this enhancement for managing a multidimensional VLA parameter. }
    537537For contiguous-array arguments (including VLA), C99 conjoins one or more of the parameters as a downstream dimension(s), \eg @cols@, implicitly using this parameter to compute the row stride of @m@.
    538538Hence, if the declaration of @fc@ is changed to:
     
    11841184\end{c++}
    11851185It turns out that @TAILQ_REMOVE@ uses its ``which element to remove'' parameter at several places, importantly, one occurring after the removal's changes are in progress.
    1186 When the second use encounters the macro substitution @TAILQ_LAST(reqs, reql)@, it obtains a different node than the first use got, with the removal's changes having alredy started.
     1186When the second use encounters the macro substitution @TAILQ_LAST(reqs, reql)@, it obtains a different node than the first use got, with the removal's changes having already started.
    11871187This macro-induced phenomenon led to an invalid pointer dereference (safety violation), at a run-time well after the removal at issue (costly to resolve).
    11881188
  • doc/theses/mike_brooks_MMath/intro.tex

    rbf8112b r5faa3a5  
    7171Among the linked structures, the list's defining characteristic is maintaining a user-explicit total order (chain shape).
    7272Element search can be by position (as for array) or by value; it is sometimes presented as a subscript.
    73 In a more complex structure, the linked shape may be system managed (user impicit) by being a function of a designated ``key'' datum, \eg often for a hash table.
     73In a more complex structure, the linked shape may be system managed (user implicit) by being a function of a designated ``key'' datum, \eg often for a hash table.
    7474Though such schemes often give better-than-$O(n)$ lookup, the linked list's user-explicit shape limits what the system can provide, to fast step/iteration for user-``nearby'' data, and slow exhaustive search/seek otherwise.
    7575Linked types are dynamically sized by adding and removing nodes using link fields internal or external to the list elements (nodes).
  • doc/theses/mike_brooks_MMath/list.tex

    rbf8112b r5faa3a5  
    10101010Therefore, the duration's response to size is not a steady worsening as size increases.
    10111011Often, each size-independent configuration responds to size increases in steps of slowdown.
    1012 Occasionally a slowdown step is followed by some perforamnce increase, where an incurred penalty begins to amortize away.
     1012Occasionally a slowdown step is followed by some performance increase, where an incurred penalty begins to amortize away.
    10131013Hence, performance results can have interesting jitter as size increases.
    10141014The analysis treats these behaviours as incidental.
     
    10161016Rather, size zones are picked, specific effects inside of a zone are averaged away, and the story at one zone is compared to that at another.
    10171017
    1018 % It is true, but perhaps not obvious, that buildind and destroying long lists is slower than building and destroying short lists.
     1018% It is true, but perhaps not obvious, that building and destroying long lists is slower than building and destroying short lists.
    10191019% Obviously, indeed, it takes longer to fuse and divide a hundred neighbours than five.
    10201020% But the key metric in this work, AII, is about a single link--unlink.
     
    10281028% So, duration's response to size is not a steady worsening as size increases.
    10291029% Rather, each size-independent configuration often responds to size increases with leaps of worsening.
    1030 % Occasionally a leap is even followed size-run of retrograde response, where a suddenly incurred penalty has a chance to ammortize away.
     1030% Occasionally a leap is even followed size-run of retrograde response, where a suddenly incurred penalty has a chance to amortize away.
    10311031% The frameworks tend to leapfrog over each other, at different points, as size increases.
    10321032%
     
    10461046        }
    10471047  \end{tabular}
    1048   \caption[Variety of IR duration responses to list length, at small--medium lengths]{Variety of IR duration responses to list length, at small--medium lengths.  Two example use cases are shown: I, stack movement with head-only access (plot a); IX, queue movement with element-oriented removal access (plot b); both use cases have insert-first polarity.  One example is run on each machine: UC-I on AMD (ploat a); UC-IX on Intel (plot b).  Lower is better.}
     1048  \caption[Variety of IR duration responses to list length, at small--medium lengths]{Variety of IR duration responses to list length, at small--medium lengths.  Two example use cases are shown: I, stack movement with head-only access (plot a); IX, queue movement with element-oriented removal access (plot b); both use cases have insert-first polarity.  One example is run on each machine: UC-I on AMD (plot a); UC-IX on Intel (plot b).  Lower is better.}
    10491049  \label{fig:plot-list-zoomin-abs}
    10501050\end{figure}
     
    11221122\subsubsection{Recap and Master Legend}
    11231123
    1124 For experiments performed in later sections, there are 12 use cases, which are all combinations of 2 movents, 2 polarities and 3 accessors.
    1125 There are 4 pysical contexts, which are all combinations of 2 machines and 2 size (length) zones.
     1124For experiments performed in later sections, there are 12 use cases, which are all combinations of 2 movements, 2 polarities and 3 accessors.
     1125There are 4 physical contexts, which are all combinations of 2 machines and 2 size (length) zones.
    11261126There are 3.25 frameworks.
    11271127This accounting considers how LQ-list supports only the movement--polarity combination ``stack, insert first.''
     
    11541154        }
    11551155  \end{tabular}
    1156   \caption[IR duration, transformed for general anaysis]{
    1157         IR duration, transformed for general anaysis.
     1156  \caption[IR duration, transformed for general analysis]{
     1157        IR duration, transformed for general analysis.
    11581158        The analysis follows the single example setup of \VRef[Figure]{f:zoomin-abs-i-swift}, \ie Use Case I on AMD; there, IR is given as absolute duration.
    11591159        Plot (a) transforms the source dataset by conditioning on specific size.
     
    11801180
    11811181The effect of conditioning on specific size erases the fact that \VRef[Figure]{fig:plot-list-zoomin-abs} shows, aside from the coarse hops, all frameworks getting smoothly slower as size increases.
    1182 This effect is partiularly relevant \emph{within} a size zone, most noticeable as the data lines all going up across the Small box.
     1182This effect is particularly relevant \emph{within} a size zone, most noticeable as the data lines all going up across the Small box.
    11831183Now, with size conditioned, \VRef[Figure]{f:zoomin-rel-i-swift} has the trends inside a zone box being flat.
    11841184This flatness gives \subref*{f:histo-rel-i-swift} nicely separated histograms.
     
    11861186Specific size is the only factor conditioned in this view.
    11871187This choice was made to keep the relationship between \VRef[Figures]{f:zoomin-abs-i-swift} and \VRef[]{:zoomin-rel-i-swift} perceptible.
    1188 By contrast, general comparisons like \VRef[Figure]{fig:plot-list-1ord} condition on more, generally, everyting not presented.
     1188By contrast, general comparisons like \VRef[Figure]{fig:plot-list-1ord} condition on more, generally, everything not presented.
    11891189Its physical-factor breakdown conditions on use case and framework, but not on physical factors; its other two breakdowns are defined similarly.
    11901190
    11911191The noteworthy performance hop, in this example, is LQ-@list@, which \VRef[Figures]{f:zoomin-abs-i-swift} has as consistently slow in the Small range, and consistently fast in the Medium range.
    1192 Therefore, in \VRef{f:zoomin-histo-i-swift}, its two per-size-zone histograms are far apart, and its cross-size-zone histogram is bimdal.
     1192Therefore, in \VRef{f:zoomin-histo-i-swift}, its two per-size-zone histograms are far apart, and its cross-size-zone histogram is bimodal.
    11931193Hops and distribution contributions, like this one, are common.
    11941194They are attention-grabbing curiosities when comparing nearly any pair of individual configurations.
     
    12081208
    12091209A condensed graphing style is used in subsequent plots to present this amount of data.
    1210 \VRef[Figure]{fig:plot-list-rel} shows how the condensed graphing style is generated from indvidual-configuration measures.
     1210\VRef[Figure]{fig:plot-list-rel} shows how the condensed graphing style is generated from individual-configuration measures.
    12111211\VRef[Figure]{f:zoomin-rel-i-swift} is formed from the data in \VRef[Figure]{f:zoomin-abs-i-swift}, transformed on the Y-axis to show duration relative to the mean across all four frameworks, at each specific size.
    1212 \VRef[Figure]{f:zoomin-histo-i-swift} consenses the interesting data within the two boxes (Small/Medium) and their combination (Both).
     1212\VRef[Figure]{f:zoomin-histo-i-swift} condenses the interesting data within the two boxes (Small/Medium) and their combination (Both).
    12131213This graph plots a vertical histogram for each of the 4 frameworks.
    12141214A data point on \VRef[Figure]{f:zoomin-abs-i-swift} is one-to-one with a point on \VRef[Figure]{f:zoomin-rel-i-swift}; each gives one IC.
    1215 Repeatability of the experimet being established previously, retry variance information (error bar on an individual-configuration point) is discarded, and only the expected performance of an IC (mean of its middle three out of five trials) is promoted into the histograms.
     1215Repeatability of the experiment being established previously, retry variance information (error bar on an individual-configuration point) is discarded, and only the expected performance of an IC (mean of its middle three out of five trials) is promoted into the histograms.
    12161216Each histogram bin (light-shaded area) counts the number of ICs whose expected performance falls in the bin's range.
    12171217A histogram's girth indicates the diversity of its qualifying configurations' performance expectations.
     
    14451445
    14461446The second breakdown, movement and polarity (middle), the responses are more subdued.
    1447 Note, LQ-list has no represntation in these comparisons because it only supports stacks that push and pop with the first element.
     1447Note, LQ-list has no representation in these comparisons because it only supports stacks that push and pop with the first element.
    14481448\CFA is completely stable under movement and polarity changes.
    14491449\uCpp and LQ show modest responses favouring queues and insertion at last.
     
    14721472Speculation is that \CFA's increased data dependency, a result of the tagging scheme, pairs poorly with the aliasing implied by queue movement.
    14731473The aliasing, at length 1, is: the head's first element is the head's last element.
    1474 With stack movement, one of these names for the first element is reaused for both insert and remove.
     1474With stack movement, one of these names for the first element is reused for both insert and remove.
    14751475While with queue movement, both names are used in alternation.
    14761476
     
    14791479The insight for contextualizing this issue was to inspect both length and width.
    14801480
    1481 The issue is seen as practically mitigated by noticing that the difficutly fades away as width increases.
     1481The issue is seen as practically mitigated by noticing that the difficulty fades away as width increases.
    14821482This effect is seen both in \VRef[Figure]{fig:plot-list-short}'s easement across the top triangle rows, and, zoomed farther out, in \VRef[Figure]{fig:plot-list-wide}.
    14831483
    14841484Increasing the width matters to the aliasing hypothesis.
    14851485In a narrow experiment, one element's insert and remove happen in rapid succession.
    1486 So, the two aliases are exercied closer together, making a data hazard (that lacks ideal hardware treatment) stretch the instruction-pipeline schedule more significantly.
     1486So, the two aliases are exercised closer together, making a data hazard (that lacks ideal hardware treatment) stretch the instruction-pipeline schedule more significantly.
    14871487Increasing the width adds harness-induced gaps between the uses of each alias, behind which a potential hazard can hide.
    14881488
    14891489In the practical scenario that judges length-1 performance as relevant, width 1 is contrived.
    1490 A thread putting itself on an often-empty waiters' list is not doing so on one such list repeatedly, at least not without taking other situation-iduced pauses.
     1490A thread putting itself on an often-empty waiters' list is not doing so on one such list repeatedly, at least not without taking other situation-induced pauses.
    14911491
    14921492Thus, the congestion at low width + length comes from the harness using repetition (in order to obtain a measurable time).
    14931493It does not reflect the situation that motivates the legitimate desire for good length-1 performance.
    14941494
    1495 There likely is a real hazard, unique to the \CFA framework, when a queue movement is repeated on a tiny list \emph{without other interventing action}.
     1495There likely is a real hazard, unique to the \CFA framework, when a queue movement is repeated on a tiny list \emph{without other intervening action}.
    14961496Doing so is believed to occur only in contrived situations.
    14971497
     
    15741574
    15751575From the perspective of assessing winning/losing frameworks, these physical effects are noise.
    1576 So, subsequent analysis conditions on the phisical effects.
     1576So, subsequent analysis conditions on the physical effects.
    15771577That is, it supposes you are put into an unknown physical situation (that is one of the four being tested), then presents all the ways your outcome could change as a result of non-physical factors, assuming that the physical situation is kept constant.
    1578 It does do by presenting results relative to the mean of the physical quadrant (\VRef[fig]{fig:plot-list-mchn-szz} histogram) to which it belogs.
    1579 With this adjustment, absolute duration values (in nonsecods) are lost.
     1578It does do by presenting results relative to the mean of the physical quadrant (\VRef[fig]{fig:plot-list-mchn-szz} histogram) to which it belongs.
     1579With this adjustment, absolute duration values (in nanoseconds) are lost.
    15801580In return, the physical quadrants are re-combined, enabling assessment of the non-physical factors.
    15811581\end{comment}
     
    16301630  This state is how LQ leaves a removed element; LQ does not offer an is-listed query.
    16311631\item[no-iter(ation)] Removing support for well-terminating iteration.
    1632   The \CFA list uses bit-manipulation tagging on link poiters (rather than \eg null links) to express, ``No more elements this way.''
     1632  The \CFA list uses bit-manipulation tagging on link pointers (rather than \eg null links) to express, ``No more elements this way.''
    16331633  This tagging has the cost of submitting a retrieved value to the ALU, and awaiting this operation's completion, before dereferencing a link pointer.
    16341634  In some cases, the is-terminating bit is transferred from one link to another, or has a similar influence on a resulting link value; this logic adds register pressure and more data dependency.
Note: See TracChangeset for help on using the changeset viewer.