Changeset 6224eeb for doc/theses


Ignore:
Timestamp:
Apr 27, 2026, 8:05:23 PM (3 days ago)
Author:
Michael Brooks <mlbrooks@…>
Branches:
master
Children:
c910d308
Parents:
b3e77cc
Message:

change to consistent "collection" over "container"

Location:
doc/theses/mike_brooks_MMath
Files:
7 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/mike_brooks_MMath/array.tex

    rb3e77cc r6224eeb  
    919919However, function @print1d_cstyle@ is asserting too much knowledge about its parameter @r@ for printing either a row slice or a column slice.
    920920Specifically, declaring the parameter @r@ with type @array@ means that @r@ is contiguous, which is unnecessarily restrictive.
    921 That is, @r@ need only be of a container type that offers a subscript operator (of type @ptrdiff_t@ $\rightarrow$ @float@) with managed length @N@.
     921That is, @r@ need only be of a collection type that offers a subscript operator (of type @ptrdiff_t@ $\rightarrow$ @float@) with managed length @N@.
    922922The new-array library provides the trait @ar@, so-defined.
    923923With it, the original declaration can be generalized with the same body.
     
    17211721
    17221722\CC~23's @std::mdspan@ adds multidimensional indexing and reshaping capabilities analogous to those built into \CFA's @array@.
    1723 Like @span@, it works over multiple underlying container types.
     1723Like @span@, it works over multiple underlying collection types.
    17241724But neither @span@ nor @mdspan@ augments the available allocation options.
    17251725
  • doc/theses/mike_brooks_MMath/background.tex

    rb3e77cc r6224eeb  
    225225\end{cquote}
    226226% I believe it is possible to refute any code examples purporting to show pointer arithmetic is faster than subscripting.
    227 % This believe stems from the performance work I did on \CFA arrays, where it is possible to generate equivalent \CFA subscripting and performance to C subscripting.
     227% This belief stems from the performance work I did on \CFA arrays, where it is possible to generate equivalent \CFA subscripting and performance to C subscripting.
    228228
    229229Unfortunately, C semantics want a programmer to \emph{believe} an array variable is a \emph{pointer to its first element}.
  • doc/theses/mike_brooks_MMath/conclusion.tex

    rb3e77cc r6224eeb  
    11\chapter{Conclusion}
    22
    3 This thesis performed a detailed examination of the three most important high-level containers in many programming languages: array, linked-list, and string.
    4 The goal of the work is to make containers easier to use, performant and safer.
    5 Since some subset of these three containers are used in almost every program in every programming language, this is a laudable goal.
     3This thesis performed a detailed examination of the three most important high-level collections in many programming languages: array, linked-list, and string.
     4The goal of the work is to make collections easier to use, performant and safer.
     5Since some subset of these three collections are used in almost every program in every programming language, this is a laudable goal.
    66Accomplishing this goal in C is difficult because these features are poorly designed.
    77In contrast, \CFA's advanced type system and language features plus my critical design choices made it possible to provide better support with significant safety.
    88The result is application code that is easier to write, understand, maintain, and significantly safer from hacker attach-vectors.
    9 Finally, the performance sections for each container demonstrated that safety does not have to reduce performance, \ie rolling-your-own container is now an excuse not a reason.
     9Finally, the performance sections for each collection demonstrated that safety does not have to reduce performance, \ie rolling-your-own collection is now an excuse not a reason.
    1010
    1111
     
    4242\section{Future Work}
    4343
    44 All three forms of containers presented in this work are in their nascence, both in design and implementation.
     44All three forms of collections presented in this work are in their nascence, both in design and implementation.
    4545This work provides the foundation for future \CFA students to add more functionality along with robust and performant implementations.
    4646Completing the syntax change from @array@ to C-style for dimensions and subscripts is an important first step to fit with C-programmer expectation resulting in faster adoption.
  • doc/theses/mike_brooks_MMath/intro.tex

    rb3e77cc r6224eeb  
    11\chapter{Introduction}
    22
    3 All modern programming languages provide at least these three high-level containers (collections): array, linked-list, and string.
     3All modern programming languages provide at least these three high-level collections (containers): array, linked-list, and string.
    44Often, the array is part of the programming language, while linked lists are built from (recursive) pointer types, and strings from arrays and/or linked lists.
    55For all three types, languages and/or their libraries supply varying degrees of high-level mechanisms for manipulating these objects at the bulk and component levels, such as copying, slicing, extracting, iterating among, adding and removing elements.
     
    3939Therefore, hardening the three primary collections goes a long way to making the majority of C programs safer, eliminating major hacker attack vectors.
    4040
    41 My work looks at extending these foundational container types in the programming language \CFA, which is a new dialect of the C programming language.
     41My work looks at extending these foundational collection types in the programming language \CFA, which is a new dialect of the C programming language.
    4242A primary goal of \CFA~\cite{Cforall} is 99\% backward compatibility with C, while maintaining a look and feel that matches with C programmer experience and intuition.
    4343This goal requires ``thinking inside the box'' to engineer new features that ``work and play'' with C and its massive legacy code-base.
    4444An equally important goal is balancing good performance with safety.
    4545
    46 The thesis describes improvements I made to the \CFA language design, both syntax and semantics, to support the container features, and the source code created within the \CFA compiler, libraries, and runtime system to implement these features.
     46The thesis describes improvements I made to the \CFA language design, both syntax and semantics, to support the collection features, and the source code created within the \CFA compiler, libraries, and runtime system to implement these features.
    4747This work leverages preexisting work within the compiler's type and runtime systems generated by prior graduate students working on the \CFA project.
    4848
     
    5050\section{Array}
    5151
    52 An array provides a homogeneous container with $O(1)$ access to elements using subscripting.
     52An array provides a homogeneous collection with $O(1)$ access to elements using subscripting.
    5353Higher-level operations like array slicing (single or multidimensional) may have significantly higher cost, but provide a better programmer experience.
    5454The array size can be static, dynamic but fixed after creation, or dynamic and variable after creation.
     
    6464\section{Linked List}
    6565
    66 A linked-list provides a homogeneous container often with $O(n)$ access to elements using successor and predecessor operations that involve pointer chasing.
     66A linked-list provides a homogeneous collection often with $O(n)$ access to elements using successor and predecessor operations that involve pointer chasing.
    6767It is the most foundational case of the linked data structures, which include trees and hash tables.
    6868Among the linked structures, the list's defining characteristic is maintaining a user-explicit total order (chain shape).
     
    8383hash search table consisting of a key (string) with associated data (@<search.h>@)
    8484\end{itemize}
    85 Because these container libraries can be restrictive or awkward to use, C programmers often build bespoke linked data-structures, which further increases the potential for memory errors.
     85Because these collection libraries can be restrictive or awkward to use, C programmers often build bespoke linked data-structures, which further increases the potential for memory errors.
    8686
    8787
     
    117117\begin{enumerate}[leftmargin=*]
    118118\item
    119 The three primary container types in C are difficult to understand, teach, and get right because they are too low level.
    120 Providing higher-level, feature-rich versions of these containers in \CFA is a major component of the primary goal.
     119The three primary collection types in C are difficult to understand, teach, and get right because they are too low level.
     120Providing higher-level, feature-rich versions of these collections in \CFA is a major component of the primary goal.
    121121The result is a simplified programming experience, which increases productivity and maintainability.
    122122\item
    123 The new container types must be as correct and safe to use as those in other modern programming languages, to elide the concerns of industry, government, and military~\cite{ONCD}.
     123The new collection types must be as correct and safe to use as those in other modern programming languages, to elide the concerns of industry, government, and military~\cite{ONCD}.
    124124Prior approaches focus on out-of-bound array accesses using a model-based approach (ASCET) in embedded systems (\eg cars)~\cite{Blache19}, and general and null-terminated string arrays using a \CC template syntax (Checked C)~\cite{Elliott18,Ruef19}.
    125125Both White House~\cite{WhiteHouse24} and DARPA~\cite{DARPA24} recently released a recommendation to \emph{move away} from C and \CC, because of cybersecurity threats exploiting vulnerabilities in these languages.
  • doc/theses/mike_brooks_MMath/list.tex

    rb3e77cc r6224eeb  
    199199Within the @with@, the code acts as if there is only one list axis, without explicit casting.
    200200
    201 Unlike the \CC template container-types, the \CFA library works completely within the type system;
     201Unlike the \CC template collection types, the \CFA library works completely within the type system;
    202202both @dlink@ and @dlist@ are ordinary types, not language macros.
    203203There is no textual expansion other than header-included static-inline function for performance.
     
    544544Hence, to be \emph{exception safe}, all internal list operations must complete before the copy/move so the list is consistent should the return fail.
    545545This coding style can result in contrived code, but is usually possible;
    546 however, it requires the container designer to anticipate the potential throw.
    547 (Note, this anticipation issue is pervasive in exception systems, not just with containers.)
    548 The solution moves the coding complexity from the container designer to the programer~\cite[ch~10, part 3]{Sutter99}.
    549 First, obtain the node, which might fail, but the container is unmodified.
    550 Second, remove the node, which modifies the container without the possibly of an exception.
     546however, it requires the collection designer to anticipate the potential throw.
     547(Note, this anticipation issue is pervasive in exception systems, not just with collections.)
     548The solution moves the coding complexity from the collection designer to the programer~\cite[ch~10, part 3]{Sutter99}.
     549First, obtain the node, which might fail, but the collection is unmodified.
     550Second, remove the node, which modifies the collection without the possibly of an exception.
    551551This movement of responsibility increases the cognitive effort for programmers.
    552552Unfortunately, this \emph{single-responsibility principle}, \ie preferring separate operations, is often repeated as a necessary requirement rather an an optional one.
  • doc/theses/mike_brooks_MMath/uw-ethesis-frontpgs.tex

    rb3e77cc r6224eeb  
    1313        \vspace*{1.0cm}
    1414
    15   % TODO: punch up the title, thinking getting interest in the department-wide posting of my presentation
    16   % Modern collections for C
    17         {\Huge\bf \CFA Container Library}
     15        {\Huge\bf \CFA Collection Library}
    1816
    1917        \vspace*{1.0cm}
     
    131129\CFA strives to fix issues in C, chief among them safety.
    132130This thesis presents a significant step forward in \CFA's goal to remove unsafe pointer operations.
    133 It describes improvements to the \CFA language design to support advanced container features.
     131It describes improvements to the \CFA language design to support advanced collection features.
    134132These features are implemented across the \CFA compiler and runtime libraries.
    135133The results maintain another \CFA goal of offering strong backwards compatibility with C.
    136134To achieve these goals, this work leverages preexisting \CFA contributions by prior students, particularly novel applications of the compiler's type system.
    137135
    138 All modern programming languages provide these three high-level containers (collections): array, linked-list, and string.
     136All modern programming languages provide these three high-level collections (containers): array, linked-list, and string.
    139137Often, the array is part of the programming language, while linked lists are built from (recursive) pointer types, and strings from arrays and/or linked lists.
    140138For all three types, languages and/or their libraries supply varying degrees of high-level mechanisms for manipulating these objects at the bulk and component levels, such as copying, slicing, extracting, and iterating among elements.
     
    149147With the array, this case is made by showing complete erasure down to a naked C array, modulo runtime bound checks, which are removable more often than with Java-style length management.
    150148With the linked list and string, empirical measures are compared with C and \CC comparable libraries.
    151 These containers offer programmers workable alternatives to hand-rolling specialized libraries, which is a huge safety benefit, eliminating many system vulnerabilities.
     149These collections offer programmers workable alternatives to hand-rolling specialized libraries, which is a huge safety benefit, eliminating many system vulnerabilities.
    152150The results establish \CFA's position as a safety-forward programming alternative.
    153151
  • doc/theses/mike_brooks_MMath/uw-ethesis.tex

    rb3e77cc r6224eeb  
    183183    pdffitwindow=false,     % window fit to page when opened
    184184    pdfstartview={FitH},    % fits the width of the page to the window
    185     pdftitle={\CFA Container Library}, % title: CHANGE THIS TEXT!
     185    pdftitle={\CFA Collection Library}, % title: CHANGE THIS TEXT!
    186186    pdfauthor={Mike Brooks},    % author: CHANGE THIS TEXT! and uncomment this line
    187187    pdfsubject={Cforall},  % subject: CHANGE THIS TEXT! and uncomment this line
    188     pdfkeywords={Cforall} {container library} {C language}, % optional list of keywords
     188    pdfkeywords={Cforall} {container library} {collection library} {array} {linked list} {string} {C language}, % optional list of keywords
    189189    pdfnewwindow=true,      % links in new window
    190190    colorlinks=true,        % false: boxed links; true: colored links
Note: See TracChangeset for help on using the changeset viewer.