Context Navigation

-                      r292d6cf6
+                      rb7921d8
 \label{f:checkHarness}
 \end{figure}
+\section{Dimension Parameter Implementation}
+The core of the preexisting \CFA compiler already had the ``heavy equipment'' needed
+to provide the feature set just reviewed (up to bugs in cases not yet exercised).
+To apply this equipment in tracking array lengths, I encoded a dimension (array's length) as a type.
+The type in question does not describe any data that the program actually uses at runtime.
+It simply carries information through intermediate stages of \CFA-to-C lowering.
+And this type takes a form such that, once \emph{it} gets lowered, the result does the right thing.
+Furthermore, the @array@ type itself is really ``icing on the cake.''
+Presenting its full version is deferred until \VRef[Section]{s:ArrayMdImpl}
+(where added complexity needed for multiple dimensions is considered).
+But simplifications close enough for the present discussion are:
+\begin{cfa}
+        forall( [N] )
+        struct array_1d_float {
+                float items[N];
+        };
+        forall( T, [N] )
+        struct array_1d {
+                T items[N];
+        };
+\end{cfa}
+This structure pattern, plus a subscript operator, is all that @array@ provides.
+My main work is letting a programmer define
+such a structre (one whose type is parameterized by @[N]@)
+and functions that operate on it (these being similarly parameterized).
+The repurposed heavy equipment is
+\begin{itemize}
+\item
+        The resolver, providing values for a used declaration's type-system variables,
+        gathered from type information in scope at the usage site.
+\item
+        The box pass, encoding information about type parameters
+        into ``extra'' regular parameters/arguments on declarations and calls.
+        Notably, it conveys the size of a type @foo@ as a @__sizeof_foo@ parameter,
+        and rewrites the @sizeof(foo)@ expression as @__sizeof_foo@, i.e. a use of the parameter.
+\end{itemize}
+The rules for resolution had to be restricted slightly, in order to achieve important refusal cases.
+This work is detailed in \VRef[Section]{s:ArrayTypingC}.
+However, the resolution--boxing scheme, in its preexisting state, was already equipped to work on (desugared) dimension parameters.
+The discussion following explains the desugaring and how correct lowered code results.
+An even simpler structure, and a toy function on it, demonstrate what's needed for the encoding.
+\begin{cfa}
+        forall( [@N@] ) { // [1]
+                struct thing {};
+                void f( thing(@N@) ) { sout | @N@; } // [2], [3]
+        }
+        int main() {
+                thing( @10@ ) x;  f(x);  // prints 10, [4]
+                thing( 100 ) y;  f(y);  // prints 100
+                return 0;
+        }
+\end{cfa}
+This example has:
+\begin{enumerate}
+\item
+        The symbol @N@ being declared as a type variable (a variable of the type system).
+\item
+        The symbol @N@ being used to parameterize a type.
+\item
+        The symbol @N@ being used as an expression (value).
+\item
+        A value like 10 being used as an argument to the parameter @N@.
+\end{enumerate}
+The chosen solution being to encode the value @N@ \emph{as a type}, items 1 and 2 are immediately available for free.
+Item 3 needs a way to recover the encoded value from a (valid) type (and to reject invalid types occurring here).
+Item 4 needs a way to produce a type that encodes the given value.
+Because the box pass handles a type's size as its main datum, the encoding is chosen to use it.
+The production and recovery are then straightforward.
+\begin{itemize}
+\item
+        The value $n$ is encoded as a type whose size is $n$.
+\item
+        Given a dimension expression $e$, produce type @char[@$e$@]@ to represent it.
+        If $e$ evaluates to $n$ then the endoded type has size $n$.
+\item
+        Given a type $T$ (produced by these rules), recover the value that it represents with the expression @sizeof(@$T$@)@.
+        If $T$ has size $n$ then the recovery expression evaluates to $n$.
+\end{itemize}
+This desugaring is applied in a translation step before the resolver.
+The ``validate'' pass hosts it, along with several other canonicalizing and desugaring transformations (the pass's name notwithstanding).
+The running example is lowered to:
+\begin{cfa}
+        forall( @N*@ ) { // [1]
+                struct thing {};
+                void f( thing(@N@) ) { sout | @sizeof(N)@; } // [2], [3]
+        }
+        int main() {
+                thing( char[@10@] ) x;  f(x);  // prints 10, [4]
+                thing( char[100] ) y;  f(y);  // prints 100
+                return 0;
+        }
+\end{cfa}
+Observe:
+\begin{enumerate}
+\item
+        @N@ is now declared to be a type.
+        It is declared to be \emph{sized} (by the @*@), meaning that the box pass shall do its @sizeof(N)@--@__sizeof_N@ extra parameter and expression translation.
+\item
+        @thing(N)@ is a type; the argument to the generic @thing@ is a type (type variable).
+\item
+        The @sout...@ expression (being an application of the @?|?@ operator) has a second argument that is an ordinary expression.
+\item
+        The type of variable @x@ is another @thing(-)@ type;  the argument to the generic @thing@ is a type (array type).
+\end{enumerate}
+From this point, preexisting \CFA compilation takes over lowering it the rest of the way to C.
+Inspecting the result shows what the above translation achieves.
+A form that shows only the relevant changes of the box pass (as informed by the resolver), leaving the rest unadulterated, is:
+\begin{cfa}
+        // [1]
+        void f( size_t __sizeof_N, @void *@ ) { sout | @__sizeof_N@; } // [2], [3]
+        int main() {
+                struct __conc_thing_10 {} x;  f(@10@, &x);  // prints 10, [4]
+                struct __conc_thing_100 {} y;  f(@100@, &y);  // prints 100
+                return 0;
+        }
+\end{cfa}
+Observe:
+\begin{enumerate}
+\item
+        The type parameter @N@ is gone.
+\item
+        The type @thing(N)@ is (replaced by @void *@, but thereby effectively) gone.
+\item
+        The @sout...@ expression (being an application of the @?|?@ operator) has a second argument that is a regular variable (parameter) usage.
+\item
+        Information about the particular @thing@ instantiation (value 10) has moved, from the type, to a regular function-call argument.
+\end{enumerate}
+At the end of the desugaring and downstream processing, the original C idiom of ``pass both a pointer and a length parameter'' has been reconstructed.
+In the programmer-written form, only the thing is passed.
+The compiler's action produces the more complex form, which if handwritten, would be error-prone.
+Back at the very front end, the parsing changes, AST schema extensions, and validation rules for enabling the sugared user input are:
+\begin{itemize}
+\item
+        Recognize the form @[N]@ as a type-variable declaration within a @forall@.
+\item
+        Have the new brand of type-variable, \emph{Dimension}, in the AST form of a type-variable, to represent one parsed from @[-]@.
+\item
+        Allow a type variable to occur in expression position.  Validate (after parsing) that only dimension-branded type variables are used here.
+\item
+        Allow an expression to occur in type-argument position.  Brand the resulting type argument as a dimension.
+\item
+        Validate (after parsing), on a generic-type usage, \eg the type part of the declaration
+        \begin{cfa}
+                @array_1d( foo, bar ) x;@
+        \end{cfa}
+        that the brands on the generic arguments match the brands of the declared type variables.
+        Here, that @foo@ is a type and @bar@ is a dimension.
+\end{itemize}
 …
 TODO: Discuss the interaction of this dimension hoisting with the challenge of extra unification for cost calculation
 \section{Multidimensional Arrays}
 \label{toc:mdimpl}
 …
 \subsection{Multidimensional array implementation}
+\label{s:ArrayMdImpl}
 A multidimensional array implementation has three relevant levels of abstraction, from highest to lowest, where the array occupies \emph{contiguous memory}.

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset b7921d8 for doc/theses/mike_brooks_MMath

Legend:

doc/theses/mike_brooks_MMath/array.tex

Download in other formats: