Changeset 28bc8c8


Ignore:
Timestamp:
Mar 8, 2018, 3:16:30 PM (6 years ago)
Author:
Aaron Moss <a3moss@…>
Branches:
ADT, aaron-thesis, arm-eh, ast-experimental, cleanup-dtors, deferred_resn, demangler, enum, forall-pointer-decay, jacob/cs343-translation, jenkins-sandbox, master, new-ast, new-ast-unique-expr, new-env, no_list, persistent-indexer, pthread-emulation, qualifiedEnum, resolv-new, with_gc
Children:
9d6f011
Parents:
fb2ce27
Message:

Update evaluation section of paper

Location:
doc/papers/general
Files:
5 edited

Legend:

Unmodified
Added
Removed
  • doc/papers/general/Paper.tex

    rfb2ce27 r28bc8c8  
    25472547In fact, \CFA's features for generic programming can enable faster runtime execution than idiomatic @void *@-based C code.
    25482548This claim is demonstrated through a set of generic-code-based micro-benchmarks in C, \CFA, and \CC (see stack implementations in Appendix~\ref{sec:BenchmarkStackImplementation}).
    2549 Since all these languages share a subset essentially comprising standard C, maximal-performance benchmarks would show little runtime variance, other than in length and clarity of source code.
     2549Since all these languages share a subset essentially comprising standard C, maximal-performance benchmarks would show little runtime variance, differing only in length and clarity of source code.
    25502550A more illustrative benchmark measures the costs of idiomatic usage of each language's features.
    2551 Figure~\ref{fig:BenchmarkTest} shows the \CFA benchmark tests for a generic stack based on a singly linked-list, a generic pair-data-structure, and a variadic @print@ function similar to that in Section~\ref{sec:variadic-tuples}.
     2551Figure~\ref{fig:BenchmarkTest} shows the \CFA benchmark tests for a generic stack based on a singly linked-list.
    25522552The benchmark test is similar for C and \CC.
    2553 The experiment uses element types @int@ and @pair(_Bool, char)@, and pushes $N=40M$ elements on a generic stack, copies the stack, clears one of the stacks, finds the maximum value in the other stack, and prints $N/2$ (to reduce graph height) constants.
     2553The experiment uses element types @int@ and @pair(short, char)@, and pushes $N=40M$ elements on a generic stack, copies the stack, clears one of the stacks, and finds the maximum value in the other stack.
    25542554
    25552555\begin{figure}
    25562556\begin{cfa}[xleftmargin=3\parindentlnth,aboveskip=0pt,belowskip=0pt]
    2557 int main( int argc, char * argv[] ) {
     2557int main() {
    25582558        int max = 0, val = 42;
    25592559        stack( int ) si, ti;
    25602560
    25612561        REPEAT_TIMED( "push_int", N, push( si, val ); )
    2562         TIMED( "copy_int", ti = si; )
     2562        TIMED( "copy_int", ti{ si }; )
    25632563        TIMED( "clear_int", clear( si ); )
    25642564        REPEAT_TIMED( "pop_int", N,
    25652565                int x = pop( ti ); if ( x > max ) max = x; )
    25662566
    2567         pair( _Bool, char ) max = { (_Bool)0, '\0' }, val = { (_Bool)1, 'a' };
    2568         stack( pair( _Bool, char ) ) sp, tp;
     2567        pair( short, char ) max = { 0h, '\0' }, val = { 42h, 'a' };
     2568        stack( pair( short, char ) ) sp, tp;
    25692569
    25702570        REPEAT_TIMED( "push_pair", N, push( sp, val ); )
    2571         TIMED( "copy_pair", tp = sp; )
     2571        TIMED( "copy_pair", tp{ sp }; )
    25722572        TIMED( "clear_pair", clear( sp ); )
    25732573        REPEAT_TIMED( "pop_pair", N,
    2574                 pair(_Bool, char) x = pop( tp ); if ( x > max ) max = x; )
     2574                pair(short, char) x = pop( tp ); if ( x > max ) max = x; )
    25752575}
    25762576\end{cfa}
     
    25832583hence runtime checks are necessary to safely down-cast objects.
    25842584The most notable difference among the implementations is in memory layout of generic types: \CFA and \CC inline the stack and pair elements into corresponding list and pair nodes, while C and \CCV lack such a capability and instead must store generic objects via pointers to separately-allocated objects.
    2585 For the print benchmark, idiomatic printing is used: the C and \CFA variants used @stdio.h@, while the \CC and \CCV variants used @iostream@; preliminary tests show this distinction has negligible runtime impact.
    2586 Note, the C benchmark uses unchecked casts as there is no runtime mechanism to perform such checks, while \CFA and \CC provide type-safety statically.
     2585Note that the C benchmark uses unchecked casts as there is no runtime mechanism to perform such checks, while \CFA and \CC provide type-safety statically.
    25872586
    25882587Figure~\ref{fig:eval} and Table~\ref{tab:eval} show the results of running the benchmark in Figure~\ref{fig:BenchmarkTest} and its C, \CC, and \CCV equivalents.
    25892588The graph plots the median of 5 consecutive runs of each program, with an initial warm-up run omitted.
    2590 All code is compiled at \texttt{-O2} by gcc or g++ 6.2.0, with all \CC code compiled as \CCfourteen.
     2589All code is compiled at \texttt{-O2} by gcc or g++ 6.3.0, with all \CC code compiled as \CCfourteen.
    25912590The benchmarks are run on an Ubuntu 16.04 workstation with 16 GB of RAM and a 6-core AMD FX-6300 CPU with 3.5 GHz maximum clock frequency.
    25922591
     
    26062605                                                                        & \CT{C}        & \CT{\CFA}     & \CT{\CC}      & \CT{\CCV}             \\ \hline
    26072606maximum memory usage (MB)                       & 10001         & 2502          & 2503          & 11253                 \\
    2608 source code size (lines)                        & 247           & 222           & 165           & 339                   \\
    2609 redundant type annotations (lines)      & 39            & 2                     & 2                     & 15                    \\
    2610 binary size (KB)                                        & 14            & 229           & 18            & 38                    \\
     2607source code size (lines)                        & 187           & 188           & 133           & 303                   \\
     2608redundant type annotations (lines)      & 25            & 0                     & 2                     & 16                    \\
     2609binary size (KB)                                        & 14            & 257           & 14            & 37                    \\
    26112610\end{tabular}
    26122611\end{table}
    26132612
    26142613The C and \CCV variants are generally the slowest with the largest memory footprint, because of their less-efficient memory layout and the pointer-indirection necessary to implement generic types;
    2615 this inefficiency is exacerbated by the second level of generic types in the pair-based benchmarks.
    2616 By contrast, the \CFA and \CC variants run in roughly equivalent time for both the integer and pair of @_Bool@ and @char@ because the storage layout is equivalent, with the inlined libraries (\ie no separate compilation) and greater maturity of the \CC compiler contributing to its lead.
     2614this inefficiency is exacerbated by the second level of generic types in the pair benchmarks.
     2615By contrast, the \CFA and \CC variants run in roughly equivalent time for both the integer and pair of @short@ and @char@ because the storage layout is equivalent, with the inlined libraries (\ie no separate compilation) and greater maturity of the \CC compiler contributing to its lead.
    26172616\CCV is slower than C largely due to the cost of runtime type-checking of down-casts (implemented with @dynamic_cast@);
    2618 There are two outliers in the graph for \CFA: all prints and pop of @pair@.
    2619 Both of these cases result from the complexity of the C-generated polymorphic code, so that the gcc compiler is unable to optimize some dead code and condense nested calls.
    2620 A compiler designed for \CFA could easily perform these optimizations.
     2617The outlier in the graph for \CFA, pop @pair@, results from the complexity of the generated-C polymorphic code.
     2618The gcc compiler is unable to optimize some dead code and condense nested calls; a compiler designed for \CFA could easily perform these optimizations.
    26212619Finally, the binary size for \CFA is larger because of static linking with the \CFA libraries.
    26222620
    2623 \CFA is also competitive in terms of source code size, measured as a proxy for programmer effort. The line counts in Table~\ref{tab:eval} include implementations of @pair@ and @stack@ types for all four languages for purposes of direct comparison, though it should be noted that \CFA and \CC have pre-written data structures in their standard libraries that programmers would generally use instead. Use of these standard library types has minimal impact on the performance benchmarks, but shrinks the \CFA and \CC benchmarks to 73 and 54 lines, respectively.
     2621\CFA is also competitive in terms of source code size, measured as a proxy for programmer effort. The line counts in Table~\ref{tab:eval} include implementations of @pair@ and @stack@ types for all four languages for purposes of direct comparison, though it should be noted that \CFA and \CC have pre-written data structures in their standard libraries that programmers would generally use instead. Use of these standard library types has minimal impact on the performance benchmarks, but shrinks the \CFA and \CC benchmarks to 41 and 42 lines, respectively.
    26242622On the other hand, C does not have a generic collections-library in its standard distribution, resulting in frequent reimplementation of such collection types by C programmers.
    2625 \CCV does not use the \CC standard template library by construction, and in fact includes the definition of @object@ and wrapper classes for @bool@, @char@, @int@, and @const char *@ in its line count, which inflates this count somewhat, as an actual object-oriented language would include these in the standard library;
     2623\CCV does not use the \CC standard template library by construction, and in fact includes the definition of @object@ and wrapper classes for @char@, @short@, and @int@ in its line count, which inflates this count somewhat, as an actual object-oriented language would include these in the standard library;
    26262624with their omission, the \CCV line count is similar to C.
    26272625We justify the given line count by noting that many object-oriented languages do not allow implementing new interfaces on library types without subclassing or wrapper types, which may be similarly verbose.
     
    26292627Raw line-count, however, is a fairly rough measure of code complexity;
    26302628another important factor is how much type information the programmer must manually specify, especially where that information is not checked by the compiler.
    2631 Such unchecked type information produces a heavier documentation burden and increased potential for runtime bugs, and is much less common in \CFA than C, with its manually specified function pointers arguments and format codes, or \CCV, with its extensive use of un-type-checked downcasts (\eg @object@ to @integer@ when popping a stack, or @object@ to @printable@ when printing the elements of a @pair@).
     2629Such unchecked type information produces a heavier documentation burden and increased potential for runtime bugs, and is much less common in \CFA than C, with its manually specified function pointer arguments and format codes, or \CCV, with its extensive use of un-type-checked downcasts (\eg @object@ to @integer@ when popping a stack, or @object@ to @printable@ when printing the elements of a @pair@).
    26322630To quantify this, the ``redundant type annotations'' line in Table~\ref{tab:eval} counts the number of lines on which the type of a known variable is re-specified, either as a format specifier, explicit downcast, type-specific function, or by name in a @sizeof@, struct literal, or @new@ expression.
    26332631The \CC benchmark uses two redundant type annotations to create a new stack nodes, while the C and \CCV benchmarks have several such annotations spread throughout their code.
    2634 The two instances in which the \CFA benchmark still uses redundant type specifiers are to cast the result of a polymorphic @malloc@ call (the @sizeof@ argument is inferred by the compiler).
    2635 These uses are similar to the @new@ expressions in \CC, though the \CFA compiler's type resolver should shortly render even these type casts superfluous.
    2636 
     2632The \CFA benchmark was able to eliminate all redundant type annotations through use of the polymorphic @alloc@ function discussed in Section~\ref{sec:libraries}.
    26372633
    26382634\section{Related Work}
    2639 
    26402635
    26412636\subsection{Polymorphism}
     
    27652760\CFA
    27662761\begin{cfa}[xleftmargin=2\parindentlnth,aboveskip=0pt,belowskip=0pt]
    2767 forall(otype T) struct stack_node;
    2768 forall(otype T) struct stack {
    2769         stack_node(T) * head;
    2770 };
    27712762forall(otype T) struct stack_node {
    27722763        T value;
    27732764        stack_node(T) * next;
    27742765};
     2766forall(otype T) struct stack { stack_node(T) * head; };
    27752767forall(otype T) void ?{}( stack(T) & s ) { (s.head){ 0 }; }
    27762768forall(otype T) void ?{}( stack(T) & s, stack(T) t ) {
    27772769        stack_node(T) ** crnt = &s.head;
    27782770        for ( stack_node(T) * next = t.head; next; next = next->next ) {
    2779                 stack_node(T) * new_node = ((stack_node(T)*)malloc());
    2780                 (*new_node){ next->value }; /***/
    2781                 *crnt = new_node;
    2782                 stack_node(T) * acrnt = *crnt;
    2783                 crnt = &acrnt->next;
     2771                *crnt = alloc();
     2772                ((*crnt)->value){ next->value };
     2773                crnt = &(*crnt)->next;
    27842774        }
    27852775        *crnt = 0;
     
    27942784forall(otype T) _Bool empty( const stack(T) & s ) { return s.head == 0; }
    27952785forall(otype T) void push( stack(T) & s, T value ) {
    2796         stack_node(T) * new_node = ((stack_node(T)*)malloc());
    2797         (*new_node){ value, s.head }; /***/
    2798         s.head = new_node;
     2786        stack_node(T) * n = alloc();
     2787        (*n){ value, head };
     2788        head = n;
    27992789}
    28002790forall(otype T) T pop( stack(T) & s ) {
    2801         stack_node(T) * n = s.head;
    2802         s.head = n->next;
    2803         T v = n->value;
    2804         delete( n );
    2805         return v;
     2791        stack_node(T) * n = head;
     2792        head = n->next;
     2793        T x = n->value;
     2794        ^(*n){};
     2795        free( n );
     2796        return x;
    28062797}
    28072798forall(otype T) void clear( stack(T) & s ) {
    2808         for ( stack_node(T) * next = s.head; next; ) {
     2799        for ( stack_node(T) * next = head; next; ) {
    28092800                stack_node(T) * crnt = next;
    28102801                next = crnt->next;
    2811                 delete( crnt );
     2802                ^(*crnt){};
     2803                free(crnt);
    28122804        }
    2813         s.head = 0;
     2805        head = 0;
    28142806}
    28152807\end{cfa}
     
    28182810\CC
    28192811\begin{cfa}[xleftmargin=2\parindentlnth,aboveskip=0pt,belowskip=0pt]
    2820 template<typename T> class stack {
     2812template<typename T> struct stack {
    28212813        struct node {
    28222814                T value;
     
    28252817        };
    28262818        node * head;
    2827         void copy(const stack<T>& o) {
     2819        void copy(const stack<T> & o) {
    28282820                node ** crnt = &head;
    28292821                for ( node * next = o.head;; next; next = next->next ) {
     
    28332825                *crnt = nullptr;
    28342826        }
    2835   public:
    28362827        stack() : head(nullptr) {}
    2837         stack(const stack<T>& o) { copy(o); }
     2828        stack(const stack<T> & o) { copy(o); }
    28382829        stack(stack<T> && o) : head(o.head) { o.head = nullptr; }
    28392830        ~stack() { clear(); }
    2840         stack & operator= (const stack<T>& o) {
     2831        stack & operator= (const stack<T> & o) {
    28412832                if ( this == &o ) return *this;
    28422833                clear();
     
    28772868        struct stack_node * next;
    28782869};
     2870struct stack { struct stack_node* head; };
    28792871struct stack new_stack() { return (struct stack){ NULL }; /***/ }
    28802872void copy_stack(struct stack * s, const struct stack * t, void * (*copy)(const void *)) {
     
    28822874        for ( struct stack_node * next = t->head; next; next = next->next ) {
    28832875                *crnt = malloc(sizeof(struct stack_node)); /***/
    2884                 **crnt = (struct stack_node){ copy(next->value) }; /***/
     2876                (*crnt)->value = copy(next->value);
    28852877                crnt = &(*crnt)->next;
    28862878        }
    2887         *crnt = 0;
     2879        *crnt = NULL;
    28882880}
    28892881_Bool stack_empty(const struct stack * s) { return s->head == NULL; }
     
    29142906\CCV
    29152907\begin{cfa}[xleftmargin=2\parindentlnth,aboveskip=0pt,belowskip=0pt]
    2916 stack::node::node( const object & v, node * n ) : value( v.new_copy() ), next( n ) {}
    2917 void stack::copy(const stack & o) {
    2918         node ** crnt = &head;
    2919         for ( node * next = o.head; next; next = next->next ) {
    2920                 *crnt = new node{ *next->value };
    2921                 crnt = &(*crnt)->next;
     2908struct stack {
     2909        struct node {
     2910                ptr<object> value;
     2911                node* next;
     2912                node( const object & v, node * n ) : value( v.new_copy() ), next( n ) {}
     2913        };
     2914        node* head;
     2915        void copy(const stack & o) {
     2916                node ** crnt = &head;
     2917                for ( node * next = o.head; next; next = next->next ) {
     2918                        *crnt = new node{ *next->value }; /***/
     2919                        crnt = &(*crnt)->next;
     2920                }
     2921                *crnt = nullptr;
    29222922        }
    2923         *crnt = nullptr;
    2924 }
    2925 stack::stack() : head(nullptr) {}
    2926 stack::stack(const stack & o) { copy(o); }
    2927 stack::stack(stack && o) : head(o.head) { o.head = nullptr; }
    2928 stack::~stack() { clear(); }
    2929 stack & stack::operator= (const stack & o) {
    2930         if ( this == &o ) return *this;
    2931         clear();
    2932         copy(o);
    2933         return *this;
    2934 }
    2935 stack & stack::operator= (stack && o) {
    2936         if ( this == &o ) return *this;
    2937         head = o.head;
    2938         o.head = nullptr;
    2939         return *this;
    2940 }
    2941 bool stack::empty() const { return head == nullptr; }
    2942 void stack::push(const object & value) { head = new node{ value, head }; /***/ }
    2943 ptr<object> stack::pop() {
    2944         node * n = head;
    2945         head = n->next;
    2946         ptr<object> x = std::move(n->value);
    2947         delete n;
    2948         return x;
    2949 }
    2950 void stack::clear() {
    2951         for ( node * next = head; next; ) {
    2952                 node * crnt = next;
    2953                 next = crnt->next;
    2954                 delete crnt;
     2923        stack() : head(nullptr) {}
     2924        stack(const stack & o) { copy(o); }
     2925        stack(stack && o) : head(o.head) { o.head = nullptr; }
     2926        ~stack() { clear(); }
     2927        stack & operator= (const stack & o) {
     2928                if ( this == &o ) return *this;
     2929                clear();
     2930                copy(o);
     2931                return *this;
    29552932        }
    2956         head = nullptr;
    2957 }
     2933        stack & operator= (stack && o) {
     2934                if ( this == &o ) return *this;
     2935                head = o.head;
     2936                o.head = nullptr;
     2937                return *this;
     2938        }
     2939        bool empty() const { return head == nullptr; }
     2940        void push(const object & value) { head = new node{ value, head }; /***/ }
     2941        ptr<object> pop() {
     2942                node * n = head;
     2943                head = n->next;
     2944                ptr<object> x = std::move(n->value);
     2945                delete n;
     2946                return x;
     2947        }
     2948        void clear() {
     2949                for ( node * next = head; next; ) {
     2950                        node * crnt = next;
     2951                        next = crnt->next;
     2952                        delete crnt;
     2953                }
     2954                head = nullptr;
     2955        }
     2956};
    29582957\end{cfa}
    29592958
  • doc/papers/general/evaluation/cfa-bench.c

    rfb2ce27 r28bc8c8  
    33#include "cfa-pair.h"
    44
    5 int main( int argc, char * argv[] ) {
     5int main() {
    66        int max = 0, val = 42;
    77        stack( int ) si, ti;
  • doc/papers/general/evaluation/cfa-stack.c

    rfb2ce27 r28bc8c8  
    1212        stack_node(T) ** crnt = &s.head;
    1313        for ( stack_node(T) * next = t.head; next; next = next->next ) {
    14                 *crnt = malloc();
     14                *crnt = alloc();
    1515                ((*crnt)->value){ next->value };
    1616                crnt = &(*crnt)->next;
     
    3131
    3232forall(otype T) void push( stack(T) & s, T value ) with( s ) {
    33         stack_node(T)* n = malloc();
     33        stack_node(T)* n = alloc();
    3434        (*n){ value, head };
    3535        head = n;
  • doc/papers/general/evaluation/timing.dat

    rfb2ce27 r28bc8c8  
    11"400 million repetitions"       "C"     "\\CFA{}"       "\\CC{}"        "\\CC{obj}"
    2 "push\nint"     2976    2225    1522    3266
    3 "copy\nnt"      2932    7072    1526    3110
    4 "clear\nint"    1380    731     750     1488
    5 "pop\nint"      1444    1196    756     5156
    6 "push\npair"    3695    2257    953     6840
    7 "copy\npair"    6034    6650    994     7224
    8 "clear\npair"   2832    848     742     3297
    9 "pop\npair"     3009    5348    797     25235
    10 
     2"push\nint"     3002    2459    1520    3305
     3"copy\nint"     2985    2057    1521    3152
     4"clear\nint"    1374    827     718     1469
     5"pop\nint"      1416    1221    717     5467
     6"push\npair"    4214    2752    946     6826
     7"copy\npair"    6127    2105    993     7330
     8"clear\npair"   2881    885     711     3564
     9"pop\npair"     3046    5434    783     26538
Note: See TracChangeset for help on using the changeset viewer.