- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/papers/concurrency/Paper.tex
r016b1eb rfd2f4a9 224 224 {} 225 225 \lstnewenvironment{C++}[1][] % use C++ style 226 {\lstset{language=C++,moredelim=**[is][\protect\color{red}]{`}{`} ,#1}\lstset{#1}}226 {\lstset{language=C++,moredelim=**[is][\protect\color{red}]{`}{`}}\lstset{#1}} 227 227 {} 228 228 \lstnewenvironment{uC++}[1][] 229 {\lstset{language=uC++,moredelim=**[is][\protect\color{red}]{`}{`} ,#1}\lstset{#1}}229 {\lstset{language=uC++,moredelim=**[is][\protect\color{red}]{`}{`}}\lstset{#1}} 230 230 {} 231 231 \lstnewenvironment{Go}[1][] 232 {\lstset{language=Golang,moredelim=**[is][\protect\color{red}]{`}{`} ,#1}\lstset{#1}}232 {\lstset{language=Golang,moredelim=**[is][\protect\color{red}]{`}{`}}\lstset{#1}} 233 233 {} 234 234 \lstnewenvironment{python}[1][] 235 {\lstset{language=python,moredelim=**[is][\protect\color{red}]{`}{`} ,#1}\lstset{#1}}235 {\lstset{language=python,moredelim=**[is][\protect\color{red}]{`}{`}}\lstset{#1}} 236 236 {} 237 237 \lstnewenvironment{java}[1][] 238 {\lstset{language=java,moredelim=**[is][\protect\color{red}]{`}{`} ,#1}\lstset{#1}}238 {\lstset{language=java,moredelim=**[is][\protect\color{red}]{`}{`}}\lstset{#1}} 239 239 {} 240 240 … … 284 284 285 285 \begin{document} 286 \linenumbers % comment out to turn off line numbering286 %\linenumbers % comment out to turn off line numbering 287 287 288 288 \maketitle … … 450 450 \hline 451 451 stateful & thread & \multicolumn{1}{c|}{No} & \multicolumn{1}{c}{Yes} \\ 452 \hline 453 \hline 452 \hline 453 \hline 454 454 No & No & \textbf{1}\ \ \ @struct@ & \textbf{2}\ \ \ @mutex@ @struct@ \\ 455 \hline 455 \hline 456 456 Yes (stackless) & No & \textbf{3}\ \ \ @generator@ & \textbf{4}\ \ \ @mutex@ @generator@ \\ 457 \hline 457 \hline 458 458 Yes (stackful) & No & \textbf{5}\ \ \ @coroutine@ & \textbf{6}\ \ \ @mutex@ @coroutine@ \\ 459 \hline 459 \hline 460 460 No & Yes & \textbf{7}\ \ \ {\color{red}rejected} & \textbf{8}\ \ \ {\color{red}rejected} \\ 461 \hline 461 \hline 462 462 Yes (stackless) & Yes & \textbf{9}\ \ \ {\color{red}rejected} & \textbf{10}\ \ \ {\color{red}rejected} \\ 463 \hline 463 \hline 464 464 Yes (stackful) & Yes & \textbf{11}\ \ \ @thread@ & \textbf{12}\ \ @mutex@ @thread@ \\ 465 465 \end{tabular} … … 2896 2896 \label{s:RuntimeStructureCluster} 2897 2897 2898 A \newterm{cluster} is a collection of user and kernel threads, where the kernel threads run the user threads from the cluster's ready queue, and the operating system runs the kernel threads on the processors from its ready queue .2898 A \newterm{cluster} is a collection of user and kernel threads, where the kernel threads run the user threads from the cluster's ready queue, and the operating system runs the kernel threads on the processors from its ready queue~\cite{Buhr90a}. 2899 2899 The term \newterm{virtual processor} is introduced as a synonym for kernel thread to disambiguate between user and kernel thread. 2900 2900 From the language perspective, a virtual processor is an actual processor (core). … … 2992 2992 \end{cfa} 2993 2993 where CPU time in nanoseconds is from the appropriate language clock. 2994 Each benchmark is performed @N@ times, where @N@ is selected so the benchmark runs in the range of 2--20 seconds for the specific programming language. 2994 Each benchmark is performed @N@ times, where @N@ is selected so the benchmark runs in the range of 2--20 seconds for the specific programming language; 2995 each @N@ appears after the experiment name in the following tables. 2995 2996 The total time is divided by @N@ to obtain the average time for a benchmark. 2996 2997 Each benchmark experiment is run 13 times and the average appears in the table. 2998 For languages with a runtime JIT (Java, Node.js, Python), a single half-hour long experiment is run to check stability; 2999 all long-experiment results are statistically equivalent, \ie median/average/standard-deviation correlate with the short-experiment results, indicating the short experiments reached a steady state. 2997 3000 All omitted tests for other languages are functionally identical to the \CFA tests and available online~\cite{CforallConcurrentBenchmarks}. 2998 % tar --exclude-ignore=exclude -cvhf benchmark.tar benchmark2999 % cp -p benchmark.tar /u/cforall/public_html/doc/concurrent_benchmark.tar3000 3001 3001 3002 \paragraph{Creation} … … 3006 3007 3007 3008 \begin{multicols}{2} 3008 \lstset{language=CFA,moredelim=**[is][\color{red}]{@}{@},deletedelim=**[is][]{`}{`}} 3009 \begin{cfa} 3010 @coroutine@ MyCoroutine {}; 3009 \begin{cfa}[xleftmargin=0pt] 3010 `coroutine` MyCoroutine {}; 3011 3011 void ?{}( MyCoroutine & this ) { 3012 3012 #ifdef EAGER … … 3016 3016 void main( MyCoroutine & ) {} 3017 3017 int main() { 3018 BENCH( for ( N ) { @MyCoroutine c;@} )3018 BENCH( for ( N ) { `MyCoroutine c;` } ) 3019 3019 sout | result; 3020 3020 } … … 3030 3030 3031 3031 \begin{tabular}[t]{@{}r*{3}{D{.}{.}{5.2}}@{}} 3032 \multicolumn{1}{@{}c}{} & \multicolumn{1}{c}{Median} & \multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\ 3033 \CFA generator & 0.6 & 0.6 & 0.0 \\ 3034 \CFA coroutine lazy & 13.4 & 13.1 & 0.5 \\ 3035 \CFA coroutine eager & 144.7 & 143.9 & 1.5 \\ 3036 \CFA thread & 466.4 & 468.0 & 11.3 \\ 3037 \uC coroutine & 155.6 & 155.7 & 1.7 \\ 3038 \uC thread & 523.4 & 523.9 & 7.7 \\ 3039 Python generator & 123.2 & 124.3 & 4.1 \\ 3040 Node.js generator & 33.4 & 33.5 & 0.3 \\ 3041 Goroutine thread & 751.0 & 750.5 & 3.1 \\ 3042 Rust tokio thread & 1860.0 & 1881.1 & 37.6 \\ 3043 Rust thread & 53801.0 & 53896.8 & 274.9 \\ 3044 Java thread & 120274.0 & 120722.9 & 2356.7 \\ 3045 Pthreads thread & 31465.5 & 31419.5 & 140.4 3032 \multicolumn{1}{@{}r}{N\hspace*{10pt}} & \multicolumn{1}{c}{Median} & \multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\ 3033 \CFA generator (1B) & 0.6 & 0.6 & 0.0 \\ 3034 \CFA coroutine lazy (100M) & 13.4 & 13.1 & 0.5 \\ 3035 \CFA coroutine eager (10M) & 144.7 & 143.9 & 1.5 \\ 3036 \CFA thread (10M) & 466.4 & 468.0 & 11.3 \\ 3037 \uC coroutine (10M) & 155.6 & 155.7 & 1.7 \\ 3038 \uC thread (10M) & 523.4 & 523.9 & 7.7 \\ 3039 Python generator (10M) & 123.2 & 124.3 & 4.1 \\ 3040 Node.js generator (10M) & 33.4 & 33.5 & 0.3 \\ 3041 Goroutine thread (10M) & 751.0 & 750.5 & 3.1 \\ 3042 Rust tokio thread (10M) & 1860.0 & 1881.1 & 37.6 \\ 3043 Rust thread (250K) & 53801.0 & 53896.8 & 274.9 \\ 3044 Java thread (250K) & 119256.0 & 119679.2 & 2244.0 \\ 3045 % Java thread (1 000 000) & 123100.0 & 123052.5 & 751.6 \\ 3046 Pthreads thread (250K) & 31465.5 & 31419.5 & 140.4 3046 3047 \end{tabular} 3047 3048 \end{multicols} … … 3052 3053 Internal scheduling is measured using a cycle of two threads signalling and waiting. 3053 3054 Figure~\ref{f:schedint} shows the code for \CFA, with results in Table~\ref{t:schedint}. 3054 Note, the incremental cost of bulk acquire for \CFA, which is largely a fixed cost for small numbers of mutex objects. 3055 Java scheduling is significantly greater because the benchmark explicitly creates multiple threads in order to prevent the JIT from making the program sequential, \ie removing all locking. 3055 Note, the \CFA incremental cost for bulk acquire is a fixed cost for small numbers of mutex objects. 3056 User-level threading has one kernel thread, eliminating contention between the threads (direct handoff of the kernel thread). 3057 Kernel-level threading has two kernel threads allowing some contention. 3056 3058 3057 3059 \begin{multicols}{2} 3058 \ lstset{language=CFA,moredelim=**[is][\color{red}]{@}{@},deletedelim=**[is][]{`}{`}}3059 \begin{cfa} 3060 \setlength{\tabcolsep}{3pt} 3061 \begin{cfa}[xleftmargin=0pt] 3060 3062 volatile int go = 0; 3061 @condition c;@ 3062 @monitor@M {} m1/*, m2, m3, m4*/;3063 void call( M & @mutex p1/*, p2, p3, p4*/@) {3064 @signal( c );@3065 } 3066 void wait( M & @mutex p1/*, p2, p3, p4*/@) {3063 `condition c;` 3064 `monitor` M {} m1/*, m2, m3, m4*/; 3065 void call( M & `mutex p1/*, p2, p3, p4*/` ) { 3066 `signal( c );` 3067 } 3068 void wait( M & `mutex p1/*, p2, p3, p4*/` ) { 3067 3069 go = 1; // continue other thread 3068 for ( N ) { @wait( c );@} );3070 for ( N ) { `wait( c );` } ); 3069 3071 } 3070 3072 thread T {}; … … 3091 3093 3092 3094 \begin{tabular}{@{}r*{3}{D{.}{.}{5.2}}@{}} 3093 \multicolumn{1}{@{}c}{} & \multicolumn{1}{c}{Median} & \multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\ 3094 \CFA @signal@, 1 monitor & 364.4 & 364.2 & 4.4 \\ 3095 \CFA @signal@, 2 monitor & 484.4 & 483.9 & 8.8 \\ 3096 \CFA @signal@, 4 monitor & 709.1 & 707.7 & 15.0 \\ 3097 \uC @signal@ monitor & 328.3 & 327.4 & 2.4 \\ 3098 Rust cond. variable & 7514.0 & 7437.4 & 397.2 \\ 3099 Java @notify@ monitor & 9623.0 & 9654.6 & 236.2 \\ 3100 Pthreads cond. variable & 5553.7 & 5576.1 & 345.6 3095 \multicolumn{1}{@{}r}{N\hspace*{10pt}} & \multicolumn{1}{c}{Median} & \multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\ 3096 \CFA @signal@, 1 monitor (10M) & 364.4 & 364.2 & 4.4 \\ 3097 \CFA @signal@, 2 monitor (10M) & 484.4 & 483.9 & 8.8 \\ 3098 \CFA @signal@, 4 monitor (10M) & 709.1 & 707.7 & 15.0 \\ 3099 \uC @signal@ monitor (10M) & 328.3 & 327.4 & 2.4 \\ 3100 Rust cond. variable (1M) & 7514.0 & 7437.4 & 397.2 \\ 3101 Java @notify@ monitor (1M) & 8717.0 & 8774.1 & 471.8 \\ 3102 % Java @notify@ monitor (100 000 000) & 8634.0 & 8683.5 & 330.5 \\ 3103 Pthreads cond. variable (1M) & 5553.7 & 5576.1 & 345.6 3101 3104 \end{tabular} 3102 3105 \end{multicols} … … 3107 3110 External scheduling is measured using a cycle of two threads calling and accepting the call using the @waitfor@ statement. 3108 3111 Figure~\ref{f:schedext} shows the code for \CFA with results in Table~\ref{t:schedext}. 3109 Note, the incremental cost of bulk acquire for \CFA, which is largelya fixed cost for small numbers of mutex objects.3112 Note, the \CFA incremental cost for bulk acquire is a fixed cost for small numbers of mutex objects. 3110 3113 3111 3114 \begin{multicols}{2} 3112 \ lstset{language=CFA,moredelim=**[is][\color{red}]{@}{@},deletedelim=**[is][]{`}{`}}3115 \setlength{\tabcolsep}{5pt} 3113 3116 \vspace*{-16pt} 3114 \begin{cfa} 3115 @monitor@M {} m1/*, m2, m3, m4*/;3116 void call( M & @mutex p1/*, p2, p3, p4*/@) {}3117 void wait( M & @mutex p1/*, p2, p3, p4*/@) {3118 for ( N ) { @waitfor( call : p1/*, p2, p3, p4*/ );@}3117 \begin{cfa}[xleftmargin=0pt] 3118 `monitor` M {} m1/*, m2, m3, m4*/; 3119 void call( M & `mutex p1/*, p2, p3, p4*/` ) {} 3120 void wait( M & `mutex p1/*, p2, p3, p4*/` ) { 3121 for ( N ) { `waitfor( call : p1/*, p2, p3, p4*/ );` } 3119 3122 } 3120 3123 thread T {}; … … 3133 3136 \columnbreak 3134 3137 3135 \vspace*{-1 6pt}3138 \vspace*{-18pt} 3136 3139 \captionof{table}{External-scheduling comparison (nanoseconds)} 3137 3140 \label{t:schedext} 3138 3141 \begin{tabular}{@{}r*{3}{D{.}{.}{3.2}}@{}} 3139 \multicolumn{1}{@{} c}{} & \multicolumn{1}{c}{Median} &\multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\3140 \CFA @waitfor@, 1 monitor & 367.1 & 365.3 & 5.0 \\3141 \CFA @waitfor@, 2 monitor & 463.0 & 464.6 & 7.1 \\3142 \CFA @waitfor@, 4 monitor & 689.6 & 696.2 & 21.5 \\3143 \uC \lstinline[language=uC++]|_Accept| monitor & 328.2 & 329.1 & 3.4 \\3144 Go \lstinline[language=Golang]|select| channel & 365.0 & 365.5 & 1.23142 \multicolumn{1}{@{}r}{N\hspace*{10pt}} & \multicolumn{1}{c}{Median} &\multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\ 3143 \CFA @waitfor@, 1 monitor (10M) & 367.1 & 365.3 & 5.0 \\ 3144 \CFA @waitfor@, 2 monitor (10M) & 463.0 & 464.6 & 7.1 \\ 3145 \CFA @waitfor@, 4 monitor (10M) & 689.6 & 696.2 & 21.5 \\ 3146 \uC \lstinline[language=uC++]|_Accept| monitor (10M) & 328.2 & 329.1 & 3.4 \\ 3147 Go \lstinline[language=Golang]|select| channel (10M) & 365.0 & 365.5 & 1.2 3145 3148 \end{tabular} 3146 3149 \end{multicols} … … 3155 3158 3156 3159 \begin{multicols}{2} 3157 \ lstset{language=CFA,moredelim=**[is][\color{red}]{@}{@},deletedelim=**[is][]{`}{`}}3158 \begin{cfa} 3159 @monitor@M {} m1/*, m2, m3, m4*/;3160 call( M & @mutex p1/*, p2, p3, p4*/@) {}3160 \setlength{\tabcolsep}{3pt} 3161 \begin{cfa}[xleftmargin=0pt] 3162 `monitor` M {} m1/*, m2, m3, m4*/; 3163 call( M & `mutex p1/*, p2, p3, p4*/` ) {} 3161 3164 int main() { 3162 3165 BENCH( for( N ) call( m1/*, m2, m3, m4*/ ); ) … … 3173 3176 \label{t:mutex} 3174 3177 \begin{tabular}{@{}r*{3}{D{.}{.}{3.2}}@{}} 3175 \multicolumn{1}{@{}c}{} & \multicolumn{1}{c}{Median} &\multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\ 3176 test-and-test-set lock & 19.1 & 18.9 & 0.4 \\ 3177 \CFA @mutex@ function, 1 arg. & 48.3 & 47.8 & 0.9 \\ 3178 \CFA @mutex@ function, 2 arg. & 86.7 & 87.6 & 1.9 \\ 3179 \CFA @mutex@ function, 4 arg. & 173.4 & 169.4 & 5.9 \\ 3180 \uC @monitor@ member rtn. & 54.8 & 54.8 & 0.1 \\ 3181 Goroutine mutex lock & 34.0 & 34.0 & 0.0 \\ 3182 Rust mutex lock & 33.0 & 33.2 & 0.8 \\ 3183 Java synchronized method & 31.0 & 31.0 & 0.0 \\ 3184 Pthreads mutex Lock & 31.0 & 31.1 & 0.4 3178 \multicolumn{1}{@{}r}{N\hspace*{10pt}} & \multicolumn{1}{c}{Median} &\multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\ 3179 test-and-test-set lock (50M) & 19.1 & 18.9 & 0.4 \\ 3180 \CFA @mutex@ function, 1 arg. (50M) & 48.3 & 47.8 & 0.9 \\ 3181 \CFA @mutex@ function, 2 arg. (50M) & 86.7 & 87.6 & 1.9 \\ 3182 \CFA @mutex@ function, 4 arg. (50M) & 173.4 & 169.4 & 5.9 \\ 3183 \uC @monitor@ member rtn. (50M) & 54.8 & 54.8 & 0.1 \\ 3184 Goroutine mutex lock (50M) & 34.0 & 34.0 & 0.0 \\ 3185 Rust mutex lock (50M) & 33.0 & 33.2 & 0.8 \\ 3186 Java synchronized method (50M) & 31.0 & 30.9 & 0.5 \\ 3187 % Java synchronized method (10 000 000 000) & 31.0 & 30.2 & 0.9 \\ 3188 Pthreads mutex Lock (50M) & 31.0 & 31.1 & 0.4 3185 3189 \end{tabular} 3186 3190 \end{multicols} … … 3201 3205 % To: "Peter A. Buhr" <pabuhr@plg2.cs.uwaterloo.ca> 3202 3206 % Date: Fri, 24 Jan 2020 13:49:18 -0500 3203 % 3207 % 3204 3208 % I can also verify that the previous version, which just tied a bunch of promises together, *does not* go back to the 3205 3209 % event loop at all in the current version of Node. Presumably they're taking advantage of the fact that the ordering of … … 3211 3215 3212 3216 \begin{multicols}{2} 3213 \lstset{language=CFA,moredelim=**[is][\color{red}]{@}{@},deletedelim=**[is][]{`}{`}} 3214 \begin{cfa}[aboveskip=0pt,belowskip=0pt] 3215 @coroutine@ C {}; 3216 void main( C & ) { for () { @suspend;@ } } 3217 \begin{cfa}[xleftmargin=0pt] 3218 `coroutine` C {}; 3219 void main( C & ) { for () { `suspend;` } } 3217 3220 int main() { // coroutine test 3218 3221 C c; 3219 BENCH( for ( N ) { @resume( c );@} )3222 BENCH( for ( N ) { `resume( c );` } ) 3220 3223 sout | result; 3221 3224 } 3222 3225 int main() { // thread test 3223 BENCH( for ( N ) { @yield();@} )3226 BENCH( for ( N ) { `yield();` } ) 3224 3227 sout | result; 3225 3228 } … … 3234 3237 \label{t:ctx-switch} 3235 3238 \begin{tabular}{@{}r*{3}{D{.}{.}{3.2}}@{}} 3236 \multicolumn{1}{@{}c}{} & \multicolumn{1}{c}{Median} &\multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\ 3237 C function & 1.8 & 1.8 & 0.0 \\ 3238 \CFA generator & 1.8 & 2.0 & 0.3 \\ 3239 \CFA coroutine & 32.5 & 32.9 & 0.8 \\ 3240 \CFA thread & 93.8 & 93.6 & 2.2 \\ 3241 \uC coroutine & 50.3 & 50.3 & 0.2 \\ 3242 \uC thread & 97.3 & 97.4 & 1.0 \\ 3243 Python generator & 40.9 & 41.3 & 1.5 \\ 3244 Node.js await & 1852.2 & 1854.7 & 16.4 \\ 3245 Node.js generator & 33.3 & 33.4 & 0.3 \\ 3246 Goroutine thread & 143.0 & 143.3 & 1.1 \\ 3247 Rust async await & 32.0 & 32.0 & 0.0 \\ 3248 Rust tokio thread & 143.0 & 143.0 & 1.7 \\ 3249 Rust thread & 332.0 & 331.4 & 2.4 \\ 3250 Java thread & 405.0 & 415.0 & 17.6 \\ 3251 Pthreads thread & 334.3 & 335.2 & 3.9 3239 \multicolumn{1}{@{}r}{N\hspace*{10pt}} & \multicolumn{1}{c}{Median} &\multicolumn{1}{c}{Average} & \multicolumn{1}{c@{}}{Std Dev} \\ 3240 C function (10B) & 1.8 & 1.8 & 0.0 \\ 3241 \CFA generator (5B) & 1.8 & 2.0 & 0.3 \\ 3242 \CFA coroutine (100M) & 32.5 & 32.9 & 0.8 \\ 3243 \CFA thread (100M) & 93.8 & 93.6 & 2.2 \\ 3244 \uC coroutine (100M) & 50.3 & 50.3 & 0.2 \\ 3245 \uC thread (100M) & 97.3 & 97.4 & 1.0 \\ 3246 Python generator (100M) & 40.9 & 41.3 & 1.5 \\ 3247 Node.js await (5M) & 1852.2 & 1854.7 & 16.4 \\ 3248 Node.js generator (100M) & 33.3 & 33.4 & 0.3 \\ 3249 Goroutine thread (100M) & 143.0 & 143.3 & 1.1 \\ 3250 Rust async await (100M) & 32.0 & 32.0 & 0.0 \\ 3251 Rust tokio thread (100M) & 143.0 & 143.0 & 1.7 \\ 3252 Rust thread (25M) & 332.0 & 331.4 & 2.4 \\ 3253 Java thread (100M) & 405.0 & 415.0 & 17.6 \\ 3254 % Java thread ( 100 000 000) & 413.0 & 414.2 & 6.2 \\ 3255 % Java thread (5 000 000 000) & 415.0 & 415.2 & 6.1 \\ 3256 Pthreads thread (25M) & 334.3 & 335.2 & 3.9 3252 3257 \end{tabular} 3253 3258 \end{multicols} … … 3258 3263 Languages using 1:1 threading based on pthreads can at best meet or exceed, due to language overhead, the pthread results. 3259 3264 Note, pthreads has a fast zero-contention mutex lock checked in user space. 3260 Languages with M:N threading have better performance than 1:1 because there is no operating-system interactions. 3265 Languages with M:N threading have better performance than 1:1 because there is no operating-system interactions (context-switching or locking). 3266 As well, for locking experiments, M:N threading has less contention if only one kernel thread is used. 3261 3267 Languages with stackful coroutines have higher cost than stackless coroutines because of stack allocation and context switching; 3262 3268 however, stackful \uC and \CFA coroutines have approximately the same performance as stackless Python and Node.js generators. 3263 3269 The \CFA stackless generator is approximately 25 times faster for suspend/resume and 200 times faster for creation than stackless Python and Node.js generators. 3270 The Node.js context-switch is costly when asynchronous await must enter the event engine because a promise is not fulfilled. 3271 Finally, the benchmark results correlate across programming languages with and without JIT, indicating the JIT has completed any runtime optimizations. 3264 3272 3265 3273 … … 3319 3327 3320 3328 The authors recognize the design assistance of Aaron Moss, Rob Schluntz, Andrew Beach, and Michael Brooks; David Dice for commenting and helping with the Java benchmarks; and Gregor Richards for helping with the Node.js benchmarks. 3321 This research is funded by a grant fromWaterloo-Huawei (\url{http://www.huawei.com}) Joint Innovation Lab. %, and Peter Buhr is partially funded by the Natural Sciences and Engineering Research Council of Canada.3329 This research is funded by the NSERC/Waterloo-Huawei (\url{http://www.huawei.com}) Joint Innovation Lab. %, and Peter Buhr is partially funded by the Natural Sciences and Engineering Research Council of Canada. 3322 3330 3323 3331 {%
Note: See TracChangeset
for help on using the changeset viewer.