Index: doc/papers/concurrency/response3
===================================================================
--- doc/papers/concurrency/response3	(revision e0116c4ebd1fd32d563981e524ecfe5a8e92dbfb)
+++ doc/papers/concurrency/response3	(revision e0116c4ebd1fd32d563981e524ecfe5a8e92dbfb)
@@ -0,0 +1,64 @@
+    I would like you address the comments of Reviewer 2, particularly with
+    regard to the description of the adaptation Java harness to deal with
+    warmup. I would expect to see a convincing argument that the computation
+    has reached a steady state.
+
+We understand referee2 and your concern about the JIT experiments, which is why
+we verified our experiments with two experts in JIT development for both Java
+and Node.js before submitting the paper. We also read the supplied papers, but
+most of the information is not applicable to our work for the following
+reasons.
+
+1. SPEC benchmarks are medium to large. In contrast, our benchmarks are 5-15
+   lines in length for each programming language (see code for the Cforall
+   tests in the paper). Hence, there is no significant computations, complex
+   control flow, or use of memory.  They test one specific language features
+   (context switch, mutex call, etc.) in isolation over and over again. These
+   language features are fixed (e.g., acquiring and releasing a lock is a fixed
+   cost). Therefore, unless the feature can be removed there is nothing to
+   optimize at runtime. But these features cannot be removed without changing
+   the meaning of the benchmark. If the feature is removed, the timing result
+   would be 0. In fact, it was difficult to prevent the JIT from completely
+   eliding some benchmarks because there are no side-effects.
+   
+2. All of our benchmark results correlate across programming languages with and
+   without JIT, indicating the JIT has completed any runtime optimizations
+   (added this sentence to Section 8.1). Any large differences are explained by
+   how a language implements a feature not by how the compiler/JIT precesses
+   that feature.  Section 8.1 discusses these points in detail.
+
+3. We also added a sentence about running all JIT-base programming language
+   experiments for 30 minutes and there was no statistical difference,
+   med/avg/std correlated with the short-run experiments, which seems a
+   convincing argument that the benchmark has reached a steady state. If the
+   JIT takes longer than 30 minutes to achieve its optimization goals, it is
+   unlikely to be useful.
+
+4. The purpose of the performance section is not to draw conclusions about
+   improvements. It is to contrast program-language implementation approaches.
+   Section 8.1 talks about ramifications of certain design and implementation
+   decisions with respect to overall performance. The only conclusion we draw
+   about performance is:
+
+     Performance comparisons with other concurrent systems and languages show
+     the Cforall approach is competitive across all basic operations, which
+     translates directly into good performance in well-written applications
+     with advanced control-flow.
+
+
+   I would also like you to provide the values for N for each benchmark run.
+
+Done.
+
+
+Referee 2 suggested
+
+   * don't start sentences with "However"
+
+However, there are numerous grammar sites on the web indicating "however" (a
+conjunction) at the start of a sentence is acceptable, e.g.:
+
+ https://www.merriam-webster.com/words-at-play/can-you-start-a-sentence-with-however
+ This is a stylistic choice, more than anything else, as we have a
+ considerable body of evidence of writers using however to begin sentences,
+ frequently with the meaning of "nevertheless."
