- Timestamp:
- Aug 13, 2022, 4:54:32 PM (2 years ago)
- Branches:
- ADT, ast-experimental, master, pthread-emulation
- Children:
- 2ae6a99
- Parents:
- 111d993
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/theses/thierry_delisle_PhD/thesis/text/eval_macro.tex
r111d993 re378c730 72 72 \begin{figure} 73 73 \centering 74 \input{result.memcd.updt.qps.pstex_t} 75 \caption[Churn Benchmark : Throughput on Intel]{Churn Benchmark : Throughput on Intel\smallskip\newline Description} 76 \label{fig:memcd:updt:qps} 77 \end{figure} 78 79 \begin{figure} 80 \centering 81 \input{result.memcd.updt.lat.pstex_t} 82 \caption[Churn Benchmark : Throughput on Intel]{Churn Benchmark : Throughput on Intel\smallskip\newline Description} 83 \label{fig:memcd:updt:lat} 74 \subfloat[][Throughput]{ 75 \input{result.memcd.forall.qps.pstex_t} 76 } 77 78 \subfloat[][Latency]{ 79 \input{result.memcd.forall.lat.pstex_t} 80 } 81 \caption[forall Latency results at different update rates]{forall Latency results at different update rates\smallskip\newline Description} 82 \label{fig:memcd:updt:forall} 83 \end{figure} 84 85 \begin{figure} 86 \centering 87 \subfloat[][Throughput]{ 88 \input{result.memcd.fibre.qps.pstex_t} 89 } 90 91 \subfloat[][Latency]{ 92 \input{result.memcd.fibre.lat.pstex_t} 93 } 94 \caption[fibre Latency results at different update rates]{fibre Latency results at different update rates\smallskip\newline Description} 95 \label{fig:memcd:updt:fibre} 96 \end{figure} 97 98 \begin{figure} 99 \centering 100 \subfloat[][Throughput]{ 101 \input{result.memcd.vanilla.qps.pstex_t} 102 } 103 104 \subfloat[][Latency]{ 105 \input{result.memcd.vanilla.lat.pstex_t} 106 } 107 \caption[vanilla Latency results at different update rates]{vanilla Latency results at different update rates\smallskip\newline Description} 108 \label{fig:memcd:updt:vanilla} 84 109 \end{figure} 85 110 … … 89 114 The memcached experiment has two aspects of the \io subsystem it does not exercise, accepting new connections and interacting with disks. 90 115 On the other hand, static webservers, servers that offer static webpages, do stress disk \io since they serve files from disk\footnote{Dynamic webservers, which construct pages as they are sent, are not as interesting since the construction of the pages do not exercise the runtime in a meaningfully different way.}. 91 The static webserver experiments will compare NGINX with a custom webserver developped for this experiment.116 The static webserver experiments will compare NGINX~\cit{nginx} with a custom webserver developped for this experiment. 92 117 93 118 \subsection{\CFA webserver} … … 98 123 99 124 Normally, webservers use @sendfile@\cite{MAN:sendfile} to send files over the socket. 100 @io_uring@ does not support @sendfile@, it supports @splice@\cite{ splice} instead, which is strictly more powerful.125 @io_uring@ does not support @sendfile@, it supports @splice@\cite{MAN:splice} instead, which is strictly more powerful. 101 126 However, because of how linux implements file \io, see Subsection~\ref{ononblock}, @io_uring@'s implementation must delegate calls to splice to worker threads inside the kernel. 102 127 As of Linux 5.13, @io_uring@ caps the numer of these worker threads to @RLIMIT_NPROC@ and therefore, when tens of thousands of splice requests are made, it can create tens of thousands of \glspl{kthrd}. … … 140 165 141 166 \subsection{Throughput} 167 To measure the throughput of both webservers, each server is loaded with over 30,000 files making over 4.5 Gigabytes in total. 168 Each client runs httperf~\cit{httperf} which establishes a connection, does an http request for one or more files, closes the connection and repeats the process. 169 The connections and requests are made according to a Zipfian distribution~\cite{zipf}. 170 Throughput is measured by aggregating the results from httperf of all the clients. 142 171 \begin{figure} 143 172 \subfloat[][Throughput]{ … … 153 182 \label{fig:swbsrv} 154 183 \end{figure} 155 Figure~\ref{fig:swbsrv} shows the results comparing \CFA to nginx in terms of throughput. 156 It demonstrate that the \CFA webserver described above is able to match the performance of nginx up-to and beyond the saturation point of the machine. 157 Furthermore, Figure~\ref{fig:swbsrv:err} shows the rate of errors, a gross approximation of tail latency, where \CFA achives notably fewet errors once the machine reaches saturation. 184 Figure~\ref{fig:swbsrv} shows the results comparing \CFA to NGINX in terms of throughput. 185 These results are fairly straight forward. 186 Both servers achieve the same throughput until around 57,500 requests per seconds. 187 Since the clients are asking for the same files, the fact that the throughput matches exactly is expected as long as both servers are able to serve the desired rate. 188 Once the saturation point is reached, both servers are still very close. 189 NGINX achieves slightly better throughtput. 190 However, Figure~\ref{fig:swbsrv:err} shows the rate of errors, a gross approximation of tail latency, where \CFA achives notably fewet errors once the machine reaches saturation. 191 This suggest that \CFA is slightly more fair and NGINX may sloghtly sacrifice some fairness for improved throughtput. 192 It demonstrate that the \CFA webserver described above is able to match the performance of NGINX up-to and beyond the saturation point of the machine. 193 194 \subsection{Disk Operations} 195 The throughput was made using a server with 25gb of memory, this was sufficient to hold the entire fileset in addition to all the code and data needed to run the webserver and the reste of the machine. 196 Previous work like \cit{Cite Ashif's stuff} demonstrate that an interesting follow-up experiment is to rerun the same throughput experiment but allowing significantly less memory on the machine. 197 If the machine is constrained enough, it will force the OS to evict files from the file cache and cause calls to @sendfile@ to have to read from disk. 198 However, what these low memory experiments demonstrate is how the memory footprint of the webserver affects the performance. 199 However, since what I am to evaluate in this thesis is the runtime of \CFA, I diceded to forgo experiments on low memory server. 200 The implementation of the webserver itself is simply too impactful to be an interesting evaluation of the underlying runtime.
Note: See TracChangeset
for help on using the changeset viewer.