Ignore:
File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/thierry_delisle_PhD/thesis/text/eval_macro.tex

    r94ce03a r3ce3fb9  
    88They therefore offer a stringent performance benchmark for \CFA.
    99Indeed existing solutions are likely to have close to optimal performance while the homogeneity of the workloads mean the additional fairness is not needed.
    10 This means that there is very little room to use for the extra cost of fairness.
    11 As such, these experiements should highlight the fairness cost in realistic scenarios.
    1210
    1311\section{Memcached}
     
    1513This also server also has the notable added benefit that there exists a full-featured front-end for performance testing called @mutilate@~\cite{GITHUB:mutilate}.
    1614Experimenting on memcached allows for a simple test of the \CFA runtime as a whole, it will exercise the scheduler, the idle-sleep mechanism, as well the \io subsystem for sockets.
    17 Note that this experiment does not exercise the \io subsytem with regards to disk operations.
     15This experiment does not exercise the \io subsytem with regards to disk operations.
    1816
    1917\subsection{Benchmark Environment}
     
    3331This models adds flexibility to the implementation, as the serving logic can now block on user-level primitives without affecting other connections.
    3432
    35 Memcached is not built according to a thread-per-connection model, but there exists a port of it that is, which was built for libfibre in \cite{DBLP:journals/pomacs/KarstenB20}.
     33Memcached is not built according to a thread-per-connection model, but there exists a port of it that is, which was built for libfibre~\cite{DBLP:journals/pomacs/KarstenB20}.
    3634Therefore this version can both be compared to the original version and to a port to the \CFA runtime.
    3735
     
    3937\begin{itemize}
    4038 \item \emph{vanilla}: the official release of memcached, version~1.6.9.
    41  \item \emph{fibre}: a modification of vanilla which uses the thread per connection model on top of the libfibre runtime~\cite{DBLP:journals/pomacs/KarstenB20}.
     39 \item \emph{fibre}: a modification of vanilla which uses the thread per connection model on top of the libfibre runtime.
    4240 \item \emph{cfa}: a modification of the fibre webserver that replaces the libfibre runtime with \CFA.
    4341\end{itemize}
     
    4745        \centering
    4846        \input{result.memcd.rate.qps.pstex_t}
    49         \caption[Memcached Benchmark: Throughput]{Memcached Benchmark: Throughput\smallskip\newline Desired vs Actual query rate for 15360 connections. Target QPS is the query rate that the clients are attempting to maintain and Actual QPS is the rate at which the server is able to respond.}
     47        \caption[Memcached Benchmark: Throughput]{Memcached Benchmark: Throughput\smallskip\newline Desired vs Actual request rate for 15360 connections. Target QPS is the request rate that the clients are attempting to maintain and Actual QPS is the rate at which the server is able to respond.}
    5048        \label{fig:memcd:rate:qps}
    5149\end{figure}
     50Figure~\ref{fig:memcd:rate:qps} shows the result for the throughput of all three webservers.
    5251This experiment is done by having the clients establish 15360 total connections, which persist for the duration of the experiments.
    53 The clients then send queries, attempting to follow a desired query rate and the server responds to the desired rate as best they can.
    54 Figure~\ref{fig:memcd:rate:qps} shows the difference between desired rate, ``Target \underline{Q}ueries \underline{P}er \underline{S}econd'', and the actual rate, ``Actual QPS'', for all three webservers.
    55 As with the experiments in the previous chapter, 15 runs for each rate were measured and the graph shows all datapoints.
    56 The solid line represents the median while the dashed and dotted lines represent the maximum and minimum respectively.
    57 For rates below 500K queries per seconds, all three webservers can easily keep up to the desired rate, resulting in all datapoints being perfectly overlapped.
    58 Beyond this limit, individual runs become visible and all three servers begin to distinguish themselves, where vanilla memcached generally achieves better throughput while \CFA and libfibre fight for second place.
    59 Overall however the performance of all three servers is very similar, especially considering that at 500K the server has reached saturation, which is discussed more in the next section.
     52The clients then send requests, attempting to follow a desired request rate.
     53The servers respond to the desired rate as best they can and the difference between desired rate, ``Target \underline{Q}ueries \underline{P}er \underline{S}econd'', and the actual rate, ``Actual QPS''.
     54The results show that \CFA achieves equivalent throughput even when the server starts to reach saturation.
     55Only then does it start to fall behind slightly.
     56This is a demonstration of the \CFA runtime achieving its performance goal.
    6057
    6158\subsection{Tail Latency}
     
    6360        \centering
    6461        \input{result.memcd.rate.99th.pstex_t}
    65         \caption[Memcached Benchmark : 99th Percentile Lantency]{Memcached Benchmark : 99th Percentile Lantency\smallskip\newline 99th Percentile of the response latency as a function of \emph{desired} query rate for 15360 connections. }
     62        \caption[Memcached Benchmark : 99th Percentile Lantency]{Memcached Benchmark : 99th Percentile Lantency\smallskip\newline 99th Percentile of the response latency as a function of \emph{desired} request rate for 15360 connections. }
    6663        \label{fig:memcd:rate:tail}
    6764\end{figure}
    6865Another important performance metric to look at is \newterm{tail} latency.
    69 Since many web applications rely on a combination of different queries made in parallel, the latency of the slowest response, \ie tail latency, can dictate overall performance.
     66Since many web applications rely on a combination of different requests made in parallel, the latency of the slowest response, \ie tail latency, can dictate overall performance.
    7067Figure~\ref{fig:memcd:rate:tail} shows the 99th percentile latency results for the same experiment memcached experiment.
    71 Again, each series is made of 15 runs with the median, maximum and minimum highlighted with lines.
    7268As is expected, the latency starts low and increases as the server gets close to saturation, point at which the latency increses dramatically.
    73 Because of this dramatic increase, the Y axis is presented using log scale.
    74 Note that the figure shows \emph{target} query rate, the actual response rate is given in Figure~\ref{fig:memcd:rate:qps} as this is the same underlying experiment.
    75 
    76 For all three servers the saturation point is reached before 500K queries per second, which was when throughput started to change among the webservers.
    77 In this experiement, all three webservers are much more distinguishable than the throughput experiment.
    78 Vanilla achieves low latency mostly across the board followed by libfibre and \CFA.
    79 However, all three webservers achieve micro second latencies and the increases in latency mostly follow eachother.
     69Note that the figure shows \emph{target} request rate, the actual response rate is given in Figure~\ref{fig:memcd:rate:qps} as this is the same underlying experiment.
    8070
    8171\subsection{Update rate}
    82 Since Memcached is effectively a simple database, an aspect that can significantly affect performance is wirtes.
    83 The information that is cached by memcached can be written to concurrently with other queries.
    84 I could therefore be interesting to see how this update rate affects performance.
    85 \begin{figure}
    86         \subfloat[][\CFA: Throughput]{
    87                 \resizebox{0.5\linewidth}{!}{
    88                         \input{result.memcd.forall.qps.pstex_t}
    89                 }
    90                 \label{fig:memcd:updt:forall:qps}
    91         }
    92         \subfloat[][\CFA: Latency]{
    93                 \resizebox{0.5\linewidth}{!}{
    94                         \input{result.memcd.forall.lat.pstex_t}
    95                 }
    96                 \label{fig:memcd:updt:forall:lat}
    97         }
    98 
    99         \subfloat[][LibFibre: Throughput]{
    100                 \resizebox{0.5\linewidth}{!}{
    101                         \input{result.memcd.fibre.qps.pstex_t}
    102                 }
    103                 \label{fig:memcd:updt:fibre:qps}
    104         }
    105         \subfloat[][LibFibre: Latency]{
    106                 \resizebox{0.5\linewidth}{!}{
    107                         \input{result.memcd.fibre.lat.pstex_t}
    108                 }
    109                 \label{fig:memcd:updt:fibre:lat}
    110         }
    111 
    112         \subfloat[][Vanilla: Throughput]{
    113                 \resizebox{0.5\linewidth}{!}{
    114                         \input{result.memcd.vanilla.qps.pstex_t}
    115                 }
    116                 \label{fig:memcd:updt:vanilla:qps}
    117         }
    118         \subfloat[][Vanilla: Latency]{
    119                 \resizebox{0.5\linewidth}{!}{
    120                         \input{result.memcd.vanilla.lat.pstex_t}
    121                 }
    122                 \label{fig:memcd:updt:vanilla:lat}
    123         }
    124         \caption[Throughput and Latency results at different update rates (percentage of writes).]{Throughput and Latency results at different update rates (percentage of writes).\smallskip\newline Description}
    125         \label{fig:memcd:updt}
    126 \end{figure}
    127 Figure~\ref{fig:memcd:updt} shows the results for the same experiement as the throughput and latency experiement but with multiple update rate.
    128 Each experiment was repeated with a update percentage of 3\%, 5\%, 10\% and 50\%.
    129 The previous experiements were run with 3\% update rates.
    130 In the end, this experiment mostly demonstrates that the performance of memcached is affected very little by the update rate.
    131 I believe this is because the underlying locking pattern is actually fairly similar.
    132 Indeed, since values can be much bigger than what the server can read atomically, a lock must be acquired while the value is read.
    133 These results suggects that memcached does not use a readers-writer lock to protect each values and instead relies on having a sufficient number of keys to limit the contention.
    134 In the end, this shows yet again that \CFA achieves equivalent performance.
     72\begin{figure}
     73        \centering
     74        \subfloat[][Throughput]{
     75                \input{result.memcd.forall.qps.pstex_t}
     76        }
     77
     78        \subfloat[][Latency]{
     79                \input{result.memcd.forall.lat.pstex_t}
     80        }
     81        \caption[forall Latency results at different update rates]{forall Latency results at different update rates\smallskip\newline Description}
     82        \label{fig:memcd:updt:forall}
     83\end{figure}
     84
     85\begin{figure}
     86        \centering
     87        \subfloat[][Throughput]{
     88                \input{result.memcd.fibre.qps.pstex_t}
     89        }
     90
     91        \subfloat[][Latency]{
     92                \input{result.memcd.fibre.lat.pstex_t}
     93        }
     94        \caption[fibre Latency results at different update rates]{fibre Latency results at different update rates\smallskip\newline Description}
     95        \label{fig:memcd:updt:fibre}
     96\end{figure}
     97
     98\begin{figure}
     99        \centering
     100        \subfloat[][Throughput]{
     101                \input{result.memcd.vanilla.qps.pstex_t}
     102        }
     103
     104        \subfloat[][Latency]{
     105                \input{result.memcd.vanilla.lat.pstex_t}
     106        }
     107        \caption[vanilla Latency results at different update rates]{vanilla Latency results at different update rates\smallskip\newline Description}
     108        \label{fig:memcd:updt:vanilla}
     109\end{figure}
     110
    135111
    136112
Note: See TracChangeset for help on using the changeset viewer.