Changes in / [fe7b281:9059213]


Ignore:
Location:
doc
Files:
1 deleted
4 edited

Legend:

Unmodified
Added
Removed
  • doc/LaTeXmacros/common.tex

    rfe7b281 r9059213  
    228228
    229229% CFA programming language, based on ANSI C (with some gcc additions)
    230 \lstdefinelanguage{Pseudo}{
    231         morekeywords={string,uint,int,bool,float},%
    232         sensitive=true,%
    233         morecomment=[l]{//},%
    234         morecomment=[s]{/*}{*/},%
    235         morestring=[b]',%
    236         morestring=[b]",%
    237         morestring=[s]{`}{`},%
    238 }%
    239 
    240 \lstset{
    241 language=Pseudo,
    242 columns=fullflexible,
    243 basicstyle=\linespread{0.9}\tt\small,           % reduce line spacing and use typewriter font
    244 stringstyle=\sf\color{Mahogany},                        % use sanserif font
    245 commentstyle=\itshape\color{OliveGreen},                % green and italic comments
    246 tabsize=4,                                                      % 4 space tabbing
    247 xleftmargin=\parindentlnth,                             % indent code to paragraph indentation
    248 extendedchars=true,                                     % allow ASCII characters in the range 128-255
    249 escapechar=§,                                           % escape to latex in CFA code
    250 mathescape=true,                                                % allow $...$ LaTeX math escapes in code
    251 %keepspaces=true,                                               %
    252 showstringspaces=false,                                 % do not show spaces with cup
    253 showlines=true,                                         % show blank lines at end of code
    254 aboveskip=4pt,                                          % spacing above/below code block
    255 belowskip=3pt,
    256 moredelim=**[is][\color{red}]{®}{®},    % red highlighting
    257 moredelim=**[is][\color{blue}]{ß}{ß},   % blue highlighting
    258 moredelim=**[is][\color{OliveGreen}]{¢}{¢}, % green highlighting
    259 moredelim=[is][\lstset{keywords={}}]{¶}{¶}, % temporarily turn off keywords
    260 % replace/adjust listing characters that look bad in sanserif
    261 literate={-}{\raisebox{-0.15ex}{\texttt{-}}}1 {^}{\raisebox{0.6ex}{$\scriptscriptstyle\land\,$}}1 {©}{{\"u}}1
    262         {~}{\raisebox{0.3ex}{$\scriptstyle\sim\,$}}1 {_}{\makebox[1.2ex][c]{\rule{1ex}{0.1ex}}}1 {`}{\ttfamily\upshape\hspace*{-0.1ex}`}1
    263         {<-}{$\leftarrow$}2 {=>}{$\Rightarrow$}2,
    264 }%
    265 
    266 % CFA programming language, based on ANSI C (with some gcc additions)
    267230\lstdefinelanguage{CFA}[ANSI]{C}{
    268231        morekeywords=[1]{_Alignas,_Alignof,__alignof,__alignof__,asm,__asm,__asm__,_At,_Atomic,__attribute,__attribute__,auto,
     
    302265\lstMakeShortInline©    % single-character for \lstinline
    303266
    304 
    305267\let\Oldthebibliography\thebibliography
    306268\renewcommand\thebibliography[1]{
  • doc/proposals/concurrency/concurrency.tex

    rfe7b281 r9059213  
    6262\newcommand{\cit}{\textsuperscript{[Citation Needed]}\xspace}
    6363\newcommand{\code}[1]{\lstinline{#1}}
    64 \newcommand{\pseudo}[1]{\lstinline[language=Pseudo]{#1}}
    6564
    6665\input{glossary}
     
    511510\begin{center}
    512511\begin{tabular}{l}
    513 \begin{lstlisting}[language=Pseudo]
    514         if monitor is free :
     512\begin{lstlisting}
     513        ¶if¶ critical section is free :
    515514                enter
    516         elif monitor accepts me :
     515        elif critical section accepts me :
    517516                enter
    518         else :
     517        ¶else¶ :
    519518                block
    520519\end{lstlisting}
     
    522521\end{center}
    523522
    524 For the \pseudo{monitor is free} condition it is easy to implement a check that can evaluate the condition in a few instruction. However, a fast check for \pseudo{monitor accepts me} is much harder to implement depending on the constraints put on the monitors. Indeed, monitors are often expressed as an entry queue and some acceptor queue as in the following figure :
     523For the \code{critical section is free} condition it is easy to implement a check that can evaluate the condition in a few instruction. However, a fast check for \code{critical section accepts me} is much harder to implement depending on the constraints put on the monitors. Indeed, monitors are often expressed as an entry queue and some acceptor queue as in the following figure :
    525524
    526525\begin{center}
    527 {\resizebox{0.4\textwidth}{!}{\input{monitor}}}
     526{\resizebox{0.5\textwidth}{!}{\input{monitor}}}
    528527\end{center}
    529528
    530 There are other alternatives to these pictures but in the case of this picture implementing a fast accept check is relatively easy. Indeed simply updating a bitmask when the acceptor queue changes is enough to have a check that executes in a single instruction, even with a fairly large number of acceptor. However, this relies on the fact that all the acceptable routines are declared with the monitor type. For OO languages this doesn't compromise much since monitors already have an exhaustive list of member routines. However, for \CFA this isn't the case, routines can be added to a type anywhere after its declaration. Its important to note that the bitmask approach does not actually require an exhaustive list of routines, but it requires a dense unique ordering of routines with an upper-bound and that ordering must be consistent across translation units.
    531 The alternative would be to have a picture more like this one:
    532 
    533 \begin{center}
    534 {\resizebox{0.4\textwidth}{!}{\input{ext_monitor}}}
    535 \end{center}
    536 
    537 Not storing the queues inside the monitor means that the storage can vary between routines, allowing for more flexibility and extensions. Storing an array of function-pointers would solve the issue of uniquely identifying acceptable routines. However, the single instruction bitmask compare has been replaced by dereferencing a pointer followed by a linear search. Furthermore, supporting nested external scheduling may now require additionnal searches on calls to accept to check if a routine is already queued in.
    538 
    539 At this point we must make a decision between flexibility and performance. Many design decisions in \CFA achieve both flexibility and performance, for example polymorphic routines add significant flexibility but inlining them means the optimizer can easily remove any runtime cost. Here however, the cost of flexibility cannot be trivially removed.
    540 
    541 In either cases here are a few alternatives for the different syntaxes this syntax : \\
    542 \begin{center}
    543 {\renewcommand{\arraystretch}{1.5}
    544 \begin{tabular}[t]{l @{\hskip 0.35in} l}
    545 \hline
    546 \multicolumn{2}{ c }{\code{accept} on type}\\
    547 \hline
     529There are other alternatives to these pictures but in the case of this picture implementing a fast accept check is relatively easy. Indeed simply updating a bitmask when the acceptor queue changes is enough to have a check that executes in a single instruction, even with a fairly large number of acceptor. However, this requires all the acceptable routines to be declared with the monitor declaration. For OO languages this doesn't compromise much since monitors already have an exhaustive list of member routines. However, for \CFA this isn't the case, routines can be added to a type anywhere after its declaration. A more flexible
     530
     531
     532At this point we must make a decision between flexibility and performance. Many design decisions in \CFA achieve both flexibility and performance, for example polymorphic routines add significant flexibility but inlining them means the optimizer can easily remove any runtime cost.
     533
     534This approach leads to the \uC example being translated to :
     535\begin{lstlisting}
     536        accept( void g(mutex struct A & mutex a) )
     537        mutex struct A {};
     538
     539        void f(A & mutex a) { accept(g); }
     540        void g(A & mutex a);
     541\end{lstlisting}
     542
     543This syntax is the most consistent with the language since it somewhat mimics the \code{forall} declarations. However, the fact that it comes before the struct declaration does means the type needs to be forward declared (done inline in the example). Here are a few alternatives to this syntax : \\
     544\begin{tabular}[t]{l l}
    548545Alternative 1 & Alternative 2 \\
    549546\begin{lstlisting}
    550547mutex struct A
    551 accept( void f(A & mutex a) )
     548accept( void g(A & mutex a) )
    552549{};
    553550\end{lstlisting} &\begin{lstlisting}
    554551mutex struct A {}
    555 accept( void f(A & mutex a) );
     552accept( void g(A & mutex a) );
    556553
    557554\end{lstlisting} \\
     
    559556\begin{lstlisting}
    560557mutex struct A {
    561         accept( void f(A & mutex a) )
     558        accept( void g(A & mutex a) )
    562559};
    563560
     
    565562mutex struct A {
    566563        accept :
    567                 void f(A & mutex a) );
     564                void g(A & mutex a) );
    568565};
    569 \end{lstlisting}\\
    570 \hline
    571 \multicolumn{2}{ c }{\code{accept} on routine}\\
    572 \hline
    573 \begin{lstlisting}
    574 mutex struct A {};
    575 
    576 void f(A & mutex a)
    577 
    578 accept( void f(A & mutex a) )
    579 void g(A & mutex a) {
    580         /*...*/
    581 }
    582 \end{lstlisting}&\\
     566\end{lstlisting}
    583567\end{tabular}
    584 }
    585 \end{center}
     568
    586569
    587570An other aspect to consider is what happens if multiple overloads of the same routine are used. For the time being it is assumed that multiple overloads of the same routine should be scheduled regardless of the overload used. However, this could easily be extended in the future.
     
    614597\end{lstlisting}
    615598
    616 This is unambiguous. Both locks will be acquired and kept, when routine \code{f} is called the lock for monitor \code{a} will be temporarily transferred from \code{g} to \code{f} (while \code{g} still holds lock \code{b}). This behavior can be extended to multi-monitor accept statment as follows.
     599This is unambiguous. The both locks will be acquired and kept, when routine \code{f} is called the lock for monitor \code{a} will be temporarily transferred from \code{g} to \code{f} (while \code{g} still holds lock \code{b}). This behavior can be extended to multi-monitor accept statment as follows.
    617600
    618601\begin{lstlisting}
     
    630613
    631614\subsubsection{Implementation Details: External scheduling queues}
    632 To support multi-monitor external scheduling means that some kind of entry-queues must be used that is aware of both monitors. However, acceptable routines must be aware of the entry queues which means they must be stored inside at least one of the monitors that will be acquired. This in turn adds the requirement a systematic algorithm of disambiguating which queue is relavant regardless of user ordering. The proposed algorithm is to fall back on monitors lock ordering and specify that the monitor that is acquired first is the lock with the relevant entry queue. This assumes that the lock acquiring order is static for the lifetime of all concerned objects but that is a reasonnable constraint. This algorithm choice has two consequences, the entry queue of the highest priority monitor is no longer a true FIFO queue and the queue of the lowest priority monitor is both required and probably unused. The queue can no longer be a FIFO queue because instead of simply containing the waiting threads in order arrival, they also contain the second mutex. Therefore, another thread with the same highest priority monitor but a different lowest priority monitor may arrive first but enter the critical section after a thread with the correct pairing. Secondly, since it may not be known at compile time which monitor will be the lowest priority monitor, every monitor needs to have the correct queues even though it is probable that half the multi-monitor queues will go unused for the entire duration of the program.
     615To support multi-monitor external scheduling means that some kind of entry-queues must be used that is aware of both monitors. However, acceptable routines must be aware of the entry queues which means they most be stored inside at least one of the monitors that will be acquired. This in turn adds the requirement a systematic algorithm of disambiguating which queue is relavant regardless of user ordering. The proposed algorithm is to fall back on monitors lock ordering and specify that the monitor that is acquired first is the lock with the relevant entry queue. This assumes that the lock acquiring order is static for the lifetime of all concerned objects gut that is a reasonnable contraint. This algorithm choice has two consequences, the ofthe highest priority monitor is no longer a true FIFO queue and the queue of the lowest priority monitor is both required and probably unused. The queue can no longer be a FIFO queue because instead of simply containing the waiting threads in order arrival, they also contain the second mutex. Therefore, another thread with the same highest priority monitor but a different lowest priority monitor may arrive first but enter the critical section after a thread with the correct pairing. Secondly, since it may not be known at compile time which monitor will be the lowest priority monitor, every monitor needs to have the correct queues even though it is probably that half the multi-monitor queues will go unused for the entire duration of the program.
    633616
    634617\subsection{Other concurrency tools}
    635 TO BE CONTINUED...
    636 
    637 \newpage
     618
    638619\section{Parallelism}
    639 Historically, computer performance was about processor speeds and instructions count. However, with heat dissipation being an ever growing challenge, parallelism has become the new source of greatest performance \cite{Sutter05, Sutter05b}. In this decade, it is not longer reasonnable to create high-performance application without caring about parallelism. Indeed, parallelism is an important aspect of performance and more specifically throughput and hardware utilization. The lowest level approach of parallelism is to use \glspl{kthread}. However since these have significant costs and limitations \glspl{kthread} are now mostly used as an implementation tool rather than a user oriented one. There are several alternatives to solve these issues which all have strengths and weaknesses.
     620Historically, computer performance was about processor speeds and instructions count. However, with heat dissipaction being an ever growing challenge, parallelism has become the new source of greatest performance \cite{Sutter05, Sutter05b}. In this decade, it is not longer reasonnable create high-performance application without caring about parallelism. Indeed, parallelism an important aspect of performance and more specifically throughput and hardware utilization. The lowest level approach parallelism is to use \glspl{kthread}. However since these have significant costs and limitations, \glspl{kthread} are now mostly used as an implementation tool rather than a user oriented one. There are several alternatives to solve these issues which all have strengths and weaknesses.
    640621
    641622\subsection{User-level threads}
     
    643624
    644625Examples of languages that support are Java\cite{Java}, Haskell\cite{Haskell} and \uC\cite{uC++book}.
    645 
    646626\subsection{Jobs and thread pools}
    647 The approach on the opposite end of the spectrum is to base parallelism on \glspl{job}. Indeed, \glspl{job} offer limited flexibility but at the benefit of a simpler user interface. In \gls{job} based systems users express parallelism as units of work and the dependency graph (either explicit or implicit) that tie them together. This means users need not to worry about concurrency but significantly limits the interaction that can occur between different jobs. Indeed, any \gls{job} that blocks also blocks the underlying \gls{kthread}, this effectively mean the CPU utilization, and therefore throughput, will suffer noticeably.
    648 The golden standard of this implementation is Intel's TBB library\cite{TBB}.
     627The opposite approach is to base parallelism on \glspl{job}. Indeed, \glspl{job} offer limited flexibility but at the benefit of a simpler user interface. In \gls{job} based systems users express parallelism as units of work and the dependency graph (either explicit or implicit) that tie them together. This means users need not to worry about concurrency but significantly limits the interaction that can occur between different jobs. Indeed, any \gls{job} that blocks also blocks the underlying \gls{kthread}, this effectively mean the CPU utilization, and therefore throughput, will suffer noticeably. The golden standard of this implementation is Intel's TBB library\cite{TBB}.
    649628
    650629\subsection{Fibers : user-level threads without preemption}
    651630Finally, in the middle of the flexibility versus complexity spectrum lay \glspl{fiber} which offer \glspl{uthread} without the complexity of preemption. This means users don't have to worry about other \glspl{fiber} suddenly executing between two instructions which signficantly reduces complexity. However, any call to IO or other concurrency primitives can lead to context switches. Furthermore, users can also block \glspl{fiber} in the middle of their execution without blocking a full processor core. This means users still have to worry about mutual exclusion, deadlocks and race conditions in their code, raising the complexity significantly.
    652 An example of a language that uses fibers is Go\cite{Go}
     631\cite{Go}
    653632
    654633\subsection{Paradigm performance}
    655 While the choice between the three paradigms listed above may have significant performance implication, it is difficult to pin the performance implications of chosing a model at the language level. Indeed, in many situations one of these paradigms will show better performance but it all strongly depends on the usage. Having mostly indepent units of work to execute almost guarantess that the \gls{job} based system will have the best performance. However, add interactions between jobs and the processor utilisation might suffer. User-level threads may allow maximum ressource utilisation but context switches will be more expansive and it is also harder for users to get perfect tunning. As with every example, fibers sit somewhat in the middle of the spectrum. Furthermore, if the units of uninterrupted work are large enough the paradigm choice will be largely amorticised by the actual work done.
     634While the choice between the three paradigms listed above may have significant performance implication, it is difficult to pin the performance implications of chosing a model at the language level. Indeed, in many situations own of these paradigms will show better performance but it all strongly depends on the usage. Having mostly indepent units of work to execute almost guarantess that the \gls{job} based system will have the best performance. However, add interactions between jobs and the processor utilisation might suffer. User-level threads may allow maximum ressource utilisation but context switches will be more expansive and it is also harder for users to get perfect tunning. As with every example, fibers sit somewhat in the middle of the spectrum. Furthermore, if the units of uninterrupted work are large enough the paradigm choice will be fully armoticised by the actual work done.
    656635
    657636\section{\CFA 's Thread Building Blocks}
     
    708687\end{lstlisting}
    709688
    710 % In this example \code{func} is a function pointer stored in \acrfull{tls}, which is \CFA is both easy to use and completly typesafe.
    711 
    712 Of course for threads to be useful, it must be possible to start and stop threads and wait for them to complete execution. While using an \acrshort{api} such as \code{fork} and \code{join} is relatively common in the literature, such an interface is not needed. Indeed, the simplest approach is to use \acrshort{raii} principles and have threads \code{fork} once the constructor has completed and \code{join} before the destructor runs.
     689In this example \code{func} is a function pointer stored in \acrfull{tls}, which is \CFA is both easy to use and completly typesafe.
     690
     691Of course for threads to be useful, it must be possible to start and stop threads and wait for them to complete execution. While using \acrshort{api} such as \code{fork} and \code{join} is relatively common in the literature, such an interface is not needed. Indeed, the simplest approach is to use \acrshort{raii} principles and have threads \code{fork} once the constructor has completed and \code{join} before the destructor runs.
    713692\begin{lstlisting}
    714693thread struct FuncRunner; //FuncRunner declared above
     
    754733\end{lstlisting}
    755734
    756 \newpage
    757 \large{\textbf{WORK IN PROGRESS}}
    758735\subsection{The \CFA Kernel : Processors, Clusters and Threads}\label{kernel}
    759736
  • doc/proposals/concurrency/ext_monitor.fig

    rfe7b281 r9059213  
    88-2
    991200 2
     106 1275 1200 1575 1500
     111 3 0 1 -1 -1 1 0 4 0.000 1 0.0000 1425 1350 105 105 1425 1350 1530 1350
     124 1 -1 0 0 0 10 0.0000 2 105 90 1425 1410 b\001
     13-6
     146 1275 1500 1575 1800
     151 3 0 1 -1 -1 1 0 4 0.000 1 0.0000 1425 1650 105 105 1425 1650 1530 1650
     164 1 -1 0 0 0 10 0.0000 2 75 75 1425 1695 a\001
     17-6
     186 2175 1500 2475 1800
     191 3 0 1 -1 -1 1 0 4 0.000 1 0.0000 2325 1650 105 105 2325 1650 2430 1650
     204 1 -1 0 0 0 10 0.0000 2 75 75 2325 1695 c\001
     21-6
     226 2175 1200 2475 1500
     231 3 0 1 -1 -1 1 0 4 0.000 1 0.0000 2325 1350 105 105 2325 1350 2430 1350
     244 1 -1 0 0 0 10 0.0000 2 105 90 2325 1410 d\001
     25-6
     266 2775 1200 7350 5700
    10275 1 0 1 -1 -1 0 0 -1 0.000 0 1 0 0 3150.000 3450.000 3150 3150 2850 3450 3150 3750
    11285 1 0 1 -1 -1 0 0 -1 0.000 0 1 0 0 3150.000 4350.000 3150 4050 2850 4350 3150 4650
     
    18354 1 -1 0 0 0 10 0.0000 2 105 90 6000 1860 b\001
    1936-6
    20 6 5100 1800 5400 2100
    21 1 3 0 1 -1 -1 1 0 4 0.000 1 0.0000 5250 1950 105 105 5250 1950 5355 1950
    22 4 1 -1 0 0 0 10 0.0000 2 105 120 5250 2010 Y\001
    23 -6
    24 6 5100 2100 5400 2400
    25 1 3 0 1 -1 -1 1 0 4 0.000 1 0.0000 5250 2250 105 105 5250 2250 5355 2250
    26 4 1 -1 0 0 0 10 0.0000 2 105 120 5250 2295 X\001
    27 -6
    28 6 3000 5400 7200 5700
     376 3000 5400 6975 5700
    29381 3 0 1 -1 -1 0 0 20 0.000 1 0.0000 3150 5550 80 80 3150 5550 3230 5630
    30391 3 0 1 -1 -1 0 0 -1 0.000 1 0.0000 4500 5550 105 105 4500 5550 4605 5655
    31401 3 0 1 -1 -1 0 0 4 0.000 1 0.0000 6000 5550 105 105 6000 5550 6105 5655
     414 0 -1 0 0 0 12 0.0000 2 180 765 6225 5625 duplicate\001
    32424 0 -1 0 0 0 12 0.0000 2 135 1035 4725 5625 blocked task\001
    33434 0 -1 0 0 0 12 0.0000 2 135 870 3300 5625 active task\001
    34 4 0 -1 0 0 0 12 0.0000 2 180 930 6225 5625 routine ptrs\001
    3544-6
    36451 3 0 1 -1 -1 0 0 -1 0.000 1 0.0000 3300 3600 105 105 3300 3600 3405 3705
     
    41501 3 0 1 -1 -1 0 0 -1 0.000 1 0.0000 6000 2400 105 105 6000 2400 6105 2505
    42511 3 0 1 -1 -1 0 0 20 0.000 1 0.0000 5100 4575 80 80 5100 4575 5180 4655
    43 2 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5
    44          4050 2925 5475 2925 5475 3225 4050 3225 4050 2925
    45522 1 0 1 -1 -1 0 0 -1 0.000 0 0 -1 0 0 2
    4653         5850 2850 6075 3000
     
    70772 1 0 1 -1 -1 0 0 -1 0.000 0 0 -1 0 0 4
    7178         3150 3150 3750 3150 3750 2850 5325 2850
    72 2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 1 2
    73         1 1 1.00 60.00 120.00
    74         7 1 1.00 60.00 120.00
    75          5250 3150 5250 2400
    76 2 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5
    77          5100 1800 5400 1800 5400 2400 5100 2400 5100 1800
    78794 1 -1 0 0 0 10 0.0000 2 75 75 6000 2745 a\001
    79804 1 -1 0 0 0 10 0.0000 2 75 75 6000 2445 c\001
     
    92934 1 -1 0 0 0 12 0.0000 2 135 525 5100 3675 shared\001
    93944 1 -1 0 0 0 12 0.0000 2 135 735 5100 3975 variables\001
    94 4 0 0 50 -1 0 11 0.0000 2 165 855 4275 3150 Acceptables\001
     95-6
     962 1 0 1 -1 -1 0 0 -1 0.000 0 0 -1 0 0 2
     97         1275 1800 1500 1950
     982 1 0 1 -1 -1 0 0 -1 0.000 0 0 -1 0 0 2
     99         2175 1800 2400 1950
     1002 1 0 1 -1 -1 0 0 -1 0.000 0 0 -1 0 0 4
     101         1575 1200 1575 1800 2175 1800 2175 1200
     1022 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 4
     103         1275 1200 1275 2100 2475 2100 2475 1200
     1042 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5
     105         4050 2925 5475 2925 5475 3225 4050 3225 4050 2925
     1062 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 2
     107        1 1 1.00 60.00 120.00
     108         1875 2400 1875 2175
     1092 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 1 2
     110        7 1 1.00 60.00 120.00
     111         5250 3075 5250 2400
     1122 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
     113         5250 2400 1875 2400
     1144 1 -1 0 0 0 12 0.0000 2 135 135 2325 1125 Y\001
     1154 1 -1 0 0 0 12 0.0000 2 120 510 1875 675 mutex\001
     1164 1 -1 0 0 0 12 0.0000 2 135 570 1875 900 queues\001
     1174 1 -1 0 0 0 12 0.0000 2 135 135 1425 1125 X\001
     1184 0 0 50 -1 0 11 0.0000 2 150 795 4275 3150 Queues Ptr\001
  • doc/proposals/concurrency/version

    rfe7b281 r9059213  
    1 0.4.95
     10.4.61
Note: See TracChangeset for help on using the changeset viewer.