Changeset 7a0f1d25 for doc


Ignore:
Timestamp:
Mar 23, 2020, 2:19:13 PM (4 years ago)
Author:
Thierry Delisle <tdelisle@…>
Branches:
ADT, arm-eh, ast-experimental, enum, forall-pointer-decay, jacob/cs343-translation, master, new-ast, new-ast-unique-expr, pthread-emulation, qualifiedEnum
Children:
3a3b2b9
Parents:
9640813
Message:

Latest updates of CompII

Location:
doc/theses/thierry_delisle_PhD/comp_II
Files:
2 added
4 edited

Legend:

Unmodified
Added
Removed
  • doc/theses/thierry_delisle_PhD/comp_II/Makefile

    r9640813 r7a0f1d25  
    2222        emptybit \
    2323        emptytree \
     24        emptytls \
    2425        resize \
     26        system \
    2527}
    2628
  • doc/theses/thierry_delisle_PhD/comp_II/comp_II.tex

    r9640813 r7a0f1d25  
    131131\begin{figure}
    132132        \begin{center}
    133 %               {\resizebox{0.8\textwidth}{!}{\input{base}}}
    134133                \input{base}
    135134        \end{center}
    136         \caption{Relaxed FIFO list at the base of the scheduler: an array of strictly FIFO lists. }
     135        \caption{Relaxed FIFO list at the base of the scheduler: an array of strictly FIFO lists. The timestamp is in all nodes and cell arrays.}
    137136        \label{fig:base}
    138137\end{figure}
     
    140139\begin{figure}
    141140        \begin{center}
    142 %               {\resizebox{0.8\textwidth}{!}{\input{empty}}}
    143141                \input{empty}
    144142        \end{center}
     
    199197Finally, a third approach is to use dense information, similar to the bitmap, but have each thread keep its own independant copies of it. While this approach can offer good scalability \emph{and} low latency, the livelyness of the information can become a problem. In the simple cases, local copies of which underlying queues are empty can become stale and end-up not being useful for the pop operation. A more serious problem is that reliable information is necessary for some parts of this algorithm to be correct. As mentionned in this section, processors must know \emph{reliably} whether the list is empty or not to decide if they can return \texttt{NULL} or if they must keep looking during a pop operation. Section~\ref{sec:sleep} discusses another case where reliable information is required for the algorithm to be correct.
    200198
     199\begin{figure}
     200        \begin{center}
     201                {\resizebox{0.8\textwidth}{!}{\input{emptytls}}}
     202        \end{center}
     203        \caption{``More empty'' queue with added per processor bitmask to indicate which array cells have items.}
     204        \label{fig:emptytls}
     205\end{figure}
     206
    201207There is a fundamental tradeoff among these approach. Dense global information about empty underlying queues helps zero-contention cases at the cost of high-contention case. Sparse global information helps high-contention cases but increases latency in zero-contention-cases, to read and ``aggregate'' the information\footnote{Hiearchical structures, e.g., binary search tree, effectively aggregate information but following pointer chains, learning information for each node. Similarly, other sparse schemes would need to read multiple cachelines to acquire all the information needed.}. Finally, dense local information has both the advantages of low latency in zero-contention cases and scalability in high-contention cases, however the information can become stale making it difficult to use to ensure correctness. The fact that these solutions have these fundamental limits suggest to me that a better solution combines these properties in an interesting ways. The lock discussed in Section~\ref{sec:resize} also allows for solutions that adapt to the number of processors, which could also prove useful.
    202208
    203209\paragraph{Objectives and Existing Work}
    204210
    205 How much scalability is actually needed is highly debatable, libfibre\cit has compared favorably to other schedulers in webserver tests\cit and uses a single atomic counter in its scheduling algorithm similarly to the proposed bitmask. As such, the single atomic instruction on a shared cacheline may be sufficiently performant.
    206 
    207 I have built a prototype of this ready-queue (including the bitmask and BMI2 usage, but not the sharded bitmask) and ran performance experiments on it but it is difficult to compare this prototype to a thread scheduler as the prototype is used as a data-queue. I have also integrated this prototype into the \CFA runtime, but have not yet created performance experiments to compare results. I believe that the bitmask approach is currently one of the larger risks of the proposal, early tests lead me to believe it may work but it is not clear that the contention problem can be overcome. The worst-case scenario is a case where the number of processors and the number of ready threads are similar, yet scheduling events are very frequent. Fewer threads should lead to the Idle Sleep mechanism discussed in Section~\ref{sec:sleep} to reduce contention while having many threads ready leads to optimal performance. It is difficult to evaluate the likeliness of this worst-case scenario in real workloads. I believe, frequent scheduling events suggest a more ``bursty'' workload where new work is finely divided among many threads which race to completion. This type of workload would only see a peek of contention close to the end of the work, but no sustained contention. Very fine-grained pipelines are less ``bursty'', these may lead to more sustained contention. However, they could also easily benefit from a direct hand-off strategy which would circumvent the problem entirely.
     211How much scalability is actually needed is highly debatable \emph{libfibre}\cit has compared favorably to other schedulers in webserver tests\cit and uses a single atomic counter in its scheduling algorithm similarly to the proposed bitmask. As such, the single atomic instruction on a shared cacheline may be sufficiently performant.
     212
     213I have built a prototype of this ready-queue in the shape of a data-queue, i.e., nodes on the queue are structures with a single int and the intrusive data fields. Using this prototype I ran preliminary performance experiments which confirm the expected performance in Table~\ref{tab:perfcases}. However, these experiments only offer a hint at the actual performance of the scheduler since threads form more complex operations than simple integer nodes, e.g., threads are not independant of each other, when a thread blocks some other thread must intervene to wake it.
     214
     215I have also integrated this prototype into the \CFA runtime, but have not yet created performance experiments to compare results. As creating one-to-one comparisons with the prototype will be complex.
    208216
    209217\subsection{Dynamic Resizing} \label{sec:resize}
    210 The \CFA runtime system currently handles dynamically adding and removing processors from clusters at any time. Since this is part of the existing design, the proposed scheduler must also support this behaviour. However, dynamicly resizing the clusters is considered a rare event associated with setup, teardown and major configuration changes. This assumptions is made both in the design of the proposed scheduler as well as in the original design of the \CFA runtime system. As such, the proposed scheduler must honor the correctness of these behaviour but does not have any performance objectives with regards to resizing a cluster. How long adding or removing processors take and how much this disrupts the performance of other threads is considered a secondary concern since it should be amortized over long period of times. However, as mentionned in Section~\ref{sec:queue}, contention on the underlying queues can have a direct impact on performance, the number of underlying queues must therefore be adjusted as the number of processors grows or shrinks. Since the underlying queues are stored in a dense array, changing the number of queues requires resizing the array and therefore moving it. This can introduce memory reclamation problems if not done correctly.
    211 
    212 \begin{figure}
    213         \begin{center}
    214 %               {\resizebox{0.8\textwidth}{!}{\input{resize}}}
     218\begin{figure}
     219        \begin{center}
     220                \input{system}
     221        \end{center}
     222        \caption{Global structure of the \CFA runtime system.}
     223        \label{fig:system}
     224\end{figure}
     225
     226The \CFA runtime system groups processors together as clusters. Threads on a cluster are always scheduled on one of the processors of the cluster. Currently, the runtime handles dynamically adding and removing processors from clusters at any time. Since this is part of the existing design, the proposed scheduler must also support this behaviour. However, dynamicaly resizing a cluster is considered a rare event associated with setup, teardown and major configuration changes. This assumption is made both in the design of the proposed scheduler as well as in the original design of the \CFA runtime system. As such, the proposed scheduler must honor the correctness of these behaviour but does not have any performance objectives with regard to resizing a cluster. How long adding or removing processors take and how much this disrupts the performance of other threads is considered a secondary concern since it should be amortized over long period of times. However, as mentionned in Section~\ref{sec:queue}, contention on the underlying queues can have a direct impact on performance. The number of underlying queues must therefore be adjusted as the number of processors grows or shrinks. Since the underlying queues are stored in a dense array, changing the number of queues requires resizing the array and expanding the array requires moving it. This can introduce memory reclamation problems if not done correctly.
     227
     228\begin{figure}
     229        \begin{center}
    215230                \input{resize}
    216231        \end{center}
    217         \caption{Copy of data structure shown in Figure~\ref{fig:base}. The cells of the array can be modified concurrently but resizing the array, which requires moving it, is not safe to do concurrently. This can also be true of the accompanying data structures used to find non-empty queues.}
     232        \caption{Copy of data structure shown in Figure~\ref{fig:base}. }
    218233        \label{fig:base2}
    219234\end{figure}
    220235
    221 It is important to note that how the array is used in this case. While the array cells are modified by every push and pop operation, the array itself, i.e., the pointer that would change when resized, is only read during these operations. Therefore the use is this pointer can be described as frequent reads and in frequent writes. This description effectively matches with the description of a Reader-Writer lock, infrequent but invasive updates among frequent read operations. In the case of the Ready-Queue described above, read operations are operations that push or pop from the ready-queue but do not invalidate any references to the ready queue data structures. Writes on the other-hand would add or remove inner queues, invalidating references to the array of inner queues in the process. Therefore, the current proposed approach to this problem is the add a per-cluster Reader Writer lock around the ready queue to prevent restructuring of the ready-queue data structure while threads are being pushed or popped.
    222 
    223 There are possible alternatives to the Reader Writer lock solution. This problem is effectively a memory reclamation problem and as such there is a large body of research on the subject. However, the RWlock solution is simple and can be leveraged to solve other problems (e.g. processor ordering and memory reclamation of threads) which makes it an attractive solution.
     236It is important to note how the array is used in this case. While the array cells are modified by every push and pop operation, the array itself, i.e., the pointer that would change when resized, is only read during these operations. Therefore the use of this pointer can be described as frequent reads and infrequent writes. This description effectively matches with the description of a Reader-Writer lock, infrequent but invasive updates among frequent read operations. In the case of the Ready-Queue described above, read operations are operations that push or pop from the ready-queue but do not invalidate any references to the ready queue data structures. Writes on the other-hand would add or remove inner queues, invalidating references to the array of inner queues in the process. Therefore, the current proposed approach to this problem is to add a per-cluster Reader Writer lock around the ready queue to prevent restructuring of the ready-queue data structure while threads are being pushed or popped.
     237
     238There are possible alternatives to the Reader Writer lock solution. This problem is effectively a memory reclamation problem and as such there is a large body of research on the subject\cit. However, the RWlock solution is simple and can be leveraged to solve other problems (e.g., processor ordering and memory reclamation of threads), which makes it an attractive solution.
    224239
    225240\paragraph{Objectives and Existing Work}
     
    227242
    228243\subsection{Idle Sleep} \label{sec:sleep}
    229 As mentionned above, idle sleep is the process of putting processors to sleep while they do not have threads to execute. In this context, processors are kernel-threads and sleeping refers to asking the kernel to block a thread. This can be achieved with either thread synchronization operations like pthread\_cond\_wait or using signal operations like sigsuspend. The goal of putting idle processors to sleep is two-fold, it reduces energy consumption in cases where more idle kernel-threads translate to idle hardware threads, and reduces contention on the ready queue, since the otherwise idle processors generally contend trying to pop items from the queue. Since energy efficiency is a growing concern in many industry sectors\cit, there is not real need to solve the contention problem without using idle sleep.
    230 
    231 Support for idle sleep broadly involves calling the operating system to block the kernel thread but also handling the race between the sleeping and the waking up, and handling which kernel thread should sleep or wake-up.
    232 
    233 When a processor decides to sleep, there is a race that occurs between it signalling that it will go to sleep (so other processors can find sleeping processors) and actually blocking the kernel thread. This is equivalent to the classic problem of missing signals when using condition variables, the ``sleepy'' processor indicates that it will sleep but has not yet gone to sleep, if another processor attempts to wake it up, the waking-up operation may claim nothing needs to be done and the signal will have been missed. In cases where threads are scheduled from processors on the current cluster, loosing signals is not necessarily critical, because at least some processors on the cluster are awake. Individual processors always finish scheduling threads before looking for new work, which means that the last processor to go to sleep cannot miss threads scheduled from inside the cluster (if they do, that demonstrates the ready-queue is not linearizable). However, this guarantee does not hold if threads are scheduled from outside the cluster, either due to an external event like timers and I/O, or due to a thread migrating from a different cluster. In this case, missed signals can lead to the cluster deadlocking where it should not\footnote{Clusters ``should'' never deadlock, but for this proposal, cases where \CFA users \emph{actually} wrote \CFA code that leads to a deadlock it is considered as a deadlock that ``should'' happen. }. Therefore, it is important that the scheduling of threads include a mechanism where signals \emph{cannot} be missed. For performance reasons, it can be advantageous to have a secondary mechanism that allows signals to be missed in cases where it cannot lead to a deadlock. To be safe, this process must include a ``handshake'' where it is guaranteed that either~: the sleepy processor notices that a thread was scheduled after it signalled its intent to block or code scheduling threads sees the intent to sleep before scheduling and be able to wake-up the processor. This matter is complicated by the fact that pthread offers few tools to implement this solution and offers no guarantee of ordering of threads waking up for most of these tools.
    234 
    235 Another issues is trying to avoid kernel sleeping and waking frequently. A possible partial solution is to order the processors so that the one which most recently went to sleep is woken up. This allows other sleeping processors to reach deeper sleep state (when these are available) while keeping ``hot'' processors warmer. Note that while this generally means organising the processors in a stack, I believe that the unique index provided by the ReaderWriter lock can be reused to strictly order the waking order of processors, causing a LIFO like waking order. While a strict LIFO stack is probably better, using the processor index could prove useful and offer a sufficiently LIFO ordering.
    236 
    237 Finally, another important aspect of Idle Sleep is when should processors make the decision to sleep and when it is appropriate for sleeping processors to be woken up. Processors that are unnecessarily awake lead to unnecessary contention and power consumption, while too many sleeping processors can lead to sub-optimal throughput. Furthermore, transitions from sleeping to awake and vice-versa also add unnecessary latency. There is already a wealth of research on the subject and I do not plan to implement a novel idea for the Idle Sleep heuristic in this project.
     244As mentioned, idle sleep is the process of putting processors to sleep when they have no threads to execute. In this context, processors are kernel-threads and sleeping refers to asking the kernel to block a thread. This benefit can be achieved with either thread synchronization operations like pthread\_cond\_wait or using signal operations like sigsuspend. The goal of putting idle processors to sleep is two-fold, it reduces energy consumption in cases where more idle kernel-threads translate to idle hardware threads, and reduces contention on the ready queue, since the otherwise idle processors generally contend trying to pop items from the queue.
     245
     246Support for idle sleep broadly involves calling the operating system to block the kernel thread and handling the race between a blocking thread  and the waking thread, and handling which kernel thread should sleep or wake-up.
     247
     248When a processor decides to sleep, there is a race that occurs between it signalling that is going to sleep (so other processors can find sleeping processors) and actually blocking the kernel thread. This is equivalent to the classic problem of missing signals when using condition variables: the ``sleepy'' processor indicates that it will sleep but has not yet gone to sleep, when another processor attempts to wake it up, the waking-up operation may claim nothing needs to be done and the signal is missed. In cases where threads are scheduled from processors on the current cluster, loosing signals is not necessarily critical, because at least some processors on the cluster are awake and may check for more processors eventually. Individual processors always finish scheduling threads before looking for new work, which means that the last processor to go to sleep cannot miss threads scheduled from inside the cluster (if they do, that demonstrates the ready-queue is not linearizable). However, this guarantee does not hold if threads are scheduled from outside the cluster, either due to an external event like timers and I/O, or due to a thread migrating from a different cluster. In this case, missed signals can lead to the cluster deadlocking where it should not\footnote{Clusters ``should'' never deadlock, but for this proposal, cases where \CFA users \emph{actually} write \CFA code that leads to a deadlock is considered as a deadlock that ``should'' happen. }. Therefore, it is important that the scheduling of threads include a mechanism where signals \emph{cannot} be missed. For performance reasons, it can be advantageous to have a secondary mechanism that allows signals to be missed in cases where it cannot lead to a deadlock. To be safe, this process must include a ``handshake'' where it is guaranteed that either~: the sleepy processor notices that a thread is scheduled after it signalled its intent to block or code scheduling threads sees the intent to sleep before scheduling and be able to wake-up the processor. This matter is complicated by the fact that pthreads offers few tools to implement this solution and offers no guarantee of ordering of threads waking up for most of these tools.
     249
     250Another issues is trying to avoid kernel threads sleeping and waking frequently. A possible partial solution is to order the processors so that the one which most recently went to sleep is woken up. This allows other sleeping processors to reach deeper sleep state (when these are available) while keeping ``hot'' processors warmer. Note that while this generally means organising the processors in a stack, I believe that the unique index provided by the ReaderWriter lock can be reused to strictly order the waking order of processors, causing a LIFO like waking order. While a strict LIFO stack is probably better, using the processor index could prove useful and offer a sufficiently LIFO ordering.
     251
     252Finally, another important aspect of Idle Sleep is when should processors make the decision to sleep and when is it appropriate for sleeping processors to be woken up. Processors that are unnecessarily unblocked lead to unnecessary contention and power consumption, while too many sleeping processors can lead to sub-optimal throughput. Furthermore, transitions from sleeping to awake and vice-versa also add unnecessary latency. There is already a wealth of research on the subject and I may use an existing approach for the Idle Sleep heuristic in this project.
    238253
    239254\subsection{Asynchronous I/O}
    240 The final aspect of this proposal is asynchronous I/O. Without it, user threads that execute I/O operations will block the underlying kernel thread. This leads to poor throughput, it would be preferrable to block the user-thread and reuse the underlying kernel-thread to run other ready threads. This requires intercepting the user-threads' calls to I/O operations, redirecting them to an asynchronous I/O interface and handling the multiplexing between the synchronous and asynchronous API. As such, these are the three components needed to implemented to support asynchronous I/O : an OS abstraction layer over the asynchronous interface, an event-engine to (de)multiplex the operations and a synchronous interface for users to use. None of these components currently exist in \CFA and I will need to build all three for this project.
     255The final aspect of this proposal is asynchronous I/O. Without it, user threads that execute I/O operations block the underlying kernel thread, which leads to poor throughput. It would be preferrable to block the user-thread and reuse the underlying kernel-thread to run other ready threads. This approach requires intercepting the user-threads' calls to I/O operations, redirecting them to an asynchronous I/O interface and handling the multiplexing between the synchronous and asynchronous API. As such, these are the three components needed to implemented support for asynchronous I/O : an OS abstraction layer over the asynchronous interface, an event-engine to (de)multiplex the operations and a synchronous interface for users to use. None of these components currently exist in \CFA and I will need to build all three for this project.
    241256
    242257\paragraph{OS Abstraction}
    243 One of the fundamental part of converting blocking I/O operations into non-blocking ones is having an underlying asynchronous I/O interface to direct the I/O operations. While there exists many different APIs for asynchronous I/O, it is not part of this proposal to create a novel API, simply to use an existing one that is sufficient. uC++ uses the \texttt{select} as its interface, which handles ttys, pipes and sockets, but not disk. It entails significant complexity and is being replaced which make it a less interesting alternative. Another interface which is becoming popular recently\cit is \texttt{epoll}, which is supposed to be cheaper than \texttt{select}. However, epoll also does not handle file system and seems to have problem to linux pipes and \texttt{TTY}s\cit. A very recent alternative that must still be investigated is \texttt{io\_uring}. It claims to address some of the issues with \texttt{epoll} but is too recent to be confident that it does. Finally, a popular cross-platform alternative is \texttt{libuv}, which offers asynchronous sockets and asynchronous file system operations (among other features). However, as a full-featured library it includes much more than what is needed and could conflict with other features of \CFA unless significant efforts are made to merge them together.
     258One fundamental part for converting blocking I/O operations into non-blocking ones is having an underlying asynchronous I/O interface to direct the I/O operations. While there exists many different APIs for asynchronous I/O, it is not part of this proposal to create a novel API. I plan to use an existing one that is sufficient. uC++ uses the \texttt{select} as its interface, which handles ttys, pipes and sockets, but not disk. It entails significant complexity and is being replaced, which make it a less interesting alternative. Another popular interface is \texttt{epoll}\cit, which is supposed to be cheaper than \texttt{select}. However, epoll also does not handle the file system and seems to have problem to linux pipes and \texttt{TTY}s\cit. A very recent alternative that must still be investigated is \texttt{io\_uring}. It claims to address some of the issues with \texttt{epoll} but is too recent to be confident that it does. Finally, a popular cross-platform alternative is \texttt{libuv}, which offers asynchronous sockets and asynchronous file system operations (among other features). However, as a full-featured library it includes much more than what is needed and could conflict with other features of \CFA unless significant effort is made to merge them together.
    244259
    245260\paragraph{Event-Engine}
    246 Laying on top of the asynchronous interface layer is the event-engine. This engine is responsible for multiplexing (batching) the synchronous I/O requests into an asynchronous I/O request and demultiplexing the results onto appropriate blocked threads. This can be straightforward for the simple cases, but can become quite complex. Decisions that will need to be made include : whether to poll from a seperate kernel thread or a regularly scheduled user thread, what should be the ordering used when results satisfy many requests, how to handle threads waiting for multiple operations, etc.
     261Laying on top of the asynchronous interface layer is the event-engine. This engine is responsible for multiplexing (batching) the synchronous I/O requests into an asynchronous I/O request and demultiplexing the results onto appropriate blocked threads. This step can be straightforward for the simple cases, but can become quite complex. Decisions that need to be made include : whether to poll from a seperate kernel thread or a regularly scheduled user thread, what should be the ordering used when results satisfy many requests, how to handle threads waiting for multiple operations, etc.
    247262
    248263\paragraph{Interface}
    249 Finally, for these components to be available, it is necessary to expose them through a synchronous interface. This can be a novel interface but it is preferrable to attempt to intercept the existing POSIX interface in order to be compatible with existing code. This allows C programs written using this interface to be transparently converted to \CFA with minimal effeort. Where this is not applicable, a novel interface will be created to fill the gaps.
     264Finally, for these components to be available, it is necessary to expose them through a synchronous interface. The interface can be novel but it is preferrable to match the existing POSIX interface in order to be compatible with existing code. Matching allows C programs written using this interface to be transparently converted to \CFA with minimal effeort. Where this is not applicable, a novel interface will be created to fill the gaps.
    250265
    251266
  • doc/theses/thierry_delisle_PhD/comp_II/img/base.fig

    r9640813 r7a0f1d25  
    1 #FIG 3.2  Produced by xfig version 3.2.5c
     1#FIG 3.2  Produced by xfig version 3.2.7a
    22Landscape
    33Center
    44Inches
    5 Letter 
     5Letter
    66100.00
    77Single
     
    1091092 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5
    110110         2400 3900 7200 3900 7200 4500 2400 4500 2400 3900
    111 4 2 0 50 -1 0 12 0.0000 2 180 660 2100 4200 Array of\001
    112 4 2 0 50 -1 0 12 0.0000 2 165 600 2100 4425 Queues\001
    113 4 2 0 50 -1 0 12 0.0000 2 135 645 2100 3075 Threads\001
    114 4 2 0 50 -1 0 12 0.0000 2 180 525 2100 2850 Ready\001
     1112 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
     112         2454 2520 2957 2520
     1132 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
     114         2454 3417 2957 3417
     1152 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
     116         2400 4350 3000 4350
     1174 2 0 50 -1 0 12 0.0000 2 165 720 2100 4200 Array of\001
     1184 2 0 50 -1 0 12 0.0000 2 150 540 2100 4425 Queues\001
     1194 2 0 50 -1 0 12 0.0000 2 135 630 2100 3075 Threads\001
     1204 2 0 50 -1 0 12 0.0000 2 165 450 2100 2850 Ready\001
     1214 0 0 50 -1 0 11 0.0000 2 135 180 2595 3561 TS\001
     1224 0 0 50 -1 0 11 0.0000 2 135 180 2595 2665 TS\001
     1234 0 0 50 -1 0 11 0.0000 2 135 180 2595 4479 TS\001
  • doc/theses/thierry_delisle_PhD/comp_II/img/resize.fig

    r9640813 r7a0f1d25  
    1 #FIG 3.2  Produced by xfig version 3.2.5c
     1#FIG 3.2  Produced by xfig version 3.2.7a
    22Landscape
    33Center
    44Inches
    5 Letter 
     5Letter
    66100.00
    77Single
     
    9898        1 1 1.00 45.00 90.00
    9999         7500 4200 7950 4200
    100 4 0 0 50 -1 0 12 0.0000 2 135 915 7500 3825 Grows with\001
    101 4 0 0 50 -1 0 12 0.0000 2 135 840 7500 4050 additional\001
    102 4 0 0 50 -1 0 12 0.0000 2 135 840 7500 4425 processors\001
     1004 0 0 50 -1 0 12 0.0000 2 135 900 7500 3825 Grows with\001
     1014 0 0 50 -1 0 12 0.0000 2 135 900 7500 4050 additional\001
     1024 0 0 50 -1 0 12 0.0000 2 120 900 7500 4425 processors\001
    103103-6
    1041042 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
     
    1181182 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5
    119119         2400 3900 7200 3900 7200 4500 2400 4500 2400 3900
    120 4 2 0 50 -1 0 12 0.0000 2 180 660 2100 4200 Array of\001
    121 4 2 0 50 -1 0 12 0.0000 2 165 600 2100 4425 Queues\001
    122 4 2 0 50 -1 0 12 0.0000 2 135 645 2100 3075 Threads\001
    123 4 2 0 50 -1 0 12 0.0000 2 180 525 2100 2850 Ready\001
     1202 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5
     121         3600 5100 4500 5100 4500 6300 3600 6300 3600 5100
     1222 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 3
     123        1 1 1.00 60.00 120.00
     124         3750 5550 2400 5550 2400 4500
     1254 2 0 50 -1 0 12 0.0000 2 165 720 2100 4200 Array of\001
     1264 2 0 50 -1 0 12 0.0000 2 150 540 2100 4425 Queues\001
     1274 2 0 50 -1 0 12 0.0000 2 135 630 2100 3075 Threads\001
     1284 2 0 50 -1 0 12 0.0000 2 165 450 2100 2850 Ready\001
     1294 0 0 50 -1 0 13 0.0000 2 135 1980 3600 5025 Cluster Data Strcuture\001
     1304 0 0 50 -1 0 13 0.0000 2 165 1170 2400 5775 Array Pointer\001
Note: See TracChangeset for help on using the changeset viewer.